0
点赞
收藏
分享

微信扫一扫

Espnet ASR Demo & Quantization Document

老牛走世界 2022-04-03 阅读 49
linux

Espnet ASR Demo & Quantization Document

  • This is a document of how to run Espnet (v1) ASR Demo and its model quantization
  • Test enviroment:
UbuntuCUDAGCC
21.0411.611.2

Installation

Note: Please follow the original installation guide provided by Espnet. Only some notes below should be paid attention to.

Requirements

soxsndfileffmpegflac
installedinstallednot installednot installed

Install Kaldi

Exactly follow the installation guide
Notes:

  • The Kaldi installation includes two parts: 1. tools installation 2. src installation. Make sure install them all in order
  • Once installed, many .o binary files can be found in directories such as: <kaldi-root>\{featbin,fgmmbin,fstbin,etc.}

Install Espnet

Exactly follow the installation guide
Notes:

  • Kaldi should be linked into <espnet>/tools (check guide)
  • Option A) Setup Anaconda environment is choosen in this document, so a virtual enviroment espnet is created with python==3.8
  • Since the current CUDA version is 11.6, which is not compatible with pytorch 1.10.1, so espnet should be installed by $ make TH_VERSION=1.10.1 CUDA_VERSION=11.3, which specifies the version pytorch and CUDA
  • Custom tools in [Optional] Custom tool installation are not installed
  • install chainer in the espnet conda enviroment by pip install chainer==6.0.0 (cupy is not installed due to some errors)

Run ASR Demo

Notes: some

  1. Prepare the audio file
    eg. the test.wav file in espnet/utils
    Put the .wav file in espnet/egs/tedlium2/asr1
  2. Perform decoding
    a. cd espnet/egs/tedlium2/asr1 and source ./path.sh
    b. recog_wav.sh --models <downloaded-model> test.wav
    Notes: The default approach is to use godown package, which could cause a time out error due to the network disconnection. In this case, the model file, eg. model.streaming.v1.tar.gz, need to be downloaded manually from google drive (see Espnet readme)
    Then, modify the download_from_google_drive.sh file in espnet/utils directory as follows:
    a. create a variable manual_download_dir that specifies the path of the downloaded model file. eg. manual_download_dir="/home/glinttsd/espnet/egs/tedlium2/asr1/model.streaming.v1.tar.gz"
    b. replace the codes in line 46-47 with
    	if [ -f "$manual_download_dir" ]
    	then 
    	echo "File download locally"
    	decompress "${manual_download_dir}" "${download_dir}"
    	else
    	echo "File download from url: ${share_url}"
    	gdown --id "${file_id}" -O "${tmp}"
    	decompress "${tmp}" "${download_dir}"
    	fi
    
    which skips the download part and decompress the model file directly.

Model Quantization

Espnet provides dynamic quantization method through pytorch API.

To enable dynamic quantization, add the following codes in espnet/utils/recog_wav.sh file line 248-249

        --quantize-asr-model True \
        --quantize-dtype "qint8" \

Now we can perform decoding as described in the last section

More usage can be found here

举报

相关推荐

0 条评论