Deepspeech

From The World according to Vissie
Jump to navigation Jump to search

Setup

https://github.com/mozilla/DeepSpeech/releases/tag/v0.7.4

Acoustic models:

wget https://github.com/mozilla/DeepSpeech/releases/download/v0.7.4/deepspeech-0.7.4-models.pbmm
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.7.4/deepspeech-0.7.4-models.tflite

The model with the ".pbmm" extension is memory mapped and thus memory efficient and fast to load. The model with the ".tflite" extension is converted to use TFLite, has post-training quantization enabled, and is more suitable for resource constrained environments.

In addition the scorer:

wget https://github.com/mozilla/DeepSpeech/releases/download/v0.7.4/deepspeech-0.7.4-models.scorer

which takes the place of the language model and trie in older releases.

We also include example audio files:

wget https://github.com/mozilla/DeepSpeech/releases/download/v0.7.4/audio-0.7.4.tar.gz

Then I got a python error:

ImportError: libf77blas.so.3: cannot open shared object file: No such file or directory
sudo apt-get install libatlas-base-dev

Running inference

Example audio files

deepspeech --model ./deepspeech-0.7.4-models.tflite --audio ./audio-0.7.4/audio/8455-210777-0068.wav

Microphone VAD Streaming

Reference web

https://github.com/mozilla/DeepSpeech-examples/blob/r0.7/mic_vad_streaming/README.rst

Setup

cd ./DeepSpeech-examples/mic_vad_streaming/
sudo sudo apt install portaudio19-dev
sudo apt-get install python3-scipy
sudo apt-get install sox
sudo pip3 install -r ./requirements.txt 

Download Exapmles

https://github.com/mozilla/DeepSpeech-examples.git

Run/test

cd ./DeepSpeech-examples/mic_vad_streaming/
python3 ./mic_vad_streaming.py -m ../../deepspeech-0.7.4-models.tflite -s ../../deepspeech-0.7.4-models.scorer -d 0 -r 48000