Week 8 + Week 9 (ASR - German)

1 minute read

Describe my work briefly

I am happy to update on my Week 8 and Week 9 progress. This period was challenging. I spent this period working on integrating all the modules to create an end-to-end Automatic Speech Recognition pipeline and resolving the issues.

Automatic Speech Recognition pipeline has four significant steps:

Pre-Processing
Feature Extraction
Language Model
Acoustic Model

The following is depicted in the image below:

To work with German dataset, care has to be taken that the scripts are UTF-8 compatible. While most of the example scripts from Kaldi supports UTF-8, there are several that are still in ASCII format. When working on integrating all the above modules, the code produced a lot of issues. Since the bugs were logged on a high level, they were difficult to debug. Also, overtime Kaldi is adapting Python 3, but still many scripts support only Python 2. This was challenging, while I had implemented the pipeline in Python 3. Apart from them, there were other specific data issues.

I would like to thanks the open community of developers and Kaldi help group for the guidance.

Now, I have created an end-to-end German ASR pipeline. Next week I would keep the model on training. Also, this is the second-evaluation week. Keeping my fingers crossed.

I will keep you posted about my progress!

Others

Share on

Twitter Facebook Google+ LinkedIn

Aashish Agarwal

Week 8 + Week 9 (ASR - German)

Describe my work briefly

Others

Share on

Leave a Comment

You May Also Enjoy

Week 12 + Week 13 (ASR - German)

Week 10 + Week 11 (ASR - German)

Week 6 + Week 7 (ASR - German)

Week 4 + Week 5 (ASR - German)