Semi-Supervised Training in Deep Learning Acoustic Model

We studied the semi-supervised training in a fully connected deep neural network (DNN), unfolded recurrent neural network (RNN), and long short-term memory recurrent neural network (LSTM-RNN) with respect to the transcription quality, the importance data sampling, and the training data amount. We found that DNN, unfolded RNN, and LSTM-RNN are increasingly more sensitive to labeling errors. For example, with the simulated erroneous training transcription at 5%, 10%, or 15% word error rate (WER) level, the semi-supervised DNN yields 2.37%, 4.84%, or 7.46% relative WER increase against the baseline model trained with the perfect transcription; in comparison, the corresponding WER increase is 2.53%, 4.89%, or 8.85% in an unfolded RNN and 4.47%, 9.38%, or 14.01% in an LSTM-RNN. We further found that the importance sampling has similar impact on all three…

Link to Full Article: Semi-Supervised Training in Deep Learning Acoustic Model

Pin It on Pinterest

Share This

Join Our Newsletter

Sign up to our mailing list to receive the latest news and updates about and the Informed.AI Network of AI related websites which includes Events.AI, Neurons.AI, Awards.AI, and Vocation.AI

You have Successfully Subscribed!