The speech age

Researchers at MIT have developed a new approach to training speech recognition systems that does not depend on transcriptions – as is the current model. Instead, their system analyses correspondences between images and spoken descriptions of those images, as captured in a large collection of audio recordings. The system then learns a mapping between acoustic features of the recordings correlated with image characteristics. Traditionally speech recognition systems such as those that convert speech to text on smartphones are the result of machine learning systems that go over many thousands of utterances and their transcriptions to learn a mapping between acoustic features and words. While this method works quite well, the requirement of professional grade transcription is costly and time-consuming. For this reason, speech recognition is usually limited to a few…

Link to Full Article: The speech age

Pin It on Pinterest

Share This

Join Our Newsletter

Sign up to our mailing list to receive the latest news and updates about and the Informed.AI Network of AI related websites which includes Events.AI, Neurons.AI, Awards.AI, and Vocation.AI

You have Successfully Subscribed!