• 2
    From Andrew NG facebook page
    Our Deep Speech system for speech recognition attains 16.5% error on Switchboard (Hub5'00), outperforming previous published results. We also focus on realistic noisy environments (speech in a noisy crowd, car, etc.) In this regime Deep Speech significantly outperforms commercial systems. Key to this approach were (i) Our scalable multi-GPU infrastructure for training an RNN, (ii) Using 7,000 hours of clean speech data, and using that to synthesize a massive 100,000 hours of data (by adding the clean data to different types of noise) to train the models. I think end-to-end deep learning is the future of speech. Paper here:http://arxiv.org/abs/1412.5567
    Refrence : 1 item Show Refrence
    • 2014-12-19 14:13:50Z
    • Root Comment : 0 Sub Comment : 0
    • Visited : 0
    • Send Comment