2011年-2016年语音识别方法小结
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
3. A)Rao K, Peng F, Sak H, et al. Grapheme-to-phoneme conversion using Long Short-Term Memory recurrent neural networks[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2015.
1. Dahl G E, Dong Y, Li D, et al. Large vocabulary continuous speech recognition with context-dependent DBN-HMMS[C]// IEEE International Conference on Acoustics, Speech & Signal Processing. IEEE, 2011:4688-4691.Deep Belief Networks (DBNs)
Networks(RNNs) 4. Deep Neural Networks(DNNs) (by substituting the
logistic units with rectified linear units.)
2013年
方法/模型对应的论文:
1. Graves, A, Mohamed, A.-R, Hinton, G. Speech recognition with deep recurrent neural networks[C]// 2013:257-264 vol.1.
2013年
采用的模型/方法有: 1. Deep Recurrent Neural Networks(RNNs) 2. Deep Convolutional Neural Networks(CNNs) 3. Deep Bidirectional LSTM (DBLSTM) Recurrent Neural
2. Sainath T N, Mohamed A R, Kingsbury B, et al. Deep convolutional neural networks for LVCSR[J]. 2013:8614-8618.
3. Graves A, Jaitly N, Mohamed A R. Hybrid speech recognition with Deep Bidirectional LSTM[C]// Automatic Speech Recognition and Understanding. 2013:273-278.
2. Sainath T N, Kingsbury B, Ramabhadran B, et al. Making Deep Belief Networks effective for large vocabulary continuous speech recognition[C]// Automatic Speech Recognition and Understanding. IEEE, 2010:30-35.
(DBN) hidden Markov model (HMM) hybrid architecture 2. Deep Belief Networks (DBNs) 3. Context-Dependent Deep-Neural-Network HMMs (CD-DNN-HMMs)
2011年
方法/模型对应的论文:
3. Deng L, Tur G, He X, et al. Use of kernel deep convex networks and end-to-end learning for spoken language understanding[C]// Spoken Language Technology Workshop. 2012:210-215.
in Convolutional Neural Networks(CNNs) 5. Deep Convolutional Neural Networks(CNNs) 6. Bi-Directional Recurrent DNNs
2014年
方法/模型对应的论文:
1. A. Graves, N. Jaitly. Towards end-to-end speech recognition with recurrent neural networks[C]// International Conference on Machine Learning. 2014:1764-1772.
2011年-2016年语音识别(SPEECH RECOGNITION)研究方法小结
说明:我从谷歌学术上以“SPEECH RECOGNITION”等关键词搜索最近5年国际上 发表的相关论文,对部分论文所采用的模型/方 法进行了非常简要的总结。
2011年
采用的模型/方法有: 1. The context-independent Deep Belief Network
3. B)Haşim Sak, Andrew Senior, Kanishka Rao, et al. Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition[J]. Computer Science, 2015.
4. Toth L. Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition[C]// 2014:190-194.
5. Sainath T N, Kingsbury B, Saon G, et al. Deep Convolutional Neural Networks for Largescale Speech Tasks[J]. Neural Networks, 2015, 64:39-48.
2015年
采用的模型/方法有: 1. Sequence-to-Sequence Neural Net Models 2. Attention-based Recurrent Networks 3. Long Short-Term Memory (LSTM) recurrent neural
networks (RNNs)
6. Hannun A Y, Maas A L, Jurafsky D, et al. First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs[J]. Eprint Arxiv, 2014.
2. Mohamed A, Dahl G E, Hinton G. Acoustic Modeling Using Deep Belief Networks[J]. IEEE Transactions on Audio Speech & Language Processing, 2012, 20(1):14--22.
3. B)Dahl G E, Yu D, Deng L, et al. Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition[J]. IEEE Transactions on Audio Speech & Language Processing, 2012, 20(1):30 - 42.
2012年
采用的模型/方法有: 1. Deep Neural Networks(DNNs) 2. Deep Belief Networks(DBNs), neural networks 3. Kernel deep convex networks
2012年
方法/模型对应的论文:
1. Hinton G, Deng L, Yu D, et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition[J]. IEEE Signal Processing Magazine, 2012, 29(6):82 - 97.
2. Hannun A, Case C, Casper J, et al. Deep Speech: Scaling up end-to-end speech recognition[J]. Eprint Arxiv, 2014.
3. Chorowski J, Bahdanau D, Cho K, et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results[J]. Eprint Arxiv, 2014.
Neural Nets and Kernel Acoustic Models for Speech Recognition[J]. Respiratory Physiology & Neurobiology, 2016, 161(2):214-7.
3. A)Seide F, Li G, Yu D. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks.[C]// INTERSPEECH 2011, Conference of the International Speech Communication Association, Florence, Italy, August. 2011:437--440.
2015年
方法/模型对应的论文:
1. Yao K, Zweig G. Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion[J]. Computer Science, 2015.
2. Chorowski J, Bahdanau D, Serdyuk D, et al. Attention-Based Models for Speech Recognition[J]. Computer Science, 2015.
4. Zeiler, M.D, Ranzato M, Monga R, et al. On rectified linear units for speech processing[J]. 2013, 32(3):3517-3521.
2014年
采用的模型/方法有: 1. LSTM Recurrent Neural Networks(RNNs) 2. A well-optimized RNN training system 3. Attention-based RNN 4. Combining time and frequency-domain convolution
2016年
采用的模型/方法有: 1. Recurrent neural networks (RNNs) 2. Large-scale kernel methods
来自百度文库
2016年
方法/模型对应的论文:
1. Chan W, Jaitly N, Le Q V, et al. Listen, Attend and Spell[J]. Computer Science, 2015. 2. Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, et al. A Comparison between Deep
1. Dahl G E, Dong Y, Li D, et al. Large vocabulary continuous speech recognition with context-dependent DBN-HMMS[C]// IEEE International Conference on Acoustics, Speech & Signal Processing. IEEE, 2011:4688-4691.Deep Belief Networks (DBNs)
Networks(RNNs) 4. Deep Neural Networks(DNNs) (by substituting the
logistic units with rectified linear units.)
2013年
方法/模型对应的论文:
1. Graves, A, Mohamed, A.-R, Hinton, G. Speech recognition with deep recurrent neural networks[C]// 2013:257-264 vol.1.
2013年
采用的模型/方法有: 1. Deep Recurrent Neural Networks(RNNs) 2. Deep Convolutional Neural Networks(CNNs) 3. Deep Bidirectional LSTM (DBLSTM) Recurrent Neural
2. Sainath T N, Mohamed A R, Kingsbury B, et al. Deep convolutional neural networks for LVCSR[J]. 2013:8614-8618.
3. Graves A, Jaitly N, Mohamed A R. Hybrid speech recognition with Deep Bidirectional LSTM[C]// Automatic Speech Recognition and Understanding. 2013:273-278.
2. Sainath T N, Kingsbury B, Ramabhadran B, et al. Making Deep Belief Networks effective for large vocabulary continuous speech recognition[C]// Automatic Speech Recognition and Understanding. IEEE, 2010:30-35.
(DBN) hidden Markov model (HMM) hybrid architecture 2. Deep Belief Networks (DBNs) 3. Context-Dependent Deep-Neural-Network HMMs (CD-DNN-HMMs)
2011年
方法/模型对应的论文:
3. Deng L, Tur G, He X, et al. Use of kernel deep convex networks and end-to-end learning for spoken language understanding[C]// Spoken Language Technology Workshop. 2012:210-215.
in Convolutional Neural Networks(CNNs) 5. Deep Convolutional Neural Networks(CNNs) 6. Bi-Directional Recurrent DNNs
2014年
方法/模型对应的论文:
1. A. Graves, N. Jaitly. Towards end-to-end speech recognition with recurrent neural networks[C]// International Conference on Machine Learning. 2014:1764-1772.
2011年-2016年语音识别(SPEECH RECOGNITION)研究方法小结
说明:我从谷歌学术上以“SPEECH RECOGNITION”等关键词搜索最近5年国际上 发表的相关论文,对部分论文所采用的模型/方 法进行了非常简要的总结。
2011年
采用的模型/方法有: 1. The context-independent Deep Belief Network
3. B)Haşim Sak, Andrew Senior, Kanishka Rao, et al. Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition[J]. Computer Science, 2015.
4. Toth L. Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition[C]// 2014:190-194.
5. Sainath T N, Kingsbury B, Saon G, et al. Deep Convolutional Neural Networks for Largescale Speech Tasks[J]. Neural Networks, 2015, 64:39-48.
2015年
采用的模型/方法有: 1. Sequence-to-Sequence Neural Net Models 2. Attention-based Recurrent Networks 3. Long Short-Term Memory (LSTM) recurrent neural
networks (RNNs)
6. Hannun A Y, Maas A L, Jurafsky D, et al. First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs[J]. Eprint Arxiv, 2014.
2. Mohamed A, Dahl G E, Hinton G. Acoustic Modeling Using Deep Belief Networks[J]. IEEE Transactions on Audio Speech & Language Processing, 2012, 20(1):14--22.
3. B)Dahl G E, Yu D, Deng L, et al. Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition[J]. IEEE Transactions on Audio Speech & Language Processing, 2012, 20(1):30 - 42.
2012年
采用的模型/方法有: 1. Deep Neural Networks(DNNs) 2. Deep Belief Networks(DBNs), neural networks 3. Kernel deep convex networks
2012年
方法/模型对应的论文:
1. Hinton G, Deng L, Yu D, et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition[J]. IEEE Signal Processing Magazine, 2012, 29(6):82 - 97.
2. Hannun A, Case C, Casper J, et al. Deep Speech: Scaling up end-to-end speech recognition[J]. Eprint Arxiv, 2014.
3. Chorowski J, Bahdanau D, Cho K, et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results[J]. Eprint Arxiv, 2014.
Neural Nets and Kernel Acoustic Models for Speech Recognition[J]. Respiratory Physiology & Neurobiology, 2016, 161(2):214-7.
3. A)Seide F, Li G, Yu D. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks.[C]// INTERSPEECH 2011, Conference of the International Speech Communication Association, Florence, Italy, August. 2011:437--440.
2015年
方法/模型对应的论文:
1. Yao K, Zweig G. Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion[J]. Computer Science, 2015.
2. Chorowski J, Bahdanau D, Serdyuk D, et al. Attention-Based Models for Speech Recognition[J]. Computer Science, 2015.
4. Zeiler, M.D, Ranzato M, Monga R, et al. On rectified linear units for speech processing[J]. 2013, 32(3):3517-3521.
2014年
采用的模型/方法有: 1. LSTM Recurrent Neural Networks(RNNs) 2. A well-optimized RNN training system 3. Attention-based RNN 4. Combining time and frequency-domain convolution
2016年
采用的模型/方法有: 1. Recurrent neural networks (RNNs) 2. Large-scale kernel methods
来自百度文库
2016年
方法/模型对应的论文:
1. Chan W, Jaitly N, Le Q V, et al. Listen, Attend and Spell[J]. Computer Science, 2015. 2. Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, et al. A Comparison between Deep