Toward a better understanding of deep neural network based acoustic modelling: An empirical investigation
X Wang, L Wang, J Chen, L Wu - … of the AAAI Conference on Artificial …, 2016 - ojs.aaai.org
X Wang, L Wang, J Chen, L Wu
Proceedings of the AAAI Conference on Artificial Intelligence, 2016•ojs.aaai.orgRecently, deep neural networks (DNNs) have outperformed traditional acoustic models on a
variety of speech recognition benchmarks. However, due to system differences across
research groups, although a tremendous breadth and depth of related work has been
established, it is still not easy to assess the performance improvements of a particular
architectural variant from examining the literature when building DNN acoustic models. Our
work aims to uncover which variations among baseline systems are most relevant for …
variety of speech recognition benchmarks. However, due to system differences across
research groups, although a tremendous breadth and depth of related work has been
established, it is still not easy to assess the performance improvements of a particular
architectural variant from examining the literature when building DNN acoustic models. Our
work aims to uncover which variations among baseline systems are most relevant for …
Abstract
Recently, deep neural networks (DNNs) have outperformed traditional acoustic models on a variety of speech recognition benchmarks. However, due to system differences across research groups, although a tremendous breadth and depth of related work has been established, it is still not easy to assess the performance improvements of a particular architectural variant from examining the literature when building DNN acoustic models. Our work aims to uncover which variations among baseline systems are most relevant for automatic speech recognition (ASR) performance via a series of systematic tests on the limits of the major architectural choices. By holding all the other components fixed, we are able to explore the design and training decisions without being confounded by the other influencing factors. Our experiment results suggest that a relatively simple DNN architecture and optimization technique produces strong results. These findings, along with previous work, not only help build a better understanding towards why DNN acoustic models perform well or how they might be improved, but also help establish a set of best practices for new speech corpora and language understanding task variants.
ojs.aaai.org
Showing the best result for this search. See all results