default search action
Shinji Takaki
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [c37]Takenori Yoshimura, Shinji Takaki, Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:
Embedding a Differentiable Mel-Cepstral Synthesis Filter to a Neural Speech Synthesis System. ICASSP 2023: 1-5 - 2022
- [i14]Takenori Yoshimura, Shinji Takaki, Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:
Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System. CoRR abs/2211.11222 (2022) - 2021
- [j11]Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
PeriodNet: A Non-Autoregressive Raw Waveform Generative Model With a Structure Separating Periodic and Aperiodic Components. IEEE Access 9: 137599-137612 (2021) - [c36]Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
Periodnet: A Non-Autoregressive Waveform Generation Model with a Structure Separating Periodic and Aperiodic Components. ICASSP 2021: 6049-6053 - [i13]Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components. CoRR abs/2102.07786 (2021) - 2020
- [j10]Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki, Junichi Yamagishi:
Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences. IEEE Access 8: 138149-138161 (2020) - [j9]Xin Wang, Shinji Takaki, Junichi Yamagishi, Simon King, Keiichi Tokuda:
A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 28: 157-170 (2020) - [j8]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 28: 402-415 (2020) - [c35]Kazuhiro Nakamura, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
Fast and High-Quality Singing Voice Synthesis System Based on Convolutional Neural Networks. ICASSP 2020: 7239-7243 - [c34]Takato Fujimoto, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
Semi-Supervised Learning Based on Hierarchical Generative Models for End-to-End Speech Synthesis. ICASSP 2020: 7644-7648
2010 – 2019
- 2019
- [j7]Toru Nakashika, Shinji Takaki, Junichi Yamagishi:
Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From Complex Spectra. IEEE ACM Trans. Audio Speech Lang. Process. 27(2): 244-254 (2019) - [c33]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis. ICASSP 2019: 5916-5920 - [c32]Yusuke Yasuda, Xin Wang, Shinji Takaki, Junichi Yamagishi:
Investigation of Enhanced Tacotron Text-to-speech Synthesis Systems with Self-attention for Pitch Accent Language. ICASSP 2019: 6905-6909 - [c31]Shinji Takaki, Toru Nakashika, Xin Wang, Junichi Yamagishi:
STFT Spectral Loss for Training a Neural Speech Waveform Model. ICASSP 2019: 7065-7069 - [c30]Yi Zhao, Atsushi Ando, Shinji Takaki, Junichi Yamagishi, Satoshi Kobashikawa:
Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise. INTERSPEECH 2019: 3292-3296 - [c29]Shuhei Kato, Yusuke Yasuda, Xin Wang, Erica Cooper, Shinji Takaki, Junichi Yamagishi:
Rakugo speech synthesis using segment-to-segment neural transduction and style tokens - toward speech synthesis for entertaining audiences. SSW 2019: 111-116 - [i12]Yi Zhao, Atsushi Ando, Shinji Takaki, Junichi Yamagishi, Satoshi Kobashikawa:
Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise -. CoRR abs/1903.12316 (2019) - [i11]Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi:
Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform. CoRR abs/1903.12392 (2019) - [i10]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Neural source-filter waveform models for statistical parametric speech synthesis. CoRR abs/1904.12088 (2019) - [i9]Kazuhiro Nakamura, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks. CoRR abs/1910.11690 (2019) - [i8]Seyyed Saeed Sarfjoo, Xin Wang, Gustav Eje Henter, Jaime Lorenzo-Trueba, Shinji Takaki, Junichi Yamagishi:
Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model. CoRR abs/1911.03952 (2019) - 2018
- [j6]Yi Zhao, Shinji Takaki, Hieu-Thi Luong, Junichi Yamagishi, Daisuke Saito, Nobuaki Minematsu:
Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder. IEEE Access 6: 60478-60488 (2018) - [j5]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Investigating very deep highway networks for parametric speech synthesis. Speech Commun. 96: 1-9 (2018) - [j4]Jaime Lorenzo-Trueba, Gustav Eje Henter, Shinji Takaki, Junichi Yamagishi, Yosuke Morino, Yuta Ochiai:
Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis. Speech Commun. 99: 135-143 (2018) - [j3]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 26(8): 1406-1419 (2018) - [c28]Shinji Takaki, Yoshikazu Nishimura, Junichi Yamagishi:
Unsupervised Speaker Adaptation for DNN-based Speech Synthesis using Input Codes. APSIPA 2018: 649-658 - [c27]Xin Wang, Jaime Lorenzo-Trueba, Shinji Takaki, Lauri Juvela, Junichi Yamagishi:
A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis. ICASSP 2018: 4804-4808 - [c26]Kentaro Sone, Shinji Takaki, Toru Nakashika:
Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model. Odyssey 2018: 261-266 - [i7]Toru Nakashika, Shinji Takaki, Junichi Yamagishi:
Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra. CoRR abs/1803.09946 (2018) - [i6]Xin Wang, Jaime Lorenzo-Trueba, Shinji Takaki, Lauri Juvela, Junichi Yamagishi:
A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis. CoRR abs/1804.02549 (2018) - [i5]Yi Zhao, Shinji Takaki, Hieu-Thi Luong, Junichi Yamagishi, Daisuke Saito, Nobuaki Minematsu:
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder. CoRR abs/1807.11679 (2018) - [i4]Shinji Takaki, Toru Nakashika, Xin Wang, Junichi Yamagishi:
STFT spectral loss for training a neural speech waveform model. CoRR abs/1810.11945 (2018) - [i3]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Neural source-filter-based waveform model for statistical parametric speech synthesis. CoRR abs/1810.11946 (2018) - [i2]Yusuke Yasuda, Xin Wang, Shinji Takaki, Junichi Yamagishi:
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language. CoRR abs/1810.11960 (2018) - 2017
- [c25]Xin Wang, Shinji Takaki, Junichi Yamagishi:
An autoregressive recurrent mixture density network for parametric speech synthesis. ICASSP 2017: 4895-4899 - [c24]Hieu-Thi Luong, Shinji Takaki, Gustav Eje Henter, Junichi Yamagishi:
Adapting and controlling DNN-based speech synthesis using input codes. ICASSP 2017: 4905-4909 - [c23]Xin Wang, Shinji Takaki, Junichi Yamagishi:
An RNN-Based Quantized F0 Model with Multi-Tier Feedback Links for Text-to-Speech Synthesis. INTERSPEECH 2017: 1059-1063 - [c22]Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi:
Direct Modeling of Frequency Spectra and Waveform Generation Based on Phase Recovery for DNN-Based Speech Synthesis. INTERSPEECH 2017: 1128-1132 - [c21]Takuhiro Kaneko, Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi:
Generative Adversarial Network-Based Postfilter for STFT Spectrograms. INTERSPEECH 2017: 3389-3393 - [c20]Toru Nakashika, Shinji Takaki, Junichi Yamagishi:
Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra. INTERSPEECH 2017: 4021-4025 - 2016
- [j2]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Investigation of Using Continuous Representation of Various Linguistic Units in Neural Network Based Text-to-Speech Synthesis. IEICE Trans. Inf. Syst. 99-D(10): 2471-2480 (2016) - [c19]Lauri Juvela, Xin Wang, Shinji Takaki, Sangjin Kim, Manu Airaksinen, Junichi Yamagishi:
The NII speech synthesis entry for Blizzard Challenge 2016. Blizzard Challenge 2016 - [c18]Shinji Takaki, Junichi Yamagishi:
A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis. ICASSP 2016: 5535-5539 - [c17]Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi:
Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks. INTERSPEECH 2016: 352-356 - [c16]Lauri Juvela, Xin Wang, Shinji Takaki, Manu Airaksinen, Junichi Yamagishi, Paavo Alku:
Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks. INTERSPEECH 2016: 2283-2287 - [c15]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System. INTERSPEECH 2016: 2856-2860 - [c14]Xin Wang, Shinji Takaki, Junichi Yamagishi:
A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora. SSW 2016: 118-121 - [c13]Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Junichi Yamagishi:
Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech. SSW 2016: 146-152 - [c12]Shinji Takaki, Sangjin Kim, Junichi Yamagishi:
Speaker Adaptation of Various Components in Deep Neural Network based Speech Synthesis. SSW 2016: 153-159 - [c11]Xin Wang, Shinji Takaki, Junichi Yamagishi:
Investigating Very Deep Highway Networks for Parametric Speech Synthesis. SSW 2016: 166-171 - [p1]Shinji Takaki, Junichi Yamagishi:
Constructing a Deep Neural Network Based Spectral Model for Statistical Speech Synthesis. Recent Advances in Nonlinear Speech Processing 2016: 117-125 - 2015
- [c10]Shinji Takaki, Sangjin Kim, Junichi Yamagishi, JongJin Kim:
Multiple feed-forward deep neural networks for statistical parametric speech synthesis. INTERSPEECH 2015: 2242-2246 - [c9]Eita Nakamura, Shinji Takaki:
Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals. MCM 2015: 109-114 - [i1]Zhenzhou Wu, Shinji Takaki, Junichi Yamagishi:
Deep Denoising Auto-encoder for Statistical Speech Synthesis. CoRR abs/1506.05268 (2015) - 2014
- [j1]Shinji Takaki, Yoshihiko Nankaku, Keiichi Tokuda:
Contextual Additive Structure for HMM-Based Speech Synthesis. IEEE J. Sel. Top. Signal Process. 8(2): 229-238 (2014) - [c8]Kei Sawada, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Keiichi Tokuda:
Overview of NITECH HMM-based text-to-speech system for Blizzard Challenge 2014. Blizzard Challenge 2014 - 2013
- [c7]Shinji Takaki, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Keiichi Tokuda:
Overview of NITECH HMM-based speech synthesis system for Blizzard Challenge 2013. Blizzard Challenge 2013 - [c6]Takaya Makino, Shinji Takaki, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:
Separable lattice 2-D HMMS introducing state duration control for recognition of images with various variations. ICASSP 2013: 3203-3207 - [c5]Shinji Takaki, Yoshihiko Nankaku, Keiichi Tokuda:
Contextual partial additive structure for HMM-based speech synthesis. ICASSP 2013: 7878-7882 - 2012
- [c4]Shinji Takaki, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Keiichi Tokuda:
Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2012. Blizzard Challenge 2012 - 2011
- [c3]Kei Hashimoto, Shinji Takaki, Keiichiro Oura, Keiichi Tokuda:
Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2011. Blizzard Challenge 2011 - [c2]Shinji Takaki, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda:
An optimization algorithm of independent mean and variance parameter tying structures for HMM-based speech synthesis. ICASSP 2011: 4700-4703 - 2010
- [c1]Shinji Takaki, Yoshihiko Nankaku, Keiichi Tokuda:
Spectral modeling with contextual additive structure for HMM-based speech synthesis. SSW 2010: 100-105
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-28 01:29 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint