default search action
Shigeki Karita
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i13]Min Ma, Yuma Koizumi, Shigeki Karita, Heiga Zen, Jason Riesa, Haruko Ishikawa, Michiel Bacchiani:
FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks. CoRR abs/2408.06227 (2024) - 2023
- [c25]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna:
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus. INTERSPEECH 2023: 5496-5500 - [c24]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani:
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations. WASPAA 2023: 1-5 - [i12]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani:
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations. CoRR abs/2303.01664 (2023) - [i11]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna:
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus. CoRR abs/2305.18802 (2023) - [i10]Shigeki Karita, Richard Sproat, Haruko Ishikawa:
Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency. CoRR abs/2306.04530 (2023) - 2022
- [c23]Yotaro Kubo, Shigeki Karita, Michiel Bacchiani:
Knowledge Transfer from Large-Scale Pretrained Language Models to End-To-End Speech Recognizers. ICASSP 2022: 8512-8516 - [c22]Yuma Koizumi, Shigeki Karita, Arun Narayanan, Sankaran Panchapagesan, Michiel Bacchiani:
SNRi Target Training for Joint Speech Enhancement and Recognition. INTERSPEECH 2022: 1173-1177 - [i9]Yotaro Kubo, Shigeki Karita, Michiel Bacchiani:
Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers. CoRR abs/2202.07894 (2022) - 2021
- [c21]Shigeki Karita, Yotaro Kubo, Michiel Adriaan Unico Bacchiani, Llion Jones:
A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition. Interspeech 2021: 2092-2096 - [c20]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. Interspeech 2021: 4089-4093 - [c19]Yuma Koizumi, Shigeki Karita, Scott Wisdom, Hakan Erdogan, John R. Hershey, Llion Jones, Michiel Bacchiani:
DF-Conformer: Integrated Architecture of Conv-Tasnet and Conformer Using Linear Complexity Self-Attention for Speech Enhancement. WASPAA 2021: 161-165 - [i8]Shigeki Karita, Yotaro Kubo, Michiel Adriaan Unico Bacchiani, Llion Jones:
A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition. CoRR abs/2106.05111 (2021) - [i7]Yuma Koizumi, Shigeki Karita, Scott Wisdom, Hakan Erdogan, John R. Hershey, Llion Jones, Michiel Bacchiani:
DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement. CoRR abs/2106.15813 (2021) - [i6]Yuma Koizumi, Shigeki Karita, Arun Narayanan, Sankaran Panchapagesan, Michiel Bacchiani:
SNRi Target Training for Joint Speech Enhancement and Recognition. CoRR abs/2111.00764 (2021) - 2020
- [c18]Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Yalta, Tomoki Hayashi, Shinji Watanabe:
ESPnet-ST: All-in-One Speech Translation Toolkit. ACL (demo) 2020: 302-311 - [c17]Takafumi Moriya, Tsubasa Ochiai, Shigeki Karita, Hiroshi Sato, Tomohiro Tanaka, Takanori Ashihara, Ryo Masumura, Yusuke Shinohara, Marc Delcroix:
Self-Distillation for Improving CTC-Transformer-Based ASR Systems. INTERSPEECH 2020: 546-550 - [i5]Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Enrique Yalta Soplin, Tomoki Hayashi, Shinji Watanabe:
ESPnet-ST: All-in-One Speech Translation Toolkit. CoRR abs/2004.10234 (2020) - [i4]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. CoRR abs/2010.12973 (2020) - [i3]Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, Shigeki Karita, Chenda Li, Jing Shi, Aswin Shanmugam Subramanian, Wangyou Zhang:
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans. CoRR abs/2012.13006 (2020)
2010 – 2019
- 2019
- [c16]Shigeki Karita, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto:
A Comparative Study on Transformer vs RNN in Speech Applications. ASRU 2019: 449-456 - [c15]Shigeki Karita, Shinji Watanabe, Tomoharu Iwata, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani:
Semi-supervised End-to-end Speech Recognition Using Text-to-speech and Autoencoders. ICASSP 2019: 6166-6170 - [c14]Marc Delcroix, Shinji Watanabe, Tsubasa Ochiai, Keisuke Kinoshita, Shigeki Karita, Atsunori Ogawa, Tomohiro Nakatani:
End-to-End SpeakerBeam for Single Channel Target Speech Recognition. INTERSPEECH 2019: 451-455 - [c13]Shigeki Karita, Nelson Enrique Yalta Soplin, Shinji Watanabe, Marc Delcroix, Atsunori Ogawa, Tomohiro Nakatani:
Improving Transformer-Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration. INTERSPEECH 2019: 1408-1412 - [c12]Atsunori Ogawa, Marc Delcroix, Shigeki Karita, Tomohiro Nakatani:
Improved Deep Duel Model for Rescoring N-Best Speech Recognition List Using Backward LSTMLM and Ensemble Encoders. INTERSPEECH 2019: 3900-3904 - [i2]Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang:
A Comparative Study on Transformer vs RNN in Speech Applications. CoRR abs/1909.06317 (2019) - 2018
- [c11]Takuya Higuchi, Keisuke Kinoshita, Nobutaka Ito, Shigeki Karita, Tomohiro Nakatani:
Frame-by-Frame Closed-Form Update for Mask-Based Adaptive MVDR Beamforming. ICASSP 2018: 531-535 - [c10]Shigeki Karita, Atsunori Ogawa, Marc Delcroix, Tomohiro Nakatani:
Sequence Training of Encoder-Decoder Model Using Policy Gradient for End-to-End Speech Recognition. ICASSP 2018: 5839-5843 - [c9]Atsunori Ogawa, Marc Delcroix, Shigeki Karita, Tomohiro Nakatani:
Rescoring N-Best Speech Recognition List Based on One-on-One Hypothesis Comparison Using Encoder-Classifier Model. ICASSP 2018: 6099-6103 - [c8]Shigeki Karita, Shinji Watanabe, Tomoharu Iwata, Atsunori Ogawa, Marc Delcroix:
Semi-Supervised End-to-End Speech Recognition. INTERSPEECH 2018: 2-6 - [c7]Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai:
ESPnet: End-to-End Speech Processing Toolkit. INTERSPEECH 2018: 2207-2211 - [c6]Marc Delcroix, Shinji Watanabe, Atsunori Ogawa, Shigeki Karita, Tomohiro Nakatani:
Auxiliary Feature Based Adaptation of End-to-end ASR Systems. INTERSPEECH 2018: 2444-2448 - [i1]Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai:
ESPnet: End-to-End Speech Processing Toolkit. CoRR abs/1804.00015 (2018) - 2017
- [c5]Shoko Araki, Nobutaka Ito, Marc Delcroix, Atsunori Ogawa, Keisuke Kinoshita, Takuya Higuchi, Takuya Yoshioka, Dung T. Tran, Shigeki Karita, Tomohiro Nakatani:
Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming. HSCMA 2017: 16-20 - [c4]Dung T. Tran, Marc Delcroix, Shigeki Karita, Michael Hentschel, Atsunori Ogawa, Tomohiro Nakatani:
Unfolded Deep Recurrent Convolutional Neural Network with Jump Ahead Connections for Acoustic Modeling. INTERSPEECH 2017: 1596-1600 - [c3]Shigeki Karita, Atsunori Ogawa, Marc Delcroix, Tomohiro Nakatani:
Forward-Backward Convolutional LSTM for Acoustic Modeling. INTERSPEECH 2017: 1601-1605 - 2015
- [c2]Takuya Yoshioka, Shigeki Karita, Tomohiro Nakatani:
Far-field speech recognition using CNN-DNN-HMM with convolution in time. ICASSP 2015: 4360-4364 - [c1]Shigeki Karita, Kumi Nakamura, Kazuhiro Kono, Yoshimichi Ito, Noboru Babaguchi:
Owner authentication for mobile devices using motion gestures based on multi-owner template update. ICME Workshops 2015: 1-6
Coauthor Index
aka: Michiel Adriaan Unico Bacchiani
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 22:15 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint