default search action

combined dblp search
author search
venue search
publication search

ask others

Boris Ginsburg

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c54]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/BurchiPBGT24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/BurchiPBGT24
Maxime Burchi, Krishna C. Puvvada, Jagadeesh Balam, Boris Ginsburg, Radu Timofte:
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer. ICASSP 2024: 10211-10215
[c53]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/0089BGLBG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/0089BGLBG24
Yang Zhang, Travis M. Bartley, Mariana Graterol-Fuenmayor, Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg:
A Chat about Boring Problems: Studying GPT-Based Text Normalization. ICASSP 2024: 10921-10925
[c52]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XuCJG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XuCJG24
Hainan Xu, Zhehuai Chen, Fei Jia, Boris Ginsburg:
Transducers with Pronunciation-Aware Embeddings for Automatic Speech Recognition. ICASSP 2024: 12026-12030
[c51]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NorooziMKBG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NorooziMKBG24
Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg:
Stateful Conformer with Cache-Based Inference for Streaming Automatic Speech Recognition. ICASSP 2024: 12041-12045
[c50]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PuvvadaKDBG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PuvvadaKDBG24
Krishna C. Puvvada, Nithin Rao Koluguri, Kunal Dhawan, Jagadeesh Balam, Boris Ginsburg:
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition. ICASSP 2024: 12111-12115
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KoluguriKZMRNBG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KoluguriKZMRNBG24
Nithin Rao Koluguri, Samuel Kriman, Georgy Zelenfroind, Somshubra Majumdar, Dima Rekesh, Vahid Noroozi, Jagadeesh Balam, Boris Ginsburg:
Investigating End-to-End ASR Architectures for Long Form Audio Transcription. ICASSP 2024: 13366-13370
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenHAHPLGBG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenHAHPLGBG24
Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg:
SALM: Speech-Augmented Language Model with in-Context Learning for Speech Recognition and Translation. ICASSP 2024: 13521-13525
[c47]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/NeekharaHVGRDKM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/NeekharaHVGRDKM24
Paarth Neekhara, Shehzeen Samarah Hussain, Rafael Valle, Boris Ginsburg, Rishabh Ranjan, Shlomo Dubnov, Farinaz Koushanfar, Julian J. McAuley:
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations. ICML 2024
[i79]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-04295
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-04295
Hainan Xu, Zhehuai Chen, Fei Jia, Boris Ginsburg:
Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition. CoRR abs/2404.04295 (2024)
[i78]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-06654
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-06654
Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Yang Zhang, Boris Ginsburg:
RULER: What's the Real Context Size of Your Long-Context Language Models? CoRR abs/2404.06654 (2024)
[i77]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-12983
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-12983
Maxime Burchi, Krishna C. Puvvada, Jagadeesh Balam, Boris Ginsburg, Radu Timofte:
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer. CoRR abs/2405.12983 (2024)
[i76]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-06220
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-06220
Vladimir Bataev, Hainan Xu, Daniel Galvez, Vitaly Lavrukhin, Boris Ginsburg:
Label-Looping: Highly Efficient Decoding for Transducers. CoRR abs/2406.06220 (2024)
[i75]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07096
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07096
Andrei Andrusenko, Aleksandr Laptev, Vladimir Bataev, Vitaly Lavrukhin, Boris Ginsburg:
Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter. CoRR abs/2406.07096 (2024)
[i74]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-11704
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-11704
Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan M. Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek, Robert Hero, Jining Huang, Vibhu Jawa, Joseph Jennings, Aastha Jhunjhunwala, John Kamalu, Sadaf Khan, Oleksii Kuchaiev, Patrick LeGresley, Hui Li, Jiwei Liu, Zihan Liu, Eileen Long, Ameya Sunil Mahabaleshwarkar, Somshubra Majumdar, James Maki, Miguel Martinez, Maer Rodrigues de Melo, Ivan Moshkov, Deepak Narayanan, Sean Narenthiran, Jesus Navarro, Phong Nguyen, Osvald Nitski, Vahid Noroozi, Guruprasad Nutheti, Christopher Parisien, Jupinder Parmar, Mostofa Patwary, Krzysztof Pawelec, Wei Ping, Shrimai Prabhumoye, Rajarshi Roy, Trisha Saar, Vasanth Rao Naik Sabavat, Sanjeev Satheesh, Jane Polak Scowcroft, Jason Sewall, Pavel Shamis, Gerald Shen, Mohammad Shoeybi, Dave Sizer, Misha Smelyanskiy, Felipe Soares, Makesh Narsimhan Sreedhar, Dan Su, Sandeep Subramanian, Shengyang Sun, Shubham Toshniwal, Hao Wang, Zhilin Wang, Jiaxuan You, Jiaqi Zeng, Jimmy Zhang, Jing Zhang, Vivienne Zhang, Yian Zhang, Chen Zhu:
Nemotron-4 340B Technical Report. CoRR abs/2406.11704 (2024)
[i73]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-12946
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-12946
Vahid Noroozi, Zhehuai Chen, Somshubra Majumdar, Steve Huang, Jagadeesh Balam, Boris Ginsburg:
Instruction Data Generation and Unsupervised Adaptation for Speech Language Models. CoRR abs/2406.12946 (2024)
[i72]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-17957
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-17957
Paarth Neekhara, Shehzeen Hussain, Subhankar Ghosh, Jason Li, Rafael Valle, Rohan Badlani, Boris Ginsburg:
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment. CoRR abs/2406.17957 (2024)
[i71]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-18871
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-18871
Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, He Huang, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee:
DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment. CoRR abs/2406.18871 (2024)
[i70]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-19674
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-19674
Krishna C. Puvvada, Piotr Zelasko, He Huang, Oleksii Hrinchuk, Nithin Rao Koluguri, Kunal Dhawan, Somshubra Majumdar, Elena Rastorgueva, Zhehuai Chen, Vitaly Lavrukhin, Jagadeesh Balam, Boris Ginsburg:
Less is More: Accurate Speech Recognition & Translation without Web-Scale Data. CoRR abs/2406.19674 (2024)
[i69]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-19954
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-19954
Zhehuai Chen, He Huang, Oleksii Hrinchuk, Krishna C. Puvvada, Nithin Rao Koluguri, Piotr Zelasko, Jagadeesh Balam, Boris Ginsburg:
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5. CoRR abs/2406.19954 (2024)
[i68]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-03495
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-03495
Kunal Dhawan, Nithin Rao Koluguri, Ante Jukic, Ryan Langman, Jagadeesh Balam, Boris Ginsburg:
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations. CoRR abs/2407.03495 (2024)
[i67]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-04368
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-04368
Wen Ding, Fei Jia, Hainan Xu, Yu Xi, Junjie Lai, Boris Ginsburg:
Romanization Encoding For Multilingual ASR. CoRR abs/2407.04368 (2024)
[i66]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-21077
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-21077
Somshubra Majumdar, Vahid Noroozi, Sean Narenthiran, Aleksander Ficek, Jagadeesh Balam, Boris Ginsburg:
Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models. CoRR abs/2407.21077 (2024)
[i65]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-13106
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-13106
He Huang, Taejin Park, Kunal Dhawan, Ivan Medennikov, Krishna C. Puvvada, Nithin Rao Koluguri, Weiqing Wang, Jagadeesh Balam, Boris Ginsburg:
NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks. CoRR abs/2408.13106 (2024)
[i64]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-01438
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-01438
Weiqing Wang, Kunal Dhawan, Taejin Park, Krishna C. Puvvada, Ivan Medennikov, Somshubra Majumdar, He Huang, Jagadeesh Balam, Boris Ginsburg:
Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR. CoRR abs/2409.01438 (2024)
[i63]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-05601
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-05601
Nithin Rao Koluguri, Travis M. Bartley, Hainan Xu, Oleksii Hrinchuk, Jagadeesh Balam, Boris Ginsburg, Georg Kucsko:
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation. CoRR abs/2409.05601 (2024)
[i62]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-06656
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-06656
Taejin Park, Ivan Medennikov, Kunal Dhawan, Weiqing Wang, He Huang, Nithin Rao Koluguri, Krishna C. Puvvada, Jagadeesh Balam, Boris Ginsburg:
Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens. CoRR abs/2409.06656 (2024)
[i61]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-09785
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-09785
Chao-Han Huck Yang, Taejin Park, Yuan Gong, Yuanchao Li, Zhehuai Chen, Yen-Ting Lin, Chen Chen, Yuchen Hu, Kunal Dhawan, Piotr Zelasko, Chao Zhang, Yun-Nung Chen, Yu Tsao, Jagadeesh Balam, Boris Ginsburg, Sabato Marco Siniscalchi, Eng Siong Chng, Peter Bell, Catherine Lai, Shinji Watanabe, Andreas Stolcke:
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition. CoRR abs/2409.09785 (2024)
[i60]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-11538
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-11538
Ke Hu, Zhehuai Chen, Chao-Han Huck Yang, Piotr Zelasko, Oleksii Hrinchuk, Vitaly Lavrukhin, Jagadeesh Balam, Boris Ginsburg:
Chain-of-Thought Prompting for Speech Translation. CoRR abs/2409.11538 (2024)
[i59]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-12352
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-12352
Jinhan Wang, Weiqing Wang, Kunal Dhawan, Taejin Park, Myungjong Kim, Ivan Medennikov, He Huang, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg:
META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR. CoRR abs/2409.12352 (2024)
[i58]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-13523
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-13523
Piotr Zelasko, Zhehuai Chen, Mengru Wang, Daniel Galvez, Oleksii Hrinchuk, Shuoyang Ding, Ke Hu, Jagadeesh Balam, Vitaly Lavrukhin, Boris Ginsburg:
EMMeTT: Efficient Multimodal Machine Translation Training. CoRR abs/2409.13523 (2024)
[i57]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-20007
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-20007
Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, Chao-Han Huck Yang, Jagadeesh Balam, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee:
Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data. CoRR abs/2409.20007 (2024)
[i56]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-01131
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-01131
Ilya Loshchilov, Cheng-Ping Hsieh, Simeng Sun, Boris Ginsburg:
nGPT: Normalized Transformer with Representation Learning on the Hypersphere. CoRR abs/2410.01131 (2024)
[i55]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-02597
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-02597
Hainan Xu, Travis M. Bartley, Vladimir Bataev, Boris Ginsburg:
Three-in-One: Fast and Accurate Transducer for Hybrid-Autoregressive ASR. CoRR abs/2410.02597 (2024)
2023
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/MeisterNKBLG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/MeisterNKBLG23
Aleksandr Meister, Matvei Novikov, Nikolay Karpov, Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg:
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of End-to-End ASR Models. ASRU 2023: 1-7
[c45]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/RekeshKKMNHHPKBG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/RekeshKKMNHHPKBG23
Dima Rekesh, Nithin Rao Koluguri, Samuel Kriman, Somshubra Majumdar, Vahid Noroozi, He Huang, Oleksii Hrinchuk, Krishna C. Puvvada, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg:
Fast Conformer With Linearly Scalable Attention For Efficient Speech Recognition. ASRU 2023: 1-8
[c44]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/BadlaniAGVSSGC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/BadlaniAGVSSGC23
Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro:
Vani: Very-Lightweight Accent-Controllable TTS for Native And Non-Native Speakers With Identity Preservation. ICASSP 2023: 1-2
[c43]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/BartleyJPKG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/BartleyJPKG23
Travis M. Bartley, Fei Jia, Krishna C. Puvvada, Samuel Kriman, Boris Ginsburg:
Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models. ICASSP 2023: 1-5
[c42]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HussainNHLG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HussainNHLG23
Shehzeen Hussain, Paarth Neekhara, Jocelyn Huang, Jason Li, Boris Ginsburg:
ACE-VC: Adaptive and Controllable Voice Conversion Using Explicitly Disentangled Self-Supervised Speech Representations. ICASSP 2023: 1-5
[c41]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LaptevBGG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LaptevBGG23
Aleksandr Laptev, Vladimir Bataev, Igor Gitman, Boris Ginsburg:
Powerful and Extensible WFST Framework for Rnn-Transducer Losses. ICASSP 2023: 1-5
[c40]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XuJMWG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XuJMWG23
Hainan Xu, Fei Jia, Somshubra Majumdar, Shinji Watanabe, Boris Ginsburg:
Multi-Blank Transducers for Speech Recognition. ICASSP 2023: 1-5
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangPLG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangPLG23
Yang Zhang, Krishna C. Puvvada, Vitaly Lavrukhin, Boris Ginsburg:
Conformer-Based Target-Speaker Automatic Speech Recognition For Single-Channel Audio. ICASSP 2023: 1-5
[c38]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/LeePGCY23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/LeePGCY23
Sang-gil Lee, Wei Ping, Boris Ginsburg, Bryan Catanzaro, Sungroh Yoon:
BigVGAN: A Universal Neural Vocoder with Large-Scale Training. ICLR 2023
[c37]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/XuJMH0G23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/XuJMH0G23
Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg:
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations. ICML 2023: 38462-38484
[c36]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AntonovaBG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AntonovaBG23
Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg:
SpellMapper: A non-autoregressive neural spellchecker for ASR customization with candidate retrieval based on n-gram mappings. INTERSPEECH 2023: 1404-1408
[c35]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GitmanLLG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GitmanLLG23
Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg:
Confidence-based Ensembles of End-to-End Speech Recognition Models. INTERSPEECH 2023: 1414-1418
[c34]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BataevKSLG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BataevKSLG23
Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, Boris Ginsburg:
Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator. INTERSPEECH 2023: 2928-2932
[c33]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangBG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangBG23
He Huang, Jagadeesh Balam, Boris Ginsburg:
Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling. INTERSPEECH 2023: 2933-2937
[c32]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsiehGG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsiehGG23
Cheng-Ping Hsieh, Subhankar Ghosh, Boris Ginsburg:
Adapter-Based Extension of Multi-Speaker Text-To-Speech Model for New Speakers. INTERSPEECH 2023: 3028-3032
[c31]
- view
  - electronic edition @ isca-archive.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/interspeech/RastorguevaLG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RastorguevaLG23
Elena Rastorgueva, Vitaly Lavrukhin, Boris Ginsburg:
NeMo Forced Aligner and its application to word alignment for subtitle generation. INTERSPEECH 2023: 5257-5258
[c30]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JiaKBG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JiaKBG23
Fei Jia, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg:
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification. INTERSPEECH 2023: 5321-5325
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/HrinchukBBG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/HrinchukBBG23
Oleksii Hrinchuk, Vladimir Bataev, Evelina Bakhturina, Boris Ginsburg:
NVIDIA NeMo Offline Speech Translation Systems for IWSLT 2023. IWSLT@ACL 2023: 442-448
[c28]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/JukicBG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/JukicBG23
Ante Jukic, Jagadeesh Balam, Boris Ginsburg:
Flexible Multichannel Speech Enhancement for Noise-Robust Frontend. WASPAA 2023: 1-5
[i54]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-08137
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-08137
Shehzeen Hussain, Paarth Neekhara, Jocelyn Huang, Jason Li, Boris Ginsburg:
ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations. CoRR abs/2302.08137 (2023)
[i53]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-14036
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-14036
Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, Boris Ginsburg:
Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator. CoRR abs/2302.14036 (2023)
[i52]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-07578
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-07578
Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro:
VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation. CoRR abs/2303.07578 (2023)
[i51]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-10384
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-10384
Aleksandr Laptev, Vladimir Bataev, Igor Gitman, Boris Ginsburg:
Powerful and Extensible WFST Framework for RNN-Transducer Losses. CoRR abs/2303.10384 (2023)
[i50]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-06795
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-06795
Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg:
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations. CoRR abs/2304.06795 (2023)
[i49]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-05084
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-05084
Dima Rekesh, Samuel Kriman, Somshubra Majumdar, Vahid Noroozi, He Huang, Oleksii Hrinchuk, Ankur Kumar, Boris Ginsburg:
Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition. CoRR abs/2305.05084 (2023)
[i48]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-02317
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-02317
Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg:
SpellMapper: A non-autoregressive neural spellchecker for ASR customization with candidate retrieval based on n-gram mappings. CoRR abs/2306.02317 (2023)
[i47]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-08753
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-08753
Kunal Dhawan, Dima Rekesh, Boris Ginsburg:
Towards training Bilingual and Code-Switched Speech Recognition models from Monolingual data sources. CoRR abs/2306.08753 (2023)
[i46]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-15824
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-15824
Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg:
Confidence-based Ensembles of End-to-End Speech Recognition Models. CoRR abs/2306.15824 (2023)
[i45]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-07057
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-07057
He Huang, Jagadeesh Balam, Boris Ginsburg:
Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling. CoRR abs/2307.07057 (2023)
[i44]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-05218
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-05218
Yang Zhang, Krishna C. Puvvada, Vitaly Lavrukhin, Boris Ginsburg:
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio. CoRR abs/2308.05218 (2023)
[i43]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09950
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09950
Nithin Rao Koluguri, Samuel Kriman, Georgy Zelenfroind, Somshubra Majumdar, Dima Rekesh, Vahid Noroozi, Jagadeesh Balam, Boris Ginsburg:
Investigating End-to-End ASR Architectures for Long Form Audio Transcription. CoRR abs/2309.09950 (2023)
[i42]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-10922
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-10922
Krishna C. Puvvada, Nithin Rao Koluguri, Kunal Dhawan, Jagadeesh Balam, Boris Ginsburg:
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition. CoRR abs/2309.10922 (2023)
[i41]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-13426
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-13426
Yang Zhang, Travis M. Bartley, Mariana Graterol-Fuenmayor, Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg:
A Chat About Boring Problems: Studying GPT-based text normalization. CoRR abs/2309.13426 (2023)
[i40]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-02943
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-02943
Aleksandr Meister, Matvei Novikov, Nikolay Karpov, Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg:
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models. CoRR abs/2310.02943 (2023)
[i39]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-09424
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-09424
Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg:
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation. CoRR abs/2310.09424 (2023)
[i38]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-09653
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-09653
Paarth Neekhara, Shehzeen Hussain, Rafael Valle, Boris Ginsburg, Rishabh Ranjan, Shlomo Dubnov, Farinaz Koushanfar, Julian J. McAuley:
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations. CoRR abs/2310.09653 (2023)
[i37]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-12371
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-12371
Taejin Park, He Huang, Coleman Hooper, Nithin Rao Koluguri, Kunal Dhawan, Ante Jukic, Jagadeesh Balam, Boris Ginsburg:
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation. CoRR abs/2310.12371 (2023)
[i36]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-12378
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-12378
Taejin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Rao Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg:
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System. CoRR abs/2310.12378 (2023)
[i35]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-17279
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-17279
Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg:
Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition. CoRR abs/2312.17279 (2023)
2022
[c27]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TatanovBG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TatanovBG22
Oktai Tatanov, Stanislav Beliaev, Boris Ginsburg:
Mixer-TTS: Non-Autoregressive, Fast and Compact Text-to-Speech Model Conditioned on Language Model Embeddings. ICASSP 2022: 7482-7486
[c26]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KoluguriPG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KoluguriPG22
Nithin Rao Koluguri, Taejin Park, Boris Ginsburg:
TitaNet: Neural Model for Speaker Representation with 1D Depth-Wise Separable Convolutions and Global Context. ICASSP 2022: 8102-8106
[c25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BakhturinaZG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BakhturinaZG22
Evelina Bakhturina, Yang Zhang, Boris Ginsburg:
Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization. INTERSPEECH 2022: 491-495
[c24]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AntonovaBG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AntonovaBG22
Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg:
Thutmose Tagger: Single-pass neural model for Inverse Text Normalization. INTERSPEECH 2022: 550-554
[c23]
- view
  - electronic edition @ isca-speech.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/interspeech/ParkKJBG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ParkKJBG22
Taejin Park, Nithin Rao Koluguri, Fei Jia, Jagadeesh Balam, Boris Ginsburg:
NeMo Open Source Speaker Diarization System. INTERSPEECH 2022: 853-854
[c22]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LaptevMG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LaptevMG22
Aleksandr Laptev, Somshubra Majumdar, Boris Ginsburg:
CTC Variations Through New WFST Topologies. INTERSPEECH 2022: 1041-1045
[c21]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ParkKBG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ParkKBG22
Taejin Park, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg:
Multi-scale Speaker Diarization with Dynamic Scale Weighting. INTERSPEECH 2022: 5080-5084
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MajumdarALG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MajumdarALG22
Somshubra Majumdar, Shantanu Acharya, Vitaly Lavrukhin, Boris Ginsburg:
Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition. SLT 2022: 130-135
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LaptevG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LaptevG22
Aleksandr Laptev, Boris Ginsburg:
Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-to-End Automatic Speech Recognition. SLT 2022: 152-159
[i34]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-15917
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-15917
Evelina Bakhturina, Yang Zhang, Boris Ginsburg:
Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization. CoRR abs/2203.15917 (2022)
[i33]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-15974
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-15974
Taejin Park, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg:
Multi-scale Speaker Diarization with Dynamic Scale Weighting. CoRR abs/2203.15974 (2022)
[i32]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-04658
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-04658
Sang-gil Lee, Wei Ping, Boris Ginsburg, Bryan Catanzaro, Sungroh Yoon:
BigVGAN: A Universal Neural Vocoder with Large-Scale Training. CoRR abs/2206.04658 (2022)
[i31]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-00064
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-00064
Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg:
Thutmose Tagger: Single-pass neural model for Inverse Text Normalization. CoRR abs/2208.00064 (2022)
[i30]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-03255
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-03255
Somshubra Majumdar, Shantanu Acharya, Vitaly Lavrukhin, Boris Ginsburg:
Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition. CoRR abs/2210.03255 (2022)
[i29]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-15781
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-15781
Fei Jia, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg:
AmberNet: A Compact End-to-End Model for Spoken Language Identification. CoRR abs/2210.15781 (2022)
[i28]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-00585
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-00585
Cheng-Ping Hsieh, Subhankar Ghosh, Boris Ginsburg:
Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers. CoRR abs/2211.00585 (2022)
[i27]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-03541
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-03541
Hainan Xu, Fei Jia, Somshubra Majumdar, Shinji Watanabe, Boris Ginsburg:
Multi-blank Transducers for Speech Recognition. CoRR abs/2211.03541 (2022)
[i26]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-05103
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-05103
Travis M. Bartley, Fei Jia, Krishna C. Puvvada, Samuel Kriman, Boris Ginsburg:
Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models. CoRR abs/2211.05103 (2022)
[i25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-08703
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-08703
Aleksandr Laptev, Boris Ginsburg:
Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition. CoRR abs/2212.08703 (2022)
2021
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/jcisd/KorshunovaGTI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jcisd/KorshunovaGTI21
Maria Korshunova, Boris Ginsburg, Alexander Tropsha, Olexandr Isayev:
OpenChem: A Deep Learning Toolkit for Computational Chemistry and Drug Design. J. Chem. Inf. Model. 61(1): 7-13 (2021)
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/JiaMG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/JiaMG21
Fei Jia, Somshubra Majumdar, Boris Ginsburg:
MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection. ICASSP 2021: 6818-6822
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/LuoWCX0KOBDFGHK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/LuoWCX0KOBDFGHK21
Jian Luo, Jianzong Wang, Ning Cheng, Edward Xiao, Jing Xiao, Georg Kucsko, Patrick K. O'Neill, Jagadeesh Balam, Slyne Deng, Adriana Flores, Boris Ginsburg, Jocelyn Huang, Oleksii Kuchaiev, Vitaly Lavrukhin, Jason Li:
Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition. ICME 2021: 1-6
[c16]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ONeillLMNZKBDFS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ONeillLMNZKBDFS21
Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko:
SPGISpeech: 5, 000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition. Interspeech 2021: 1434-1438
[c15]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BakhturinaLGZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BakhturinaLGZ21
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg, Yang Zhang:
Hi-Fi Multi-Speaker English TTS Dataset. Interspeech 2021: 2776-2780
[c14]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BeliaevG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BeliaevG21
Stanislav Beliaev, Boris Ginsburg:
TalkNet: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis. Interspeech 2021: 3760-3764
[c13]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangBGG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangBGG21
Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg:
NeMo Inverse Text Normalization: From Development to Production. Interspeech 2021: 4468-4472
[c12]
- view
  - electronic edition @ isca-speech.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/interspeech/ZhangBG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangBG21
Yang Zhang, Evelina Bakhturina, Boris Ginsburg:
NeMo (Inverse) Text Normalization: From Development to Production. Interspeech 2021: 4857-4859
[c11]
- view
  - electronic edition @ neurips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/BakhturinaLG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/BakhturinaLG21
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg:
A Toolbox for Construction and Analysis of Speech Datasets. NeurIPS Datasets and Benchmarks 2021
[i24]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2104-02014
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-02014
Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko:
SPGISpeech: 5, 000 hours of transcribed financial audio for fully formatted end-to-end speech recognition. CoRR abs/2104.02014 (2021)
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2104-04896
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-04896
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg:
NeMo Toolbox for Speech Dataset Construction. CoRR abs/2104.04896 (2021)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2104-05055
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-05055
Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg:
NeMo Inverse Text Normalization: From Development To Production. CoRR abs/2104.05055 (2021)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2104-08189
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-08189
Stanislav Beliaev, Boris Ginsburg:
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction. CoRR abs/2104.08189 (2021)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2105-08049
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2105-08049
Yang Zhang, Vahid Noroozi, Evelina Bakhturina, Boris Ginsburg:
SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services. CoRR abs/2105.08049 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2107-10708
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-10708
Aleksei Kalinov, Somshubra Majumdar, Jagadeesh Balam, Boris Ginsburg:
CarneliNet: Neural Mixture Model for Automatic Speech Recognition. CoRR abs/2107.10708 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2108-09889
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-09889
Tuan Manh Lai, Yang Zhang, Evelina Bakhturina, Boris Ginsburg, Heng Ji:
A Unified Transformer-based Framework for Duplex Text Normalization. CoRR abs/2108.09889 (2021)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2110-03098
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-03098
Aleksandr Laptev, Somshubra Majumdar, Boris Ginsburg:
CTC Variations Through New WFST Topologies. CoRR abs/2110.03098 (2021)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2110-04410
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-04410
Nithin Rao Koluguri, Taejin Park, Boris Ginsburg:
TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context. CoRR abs/2110.04410 (2021)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2110-05798
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-05798
Paarth Neekhara, Jason Li, Boris Ginsburg:
Adapting TTS models For New Speakers using Transfer Learning. CoRR abs/2110.05798 (2021)
2020
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KrimanBGHKLLLZ20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KrimanBGHKLLLZ20
Samuel Kriman, Stanislav Beliaev, Boris Ginsburg, Jocelyn Huang, Oleksii Kuchaiev, Vitaly Lavrukhin, Ryan Leary, Jason Li, Yang Zhang:
Quartznet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions. ICASSP 2020: 6124-6128
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HrinchukPG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HrinchukPG20
Oleksii Hrinchuk, Mariya Popova, Boris Ginsburg:
Correction of Automatic Speech Recognition with Transformer Sequence-To-Sequence Model. ICASSP 2020: 7074-7078
[c8]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MajumdarG20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MajumdarG20
Somshubra Majumdar, Boris Ginsburg:
MatchboxNet: 1D Time-Channel Separable Convolutional Neural Network Architecture for Speech Commands Recognition. INTERSPEECH 2020: 3356-3360
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2007-09286
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2007-09286
Boris Ginsburg:
On regularization of gradient descent, layer imbalance and flat minima. CoRR abs/2007.09286 (2020)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2010-13886
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-13886
Fei Jia, Somshubra Majumdar, Boris Ginsburg:
MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection. CoRR abs/2010.13886 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c7]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiLGLKCNG19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiLGLKCNG19
Jason Li, Vitaly Lavrukhin, Boris Ginsburg, Ryan Leary, Oleksii Kuchaiev, Jonathan M. Cohen, Huyen Nguyen, Ravi Teja Gadde:
Jasper: An End-to-End Convolutional Neural Acoustic Model. INTERSPEECH 2019: 71-75
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1904-03288
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1904-03288
Jason Li, Vitaly Lavrukhin, Boris Ginsburg, Ryan Leary, Oleksii Kuchaiev, Jonathan M. Cohen, Huyen Nguyen, Ravi Teja Gadde:
Jasper: An End-to-End Convolutional Neural Acoustic Model. CoRR abs/1904.03288 (2019)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1905-11286
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-11286
Boris Ginsburg, Patrice Castonguay, Oleksii Hrinchuk, Oleksii Kuchaiev, Vitaly Lavrukhin, Ryan Leary, Jason Li, Huyen Nguyen, Jonathan M. Cohen:
Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks. CoRR abs/1905.11286 (2019)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1909-09577
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1909-09577
Oleksii Kuchaiev, Jason Li, Huyen Nguyen, Oleksii Hrinchuk, Ryan Leary, Boris Ginsburg, Samuel Kriman, Stanislav Beliaev, Vitaly Lavrukhin, Jack Cook, Patrice Castonguay, Mariya Popova, Jocelyn Huang, Jonathan M. Cohen:
NeMo: a toolkit for building AI applications using Neural Modules. CoRR abs/1909.09577 (2019)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1910-10697
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-10697
Oleksii Hrinchuk, Mariya Popova, Boris Ginsburg:
Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model. CoRR abs/1910.10697 (2019)
2018
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/cmbbeiv/DubrovinaKGHK18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/cmbbeiv/DubrovinaKGHK18
Anastasia Dubrovina, Pavel Kisilev, Boris Ginsburg, Sharbell Y. Hashoul, Ron Kimmel:
Computational mammography using deep neural networks. Comput. methods Biomech. Biomed. Eng. Imaging Vis. 6(3): 243-247 (2018)
[c6]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/JinGK18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/JinGK18
Peter H. Jin, Boris Ginsburg, Kurt Keutzer:
Spatially Parallel Convolutions. ICLR (Workshop) 2018
[c5]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/MicikeviciusNAD18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/MicikeviciusNAD18
Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory F. Diamos, Erich Elsen, David García, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, Hao Wu:
Mixed Precision Training. ICLR (Poster) 2018
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1805-10387
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1805-10387
Oleksii Kuchaiev, Boris Ginsburg, Igor Gitman, Vitaly Lavrukhin, Carl Case, Paulius Micikevicius:
OpenSeq2Seq: extensible toolkit for distributed and mixed precision training of sequence-to-sequence models. CoRR abs/1805.10387 (2018)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1811-00707
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-00707
Jason Li, Ravi Gadde, Boris Ginsburg, Vitaly Lavrukhin:
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation. CoRR abs/1811.00707 (2018)
2017
[c4]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/KuchaievG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/KuchaievG17
Oleksii Kuchaiev, Boris Ginsburg:
Factorization tricks for LSTM networks. ICLR (Workshop) 2017
[c3]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/VincentSFGD17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/VincentSFGD17
Kevin Vincent, Kevin Stephano, Michael A. Frumkin, Boris Ginsburg, Julien Demouth:
On Improving the Numerical Stability of Winograd Convolutions. ICLR (Workshop) 2017
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/KuchaievG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/KuchaievG17
Oleksii Kuchaiev, Boris Ginsburg:
Factorization tricks for LSTM networks. CoRR abs/1703.10722 (2017)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1708-01715
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1708-01715
Oleksii Kuchaiev, Boris Ginsburg:
Training Deep AutoEncoders for Collaborative Filtering. CoRR abs/1708.01715 (2017)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1708-03888
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1708-03888
Yang You, Igor Gitman, Boris Ginsburg:
Scaling SGD Batch Size to 32K for ImageNet Training. CoRR abs/1708.03888 (2017)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1709-08145
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1709-08145
Igor Gitman, Boris Ginsburg:
Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification. CoRR abs/1709.08145 (2017)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1710-03740
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1710-03740
Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory F. Diamos, Erich Elsen, David García, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, Hao Wu:
Mixed Precision Training. CoRR abs/1710.03740 (2017)
2016
[c2]
- view
- export record
  dblp key:
  - conf/nips/RichardsonHGZ16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/RichardsonHGZ16
Elad Richardson, Rom Herskovitz, Boris Ginsburg, Michael Zibulevsky:
SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques. NIPS 2016: 1534-1542
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/RichardsonHGZ16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/RichardsonHGZ16
Elad Richardson, Rom Herskovitz, Boris Ginsburg, Michael Zibulevsky:
SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques. CoRR abs/1609.00629 (2016)

2000 – 2009

see FAQ

What is the meaning of the colors in the publication lists?

2002
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/tacas/ArmoniFFGGKLMSTVZ02
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/tacas/ArmoniFFGGKLMSTVZ02
Roy Armoni, Limor Fix, Alon Flaisher, Rob Gerth, Boris Ginsburg, Tomer Kanza, Avner Landver, Sela Mador-Haim, Eli Singerman, Andreas Tiemeyer, Moshe Y. Vardi, Yael Zbar:
The ForSpec Temporal Logic: A New Temporal Property-Specification Language. TACAS 2002: 296-211

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.