default search action
2nd Eval4NLP 2021: Punta Cana, Dominican Republic
- Yang Gao, Steffen Eger, Wei Zhao, Piyawat Lertvittayakumjorn, Marina Fomicheva:
Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems, Eval4NLP 2021, Punta Cana, Dominican Republic, November 10, 2021. Association for Computational Linguistics 2021, ISBN 978-1-954085-88-6 - Lucie Gianola, Hicham El Boukkouri, Cyril Grouin, Thomas Lavergne, Patrick Paroubek, Pierre Zweigenbaum:
Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing. 1-10 - Qingkai Zeng, Mengxia Yu, Wenhao Yu, Tianwen Jiang, Meng Jiang:
Validating Label Consistency in NER Data Annotation. 11-15 - Urja Khurana, Eric T. Nalisnick, Antske Fokkens:
How Emotionally Stable is ALBERT? Testing Robustness with Stochastic Weight Averaging on a Sentiment Analysis Task. 16-31 - Alexey Tikhonov, Igor Samenko, Ivan P. Yamshchikov:
StoryDB: Broad Multi-language Narrative Dataset. 32-39 - Chester Palen-Michel, Nolan Holley, Constantine Lignos:
SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation. 40-50 - Nicolas Garneau, Luc Lamontagne:
Trainable Ranking Models to Evaluate the Semantic Accuracy of Data-to-Text Neural Generator. 51-61 - Yo Ehara:
Evaluation of Unsupervised Automatic Readability Assessors Using Rank Correlations. 62-72 - Heather Lent, Semih Yavuz, Tao Yu, Tong Niu, Yingbo Zhou, Dragomir Radev, Xi Victoria Lin:
Testing Cross-Database Semantic Parsers With Canonical Utterances. 73-83 - Enzo Terreau, Antoine Gourru, Julien Velcin:
Writing Style Author Embedding Evaluation. 84-93 - Oleg V. Vasilyev, John Bohannon:
ESTIME: Estimation of Summary-to-Text Inconsistency by Mismatched Embeddings. 94-103 - Yang Liu, Alan Medlar, Dorota Glowacka:
Statistically Significant Detection of Semantic Shifts using Contextual Word Embeddings. 104-113 - Emma Manning, Nathan Schneider:
Referenceless Parsing-Based Evaluation of AMR-to-English Generation. 114-122 - Ayush Garg, Sammed S. Kagi, Vivek Srivastava, Mayank Singh:
MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation. 123-132 - Marcos V. Treviso, Nuno Miguel Guerreiro, Ricardo Rei, André F. T. Martins:
IST-Unbabel 2021 Submission for the Explainable Quality Estimation Shared Task. 133-145 - Raphael Rubino, Atsushi Fujita, Benjamin Marie:
Error Identification for Machine Translation with Metric Embedding and Attention. 146-156 - Christoph Wolfgang Leiter:
Reference-Free Word- and Sentence-Level Translation Evaluation with Token-Matching Metrics. 157-164 - Marina Fomicheva, Piyawat Lertvittayakumjorn, Wei Zhao, Steffen Eger, Yang Gao:
The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and Results. 165-178 - Benjamin Murauer, Günther Specht:
Developing a Benchmark for Reducing Data Bias in Authorship Attribution. 179-188 - David Chen, Maury Courtland, Adam Faulkner, Aysu Ezen-Can:
Error-Sensitive Evaluation for Ordinal Target Variables. 189-199 - Vivek Srivastava, Mayank Singh:
HinGE: A Dataset for Generation and Evaluation of Code-Mixed Hinglish Text. 200-208 - Oskar Wysocki, Malina Florea, Dónal Landers, André Freitas:
What is SemEval evaluating? A Systematic Analysis of Evaluation Campaigns in NLP. 209-229 - Tasnim Kabir, Marine Carpuat:
The UMD Submission to the Explainable MT Quality Estimation Shared Task: Combining Explanation Models with Sequence Labeling. 230-237 - Melda Eksi, Erik Gelbing, Jonathan Stieber, Chi Viet Vu:
Explaining Errors in Machine Translation with Absolute Gradient Ensembles. 238-249 - Peter Polák, Muskaan Singh, Ondrej Bojar:
Explainable Quality Estimation: CUNI Eval4NLP Submission. 250-255
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.