Dimensionality Reduction for Machine Learning-based Argument Mining

Andrés Segura-Tinoco, Iván Cantador


Abstract
Recent approaches to argument mining have focused on training machine learning algorithms from annotated text corpora, utilizing as input high-dimensional linguistic feature vectors. Differently to previous work, in this paper, we preliminarily investigate the potential benefits of reducing the dimensionality of the input data. Through an empirical study, testing SVD, PCA and LDA techniques on a new argumentative corpus in Spanish for an underexplored domain (e-participation), and using a novel, rich argument model, we show positive results in terms of both computation efficiency and argumentative information extraction effectiveness, for the three major argument mining tasks: argumentative fragment detection, argument component classification, and argumentative relation recognition. On a space with dimension around 3-4% of the number of input features, the argument mining methods are able to reach 95-97% of the performance achieved by using the entire corpus, and even surpass it in some cases.
Anthology ID:
2023.argmining-1.9
Volume:
Proceedings of the 10th Workshop on Argument Mining
Month:
December
Year:
2023
Address:
Singapore
Editors:
Milad Alshomary, Chung-Chi Chen, Smaranda Muresan, Joonsuk Park, Julia Romberg
Venues:
ArgMining | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
89–99
Language:
URL:
https://aclanthology.org/2023.argmining-1.9
DOI:
10.18653/v1/2023.argmining-1.9
Bibkey:
Cite (ACL):
Andrés Segura-Tinoco and Iván Cantador. 2023. Dimensionality Reduction for Machine Learning-based Argument Mining. In Proceedings of the 10th Workshop on Argument Mining, pages 89–99, Singapore. Association for Computational Linguistics.
Cite (Informal):
Dimensionality Reduction for Machine Learning-based Argument Mining (Segura-Tinoco & Cantador, ArgMining-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.argmining-1.9.pdf
Software:
 2023.argmining-1.9.Software.zip