loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Besay Montesdeoca ; Julián Luengo ; Jesús Maillo ; Diego García-Gil ; Salvador García and Francisco Herrera

Affiliation: Dept. of Computer Science and Artificial Intelligence, University of Granada, Granada, E-!8071 and Spain

Keyword(s): Big Data, Missing Values, Imputation, k-Means, Fuzzy k-Means.

Abstract: Albeit most techniques and algorithms assume that the data is accurate, measurements in our analogic world are far from being perfect. Since our capabilities of storing and processing data are growing everyday, these imperfections will accumulate, generating poorer decisions and hindering any knowledge extraction process carried out over the raw data. One of the most disturbing imperfections is the presence of missing values. Many inductive algorithms assume that the data is complete, thus if they face missing data they will not work properly or the quality of the knowledge extracted will be poorer. At this point there is no sophisticated missing values treatment implemented in any major Big Data framework. In this contribution, we present two novel imputation methods based on clustering that achieve better results than simply removing the faulty examples or filling-in the missing values with the mean that can be easily ported to Spark’s MLlib.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 52.15.57.186

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Montesdeoca, B.; Luengo, J.; Maillo, J.; García-Gil, D.; García, S. and Herrera, F. (2019). A First Approach on Big Data Missing Values Imputation. In Proceedings of the 4th International Conference on Internet of Things, Big Data and Security - IoTBDS; ISBN 978-989-758-369-8; ISSN 2184-4976, SciTePress, pages 315-323. DOI: 10.5220/0007738403150323

@conference{iotbds19,
author={Besay Montesdeoca. and Julián Luengo. and Jesús Maillo. and Diego García{-}Gil. and Salvador García. and Francisco Herrera.},
title={A First Approach on Big Data Missing Values Imputation},
booktitle={Proceedings of the 4th International Conference on Internet of Things, Big Data and Security - IoTBDS},
year={2019},
pages={315-323},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007738403150323},
isbn={978-989-758-369-8},
issn={2184-4976},
}

TY - CONF

JO - Proceedings of the 4th International Conference on Internet of Things, Big Data and Security - IoTBDS
TI - A First Approach on Big Data Missing Values Imputation
SN - 978-989-758-369-8
IS - 2184-4976
AU - Montesdeoca, B.
AU - Luengo, J.
AU - Maillo, J.
AU - García-Gil, D.
AU - García, S.
AU - Herrera, F.
PY - 2019
SP - 315
EP - 323
DO - 10.5220/0007738403150323
PB - SciTePress