Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (392)

Search Parameters:
Keywords = voting classifier

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 20963 KiB  
Article
Landslide Susceptibility Mapping Based on Ensemble Learning in the Jiuzhaigou Region, Sichuan, China
by Bangsheng An, Zhijie Zhang, Shenqing Xiong, Wanchang Zhang, Yaning Yi, Zhixin Liu and Chuanqi Liu
Remote Sens. 2024, 16(22), 4218; https://doi.org/10.3390/rs16224218 - 12 Nov 2024
Viewed by 364
Abstract
Accurate landslide susceptibility mapping is vital for disaster forecasting and risk management. To address the problem of limited accuracy of individual classifiers and lack of model interpretability in machine learning-based models, a coupled multi-model framework for landslide susceptibility mapping is proposed. Using Jiuzhaigou [...] Read more.
Accurate landslide susceptibility mapping is vital for disaster forecasting and risk management. To address the problem of limited accuracy of individual classifiers and lack of model interpretability in machine learning-based models, a coupled multi-model framework for landslide susceptibility mapping is proposed. Using Jiuzhaigou County, Sichuan Province, as a case study, we developed an evaluation index system incorporating 14 factors. We employed three base models—logistic regression, support vector machine, and Gaussian Naive Bayes—assessed through four ensemble methods: Stacking, Voting, Bagging, and Boosting. The decision mechanisms of these models were explained via a SHAP (SHapley Additive exPlanations) analysis. Results demonstrate that integrating machine learning with ensemble learning and SHAP yields more reliable landslide susceptibility mapping and enhances model interpretability. This approach effectively addresses the challenges of unreliable landslide susceptibility mapping in complex environments. Full article
(This article belongs to the Special Issue Remote Sensing Data for Modeling and Managing Natural Disasters)
Show Figures

Figure 1

11 pages, 951 KiB  
Article
A Machine Learning Model Based on Thyroid US Radiomics to Discriminate Between Benign and Malignant Nodules
by Antonino Guerrisi, Elena Seri, Vincenzo Dolcetti, Ludovica Miseo, Fulvia Elia, Gianmarco Lo Conte, Giovanni Del Gaudio, Patrizia Pacini, Angelo Barbato, Emanuele David and Vito Cantisani
Cancers 2024, 16(22), 3775; https://doi.org/10.3390/cancers16223775 - 8 Nov 2024
Viewed by 429
Abstract
Background/Objectives: Thyroid nodules are a very common finding, mostly benign but sometimes malignant, and thus require accurate diagnosis. Ultrasound and fine needle biopsy are the most widely used and reliable diagnostic methods to date, but they are sometimes limited in addressing benign [...] Read more.
Background/Objectives: Thyroid nodules are a very common finding, mostly benign but sometimes malignant, and thus require accurate diagnosis. Ultrasound and fine needle biopsy are the most widely used and reliable diagnostic methods to date, but they are sometimes limited in addressing benign from malignant nodules, mainly with regard to ultrasound, by the operator’s experience. Radiomics, quantitative feature extraction from medical images and machine learning offer promising avenues to improve diagnosis. The aim of this work was to develop a machine learning model based on thyroid ultrasound images to classify nodules into benign and malignant classes. Methods: For this purpose, images of ultrasonography from 142 subjects were collected. Among these subjects, 40 patients (28.2%) belonged to the class “malignant” and 102 patients (71.8%) belonged to the class “benign”, according to histological diagnosis from fine-needle aspiration. This image set was used for the training, cross-validation and internal testing of three different machine learning models. A robust radiomic approach was applied, under the hypothesis that the radiomic feature could capture the disease heterogeneity among the two groups. Three models consisting of four ensembles of machine learning classifiers (random forests, support vector machines and k-nearest neighbor classifiers) were developed for the binary classification task of interest. The best performing model was then externally tested on a cohort of 21 new patients. Results: The best model (ensemble of random forest) showed Receiver Operating Characteristic-Area Under the Curve (ROC-AUC) (%) of 85 (majority vote), 83.7 ** (mean) [80.2–87.2], accuracy (%) of 83, 81.2 ** [77.1–85.2], sensitivity (%) of 70, 67.5 ** [64.3–70.7], specificity (%) of 88, 86.5 ** [82–91], positive predictive value (PPV) (%) of 70, 66.5 ** [57.9–75.1] and negative predictive value (NPV) (%) of 88, 87.1 ** [85.5–88.8] (* p < 0.05, ** p < 0.005) in the internal test cohort. It achieved an accuracy of 90.5%, a sensitivity of 100%, a specificity of 86.7%, a PPV of 75% and an NPV of 100% in the external testing cohort. Conclusions: The model constituted of four ensembles of random forest classifiers could identify all the malignant nodes and the consistent majority of benign in the external testing cohort. Full article
(This article belongs to the Section Methods and Technologies Development)
Show Figures

Figure 1

17 pages, 8064 KiB  
Article
Ensemble Pretrained Convolutional Neural Networks for the Classification of Insulator Surface Conditions
by Arailym Serikbay, Mehdi Bagheri, Amin Zollanvari and B. T. Phung
Energies 2024, 17(22), 5595; https://doi.org/10.3390/en17225595 - 8 Nov 2024
Viewed by 535
Abstract
Overhead transmission line insulators are non-conductive materials that separate conductors from grounded transmission towers. Once in operation, they frequently experience environmental pollution and electrical or mechanical stress. Since adverse operational conditions can lead to insulation failure, regular inspections are essential to prevent power [...] Read more.
Overhead transmission line insulators are non-conductive materials that separate conductors from grounded transmission towers. Once in operation, they frequently experience environmental pollution and electrical or mechanical stress. Since adverse operational conditions can lead to insulation failure, regular inspections are essential to prevent power outages. To this end, this paper proposes a novel technique based on deep convolutional neural networks (CNNs) to classify high-voltage insulator surface conditions based on their image. Successful applications of CNNs in computer vision have led to several pretrained architectures for image classification. To use these pretrained models, a practitioner typically fine-tunes and selects one final model via a model selection stage and discards all other models. In contrast with many existing studies that use such a “winner-takes-all” approach, here, we identify the best subset of seven popular pretrained CNN architectures that are combined by soft voting to form an ensemble classifier. From a machine learning (ML) perspective, this focus is warranted because the convolutional base of each pretrained architecture operates as a feature extractor and an ensemble of them works as a combination of various feature extraction rules. Our numerical experiments demonstrate the advantage of the identified ensemble model to individual pretrained architectures. Full article
(This article belongs to the Section F: Electrical Engineering)
Show Figures

Figure 1

24 pages, 453 KiB  
Article
An Effective Ensemble Approach for Preventing and Detecting Phishing Attacks in Textual Form
by Zaher Salah, Hamza Abu Owida, Esraa Abu Elsoud, Esraa Alhenawi, Suhaila Abuowaida and Nawaf Alshdaifat
Future Internet 2024, 16(11), 414; https://doi.org/10.3390/fi16110414 - 8 Nov 2024
Viewed by 1280
Abstract
Phishing email assaults have been a prevalent cybercriminal tactic for many decades. Various detectors have been suggested over time that rely on textual information. However, to address the growing prevalence of phishing emails, more sophisticated techniques are required to use all aspects of [...] Read more.
Phishing email assaults have been a prevalent cybercriminal tactic for many decades. Various detectors have been suggested over time that rely on textual information. However, to address the growing prevalence of phishing emails, more sophisticated techniques are required to use all aspects of emails to improve the detection capabilities of machine learning classifiers. This paper presents a novel approach to detecting phishing emails. The proposed methodology combines ensemble learning techniques with various variables, such as word frequency, the presence of specific keywords or phrases, and email length, to improve detection accuracy. We provide two approaches for the planned task; The first technique employs ensemble learning soft voting, while the second employs weighted ensemble learning. Both strategies use distinct machine learning algorithms to concurrently process the characteristics, reducing their complexity and enhancing the model’s performance. An extensive assessment and analysis are conducted, considering unique criteria designed to minimize biased and inaccurate findings. Our empirical experiments demonstrates that using ensemble learning to merge attributes in the evolution of phishing emails showcases the competitive performance of ensemble learning over other machine learning algorithms. This superiority is underscored by achieving an F1-score of 0.90 in the weighted ensemble method and 0.85 in the soft voting method, showcasing the effectiveness of this approach. Full article
Show Figures

Figure 1

14 pages, 2089 KiB  
Article
A Fast and Cost-Effective Electronic Nose Model for Methanol Detection Using Ensemble Learning
by Bilge Han Tozlu
Chemosensors 2024, 12(11), 225; https://doi.org/10.3390/chemosensors12110225 - 29 Oct 2024
Viewed by 582
Abstract
Methanol, commonly used to cut costs in the production of counterfeit alcohol, is extremely harmful to human health, potentially leading to severe outcomes, including death. In this study, an electronic nose system was designed using 11 inexpensive gas sensors to detect the proportion [...] Read more.
Methanol, commonly used to cut costs in the production of counterfeit alcohol, is extremely harmful to human health, potentially leading to severe outcomes, including death. In this study, an electronic nose system was designed using 11 inexpensive gas sensors to detect the proportion of methanol in an alcohol mixture. A total of 168 odor samples were taken and analyzed from eight types of ethanol–methanol mixtures prepared at different concentrations. Only 4 features out of 264 were selected using the feature selection method based on feature importance. These four features were extracted from the data of MQ-3, MQ-4, and MQ-137 sensors, and the classification process was carried out using the data of these sensors. A Voting Classifier, an ensemble model, was used with Linear Discriminant Analysis, Support Vector Machines, and Extra Trees algorithms. The Voting Classifier achieved 85.88% classification accuracy before and 81.85% after feature selection. With its cost effectiveness, fast processing time, and practicality, the recommended system shows great potential for detecting methanol, which threatens human health in counterfeit drink production. Full article
(This article belongs to the Special Issue Gas Sensors and Electronic Noses for the Real Condition Sensing)
Show Figures

Figure 1

16 pages, 3470 KiB  
Article
YOLOv8-Based Estimation of Estrus in Sows Through Reproductive Organ Swelling Analysis Using a Single Camera
by Iyad Almadani, Mohammed Abuhussein and Aaron L. Robinson
Digital 2024, 4(4), 898-913; https://doi.org/10.3390/digital4040044 - 27 Oct 2024
Viewed by 573
Abstract
Accurate and efficient estrus detection in sows is crucial in modern agricultural practices to ensure optimal reproductive health and successful breeding outcomes. A non-contact method using computer vision to detect a change in a sow’s vulva size holds great promise for automating and [...] Read more.
Accurate and efficient estrus detection in sows is crucial in modern agricultural practices to ensure optimal reproductive health and successful breeding outcomes. A non-contact method using computer vision to detect a change in a sow’s vulva size holds great promise for automating and enhancing this critical process. However, achieving precise and reliable results depends heavily on maintaining a consistent camera distance during image capture. Variations in camera distance can lead to erroneous estrus estimations, potentially resulting in missed breeding opportunities or false positives. To address this challenge, we propose a robust six-step methodology, accompanied by three stages of evaluation. First, we carefully annotated masks around the vulva to ensure an accurate pixel perimeter calculation of its shape. Next, we meticulously identified keypoints on the sow’s vulva, which enabled precise tracking and analysis of its features. We then harnessed the power of machine learning to train our model using annotated images, which facilitated keypoint detection and segmentation with the state-of-the-art YOLOv8 algorithm. By identifying the keypoints, we performed precise calculations of the Euclidean distances: first, between each labium (horizontal distance), and second, between the clitoris and the perineum (vertical distance). Additionally, by segmenting the vulva’s size, we gained valuable insights into its shape, which helped with performing precise perimeter measurements. Equally important was our effort to calibrate the camera using monocular depth estimation. This calibration helped establish a functional relationship between the measurements on the image (such as the distances between the labia and from the clitoris to the perineum, and the vulva perimeter) and the depth distance to the camera, which enabled accurate adjustments and calibration for our analysis. Lastly, we present a classification method for distinguishing between estrus and non-estrus states in subjects based on the pixel width, pixel length, and perimeter measurements. The method calculated the Euclidean distances between a new data point and reference points from two datasets: “estrus data” and “not estrus data”. Using custom distance functions, we computed the distances for each measurement dimension and aggregated them to determine the overall similarity. The classification process involved identifying the three nearest neighbors of the datasets and employing a majority voting mechanism to assign a label. A new data point was classified as “estrus” if the majority of the nearest neighbors were labeled as estrus; otherwise, it was classified as “non-estrus”. This method provided a robust approach for automated classification, which aided in more accurate and efficient detection of the estrus states. To validate our approach, we propose three evaluation stages. In the first stage, we calculated the Mean Squared Error (MSE) between the ground truth keypoints of the labia distance and the distance between the predicted keypoints, and we performed the same calculation for the distance between the clitoris and perineum. Then, we provided a quantitative analysis and performance comparison, including a comparison between our previous U-Net model and our new YOLOv8 segmentation model. This comparison focused on each model’s performance in terms of accuracy and speed, which highlighted the advantages of our new approach. Lastly, we evaluated the estrus–not-estrus classification model by defining the confusion matrix. By using this comprehensive approach, we significantly enhanced the accuracy of estrus detection in sows while effectively mitigating human errors and resource wastage. The automation and optimization of this critical process hold the potential to revolutionize estrus detection in agriculture, which will contribute to improved reproductive health management and elevate breeding outcomes to new heights. Through extensive evaluation and experimentation, our research aimed to demonstrate the transformative capabilities of computer vision techniques, paving the way for more advanced and efficient practices in the agricultural domain. Full article
Show Figures

Figure 1

23 pages, 1033 KiB  
Article
A Hybrid Ensemble Approach for Greek Text Classification Based on Multilingual Models
by Charalampos M. Liapis, Konstantinos Kyritsis, Isidoros Perikos, Nikolaos Spatiotis and Michael Paraskevas
Big Data Cogn. Comput. 2024, 8(10), 137; https://doi.org/10.3390/bdcc8100137 - 14 Oct 2024
Viewed by 781
Abstract
The present study explores the field of text classification in the Greek language. A novel ensemble classification scheme based on generated embeddings from Greek text made by the multilingual capabilities of the E5 model is presented. Our approach incorporates partial transfer learning by [...] Read more.
The present study explores the field of text classification in the Greek language. A novel ensemble classification scheme based on generated embeddings from Greek text made by the multilingual capabilities of the E5 model is presented. Our approach incorporates partial transfer learning by using pre-trained models to extract embeddings, enabling the evaluation of classical classifiers on Greek data. Additionally, we enhance the predictive capability while maintaining the costs low by employing a soft voting combination scheme that exploits the strengths of XGBoost, K-nearest neighbors, and logistic regression. This method significantly improves all classification metrics, demonstrating the superiority of ensemble techniques in handling the complexity of Greek textual data. Our study contributes to the field of natural language processing by proposing an effective ensemble framework for the categorization of Greek texts, leveraging the advantages of both traditional and modern machine learning techniques. This framework has the potential to be applied to other less-resourced languages, thereby broadening the impact of our research beyond Greek language processing. Full article
Show Figures

Figure 1

13 pages, 2288 KiB  
Article
A Longitudinal Model for a Dynamic Risk Score to Predict Delayed Cerebral Ischemia after Subarachnoid Hemorrhage
by Jan F. Willms, Corinne Inauen, Stefan Yu Bögli, Carl Muroi, Jens M. Boss and Emanuela Keller
Bioengineering 2024, 11(10), 988; https://doi.org/10.3390/bioengineering11100988 - 30 Sep 2024
Viewed by 660
Abstract
Background: Accurate longitudinal risk prediction for DCI (delayed cerebral ischemia) occurrence after subarachnoid hemorrhage (SAH) is essential for clinicians to administer appropriate and timely diagnostics, thereby improving treatment planning and outcome. This study aimed to develop an improved longitudinal DCI prediction model and [...] Read more.
Background: Accurate longitudinal risk prediction for DCI (delayed cerebral ischemia) occurrence after subarachnoid hemorrhage (SAH) is essential for clinicians to administer appropriate and timely diagnostics, thereby improving treatment planning and outcome. This study aimed to develop an improved longitudinal DCI prediction model and evaluate its performance in predicting DCI between day 4 and 14 after aneurysm rupture. Methods: Two DCI classification models were trained: (1) a static model based on routinely collected demographics and SAH grading scores and (2) a dynamic model based on results from laboratory and blood gas analysis anchored at the time of DCI. A combined model was derived from these two using a voting approach. Multiple classifiers, including Logistic Regression, Support Vector Machines, Random Forests, Histogram-based Gradient Boosting, and Extremely Randomized Trees, were evaluated through cross-validation using anchored data. A leave-one-out simulation was then performed on the best-performing models to evaluate their longitudinal performance using time-dependent Receiver Operating Characteristic (ROC) analysis. Results: The training dataset included 218 patients, with 89 of them developing DCI (41%). In the anchored ROC analysis, the combined model achieved a ROC AUC of 0.73 ± 0.05 in predicting DCI onset, the static and the dynamic model achieved a ROC AUC of 0.69 ± 0.08 and 0.66 ± 0.08, respectively. In the leave-one-out simulation experiments, the dynamic and voting model showed a highly dynamic risk score (intra-patient score range was 0.25 [0.24, 0.49] and 0.17 [0.12, 0.25] for the dynamic and the voting model, respectively, for DCI occurrence over the course of disease. In the time-dependent ROC analysis, the dynamic model performed best until day 5.4, and afterwards the voting model showed the best performance. Conclusions: A machine learning model for longitudinal DCI risk assessment was developed comprising a static and a dynamic sub-model. The longitudinal performance evaluation highlighted substantial time dependence in model performance, underscoring the need for a longitudinal assessment of prediction models in intensive care settings. Moreover, clinicians need to be aware of these performance variations when performing a risk assessment and weight the different model outputs correspondingly. Full article
Show Figures

Graphical abstract

22 pages, 2550 KiB  
Article
Ensemble Fusion Models Using Various Strategies and Machine Learning for EEG Classification
by Sunil Kumar Prabhakar, Jae Jun Lee and Dong-Ok Won
Bioengineering 2024, 11(10), 986; https://doi.org/10.3390/bioengineering11100986 - 29 Sep 2024
Viewed by 871
Abstract
Electroencephalography (EEG) helps to assess the electrical activities of the brain so that the neuronal activities of the brain are captured effectively. EEG is used to analyze many neurological disorders, as it serves as a low-cost equipment. To diagnose and treat every neurological [...] Read more.
Electroencephalography (EEG) helps to assess the electrical activities of the brain so that the neuronal activities of the brain are captured effectively. EEG is used to analyze many neurological disorders, as it serves as a low-cost equipment. To diagnose and treat every neurological disorder, lengthy EEG signals are needed, and different machine learning and deep learning techniques have been developed so that the EEG signals could be classified automatically. In this work, five ensemble models are proposed for EEG signal classification, and the main neurological disorder analyzed in this paper is epilepsy. The first proposed ensemble technique utilizes an equidistant assessment and ranking determination mode with the proposed Enhance the Sum of Connection and Distance (ESCD)-based feature selection technique for the classification of EEG signals; the second proposed ensemble technique utilizes the concept of Infinite Independent Component Analysis (I-ICA) and multiple classifiers with majority voting concept; the third proposed ensemble technique utilizes the concept of Genetic Algorithm (GA)-based feature selection technique and bagging Support Vector Machine (SVM)-based classification model. The fourth proposed ensemble technique utilizes the concept of Hilbert Huang Transform (HHT) and multiple classifiers with GA-based multiparameter optimization, and the fifth proposed ensemble technique utilizes the concept of Factor analysis with Ensemble layer K nearest neighbor (KNN) classifier. The best results are obtained when the Ensemble hybrid model using the equidistant assessment and ranking determination method with the proposed ESCD-based feature selection technique and Support Vector Machine (SVM) classifier is utilized, achieving a classification accuracy of 89.98%. Full article
(This article belongs to the Special Issue Machine Learning Technology in Predictive Healthcare)
Show Figures

Figure 1

18 pages, 7989 KiB  
Article
Intelligent Dance Motion Evaluation: An Evaluation Method Based on Keyframe Acquisition According to Musical Beat Features
by Hengzi Li and Xingli Huang
Sensors 2024, 24(19), 6278; https://doi.org/10.3390/s24196278 - 28 Sep 2024
Viewed by 715
Abstract
Motion perception is crucial in competitive sports like dance, basketball, and diving. However, evaluations in these sports heavily rely on professionals, posing two main challenges: subjective assessments are uncertain and can be influenced by experience, making it hard to guarantee timeliness and accuracy, [...] Read more.
Motion perception is crucial in competitive sports like dance, basketball, and diving. However, evaluations in these sports heavily rely on professionals, posing two main challenges: subjective assessments are uncertain and can be influenced by experience, making it hard to guarantee timeliness and accuracy, and increasing labor costs with multi-expert voting. While video analysis methods have alleviated some pressure, challenges remain in extracting key points/frames from videos and constructing a suitable, quantifiable evaluation method that aligns with the static–dynamic nature of movements for accurate assessment. Therefore, this study proposes an innovative intelligent evaluation method aimed at enhancing the accuracy and processing speed of complex video analysis tasks. Firstly, by constructing a keyframe extraction method based on musical beat detection, coupled with prior knowledge, the beat detection is optimized through a perceptually weighted window to accurately extract keyframes that are highly correlated with dance movement changes. Secondly, OpenPose is employed to detect human joint points in the keyframes, quantifying human movements into a series of numerically expressed nodes and their relationships (i.e., pose descriptions). Combined with the positions of keyframes in the time sequence, a standard pose description sequence is formed, serving as the foundational data for subsequent quantitative evaluations. Lastly, an Action Sequence Evaluation method (ASCS) is established based on all action features within a single action frame to precisely assess the overall performance of individual actions. Furthermore, drawing inspiration from the Rouge-L evaluation method in natural language processing, a Similarity Measure Approach based on Contextual Relationships (SMACR) is constructed, focusing on evaluating the coherence of actions. By integrating ASCS and SMACR, a comprehensive evaluation of dancers is conducted from both the static and dynamic dimensions. During the method validation phase, the research team judiciously selected 12 representative samples from the popular dance game Just Dance, meticulously classifying them according to the complexity of dance moves and physical exertion levels. The experimental results demonstrate the outstanding performance of the constructed automated evaluation method. Specifically, this method not only achieves the precise assessments of dance movements at the individual keyframe level but also significantly enhances the evaluation of action coherence and completeness through the innovative SMACR. Across all 12 test samples, the method accurately selects 2 to 5 keyframes per second from the videos, reducing the computational load to 4.1–10.3% compared to traditional full-frame matching methods, while the overall evaluation accuracy only slightly decreases by 3%, fully demonstrating the method’s combination of efficiency and precision. Through precise musical beat alignment, efficient keyframe extraction, and the introduction of intelligent dance motion analysis technology, this study significantly improves upon the subjectivity and inefficiency of traditional manual evaluations, enhancing the scientificity and accuracy of assessments. It provides robust tool support for fields such as dance education and competition evaluations, showcasing broad application prospects. Full article
(This article belongs to the Collection Sensors and AI for Movement Analysis)
Show Figures

Figure 1

12 pages, 1932 KiB  
Article
Ensemble Learning with Highly Variable Class-Based Performance
by Brandon Warner, Edward Ratner, Kallin Carlous-Khan, Christopher Douglas and Amaury Lendasse
Mach. Learn. Knowl. Extr. 2024, 6(4), 2149-2160; https://doi.org/10.3390/make6040106 - 24 Sep 2024
Viewed by 849
Abstract
This paper proposes a novel model-agnostic method for weighting the outputs of base classifiers in machine learning (ML) ensembles. Our approach uses class-based weight coefficients assigned to every output class in each learner in the ensemble. This is particularly useful when the base [...] Read more.
This paper proposes a novel model-agnostic method for weighting the outputs of base classifiers in machine learning (ML) ensembles. Our approach uses class-based weight coefficients assigned to every output class in each learner in the ensemble. This is particularly useful when the base classifiers have highly variable performance across classes. Our method generates a dense set of coefficients for the models in our ensemble by considering the model performance on each class. We compare our novel method to the commonly used ensemble approaches like voting and weighted averages. In addition, we compare our approach to class-specific soft voting (CSSV), which was also designed to address variable performance but generates a sparse set of weights by solving a linear system. We choose to illustrate the power of this approach by applying it to an ensemble of extreme learning machines (ELMs), which are well suited for this approach due to their stochastic, highly varying performance across classes. We illustrate the superiority of our approach by comparing its performance to that of simple majority voting, weighted majority voting, and class-specific soft voting using ten popular open-source multiclass classification datasets. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

19 pages, 7056 KiB  
Article
A Data-Centric Approach to Understanding the 2020 U.S. Presidential Election
by Satish Mahadevan Srinivasan and Yok-Fong Paat
Big Data Cogn. Comput. 2024, 8(9), 111; https://doi.org/10.3390/bdcc8090111 - 4 Sep 2024
Viewed by 640
Abstract
The application of analytics on Twitter feeds is a very popular field for research. A tweet with a 280-character limitation can reveal a wealth of information on how individuals express their sentiments and emotions within their network or community. Upon collecting, cleaning, and [...] Read more.
The application of analytics on Twitter feeds is a very popular field for research. A tweet with a 280-character limitation can reveal a wealth of information on how individuals express their sentiments and emotions within their network or community. Upon collecting, cleaning, and mining tweets from different individuals on a particular topic, we can capture not only the sentiments and emotions of an individual but also the sentiments and emotions expressed by a larger group. Using the well-known Lexicon-based NRC classifier, we classified nearly seven million tweets across seven battleground states in the U.S. to understand the emotions and sentiments expressed by U.S. citizens toward the 2020 presidential candidates. We used the emotions and sentiments expressed within these tweets as proxies for their votes and predicted the swing directions of each battleground state. When compared to the outcome of the 2020 presidential candidates, we were able to accurately predict the swing directions of four battleground states (Arizona, Michigan, Texas, and North Carolina), thus revealing the potential of this approach in predicting future election outcomes. The week-by-week analysis of the tweets using the NRC classifier corroborated well with the various political events that took place before the election, making it possible to understand the dynamics of the emotions and sentiments of the supporters in each camp. These research strategies and evidence-based insights may be translated into real-world settings and practical interventions to improve election outcomes. Full article
(This article belongs to the Special Issue Machine Learning in Data Mining for Knowledge Discovery)
Show Figures

Figure 1

21 pages, 2757 KiB  
Article
Classifying Unconscious, Psychedelic, and Neuropsychiatric Brain States with Functional Connectivity, Graph Theory, and Cortical Gradient Analysis
by Hyunwoo Jang, Rui Dai, George A. Mashour, Anthony G. Hudetz and Zirui Huang
Brain Sci. 2024, 14(9), 880; https://doi.org/10.3390/brainsci14090880 - 30 Aug 2024
Viewed by 1375
Abstract
Accurate and generalizable classification of brain states is essential for understanding their neural underpinnings and improving clinical diagnostics. Traditionally, functional connectivity patterns and graph-theoretic metrics have been utilized. However, cortical gradient features, which reflect global brain organization, offer a complementary approach. We hypothesized [...] Read more.
Accurate and generalizable classification of brain states is essential for understanding their neural underpinnings and improving clinical diagnostics. Traditionally, functional connectivity patterns and graph-theoretic metrics have been utilized. However, cortical gradient features, which reflect global brain organization, offer a complementary approach. We hypothesized that a machine learning model integrating these three feature sets would effectively discriminate between baseline and atypical brain states across a wide spectrum of conditions, even though the underlying neural mechanisms vary. To test this, we extracted features from brain states associated with three meta-conditions including unconsciousness (NREM2 sleep, propofol deep sedation, and propofol general anesthesia), psychedelic states induced by hallucinogens (subanesthetic ketamine, lysergic acid diethylamide, and nitrous oxide), and neuropsychiatric disorders (attention-deficit hyperactivity disorder, bipolar disorder, and schizophrenia). We used support vector machine with nested cross-validation to construct our models. The soft voting ensemble model marked the average balanced accuracy (average of specificity and sensitivity) of 79% (62–98% across all conditions), outperforming individual base models (70–76%). Notably, our models exhibited varying degrees of transferability across different datasets, with performance being dependent on the specific brain states and feature sets used. Feature importance analysis across meta-conditions suggests that the underlying neural mechanisms vary significantly, necessitating tailored approaches for accurate classification of specific brain states. This finding underscores the value of our feature-integrated ensemble models, which leverage the strengths of multiple feature types to achieve robust performance across a broader range of brain states. While our approach offers valuable insights into the neural signatures of different brain states, future work is needed to develop and validate even more generalizable models that can accurately classify brain states across a wider array of conditions. Full article
Show Figures

Figure 1

15 pages, 12817 KiB  
Article
Aeolian Desertification Dynamics from 1995 to 2020 in Northern China: Classification Using a Random Forest Machine Learning Algorithm Based on Google Earth Engine
by Caixia Zhang, Ningjing Tan and Jinchang Li
Remote Sens. 2024, 16(16), 3100; https://doi.org/10.3390/rs16163100 - 22 Aug 2024
Viewed by 798
Abstract
Machine learning methods have improved in recent years and provide increasingly powerful tools for understanding landscape evolution. In this study, we used the random forest method based on Google Earth Engine to evaluate the desertification dynamics in northern China from 1995 to 2020. [...] Read more.
Machine learning methods have improved in recent years and provide increasingly powerful tools for understanding landscape evolution. In this study, we used the random forest method based on Google Earth Engine to evaluate the desertification dynamics in northern China from 1995 to 2020. We selected Landsat series image bands, remote sensing inversion data, climate baseline data, land use data, and soil type data as variables for majority voting in the random forest method. The method’s average classification accuracy was 91.6% ± 5.8 [mean ± SD], and the average kappa coefficient was 0.68 ± 0.09, suggesting good classification results. The random forest classifier results were consistent with the results of visual interpretation for the spatial distribution of different levels of desertification. From 1995 to 2000, the area of aeolian desertification increased at an average rate of 9977 km2 yr−1, and from 2000 to 2005, from 2005 to 2010, from 2010 to 2015, and from 2015 to 2020, the aeolian desertification decreased at an average rate of 2535, 3462, 1487, and 4537 km2 yr−1, respectively. Full article
Show Figures

Figure 1

16 pages, 3087 KiB  
Article
Predicting the Performance of Ensemble Classification Using Conditional Joint Probability
by Iqbal Murtza, Jin-Young Kim and Muhammad Adnan
Mathematics 2024, 12(16), 2586; https://doi.org/10.3390/math12162586 - 21 Aug 2024
Cited by 1 | Viewed by 609
Abstract
In many machine learning applications, there are many scenarios when performance is not satisfactory by single classifiers. In this case, an ensemble classification is constructed using several weak base learners to achieve satisfactory performance. Unluckily, the construction of the ensemble classification is empirical, [...] Read more.
In many machine learning applications, there are many scenarios when performance is not satisfactory by single classifiers. In this case, an ensemble classification is constructed using several weak base learners to achieve satisfactory performance. Unluckily, the construction of the ensemble classification is empirical, i.e., to try an ensemble classification and if performance is not satisfactory then discard it. In this paper, a challenging analytical problem of the estimation of ensemble classification using the prediction performance of the base learners is considered. The proposed formulation is aimed at estimating the performance of ensemble classification without physically developing it, and it is derived from the perspective of probability theory by manipulating the decision probabilities of the base learners. For this purpose, the output of a base learner (which is either true positive, true negative, false positive, or false negative) is considered as a random variable. Then, the effects of logical disjunction-based and majority voting-based decision combination strategies are analyzed from the perspective of conditional joint probability. To evaluate the forecasted performance of ensemble classifier by the proposed methodology, publicly available standard datasets have been employed. The results show the effectiveness of the derived formulations to estimate the performance of ensemble classification. In addition to this, the theoretical and experimental results show that the logical disjunction-based decision outperforms majority voting in imbalanced datasets and cost-sensitive scenarios. Full article
(This article belongs to the Section Mathematics and Computer Science)
Show Figures

Figure 1

Back to TopTop