1. Introduction
Emotion is the reflection of people’s psychological and physical expressions. It plays a crucial factor in decision-making, perception, and human-computer interaction (HCI) systems [
1,
2] Many studies based on emotion recognition have been conducted in the last few decades [
3,
4].
The methods of emotion recognition are usually divided into two categories. One is based on physiological signals, and the other is based on non-physiological signals. Non-physiological signals include facial expressions, speech signals, body movements, and so on [
5,
6]. Studies based on non-physiological signals have produced significant results. For example, virtual markers based on an optical flow algorithm were used to classify six facial emotions (happiness, sadness, anger, fear, disgust, and surprise) [
7]. They achieved a maximum accuracy of 99.81% with the CNN classifier. Niu et al. [
8] proposed fused features using the oriented fast and rotated brief (ORB) features and local binary patterns (LBP) features to classify seven facial emotions, in which the accuracy is 79.8%. However, emotion recognition through facial expressions or behavior analyzes is usually built on fake emotions, including photos of actors instead of faces expressing real emotional states. Datasets of real facial emotions are scarce. The expression and regulation of emotional cues are different in different countries [
9]. It may affect the accuracy of the emotion classification. Therefore, research on recognizing emotions through physiological signals is being actively conducted.
Physiological signals are another approach for emotion recognition. Physiological signals include heart rate, functional magnetic resonance imaging, electromyography (EMG), electroencephalogram (EEG), and so on. Among them, emotion recognition based on EEG signals has great effects on detecting an emotion directly from brain activity [
10]. For example, Thammasan et al. [
11] proposed a continuous music emotion classification algorithm based on different features. Arnau et al. [
12] used principal component analysis to selected features, achieving accuracies of 67.7% and 69.6% for valence and arousal. An emotion recognition model based on a mixture classification technique for physically challenged or immobilized people was proposed [
13]. This model is an asymmetric distribution, which can help extract the EEG signals with a symmetric or asymmetric form.
In 2008, Lucasa et al. first proposed a visibility graph (VG) to map time series data into a complex network [
14]. With the development of research, many improved VG algorithms were proposed. One of the modified visibility graphs, the horizontal visibility graph (HVG), has been submitted by Luque et al. [
15]. HVG can represent the chaotic characteristics of EEG signals. Wang et al. used a limited penetrable visibility graph (LPVG) to analyze Alzheimer’s disease [
16]. Zhu et al. proposed weighted horizontal visibility (WVG), which introduced the edge weight [
17]. Recently, the visibility graph has been employed to analyze EEG signals. Bhaduri calculated the scale-freeness of the visibility graph of EEG data patterns varying from normal eye closed to epileptic [
18]. This work provided the first quantitative analysis technique for the degree of fractality. Zhu et al. used difference visibility graphs to analyze and classify the EEG signals of sleep stages [
19]. The accuracy of the six-state classification is 87.5%. Cai [
20] et al. developed a novel multiplex LPHVG method to explore brain fatigue behavior. This method yields novel insights into the brain-behavior associated with fatigue driving. By employing the visibility graph algorithm, Bajestani et al. examined the EEG signals of patients with autism spectrum disorder (ADS) [
21]. The ASD class can be discerned with an accuracy of 81.67%.
The effectiveness of complex network features in the classification of EEG signals has been demonstrated. However, few studies currently use complex network features for EEG-based emotion recognition.
In this paper, a novel approach based on complex network features was presented for emotion recognition. Weighted complex networks based on new angle measurements were constructed. The innovation of this method is that we use a new weighted method to construct the directed visibility graph. On this basis, the fusion feature is used to improve the effectiveness of features. EEG signals were mapped into two complex weighted networks from different directions: forward weighted horizontal visibility graph (FWHVG) and backward weighted horizontal visibility graph (BWHVG). Two feature matrices were extracted from the two weighted complex networks. Then, the fusion feature of two feature matrices was used to classify the EEG signals. The fusion matrix was fed into three classifiers for training and testing.
2. Related Works
2.1. Emotion Classification of DEAP Dataset
Emotion datasets with different modalities were established by researchers, such as the DREAMER dataset, AMIGOS dataset, MAHNOB HCI dataset, and the DEAP dataset. The DEAP emotion database was used in this paper. There are many emotion recognition methods proposed based on the DEAP dataset. Lee et al. [
22] proposed an emotion recognition model using a photoplethysmogram (PPG) signal of DEAP for the short recognition interval. Electrodermal activity (EDA) signals of the DEAP dataset were used to design a sensor for emotion recognition [
23]. They achieved an accuracy of 85% for four class emotional states. Kim et al. [
24] proposed a long short-term memory network based on EEG signals to consider changes in emotion over time. They performed the two-level and three-level classification experiments based on valence and arousal. The classification rates on two-level emotion recognition were 90.1% and 88.3% for valence and arousal 86.9% and 84.1% on three-level emotion recognition.
2.2. Emotion Recognition Based on Feature Extraction
Researchers have conducted studies to extract different features in the EEG-based emotion recognition task. Machine learning and deep learning techniques are applied to classify emotional states. Numerous attributes include power spectral density features (PSD), fractal dimension features (FD), entropy features, wavelet features, the differential asymmetry feature (DASM), the rational asymmetry feature (RASM), and the differential causality feature (DCAU) have been widely employed to characterize EEG [
25,
26]. Yin Y.Q et al. proposed a deep learning model fused graph convolutional neural network (GCNN) and long-short term memories neural networks (LSTM). Differential entropy was extracted to construct a feature cube as the input of the model. The average classification accuracies were 90.45% and 90.60% for valence and arousal on the DEAP dataset [
27]. A dynamical graph convolutional neural network (DGCNN) using a graph to model the multichannel EEG features was proposed in [
28]. Five kinds of features including differential entropy, DASM, RASM, PSD, and DCAU were investigated to evaluate the proposed method. The accuracy was 86.23% in valence and 84.54% in arousal on the DREAMER database. Goshvarpour et al. [
29] extracted the approximate and detailed coefficients of the wavelet transform and calculated the second-order difference plot of the coefficients. The average classification rate was 80.24% on four different emotion classes.
2.3. Emotion Models
A number of researchers have proposed different ways to express emotions, including the discrete emotion model, the dimensional emotion model, and other emotion models. In the discrete emotion model, researchers considered the theory of basic emotion, such as the Ekman emotion model [
30] and the Panksepp emotion model [
31]. There is a dispute about the number of basic emotions. Tuomas et al. believe that fear, anger, disgust, and happiness are the four basic emotions [
32]. While Cowen et al. maintain that there are 27 basic emotions [
33]. In the dimensional model, emotions are described by multiple dimensions, such as the circumplex model [
34]. It’s a two-dimensional model of arousal and valence. When dominance is added, it can be extended to a 3D emotion model [
35]. Many researchers have proposed different emotion models according to their different analytical perspectives, such as Ortony-Clore-Collins (OCC) model and hidden Markov model (HMM) [
36,
37].
3. Materials and Methods
3.1. Dataset
The DEAP dataset [
38], a multimodal dataset created by Koelstra et al., is used in this paper. The dataset is publically available and many researchers have performed their analysis on it. The DEAP dataset consists of two parts, namely the online ratings and the participant ratings, contains 1280 multivariate biosignals, such as electroencephalogram, photoplethysmogram, electromyogram, and electrodermal activity.
Table 1 describes the participant rating part. The participant ratings were acquired from 32 participants with an average age of 26.9 years, in which each subject watched 40 one-minute long music videos. After watching each video, participants assessed the videos at different levels ranging from 1 (low) to 9 (high). The emotional response includes five dimensions: valence, arousal, dominance, liking, and familiarity. Valence is an indicator of pleasantness. Arousal is a measure of the intensity of the emotion varying from unexcited to excited. Dominance represents the feeling of being in control of the emotion. Liking asks for participants’ liking of the video. Familiarity is the participants’ familiarity with each of the videos. Familiarity study participants’ tastes, not their feelings, on a scale of 1 to 5. For valence, arousal, dominance, and liking, the threshold is set as different values in different researches. The middle of the 9-point rating is used to generate two classes as used on the DEAP dataset. The label is low when the rating is less than 5, and the label is high when the rating is greater than or equal to 5.
Forty physiological channels were recorded for each participant, including 32 EEG channels and eight other peripheral channels. The data includes 60-s trial data and 3-s baseline data. 60-s trial data were used in this paper. The DEAP provide the preprocessed dataset. The data were down sampled to 128 Hz, and a bandpass frequency filter from 4.0–45.0 Hz was applied. Since emotions are generally described by arousal and valence, we only consider the two factors.
3.2. Emotion Recognition Framework
The block diagram of the proposed method for EEG emotion recognition in this paper is shown in
Figure 1. Thirty-two EEG channels are selected to classify emotional states in this paper. The procedure is divided into four steps, namely, preprocessing, feature extraction, feature fusion, and classification. The preprocessing includes data partitioning and channel selection. For the EEG signal data after preprocessing, time-domain features and network statistical properties are extracted. And then, the two types of features can be combined and normalized. Finally, three classifiers are used to train these features to obtain the results of emotion recognition.
3.3. Visibility Graph Networks
3.3.1. Horizontal Visibility Graph
VG algorithm can map time series to complex networks. For an EEG signal
with
N data samples, each sample can be considered as a node of the graph represented in a histogram. The height of the histogram represents the value of the corresponding data node. There is a connection between two nodes if the top of two bars is visible. For any two nodes (
ti,xi) and (
tj,xj), the edge between
ti, and
tj is connected if any data node (
tk,xk) between (
ti,xi) and (
tj,xj) fulfils the following criterion of convexity [
14]:
HVG is a modification of the VG algorithm. In HVG, two data nodes (
ti,xi) and (
tj,xj) will have horizontal visibility if they fulfil Equation (2) [
15]:
where (
tk,xk) is a data node between (
ti,xi) and (
tj,xj).
The complex network can be expressed by an adjacent matrix
. If
ti and
tj are connected,
aij = 1, otherwise
aij = 0, as shown in
Figure 2.
3.3.2. Directed Weighted Horizontal Visibility Graph
HVG with edge weight is known as the weighted horizontal visibility graph (WHVG), where the link between two nodes are not binary values (0 and 1). There are two commonly used edge weights at present, namely distance [
39] and radian function [
40]. We proposed a novel directed weighted horizontal visibility graph (DWHVG). The edge weight is related to visibility angle measurement. The weighted complex network can be expressed by a weight matrix
. The edge weight
wij is the angle between nodes
i and
j. It can be described as follows: if nodes
i and
j is visible, the connection of the vertex
i and vertex
j is called
ab, and the connection of the vertex
i and bottom
j is called
ac. The edge weight
wij is the angle between
ab and
ac, as shown in
Figure 3a. Equation (3) is the edge weight of FWHVG. Equation (4) is the edge weight of BWHVG:
The HVG algorithm is undirected, but the edge weight is related to the direction in our method. For a time series, when it is mapped forward to a weighted horizontal visibility graph, it can be named forward weighted horizontal visibility graph (FWHVG), as shown in
Figure 3. When it is mapped back to a weighted horizontal visibility graph, it can be named backward weighted horizontal visibility graph (BWHVG), as shown in
Figure 4. For a random time series given by
x = {7.0,4.0,8.0,6.5,7.6,9.0}, the HVG can be found in
Figure 2, and the graphical illustration of FWHVG and BWHVG can be found in
Figure 3 and
Figure 4.
Figure 3a and
Figure 4a show angle measurements of FWHVG and BWHVG between partial nodes.
Figure 3b and
Figure 4b show the networks mapped by FWHVG and BWHVG. Edge weights are different in different directed weighted horizontal visibility graphs.
The following example illustrates how edge weight is calculated. As it is clear from
Figure 2 that
x1 = 7.0 and
x3 = 8.0 is visible. The angles between
x1 and
x3 of FWHVG and BWHVG are shown in
Figure 3a and
Figure 4a. The edge weight of FWHVG between the two nodes is:
Thus, the edge weight between node 1 and node 3 is 1.756 in FWHVG. The weighted matrix of FWHVG can be calculated as:
The edge weight of BWHVG between the two nodes is:
The edge weight between node 1 and node 3 is 0.862 in BWHVG. The weighted matrix of BWHVG can be calculated as:
3.4. Feature Extraction
The main objective of feature extraction is to obtain reliable data for emotion recognition. For this reason, time-domain features and complex network features are extracted from EEG data.
3.4.1. Time-Domain Features
Nawaz et al. [
41] compared different features in emotion recognition to identify the features that can effectively discriminate the emotions. Their study showed that the time-domain features are more suitable for emotion recognition compared with power, entropy, fractal dimension, and wavelet energy. However, the time-domain features have received less attention so far. In this paper, we will make a deep analysis of the validity of time-domain features for emotion recognition.
In the current study, six time-domain features are adapted from [
41]. Suppose
represents an EEG signal with
data samples.
- (1)
Mean: Mean represents the average of the time series:
- (2)
Standard deviation: It represents the deviation of data compared with mean. The standard deviation is calculated as a square root of the average of the square of the difference between the EEG signal sample and the mean:
- (3)
First Difference: It represents the relationship between the current data and the previous data, and reflects the waveform dimensionality changes. First difference is calculated as the sum of the absolute difference between a pair of samples:
- (4)
Second Difference: It means the relationship between three adjacent data points and is a measure sensitive to the variation of the signal amplitude. The calculation of the second difference is similar to that of the first difference.
In following section,
X(
t) represents the normalized series as below:
where
and
can be found in Equations (7) and (8).
- (5)
First difference of normalized EEG: It is the relationship between the current data and the previous data of normalized EEG signal:
- (6)
Second difference of normalized EEG: It represents the relationship between three adjacent data points of normalized EEG signal:
3.4.2. Network Statistical Properties
The original series is mapped into weighted networks. Then the network metrics can be extracted.
- (7)
Average weighted degree
In unweighted networks, the edge number of one node connected with other nodes is called degree. In general, the larger degree of the node, the greater importance of the network. In a weighted network, the weighted degree
di can be extended to the strength of node
ti [
21]. The average weighted degree can be represented as Equation (15):
where
wij is the edge weight between node
ti and
tj.
- (8)
Deviation of weighted degree
The deviation of weighted degree can be calculated as follows [
42]:
- (9)
Weighted clustering coefficient
Clustering coefficient and clustering coefficient entropy [
43] describes the relationship between one node and its neighbors. The weighted clustering coefficient of the network can be calculated from the average weighted clustering coefficient of all nodes in the network, as shown in Equation (17):
where
Ci is the weighted clustering coefficient of node
ti,
wik is the weight between node
ti and
tk,
wjk is the weight between node
tj and
tk,
wij is the weight between node
ti and
tj.
- (10)
Weighted clustering coefficient entropy
Weighted clustering coefficient entropy
EC can be calculated as follows:
where
PC,i is the probability of the weighted clustering coefficient of node
ti.
3.5. Feature Fusion
After extracting the features of complex networks, two kinds of visibility graph features are fused. The procedure can be described as follows:
- (1)
Setting a sliding time-window to divide the EEG signals into M segments.
- (2)
EEG segments are mapped to FWHVGs and complex network features are extracted. For a feature, we can get the feature vector .
- (3)
Then we map EEG segments to BWHVGs, and extracted complex network features. For a feature, we get the feature vector .
- (4)
Finally, the fusion feature vector is calculated as Equation (21):
where
g1,
g2, ...,
gM is the element of
.
and
are the elements of
and
, separately.
The values of different features may vary greatly, so it is necessary to normalize the features to reduce the difference. Mapping the feature vector between 0 and 1 to avoid the classification error caused by the large difference of features. The normalized result
is expressed by Equation (22):
where
is the element of
G;
and
represent the maximum and minimum values of
G. The normalized feature vector is
. For four complex network features, the normalized feature matrix can be represented as
. For six time-domain features, the normalized feature matrix is
. The normalized feature matrix of combined features can be represented as
.
3.6. Classification
Support vector machines (SVM), optimized fitted k-nearest neighbors (OF-KNN) and decision tree (DT) classifiers are used for classification in this part. Based on promising empirical results of the three classifiers, we used them for emotion classification [
41,
44,
45,
46]. Besides, in the
Section 4.4.1, the effectiveness of different scenarios based on [
41] were compared. We used the same classifiers as this reference. Complementary information from different classifiers may lead to higher accuracy.
3.6.1. Support Vector Machines (SVM)
We use a library for support vector machines (LIBSVM) in our work. It is a further improvement made on the SVM [
47]. LIBSVM can solve the two-class problem by constructing an optimal separating hyperplane. This hyperplane is linear, and the distance between the two groups is maximized. There are two important parameter, kernel function parameter
γ and penalty factor
C. Kernel function transfers the training samples into a higher dimensional feature space. penalty factor represents degree of penalty to misclassification of samples.
C is 2 and
γ is 1 in this paper. SVM is a small sample learning method with simple algorithm and good robustness. However, this algorithm is difficult to implement for large-scale training samples.
3.6.2. Optimized Fitted K-Nearest Neighbors (OF-KNN)
KNN is a popular machine learning algorithm, which is very reliable for EEG data classification. KNN looks for a number
k of samples (called k-neighbors) nearest to the incoming training sample and then predicts its class based on the most common class of its nearest neighbors [
48]. The KNN classifier’s performance is mostly dependent on the choice of the distance parameter and the number of nearest neighbors
k. In this paper, we used a variant of KNN called optimized fitted KNN. This algorithm can find hyperparameters that minimize five-fold cross validation loss by using automatic hyperparameter optimization. To pick the best estimate, the Bayesian optimization acquisition function ‘expected-improvement-plus’ is used. It calculates the best estimated feasible point using the ‘best-point’ function. This algorithm has high accuracy and is insensitive to outliers. However, when the sample is unbalanced, there will be a large prediction bias.
3.6.3. Decision Tree (DT)
DT can change the complicated decision-making problems into simple processes with minimum computation time [
49]. The advantages of the algorithm include that they are relatively easy to interpret and have good classification performance on many datasets. It performs the learning by splitting the input data into finer subgroups and assigning decision rules to the subgroups in model outputs. DT can produce feasible and effective results for large data sources in a relatively short time. It is not suitable for data with the strong correlation.
4. Results
4.1. Evaluation Metrics
Three classification metrics including accuracy (
Acc), sensitivity (
Sen), specificity (
Spe), and precision (
Pre) are used in this study [
50].
(1) Accuracy
Accuracy is the most commonly used evaluated guideline. It represents the proportion of the sample that is classified correctly:
(2) Sensitivity
Sensitivity, also called Recall, means the probability percentage that positive samples are classified as positive samples by the model:
(3) Specificity
Specificity means the probability of correctly classified negative instances:
(4) Precision
Precision refers to the probability of true positive to the positive determined by the model.
where
TP,
TN,
FP and
FN stand for true positive, true negative, false positive and false negative, respectively.
4.2. Preprocessing
EEG signals are usually collected with noise in real life, which makes it challenging to design algorithms for emotion classification. EEG recording equipment may be affected by the surrounding environment. Muscle activity and eye movement can also bring the noise. The input signal used for emotion recognition should be the noise-filtered signal. The DEAP database provides a preprocessed version. The data has been down-sampled to 128 Hz, and a bandpass frequency filter from 4.0–45.0 Hz was applied in this version. We set a 10-s long sliding time-window with 50% overlap to divide the one-minute long EEG signals. Following this segmentation, a one-minute long EEG signal is divided into eleven 10-s long EEG segments.
4.3. Analysis of Visibility Graph Networks
The emotion classes are assigned according to arousal and valence ratings done by subjects. It can be predetermined as two classes, i.e., low or high, based on the threshold of 5 on each dimension [
51]. The labels are low valence and low arousal when the rating is less than 5. The labels are high valence and high arousal when the rating is greater than or equal to 5. The adjacency matrices of networks obtained from the EEG signal with high valence and low valence by applying VG are shown in
Figure 5. As mentioned in
Section 3.3.1, when a time series is mapped to an unweighted complex network, it can be expressed by an adjacent matrix. When two nodes are visible to each other, the value of the adjacent matrix is 1, otherwise, the value is 0. The white dots in
Figure 5 and
Figure 6 indicate the corresponding pair of nodes that are visible to each other, and the black portions represent no visibility. For each set of the data, 1280 samples were selected. The network connections of the EEG signal with low valence (
Figure 5a) are tighter, and the clusters are much bigger. This indicates that its clustering characteristic is more obvious than the EEG signal with high valence (
Figure 5b) in the control group.
The adjacency matrixes of networks based on the HVG method are shown in
Figure 6. The information got from
Figure 6 is similar to that in
Figure 5. The network connections in
Figure 6a are tighter compared with
Figure 6b, and the clusters are much bigger. There are fewer white dots in
Figure 6 than in
Figure 5, which means that the number of connected edges in
Figure 6 is less than that in
Figure 5. This indicated that the network mapped by VG is more complicated than that mapped by HVG. From the above analysis, we can get that the visibility network is effective in emotion recognition. HVG retains part of the information in the VG. And its structure is more straightforward. So, the HVG method is chosen as the basis in our process.
Figure 7 shows a local refinement of weight matrices based on forward weighted complex networks and backward weighted complex networks. When a time series is mapped to a weighted complex network, it can be expressed by a weight matrix. The color represents the weighted edge, the larger the value, the darker the color. 128 samples were selected for easier comparison. The following four images are all from the same time series. The figures show the different edge weights of different methods. The weight matrices were normalized.
Figure 7a,b are the weight matrices of the forward weighted visibility graph (FWVG) and backward weighted visibility graph (BWVG).
Figure 7c,d are the weight matrices of the forward and backward weighted horizontal visibility graph. The edge weights of the elements nearby the diagonal part of the matrixes are much larger than those far away from the diagonal. In different graphs, elements with large weights are located in different places.
As mentioned above, 32 EEG channels are used to classify emotional states. That’s means, for a complex network feature, we can get 32-dimensional feature matrices. In this paper, four network properties were used for emotion recognition, as listed in
Section 3.4.2. For one feature, the feature matrix of 32 EEG channels is 440 (segments) × 32 (channels). For four features, the feature matrix of 32 EEG channels is 440 (segments) × 128 (32 (channels) × 4 (features)). There was little difference in the classification results of the four features separately. Now, we randomly select a feature to compare the effectiveness of the different methods. The average weighted degree feature was selected here.
Figure 8 shows box plots of the feature of 32 EEG channels based on HVG and DWHVG. The abscissa represents 32 EEG channels. Red box plots are the average weighted degree feature of EEG signals with low valence. Black box plots are the feature of EEG signals with high valence. It can be observed from the box plot that the differences in terms of median and quartiles in
Figure 8b are more obvious than those in
Figure 8a.
4.4. Classification Results
Five-fold cross validation and 10-fold cross validation were performed to evaluate participant’s samples and the mean of them was taken as the result of the subject. The average performance of all participants was calculated as the final results.
4.4.1. Comparison of Time-Domain Features
In [
41], only 14-channels were selected for classifying emotional states. The selected EEG channels are located on AF3, F3, F7, FC5, T7, P7, O1, O2, P8, T8, FC6, F8, F4, and AF4. The first 20 s of data were excluded from EEG samples and the remaining 40-s long EEG signal was divided into four 10-s long segments without overlap. Grid searching was used to scan the available set of parameters for identifying the best parameter. The parameters of SVM and KNN were {
C,
γ} ∈ {10
−4,10
−3,10
−2,10
−1,1,10,10
2,10
3,10
4,10
510
6} and
k ∈ {5,4,3,2,1}. To find out the effectiveness of different data lengths and the sliding window types, four scenarios were compared in this section. The same EEG channels and classifiers were used for classification. Five-fold cross validation was used as in [
41].
Scenario 1: The plan used in [
41].
Scenario 2: The remaining 40-s long EEG signal was divided by a 10-s long sliding time-window with 50% overlap.
Scenario 3: A 10-s long sliding time-window partitioned one-minute long EEG signal into six segments without overlap.
Scenario 4: One-minute long EEG signal was segmented by a 10-s long sliding time-window with 50% overlap.
In Scenario 1, there are 160 (40 (videos) × 4 (segments)) features for each participant on each channel. With 5-fold cross validation method, the numbers of training data and testing data are 128 and 32. In Scenario 2, 280 (40 (videos) × 7 (segments)) features are divided into five equal data with the number of 56. There are 240 (40 (videos) × 6 (segments)) features in Scenario 3. 5-fold cross validation method splits the data into 192 training data and 48 testing data. In Scenario 4, 440 (40 (videos) × 11 (segments)) features are divided into 352 training data and 88 testing data.
Average accuracies of the different scenarios for the valence and arousal classification tasks are presented in
Table 2. When the sliding time window with an overlap rate of 50% is used for data segmentation, the classification accuracy is higher and the average sentiment recognition rates on 60-s long EEG signals are better than those on the remaining 40-s long EEG signal. In scenarios four, the classification accuracies are 95.68%, 94.60%, 85.19% for valence with SVM, KNN, and DT. The classification accuracies of arousal are 93.41%, 94.22%, 81.23%, respectively.
4.4.2. Analysis of Complex Network Features
In this section one-minute long EEG signal was divided by a 10-s long sliding time-window with 50% overlap. 32-channel EEGs were used for classifying emotional states. 10-fold cross validation method was used in following experiments. The performance estimation for complex network features of HVG and the proposed method are shown in
Table 3 and
Table 4.
As seen in
Table 3 and
Table 4, it is obvious that the OF-KNN method outperforms SVM and DT to classify valence and arousal. DT has the worst performance. With OF-KNN, we obtain the average classification accuracies for valence and arousal as 97.53% and 97.75% separately of proposed method. The performances of the HVG algorithm are 96.51% and 96.21% for valence and arousal. The classification accuracies of the proposed method in valence and arousal are respectively 1.02% and 1.24% higher than that of the HVG method. Most of the evaluation metrics in
Table 3 are better than those in
Table 4.
4.4.3. Performance of Combined Features
In
Section 4.4.1, only 14-channel EEGs were selected. In order to analyze the data more objectively, the remaining 18-channel EEG recordings were added for emotion recognition in this section, like
Section 4.4.2.
Table 5 shows the performance estimation for time-domain features of one-minute long EEG signals of 32 channels.
Table 6 is listed the classification performance of combined features based on the proposed method and time-domain features. The combined features include time-domain features and complex network features of the proposed method.
The OF-KNN method is superior to SVM and DT in the classification of time-domain features and combined features. It has been observed from the results that the overall average accuracies of time-domain features are 97.78% and 97.37% under valence and arousal, with OF-KNN. Those of combined features are 98.12% and 98.06% separately. The classification accuracies of combined features in valence and arousal are respectively 0.42% and 0.69% higher than those of time-domain features, which are 0.59% and 0.31% higher than those of the proposed method (listed in
Table 4). With OF-KNN, most the evaluation metrics of combined features are more stable compared with time-domain features. For example, in arousal dimensions, the STD of
Acc, Sen, Spe and
Pre based on time-domain features in the
Table 5 are 2.35%, 3.85%, 2.37%, and 3.45%. Those of combined features in the
Table 6 are 1.81%, 2.13%, 1.38% and 2.09%.
4.4.4. Effectiveness of Different Classifiers
The final experimental results for valence and arousal are shown in
Figure 9 and
Figure 10. The OF-KNN classifier can best distinguish EEG signals in valence and arousal dimensions than the other two types of classifiers. The emotion recognition method gets the lowest classification accuracy with the DT classifiers. When the SVM classifier is used, the classification accuracies of combined features are dropped compared with time-domain features and visibility graph features. The combination of the two types of features may not improve the classification accuracy. The evaluation metrics of OF-KNN is better than those in SVM and DT, and fluctuate less. The values of evaluation metrics of OF-KNN are smaller than those of SVM. But these metrics of SVM fluctuate a lot. This result partially reflects that the OF-KNN classifier outperformed SVM and DT in EEG-based emotion recognition in this paper.
5. Discussion
Many researchers have extracted features from EEG signals to identify the emotional state. Among these methods, time-domain features, entropy, and wavelet transform are widely used. In this study, we investigated the effectiveness of complex network metrics and time-domain features on emotion recognition.
For time-domain features, four scenarios were compared to find out the effectiveness of different data lengths and the sliding window types for emotion classification. The results showed that the method reached the highest accuracy when EEG signals were segmented by a 10-s long sliding time-window with 50% overlap. As mentioned above, each participant watches 40 one-minute long videos. At the same time, each participant has 40 one-minute long EEG recordings. As mentioned above, each participant watches 40 one-minute long videos. At the same time, each participant has 40 one-minute long EEG recordings. When six time-domain features are extracted from each channel, 192-dimensional (32 (channels) × 6 (features) = 192) feature matrices can be produced.
In the case of complex network metrics, we constructed the DWHVG based on a new angle measurement method, in which the undirected network is relevant to the direction. EEG signals were mapped into FWHVGs and BWHVGs from different directions. On this basis, the fusion feature is used to improve the effectiveness of features. Extracting four network metrics on each channel of EEG data produces 128-dimensional (32 (channels) × 4 (features) = 128) feature matrices. It can be found that the proposed method is effective in recognizing emotion.
SVM, OF-KNN, and DT classifiers were used for classification. The results reflected that the OF-KNN classifier outperformed SVM and DT in our method. The combination of the two types of features was fed into the three classifiers. Only OF-KNN shows a better classification rate. It is confirmed that the complex network features are effective in recognizing emotion. It provides a new research idea in emotion recognition.
The comparison of the proposed method with the existing methods is presented in
Table 7. The emotion recognition problems in the references of
Table 7 are all binary classification. The EEG signals used in the table all come from the DEAP database. Different feature extraction methods were compared in [
41]. With the KNN classifier, the time-domain statistical characteristics achieved accuracies of 77.62% and 78.96% for valence and arousal respectively. Gao et al. [
52] proposed a channel-fused dense convolutional network (CNN) for EEG-based emotion recognition. The deep-learning framework can obtain recognition accuracies over 92% for both valence and arousal classification tasks. Cui et al. [
53] used an end-to-end regional-asymmetric convolutional neural network (RACNN) to reach accuracies of 96.65% and 97.11% under valance and arousal. An emotion recognition system transforming 1D chain-like EEG vector sequences into 2D mesh-like matrix sequences was proposed in [
54]. The experimental results demonstrated that the classification accuracies of hybrid neural networks achieved 93.64% and 93.26% in valence and arousal dimensions. According to Liu et al. [
55], a multi-level features guided capsule network (MLF-CapsNet) was used. A one-second long sliding time window divided the one-minute long EEG signal into 60 segments. The maximum recognition rates on valence and arousal were separately 97.97% and 98.31%. When combined with time-domain features, the proposed method showed the accuracies of 98.12% and 98.06% for valence and arousal.
According to values of arousal and valence, emotion states can also be divided into 4 types, high arousal high valence (HAHV), high arousal low valence (HALV), low arousal high valence (LAHV), and low arousal low valence (LALV). Zhang et al. [
45] employed an empirical mode decomposition (EMD) strategy to decompose EEG signals, and then calculated corresponding sample entropies of the first 4 intrinsic mode functions (IMFs). The average accuracy for the 4-class task was 93.20%. Nonlinear features were extracted from EEG signals, and a feature selection method was used to enhance the classification performance [
26]. MLP, KNN, and SVM combined through the voting algorithm as a combined classifier. A classification rate of 84.56% was achieved on the DEAP dataset, 90% on their dataset. The highest classification accuracy achieved by ANN for 4-class emotion, entropy-based features, and implementation is 93.75% in [
56].
The limitations of this study are as follows. The preprocessed dataset provided by the DEAP database was used in this paper. We didn’t take into account the effect of noise. A study on noise robustness should be considered in future work. What more, the proposed method is only verified in the DEAP dataset, it should be performed and experimented with in different datasets. Besides, only two-level classification experiments of valence and arousal were considered in this paper. The multi-classification problem should be taken into consideration.
6. Conclusions
This paper proposed a novel method based on an improved visibility graph network to recognize the emotion model, which classified the two emotional dimensions of arousal and valance. In this model, a weighted visibility graph construction method based on visibility angle measurement transforms an undirected network into a directed network. Then, the feature matrices extracted from different directions based on DHVG were integrated into new feature matrices through feature fusion.
Thirty-two channel recordings of EEG signals were used in this implementation. Besides, we also extracted the time domain features. Three different machine learning classifiers were used to compare the feature extraction methods, which were SVM, OF-KNN, and DT.
In the valence and arousal domain, the average emotion recognition rates based on complex network features of our proposed method achieved 97.53% and 97.75% with 10-fold cross validation. When combined with time-domain features, the average accuracies reached 98.12% and 98.06%. It is confirmed that the proposed method is effective in recognizing emotion.
In the process of emotion recognition, the combinations of different channels have different recognition results. In the future, we will explore how to use fewer EEG channels to achieve higher classification accuracy. Moreover, the multi-category of the emotional dimension is also worth studying.
Author Contributions
T.K. and J.S. designed the algorithms, performed the experiments, and analyzed the experimental data. The other authors contributed in data analysis, checking and correcting and R.M. co-supervised the students. All authors reviewed and approved the final manuscript. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Open Research Fund of Key Laboratory of Ministry of Education (UASP2001) and the Fundamental Research Funds for the Central Universities (2242020k30044).
Institutional Review Board Statement
Ethical review and approval were waived for this study, due to information used in this paper was come from a publicly available dataset. The dataset has been performed in the frameworks of European Community’s Seventh Framework Program (FP7/2007–2011) under grant agreement no. 216444 (Peta Media).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
Conflicts of Interest
The authors declare no conflict of interest.
References
- Jenke, R.; Peer, A.; Buss, M. Feature extraction and selection for emotion recognition from EEG. IEEE Trans. Affect. Comput. 2014, 5, 327–339. [Google Scholar] [CrossRef]
- Zheng, X.W.; Liu, X.F.; Zhang, Y.C. A portable HCI system-oriented EEG feature extraction and channel selection for emotion recognition. Int. J. Intell. Syst. 2021, 36, 152–176. [Google Scholar] [CrossRef]
- Salankar, N.; Mishra, P.; Garg, L. Emotion recognition from EEG signals using empirical mode decomposition and second-order difference plot. Biomed. Signal Process. Control 2021, 65, 102389. [Google Scholar] [CrossRef]
- Krumhuber, E.G.; Kuster, D.; Namba, S. Emotion recognition from posed and spontaneous dynamic expressions: Human observers versus machine analysis. Emotion 2021, 21, 447–451. [Google Scholar] [CrossRef] [PubMed]
- Sally, O.; Andrea, H.; Thomas, P. Psychometric challenges and proposed solutions when scoring facial emotion expression codes. Behav. Res. Methods 2014, 46, 992–1006. [Google Scholar]
- Zhu, Z.; Dai, W.; Hu, Y.; Li, J. Speech Emotion recognition model based on Bi-GRU and focal loss. Pattern Recognit. Lett. 2020, 11, 358–365. [Google Scholar] [CrossRef]
- Hassouneh, A.; Mutawa, A.M.; Murugappan, M. Development of a real-time emotion recognition system using facial expressions and EEG based on machine learning and deep neural network methods. Inform. Med. Unlocked 2020, 20, 100372. [Google Scholar] [CrossRef]
- Niu, B.; Gao, Z.; Guo, B. Facial expression recognition with LBP and ORB features. Comput. Intell. Neurosci. 2021, 2021, 8828245. [Google Scholar] [CrossRef]
- Valentina, F.; Jordi, V.; Alfredo, M. Errors, Biases and Overconfidence in Artificial Emotional Modeling. In Proceedings of the WI′19: IEEE/WIC/ACM International Conference on Web Intelligence (WI′19 Companion), Thessaloniki, Greece, 14–17 October 2019. [Google Scholar]
- Gupta, V.; Chopda, M.D.; Pachori, R.B. Cross-subject emotion recognition using flexible analytic wavelet transform from EEG signals. IEEE Sens. 2019, 19, 2266–2274. [Google Scholar] [CrossRef]
- Thammasan, N.; Moriyama, K.; Fukui, K.I.; Numao, M. Continuous music-emotion recognition based on electroencephalogram. IEICE Trans. Inf. Syst. 2016, 99, 1234–1241. [Google Scholar] [CrossRef] [Green Version]
- Arnau, G.P.; Arevalillo, H.M.; Ramzan, N. Fusing highly dimensional energy and connectivity features to identify affective states from EEG signals. Neurocomputing 2017, 244, 81–89. [Google Scholar] [CrossRef] [Green Version]
- Krishna, N.M.; Sekaran, K.; Vamsi, A.V.N.; Ghantasala, G.S.P.; Chandana, P.; Kadry, S.; Blazauskas, T.; Damasevicius, R.; Kaushik, S. An efficient mixture model approach in brain-machine interface systems for extracting the psychological status of mentally impaired persons using EEG signals. IEEE Access 2019, 7, 77905–77914. [Google Scholar] [CrossRef]
- Lacasa, L.; Luque, B.; Ballesteros, F.; Luque, J.; Nuño, J.C. From time series to complex networks: The visibility graph. Proc. Natl. Acad. Sci. USA 2008, 105, 4972–4975. [Google Scholar] [CrossRef] [Green Version]
- Luque, B.; Lacasa, L.; Ballesteros, F. Horizontal visibility graphs: Exact results for random time series. Phys. Rev. E 2009, 80, 046103. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, J.; Yang, C.; Wang, R.; Yu, H.; Cao, Y.; Liu, J. Functional brain networks in Alzheimer’s disease: EEG analysis based on limited penetrable visibility graph and phase space method. Phys. A Stat. Mech. Its Appl. 2016, 460, 174–187. [Google Scholar] [CrossRef]
- Zhu, G.H.; Li, Y.; Wen, P. Epileptic seizure detection in EEGs signals using a fast weighted horizontal visibility algorithm. Comput. Methods Programs Biomed. 2014, 115, 64–75. [Google Scholar] [CrossRef] [PubMed]
- Bhaduri, S.; Ghosh, D. Electroencephalographic data analysis with visibility graph technique for quantitative assessment of brain dysfunction. Clin. EEG Neurosci. 2015, 46, 218–223. [Google Scholar] [CrossRef] [PubMed]
- Zhu, G.H.; Li, Y.; Wen, P. Analysis and classification of sleep stages based on difference visibility graphs from a single-channel EEG signal. IEEE J. Biomed. Health 2014, 18, 1813–1820. [Google Scholar] [CrossRef]
- Cai, Q.; Gao, Z.K.; Yang, Y.X.; Dang, W.-D.; Grebogi, C. Multiplex limited penetrable horizontal visibility graph from EEG signals for driver fatigue detection. Int. J. Neural Syst. 2019, 29, 1850057. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bajestani, G.S.; Behrooz, M.; Khani, A.G.; Nouri-Baygi, M.; Mollaei, A. Diagnosis of autism spectrum disorder based on complex network features. Comput. Methods Programs Biomed. 2019, 177, 277–283. [Google Scholar] [CrossRef] [PubMed]
- Lee, M.S.; Lee, Y.K.; Lim, M.T.; Kang, T.-K. Emotion recognition using convolutional neural network with selected statistical photoplethysmogram features. Appl. Sci. 2020, 10, 3501. [Google Scholar] [CrossRef]
- Machot, F.A.; Elmachot, A.; Ali, M.; Machot, E.A.; Kyamakya, K. A deep-learning model for subject-independent human emotion recognition using electrodermal activity sensors. Sensors 2019, 19, 1659. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kim, Y.; Choi, A. EEG-based emotion classification using long short-term memory network with attention mechanism. Sensors 2020, 20, 6727. [Google Scholar] [CrossRef] [PubMed]
- Zheng, W.L.; Zhu, J.Y.; Lu, B.L. Identifying stable patterns over time for emotion recognition from EEG. IEEE Trans. Affect. Comput. 2017, 10, 417–429. [Google Scholar] [CrossRef] [Green Version]
- Soroush, M.Z.; Maghooli, K.; Setarehdan, S.K.; Nasrabadi, A.M. Emotion classification through nonlinear EEG analysis using machine learning methods. Int. Clin. Neurosci. J. 2018, 5, 135–149. [Google Scholar] [CrossRef]
- Yin, Y.; Zheng, X.; Hu, B.; Zhang, Y.; Cui, X. EEG emotion recognition using fusion model of graph convolutional neural networks and LSTM. Appl. Soft Comput. 2020, 100, 106954. [Google Scholar] [CrossRef]
- Song, T.; Zheng, W.; Song, P.; Cui, Z. EEG emotion recognition using dynamical graph convolutional neural networks. IEEE Trans. Affect. Comput. 2020, 11, 532–541. [Google Scholar] [CrossRef] [Green Version]
- Goshvarpour, A.; Abbasi, A. An Emotion recognition approach based on wavelet transform and second-order difference plot of ECG. J. AI Data Min. 2017, 5, 211–221. [Google Scholar]
- Tracy, J.L.; Randles, D. Four Models of Basic Emotions: A Review of Ekman and Cordaro, Izard, Levenson, and Panksepp and Watt. Emot. Rev. 2011, 3, 397–405. [Google Scholar] [CrossRef] [Green Version]
- Panksepp, J.; Watt, D. What is basic about basic emotions? Lasting lessons from affective neuroscience. Emot. Rev. 2011, 3, 387–396. [Google Scholar] [CrossRef]
- Tuomas, E.; Vuoskoski, K. A review of music and emotion studies: Approaches, emotion models, and stimuli. Music Percept. Interdiscip. J. 2013, 30, 307–340. [Google Scholar]
- Cowen, A.S.; Keltner, D. Self-report captures 27 distinct categories of emotion bridged by continuous gradients. Proc. Natl. Acad. Sci. USA 2017, 114, E7900–E7909. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Russell, J.A. Affective space is bipolar. J. Personal. Soc. Psychol. 1979, 37, 345–356. [Google Scholar] [CrossRef]
- Liu, Y.; Sourina, O. Real-time subject-dependent EEG-based emotion recognition algorithm. In Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), San Diego, CA, USA, 5–8 October 2014. [Google Scholar]
- Wu, P.; Li, X.; Shen, S.; He, D. Social media opinion summarization using emotion cognition and convolutional neural networks. Int. J. Inf. Manag. 2019, 51, 101016. [Google Scholar] [CrossRef]
- Kang, M.; Ahn, J.; Lee, K. Opinion mining using ensemble text hidden Markov models for text classification. Expert Syst. Appl. 2018, 94, 218–227. [Google Scholar] [CrossRef]
- Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.-S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. DEAP: A database for emotion analysis using physiological signals. IEEE Trans. Affect. Comput. 2012, 3, 18–31. [Google Scholar] [CrossRef] [Green Version]
- Jiang, W.; Wei, B.Y.; Zhan, J. A visibility graph power averaging aggregation operator: A methodology based on network analysis. Comput. Ind. Eng. 2016, 101, 260–268. [Google Scholar] [CrossRef]
- Supriya, S.; Siuly, S.; Wang, H. Weighted visibility graph with complex network features in the detection of epilepsy. IEEE Access 2016, 4, 6554–6566. [Google Scholar] [CrossRef] [Green Version]
- Nawaz, R.; Cheah, K.H.; Nisar, H. Comparison of different feature extraction methods for EEG-based emotion recognition. Biocybern. Biomed. Eng. 2020, 40, 910–926. [Google Scholar] [CrossRef]
- Gao, Z.K.; Cai, Q.; Yang, Y.X. Time-dependent limited penetrable visibility graph analysis of nonstationary time series. Phys. A 2017, 476, 43–48. [Google Scholar] [CrossRef]
- Gao, Z.K.; Cai, Q.; Yang, Y.X. Visibility graph from adaptive optimal-kernel time-frequency representation for classification of epileptiform EEG. Int. J. Neural Syst. 2017, 27, 1750005. [Google Scholar] [CrossRef]
- Mohammadi, Z.; Frounchi, J.; Amiri, M. Wavelet-based emotion recognition system using EEG signal. Neural Comput. Appl. 2017, 28, 1985–1990. [Google Scholar] [CrossRef]
- Zhang, Y.; Ji, X.M.; Zhang, S.H. An approach to EEG-based emotion recognition using combined feature extraction method. Neurosci. Lett. 2016, 633, 152–157. [Google Scholar] [CrossRef] [PubMed]
- Bhatti, A.M.; Majid, M.; Anwar, S.M.; Khan, B. Human emotion recognition and analysis in response to audio music using brain signals. Comput. Hum. Behav. 2016, 65, 267–275. [Google Scholar] [CrossRef]
- Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 1–27. [Google Scholar] [CrossRef]
- Chatterjee, S.; Choudhury, N.A.; Bose, R. Detection of epileptic seizure and seizure-free EEG signals employing generalized S-Transform. IET Sci. Meas. Technol. 2017, 11, 847–855. [Google Scholar] [CrossRef]
- Maazouzi, F.; Bahi, H. Using multi decision tree technique to improving decision tree classifier. Int. J. Bus. Intell. Data Min. 2012, 7, 274–287. [Google Scholar] [CrossRef]
- Marina, S.; Guy, L. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar]
- Yin, Z.; Zhao, M.; Wang, Y.; Yang, J.; Zhang, J. Recognition of emotions using multimodal physiological signals and an ensemble deep learning model. Comput. Methods Programs Biomed. 2017, 140, 93–110. [Google Scholar] [CrossRef]
- Gao, Z.; Wang, X.; Yang, Y.; Li, Y.; Ma, K.; Chen, G. A channel-fused dense convolutional network for EEG-based emotion recognition. IEEE Trans. Cogn. Dev. Syst. 2020, 1, 2976112. [Google Scholar] [CrossRef]
- Cui, H.; Liu, A.; Zhang, X.; Chen, X.; Wang, K.; Chen, X. EEG-based emotion recognition using an end-to-end regional-asymmetric convolutional neural network. Knowl. Based Syst. 2020, 205, 106243. [Google Scholar] [CrossRef]
- Chen, J.; Jiang, D.; Zhang, Y.; Zhang, P. Emotion recognition from spatiotemporal EEG representations with hybrid convolutional recurrent neural networks via wearable multi-channel headset. Comput. Commun. 2020, 154, 58–65. [Google Scholar] [CrossRef]
- Liu, Y.; Ding, Y.; Li, C.; Cheng, J.; Song, R.; Wan, F.; Chen, X. Multi-channel EEG-based emotion recognition via a multi-level features guided capsule network. Comput. Biol. Med. 2020, 123, 103927. [Google Scholar] [CrossRef] [PubMed]
- Ahirwal, M.K.; Kose, M.R. Emotion recognition system based on EEG signal: A comparative study of different features and classifiers. In Proceedings of the 2018 Second International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 15–16 February 2018. [Google Scholar]
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).