1 Introduction

Countries in the world are primarily classified as rich or poor depending upon their per-capita GDP (gross domestic product). Per-capita GDP of a nation represents its economic output per person meaning that this metric is calculated by dividing the GDP of the country with its mid-year population size. Almost all the nations enhance their GDP by industrialization. However, industrialization comes at a cost as it increases the energy consumption exponentially. It has been well established by looking at the historical consumption of the primary energy that it is tightly coupled with carbon emission; and in this way, carbon emission also indirectly affects the GDP (Pedersen et al. 2021).

In the world bank database, there are many countries e.g., Hungary, Poland, Afghanistan, Yemen, and Iraq etc., for which the GDP values are unavailable for certain years in the recorded time periods (https://data.worldbank.org/). These countries have missing GDP as they were either war-torn or politically unstable in different durations of time. Accordingly, it was often impossible to collect their macroeconomic data to calculate the GDP for the respective durations. However, the tightly coupled, directly proportional relationship of economic growth with carbon emission can be effectively employed to predict the missing per-capita GDP of such nations through their carbon emissions. Moreover, available data of these countries in world bank database is in very little quantity. Hence, training a machine learning (ML) model for the prediction of missing GDP (using carbon emissions) may not be robust and precise.

The difficulty arising from this data scarcity may be avoided by using transfer learning (TL) methodology since data from many other countries of similar economic status are readily available. Transfer learning is also known by different names such as domain adaptation, meta-learning and multi-task learning, etc. (Shell and Coupland 2015; Pan and Yang 2009). It has seen many applications in real-world (Shao et al. 2014) e.g., image classification (Sunil et al. 2023), object/text recognition (Ghosh et al. 2023), and video description (Rafiq et al. 2023), and industrial IoT (Deb et al. 2020) etc. The concept of TL methodology is inspired from the human behavior which uses previously acquired knowledge to solve a newer but related task. For example, human transfers the knowledge acquired from riding a bicycle while learning the motor-bike riding. Similarly, TL utilizes the previously acquired knowledge (from a different domain) to solve tasks from a newer but related domain. However, performances of the ordinary ML methodologies are heavily dependent on the types of the training data. The generalization capability of such methodologies deteriorates profoundly if training (source domain) and testing (target domain) data have large differences in their distributions, and accordingly, in such situations they may not yield expected performance.

To overcome this limitation, this paper introduces a novel approach which preprocesses the source domain (training data) in accordance with target domain (testing data) in an unsupervised manner using isolation forest (IF). Thus, it proposes a novel unsupervised transfer learning (UTL) method using IF which we term as “target specific multi-source UTL (TSmUTL)” for the prediction of missing per-capita GDP of a nation. The terminology ‘multi-source’ refers that the source domain data is collected from multiple sources. Here, data has been collected from a number of countries having huge economic, geographic and political diversity among themselves. Carbon emission and per-capita GDP data from a total of 78 countries are collected to prepare our multi-source domain data. This dataset is initially pre-processed for each target domain separately by removing the anomalous data points through the application of IF. Here, IF is trained on a specific target domain data, and then employed to identify those data points of source domain which do not conform to the majority distribution of that target domain. This identification of non-relevant data points by IF is achieved through its capability in outlier/anomaly detection.Footnote 1 As presence of such anomalous data points in source domain affects the prediction precision of ML methods in target domain, anomaly detection is very crucial. This paper uses IF as it is an unsupervised tree-based approach which can be effectively trained even with little amount of data. Also, it is extremely potent in detecting anomalies by overcoming the challenge of swamping and masking in the datasets. This preprocessing procedure helps in enhancing the generalization ability of ML models in precisely predicting the missing per-capita GDP of a target domain country.

Thus, this work mainly contributes by designing a target specific multi-source UTL for missing per-capita GDP prediction of a nation. The proposed methodology is equally potent for any nation irrespective of where it is a developed economy or a developing economy or a least developing economy, since this TL methodology is specifically curated to the data distribution of that particular nation. It helps in increasing the size of relevant training dataset through the help of anomaly detection using IF in order to build a robust predictive ML model. It is empirically evaluated using different ML regression methodologies on eleven countries, belonging to different strata of economies, and proved its worthiness in majority of the cases by improving their prediction preciseness significantly.

Moreover, to accurately evaluate the performance of an ML model, this paper uses root mean square relative error (RMSRE) instead of widely popular root mean square error (RMSE). RMSE is the default choice of regression ML models. However, this paper deviates from this default behaviour. The reason behind this is that per-capita GDP of a country can be as low as $100 in twentieth century and can reach up to as high as $10,000 in twenty-first century. In this scenario, RMSE is highly susceptible to the magnitude of the per-capita GDP, whereas RMSRE gives equal importance to each year’s per-capita GDP irrespective of their magnitude. Per-capita GDP of a nation can be missing at any part of its modern history, be it as early as year 1960 or as late as 2016. So, to precisely estimate the missing per-capita GDP of a nation at any point in history, RMSRE is a better choice due to its magnitude unbiasedness.Footnote 2

Rest of the manuscript is organized in following way. Related works are thoroughly reviewed in the Sect. 2. Section 3 provides the basic definitions related to the current work. In Sect. 4, the proposed approach is elaborately discussed. In the Sect. 5, the performance of the proposed approach is empirically investigated for different regression techniques, and results are presented comparatively. Section 6 infers on the prime features of the proposed TSmUTL approach mentioning the main reasons behind its methodological supremacy and better performance over its competitor. It also lists few limitations of the proposed approach and highlights on the policy implications. Finally, Sect. 7 discusses the conclusion of the work and highlights several future research directions.

2 A concise review of related works

This section presents a concise survey of the existing research works on UTL, domain adaptation, and CO2-GDP relationship. Accordingly, the literature survey is divided into two sub-sections: first sub-section reviews the latest research works in the field of transfer learning focusing especially on UTL and domain adaptation, and the second sub-section extensively discusses the existing works which analyses the relationship between CO2 and GDP.

2.1 Review of latest transfer learning research

TL is now a quite mature research field with valuable contributions from numerous researchers. This section provides a succinct review of the latest research works related to the topics covered in this paper.

A three-stage reject inference learning methodology for credit risk management applications was proposed by Shen et al. (2020a). In this methodology, they first used the three-way decision theory for the selection of rejected credit samples. Then, they integrated it with high-level representation of accepted and selected rejected credit samples for transfer learning. Finally, they used the reconstructed accepted credit samples for credit scoring. To validate the methodology, they used Chinese credit data. Pirbonyeh et al. (2019) aimed to minimize distribution difference in the transformed source domain and the target domain by proposing a linear unsupervised transfer learning. They used three mechanisms to preserve the local structures of untransformed source domain. In the first mechanism, they minimized the distance among similar data pairs of untransformed source domains. In the second mechanism, they preserved the clusters of untransformed source domain, and in the third they used combination of the previous two mechanisms. To optimize this nonlinear problem, they used two methods: first one is an iterative method, and the second one is a relaxed version. Each of these methods was aimed to achieve the optimal weight approximation. They evaluated their approach on standard UCI datasets, e.g., Yeast, Ionosphere, Cancer, and on image classification problems such as Land Mine and Office-Caltech.

A particle swarm optimization based unsupervised transfer learning method was proposed by Sanodiya et al. in (2020). This approach was aimed to select optimal features from both target domain and source domain in order to overcome the limitation of degenerated features. In degenerated feature transformations, differences between the distributions of target domain and source domain is minimized. However, it is very difficult to address these distribution differences if original features in both target domain and source domain are distorted. Along with this, the proposed approach is aimed to decrease the marginal distribution, reduction of conditional distribution, maximization of variance of target domain, preservation of source domain’s discriminative information and exploitation of the similar geometric properties for modeling the manifold. They evaluated their proposed approach using different face recognition datasets. Michau and Fink (2021) proposed UTL for anomaly detection using the adversarial deep learning in order to align distribution. They also introduced a multidimensional scaling loss for conserving the relationship between the target domain dataset and the source domain dataset.

For community detection, Xie et al. (2019a) proposed a deep transitive network for the extraction of nonlinear features. They incorporated UTL in which embedded instances of Kullback–Leibler divergence is minimized. They evaluated the approach on real and synthetic networks. Shamsi et al. (2020) proposed TL mutation for predicting the phenotypic consequence of amino acid variation in proteins. It first used supervised TL for proteins in which knowledge is transferred from survival data to the protein function. Thereafter, in order to extend this information to a homologous protein they then employed UTL. Through this approach, they exploited only single point mutagenesis data to predict higher-order mutations. Then, mutational effects of homologous proteins were predicted using UTL. That TL approaches were generalized for random field models in order to transfer knowledge among them. Xie et al. (2019b) proposed a transfer integrated locally kernel embedding for click-through rate prediction in order to improve user experience and to benefit advertising. It aimed to handle high dimensional data sparsity and imbalance in advertising for improving prediction precision of advertising.

Waytowich et al. (2016) proposed a UTL approach to address frequent calibration session in brain computer interface due to inter and intra individual brain signal variations. Their proposed spectral transfer using information geometry improved ensemble of information geometry classifiers that were built using individual training person data by combining and improving their predictions. They validated their method for rapid serial visual presentation task in both off-line and real-time feedback analysis. Lovric et al. (2021) employed UTL for embedding in the domain of toxicology. They investigated three different techniques for dimensionality reduction viz. variational autoencoders, principal component analysis and, uniform manifold approximation and projection to embed molecular fingerprints. They used these embedded features for various training classification models viz., logistic regression and random forest. Their empirical evaluation suggested that UTL helps in reasonable model quality for classification of toxicity outcomes.

A residual joint adaptation adversarial network which employs UTL for fault diagnosis was proposed by Jiao et al. (2020). They aimed to address the changing environments in industry due to equipment wear, environment interferences, and working conditions. Their proposed network incorporated one-dimensional residual network for adaptive feature learning which process raw mechanical signal directly. They evaluated their model performance on two experimental platforms, i.e., the rolling bearing and the planetary gearbox. Torres-Soto and Ashley (2020) employed multi task deep learning for detecting cardiac rhythm in wearable devices. For real-time atrial fibrillation detection in wearable photo plethysmography devices, the proposed approach jointly assesses arrhythmia event and signal quality. Their proposed approach employed UTL through convolutional denoising autoencoders in order to enhance prediction precision. Bird et al. (2020) depicted the importance of TL in electromyographic (muscular wave) domains (EMG) and electroencephalographic (brainwave) (EEG) classification with both convolutional neural network (CNN) and multilayer perceptron (MLP). For obtaining the best hyperparameters of these networks, they used multi-objective evolutionary search method. For both, CNN and MLP networks, transfer learning from EEG to EMG and from EMG to EEG, proved beneficial.

Li et al. (2020) proposed adversarial tight match for unsupervised domain adaptation. In which they reduced distribution divergence of source domain and target domain by jointly optimizing maximum density divergence, a loss function. Through this loss function, they aimed to maximize intra-class density and minimize inter domain divergence. For machine fault diagnosis, Zhang et al. (2021) proposed open set domain adaptation based on deep learning. This methodology extracts domain invariant features using adversarial learning. To further improve, the generalization entropy minimization was employed. To embed the similarities in testing data and known health conditions, they introduced instance level weighted mechanism. Kernel maximum mean discrepancy was studied from the aspect of reducing discrepancy in target and source domains. It was used in joint distribution adaptation, transfer component analysis (shallow methods) and deep methods such as deep adaptation network and deep domain confusion as transfer criteria. Though kernel selection is a challenge, and deep adaptation networks tried to combine multiple kernels. Thus, Li et al. (2021) proposed an optimal deep transfer network which employed an ensemble of deep transfer networks using different kernels for maximum mean discrepancy.

A joint feature and label adversarial network for wafer map defect recognition was proposed by Yu et al. (2020). Defect recognition in wafer map helps in locating faults in semiconductor manufacturing processes for yield enhancement. This was a semi supervised deep transfer learning algorithm which used convolutional neural networks (CNNs) on feature maps for extracting transferrable features. Thereafter it introduced a generative adversarial network (GANs) for pseudo label learning block and multilayer domain adaptation for reducing distribution discrepancy. In transfer representation learning, commonly used methodology is projection to high dimensional space using kernel based nonlinear mapping, followed by reduction in dimension. Due to lack of interpretability and difficulty in selection of kernels, Xu et al. (2019) used Takagi–Sugeno–Kang fuzzy system (TSK-FS) for transfer representation learning. It consisted of two parts that first transforms target domains and source domains to a fuzzy features space. Then it employed principal component analysis and linear discriminant analysis in order to preserve geometric properties and discriminant information. For domain discrepancy measurement using distance among probability distribution of target and source domain, Zellinger et al. (2021) proposed to address the problem of deriving generalization bounds. They analyzed the domain adaptation based on the assumption of moment distances for realizing weaker similarity in target domain and source domain distributions. Based on finite moments and smoothness conditions, they obtained generalization bounds for it.

A multi-source TL algorithm to address time-varying characteristics (which causes differences in new and old data) of the time series data was proposed by Gu and Dai (2021). They introduced the theory of domain adaptation, and analyzed the target risk of time series forecasting for multi-source settings. They used maximum mean discrepancy in multi-source TL algorithm for measuring similarity between target and source domain. Also, they exploited the Kullback–Leibler divergence for similarity measurement among the domains for active learning with multi-source TL.

A deep transfer learning based neural network coupled with multiscale distribution adaptation for cross domain state-of-charge estimation was proposed by Bian et al. (2020). To extract nonlinear characteristics from battery measurements of two different domains, their proposed neural network was composed of convolutional block followed by bidirectional recurrent neural network. For minimization of discrepancy in these nonlinear features of two different domains, they used multiscale distribution adaptation for imposing constraints on neural layers. Mardani and Jafar (2021) used iterative Fischer linear discriminant analysis in their proposed cross- and multiple-domains visual transfer learning for domain adaptation. Here, they transferred target and source domains into a lower dimension subspace. It benefitted with joint domain adaptation and Fischer linear discriminant analysis in reducing domain discrepancy. To contain data drift across domains, it employed an adaptive classifier and also used pseudo target labels to iteratively converge the model.

An adaptive component embedding for domain adaptation to alleviate domain discrepancy was proposed by Jing et al. (2020). By aligning first order statistics it embedded the domain data into a shared invariant feature subspace, while preserving the geometric properties. To further mitigates the domain, shift, they also aligned second order statistics. Thereafter for classification, structural risk functional was optimized in regenerating the kernel Hilbert space. For 2D to 3D registration through digitally reconstructed radiographs, Wang et al. (2021) proposed a pseudo-Siamese multi-view point-based registration methodology to overcome the scarcity of real fluoroscopic images. Through this methodology, they aimed to overcome the requirement of real X-ray images by generating similar digitally reconstructed radiographs with computer tomography. By transferring the real features to synthetic features, they depicted the reduction in the domain differences between authentic X-ray and synthetic images. Balaha et al. (2022) used deep TL and feature classification approach for early detection and early prognosis of COVID-19. They used both normal data augmentation methods and also generative adversarial networks-based augmentation in early diagnostic phase. Here, seven different convolutional neural networks using transfer learning were trained to detect infection from computed tomography images. For prognostic markers, 28 ML methods were applied over patients’ laboratory test data. A cross-network deep network embedding for Cross-network node classification by incorporating domain adaptation was proposed by Shen et al. (2020b). The network was aimed to learn network-invariant and label-discriminative node vector representations. By mapping strongly connected nodes in having similar latent vectors, the proposed methodology leverages network in learning proximity among nodes. For cross networks, it leveraged labels and attributes for learning proximities among nodes. For surgical margin detection, Santilli et al. (2021) used domain adaptation in building a self-supervised model using a scarce and weakly labelled data. The approach aimed to contextualize intelligent knife (iKnife) features from easily available cancer data. Their domain adaptation approach transferred the learnt information to classify breast cancer. The iKnife approach was used to detect the tissue being burnt by capturing and analyzing the vapor emanating from high-frequency electrocautery tool.

Classifier trained on domain invariant features of source domain cannot guarantee the alignment of the target domain class distribution with the source domain, a limitation which was addressed by Lee et al. (2021). They proposed a class conditional domain invariant learning that was motivated from generalizing the upper bound for domain adaptation. The learned feature space was mapped nearby for same classes for class conditional alignment and can be applied to any deep learning with domain invariant learning. Gupta and Jalal (2022) presented a comprehensive review and performance analysis of state of art techniques based on TL and their extension in scene text reading system. Yao et al. (2023) presented a survey of TL techniques into machinery tasks comprising diagnostics and prognostics. The review focused on three type of TL methods, viz., feature matching, model and parameter transfer, and adversarial adaptation.

Table 1 summarizes the recent literature on UTL and domain adaptation focusing on the methodology used/proposed and the application investigated. Through these analyses it was observed that TL may be highly beneficial in utilizing the existing knowledge from a related domain to tide over data scarcity in a newer domain in building robust ML models. Next sub-sections explore the real-world problems that are efficiently and effectively solved by developing novel TL approaches in the thesis.

Table 1 Summarizing the recent research on domain adaptation and UTL approaches

2.2 Review of the latest research on CO2 and GDP relationship

There were a lot of countries in the World Bank database whose GDP were missing due to various factors either they were war-torn countries or isolated countries or politically/economically unstable countries etc. The work of this paper aimed to fill this research gap by exploiting the carbon emission data of a country to predict its missing GDP. Relationship of carbon emission and GDP was widely studied in the literature. This sub-section studies the available literature that analyzes this relationship. Initially this sub-section discusses on the Environmental Kuznets Curve (EKC) hypothesis, and then focusses and establishes a correlation between GDP and carbon emission. There were a number of CO2 emission related studies in the literature. Hoa et al. studied the economic impact of CO2 emissions and its mitigation policies in Hoa and Limskul (2013). Narayan et al. (2016) used cross-correlation estimation to analysed the association between economic growth and CO2 emissions based on data from 181 countries. Abid (2017) empirically analysed 41 EU countries and 58 Middle East & African countries to test the hypothesis of the Environmental Kuznets Curve (EKC). The experiments illustrated a monotonically increasing relationship between CO2 emissions and GDP.

Carbon emissions of a nation may or may not follow inverted U-shaped EKC. EKC postulates an inverted U-shaped correlation among CO2 emission and GDP. This signifies that environmental degradation (carbon emission) is substantial at the beginning of industrialization when economic condition of a country is weaker with smaller GDP. However, EKC postulates that as the country achieves higher GDP with the development, its carbon emission starts to decline (Dinda 2004). Now, the country focuses more on better living standard with a clean environment than the rise in income alone. There were many existing literatures that support EKC hypothesis (Esteve and Tamarit 2012; Shahbaz et al. 2012; Churchill et al. 2020; Ike et al. 2020; Sarkodie and Ozturk 2020). In contrast, there were also many other studies which disapproved this hypothesis (Robalino-López et al. 2014, 2015, 2016; Altıntaş and Kassouri 2020; Dogan et al. 2020). Esteve and Tamarit (2012) analyzed Spanish EKC for the years ranging from 1857 to 2007 and found that “income elasticity” was less than one, even though per-capita CO2 consumption was monotonically increasing. Shahbaz et al. (2012) supported the EKC hypothesis in case of Pakistan during 1971–2009 by analyzing the interactions among CO2 emissions, economic growth, trade openness, and energy consumption. Churchill et al. (2020) proved an inverted U-shaped EKC in case of eight Australian states and territories, where carbon emission peaked in 2010 and declined afterwards. Ike et al. (2020) investigated 15 oil-producing countries with respect to their oil production effect on carbon emission. They validated EKC hypothesis for median and higher emission countries. They also proved that electricity production increases carbon emission while trade reduces it. Sarkodie and Ozturk (2020) validated the EKC hypothesis for Kenya.

Robalino-López et al. (2014) suggested that Ecuador can fulfill EKC in near future, if its economic growth was coupled with increased usage of efficient fossil fuel technology and renewable energies. Also, in Robalino-López et al. (2015), authors found that EKC is not fulfilled in case of Venezuela. However, they suggested that EKC could be achieved in medium term if economic growth was combined with increased renewable energy usage, along with appropriate changes in the productive sectoral structure and the energy matrix. Also, in Robalino-López et al. (2016), they analyzed ten South American countries from 1980 to 2010 and concluded that EKC was not supported in this region. However, they concluded that CO2 emission increases as GDP increases in all of the countries, except in Venezuela. Altintas and Kassouri (2020) analysed fourteen European Countries and found that EKC hypothesis was not valid for these European countries. Dogan et al. (2020) re-investigated EKC for BRICST (Brazil, Russia, India, China, South Africa, and Turkey) by considering ecological footprint (as proxy for the environment). The empirical results disapproved EKC hypothesis for the region excluding Russia (due to data unavailability). From the literature surveyed above, it can be seen that EKC hypothesis was not universally applicable due to geopolitical constraints e.g., resource availability, and utilization of renewable energies etc.

Wagner (2008), depicted relationship between GDP and carbon emission to address many challenges of econometrics. Later in Wagner (2015), the work was extended with a detailed depiction of the carbon emission elasticity of GDP along with its monotonic behavior. Pao and Tsai (2011) analyzed correlation among energy consumption, foreign direct investment, carbon emission and GDP for Russia, Brazil, China and India. The unidirectional causalities among income, energy consumption and CO2 emissions of the member countries of the Gulf cooperation council were investigated by Salahuddin and Gow (2014). Ajmi et al. stated that relationship among CO2 emissions, energy consumption, and income exhibits time dependent variations in Ajmi et al. (2015).

Variations in the CO2 emissions between rural folk and urban conglomerates for Asian countries were examined in detail by Krey et al. (2012). By using dynamic simultaneous equation data models, Can et al. (2019) pointed out that India would quadruple its CO2 emission by 2050 if it maintains its forecasted economic growth establishing that increase in GDP along with increase in its CO2 emission. By analyzing the data of 105 countries, Sugiawan et al. (2019) cautioned that any cut in CO2 emission of a nation would negatively impact its sustainable well-being. Mensah et al. (2019) used Pooled mean group estimation panel autoregressive distributed lag to depict a bilateral causal link in long-term and shot-term for all panels between fossil fuel consumption and economic growth as well as for fossil fuel consumption and CO2 emissions.

Chandran Govindaraju and Tang (2013) have suggested that the CO2 emission of a nation was proportional to its economic might, which represents its level of industrialization. Arvin et al. (2015) investigated the relationship between CO2 emissions, urbanization and economic growth in the context of G-20 countries. Heidari et al. (2015) reported that there exists a nonlinear relationship between energy consumption per capita, CO2 emissions per capita, and gross domestic product (GDP) per capita. Begum et al. (2015) and Sohag et al. (2015) analysed the connection between economic growth, population growth, technological innovation, and trade openness and CO2 emissions in the case of Malaysia. Saidi and Hammami (2015) demonstrated the positive impact of energy consumption and CO2 emissions on economic growth by analysing the data of 58 countries collected over the period (1990–2012), using simultaneous-equation models. Omri et al. (2014) too analysed the causal interaction between CO2 emissions and economic growth by using dynamic simultaneous-equation data models and shown their positive impact on each other.

Udemba et al. (2021), found a unidirectional relationship in energy utilization and income (GDP), and therefore suggested that conservative energy policies should be avoided by Indian government else it will negatively impact GDP. Shahnazi, and Shabani (2021) concluded that carbon emission of a nation and its neighbours were positively correlated. By analyzing 32 countries in Sub-Saharan Africa, Adedoyin et al. (2021) proved that CO2 emission was increased both by real GDP and non-renewable energy. For emerging industrialized seven (E7) economies, Gyamfi et al. (2021) empirically stated that both real GDP and coal rent had a positive effect on CO2 emissions. They also mentioned that an increase of 1% in GDP causes 0.400% increase in pollution emission.

Magazzino et al. (2021) investigated relationship among Information and Communication Technology (ICT) usage, economic growth, electricity consumption, and environmental pollution for 16 European Union countries. They stated that ICT usage increases electricity consumption, which thereby increases both CO2 emissions and GDP. As 1% increase in economic growth rate increases per capita electricity consumption by 0.13%. Nguyen et al. (2021), found out that for developed G-6 countries, their capital market expansion, trade openness, and economic growth were the main cause of increase in their carbon emissions. Chen et al. empirically demonstrated that per capita GDP was proportional to CO2 emissions (Chen et al. 2018). In the context of IRAN, Mousavi et al. (2017) proved, using historical data, that increase in CO2 emissions is largely because of economic activity. Chaabouni and Saidi (2017) experimentally proved a bidirectional causality between CO2 emissions and GDP per capita. They did a case study of 51 countries and found that 1% increase in CO2 emissions increases the economy by 0.011%. In addition, a growth of 1% in economy indicates an increase of 0.263% CO2 emissions. Acheampong (2018) analysed the dynamic causal relationship among carbon emissions, energy consumption and economic growth in the context of 116 countries. The analysis results have depicted that carbon emissions positively cause economic growth. Stern (2010) estimated that the carbon-income elasticity was 1.509 globally, suggesting that increase in carbon emission increases GDP. Liddle (2015) also pointed out that the per capita CO2 elasticity of GDP was positive and monotonic. Sarkodie and Owusu (2017) reported that, for Ghana, if energy-use, GDP, and population increases by 1%, then CO2 emissions also increase by 0.58%, 0.73%, and 1.30%, respectively. All of the above and many other research works might be the reasons why the Article 2 of the ‘Paris Agreement’ is lenient for developing economies. It allowed them more time in achieving their peak carbon emission since it was necessary for their development needs (http://unfccc.int/files/essential_background/convention/application/pdf/english_paris_agreement.pdf).

A novel way of predicting the GDP using CO2 emissions from solid, liquid and gaseous fuel was reported by Marjanovic et al. (2016). Shukla et al. also addressed a similar problem by employing fuzzy sets and random fuzzy variable to model uncertainty in the input dataset (2018a, 2018b). Using the carbon emission data, authors predicted the GDP in Shukla et al. (2018a), whereas in Shukla et al. (2018b) they predicted the human development index. Using a single source-based TL, Kumar and Muhuri (2019) predicted missing per-capita GDP of numerous nations through their carbon emission. Aghamaleki and Baharlou (2018) employed IF with TL for noisy web data classification. Kumar et al. (2020) used IF for anomaly detection in multi-source UTL for missing GDP prediction using carbon emissions. This work was further extended by them in Kumar et al. (2021) highlighting the strength and weakness of their approach. Proposed approach is evaluated over eleven different target domains with varying distributions. That is, for the thorough evaluation of the proposed TSmUTL, data from eleven different countries are used as target domains not only from developing economies, but also from developed economies. Improving upon previous studies, proposed approach automates the transfer learning technique in extracting target domain specific relevant training data from readily available similar domains for robust ML modelling, that are pragmatic to many other newer real-world applications in overcoming their data scarcity issues.

3 Basic definitions

This paper uses TL to estimate per-capita GDP of a nation using its carbon emission. TL categorization based on domain and task specifications, are defined in following way (Pan and Yang 2009; Behbood et al. 2015):

  • Domain: It comprises of \(\left\{\text{F},\text{P}\left(\text{X}\right)\right\}\text{ X}=\{{\text{x}}_{1},\dots \dots ,{\text{x}}_{\text{n}}\}\in \text{F}\), here \(\text{F}\) denotes feature space and \(\text{P}\left(\text{X}\right)\) denotes marginal probability distribution.

  • Task: A task is associated with every domain and is represented as \(\{\text{Y},\text{f}\left(.\right)\}\), \(\text{Y}=\left\{{\text{y}}_{1},\dots .,{\text{y}}_{\text{m}}\right\}\). Here, \(\text{Y}\) denotes the label space, while the objective function is denoted by \(\text{f}\left(.\right)\). This objective function \(f(.)\) is learned during training over data and its label, represented by pair (\({\text{x}}_{\text{i}}{,\text{ y}}_{\text{i}})\) to predict newer unseen instances.

  • Transfer learning: In TL framework, there are two types of domains \({\text{D}}_{\text{s}}\) (source domain) and \({\text{D}}_{\text{t}}\) (target domain). Each domain is associated with a source task \({\text{T}}_{\text{s}}\) and target task \({\text{T}}_{\text{t}}\). In order to solve the target domain problem much precisely, learning function \({\text{f}}_{\text{t}}(.)\) is also optimized by extracting the necessary knowledge from source domain and source task (i.e. \({\text{D}}_{\text{s}}\) and \({\text{T}}_{\text{s}}\)) too. However, if \({\text{D}}_{\text{s}}={\text{D}}_{\text{t}}\) and \({\text{T}}_{\text{s}}={\text{T}}_{\text{t}}\) (i.e. both the domains and their tasks are equal) then it transforms to the classical machine learning problem.

  • Transductive transfer learning: This TL occurs when \({D}_{s}\) is not equal to \({D}_{t}\). It means that either \({F}_{s}\) is not equal to \({F}_{t}\) or \({P}_{s}\left(X\right)\) is unequal to \({P}_{t}\left(X\right)\) or both of them are unequal.

  • Inductive transfer learning: In this TL, source task \({\text{T}}_{\text{s}}\) is unequal to target task \({\text{T}}_{\text{t}}\). A few labelled instances are essential in target task to induce \({\text{f}}_{\text{t}}(.)\) for its better performance in target domain.

  • Unsupervised transfer learning (UTL): This TL exists when \({\text{D}}_{\text{s}}\) \(\ne\) \({\text{D}}_{\text{t}}\) (i.e. source domain is not equal to target domain) and there are no labels available in both the domains.

4 Proposed target specific multi-source UTL

This section provides the details of the proposed Target specific multi-source UTL (TSmUTL). First sub-section describes the dataset. Second sub-section explains the proposed TSmUTL via flowchart and its algorithmic explanations. Last sub-section explains the RMSRE evaluation metric.

4.1 Datasets

World Bank database (https://data.worldbank.org/) is accessed to get the data used in this paper. The missing per-capita GDP values of various nations such as Afghanistan, Albania, Angola, Hungary, Iraq, Libya, Myanmar, Poland, Syria, Vietnam, and Yemen are recorded. The missing duration of GDP of these countries in the database of World Bank (https://data.worldbank.org/) are mentioned in Table 2. To predict the missing GDP of these nations, only their carbon emission data is used with input parameters as described in Table 2. As the task is to estimate per-capita GDP, Table 2 also shows the output parameter.

Table 2 Input parameters, output parameters, and countries with missing per-capita GDP

All the input and output parameters (depicted in Table 2) of each nation is of duration from year 1960 to year 2016. Carbon emissions are extracted from World Bank database (https://data.worldbank.org/) in percentage (%) for gaseous, liquid, and solid fuel. Then they are converted to real values from percentage by considering per-capita CO2 emission in metric tons as the base value. These carbon emissions from three different kinds of fuels are used for GDP prediction as type of fuel usage also effects GDP of a country significantly (Garba and Bellingham 2021). Use of traditional solid fuel in households is very common when energy resources are very limited, and as the economic progress is achieved then household energy is met by shifting to liquid fuel (e.g. kerosene etc.). Finally, when economy become prosperous then clean cooking fuel (gaseous e.g. LPG/PNG etc.) is widely used indicating high living standards. Therefore, this paper not only exploits the indirect relationship between carbon emissions and GDP, but also utilizes the non-linear relationship among carbon emission from gaseous, liquid and solid fuels for precise estimation of missing GDP a nation. The final input parameters used to train an ML model and the output label are depicted in Table 2.

4.2 Target specific multi-source UTL (TSmUTL)

The data of the countries (Table 2) with missing GDP is not sufficient for training a robust supervised machine learning model. To overcome this limitation, this paper proposes a novel target specific multi-source UTL (TSmUTL) methodology that employs IF for increasing the relevant dataset size for any specific target task. Multi-source domain data used in this paper comprises the data collected from a total of 78 countries including European Union which is a grouping of 28 countries (as data is up to year 2016 before Brexit). These countries are following:

‘Burkina_Faso’, ‘Bangladesh’, ‘Egypt’, ‘Cameroon’, ‘Tunisia’, Cuba’, ‘Kenya’, ‘Jordan’, ‘Iran’, ‘Nepal’, ‘South Africa’, ‘India’, ‘Pakistan’, ‘Zambia’, ‘Ghana’, ‘Zimbabwe’, ‘Nigeria’, ‘Sudan’, ‘Indonesia’, ‘Thailand’, ‘Haiti’, ‘Ethiopia’, ‘Austria’, ‘Canada’, ‘Australia’, ‘Sri Lanka’, ‘Eritrea’, ‘New Zealand’, ‘Japan’, ‘France’, ‘Germany’, ‘Italy’, ‘United Kingdom’, ‘Norway’, ‘Spain’, ‘Sweden’, ‘Madagascar’, ‘Netherlands’, ‘Portugal’, ‘Finland’, ‘Greece’, ‘Central African Republic’, ‘Benin ‘, ‘Chad’, ‘Fiji’, ‘Equatorial Guinea’, ‘China’, ‘Papua New Guinea’, ‘Egypt’, ‘Malaysia’, ‘Belarus’, ‘Armenia’, ‘Bulgaria’, ‘Bosnia and Herzegovina’, ‘Czech Republic’, ‘United States’, ‘Tajikistan’, ‘Georgia’, ‘Turkey’, ‘Uzbekistan’, ‘Ukraine’, ‘Argentina’, ‘Brazil’, ‘Venezuela’, ‘Mexico’, ‘Switzerland’, ‘Peru’, ‘Algeria’, ‘Morocco’, ‘Bhutan’, ‘Mauritius’, ‘Luxembourg’, ‘Uganda’, ‘Mozambique’, ‘Belgium’, ‘Philippines’, ‘Ireland’, ‘European Union’

The choice of these countries is absolutely random keeping in mind developing and developed countries from all the diverse geographies of the six continents of the planet Earth (leaving Antarctica) are well represented. This multi-source domain data is a combination of the data collected from a total of 78 countries having diverse data distributions. Hence, they are used directly to build a ML model in order to predict missing GDP of a target domain country (Table 2), then it may result in very poor precision. To overcome this limitation, this paper proposes target specific multi-source UTL (TSmUTL). Flowchart of this technique is depicted in Fig. 1, and the steps are provided in the Algorithm 1.

figure a

Algorithm 1: Target specific multi-source UTL (TSmUTL)

Fig. 1
figure 1

Proposed TSmUTL approach

First step in this algorithm uses a target domain data to train an IF model. In Step 2, non-useful/redundant data points from the multi-source domain dataset which do not follows the data distribution of the majority of the target domains are removed using IF trained in Step 1. This pre-processed multi-source domain is then used to train different ML models in Step 3. These trained ML models are measured for their precision in predicting available GDP of the target domain using an optimum evaluation metric in Step 4. In Step 5, the most precise ML model is employed to estimate the missing GDP of the target domain.

4.3 Evaluation metric

To evaluate the performance of these models, root mean square relative error (RMSRE) is used as evaluation metric instead of root mean square error (RMSE). The RMSE and RMSRE (Cody 1969; Despotovic et al. 2015) are mathematically defined in Eqs. (1) and (2), respectively.

$${\text{RMSE}}=\sqrt{\left(\frac{{\sum }_{{\text{i}}=1}^{{\varvec{n}}}{\left({{\varvec{E}}}_{\text{i}}-{{\varvec{O}}}_{\text{i}}\right)}^{2}}{{\varvec{N}}}\right)}$$
(1)
$$RMSRE=\sqrt{\left(\frac{{\sum }_{{\text{i}}=1}^{n}{\left(\frac{{E}_{\text{i}}-{O}_{\text{i}}}{{O}_{\text{i}}}\right)}^{2}}{N}\right)}$$
(2)

In Eqs. (1) and (2), \({E}_{\text{i}}\) is the estimated/predicted value and \({O}_{\text{i}}\) is the actual/available value. While total number of observations is denoted by \(N\). Smaller the values of RMSE and RMSRE, better will be the prediction accuracy. That is, smaller RMSE and RMSRE values are desirable.

5 Experimental evaluation

In this section, first the proposed TSmUTL approach is empirically evaluated for its efficacy in precise prediction of missing per-capita GDP. Experiments are conducted for 11 different countries viz. Afghanistan, Albania, Angola, Hungary, Iraq, Libya, Myanmar, Poland, Syria, Vietnam, Yemen which have missing per-capita GDP (Table 2). At the end, current results are discussed with respect to the previous study (Kumar et al. 2021) and also current findings’ usefulness in policy making are suggested.

For removal of outliers by the trained IF (Pedregosa et al. 2011) in TSmUTL approach (Fig. 1 and Algorithm 1), various contaminations values are selected for thorough analysis of the proposed approach in effective removal of non-useful data points from the multi-source domain dataset. The contamination values are in range [0, 0.2], where the minimum value is 0 which means all the data points of multi-source domain dataset are used (no outliers are removed) and the maximum range 0.2 means that at least 20% of the data points from multi-source domain are removed as outliers. Now, the question is that why upper limit is kept 0.2? This limit is sufficient enough to depict the effect of anomalies/outliers’ removal in the proposed TSmUTL. Since we have taken data of each country from year 1960 to year 2016 (57 years), so for a contamination value if removal of outliers from multi-source domain leaves less than 57 data points, then no experiments are conducted for that contamination value. The empirical findings of each country mentioned in Table 2 are thoroughly described below.

From Fig. 2a it can be observed that in case of Afghanistan the lowest RMSRE values for GRNN is observed at 17.5% contamination (0.175) and for SVR lowest RMSRE is observed at 1% contamination with significant improvement. While ELM depicted no improvement when outliers are removed, as its best performance is without any outlier removal by IF at zero contamination. Experiment results of Albania are depicted in Fig. 2b, where the lowest RMSRE for GRNN, SVR and ELM is obtained at 20% contamination value of IF, as it showed a continuous improvement (decrease in RMSRE) from zero contamination to 20% contamination for all the three models. Figure 2c depicts experimental results for Angola, where also the lowest RMSRE values for GRNN and SVR is witnessed at 20% contamination value of IF with significant improvement, while ELM best results are obtained at 2.5% contamination.

Fig. 2
figure 2figure 2

RMSRE error variations with respect to contamination percentage in IF for GRNN, SVR and ELM for each country: Afghanistan, Albania, Angola, Hungary, Iraq, Libya, Myanmar, Poland, Syria, Vietnam, and Yemen

ELM witnessed constant improved as the contamination value is increased in IF in case of Hungary in Fig. 2d, achieving lowest RMSRE at 17.5% contamination. While GRNN best result is without any contamination and for SVR the best performance is obtained at 2.5% contamination. As explained before, if for a contamination value the outlier removal from multi-source domain results in less number of points than a single country data points (i.e. 57 data points), then such contamination values results are discarded. So, in Iraq the contamination values greater than 7.5% are discarded as greater contamination values result in fewer data points in multi-source dataset due to their removal of almost all points as outliers as shown in Fig. 2e. GRNN and SVR predicted the best accuracies at 1% contamination and ELM best results are without any contamination. For Libya (Fig. 2f), GRNN and SVR converges to lowest RMSRE at 20% contamination and ELM at 2.5% contamination attained best result. Only GRNN improved its performance in case of Myanmar (Fig. 2g), after outlier removal at 1% contamination. However, ELM and SVR has provided best performance is without any outlier removal at zero contamination of IF.

For Poland, though all the ML methods seem to converge at higher contamination, the best performance is obtained by setting 2.5% contamination in IF for all the three GRNN, SVR and ELM (Fig. 2h). For Syria (Fig. 2i), all the three methods attained their best performance (lowest RMSRE) after removal of outliers detected by IF at 1% contamination. GRNN in Fig. 2j for Vietnam kept on improving as the contamination value is increased in IF and converges to optimum value at 20% contamination. While ELM achieved lowest RMSRE at 5% contamination, however SVR best RMSRE is without any outlier removal at 0% contamination. For Yemen (Fig. 2k), GRNN performed best using complete multi-source data without any outlier removal, while SVR and ELM best accuracies are at 12.5% and 17.5% respectively.

Finally, from the above analysis of the results visualized in Fig. 2, it can be observed from that TSmUTL methodology improves the performance of at least two of the regression technique out of the three (viz., GRNN, SVR, ELM) for 10 countries out of the total 11 countries. The exception is Myanmar (depicted in Fig. 2g) as only GRNN depicts improvement in it. For each of the country with missing GDP, whose experiments are depicted in Fig. 2, the best performing model is selected based on lowest predicted RMSRE in order to precisely predict missing per-capita GDP of these countries.

The compilation of these best performing models for each of these countries is mentioned in Table 3 along with the lowest RMSRE obtained at a particular contamination value in IF. As it can be seen from Table 3 that the proposed approach TSmUTL resulted in better accuracy for eight countries out of total eleven countries by attaining lowest RMSRE except for Afghanistan, Myanmar, and Vietnam. These best models are then used to predict missing per-capita GDP of these countries as depicted in Fig. 3.

Table 3 Best models for missing per-capita GDP prediction
Fig. 3
figure 3figure 3figure 3figure 3

Unavailable and estimated per-capita GDP of eleven different Countries: Afghanistan, Albania, Angola, Hungary, Iraq, Libya, Myanmar, Poland, Syria, Vietnam, and Yemen

In Fig. 3, two figures are plotted for each country with missing GDP as mentioned in Table 2. First figure of each country depicts the missing duration of per-capita GDP with dotted red lines, while solid blue line depicts the available per-capita GDP values. The second figure of each country depicts the predicted values of per-capita GDP using the best model as mentioned in Table 3. These predicted per-capita GDP values are plotted using solid red lines in Fig. 3.

For example, Fig. 3a depicts the duration of missing per-capita GDP of Afghanistan and it is titled as ‘Afghanistan missing’, while Fig. 3b depicts the predicted per-capita GDP of these missing duration plotted using solid red line, and accordingly this figure is titled as ‘Afghanistan predication’. Exactly same presentation is used for the rest of the countries in Fig. 3. Missing per-capita GDP of Afghanistan when predicted in Fig. 3b depicts that initially predicted per-capita GDP rises but thereafter declined drastically mainly due to increase in internal turmoil/war like conditions at that time. Prediction of missing per-capita GDP of Albania from 1960 to 1983 (in Fig. 3d) depicts that economic condition varied slightly, and was at same level as that of the available per-capita GDP that is available from year 1984 onward as depicted by blue line in the figure. Similarly, from Fig. 3f it can be seen that Angola missing per-capita GDP prediction shows very minute increase in its missing duration from year 1960 to year 1979 with its prediction in 1979 is almost similar to that of the available per-capita GDP value in year 1980. Predicted per-capita GDP of Hungary for missing duration (1960 to 1990) in Fig. 3h shows a slightly increase in its per-capita GDP over the years.

Prediction of Iraq’s missing per-capita GDP in Fig. 3j from year 1965 to year 1967 and during its isolation from international community after Persian Gulf war (1990–1991) depicted a decline in its economy, almost to its lowest level. In case of Libya’s missing per-capita GDP prediction in Fig. 3l there appears to be an increase in per-capita GDP that increase almost linearly to smoothly join available per-capita GDP from year 1990 onwards. These predictions depict a constant rise in economic activity in Libya over the years during its missing duration from year 1960 to year 1989. Myanmar which remained largely isolated from international communities due to its military rule, depicts up and down in its missing per-capita GDP prediction as shown in Fig. 3n from year 1960 to year 1999, indicating that its economic activity had slight variations during its military rule without any drastic improvement. Poland’s missing per-capita GDP from year 1960 to year 1989 when predicted in Fig. 3p shows an upward trend up to year 1979 and thereafter a slight decline till 1989, where it joins the blue line in year 1990 (i.e., the available per-capita GDP) which thereafter shows an upward trend mainly from year 1995 onwards. This is also proved by Gomułka (2016) which stated in their published work that “Macroeconomic destabilization in 1989 of Poland economy was worse than any of the real socialism based economies.” Syria’s missing per-capita GDP from year 2008 to year 2016 when predicted in Fig. 3r depicts a heavy decline in its economy which appears to be in synchronous with civil unrest that started during that period in the country. Vietnam which was in continuous war (1954–1975) till its unification in 1976 has its per-capita GDP missing from year 1960 to year 1984. This missing duration of per-capita GDP is predicted in Fig. 3t, which depicts an extremely low economic activity which increases from year 1975 and declines afterword. It increases again from year 1981 onwards and join the available per-capita GDP values (in blue line) in year 1985 at its level. Duration of Yemen missing per-capita GDP was from year 1960 to year 1989, and it’s predicted per-capita GDP in Fig. 3v indicates very little increase in its economic activity in this missing duration. Colton (Colton 2010) analyzed structural changes in Yemen economy after 1970, which collapsed after Gulf crisis of 1990 and left most of the Yemen’s population with no means of survival. The prediction of missing per-capita GDP of Fig. 3v relates with Colton’s analysis as red line (predicted per-capita GDP) increases slowly from year 1960 to year 1989 and joins the available per-capita GDP (blue line) in year 1990 which collapses thereafter.

6 Overall discussions

This section provides a succinct comparative discussion on the TSmUTL approach proposed in this paper with one provided in Kumar et al. (2021) in terms of the methodologies/assumptions, prediction results, the evaluation, and its limitations. Further, it will also discuss few policy implications.

6.1 Comparing the proposed TSmUTL with (Kumar et al. 2021)

6.1.1 Methodological supremacy of the proposed TSmUTL over (Kumar et al. 2021)

The TL-based GDP prediction approach proposed in Kumar et al. (2021) used a common source domain for training IF during multi-source UTL. This restricts the generalization capability of the approach given in Kumar et al. (2021) when target domain varies. To use their approach, it was necessary to collect an additional domain, which was termed as Source Domain 1 in Kumar et al. (2021). The Source Domain 1 of Kumar et al. (2021) was required to be similar to the target domain for multi-source UTL. This made the approach proposed in Kumar et al. (2021) vulnerable to the kind of similarity employed in collecting Source Domain 1 with respect to the target domain. Thus, it restricts its generalization to any arbitrary target domain. For example, the Source Domain 1 used in Kumar et al. (2021) was collected from only the developing countries. Hence, it could not be used when target domain was a developed country.

The current work has aimed to improve upon the above weakness of Kumar et al. (2021) by proposing TSmUTL, where the necessity of designing an extra source domain is completely removed. In sharp contrast to Kumar et al. (2021), this paper not only uses target specific multi-source UTL, but also replaced the evaluation metric to give equal importance to each year irrespective of the magnitude of the labels (per-capita GDP) in that year. The approach introduced in the current paper has been evaluated over eleven different target domains with varying distributions. That is, for a thorough validation of the TSmUTL approach proposed in this paper, data from eleven different countries has been used as target domains. Those are not only from the developing economies, but also from the developed economies. Improving upon previous study (Kumar et al. 2021), the approach proposed in this paper automates the transfer learning technique in extracting target domain specific relevant training data from readily available similar domains for robust ML modelling. Accordingly, the current approach can pragmatically be used in many other newer real-world applications for overcoming their data scarcity issues.

6.1.2 Supremacy of the proposed TSmUTL over (Kumar et al. 2021) in terms of results

There is a stark contrast between the prediction of missing per-capita GDP of Afghanistan during 1982 to 2001 by the approach proposed in Kumar et al. (2021) and the TSmUTL proposed in this paper. Predicted GDP was almost constant for all the missing duration in Kumar et al. (2021) with higher magnitude even during the turmoil. Moreover, it was also compared to the available GDP values of 1981 and 2002. The available GDP of 1981 and 2002 were smaller than the predicted 1982 and 2001, and so much sudden variation in per-capita GDP is highly unlikely.

On the contrary, the GDP values predicted by the TSmUTL approach proposed in this paper has shown noticeable variations during that duration, first by depicting an increase initially and thereafter showing a huge decrease over the years. For example, for Iraq, from 1965 to 1967, the GDP values predicted in Kumar et al. (2021) depicted a sudden increase for those years, which was completely unrealistic. However, the per-capita GDP values predicted by our proposed TSmUTL (Fig. 3j) depicts very small variation during those years, which are more realistic.

Also, during the years 1991 to 2003 (Gulf war duration 1990–1991), our proposed TSmUTL predicted GDP values are much lower than the GDP values predicted in Kumar et al. (2021). In Fig. 3n, the GDP values of Myanmar predicted by the TSmUTL vary a lot with smaller magnitude during its isolation from international community, whereas the GDP values of Myanmar predicted in Kumar et al. (2021) were very high and almost constant. Further, TSmUTL has predicted far lower GDP for Syria in Fig. 3r during its ongoing civil war since the country is almost destroyed by the civil war. Unrealistically, the approach proposed in Kumar et al. (2021) predicted higher GDP values of this country also during the period of its civil war duration.

Predicted GDP values of Vietnam were almost constant with higher magnitudes in Kumar et al. (2021), but as we may see in Fig. 3t, our proposed TSmUTL has predicted lower GDP values with some increase for a few years in between. It is mentioned here that Vietnam was in continuous war from 1954 to 1975 till its unification in 1976. In (Kumar et al. 2021), predicted per capita GDP of Yemen in Kumar et al. (2021) was also reported as very high during its missing duration, in contrast to published work of Colton (2010). Whereas the prediction of per-capita GDP by our proposed TSmUTL approach (see Fig. 3v) has been found as more realistic with lower magnitude in synchronization of its economic turmoil.

We can conclude that the approach proposed in Kumar et al. (2021) could not be extended to economies of varying scales, a major limitation of the said study. The TSmUTL approach proposed in this paper has suitably addressed the above-mentioned limitation of Kumar et al. (2021). Accordingly, the TSmUTL approach has been successful in predicting the missing GDP values of the countries (with insufficient micro-economic data) through the consideration of target specific source domains with GDP data from multiple countries (both developed and developing) for training the ML models. This is main reason behind the success of the proposed TSmUTL approach in predicting the GDP values for the war-torn and isolated countries with better accuracy.

6.1.3 Supremacy of the proposed TSmUTL over (Kumar et al. 2021) in terms of evaluation and employed metric

Further, the accurate evaluation of predicted GDP values by the proposed TSmUTL approach is ensured by using the root mean square relative error (RMSRE) instead of the widely popular root mean square error (RMSE). On the other hand, (Kumar et al. 2021) considered RMSE, a default choice of many regression models. The current paper has suitably avoided using the RMSE for evaluating the accuracy of the prediction results. The reason behind this is that per-capita GDP of a country can be as low as $100 in twentieth century and can reach up to as high as $10,000 in twenty-first century. In such scenarios, RMSE may be highly susceptible to the magnitude of the per-capita GDP because RMSE is skewed towards those years in which the per-capita GDP values are higher without giving every year equal importance.

Let us say, a country in year 1990 had actual per-capita GDP value as $1000 and in year 2015 its per-capita GDP increased to $5000. While an ML model after training estimates/predicts (both these terms used interchangeably) per-capita GDP for year 1990 and 2015 as $500 and $4000. The percentage error in prediction of per-capita for year 1990 and year 2015 are 50% \( \left( {\frac{{\left( {\$ 1000 - \$ 500} \right)*100}}{{\$ 1000}}} \right) \) and 20% \( \left( {\frac{{\left( {\$ 5000 - \$ 4000} \right)*100}}{{\$ 5000}}} \right) \), respectively. As RMSE is magnitude dependent, it gives more weightage to error in year 2015 due to larger magnitude i.e. $5000–$4000 = $1000 than for year 1990 in which the error is only $500 i.e. ($1000–$500). This skewness of RMSE will designate such ML models as best which predicts those years’ per-capita GDP whose magnitude is large giving less importance to years in which per-capita GDP magnitude is smaller. As a country can have missing per-capita GDP at any point of time (Table 2), an evaluation metric should give equal importance to each year irrespective of their per-capita GDP magnitude at that point of time.

On the other hand, RMSRE gives equal importance to each year’s per-capita GDP irrespective of their magnitude. Per-capita GDP of a nation can be missing at any part of its modern history, be it as early as year 1960 or as late as 2016. So, to precisely estimate the missing per-capita GDP of a nation at any point in history, RMSRE is a better choice due to its magnitude unbiasedness. This is why, this paper uses RMSRE as its evaluation metric in the proposed approach, which has also played a major role in ensuring a robust and precise prediction of the missing per-capita GDP in any previous years, (be it in 1960 or 2016 etc.) irrespective of the magnitude of the per-capita GDP at that point of time.

In summary, the improvements introduced in the proposed TSmUTL approach, specifically the focus on the target specific transfer learning and the adoption of root mean square relative error (RMSRE) over RMSE, has significantly impacted the model training and selection resulting in better generalization than (Kumar et al. 2021). The proposed TSmUTL approach is capable of enhancing the model generalization by incorporating target-specific transfer learning and ensuring that the source domain data is more relevant to the target domain thereby increasing its effectiveness. Proposed TSmUTL strategically filters out irrelevant data points from the source domain, which could compromise the model generalization on the target domain during training. Additionally, the use of RMSRE ensures that each year’s GDP is given equal importance regardless of its magnitude, avoiding the bias towards higher GDP values inherent in Kumar et al. (2021). These enhancements have enabled TSmUTL to generalize effectively across different scenarios providing more precise and unbiased GDP estimates. The proposed TSmUTL approach will be highly effective for real-world problems as it focuses on instance selection, ensuring that only relevant data from the source domain is used for training, thereby enhancing model generalization on the target domain. Furthermore, by employing an unbiased evaluation metric RMSRE, it provides more accurate and balanced predictions across diverse scenarios.

6.2 Limitations

The proposed TSmUTL approach though successful in predicting the missing per-capita GDP of the war-torn and isolated countries with less errors, it has few limitations as mentioned below:

  1. (a)

    It requires some labelled data in the target domain to validate the best model.

  2. (b)

    The source domain should have same features space as target domain.

  3. (c)

    In this approach, a completely divergent source domain will be eliminated during TL. Accordingly, some instances in the source domain need to have similar distribution with the target domain. So, choice of the source domain should be somewhat similar to the target domain for achieving enhanced TL.

6.3 Policy implications

It could be seen from the missing GDP predictions of Iraq and Myanmar (Fig. 3j and n, respectively) and similarly from many others that whenever a country is isolated from rest of the world, its GDP reduces drastically. As discussed in Sect. 2, many works in the literatures (such as Nguyen et al. 2021; Magazzino et al. 2021) highlighted that trade openness, economic growth, ICT usage and CO2 emissions are interlinked. Thus, when a country goes in isolation for whatever reasons it heavily affects the GDP of the country. For example, as Afghanistan has been in Isolation since 2021, it will affect its economy and drastically reduce its GDP values in the subsequent years. When a country is isolated from the world due to sanctions or any other reason in such scenarios their macroeconomic data is unavailable, and their GDP calculation is not possible. Hence in such situations it is very difficult to gauge the impact of any newer Government policy or international policy on the population of that country. Our proposed model addresses this limitation capably. It can effectively analyze the economic impact of such policies by estimating its GDP even in absence of macroeconomic statistics. This subtle but data driven crucial information goes a long way in helping the lawmakers in designing and modifying the policies for their intended precise targeted impact on the population.

7 Conclusion

There are many nations whose GDP values are missing in the World Bank Database. This may be either because they are war-torn, isolated/inaccessible, or politically unstable etc. By using the indirect relationship between carbon emission and GDP, this paper has been aimed to predict these missing GDP values of such nations using their carbon emissions from solid, liquid, and gaseous fuels. The available dataset of these countries is in very little quantity for building a robust and precise predictive regression framework. To overcome this limitation, carbon emission data from multiple sources (i.e., multiple countries) has been collected in this work. However, a regression ML model cannot directly use this multi-source collection due to its huge data distribution differences in precisely predicting missing GDP of a nation. Therefore, this paper has proposed target-specific multi-source unsupervised transfer learning through the application of isolation forest. By employing the process of anomalies detection and their removal, the proposed methodology has extracted relevant training data from the multi-source collection. Inclusion of these anomalous points would have otherwise profoundly degraded the predictive capability of a regression ML model.

Experiments are conducted using different regression ML techniques covering wide number of countries with different type of economies. To evaluate the performance of these models, root mean square relative error is used instead of root mean square error. With this equal importance is accredited to any year’s missing GDP irrespective of the magnitude of the GDP in that year. The proposed has methodology proved to be very effective in accurately predicting missing GDP of a nation by improving the regression ML model learning capability. The prediction is found to be in synchronization with other published literatures which described the economies of such nations at their missing duration years. This unsupervised, automated transfer learning technique of extracting relevant training data from a large pool of related dataset can help in building more robust ML model that are pragmatic to many other newer real-world applications, by overcoming their data scarcity issues.

In future research, one may improve this approach to be more potent in its performance and will further employ it to solve similar other real-world problem. Future research may also aim to extend this work by exploiting the concepts of deep transfer learning. In addition, further research may also target to have more robust data modelling techniques enabling the precise capturing of noises and inherent uncertainties. Importantly, our proposed approach is general in nature. Hence, as stated earlier, it may suitably be utilized in diverse real-life areas with similar applications such as health, banking and financial sectors.