Assessing the measurement invariance of Free Will and Determinism Plus scale across four languages: a registered report.

Duan S; Zhou C; Liu Q; Gong Y; Dou Z; Li J; Chuan-Peng H

doi:10.1098/rsos.220876

Assessing the measurement invariance of Free Will and Determinism Plus scale across four languages: a registered report.

Affiliations

1. School of Psychology, Nanjing Normal University, Nanjing 210024, People's Republic of China.
Authors
Duan S¹
Zhou C¹
Chuan-Peng H¹
(3 authors)
2. GeseDNA Research Team, Beijing 100016, People's Republic of China.
Authors
Liu Q²
Dou Z²
(2 authors)
3. Faculty of Education, Monash University, Melbourne 3800, Australia.
Authors
Gong Y³
(1 author)
4. School of Teacher Education, Dali University, Dali 671003, People's Republic of China.
Authors
Li J⁴
(1 author)

ORCIDs linked to this article

Royal Society Open Science, 13 Nov 2024, 11(11):220876
https://doi.org/10.1098/rsos.220876 PMID: 39539506 PMCID: PMC11557234

This article is in the Europe PMC Open access subset. Refer to the copyright information in the article for licensing details.

Free full text in Europe PMC

This article is based on a previously available preprint.

Abstract

Free will is assumed to be the core of an individual's self-concept. Belief in free will has been studied extensively and was found to be correlated with many behavioural and psychological outcomes. Although developed and validated in the West, the Free will and Determinism Plus (FAD-Plus) scale has been translated, used, and interpreted as a measurement of free will beliefs in multiple cultures. However, the cross-cultural measurement invariance of FAD-Plus has not been examined. Given the cultural differences in understanding the concept of 'free will', items of FAD-Plus may have different interpretations in different cultures, which may compromise its cross-cultural measurement invariance. To provide empirical evidence for the lack of cross-cultural measurement invariance, we collected data in China and analyzed these data together with open datasets of FAD-Plus in three other languages: Japanese, French and English. We only found partial measurement invariance between the Chinese and English datasets, as well as the Japanese and English datasets. These results provided the first assessment of cross-cultural measure invariance of FAD-Plus. We discussed the potential implications of the current study for future studies in the field.

Free full text

R Soc Open Sci. 2024 Nov; 11(11): 220876.

Published online November 13, 2024. https://doi.org/10.1098/rsos.220876

PMCID: PMC11557234

PMID: 39539506

Assessing the measurement invariance of Free Will and Determinism Plus scale across four languages: a registered report

Siqi Duan, Conceptualization, Data curation, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review and editing,^#^1
,^2
,^† Chenghao Zhou, Conceptualization, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review and editing,^#^1
,^3
,^† Qinglan Liu, Data curation, Investigation, Resources,⁴ Yixin Gong, Writing – original draft, Writing – review and editing,⁵ Zenan Dou, Investigation, Resources,⁴ Jingguang Li, Investigation, Resources,⁶ and Hu Chuan-Peng, Conceptualization, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – review and editing¹

Author information Article notes Copyright and License information Disclaimer

Go to:

Associated Data

Data Availability Statement

Materials: All materials used for data collection in China are available at [76].

Code and Raw data: All de-identified raw data (in CSV format), and related R scripts are also available at: [77] and [76].

The dataset used in our study were from multiple sources. More specifically, we included several open or shared datasets from the following studies (classification by language):

The Chinese dataset were newly collected and are available at [76].

The English dataset was from three recent papers: [36–38].

The French dataset wass from: [17].

The Japanese dataset was from two papers: [19,23].

Supplementary material is available online [78].

Go to:

Abstract

Free will is assumed to be the core of an individual’s self-concept. Belief in free will has been studied extensively and was found to be correlated with many behavioural and psychological outcomes. Although developed and validated in the West, the Free will and Determinism Plus (FAD-Plus) scale has been translated, used, and interpreted as a measurement of free will beliefs in multiple cultures. However, the cross-cultural measurement invariance of FAD-Plus has not been examined. Given the cultural differences in understanding the concept of ‘free will’, items of FAD-Plus may have different interpretations in different cultures, which may compromise its cross-cultural measurement invariance. To provide empirical evidence for the lack of cross-cultural measurement invariance, we collected data in China and analyzed these data together with open datasets of FAD-Plus in three other languages: Japanese, French and English. We only found partial measurement invariance between the Chinese and English datasets, as well as the Japanese and English datasets. These results provided the first assessment of cross-cultural measure invariance of FAD-Plus. We discussed the potential implications of the current study for future studies in the field.

Keywords: FAD-plus, measurement invariance, measurement equivalence, free will, cross-cultural research

Go to:

1. Introduction

Free will is the core of an individual’s self-concept as a subject capable of rational, independent thinking, and decision-making [1]. The lay belief in free will, that people have the capacity to act freely or could have chosen to do otherwise [2], has been studied extensively in psychology [1,3–5]. Belief in free will is the general belief of lay people that human behaviour is free from internal and external constraints across situations for both self and others [4]. It is a unique psychological construct as it focuses on the capacity for choice and constraints from both the individual and environment (see Feldman [4] for detailed discussion). Previous studies revealed that belief in free will is associated with a variety of psychological/behavioural outcomes, e.g. life satisfaction, across different cultures [6–9].

As the foundation of quantitative studies, researchers have developed multiple scales to measure belief in free will. As the ‘free will notion … is the heart of Western religious, philosophical, and legal understandings of moral responsibility’ [10], it is not surprising that scales of belief in free will were all developed in the West. Since the first measurement of free will beliefs [11], various scales of belief in free will had been developed. For example, a 19-item scale named the Free Will-Determinism Scale with a three-factor structure, religious-philosophical determinism, psychosocial determinism and libertarianism, from Stroessner and Green [12]; and a 29-item scale named Free Will Inventory (FWI) [13], that measures the strength of people’s belief in free will, determinism, dualism and responsibility.

The scales of free will beliefs, however, are used in different cultural contexts. In some cases, researchers compared beliefs in free will across different cultures based on the observed scores [14,15]. The most widely used scale is the Free will and Determinism Plus (FAD-Plus) scale, which was developed by Paulhus and Carey [16]. FAD-Plus inherited the compatibilism from Stroessner and Green [12] and Rokas [10]. It includes four dimensions: free will, scientific determinism, fatalistic determinism and unpredictability. The free will sub-scale measures people’s belief in free will, which has items such as ‘People have complete control over the decisions they make’. The scientific determinism sub-scale measures the belief in scientific causality, with items such as ‘People’s biological makeup determines their talents and personality’. Fatalistic determinism measures the belief in fate, with items like ‘I believe that the future has already been determined by fate’. The scientific determinism and fatalistic determinism sub-scales are distinguished by the type of deterministic thinking [16]. The fourth sub-scale, unpredictability, measures the belief in the unpredictable and unknowable, which is a conceptual dimension independent of free will, belief or determinism [16].

FAD-Plus has already been translated into at least five different languages [17–20], including several Chinese versions (e.g. [21,22]) and two Japanese versions [19,23]. These translated versions of FAD-Plus are vital for comparing and exploring the associations between free will beliefs and a variety of psychological outcomes across different cultures [18–22,24]. For example, Paulhus and Carey [16] found that the Free will dimension has a strong positive correlation with Extraversion and the Fatalistic Determinism has a strong positive correlation with Neuroticism in a Canadian sample. In Goto [19]’s study, the Free will dimension showed a strong positive correlation with patience in a Japanese sample. An Italian study found that Scientific Determinism has a strong correlation with Self-Representations, Extraversion and Other Representations [18]. After reviewing 36 studies from regions including the United States, Singapore, Hong Kong, India, Turkey and Germany, Lam [1] found that, overall, people believed in free will across regions. Lam [1] also found that beliefs in free will mean different actions under different cultural backgrounds. However, like other measurements of belief in free will, FAD-Plus was also developed in English based on a narrow sample of population, i.e. North American or European undergraduates in their 20s [1]. So far, no study has compared these translated versions of FAD-Plus to the English version in terms of measurement invariance.

Measurement invariance (or measurement equivalence) is a psychometric notion that a scale or task measures the same concept across different groups, situations and times [25–27]. Without testing measurement invariance, we cannot directly compare the scores or latent constructs of a scale across different cultures since they may measure different psychological concepts [28,29]. Milfont and Fischer [29] identified four levels of measurement invariance. The first and lowest level is configural measurement invariance, which means the scale shares the same structure (i.e. the number of latent constructs) across each measured group. For FAD-Plus, configural invariance means that data from different languages should exhibit the same four-dimensional structure. The second level is metric invariance (also called weak invariance), which means the scale has the same factor loadings for each item. If this level of MI holds for FAD-Plus, it means that not only do FAD-Plus across different languages share a four-dimension structure, but items’ weight or loading of each dimension are similar. A higher level, scalar invariance (or strong invariance), requires equal item intercepts, indicating equal mean scores for each item across different groups. At this level of measurement invariance, researchers can compare the means of latent variables (e.g. four dimensions of FAD-Plus) across different groups. The fourth and final level, strict invariance (also known as error variance invariance or full uniqueness measurement invariance), means that item residual variances are equal across groups. This level of measurement invariance is not essential for comparing the means of the scale across different groups [30]; thus, we will not include this level of measurement invariance in our analyses.

We infer that even the metric invariance (i.e. weak invariance) may not hold across the FAD-Plus scale in different languages due to the following reasons. First, the phrase ‘free will’ is not a native concept in many non-Western languages [31]. As mentioned above, the concept of free will is deeply rooted in Western cultural tradition and imported to other cultures. For example, in Chinese and Japanese languages, the phrase ‘free will’ (自由意志) is a translated philosophical jargon that coined two existing words, ‘free’ (自由) and ‘will’ (意志). In such cases, ‘free will’ often functions as an academic or technical term, which many people, particularly those without higher education, may struggle to understand or may misinterpret. Moreover, in China, ancient philosophers expressed ideas that are similar to ‘free will’ but with different connotations. For example, Confucius, one of the most important ancient philosophers in China, has a famous saying: ‘七十而从心所欲不逾矩’, which was translated as ‘at seventy, I followed what my heart desired without overstepping the line’ [32]. Confucius highlighted an ideal state in which the will of a decent person (君子) is naturally consistent with the norms, instead of against external/internal constraints. This tradition may cause different interpretations of items such as ‘people have complete free will’. For a Confucian, ‘complete free will’ might refer to achieving that ideal state, a meaning that might not occur to a responder from the West. Another difference may exist in the interpretation of ‘fate’: while in the Western world, fate is usually associated with the will of God, in Buddhism’s view, fate is part of the laws of nature (自然法则). These differences may cause different interpretations for items such as ‘Fate already has a plan for everyone and I believe that the future has already been determined by fate. These cultural and societal factors may lead to further differences in the interpretation of items within the FAD-Plus. These differences, in turn, may cause different loadings of items on latent factors, or worse, cause different configural structures of FAD-Plus. In other words, these cultural differences will result in non-invariance even for the lowest level. Furthermore, measurement invariance is not only important for comparisons between countries/regions but also for sub-groups that use the same language or within the same region. For example, education may play an important role in free will belief as higher education is usually associated with more knowledge about science. Thus, it is important to compare the measurement invariance of FAD-Plus across sub-groups with different educational levels (e.g. different Chinese samples).

The unknown status of measurement invariance makes it difficult to interpret the results from cross-cultural comparison of belief in free will (see similar view in other cross-cultural studies [33,34]. If the cross-language measurement invariance of FAD-Plus does not hold, researchers would need to reconsider what FAD-Plus measures in different cultures, how to conceptualize this construct in a global context, and how to design studies to measure it, such as through a multi-national collaboration project (e.g. [35]). The present study aims to assess the measurement invariance of FAD-Plus across four different languages. We collected a new dataset in China using an improved version of the Chinese FAD-Plus and retrieved open data from three other languages: English, French and Japanese. The Chinese data were collected from a diverse Chinese sample, using a Chinese translation of FAD-Plus that follows the standard translation procedure. Data for FAD-Plus in other languages was retrieved online, such as the English version [36–38], or generously shared by the authors, e.g. the Japanese [19,23] and the French version [17]. This project provided insights into measuring free will beliefs across different cultures.

Go to:

2. Method

2.1. Data

The Chinese data were collected by three Chinese teams (SD, CHZ, YG and HCP; QLL and ZD; JL) immediately after Stage 1 of the Registered Report was accepted in principle, and each team aimed to recruit at least 400 valid participants. One team collected data from 572 high school students, but only 356 passed the validation checks. The other two teams collected valid data from 455 and 437 participants. The sample sizes exceeded 200 per group, as recommended by Koh and Zumbo [39] for measurement invariance analysis, and this was also the case for our sub-group analysis between different sites. Participants have a wide range of ages, educational levels and locations. The three teams collected data based in three cities: SD, CHZ, YG and HCP collected data in Nanjing (in South China), QLL and ZD are based in Beijing (in North China) and JL is based in Dali (in Southwest China). These three teams targeted different samples: college students, general young adult internet users who are undertaking online meditation training, and high school students, respectively. We collected socio-demographic variables, including gender, age, educational level, monthly family income and foreign experience. These variables were used to assess the diversity of our samples. This study was conducted according to the Declaration of Helsinki. The research protocol has been approved by the Institute Review Board of Nanjing Normal University (No. NNU202110002). All participants were clearly informed, and their consent was obtained before collecting any data. See electronic supplementary material, S1.3, for more details about the data collection procedure.

Datasets of FAD-Plus in English, French and Japanese were retrieved from available sources. As for our English datasets, we combined data from three studies on OSF: Earp et al. [36], which included four studies (1A, 1B, 1C and 2) conducted in England and the US that examined the relationship between belief in free will and humility; Post and Zwaan [38], which studied the value of believing in free will in North America and Netherlands; and Nadelhoffer et al. [37], which explored cheating behaviour with the belief in free will from the US. In total, the English datasets comprised 3256 participants with a mean age of 28.85 (s.d. = 15.15). The French data were collected in Belgium by [17], who examined the reliability of the French translation of FAD-Plus with a sample size of 904 and a mean age of 26.29 (s.d. = 8.68). The Japanese data were collected using two versions of the translation [19,23], which we have separated into two datasets: the early version with 3272 participants and a mean age of 38.15 (s.d. = 10.62); the later version included 800 participants with a similar mean age.

2.2. Material

2.2.1. The Free will and Determinism Plus (FAD-PLUS) Scale

As mentioned above, the FAD-Plus was developed by Paulhus and Carey [16]. This scale consists of four subscales: Free Will (7 items), Scientific Determinism (7 items), Fatalistic Determinism (5 items) and Unpredictability (8 items). Each item is rated on a 5-point Likert scale: ‘1 = strongly disagree, 5 = strongly agree’. No item needs reverse scoring.

The Chinese translation of FAD-Plus used in the current study was an improved version based on three earlier translations [9,21,22,40]. More specifically, this version resolved the inconsistencies between three previous translations and was re-translated according to the translation and back-translation method (e.g. ITC [41]). For additional details on the translation process, see electronic supplementary material, S1.3 Procedure of Chinese Data Collection and https://osf.io/t7p43/.

The French version of FAD-Plus was translated following the back-translation procedure; see Caspar et al. [17] for more details. The Japanese version of FAD-Plus has two different sub-versions, both following the back-translation procedure; please see Goto [23] for more details. The English version used the original FAD-Plus by Paulhus and Carey [16].

2.2.2. The Big Five Inventory

Similar to the original study by Paulhus and Carey [16] and Caspar et al. [17], we validated the FAD-Plus using the Big Five personality traits, which include Conscientiousness, Agreeableness, Openness, Neuroticism, and Extraversion. The original study found that the Free Will subscale of the FAD-Plus was positively correlated with the Extraversion, Neuroticism, and Agreeableness subscales of the Big Five [16].

In the new Chinese data, we used the Big Five Inventory−2 (BFI−2) as a measure of the Big Five personality. BFI−2 is a revised version of the Big Five Inventory [42]. It scores on a 5-point Likert scale. Participants were asked to rate their agreement with each statement on a scale from ‘1 = strongly disagree’ to ‘5 = strongly agree’. The Chinese version of the BFI−2 was translated by Zhang et al. [43].

Data of the BFI were also available in French language [44]; see [17] for details. However, BFI data were not available in Japanese datasets and English datasets.

2.2.3. Locus of control

Locus of control was also used to test the criterion validity of the FAD-Plus in the original study by Paulhus and Carey [16]. This study reported that the Free Will subscale was positively correlated with internal control, the Fatalistic Determinism subscale was correlated with aspects of external control, and the Scientific Determinism subscale was correlated with both internal and external control.

We used the Chinese version of the multi-dimensional locus of control (MLOC) inventory, which was translated from Levenson [45] and has demonstrated good psychometric properties in Chinese samples [46]. The MLOC consists of 24 items, with each of its three dimensions represented by eight items: internality, control by powerful others and control by chance forces. It is scored on a 6-point Likert scale, ranging from −3 (strongly disagree) to 3 (strongly agree). However, due to an implementation error, the rating scale of the MLOC in the first test used a 7-point Likert scale instead of the original 6-point scale. This error was corrected before the retest. Consequently, the test-retest reliability of the MLOC was not calculated, and we only used data from the retest.

In the Japanese datasets, data on locus of control was collected, see Goto [23] for more details. Note that the Japanese locus of control scale is a 7-point Likert questionnaire with seven items. These seven items constitute two subscales: external and internal control. There was no data of locus of control data in the French and English datasets.

2.3. Procedure

We collected the Chinese data online powered by Qualtrics XM. The scales were presented in the following order: FAD-Plus, BFI−2 and MLOC (see deviation). To calculate the test-retest reliability, we re-tested a subset of participants, with an interval of approximately four weeks. The data collection procedures in other languages had been described in the original papers and are not repeated here.

2.4. Data analysis

We used R (4.3.1) and other R packages to analyze the data. All data were preprocessed using tidyverse (1.3.0 [47]). The following R packages were used in our analysis: CTT (2.3.3 [48]), dplyr (1.0.2 [47]), lavaan (0.6–7 [49]), psych (2.0.12 [50]), semPlot (1.1.2 [51]), semTools (0.5–3 [52]), NNLM (0.4.3 [53]) and RcppML (0.5.6 [54]). All the scripts are available at: https://github.com/Chuan-Peng-Lab/FAD_Plus_Stage2.

2.4.1. Descriptive

See table 1 for the descriptions of Chinese datasets and all variables. The correlations between items and their corresponding dimensions can be found at https://osf.io/n5h7t/.

Table 1.

Information on distinct source of data from different language datasets.

			FAD-Plus						BFI			LOC
dataset ID	subdataset	FAD-plus version	sample size (N)	mean age	gender	educational attainment	family income (per month, RMB)	foreign experience	sample size (N)	mean age	gender	sample size (N)	mean age	gender
CHN	dataset 1 (By SD, CHZ, YG, and HCP in Nanjing)	CHN	437	21.91 ± 2.28 NA = 2	F = 223 M = 214	n₁ = 1; n₃ = 4; n₄ = 11; n₅ = 323; n₆ = 96; n₇ = 2	mean = 24106 median = 12000	yes = 39 no = 398	437	21.91 ± 2.28 NA = 2	F = 223 M = 214
	dataset 2 (By QLL and ZD in Beijing)	CHN	455	31.00 ± 8.38	F = 378 M = 77	n₂ = 2; n₃ = 22; n₄ = 27; n₅ = 248; n₆ = 129; n₇ = 27	mean = 34255 median = 15000	yes = 133 no = 322	455	31.00 ± 8.38	F = 378 M = 77
	dataset 3 (By JL in Dali)	CHN	356	16.94 ± 1.35	F = 229 M = 127	n₃ = 347; n₅ = 2; n₆ = 7	mean = 7214 median = 5000	yes = 4 no = 352	356	16.94 ± 1.35	F = 229 M = 127
	retest-dataset (By SD, CHZ, YG, & HCP in Nanjing)	CHN	188	22.10 ± 2.03	F = 80 M = 108	n₃ = 1; n₄ = 3; n₅ = 137; n₆ = 46; n₇ = 1	mean = 14914 median = 12000	yes = 17 no = 169	188	22.10 ± 2.03	F = 80 M = 108	188	22.10 ± 2.03	F = 80 M = 108
ENG^a		ENG	3256	28.85 ± 15.15 NA = 28	M = 1733 F = 1478 NA = 276				—	—	—	—	—	—
FRN^b		FRN	904	26.29 ± 8.68 NA = 7	M = 579 F = 325				195	25.73 ± 8.49	M = 138 F = 57	—	—	—
JPN_1^c		JPN_v1	3727	38.15 ± 10.62 NA = 2925	M = 246 F = 546 NA = 2935				—	—	—	802	38.15 ± 10.62	M = 246 F = 546 NA = 10
JPN_2^d		JPN_v2	800	38.16 ± 10.62	M = 245 F = 545 NA = 10				—	—	—	800	38.16 ± 10.62	M = 245 F = 545 NA = 10

Note: The rating scale for MLOC was incorrect in our first test at all three Chinese sites due to an implementation error (see procedure). Therefore, we did not include the first test data for LOC. However, all this data was recorded and is available at https://osf.io/utrmv/.

Levels of educational attainments:1: primary school or less; 2: middle school or equivalent; 3: high school or equivalent; 4: some college, vocational school after high school; 5: college graduate, with bachelor’s degree or in college/university; 6: master, with master’s degree or in a master program; 7: doctor and higher, with doctor degree or in the PhD program.

^aENG¹ data source: [36–38]

^bFRN² data source: [17]

^cJPN_1³ data source: [19,23];

^dJPN_2⁴ data source: [23]

2.4.2. Reliability

Based on the classic test theory, we calculated the internal reliability of subscales and the whole test for FAD-Plus, with Cronbach’s alpha and McDonald’s omega coefficients as indicators. The test-retest reliability was reported based on data from participants who completed the test twice. Cronbach’s alpha and Mcdonald’s omega were calculated using the R package psych (2.0.12 [50]).

2.4.3. Validity

As for the validity, we tested the construct and criterion validity. While measurement invariance is also a facet of validity, it was reported separately to emphasize its importance.

2.4.3.1. Construct validity

Using the original four-factor model from Paulhus and Carey [16], we tested the model fit and extracted the loadings of items. We compared the constructs from different languages’ datasets. Specifically, we evaluated the relationship between each item and its corresponding dimension across distinct datasets.

2.4.3.2. Criterion validity

Similar to Paulhus and Carey [16], we estimated criterion validity by calculating correlations between the four dimensions of FAD-Plus and five dimensions of the BFI, as well as with the subscales of the MLOC (electronic supplementary material, table S2).

To compare these correlations with the original study, we employed a bootstrap sampling approach [55]. This involved generating a distribution of correlation coefficients (r-values) based on 5000 bootstrap samples for each pair of variables. We randomly selected observations with replacements from the newly collected data and calculated the correlations that index the criterion validity. The resulting distribution of r-values allowed us to estimate the mean r-value and its 95% confidence interval. More importantly, we can infer whether each correlation significantly differs from the originally reported r-value in Paulhus and Carey [16]. More specifically, if the original r-value falls outside the 95% confidence interval of the corresponding bootstrap distribution, we infer that the correlation from the new data is statistically different from the original correlation. Additionally, we calculated the p-value for the observed correlation coefficient based on its position within the bootstrap distribution.

2.4.4. Measurement invariance

For measurement invariance, we compared different non-English datasets with the English ones. Besides the traditional multi-group confirmatory factor analysis (CFA) method [56], we also included a partial metric [57] since weak measurement invariance might not hold. The analytical workflow for measurement invariance was planned and carried out as in figure 1. Firstly, we tested Paulhus and Carey’s [16] original four-factor model in all datasets. Once the original model fit the datasets, we continued to test the measurement invariance with the traditional CFA multi-group method. In the event of failing to achieve metric invariance, we proceeded to test partial measurement invariance instead.

Procedures of detection of MI (measurement invariance) in two different languages datasets.

Figure 1.

Procedures of detection of measurement invariance (MI) in two different language datasets.

Although we obtained partial measurement invariance for the FAD-Plus data between the three non-English datasets and the English dataset, we employed the strategy from Iurino and Saucier [58]. Specifically, we created a new dataset by randomly selecting half the data in each of the four different language datasets. This new, multi-language dataset was used for exploratory factor analysis (EFA, see electronic supplementary material, S3.3). Refer to figure 1 for the workflow of our measurement invariance analysis.

In addition to comparing the measurement invariance across different languages, we also examined measurement invariance among different groups that use the same language. We compared the measurement invariance of different groups in our Chinese samples because we collected relatively diverse datasets representing different sub-groups of the Chinese population (college students, adults outside the university community and high school students). For data from other languages, for example, two Japanese versions, their measurement invariance was also explored and reported.

Besides the multi-group CFA, we conducted additional analysis to assess measurement invariance using item response theory (IRT) and non-negative matrix factorization (NMF). See supplemental materials for more details and results.

2.5. Disclosure

We submitted this project as a registered report and the current document is at Stage 2 of the registration. The in-principle accepted Stage 1 protocol is available at https://osf.io/umhvp.

This project is based on our first attempt to translate FAD-Plus into Chinese and examine the psychometric properties of the Chinese version [22]. All de-identified data and scripts are available at https://osf.io/utrmv/. Also, we reported additional methodological details, results, and plots in the ‘Supplementary analyses and results for measurement invariance’ (§2) of the electronic supplementary material. Materials used for the data collection in China can be accessed at https://osf.io/t7p43/. To maintain full transparency, we have made submission records, reviewers’ comments, response letters and decision letters available in the ‘submission_history’ folder on https://osf.io/utrmv/.

2.5.1. Deviations from the Stage 1 protocol

We documented all deviations from the Stage 1 protocol throughout the data collection process for transparency. All these deviations were unanticipated and did not significantly affect the strength of the evidence. Firstly, one team collected fewer valid participants than planned. The plan was for each team to recruit at least 400 valid participants following Stage 1. However, one team recruited 356 valid participants from a survey of 572 high school students. Due to practical reasons (i.e. authors’ agreement with local high schools), we were unable to collect more data after the initial round of data collection. Despite these deviations, with an overall sample size exceeding 1200, the data from the three cities (Nanjing, Beijing and Dali) met the requirement for measurement invariance analysis and provided a diverse demographic range, which included variations in age, education and location.

Secondly, the data were collected via Qualtrics XM instead of JsPsych hosted on GitHub. This was due to unstable access to GitHub in some regions of mainland China. This change caused an unexpected implementation error: the order of testing for FAD-Plus, BFI−2 and MLOC deviated from our initial randomized design. We ensured that all our participants experienced the same order of questionnaire materials, and usually, the order of tests did not lead to significant consequences (e.g. [59]).

Thirdly, a 7-point scale was mistakenly used for the Chinese MLOC scale, instead of the intended 6-point Likert scale. We corrected this error in the Nanjing retest sample (with a retest rate of 60.64%). Only the MLOC retest data were analyzed in the main text. We also provided correlational analyses between four factors of FAD-Plus and both the 6-point and 7-point scales to ensure transparency (see electronic supplementary material, table S1).

Fourthly, we used a bootstrap method to compare the correlation values in the original study and the correlations calculated from the new datasets. These correlations served as indices of criterion validity for FAD-Plus. We employed the bootstrap method because it allowed us to rigorously evaluate whether the observed correlations in the new datasets significantly differ from the original reported values. These results provided additional evidence regarding the cross-cultural variability in the relationships between the dimensions of the FAD-Plus and the personality measures (BFI and MLOC).

Go to:

3. Results

3.1. Descriptive

Data from Chinese participants were collected by three Chinese research teams. In total, 1248 participants were included, with 830 females and 418 males, a mean age of 23.81 years and a standard deviation of 7.85 years. Detailed descriptions of each dataset can be found in table 1. Participants in Dataset 2 were from a diverse age range and educational levels. Dataset 3, as planned, was collected among high school students and had a lower mean age than the other two datasets. Dataset 1 was from a university community, and it included both undergraduate students and graduate students. Only participants in Dataset 1 took the retest.

When compared to datasets in other languages, the newly acquired Chinese data generally exhibits a satisfactory sample size and a mean age consistent with the English and French datasets (table 1).

3.2. Reliability

Internal reliability for the subscales and the entire test of FAD-Plus was assessed using Cronbach’s alpha and McDonald’s omega. In comparison with the original α values reported by Paulhus and Carey [16], all datasets showed satisfactory reliability. The test-retest reliability values, calculated using Pearson’s correlation coefficients for each dimension, are also listed in table 2.

Table 2.

Cronbach’s alpha and McDonald’s omega with different datasets in distinct dimensions.

	original (α)	CHN (α/ω)	CHN test-retest (r)	ENG (α/ω)	FRN (α/ω)	JPN_1 (α/ω)	JPN_2 (α/ω)
whole	-	0.794/0.803		0.803/0.861	0.714/0.694	0.825/0.838	0.790/0.830
FD	0.82	0.751/0.809	0.684	0.860/0.889	0.739/0.782	0.753/0.798	0.711/0.781
SD	0.69	0.671/0.785	0.653	0.769/0.825	0.597/0.677	0.678/0.748	0.641/0.741
UP	0.72	0.745/0.807	0.613	0.791/0.847	0.709/0.762	0.786/0.824	0.792/0.853
FW	0.70	0.749/0.825	0.745	0.842/0.891	0.710/0.782	0.663/0.753	0.660/0.762

3.3. Validity

3.3.1. Construct validity

The results of the CFA indicated that all language datasets fit the four-factor model well. The loadings are presented in electronic supplementary material, table S5 in the electronic supplementary material.

3.3.2. Criterion validity

Correlational analyses were performed between the dimensions of FAD-Plus and the factors of the BFI for the Chinese and French datasets, respectively. Additionally, correlations between the FAD-Plus and MLOC in the Chinese retest dataset were computed. The Japanese dataset was not analyzed due to the use of a different LOC scale.

For the FW (Free Will) dimension, we observed that all correlations between FW and both MLOC and BFI differed from the original study. For the correlations that were significant in the original study (MLOC_I, BFI_A and BFI_E), the Chinese dataset exhibited higher correlations. For those originally non-significant correlations (MLOC_C, MLOC_P, BFI_C, BFI_N and BFI_O), the Chinese dataset showed significant correlations. The results from the French data, however, showed less divergence from the original results: only the correlation between FW and BFI-C was significant. whereas the original study did not report this effect (see figure 2). These results suggested that the FW dimension may measure a different construct than those in Western cultural context.

Figure 2.

Correlations between four dimensions of the FAD-Plus and three dimensions of the multidimensional locus of control (MLOC) (upper) and five dimensions of the Big Five Inventory (BFI) (lower). ORG: The original reported results (English); CHN: Mean r-value after 5000 bootstraps from Chinese newly collected data; FRN: Mean r-value after 5000 bootstraps from the open-access data from French language community by [17]. The error bars represent 95% confidence intervals. Filled dots indicate that the original r-value falls within the 95% bootstrap CI, while unfilled dots mean the r-values does not fall within the 95% bootstrap CI, i.e. a significant difference from the original (ORG) results.

For the FD (Fatalistic Determinism) dimension, the Chinese dataset also displayed a different pattern from the original study in two out of three MLOC dimensions and four out of five BFI dimensions. Although most correlations were in the same direction in the Chinese dataset and the original study, the relationship between FD and openness (BFI_O) was reversed: the original study reported a positive correlation, we found a negative correlation instead. The French dataset also showed a similar pattern for the relationships, with one exception. The relationship between FD and agreeableness (BFI_A) as compared to the original study: while the original study reported a positive correlation, the French data showed a non-significant correlation.

For the SD (Scientific Determinism) and UP (Unpredictability) dimensions, we found less divergence between our data and the original study. In the Chinese dataset, SD showed different correlations in three out of five dimensions of BFI, but we did not observe significant differences in the correlations between SD and the three dimensions of MLOC. The UP dimension also showed different correlations in two out of five dimensions of BFI, but we did not observe differences in the correlations for the three dimensions of MLOC. In French data, we did not find differences between SD and any BFI dimensions, but we found the correlation between UP and two out of five dimensions of BFI differed from the original.

3.4. Measurement invariance

Table 3 summarizes the outcomes of measurement invariance analyses for the Chinese, French and Japanese datasets when compared to English dataset. We observed only partial measurement invariance (p = 0.07) for the Chinese versions compared to the English version, suggesting that the factor loadings may not differ significantly across these cultural contexts. For the Chinese and English comparison (ENG-CHN), the configural model fit the data well, indicating that the basic factor structure was achieved. However, when moving to more constrained models, the weak model (χ² = 9197.0, df = 659, CFI = 0.784, RMSEA = 0.076) exhibited a decrease in CFI and an increase in RMSEA compared to the configural model, indicating that factor loadings may not be equivalent between these two groups. Similarly, for the French and English comparison (ENG-FRN) and the comparison between the Japanese datasets and the English dataset, the configural model was achieved, but the more constrained models, including the partial model, showed a decline in fit (p<0.05). However, as reported in Stage 1 electronic supplementary material, based on the IRT method, the comparison between the Japanese dataset 2 and the English dataset showed partial MI with p = 0.011. In line with Stage 1 plan, we used the IRT method to identify items without differential item functioning (DIF) and found that only item SD2 (‘People’s biological makeup determines their talents and personality’) exhibited measurement invariance between the Chinese and English datasets.

Table 3.

Results of the measurement invariance analyses comparing different languages’ datasets.

ENG-CHN	χ ²	df	CFI	RMSEA	ΔCFI	ΔRMSEA
configural	8299.7	636	0.806	0.073
partial	10500 ·	679 (677)	0.751 (0.751)	0.080 (0.080)	0.000	0.000
weak	9197.0***	659	0.784	0.076	0.022	0.003
ENG-FRN	χ ²	df	CFI	RMSEA	ΔCFI	ΔRMSEA
configural	7288.5	636	0.813	0.071
partial	8854.8*** (8842)	679 (677)	0.770 (0.770)	0.076 (0.076)	0.000	0.000
weak	7418.8***	659	0.810	0.070	0.003	0.001
ENG-JPN_1	χ ²	df	CFI	RMSEA	ΔCFI	ΔRMSEA
configural	11286	636	0.803	0.069
partial	13555*** (13529)	679 (677)	0.762 (0.762)	0.074 (0.074)	0.000	0.000
weak	11986***	659	0.790	0.070	0.013	0.001
ENG-JPN_2	χ ²	df	CFI	RMSEA	ΔCFI	ΔRMSEA
configural	7491.9	636	0.810	0.073
partial	9271.9* (9262.8)	679 (677)	0.762 (0.762)	0.079 (0.079)	0.000	0.000
weak	7971.0***	659	0.797	0.074	0.013	0.001

Note ‘***” p<0.005; ‘**” p<0.01; ‘*” p<0.05; ‘.” p<0.1; “ ” p<1

We also conducted MI analyses within subgroups of the same languages, specifically within the Chinese and Japanese datasets. We found that three Chinese sub-groups exhibited partial MI, and the two Japanese datasets exhibited partial MI as well. For additional details, see electronic supplementary material, 2.2.

Go to:

4. Discussion

This registered report aimed to examine the measurement invariance of the FAD-Plus scale across four languages. We compared a newly collected Chinese dataset and available Japanese and French data with an open English dataset of FAD-Plus. We found that the four-factor model of FAD-Plus held for the four language datasets. Our multi-group CFA revealed partial measurement invariance between the Chinese and English datasets, as well as between one of the Japanese datasets and the English dataset. The reliability of FAD-Plus was comparable to the original study in all four languages. However, the criterion validity of the FAD-Plus in the Chinese and French datasets showed mixed results when compared to the original study.

Our primary analyses focused on the measurement invariance between the following pairs: Chinese and English, Japanese dataset 1 and English, Japanese dataset 2 and English and French and English. The results revealed that while the four-factor model of FAD-Plus held across all datasets, only weak partial invariance was found between the Chinese-English dataset pair and the Japanese dataset 2 and English dataset pair. For the other two dataset pairs, we did not find partial metric invariance. Our further analyses with IRT revealed that only one item from the Scientific Determinism subscale (SD2, ‘People’s biological makeup determines their talents and personality’) exhibited measurement invariance between the Chinese-English datasets, but none exhibited measurement invariance for the Japanese dataset 2 and English dataset pair. Given that partial metric invariance was only achieved by allowing the factor loadings of 21 out of 27 items to vary across groups, our results suggest near non-invariance of FAD-Plus across different languages and caution when interpreting the score of the FAD-Plus obtained across different languages.

The lack of measurement invariance might have three major sources: item bias, method bias and construct bias [60]. Item bias refers to anomalies at the item level, such as poor translations [61] or the inclusion of terms that have a culture-specific interpretation. Method bias results from differences in the methods used, such as differences in sampling procedures across populations [62,63], differences in non-response patterns [64,65], variations in familiarity with stimuli across groups, and differences in questionnaire administration. Our data suggested that the non-invariance was not due to poor item quality or method bias, as the results did not improve even when we applied multi-group CFA to data from the same language (e.g. datasets from Chinese or Japanese, see electronic supplementary material, results 2.2).

The third source, construct bias, is probably the reason for our near non-invariance results. Construct bias is the most fundamental form of bias and means that the construct itself is interpreted differently across groups [66], which aligns with our reasoning prior to data collection and cross-cultural differences in the interpretation of ‘free will’ [1]. Indeed, our criterion validity results suggest that the constructs measured by the Chinese FAD-Plus may differ from those in the original study. In the original study as well as in other studies conducted in Western culture, free will belief was expected to be moderately correlated to the internal control dimension of MLOC and not necessarily negatively correlated with the powerful other or chance dimension of MLOC, because free will belief and internal control are two distinct constructs [4,16]. However, we found that FW had a strong, rather than moderate, positive correlation with internal control (r = 0.577, 95% CI [0.467, 0.675]) and has negative correlations with powerful others (r = - 0.288, 95% CI [−0.414,−0.149]) and chance (r = −0.324, 95% CI [−0.446,−0.194]). These negative correlations between FW and powerful others, and FW and chance, are similar to those between internal and powerful others (r = −0.232, 95% CI [−0.370,−0.084]) and between internal and chance (r = −0.392, 95% CI [−0.513,−0.257]), suggesting that FW might be perceived similarly to internal control in Chinese samples, rather than as a distinct construct. Note that our criterion validity results may be caused by the lack of measurement invariance for the BFI/MLOC across different cultural contexts. There was evidence that the BFI−2 (the version we used) was ‘largely invariant’ across Chinese and US samples [43], but the measurement invariance of MLOC is still an open question. These findings suggested that a thorough examination of the measurement invariance of belief in free will measurement requires not only the scale itself but also relevant scales to be validated across diverse cultures.

Given the relatively poor results of the MI of FAD-Plus, we carried out an exploratory analysis using EFA and NMF, as planned (see figure 1). Results from the EFA indicated that the four-factor model fit well, with the exception of item UP20 (‘Luck plays a big role in people’s lives’). These results suggested that the four-factor model of FAD-Plus may remain the best candidate model across all these languages. However, NMF revealed that a four-factor model for the English and French data, the best model for the Chinese and Japanese data is a three-factor model (see §3 of the electronic supplementary material). The slight differences in results for the Chinese and Japanese datasets between EFA and NMF may be attributed to the internal algorithms of each method: EFA assumes normally distributed data and aims to explain the covariance among observed variables; in contrast, NMF is a more data-driven approach that decomposes the data matrix into non-negative factors, identifying the most distinct and clear-cut factors or patterns within the data. The nuanced differences highlighted by NMF suggest that a four-factor model may be acceptable for the Chinese and Japanese datasets, whereas a three-factor model is preferable from a purely data-driven perspective. This lack of convergence between EFA and NMF underscores the importance of integrating various data analysis approaches in cross-cultural studies (e.g. [67]).

Taken together, our results revealed that only partial metric invariance can be achieved for the FAD-Plus across different languages and across different datasets within the same language. These results call attention to the clarity of the four constructs of FAD-Plus and their validity [18,20] and cross-cultural generalizability [1]. The unsatisfactory results of the MI analysis highlight a new area for future research, for example, by combining with new practices such as big-team science [68,69] and adversarial collaboration [70].

More specifically, we suggest that future work could combine top-down (theory-driven) and bottom-up (data-driven) approaches to deepen our understanding of belief in free will. Firstly, an adversarial collaboration (e.g. [71]) is needed for a better conceptualization of free will belief and for determining which dimensions should be measured alongside the belief in free will. Although the configural MI of FAD-Plus seems to be achieved, we should not ignore the fact that there are other free will belief scales that include dimensions such as dualism (FWI [13]). Moreover, recent studies revealed that dimensions such as attitudes towards free will [5] and dualism [14] are closely related to free will beliefs. Thus, a collaborative theory-driven approach is needed to inform a better conceptualization of belief in free will and its nomological networks and construct validity [72]. Adversarial collaboration provides a good opportunity for researchers, with their own conceptual frameworks, to work together and work towards developing a unified framework for measuring belief in free will. Meanwhile, given that previous studies of free will beliefs have largely been Western, it is necessary to include researchers from more diverse cultures to obtain a more generalizable concept of free will from the very beginning.

Secondly, the theory-driven approach should be complemented by a bottom-up approach, in which participants’ interpretations and experiences of the concepts should be elicited and analyzed. For example, to study the positive emotions elicited when experiencing or observing a sudden intensification of communal sharing relationships, Seibt et al. [73] tested their theoretical model by presenting videos to participants and asking them to report their feelings on several dimensions. Given the abstractness of free will, it may not be easy to evoke participants’ experiences using videos; however, it is possible to explore beliefs about free will by studying lay people’s intuitions [4] or their prototypes of free will [74]. Implementing bottom-up approaches requires large-scale data collection, which can be achieved through large-scale collaborative science initiatives, such as the Psychological Science Accelerator [75].

In conclusion, our findings shed light on the nuances of measurement invariance in the context of the FAD-Plus scale across different languages. These findings call for studies on the measurement of belief in free will with the state-of-the-art practices such as adversarial collaboration, cross-cultural studies and/or big team science.

Go to:

Ethics

The research protocol has been approved by the Institute Review Board of Nanjing Normal University (No. NNU202110002).

Go to:

Data accessibility

Materials: All materials used for data collection in China are available at [76].

Code and Raw data: All de-identified raw data (in CSV format), and related R scripts are also available at: [77] and [76].

The dataset used in our study were from multiple sources. More specifically, we included several open or shared datasets from the following studies (classification by language):

The Chinese dataset were newly collected and are available at [76].

The English dataset was from three recent papers: [36–38].

The French dataset wass from: [17].

The Japanese dataset was from two papers: [19,23].

Supplementary material is available online [78].

Go to:

Declaration of AI use

We have not used AI-assisted technologies in creating this article.

Go to:

Authors’ contributions

S.D.: conceptualization, data curation, formal analysis, methodology, visualization, writing— original draft, writing—review and editing; C.Z.: conceptualization, formal analysis, methodology, visualization, writing—original draft, writing—review and editing; Q.L.: data curation, investigation, resources; Y.G.: writing— original draft, writing—review and editing; Z.D.: investigation, resources; J.L.: investigation, resources; H.C.-P.: conceptualization, investigation, methodology, project administration, resources, supervision, validation, writing— review and editing.

All authors gave final approval for publication and agreed to be held accountable for the work performed therein.

Go to:

Conflict of interest

We declare we have no competing interests.

Go to:

Funding

No funding has been received for this article.

Go to:

References

1. Lam A. 2021. Folk conceptions of free will: a systematic review and narrative synthesis of psychological research. PsyArXiv. (10.31234/osf.io/nuyjw) [CrossRef]

2. Nichols S. 2004. The folk psychology of free will: fits and starts. Mind Lang. 19, 473–502. (10.1111/j.0268-1064.2004.00269.x) [CrossRef] [Google Scholar]

3. Chandrashekar SPet al. . 2021. Agency and self-other asymmetries in perceived bias and shortcomings: replications of the bias blind spot and link to free will beliefs. Judgm. Decis. Mak. 16, 1392–1412. (10.1017/S1930297500008470) [CrossRef] [Google Scholar]

4. Feldman G. 2017. Making sense of agency: belief in free will as a unique and important construct. Soc. Personal. Psychol. Compass 11, e12293. (10.1111/spc3.12293) [CrossRef] [Google Scholar]

5. Genschow O, Cracco E, Schneider J, Protzko J, Wisniewski D, Brass M, Schooler JW.. 2023. Manipulating belief in free will and its downstream consequences: a meta-analysis. Pers. Soc. Psychol. Rev. 27, 52–82. (10.1177/10888683221087527) [Abstract] [CrossRef] [Google Scholar]

6. Baumeister RF, Sparks EA, Stillman TF, Vohs KD.. 2008. Free will in consumer behavior: self‐control, ego depletion, and choice. J. Consum. Psychol. 18, 4–13. (10.1016/j.jcps.2007.10.002) [CrossRef] [Google Scholar]

7. Baumeister R, Monroe A.. 2014. Recent research on free will: conceptualizations, beliefs, and processes. Adv. Exp. Soc. Psychol. 1–52. (10.1016/B978-0-12-800284-1.00001-1) [CrossRef] [Google Scholar]

8. Crescioni AW, Baumeister RF, Ainsworth SE, Ent M, Lambert NM.. 2016. Subjective correlates and consequences of belief in free will. Philos. Psychol. 29, 41–63. (10.1080/09515089.2014.996285) [CrossRef] [Google Scholar]

9. Li C, Wang S, Zhao Y, Kong F, Li J.. 2016. The freedom to pursue happiness: belief in free will predicts life satisfaction and positive affect among chinese adolescents. Front. Psychol. 7, 2027. (10.3389/fpsyg.2016.02027) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

10. Rakos RF, Laurene KR, Skala S, Slane S.. 2008. Belief in free will: measurement and conceptualization innovations. Behav. Soc. Iss. 17, 20–40. (10.5210/bsi.v17i1.1929) [CrossRef] [Google Scholar]

11. Nettler G. 1959. Cruelty, dignity, and determinism. Am. Sociol. Rev. 24, 375. (10.2307/2089386) [CrossRef] [Google Scholar]

12. Stroessner SJ, Green CW.. 1990. Effects of belief in free will or determinism on attitudes toward punishment and locus of control. J. Soc. Psychol. 130, 789–799. (10.1080/00224545.1990.9924631) [CrossRef] [Google Scholar]

13. Nadelhoffer T, Shepard J, Nahmias E, Sripada C, Ross LT.. 2014. The free will inventory: measuring beliefs about agency and responsibility. Conscious. Cogn. 25, 27–41. (10.1016/j.concog.2014.01.006) [Abstract] [CrossRef] [Google Scholar]

14. Wisniewski D, Deutschländer R, Haynes JD.. 2019. Free will beliefs are better predicted by dualism than determinism beliefs across different cultures. PLoS One 14, e0221617. (10.1371/journal.pone.0221617) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

15. Zhao X, Wente A, Flecha MF, Galvan DS, Gopnik A, Kushnir T.. 2021. Culture moderates the relationship between self-control ability and free will beliefs in childhood. Cognition 210, 104609. (10.1016/j.cognition.2021.104609) [Abstract] [CrossRef] [Google Scholar]

16. Paulhus DL, Carey JM.. 2011. The FAD-plus: measuring lay beliefs regarding free will and related constructs. J. Pers. Assess. 93, 96–104. (10.1080/00223891.2010.528483) [Abstract] [CrossRef] [Google Scholar]

17. Caspar EA, Verdin O, Rigoni D, Cleeremans A, Klein O.. 2017. What do you believe in? French translation of the FAD-plus to assess beliefs in free will and determinism and their relationship with religious practices and personality traits. Psychol. Belg. 57, 1–16. (10.5334/pb.321) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

18. Fino E, Iliceto P.. 2023. Do people have control over the decisions they make? Psychometric properties of the Italian version of the free will and scientific determinism questionnaire (FAD-plus-I). Curr. Psychol. 42, 11268–11286. (10.1007/s12144-021-02268-4) [CrossRef] [Google Scholar]

19. Goto T, Ishibashi Y, Kajimura S, Oka R, Kusumi T.. 2015. Development of free will and determinism scale in Japanese. Shinrigaku Kenkyu 86, 32–41. (10.4992/jjpsy.86.13233) [Abstract] [CrossRef] [Google Scholar]

20. Yilmaz O, Bahçekapili HG, Harma M.. 2018. Different types of religiosity and lay intuitions about free will/determinism in Turkey. Int. J. Psychol. Relig. 28, 89–102. (10.1080/10508619.2018.1425062) [CrossRef] [Google Scholar]

21. Li J, Zhao Y, Lin L, Chen J, Wang S.. 2018. The freedom to persist: belief in free will predicts perseverance for long-term goals among Chinese adolescents. Pers. Individ. Dif. 121, 7–10. (10.1016/j.paid.2017.09.011) [CrossRef] [Google Scholar]

22. Liu QL, Wang F, Yan W, Peng K, Sui J, Hu CP.. 2020. Questionnaire data from the revision of a Chinese version of free will and determinism plus scale. J. Open Psychol. Data 8. (10.5334/jopd.49) [CrossRef] [Google Scholar]

23. Goto T. 2021. Comparing the psychometric properties of two Japanese-translated scales of the free will and determinism-plus scale. Front. Psychol. 12, 720601. (10.3389/fpsyg.2021.720601) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

24. Caspar EA, Vuillaume L, Magalhães De Saldanha da Gama PA, Cleeremans A.. 2017. The influence of (Dis)belief in free will on immoral behavior. Front. Psychol. 8, 20. (10.3389/fpsyg.2017.00020) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

25. Luong R, Flake JK.. 2023. Measurement invariance testing using confirmatory factor analysis and alignment optimization: a tutorial for transparent analysis planning and reporting. Psychol. Methods 28, 905–924. (10.1037/met0000441) [Abstract] [CrossRef] [Google Scholar]

26. Schmitt N, Kuljanin G.. 2008. Measurement invariance: review of practice and implications. Hum. Resour. Manag. Rev. 18, 210–222. (10.1016/j.hrmr.2008.03.003) [CrossRef] [Google Scholar]

27. Vandenberg RJ, Lance CE.. 2000. A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organ. Res. Methods 3, 4–70. (10.1177/109442810031002) [CrossRef] [Google Scholar]

28. Bieda A, Hirschfeld G, Schönfeld P, Brailovskaia J, Zhang XC, Margraf J.. 2017. Universal happiness? Cross-cultural measurement invariance of scales assessing positive mental health. Psychol. Assess. 29, 408–421. (10.1037/pas0000353) [Abstract] [CrossRef] [Google Scholar]

29. Milfont TL, Fischer R.. 2010. Testing measurement invariance across groups: applications in cross-cultural research. Int. J. Psychol. Res. 3, 111–130. (10.21500/20112084.857) [CrossRef] [Google Scholar]

30. Putnick DL, Bornstein MH.. 2016. Measurement invariance conventions and reporting: the state of the art and future directions for psychological research. Dev. Rev. 41, 71–90. (10.1016/j.dr.2016.06.004) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

31. Berniūnas R, Beinorius A, Dranseika V, Silius V, Rimkevičius P.. 2021. The weirdness of belief in free will. Conscious. Cogn. 87, 103054. (10.1016/j.concog.2020.103054) [Abstract] [CrossRef] [Google Scholar]

32. Confucius . 1998. The analects, (ed. DC Lau, editor. ), 1st edn. Penguin Classics. [Google Scholar]

33. Boer D, Hanke K, He J.. 2018. On detecting systematic measurement error in cross-cultural research: a review and critical reflection on equivalence and invariance tests. J. Cross Cult. Psychol. 49, 713–734. (10.1177/0022022117749042) [CrossRef] [Google Scholar]

34. Flake JK, Fried EI.. 2020. Measurement schmeasurement: questionable measurement practices and how to avoid them. Adv. Meth. Pract. Psychol. Sci. 3, 456–465. (10.1177/2515245920952393) [CrossRef] [Google Scholar]

35. Zickfeld JHet al. . 2019. Kama muta: conceptualizing and measuring the experience often labelled being moved across 19 nations and 15 languages. Emotion 19, 402–424. (10.1037/emo0000450) [Abstract] [CrossRef] [Google Scholar]

36. Earp BD, Everett JAC, Crone D, Nadelhoffer T, Caruso GD, Shariff A, Sinnott-Armstrong W.. 2018. Determined to be humble? Exploring the relationship between belief in free will and humility. PsyArXiv (10.31234/osf.io/3bxra) [CrossRef] [Google Scholar]

37. Nadelhoffer T, Shepard J, Crone DL, Everett JAC, Earp BD, Levy N.. 2020. Does encouraging a belief in determinism increase cheating? Reconsidering the value of believing in free will. Cognition 203, 104342. (10.1016/j.cognition.2020.104342) [Abstract] [CrossRef] [Google Scholar]

38. Post L, Zwaan RA.. 2014. What is the value of believing in free will? Two replication studies. See https://osf.io/mnwgb/.

39. Koh KH, Zumbo BD.. 2008. Multi-group confirmatory factor analysis for testing measurement invariance in mixed item format data. J. Mod. Appl. Stat. Methods 7, 471–477. (10.22237/jmasm/1225512660) [CrossRef] [Google Scholar]

40. Zhao X, Liu L, Zhang X, Shi J, Huang Z.. 2014. The effect of belief in free will on prejudice. PLoS One 9, e91572. (10.1371/journal.pone.0091572) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

41. ITC guidelines for translating and adapting tests (second edition). 2017. International journal of testing 18, 101–134. (10.1080/15305058.2017.1398166) [CrossRef] [Google Scholar]

42. Soto CJ, John OP.. 2017. The next big five inventory (BFI-2): developing and assessing a hierarchical model with 15 facets to enhance bandwidth, fidelity, and predictive power. J. Pers. Soc. Psychol. 113, 117–143. (10.1037/pspp0000096) [Abstract] [CrossRef] [Google Scholar]

43. Zhang B, Li YM, Li J, Luo J, Ye Y, Yin L, Chen Z, Soto CJ, John OP.. 2022. The big five inventory-2 in China: a comprehensive psychometric evaluation in four diverse samples. Assessment 29, 1262–1284. (10.1177/10731911211008245) [Abstract] [CrossRef] [Google Scholar]

44. Plaisant O, Courtois R, Réveillère C, Mendelsohn GA, John OP.. 2010. Validation par analyse factorielle du big five inventory français (BFI-Fr) analyse convergente avec le NEO-PI-R. Ann. Méd. 168, 97–106. (10.1016/j.amp.2009.09.003) [CrossRef] [Google Scholar]

45. Levenson H. 1973. Multidimensional locus of control in psychiatric patients. J. Consult. Clin. Psychol. 41, 397–404. (10.1037/h0035357) [Abstract] [CrossRef] [Google Scholar]

46. Wang X, Wang X, Ma H. (eds). 1999. Rating scales for mental health (in Chinese). Beijing: Chinese Mental Health Journal Publisher. [Google Scholar]

47. Wickham Het al. . 2019. Welcome to the tidyverse. J. Open Sou. Sci. 4, 1686. (10.21105/joss.01686) [CrossRef] [Google Scholar]

48. Willse JT. 2018. CTT. (10.32614/CRAN.package.CTT) [CrossRef]

49. Rosseel Y. 2012. Lavaan: an R package for structural equation modeling. J. Stat. Softw. 48, 1–36. (10.18637/jss.v048.i02) [CrossRef] [Google Scholar]

50. Revelle W. 2021. psych: Procedures for Psychological, Psychometric, and Personality Research. See https://CRAN.R-project.org/package=psych.

51. Epskamp S. 2015. SemPlot: unified visualizations of structural equation models. Struct. Equ. Model. 22, 474–483. (10.1080/10705511.2014.937847) [CrossRef] [Google Scholar]

52. Jorgensen TDet al. . 2021. semTools:useful tools for structural equation modeling (0.5-5) [Computer software]. See https://CRAN.R-project.org/package=semTools.

53. Lin X, Boutros PC.. 2020. Optimization and expansion of non-negative matrix factorization. BMC Bioinformatics 21, 7. (10.1186/s12859-019-3312-5) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

54. DeBruine Z. 2021. RcppML: Rcpp machine learning library (0.3.7) [computer software]. See https://cran.r-project.org/web/packages/RcppML/index.html.

55. Efron B. 1979. Bootstrap methods: another look at the jackknife. Ann. Statist. 7. (10.1214/aos/1176344552) [CrossRef] [Google Scholar]

56. Meredith W. 1993. Measurement invariance, factor analysis and factorial invariance. Psychometrika 58, 525–543. (10.1007/BF02294825) [CrossRef] [Google Scholar]

57. Kim ES, Yoon M.. 2011. Testing measurement invariance: a comparison of multiple-group categorical CFA and IRT. Struct. Equ. Modeling 18, 212–228. (10.1080/10705511.2011.557337) [CrossRef] [Google Scholar]

58. Iurino K, Saucier G.. 2020. Testing measurement invariance of the moral foundations questionnaire across 27 countries. Assessment 27, 365–372. (10.1177/1073191118817916) [Abstract] [CrossRef] [Google Scholar]

59. Barry MJ, Walker-Corkery E, Chang Y, Tyll LT, Cherkin DC, Fowler FJ.. 1996. Measurement of overall and disease-specific health status: does the order of questionnaires make a difference? J. Health Serv. Res. Policy 1, 20–27. (10.1177/135581969600100105) [Abstract] [CrossRef] [Google Scholar]

60. Davidov E, Meuleman B, Cieciuch J, Schmidt P, Billiet J.. 2014. Measurement equivalence in cross-national research. Annu. Rev. Sociol. 40, 55–75. (10.1146/annurev-soc-071913-043137) [CrossRef] [Google Scholar]

61. Harkness JA, Villar A, Edwards B.. 2010. Translation, adaptation, and design. In Survey methods in multinational, multiregional, and multicultural contexts, pp. 117–140. Hoboken, NJ, USA: John Wiley & Sons, Inc. (10.1002/9780470609927) [CrossRef] [Google Scholar]

62. Häder S, Gabler S.. 2003. Sampling and estimation. In Cross cultural survey methods. Hoboken, NJ, USA: John Wiley & Sons, Inc. [Google Scholar]

63. Heeringa SG, O’muircheartaigh C.. 2010. Sample design for cross-cultural and cross-national survey programs. In Survey methods in multinational, multiregional, and multicultural contexts, pp. 251–267. Hoboken, NJ, USA: John Wiley & Sons, Inc. (10.1002/9780470609927) [CrossRef] [Google Scholar]

64. Billiet J, Philippens M, Fitzgerald R, Stoop I.. 2007. Estimation of nonresponse bias in the European social survey: using information from reluctant respondents. J. Off. Stat. 23, 135. [Google Scholar]

65. Couper M, De Leeuw E.. 2003. Nonresponse in cross-cultural and cross-national surveys. In Cross-cultural survey methods. Hoboken, NJ, USA: John Wiley & Sons, Inc. [Google Scholar]

66. Fischer R, Karl JA, Luczak-Roesch M.. Why equivalence and invariance are both different and essential for scientific studies of culture: a discussion of mapping processes and theoretical implications. OSF. (10.31234/osf.io/fst9k) [CrossRef]

67. Camilleri JA, Eickhoff SB, Weis S, Chen J, Amunts J, Sotiras A, Genon S.. 2021. A machine learning approach for the factorization of psychometric data with application to the delis kaplan executive function system. Sci. Rep. 11, 16896. (10.1038/s41598-021-96342-3) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

68. Forscher PS, Wagenmakers EJ, Coles NA, Silan MA, Dutra N, Basnight-Brown D, IJzerman H.. 2023. The benefits, barriers, and risks of big-team science. Perspect. Psychol. Sci. 18, 607–623. (10.1177/17456916221082970) [Abstract] [CrossRef] [Google Scholar]

69. Oshiro Bet al. . 2024. Structural validity evidence for the oxford utilitarianism scale across 15 languages. Psychol. Test Adapt. Dev. 5, 175–191. (10.1027/2698-1866/a000061) [CrossRef] [Google Scholar]

70. Clark CJ, Tetlock PE.. 2023. Adversarial collaboration: the next science reform. In Ideological and political bias in psychology: nature, scope, and solutions (eds Frisby CL, Redding RE, O’Donohue WT, Lilienfeld SO.), pp. 905–927. Springer Cham. (10.1007/978-3-031-29148-7_32) [CrossRef] [Google Scholar]

71. Ellemers N, Fiske ST, Abele AE, Koch A, Yzerbyt V.. 2020. Adversarial alignment enables competing models to engage in cooperative theory building toward cumulative science. Proc. Natl Acad. Sci. USA 117, 7561–7567. (10.1073/pnas.1906720117) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

72. Cronbach LJ, Meehl PE.. 1955. Construct validity in psychological tests. Psychol. Bull. 52, 281–302. (10.1037/h0040957) [Abstract] [CrossRef] [Google Scholar]

73. Seibt B, Schubert TW, Zickfeld JH, Zhu L, Arriaga P, Simão C, Nussinson R, Fiske AP.. 2018. Kama muta: similar emotional responses to touching videos across the United States, Norway, China, Israel, and Portugal. J. Cross Cult. Psychol. 49, 418–435. (10.1177/0022022117746240) [CrossRef] [Google Scholar]

74. Buchtel EE. 2023. Morality as fish: defining morality as a prototype concept. Psychol. Inq. 34, 80–85. (10.1080/1047840X.2023.2248859) [CrossRef] [Google Scholar]

75. Moshontz Het al. . 2018. The psychological science accelerator: advancing psychology through a distributed collaborative network. Adv. Methods Pract. Psychol. Sci. 1, 501–515. (10.1177/2515245918797607) [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]

76. Hu CP, Siqi D, Cheng-Hao Z, Yixin G.. 2021. Measurement invariance of FAD+ / materials. See https://osf.io/t7p43/.

77. Chuan-Peng-Lab . 2024. FAD_Plus_Stage2. github https://github.com/Chuan-Peng-Lab/FAD_Plus_Stage2

78. Duan S, Zhou C, Liu Q, Gong Y, Dou Z, Li J.. 2024. Data from: Assessing the Measurement Invariance of Free will and Determinism Plus Scale Across Four Languages: A Regtered Report. Figshare. (10.6084/m9.figshare.c.7467685) [Europe PMC free article] [Abstract] [CrossRef]

Articles from Royal Society Open Science are provided here courtesy of The Royal Society

Full text links

Read article at publisher's site: https://doi.org/10.1098/rsos.220876

Citations & impact

This article has not been cited yet.

Impact metrics

Alternative metrics

Altmetric item for https://www.altmetric.com/details/170474934

Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/170474934

Search life-sciences literature (45,100,050 articles, preprints and more)

Assessing the measurement invariance of Free Will and Determinism Plus scale across four languages: a registered report.

Author information

Affiliations

Authors

Authors

Authors

Authors

ORCIDs linked to this article

Abstract

Free full text

Assessing the measurement invariance of Free Will and Determinism Plus scale across four languages: a registered report

Siqi Duan

Chenghao Zhou

Qinglan Liu

Yixin Gong

Zenan Dou

Jingguang Li

Hu Chuan-Peng

Associated Data

Abstract

1. Introduction

2. Method

2.1. Data

2.2. Material

2.2.1. The Free will and Determinism Plus (FAD-PLUS) Scale

2.2.2. The Big Five Inventory

2.2.3. Locus of control

2.3. Procedure

2.4. Data analysis

2.4.1. Descriptive

Table 1.

2.4.2. Reliability

2.4.3. Validity

2.4.3.1. Construct validity

2.4.3.2. Criterion validity

2.4.4. Measurement invariance

2.5. Disclosure

2.5.1. Deviations from the Stage 1 protocol

3. Results

3.1. Descriptive

3.2. Reliability

Table 2.

3.3. Validity

3.3.1. Construct validity

3.3.2. Criterion validity

3.4. Measurement invariance

Table 3.

4. Discussion

Ethics

Data accessibility

Declaration of AI use

Authors’ contributions

Conflict of interest

Funding

References

Full text links

Citations & impact

Impact metrics

Alternative metrics

Partnerships & funding