Genome-wide analyses of non-syndromic cleft lip with palate identify 14 novel loci and genetic heterogeneity

Yu, Yanqin; Zuo, Xianbo; He, Miao; Gao, Jinping; Fu, Yuchuan; Qin, Chuanqi; Meng, Liuyan; Wang, Wenjun; Song, Yaling; Cheng, Yong; Zhou, Fusheng; Chen, Gang; Zheng, Xiaodong; Wang, Xinhuan; Liang, Bo; Zhu, Zhengwei; Fu, Xiazhou; Sheng, Yujun; Hao, Jiebing; Liu, Zhongyin; Yan, Hansong; Mangold, Elisabeth; Ruczinski, Ingo; Liu, Jianjun; Marazita, Mary L.; Ludwig, Kerstin U.; Beaty, Terri H.; Zhang, Xuejun; Sun, Liangdan; Bian, Zhuan

doi:10.1038/ncomms14364

Genome-wide analyses of non-syndromic cleft lip with palate identify 14 novel loci and genetic heterogeneity

Article
Open access
Published: 24 February 2017

Volume 8, article number 14364, (2017)
Cite this article

Download PDF

You have full access to this open access article

From

View current issue

Genome-wide analyses of non-syndromic cleft lip with palate identify 14 novel loci and genetic heterogeneity

Download PDF

Yanqin Yu¹^na1,
Xianbo Zuo^2,3,4,5,6^na1,
Miao He^1,7^na1,
Jinping Gao^2,3,4,5,6^na1,
Yuchuan Fu⁸,
Chuanqi Qin^1,8,
Liuyan Meng¹,
Wenjun Wang^2,3,4,5,6,
Yaling Song¹,
Yong Cheng¹,
Fusheng Zhou^2,3,4,5,6,
Gang Chen^2,3,4,5,6,
Xiaodong Zheng^2,3,4,5,6,
Xinhuan Wang¹,
Bo Liang^2,3,4,5,6,
Zhengwei Zhu^2,3,4,5,6,
Xiazhou Fu⁹,
Yujun Sheng^2,3,4,5,6,
Jiebing Hao¹⁰,
Zhongyin Liu¹¹,
Hansong Yan¹²,
Elisabeth Mangold¹³,
Ingo Ruczinski¹⁴,
Jianjun Liu^2,3,4,5,6,
Mary L. Marazita^15,16,17,
Kerstin U. Ludwig^13,18,
Terri H. Beaty¹⁹,
Xuejun Zhang^{2,3,4,5,6,20,21},
Liangdan Sun^2,3,4,5,6,22^na2 &
…
Zhuan Bian¹^na2

9502 Accesses
7 Altmetric
1 Mention
Explore all metrics

Abstract

Non-syndromic cleft lip with palate (NSCLP) is the most serious sub-phenotype of non-syndromic orofacial clefts (NSOFC), which are the most common craniofacial birth defects in humans. Here we conduct a GWAS of NSCLP with multiple independent replications, totalling 7,404 NSOFC cases and 16,059 controls from several ethnicities, to identify new NSCLP risk loci, and explore the genetic heterogeneity between sub-phenotypes of NSOFC. We identify 41 SNPs within 26 loci that achieve genome-wide significance, 14 of which are novel (RAD54B, TMEM19, KRT18, WNT9B, GSC/DICER1, PTCH1, RPS26, OFCC1/TFAP2A, TAF1B, FGF10, MSX1, LINC00640, FGFR1 and SPRY1). These 26 loci collectively account for 10.94% of the heritability for NSCLP in Chinese population. We find evidence of genetic heterogeneity between the sub-phenotypes of NSOFC and among different populations. This study substantially increases the number of genetic susceptibility loci for NSCLP and provides important insights into the genetic aetiology of this common craniofacial malformation.

Genome-wide meta-analyses of nonsyndromic orofacial clefts identify novel associations between FOXE1 and all orofacial clefts, and TP63 and cleft lip with or without cleft palate

Article 04 January 2017

Genome-wide association study identifies a new susceptibility locus for cleft lip with or without a cleft palate

Article 16 March 2015

Rare variants analyses suggest novel cleft genes in the African population

Article Open access 20 June 2024

Introduction

Orofacial clefts (OFCs) are the most common craniofacial malformations in humans and present a major public health burden, imposing substantial health care and financial burdens on the individual, their family and society¹. In general, the highest birth prevalence rates of OFCs are reported in Asia (especially in China and Japan), often as high as 1 in 500 and affecting more than 2.6 million people in China². According to whether the patients have other malformations or anomalies, OFCs can be divided into syndromic and non-syndromic forms. Approximately 70% of cleft lip (CL) with or without cleft palate (CP) cases and 50% of CP only (CPO) cases occur, as isolated entities with no other abnormal phenotypes are considered to be non-syndromic (referred to as NSOFC)^1,2,3. NSOFC is further classified into non-syndromic cleft lip with palate (NSCLP), non-syndromic CL only (NSCLO) and non-syndromic CPO (NSCPO), based on the anatomical morphology⁴. As they share common epidemiological patterns and occur during the same embryological period, NSCLP and NSCLO are often grouped together as non-syndromic CL with or without CP (NSCL/P), differing only in severity⁵. However, there is some evidence showing that NSCLP and NSCLO might harbour different genetic aetiologies^6,7,8,9.

Multiple genome-wide association study (GWAS) and relative extension studies of NSCL/P have been performed, and 22 susceptibility loci were identified^{7,9,10,11,12,13,14,15,16}, including the 1q32 (IRF6) locus, which was observed in previous candidate gene studies and subsequently confirmed in several GWASs^{7,10,12,14,15,16}. However, only one GWAS of NSCL/P was conducted in a Chinese population¹⁵ and thus the heritability in the risk of NSOFC remains unexplained in China, especially for the three distinct sub-groups of NSCLP, NSCLO and NSCPO in both Chinese and European populations.

To facilitate the understanding of the genetic architecture and gain a better understanding of the genetic basis underlying the sub-phenotypes of NSOFC, here we perform a NSCLP GWAS using two independent case–control samples from China and replicate interesting markers in a total of 23,463 samples from sub-phenotypes of NSOFC and multiple ethnic groups. We identify 14 new loci and confirm 12 previously reported ones for NSCLP. These susceptibility variants identified in the current study collectively account for 10.94% of the heritability for NSCLP in Chinese population. In addition, evidence of genetic heterogeneity is observed between the three sub-phenotypes of NSOFC and among different populations.

Results

Identification of 26 NSCLP-associated loci

In the discovery stage, we genotyped 900,015 single-nucleotide polymorphisms (SNPs) using the Illumina HumanOmniZhongHua-8 BeadChip in 2,096 cases and 4,051 controls of Chinese ancestry (cohort 1). After quality control, 803,202 SNPs (call rate>95% and minor allele frequency (MAF)>1%) in 2,033 NSCLP cases and 4,051 controls of Chinese ancestry were used in the GWAS discovery analysis (Fig. 1 and Supplementary Table 1). The Manhattan plot of P-values using Cochran–Armitage trend test with adjustment for gender is shown in Supplementary Fig. 1. All cases and controls were assessed by principal components analysis for population stratification and were confirmed to be of Chinese ancestry (Supplementary Fig. 2). Quantile–quantile plots were constructed and genomic control values were calculated (λ_GC=1.04) (Supplementary Fig. 3). Both of these results indicate negligible inflation of the genome-wide association signals caused by population stratification, further suggesting that the deviated tail of the P-values’ distribution reflects some true genetic associations with NSCLP. We then conducted logistic regression analysis to assess the genotype–phenotype association.

To perform a fast-track replication study, we selected and genotyped 152 SNPs (P<1 × 10⁻⁴) within 79 loci for a follow-up analysis in an additional 1,346 NSCLP cases and 4,542 controls of Chinese ancestry (cohort 2). Of the 146 successfully genotyped SNPs, 64 showed nominal association (P<0.05 using logistic regression) in the validation stage and 61 of them showed a consistent direction in their estimated effects on risk between the discovery (cohort 1) and validation (cohort 2) stages (Supplementary Table 2). A fixed-effects meta-analysis of the combined cohorts 1 and 2, totalling 3,379 NSCLP cases and 8,593 controls, identified 14 new loci (20 SNPs) (P<5.00 × 10⁻⁸ using Cochran–Mantel–Haenszel test), namely 2p25.1, 4p16.2, 4q28.1, 5p12, 6p24.3, 8p11.23, 8q22.1, 9q22.32, 12q13.13, 12q13.2, 12q21.1, 14q22.1, 14q32.13 and 17q21.32, and three suggestive loci 2q35, 8q22.2 and 20q13.2 (Table 1, Fig. 2 and Supplementary Table 3). We also confirmed 12 reported loci (21 SNPs): 1p22.1, 1q32.2, 2p24.2, 8q21.3, 8q24.21, 9q22.2, 10q25.3, 13q31.1, 16p13.3, 17p13.1, 17q22 and 20q12 (P<5.00 × 10⁻⁸) (Fig. 2 and Supplementary Table 4). All these 26 susceptibility loci collectively account for 10.94% of the NSCLP heritability. In addition, conditional analyses were performed for all 26 loci and we identified a secondary signal in one previously reported locus at 1q32.2 (Supplementary Table 5). After reviewing the published GWASs of NSCL/P and the present study, we summarize the susceptibility loci identified in different populations in Supplementary Fig. 4.

Table 1 Association results for markers from the 14 novel NSCLP risk loci at each stage.

Full size table

**Figure 2: Manhattan plot of the association evidence of the 26 NSCLP risk loci in the discovery stage.**

Replications of the 26 loci in sub-phenotype groups of NSOFC

We successfully genotyped 40 of the 41 SNPs (1 SNP, rs481931 at 1p22.1, was unsuccessfully genotyped) from the 26 loci in cohort 3 (NSCLO) and cohort 4 (NSCPO). Two novel (14q32.13 and 17q21.32) and eight reported loci (1p22.1, 1q32.2, 2p24.2, 8q21.3, 9q22.2, 10q25.3, 17p13.1 and 20q12) showed significant associations (P_Bonferroni<1.25 × 10⁻³ using logistic regression test and Bonferroni correction; 0.05 out of 40) with NSCLO (Supplementary Table 6). All the associated SNPs from the above ten loci have concordant associations in the effect sizes and direction in both NSCLP and NSCLO (Supplementary Tables 3, 4 and 6). Two loci (13q31.1 and 15q13.3) were reported to be more strongly associated with NSCLP than NSCLO^7,8,9. We also found rs9545308 at 13q31.1 to be significantly associated with NSCLP (P_{NSCLP meta}=2.00 × 10⁻⁹, odds ratio (OR)=1.29) but not with NSCLO (P_NSCLO=4.95 × 10⁻³, OR=1.23) in our Chinese samples. The marker at 15q13.3 was not successfully replicated in NSCLP and thus was not chosen to be replicated in NSCLO in our study.

One novel (9q22.32) and two reported loci (1q32.2 and 8q24.21) showed significant associations with NSCPO (Supplementary Table 7). The marker in 1q32.2 showed opposite directions of association between the NSCLP and NSCPO groups (rs9430019; OR_NSCLP=1.25 and OR_NSCLO=0.66), whereas the markers in the 8q24.21 and 9q22.32 loci were concordant in the estimated direction of association with NSCLP (Supplementary Tables 3, 4 and 7). It is worth mentioning that the recent GWAS¹⁷ and sequencing study¹⁸ revealed an aetiological missense variant in GRHL3 (1p36.11) for NSCPO. The additional locus 9q22.33 (FOXE1) was identified potentially accounting for linkage to both NSCL/P and NSCPO¹⁹. The markers at 1p36.11 and 9q22.33 were not significant at the GWAS stage in our study and thus were not replicated in NSCLP and NSCPO.

Tests for heterogeneity showed that the SNPs at 1, 8 and 5 loci yielded significant evidence of heterogeneity (P<1.25 × 10⁻³ using logistic regression test and Bonferroni correction; 0.05 out of 40) between NSCLO and NSCLP, NSCPO and NSCLP, and NSCPO and NSCLO, respectively (Table 2 and Table 3). Interestingly, gender stratification analysis revealed that one previously identified locus (1q32.2) showed strong evidence of heterogeneity (P=1.38 × 10⁻⁴) in the evidence of association with NSCLP from male and female cases. The marker on 8q21.3 was observed to exhibit significant evidence of heterogeneity in its estimated effect between older mothers (>35 years) and the reference age of mothers (25–35 years) (Supplementary Table 8).

Table 2 SNPs showing significance in stratified analysis among the three anatomical types of orofacial clefts: NSCLP, NSCLO and NSCPO in Chinese population.

Full size table

Table 3 Markers achieving genome-wide significance in GWAS of 3,379 NSCLP cases and 8,593 controls of Chinese Han and prioritized genes in each significant SNP.

Full size table

Replications of 16 NSCLP loci in multi-ethnic groups

We further checked for associations of the 14 novel loci and two recently reported NSCL/P loci (16p13.3 in China¹⁵ and 2p24.2 in a multi-ethnic study¹⁶) using cohorts 5–7. Different loci showed evidence of association in different cohorts (P<0.05 using logistic regression test), specifically three loci in Central Europeans, seven loci in Asians and five loci in European ancestry (Table 1 and Supplementary Table 9). For the majority of the 16 loci mentioned, the direction and magnitude of the effect of ORs were consistent across Chinese and non-Chinese samples. However, we observed an apparent difference in risk allele frequencies (AFs) for most of these 16 risk loci. For example, AFs in the cases of the markers at 4p16.2 (rs1907989, AF_Chinese=0.46, AF_European=0.57) and 17q21.32 (rs1838105, AF_Chinese=0.45, AF_European=0.39) showed a certain degree of difference between the Chinese and Central European populations (Table 1), whereas the AFs of the markers at 8q24.21 (rs987525, AF_Chinese=0.07, AF_European=0.38 and rs7017252, AF_Chinese=0.08, AF_European=0.55) were highly different between the Chinese and European populations (Supplementary Table 10).

Biological implications analyses for the 26 NSCLP loci

Of the 135 SNPs associated with the risk of NSCLP at these 26 loci (r²≥0.7 with the index SNPs), 33, 99 and 113, respectively, were found in known or predicted regulatory elements such as promoters, enhancers or motifs biochemically characterized to regulate transcription for 50 reference genes (Supplementary Data 1). Expression profiles of the 49 genes within 500 kb and showing strong linkage disequilibrium (LD) with index SNPs (r²≥0.7) in the 26 loci based on craniofacial gene expression patterns in mouse (EMAGE database) showed that a total of these 20 genes were related to embryonic development, among which 12 were expressed in the related tissue for OFC (Supplementary Table 11). Twelve genes were found to produce CL/CP malformations in mutant mouse models (Supplementary Table 12). Seven genes were reported to be associated with nine recognized malformation syndromes including OFC as a clinical phenotype (Supplementary Table 13). By manually reviewing the literature related to these 49 genes and their functions, annotated systematically using several databases, 28 of these genes were classified into different biological categories according to morphogenesis, development, molecular and cellular function, and 17 of these categories contained 3 or more genes. The most relevant catalogue was the morphogenesis- and development-related traits, which included the bone, limb, brain, ear and other organs. Several important signalling pathways were also reviewed, among which epidermis/epithelium development and morphogenesis, and the fibroblast growth factor receptor (FGFR) signalling pathway involved the most genes, whereas the bone morphogenic proteins (BMP), WNT and Notch signalling pathways have also been reported to be associated with lip and palate development (Supplementary Table 14).

Discussion

In this study, we identified 14 novel risk loci associated with NSCLP and confirmed 12 previously reported loci in the Chinese population. Several lines of evidence support the hypothesis that these newly identified loci contribute to orofacial clefting. At 4p16.2, two markers in moderate LD with one another (r²=0.65) are located 50 kb upstream of MSX1 (Supplementary Fig. 5b). Mutations in MSX1 contribute to CL/P and tooth agenesis in humans²⁰. Msx1⁻ homozygotic mice exhibit CP, deficiency of the alveolar mandible and maxilla, and failure of tooth development²¹. The orthologous Spry1 of the optimized gene SPRY1 at 4q28.1 has been shown to exhibit facial clefting and CP in transgenic mice²². The marker at 5p12 is adjacent to FGF10, whose encoded protein has been reported to play roles in the epithelial–mesenchymal transition, and Fgf10 knockout mice show the symptoms of CP²³. The 6p24.3 region harbours two candidate genes for the OFC: OFCC1 was reported as a potential OFC susceptibility gene in humans, based on the observation of three unrelated OFC patients all harbouring a chromosomal break within or close to the gene²⁴. TFAP2A mutations have been found in patients diagnosed with branchio-oculo-facial syndrome, which is characterized by branchial defects, ocular anomalies and facial defects, including CL/P²⁵.

The SNP at 8p11.23 is located in an enhancer within FGFR1. In humans, mutations of FGFR1 have been found in syndromic CL/P²⁶, whereas in mice a homozygous hypomorphic allele at Fgfr1 caused CP²⁷. A marker at 9q22.32 is located at the intronic region of PTCH1, mutations of which have been identified in patients of Gorlin syndrome with CL/P as one of its clinical manifestations²⁸. Ptch1^DL mouse models showed craniofacial defects, including underdeveloped palatal shelves and clefting of the secondary palate²⁹. The marker rs705704 is located at the locus 12q13.2, which includes the gene RPS26, which may be relevant to orofacial clefting, as mutations of RPS26 have been reported in Diamond–Blackfan anaemia (DBA) with CPO as one of its clinical phenotypes³⁰. Two highly correlated markers (r²=0.99) at 14q32.13 are located between GSC and DICER1. GSC modulates the epithelial–mesenchymal transition and mutations in GSC lead to a syndrome defined by short stature, auditory canal atresia, mandibular hypoplasia and skeletal abnormalities³¹, whereas DICER1 mutations have been reported to cause pleuropulmonary blastoma and multinodular goiter-1, with or without Sertoli–Leydig cell tumours (MNG1)³², and Dicer1 conditional knockout mice exhibit secondary palate clefting and other severe craniofacial dysmorphisms³³. The excellent candidate gene WNT9B at 17q21.32 has already been functionally implicated in craniofacial development, as mice with Wnt9b targeted mutation were described as presenting CL/P phenotypes³⁴.

Among the other new signals, two markers at 2p25.1 are in perfect LD (r²=1) with one another and are located 12 kb upstream of TAF1B, encoded protein of which is important for polymerase (Pol) I transcription³⁵. At 8q22.1, a synonymous codon SNP rs957448 (KIAA1429) is correlated (r²=0.65) with rs12681366 (an intronic SNP of RAD54B). Human RAD54B was first identified as a homologue of RAD54, which plays an important role in DNA repair³⁶. The strongest associated marker at 12q13.13 is located 500 bp downstream of KRT18, which encodes a protein in the large family of cytoskeletal proteins with specific expression in epithelial cells³⁷. At 12q21.1, the signals are near the TMEM19 gene, involving the SNPs rs2304269 and rs7967428, which are in strong LD with each other (r²=0.98). Rs2304269 and rs7967428 are respectively located at one active promoter and five strong enhancers in epidermal keratinocytes according to ENCODE data. The 14q22.1 signal is close to LINC00640, a gene of unknown function. In addition, to gain further insight into the possible involvement of genes at some identified loci in the development of NSCLP, immunohistochemistry (IHC) analysis performed in mice at different embryonic stages found positive IHC staining of three genes of interest (Rad54b, Rps26 and Fam49a) in the palatal mesenchymal cells and epithelium cells (Supplementary Fig. 6).

Notably, in our study, two members of the FGF signalling pathway, including FGF10 at 5p12 and FGFR1 at 8p11.23, as well as three FGF signalling regulatory genes (SPRY1 (ref. 38) at 4q28.1, PTCH1 (ref. 39) at 9q22.32 and WNT9B (ref. 40) at 17q21.32) were found to be associated with the risk of NSCLP. We performed a network analysis of notable genes in the 26 NSCLP associated loci, which showed that several FGF signalling related genes such as FGFR1 and FGF10 are connected (Fig. 3). The FGF signalling pathway was proposed to contribute to NSCL/P⁴¹ and previous candidate gene studies have provided evidence in humans and animal models^41,42. The findings of our association study strengthen the hypothesis that the FGF signalling pathway might play important roles in craniofacial development. Intriguingly, we also found a potential link between ribosomopathies and the genes in our NSCLP-associated loci, including RPS26, RAD54B and TAF1B. Mutations in RPS26 were reported to affect the functions of the proteins in ribosomal RNA processing in DBA patients and DBAs belong to a class of diseases called ribosomopathies^30,43. Moreover, RPS26 and RAD54B were reported to regulate p53 (refs 44, 45) and the p53 pathway is importantly involved in ribosome biogenesis⁴³. In addition, TAF1B was reported as a component of RNA Pol I basal transcription factor, which is essential for Pol I recruitment to the ribosomal RNA gene promoter³⁵.

**Figure 3: Functional similarity network of genes in the 26 NSCLP associated loci.**

For the 12 significant associated loci in the study that had been previously reported, the strongest signals occurred for 2 SNPs in near-perfect LD (r²=0.92) in 2p24.2, located in the 3′-untranslated region of FAM49A. It is worth mentioning that both the LD block and ±500 kb on either side of the index SNPs in this region only contain the single gene FAM49A, although a few non-coding RNA genes are located in this region. FAM49A is a protein-coding gene whose paralogue, FAM49B, is located in a previously reported susceptibility locus near the gene desert region of 8q24, which shows a strong association with the risk of NSCL/P in European populations^10,46,47. ENCODE data indicate that SNP rs7552 alters the regulatory motifs of TBX5 and BRCA1, and the highly correlated SNP rs4832651 (r²=0.98) lies within a conserved enhancer for mammary epithelial cell activity. Although Myc-oncogene has been reported as the probable target effect gene in the 8q24 region for NSCL/P⁴⁷, the functions of both FAM49A and FAM49B remain poorly defined. These genes might play a role in the aetiology of NSCL/P and whether their functions vary across different populations is clearly worth further investigation. In addition, as expected, the second strongest signals were near IRF6 at 1q32.2 and this association signal has been independently replicated in numerous GWAS studies and candidate gene studies^{2,6,10,12,14,15,16}. Of the remaining ten loci, 1p36.13 and3q12.1 approached genome-wide significance, 15q22.2 showed suggestive evidence of association and the additional seven loci were only analysed in the NSCLP GWAS stage (Supplementary Table 4).

Comparisons of NSCLP, NSCLO and NSCPO have yielded clear evidence of genetic heterogeneity among the three sub-groups of NSOFC. The two sub-groups (NSCLP and NSCLO) generally grouped together appeared to share more genetic risk factors, which is consistent with previous findings^4,6,48, and these results argue for distinct origins of development of the lip and primary palate versus the secondary palate^1,49. In addition, although 1p36.11 and 9q22.33 were not confirmed in NSCPO in our study, 1q32.2, 8q24.21 and 9q22.32 were first demonstrated to have an effect on NSCPO in the Chinese population. Importantly, our study provided evidence that 1q32.2 exhibits an overlapping effect on all three sub-phenotypes of NSCLP, NSCLO and NSCPO. The evidence of association at 1q32.2 was stronger among males than females, which may reflect the higher prevalence rate of NSCLP among males (male:female=2.6:1 in our study). Stratification of the results by maternal gestational age revealed that older mothers may have a higher risk of having a child with NSCLP, as suggested by some previous studies of congenital disorders such as NSOFC^50,51.

Plausible reasons for the failure to replicate all of the associated loci in different ethnic groups could be due to the limited sample size, the combined sub-groups of NSCL/P used in previous analyses, the differential tagging of unobserved causal variants across ethnic groups or the existence of true genetic heterogeneity across ethnic groups. Further studies using larger sample sizes or analytical approaches, such as a trans-ethnic genome-wide meta-analysis approach⁵² with more detailed classification of sub-phenotypes, are warranted to further investigate this hypothesis.

Of the 26 genetic risk factors, 19 had reported associations with a total of 34 other diseases/traits. These associations could mainly be categorized into six different groups, including developmental, immune, metabolic, neoplastic, endocrine and degenerative categories (Supplementary Data 2). To further assess the possible independence among these various birth defects/diseases/traits within these particular SNPs, we examined LD patterns between these SNPs in Asian, African and European populations using data from the 1,000 Genomes Project. As a result, three susceptibility loci were identified to be shared by NSCLP and other diseases/traits, including schizophrenia at 8p11.23, asthma, polycystic ovary syndrome, rheumatoid arthritis, vitiligo, type 1 diabetes autoantibodies and alopecia at 12q13.2, and height at 9q22.32. The SNPs reported to be associated with NSCLP and other diseases/traits were in strong LD (r²≥0.7), which showed significant and non-independent association of the risk of NSCLP and other diseases/traits. Interestingly, NSCLP and adult height shared the same index SNP (rs10512248 at PTCH1 in 9q22.32), suggesting that some shared genetic factors might underlie these two very distinct phenotypes. Furthermore, some reported GWAS loci had susceptibility genes shared between NSCLP and other diseases/traits, such as MAFB at 20q12 for Dupuytren’s disease, low-density lipoprotein cholesterol and total cholesterol, which showed a clearly independent association with NSCLP index SNPs and suggested pleiotropic effects of these genes on other diseases/traits.

Overall, our current study has advanced the understanding of the genetic architecture controlling the risk of NSOFC by substantially increasing the number of genetic risk factors and has highlighted potential candidate genes through subsequent genetic and biological analyses. This study has also provided further insight into the possible pleiotropic effects of genetic risk factors on different sub-phenotypes, in different populations and among different diseases/traits. Through a comprehensive analysis of cases and controls from a Chinese population, we have identified 14 new genetic risk factors and validated associations in a large majority of previously reported loci. Further sequencing and functional investigations will probably identify causal mutational events and true susceptibility genes in or near these tagging SNPs and further elucidate the disease pathogenesis of these common congenital birth defects.

Methods

Samples

In the current study, we carried out a two-stage GWAS and further replications of NSOFC. The discovery stage included 2,096 NSCLP cases and 4,051 controls (cohort 1). Replication studies were performed in an additional 1,346 unrelated NSCLP cases and 4,542 controls (cohort 2). Further replications consisted of cohort 3 (1,104 NSCLO cases versus 3,312 controls), cohort 4 (1,104 NSCPO cases versus 3,312 controls shared with cohort 3), cohort 5 (399 NSCL/P cases versus 1,318 controls), cohort 6 (861 NSCL/P case–parent trios) and cohort 7 (557 NSCL/P case–parent trios). Samples of cohorts 1–4 were recruited from the Chinese population through collaboration with multiple hospitals in Hubei, Henan and Anhui province. All cases were interviewed and clinically assessed by at least two experienced clinicians, and a detailed questionnaire was completed to identify any further anomalies, such as congenital heart disease, hypospadias, accessory auricle, lip pits and so on, which would suggest an underlying syndrome. We collected clinical information from the subjects through a full clinical checkup and additional demographic information from the cases was obtained through a structured questionnaire that mainly included four parts: basic information, clinical feature, maternal situation and life style during the first trimester of pregnancy, and genetic background of the patients. All controls were healthy individuals without OFC or family history of OFC (including first-, second- and third-degree relatives). Peripheral blood samples were collected after the written informed consents were obtained from all the participants or their guardians. The study was approved by the institutional ethics committee of each hospital (Hospital of Stomatology Wuhan University, The Second Charity Hospital of Henan Province, Stomatological Hospital of Nanyang, Stomatological Hospital of Xiangyang and The First Affiliated Hospital of Anhui Medical University) and was conducted according to the Declaration of Helsinki principles. The replication data in cohort 5 from the GWAS in Central Europeans was provided by Mangold et al.¹³, whereas the replication data in cohorts 6 and 7 were from the GWAS of case–parent trios of Asians and European ancestry provided by Beaty et al.¹². All the controls and cases for each replication cohort were sampled from the same locality and the same population in each study, to assure minimal population stratification effects for each replication.

DNA extraction

Approximately 4 ml EDTA anticoagulated venous blood sample was collected from each participant. Genomic DNAs of the cases were extracted from peripheral blood lymphocytes using the standard SDS–proteinase K-phenol/chloroform method. For the controls, DNAs were isolated by standard procedures using Flexi Gene DNA kits (Qiagen) according to the manufacturer’s protocol. After quality control, DNAs were diluted to working concentrations of 45–55 ng μl⁻¹ for genome-wide genotyping and 20–30 ng μl⁻¹ for the validation studies, respectively.

Genotyping and quality controls in GWAS

The discovery-stage genotyping was conducted according to the Infinium HD protocol using the Illumina HumanOmniZhongHua-8 v1.1 BeadChip (Illumina, San Diego, CA, USA) at the Key Laboratory of Dermatology at Anhui Medical University (Ministry of Education), Hefei, Anhui, China. Genotyping was performed as described in the Infinium HD protocol from Illumina⁵³.

In the GWAS stage, a total of 900,015 SNPs were genotyped in 2,096 cases and 4,051 controls. A standard quality-control criterion was applied to select SNPs and samples for further analysis. SNPs were excluded if they had (i) a call rate <95% in cases or controls; (ii) an MAF of <1% in the population; or (iii) significant deviation from Hardy–Weinberg equilibrium (HWE) in the controls (P≤10⁻⁴). In addition, all the SNPs on the X, Y and mitochondrial chromosomes, as well as the copy number variation-related SNPs and probes, were excluded from statistical analysis. Meanwhile, samples were removed if they (i) had an overall genotyping rate of <98%; (ii) were duplicates or showed familial relationships based on pairwise identity by state using PLINK 1.07 (ref. 54), the sample with higher call rate was left between the related samples (PI_HAT>0.025); (iii) showed inconsistent genetic gender with epidemiological or clinical data; (iv) and were ancestry outliers or heterozygosity outliers. Samples were assessed for population stratification using the software package EIGENSTRAT⁵⁵. The original script from EIGENSTRAT was modified to extract the principal components for plotting. In total, 63 samples were removed from analysis. After quality control, the genotype data of 803,202 autosomal SNPs in 2,033 cases and 4,051 controls remained for further analysis.

SNP selection for replication studies

SNPs were selected for NSCLP replication according to the following steps: (i) we first picked out all the top SNPs with P<1.0 × 10⁻⁴ in the initial stage and excluded the SNPs with ambiguous genotype scatter plots; (ii) then we selected at least one SNP with the lowest P-values in each of the novel loci, which defined by using the PLINK option ‘-indep-pairwise 50 5 0.2’; (iii) in addition, one to four top SNPs were chosen from the previously reported loci; (iv) we also selected SNPs that were located within or close to the susceptibility genes with gene expression profiling evidence for OFC or for syndromes with OFC symptoms. In total, 152 SNPs were selected for the NSCLP replication stage. Furthermore, all the promising SNPs were selected for the NSCLO and NSCPO replications. These SNPs had the lowest P-values in NSCLP-meta stage and showed P<0.05, as well as with call rate >90%, MAF>0.01 and P_HWE>10⁻⁴ in NSCLP replication stage; thus, 41 SNPs reached genome-wide significance. The above 41 SNPs were distributed in 12 previously reported NSCL/P associated loci and 14 novel loci. In cohort 5–7 replications, 24 SNPs were selected for replication in 3 GWAS data sets from Central Europeans, Asians and European ancestry groups. Of the 24 SNPs, 20 were picked out from the 14 novel loci and 4 were from 2 newly reported NSCL/P loci (16p13.3 (ref. 15) and 2p24.2 (ref. 16)) and all of them were from the 41 significant SNPs.

Genotyping and quality control in replication studies

Genotyping analyses of replications in cohorts 2–4 were conducted by using the Sequenom MassARRAY system, at the Key Laboratory of Dermatology at Anhui Medical University (Ministry of Education), Hefei, Anhui, China. Locus-specific PCR primers (Supplementary Table 15) were designed using MassARRAY Assay Design 3.0 software, following the manufacturer’s instructions (Sequenom)⁵³. Quality control was performed in each data set separately using PLINK 1.07. In each case–control replications (cohorts 2–4), we excluded SNPs with a call rate <90% in cases or controls, or deviation from HWE proportions (P≤1 × 10⁻⁴) in the controls.

To evaluate the quality of the genotype data for the validation analyses, 100 randomly selected samples from the GWAS stage were re-genotyped using the Sequenom system. The concordance rate between the genotypes from the Illumina HumanOmniZhongHua-8 v1.1 BeadChip and the Sequenom MassARRAY assay analyses was >99%. The cluster plots from the Illumina and Sequenom analyses were checked to confirm their good quality. After quality control, 146 SNPs were remained for NSCLP replication and 40 SNPs were left for further replications in cohort 3 and 4 analyses, respectively.

Statistical analyses

In the GWAS stage, we examined potential genetic relatedness based on pairwise identity by state for all of the successfully genotyped samples using the PLINK 1.07 software. For the duplicated samples and all pairs of first-, second- and third-degree relatives detected, the subject from each pair with the lower call rate was removed from further analysis. All cases and controls were assessed by principal components analysis for population stratification and were confirmed to be of Chinese ancestry. Quantile–quantile plots were constructed and calculations of genomic control values (λ_GC=1.04 indicated a negligible inflation of the genome-wide statistical results due to population stratification) were performed by using the software R (http://www.r-project.org/) to evaluate the overall significance of genome-wide association results and the potential impact of population stratification, respectively, in the discovery stage.

Association of GWAS and replication analysis were performed using the Cochran–Armitage trend test. Single-marker association analyses were performed to test for disease–SNP associations using logistic regression in each stage. Fixed effects meta-analyses of cohorts 1 and 2 (NSCLP combine) were performed using the Cochran–Mantel–Haenszel test, where P-values and heterogeneity index Q-values from Cochran’s Q statistics were also obtained. Assessment of heterogeneity across studies was carried out by evaluating the P_het values from Cochran’s Q statistics (Bonferroni-corrected heterogeneity Q-values P_het of<0.05 were considered significant)^52,56. OR values were measured as OR per allele and presented for the minor allele of a SNP, unless otherwise stated. A threshold of P<5 × 10⁻⁸ was adopted to define novel loci with genome-wide significance. The regional association plots for each susceptibility locus were generated in R using information from the HapMap project (CHB and JPT samples). After applying quality control and removing those SNPs with MAF<1%, HWE<0.0001 and call rate <95% from GWAS data set, 2,033 cases and 4,051 controls with 803,202 SNPs were used for the disease variation assessment in the genome-wide level. Furthermore, the samples passed quality control from the discovery and NSCLP replication (3,379 cases and 8,593 controls) with the 41 markers attaining genome-wide significance (P<5 × 10⁻⁸) were used for disease variation estimating of the 26 NSCLP risk loci. The proportion of variance in NSCLP risk was examined via the residual maximum likelihood method in the program genome-wide complex trait analysis and estimated assuming a disease prevalence of 0.001 (1 out of 1,000) and log additive risk^52,57. All power calculations were performed using the genetic power calculator assuming a disease prevalence of 0.001 and log-additive risk. We carried out conditional analyses to identify additional association signals after accounting for the effects of known and newly discovered susceptibility loci. To investigate more than two association signals per locus, we used a stepwise procedure in which additional SNPs were added to the model according to their conditional P-values, as programmed in EMMAX. We estimated the LD metrics r² and D’ using 6,084 individuals from METSIM, who passed genotyping quality control. To replicate associations of the 24 SNPs in different ethnicities, GWAS data from three previously published NSCL/P populations (Central Europeans, Asians and European ancestry groups) were extracted. Replication in the Central European NSCL/P samples was based on a data set published in Mangold et al.¹³ SNPs that had not been genotyped in this study were imputed using IMPUTE2 software⁷. Genotype imputation for the case–parent trios described in Beaty et al.¹² was run by the GENEVA Coordinating Center⁵⁸, using a worldwide 1,000 Genomes Project reference panel and the IMPUTE2 software in 2012. Imputed genotypes and accompanying marker annotation and quality metrics files are available through the authorized access portion of the dbGaP posting.

Stratification analyses

Genotype–phenotype stratification analyses were conducted by using PLINK 1.07 software for the 41 significant associated markers in NSCLP-meta stage. Genotype data were extracted from GWAS and NSCLP replication stages. Then, we performed stratification analyses on gender and maternal gestational age in NSCLP. P-value below 1.22 × 10⁻³ using logistic regression (0.05 out of 41, Bonferroni correction) was considered to be statistically significant.

Heterogeneity analyses among NSCLP, NSCLO and NSCPO were performed by using PLINK 1.07 software based on the 40 significant associated markers in NSCLP meta-stage. Genotype data were extracted from discovery and replication stages of NSCLP, NSCLO and NSCPO (cohorts 1–4). We first divided the cases into three sub-phenotypes NSCLP, NSCLO and NSCPO, then extracted genotype of each case from the above four cohorts and calculated the association between each combination of two sub-phenotypes. P-value below 1.25 × 10⁻³ using logistic regression (0.05 out of 40, Bonferroni correction) was considered to be statistically significant.

Locus annotation and candidate gene prioritization

To prioritize candidate genes, besides the nearest genes to the index SNPs, the following methods were used to help prioritize potential causal genes in each associated region. All genes located in the same LD block as the index SNPs (r²≥0.7) were selected⁵² and annotated for function in molecular, cellular, animal model and tissue/organ levels using several databases, including PubMed (http://www.ncbi.nlm.nih.gov/pubmed/), EMAGE (http://www.emouseatlas.org/emage/home.php), MGI (http://www.informatics.jax.org/), OMIM (http:/ www.omim.org/), Gene (http://www.ncbi.nlm.nih.gov/gene/), UCSC (http://genome.ucsc.edu/) and Ensembl (http://www.ensembl.org/index.html). The nearest genes on both sides of the index SNP were annotated when no gene was located within the LD block. A total of 135 SNPs at these 26 NSCLP risk loci (all with r²≥0.7 with the SNPs found to be genome-wide significant here) with MAF>0.05 and P_HWE>1 × 10⁻⁴ were annotated by using the following methods: regulatory features from ENCODE Consortium/ENCODE/Roadmap Epigenomics Project (http://www.roadmapepigenomics.org/)^59,60.

Network analysis

We expanded the global network by including the Human Net protein interaction database⁶¹ and literature-curated interactions from STRING^62,63 to derive an expanded global network based on known protein–protein interactions using the notable genes of the 26 NSCLP associated loci from the present study (Fig. 3).

GWAS catalogue reviews

We evaluated all the SNPs within ±500 kb of the index SNPs (from the 26 loci) and with P<5 × 10⁻⁸ recorded in National Human Genome Research Institute GWAS catalogue (http://www.genome.gov/gwastudies) updated on 20 February 2015. The LD patterns of the index SNPs and the recorded SNPs in GWAS catalogue were inquired using SNAP version 2.2 in Asian (CHB+JPT) and European (CEU) populations using data from the 1,000 Genomes Project Pilot 1.

Expression studies in the mouse

Eight- to 14-week-old wild-type Kunming mice were housed in approved specific pathogen-free conditions and mated for 12 h, the presence of a vaginal plug was designated as E0.5. Pregnant mice were randomly divided into four groups and killed at embryonic stages E13.5–E16.5, respectively. Embryos with death or other malformations were ruled out. Normal fetuses were harvested and fixed in 4% paraformaldehyde overnight at 4 °C for IHC. The 4 μm paraffin sections were deparaffinized, rehydrated and subjected to antigen retrieval with high pressure method. A mixture of 30% H₂O₂ and methanol (1/9, v/v) was performed to inhibit endogenous peroxidase activity. The rabbit polyclonal antibodies to Fam49a (LS-C167900, LSbio; 1:100 dilution), Rad54b (orb100108, Biorbyt; 1:200 dilution), Rps26 (14909-1-AP, Proteintech; 1:800 dilution), Taf1b (12818-1-AP, Proteintech; 1:600 dilution) and Thap2 (orb186252, Biorbyt; 1:200 dilution) were incubated with the sections at 4 °C overnight, respectively, and were then detected with the Rabbit SP kit (SP9001, Zhongshan Golden Bridge Biotech). The sections were then counterstained with haematoxylin. The results were assessed by an investigator who was blinded to the group allocation. All experimental procedures were carried out in accordance with the Institutional Animal Care and Use Committee of the Laboratory Animal Center of Wuhan University, China. The study was approved by the Ethics Committee, School and Hospital of Stomatology of Wuhan University, China.

Data availability

The data that support the findings of this study are available from the corresponding author on request.

Additional information

How to cite this article: Yu, Y. et al. Genome-wide analyses of non-syndromic cleft lip with palate identify 14 novel loci and genetic heterogeneity. Nat. Commun. 8, 14364 doi: 10.1038/ncomms14364 (2017).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Mossey, P. A., Little, J., Munger, R. G., Dixon, M. J. & Shaw, W. C. Cleft lip and palate. Lancet 374, 1773–1785 (2009).
Article Google Scholar
Dixon, M. J., Marazita, M. L., Beaty, T. H. & Murray, J. C. Cleft lip and palate: understanding genetic and environmental influences. Nat. Rev. Genet. 12, 167–178 (2011).
Article CAS Google Scholar
Fu, X. et al. Loss-of-function mutation in the X-linked TBX22 promoter disrupts an ETS-1 binding site and leads to cleft palate. Hum. Genet. 134, 147–158 (2015).
Article CAS Google Scholar
Harville, E. W., Wilcox, A. J., Lie, R. T., Vindenes, H. & Abyholm, F. Cleft lip and palate versus cleft lip only: are they distinct defects? Am. J. Epidemiol. 162, 448–453 (2005).
Article Google Scholar
Leslie, E. J. & Marazita, M. L. Genetics of cleft lip and cleft palate. Am. J. Med. Genet. C Semin. Med. Genet. 163C, 246–258 (2013).
Article Google Scholar
Rahimov, F. et al. Disruption of an AP-2alpha binding site in an IRF6 enhancer is associated with cleft lip. Nat. Genet. 40, 1341–1347 (2008).
Article CAS Google Scholar
Ludwig, K. U. et al. Genome-wide meta-analyses of nonsyndromic cleft lip with or without cleft palate identify six new risk loci. Nat. Genet. 44, 968–971 (2012).
Article CAS Google Scholar
Jia, Z. et al. Replication of 13q31.1 association in nonsyndromic cleft lip with cleft palate in Europeans. Am. J. Med. Genet. A 167A, 1054–1060 (2015).
Article Google Scholar
Ludwig, K. U. et al. Meta-analysis reveals genome-wide significance at 15q13 for nonsyndromic clefting of both the lip and the palate, and functional analyses implicate GREM1 as a plausible causative gene. PLoS Genet. 12, e1005914 (2016).
Article Google Scholar
Birnbaum, S. et al. Key susceptibility locus for nonsyndromic cleft lip with or without cleft palate on chromosome 8q24. Nat. Genet. 41, 473–477 (2009).
Article CAS Google Scholar
Grant, S. F. et al. A genome-wide association study identifies a locus for nonsyndromic cleft lip with or without cleft palate on 8q24. J. Pediatr. 155, 909–913 (2009).
Article CAS Google Scholar
Beaty, T. H. et al. A genome-wide association study of cleft lip with and without cleft palate identifies risk variants near MAFB and ABCA4. Nat. Genet. 42, 525–529 (2010).
Article CAS Google Scholar
Mangold, E. et al. Genome-wide association study identifies two susceptibility loci for nonsyndromic cleft lip with or without cleft palate. Nat. Genet. 42, 24–26 (2010).
Article CAS Google Scholar
Beaty, T. H. et al. Confirming genes influencing risk to cleft lip with/without cleft palate in a case-parent trio study. Hum. Genet. 132, 771–781 (2013).
Article CAS Google Scholar
Sun, Y. et al. Genome-wide association study identifies a new susceptibility locus for cleft lip with or without a cleft palate. Nat. Commun. 6, 6414 (2015).
Article Google Scholar
Leslie, E. J. et al. A multi-ethnic genome-wide association study identifies novel loci for non-syndromic cleft lip with or without cleft palate on 2p24.2, 17q23 and 19q13. Hum. Mol. Genet. 25, 2862–2872 (2016).
CAS PubMed PubMed Central Google Scholar
Leslie, E. J. et al. A genome-wide association study of nonsyndromic cleft palate identifies an etiologic missense variant in GRHL3. Am. J. Hum. Genet. 98, 744–754 (2016).
Article CAS Google Scholar
Mangold, E. et al. Sequencing the GRHL3 coding region reveals rare truncating mutations and a common susceptibility variant for nonsyndromic cleft Palate. Am. J. Hum. Genet. 98, 755–762 (2016).
Article CAS Google Scholar
Moreno, L. M. et al. FOXE1 association with both isolated cleft lip with or without cleft palate, and isolated cleft palate. Hum. Mol. Genet. 18, 4879–4896 (2009).
Article CAS Google Scholar
Wattanarat, O. & Kantaputra, P. N. Preaxial polydactyly associated with a MSX1 mutation and report of two novel mutations. Am. J. Med. Genet. A 170, 254–259 (2016).
Article CAS Google Scholar
Satokata, I. & Maas, R. Msx1 deficient mice exhibit cleft palate and abnormalities of craniofacial and tooth development. Nat. Genet. 6, 348–356 (1994).
Article CAS Google Scholar
Yang, X. et al. Conditional expression of Spry1 in neural crest causes craniofacial and cardiac defects. BMC Dev. Biol. 10, 48 (2010).
Article Google Scholar
Rice, R. et al. Disruption of Fgf10/Fgfr2b-coordinated epithelial-mesenchymal interactions causes cleft palate. J. Clin. Invest. 113, 1692–1700 (2004).
Article CAS Google Scholar
Davies, S. J. et al. Mapping of three translocation breakpoints associated with orofacial clefting within 6p24 and identification of new transcripts within the region. Cytogenet. Genome Res. 105, 47–53 (2004).
Article CAS Google Scholar
Gunes, N. et al. Branchio-oculo-facial syndrome in a newborn caused by a novel TFAP2A mutation. Genet. Couns. 25, 41–47 (2014).
CAS PubMed Google Scholar
Dode, C. et al. Loss-of-function mutations in FGFR1 cause autosomal dominant Kallmann syndrome. Nat. Genet. 33, 463–465 (2003).
Article CAS Google Scholar
Trokovic, N., Trokovic, R., Mai, P. & Partanen, J. Fgfr1 regulates patterning of the pharyngeal region. Genes Dev. 17, 141–153 (2003).
Article CAS Google Scholar
Lo Muzio, L. Nevoid basal cell carcinoma syndrome (Gorlin syndrome). Orphanet. J. Rare Dis. 3, 32 (2008).
Article Google Scholar
Feng, W., Choi, I., Clouthier, D. E., Niswander, L. & Williams, T. The Ptch1(DL) mouse: a new model to study lambdoid craniosynostosis and basal cell nevus syndrome-associated skeletal defects. Genesis 51, 677–689 (2013).
Article CAS Google Scholar
Gripp, K. W. et al. Diamond-Blackfan anemia with mandibulofacial dystostosis is heterogeneous, including the novel DBA genes TSR2 and RPS28. Am. J. Med. Genet. A 164A, 2240–2249 (2014).
Article Google Scholar
Parry, D. A. et al. SAMS, a syndrome of short stature, auditory-canal atresia, mandibular hypoplasia, and skeletal abnormalities is a unique neurocristopathy caused by mutations in Goosecoid. Am. J. Hum. Genet. 93, 1135–1142 (2013).
Article CAS Google Scholar
Rio Frio, T. et al. DICER1 mutations in familial multinodular goiter with and without ovarian Sertoli-Leydig cell tumors. JAMA 305, 68–77 (2011).
Article CAS Google Scholar
Barritt, L. C. et al. Conditional deletion of the human ortholog gene Dicer1 in Pax2-Cre expression domain impairs orofacial development. Indian J. Hum. Genet. 18, 310–319 (2012).
Article CAS Google Scholar
Juriloff, D. M., Harris, M. J., McMahon, A. P., Carroll, T. J. & Lidral, A. C. Wnt9b is the mutated gene involved in multifactorial nonsyndromic cleft lip with or without cleft palate in A/WySn mice, as confirmed by a genetic complementation test. Birth Defects Res. A Clin. Mol. Teratol. 76, 574–579 (2006).
Article CAS Google Scholar
Naidu, S., Friedrich, J. K., Russell, J. & Zomerdijk, J. C. TAF1B is a TFIIB-like component of the basal transcription machinery for RNA polymerase I. Science 333, 1640–1642 (2011).
Article ADS CAS Google Scholar
Sarai, N. et al. Biochemical analysis of the N-terminal domain of human RAD54B. Nucleic Acids Res. 36, 5441–5450 (2008).
Article CAS Google Scholar
Fortier, A. M., Asselin, E. & Cadrin, M. Keratin 8 and 18 loss in epithelial cancer cells increases collective cell migration and cisplatin sensitivity through claudin1 up-regulation. J. Biol. Chem. 288, 11555–11571 (2013).
Article CAS Google Scholar
Kramer, S., Okabe, M., Hacohen, N., Krasnow, M. A. & Hiromi, Y. Sprouty: a common antagonist of FGF and EGF signaling pathways in Drosophila. Development 126, 2515–2525 (1999).
CAS PubMed Google Scholar
Metzis, V. et al. Patched1 is required in neural crest cells for the prevention of orofacial clefts. Hum. Mol. Genet. 22, 5026–5035 (2013).
Article CAS Google Scholar
Jin, Y. R., Han, X. H., Taketo, M. M. & Yoon, J. K. Wnt9b-dependent FGF signaling is crucial for outgrowth of the nasal and maxillary processes during upper jaw and lip development. Development 139, 1821–1830 (2012).
Article CAS Google Scholar
Riley, B. M. et al. Impaired FGF signaling contributes to cleft lip and palate. Proc. Natl Acad. Sci. USA 104, 4512–4517 (2007).
Article ADS CAS Google Scholar
Pauws, E. & Stanier, P. FGF signalling and SUMO modification: new players in the aetiology of cleft lip and/or palate. Trends Genet. 23, 631–640 (2007).
Article CAS Google Scholar
Narla, A. & Ebert, B. L. Ribosomopathies: human disorders of ribosome dysfunction. Blood 115, 3196–3205 (2010).
Article CAS Google Scholar
Cui, D. et al. The ribosomal protein S26 regulates p53 activity in response to DNA damage. Oncogene 33, 2225–2235 (2014).
Article CAS Google Scholar
Nagai, Y. et al. High RAD54B expression: an independent predictor of postoperative distant recurrence in colorectal cancer patients. Oncotarget 6, 21064–21073 (2015).
Article Google Scholar
Murray, T. et al. Examining markers in 8q24 to explain differences in evidence for association with cleft lip with/without cleft palate between Asians and Europeans. Genet. Epidemiol. 36, 392–399 (2012).
Article Google Scholar
Uslu, V. V. et al. Long-range enhancers regulating Myc expression are required for normal facial morphogenesis. Nat. Genet. 46, 753–758 (2014).
Article CAS Google Scholar
Watkins, S. E., Meyer, R. E., Strauss, R. P. & Aylsworth, A. S. Classification, epidemiology, and genetics of orofacial clefts. Clin. Plast. Surg. 41, 149–163 (2014).
Article Google Scholar
Meng, L., Bian, Z., Torensma, R. & Von den Hoff, J. W. Biological mechanisms in palatogenesis and cleft palate. J. Dent. Res. 88, 22–33 (2009).
Article CAS Google Scholar
Reefhuis, J. & Honein, M. A. Maternal age and non-chromosomal birth defects, Atlanta--1968-2000: teenager or thirty-something, who is at risk? Birth Defects Res. A Clin. Mol. Teratol. 70, 572–579 (2004).
Article CAS Google Scholar
Bille, C. et al. Parent’s age and the risk of oral clefts. Epidemiology 16, 311–316 (2005).
Article Google Scholar
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).
Article CAS Google Scholar
Han, J. W. et al. Genome-wide association study in a Chinese Han population identifies nine new susceptibility loci for systemic lupus erythematosus. Nat. Genet. 41, 1234–1237 (2009).
Article CAS Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS Google Scholar
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Article CAS Google Scholar
Liu, H. et al. Discovery of six new susceptibility loci and analysis of pleiotropic effects in leprosy. Nat. Genet. 47, 267–271 (2015).
Article CAS Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Article CAS Google Scholar
Cornelis, M. C. et al. The Gene, Environment Association Studies consortium (GENEVA): maximizing the knowledge obtained from GWAS by collaboration across studies of multiple conditions. Genet. Epidemiol. 34, 364–372 (2010).
Article Google Scholar
Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article ADS Google Scholar
Bernstein, B. E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).
Article CAS Google Scholar
Barabasi, A. L., Gulbahce, N. & Loscalzo, J. Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12, 56–68 (2011).
Article CAS Google Scholar
Lee, I., Blom, U. M., Wang, P. I., Shim, J. E. & Marcotte, E. M. Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 21, 1109–1121 (2011).
Article CAS Google Scholar
Szklarczyk, D. et al. The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 39, D561–D568 (2011).
Article CAS Google Scholar

Download references

Acknowledgements

We thank all the individuals for participating in this project and we acknowledge all oral surgeons at relevant hospitals for their help in the recruitment of subjects, including Hospital of Stomatology of Wuhan University, The Second Charity Hospital of Henan Province, Stomatological Hospital of Nanyang and Stomatological Hospital of Xiangyang, as well as the laboratory staff who contributed to making this work possible. This work was funded by Pre-National Basic Research Program of China (973 Plan; 2012CB722404 to L.D.S. and Z.B.), Top-Notch Young Talents Program of China and Recovery Medical Science Fundation to L.D.S., National Natural Science Foundation of China (81120108010, 81470727, 81300870 and 81571438 to Z.B., M.H. and L.Y.M.), National Key Research and Development Program (2016YFC1000505) and Hubei Province’s Outstanding Medical Academic Leader Program to Z.B., and Applied Basic Research Program of Wuhan, China (2014062801011267) to L.Y.M.

Author information

Yanqin Yu, Xianbo Zuo, Miao He and Jinping Gao: These authors contributed equally to this work.
Liangdan Sun and Zhuan Bian: These authors jointly supervised this work.

Authors and Affiliations

The State Key Laboratory Breeding Base of Basic Science of Stomatology (Hubei-MOST) and Key Laboratory of Oral Biomedicine Ministry of Education, School and Hospital of Stomatology, Wuhan University, Wuhan, 430079, Hubei, China
Yanqin Yu, Miao He, Chuanqi Qin, Liuyan Meng, Yaling Song, Yong Cheng, Xinhuan Wang & Zhuan Bian
Institute of Dermatology and Department of Dermatology at No. 1 Hospital, Anhui Medical University, Hefei, 230032, Anhui, China
Xianbo Zuo, Jinping Gao, Wenjun Wang, Fusheng Zhou, Gang Chen, Xiaodong Zheng, Bo Liang, Zhengwei Zhu, Yujun Sheng, Jianjun Liu, Xuejun Zhang & Liangdan Sun
State Key Lab Incubation of Dermatology, Ministry of Science and Technology, Hefei, China
Xianbo Zuo, Jinping Gao, Wenjun Wang, Fusheng Zhou, Gang Chen, Xiaodong Zheng, Bo Liang, Zhengwei Zhu, Yujun Sheng, Jianjun Liu, Xuejun Zhang & Liangdan Sun
Key Lab of Dermatology, Ministry of Education, Heifei, China
Xianbo Zuo, Jinping Gao, Wenjun Wang, Fusheng Zhou, Gang Chen, Xiaodong Zheng, Bo Liang, Zhengwei Zhu, Yujun Sheng, Jianjun Liu, Xuejun Zhang & Liangdan Sun
Key Lab of Gene Resources Utilization for Severe Inherited Disorders, Anhui, 230032, China
Xianbo Zuo, Jinping Gao, Wenjun Wang, Fusheng Zhou, Gang Chen, Xiaodong Zheng, Bo Liang, Zhengwei Zhu, Yujun Sheng, Jianjun Liu, Xuejun Zhang & Liangdan Sun
Collaborative Innovation Center of Complex and Severe skin Disease, Anhui Medical University, Hefei, 230032, Anhui, China
Xianbo Zuo, Jinping Gao, Wenjun Wang, Fusheng Zhou, Gang Chen, Xiaodong Zheng, Bo Liang, Zhengwei Zhu, Yujun Sheng, Jianjun Liu, Xuejun Zhang & Liangdan Sun
Department of Pediatric Dentistry, School and Hospital of Stomatology, Wuhan University, Wuhan, 430079, Hubei, China
Miao He
Department of Oral and Maxillofacial Surgery, School and Hospital of Stomatology, Wuhan University, Wuhan, 430079, Hubei, China
Yuchuan Fu & Chuanqi Qin
Department of Genetics and Centre for Developmental Biology, College of Life Science, Wuhan University, Wuhan, 430072, Hubei, China
Xiazhou Fu
The Second Charity Hospital of Henan Province, Jiaozuo, 454000, Henan, China
Jiebing Hao
Stomatological Hospital of Nanyang, Nanyang, 473013, Henan, China
Zhongyin Liu
Stomatological Hospital of Xiangyang, Xiangyang, 441011, Hubei, China
Hansong Yan
Institute of Human Genetics, Life and Brain Center, University of Bonn, Bonn, 53127, Germany
Elisabeth Mangold & Kerstin U. Ludwig
Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, 21205, Maryland, USA
Ingo Ruczinski
Department of Oral Biology and Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, 15219, Pennsylvania, USA
Mary L. Marazita
Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, 15261, Pennsylvania, USA
Mary L. Marazita
Department of Psychiatry, Clinical and Translational Science, School of Medicine, University of Pittsburgh, Pittsburgh, 15213, Pennsylvania, USA
Mary L. Marazita
Department of Genomics, Life and Brain Center, University of Bonn, Bonn, 53127, Germany
Kerstin U. Ludwig
Department of Epidemiology, School of Public Health, Johns Hopkins University, Baltimore, 21205, Maryland, USA
Terri H. Beaty
Department of Dermatology at No. 2 Hospital, Anhui Medical University, Hefei, 230022, Anhui, China
Xuejun Zhang
Institute of Dermatology and Department of Dermatology, Huashan Hospital of Fudan University, Shanghai, 200040, China
Xuejun Zhang
The Key Laboratory of Major Autoimmune Diseases, Anhui Province, Anhui, 230032, China
Liangdan Sun

Authors

Yanqin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xianbo Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Miao He
View author publications
You can also search for this author in PubMed Google Scholar
Jinping Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yuchuan Fu
View author publications
You can also search for this author in PubMed Google Scholar
Chuanqi Qin
View author publications
You can also search for this author in PubMed Google Scholar
Liuyan Meng
View author publications
You can also search for this author in PubMed Google Scholar
Wenjun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yaling Song
View author publications
You can also search for this author in PubMed Google Scholar
Yong Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Fusheng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Gang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Xinhuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Liang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengwei Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xiazhou Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yujun Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Jiebing Hao
View author publications
You can also search for this author in PubMed Google Scholar
Zhongyin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hansong Yan
View author publications
You can also search for this author in PubMed Google Scholar
Elisabeth Mangold
View author publications
You can also search for this author in PubMed Google Scholar
Ingo Ruczinski
View author publications
You can also search for this author in PubMed Google Scholar
Jianjun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mary L. Marazita
View author publications
You can also search for this author in PubMed Google Scholar
Kerstin U. Ludwig
View author publications
You can also search for this author in PubMed Google Scholar
Terri H. Beaty
View author publications
You can also search for this author in PubMed Google Scholar
Xuejun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Liangdan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhuan Bian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.B. and L.D.S. conceived this study. L.D.S., Z.B., M.H. and L.Y.M. provided financial support. Z.B., L.D.S., X.B.Z. and X.J.Z. designed the study. Y.C.F., Y.L.S., Y.C., C.Q.Q., X.Z.F., J.B.H., Z.Y.L., H.S.Y., Z.W.Z., W.J.W., B.L. and Y.Q.Y. conducted sample selection and clinical data management. The following authors from the various collaborating groups undertook assembly of case/control series in their respective regions and collected data and samples: Y.C.F., Y.L.S., Y.C., C.Q.Q., X.Z.F., H.S.Y. and Y.Q.Y. in Hubei province; J.B.H. and Z.Y.L. in Henan province; W.J.W., Z.W.Z. and B.L. in Anhui province. F.S.Z. and G.C. performed genotyping and sequencing. X.B.Z., X.D.Z., L.D.S., Z.B., M.H., J.P.G., W.J.W. and Y.Q.Y. undertook data manipulation, statistical analysis and bioinformatic interrogations, and data checking. L.Y.M. and X.H.W. conducted animal experiment. T.H.B. and I.R. contributed data from the Baltimore study. E.M. and K.U.L. contributed data from the Bonn-II study. Z.B., L.D.S., M.H., Y.J.S., W.J.W. and Y.Q.Y. contributed to manuscript writing. Z.B., L.D.S., J.J.L., T.H.B., E.M., M.L.M. and K.U.L. helped to revise the manuscript. All authors contributed to the final paper, with Z.B., L.D.S., X.J.Z., Y.Q.Y., X.B.Z., M.H. and J.P.G. playing key roles.

Corresponding authors

Correspondence to Liangdan Sun or Zhuan Bian.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

Supplementary Figures and Supplementary Tables (PDF 3848 kb)

Supplementary Dataset 1

ENCODE annotations of 135 SNPs. (XLSX 94 kb)

Supplementary Dataset 2

Reported associations of other diseases/traits with index SNPs of NSCLP risk loci. (XLSX 26 kb)

Peer Review File (PDF 680 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Yu, Y., Zuo, X., He, M. et al. Genome-wide analyses of non-syndromic cleft lip with palate identify 14 novel loci and genetic heterogeneity. Nat Commun 8, 14364 (2017). https://doi.org/10.1038/ncomms14364

Download citation

Received: 30 May 2016
Accepted: 20 December 2016
Published: 24 February 2017
DOI: https://doi.org/10.1038/ncomms14364
Springer Nature Limited

Genome-wide analyses of non-syndromic cleft lip with palate identify 14 novel loci and genetic heterogeneity

Abstract

Similar content being viewed by others

Introduction

Results

Identification of 26 NSCLP-associated loci

Replications of the 26 loci in sub-phenotype groups of NSOFC

Replications of 16 NSCLP loci in multi-ethnic groups

Biological implications analyses for the 26 NSCLP loci

Discussion

Methods

Samples

DNA extraction

Genotyping and quality controls in GWAS

SNP selection for replication studies

Genotyping and quality control in replication studies

Statistical analyses

Stratification analyses

Locus annotation and candidate gene prioritization

Network analysis

GWAS catalogue reviews

Expression studies in the mouse

Data availability

Additional information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation