mTADA: a framework for analyzing de novo mutations in multiple traits
Hoang T. Nguyen, Amanda Dobbyn, Joseph Buxbaum, Dalila Pinto, Shaun M Purcell, Patrick F Sullivan, Xin He, Eli A. Stahl
Received Date: 2nd September 2018
Joint analysis of multiple traits can result in the identification of associations not found through the analysis of each trait in isolation. In addition, approaches that consider multiple traits can aid in the characterization of shared genetic etiology among those traits. In recent years, parent-offspring trio studies have reported an enrichment of de novo mutations (DNMs) in neuropsychiatric disorders. The analysis of DNM data in the context of neuropsychiatric disorders has implicated multiple putatively causal genes, and a number of reported genes are shared across disorders. However, a joint analysis method designed to integrate de novo mutation data from multiple studies has yet to be implemented. We here introduce multiple-trait TADA (mTADA) which jointly analyzes two traits using DNMs from non-overlapping family samples. mTADA uses two single-trait analysis data sets to estimate the proportion of overlapping risk genes, and reports genes shared between and specific to the relevant disorders. We applied mTADA to >13,000 trios for six disorders: schizophrenia (SCZ), autism spectrum disorder (ASD), developmental disorders (DD), intellectual disability (ID), epilepsy (EPI), and congenital heart disease (CHD). We report the proportion of overlapping risk genes and the specific risk genes shared for each pair of disorders. A total of 153 genes were found to be shared in at least one pair of disorders. The largest percentages of shared risk genes were observed for pairs of DD, ID, ASD, and CHD (>20%) whereas SCZ, CHD, and EPI did not show strong overlaps in risk gene set between them. Furthermore, mTADA identified additional SCZ, EPI and CHD risk genes through integration with DD de novo mutation data. For CHD, using DD information, 31 risk genes with posterior probabilities > 0.8 were identified, and 20 of these 31 genes were not in the list of known CHD genes. We find evidence that most significant CHD risk genes are strongly expressed in prenatal stages of the human genes. Finally, we validated our findings for CHD and EPI in independent cohorts comprising 1241 CHD trios, 226 CHD singletons and 197 EPI trios. Multiple novel risk genes identified by mTADA also had de novo mutations in these independent data sets. The joint analysis method introduced here, mTADA, is able to identify risk genes shared by two traits as well as additional risk genes not found through single-trait analysis only. A number of risk genes reported by mTADA are identified only through joint analysis, specifically when ASD, DD, or ID are one of the two traits examined. This suggests that novel genes for the trait or a new trait might converge to a core-gene list of the three traits.
Read in full at bioRxiv.
This is an abstract of a preprint hosted on an independent third party site. It has not been peer reviewed but is currently under consideration at Nature Communications.