A mixed-effects two-part model for twin-data and an application on identifying important factors associated with extremely preterm children’s health disorders

Zou B, Santos HP, Xenakis JG, O’Shea MM, Fry RC, Zou F.

PLoS One. 2022 Jun 13;17(6):e0269630. doi: 10.1371/journal.pone.0269630. PMID: 35696398; PMCID: PMC9191696.

PubMed Link


Our recent studies identifying factors significantly associated with the positive child health index (PCHI) in a mixed cohort of preterm-born singletons, twins, and triplets posed some analytic and modeling challenges. The PCHI transforms the total number of health disorders experienced (of the eleven ascertained) to a scale from 0 to 100%. While some of the children had none of the eleven health disorders (i.e., PCHI = 1), others experienced a subset or all (i.e., 0 ≤PCHI< 1). This indicates the existence of two distinct data processes-one for the healthy children, and another for those with at least one health disorder, necessitating a two-part model to accommodate both. Further, the scores for twins and triplets are potentially correlated since these children share similar genetics and early environments. The existing approach for analyzing PCHI data dichotomizes the data (i.e., number of health disorders) and uses a mixed-effects logistic or multiple logistic regression to model the binary feature of the PCHI (1 vs. < 1). To provide an alternate analytic framework, in this study we jointly model the two data processes under a mixed-effects two-part model framework that accounts for the sample correlations between and within the two data processes. The proposed method increases power to detect factors associated with disorders. Extensive numerical studies demonstrate that the proposed joint-test procedure consistently outperforms the existing method when the type I error is controlled at the same level. Our numerical studies also show that the proposed method is robust to model misspecifications and it is applicable to a set of correlated semi-continuous data.