- Research
- Open access
- Published:

# Functional mapping imprinted quantitative trait loci underlying developmental characteristics

*Theoretical Biology and Medical Modelling*
**volume 5**, Article number: 6 (2008)

## Abstract

### Background

Genomic imprinting, a phenomenon referring to nonequivalent expression of alleles depending on their parental origins, has been widely observed in nature. It has been shown recently that the epigenetic modification of an imprinted gene can be detected through a genetic mapping approach. Such an approach is developed based on traditional quantitative trait loci (QTL) mapping focusing on single trait analysis. Recent studies have shown that most imprinted genes in mammals play an important role in controlling embryonic growth and post-natal development. For a developmental character such as growth, current approach is less efficient in dissecting the dynamic genetic effect of imprinted genes during individual ontology.

### Results

Functional mapping has been emerging as a powerful framework for mapping quantitative trait loci underlying complex traits showing developmental characteristics. To understand the genetic architecture of dynamic imprinted traits, we propose a mapping strategy by integrating the functional mapping approach with genomic imprinting. We demonstrate the approach through mapping imprinted QTL controlling growth trajectories in an inbred F_{2} population. The statistical behavior of the approach is shown through simulation studies, in which the parameters can be estimated with reasonable precision under different simulation scenarios. The utility of the approach is illustrated through real data analysis in an F_{2} family derived from LG/J and SM/J mouse stains. Three maternally imprinted QTLs are identified as regulating the growth trajectory of mouse body weight.

### Conclusion

The functional iQTL mapping approach developed here provides a quantitative and testable framework for assessing the interplay between imprinted genes and a developmental process, and will have important implications for elucidating the genetic architecture of imprinted traits.

## Background

Hunting for genes underlying mendelian disorders or quantitative traits has been a long-term effort in genetical research. Most current statistical approaches to gene mapping assume that the maternally and paternally derived copies of a gene in diploid organisms have a comparable level of expression. This, however, is not necessarily true as revealed by recent studies, in which some genes show asymmetric expression, and their expression in the offspring depends on the parental origin of their alleles [1–3]. This phenomenon, termed genomic imprinting, results from the modification of DNA structure rather than changes in the underlying DNA sequences. As one type of epigenetic phenomenon, genomic imprinting has greatly shaped modern research in genetics since its discovery. Some previously puzzling genetic phenomena can now be explained by imprinting theory. However, little is known about the size, location and functional mechanism of imprinted genes in development.

The selective control of gene imprinting is unique to placental mammals and flowering plants. There is increasing evidence that many economically important traits and human diseases are influenced by genomic imprinting [3–6]. More recent studies have shown that genomic imprinting might be even more common than previously thought [7]. Despite its importance, the study of genomic imprinting is still in its early infancy. The biological function of genomic imprinting in shaping an organism's development is still unclear. Recent publications have shown that the majority of imprinted genes in mammals play an important role in controlling embryonic growth and development [8, 9], and some involve in post-natal development, affecting suckling and metabolism [9, 10]. The malfunction of imprinted genes at any developmental stage could lead to substantially abnormal characters such as cancers or other genetic disorders. It is therefore of paramount importance to identify imprinted genes and to understand at which developmental stage they function, to help us explore opportunities to prevent, control and treat diseases therapeutically. With the development of new biotechnology coupled with computationally efficient statistical tools, it is now possible to map imprinted genes and understand their roles in disease susceptibility.

Several studies have shown that the effects of imprinted quantitative trait loci (iQTL) can be estimated and tested in controlled crosses of inbred or outbred lines [6, 11–15]. These approaches are designed on the traditional QTL mapping framework where a phenotypic trait is measured at certain developmental stage for a mapping subject, ignoring the dynamic features of gene expression. As a highly complex process, genomic imprinting involves a number of growth axes operating coordinately at different development stages [16]. Changes in gene expression at different developmental stages reflect the dynamic changes of gene function over time. They also reflect the response of an organism to either internal or external stimuli, so it can redirect its developmental trajectory to adapt better to environmental conditions, and thereby to increase its fitness [17]. For this reason, incorporating such information into genetic mapping should provide more information about the genetic architecture of a dynamic developmental trait.

When a developmental feature of an imprinted trait is considered, traditional iQTL mapping approaches that only consider the phenotypic trait measured at a particular time point will be inappropriate for such an analysis. In fact, for a quantitative trait of developmental behavior, the genetic effect at time *t* (denoted as *G*_{
t
}) is composed of the genetic effect at time *t* - 1 (denoted as *G*_{t-1}) and the extra genetic effect from time *t* - 1 to *t* (denoted as *G*_{Δt}) [18]. Therefore, the phenotypic trait measured at time *t* reflects the cumulative gene effects from initial time to *t*, and is highly correlated with the trait measured at time *t* - 1. The correlations among traits measured at different time periods (i.e., different developmental stages) thus provide correlation information about gene expressions, and hence tell us how genes mediate to respond to internal and external stimuli. Current imprinting QTL (iQTL) mapping approaches, by ignoring the correlations among traits measured at different developmental stages, could therefore potentially overestimate the number and the effective size of iQTLs, and lead to wrong inferences.

Although conditional QTL analysis can reduce bias and increase detecting power by partitioning the genetic effect in a conditional manner [18], analysis of traits at each measurement time point is still less powerful and less attractive than analysis by considering measurements at different developmental stages jointly [19]. The recent development of functional mapping brings challenges as well as opportunities for mapping genes responsible for dynamic features of a quantitative trait [17, 19, 20]. Functional mapping is the integration between genetic mapping and biological principles through mathematical equations. The relative merits of functional mapping in biology lie in the strong biological relevance of QTL detection, and its statistical advantages are that it reduces data dimensions and increases the power and stability of QTL detection. By incorporating various mathematical functions into the mapping framework, functional mapping has great flexibility for mapping genes that underlie complex dynamic/longitudinal traits. It provides a quantitative framework for assessing the interplay between genetic function and developmental pattern and form.

In this article, we extend our previous work of interval iQTL mapping to functional iQTL mapping by incorporating biologically meaningful mathematical functions into a QTL mapping framework. We illustrate the idea through an inbred line F_{2} design, although it can be easily extended to other genetic designs. To distinguish the genetic differences between the two reciprocal heterozygous forms derived from an F_{2} population, information about sex-specific differences in the recombination fraction is used. Monte Carlo simulations are performed to evaluate the model performance under different scenarios considering the effect of sample size, heritability and imprinting mechanism. A real example is illustrated in which three iQTLs affecting the growth trajectory of body weight in an F_{2} family derived from two different mouse strains are identified through a genome-wide linkage scan.

## Methods

### Functional QTL Mapping

Statistical methods for mapping QTL underlying developmental characteristics such as growth or HIV dynamics have been developed previously [19, 20]. The so called functional mapping approach has been recently applied to mapping QTL underlying programmed cell death [21, 22]. Functional mapping is derived under the finite mixture model-based likelihood framework. In the mixture model, each observation *y* is modelled as a mixture of *J* (known and finite) components. The distribution for each component corresponds to the genotype category depending on the underlying genetic design. For an F_{2} design, there are three mixture components (*J* = 3). The density function for each genotype component is assumed to follow a parametric distribution (*f*) such as Gaussian, which can be expressed as:

where \overrightarrow{\pi} = (*π*_{1}, *π*_{2}, *π*_{3}) is a vector of mixture proportions which are constrained to be non-negative and sum to unity; \overrightarrow{\varphi} = (*ϕ*_{1}, *ϕ*_{2}, *ϕ*_{3}) is a vector for the component specific parameters, with *ϕ*_{
j
}being specific to component *j*; and *η* contains parameters (i.e., residual variance) that are common to all components.

For an F_{2} design initiated with two contrasting homozygous inbred lines, there are three genotypes at each locus. Suppose there is a putative segregating QTL with alleles *Q* and *q* that affects a developmental trait such as growth. In a QTL mapping study, the QTL genotype is generally considered as missing, but can be inferred from the two flanking markers. The missing QTL genotype probability *π*_{
j
}can be calculated as the conditional probability of the QTL genotype given the observed flanking marker genotypes. For a population with structured pedigree like an F_{2} population, it can be expressed in terms of the recombination fractions, whereas for a natural population, it can be expressed as a function of linkage disequilibria. The derivations of the conditional probabilities of QTL genotypes can be found in the general QTL mapping literature [23].

In functional mapping, the parameters \overrightarrow{\varphi} = (*ϕ*_{1}, *ϕ*_{2}, *ϕ*_{3}) specify the underlying developmental mean function (**m**). For an F_{2} design, there are three sets of mean functions corresponding to three QTL genotypes. To reduce the number of parameters and enhance the interpretability of functional mapping, the mean process is modelled by certain biologically meaningful mathematical functions, either parametrically or nonparametrically. Suppose that the phenotypic traits are acquired from *n* individuals, and that *t* measurements are made on each individual *i*. Let the response of individual *i* at time *t* be denoted by *y*_{
i
}(*t*), *i* = 1, ⋯, *n*; *t* = 1, ⋯, *τ*. Then the response can be modelled as

where *f*(*t*) is a linear or nonlinear function evaluated at time *t*, depending on the underlying developmental pattern; *e*_{
i
}(*t*) is the residual error, which is assumed to be normal with mean zero and variance *σ*^{2}(*t*). The intra-individual correlation is specified as *ρ*, which leads to the covariance for individual *i* at two different time points, *t*_{1} and *t*_{2}, expressed as cov(*y*_{
i
}(*t*_{1}), *y*_{
i
}(*t*_{2})) = \rho {\sigma}_{{t}_{1}}{\sigma}_{{t}_{2}}. Assuming multivariate normal distribution, the density function for each progeny *i* who carries genotype *j* can be expressed as

where **m**_{
j
}= [*m*_{
j
}(1), ⋯, *m*_{
j
}(*τ*)] is the mean vector common for all individuals with genotype *j*, which can be evaluated through function *f* in Model (2). The unknown parameters that specify the position of QTL within a marker interval are arrayed in **Ω**_{
r
}. The parameters that define the mean and the covariance functions are arrayed in **Ω**_{
q
}.

Since we do not observe the QTL genotype, the distribution of *y* is modelled through a finite mixture model given in Model (1). At a particular time point (say *t*), the genetic effect can be obtained by solving the following equations

where *a*(*t*) and *d*(*t*) are the additive and dominant effects at time *t*, respectively.

### Functional iQTL Mapping

#### Modelling the imprinted mean function

In an F_{2} population, three QTL genotypes are segregated. The three QTL genotypes may have different expressions which result in three different mean trajectories. Considering the imprinting property of an iQTL, we introduce the notation for the parental origin of alleles inherited from both parents. Let *Q*_{
M
}and *q*_{
M
}be two alleles inherited from the maternal parent, and *Q*_{
P
}and *q*_{
P
}be two alleles derived from the paternal parent. The subscripts *M* and *P* refer to maternal and paternal origin, respectively. These four parentally specific alleles form four distinct genotypes expressed as *Q*_{
M
}*Q*_{
P
}, *Q*_{
M
}*q*_{
P
}, *q*_{
M
}*Q*_{
P
}, and *q*_{
M
}*q*_{
P
}. In contrast, in a regular QTL mapping study without distinguishing the allelic parental origin, the two reciprocal heterozygotes, *Q*_{
M
}*q*_{
P
}and *q*_{
M
}*Q*_{
P
}, are collapsed to one heterozygote. When a QTL is imprinted, the four QTL genotypes show different gene expressions, which result in different developmental growth trajectories. For a maternally (or paternally) imprinted QTL, the allele inherited from the maternal (or paternal) parent is not expressed. Thus, two growth trajectories would be expected. By testing the differences of the four growth trajectories, one can test whether there is a QTL, and whether the QTL is imprinted.

For simplicity, we use numerical notation to denote the four parent-of-origin-specific genotypes, i.e., *Q*_{
M
}*Q*_{
P
}= 1, *Q*_{
M
}*q*_{
P
}= 2, *q*_{
M
}*Q*_{
P
}= 3, and *q*_{
M
}*q*_{
P
}= 4. The mean functions of these genotypes are denoted as **m**_{
j
}, (*j* = 1, ⋯, 4). We know that for an imprinted gene, the expression of an allele depends on its parental origin. On a developmental scale, the two reciprocal heterozygotes, *Q*_{
M
}*q*_{
P
}and *q*_{
M
}*Q*_{
P
}, may present different mean trajectories. The degree of imprinting of an iQTL can thus be assessed by the genotype-specific parameters. Through testing the difference between the mean functions of the two reciprocal heterozygotes, we can assess the imprinting property of a QTL. An overlap of the two trajectories for the two reciprocal heterozygotes indicates no sign of imprinting.

For a developmental characteristic such as growth, it is well known that the underlying trajectory can be described by a universal growth law, which follows a logistic growth function [24]. At a developmental stage, say time *t*, the mean value of an individual carrying QTL genotype j can be expressed by

where the growth parameters (*α*_{
j
}, *β*_{
j
}, *γ*_{
j
}) describe asymptotic growth, initial growth and relative growth rate, respectively [25]. With estimated growth parameters, we can easily retrieve the genotypic means at every time point by simply plugging *t* into Equation (5). This modelling approach can significantly reduce the number of unknown parameters to be estimated, especially when the number of measurement points is large [19].

At a particular time point (say *t*), the mean expression of an individual carrying QTL genotype *j* can be evaluated through the three growth parameters (*α*_{
j
}, *β*_{
j
}, *γ*_{
j
}). On the basis of the univariate imprinting model given in [12], we can partition the genetic effects at time *t* as the allele-specific effects, i.e.

where *a*_{
M
}and *a*_{
P
}refer to the additive effects of alleles inherited from mother and father, respectively; *d* refers to the allele dominant effect.

To illustrate the idea, we use the growth trait to demonstrate the mapping principle. The idea can be easily extended to other developmental characteristics. For developmental characteristics other than growth, different mathematical functions should be developed. Some flexible choices include nonparametric regressions based on smoothing splines or orthogonal polynomials [21].

#### Modelling the covariance structure

To understand how QTL mediate growth, it is essential to take correlations among repeated measures into account [19]. The repeated measures provide correlation information on gene expression. Hence, dissection of the intra-individual correlation will help us to understand better how genes function over time. One commonly used model for covariance structure modelling is the first-order autoregressive (AR(1)) model [26], expressed as

for the variance, and

for the covariance between any two time points *t*_{k} and *t*_{k'}, where 0 <*ρ* < 1 is the proportion parameter with which the correlation decays with time lag.

For a developmental characteristic such as growth, the inter-individual variation generally increases as time increases, which leads to a nonstantionary variance function. Since the AR(1) covariance model assumes stationary variance, it can not be applied directly. To stabilize the variance at different measurement time points, we apply a multivariate Box-Cox transformation to stabilize the variance [27], which has the form

The Box-Cox transformation ensures the homoscedasticity and normality of the response *y*. For repeated measures or longitudinal studies, a reasonable constraint is to set *λ*(*t*) = *λ* for all *t*. Then the optimal choice of *λ* can be estimated from the data. To preserve the interpretability of the estimated mean parameters, Carroll and Ruppert [28] proposed a transform-both-sides (TBS) model in which the same transformation form is applied to both sides of Model (2). For a log-transformation, this results in log*y*_{
i
}(*t*) = log*f*(*t*) + *e*_{
i
}(*t*). Wu et al. [29] later showed the favorable property of this approach in functional mapping. For the modelling purpose of stabilizing variances, we simply adopt the log-transformation in the current setting.

Alternatively, one can model the covariance structure nonstationarily without transforming the original data. Among a pool of choices, the structured antedependence (SAD) model [30] displays a number of favorable merits. The SAD model of order p for modelling the error term in Eq. (2) is given by

where *ε*_{
i
}(*t*) is the "innovation" term assumed to be independent and distributed as \mathcal{N}(0,{\sigma}_{t}^{2}). Therefore, the variance-covariance matrix can be expressed as

where **Σ**_{
ε
}is a diagonal matrix with diagonal elements being the innovation variance; **A** is a lower triangular matrix which contains the antedependence coefficient *φ*_{
r
}. The SAD order (*p*) can be selected through an information criterion [31]. The SAD(*r*) model has been previously applied in functional mapping of programmed cell death [21].

### Parameter Estimation

Assuming inter-individual independence, the joint likelihood function is given by

where **z**_{
i
}= [*z*_{
i
}(1), ⋯, *z*_{
i
}(*τ*)] is the observed log-transformed trait vector for individual *i* (*i* = 1, ⋯, *n*) over *τ* time points; *f*_{
j
}is the multivariate normal density function with log-transformed mean for QTL genotype *j*; *π*_{j|i}(*j* = 1, ⋯, 4) is the mixture proportion for individual *i* with genotype *j*, which is derived assuming a sex-specific difference in recombination rate and can be found in [12]. The unknown parameters in **Ω** comprise three sets, one defining the co-segregation between the QTL and markers and thereby the location of the QTL relative to the markers, denoted by **Ω**_{
r
}, and the other defining the distribution of a growth trait for each QTL genotype, denoted by **Ω**_{
q
}= (**Ω**_{
m
}, **Ω**_{
v
}), where {\Omega}_{m}=({\Omega}_{{m}_{1}},{\Omega}_{{m}_{2}},{\Omega}_{{m}_{3}},{\Omega}_{{m}_{4}}) defines the mean vector for different genotypes and **Ω**_{
v
}defines the covariance parameters.

We implement the EM algorithm to obtain the maximum likelihood estimates (MLEs) of the unknown parameters. The first derivative of the log-likelihood function, with respect to specific parameter *ϕ* contained in **Ω**, is given by

where we define

The MLEs of the parameters contained in (**Ω**_{
m
}, **Ω**_{
v
}) are obtained by solving

Direct estimation is unavailable since there is no closed form for the MLEs of parameters. The EM algorithm is applied to solve these unknowns iteratively.

**E-step**: Given initial values for (**Ω**_{
m
}, **Ω**_{
v
}), calculate the posterior probability matrix **Π** = {Π_{j|i}} in Eq. (8).

**M-step**: With the updated posterior probability **Π**, we can update the parameters contained in **Ω**_{
q
}. The maximization can be implemented through an iteration procedure or through the Newton-Raphson or other algorithm such as simplex algorithm [32].

The above procedures are iteratively repeated between (8) and (9), until a certain convergence criterion is met. For details of the EM algorithm, one can refer to [19]. The converged values are the MLEs of the parameters. The initial values under the alternative hypothesis are generally set as the estimated values under the null. Note also that in the above algorithm, we do not directly estimate the QTL-segregating parameters (**Ω**_{
r
}). In general, we use a grid search approach to estimate the QTL location by searching for a putative QTL at every 1 or 2 cM on a map interval bracketed by two markers throughout the entire linkage map. The log-likelihood ratio test statistic for a QTL at a testing position is displayed graphically to generate a log-likelihood ratio plot called LR profile plot. The genomic position corresponding to a peak of the profile is the MLE of the QTL location.

We have found that the algorithm is sensitive to initial values, particularly the mean values of the two reciprocal heterozygotes. To make sure the parameters are converged to the "correct" ones, we normally give different initial values for the two reciprocal heterozygotes and check which one produces the highest likelihood value. The ones which produce higher likelihood value are considered as the MLEs.

### Hypothesis Testing

#### Global QTL test

Testing whether there is a QTL affecting the developmental trajectory is the first step toward understanding of genetic architecture of an imprinted trait. Once the MLEs of the parameters are obtained, the existence of a QTL affecting the growth curve can be tested by formulating the following hypotheses

where H_{0} corresponds to the reduced model, in which the data can be fit by a single curve, and H_{1} corresponds to the full model, in which there exist different curves to fit the data. The above test is equivalent to test

The statistic for testing the hypotheses is calculated as the log-likelihood (LR) ratio of the reduced to the full model

where \tilde{\Omega} and \stackrel{\_}{\Omega} denote the MLEs of the unknown parameters under H_{0} and H_{1}, respectively. An empirical approach to determining the critical threshold is based on permutation tests [33].

#### Imprinting test

Rejection of the null hypothesis in Test (10) at a particular genomic position indicates evidence of a QTL at that locus. Next, we would like to know the imprinting property of a detected QTL. To test if a detected QTL is imprinted or not, we develop the following hypothesis

The null hypothesis states that the two reciprocal QTL genotypes have the same mean curve and hence have the same gene expression, i.e., the expressions of genotypes *Q*_{
M
}*q*_{
P
}and *q*_{
M
}*Q*_{
P
}are independent of allelic origin. Rejection of the null hypothesis indicates evidence of genomic imprinting.

Following Test (11), if the null is rejected, further tests can be done to test whether an iQTL is maternally imprinted or paternally imprinted. The following hypothesis tests can be formulated

for testing paternally imprinted QTL and

for testing maternally imprinted QTL.

The null hypothesis in Test (12) states that the two QTL genotypes *Q*_{
M
}*Q*_{
P
}and *Q*_{
M
}*q*_{
P
}have the same mean curves and hence same expressions (i.e., allele inherited from the paternal parent does not express).

The iQTL identified can then be claimed as a paternally imprinted QTL. Similarly, if one fails to reject the null in Test (13), the conclusion that there is maternal imprinting can be reached.

Note that the imprinting test (11) is only conducted at the position where a significant QTL is declared on the basis of Test (10). So Test (11) is a point test. Tests (12) and (13) are only conducted when the null in Test (11) is rejected. We can either use the likelihood ratio test or a nonparametric test based on the area under the curve (AUC). The idea of the AUC test is that if two genotypes have the same expression, the area under the developmental curve would be the same. The AUC for QTL genotype j is defined as

Similarly, Tests (11)–(13) can be defined accordingly based on the AUC. For example, to test (12), the hypothesis would be simplified to

The significance of Tests (11)–(13) can be evaluated on the basis of permutations. In our simulation study, we found that the test based on the AUC is more sensitive and powerful than the one based on the likelihood ratio test.

#### Regional test

Even though a mean curve can be modelled throughout a continuous function, genes may not function across all the observed stages. For imprinted genes, loss of imprinting (LOI) is reported in the literature [34]. The question of how a QTL exerts its effects on an interval across a growth trajectory (say [*t*_{1}, *t*_{2}]) can be tested using a regional test approach based on the AUC. The AUC for genotype *j* at a given time interval is calculated as

If the AUCs of the four genotypes for a testing period [*t*_{1}, *t*_{2}] are the same, we claim there is no QTL effect at that time interval. The hypothesis test for the genetic effect over a period of growth can be formulated as

This test can detect if a QTL exerts an early gene effect or triggers a late effect.

## Results

### Monte Carlo Simulation

Monte Carlo simulations are performed to evaluate the statistical behavior of the developed approach. Consider an F_{2} population initiated with two contrasting inbred lines, with which a 100 *cM* long linkage group composed of 6 equidistant markers is constructed. A putative QTL that affects the imprinted growth process is located at 46 cM from the first marker on the linkage group. The marker genotypes in the F_{2} family are simulated by mimicking sex-specific recombination fractions in mice, i.e., *r*_{
M
}= 1.25*r*_{
P
}. The Haldane map function is used to convert the map distance into the recombination fraction. Data are simulated with different specifications, namely different heritability levels (H^{2} = 0.1 vs 0.4) and different sample sizes (*n* = 200 vs 500). For each F_{2} progeny, its phenotype is simulated with 10 equally spaced time points. The covariance structure is simulated assuming the first-order AR(1) model. Note that the variance parameter (*σ*^{2}) is calculated on the basis of the log-transformed data.

Several data sets are simulated assuming no imprinting, partial imprinting, complete maternal and paternal imprinting. The simulation results are summarized in Tables 1, 2, 3, 4. As we expected, the precision of parameter estimates is increasing with the increase of the sample size and heritability under different imprinting scenarios. For example, when a QTL is not imprinted (Table 4), the RMSE of the parameter *a* for genotype *Q*_{
M
}*Q*_{
P
}decreases from 0.397 to 0.327, an 18% increase in precision when the sample size increases from 200 to 500 with fixed heritability level (0.1). For the same parameter, when a QTL is completely maternally imprinted, a reduction in RMSE from 0.478 to 0.305 is observed (Table 1). When we fix the sample size and increase the heritability level, the reduction in RMSE is even more noteworthy. For example, under fixed sample size (*n* = 200), the RMSE of the parameter a for QTL genotype *Q*_{
M
}*Q*_{
P
}is reduced from 0.397 to 0.137, a 65% increase in precision compared to an 18% increase when sample size increases from 200 to 500 with fixed heritability (Table 4). Large heritability infers high genetic variability and low environmental variation [35]. Therefore, to increase the precision of parameter estimation, well managed experiments in which environmental variation is reduced is more important than just simply increasing sample sizes.

Under different simulation scenarios, another general trend is that the estimation for the genetic parameters of the two homozygotes performs better than that for the two reciprocal heterozygotes. For example, the RMSE of the growth parameter *a* for *Q*_{
M
}*q*_{
P
}is 0.765, while it decreases to 0.397 for genotype *Q*_{
M
}*Q*_{
P
}with fixed sample size 200 and heritability level 0.1 (Table 4). This is what we expected since partitioning the heterozygote into two parts may cause information loss. As the sample size or the heritability level increases, the RMSEs are greatly reduced for the two reciprocal heterozygous genotypes. For example, the RMSE (for parameter *a*) is reduced from 0.765 to 0.304 when the heritability level increases from 0.1 to 0.4 under fixed sample size 200 (Table 4). Overall, the QTL position estimation is reasonably good under different simulation scenarios, even though the precision is reduced a little with completely imprinted models (Tables 1 and 2), compared with the non-imprinting and partial imprinting models (Tables 3 and 4).

Table 1 also summarizes the results of comparison between the imprinting and non-imprinting models, in which the regular non-imprinting functional mapping model is indicated by "NI". Data are simulated assuming complete maternal imprinting, and are then subject to analysis using the imprinting (four QTL genotypes) and non-imprinting (three QTL genotypes) models. It can be seen that the non-imprinting model produces poorer estimation than the imprinting model. The RMSE is generally large when data are analyzed with the non-imprinting model, especially the mean parameters for the two reciprocal heterozygotes. We observed similar results under other imprinting mechanisms (e.g., partial or complete paternal imprinting) and the results are omitted. When data are simulated assuming no imprinting, the non-imprinting model, however, outperforms the imprinting model, in which the standard errors of the mean parameters fitted with the imprinting model are slightly higher than those fitted with the non-imprinting model (data not shown). Similar results were also obtained in our previous univariate imprinting analysis [22]. Therefore, caution is needed about the interpretation of the results. One should try both imprinting and non-imprinting models and report the union of QTLs that are shown in both analyses.

### A Case Study

We apply the developed model to a published data set [36] to show the utility of the approach. The data contain 502 F_{2} mice derived from two inbred strains differing greatly in body weight, the Large (LG/J) and Small (SM/J). Each F_{2} progeny was measured for its body mass at 10 equally spaced weeks starting at day 7 after birth. Ninety-six codominant markers were obtained with an average length of ~23 cM spanning the 19 autosomes. For more information about the data, readers are referred to the original paper [36].

The sex-specific recombination rate is reconstructed on the basis of the marker data according to the average 1.25:1 recombination rate between female and male chromosomes [37]. The data were analyzed previously with a univariate imprinting model [12], in which three iQTL and one Mendelian QTL were detected. Considering the limitation of the univariate analysis as introduced in the background section, we apply the data to the newly developed functional mapping model. As clearly shown by the genomewide LR profile plot in Fig 1, the model detects four major QTLs. One QTL passes the genomewide threshold and is located on chromosome 6. The other three QTLs, located on chromosomes 7, 10 and 16, respectively, are only significant at the chromosome-wide level. In the plot, the solid curve corresponds to the likelihood ratio statistics at every testing position. The 5% significance threshold value for claiming the existence of QTLs at the genome-wide level is marked with the horizonal solid line based on permutation tests. The dashed line represents the 5% chromosome-wide threshold, which is obtained by only scanning the targeted chromosome.

Table 5 tabulates the estimated QTL positions, the marker intervals of the QTL belonging to each, the MLEs of curve parameters that specify the developmental pattern, and the asymptotic standard errors of the estimators (in parentheses). It can be seen that all parameters can be reasonably estimated with small sampling errors. The four QTLs detected are located between markers Nds5 and Mit15 on chromosome 6, between markers Nds1 and Mit148 on chromosome 7, between markers Mit133 and Mit14 on chromosome 10, and between markers Mit2 and Mit5 on chromosome 16. The data are also analyzed with the regular functional mapping approach and four QTLs are detected, among which only two QTLs, located on chromosomes 6 and 7, agree in both models. The other two, located on chromosomes 11 and 15, do not show significance with the imprinting analysis. The miss-detection of these two QTLs by the imprinting model may partially be due to the limitation of the imprinting approach as revealed by the simulation study.

To dissect the imprinting property of the detected QTLs, we further conduct hypothesis tests based on Tests (11)–(13). Among the four QTLs detected, three show signs of imprinting, i.e., the ones located on chromosomes 6, 10 and 16. The one located on chromosome 7 shows no sign of imprinting. The developmental trajectories of the identified QTLs are plotted and are shown in Fig. 2 with the growth trajectories for all individuals indicated as gray in the background. The imprinting property of the four QTLs can also be inferred from the plot. For example, for the iQTL detected on chromosome 6, the solid and the dashed blue curves for QTL genotypes *Q*_{
M
}*Q*_{
P
}and *q*_{
M
}*Q*_{
P
}are almost merged together and the solid and dashed red curves for QTL genotypes *Q*_{
M
}*q*_{
P
}and *q*_{
M
}*q*_{
P
}are almost merged together. This implies that only the allele inherited from the paternal parent is expressed and the QTL is maternally imprinted. A similar result is obtained for this QTL with our previous univariate imprinting model [12]. A similar pattern can be seen for other two imprinted QTLs, while for the QTL detected on chromosome 7, four separate developmental curves can be clearly seen, which indicates that the QTL is not imprinted. The trajectory plots confirm the testing results based on Tests (12) and (13).

## Discussion

We all know that the diversity of offspring genes inherited from parents is due to the perturbation and reshuffling of parental genetic information during meiosis. Maintaining a functional expression balance between the paternal and maternal alleles is crucial for an organism's normal development. Breakage of the balance (e.g., the same allele expresses differently depending on its parental origin) often results in a phenomenon called genomic imprinting. In extreme cases, when one allele is completely silent, deletion of the functional allele may have serious consequences. For example, a deletion of the paternal copies of the imprinted SNRPN gene and necdin gene on chromosome 15 located in the region 15q11–13 may result in a genetic disorder called Prader-Willis syndrome [38].

Genomic imprinting has been ubiquitously observed in nature. The cause of imprinting has been thought to be related to DNA methylation in which certain parts of the DNA sequence are methylated and hence are silent in expression [39]. Although the functional mechanism of genomic imprinting is not totally clear, scientists thought that genomic imprinting has the one-generation dependence property. At the early stage of meiosis, the imprinting mark is erased and is subsequently reset at the end of meiosis, and imprinting is passed from one generation to another [40]. Offspring expression of an imprinted gene only depends on the allelic origin and is independent of the offspring sex. Therefore, any population in which the allelic parental origin can be traced can be used for mapping purposes theoretically.

Experimental line crossing has been widely used in QTL mapping studies. Several studies have been reported based on inbred line crosses for iQTL mapping [11–13]. These approaches are developed for univariate QTL analysis. Considering the functional dynamics of a gene, a powerful approach for understanding the genetic architecture of a dynamic trait would be to incorporate the dynamic feature of gene function into a mapping model. Statistical functional mapping, as revealed by empirical studies [20, 22], shows its unique merits in mapping QTL underlying developmental character or reaction norm. To understand the genetic architecture of a dynamic imprinted trait, we have extended the functional mapping approach to map iQTL responsible for a dynamic imprinted trait. The model is a natural extension of our previous single trait imprinting model [12]. Simulation studies have shown the small sample properties of the approach under different simulation scenarios, considering the effect of sample size, heritability levels and different imprinting mechanisms.

As revealed by the real data analysis, the current functional iQTL mapping approach and our previous single trait iQTL mapping approach [12] agree on most significant QTLs, i.e., the ones from chromosomes 6,7 and 10. The imprinting property of these QTLs also agrees with the previous results, i.e., the QTLs on chromosome 6 and 10 are maternally imprinted and the QTL on chromosome 7 is not imprinted. However, the new model identifies one new iQTL located on chromosome 16. This iQTL is maternally imprinted. The one located on chromosome 15 detected by our previous single trait iQTL model does not show significance in the current analysis. Since the single trait analysis is based on the marginal distribution of the data at certain developmental stage, it may amplify the QTL effect by not adjusting the cumulative genetic effects up to time *t* [18]. Further experimental evidence is needed to confirm the results.

It is interesting to note that the detected iQTL for this data set are all maternally imprinted. The results, however, are not unusual, as explained by the parental genetic conflict theory [41]. Based on the theory, paternally derived alleles always trigger a favorable effect on the offspring's growth, whereas maternally derived alleles tend to trigger the opposite effect. As a result, imprinted genes that are preferentially expressed from the paternal alleles are enriched in genes that promote offspring growth. Therefore, for the growth trait in the current study, it is not surprising to see many maternally imprinted QTLs.

As a first attempt of its kind, we construct our functional iQTL mapping idea on a tractable one-QTL interval mapping framework. A one-QTL model does not consider the effects of background markers and is very limited to precisely elucidate the complex genetic architecture of a dynamic imprinted trait. The incorporation of ideas from more advanced mapping approaches such as composite interval mapping [42] and multiple interval mapping [43] can greatly enhance the utility of the developed method. More recently, Yang et al. [44] developed a composite functional mapping approach, which adopted a similar idea as the composite interval mapping [42] and shows improved features against the one-QTL functional mapping model. To make our work more useful in practice, modelling of multiple QTLs by composite or multiple interval mapping will be considered in future work.

The developed functional iQTL mapping is illustrated using an inbred F_{2} design. Since the allele parental origin for the two reciprocal heterozygotes is not distinguishable, we incorporate the sex specific recombination rate information to discern the difference in distribution of the two reciprocal heterozygotes. We assume an average 1.25:1 female-to-male recombination rate in mouse to illustrate our method. Additional simulations are conducted by varying the recombination ratio and similar results are obtained (data not shown). The sex specific recombination rates are commonly observed in nature. For example, averaged over the entire genome, the female-to-male recombination rate is 1.6:1 in human [45], 1.4:1 in dog [46], 1.4:1 in pig [47]. However, we also expect local variation in sex specific recombination rate. For example, Marklund et al. [47] reported that the chromosome specific female-to-male recombination rate varies from 2.2:1 (Pig chromosome 5) to 1:1 (Pig chromosome 1), an approximately two-fold difference in female-to-male recombination rates. This kind of local variation in sex specific recombination rate may affect the analysis results, and concomitantly the inference of genomic imprinting. The use of a sex-specific linkage map would greatly improve the inference of a local imprinting property of a QTL. With the availability of sex specific linkage information in experimental species, we expect the method be more robust and to provide more informative results. Meanwhile, owning to the limited information about the sex specific recombination rate in other species, this may limit the application of the developed approach to other populations. An alternative genetic design that can trace the allelic parental origin using experimental line crosses is the backcross design. Statistical approaches focusing on the backcross design have been developed for iQTL mapping in univariate trait analysis [11, 13]. The idea can be extended to functional iQTL mapping with little modification.

Functional mapping and its extension like the one presented in this article provide a stimulating way to map complex biological processes by incorporating curve fitting into a mapping framework. Regardless of the limitations mentioned above, the integration of imprinting information into the functional mapping framework provides a testable quantitative platform for understanding the genetic basis of imprinted genes accounting for quantitative variation of a dynamic trait. The incorporation of more flexible mean function modelling approaches, such as non-parametric regression [48], would greatly enhance the flexibility of the current approach.

## References

Barlow DP: Gametic imprinting in mammals. Science. 1995, 270: 1610-1613. 10.1126/science.270.5242.1610.

Reik W, Walter J: Genomic imprinting: parental influence on the genome. Nat Rev Genet. 2001, 2: 21-32. 10.1038/35047554.

Reik W, Walter J: Evolution of imprinting mechanisms: the battle of the sexes begins in the zygote. Nat Genet. 2001, 27: 255-256. 10.1038/85804.

Dilkes BP, Dante RA, Coelho C, Larkins BA: Genetic analysis of endoreduplication in Zea mays endosperm: evidence of sporophytic and zygotic maternal control. Genetics. 2002, 160: 1163-1177.

Nezer C, Moreau L, Brouwers B, Coppieters W, Detilleux J, Hanset R, Karim L, Kvasz A, Leroy P, Georges M: An imprinted QTL with major effect on muscle mass and fat deposition maps to the IGF2 locus in pigs. Nat Genet. 1999, 21: 155-156. 10.1038/5935.

de Koning D-J, Rattink AP, Harlizius B, van Arendonk JAM, Brascamp EW, Groenen MA: Genome-wide scan for body composition in pigs reveals important role of imprinting. Proc Natl Acad Sci USA. 2000, 97: 7947-7950. 10.1073/pnas.140216397.

Gimelbrant A, Hucchinson JN, Thompson R, Chess A: Widespread monoallelic expression on human authosomes. Nature. 2007, 318: 1136-1140.

Isles AR, Holland AJ: Imprinted genes and mother-offspring interactions. Early Hum Dev. 2005, 81 (1): 73-77. 10.1016/j.earlhumdev.2004.10.006.

Tycko B, Morison IM: Physiological functions of imprinted genes. J Cell Physiol. 2002, 192 (3): 245-258. 10.1002/jcp.10129.

Constancia M, Kelsey G, Reik W: Resourceful imprinting. Nature. 2004, 432 (7013): 53-57. 10.1038/432053a.

Cui YH: A Statistical Framework for Genome-wide Scanning and Testing Imprinted Quantitative Trait Loci. J Theo Biol. 2007, 244: 115-126. 10.1016/j.jtbi.2006.07.009.

Cui YH, Lu Q, Cheverud JM, Littel RL, Wu RL: Model for mapping imprinted quantitative trait loci in an inbred F

_{2}design. Genomics. 2006, 87: 543-551. 10.1016/j.ygeno.2005.11.021.Cui YH, Cheverud JM, Wu RL: A Statistical Model for Dissecting Genomic Imprinting through Genetic Mapping. Genetica. 2007, 130: 227-239. 10.1007/s10709-006-9101-x.

Knott SA, Marklund L, Haley CS, Andersson K, Davies W, Ellegren H, Fredholm M, Hansson I, Hoyheim B, Lundstrom K, Moller M, Andersson L: Multiple marker mapping of quantitative trait loci in a cross between outbred wild boar and Large White pigs. Genetics. 1998, 149: 1069-1080.

de Koning D-J, Bovenhuis H, van Arendonk JAM: On the detection of imprinted quantitative trait loci in experimental crosses of outbred species. Genetics. 2002, 161: 931-938.

Bartolomei MS, Tilghman SM: Genomic imprinting in mammals. Annu Rev Genet. 1997, 31: 493-525. 10.1146/annurev.genet.31.1.493.

Wu RL, Ma C-X, Lin M, Casella G: A general framework for analyzing the genetic architecture of developmental characteristics. Genetics. 2004, 166: 1541-1551. 10.1534/genetics.166.3.1541.

Zhu J: Analysis of conditional effects and variance components in developmental genetics. Genetics. 1995, 141: 1633-1639.

Ma C-X, Casella G, Wu RL: Functional mapping of quantitative trait loci underlying the character process: A theoretical framework. Genetics. 2002, 161: 1751-1762.

Wu RL, Lin M: Functional mapping – How to map and study the genetic architecture of dynamic complex traits. Nat Rev Genet. 2006, 7: 229-237. 10.1038/nrg1804.

Cui Y, Wu R, Casella G, Zhu J: Non-parametric functional mapping of quantitative trait loci underlying programmed cell death. Stat Appl Genet Mol Biol. 2008,

Cui Y, Zhu J, Wu R: Functional Mapping for Genetic Control of Programmed Cell Death. Physiol Genomics. 2006, 25: 458-469. 10.1152/physiolgenomics.00181.2005.

Wu RL, Casella G, Ma C-X: Statistical Genetics of Quantitative Traits: Linkage, Maps and Qtl. 2007, New York: Springer

West GB, Brown JH, Enquist BJ: A general model for ontogenetic growth. Nature. 2001, 413: 628-631. 10.1038/35098076.

Niklas KL: Plant Allometry: The Scaling of Form and Process. 1994, Chicago: University of Chicago

Diggle PJ, Heagerty P, Liang KY, Zeger SL: Analysis of Longitudinal Data. 2002, Oxford, UK: Oxford University Press

Box GEP, Cox DR: An analysis of transformations. Journal of the Royal Statistical Society Series B. 1964, 26: 211-252.

Carroll RJ, Ruppert D: Power-transformations when fitting theoretical models to data. J Am Stat Assoc. 1984, 79: 321-328. 10.2307/2288271.

Wu RL, Ma CX, Lin M, Wang ZH, Casella G: Functional mapping of growth quantitative trait loci using a transform-both-sides logistic model. Biometrics. 2004, 60: 729-38. 10.1111/j.0006-341X.2004.00223.x.

Jaffrézic F, Thompson R, Hill WG: Structured antedependence models for genetic analysis of repeated measures on multiple quantitative traits. Genet Res. 2003, 82: 55-65. 10.1017/S0016672303006281.

Zhao W, Chen YQ, Casella G, Cheverud JM, Wu RL: A non-stationary model for functional mapping of complex traits. Bioinformatics. 2005, 21: 2469-2477. 10.1093/bioinformatics/bti382.

Nelder JA, Mead R: A simplex method for function minimization. Computer J. 1965, 7: 308-313.

Churchill GA, Doerge RW: Empirical threshold values for quantitative trait mapping. Genetics. 1994, 138: 963-971.

Feinberg AP: Genomic imprinting and cancer, in The Metabolic and Molecular Bases of Inherited Disease. Edited by: Scriver CR, Beaudet AL, Sly WS, Valle D. 2001, New York: McGraw-Hill, 525-537.

Lynch M, Walsh B: Genetics and Analysis of Quantitative Traits. 1998, Sunderland, MA.: Sinauer

Vaughn TT, Pletscher LS, Peripato A, King-Ellison K, Adams E, Erikson C, Cheverud JM: Mapping quantitative trait loci for murine growth: A closer look at genetic architecture. Genet Res. 1999, 74: 313-322. 10.1017/S0016672399004103.

Dietrich WF, Miller J, Steen R, Merchant MA, Damron-Boles D, Husain Z, Dredge R, Daly MJ, Ingalls KA, O'Connor TJ: A comprehensive genetic map of the mouse genome. Nature. 1996, 380: 149-152. 10.1038/380149a0.

Cassidy SB, Lai LW, Erickson RP, Magnuson L, Thomas E, Gendron R, Herrmann J: Trisomy 15 with loss of the paternal 15 as a cause of Prader-Willi syndrome due to maternal disomy. Am J Hum Genet. 1992, 51: 701-708.

Gold JD, Pedersen RA: Mechanisms of Genomic Imprinting in Mammals. Current Topics in Developmental Biology. 1994, 29: 227-247.

Walter J, Paulsen M: Imprinting and disease. Seminars in Cell & Dev Biol. 2003, 14: 101-110. 10.1016/S1084-9521(02)00142-8.

Wilkins JF, Haig D: What good is genomic imprinting: the function of parent specific gene expression. Nat Rev Genet. 2003, 4: 359-368. 10.1038/nrg1062.

Zeng Z-B: Precision mapping of quantitative trait loci. Genetics. 1994, 136: 1457-1468.

Kao CH, Zeng Z-B, Teasdale RD: Multiple interval mapping for quantitative trait loci. Genetics. 1999, 152: 1203-1216.

Yang R, Gao H, Wang X, Zhang J, Zeng Z-B, Wu RL: A Semiparametric Approach for Composite Functional Mapping of Dynamic Quantitative Traits. Genetics. 2007, 177: 1859-1870. 10.1534/genetics.107.077321.

Dib C, Faure S, Fizames C, Samson S, Drouot N, Vignal A, Millasseau P, Marc S, Hazan J, Seboun E, Lathrop M, Gyapay G, Morissette J, Weissenbach J: A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature. 1996, 380: 152-154. 10.1038/380152a0.

Neff MW, Broman KW, Mellersh CS, Ray K, Acland GM, Aguirre GD, Ziegle JS, Ostrander EA, Rine J: A second-generation genetic linkage map of the domestic dog, Canis familiaris. Genetics. 1999, 151: 803-820.

Marklund L, Moller MJ, Hoyheim B, Davies W, Fredholm M, Juneja RK, Mariani P, Coppieters W, Ellegren H, Andersson L: A comprehensive linkage map of the pig based on a wild pig-Large White intercross. Anim Genet. 1996, 27: 255-269.

Boularan J, Ferre L, Vieu P: Growth curves: a two-stage nonparametric approach. J Stat Plan Infer. 1994, 38: 327-350. 10.1016/0378-3758(94)90014-0.

## Acknowledgements

The authors wish to thank the executive editor Paul Agutter and the three anonymous referees for their valuable comments. The authors are also grateful to James Cheverud for providing the mouse data. This work was supported by research grant DMS-0707031 from the National Science Foundation, NIH grant RR015116 and an intramural research grant from Michigan State University.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Competing interests

The author(s) declare that they have no competing interests.

### Authors' contributions

YC conceived the idea, developed the model, performed the analysis, and drafted the manuscript. SL and GL assisted in the analysis and preparation of the manuscript.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

### Cite this article

Cui, Y., LI, S. & LI, G. Functional mapping imprinted quantitative trait loci underlying developmental characteristics.
*Theor Biol Med Model* **5**, 6 (2008). https://doi.org/10.1186/1742-4682-5-6

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/1742-4682-5-6