Class | Analysis and data manipulation command |
Name | fpm |
Arguments | <quantitative trait>
[>|>=|<|<=|==|^=
<threshold>]|<binary trait> [nqtl <number simulated QTLs>] [link logit|probit|mft|ln] [likelihood_family gaussian|binomial|poisson| weibull [censor <censoring trait>]] [p|fre] [a|va] [d|vd] [g|vg|h2] [c|vc] [s|vs] [fix a|c|d|e|g|mu|s|var] [aval|cval|dval|eval|gval|mu|pval|sval| var|AA|AB|BB|SD <value>] [cov <x1> [+ <x2>...]] [print] [save <target_variable>] |
Uses a Monte Carlo Markov Chain (Metropolis) algorithm to perform Generalized Linear Mixed Model analysis (when nqtl set to zero), complex segregation analysis (nqtl set to one), "genetic" mixed model analysis (nqtl set to one, g estimated), or a finite polygenic model analysis for a quantitative or binary trait using all phenotyped individuals. The a, d, g, c and s options respectively include additive QTL effects, dominance QTL effects, additive Gaussian polygenic random effects, pedigree environmental effects and maternally derived sibship effects in the model. The fix option allows that variable to be held fixed. The aval (dval etc) keyword allows the starting value of that parameter to be set. Either a Gaussian, Binomial, Weibull or Poisson likelihood can be used. In addition to the canonical links for these three types, one can also fit the multifactorial threshold model to binary data. Because the identity link function for a binomial likelihood is of genetic interest, notably in the case of the single major locus model, special checks to reject models where predictions lie outside 0-1 are made, so that this model can be successfully fitted.
The likelihood contribution from the ith individual to the Metropolis criterion for these models is (see for example, Guo and Thompson [1993]):
LL = F*Σ(log(P(G_{j}))) + F*log(f(a|V_{A})) + (1-F)*log(f(a|a_{FA}, a_{MO})) + log(c|V_{A}) + I*log(f(y|G_{1},...G_{j}, a, c, V_{E}))
where,
P(x) denotes the probability of x,
f(x) denotes the density of x,
y is the trait value,
a is the breeding value,
c is the pedigree-specific intercept,
G_{j} is the genotype at the jth QTL,
V_{A} is the additive polygenic variance,
V_{C} is the familial environmental variance,
V_{E} is the error variance,
F=1 when a founder, 0 when a nonfounder
I=1 when phenotype observed, 0 when unobserved.
The conditional density for the breeding values of offspring includes the correction for inbreeding (the segregation variance being 1-0.5*(F_{FA}+F_{MO}). The random effects are modelled as zero-mean gaussian.
The realizations of the parameters are summarized as means, and approximate standard errors produced by batching (default B=iter^{1/2} [Jones et al 2005]). The interbatch lag-1 serial correlation is calculated as a diagnostic for the appropriate number of values to simulate [Ripley 1987].
The resulting BLUPs or the last set of simulated genotypes can be saved to an appropriate variable in the dataset, using the save option, or printed, if the print option is chosen. By increasing the usual print level, simulated values can be written to the standard output or a file. These can be used to construct confidence intervals etc after being read into other program eg Coda.
The implementation of the generalized linear mixed models is quite straightforward in the chosen Metropolis paradigm (it would be more work to produce a Gibbs sampler, I believe), but for the "standard" sampler, it is important to check that the proposal acceptance rates are in the optimum range (usually stated as 0.2-0.6, Ripley 1987). This is less critical for the slice sampler, where the tabulated acceptance rates are actually the ratio of accepted proposals to the number of function evaluations (and so are just a measure of algorithm efficiency).
Increasing the number of random effects chains is realized by duplicating the families the appropriate number of times and correcting the likelihood and standard errors. One is essentially averaging over multiple estimates of the random effects for each individual, as global parameters such the fixed effects regression coefficients and overall variances are the same over the replicate chains at that iteration.
Example:
>> include boricex.in >> set chains 4 >> set burn 1000 >> set iter 1000 >> fpm nonviable nqtl 0 c s cov dose ------------------------------------------------ Finite Polygenic Model analysis for "nonviable" ------------------------------------------------ Number of families = 4 Number of sibships = 107 Number of observations = 1297 Burn-in MCMC iterations = 1000 Evaluated MCMC iterations = 10000 Number of MCMC chains = 4 Metropolis sampler = Sliced Model type = Binomial Link type = Logit Fixed Effects = nonviable ~ mu + dose Random Effects = VC VS VE Global trait mean = 0.101773 Global trait variance = 0.091416 Number of simulated QTLs = 0 Intercept = -3.175311 Family environmental var = 0.184530 ( 0.0%) Maternal effect variance = 1.037317 ( 0.0%) Environmental variance = 0.038910 (100.0%) Modal model loglikelihood = -512.842 Mean model loglikelihood = -509.542 C.V. Loglikelihood = 1.26% Summarized run as 99 batches of size 101 : Parameter Mean Mode SD Z-value MC-SE Fixed ---------------- ---------- ---------- ---------- ---------- -------- ----- mu -3.1753 -3.1380 0.3362 9.4435 0.0229 VC 0.1845 0.1565 0.2678 0.6890 0.0102 VS 1.0373 1.0368 0.3354 3.0932 0.0159 VE 0.0389 0.0385 0.0118 3.3074 0.0008 sdC 0.4051 0.3791 0.2859 1.4171 0.0109 sdS 1.0152 1.0316 0.1645 6.1707 0.0079 sdE 0.1967 0.2006 0.0300 6.5471 0.0020 dose 2.9324 2.6670 1.6517 1.7754 0.1085 Realized VC 0.1348 0.1418 0.1415 0.9532 0.0051 Realized VS 1.0319 1.0306 0.3015 3.4228 0.0150 Realized VE 0.0764 0.0765 0.0012 63.5607 0.0000 VE(F+R)/VE(F) 0.8461 0.8435 0.0234 36.1427 0.0010 Proposal N Accepted ------------ ------------- -------- Genotype 13798452 0.593 (M) mu 11355 0.191 (S) VC 10910 0.200 (S) VS 11184 0.194 (S) dose 11474 0.192 (S)
<< (blup) | Up to index | >> (assoc) |