Class | Analysis and data manipulation command |
Name | assoc |
Arguments | <binary trait> | <quantitative trait> [(>|>=|<|<=|==|^= <threshold>]) | categorical ] [founders] [covariate <covariate>] [genotypic] [ibd <ibd_marker>] [snp | freq | maf | risk] |
For a binary or dichotomised quantitative quantitative trait, prints chi-square statistics for association for all marker loci versus the trait, either affected versus unaffected if the trait is binary, or above or below the threshold if the trait is quantitative.
If the categorical modifier is present, then each unique value of a quantitative trait is taken to represent a category of a multinomial trait.
For dichotomous traits, a second table in the output shows the results from the FBAT or reconstruction-combined TDT within informative sibships.
The snp modifier leads to simplified output: for a binary trait, the allelic odds ratio and 95 confidence interval, followed by P-value; it also prevents the FBAT statistic being calculated. The odds ratio is for the increasing allele if two alleles are present, otherwise it is for the base (ie first) allele. For a quantitative trait, beta and SE for the second allele is given - the output indicates this allele and its frequency.
The freq, maf, and risk modifiers also lead to simplified output aimed at SNP analysis: respectively, they print case and control allele frequencies at either the first allele (in the collation sequence), the minor allele (minor, that is, in controls), or the risk allele.
Also prints F statistics assuming trait is indicator of membership of different subpopulations (see fstats).
For a quantitative trait, prints the model and residual sums-of-squares and allelic means with naive standard errors from an additive allelic ANOVA model.
Monte-Carlo empiric P-values are produced for either analysis via gene-dropping, which can either be unconditional, or completely linked to that of another (or the same) marker (ibd keyword). Analysis may be restricted to founders by adding the founders modifier. Genotypic rather than allelic analyses can also be specified, using the genotype flag. Covariates can be added to some analyses.
For a binary trait genotype analysis, if the trait prevalence has been specified (using set prevalence), then the penetrances, sibling recurrence risk ratio and the attributable risk are output. The attributable risk using each genotype as the base genotype is calculated, and summarized as that using the lowest penentrance homozygous genotype as the base genotype:
AR = (Prevalence-Penetrance[lowest homozygote])/Prevalence
The sibling recurrence risk ratio is calculated using the same algorithm as that used by sml, except that in this case it allows multiple alleles.
The binary or categorical trait association test statistic is the Pearson contingency chi-square based test for equality of allele or genotype frequencies at a marker locus in individuals expressing different trait values (RxC table).
Example:
>> ass AD
--------------------------------------------------
Allelic association testing for trait "AD"
--------------------------------------------------
Marker Typed Allels Chi-square Asy P Emp P Iters
---------- ------ ------ ---------- ------ ------ ------
D14S52 21 7 4.7 0.5890 0.2674 187 AssX2-HWE .
D14S52 6 7 0.5 1.0000 0.7692 65 RC-TDT .
D14S43 21 7 12.6 0.0498 0.0060 501 AssX2-HWE *
D14S43 2 7 3.3 0.4070 0.2024 247 RC-TDT .
D14S53 20 6 11.0 0.0516 0.0040 501 AssX2-HWE *
D14S53 3 6 1.0 1.0000 0.4032 124 RC-TDT .
>> ass LDL
--------------------------------------------------
Allelic association testing for trait "LDL"
--------------------------------------------------
Marker Typed Allels Chi-square Asy P Emp P Iters
---------- ------ ------ ---------- ------ ------ ------
ADA 184 2 0.0 0.8873 1.0000 20 ANOVA-HWE .
ADA 96 2 0.8 0.3854 0.5000 40 ANOVA-CPG .
C3 187 3 45.1 0.0000 0.0050 201 ANOVA-HWE ***
C3 98 3 24.9 0.0000 0.0149 201 ANOVA-CPG ***
>> set plevel 1
>> ass LDL genotypic
--------------------------------------------------
Allelic association testing for trait "LDL"
--------------------------------------------------
NOTE: Genotypic rather than allelic association test.
------ QTL Association with "C3 " -----
Genotype Gtypic Mean Stand Error Count
----------------------------------------------
1/1 194.0000 32.4496 3
1/2 117.0000 8.8867 40
2/2 120.3821 5.0678 123
1/3 228.0000 39.7424 2
2/3 211.5263 12.8941 19
3/3 0.0000 0.0000 0
----------------------------------------------
Total 131.2513 63.5233 187
No. trait(+) marker(-) = 3
No. trait(+) marker(+) = 187
Model Mean Square = 679411.6445 (df= 5)
Mean Square Error = 3158.9219 (df= 182)
Genetic Variance = 939.1733
Likelihood ratio test = 49.8482
Nominal P-value = 0.0000
Equalled or exceeded by = 1/ 201 simulated values (0.0050)
Mean (SD) simulated MSE = 3977.5971 ( 103.8145)
------ QTL Association with "C3 "-------
------ Conditioned on Parental Genotype -------
Allele Allelic Mean Stand Error Count
-----------------------------------------------
1 60.5876 14.7357 18
2 55.9478 3.6638 165
3 151.3107 17.1900 13
----------------------------------------------
Total 125.3980 68.3354 196
No. trait(+) marker(-) = 0
No. trait(+) marker(+) = 98
Model Mean Square = 547548.0744 (df= 3)
Mean Square Error = 3698.2608 (df= 95)
Likelihood ratio test = 24.8991
Nominal P-value = 0.0000
Equalled or exceeded by = 1/ 201 simulated values (0.0050)
Mean (SD) simulated MSE = 4347.1922 ( 228.2397)
>> set prev 0.05
Binary trait model prevalence = 0.050000
>> ass ad gen
-> ass ad gen
--------------------------------------------------
Allelic association testing for trait "ad"
--------------------------------------------------
NOTE: Genotypic rather than allelic association test.
---- Association Analysis for "apoe " -----
Genotype Affected Unaffected Total Dev Penetrance AR
------------------------------------------------------------------
2/2 0 (.000) 0 (.000) 0 0.0 -
2/3 1 (.023) 7 (.058) 8 -4.2 0.0206 0.5890
3/3 7 (.163) 58 (.483) 65 -11.6 0.0174 0.6516
2/4 1 (.023) 5 (.042) 6 -3.5 0.0285 0.4293
3/4 23 (.535) 38 (.317) 61 -6.4 0.0816 -.6328
4/4 11 (.256) 12 (.100) 23 -2.9 0.1187 ******
-------------------------------------------------
Total 43 120 163
No. trait(+) marker(-) = 168
No. trait(+) marker(+) = 163
Assumed trait prevalence = 0.0500
Genetic variances VA, VD = 0.0012 0.0001
Sib recurrence risk ratio = 1.2557
Attributable risk = 0.6516 (non-3)
Contingency Pearson chi-sq = 18.7
Nominal degrees of freedom = 4
Nominal P-value = 0.0009
Equalled or exceeded by = 1/201 simulated values (0.0050)
Mean (Var) simulated chi-sqs = 3.9 ( 7.2)
>> keep 1007_s_at 1 -- 50
>> rec $m nuc
>> blo 1007_s_at
>> ass 1007_s_at snp
--------------------------------------------------
Allelic association testing for trait "1007_s_at"
--------------------------------------------------
NOTE: Showing allelic betas for diallelic markers.
Marker Allele AF Typed Beta ASE Z P-value
-------------------- ------ ---- ------ ---------- ---------- ---------- ---------
rs6576700 G A 0.55 20 -.2507 0.2234 1.122 0.1309 SNP
rs7730126 G A 0.68 22 0.2500 0.2919 0.8564 0.1959 SNP
rs10834942 G A 0.17 15 0.4635 0.5683 0.8156 0.2074 SNP
rs7995987 T G 0.64 22 0.6581E-02 0.2446 0.2690E-01 0.4893 SNP
[...]
See also:
set prevalence | fix population binary trait prevalence |
sml | recurrence risks |
<< (fpm) | Up to index | >> (mito) |