Sib-pair Command: assoc


ClassAnalysis and data manipulation command
Nameassoc
Arguments <binary trait> | <quantitative trait> [(>|>=|<|<=|==|^= <threshold>]) | categorical ] [founders] [covariate <covariate>] [genotypic] [ibd <ibd_marker>] [snp | freq | maf | risk]

For a binary or dichotomised quantitative quantitative trait, prints chi-square statistics for association for all marker loci versus the trait, either affected versus unaffected if the trait is binary, or above or below the threshold if the trait is quantitative.

If the categorical modifier is present, then each unique value of a quantitative trait is taken to represent a category of a multinomial trait.

For dichotomous traits, a second table in the output shows the results from the FBAT or reconstruction-combined TDT within informative sibships.

The snp modifier leads to simplified output: for a binary trait, the allelic odds ratio and 95 confidence interval, followed by P-value; it also prevents the FBAT statistic being calculated. The odds ratio is for the increasing allele if two alleles are present, otherwise it is for the base (ie first) allele. For a quantitative trait, beta and SE for the second allele is given - the output indicates this allele and its frequency.

The freq, maf, and risk modifiers also lead to simplified output aimed at SNP analysis: respectively, they print case and control allele frequencies at either the first allele (in the collation sequence), the minor allele (minor, that is, in controls), or the risk allele.

Also prints F statistics assuming trait is indicator of membership of different subpopulations (see fstats).

For a quantitative trait, prints the model and residual sums-of-squares and allelic means with naive standard errors from an additive allelic ANOVA model.

Monte-Carlo empiric P-values are produced for either analysis via gene-dropping, which can either be unconditional, or completely linked to that of another (or the same) marker (ibd keyword). Analysis may be restricted to founders by adding the founders modifier. Genotypic rather than allelic analyses can also be specified, using the genotype flag. Covariates can be added to some analyses.

For a binary trait genotype analysis, if the trait prevalence has been specified (using set prevalence), then the penetrances, sibling recurrence risk ratio and the attributable risk are output. The attributable risk using each genotype as the base genotype is calculated, and summarized as that using the lowest penentrance homozygous genotype as the base genotype:

AR = (Prevalence-Penetrance[lowest homozygote])/Prevalence

The sibling recurrence risk ratio is calculated using the same algorithm as that used by sml, except that in this case it allows multiple alleles.

The binary or categorical trait association test statistic is the Pearson contingency chi-square based test for equality of allele or genotype frequencies at a marker locus in individuals expressing different trait values (RxC table).

Example:

>> ass AD
--------------------------------------------------
Allelic association testing for trait "AD"
--------------------------------------------------
Marker     Typed  Allels Chi-square Asy P  Emp P  Iters
---------- ------ ------ ---------- ------ ------ ------
D14S52         21      7        4.7 0.5890 0.2674    187 AssX2-HWE .  
D14S52          6      7        0.5 1.0000 0.7692     65 RC-TDT    .  
D14S43         21      7       12.6 0.0498 0.0060    501 AssX2-HWE *  
D14S43          2      7        3.3 0.4070 0.2024    247 RC-TDT    .  
D14S53         20      6       11.0 0.0516 0.0040    501 AssX2-HWE *  
D14S53          3      6        1.0 1.0000 0.4032    124 RC-TDT    .  

>> ass LDL
--------------------------------------------------
Allelic association testing for trait "LDL"
--------------------------------------------------
Marker     Typed  Allels Chi-square Asy P  Emp P  Iters
---------- ------ ------ ---------- ------ ------ ------
ADA           184      2        0.0 0.8873 1.0000     20 ANOVA-HWE .  
ADA            96      2        0.8 0.3854 0.5000     40 ANOVA-CPG .  
C3            187      3       45.1 0.0000 0.0050    201 ANOVA-HWE ***
C3             98      3       24.9 0.0000 0.0149    201 ANOVA-CPG ***

>> set plevel 1
>> ass LDL genotypic

--------------------------------------------------
Allelic association testing for trait "LDL"
--------------------------------------------------
NOTE:  Genotypic rather than allelic association test.

  ------ QTL Association with "C3        " -----
  Genotype   Gtypic Mean     Stand Error   Count
  ----------------------------------------------
   1/1           194.0000        32.4496       3
   1/2           117.0000         8.8867      40
   2/2           120.3821         5.0678     123
   1/3           228.0000        39.7424       2
   2/3           211.5263        12.8941      19
   3/3             0.0000         0.0000       0
  ----------------------------------------------
  Total          131.2513        63.5233     187

 No. trait(+) marker(-)  =      3
 No. trait(+) marker(+)  =    187
 Model Mean Square       = 679411.6445 (df=   5)
 Mean Square Error       =   3158.9219 (df= 182)
 Genetic Variance        =    939.1733
 Likelihood ratio test   =     49.8482
 Nominal P-value         =      0.0000
 Equalled or exceeded by =   1/  201 simulated values (0.0050)
 Mean (SD) simulated MSE =   3977.5971 (    103.8145)

  ------ QTL Association with "C3        "-------
  ------ Conditioned on Parental Genotype -------
    Allele   Allelic Mean    Stand Error   Count
  -----------------------------------------------
       1          60.5876        14.7357      18
       2          55.9478         3.6638     165
       3         151.3107        17.1900      13
  ----------------------------------------------
  Total          125.3980        68.3354     196

 No. trait(+) marker(-)  =      0
 No. trait(+) marker(+)  =     98
 Model Mean Square       = 547548.0744 (df=   3)
 Mean Square Error       =   3698.2608 (df=  95)
 Likelihood ratio test   =     24.8991
 Nominal P-value         =      0.0000
 Equalled or exceeded by =   1/  201 simulated values (0.0050)
 Mean (SD) simulated MSE =   4347.1922 (    228.2397)

>>  set prev 0.05

Binary trait model prevalence = 0.050000

>>  ass ad gen

->  ass ad gen

--------------------------------------------------
Allelic association testing for trait "ad"
--------------------------------------------------
NOTE:  Genotypic rather than allelic association test.


  ---- Association Analysis for "apoe      " -----
  Genotype  Affected    Unaffected    Total    Dev   Penetrance   AR
  ------------------------------------------------------------------
   2/2        0 (.000)      0 (.000)       0    0.0      -
   2/3        1 (.023)      7 (.058)       8   -4.2     0.0206 0.5890
   3/3        7 (.163)     58 (.483)      65  -11.6     0.0174 0.6516
   2/4        1 (.023)      5 (.042)       6   -3.5     0.0285 0.4293
   3/4       23 (.535)     38 (.317)      61   -6.4     0.0816 -.6328
   4/4       11 (.256)     12 (.100)      23   -2.9     0.1187 ******
  -------------------------------------------------
   Total     43           120            163

       No. trait(+) marker(-) =   168
       No. trait(+) marker(+) =   163

     Assumed trait prevalence =   0.0500
     Genetic variances VA, VD =   0.0012   0.0001
    Sib recurrence risk ratio =   1.2557
            Attributable risk =   0.6516 (non-3)

   Contingency Pearson chi-sq =  18.7
   Nominal degrees of freedom =   4
              Nominal P-value =   0.0009
      Equalled or exceeded by = 1/201 simulated values (0.0050)
 Mean (Var) simulated chi-sqs =   3.9 (   7.2)

>> keep 1007_s_at 1 -- 50
>> rec $m nuc
>> blo 1007_s_at
>> ass 1007_s_at snp

--------------------------------------------------
Allelic association testing for trait "1007_s_at"
--------------------------------------------------
NOTE:  Showing allelic betas for diallelic markers.


Marker               Allele AF   Typed   Beta        ASE        Z        P-value
-------------------- ------ ---- ------ ---------- ----------  ---------- ---------
rs6576700               G A 0.55     20 -.2507     0.2234      1.122     0.1309 SNP
rs7730126               G A 0.68     22 0.2500     0.2919     0.8564     0.1959 SNP
rs10834942              G A 0.17     15 0.4635     0.5683     0.8156     0.2074 SNP
rs7995987               T G 0.64     22 0.6581E-02 0.2446     0.2690E-01 0.4893 SNP
[...]

See also:

set prevalencefix population binary trait prevalence
sml recurrence risks


<< (fpm)Up to index>> (mito)