Background

Many common chronic diseases are highly polygenic, with potentially hundreds or thousands of common risk variants with small effects contributing to disease risk. A small number of these variants has now been identified through genome-wide association studies (GWAS), but the majority remain to be mapped. A number of approaches are expected to increase the power to identify variants with a modest but reproducible effect on disease risk, namely larger GWAS, the analysis of more refined phenotypes, multivariate association analysis of related phenotypes, gene-based association analyses, and association analyses restricted to functional variants, such as those that regulate gene expression levels.

We reasoned that if the expression of a gene is causally-related to disease status, and gene expression is determined by multiple independent single nucleotide polymorphisms (eSNPs), then a gene-based test that captures the aggregate effects of these eSNPs would be expected to improve power over the alternative approach of testing each eSNP individually.

We collated results from published GWAS of gene expression to identify a set of independent eSNPs for each gene. We then developed a gene-based approach to test the association between a trait or disease and all independent eSNPs of a gene. This approach only requires summary association statistics and uses simulations based on observed genotype data to correct for multiple gene testing.

The advantage of this approach over previously developed gene-based tests (eg. VEGAS) is that only SNPs that are linked to gene-expression are included in the analysis, irrespectively of how far they are from the gene (both cis-acting and trans-acting eSNPs are included). When compared to approaches that infer gene-expression levels from SNP data (eg. PrediXscan), the main advantage of our gene-based approach is that it does not require individual-level genetic data (and so it can be used to analyse summary statistics of GWAS meta-analyses).

5 Oct 2018:

Updated cis eQTL database released (version 2018-06-27). In this updated version, eQTL are identified from published eQTL studies based on a slightly more conservative P-value threshold than used in previous versions: P<8.9x10-10, which corrects for 55,765 transcripts in Gencode v19, each tested for association with 1,000 SNPs, as suggested by others (Davis et al. AJHG 2016, PMID 26749306). Sentinel eQTLs in low LD with each other are also identified based on a more conservative pairwise r2 threshold (<0.05 instead of 0.10). These new thresholds reduce the number of sentinel eQTL per gene, which facilitates interpretation of results but could decrease power (when compared to more liberal thresholds).

EUGENE-online no longer supported, as most users have opted for running EUGENE locally.

2 May 2017: In addition to the stand alone version (see 'Download & usage' on the left), now you can request an EUGENE analysis online (click on 'EUGENE online'), all you need is to upload your GWAS results (SNP and P-value). EUGENE-online is currently in beta testing.

22 July 2016: A manuscript that describes this approach and applied it to a GWAS of asthma is in press at the Journal of Allergy and Clinical Immunology (PMID 27554816).