Prioritizer WGA

E-mail Print PDF

Whole Genome Association Prioritizer (WGA). This new software tool and method combines both basic (statistical) functionality for performing preprocessing, quality control and single marker association analysis on raw genotype files from Illumina and Affymetrix WGA chips, but also includes a comprehensive genome viewer, for the joint exploration of the called genotypes and raw data, linkage disequilibrium patterns and genes underlying strong hits. Additionally it includes new functionality to help improve the reliability of detecting real disease SNPs by utilizing our functional human gene network.

Bayesian approach to generate a gene network, based upon data from Gene Ontology (GO), KEGG, BIND, HPRD, Reactome, a dataset which contained approximately 70,000 predicted protein-protein interactions (Lehner and Fraser, 2004), 3,000 predicted human protein-protein interactions (Stelzl et al, 2005) and co-expression data, derived from approximately 10,000 human microarray experiments stored within the Gene Expression Omnibus and the Stanford Microarray Database.

Basic principle of the positional candidate gene prioritization method using gene networks. Depicted in this figure are three different gene-gene interaction data sources that are integrated in a Bayesian way. After integration of the data sources the actual gene network is constructed. As an example, all genes get an initial score of 0 assigned and three different susceptibility loci, each containing a disease gene (P, Q or R) and two non-disease genes, are analyzed. Per locus the three positional candidate genes increase the scores of genes functionally nearby within the gene network, using a kernel function which models the relationship between gene-gene distance and score effect. Once all loci have been processed, shuffling the three susceptibility loci 10,000 times across the genome allows for the determination of an empiric p-value per gene, and the eventual ranking of the positional candidate genes per locus. Genes P, Q and R should then end up as the top ranked genes, as they have the most significant p-values.

 

 

You are here: Software Gene Prioritization Prioritizer WGA