ISCB-Asia/SCCG 2012 Proceedings Talk

Two-way AIC: Detection of Differentially Expressed Genes from Large Scale Microarray Meta-Dataset

Koki Tsuyuzaki1, Daisuke Tominaga2, Yeondae Kwon1 & Satoru Mizaki1
1Department of Medical and Life Science, Tokyo University of Science
2Computational Biology Research Center, AIST, Japan

Abstract

Background: Detection of differentially expressed genes (DEGs) from DNA microarray datasets is a common routine task in biomedical research. For the detection of DEGs, numerous methods are proposed. Greater amounts of data are well known to make the results of statistical analyses more reliable. However, a commonly encountered problem is that sample sizes of each gene in microarray datasets are often limited because of the cost of experiments. By contrast, datasets in public databases are accumulating rapidly. Therefore, one idea is to integrate multiple datasets and to construct large-scale meta-datasets to increase the reliability of DEG detection. Then we may detect both DEGs in common and DEGs which are special in the experimental conditions at one time. Herein we propose "two-way AIC", which detects DEGs from meta-datasets based on a outlier detection. This method detects specific genes that are differentially expressed in specific experimental conditions, where both genes and conditions are specified.

Result: As a case study of the Pseudomonas aeruginosa, we evaluate whether two-way AIC method can detect operon genes as DEGs, or not. Specificity of the two-way AIC as the operon detection is superior to widely used conventional methods such as F-test/t-rank or RankProducts method. Because our method is used with very large-scale data, we also estimate its computational time to show its computability. It costs 8.30・10-6・ x2.47 against data size.

Conclusions: The two-way AIC performs high specificity for operon gene detection on the microarray meta-dataset with computational cost of O(x2.47), where x is the number of genes or experiments.