ISCB-Asia/SCCG 2012 PC Invite Presentation Dongxiao Zhu RNA-seq to capture transcriptome landscape |
The first problem is ab initio reconstruction of the transcriptome sequence from RNA-seq reads. The latter can be viewed as randomly “sampled” from the former. This reverse engineering problem is complicated by an ultra-high throughput of the reads (hundreds of millions) and a highly non-linear transcriptome structure. We design a novel divide-and-conquer strategy to localize reads to annotated reference genome regions and develop a new algorithm to infer the nonlinear structure within each region. Using simulation studies, we have demonstrated a high accuracy in transcriptome reconstruction.
The second problem is to quantify the identified transcripts from problem 1. Due to the overlapping of the transcript sequences, the observed expression signal can be attributed to a number of isoform transcripts. We develop a novel deconvolution algorithm with shrinkage to infer the relative abundance of the isoform transcripts using the base-pair expression signal from RNA-seq experiments. Similarly we demonstrate a high accuracy in transcriptome quantification using simulation and real-world studies. Finally I briefly introduce innovative algorithms for reconstructing signaling pathway topologies. [more information]