- Linking transcriptomes of silver carp and bighead carp to their growth and feeding habits,
Guoqing Lu, Michael Wachholtz, Xiaolin Liao, James Lamer, Chenghui Wang
Transcriptomes of two important aquaculture fish, silver carp and bighead carp were analyzed, resulting in around 40,000 transcripts with known functions. 518 genes were found significantly differentiated in their expressions between the two carps. Our study revealed critical genes and networks related to their growth and feeding habits.
- Transcriptomics of common carp: insight into gene expression differentiation of body colors,
Chenghui Wang, Guoqing Lu
Oujiang color common carp is unique in its body color; however its molecular mechanism remains unclear. Here we conducted transcriptome profiling analysis of skins of two carp strains. Our preliminary study revealed significant expression difference of genes that are potentially related to the body color patterns.
- Comparison of Cancer Cell Line Genomic Profiling Data Reported by Sanger and CCLE,
Xiaoqiao Liu, Zongfu Cao, Yue Wang
We do detailed comparison of cancer cell line genomic profiling data reported by Sanger Cancer Cell Lines Project and Broad-Novartis Cancer Cell Line Encyclopedia project. Considerable high consistency can be observed for mRNA expression and copy number data, yet the mutation data from two projects demonstrate more discordance.
- Bayesian data analysis of genomic microarrays for detecting Streptococcus pneumoniae multiple serotype carriage,
Richard Newton, Jason Hinds, Lorenz Wernisch
We describe Bayesian methods for the accurate analysis of data from genomic microarrays designed for Streptococcus pneumoniae molecular serotyping. This approach enabled the development of a flexible and expandable statistical model, producing a robust and highly accurate analysis of the data; being particularly successful in cases of multiple serotype carriage.
- Provenance of unaligned reads in ChIP-Seq studies,
Zachary W. Ouma,, Katherine Mejia-Guerra, Erich Grotewold
Not all sequence reads align to their reference genomes in ChIP-Seq data analysis. This study determined the source of such unaligned reads. Our analyses reveal a significant level of contamination of unaligned reads with sequences of bacterial and metazoan origin in most sequence data sets analysed.
- Assembling large sets of similar transcripts from short reads: cyclotide identification from the Oldenlandia affinis transcriptome,
Quentin Kaas, Husen Jia, Joshua S. Mylne, David J. Craik
Cyclotides are a very large family of plant peptides known for their intriguing cyclic structure. Plants typically produce a hundred of cyclotide precursors, which contain very similar and at times internally repeating domain sequences. Our work tackled the challenging task of assembling cyclotide containing transcriptomes using NGS short reads.
- Identification of electron transport proteins in different complexes of cellular respiration system,
Yu-Yen Ou
We have proposed a method based on radial basis function networks using Position Specific Scoring Matrix (PSSM) profiles and amino acid biochemical properties to identify electron transport proteins in different complexes of cellular respiration system.
- A comparative study of copy number variation detection methods for next-generation sequencing technologies,
Yu-Ping Wang, Junbo Duan
The emergence of NGS technologies enables high resolution detection of CNVs. We compared and evaluated five recently published CNV detection methods: CNV-seq, FREEC, readDepth, CNVnator and SegSeq under different experiment settings, which will help investigators to make informed decisions when choosing a suitable method for their specific needs.
- Finding Molecular Mechanisms behind IgA Nephropathy Utilizing Microarray and qPCR Technology,
Peidi Liu, Kerstin Ebefors, Jenny Nyström, Börje Haraldsson
Investigation on a specific kidney disease named IgA Nephropathy using microarray and qPCR technologies via a statistical analysis approach to find out molecular mechanisms behind the disease.
- Nitrated proteome in human embryonic stem cells,
Ji Hwan Park, Hyobin Jeong, Daehee Hwang, Sojeong Yun
Nitrated proteomes were profiled from human embryonic stem cells (ESC) and their differentiated cells by isolating nitrated peptides using fluorine affinity purification, followed by LC-MS/MS analysis. Comparing with known phosphorylated sites and integrating with transcriptomics provide a basis for understanding potential roles of protein-tyrosine-nitration in self-renewal and differentiation of ESCs.
- Systematic study on microRNA regulation network from next generation sequencing,
Ping Chen, Anna-Maria Lahesmaa-Korpinen, Alejandra Cervera, Vladimir Rogojin, Sampsa Hautaniemi
We present an approach for the integration study of the next generation sequencing data in human cancers. This pipeline includes general sequencing analysis modules for miRNA-seq, RNA-seq and also exome-seq, together with functional study modules which are all implemented in a data analysis and integration framework, Anduril.
- Accurate spliced alignments of mid-to-long RNA-seq reads,
Zeng Chao, Hiroaki Iwata, Natsuhiro Ichinose, Tetsushi Yada, Osamu Gotoh
Spaln is a space efficient and fast method for mapping and aligning cDNA/EST sequences onto genomic sequence. In this work, we expended Spaln to enable mapping mid-to-long RNA-seq reads against both exonic regions and splice junctions with high sensitivity and accuracy.
- Computational Prediction of miRNA Regulations in Mouse Whole Brain Using in situ Hybridization Data,
Risa Kawaguchi, Hisanori Kiryu
We tried to predict miRNA regulations by examining relationships between mRNA expressions and seed existences in its 3’UTR using 3D expression data. We adopted "wordscore" as an index of regulation, which is almost Z-score of logarithms of seed frequency in shuffle gene sequences and useful to distinguish the significance seed.
- A Systems Approach to Rheumatoid Arthritis,
Jieon Lee, Sungyong You, Daehee Hwang, Wan-Uk Kim
Rheumatoid Arthritis (RA) is an autoimmune disease that attack joints. To identify novel molecular targets for RA, a systems approach was applied. This method 1) identified RAGs, 2) reconstructed RA-perturbed network, and 3) selected potential targets. The model can be used for identification of targets for other disease as well.
- Finding novel malarial proteases using systems approaches,
Yufeng Wang, Hong Cai, Timothy Lilburn, Jianying Gu
Proteases are attractive antimalarial targets due to their essential roles in parasite life cycle. In this study, combining comparative genomics, machine learning, and network analysis, we identified and characterized the protease complement in the malaria parasites, providing a catalog of targets for functional characterization and rational inhibitor design.
- SASeq: A Selective and Adaptive Shrinkage Approach to Detect and Quantify Active Transcripts,
Tin Nguyen, Nan Deng, Dongxiao Zhu
We developed a selective and adaptive shrinkage method to detect and quantify condition-specific transcripts using RNA-seq. Initial efforts on mathematical or statistical modeling of read counts or per-base exonic expression signal have been successful but may face an increasing risk of model misspecification and overfitting.
- Genetic diversity and phylogenetic analysis of the genus Alstroemeria based on nuclear ribosomal DNA ITS region sequence,
Byeong-Rok Yoo, Dong Wang, Koo-Yeon Lee, In-Lee Choi, Kyoung-Soo Lee, Jin-Sung Son, Ho-Min Kang, Dong-Jin Lee, Kyung Choi, Kwang-Woo Park, Soon-Kwan Hong
The results showed that there are some certain divergenees in the ITS region sequence between other cultivars. Particularly, PinkFloyd showed the highest dissimilarity of the ITS region sequence with other 11 Astroemeria cultivars. This outcome makes us further understand the molecular diversification in the Alstroemeria.
- Genetic variation of Oplopanax elatus Nakai populations estimated using nrDNA ITS region sequence,
Dong Wang, Byeong-Rok Yoo, Young-Seol Kim, Soon-Kwan Hong, Yong-Chul Park, Wan-Geun Park
We apply molecular methods based on DNA cloning and sequencing to the plant species Oplopanax elatus Nakai. Our work orientates Oplopanax elatus Nakai in the phylogenetic tree of the genus and will lead to further understanding and clearer classification of this species.
- Genetic variation of the unripe hot pepper Paprika estimated using nrDNA ITS region sequence,
Dong Wang, Byeong-Rok Yoo, In-Lee Choi, Yong-Beom Lee, Ki-Young Choi, Soon-Kwan Hong, Ho-Min Kang
A couple of studies on the species paprika have been performed, such as morphological study. However, molecular methods based on DNA cloning and sequencing is a useful way compared to traditional identification methods. During the method, internal transcribed spacer of the species was cloned and sequenced.
- Total cholesterol content, healthy unsaturated fatty acid and carcass characteristics in M. longissimus dorsi of Hanwoo (Korean native cattle) steers fed an alcohol fermented feed supplemented with soy lecithin,
X. Z. Li, C. G. Choi, J. S. Ahn, Y. J. Kim, C. S. Choi, J. S. Jeoun, J. S. Shin
A feeding experiment was conducted with twenty-four Hanwoo (Korean native cattle) steers (445.80 ± 37.3 kg) to investigate effect of soy lecithin on total cholesterol content, healthy unsaturated fatty acid profile and carcass characteristics in M. longissimus dorsi.
- Effect of dietary linseed oil and propionate precursors on ruminal bacterial community, composition and diversity in Yanbian Yellow steers,
Z.G. Liu, H. Zhou, W. Zhou, F. H. Xua, H. Wanga, W.M. Wanga, X. Z. Li, J.S. Shin, C. G. Yan
The rumen microbial ecosystem is a complex system and rumen fermentation processes include interactions among microorganisms. There are important relationships between dietary variety and ruminal bacterial composition.
- Effect of dietary linseed oil and malate on conjugated linoleic acid in rumen fluid, plasma and milk fat, and lipogenetic enzymes in mammary gland and milk somatic cells in lactating goats,
H. Wang, W. Zhou, F.H. Xu, H. Zhou, Z.G. Liu, W.M. Wang, C.G. Yan, J.S. Shin, X.Z. Li
Plant oil in the diet is known to enhance milk fat composition and alter milk fatty acid profile owing to changes in the supply of fatty acid precursors and/or activity of lipogenic enzymes both in the mammary gland and in somatic cells in lactating goats.
- In-silico prediction of drug repositioning candidates using an integrated network approach,
Haeseung Lee, Hanna Ryu, Sanghyuk Lee, Wankyu Kim
We propose an integrative computational approach to identify DR candidates and apply our method to three types of cancer (glioblastoma multiforme, lung and ovarian cancer).
- Integration of multi 'omics'! where are we?,
MNV Prasad Gajula, Anil Rai
A complex biological system requires individual domains like genes, proteins, metabolites, other components work collaborately to perform its specified function. The challenge is to find the cross correspondence between those different components. We propose to develop an integrated ‘omics’ platform to address the issue by using bioinformatics and IT.
- LocTree2: a web server for predicting localization in all domains of life,
Tatyana Goldberg, Syeda Tanzeem H. Charu, Tobias Hamp, Burkhard Rost
LocTree2 web server predicts sub-cellular localization for all proteins in all domains of life. It predicts 3 classes for Archaea, 6 for Bacteria and 18 classes for Eukaryota. LocTree2 accurately distinguishes between membrane and non-membrane proteins. It compares favorably to other localization predictors and is stable even for protein fragments.
- Estimation of the Optimal Stage Division on Gene Expression Time Series,
Daisuke Tominaga
In the cellular level phenomena, changes of state or condition of cells are observed as stage progression. Given quantitative gene expression time series data, the optimal model for time period division into stages is defined for stage progression detection. We applied the method to nematode development.
- Virtual Screening of Multi-target Inhibitors by Combinatorial Support Vector Machines,
Chu Qin, Xiao Hua, Zhe Shi, Yu Zong Chen
We evaluated he performance of combinatorial support vector machines (C-SVM) in searching dual inhibitors of 29 target pairs.Compared with other tools, C-SVM produced comparable dual-inhibitor yields and significantly lower false-hit rates in screening large chemical database, regardless of the similarity level. It showed promising capability in searching multi-target inhibitors.
- Web-based system for analysis and management of next-generation genomic sequence data,
Byungwook Lee, Minkyung Sung
We present NGSpass for analyzing and managing NGS genomic sequence data. Our system accepts a FASTQ-formatted sequencing file as inputs, and it then executes back-end analysis pipelines already constructed by users. Users can simply build analysis pipelines by adding or deleting programs and adjusting parameters of each program.
- An Approximate Bayesian Approach to Mapping Paired-End DNA Reads to a Reference Genome,
Anish Man Singh Shrestha, Martin C. Frith
We present a new probabilistic framework for mapping paired-end DNA reads to a reference genome. Tests with simulated and real data show that our method provides a good combination of sensitivity, error rate, and computation time, especially in challenging cases such as partially-assembled reference or a relatively divergent one.
- A Comparative Bioinformatic Analysis of the Protein-Protein Interaction Networks of the Archaea,
Cathy K. Derow, David Fell
Archaea, while prokaryotic, are more closely related to eukaryotes. They are posited as a Third Kingdom of Life. Determining network properties of archaeal protein-protein interactions and comparing them to those of eukaryotes and bacteria, will aid understanding of Archaea and help elucidate universal network properties of protein interactions and those unique to Kingdoms.
- DEGASeq: Graph-based identification and visualization of differentially expressed alternative splicing events from RNA-Seq,
Charny Park, Byungwook Lee, Wankyu Kim, Sanghyuk Lee
We present DEGASeq (Differential Expression Graph for Alternative Splicing) software. AS events are analyzed by the graph structure of exon. Each event is tested for differential expression using only the junction reads corresponding event. A novel visualization was devised to illustrate condition-specific events on the exon graph that can be customized interactively by user.
- Open chromatin regions in primate genomes catch recent exogenous DNA insertions,
Junko Tsuji1 and Paul Horton
It is known that mitochondrial DNAs (mtDNAs) insert into the nuclear genome via the repair of DNA double-strand breaks. In this study, we found recent mtDNA insertions strongly correlate with species-specific open chromatin regions. In addition, we also show the correlation between open chromatin regions and retrotransposons.
- ValidNESs: a database of validated leucine-rich nuclear export signals,
Szu-Chin Fu, Hsuan-Cheng Huang, Paul Horton, Hsueh-Fen Juan
ValidNESs provides both updated data and an upgraded interface for convenient access of experimentally validated leucine-rich nuclear export signals (NESs) and NES-containing proteins. We also integrate the state-of-the-art NES predictor into ValidNESs, enabling valuable hints to be gained by in silico prediction.
- An investigation of structural profiles around target sites of RNA binding proteins.,
Tsukasa Fukunaga, Hisanori Kiryu
We developed algorithms for computing the structural profiles exactly using a dynamic programming method, and implemented in software called 'CapR'. We investigated structural profiles of target sites of RBPs determined by RIP-Chip and CLIP-seq.
- Protein disorder in virus-host interactions,
Yu-An Dong, Burkhard Rost
Eukaryotes possess significantly more protein disorder than do bacteria and archea. Surprisingly, many viruses also have high disorder content. We investigated the pattern of disorder in virus-host interactions.
- MEGADOCK: a High-speed Protein-protein Interaction Prediction System by All-to-all Physical Docking,
Masahito Ohue, Yuri Matsuzaki, Nobuyuki Uchikoga, Takashi Ishida, Yutaka Akiyama
The elucidation of protein-protein interaction networks is important for understanding cellular systems and structure-based drug designs. However, the development of an effective method to conduct exhaustive PPI screening represents a computational challenge. We developed a high-speed protein-protein interaction system called "MEGADOCK" with Fourier-transform-based protein docking technique.
- Utilizing probabilistic dependencies in low-complexity regions when detecting homologous regions of biological sequences,
Thomas M. Poulsen, Martin C. Frith, Paul Horton
When detecting homologs, alignment of low-complexity regions and tandem repeats can produce false detections. To address these issues, Hidden Markov models are extended to detect repetitive regions and capture higher-order nucleotide distributions. These models exploit additional information compared with standard methods that typically neglect nucleotide dependencies and mask repetitive regions.
- Application of exhaustive protein-protein interaction prediction system by using protein docking to signal transduction pathways,
Yuri Matsuzaki, Masahito Ohue, Nobuyuki Uchikoga, Takashi Ishida, Yutaka Akiyama
We have developed a high throughput protein-protein interaction (PPI) prediction system "MEGADOCK", which is based on rigid-body docking. MEGADOCK is implemented using hybrid parallelization by MPI and OpenMP. It showed sufficient scalability on high performance computers. We evaluated our system by try reconstructing known signal transduction systems.
- Modeling the interaction of acetylcholinesterase inhibitor,
Ayupov Rustam Khasanovich, Natalia I. Akberova, D.S. Tarasov
Currently used in medicine acetylcholinesterase inhibitor (AChE) have not sufficient specificity. For the development of new inhibitors methods in silico are used. In this paper we investigate the interaction of AChE and a pyridoxine derivative, a promising inhibitor of AChE, using molecular dynamics method.
- MoiraiSP: a novel mitochondrial localization signal predictor,
Yoshinori Fukasawa,Kenichiro Imai, Szu-Chin Fu, Junko Tsuji, Paul Horton
In this work, we developed a novel predictor for mitochondrial targeting signal and its cleavage site trained on recent data. To predict the signal, we performed motif search in signal region and generated profiles for cleavage sites. Making use of those novel features, our predictor outperforms present prediction tools.
- Distribution and Structure of Polyketide Synthases in Aspergillus,
Shu-Hsi Lin, Ping-Chiang Lyu, Chuan-Yi Tang, Miwa Yoshimoto, Masanori Arita
This work compares over 400 polyketide synthases from fungal genomes and discusses their evolutionary relationship and the distribution and structure of polyketide synthases in eight Aspergillus species.
- GRiG: A PPV-sensitive method for predicting somatic SNVs from cancer-normal paired sequencing data with greedy rule induction algorithm,
Shaoping Ling, Lili Dong, Lihua Cao, Caiyan Jia, Xuemei Lu, Chung‐I Wu
We presented a Greedy Rule Induction alGorithm (GRiG) for predicting somatic SNVs in cancer-normal paired sequencing data, which integrates feature selection and rule inference into a machine learning frame work. GRiG always achieved high positive prediction value (PPV) in both exome capture sequencing (ECS) and whole genome sequencing (WGS) dataset.
- CASmap: Splitting Short Reads Alignment with FPGA-based Streamline Optimization,
Shaoping Ling, Jiahan Liu, Lingtong Hao, Longhui Yin, Lili Dong, Lihua Cao, Fen Xiao, Junsuo Zhao, Chung-I Wu, Xuemei Lu
We created a new alignment system (CASmap) which implemented BWT-based alignment algorithm in a customized desktop reconfigurable computer based on FPGA reconfigurable platform. It accelerated ~30X and ~2X higher than BWA (one thread) and SOAP3 in searching read location candidates in suffix array with the power of FPGA-based streamline optimization.