Home Up
Paul Horton, Readings in Computational Biology 2024
Course Overview
Papers Presented
The Precision-Recall Plot is more Informative than the ROC when evaluating binary classifiers on Imbalanced Datasets
Takaya Saito & Marc Rehmsmeier
PLoS ONE, 10:(3), e0118432, 2015.
BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
Kelvin C.K. Chan, Xintao Wang, Ke Yu, Chao Dong, Chen Change Loy
arXiv:2012.02181, 2021.
Is artificial data useful for biomedical natural language processing algorithms
Zixu Wang, Julia Ive, Sumithra Velupillai & Lucia Specia
Proceedings of the 18th BioNLP Workshop and Shared Task, 2019.
A linear space algorithm for computing maximal common subsequences
Hirschberg, D. S.
Communications of the ACM,, 18(6), 341–343, 1975.
Computing Longest Common Subsequence for Multiple Sequences
Chinmay Bepery, Sk. Abdullah-Al-Mamum & M Sohel Rahman
2nd International Conference on EICT, 2015.
Basic local alignment search tool
Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Myers & David Lipman
Journal of molecular biology,215(3):403-10, 1990.
A Fast UPGMA Algorithm With Multiple Graphics Processing Units Using NCCL
Guan-Jie Hua, Che-Lun Hung, Chun-Yuan Lin, Fu-Che Wu, Yu-Wei Chan & Chuan Yi Tang
Evol Bioinform Online 13:1176934317734220, 2017.
nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation
Fabian Isensee, Paul F. Jaeger, Simon A. A. Kohl, Jens Petersen & Klaus H. Maier-Hein
Nature Methods, 18, 203–211 2021.
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
Ben Langmead, Cole Trapnell, Mihai Pop & Steven L Salzberg
Genome Biology, 10, R25. 2009.
Automatic Task-Level Thinking Steps Help Large Language Models for Challenging Classification Tasks
Chunhui Du, Jidong Tian, Haoran Liao, Jindou Chen, Hao He & Yaohui Jin
Conference on Empirical Methods in Natural Language Processing, 2023.
Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning
Richard J. Chen, Chengkuan Chen, Yicong Li, Tiffany Y. Chen, Andrew D. Trister, Rahul G. Krishnan, Faisal Mahmood
arXiv:2206.02647, 2022.
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever, Oriol Vinyals & Quoc V. Le
Proceedings of the 27th International Conference on Neural Information Processing Systems,
2 3104–3112, 2014.
Language Models are Unsupervised Multitask Learners
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei & Ilya Sutskever
Preprint, 2019.
Deep Learning
LeCun, Y., Bengio, Y. & Hinton, G.
Nature, 521:436–444, 2015.
A Block-sorting Lossless Data Compression Algorithm
Michael Burrow & David Wheeler
Technical Report 124, Digital Equipment Corporation, SRC-RR-124, 1994
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
Mikel Artetxe & Holger Schwenk
Transactions of the Association for Computational Linguistics, 7: 597–610, 2019.
The Paradoxical Success of Fuzzy Logic
Charles Elkan
Proc. AAAI'93, 1993.
Efficient Acceleration of the Pair-HMMs Forwarding Algorithms for GATK HaplotypeCaller on Graphics Processing Units
Shanshan Ren, Koen Bertels & Zaid Al-Ars
Evol Bioinform Online, 2018.
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau, Kyunghyun Cho & Yoshua Bengio
Presented at ICLR 2015, arXiv:1409.0473, 2015.
GPU Accelerated API for Alignment of Genomes Sequencing Data
Nauman Ahmed; Hamid Mushtaq; Koen Bertels; Zaid Al-Ars
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2017.
Scaling accurate genetic variant discovery to tens of thousands of samples
Ryan Poplin, Valentin Ruano-Rubio, Mark A. DePristo, ..., Eric Banks
bioRxiv, doi:10.1101/201178, 2017.
Not All Demonstration Examples are Equally Beneficial
Zhe Yang, Damai Dai, Peiyi Wang, and Zhifang Sui
Findings of the Association for Computational Linguistics: EMNLP, 13209–13221, 2023.
Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection
Dong Gong, ..., Anton van den Hengel
Presented at ICCV 2019, arXiv:1904.02639v2, 2019.
BioREx: Improving biomedical relation extraction by leverage heterogeneous datasets
Po-Ting Lai, Chih-Hsuan Wei, Ling Luo, Qingyu Chen, & Zhiyong Lua
arXiv:2306.11189v1, 2019.
LOTUS: A single- and multitask machine-learning algorithm for the prediction of cancer driver genes
Olivier Collier, Véronique Stoven, Jean-Philippe Vert
PLoS Computational Biology, 15(9):e1007381, 2019.
Attention Is All You Need
Ashish Vaswani, Noam Shazeer, Niki Parmar,..., Illia, Polosukhin
Advances in Neural Information Processing Systems (NIPS), 30, 2017.
An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation
Shiquan Yang, Rui Zhang, Sarah M. Erfani, Jey Han Lau
Association for Computational Linguistics (ACL) (1):4918-4935, 2022.
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu, Myle Ott, Naman Goyal, ..., Veselin Stoyanov
arXiv:1907.11692, 2019.
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton
Advances in Neural Information Processing Systems 25 (NIPS), 2012.
AI and Memory Wall
Amir Gholami, Zhewei Yao, Sehoon Kim, Coleman Hooper, Michael W. Mahoney, Kurt Keutzer
IEEE Micro Journal, arXiv:2403.14123, 2024.
Improved GPU Implementation of the Pair-HMM Forward Algorithm
Enliang Li, Subho S. Banerjee, Sitao Huang, Ravishankar K. Iyer, Deming Chen
2021 IEEE 39th International Conference on Computer Design (ICCD) 2021.
Chaos Engineering
Ali Basiri, Niosha Behnam, Ruud de Rooij, Lorin Hochstein, Luke Kosewski, Justin Reynolds & Casey Rosenthal
IEEE Software, 33(3), 2016.
Distributed Cache Reduction Approach to DNA sequencing using a greedy algorithm for the shortest common superstring
Ali Khalid, Anthony Enem & Eduardo Colmenares
International Conference on Computational Science and Computational Intelligence (CSCI), 2018.
PLINK: A tool set for whole-genome association and population-based linkage analyses
Shaun Purcell, Benjamin Neale, Kathe Todd-Brown, ..., Pak C. Sham
American Journal of Human Genetics, 81(3), 559–575, 2007.
One Cannot Stand for Everyone! Leveraging Multiple User Simulators
Yajiao Liu, Xin Jiang, Yichun Yin, Yasheng Wang, Fei Mi, Qun Liu, Xiang Wan, Benyou Wang
61st Annual Meeting of the Association for Computational Linguistics, 1, 2023.
Applying hidden Markov models to keystroke pattern analysis for password verification
W. Chen; W. Chang
IEEE International Conference on Information Reuse and Integration, 2004.
MitoFates: Improved Prediction of Mitochondrial Targeting Sequences and Their Cleavage Sites
Yoshinori Fukasawa, Junko Tsuji, Szu-Chin Fu, Kentaro Tomii, Paul Horton, Kenichiro Imai
Mol Cell Proteomics, 14(4):1113-26, 2015.
A Hybrid Ensemble Learning Model for Short-Term Solar Irradiance Forecasting Using Historical Observations and Sky Images
Zhongju Wang, Long Wang, Chao Huang & Xiong Luo
IEEE/IAS Industrial and Commercial Power System Asia (I&CPS Asia), 2021.
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
arXiv:2310.11511, 2023.
Translating Hanja Historical Documents to Contemporary Korea and English
Juhee Son, Jiho Jin, Haneul Yoo, JinYeong Bak, Kyunghyun Cho & Alice Oh
Findings of the Association for Computational Linguistics: EMNLP, 1260–1272, 2022.
Defending Jailbreak Prompts via In-Context Adversarial Game
Yujun Zhou, Yufei Han, Haomin Zhuang, Taicheng Guo, Kehan Guo, Zhenwen Liang, Hongyan Bao & Xiangliang Zhang
arXiv:2402.13148, 2024.
Deep Residual Learning for Image Recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren & Jian Sun
arXiv:1512.03385, 2015.
Earliest Eligible Virtual Deadline First: A Flexible and Accurate Mechanism for Proportional Share Resource Allocation
Ion Stoica, Hussein Abdel-Wahab
Technical Report, 1995.
Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion
Martin Christen Frølund Thomsen & Morten Nielsen
Nucleic Acids Res. 40(Web Server issue):W281–287, 2012.
Parallel algorithm for indexing large DNA sequences using MapReduce on Hadoop
F Kaniwa, O Dinakenyane & VM Kuthadi
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2017.
UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy
Tom Smith, Andreas Heger & Ian Sudbery
Genome Research, 27(3):491-499, 2017.
Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction
Ashish Sharma, Kevin Rushton, Inna Lin, David Wadden, Khendra Lucas, Adam Miner, Theresa Nguyen, Tim Althoff
61st Annual Meeting of the Association for Computational Linguistics, 1, 2023.
Accelerating Genotyping Improving the performance of Masked PanGenie
Hartmut Häntze & Paul Horton
Bioinformatics, 39(Suppl 1):i213–221, 2023.
MedSAM: Segment anything in medical images
Ma, J., He, Y., Li, F. et al.
Nature Communications, 15654, 2024.
How to apply de Bruijn graphs to genome assembly
Compeau, P., Pevzner, P. & Tesler, G.
Nat Biotechnol 29, 987–991, 2011.
PROTLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training
Le Zhuo, ..., Wentao Zhang
Proceedings of the ACL, 2024.
Accurate brain tumor detection using deep convolutional neural network
Md. Saikat Islam Khan, ..., Iman Dehzangig
Comput Struct Biotechnol J. 20:4733–4745, 2022.
Seq2Image: Sequence Analysis using Visualization and Deep Convolution Neural Network
Neda Tavakoli
Annual International Computer Software and Applications Conference, 2020.
Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs
Varun Gulshan, ..., Dale R Webster
JAMA, 316(22):2402-2410, 2016.