ISCB-Asia/SCCG 2012 Proceedings Talk

Multiclass Relevance Units Machine: Benchmark Evaluation and Application to Small ncRNA Discovery

Mark Menor, Kyungim Baek & Guylaine Poisson
University of Hawaii at Manoa, Information and Computer Sciences

Abstract

Background: Classification is the problem of assigning each input object to one of a finite number of classes. This problem has been extensively studied in machine learning and statistics, and there are numerous applications to bioinformatics as well as many other fields.

In this work, we present the extension of a recently introduced probabilistic kernel-based learning algorithm called the Classification Relevance Units Machine (CRUM) to the multiclass setting to increase its applicability. The extension is achieved under the error correcting output codes framework. The probabilistic outputs of the binary CRUM is preserved using a proposed linear-time decoding algorithm, an alternative to the generalized Bradley-Terry algorithm whose application to large-scale prediction settings is prohibited by its computational complexity.

Experiments on a variety of real small-scale datasets and one larger bioinformatics dataset for small ncRNA classification show that the Multiclass Relevance Units Machine (McRUM) can achieve comparable or slightly higher accuracy than previous analyses of these datasets. Thus the results suggest CRUM's potential in solving multiclass problems in bioinformatics and other fields of study.