Repository landing page

We are not able to resolve this OAI Identifier to the repository landing page. If you are the repository manager for this record, please head to the Dashboard and adjust the settings.

AdaSampling for positive-unlabeled and label noise learning with bioinformatics applications

Abstract

© 2018 IEEE. Class labels are required for supervised learning but may be corrupted or missing in various applications. In binary classification, for example, when only a subset of positive instances is labeled whereas the remaining are unlabeled, positive-unlabeled (PU) learning is required to model from both positive and unlabeled data. Similarly, when class labels are corrupted by mislabeled instances, methods are needed for learning in the presence of class label noise (LN). Here we propose adaptive sampling (AdaSampling), a framework for both PU learning and learning with class LN. By iteratively estimating the class mislabeling probability with an adaptive sampling procedure, the proposed method progressively reduces the risk of selecting mislabeled instances for model training and subsequently constructs highly generalizable models even when a large proportion of mislabeled instances is present in the data. We demonstrate the utilities of proposed methods using simulation and benchmark data, and compare them to alternative approaches that are commonly used for PU learning and/or learning with LN. We then introduce two novel bioinformatics applications where AdaSampling is used to: 1) identify kinase-substrates from mass spectrometry-based phosphoproteomics data and 2) predict transcription factor target genes by integrating various next-generation sequencing data

Similar works

Full text

thumbnail-image

OPUS - University of Technology Sydney

redirect
Last time updated on 18/10/2019

This paper was published in OPUS - University of Technology Sydney.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.