Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set

Johnson David; Wren Jonathan D; Gruenwald Le

Repository landing page

oai:doaj.org/article:e7c20c68a5a04afb9c9fd5ba4adc43e0

Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set

Authors: Johnson David
Wren Jonathan D
Gruenwald Le
Publication date: 1 July 2005
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

Abstract There is an enormous amount of information encoded in each genome – enough to create living, responsive and adaptive organisms. Raw sequence data alone is not enough to understand function, mechanisms or interactions. Changes in a single base pair can lead to disease, such as sickle-cell anemia, while some large megabase deletions have no apparent phenotypic effect. Genomic features are varied in their data types and annotation of these features is spread across multiple databases. Herein, we develop a method to automate exploration of genomes by iteratively exploring sequence data for correlations and building upon them. First, to integrate and compare different annotation sources, a sequence matrix (SM) is developed to contain position-dependant information. Second, a classification tree is developed for matrix row types, specifying how each data type is to be treated with respect to other data types for analysis purposes. Third, correlative analyses are developed to analyze features of each matrix row in terms of the other rows, guided by the classification tree as to which analyses are appropriate. A prototype was developed and successful in detecting coinciding genomic features among genes, exons, repetitive elements and CpG islands.</p

Similar works

Full text

Directory of Open Access Journals

oai:doaj.org/article:e7c20c68a...

Last time updated on 17/12/2014

This paper was published in Directory of Open Access Journals.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.