Repository landing page

We are not able to resolve this OAI Identifier to the repository landing page. If you are the repository manager for this record, please head to the Dashboard and adjust the settings.

Suffix Structures and Circular Pattern Problems

Abstract

The suffix tree is a data structure used to represent all the suffixes in a string. However, a major problem with the suffix tree is its practical space requirement. In this dissertation, we propose an efficient data structure -- the virtual suffix tree (VST) -- which requires less space than other recently proposed data structures for suffix trees and suffix arrays. On average, the space requirement (including that for suffix arrays and suffix links) is 13.8n bytes for the regular VST, and 12.05n bytes in its compact form, where n is the length of the sequence.;Markov models are very popular for modeling complex sequences. In this dissertation, we present the probabilistic suffix array (PSA), a space-efficient alternative to the probabilistic suffix tree (PST) used to represent Markov models. The PSA provides all the capabilities of the PST, such as learning and prediction, and maintains the same linear time construction (linearity with respect to sequence length). The PSA, however, has a significantly smaller memory requirement than the PST, for both the construction stage, and at the time of usage.;Using the proposed suffix data structures, we study the circular pattern matching (CPM) problem. We provide a linear time, linear space algorithm to solve the exact circular pattern matching problem. We then present four algorithms to address the approximate circular pattern matching (ACPM) problem. Our bidirectional ACPM algorithm provides the best time complexity when compared with other algorithms proposed in the literature. Further, we define the circular pattern discovery (CPD) problem and present algorithms to solve this problem. Using the proposed circular pattern matching algorithms, we perform experiments on computational analysis and function prediction for multidomain proteins

Similar works

Full text

thumbnail-image

The Research Repository @ WVU (West Virginia University)

redirect
Last time updated on 17/10/2019

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.