Efficient seeding techniques for protein similarity search

Roytberg, Mihkail; Gambin, Anna; Noé, Laurent; Lasota, Slawomir; Furletova, Eugenia; Szczurek, Ewa; Kucherov, Gregory

Repository landing page

Efficient seeding techniques for protein similarity search

Authors: Mihkail Roytberg
Anna Gambin
Laurent Noé
Slawomir Lasota
Eugenia Furletova
Ewa Szczurek
Gregory Kucherov
Publication date: 1 July 2008
Publisher: Springer Berlin Heidelberg
Doi

Abstract

International audienceWe apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal sensitivity/selectivity trade-offs. We propose several different design methods and use them to construct several alphabets.We then perform an analysis of seeds built over those alphabet and compare them with the standard Blastp seeding method [2,3], as well as with the family of vector seeds proposed in [4]. While the formalism of subset seed is less expressive (but less costly to implement) than the accumulative principle used in Blastp and vector seeds, our seeds show a similar or even better performance than Blastp on Bernoulli models of proteins compatible with the common BLOSUM62 matrix

Similar works

Full text

Open in the Core reader

Download PDF

INRIA a CCSD electronic archive server

oai:HAL:inria-00335564v1

Last time updated on 09/11/2016

This paper was published in INRIA a CCSD electronic archive server.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.