Identification of repetitive units in protein structures with ReUPred

Layla Hirsh, Damiano Piovesan, Lisanna Paladin, Silvio C.E. Tosatto

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

Over the last decade, numerous studies have demonstrated the fundamental importance of tandem repeat (TR) proteins in many biological processes. A plethora of new repeat structures have also been solved. The recently published RepeatsDB provides information on TR proteins. However, a detailed structural characterization of repetitive elements is largely missing, as repeat unit annotation is manually curated and currently covers only 3 % of the bona fide TR proteins. Repeat Protein Unit Predictor (ReUPred) is a novel method for the fast automatic prediction of repeat units and repeat classification using an extensive Structure Repeat Unit Library (SRUL) derived from RepeatsDB. ReUPred uses an iterative structural search against the SRUL to find repetitive units. On a test set of solenoid proteins, ReUPred is able to correctly detect 92 % of the proteins. Unlike previous methods, it is also able to correctly classify solenoid repeats in 89 % of cases. It also outperforms two recent state-of-the-art methods for the repeat unit identification problem. The accurate prediction of repeat units increases the number of annotated repeat units by an order of magnitude compared to the sequence-based Pfam classification. ReUPred is implemented in Python for Linux and freely available from the URL: http://protein.bio.unipd.it/reupred/.

Original languageEnglish
Pages (from-to)1391-1400
Number of pages10
JournalAmino Acids
Volume48
Issue number6
DOIs
StatePublished - 1 Jun 2016

Keywords

  • Protein classification
  • Repeat protein
  • Structure prediction

Fingerprint

Dive into the research topics of 'Identification of repetitive units in protein structures with ReUPred'. Together they form a unique fingerprint.

Cite this