TY - JOUR
T1 - Daisy
T2 - An integrated repeat protein curation service
AU - Bezerra-Brandao, Manuel
AU - Tunque Cahui, Ronaldo Romario
AU - Hirsh, Layla
N1 - Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/12
Y1 - 2023/12
N2 - Tandem repeats in proteins identification, classification and curation is a complex process that requires manual processing from experts, processing power and time. There are recent and relevant advances applying machine learning for protein structure prediction and repeat classification that are useful for this process. However, no service contemplates required databases and software to supplement researching on repeat proteins. In this publication we present Daisy, an integrated repeat protein curation web service. This service can process Protein Data Bank (PDB) and the AlphaFold Database entries for tandem repeats identification. In addition, it uses an algorithm to search a sequence against a library of Pfam hidden Markov model (HMM). Repeat classifications are associated with the identified families through RepeatsDB. This prediction is considered for enhancing the ReUPred algorithm execution and hastening the repeat units identification process. The service can also operate every associated PDB and AlphaFold structure with a UniProt proteome registry. Availability: The Daisy web service is freely accessible at daisy.bioinformatica.org.
AB - Tandem repeats in proteins identification, classification and curation is a complex process that requires manual processing from experts, processing power and time. There are recent and relevant advances applying machine learning for protein structure prediction and repeat classification that are useful for this process. However, no service contemplates required databases and software to supplement researching on repeat proteins. In this publication we present Daisy, an integrated repeat protein curation web service. This service can process Protein Data Bank (PDB) and the AlphaFold Database entries for tandem repeats identification. In addition, it uses an algorithm to search a sequence against a library of Pfam hidden Markov model (HMM). Repeat classifications are associated with the identified families through RepeatsDB. This prediction is considered for enhancing the ReUPred algorithm execution and hastening the repeat units identification process. The service can also operate every associated PDB and AlphaFold structure with a UniProt proteome registry. Availability: The Daisy web service is freely accessible at daisy.bioinformatica.org.
UR - http://www.scopus.com/inward/record.url?scp=85173548524&partnerID=8YFLogxK
U2 - 10.1016/j.jsb.2023.108033
DO - 10.1016/j.jsb.2023.108033
M3 - Article
C2 - 37797915
AN - SCOPUS:85173548524
SN - 1047-8477
VL - 215
JO - Journal of Structural Biology
JF - Journal of Structural Biology
IS - 4
M1 - 108033
ER -