TY - GEN
T1 - Base 5 vs. Karp-Rabin as optimizations in the BLAST heuristic for the alignment of DNA sequences
AU - Cruz-Gamero, Franklin L.A.
AU - Beltran-Castanon, Cesar A.
AU - Gutierrez-Caceres, Juan C.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/8
Y1 - 2019/8
N2 - In bioinformatics, the database of biological sequences increases at a dizzying rate, with the alignment algorithms used for the comparison of sequences determining genetic distances, generation of phylogenetic trees, etc. This work seeks to compare the incorporation of the Rabin-Karp and Base 5 algorithms as possible optimizations during the generation of seed indexes of the BLAST alignment algorithm to align multiple query sequences with the DNA sequence of the human genome as sequence of reference. The tests were processed sequentially and using GPU in the MANATI supercomputer of the High Performance Computational Center of the Peruvian Amazon of the IIAP, showing a better performance for a possible optimization of BLAST in the generation of hash keys with the algorithm taken from Base 5 for long sequences (genomes) with short keys, generating maximum dispersion. However, for short sequences or longer keys, it is advisable to use Karp-Rabin, reducing this dispersion.
AB - In bioinformatics, the database of biological sequences increases at a dizzying rate, with the alignment algorithms used for the comparison of sequences determining genetic distances, generation of phylogenetic trees, etc. This work seeks to compare the incorporation of the Rabin-Karp and Base 5 algorithms as possible optimizations during the generation of seed indexes of the BLAST alignment algorithm to align multiple query sequences with the DNA sequence of the human genome as sequence of reference. The tests were processed sequentially and using GPU in the MANATI supercomputer of the High Performance Computational Center of the Peruvian Amazon of the IIAP, showing a better performance for a possible optimization of BLAST in the generation of hash keys with the algorithm taken from Base 5 for long sequences (genomes) with short keys, generating maximum dispersion. However, for short sequences or longer keys, it is advisable to use Karp-Rabin, reducing this dispersion.
KW - Alignment
KW - BLAST
KW - DNA
KW - GPU
KW - Optimization
UR - http://www.scopus.com/inward/record.url?scp=85073555937&partnerID=8YFLogxK
U2 - 10.1109/INTERCON.2019.8853584
DO - 10.1109/INTERCON.2019.8853584
M3 - Conference contribution
AN - SCOPUS:85073555937
T3 - Proceedings of the 2019 IEEE 26th International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2019
BT - Proceedings of the 2019 IEEE 26th International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 26th IEEE International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2019
Y2 - 12 August 2019 through 14 August 2019
ER -