Resumen
The dominant use of Convolutional Neural Networks (CNNs) in several image and video analysis tasks necessitates a careful re-evaluation of the underlying software libraries for computing them for large-scale image and video databases. We focus our attention on developing methods that can be applied to large image databases or videos of large image sizes.We develop a method that maximizes throughput through the use of vector-based memory I/O and optimized 2D FFT libraries that run on all available physical cores. We also show how to decompose arbitrarily large images into smaller, optimal blocks that can be effectively processed through the use of overlap-and-add. Our approach outperforms Tensorflow for 5 × 5 kernels and significantly outperforms Tensorflow for 11 × 11 kernels.
Idioma original | Español |
---|---|
Título de la publicación alojada | Proceedings of the IEEE Southwest Symposium on Image Analysis and Interpretation |
Páginas | 70-73 |
Número de páginas | 4 |
Volumen | 2020-March |
Estado | Publicada - 1 mar. 2020 |
Publicado de forma externa | Sí |