Resumen
The manuscript describes fast and scalable architectures and associated algorithms for computing convolutions and cross-correlations. The basic idea is to map 2D convolutions and cross-correlations to a collection of 1D convolutions and cross-correlations in the transform domain. This is accomplished through the use of the discrete periodic radon transform for general kernels and the use of singular value decomposition-LU decompositions for low-rank kernels. The approach uses scalable architectures that can be fitted into modern FPGA and Zynq-SOC devices. Based on different types of available resources, for P× P blocks, 2D convolutions and cross-correlations can be computed in just O(P) clock cycles up to O(P2) clock cycles. Thus, there is a trade-off between performance and required numbers and types of resources. We provide implementations of the proposed architectures using modern programmable devices (Virtex-7 and Zynq-SOC). Based on the amounts and types of required resources, we show that the proposed approaches significantly outperform current methods.
Idioma original | Español |
---|---|
Páginas (desde-hasta) | 2230-2245 |
Número de páginas | 16 |
Publicación | IEEE Transactions on Image Processing |
Volumen | 26 |
Estado | Publicada - 1 may. 2017 |