TY - GEN
T1 - Efficient GPU-based implementation of the median filter based on a multi-pixel-per-thread framework
AU - Salvador, Gabriel
AU - Chau, Juan M.
AU - Quesada, Jorge
AU - Carranza, Cesar
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/9/21
Y1 - 2018/9/21
N2 - Median filtering has become a ubiquitous smoothing tool for image denoising tasks, with its complexity generally determined by the median algorithm used (usually on the order of O(n log(n)) when computing the median of n elements). Most algorithms were formulated for scalar single processor computers, with few of them successfully adapted and implemented for computers with a parallel architecture. However, the redundancy for processing neighboring pixels has not yet been fully exploited for parallel implementations. Additionally, most of the implementations are only suitable for fixed point images, but not for floating point.In this paper we propose an efficient parallel implementation of the 2D median filter, based on a multiple pixel-per-thread framework, and test its implementation on a CUDA-capable GPU either for fixed point or floating point data. Our computational results show that our proposed methods outperforms state-of the art implementations, with the difference increasing significantly as the filter size grows.
AB - Median filtering has become a ubiquitous smoothing tool for image denoising tasks, with its complexity generally determined by the median algorithm used (usually on the order of O(n log(n)) when computing the median of n elements). Most algorithms were formulated for scalar single processor computers, with few of them successfully adapted and implemented for computers with a parallel architecture. However, the redundancy for processing neighboring pixels has not yet been fully exploited for parallel implementations. Additionally, most of the implementations are only suitable for fixed point images, but not for floating point.In this paper we propose an efficient parallel implementation of the 2D median filter, based on a multiple pixel-per-thread framework, and test its implementation on a CUDA-capable GPU either for fixed point or floating point data. Our computational results show that our proposed methods outperforms state-of the art implementations, with the difference increasing significantly as the filter size grows.
KW - CUDA
KW - GPU
KW - Image Processing
KW - Median Filter
KW - Parallel Processing
UR - http://www.scopus.com/inward/record.url?scp=85055536183&partnerID=8YFLogxK
U2 - 10.1109/SSIAI.2018.8470318
DO - 10.1109/SSIAI.2018.8470318
M3 - Conference contribution
AN - SCOPUS:85055536183
T3 - Proceedings of the IEEE Southwest Symposium on Image Analysis and Interpretation
SP - 121
EP - 124
BT - 2018 IEEE Southwest Symposium on Image Analysis and Interpretation, SSIAI 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE Southwest Symposium on Image Analysis and Interpretation, SSIAI 2018
Y2 - 8 April 2018 through 10 April 2018
ER -