TY - GEN
T1 - On the architecture of a big data classification tool based on a map reduce approach for hyperspectral image analysis
AU - Ayma, V. A.
AU - Ferreira, R. S.
AU - Happ, P. N.
AU - Oliveira, D. A.B.
AU - Costa, G. A.O.P.
AU - Feitosa, R. Q.
AU - Plaza, A.
AU - Gamba, P.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/11/10
Y1 - 2015/11/10
N2 - Advances in remote sensors are providing exceptional quantities of large-scale data with increasing spatial, spectral and temporal resolutions, raising new challenges in its analysis, e.g. those presents in classification processes. This work presents the architecture of the InterIMAGE Cloud Platform (ICP): Data Mining Package; a tool able to perform supervised classification procedures on huge amounts of data, on a distributed infrastructure. The architecture is implemented on top of the MapReduce framework. The tool has four classification algorithms implemented taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines. The SVM classifier was applied on datasets of different sizes (2 GB, 4 GB and 10 GB) for different cluster configurations (5, 10, 20, 50 nodes). The results show the tool as a potential approach to parallelize classification processes on big data.
AB - Advances in remote sensors are providing exceptional quantities of large-scale data with increasing spatial, spectral and temporal resolutions, raising new challenges in its analysis, e.g. those presents in classification processes. This work presents the architecture of the InterIMAGE Cloud Platform (ICP): Data Mining Package; a tool able to perform supervised classification procedures on huge amounts of data, on a distributed infrastructure. The architecture is implemented on top of the MapReduce framework. The tool has four classification algorithms implemented taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines. The SVM classifier was applied on datasets of different sizes (2 GB, 4 GB and 10 GB) for different cluster configurations (5, 10, 20, 50 nodes). The results show the tool as a potential approach to parallelize classification processes on big data.
KW - Big Data
KW - Classification Algorithms
KW - Cloud Computing
KW - MapReduce
UR - http://www.scopus.com/inward/record.url?scp=84962592030&partnerID=8YFLogxK
U2 - 10.1109/IGARSS.2015.7326066
DO - 10.1109/IGARSS.2015.7326066
M3 - Conference contribution
AN - SCOPUS:84962592030
T3 - International Geoscience and Remote Sensing Symposium (IGARSS)
SP - 1508
EP - 1511
BT - 2015 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2015 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2015
Y2 - 26 July 2015 through 31 July 2015
ER -