Quality data extraction methodology based on the labeling of coffee leaves with nutritional deficiencies

Adolfo Jungbluth, Jon Li Yeng, Luis Vives

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Nutritional deficiencies detection for coffee leaves is a task which is often undertaken manually by experts on the field known as agronomists. The process they follow to carry this task is based on observation of the different characteristics of the coffee leaves while relying on their own experience. Visual fatigue and human error in this empiric approach cause leaves to be incorrectly labeled and thus affecting the quality of the data obtained. In this context, different crowdsourcing approaches can be applied to enhance the quality of the data extracted. These approaches separately propose the use of voting systems, association rule filters and evolutive learning. In this paper, we extend the use of association rule filters and evolutive approach by combining them in a methodology to enhance the quality of the data while guiding the users during the main stages of data extraction tasks. Moreover, our methodology proposes a reward component to engage users and keep them motivated during the crowdsourcing tasks. The extracted dataset by applying our proposed methodology in a case study on Peruvian coffee leaves resulted in 93.33% accuracy with 30 instances collected by 8 experts and evaluated by 2 agronomic engineers with background on coffee leaves. The accuracy of the dataset was higher than independently implementing the evolutive feedback strategy and an empiric approach which resulted in 86.67% and 70% accuracy respectively under the same conditions.

Original languageEnglish
Title of host publicationICISDM 2018 - 2nd International Conference on Information System and Data Mining
PublisherAssociation for Computing Machinery
Pages59-64
Number of pages6
ISBN (Electronic)9781450363549
DOIs
StatePublished - 9 Apr 2018
Externally publishedYes
Event2nd International Conference on Information System and Data Mining, ICISDM 2018 - Lakeland, United States
Duration: 9 Apr 201811 Apr 2018

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2nd International Conference on Information System and Data Mining, ICISDM 2018
Country/TerritoryUnited States
CityLakeland
Period9/04/1811/04/18

Keywords

  • Data extraction
  • Data quality assessment
  • Quality data extraction methodology

Fingerprint

Dive into the research topics of 'Quality data extraction methodology based on the labeling of coffee leaves with nutritional deficiencies'. Together they form a unique fingerprint.

Cite this