TY - GEN
T1 - Geolocated Data Generation and Protection Using Generative Adversarial Networks
AU - Alatrista-Salas, Hugo
AU - Montalvo-Garcia, Peter
AU - Nunez-del-Prado, Miguel
AU - Salas, Julián
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Data mining techniques allow us to discover patterns in large datasets. Nonetheless, data may contain sensitive information. This is especially true when data is georeferenced. Thus, an adversary could learn about individual whereabouts, points of interest, political affiliation, and even sexual habits. At the same time, human mobility is a rich source of information to analyze traffic jams, health care accessibility, food desserts, and even pandemics dynamics. Therefore, to enhance privacy, we study the use of Deep Learning techniques such as Generative Adversarial Network (GAN) and GAN with Differential Privacy (DP-GAN) to generate synthetic data with formal privacy guarantees. Our experiments demonstrate that we can generate synthetic data to maintain individuals’ privacy and data quality depending on privacy parameters. Accordingly, based on the privacy settings, we generated data differing a few meters and a few kilometers from the original trajectories. After generating fine-grain mobility trajectories at the GPS level through an adversarial neural networks approach and using GAN to sanitize the original trajectories together with differential privacy, we analyze the privacy provided from the perspective of anonymization literature. We show that such ϵ -differentially private data may still have a risk of re-identification.
AB - Data mining techniques allow us to discover patterns in large datasets. Nonetheless, data may contain sensitive information. This is especially true when data is georeferenced. Thus, an adversary could learn about individual whereabouts, points of interest, political affiliation, and even sexual habits. At the same time, human mobility is a rich source of information to analyze traffic jams, health care accessibility, food desserts, and even pandemics dynamics. Therefore, to enhance privacy, we study the use of Deep Learning techniques such as Generative Adversarial Network (GAN) and GAN with Differential Privacy (DP-GAN) to generate synthetic data with formal privacy guarantees. Our experiments demonstrate that we can generate synthetic data to maintain individuals’ privacy and data quality depending on privacy parameters. Accordingly, based on the privacy settings, we generated data differing a few meters and a few kilometers from the original trajectories. After generating fine-grain mobility trajectories at the GPS level through an adversarial neural networks approach and using GAN to sanitize the original trajectories together with differential privacy, we analyze the privacy provided from the perspective of anonymization literature. We show that such ϵ -differentially private data may still have a risk of re-identification.
KW - Differential privacy
KW - Disclosure risk
KW - Generative Adversarial Networks
KW - Information loss
KW - Privacy
KW - Synthetic trajectories
UR - http://www.scopus.com/inward/record.url?scp=85137099547&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-13448-7_7
DO - 10.1007/978-3-031-13448-7_7
M3 - Conference contribution
AN - SCOPUS:85137099547
SN - 9783031134470
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 80
EP - 91
BT - Modeling Decisions for Artificial Intelligence - 19th International Conference, MDAI 2022, Proceedings
A2 - Torra, Vicenç
A2 - Narukawa, Yasuo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 19th International Conference on Modeling Decisions for Artificial Intelligence, MDAI 2022
Y2 - 30 August 2022 through 2 September 2022
ER -