TY - JOUR
T1 - A Graph-Based Differentially Private Algorithm for Mining Frequent Sequential Patterns
AU - Nunez-Del-prado, Miguel
AU - Maehara-Aliaga, Yoshitomi
AU - Salas, Julián
AU - Alatrista-Salas, Hugo
AU - Megías, David
N1 - Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2022/2/1
Y1 - 2022/2/1
N2 - Currently, individuals leave a digital trace of their activities when they use their smartphones, social media, mobile apps, credit card payments, Internet surfing profile, etc. These digital activities hide intrinsic usage patterns, which can be extracted using sequential pattern algorithms. Sequential pattern mining is a promising approach for discovering temporal regularities in huge and heterogeneous databases. These sequences represent individuals’ common behavior and could contain sensitive information. Thus, sequential patterns should be sanitized to preserve individuals’ privacy. Hence, many algorithms have been proposed to accomplish this task. However, these techniques add noise to the candidate support before they are validated as, frequently, and thus, they cannot be applied without having access to all the users’ sequences data. In this paper, we propose a differential privacy graph-based technique for publishing frequent sequential patterns. It is applied at the post-processing stage; hence it may be used to protect frequent sequential patterns after they have been extracted, without the need to access all the users’ sequences. To validate our proposal, we performed a detailed assessment of its utility as a pattern mining algorithm and calculated the impact of the sanitization mechanism on a recommender system. We further evaluated its information loss disclosure risk and performed a comparison with the DP-FSM algorithm.
AB - Currently, individuals leave a digital trace of their activities when they use their smartphones, social media, mobile apps, credit card payments, Internet surfing profile, etc. These digital activities hide intrinsic usage patterns, which can be extracted using sequential pattern algorithms. Sequential pattern mining is a promising approach for discovering temporal regularities in huge and heterogeneous databases. These sequences represent individuals’ common behavior and could contain sensitive information. Thus, sequential patterns should be sanitized to preserve individuals’ privacy. Hence, many algorithms have been proposed to accomplish this task. However, these techniques add noise to the candidate support before they are validated as, frequently, and thus, they cannot be applied without having access to all the users’ sequences data. In this paper, we propose a differential privacy graph-based technique for publishing frequent sequential patterns. It is applied at the post-processing stage; hence it may be used to protect frequent sequential patterns after they have been extracted, without the need to access all the users’ sequences. To validate our proposal, we performed a detailed assessment of its utility as a pattern mining algorithm and calculated the impact of the sanitization mechanism on a recommender system. We further evaluated its information loss disclosure risk and performed a comparison with the DP-FSM algorithm.
KW - Anonymization of big data
KW - Differential privacy
KW - Edge differential privacy
KW - Frequent pattern mining
KW - Graph differential privacy
KW - Sequential pattern mining
UR - http://www.scopus.com/inward/record.url?scp=85124976681&partnerID=8YFLogxK
U2 - 10.3390/app12042131
DO - 10.3390/app12042131
M3 - Article
AN - SCOPUS:85124976681
SN - 2076-3417
VL - 12
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 4
M1 - 2131
ER -