TY - GEN
T1 - Towards an Efficient Log Data Protection in Software Systems through Data Minimization and Anonymization
AU - Portillo-Dominguez, A. Omar
AU - Ayala-Rivera, Vanessa
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - IT infrastructures of companies generate large amounts of log data every day. These logs are typically analyzed by software engineers to gain insights about activities occurring within a company (e.g., to debug issues exhibited by the production systems). To facilitate this process, log data management is often outsourced to cloud providers. However, logs may contain information that is sensitive by nature and considered personal identifiable under most of the new privacy protection laws, such as the European General Data Protection Regulation (GDPR). To ensure that companies do not violate regulatory compliance, they must adopt, in their software systems, appropriate data protection measures. Such privacy protection laws also promote the use of anonymization techniques as possible mechanisms to operationalize data protection. However, companies struggle to put anonymization in practice due to the lack of integrated, intuitive, and easy-to-use tools that accommodate effectively with their log management systems. In this paper, we propose an automatic approach (SafeLog) to filter out information and anonymize log streams to safeguard the confidentiality of sensitive data and prevent its exposure and misuse from third parties. Our results show that atomic anonymization operations can be effectively applied to log streams to preserve the confidentiality of information, while still allowing to conduct different types of analysis tasks such as users behavior, and anomaly detection. Our approach also reduces the amount of data sent to cloud vendors, hence decreasing the financial costs and the risk of overexposing information.
AB - IT infrastructures of companies generate large amounts of log data every day. These logs are typically analyzed by software engineers to gain insights about activities occurring within a company (e.g., to debug issues exhibited by the production systems). To facilitate this process, log data management is often outsourced to cloud providers. However, logs may contain information that is sensitive by nature and considered personal identifiable under most of the new privacy protection laws, such as the European General Data Protection Regulation (GDPR). To ensure that companies do not violate regulatory compliance, they must adopt, in their software systems, appropriate data protection measures. Such privacy protection laws also promote the use of anonymization techniques as possible mechanisms to operationalize data protection. However, companies struggle to put anonymization in practice due to the lack of integrated, intuitive, and easy-to-use tools that accommodate effectively with their log management systems. In this paper, we propose an automatic approach (SafeLog) to filter out information and anonymize log streams to safeguard the confidentiality of sensitive data and prevent its exposure and misuse from third parties. Our results show that atomic anonymization operations can be effectively applied to log streams to preserve the confidentiality of information, while still allowing to conduct different types of analysis tasks such as users behavior, and anomaly detection. Our approach also reduces the amount of data sent to cloud vendors, hence decreasing the financial costs and the risk of overexposing information.
KW - Anonymization
KW - Privacy
KW - Security
KW - Software Engineering
UR - https://www.scopus.com/pages/publications/85088565091
U2 - 10.1109/CONISOFT.2019.00024
DO - 10.1109/CONISOFT.2019.00024
M3 - Conference contribution
AN - SCOPUS:85088565091
T3 - Proceedings - 2019 7th International Conference in Software Engineering Research and Innovation, CONISOFT 2019
SP - 107
EP - 115
BT - Proceedings - 2019 7th International Conference in Software Engineering Research and Innovation, CONISOFT 2019
A2 - Juarez-Ramirez, Reyes
A2 - Fernandez y Fernandez, Carlos Alberto
A2 - Jimenez Calleros, Samantha Paulina
A2 - Ramirez-Noriega, Alan David
A2 - Perez-Gonzalez, Hector G.
A2 - Sandoval, Guillermo Licea
A2 - Guerra-Garcia, Cesar Arturo
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Conference in Software Engineering Research and Innovation, CONISOFT 2019
Y2 - 23 October 2019 through 25 October 2019
ER -