Skip to main navigation Skip to search Skip to main content

Improving the utility of anonymized datasets through dynamic evaluation of generalization hierarchies

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The dissemination of textual personal information has become a key driver for innovation and value creation. However, due to the possible content of sensitive information, this data must be anonymized, which can reduce its usefulness for secondary uses. One of the most used techniques to anonymize data is generalization. However, its effectiveness can be hampered by the Value Generalization Hierarchies (VGHs) used to dictate the anonymization of data, as poorlyspecified VGHs can reduce the usefulness of the resulting data. To tackle this problem, we propose a metric for evaluating the quality of textual VGHs used in anonymization. Our evaluation approach considers the semantic properties of VGHs and exploits information from the input datasets to predict with higher accuracy (compared to existing approaches) the potential effectiveness of VGHs for anonymizing data. As a consequence, the utility of the resulting datasets is improved without sacrificing the privacy goal. We also introduce a novel rating scale to classify the quality of the VGHs into categories to facilitate the interpretation of our quality metric for practitioners.

Original languageEnglish
Title of host publicationProceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages30-39
Number of pages10
ISBN (Electronic)9781509032075
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event17th IEEE International Conference on Information Reuse and Integration, IRI 2016 - Pittsburgh, United States
Duration: 28 Jul 201630 Jul 2016

Publication series

NameProceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016

Conference

Conference17th IEEE International Conference on Information Reuse and Integration, IRI 2016
Country/TerritoryUnited States
CityPittsburgh
Period28/07/1630/07/16

Keywords

  • Anonymization
  • Data Publishing
  • Data Quality
  • Data Semantics
  • Generalization Hierarchies
  • Privacy

Fingerprint

Dive into the research topics of 'Improving the utility of anonymized datasets through dynamic evaluation of generalization hierarchies'. Together they form a unique fingerprint.

Cite this