COCOA: A synthetic data generator for testing anonymization techniques

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Conducting extensive testing of anonymization techniques is critical to assess their robustness and identify the scenarios where they are most suitable. However, the access to real microdata is highly restricted and the one that is publicly-available is usually anonymized or aggregated; hence, reducing its value for testing purposes. In this paper, we present a framework (COCOA) for the generation of realistic synthetic microdata that allows to define multi-attribute relationships in order to preserve the functional dependencies of the data. We prove how COCOA is useful to strengthen the testing of anonymization techniques by broadening the number and diversity of the test scenarios. Results also show how COCOA is practical to generate large datasets.

Original languageEnglish
Title of host publicationPrivacy in Statistical Databases - UNESCO Chair in Data Privacy International Conference, PSD 2016, Proceedings
EditorsJosep Domingo-Ferrer, Mirjana Pejić-Bach
PublisherSpringer Verlag
Pages163-177
Number of pages15
ISBN (Print)9783319453804
DOIs
Publication statusPublished - 2016
Externally publishedYes
EventInternational Conference on Privacy in Statistical Databases, PSD 2016 - Dubrovnik, Croatia
Duration: 14 Sep 201616 Sep 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9867 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Privacy in Statistical Databases, PSD 2016
Country/TerritoryCroatia
CityDubrovnik
Period14/09/1616/09/16

Fingerprint

Dive into the research topics of 'COCOA: A synthetic data generator for testing anonymization techniques'. Together they form a unique fingerprint.

Cite this