Name:
Synthetic data privacy evaluation
Description:
Privacy evaluation framework for synthetic data publishing
Professor — Lab:
Carmela TroncosoSecurity and Privacy Engineering Laboratory
Contact:
Theresa Stadler

Layman description:
The framework implemented in this library allows data holders to evaluate how much publishing a synthetic dataset in place of a sensitive raw dataset reduces the privacy risk for the individuals whose data is included in the raw data. The results of the evaluation help to inform decisions about whether to publish the data or which generative model provides the best trade-off between utility and privacy gain.
Technical description:
The framework implemented in this library measures the privacy gain of publishing a synthetic dataset in place of the raw data with respect to a specific privacy concern. Each concern is modelled as a privacy adversary that targets an individual record and aims to infer a secret about this record. The library includes implementations of two new privacy attacks on the output of a generative model. To evaluate privacy gain, the framework is instantiated under the chosen threat model and outputs an estimate about how much publishing the synthetic data instead of the raw data reduces the privacy loss of a chosen target record under this threat model.
Papers:
Project status:
inactive — entered showcase: 2021-02-08 — entry updated: 2022-07-07

Source code:
Lab Github - last commit: 2022-05-13
Code quality:
This project has not yet been evaluated by the C4DT Factory team. We will be happy to evaluate it upon request.
Project type:
Library
Programming language:
Python
License:
BSD-3-Clause