Home > Research > Publications & Outputs > A Bayesian Nonparametric Approach to Differenti...

Links

Text available via DOI:

View graph of relations

A Bayesian Nonparametric Approach to Differentially Private Data

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

A Bayesian Nonparametric Approach to Differentially Private Data. / Battiston, Marco; Ayed, Fadhel; Di Benedetto, Giuseppe.
Privacy in Statistical Databases: UNESCO Chair in Data Privacy, International Conference, PSD 2020, Proceedings. ed. / Josep Domingo-Ferrer; Krishnamurty Muralidhar. Springer, 2020. p. 32-48 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12276 LNCS).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Battiston, M, Ayed, F & Di Benedetto, G 2020, A Bayesian Nonparametric Approach to Differentially Private Data. in J Domingo-Ferrer & K Muralidhar (eds), Privacy in Statistical Databases: UNESCO Chair in Data Privacy, International Conference, PSD 2020, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12276 LNCS, Springer, pp. 32-48, International Conference on Privacy in Statistical Databases, PSD 2020, Tarragona, Spain, 23/09/20. https://doi.org/10.1007/978-3-030-57521-2_3

APA

Battiston, M., Ayed, F., & Di Benedetto, G. (2020). A Bayesian Nonparametric Approach to Differentially Private Data. In J. Domingo-Ferrer, & K. Muralidhar (Eds.), Privacy in Statistical Databases: UNESCO Chair in Data Privacy, International Conference, PSD 2020, Proceedings (pp. 32-48). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12276 LNCS). Springer. https://doi.org/10.1007/978-3-030-57521-2_3

Vancouver

Battiston M, Ayed F, Di Benedetto G. A Bayesian Nonparametric Approach to Differentially Private Data. In Domingo-Ferrer J, Muralidhar K, editors, Privacy in Statistical Databases: UNESCO Chair in Data Privacy, International Conference, PSD 2020, Proceedings. Springer. 2020. p. 32-48. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-57521-2_3

Author

Battiston, Marco ; Ayed, Fadhel ; Di Benedetto, Giuseppe. / A Bayesian Nonparametric Approach to Differentially Private Data. Privacy in Statistical Databases: UNESCO Chair in Data Privacy, International Conference, PSD 2020, Proceedings. editor / Josep Domingo-Ferrer ; Krishnamurty Muralidhar. Springer, 2020. pp. 32-48 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Bibtex

@inproceedings{e164a193635e43adb6d801d6ea2ddafb,
title = "A Bayesian Nonparametric Approach to Differentially Private Data",
abstract = "The protection of private and sensitive data is an important problem of increasing interest due to the vast amount of personal data collected. Differential Privacy is arguably the most dominant approach to address privacy protection, and is currently implemented in both industry and government. In a decentralized paradigm, the sensitive information belonging to each individual will be locally transformed by a known privacy-maintaining mechanism Q. The objective of differential privacy is to allow an analyst to recover the distribution of the raw data, or some functionals of it, while only having access to the transformed data. In this work, we propose a Bayesian nonparametric methodology to perform inference on the distribution of the sensitive data, reformulating the differentially private estimation problem as a latent variable Dirichlet Process mixture model. This methodology has the advantage that it can be applied to any mechanism Q and works as a “black box” procedure, being able to estimate the distribution and functionals thereof using the same MCMC draws and with very little tuning. Also, being a fully nonparametric procedure, it requires very little assumptions on the distribution of the raw data. For the most popular mechanisms Q, like Laplace and Gaussian, we describe efficient specialized MCMC algorithms and provide theoretical guarantees. Experiments on both synthetic and real dataset show a good performance of the proposed method.",
keywords = "Bayesian Nonparametrics, Differential Privacy, Dirichlet Process mixture model, Exponential mechanism, Laplace noise, Latent variables",
author = "Marco Battiston and Fadhel Ayed and {Di Benedetto}, Giuseppe",
year = "2020",
month = sep,
day = "16",
doi = "10.1007/978-3-030-57521-2_3",
language = "English",
isbn = "9783030575205",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "32--48",
editor = "Josep Domingo-Ferrer and Krishnamurty Muralidhar",
booktitle = "Privacy in Statistical Databases",
note = "International Conference on Privacy in Statistical Databases, PSD 2020 ; Conference date: 23-09-2020 Through 25-09-2020",

}

RIS

TY - GEN

T1 - A Bayesian Nonparametric Approach to Differentially Private Data

AU - Battiston, Marco

AU - Ayed, Fadhel

AU - Di Benedetto, Giuseppe

PY - 2020/9/16

Y1 - 2020/9/16

N2 - The protection of private and sensitive data is an important problem of increasing interest due to the vast amount of personal data collected. Differential Privacy is arguably the most dominant approach to address privacy protection, and is currently implemented in both industry and government. In a decentralized paradigm, the sensitive information belonging to each individual will be locally transformed by a known privacy-maintaining mechanism Q. The objective of differential privacy is to allow an analyst to recover the distribution of the raw data, or some functionals of it, while only having access to the transformed data. In this work, we propose a Bayesian nonparametric methodology to perform inference on the distribution of the sensitive data, reformulating the differentially private estimation problem as a latent variable Dirichlet Process mixture model. This methodology has the advantage that it can be applied to any mechanism Q and works as a “black box” procedure, being able to estimate the distribution and functionals thereof using the same MCMC draws and with very little tuning. Also, being a fully nonparametric procedure, it requires very little assumptions on the distribution of the raw data. For the most popular mechanisms Q, like Laplace and Gaussian, we describe efficient specialized MCMC algorithms and provide theoretical guarantees. Experiments on both synthetic and real dataset show a good performance of the proposed method.

AB - The protection of private and sensitive data is an important problem of increasing interest due to the vast amount of personal data collected. Differential Privacy is arguably the most dominant approach to address privacy protection, and is currently implemented in both industry and government. In a decentralized paradigm, the sensitive information belonging to each individual will be locally transformed by a known privacy-maintaining mechanism Q. The objective of differential privacy is to allow an analyst to recover the distribution of the raw data, or some functionals of it, while only having access to the transformed data. In this work, we propose a Bayesian nonparametric methodology to perform inference on the distribution of the sensitive data, reformulating the differentially private estimation problem as a latent variable Dirichlet Process mixture model. This methodology has the advantage that it can be applied to any mechanism Q and works as a “black box” procedure, being able to estimate the distribution and functionals thereof using the same MCMC draws and with very little tuning. Also, being a fully nonparametric procedure, it requires very little assumptions on the distribution of the raw data. For the most popular mechanisms Q, like Laplace and Gaussian, we describe efficient specialized MCMC algorithms and provide theoretical guarantees. Experiments on both synthetic and real dataset show a good performance of the proposed method.

KW - Bayesian Nonparametrics

KW - Differential Privacy

KW - Dirichlet Process mixture model

KW - Exponential mechanism

KW - Laplace noise

KW - Latent variables

U2 - 10.1007/978-3-030-57521-2_3

DO - 10.1007/978-3-030-57521-2_3

M3 - Conference contribution/Paper

AN - SCOPUS:85092077839

SN - 9783030575205

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 32

EP - 48

BT - Privacy in Statistical Databases

A2 - Domingo-Ferrer, Josep

A2 - Muralidhar, Krishnamurty

PB - Springer

T2 - International Conference on Privacy in Statistical Databases, PSD 2020

Y2 - 23 September 2020 through 25 September 2020

ER -