Standard
Using saturated models for data synthesis. /
Jackson, James; Francis, Brian; Mitra, Robin et al.
Proceedings of the 36th International Workshop on Statistical Modelling: July 18-22, 2022 - Trieste, Italy. ed. / Nicola Torelli; Ruggero Bellio; Vito Muggeo. EUT Edizioni Università di Trieste, Trieste 2022, 2022. p. 205-210 34.
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper
Harvard
Jackson, J, Francis, B, Mitra, R & Dove, I 2022,
Using saturated models for data synthesis. in N Torelli, R Bellio & V Muggeo (eds),
Proceedings of the 36th International Workshop on Statistical Modelling: July 18-22, 2022 - Trieste, Italy., 34, EUT Edizioni Università di Trieste, Trieste 2022, pp. 205-210, 36th International Workshop on Statistical Modelling, Trieste, Italy,
18/07/22.
APA
Jackson, J., Francis, B., Mitra, R., & Dove, I. (2022).
Using saturated models for data synthesis. In N. Torelli, R. Bellio, & V. Muggeo (Eds.),
Proceedings of the 36th International Workshop on Statistical Modelling: July 18-22, 2022 - Trieste, Italy (pp. 205-210). Article 34 EUT Edizioni Università di Trieste, Trieste 2022.
Vancouver
Jackson J, Francis B, Mitra R, Dove I.
Using saturated models for data synthesis. In Torelli N, Bellio R, Muggeo V, editors, Proceedings of the 36th International Workshop on Statistical Modelling: July 18-22, 2022 - Trieste, Italy. EUT Edizioni Università di Trieste, Trieste 2022. 2022. p. 205-210. 34
Author
Bibtex
@inproceedings{1080db38d5534d348c495f3f8f80b50e,
title = "Using saturated models for data synthesis",
abstract = "The use of synthetic data sets are becoming ever more prevalent,as regulations such as the General Data Protection Regulation (GDPR), which place greater demands on the protection of individuals{\textquoteright} personal data, are coupled with the conflicting demand to make more data available to researchers. This paper discusses the approach of synthesizing categorical data at the aggregated(contingency table) level using a saturated count model, which adds noise - and hence protection - to cell counts. The paper also discusses how distributional properties of synthesis models are intrinsic to generating synthetic data with suitable risk and utility profiles.",
keywords = "Synthetic data, Data privacy, Count models",
author = "James Jackson and Brian Francis and Robin Mitra and Iain Dove",
year = "2022",
month = jul,
day = "18",
language = "English",
pages = "205--210",
editor = "Nicola Torelli and Ruggero Bellio and Vito Muggeo",
booktitle = "Proceedings of the 36th International Workshop on Statistical Modelling",
publisher = "EUT Edizioni Universit{\`a} di Trieste, Trieste 2022",
note = "36th International Workshop on Statistical Modelling : July 18-22, 2022 - Trieste, Italy, IWSM ; Conference date: 18-07-2022 Through 22-07-2022",
url = "https://www.iwsm2022.com/",
}
RIS
TY - GEN
T1 - Using saturated models for data synthesis
AU - Jackson, James
AU - Francis, Brian
AU - Mitra, Robin
AU - Dove, Iain
N1 - Conference code: 36
PY - 2022/7/18
Y1 - 2022/7/18
N2 - The use of synthetic data sets are becoming ever more prevalent,as regulations such as the General Data Protection Regulation (GDPR), which place greater demands on the protection of individuals’ personal data, are coupled with the conflicting demand to make more data available to researchers. This paper discusses the approach of synthesizing categorical data at the aggregated(contingency table) level using a saturated count model, which adds noise - and hence protection - to cell counts. The paper also discusses how distributional properties of synthesis models are intrinsic to generating synthetic data with suitable risk and utility profiles.
AB - The use of synthetic data sets are becoming ever more prevalent,as regulations such as the General Data Protection Regulation (GDPR), which place greater demands on the protection of individuals’ personal data, are coupled with the conflicting demand to make more data available to researchers. This paper discusses the approach of synthesizing categorical data at the aggregated(contingency table) level using a saturated count model, which adds noise - and hence protection - to cell counts. The paper also discusses how distributional properties of synthesis models are intrinsic to generating synthetic data with suitable risk and utility profiles.
KW - Synthetic data
KW - Data privacy
KW - Count models
M3 - Conference contribution/Paper
SP - 205
EP - 210
BT - Proceedings of the 36th International Workshop on Statistical Modelling
A2 - Torelli, Nicola
A2 - Bellio, Ruggero
A2 - Muggeo, Vito
PB - EUT Edizioni Università di Trieste, Trieste 2022
T2 - 36th International Workshop on Statistical Modelling
Y2 - 18 July 2022 through 22 July 2022
ER -