Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Confidentiality challenges in releasing longitudinally linked data
AU - Mitra, R.
AU - Blanchard, S.
AU - Dove, I.
AU - Tudor, C.
AU - Spicer, K.
PY - 2020
Y1 - 2020
N2 - Longitudinally linked household data allows researchers to analyse trends over time as well as on a cross-sectional level. Such analysis requires households to be linked across waves, but this increases the possibility of disclosure risks. We focus on an inter-wave disclosure risk specific to such data sets where intruders can make use of intimate knowledge gained about the household in one wave to learn new sensitive information about the household in future waves. We consider a specific way this risk could occur when households split in one wave, so an individual has left the household, and illustrate this risk using the Wealth and Assets survey. We also show that simply removing the links between waves may be insufficient to adequately protect confidentiality. To mitigate this risk we investigate two statistical disclosure control methods, perturbation and synthesis, that alter sensitive information on these households in the current wave. In this way no new sensitive information will be disclosed to these individuals, while utility should be largely preserved provided the SDC measures are applied appropriately. © 2020, University of Skovde. All rights reserved.
AB - Longitudinally linked household data allows researchers to analyse trends over time as well as on a cross-sectional level. Such analysis requires households to be linked across waves, but this increases the possibility of disclosure risks. We focus on an inter-wave disclosure risk specific to such data sets where intruders can make use of intimate knowledge gained about the household in one wave to learn new sensitive information about the household in future waves. We consider a specific way this risk could occur when households split in one wave, so an individual has left the household, and illustrate this risk using the Wealth and Assets survey. We also show that simply removing the links between waves may be insufficient to adequately protect confidentiality. To mitigate this risk we investigate two statistical disclosure control methods, perturbation and synthesis, that alter sensitive information on these households in the current wave. In this way no new sensitive information will be disclosed to these individuals, while utility should be largely preserved provided the SDC measures are applied appropriately. © 2020, University of Skovde. All rights reserved.
KW - Data confidentiality
KW - Disclosure risk
KW - Matching
KW - Perturbation
KW - Propensity score
KW - Synthetic data
KW - Linked data
KW - Perturbation techniques
KW - Current waves
KW - Household datum
KW - Sensitive informations
KW - Statistical disclosure Control
KW - Trends over time
KW - Risk assessment
M3 - Journal article
VL - 13
SP - 151
EP - 170
JO - Transactions on Data Privacy
JF - Transactions on Data Privacy
IS - 2
ER -