A reconstruction attack is any method for partially reconstructing a private dataset from public aggregate information. Typically, the dataset contains sensitive information about individuals, whose privacy needs to be protected. The attacker has no or only partial access to the dataset, but has access to public aggregate statistics about the datasets, which could be exact or distorted, for example by adding noise. If the public statistics are not sufficiently distorted, the attacker is able to accurately reconstruct a large portion of the original private data. Reconstruction attacks are relevant to the analysis of private data, as they show that, in order to preserve even a very weak notion of individual privacy, any published statistics need to be sufficiently distorted. This phenomenon was called the Fundamental Law of Information Recovery by Dwork and Roth, and formulated as "overly accurate answers to too many questions will destroy privacy in a spectacular way."[1]
In 2003, Irit Dinur and Kobbi Nissim proposed a reconstruction attack based on noisy answers to multiple statistical queries.[2] Their work was recognized by the 2013 ACM PODS Alberto O. Mendelzon Test-of-Time Award in part for being the seed for the development of differential privacy.[3]
Dinur and Nissim model a private database as a sequence of bits
D=(d1,\ldots,dn)
S\subseteq\{1,\ldots,n\}
qS(D)=\sumi{di}
a1,\ldots,am
S1,\ldots,Sm
|ai-
q | |
Si |
(D)|\lel{E}
i\in\{1,\ldots,m\}
l{E}
m
D
l{E}
m
n
m
n
l{E}
n
m
n
l{E}
\sqrt{n}