City Research Online

Efficient private record linkage of very large datasets

Schnell, R. (2013). Efficient private record linkage of very large datasets. Paper presented at the 59th World Statistics Congress of the International Statistical Institute, 25-30 Aug 2013, Hong Kong.


Increasingly, administrative data is being used for statistical pur- poses, for example registry based census taking. In practice, this usu- ally requires linking separate files containing information on the same unit, without revealing the identity of the unit. If the linkage has to be done without a unique identification number, the comparison of keys derived from unit identifiers and assumed to be similar is necessary. When dealing with large files like census data or population registries, comparing each possible pair of keys of two files is impossible. There- fore, special algorithms (blocking methods) have to be used to reduce the number of comparisons needed. The presentation will discuss the most widely used blocking algorithms for encrypted data, describe a recently introduced algorithm and compare the performance of the blocking methods currently thought to be the most effective for very large files.

Publication Type: Conference or Workshop Item (Paper)
Additional Information: Copyright 2013, the authors.
Publisher Keywords: administrative data, blocking methods, bloom filter, census data, indexing methods, linking codes
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: School of Science & Technology > Computer Science
Text - Published Version
Download (278kB) | Preview



Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login