
Two Hospitals Matched Patient Records Without Sharing a Single Name
Hospital A has 50,000 patient records. Hospital B has 40,000. Some patients visit both. Nobody knows which ones. They need to find the overlap — for care coordination, billing reconciliation, research. But HIPAA says neither hospital can share raw patient data with the other. No names. No SSNs. No dates of birth. Nothing identifiable. So how do you match records you're not allowed to see? Bloom Filters A bloom filter is a bit array. You take a name like "John Smith," break it into character pairs ("jo", "oh", "hn", "n ", " s", "sm", "mi", "it", "th"), hash each pair into positions in the bit array, and flip those bits to 1. The result is an encrypted fingerprint. You can't reverse it back to "John Smith." But "Jon Smith" produces a bloom filter with most of the same bits flipped — because most of the character pairs overlap. Two similar names → overlapping bits → measurable similarity. Without ever decrypting. The Command pip install goldenmatch goldenmatch pprl link \ --file-a hospita
Continue reading on Dev.to Python
Opens in a new tab



