No account? Create an account

## More about Bloom filters

#### « previous entry | next entry » 28th Jan 2008 | 12:36

While writing up my notes on Bloom filters, I wanted a better idea of the relationship between the false positive rate and the size multiplier.

 sm = size / pop pz = exp(−nh ∗ pop / size) = exp(−nh / sm) fpr = (1 − pz) nh

Given a fixed number of hash functions, what size multiplier do we need to make enough space in the Bloom filter to produce the desired false positive rate? How does the size multiplier change as nh and fpr change? I wrote a program to solve this numerically, and the output is below. The entries for higher fprs show quite nicely that there's an optimal nh for a given fpr.

size/pop nh=1 nh=2 nh=3 nh=4 nh=5 nh=6 nh=7 nh=8 nh=9 nh=10 nh=11 nh=12 nh=13 nh=14 nh=15 nh=16
fpr=2-1 1.4 1.6 1.9 2.2 2.4 2.7 3.0 3.2 3.5 3.7 3.9 4.2 4.4 4.6 4.8 5.1
fpr=2-2 3.5 2.9 3.0 3.3 3.5 3.8 4.1 4.4 4.6 4.9 5.2 5.4 5.7 5.9 6.2 6.4
fpr=2-3 7.5 4.6 4.3 4.4 4.6 4.9 5.2 5.4 5.7 6.0 6.3 6.5 6.8 7.1 7.3 7.6
fpr=2-4 15.5 7.0 5.9 5.8 5.9 6.0 6.3 6.5 6.8 7.1 7.3 7.6 7.9 8.2 8.4 8.7
fpr=2-5 31.5 10.3 7.9 7.3 7.2 7.3 7.4 7.7 7.9 8.1 8.4 8.7 9.0 9.2 9.5 9.8
fpr=2-6 63.5 15.0 10.4 9.2 8.8 8.7 8.7 8.9 9.1 9.3 9.5 9.8 10.0 10.3 10.6 10.9
fpr=2-7 127.5 21.6 13.6 11.3 10.5 10.2 10.1 10.2 10.3 10.5 10.7 10.9 11.1 11.4 11.7 11.9
fpr=2-8 255.5 31.0 17.5 13.9 12.5 11.9 11.6 11.5 11.6 11.7 11.9 12.1 12.3 12.5 12.8 13.0
fpr=2-9 511.5 44.2 22.5 16.9 14.8 13.8 13.3 13.0 13.0 13.0 13.1 13.3 13.5 13.7 13.9 14.2
fpr=2-10 1023.5 63.0 28.7 20.6 17.4 15.9 15.1 14.7 14.5 14.4 14.5 14.6 14.7 14.9 15.1 15.3
fpr=2-11 2047.5 89.5 36.6 24.9 20.4 18.2 17.1 16.4 16.1 15.9 15.9 15.9 16.0 16.1 16.3 16.5
fpr=2-12 4095.5 127.0 46.5 30.0 23.8 20.9 19.3 18.3 17.8 17.5 17.4 17.3 17.3 17.4 17.6 17.7
fpr=2-13 8191.5 180.0 59.0 36.0 27.7 23.8 21.7 20.4 19.7 19.2 18.9 18.8 18.8 18.8 18.9 19.0
fpr=2-14 16383.5 255.0 74.7 43.2 32.3 27.1 24.3 22.7 21.6 21.0 20.6 20.4 20.2 20.2 20.2 20.3
fpr=2-15 32767.5 361.0 94.5 51.8 37.4 30.8 27.3 25.1 23.8 22.9 22.4 22.0 21.8 21.7 21.6 21.7
fpr=2-16 65535.5 511.0 119.4 62.0 43.4 35.0 30.5 27.8 26.1 25.0 24.2 23.7 23.4 23.2 23.1 23.1
fpr=2-17 131071.5 723.1 150.9 74.1 50.2 39.7 34.1 30.7 28.6 27.2 26.2 25.6 25.1 24.8 24.6 24.6
fpr=2-18 262143.5 1023.0 190.5 88.5 58.1 44.9 38.0 33.9 31.3 29.5 28.3 27.5 26.9 26.5 26.3 26.1
fpr=2-19 524287.5 1447.2 240.4 105.6 67.1 50.8 42.3 37.4 34.2 32.1 30.6 29.6 28.8 28.3 27.9 27.7
fpr=2-20 1048575.5 2047.0 303.3 126.0 77.5 57.4 47.1 41.1 37.3 34.8 33.0 31.7 30.8 30.1 29.7 29.3