Skip to content

Bloom filters

A probabilistic data structure used for membership tests, particularly useful in big data scenarios where false positives are acceptable.

Bloom filters trade accuracy for speed and memory size.

Some use cases for Bloom filters include:

  1. Caching - quickly check if item is in cache before performing more expensive database lookup
  2. Set Membership test - check if an item is a member of a group. A bloom filter would return:
    • definitely no
    • probably yes

Bloom filters use a fixed-size array and a small number of hash functions.

Advantages of bloom filters include:

  • fast. normally only requiring computing a few hash functions.
  • less memory usage