Package jadex.commons.collection
Class BloomFilter
- java.lang.Object
-
- jadex.commons.collection.BloomFilter
-
public class BloomFilter extends java.lang.Object
A bloom filter is a probabilistic data structure for checking if a value is contained in a set. It has a false positive rate, i.e. it can tell that a value is contained in the filter although it is not. This implementation does not support remove() because clearing a bit that is used by more than one element might lead to false negatives. (If remove is required a counting filter needs to be used).
-
-
Constructor Summary
Constructors Constructor Description BloomFilter()
Create a new BloomFilter with default p = 0.01 and n = 1000.BloomFilter(double p, int n)
Create a new BloomFilter.BloomFilter(int m, int k, int n)
Create a new BloomFilter.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
add(byte[] value)
Add a value to the filter.void
clear()
Clear the filter content.static int
computeAcceptableN(long k, long m)
Compute acceptable amount of elements with given number of hashes and size of filter in bits.static int
computeOptimalK(long n, long m)
Compute optimal number of hash functions using expected number of elements and size of filter in bits.static int
computeOptimalM(long n, double p)
Compute optimal size of bloom filter in bits by expected number of elements and acceptable false positive rate.static double
computeP(long k, long m, double insertedElements)
Compute best-case (uniform hash function) false positive probability.static int[]
hashK(byte[] value, int m, int k)
Compute k hashes based on two hashes using (hash1+i*hash2)%m;static void
main(java.lang.String[] args)
Main for testing.boolean
mightContain(byte[] value)
Test if a value is contained in the filter.static int
murmur3(byte[] data, int seed)
Compute the murmur hash for a byte array.java.lang.String
toString()
Get the string representation.
-
-
-
Constructor Detail
-
BloomFilter
public BloomFilter()
Create a new BloomFilter with default p = 0.01 and n = 1000.
-
BloomFilter
public BloomFilter(double p, int n)
Create a new BloomFilter.- Parameters:
p
- The acceptable false positive rate.n
- The expected number of entries.
-
BloomFilter
public BloomFilter(int m, int k, int n)
Create a new BloomFilter.- Parameters:
m
- The number of bits in the filter.k
- The number of hashes used.n
- The expected number of values in the filter.
-
-
Method Detail
-
add
public boolean add(byte[] value)
Add a value to the filter.- Parameters:
value
- The value.
-
clear
public void clear()
Clear the filter content.
-
mightContain
public boolean mightContain(byte[] value)
Test if a value is contained in the filter.- Parameters:
value
- The value.- Returns:
- True, if value might be contained.
-
computeOptimalM
public static int computeOptimalM(long n, double p)
Compute optimal size of bloom filter in bits by expected number of elements and acceptable false positive rate.- Parameters:
n
- Expected number of elements.p
- Acceptable false positive rate.- Returns:
- The optimal size of the bloom filter in bits.
-
computeOptimalK
public static int computeOptimalK(long n, long m)
Compute optimal number of hash functions using expected number of elements and size of filter in bits.- Parameters:
n
- Expected number of elements inserted in the bloom filterm
- The size of the bloom filter in bits.- Returns:
- The optimal number of hash functions.
-
computeAcceptableN
public static int computeAcceptableN(long k, long m)
Compute acceptable amount of elements with given number of hashes and size of filter in bits.- Parameters:
k
- Number of hashes.m
- The size of the bloom filter in bits.- Returns:
- Acceptable number elements that can be inserted.
-
computeP
public static double computeP(long k, long m, double insertedElements)
Compute best-case (uniform hash function) false positive probability.- Parameters:
k
- Number of hashes.m
- The size of the bloom filter in bits.n
- Number of elements inserted in the filter.- Returns:
- The expected false positive probability.
-
hashK
public static int[] hashK(byte[] value, int m, int k)
Compute k hashes based on two hashes using (hash1+i*hash2)%m;- Parameters:
value
- The byte array.m
- The maximum number of values.k
- The number of hash values to compute.- Returns:
- K hashes.
-
murmur3
public static int murmur3(byte[] data, int seed)
Compute the murmur hash for a byte array.
-
toString
public java.lang.String toString()
Get the string representation.- Overrides:
toString
in classjava.lang.Object
-
main
public static void main(java.lang.String[] args)
Main for testing.
-
-