Package jadex.commons.collection
Class BloomFilter
- java.lang.Object
-
- jadex.commons.collection.BloomFilter
-
public class BloomFilter extends java.lang.ObjectA bloom filter is a probabilistic data structure for checking if a value is contained in a set. It has a false positive rate, i.e. it can tell that a value is contained in the filter although it is not. This implementation does not support remove() because clearing a bit that is used by more than one element might lead to false negatives. (If remove is required a counting filter needs to be used).
-
-
Constructor Summary
Constructors Constructor Description BloomFilter()Create a new BloomFilter with default p = 0.01 and n = 1000.BloomFilter(double p, int n)Create a new BloomFilter.BloomFilter(int m, int k, int n)Create a new BloomFilter.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanadd(byte[] value)Add a value to the filter.voidclear()Clear the filter content.static intcomputeAcceptableN(long k, long m)Compute acceptable amount of elements with given number of hashes and size of filter in bits.static intcomputeOptimalK(long n, long m)Compute optimal number of hash functions using expected number of elements and size of filter in bits.static intcomputeOptimalM(long n, double p)Compute optimal size of bloom filter in bits by expected number of elements and acceptable false positive rate.static doublecomputeP(long k, long m, double insertedElements)Compute best-case (uniform hash function) false positive probability.static int[]hashK(byte[] value, int m, int k)Compute k hashes based on two hashes using (hash1+i*hash2)%m;static voidmain(java.lang.String[] args)Main for testing.booleanmightContain(byte[] value)Test if a value is contained in the filter.static intmurmur3(byte[] data, int seed)Compute the murmur hash for a byte array.java.lang.StringtoString()Get the string representation.
-
-
-
Constructor Detail
-
BloomFilter
public BloomFilter()
Create a new BloomFilter with default p = 0.01 and n = 1000.
-
BloomFilter
public BloomFilter(double p, int n)Create a new BloomFilter.- Parameters:
p- The acceptable false positive rate.n- The expected number of entries.
-
BloomFilter
public BloomFilter(int m, int k, int n)Create a new BloomFilter.- Parameters:
m- The number of bits in the filter.k- The number of hashes used.n- The expected number of values in the filter.
-
-
Method Detail
-
add
public boolean add(byte[] value)
Add a value to the filter.- Parameters:
value- The value.
-
clear
public void clear()
Clear the filter content.
-
mightContain
public boolean mightContain(byte[] value)
Test if a value is contained in the filter.- Parameters:
value- The value.- Returns:
- True, if value might be contained.
-
computeOptimalM
public static int computeOptimalM(long n, double p)Compute optimal size of bloom filter in bits by expected number of elements and acceptable false positive rate.- Parameters:
n- Expected number of elements.p- Acceptable false positive rate.- Returns:
- The optimal size of the bloom filter in bits.
-
computeOptimalK
public static int computeOptimalK(long n, long m)Compute optimal number of hash functions using expected number of elements and size of filter in bits.- Parameters:
n- Expected number of elements inserted in the bloom filterm- The size of the bloom filter in bits.- Returns:
- The optimal number of hash functions.
-
computeAcceptableN
public static int computeAcceptableN(long k, long m)Compute acceptable amount of elements with given number of hashes and size of filter in bits.- Parameters:
k- Number of hashes.m- The size of the bloom filter in bits.- Returns:
- Acceptable number elements that can be inserted.
-
computeP
public static double computeP(long k, long m, double insertedElements)Compute best-case (uniform hash function) false positive probability.- Parameters:
k- Number of hashes.m- The size of the bloom filter in bits.n- Number of elements inserted in the filter.- Returns:
- The expected false positive probability.
-
hashK
public static int[] hashK(byte[] value, int m, int k)Compute k hashes based on two hashes using (hash1+i*hash2)%m;- Parameters:
value- The byte array.m- The maximum number of values.k- The number of hash values to compute.- Returns:
- K hashes.
-
murmur3
public static int murmur3(byte[] data, int seed)Compute the murmur hash for a byte array.
-
toString
public java.lang.String toString()
Get the string representation.- Overrides:
toStringin classjava.lang.Object
-
main
public static void main(java.lang.String[] args)
Main for testing.
-
-