Python实现布隆过滤器-CFANZ编程社区

Python实现布隆过滤器

import hashlib

class BloomFilter:
    def __init__(self, size, hash_functions):
        self.size = size
        self.bit_array = [False] * size
        self.hash_functions = hash_functions

    def add(self, item):
        for func in self.hash_functions:
            index = func(item) % self.size
            self.bit_array[index] = True

    def might_contain(self, item):
        for func in self.hash_functions:
            index = func(item) % self.size
            if not self.bit_array[index]:
                return False
        return True

def hash_function1(item):
    m = hashlib.md5()
    m.update(item.encode())
    return int(m.hexdigest(), 16)

def hash_function2(item):
    m = hashlib.sha1()
    m.update(item.encode())
    return int(m.hexdigest(), 16)

def hash_function3(item):
    m = hashlib.sha256()
    m.update(item.encode())
    return int(m.hexdigest(), 16)

方法二

# 测试布隆过滤器
bf = BloomFilter(1000, [hash_function1, hash_function2, hash_function3])
bf.add("apple")
bf.add("banana")
bf.add("orange")

print(bf.might_contain("apple"))  
print(bf.might_contain("grape"))

这个实现中，BloomFilter 类代表布隆过滤器。构造函数接受过滤器的大小和哈希函数列表。add 方法将一个元素添加到布隆过滤器中，通过对元素应用每个哈希函数并设置对应的位。might_contain 方法检查一个元素是否可能在布隆过滤器中，通过检查每个哈希函数对应的位是否都被设置。

这里提供了三个简单的哈希函数示例，在实际应用中，你可能需要更强大的哈希函数或者使用现有的哈希库来提供更多不同的哈希函数。