A Bloom Filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not. In this example, we'll implement a generic Bloom Filter in Swift.
import Foundation
struct BloomFilter {
private var bitArray: [Bool]
private let size: Int
private let hashFunctions: [(T) -> Int]
init(size: Int, hashFunctions: [(T) -> Int]) {
self.size = size
self.bitArray = Array(repeating: false, count: size)
self.hashFunctions = hashFunctions
}
mutating func insert(_ item: T) {
for hashFunction in hashFunctions {
let index = hashFunction(item) % size
bitArray[index] = true
}
}
func contains(_ item: T) -> Bool {
for hashFunction in hashFunctions {
let index = hashFunction(item) % size
if !bitArray[index] {
return false
}
}
return true
}
}
// Example hash functions
func hash1(_ input: String) -> Int {
return input.hashValue
}
func hash2(_ input: String) -> Int {
return (input.hashValue * 2) % 100
}
// Using the Bloom Filter
var bloomFilter = BloomFilter(size: 100, hashFunctions: [hash1, hash2])
bloomFilter.insert("Hello")
print(bloomFilter.contains("Hello")) // true
print(bloomFilter.contains("World")) // false
How do I avoid rehashing overhead with std::set in multithreaded code?
How do I find elements with custom comparators with std::set for embedded targets?
How do I erase elements while iterating with std::set for embedded targets?
How do I provide stable iteration order with std::unordered_map for large datasets?
How do I reserve capacity ahead of time with std::unordered_map for large datasets?
How do I erase elements while iterating with std::unordered_map in multithreaded code?
How do I provide stable iteration order with std::map for embedded targets?
How do I provide stable iteration order with std::map in multithreaded code?
How do I avoid rehashing overhead with std::map in performance-sensitive code?
How do I merge two containers efficiently with std::map for embedded targets?