md.bloomfilter 1.0.1

dotnet add package md.bloomfilter --version 1.0.1                
NuGet\Install-Package md.bloomfilter -Version 1.0.1                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="md.bloomfilter" Version="1.0.1" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add md.bloomfilter --version 1.0.1                
#r "nuget: md.bloomfilter, 1.0.1"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install md.bloomfilter as a Cake Addin
#addin nuget:?package=md.bloomfilter&version=1.0.1

// Install md.bloomfilter as a Cake Tool
#tool nuget:?package=md.bloomfilter&version=1.0.1                

Bloom Filter

PM> NuGet\Install-Package md.bloomfilter

A Bloom Filter is a space efficient, probabilistic data structure.
Storing a value as only several bits - after running it though several hashing algorithms.

Bloom Filter Data Structure

Add items to a Bloom Filter, then query the filter later to see if the item was (potentially) previously added.
Once created the filter cannot be resized, once an item is added it cannot be removed.
The filter very memory efficient, it only uses a few bytes to store an object; and those bytes are shared.
The filter can tell you with 99% accuracy if an item was previously added (i.e. it has a 1% false positive rate).

A Bloom Filter can tell you if an item is definitely not in the filter, and with 99% certainty, tell you if it has seen the item before.
It can't be 100% certain - as hash functions have collisions (i.e. two different values with the same hash).
Use a filter over an expensive resource to check if it's likey there or not.
For example as a blacklist, on a match you go to a database to make sure the item is actually blacklisted.

Type Implementations

This project contains two explicit types of Bloom Filters - integer, and string.
To create a specific type of Bloom Filter, inherent from AbstractBloomFilter<T>, and implement the Hash function.

Note a good hash function for a Bloom Filter should be fast as possible, with each value of T uniformly distributed across the int32 range.
Cryptographic hashes are a poor choice, as the size of the filter grows; the speed to compute hashes grows too.
A good example of a non-cryptographic hash function would be MurmurHash3.

If you don't want to roll your own type's hash, then find a way to represent your type as an int, or string, and use the provided filters.
The values used as input to the hash function should not change over time, otherwise future matches will be incorrect.

IntFilter

// The source of items we want to add to the filter: numbers 97 - 122 (inclusive).
var items = Enumerable.Range(97, 26).ToArray();

// The integer bloom filter.
var filter = new IntegerFilter(items);

// These items where added to the filter.
Assert.True(filter.Contains(97));
Assert.True(filter.Contains(122));

// These items where not added (outside of the range).
Assert.False(filter.Contains(96));
Assert.False(filter.Contains(123));

StringFilter

// The source of items we want to add to the filter: characters a - z (inclusive).
var items = Enumerable.Range(97, 26).Select(c => Convert.ToChar(c).ToString()).ToArray();

// The string bloom filter.
var filter = new StringFilter(items);

// These items where added to the filter.
Assert.True(filter.Contains("a"));
Assert.True(filter.Contains("z"));

// These items where not added (outside of the range).
Assert.False(filter.Contains("`"));
Assert.False(filter.Contains("{"));

IReadonlyFilter<T>

Any Bloom Filter can be changed to a readonly collection by calling the base method Readonly().
This prevents users of this instance from adding any new items to the filter.

// The filter's source.
var items = Enumerable.Range(97, 26).ToArray();

// The bloom filter.
var filter = new IntegerFilter(items);

// The readonly instance of the bloom filter.
var readonlyFilter = filter.Readonly();

Credits

Changelog

1.0.0

  • Added Bloom Filter implementations for integer, and string types (i.e. their hash functions).

1.0.1

  • Moved exceptions into their own class for performance.
  • Added some missing XML comments on the types.
Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • .NETStandard 2.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.0.1 120 6/1/2024
1.0.0 119 5/19/2024

Moved exceptions into their own class for performance.
Added some missing XML comments on the types.