NipahTokenizer 3.1.1
See the version list below for details.
dotnet add package NipahTokenizer --version 3.1.1
NuGet\Install-Package NipahTokenizer -Version 3.1.1
<PackageReference Include="NipahTokenizer" Version="3.1.1" />
paket add NipahTokenizer --version 3.1.1
#r "nuget: NipahTokenizer, 3.1.1"
// Install NipahTokenizer as a Cake Addin #addin nuget:?package=NipahTokenizer&version=3.1.1 // Install NipahTokenizer as a Cake Tool #tool nuget:?package=NipahTokenizer&version=3.1.1
Nipah Tokenizer
Installing
dotnet add package NipahTokenizer
Using
First setup the tokenizer options as needed, you can use the default as follows:
var tokenizerOptions = TokenizerOptions.Default;
Then just create a new instance of tokenizer and call "Tokenize" passing tokenizer options as arg:
var tokenizer = new Tokenizer();
var tokens = tokenizer.Tokenize("any-text-here", tokenizerOptions);
You can now just iterate over tokens as needed.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
-
net6.0
- No dependencies.
NuGet packages (1)
Showing the top 1 NuGet packages that depend on NipahTokenizer:
Package | Downloads |
---|---|
Nipah.Markdown
A library to make simple parses of markdown |
GitHub repositories
This package is not used by any popular GitHub repositories.
Refactoring code again.
More optimizations and bugfixes.
Adding option to use parallelism to tokenize larger texts (can reach +50% speed), results in less accurate tokenization but good.
Switching to struct split item (less heap memory usage).
Switching to use chars directly instead Regex (+80% speed).
- fast fix