15 packages returned for Tags:"tokenization"

Sort by

Lexepars.Grammars.Yaml
by: Dimigur
- .NET 5.0 .NET Core 2.0 .NET Standard 2.0 .NET Framework 4.6.1
- 69,905 total downloads
- last updated 9/11/2019
- Latest version: 0.1.0
- yaml parsing lexer tokenization grammar validation
YAML grammar based on Lexepars parser lib.
Lexepars.Grammars.Json
by: Dimigur
- .NET 5.0 .NET Core 2.0 .NET Standard 2.0 .NET Framework 4.6.1
- 33,132 total downloads
- last updated 7/27/2019
- Latest version: 1.0.0
- json parsing lexer tokenization
JSON grammar based on Lexepars parser lib.
Lexepars.TestFixtures
by: Dimigur
- .NET 5.0 .NET Core 3.1
- 69,713 total downloads
- last updated 12/29/2021
- Latest version: 1.3.0
- monadic parser combinator parsing lexer tokenization parsec fparsec
Package Description
CyberSource.Flex.Server
by: CyberSource
- Deprecated
- .NET Framework
- 37,596 total downloads
- last updated 1/15/2019
- Latest version: 0.2.2
- CyberSource tokenization tokenisation tokens token
SDK for server-side integration with CyberSource Flex API
Stanford.NLP.Segmenter
by: sergey_tihon
- Deprecated
- .NET 5.0 .NET Core 3.1 .NET Framework 4.6.1
- 23,357 total downloads
- last updated 8/25/2022
- Latest version: 4.2.0.2
- nlp stanford segmenter tokenization splitting IKVM
Tokenization of raw text is a standard pre-processing step for many NLP tasks. For English, tokenization usually involves punctuation splitting and separation of some affixes like possessives. Other languages... More information
Microsoft.KernelMemory.AI.TikToken
by: Microsoft MicrosoftOCTO
- Deprecated
- .NET 8.0
- 94,511 total downloads
- last updated 7/9/2024
- Latest version: 0.66.240709.1
- TikToken Tokenization BPE GPT4 GPT Memory RAG Kernel Semantic Episodic More tags
Provide TikToken tokenizers in Kernel Memory
Lexepars
by: Dimigur
- .NET 5.0 .NET Core 3.0 .NET Standard 2.1
- 18,513 total downloads
- last updated 12/29/2021
- Latest version: 1.3.0
- monadic parser combinator parsing lexer tokenization parsec fparsec
Concise monadic parser combinator library with separate lexer/parser phases and big-size input support.
McBits.Tokenization
by: McBits
- .NET 5.0 .NET Core 2.0 .NET Standard 2.0 .NET Framework 4.6.1
- 6,026 total downloads
- last updated 12/22/2018
- Latest version: 2.0.0
- mcbits tokenization
Extract tokens from a string of text for use with NLP tools or statistical analysis.
Alumis.Text.Tokenization
by: alumis
- .NET 5.0 .NET Core 2.0 .NET Standard 2.0 .NET Framework 4.6.1
- 6,119 total downloads
- last updated 4/24/2019
- Latest version: 1.0.9
- text tokenization unicode
Text tokenization based on Unicode grapheme clustering, and the XID_Start and XID_Continue binary properties.
TextMatch
by: cris-almodovar
- .NET Framework
- 11,697 total downloads
- last updated 11/22/2015
- Latest version: 0.6.5
- text search match fuzzy wildcard tokenization stemming
TextMatch is a library for searching inside texts using Lucene query expressions. Supports all types of Lucene query expressions - boolean, wildcard, fuzzy. Options are available for tweaking tokenization, such... More information
MetaParser
by: dsisco11
- 4,231 total downloads
- last updated 2/23/2023
- Latest version: 1.0.7
- parsing tokenization tokenizer text parser
Package Description
rosette_api
by: basistech
- .NET Framework 4.5
- 144,988 total downloads
- last updated 7/10/2024
- Latest version: 1.30.0
- RosetteAPI address similarity analyze language babel street categories coreference entity More tags
.Net (C#) Binding for Rosette API
Universal.Common.Language
by: ong.andrew
- .NET 5.0 .NET Core 2.0 .NET Standard 2.0 .NET Framework 4.6.1
- 14,725 total downloads
- last updated 8/3/2022
- Latest version: 1.2.1
- Universal Common Language Sentence Component Extensions Bag Of Words Tokenization More tags
Class library for working with general language constructs.
Cynic-Magnit.Tokenization
by: catapart
- .NET 6.0
- 1,124 total downloads
- last updated 10/20/2022
- Latest version: 1.0.1
- string parse parsing token tokenizer tokenization
Tokenize strings into custom tokens using ordered regex operations.
UralicNLP
by: mikahama
- .NET 8.0
- 170 total downloads
- last updated 2/21/2024
- Latest version: 1.0.0
- nlp tokenization lemmatization English Finnish Spanish French Italian German Russian
An NLP library for many languages.