Hash2Vec 1.0.3-alpha

This is a prerelease version of Hash2Vec.
Install-Package Hash2Vec -Version 1.0.3-alpha
dotnet add package Hash2Vec --version 1.0.3-alpha
<PackageReference Include="Hash2Vec" Version="1.0.3-alpha" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Hash2Vec --version 1.0.3-alpha
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: Hash2Vec, 1.0.3-alpha"
#r directive can be used in F# Interactive, C# scripting and .NET Interactive. Copy this into the interactive tool or source code of the script to reference the package.
// Install Hash2Vec as a Cake Addin
#addin nuget:?package=Hash2Vec&version=1.0.3-alpha&prerelease

// Install Hash2Vec as a Cake Tool
#tool nuget:?package=Hash2Vec&version=1.0.3-alpha&prerelease
The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Quick start in Hash2Vec

Hash2Vec tool for vectorizing text to numerical vector.

The basis of this algorithm is the principle of obtaining a vector based on the morphological structure of the word and coding this basis of the word into a numerical vector.

Hash2Vec can be used in two operating modes: in fuzzy search mode and in machine learning mode for solving classification problems.

Hash2Vec supports several languages for vectorizing text:

  • Russian
  • English
  • French
  • Russian & English

Hash2Vec Documentation:

Example vectorization text

var hash2vecBuild = new Hash2VecBuild(inputFile, outputFile) { WithBinary = binary };
 hash2VecBuild.BuildNormalizationVector(new Hash2VecToRussian()); // Vectorization vector Russian
 hash2VecBuild.BuildNormalizationVector(new Hash2VecToEnglish()); // Vectorization vector English
 hash2VecBuild.BuildNormalizationVector(new Hash2VecToFrench());  // Vectorization vector French
 hash2VecBuild.BuildNormalizationVector(new Core.Hash2Vec(size:75)); // Vectorization vector Russian & English (default length 75)
 //There is no default vector normalization in the initialization Build
 hash2VecBuild.Build(size:75) // Vectorization vector Russian & English (default length 75)

Example normalization vector

 var normalizationVectors = new NormalizationVectors();
 var vectors = normalizationVectors.LoadVectors("InputFile"); // Loading vectors to normalization
 normalizationVectors.NormalizationVector(ref vectors); // Normalization vectors
 normalizationVectors.CheckHashVector(ref vectors);  // Vector hash check

Example test distance vector

var vocabulary = new Hash2VecBinaryReader().Read("inputFile"); // InputFile vectorization vectors
var distanceList = vocabulary.Distance("milks", 50,0).ToList();
    distanceList.ForEach(dis =>
       Console.WriteLine("{0}\t\t ||{1,10:F6}", dis.Representation.WordOrNull, dis.DistanceValue));

Example test Fuzzy Search in Hash2Vec

var input = "молоко домик в деревне";
var name = "молоко в деревне";
var distHash2Vec = input.Hash2VecDistanceCorrect(name);
 if (distHash2Vec > 0.55) //Result greater than 55 good result
      Console.WriteLine("\t{0:###,###.00000} against {1}", distHash2Vec, name);

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.0.3-alpha 51 4/15/2021
1.0.2-alpha 124 11/23/2020
1.0.1-alpha 183 10/17/2020

Preview version Hash2Vec