Html2Xhtml is a .NET 4.0 library for converting HTML to XHTML licensed under GPLv2 or above.
I tested Html2Xhtml in the local reconstruction of a large online database of the European Union. Tidy/Tidy.NET would not even produce valid output most of the time, Chilkat's HTML-to-XML was a bit slow and produced strange results (misplaced, missing, unexplainable elements). In attempt to find a free, fast and reliable conversion tool I created this library. It converts 2 - 4x faster than all other libraries I tested.
Html2Xhtml, combined with the power of LINQ to XML, is an excellent tool for all large-scale data extraction and web crawling scenarios.
Install-Package Html2Xhtml -Version 184.108.40.206
dotnet add package Html2Xhtml --version 220.127.116.11
<PackageReference Include="Html2Xhtml" Version="18.104.22.168" />
paket add Html2Xhtml --version 22.214.171.124
#r "nuget: Html2Xhtml, 126.96.36.199"
// Install Html2Xhtml as a Cake Addin #addin nuget:?package=Html2Xhtml&version=188.8.131.52 // Install Html2Xhtml as a Cake Tool #tool nuget:?package=Html2Xhtml&version=184.108.40.206
This package has no dependencies.
This package is not used by any NuGet packages.
This package is not used by any popular GitHub repositories.