87 packages returned for Tags:"crawler"

This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world"... More information
Deprecated as there's new maintainer for original HAP project. Please check the new repo at https://github.com/zzzprojects/html-agility-pack. This is a port of HtmlAgilityPack library created by Simon Mourrier and Jeff Klawiter for .NET Core platform. This NuGet package supports can be used with... More information
Abot is an open source C# web crawler built for speed and flexibility. It takes care of the low level plumbing (multithreading, http requests, scheduling, link parsing, etc..). You just register for events to process the page data. You can also plugin your own implementations of core interfaces to... More information
Crawler-Lib NHunspell is a spell check, hyphenation, word stemming and thesaurus library based on the Open Office spell check library Hunspell. NHunspell can use the vast amount of OpenOffice dictionaries. It is an alternative to NetSpell, GNU Aspell, ISpell, PSpell and Enchant. It wraps the native... More information
A powerful C# web crawler that makes advanced crawling features easy to use. AbotX builds upon the open source Abot C# Web Crawler by providing a powerful set of wrappers and extensions.
DotnetSpider, a .NET Standard web crawling library. It is lightweight, efficient and fast high-level web crawling & scraping framework
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world"... More information
  • 5,785 total downloads
  • last updated 3/25/2020
  • Latest version: 3.0.3
  • web crawler
Spidey is a library designed to help with crawling and parsing web content.
  • 8,271 total downloads
  • last updated 11/21/2016
  • Latest version: 0.1.13-beta
  • crawler robot spider
.NET Core port of sjdirect/abot. Abot is an open source C# web crawler built for speed and flexibility. It takes care of the low level plumbing (multithreading, http requests, scheduling, link parsing, etc..). You just register for events to process the page data. You can also plugin your own... More information
dcsoup is a .NET library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. This library is basically a port of jsoup, a Java HTML parser library. see also: http://jsoup.org/ API reference is... More information
Aspose.HTML is a cross-platform class library that enables you to perform a wide range of HTML manipulation tasks directly within your .NET applications. Aspose.HTML supports parsing of HTML5, CSS3, SVG and HTML Canvas to construct a Document Object Model (DOM) based on the official W3C... More information
  • 339 total downloads
  • last updated 1/13/2018
  • Latest version: 0.1.0
  • crawler
A web crawler written with .net standard
This is a port of HtmlAgilityPack library created by Simon Mourrier and Jeff Klawiter for .NET Core platform. This NuGet package supports can be used with Universal Windows Platform, ASP.NET 5 (using .NET Core) and full .NET Framework 4.6. Original description: This is an agile HTML parser... More information