The Crawler-Lib Engine is a general purpose workflow enabled task processor. It has evolved from a web crawler over data mining and information retrieval. It is throughput optimized and can perform thousands of tasks per second on standard hardware. Due to its workflow capabilities it allows to structure and parallelize even complex kind of work. Please visit the project page for the complete view of the Crawler-Lib Engine.
A license for the Anonymous Edition is included in the package. A license for the more powerful free Community Edition can be generated on the project page. A unrestricted license is available too.
Install-Package CrawlerLib.Engine -Version 2.3.5544.21265
dotnet add package CrawlerLib.Engine --version 2.3.5544.21265
<PackageReference Include="CrawlerLib.Engine" Version="2.3.5544.21265" />
paket add CrawlerLib.Engine --version 2.3.5544.21265
HttpResonse.CharacterSet has now a setter, where the encoding can be changed after the request is analyzed.
The TaskResultBase has a new method Process() where result processing can be implemented by overriding.
It is possible to let the engine call this method ether directly or with a TPL Task. This is controlled with the property TaskResultProcessing
which is by default TaskResultProcessingEnum.EnqueueFinishedTasks.
The Retry workflow element supports SetRetryWork() and ResetRetryWork() within any handler of the child objects.
Fixed: Wrong parenting in Retry workflow element
Introduced some header properties in HttpRequest
Fixed: Crawler blocks after 600 tasks in the community edition
Breaking: AddLimiter() takes name form LimiterConfig now.
Minor fixes: Calculate workflow element, Quota throughput limiter
* New Workflow Element: Calculate - allows th assemble a result from multiple parallel parts (like a Group but with a result)
* Fixed: Workflow parents which don't start children blocked task
* Refactoring: Removed DNS, TickTimestamp, TickTimeSpan
* Fixed and extended 'Work' workflow element
* Breaking: Replaces Licensing system, doesn't work with old licenses. Generate new ones on the Crawler-Lib homepage.
* Fixed ClickOnce installer problems
* Fix parenting error in Retry element
* Unrestricted License available
* Added additional constructors to several workflow elements. So you can construct and use them without specifying a complete configuration object for the element.
* Added AwaitProcessingEnum awaitProcessing to several workflow element constructors, so you can specify that the continuation will be called on failure and check the Success property to decide what to do.
* The workflow elements are awaitable since this release.
* New workflow elements for limits and operation cost calculation have been added.
* A vast amount of small extensions and refactoring
Version 2.00 -2.01 -2.02:
* First public releases
NuGet packages (1)
Showing the top 1 NuGet packages that depend on CrawlerLib.Engine:
The Crawler-Lib Engine Test Helper simplifies the test of tasks. It can be used to develop unit tests and integration tests for tasks.
This package is not used by any popular GitHub repositories.