14 packages returned for Tags:"Tokenizer"

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks... More information
WHtmlParser is HTML code parser which works fast and does not have limitations imposed by parsers according to a WC3 specification. If you want to parse and modify a HTML code in an object-oriented way I recommend using a WQuery library which wraps a parser into an object, which shares user-friendly... More information
WQuery enables parsing and then editing a HTML code with the assistance of a fluent interface just like in the case of a jQuery library. WQuery is a part of a Wojdav Bootstrap Mvc package. The parsing of the HTML code is based on a WHtmlParser library. For now, a WHtmlParser contains some... More information