Eliassen.Apache.Tika
0.1.85
dotnet add package Eliassen.Apache.Tika --version 0.1.85
NuGet\Install-Package Eliassen.Apache.Tika -Version 0.1.85
<PackageReference Include="Eliassen.Apache.Tika" Version="0.1.85" />
paket add Eliassen.Apache.Tika --version 0.1.85
#r "nuget: Eliassen.Apache.Tika, 0.1.85"
// Install Eliassen.Apache.Tika as a Cake Addin #addin nuget:?package=Eliassen.Apache.Tika&version=0.1.85 // Install Eliassen.Apache.Tika as a Cake Tool #tool nuget:?package=Eliassen.Apache.Tika&version=0.1.85
Eliassen.Apache.Tika
Summary
The Eliassen.Apache.Tika library provides functionality for content type detection and document conversion using Apache Tika. It offers a set of classes and methods for integrating Apache Tika services with .NET applications.
Installation
To use this library in your .NET project, add a reference to the Eliassen.Apache.Tika NuGet package.
Usage
Content Type Detection
The TikaContentTypeDetector
class provides methods for asynchronously detecting the content type of a
stream using Apache Tika.
using Eliassen.Apache.Tika;
// Detect content type asynchronously
string contentType = await TikaContentTypeDetector.DetectContentTypeAsync(stream);
## Document Conversion
The library includes several conversion handlers for converting documents to HTML format using Apache
Tika. Each handler supports specific document formats.
### Word Documents (DOCX)
```csharp
using Eliassen.Apache.Tika;
// Convert DOCX document to HTML
TikaDocxToHtmlConversionHandler handler = new TikaDocxToHtmlConversionHandler();
handler.ConvertAsync(sourceStream, sourceContentType, destinationStream, destinationContentType);
PDF Documents
using Eliassen.Apache.Tika;
// Convert PDF document to HTML
TikaPdfToHtmlConversionHandler handler = new TikaPdfToHtmlConversionHandler();
handler.ConvertAsync(sourceStream, sourceContentType, destinationStream, destinationContentType);
OpenDocument Text (ODT) Documents
using Eliassen.Apache.Tika;
// Convert ODT document to HTML
TikaOdtToHtmlConversionHandler handler = new TikaOdtToHtmlConversionHandler();
handler.ConvertAsync(sourceStream, sourceContentType, destinationStream, destinationContentType);
Rich Text Format (RTF) Documents
using Eliassen.Apache.Tika;
// Convert RTF document to HTML
TikaRtfToHtmlConversionHandler handler = new TikaRtfToHtmlConversionHandler();
handler.ConvertAsync(sourceStream, sourceContentType, destinationStream, destinationContentType);
Extensibility
Developers can extend the functionality by inheriting from TikaToHtmlConversionBaseHandler
or
TikaConversionHandlerBase
classes for custom document conversion requirements.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
-
net8.0
- Eliassen.Documents.Abstractions (>= 0.1.85)
- Microsoft.Extensions.Configuration.Abstractions (>= 8.0.0)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 8.0.1)
- Microsoft.Extensions.Diagnostics.HealthChecks (>= 8.0.8)
- Microsoft.Extensions.Http (>= 8.0.0)
- Microsoft.Extensions.Logging.Abstractions (>= 8.0.1)
- Microsoft.Extensions.Options.ConfigurationExtensions (>= 8.0.0)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on Eliassen.Apache.Tika:
Package | Downloads |
---|---|
Eliassen.Common.Extensions
Package Description |
GitHub repositories
This package is not used by any popular GitHub repositories.