GroupDocs.Parser-Cloud 23.7.0

dotnet add package GroupDocs.Parser-Cloud --version 23.7.0                
NuGet\Install-Package GroupDocs.Parser-Cloud -Version 23.7.0                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="GroupDocs.Parser-Cloud" Version="23.7.0" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add GroupDocs.Parser-Cloud --version 23.7.0                
#r "nuget: GroupDocs.Parser-Cloud, 23.7.0"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install GroupDocs.Parser-Cloud as a Cake Addin
#addin nuget:?package=GroupDocs.Parser-Cloud&version=23.7.0

// Install GroupDocs.Parser-Cloud as a Cake Tool
#tool nuget:?package=GroupDocs.Parser-Cloud&version=23.7.0                

Document Parsing & Data Extraction API for .NET Cloud

Version 23.7.0 NuGet .NET

banner


Docs Swagger Examples Blog Support Release Notes Dashboard


GroupDocs.Parser Cloud is a robust REST API designed to streamline document parsing and data extraction in your cloud-based .NET applications. Whether you need to extract text, images, metadata, or structured data using custom templates, this API offers high accuracy, fast processing, and scalability for enterprise-level operations. It supports a wide range of document formats, integrates seamlessly with multiple programming languages, and ensures secure API access with JWT authentication. With features like batch processing, Docker support, and comprehensive SDKs, GroupDocs.Parser Cloud is ideal for any document processing or data extraction task.

General Features

Text Extraction

Extract text from a wide range of document formats.

Document Info Extraction

Extract metadata and other document information such as title, author, and subject.

Image Extraction

Extract images embedded within documents.

Container Items Info Extraction

Extract information from container file formats like ZIP, PST, and OST.

Parse by Template

Parse documents by using custom templates for structured data extraction.

Document Processing Features

Metadata Extraction

Extracts metadata such as author, creation date, etc., from supported file formats.

Template-Based Parsing

Define templates for structured data extraction, ideal for processing forms, invoices, and other structured documents.

Batch Processing

Process multiple documents in a single request, making it efficient for large-scale operations.

Integration Features

RESTful API

Access the parser features via a REST API for easy integration into any platform.

SDK Availability

SDKs available for multiple programming languages including .NET, Java, Python, PHP, and more.

Platform Agnostic

Can be used across various platforms such as Windows, macOS, and Linux.

Security and Authentication

JWT Authentication

Ensures secure API access through JSON Web Token (JWT) authentication.

Client ID and Secret

Use Client ID and Secret for making secure API calls.

Data Encryption

Supports secure and encrypted communication between the client and the API.

Performance Features

High Accuracy

Provides accurate text extraction using advanced algorithms.

Fast Processing

Optimized for quick data extraction, suitable for high-performance applications.

Scalability

Can handle large volumes of documents efficiently, supporting enterprise-level operations.

Usability Features

Comprehensive Documentation

Extensive documentation and code samples available to help developers get started quickly.

API Explorer

Built-in API explorer for testing and exploring the API functionalities directly in the browser.

Multi-Platform Support

Compatible with various operating systems including Windows, Linux, and macOS.

Deployment and Hosting

Docker Support

Can be deployed in a Docker container for private cloud or on-premises hosting.

Self-Hosting

Allows running the API on your infrastructure with full control over the environment.

Automatic Scaling

Automatically scales to meet varying workloads, ensuring high availability.

Supported Document Formats

The following table indicates the file formats from which GroupDocs.Parser Cloud can extract data.

Document Type File Format Parse Document by Template Extract Text Extract Document Info Extract Images Extract Container Items Info
Word Processing DOC - Microsoft Word Document
DOT - Microsoft Word Document Template
DOCX - Office Open XML Document
DOCM - Office Open XML Macro-Enabled Document
DOTX - Office Open XML Document Template
DOTM - Office Open XML Document Macro-Enabled Template
TXT - Plain Text
ODT - Open Document Text
OTT - Open Document Text Template
RTF - Rich Text Format
PDF PDF - Portable Document Format File
Markup HTML - Hypertext Markup Language File
XHTML - Extensible Hypertext Markup Language File
MHTML - MIME HTML File
MD - Markdown
XML - XML File
Ebooks CHM - Compiled HTML Help File
EPUB - Digital E-Book File Format
FB2 - FictionBook 2.0 File
Spreadsheet XLS - Microsoft Excel Spreadsheet
XLT - Microsoft Excel Template
XLSX - Office Open XML Spreadsheet
XLSM - Office Open XML Macro-Enabled Spreadsheet
XLSB - Office Open XML Binary Spreadsheet
XLTX - Office Open XML Spreadsheet Template
XLTM - Office Open XML Macro-Enabled Spreadsheet Template
ODS - Open Document Spreadsheet
OTS - Open Document Spreadsheet Template
CSV - Comma Separated Values
XLA - Excel Add-In File
XLAM - Excel Open XML Macro-Enabled Add-In
NUMBERS - Apple iWork Numbers
Presentations PPT - PowerPoint Presentation
PPS - PowerPoint Slideshow
POT - PowerPoint Template
PPTX - Office Open XML Presentation
PPTM - Office Open XML Macro-Enabled Presentation
POTX - Office Open XML Presentation Template
POTM - Office Open XML Macro-Enabled Presentation Template
PPSX - Office Open XML Presentation Slideshow
PPSM - Office Open XML Macro-Enabled Presentation Slideshow
ODP - Open Document Presentation
OTP - Open Document Presentation Template
Emails PST - Outlook Personal Information Store File
OST - Outlook Offline Data File
EML - E-Mail Message
EMLX - Apple Mail Message
MSG - Outlook Mail Message
Notes ONE - OneNote Document
Archives ZIP - Zipped File

Get Started

You do not need to install anything to get started with GroupDocs.Parser Cloud SDK for .Net. Just create an account at GroupDocs for Cloud and get your application information.

Simply execute Install-Package GroupDocs.Parser-Cloud from Package Manager Console in Visual Studio to fetch & reference GroupDocs.Parser assembly in your project. If you already have GroupDocs.Parser Cloud SDK for .Net and want to upgrade it, please execute Update-Package GroupDocs.Parser-Cloud to get the latest version.

Please check the GitHub Repository for common usage scenarios.

GroupDocs.Parser Cloud API Code Samples

These code samples demonstrate various parsing capabilities of GroupDocs.Parser Cloud, including extracting text, extracting images, and parsing documents by template.

Extracting Text from a Document

Learn how to extract text from a document using the GroupDocs.Parser Cloud API. This example demonstrates the text extraction process in C#.

using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;

namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
    class Extract_Text_From_Document
    {
        public static void Run()
        {
            // Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
            var configuration = new Configuration
            {
                AppSid = "YOUR_APP_SID",
                AppKey = "YOUR_APP_KEY"
            };

            // Initialize the Parser API instance
            var apiInstance = new ParserApi(configuration);

            try
            {
                // Define the document to parse
                var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "document.docx" };

                // Create a text extraction request
                var request = new ExtractTextRequest(fileInfo);

                // Extract text from the document
                var response = apiInstance.ExtractText(request);

                // Output the extracted text to the console
                Console.WriteLine("Extracted Text: " + response.Text);
            }
            catch (Exception e)
            {
                // Handle any exceptions that occur during the API call
                Console.WriteLine("Exception when calling ParserApi.ExtractText: " + e.Message);
            }
        }
    }
}

Extracting Images from a Document

Learn how to extract images embedded within a document using the GroupDocs.Parser Cloud API. This example illustrates the process in C#.

using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;

namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
    class Extract_Images_From_Document
    {
        public static void Run()
        {
            // Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
            var configuration = new Configuration
            {
                AppSid = "YOUR_APP_SID",
                AppKey = "YOUR_APP_KEY"
            };

            // Initialize the Parser API instance
            var apiInstance = new ParserApi(configuration);

            try
            {
                // Define the document to parse
                var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "document.pdf" };

                // Create an image extraction request
                var request = new ExtractImagesRequest(fileInfo);

                // Extract images from the document
                var response = apiInstance.ExtractImages(request);

                // Loop through and output each extracted image's info
                foreach (var image in response.Images)
                {
                    Console.WriteLine("Image Format: " + image.Format + ", Image Path: " + image.Path);
                }
            }
            catch (Exception e)
            {
                // Handle any exceptions that occur during the API call
                Console.WriteLine("Exception when calling ParserApi.ExtractImages: " + e.Message);
            }
        }
    }
}

Parsing Document by Template

Learn how to parse a document by using a custom template for structured data extraction with the GroupDocs.Parser Cloud API. This example shows the template-based parsing in C#.

using System;
using GroupDocs.Parser.Cloud.Sdk.Api;
using GroupDocs.Parser.Cloud.Sdk.Model.Requests;
using GroupDocs.Parser.Cloud.Sdk.Model;

namespace GroupDocs.Parser.Cloud.Sdk.Examples
{
    class Parse_Document_By_Template
    {
        public static void Run()
        {
            // Get your AppSID and AppKey from https://dashboard.groupdocs.cloud/ (free registration required)
            var configuration = new Configuration
            {
                AppSid = "YOUR_APP_SID",
                AppKey = "YOUR_APP_KEY"
            };

            // Initialize the Parser API instance
            var apiInstance = new ParserApi(configuration);

            try
            {
                // Define the document and template file
                var fileInfo = new FileInfo { Folder = "path/to/folder", Name = "invoice.pdf" };
                var templatePath = "path/to/template.json";

                // Create a template-based parsing request
                var request = new ParseRequest(fileInfo, templatePath);

                // Parse the document using the template
                var response = apiInstance.Parse(request);

                // Output the parsed data to the console
                foreach (var field in response.Fields)
                {
                    Console.WriteLine("Field Name: " + field.Name + ", Field Value: " + field.Value);
                }
            }
            catch (Exception e)
            {
                // Handle any exceptions that occur during the API call
                Console.WriteLine("Exception when calling ParserApi.Parse: " + e.Message);
            }
        }
    }
}

Docs Swagger Examples Blog Support Release Notes Dashboard


Tags

Document Data Extraction | REST API | GroupDocs.Parser | Text Extraction | Image Extraction | Template Parsing | Markdown Extraction | HTML Extraction | Container Files | Data Parsing | Document Information | File Management | Cloud Storage | SDKs | Cross Platform | Storage API | File Operations | Folder Operations | Security and Authentication | Document Parsing | API Integration | Data Extraction | ZIP Files | PDF | PST/OST Files | Extract Images | Document Processing | Data Extraction API | GroupDocs SDK | API Explorer | Metadata Extraction

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net20 is compatible.  net35 was computed.  net40 was computed.  net403 was computed.  net45 was computed.  net451 was computed.  net452 was computed.  net46 was computed.  net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
23.7.0 638 7/11/2023
22.12.0 361 12/19/2022
22.3.0 485 3/14/2022
20.6.0 1,263 6/16/2020
19.11.0 522 12/12/2019