GHSoftware.WordDocTextExtractor 1.0.1

dotnet add package GHSoftware.WordDocTextExtractor --version 1.0.1
                    
NuGet\Install-Package GHSoftware.WordDocTextExtractor -Version 1.0.1
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="GHSoftware.WordDocTextExtractor" Version="1.0.1" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="GHSoftware.WordDocTextExtractor" Version="1.0.1" />
                    
Directory.Packages.props
<PackageReference Include="GHSoftware.WordDocTextExtractor" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add GHSoftware.WordDocTextExtractor --version 1.0.1
                    
#r "nuget: GHSoftware.WordDocTextExtractor, 1.0.1"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package GHSoftware.WordDocTextExtractor@1.0.1
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=GHSoftware.WordDocTextExtractor&version=1.0.1
                    
Install as a Cake Addin
#tool nuget:?package=GHSoftware.WordDocTextExtractor&version=1.0.1
                    
Install as a Cake Tool

GHSoftware.WordDocTextExtractor

GHSoftware.WordDocTextExtractor is a .NET library for extracting text from legacy Microsoft Word .doc files (Word 97-2003, Word 95 and Word 6.0).

This project is based on and extends the functionality of the original b2xtranslator project, with a focus on robust plain text extraction. It handles complex document structures and edge cases, producing clean, readable text output from binary .doc files.


Key Features

  • DOC to Plain Text Conversion Converts .doc files (Word 97, 6.0) into plain text while preserving document flow as much as possible.

  • Enhanced Compatibility Handles tables, headers/footers, embedded objects, and other Word-specific structures.

  • Active Maintenance This library is actively maintained and modernized for .NET 6+.


Installation

dotnet add package GHSoftware.WordDocTextExtractor

Usage Example

using b2xtranslator.txt;

string path = "legacy-document.doc";
string extractedText = DocTextExtractor.ExtractTextFromFile(docPath);

Console.WriteLine(extractedText);

More Information

For advanced usage, CLI tools, or additional formats, see the main b2xtranslator project.


License

This project is open source and distributed under the same license as the original b2xtranslator. See LICENSE.


Credits

Originally developed by DIaLOGIKa (2008-2009) and Evolution Recruitment Solutions (2017). Maintained and extended by Gustavo Hennig (2025).

Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net6.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.1 21 7/19/2025
1.0.0 18 7/19/2025