SILVERCODERS DocToText is a powerful utility which can convert documents in many formats to plain text. It includes a console application and C/C++ library, which allows embedding text extraction mechanisms into other applications. It supports MS Office binary formats (MS Word (DOC), MS Excel (XLS), MS PowerPoint (PPT), and Rich Text Format (RTF)), OpenDocument formats (text documents (ODT), spreadsheets (ODS), and presentations (ODP)), Office Open XML formats (MS Word (DOCX), MS Excel (XLSX), and MS PowerPoint (PPTX)), and HyperText Markup Language (HTML). DocToText can extract text not only from the document body but also from annotations (comments) embedded in odt, doc, docx, or rtf files and read metadata like author, last modification date, or number of pages. It can be used as a fast console viewer, and is able to convert corrupted OpenDocument and Office Open XML documents. It can be used to recover text even if other recovery methods failed.
|Tags||Utilities Text Processing Archiving Office/Business Recovery Tools|
|Operating Systems||Unix POSIX Linux Windows Windows Windows Mac OS X|
Release Notes: HyperText Markup Language (HTML) format support was introduced in this version. The ability to retrieve metadata like document author, last modification date, or number of pages was added. The new important feature is extracting text from annotations (comments) embedded in odt, doc, docx, or rtf files. Some malfunctions were also fixed.
Release Notes: This is the first version available for Mac OS X and also the first version available as a C/C++ library in addition to the console application. MS PowerPoint binary format (PPT) support has been added. Headers, footers, and embedded XLS workbooks in DOC files are now supported. Extracting text from OpenDocument and OOXML formats has been significantly optimized. A lot of bugs have been fixed.
Release Notes: In addition to bug fixes and optimizations, MS Excel binary format (XLS) support was added in this version.
Release Notes: In addition to bugfixes and optimizations, a corrupted OpenDocument and Office Open XML documents conversion feature was added.
Release Notes: In addition to bugfixes and optimizations, Office Open XML (ISO/IEC 29500, also called OOXML, OpenXML, or MSOOXML) documents are supported.