WordArc - reading all MSWord formats in TC

Discuss and announce Total Commander plugins, addons and other useful tools here, both their usage and their development.

Moderators: Hacker, petermad, Stefan2, white

Post Reply
User avatar
alexanderwdark
Senior Member
Senior Member
Posts: 270
Joined: 2008-04-14, 07:20 UTC
Location: Russia
Contact:

WordArc - reading all MSWord formats in TC

Post by *alexanderwdark »

WordArcTC - plugin to work with MSWord files as archive


Read Doc, WordPerfect, Docx, XML, HTML, Wri, etc. with lister, convert any Word files to txt, rtf, html, etc. simply my TC! Work with documents as archives, as folders. For doc to rtf/txt converting (reading) no MSOffice or MSWord is needed!


Unpacking formats:
------------------

direct.txt - get text form document using direct mode, no need for installed Word *

direct_oem.txt - direct, but in OEM encoding *

msfilter.rtf - get RTF text using Microsoft TexConv library *

msfilter_richedit.rtf - get RTF text using Microsoft TexConv library and filter it by MS RichEdit *

word.txt - get text from document using MSWord

word_lines.txt - get text with line breaks from document using MSWord

word.rtf - get RTF text from document using MSWord

word_richedit.rtf - get RTF text from document using MSWord and filter it by MS RichEdit

word.html - get HTML text from document using MSWord

word_dos.txt - get ASCII DOS text from document using MSWord

word_dos_lines.txt - get ASCII DOS text with line breaks from document using MSWord

word_unicode.txt - get unicode text from document using MSWord

* MSWord or MS Office may be not installed.


Usage
------

Use Ctrl + PgDown to enter word document as archive.

For description of using constole tool wa.exe see wa.exe screen (run it without any parameter)

Ini files
-----

Ini file is located in Total Commander directory for plugin ini files. If You want to reset plugins settings
(i.e. You filter for extension assigment), just delete wordarc_exts.ini file.

download here...


Latest Office Converter Pack You can get here for free:

http://www.microsoft.com/office/orkarchive/2003ddl.htm

No MS Office needed for non-MSWord functions.

This is opensource plugin sample. If You want, You may use it sources and make You own plugin.
Michael Diegelmann
Junior Member
Junior Member
Posts: 36
Joined: 2006-02-18, 17:25 UTC
Location: Rosenheim (Germany)
Contact:

Suggestion: Register WordArcTC plugin within Open With List

Post by *Michael Diegelmann »

Treating Word DOC files etc. as an archive containing all kinds of text file formats sounds really interesting! But when I was asked to associate the .DOC file extension with WordArc.wcx (thus destroying the existing association with Word.exe) I have immediately stopped the installation of this otherwise certainly very helpful plugin. May I suggest to simply add your plugin to the OpenWithList for DOC files instead?

P.S. Thank you for the source code. It might eventually come in handy when writing some content plugin myself.
User avatar
alexanderwdark
Senior Member
Senior Member
Posts: 270
Joined: 2008-04-14, 07:20 UTC
Location: Russia
Contact:

Re: Suggestion: Register WordArcTC plugin within Open With L

Post by *alexanderwdark »

Michael Diegelmann wrote:Treating Word DOC files etc. as an archive containing all kinds of text file formats sounds really interesting! But when I was asked to associate the .DOC file extension with WordArc.wcx (thus destroying the existing association with Word.exe) I have immediately stopped the installation of this otherwise certainly very helpful plugin. May I suggest to simply add your plugin to the OpenWithList for DOC files instead?

P.S. Thank you for the source code. It might eventually come in handy when writing some content plugin myself.
No, this wcx archive plugin not destroying the existing association with Word.exe. You can enter archive by Ctrl+PgDown only (by Enter key or mouse click you run MSWord).
User avatar
alexanderwdark
Senior Member
Senior Member
Posts: 270
Joined: 2008-04-14, 07:20 UTC
Location: Russia
Contact:

Post by *alexanderwdark »

How to enable support for DOCX files?


wincmd.ini [Packer] PluginOverrideZip=1 allows to use packer plugin which overrides the internal ZIP packer

Please, add or change this option to allow WordArcTC to handle DOCX files.

...and all docx files you can view as unicode, ansi, oem text by enabling routines of MS converter filters, ms word, etc.
User avatar
alexanderwdark
Senior Member
Senior Member
Posts: 270
Joined: 2008-04-14, 07:20 UTC
Location: Russia
Contact:

Post by *alexanderwdark »

24.02.2013 Plugin was updated. The are are new careful modes, which helps get more precise result, without garbage. Also new plugin RedDOC was uploaded, that supports reading binary DOC file without MSWord installed. Same as ListDOC, but for both TC x32 and x86-64
damjang
Senior Member
Senior Member
Posts: 215
Joined: 2003-10-09, 15:58 UTC
Contact:

Post by *damjang »

Thank you for the RedDOC plugin. I want only ask what is the difference with ListDoc, rather x64 support? Because ListDOc is 47k and RedDoc >2M. Is there also a format support difference? I ask this because sometime I find some docs that ListDoc don't read ok (don't have any now).
User avatar
alexanderwdark
Senior Member
Senior Member
Posts: 270
Joined: 2008-04-14, 07:20 UTC
Location: Russia
Contact:

Post by *alexanderwdark »

damjang wrote:Thank you for the RedDOC plugin. I want only ask what is the difference with ListDoc, rather x64 support? Because ListDOc is 47k and RedDoc >2M. Is there also a format support difference? I ask this because sometime I find some docs that ListDoc don't read ok (don't have any now).
ListDOC - not my plugin, RedDOC written from scratch. Among them are more differences than common. ListDOC, I think, is in clear WinAPI. RedDOC uses VCL library, so size is larger. That is the size in this case is due to the codebase & compiler. As for supporting the format, the structure of the document is analyzed, supports both Unicode and non-Unicode blocks. But the project is still very fresh, not everything can be finalized. Non-printing control characters can be displayed. May be revised in this direction. But the format is quite complex to implement all the desires at once. As for the documents, I do check on some Unicode (Russian) and Unicode files. The content will be in any case displayed is in the form of unicode text.
User avatar
LonerD
Senior Member
Senior Member
Posts: 381
Joined: 2010-06-19, 20:18 UTC
Location: Makeyevka/Makiivka
Contact:

Post by *LonerD »

alexanderwdark
Thanks for updates.
Can you add also rtf and docx support for RedDOC plugin?
"I used to feel guilty in Cambridge that I spent all day playing games, while I was supposed to be doing mathematics. Then, when I discovered surreal numbers, I realized that playing games IS math." John Horton Conway
User avatar
alexanderwdark
Senior Member
Senior Member
Posts: 270
Joined: 2008-04-14, 07:20 UTC
Location: Russia
Contact:

Post by *alexanderwdark »

LonerD wrote:alexanderwdark
Thanks for updates.
Can you add also rtf and docx support for RedDOC plugin?
Because document is analyzed directly, it will be difficult to do. Now there are some minor updates, see on plugin page.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50479
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

RTF is already handled by Lister, even with full formatting. docx is handled by plugin OpenOffice/DOCX/FB2 Viewer, also with formatting:
http://www.totalcmd.net/plugring/OOoHtmlViewer.html

So this plugin really fills a missing gap in 64-bit!
Author of Total Commander
https://www.ghisler.com
User avatar
alexanderwdark
Senior Member
Senior Member
Posts: 270
Joined: 2008-04-14, 07:20 UTC
Location: Russia
Contact:

Post by *alexanderwdark »

2013-02-27 Plugin WordArc was updated (new "careful mode" kernel). Also updated RedDoc plugin. Some kernel fixes, and now there are options for reading foreign ANSI documents. So, you can read one-byte ANSI german files in russian system and vice versa. This is not possible in Listdoс, it only uses the system codepage.
User avatar
EricB
Senior Member
Senior Member
Posts: 357
Joined: 2008-03-25, 22:21 UTC
Location: The Netherlands

Post by *EricB »

Hi Alexander,

Good work! Very easy to have a MS Word doc viewer that does not rely on the presence of MS Office. Regarding the question of LonerD: somewhere in the past Fenix_Productions made a lister plugin for Docx (see http://www.ghisler.ch/board/viewtopic.php?t=21559). Maybe he can advise a bit?

Regards, EricB
Post Reply