Encoding of custom column language files and pluginst.inf - Unicode?

Discuss and announce Total Commander plugins, addons and other useful tools here, both their usage and their development.

Moderators: sheep, Hacker, Stefan2, white

User avatar
Dalai
Power Member
Power Member
Posts: 7101
Joined: 2005-01-28, 22:17 UTC
Location: Meiningen (Südthüringen)

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *Dalai »

Ah, now it makes sense.

Regards
Dalai
#101164 Personal licence
Ryzen 5 2600, 16 GiB RAM, ASUS Prime X370-A, Win7 x64

Plugins: Services2, Startups

User avatar
Usher
Power Member
Power Member
Posts: 856
Joined: 2011-03-11, 10:11 UTC

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *Usher »

2ghisler(Author)
I'm afraid it's a total mess with Unicode support:
  • LNG file may be in ANSI or UTF-8 with BOM, UTF-16 is not supported;
  • MNU files may be in ANSI or UTF-8 with NO BOM only, UTF-16 is not supported;
  • INI files may be in ANSI or UTF-16, UTF-8 support is broken…
  • What about INF, INC, BAR and other files?
Why can't we have a common solution for all text config files?
Regards from Poland
Andrzej P. Wozniak

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 39712
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *ghisler(Author) »

mnu and plugin ini files need to use the same encoding as the main lng file.
Author of Total Commander
http://www.ghisler.com

User avatar
MVV
Power Member
Power Member
Posts: 8488
Joined: 2008-08-03, 12:51 UTC
Location: Russian Federation

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *MVV »

It would still be much better if TC could respect pure-Unicode INI files (which are 100% valid in Windows) and read them using pure-Unicode API...

User avatar
petermad
Power Member
Power Member
Posts: 9657
Joined: 2003-02-05, 20:24 UTC
Location: Valsted, Denmark
Contact:

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *petermad »

2ghisler(Author)
Does that apply to pluginst.inf too ?
License #524
Danish Total Commander Translator
TC 9.51 32+64bit on Win XP 32bit, Win 7, 8.1 & 10 (1909) 64bit, 'Everything' 1.4.1.965 (x64)
TC 3.0 on Android 6.0
Get:
Extended Total Commander Menus | TC Languagebar | TC Dark Help | PHSM-Calendar

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 39712
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *ghisler(Author) »

It would still be much better if TC could respect pure-Unicode INI files (which are 100% valid in Windows) and read them using pure-Unicode API...
While I could add this as a feature, the problem is that thousands of people will continue to use older versions for many years. Such language files wouldn't work for those users.
Author of Total Commander
http://www.ghisler.com

User avatar
MVV
Power Member
Power Member
Posts: 8488
Joined: 2008-08-03, 12:51 UTC
Location: Russian Federation

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *MVV »

Well, wincmd.ini encoding may be safely changed, but for plugin LNG files you're right, this will require newest TC, so perhaps you could use ulng extension for UTF-16 files, or just leave this on users (update TC or edit LNG files). But this is the feature, so we need it...

Following parameters may be used for indication:
+ codepage=1200 for TC LNG files (UTF-16 codepage number)
+ codepage=1200 for wincmd.ini in [Configuration] section
+ some kind of similar parameter for plugin LNG files (e.g. codepage=1200 before all sections for manual parsing or in some section like [encoding] or [lng] for reading via API)

So all old LNG files would work as expected but new ones could use UTF-16.

User avatar
Usher
Power Member
Power Member
Posts: 856
Joined: 2011-03-11, 10:11 UTC

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *Usher »

2MVV
Codepage declaration is needed mostly for 8-bit charsets, including UTF-8, as BOM is not required in this case.
You don't need codepage declaration in UTF-16 files. They all must use BOM. The problem is that TC in some situations doesn't recognize BOM and ignores files with embedded 0x00 bytes. They are unreadable so codepage declaration inside such files is also ignored.

IMHO it's highest time to start working on fully Unicode Total Commander X, where X stands for number 10 and XP+ compatibility.
Regards from Poland
Andrzej P. Wozniak

User avatar
MVV
Power Member
Power Member
Posts: 8488
Joined: 2008-08-03, 12:51 UTC
Location: Russian Federation

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *MVV »

Usher,
Direct UTF-16 indication is needed for TC to know that this INI (or LNG) must be read using Unicode API and not ANSI API. Windows allows transparently reading both UTF-16 and ANSI INI files with both Unicode and ANSI API (with automatic conversation according to system codepage if needed), and TC currently reads all INI files as ANSI with manual codepage conversation, but in case of UTF-16 INI manual conversation doesn't allow mixing languages/codepages in the same UTF-16 INI file.

User avatar
Dalai
Power Member
Power Member
Posts: 7101
Joined: 2005-01-28, 22:17 UTC
Location: Meiningen (Südthüringen)

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *Dalai »

Well, in case of the suggestions to use a second file (*W.inf and *W.lng, or *.ulng) it's already indicated to TC that it should use Unicode API to read it. Don't you think such indicator is sufficient?

Regards
Dalai
#101164 Personal licence
Ryzen 5 2600, 16 GiB RAM, ASUS Prime X370-A, Win7 x64

Plugins: Services2, Startups

User avatar
MVV
Power Member
Power Member
Posts: 8488
Joined: 2008-08-03, 12:51 UTC
Location: Russian Federation

Re: Encoding of custom column language files and pluginst.inf - Unicode?

Post by *MVV »

Yes, suffixed name (or prefixed/suffixed extension) is sufficient to indicate that file must be read as pure Unicode, of course.

Post Reply