Total Commander

Posted: **2024-02-03, 16:22 UTC**

All was good while history was an ASCII text, but now the file contains funny characters, and the correct encoding is unknown.
Therefore, it would be beneficial to switch to UTF-8 and add corresponding BOM.
Other kinds of Unicode will considerably increase the size.

Posted: **2024-02-03, 16:37 UTC**

It's not an ASCII text, it's ANSI, which displays fine with internal viewer,
many other popular Text file viewers like CudaLister or AkelPad
and the very useful TC Changes Viewer.

Posted: **2024-02-03, 16:39 UTC**

What sort of ANSI? What code page? Do you know that ANSI for Asian languages is double byte?

Posted: **2024-02-03, 16:52 UTC**

To be honest, yes - it doesn’t matter at all - what encoding is, what is the correct name,
but this file really should have been saved long ago either in UTF-8 + BOM, or in UTF-16 LE + BOM (as INI)

Posted: **2024-02-04, 03:09 UTC**

What sort of ANSI? What code page?

Code page 1252 Western Latin1

Note, that in the history.txt file from TC 11.03rc1 and TC 11.03rc3 a few characters (äö‹›«»Äß) was not saved correctly.

but now the file contains funny characters

Some of these characters from the extended ANSI charset has been there since 2008 and all of them since 2016.

Posted: **2024-02-04, 09:17 UTC**

Code page 1252 Western Latin1

depends on regional settings. For me it's 1251

in all editors....
That's why I strongly vote for either saving file in UTF-8 + BOM, or in UTF-16 LE + BOM

Posted: **2024-02-04, 10:28 UTC**

The line to look at was:

29.01.24 Fixed: Regular expressions in file names: Support Unicode accents in constructs like \bфndern\b which will find the whole word "фndern" (change) in a file name (32/64)

Posted: **2024-02-04, 10:58 UTC**

browny wrote: ↑2024-02-04, 10:28 UTC The line to look at was:
29.01.24 Fixed: Regular expressions in file names: Support Unicode accents in constructs like \bфndern\b which will find the whole word "фndern" (change) in a file name (32/64)

Looks fine

Code: Select all

29.01.24 Fixed: Regular expressions in file names: Support Unicode accents in constructs like \bändern\b which will find the whole word "ändern" (change) in a file name (32/64)

Posted: **2024-02-04, 11:15 UTC**

Looks fine

again - depends on regional settings! For RU settings for example this file will be treated as Cyrillic Windows 1251.
By default as in the Lister so and in all available Editors here...
Which will lead to the view like that was shown above: "\bдndern\b". BUT If I will use Encodings menu in the Lister
and choose "ASCII\DOS (local codepage) 1" - it will look like that: "\bфndern\b".
AND ONLY if I will choose 1252 - I will see what you see. BUT here I MUST BEFOREHAND KNOW ABOUT this fact!
Therefore, I repeat that in order to prevent misunderstandings about which line/word/character in this file
we can talk about - this file really should have been saved long ago either in UTF-8 + BOM or in UTF-16 LE + BOM (as INI )

Posted: **2024-02-04, 11:59 UTC**

2AntonyD
I agree with you, but why should one use "ASCII\DOS (local codepage) 1" for Windows programs ?

Posted: **2024-02-04, 12:25 UTC**

Character graphics still could be seen; and not only in old files.

Posted: **2024-02-04, 14:48 UTC**

2AntonyD

AND ONLY if I will choose 1252 - I will see what you see.

I also see "ändern" with codepage 1250, 1254, 1257 and 1258 when using the default fixedsys (western) font in Lister.

Posted: **2024-02-05, 08:21 UTC**

I think that it's a good idea (UTF-8 with by order mark), especially since the new Windows notepad app is such a crappy implementation that it doesn't even correctly recognize ansi text, although it's very easy to find invalid UTF-8 codes.

Posted: **2024-02-06, 15:40 UTC**

Moderator message from: white » 2024-02-06, 15:39 UTC

15 posts split to new thread Use UTF-8 with byte-order mark when creating a new .txt file

Posted: **2024-02-08, 20:15 UTC**

Not only Notepad. The issue was present in Lister too, if the default was non-Latin CP.
Thanks, fixed in RC5 History.txt.

Total Commander

UTF-8 for history.txt

UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt

Re: UTF-8 for history.txt