UTF-8 for history.txt
Moderators: Hacker, petermad, Stefan2, white
UTF-8 for history.txt
All was good while history was an ASCII text, but now the file contains funny characters, and the correct encoding is unknown.
Therefore, it would be beneficial to switch to UTF-8 and add corresponding BOM.
Other kinds of Unicode will considerably increase the size.
Therefore, it would be beneficial to switch to UTF-8 and add corresponding BOM.
Other kinds of Unicode will considerably increase the size.
Re: UTF-8 for history.txt
It's not an ASCII text, it's ANSI, which displays fine with internal viewer,
many other popular Text file viewers like CudaLister or AkelPad
and the very useful TC Changes Viewer.
many other popular Text file viewers like CudaLister or AkelPad
and the very useful TC Changes Viewer.
Windows 11 Home, Version 24H2 (OS Build 26100.3915)
TC 11.51 x64 / x86
Everything 1.5.0.1391a (x64), Everything Toolbar 1.5.2.0, Listary Pro 6.3.2.88
QAP 11.6.4.2.1 x64
TC 11.51 x64 / x86
Everything 1.5.0.1391a (x64), Everything Toolbar 1.5.2.0, Listary Pro 6.3.2.88
QAP 11.6.4.2.1 x64
Re: UTF-8 for history.txt
What sort of ANSI? What code page? Do you know that ANSI for Asian languages is double byte?
Re: UTF-8 for history.txt
To be honest, yes - it doesn’t matter at all - what encoding is, what is the correct name,
but this file really should have been saved long ago either in UTF-8 + BOM, or in UTF-16 LE + BOM (as INI)
but this file really should have been saved long ago either in UTF-8 + BOM, or in UTF-16 LE + BOM (as INI)
#146217 personal license
Re: UTF-8 for history.txt
Code page 1252 Western Latin1What sort of ANSI? What code page?
Note, that in the history.txt file from TC 11.03rc1 and TC 11.03rc3 a few characters (äö‹›«»Äß) was not saved correctly.
Some of these characters from the extended ANSI charset has been there since 2008 and all of them since 2016.but now the file contains funny characters
License #524 (1994)
Danish Total Commander Translator
TC 11.51 32+64bit on Win XP 32bit & Win 7, 8.1 & 10 (22H2) 64bit, 'Everything' 1.5.0.1391a
TC 3.60b4 on Android 6, 13, 14
TC Extended Menus | TC Languagebar | TC Dark Help | PHSM-Calendar
Danish Total Commander Translator
TC 11.51 32+64bit on Win XP 32bit & Win 7, 8.1 & 10 (22H2) 64bit, 'Everything' 1.5.0.1391a
TC 3.60b4 on Android 6, 13, 14
TC Extended Menus | TC Languagebar | TC Dark Help | PHSM-Calendar
Re: UTF-8 for history.txt
depends on regional settings. For me it's 1251Code page 1252 Western Latin1

That's why I strongly vote for either saving file in UTF-8 + BOM, or in UTF-16 LE + BOM
#146217 personal license
Re: UTF-8 for history.txt
The line to look at was:
29.01.24 Fixed: Regular expressions in file names: Support Unicode accents in constructs like \bфndern\b which will find the whole word "фndern" (change) in a file name (32/64)
Re: UTF-8 for history.txt
Looks finebrowny wrote: 2024-02-04, 10:28 UTC The line to look at was:29.01.24 Fixed: Regular expressions in file names: Support Unicode accents in constructs like \bфndern\b which will find the whole word "фndern" (change) in a file name (32/64)
Code: Select all
29.01.24 Fixed: Regular expressions in file names: Support Unicode accents in constructs like \bändern\b which will find the whole word "ändern" (change) in a file name (32/64)
Windows 11 Home, Version 24H2 (OS Build 26100.3915)
TC 11.51 x64 / x86
Everything 1.5.0.1391a (x64), Everything Toolbar 1.5.2.0, Listary Pro 6.3.2.88
QAP 11.6.4.2.1 x64
TC 11.51 x64 / x86
Everything 1.5.0.1391a (x64), Everything Toolbar 1.5.2.0, Listary Pro 6.3.2.88
QAP 11.6.4.2.1 x64
Re: UTF-8 for history.txt
again - depends on regional settings! For RU settings for example this file will be treated as Cyrillic Windows 1251.Looks fine
By default as in the Lister so and in all available Editors here...
Which will lead to the view like that was shown above: "\bдndern\b". BUT If I will use Encodings menu in the Lister
and choose "ASCII\DOS (local codepage) 1" - it will look like that: "\bфndern\b".
AND ONLY if I will choose 1252 - I will see what you see. BUT here I MUST BEFOREHAND KNOW ABOUT this fact!
Therefore, I repeat that in order to prevent misunderstandings about which line/word/character in this file
we can talk about - this file really should have been saved long ago either in UTF-8 + BOM or in UTF-16 LE + BOM (as INI )
#146217 personal license
Re: UTF-8 for history.txt
2AntonyD
I agree with you, but why should one use "ASCII\DOS (local codepage) 1" for Windows programs ?
I agree with you, but why should one use "ASCII\DOS (local codepage) 1" for Windows programs ?
Windows 11 Home, Version 24H2 (OS Build 26100.3915)
TC 11.51 x64 / x86
Everything 1.5.0.1391a (x64), Everything Toolbar 1.5.2.0, Listary Pro 6.3.2.88
QAP 11.6.4.2.1 x64
TC 11.51 x64 / x86
Everything 1.5.0.1391a (x64), Everything Toolbar 1.5.2.0, Listary Pro 6.3.2.88
QAP 11.6.4.2.1 x64
Re: UTF-8 for history.txt
Character graphics still could be seen; and not only in old files.
Re: UTF-8 for history.txt
2AntonyD
I also see "ändern" with codepage 1250, 1254, 1257 and 1258 when using the default fixedsys (western) font in Lister.AND ONLY if I will choose 1252 - I will see what you see.
License #524 (1994)
Danish Total Commander Translator
TC 11.51 32+64bit on Win XP 32bit & Win 7, 8.1 & 10 (22H2) 64bit, 'Everything' 1.5.0.1391a
TC 3.60b4 on Android 6, 13, 14
TC Extended Menus | TC Languagebar | TC Dark Help | PHSM-Calendar
Danish Total Commander Translator
TC 11.51 32+64bit on Win XP 32bit & Win 7, 8.1 & 10 (22H2) 64bit, 'Everything' 1.5.0.1391a
TC 3.60b4 on Android 6, 13, 14
TC Extended Menus | TC Languagebar | TC Dark Help | PHSM-Calendar
- ghisler(Author)
- Site Admin
- Posts: 50386
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Re: UTF-8 for history.txt
I think that it's a good idea (UTF-8 with by order mark), especially since the new Windows notepad app is such a crappy implementation that it doesn't even correctly recognize ansi text, although it's very easy to find invalid UTF-8 codes.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Re: UTF-8 for history.txt
Moderator message from: white » 2024-02-06, 15:39 UTC
15 posts split to new thread Use UTF-8 with byte-order mark when creating a new .txt file
Re: UTF-8 for history.txt
Not only Notepad. The issue was present in Lister too, if the default was non-Latin CP.
Thanks, fixed in RC5 History.txt.
Thanks, fixed in RC5 History.txt.