Lister and UTF-8 option

Here you can propose new features, make suggestions etc.

Moderators: white, Hacker, petermad, Stefan2

User avatar
menet
Member
Member
Posts: 199
Joined: 2005-04-21, 12:27 UTC
Location: Paris, France

Lister and UTF-8 option

Post by *menet »

Hi,

TC does not change automaticly the lister vue to UTF-8 if the TXT file does not contain the UTF-8 signature (at the beginning of the file). :?

Today, to change the view to UTF-8 for a file in lister, we have to do Options / UTF-8 or use the 7 shortcut.

Can we change to 8 this shortcut to be more mnemonic ? :?: Or add a second shortcut : 8 for UTF-8 view ? :twisted:

Another wish but not so easy to implement : Can we have a new option in Lister to use UTF-8 has default for text file ?

Best regards. :wink:
#22273 Personal licence
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48005
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...
Author of Total Commander
https://www.ghisler.com
User avatar
menet
Member
Member
Posts: 199
Joined: 2005-04-21, 12:27 UTC
Location: Paris, France

Post by *menet »

ghisler(Author) wrote:How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...
Hi Christian,
If the file have not UTF-8 signature, TC can't know that it should read it in UTF-8.

But I would like to have a special option to open the text files (files that TC opens in "text only" format by default) with UTF-8 format by default. 8)
I am french and I use UTF-8 format by default for all my text files (but without using UTF-8 signature) with PSPad free text editor ( http://www.pspad.com/ ).
It will give no changes for English text but it is not the case for French text... :roll:

What about adding also 8 for shortcut to UTF-8 format in Lister ?

Regards :wink:
#22273 Personal licence
User avatar
menet
Member
Member
Posts: 199
Joined: 2005-04-21, 12:27 UTC
Location: Paris, France

Post by *menet »

Hi Christian,

It is possible that i have not well understood your reply. :?

Does my request to have a special new option to read the text file using the UTF-8 format by default is stupid ? :?:
Will it give a problem in some other view ?

Best Regards :wink:
#22273 Personal licence
gigaman
Member
Member
Posts: 131
Joined: 2003-02-14, 11:28 UTC

Post by *gigaman »

ghisler(Author) wrote:How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...
For English text, it looks the same, so it also doesn't matter whether the lister is switched to ANSI or UTF-8 ;)
For non-English text, however, it should be possible to "guess" the format even when there's no signature in the file (for example, verify that all bytes >= 0x80 fall into valid UTF-8 sequences... maybe it would be good enough?).
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48005
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.
Author of Total Commander
https://www.ghisler.com
User avatar
menet
Member
Member
Posts: 199
Joined: 2005-04-21, 12:27 UTC
Location: Paris, France

Post by *menet »

Hi Christian, you have not replied to my request to have a special option to use UTF-8 format has default for text files without doing a scan of the file ? :roll:

Best Regards :wink:
#22273 Personal licence
gigaman
Member
Member
Posts: 131
Joined: 2003-02-14, 11:28 UTC

Post by *gigaman »

ghisler(Author) wrote:Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.
Right, scanning of the whole file is not a good idea if the file is really big - but I think that a smaller block (32kB?) can give quite a reliable result (if the number of 0x80+ characters exceeds certain limit, of course; the format of UTF-8 sequences is quite special).
Maybe this "text format auto-detection" could be an optional feature (enable/disabled in Lister options). Detecting Unicode files (without BOF signature) should be possible in a similar way.
User avatar
now
Member
Member
Posts: 181
Joined: 2006-11-01, 08:34 UTC

Post by *now »

ghisler(Author) wrote:Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.
Well, one could argue that not scanning only a small part never leads to the correct result. I think people would rather the lister at least tried to make an educated guess, based on a small part of the file, than that it did nothing.

I can send you a program I wrote to determine the encoding of files at work (based on an algorithm found in the Unix utility "file"). It's written in Ruby, but it should be easy enough to follow even if you're not familiar with the language.
tommy0910
Junior Member
Junior Member
Posts: 37
Joined: 2004-07-08, 09:22 UTC

Re: Lister and UTF-8 option

Post by *tommy0910 »

Is there any news on this?

I don't need any autodetection. I'd just like lister to always start in mode "7" instead of mode "1".
User avatar
Horst.Epp
Power Member
Power Member
Posts: 6429
Joined: 2003-02-06, 17:36 UTC
Location: Germany

Re: Lister and UTF-8 option

Post by *Horst.Epp »

tommy0910 wrote: 2020-06-23, 21:15 UTC Is there any news on this?

I don't need any autodetection. I'd just like lister to always start in mode "7" instead of mode "1".
Install the CudaLister plugin.
You can set UTF-8 as default for opening files and it also has many advantages compared to pure Lister.
The options are reached by the context menu in any open file.
https://totalcmd.net/plugring/CudaLister.html
Windows 11 Home x64 Version 23H2 (OS Build 22631.3296)
TC 11.03 x64 / x86
Everything 1.5.0.1371a (x64), Everything Toolbar 1.3.2, Listary Pro 6.3.0.69
QAP 11.6.3.2 x64
User avatar
Hacker
Moderator
Moderator
Posts: 13040
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Re: Lister and UTF-8 option

Post by *Hacker »

tommy0910,
I'd just like lister to always start in mode "7" instead of mode "1".
Configuration - Options - Edit/View - External Viewer - Default:

Code: Select all

%COMMANDER_EXE% /S=L:T7
HTH
Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.
tommy0910
Junior Member
Junior Member
Posts: 37
Joined: 2004-07-08, 09:22 UTC

Re: Lister and UTF-8 option

Post by *tommy0910 »

Thx :) Great!
amesh
New Member
New Member
Posts: 1
Joined: 2021-09-01, 09:47 UTC

Re: Lister and UTF-8 option

Post by *amesh »

Hacker wrote: 2020-06-24, 09:07 UTC tommy0910,
I'd just like lister to always start in mode "7" instead of mode "1".
Configuration - Options - Edit/View - External Viewer - Default:

Code: Select all

%COMMANDER_EXE% /S=L:T7
HTH
Roman
Where should be "Default" in External Viewer settings? I can not find...
Image: https://diogenesfest.com/temp/TC-External-Viewer-settings.png
User avatar
Stefan2
Power Member
Power Member
Posts: 4124
Joined: 2007-09-13, 22:20 UTC
Location: Europa

Re: Lister and UTF-8 option

Post by *Stefan2 »

amesh wrote: 2021-09-01, 09:55 UTC
Hacker wrote: 2020-06-24, 09:07 UTC tommy0910,
I'd just like lister to always start in mode "7" instead of mode "1".
Configuration - Options - Edit/View - External Viewer - Default:

Code: Select all

%COMMANDER_EXE% /S=L:T7
HTH
Roman
Where should be "Default" in External Viewer settings? I can not find...
Image: https://diogenesfest.com/temp/TC-External-Viewer-settings.png


The text box (edit control) behind of "Default:"






 
Post Reply