Lister and UTF-8 option

menet · Post by *menet » 2006-11-08, 06:50 UTC

Hi,

TC does not change automaticly the lister vue to UTF-8 if the TXT file does not contain the UTF-8 signature (at the beginning of the file).

Today, to change the view to UTF-8 for a file in lister, we have to do Options / UTF-8 or use the 7 shortcut.

Can we change to 8 this shortcut to be more mnemonic ?

Or add a second shortcut : 8 for UTF-8 view ?

Another wish but not so easy to implement : Can we have a new option in Lister to use UTF-8 has default for text file ?

Best regards.

Post by *ghisler(Author) » 2006-11-08, 16:58 UTC

How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...

menet · Post by *menet » 2006-11-08, 19:11 UTC

ghisler(Author) wrote:How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...

Hi Christian,
If the file have not UTF-8 signature, TC can't know that it should read it in UTF-8.

But I would like to have a special option to open the text files (files that TC opens in "text only" format by default) with UTF-8 format by default.

I am french and I use UTF-8 format by default for all my text files (but without using UTF-8 signature) with PSPad free text editor ( http://www.pspad.com/ ).
It will give no changes for English text but it is not the case for French text...

What about adding also 8 for shortcut to UTF-8 format in Lister ?

Regards

menet · Post by *menet » 2006-11-25, 08:40 UTC

Hi Christian,

It is possible that i have not well understood your reply.

Does my request to have a special new option to read the text file using the UTF-8 format by default is stupid ?

Will it give a problem in some other view ?

Best Regards

gigaman · Post by *gigaman » 2006-11-25, 23:51 UTC

ghisler(Author) wrote:How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...

For English text, it looks the same, so it also doesn't matter whether the lister is switched to ANSI or UTF-8

For non-English text, however, it should be possible to "guess" the format even when there's no signature in the file (for example, verify that all bytes >= 0x80 fall into valid UTF-8 sequences... maybe it would be good enough?).

Post by *ghisler(Author) » 2006-11-26, 13:49 UTC

Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.

menet · Post by *menet » 2006-11-26, 20:46 UTC

Hi Christian, you have not replied to my request to have a special option to use UTF-8 format has default for text files without doing a scan of the file ?

Best Regards

gigaman · Post by *gigaman » 2006-11-26, 22:48 UTC

ghisler(Author) wrote:Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.

Right, scanning of the whole file is not a good idea if the file is really big - but I think that a smaller block (32kB?) can give quite a reliable result (if the number of 0x80+ characters exceeds certain limit, of course; the format of UTF-8 sequences is quite special).
Maybe this "text format auto-detection" could be an optional feature (enable/disabled in Lister options). Detecting Unicode files (without BOF signature) should be possible in a similar way.

now · Post by *now » 2006-11-27, 07:45 UTC

ghisler(Author) wrote:Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.

Well, one could argue that not scanning only a small part never leads to the correct result. I think people would rather the lister at least tried to make an educated guess, based on a small part of the file, than that it did nothing.

I can send you a program I wrote to determine the encoding of files at work (based on an algorithm found in the Unix utility "file"). It's written in Ruby, but it should be easy enough to follow even if you're not familiar with the language.

tommy0910 · Post by *tommy0910 » 2020-06-23, 21:15 UTC

Is there any news on this?

I don't need any autodetection. I'd just like lister to always start in mode "7" instead of mode "1".

Horst.Epp · Post by *Horst.Epp » 2020-06-24, 07:57 UTC

tommy0910 wrote: 2020-06-23, 21:15 UTC Is there any news on this?

I don't need any autodetection. I'd just like lister to always start in mode "7" instead of mode "1".

Install the CudaLister plugin.
You can set UTF-8 as default for opening files and it also has many advantages compared to pure Lister.
The options are reached by the context menu in any open file.
https://totalcmd.net/plugring/CudaLister.html

Post by *Hacker » 2020-06-24, 09:07 UTC

tommy0910,

I'd just like lister to always start in mode "7" instead of mode "1".

Configuration - Options - Edit/View - External Viewer - Default:

Code: Select all

%COMMANDER_EXE% /S=L:T7

HTH
Roman

tommy0910 · Post by *tommy0910 » 2020-06-24, 19:15 UTC

Thx

Great!

amesh · Post by *amesh » 2021-09-01, 09:55 UTC

Hacker wrote: 2020-06-24, 09:07 UTC tommy0910,
I'd just like lister to always start in mode "7" instead of mode "1".
Configuration - Options - Edit/View - External Viewer - Default:
Code: Select all
%COMMANDER_EXE% /S=L:T7
HTH
Roman

Where should be "Default" in External Viewer settings? I can not find...
Image: https://diogenesfest.com/temp/TC-External-Viewer-settings.png

Stefan2 · Post by *Stefan2 » 2021-09-01, 12:13 UTC

amesh wrote: 2021-09-01, 09:55 UTC
Hacker wrote: 2020-06-24, 09:07 UTC tommy0910,
I'd just like lister to always start in mode "7" instead of mode "1".
Configuration - Options - Edit/View - External Viewer - Default:
Code: Select all
%COMMANDER_EXE% /S=L:T7
HTH
Roman
Where should be "Default" in External Viewer settings? I can not find...
Image: https://diogenesfest.com/temp/TC-External-Viewer-settings.png

The text box (edit control) behind of "Default:"

Total Commander

Lister and UTF-8 option

Lister and UTF-8 option

Re: Lister and UTF-8 option

Re: Lister and UTF-8 option

Re: Lister and UTF-8 option

Re: Lister and UTF-8 option

Re: Lister and UTF-8 option

Re: Lister and UTF-8 option