LOADLIST can't load from utf8 with BOM

English support forum

Moderators: Hacker, petermad, Stefan2, white

Post Reply
brucmao
Junior Member
Junior Member
Posts: 10
Joined: 2024-10-21, 02:58 UTC

LOADLIST can't load from utf8 with BOM

Post by *brucmao »

this is my custom command

Code: Select all

[em_LoadMyList]
cmd=LOADLIST %COMMANDER_PATH%\User\MyList.txt
MyList.txt (UTF-8 with BOM)

Image: https://s3.bmp.ovh/imgs/2024/12/16/cd57d56935f678ce.png
When my MyList.txt contains Chinese characters and its encoding is set to UTF-8 with BOM, running the em_LoadMyList custom command results in an error saying

< Error! >
File not found! (2x)

Image: https://s3.bmp.ovh/imgs/2024/12/17/aca4e328d4e31ae2.png
The strange thing is that only some paths containing Chinese characters will cause an error, such as the following paths:
c:\work\色标库\
c:\work\柯坪县\
Last edited by brucmao on 2024-12-17, 01:47 UTC, edited 2 times in total.
User avatar
white
Power Member
Power Member
Posts: 5747
Joined: 2003-11-19, 08:16 UTC
Location: Netherlands

Re: LOADLIST can't load from utf8 with BOM

Post by *white »

brucmao wrote: 2024-12-16, 15:37 UTC The strange thing is that only some paths containing Chinese characters will cause an error, such as the following paths:
c:\work\色标库\
c:\work\柯坪县\
Not confirmed. I tried these folders and your em_ command.
Which TC version are you using? Do you have access to the folders? Perhaps there are some hidden characters?
brucmao
Junior Member
Junior Member
Posts: 10
Joined: 2024-10-21, 02:58 UTC

Re: LOADLIST can't load from utf8 with BOM

Post by *brucmao »

white wrote: 2024-12-16, 16:37 UTC
brucmao wrote: 2024-12-16, 15:37 UTC The strange thing is that only some paths containing Chinese characters will cause an error, such as the following paths:
c:\work\色标库\
c:\work\柯坪县\
Not confirmed. I tried these folders and your em_ command.
Which TC version are you using? Do you have access to the folders? Perhaps there are some hidden characters?
Total Commander Version 11.03 64 bit (2024-02-21)
If I convert the encoding of the MyList.txt file to GB2312, it works well.
brucmao
Junior Member
Junior Member
Posts: 10
Joined: 2024-10-21, 02:58 UTC

Re: LOADLIST can't load from utf8 with BOM

Post by *brucmao »

It works fine after I perform the following operations.

go to Windows Settings > Time & language > Language & region > Administrative language settings > Change system locale, and check Beta: Use Unicode UTF-8 for worldwide language support.
User avatar
beb
Power Member
Power Member
Posts: 579
Joined: 2009-09-20, 08:03 UTC
Location: Odesa, Ukraine

Re: LOADLIST can't load from utf8 with BOM

Post by *beb »

brucmao wrote: 2024-12-17, 02:57 UTC Windows Settings > Time & language > Language & region > Administrative language settings > Change system locale, and check Beta: Use Unicode UTF-8 for worldwide language support.
This may affect your other applications.
What if you convert your MyList.txt file to Unicode: UTF-16 LE (1200)?
#278521 User License
Total Commander [always the latest version, including betas] x86/x64 on Win10 x64/Android 10/15
brucmao
Junior Member
Junior Member
Posts: 10
Joined: 2024-10-21, 02:58 UTC

Re: LOADLIST can't load from utf8 with BOM

Post by *brucmao »

beb wrote: 2024-12-17, 03:28 UTC
brucmao wrote: 2024-12-17, 02:57 UTC Windows Settings > Time & language > Language & region > Administrative language settings > Change system locale, and check Beta: Use Unicode UTF-8 for worldwide language support.
This may affect your other applications.
What if you convert your MyList.txt file to Unicode: UTF-16 LE (1200)?
I suspect that the LOADLIST command uses the system’s default encoding when executed. I use AutoHotkey to dynamically append the selected file path to MyList.txt (or delete paths from MyList.txt that match the selected file), but LOADLIST cannot specify the encoding when reading.
My current solution is to explicitly specify the encoding as CP0 when reading and writing with AutoHotkey.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50390
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: LOADLIST can't load from utf8 with BOM

Post by *ghisler(Author) »

I have just checked the source code: LOADLIST supports
- UTF-8 with BOM: EF BB BF
- UTF-16 with BOM
- ANSI using current encoding

It does NOT support UTF-8 without BOM as created by newer Notepad versions because they can be ambiguous.
Just save as UTF-8 with BOM.

I have tested it on Windows 11 with Western encoding and UTF-8 with BOM works also with Chinese.
Author of Total Commander
https://www.ghisler.com
User avatar
white
Power Member
Power Member
Posts: 5747
Joined: 2003-11-19, 08:16 UTC
Location: Netherlands

Re: LOADLIST can't load from utf8 with BOM

Post by *white »

brucmao wrote: 2024-12-17, 05:13 UTC I suspect that the LOADLIST command uses the system’s default encoding when executed. I use AutoHotkey to dynamically append the selected file path to MyList.txt (or delete paths from MyList.txt that match the selected file), but LOADLIST cannot specify the encoding when reading.
My current solution is to explicitly specify the encoding as CP0 when reading and writing with AutoHotkey.
It should work with utf-8 with BOM. Perhaps your file did not contain the BOM. Note that when appending to an existing file without BOM, the BOM won't be added.
brucmao
Junior Member
Junior Member
Posts: 10
Joined: 2024-10-21, 02:58 UTC

Re: LOADLIST can't load from utf8 with BOM

Post by *brucmao »

ghisler(Author) wrote: 2024-12-17, 08:34 UTC I have just checked the source code: LOADLIST supports
- UTF-8 with BOM: EF BB BF
- UTF-16 with BOM
- ANSI using current encoding

It does NOT support UTF-8 without BOM as created by newer Notepad versions because they can be ambiguous.
Just save as UTF-8 with BOM.

I have tested it on Windows 11 with Western encoding and UTF-8 with BOM works also with Chinese.
I created a TXT file using Sublime Text with UTF-8 with BOM encoding, and I tested the following three paths:

C:\work\色标库
C:\work\柯坪县
c:\work\北边.png

1. When the current system active code page is 936, the first two paths fail, but the last one loads successfully.
2. When I change the system active code page to 65001 (by selecting “Use Unicode UTF-8 for worldwide language support”), all three paths load successfully.

I had other colleagues who use Total Commander test it, and they got the same result.
User avatar
Usher
Power Member
Power Member
Posts: 1726
Joined: 2011-03-11, 10:11 UTC

Re: LOADLIST can't load from utf8 with BOM

Post by *Usher »

Can you test lowercase c: in all paths rather than uppercase C:?
Andrzej P. Wozniak
Polish subforum moderator
brucmao
Junior Member
Junior Member
Posts: 10
Joined: 2024-10-21, 02:58 UTC

Re: LOADLIST can't load from utf8 with BOM

Post by *brucmao »

2 Usher
the same
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50390
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: LOADLIST can't load from utf8 with BOM

Post by *ghisler(Author) »

I could reproduce and fix it, the problem was code page 936. It worked with single byte code pages like Western or Cyrillic, but not with multi-byte code pages.
Author of Total Commander
https://www.ghisler.com
User avatar
beb
Power Member
Power Member
Posts: 579
Joined: 2009-09-20, 08:03 UTC
Location: Odesa, Ukraine

Re: LOADLIST can't load from utf8 with BOM

Post by *beb »

2brucmao

Code: Select all

C:\work\色标库
C:\work\柯坪县
c:\work\北边.png
1. I took that and made those folders and a file on my PC.
2. I made LOADLIST.txt with that contents in two encodings:
2.1. LOADLIST.txt UTF-8 BOM (begins with "EF BB BF" bytes [which is BOM for that encoding]: seen as such when viewing file in Lister as hex)
2.2. LOADLIST.txt UTF-16LE BOM (begins with "FF FE" bytes [which is BOM for that encoding]: seen as such when viewing file in Lister as hex)
3. I reproduced your command in my TC:

Code: Select all

[em_LoadMyList]
cmd=LOADLIST %COMMANDER_PATH%\User\MyList.txt
My "Language for non-Unicode programs" is Ukrainian [nothing to do with the Chinese], "Use Unicode UTF-8 for worldwide language support" option is inactive [unchecked].

4. I run your command on my PC and it works as intended, both with LOADLIST.txt UTF-8 BOM, and LOADLIST.txt UTF-16LE BOM.

Please, check if your LOADLIST.txt file actually does go with BOM.
You can also try my files:
https://workupload.com/archive/y5DSfYdubj
LOADLIST_UTF8_BOM.zip (174 bytes)
LOADLIST_UTF16LE_BOM.zip (176 bytes)

Edit.
ghisler(Author) wrote: 2024-12-18, 08:55 UTC I could reproduce and fix it, the problem was code page 936. It worked with single byte code pages like Western or Cyrillic, but not with multi-byte code pages.
Oh. Regarding this, my message becomes useless. Sorry.
#278521 User License
Total Commander [always the latest version, including betas] x86/x64 on Win10 x64/Android 10/15
Post Reply