Page 1 of 1

[9.2x] TC display garbled characters Unicode filename in compressed package

Posted: 2018-10-23, 17:13 UTC
by thomasmolover
When using TC to browse the files in the compressed package, the filename in the compressed package are compressed by Unicode - such as the linux compressed package, which will display garbled characters. but in any other compress app, they are displayed correct.

here is the package

https://drive.google.com/open?id=1Q--TliqT8H1QcgqkGd7FjFOlnYwEBDCu

LEFT is the correct in 7z, Right is wrong in TC

https://imgur.com/a/XY0MaMf

Re: [9.2x] TC display garbled characters Unicode filename in compressed package

Posted: 2018-10-25, 09:33 UTC
by ghisler(Author)
Strange, I get the same names as 7zip here. Which language do you use for non-Unicode programs on your system?
Control panel -> Regional and language ioptions - last tab - language for non-Unicode programs

Re: [9.2x] TC display garbled characters Unicode filename in compressed package

Posted: 2018-10-25, 11:32 UTC
by MVV
Since it is a ZIP archive, there is a problem of missing the only standard for Unicode names...

However in my TC and 7Z names look like on the left screenshot. I have Russian as non-Unicode language. TC 9.21a 32 bit, 7-Zip 16.04.

Re: [9.2x] TC display garbled characters Unicode filename in compressed package

Posted: 2018-10-25, 13:12 UTC
by thomasmolover
ghisler(Author) wrote: 2018-10-25, 09:33 UTC Strange, I get the same names as 7zip here. Which language do you use for non-Unicode programs on your system?
Control panel -> Regional and language ioptions - last tab - language for non-Unicode programs
I set to language with Simple Chinese, all my friend use Chinese have the problem.

One of my friends tould me that he guess Unix compress filename with UTF8noBOM, and TC can display ANSI with local or Unicode16LE,
it display the utf8noBOM in ANSI.

Re: [9.2x] TC display garbled characters Unicode filename in compressed package

Posted: 2018-10-29, 15:46 UTC
by ghisler(Author)
ZIP has a special flag for Unicode names in its standard. This shouldn't happen if the ZIP file follows the ZIP standard. I will have to analyze the file in detail to find out what's wrong.

Re: [9.2x] TC display garbled characters Unicode filename in compressed package

Posted: 2018-10-29, 17:57 UTC
by Usher
2thomasmolover
It seems that you use different fonts in 7zip (what version?) and TC. Change fonts in TC, restart Windows and stop digging in fonts when testing software, please.

You can also read https://winaero.com/blog/rebuild-font-cache-windows-10/ or find similar hints for older Windows versions.

Re: [9.2x] TC display garbled characters Unicode filename in compressed package

Posted: 2019-02-27, 16:22 UTC
by ghisler(Author)
I have tested this archive: The UTF-8 flag is NOT set in the headers. The problem is that in Chinese, both the UTF-8 encoding and the Chinese ANSI encoding share valid characters. Does anyone know how to reliably determine that it's UTF-8 and not local encoding? I know how to detect valid UTF-8, but this will also detect many non-UTF-8 names falsely as UTF-8.