Files names inside of .zip in Chinese locale Windows 7 RC
Moderators: white, Hacker, petermad, Stefan2
Files names inside of .zip in Chinese locale Windows 7 RC
Windows 7 RC, english version.
TC 7.50 Beta 3, english language only.
Once installed, it's OK.
I've changed locale for non-unicode programs to Chinese (Control Panel-Region & Language-Administrative-Change system locale).
Restarted computer as was asked.
Problem: non-english files names inside of some .zip file are shown with chinese characters. Plus TC and WinRAR shows different names of files inside of archive.
Even worse: some file name mixed with extensions!
Tried to change font & script of font (TC - configuration - option - font) - no success
Here is link (300 KB) as it should be (I used Cyrillic font's script):
http://img32.imageshack.us/img32/1936/totalcmdrussianlocalezi.png
Here is link (320 KB) shows problem:
http://img32.imageshack.us/img32/6571/totalcmdchineselocalezi.png
One additional strange thing: when opened this .zip file with Windows Explorer' built-in zip support, can see 33 files, from TC - 37, from WinRAR - 37. May be it's bug of Wndows 7 itself?
PS. I don't want to blame author of plugin, showed on screenshot. Just tried to show example, easy to check for everybody.
TC 7.50 Beta 3, english language only.
Once installed, it's OK.
I've changed locale for non-unicode programs to Chinese (Control Panel-Region & Language-Administrative-Change system locale).
Restarted computer as was asked.
Problem: non-english files names inside of some .zip file are shown with chinese characters. Plus TC and WinRAR shows different names of files inside of archive.
Even worse: some file name mixed with extensions!
Tried to change font & script of font (TC - configuration - option - font) - no success
Here is link (300 KB) as it should be (I used Cyrillic font's script):
http://img32.imageshack.us/img32/1936/totalcmdrussianlocalezi.png
Here is link (320 KB) shows problem:
http://img32.imageshack.us/img32/6571/totalcmdchineselocalezi.png
One additional strange thing: when opened this .zip file with Windows Explorer' built-in zip support, can see 33 files, from TC - 37, from WinRAR - 37. May be it's bug of Wndows 7 itself?
PS. I don't want to blame author of plugin, showed on screenshot. Just tried to show example, easy to check for everybody.
- ghisler(Author)
- Site Admin
- Posts: 48107
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
This isn't a bug - the ZIP format simply doesn't store the information in which locale the names were created! The Explorer will have similar problems.
If you want to send a zip file with (e.g. European) accents to someone with Chinese Windows, you should check the following option when packing:
Configuration - Options - ZIP - Store all names with non-English characters in extra field.
Total Commander 7.5 and Winzip can handle such extra fields.
If you want to send a zip file with (e.g. European) accents to someone with Chinese Windows, you should check the following option when packing:
Configuration - Options - ZIP - Store all names with non-English characters in extra field.
Total Commander 7.5 and Winzip can handle such extra fields.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Thank you for information. File names' warning noted.
But the problem remains: extensions!
As you can see, for files with latin symbols extension was not changed.
For file with non-Unicode symbols, some (?) extension mixed with file names.
Reason is, for sure, "dot" sign before extension was changed to something else. But only for non-Unicode file names!
And WinRAR changed it to "?", while TC to "_"
PS. In this case "Configuration - Options - ZIP - Store all names with non-English characters in extra field" shouldn't be the default option, should it?
But the problem remains: extensions!
As you can see, for files with latin symbols extension was not changed.
For file with non-Unicode symbols, some (?) extension mixed with file names.
Reason is, for sure, "dot" sign before extension was changed to something else. But only for non-Unicode file names!
And WinRAR changed it to "?", while TC to "_"
PS. In this case "Configuration - Options - ZIP - Store all names with non-English characters in extra field" shouldn't be the default option, should it?
- ghisler(Author)
- Site Admin
- Posts: 48107
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
That's normal, TC stores characters as '?' which cannot be stored with the local encoding. TC shows and unpacks such characters as '_' because you cannot have file names with '?' in them.
No, because it takes more space to store the names twice, and the problem doesn't affect people who do not send zips to people in countries with different encoding.PS. In this case "Configuration - Options - ZIP - Store all names with non-English characters in extra field" shouldn't be the default option, should it?
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Understood why characters appear in different way in TC and WinRAR.
But, again.
1. "Dot" character exists in any font/encoding table (at least it should ).
2. Extensions have only english letters (characters).
3. They are shown correctly in other files' names.
4. Last argument: if extension is missing, how Windows will determine file' type?
So, here is my opinion: unreadable characters could be changed to something readable in existing locale, while "dots" should not be changed.
But, again.
In some cases TC changed "dot" separated name & extension to "_", in some - not. This is incorrect, I think.As you can see, for files with latin symbols extension was not changed.
For file with non-Unicode symbols, some (?) extension mixed with file names.
Reason is, for sure, "dot" sign before extension was changed to something else. But only for non-Unicode file names!
1. "Dot" character exists in any font/encoding table (at least it should ).
2. Extensions have only english letters (characters).
3. They are shown correctly in other files' names.
4. Last argument: if extension is missing, how Windows will determine file' type?
So, here is my opinion: unreadable characters could be changed to something readable in existing locale, while "dots" should not be changed.
- ghisler(Author)
- Site Admin
- Posts: 48107
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
I haven't seen such a problem yet. Can you post the names of these files here (as text, not image) so I can create them via Shift+F4 (copy+paste the name) and try to reproduce the problem? Thanks!In some cases TC changed "dot" separated name & extension to "_", in some - not. This is incorrect, I think.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Sure!
Here they are, all from screenshot, sorted by extension:
仴嚆後_lng
侁犩岐_lng
愩後_lng
拋槈_lng
pluginst.inf
Brazilian Portuguese.lng
Cesky.lng
Chinese.lng
Croatian.lng
Dansk.lng
Deutsch.lng
English.lng
Espa醥l.lng
Francais.lng
Hellenic.lng
Hrvatski.lng
Italiano.lng
Korean.lng
Magyar.lng
Nederlands.lng
Norsk.lng
Polski.lng
Portuguese (Portugal).lng
Romanian.lng
Serbian.lng
Slovenscina.lng
Slovensky.lng
Svenska.lng
Taiwanese.lng
Turkish.lng
悌狅.lng
摢酄醐犰獱.lng
Changes.txt
Languages.txt
license.txt
readme.txt
CADView.wlx
Here they are, all from screenshot, sorted by extension:
仴嚆後_lng
侁犩岐_lng
愩後_lng
拋槈_lng
pluginst.inf
Brazilian Portuguese.lng
Cesky.lng
Chinese.lng
Croatian.lng
Dansk.lng
Deutsch.lng
English.lng
Espa醥l.lng
Francais.lng
Hellenic.lng
Hrvatski.lng
Italiano.lng
Korean.lng
Magyar.lng
Nederlands.lng
Norsk.lng
Polski.lng
Portuguese (Portugal).lng
Romanian.lng
Serbian.lng
Slovenscina.lng
Slovensky.lng
Svenska.lng
Taiwanese.lng
Turkish.lng
悌狅.lng
摢酄醐犰獱.lng
Changes.txt
Languages.txt
license.txt
readme.txt
CADView.wlx
- ghisler(Author)
- Site Admin
- Posts: 48107
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Are these the correct names with underscore _ instead of dot in some Chinese names before the extensions? Or is this the list of Total Commander shows it?
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
This is the list of Total Commander shows it. Got them from TC with:
Correct list you can see on link from my first post. That's how it should be:
Two file names with chinese characters absent in used font, are shown with replaced to readable names in current locale. But extensions left unchanged.
P.S. Or you can download this file (cadview.zip 1,16 MB - plugin for TC ) & check yourself to be 100% sure of names.
Code: Select all
ctrl-A -> alt-M -> Copy selected names to clipboard -> Pasted in forum
Two file names with chinese characters absent in used font, are shown with replaced to readable names in current locale. But extensions left unchanged.
P.S. Or you can download this file (cadview.zip 1,16 MB - plugin for TC ) & check yourself to be 100% sure of names.
Let me disagree with you. Here are my arguments:ghisler(Author) wrote:...No, because it takes more space to store the names twice, and the problem doesn't affect people who do not send zips to people in countries with different encoding.PS. In this case "Configuration - Options - ZIP - Store all names with non-English characters in extra field" shouldn't be the default option, should it?
1. Space for storing names, are you kidding? 50 more bytes? Less? Who now carry about bytes? Megabytes (well, can agree with hundreds of Kilobytes) does matter, but bytes?
2. Now seriously. As TC is declared as having full Unicode support, all Unicode options should be on-default. If somebody has his own opinion, he can choose not to use Unicode.
3. Option called "Store all names with non-English characters in extra field", which means for english name there wil be no extra field. As far as I understand, this option applies exactly for those who need it, who using non-English names. There are lot of them over the world! And they'll get full Unicode support "Out-of-box", which, for sure, will make them happy
- ghisler(Author)
- Site Admin
- Posts: 48107
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
- ghisler(Author)
- Site Admin
- Posts: 48107
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
I see - changing the TC pack options wouldn't have any influence on third party archives anyway, who knows what they used to create the archive. The only solution would be to let the user choose the encoding of an archive. It's on my to do list, but I don't currently know where to put the necessary button. The user interface is already more than full...
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
- ghisler(Author)
- Site Admin
- Posts: 48107
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
I have checked it now - TC handles it correctly. In the Chinese locale, the character before the dot is the first part of a 2 byte character, so the dot is seen as the second part. The only solution here is that the plugin author changes the encoding of the file names (UTF-8 in extra field) or that the user can somehow choose the encoding himself.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com