[BUG] Internal ZIP packer corrupts special chars in names

English support forum

Moderators: white, Hacker, petermad, Stefan2

Post Reply
jb
Senior Member
Senior Member
Posts: 412
Joined: 2003-02-09, 22:56 UTC
Location: Switzerland

[BUG] Internal ZIP packer corrupts special chars in names

Post by *jb »

The internal ZIP packer of TC 5.50 corrupts special characters in file names (eg: ®, ´, ³). Such characters sometimes occur in bookmarks of web browsers.
WinRAR V3.11 has the same flaw, but WinZip V8.0 does it right.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48088
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Total Commander follows the pkzip standard, which uses the OEM (DOS) charset to store file names. It uses CharToOem for the conversion, which loses some special characters in the conversion. Winzip simply ignores the Zip standard and stores the file names using the Windows charset...
Author of Total Commander
https://www.ghisler.com
jb
Senior Member
Senior Member
Posts: 412
Joined: 2003-02-09, 22:56 UTC
Location: Switzerland

Post by *jb »

ghisler(Author) wrote:Total Commander follows the pkzip standard, which uses the OEM (DOS) charset to store file names. It uses CharToOem for the conversion, which loses some special characters in the conversion. Winzip simply ignores the Zip standard and stores the file names using the Windows charset...
What's the point in sticking to the ancient OEM (DOS) charset? DOS is over. It's time for Unicode.
Do you mean the ANSI character set with the 'Windows charset'?
It seems that the WinZip-way leads to corruption only when unzipping on a non-Windows platform, but the pkzip-standard-way always leads to corruption when special characters (>127) occur in file names, right?
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48088
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

The problem is that some programs do expect that the zip file follows the standard! Winzip just ignores the standard and tries to set its own standard, but I don't think that this is good praxis. It means that files created with Winzip cannot be unpacked correctly with most DOS tools (and even some Windows tools).

Btw, there is no Unicode support for zip files (yet?).
Author of Total Commander
https://www.ghisler.com
jb
Senior Member
Senior Member
Posts: 412
Joined: 2003-02-09, 22:56 UTC
Location: Switzerland

Post by *jb »

What about adding an option that allows the user to choose the character set?
I guess 95% of my archives are for personal use only (backups etc.), not for exchange with other people. In these cases I would be glad not to loose a single character. Probably many users share this need.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48088
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Yes, this sounds like a good idea! I will add it to my wish list.
Author of Total Commander
https://www.ghisler.com
IGL
Member
Member
Posts: 179
Joined: 2004-02-26, 10:47 UTC
Location: Poland

Post by *IGL »

It seems the problem is still existing. Normally it is not necessary, but sometimes some characters may get lost.
Furthermore - when you copy file with strange name again (the same name but different content) to alraedy existing ZIP then the new name is duplicated. When you unpack - files may get overwritten. Conversion to OEM may bring additional problems - there could be an option to disable OEM conversion.
I sign under the wish list - and this post is a little reminder :) although I agree this is not an urgent case.
:-)
User avatar
Zodarr
Junior Member
Junior Member
Posts: 13
Joined: 2006-03-16, 22:28 UTC
Location: Budapest, Hungary

Other character (letter)

Post by *Zodarr »

Hi!
I think it's the right topic to write to...
Problem is another special character, it's like "ae" in one place (letter)...
I hope it's clear what i mean...
Sry, cannot reproduce this char, it's in some japanese named file...

Ohh, and another! Why cant TC display japanese (and i think, other asian languages too) file or dir names? Is there, or will be there a solution?
Maybe unicode could solve this problems... I dunno.

THX
Paradimethyldiaminoarsenobensolmonomethansulfinacidic natrium, or
Überseedecksgeschwindigkeitkontrollerkonsoleneinheit?
Post Reply