Problem with non-us characters in zip-files

English support forum

Moderators: Hacker, petermad, Stefan2, white

Post Reply
LarsSandberg
Junior Member
Junior Member
Posts: 62
Joined: 2003-04-11, 07:46 UTC
Location: Denmark

Problem with non-us characters in zip-files

Post by *LarsSandberg »

Hi, I'm using TC 6.02 and have encountered following problem:
When I pack some files, where the filename include the danish "ø" character, this character becomes "o" instead inside the zip-archive.

The OS is Windows 2000 Pro UK-version, where regional settings are set to danish.

I'm using the WCMZIP32.DLL file in TC install dir for enabling the opstion for password protected zip-files.
WCMZIP32.DLL is at 32.768 bytes and have a filedate 29-09-01. Maybe my problem is here? Is there a newer version of that dll-file?
Best regards
Lars
User avatar
Clo
Moderator
Moderator
Posts: 5731
Joined: 2003-12-02, 19:01 UTC
Location: Bordeaux, France
Contact:

Works @ home---

Post by *Clo »

2LarsSandberg
:) Hello !
• That issue (or close) has been already discussed HERE
- You even posted a reply at this thread :wink:

- I just remake a test : no problem under Win 98 SE (French), that works fine. The "ø" is alright in the ZIP name.
- I'm using the same ZIP DLL than yours. I run the newest TC 6.03a
- I've not Win 2000, but I go to test under Win XP-Pro quickly.
- I recommend you to upgrade TC into 6.03a...
EDIT :
¤ Works fine too under XP-Pro (French) - Same ZIP DLL
- All Codepages are "850" (French)

:mrgreen:  Kind regards,
#31505 Traducteur Français de TC French translator Aide en Français Tutoriels Français English Tutorials
LarsSandberg
Junior Member
Junior Member
Posts: 62
Joined: 2003-04-11, 07:46 UTC
Location: Denmark

Post by *LarsSandberg »

Hi, I tried to update to TC 6.03a, it didn't help. In archives "ø" still changes to "o" :(
I tried both with the original dll + the one that enables password protection
Best regards
Lars
User avatar
SanskritFritz
Power Member
Power Member
Posts: 3693
Joined: 2003-07-24, 09:25 UTC
Location: Budapest, Hungary

Post by *SanskritFritz »

2LarsSandberg
Are you sure you are not using unicode in the filenames? Because if yes, zip doesnt handle unicode, so it converts. Use normal 8 bit characters in the filename, then, if regional settings are danish, it should work ok.
I switched to Linux, bye and thanks for all the fish!
LarsSandberg
Junior Member
Junior Member
Posts: 62
Joined: 2003-04-11, 07:46 UTC
Location: Denmark

Post by *LarsSandberg »

How should it be unicode, when filenames are typed in?
I was forced to use winzip (which I hate) to make the zip-files. Winzip handled the letter ø correctly. TC 5.51, 6.02 and 6.03a failed :-(
Best regards
Lars
User avatar
Clo
Moderator
Moderator
Posts: 5731
Joined: 2003-12-02, 19:01 UTC
Location: Bordeaux, France
Contact:

A small test---

Post by *Clo »

2LarsSandberg
:) Hi !
• Typing the file-name doesn't mean that isn't Unicode !
* Just a test :
- Copy the file-name with a ø in an editor which supports Unicode,
- save that file as any name,
- display it in Lister,
- look at the Menu "Options" what entry is ticked.
- let's know…

:mrgreen: KR
Claude
Clo
#31505 Traducteur Français de TC French translator Aide en Français Tutoriels Français English Tutorials
LarsSandberg
Junior Member
Junior Member
Posts: 62
Joined: 2003-04-11, 07:46 UTC
Location: Denmark

Post by *LarsSandberg »

The danish "ø" is contained within extended 8-bit ascii character set. So the filename pasted into an editor (UltraEdit), shows that the "ø" character is F8 (hex) equal to a decimal value of 248. So in other words: typing ø in filenames leads to ascii-character sets, which then can be converted to UTF-8 or 16 (Unicode) letters. UltraEdit shows that the saved file is with "dos" based content (ascii).
If you are in doubt, just look at an standard ascii-table, here you will find "ø" as F8h or 248 dec
Best regards
Lars
User avatar
Clo
Moderator
Moderator
Posts: 5731
Joined: 2003-12-02, 19:01 UTC
Location: Bordeaux, France
Contact:

Charset...

Post by *Clo »

2LarsSandberg
:) I know what is a character table for a while, and I type Alt+0248 to get the ø character renaming a file (and also a directory in the old test quoted above), in order to make these tests.
- French versions of Windows support enhanced ASCII characters, since we have a lot of them: é è ç à ù ü ê î œ æ… and corresponding capital-letters. Danish versions must support them too...
{ Apart : Not any Petermad around ? }
- Please, could you test that under Win 98SE or even under XP-Pro (i.e. at a friend's or at your next-door neighbour's) ? Like I told you, that works fine at home with both these OSs, with ZIP or RAR, file name, archive name or dir name. I guess that TC isn't faulty, since that works at home with the same DLLs than yours in TC; it could be a Win2000 problem...

:mrgreen: KR
Claude
Clo
#31505 Traducteur Français de TC French translator Aide en Français Tutoriels Français English Tutorials
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50505
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

OK, I can shed some light on this problem:

- by default, ZIP stores file names in DOS charset (ASCII)
- the Windows function AnsiToAscii converts the ø to o, although the ø seems to exist also in the ASCII charset

Therefore the ø is lost when packing with ZIP...
Author of Total Commander
https://www.ghisler.com
LarsSandberg
Junior Member
Junior Member
Posts: 62
Joined: 2003-04-11, 07:46 UTC
Location: Denmark

Post by *LarsSandberg »

Clo: Yep "é è ç à ù ü ê î œ æ & ø" are all within the 8-bit ascii table, which by the way is the same in France as well as Denmark, Germany.. (but not china :o)

Ghisler: I can't see that windows 2000 creates the problem, since WinZip (it was almost over my dead body to use this tool :-) ) on the same PC, creates correct zip-archives, where the "ø" character is contained.
Best regards
Lars
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50505
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Winzip may be storing the file names with Ansi (Windows) characters, which is against pkzip specifications, but will be accepted by many unpackers (including TC) if the operating system is marked as "Windows".
Author of Total Commander
https://www.ghisler.com
User avatar
Clo
Moderator
Moderator
Posts: 5731
Joined: 2003-12-02, 19:01 UTC
Location: Bordeaux, France
Contact:

Why is it right in French Win ?

Post by *Clo »

2ghisler(Author)
:) Good evening again,
Therefore the ø is lost when packing with ZIP...
… but NOT lost at home with French versions of Windows ! Same TC, same Zip DLL…
* Weird, odd, strange…
:?:
EDIT :
A screen-shot where a ZIP has a ø in its name, a file has a ø in its name too. The unpacked files have right characters…

:mrgreen: M.f.G.
Claude
Clo
#31505 Traducteur Français de TC French translator Aide en Français Tutoriels Français English Tutorials
LarsSandberg
Junior Member
Junior Member
Posts: 62
Joined: 2003-04-11, 07:46 UTC
Location: Denmark

Post by *LarsSandberg »

Clo:
That issue (or close) has been already discussed HERE
- You even posted a reply at this thread
Heh, yes that could almost be embarassing for me :) My situation is that i'm employed in a new company. And the users didn't know the discipline to stay away from danish local characters (æøå) in filenames. So I just landed in some mess :) Where I could notice that TC's zip-packer in some strange combination also had problems with that. As you say, it is strange, since it is working by you, thats weird.
Our situation is that the filenames are referenced, so we can't rename these. I would guess that Mr. Ghisler are digging to find answers, since it is working by you.
Best regards
Lars
User avatar
Clo
Moderator
Moderator
Posts: 5731
Joined: 2003-12-02, 19:01 UTC
Location: Bordeaux, France
Contact:

Problem in network---

Post by *Clo »

2LarsSandberg
:) Hi Lars !
• Actually, that works fine at home, like you can see on the new pic I added above…
* The one problem I'd with extended characters was through the network; in that case, SOME extended characters were not copied, and replaced by an underscore (you can see one on the pic, I used the same test-files), while others were copied alright. The Danish æ ø å are kept, but € œ … are changed.
* I didn't get a satisfactory explanation for that issue.
* So, avoid such handlings via the local network, if any is installed!
:mrgreen: Kind regards,
Claude
Clo
#31505 Traducteur Français de TC French translator Aide en Français Tutoriels Français English Tutorials
LarsSandberg
Junior Member
Junior Member
Posts: 62
Joined: 2003-04-11, 07:46 UTC
Location: Denmark

Post by *LarsSandberg »

Hi Mr. Ghisler. I noticed a thread back from feb 2003, where you on request mentioned that you would consider an option/ switch that allowed non-us characters in zip-archives created by TC. Here is the URL to that thread: http://ghisler.ch/board/viewtopic.php?t=79&highlight=character+zip
I think that many would miss that option
Best regards
Lars
Post Reply