Problems showing UTF-8 in Lister

This forum contains bug reports from previous beta tests - the issue has remained unresolved, either because it couldn't be reproduced, or couldn't be prevented/fixed

Moderators: white, Hacker, petermad, Stefan2

Post Reply
bb
Junior Member
Junior Member
Posts: 18
Joined: 2004-01-09, 16:57 UTC
Location: Leipzig / Germany
Contact:

Problems showing UTF-8 in Lister

Post by *bb »

I found some problems viewing UTF-8 text files in the Lister.

One thing I could isolate to a small example is the following:
Two Files, both with BOM, both with (the same) Cyrillic letters, one with a back-stepping accent, one without. One is shown as Cyrillic, the other one as black boxes.

These is the content in hex with accent (correct letters but accent without backspace, it should be over the а, shown correctly e.g. in notepad):
EF BB BF D0 90 D0 BB D0 B5 D0 BA D1 81 D0 B0 CC 81 D0 BD D0 B4 D1 80

The content should read "Алекс́андр"

And this without (shown as boxes in Lister, shown correctly e.g. in notepad:
EF BB BF D0 90 D0 BB D0 B5 D0 BA D1 81 D0 B0 D0 BD D0 B4 D1 80

The content should read "Александр"

To be sure to have the same files, this is the hex sequence:
EF BB BF D0 90 D0 BB D0 B5 D0 BA D1 81 D0 B0 D0 BD D0 B4 D1 80

EF BB BF D0 90 D0 BB D0 B5 D0 BA D1 81 D0 B0 CC 81 D0 BD D0 B4 D1 80

Interestingly, when pasting from notepad to Firefox, the accent moved from а to н, so obviously, notepad want's the modifier accent to be postfixed, while firefox interprets it as prefix. I think if the hex CC 81 is behind a letter the accent should be over this one.

I was testing this using TC 7.02a on German WinXP.
mfg, bb
bb
Junior Member
Junior Member
Posts: 18
Joined: 2004-01-09, 16:57 UTC
Location: Leipzig / Germany
Contact:

Post by *bb »

An addition:
Saved as Unicode (both Big Endian and Little Endian) both files show up correctly. (With the restriction of the Accent being behind and not over the а).
mfg, bb
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48007
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

2bb
Could you please send me a zip or rar archive containing a text file with these names? TC converts all UTF-8 text to UTF-16 before displaying it, so maybe there is a problem with the conversion function...
Author of Total Commander
https://www.ghisler.com
bb
Junior Member
Junior Member
Posts: 18
Joined: 2004-01-09, 16:57 UTC
Location: Leipzig / Germany
Contact:

Post by *bb »

I sent a mail with a .rar file attached on 2007-12-03, 21:48 with subject "UTF8 to Unicode conversion in Lister" to support at ghisler dot com.

Did you receive it?
If yes, any news?
mfg, bb
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48007
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Unfortunately I haven't received anything so far. Can you please try to re-send it? Please make sure to limit the size to 500kBytes max.
Author of Total Commander
https://www.ghisler.com
bb
Junior Member
Junior Member
Posts: 18
Joined: 2004-01-09, 16:57 UTC
Location: Leipzig / Germany
Contact:

Post by *bb »

Ok, I just resent it. Size is about 2KB.
mfg, bb
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48007
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Thanks, got it now! Your e-mail was shown as "filtered" in mailwasher, but I saw it and could avoid its deletion. :)
Author of Total Commander
https://www.ghisler.com
Post Reply