Media Player and encoding in audio tags

The behaviour described in the bug report is either by design, or would be far too complex/time-consuming to be changed

Moderators: white, Hacker, petermad, Stefan2

Post Reply
User avatar
Usher
Power Member
Power Member
Posts: 1675
Joined: 2011-03-11, 10:11 UTC

Media Player and encoding in audio tags

Post by *Usher »

Many old audio files (mostly MP3) contain tags saved in ANSI encoding. However, internal TC media player just ignores that fact. Many other programs (f.e. VLC, Media Player Classic, foobar2000) simply display tags in local ANSI encoding or in UTF-8.

Why TC doesn't try to use existing Lister rules for encoding detection?
Andrzej P. Wozniak
Polish subforum moderator
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Media Player and encoding in audio tags

Post by *ghisler(Author) »

Can you send me the first few kbytes (including the entire tags) of a file which isn't shown correctly?
TC does support ANSI tags, and uses the current Windows encoding to convert them to Unicode.
Why TC doesn't try to use existing Lister rules for encoding detection?
This is meant for text display.
Author of Total Commander
https://www.ghisler.com
User avatar
Usher
Power Member
Power Member
Posts: 1675
Joined: 2011-03-11, 10:11 UTC

Re: Media Player and encoding in audio tags

Post by *Usher »

ghisler(Author) wrote: 2022-06-24, 07:10 UTC Can you send me the first few kbytes (including the entire tags) of a file which isn't shown correctly?
Here you are: audio_tags.7z
16 MB in total, only mp3 files with ID3v1 and ID3v2x tags - one in Russian (Windows-1251), all the rest in ASCII/Polish (Windows-1250), sometimes mixed, as described in filenames.
ghisler(Author) wrote: 2022-06-24, 07:10 UTC
Why TC doesn't try to use existing Lister rules for encoding detection?
This is meant for text display.
But we are talking about TEXT tags, right?

I think that you should make displaying ANSI tags configurable - just in case of any problems.
Andrzej P. Wozniak
Polish subforum moderator
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Media Player and encoding in audio tags

Post by *ghisler(Author) »

Thanks for the files. These are using ANSI in their local encodings. I do not have any plans to make them depend on the text encoding, sorry. This is meant for text, not for titles. It would be annoying to to change the encoding for every track change.

Btw, the tags containing Polish etc. are using a wrong encoding. The ID3v2 standard describes that
text encoding = 0 -> ISO-8859-1 (ASCII).

I do have a special detection function for Cyrillic, which converts from Cyrillic codepage to Unicode when either the file name contains Cyrillic, or the tags contain valid Cyrillic.

Moderator message

Moved to "will not be changed"
Author of Total Commander
https://www.ghisler.com
User avatar
Usher
Power Member
Power Member
Posts: 1675
Joined: 2011-03-11, 10:11 UTC

Re: Media Player and encoding in audio tags

Post by *Usher »

ghisler(Author) wrote: 2022-06-27, 09:28 UTC Btw, the tags containing Polish etc. are using a wrong encoding. The ID3v2 standard describes that
text encoding = 0 -> ISO-8859-1 (ASCII).
It's not about standard, it's about REAL USE. Once again - in common use ID3 tags have been saved in local ANSI codepage (some Windows-nnnn, not ISO-8859-xx) by almost all programs before Windows XP and are still saved as ANSI by many programs. See Wikipedia article: https://en.wikipedia.org/wiki/ID3#ID3v2
Wikipedia about ID3 wrote:However, mojibake is still common when using local encodings instead of Unicode.
BTW. ISO-8859-1 isn't even supposed to support Polish language, the proper encoding for Polish is ISO-8859-2.

ghisler(Author) wrote: 2022-06-27, 09:28 UTC I do have a special detection function for Cyrillic, which converts from Cyrillic codepage to Unicode when either the file name contains Cyrillic, or the tags contain valid Cyrillic.
What encoding uses "valid Cyrillic"? Anything other than Windows-1251? Russian coders belong to most conservative people in this regard, some of them still use KOI-8R in text files.

If you want to tell anything wrong, blame developers of other popular software which used Western European languages and didn't care about the rest of the world. People just used the buggy software as was. And you still didn't explain why other players COULD DO IT BETTER. They are still popular and in common use: VLC, Media Player Classic, foobar2000.
Andrzej P. Wozniak
Polish subforum moderator
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Media Player and encoding in audio tags

Post by *ghisler(Author) »

BTW. ISO-8859-1 isn't even supposed to support Polish language, the proper encoding for Polish is ISO-8859-2.
Exactly. Therefore these tags are wrong and should use Unicode.
Author of Total Commander
https://www.ghisler.com
User avatar
Usher
Power Member
Power Member
Posts: 1675
Joined: 2011-03-11, 10:11 UTC

Re: Media Player and encoding in audio tags

Post by *Usher »

Once again. Developers use 0 for local ANSI encoding and you have to live with that. ZERO in real use means - UNDEFINED, LEGACY, system-dependent. Just make it optional, if you don't like it.
Andrzej P. Wozniak
Polish subforum moderator
Post Reply