Preliminary information about Unicode support (TC7.5)
Moderators: Hacker, petermad, Stefan2, white
- ghisler(Author)
- Site Admin
- Posts: 50390
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Yes, that's about how I have implemented it now, with two differences:
1. If the string contains accents etc. from the current code page only, it will be stored as ANSI for full backwards compatibility.
2. If the string contains characters from a different codepage, it will be stored as UTF-8, but with the UTF-8 three byte prefix. This way all strings from the current codepage will still work with older TC versions too, but strings from a different codepage will show up as garbage and not work. But such paths wouldn't have worked with the old version anyway...
1. If the string contains accents etc. from the current code page only, it will be stored as ANSI for full backwards compatibility.
2. If the string contains characters from a different codepage, it will be stored as UTF-8, but with the UTF-8 three byte prefix. This way all strings from the current codepage will still work with older TC versions too, but strings from a different codepage will show up as garbage and not work. But such paths wouldn't have worked with the old version anyway...
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
- ghisler(Author)
- Site Admin
- Posts: 50390
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Only when you enter them - TC 7.0x cannot show Unicode paths at all yet, but TC 7.5 will.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Yes of course only when I enter them. So it's clear that opening a questionable folder in TC 7.5 and later opening it in TC < 7.5 will not work. Why not save the 8+3 filename (just like in TC 7.0) if available and check for the full unicode file name while reading the entry?Only when you enter them
If there is no 8+3 filename save it as UTF-8. The directory couldn't be opened in < TC 7.5 anyway - but only in this case.
2VadiMGP
I understood it like that: He is not writing a BOM in the beginning of the file. He writes a BOM in front of each each UTF-8 encoded value to know that he has to decode it when reading it later.
There is another thing that could be discussed. Many (path) settings are not available in the options dialog. It could become quite difficult to edit the ini file considerung this changes.
I understood it like that: He is not writing a BOM in the beginning of the file. He writes a BOM in front of each each UTF-8 encoded value to know that he has to decode it when reading it later.
There is another thing that could be discussed. Many (path) settings are not available in the options dialog. It could become quite difficult to edit the ini file considerung this changes.
I understood this exactly in the same way. But user can click on menu "Change Settings Files Directly". And what will be after this?Lefteous wrote:He is not writing a BOM in the beginning of the file. He writes a BOM in front of each each UTF-8 encoded value to know that he has to decode it when reading it later.
No, thanks. I prefer to use text editors. Anyway, I use UTF-16 for years and I will continue to use it. I just wanted to express my opinion - efforts to supportig UTF-8 (including all efforts of plugin autors and probable collision when user will save file in UTF-8 by editor) doesn't seems to me attractive.Lefteous wrote:I guess a hex editor could be a better too
I didn't checked yet another issue - using UTF-8 in entry names. Some entry names are based on user input (i.e. [Search] section)
Even 8+3 name doesn't have to have an Ansi representation.Lefteous wrote:Why not save the 8+3 filename (just like in TC 7.0) if available and check for the full unicode file name while reading the entry?
If there is no 8+3 filename save it as UTF-8.
I don't know why is that, but I've seen a (Japanese XP) system with Japanese characters in the account name. Short filename generation was enabled, the account name was short (3 characters) - but it couldn't be transformed into an ordinary 8+3 filename; GetShortPathName() call succeeded, but it just kept the user's folder (C:\Documents and Settings\XXX) in its original form - i.e. containing those Japanese characters.
So, for an Ansi program running in another (non-Japanese) language, this user's folder was inaccessible (which unfortunatelly includes the TEMP folder, etc.)
What I'm trying to say is just that even 8+3 names (returned from GetShortPathName() for example) should be checked - if they can be stored correctly without Unicode/UTF.
Hello Lefteous,
since you are the pioneer in developing Unicode supporting (content) plugins for the upcoming TC 7.5 maybe you can share some programming hints beside the one in http://ghisler.ch/board/viewtopic.php?t=17135. Your help would be very appreciated.
Regards
tbeu
[mod]The next two posts moved here from DirSizeCalc 2.10 (content plugin).
Hacker (Moderator)[/mod]
since you are the pioneer in developing Unicode supporting (content) plugins for the upcoming TC 7.5 maybe you can share some programming hints beside the one in http://ghisler.ch/board/viewtopic.php?t=17135. Your help would be very appreciated.
Regards
tbeu
[mod]The next two posts moved here from DirSizeCalc 2.10 (content plugin).
Hacker (Moderator)[/mod]
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more
- fenix_productions
- Power Member
- Posts: 1979
- Joined: 2005-08-07, 13:23 UTC
- Location: Poland
- Contact:
Support for this request 
Honestly: I've read whole linked thread and hlp file for new content plugins and... I don't get it
The most important thing for me is to know how to deal with ft_string or ft_stringw but there are no examples. Furthermore: no info about dealing with ft_fulltext in Unicode case is provided.
Well I will also post in proper thread

Honestly: I've read whole linked thread and hlp file for new content plugins and... I don't get it

Well I will also post in proper thread

"When we created the poke, we thought it would be cool to have a feature without any specific purpose." Facebook...
#128099
#128099