Known issues with Unicode characters in filenames

English support forum

Moderators: white, Hacker, petermad, Stefan2

Post Reply
misvin
Member
Member
Posts: 112
Joined: 2010-08-14, 11:25 UTC

Known issues with Unicode characters in filenames

Post by *misvin »

Hi All,

I want to use Unicode characters in file and directory names, including the following symbols:

1. Unicode Character "Left Double Quotation Mark" (U+201C) as replacement of illegal character " in filename.

2. Unicode Character "OCR Double Backslash" (U+244A) as replacement of illegal character \ in filename.

3. Unicode Character "Black Question Mark Ornament" (U+2753) as replacement of illegal character ? in filename.

4. Unicode Character "Fullwidth Colon" (U+FF1A) as replacement of illegal character : in filename.

5. Unicode control characters: LRM (Left to Right Mark), RLM (Right to Left Mark), LRE (Left to Right Embedding), RLE (Right to Left Embedding), LRO (Left to Right Override), RLO (Right to Left Override).

My question:

Do you know about any issues, limitations, problems, restrictions etc. with using Unicode characters (in general) and any of the above listed characters (in particular) when a user want to:

1. Execute different operations in Total Commander 9.10 (Windows 10), including copy/move, search files, synchronize directories, archive/unpack, multi-rename, split/combine.

2. Execute different operations in File Explorer, Command Prompt/Windows PowerShell.

3. Send, receive, open files with Unicode characters by mail (GMail, Yahoo, Hotmail, Exchange Server).

4. Upload and synchronize files with Unicode characters on OneDrive, Google Drive, Dropbox cloud storage.

Thanks
User avatar
milo1012
Power Member
Power Member
Posts: 1158
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 »

First point:
TC has some (IMO overcautious) filter concerning RTL/LTR Unicode markers, see
http://www.ghisler.ch/board/viewtopic.php?t=46465

Second:
I know the Win32 API pretty well, and I never heard about any restrictions when it comes to Unicode characters.
The API is pretty much agnostic when it comes to the semantics of characters (except of course the well-known Windows forbidden characters). So Explorer shouldn't care about these characters in file names. In the command prompt you may have issues displaying special characters, though.

For the third and fourth point I can't really say much about those platforms, you probably should ask in their dedicated forums about restrictions. But one important thing such platforms should consider and care about is Unicode Normalization. I wrote a plug-in which might help when it comes to synchronizing file names in different normalization forms.
TC plugins: PCREsearch and RegXtract
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

TC currently doesn't support Unicode in path names to its settings files, e.g. to wincmd.ini, wcx_ftp.ini or the button bar files. Why? The used functions to write to ANSI ini files also accept only ANSI file names.
Author of Total Commander
https://www.ghisler.com
misvin
Member
Member
Posts: 112
Joined: 2010-08-14, 11:25 UTC

Post by *misvin »

ghisler(Author) wrote:TC currently doesn't support Unicode in path names to its settings files, e.g. to wincmd.ini, wcx_ftp.ini or the button bar files. Why? The used functions to write to ANSI ini files also accept only ANSI file names.
If this is the only problem with Unicode support in TC, then I can live with that :-)
Post Reply