Preliminary information about Unicode support (TC7.5)
Moderators: Hacker, petermad, Stefan2, white
- ghisler(Author)
- Site Admin
- Posts: 50390
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Preliminary information about Unicode support (TC7.5)
As some of you already know, I'm currently adding full Unicode support to Total Commander, for the next big release 7.5.
Unicode support in plugins will work like this:
1. All existing functions remain unchanged
2. Where Unicode file names as parameters are possible, there will be an additional function ending with "W". This function will take the Unicode name
3. If the function is present, TC will call it - but only on NT-based systems (Windows NT/2000/XP/Vista)
4. If the function isn't present, and on Win9x/ME systems, TC will call the already existing ANSI functions. Unicode parts of the file name will be converted to the 8.3 (DOS) form first. The plugin will not be called if the 8.3 names are disabled, or there isn't a valid 8.3 name.
Example: The Lister plugin function ListLoad:
Currently defined as:
HWND __stdcall ListLoad(HWND ParentWin,char* FileToLoad,int ShowFlags);
An additional function ListLoadW will be added:
HWND __stdcall ListLoadW(HWND ParentWin,WCHAR* FileToLoad,int ShowFlags);
This way, all existing plugins will continue to work, even in Unicode subdirectories and for files with Unicode names, and Lister shows the Unicode file name in its title. Plugin writers can add full Unicode support relatively easily.
The ANSI functions will still have to be implemented, for all the cases where the Unicode functions cannot be called:
- Windows 9x/ME
- older versions of Total Commander
- third party programs
What do you think?
Unicode support in plugins will work like this:
1. All existing functions remain unchanged
2. Where Unicode file names as parameters are possible, there will be an additional function ending with "W". This function will take the Unicode name
3. If the function is present, TC will call it - but only on NT-based systems (Windows NT/2000/XP/Vista)
4. If the function isn't present, and on Win9x/ME systems, TC will call the already existing ANSI functions. Unicode parts of the file name will be converted to the 8.3 (DOS) form first. The plugin will not be called if the 8.3 names are disabled, or there isn't a valid 8.3 name.
Example: The Lister plugin function ListLoad:
Currently defined as:
HWND __stdcall ListLoad(HWND ParentWin,char* FileToLoad,int ShowFlags);
An additional function ListLoadW will be added:
HWND __stdcall ListLoadW(HWND ParentWin,WCHAR* FileToLoad,int ShowFlags);
This way, all existing plugins will continue to work, even in Unicode subdirectories and for files with Unicode names, and Lister shows the Unicode file name in its title. Plugin writers can add full Unicode support relatively easily.
The ANSI functions will still have to be implemented, for all the cases where the Unicode functions cannot be called:
- Windows 9x/ME
- older versions of Total Commander
- third party programs
What do you think?
Last edited by ghisler(Author) on 2007-09-27, 19:36 UTC, edited 1 time in total.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
This is the solution I suggested so I couldn't be happierWhat do you think?

This is what already happens in the current version of TC right?Unicode parts of the file name will be converted to the 8.3 (DOS) form first. The plugin will not be called if the 8.3 names are disabled, or there isn't a valid 8.3 name.
Will you also introduce a widechar character return type (maybe ft_widestring) in the content plug-in interface?
[mod]Some [OT] posts were split to Unicode to ANSI conversion when launching an application.
Hacker (Moderator)[/mod]
Hacker (Moderator)[/mod]
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.
I have updated my tiny content plug-in "Attributes" to support Unicode.
This is how I plan to change my plug-ins. The following points are important especially if you haven't worked with Unicode before:
This is how I plan to change my plug-ins. The following points are important especially if you haven't worked with Unicode before:
- The string type char* has been changed into wchar_t* for all variables and constants in the plug-in.
- The strings in the array "fieldNames" are declared with L"String instead of "String".
- The source file contains a pair of functions to be exported. The Unicode functions have a W postfix.
- The Unicode functions do the whole work and contain almost the same code that was previously used in the ANSI functions. The only difference here is the call to GetFileAttributesW which is the Unicode version of this function. You have to rename every single API call.
- The ANSI version of ContentGetSupportedField first calls the Unicode function and then converts the returned Unicode string values to ANSI.
- The ANSI version of ContentGetValue first converts the delivered ANSI string to Unicode and then call the Unicode function.
- This encapsulation of functions results in smaller code and zero redundancy compared to implementing each function twice.
Code: Select all
// Attributes.h
#include <windows.h>
#include "contentplug.h"
// Number of fields supported by this plug-in.
const DWORD FIELD_COUNT = 14;
// An array of size FIELD_COUNT containing the names of all file attributes covered
// by GetFileAttributes system operation.
wchar_t* fieldNames [FIELD_COUNT] = {L"Read Only", L"Hidden", L"System",
L"Directory", L"Archive", L"Device", L"Normal",
L"Temporary", L"Sparse File", L"Reparse Point", L"Compressed",
L"Offline", L"Not Content Indexed", L"Encrypted"};
// An array of size FIELD_COUNT containing the numeric constants of all file attributes
// covered by GetFileAttributes system operation.
// The array index is used as field index.
DWORD attributeConstants [FIELD_COUNT] = {FILE_ATTRIBUTE_READONLY, FILE_ATTRIBUTE_HIDDEN,
FILE_ATTRIBUTE_SYSTEM, FILE_ATTRIBUTE_DIRECTORY, FILE_ATTRIBUTE_ARCHIVE,
FILE_ATTRIBUTE_DEVICE, FILE_ATTRIBUTE_NORMAL, FILE_ATTRIBUTE_TEMPORARY,
FILE_ATTRIBUTE_SPARSE_FILE, FILE_ATTRIBUTE_REPARSE_POINT, FILE_ATTRIBUTE_COMPRESSED,
FILE_ATTRIBUTE_OFFLINE, FILE_ATTRIBUTE_NOT_CONTENT_INDEXED, FILE_ATTRIBUTE_ENCRYPTED};
Code: Select all
// Attributes.cpp
#include "Attributes.h"
#include <strsafe.h>
BOOL APIENTRY DllMain(HANDLE, DWORD, LPVOID)
{
return TRUE;
}
int __stdcall ContentGetSupportedField(int FieldIndex, char* FieldName, char* Units, int maxlen)
{
wchar_t wideFieldName [MAX_PATH] = {0};
wchar_t wideUnits [MAX_PATH] = {0};
int result = ContentGetSupportedFieldW (FieldIndex, wideFieldName, wideUnits, maxlen);
WideCharToMultiByte (CP_ACP, 0, wideFieldName, MAX_PATH, FieldName, maxlen, NULL, NULL);
WideCharToMultiByte (CP_ACP, 0, wideUnits, MAX_PATH, Units, maxlen, NULL, NULL);
return result;
}
int __stdcall ContentGetSupportedFieldW (int FieldIndex, wchar_t* FieldName, wchar_t* Units, int maxlen)
{
if (FieldIndex >= FIELD_COUNT)
{
return ft_nomorefields;
}
Units[0] = 0;
StringCchCopyW (FieldName, maxlen, fieldNames[FieldIndex]);
return ft_boolean;
}
int __stdcall ContentGetValue (char* FileName, int FieldIndex, int, void* FieldValue, int maxlen, int)
{
wchar_t wideFileName [MAX_PATH] = {0};
MultiByteToWideChar (CP_ACP, MB_PRECOMPOSED, FileName, -1, wideFileName, maxlen);
return ContentGetValueW (wideFileName, 0, FieldIndex, FieldValue, maxlen, 0);
}
int __stdcall ContentGetValueW (wchar_t* FileName, int FieldIndex, int, void* FieldValue, int, int)
{
DWORD attr = GetFileAttributesW (FileName);
if (attr == INVALID_FILE_ATTRIBUTES)
{
return ft_fileerror;
}
*(BOOL*)FieldValue = attr & attributeConstants[FieldIndex];
return ft_boolean;
}
Last edited by Lefteous on 2007-09-28, 16:06 UTC, edited 1 time in total.
- XPEHOPE3KA
- Power Member
- Posts: 854
- Joined: 2006-03-03, 18:23 UTC
- Location: Saint-Petersburg, Russia
I think it's possible to write an automatic converter doing these jobs.Lefteous wrote:If you follow these rules converting your plug-in to Unicode is straightforward and the plug-in file size won't increase too much.
- The string type char* has been changed into wchar_t* for all variables and constants in the plug-in.
- The strings in the array "fieldNames" are declared with L"String instead of "String".
- The source file contains a pair of functions to be exported. The Unicode functions have a W postfix.
- The Unicode functions do the whole work and contain almost the same code that was previously used in the ANSI functions. The only difference here is the call to GetFileAttributesW which is the Unicode version of this function. You have to rename every single API call.
- The ANSI functions first convert the deliered ANSI string to Unicode and then call the Unicode function. This results in smaller code and zero redundancy compared to implementing each function twice.
F6, Enter, Tab, F6, Enter, Tab, F6, Enter, Tab... - I like to move IT, move IT!..
2XPEHOPE3KA
There are also special cases where there is only a Unicode version of a Windows API function and previous ANSI only functions had to convert the returned string to ANSI or to Unicode if the API functions requred it as input. It's really not that easy.
And of course there also other programming languages which work a bit different.
Sure this is possible but please consider that refactoring is not just mass search & replace. Such a converter really has to understand the code.I think it's possible to write an automatic converter doing these jobs.
There are also special cases where there is only a Unicode version of a Windows API function and previous ANSI only functions had to convert the returned string to ANSI or to Unicode if the API functions requred it as input. It's really not that easy.
And of course there also other programming languages which work a bit different.
2XPEHOPE3KA
Unfortunately, real headache starts from code like this, although it looks like safety enough
Unfortunately, real headache starts from code like this, although it looks like safety enough
Code: Select all
...
TCHAR buffer[MAX_BUFFER_LEN];
...
_sntprintf(buffer, sizeof(buffer), TEXT("%d"), iVal);
...
2Lefteous
Code: Select all
function ContentGetValueW(FileName: PWChar; ...);
var
attr: DWORD;
begin
if Win32Platform = VER_PLATFORM_WIN32_NT then //!!!
attr := GetFileAttributesW(FileName)
else //!!!
attr := GetFileAttributesA( PChar(AnsiString(WideString(FileName))) );
if attr = INVALID_FILE_ATTRIBUTES then
Result := ft_filerror
else
begin
FieldValue := attr and attributeConstants[FieldIndex];
Result := ft_boolean;
end;
end;