Page 1 of 1
[Bug] Incorrect comparison of two unicode strings
Posted: 2019-12-13, 12:24 UTC
by remittor
TotalCmd 9.22a uses this function absolutely everywhere:
Code: Select all
int __usercall wcsicmp@<eax>(WCHAR * wstr1@<eax>, WCHAR * wstr2@<edx>)
{
int result;
CHAR tmp2[260];
CHAR tmp1[260];
WCHAR wtmp2[1024];
WCHAR wtmp1[1024];
if ( win_dwPlatformId == 2 ) // Windows NT
{
wcsncpyEx(wtmp1, 1023, wstr1);
wcsncpyEx(wtmp2, 1023, wstr2);
wstr_lower(wtmp1);
wstr_lower(wtmp2);
result = wcscmp(wtmp1, wtmp2);
}
else
{
WideCharToMultiByte(0, 0, wstr1, -1, tmp1, 259, 0, 0);
WideCharToMultiByte(0, 0, wstr2, -1, tmp2, 259, 0, 0);
str_lower(tmp1);
str_lower(tmp2);
result = strcmp(tmp1, tmp2);
}
return result;
}
Critical Bug:
If the length of wstr1 or wstr2 is will be more than 259, then WideCharToMultiByte not insert null terminator!
After function strcmp will compare trash!
Recomendation: All this code needs to be replaced with a call
shlwapi.dll@StrCmpIW
Re: [Bug] Incorrect comparison of two unicode strings
Posted: 2019-12-16, 11:30 UTC
by ghisler(Author)
This shouldn't be a problem because the part with WideCharToMultiByte is only used on Windows 9x/ME, which doesn't support such long paths. However, I can make tmp1 and tmp2 larger an add a terminating 0 myself just in case.
Re: [Bug] Incorrect comparison of two unicode strings
Posted: 2019-12-17, 12:49 UTC
by remittor
ghisler(Author) wrote: 2019-12-16, 11:30 UTC
This shouldn't be a problem because the part with WideCharToMultiByte is only used on Windows 9x/ME
It seems so to you. But in fact, long strings come into function
wcsicmp.
For example, the function
wcsicmp is called where the display of the internal directory of the archive file is formed. And this function is called in many places.
Code: Select all
wcsicmp(&elem[40], L"plugininst.inf") /* elem[40] = headerData.FileName */
wcsicmp(locFileName, ArcCurDir) /* WCHAR locFileName[1024] , ArcCurDir = current archive directory */
wcsicmp(ArcCurDir, filename) /* filename = locFileName */
ghisler(Author) wrote: 2019-12-16, 11:30 UTC
However, I can make tmp1 and tmp2 larger an add a terminating 0 myself just in case.
This is terrible! Do not do that! Just call the function
shlwapi.dll@StrCmpIW (since Win98 with IE 4.0).
The function
wcsicmp you implemented is slow. This function also introduces its delay when fetching files to display the insides of archives.
Re: [Bug] Incorrect comparison of two unicode strings
Posted: 2019-12-18, 10:27 UTC
by ghisler(Author)
Sorry, I will not change to a so far unknown function just for making TC on legacy Windows 9x/ME a little bit faster. I will add the terminating zeroes myself, although MSDN describes that they are actually added:
https://docs.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-widechartomultibyte
cchWideChar
...
If this parameter is -1, the function processes the entire input string, including the terminating null character. Therefore, the resulting character string has a terminating null character, and the length returned by the function includes this character.
My function does pass -1 as cchWideChar.
Re: [Bug] Incorrect comparison of two unicode strings
Posted: 2019-12-18, 13:48 UTC
by remittor
ghisler(Author) wrote: 2019-12-18, 10:27 UTC... although MSDN describes that they are actually added...
This is true only when output string is less than 259 characters long.
Personally test and understand what it is about.
ghisler(Author) wrote: 2019-12-18, 10:27 UTCSorry, I will not change to a so far unknown function
You’ve said the same thing about these functions:
StrCmpLogicalW,
SHStrDupW. But now use them too.
Re: [Bug] Incorrect comparison of two unicode strings
Posted: 2020-01-24, 09:39 UTC
by remittor
Regarding the use of function
StrCmpIW, I already wrote here:
https://www.ghisler.ch/board/viewtopic.php?p=367822#p367822
ghisler(Author) wrote: 2020-01-13, 10:44 UTC
I'm reluctant to use StrCmpIW because I don't know what it does internally.
I created a special test program that compares the function
StrCmpIW and
CharLowerW.
Code: Select all
#define _CRT_SECURE_NO_WARNINGS
#include <windows.h>
#include <stdlib.h>
#include <stdio.h>
#include <shlwapi.h>
const int str_len = 2;
const WCHAR filename[] = L"test_result.txt";
const WCHAR bom = L'\xFEFF';
int main(int argc, WCHAR* argv[])
{
DWORD dw;
WCHAR str[str_len + 16];
WCHAR lwr[str_len + 16];
DeleteFileW(filename);
DWORD dwShareMode = FILE_SHARE_READ | FILE_SHARE_WRITE;
HANDLE hFile = CreateFileW(filename, GENERIC_WRITE, dwShareMode, NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
for (size_t i = 0; i < 8; i++) {
str[i] = 0x20;
}
str[0] = bom;
str[7] = L'\n';
str[8] = 0;
WriteFile(hFile, str, (DWORD)wcslen(str) * sizeof(WCHAR), &dw, NULL);
memset(str, 0, sizeof(str));
str[0] = 0x20;
str[1] = 0x20;
printf("start \n");
for (size_t t = 0xD800 - 1; t <= 0xDFFF; t++) {
str[0] = (WCHAR)t;
if (t == 0xD800 - 1) {
str[0] = 0x20;
}
for (size_t x = 0; x <= 0xFFFF; x++) {
if (str[0] == 0x20) {
if (x == 0) continue; /* ignore null */
}
str[1] = (WCHAR)x;
memcpy(lwr, str, str_len * sizeof(WCHAR));
lwr[2] = 0;
CharLowerW(lwr);
int ret = StrCmpIW(lwr, str);
if (ret) {
WCHAR buf[str_len * 2 + 64];
wcscpy(buf, str);
wcscat(buf, L" / ");
wcscat(buf, lwr);
wcscat(buf, L"\n");
WriteFile(hFile, buf, (DWORD)wcslen(buf) * sizeof(WCHAR), &dw, NULL);
}
}
}
wcscpy(str, L"<<<EOF>>>");
wprintf(str);
WriteFile(hFile, str, (DWORD)wcslen(str) * sizeof(WCHAR), &dw, NULL);
return 0;
}
According to the test results, it turned out that only "Ʀ"/"ʀ" is the problem symbol.
Unicode:
Re: [Bug] Incorrect comparison of two unicode strings
Posted: 2020-01-24, 17:52 UTC
by ghisler(Author)
In my tests on Windows 10, StrCmpIW didn't cause any measurable speedup of the archive read functions (at least of the initial read).