[Old] Copy full info from Command Browser: wrong encoding

Bug reports will be moved here when the described bug has been fixed

Moderators: sheep, Hacker, Stefan2, white

Post Reply
User avatar
Flint
Power Member
Power Member
Posts: 3249
Joined: 2003-10-27, 09:25 UTC
Location: Moscow, Russia
Contact:

[Old] Copy full info from Command Browser: wrong encoding

Post by *Flint » 2019-10-29, 23:07 UTC

I've only just found this problem, but when I re-checked with 9.22a, it was there too, so it's not 9.50-specific. Still, a bug, which is present in 9.50 too.

Prerequisites (I'm not sure if all of them are really required, but that's how I reproduced it):
OS: Windows 7 x64, Windows 10 x64.
System locale set to Russian, language for non-Unicode programs is also Russian.
The actual Windows interface language doesn't seem to matter (I reproduced it in Win7 Russian, and Win7 English, but both of them had Russian locale).
Installed keyboard layouts: English (default), Russian.
TC is configured to use Russian interface language.

The experiment steps:
1. Start TC, run cm_CommandBrowser.
2. Select any command with Russian description. Press Ctrl+Shift+C to copy full line.
3. Switch to some text editor, paste the clipboard contents and check the result.
4. Switch back to TC, and perform Shift+double click on the command, or select a command and click OK while holding Shift (this will also copy the full line).
5. Again, check what was copied to clipboard.

Expected results: every time, the full line should be copied as it was shown in the dialog.

The actual results, however, will vary depending on several factors. Suppose, I'm copying the command:

Code: Select all

cm_SrcShort	301	Активная: Краткий режим
The results are:

a) TC 64-bit:
a.1) Ctrl+Shift+C always copies the text correctly:

Code: Select all

cm_SrcShort	301	Активная: Краткий режим
a.2) Shift+OK / Shift+double-click always replaces the Russian letters with question marks and cuts off some of them:

Code: Select all

cm_SrcShort	301	????????: ??????? ???
b) TC 32-bit: all three methods of copying behave identically, but the result depends on which keyboard layout is turned on, when you perform the copy operation:
b.1) if the English layout is active, the Russian letters are replaced with their diacritical counterparts from the 1252 codepage (that is, Russian symbols are replaced by those symbols from the codepage 1252, which have the same code numbers, as original Russian letters had in their "native" codepage 1251), the result looks like this:

Code: Select all

cm_SrcShort	301	Àêòèâíàÿ: Êðàòêèé ðåæèì
b.2) If the Russian layout is active, the text is copied correctly:

Code: Select all

cm_SrcShort	301	Активная: Краткий режим
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 9.22a / Win7 x64 SP1, Win10 x64

User avatar
Usher
Power Member
Power Member
Posts: 637
Joined: 2011-03-11, 10:11 UTC

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *Usher » 2019-10-30, 02:27 UTC

2Flint
What encoding/codepage is used for WCMD_RUS.INC and other lang files?
Regards from Poland
Andrzej P. Wozniak

browny
Member
Member
Posts: 191
Joined: 2007-09-10, 13:19 UTC

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *browny » 2019-10-30, 07:40 UTC

Flint wrote:
2019-10-29, 23:07 UTC
b) TC 32-bit: all three methods of copying behave identically, but the result depends on which keyboard layout is turned on, when you perform the copy operation:
This is the standard behaviour of old Delphi versions (approximately before XE7).
The simplest trick was to convert ANSI string to widestring when putting in on the clipboard, and set clipboard contents as CF_UNICODETEXT.

User avatar
Flint
Power Member
Power Member
Posts: 3249
Joined: 2003-10-27, 09:25 UTC
Location: Moscow, Russia
Contact:

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *Flint » 2019-10-30, 08:26 UTC

2Usher
The default one, 1251 (they are in the TC distribution).

2browny
This is the standard behaviour of old Delphi versions (approximately before XE7).
Not of old Delphi, but of all non-Unicode programs. It's an extremely old problem in Windows, that when you copy something to clipboard from a non-Unicode program, Windows converts the text into Unicode not from the system locale ANSI encoding, but from the encoding of the currently active keyboard layout. (I've never understood that logic, as if the encoding of the existing text could suddenly change when I just switch my input layout…)

Anyway, when TC was converted to Unicode, it either started to call the appropriate Unicode API functions, or convert the text itself, but the result was correct clipboard processing. But in this particular function the correction is, obviously, missing.
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 9.22a / Win7 x64 SP1, Win10 x64

browny
Member
Member
Posts: 191
Joined: 2007-09-10, 13:19 UTC

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *browny » 2019-10-30, 11:59 UTC

Flint wrote:
2019-10-30, 08:26 UTC
I've never understood that logic, as if the encoding of the existing text could suddenly change when I just switch my input layout…
The logic is quite simple, though unintuitive.
Your text as a Unicode string can have a mix of intenational characters, and that would require different code pages for ANSI/OEM.
ANSI text can have one code page only, and neither file system nor clipboard care which code page it was.

User avatar
Usher
Power Member
Power Member
Posts: 637
Joined: 2011-03-11, 10:11 UTC

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *Usher » 2019-10-30, 12:42 UTC

Flint wrote:
2019-10-30, 08:26 UTC
2Usher
The default one, 1251 (they are in the TC distribution).
Could you test files in UTF-8 and UTF-16 encoding, please?
Regards from Poland
Andrzej P. Wozniak

User avatar
Flint
Power Member
Power Member
Posts: 3249
Joined: 2003-10-27, 09:25 UTC
Location: Moscow, Russia
Contact:

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *Flint » 2019-10-30, 13:08 UTC

browny wrote:
2019-10-30, 11:59 UTC
The logic is quite simple, though unintuitive.
Your text as a Unicode string can have a mix of intenational characters, and that would require different code pages for ANSI/OEM.
ANSI text can have one code page only, and neither file system nor clipboard care which code page it was.
I know all about it. But I fail to see, why on Earth the conversion procedure must depend on the currently selected input language. I understand when, e.g., MS Word assigns the layout's language to the text I'm typing — that's logical. But I don't understand why, without any input from my side, it converts the already existing text (typed by a completely different person) differently, depending on what input language I have currently turned on.

But this is a complete offtopic. It works like that, and nothing we can do about it. The topic was about the TC behavior, which can be fixed.


2Usher
Could you test files in UTF-8 and UTF-16 encoding, please?
UTF-8: All the same.
UTF-16: TC does not see the language file and does not allow to select it. If I just modify LanguageFile in wincmd.ini, it shows all Russian symbols in the GUI as garbage (looks like undecoded UTF-8). No use checking how the Command Browser would work in such a broken scenario.
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 9.22a / Win7 x64 SP1, Win10 x64

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 38438
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *ghisler(Author) » 2019-10-30, 17:25 UTC

Confirmed, I was simply copying plain text. This worked when only copying the command, but not for the description...
Author of Total Commander
http://www.ghisler.com

User avatar
Flint
Power Member
Power Member
Posts: 3249
Joined: 2003-10-27, 09:25 UTC
Location: Moscow, Russia
Contact:

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *Flint » 2019-11-06, 20:45 UTC

I've forgotten to re-check it in β3 where the history mentions Ctrl+Shift+C as fixed. And, indeed, I confirm that Ctrl+Shift+C copies the correct text now.

However, Shift+OK / Shift+dblclick still works incorrectly (in 64-bit produces question marks, in 32-bit with English keyboard layout produces diacritical letters). This was re-tested in 9.50β4, Win7 x64.
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 9.22a / Win7 x64 SP1, Win10 x64

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 38438
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *ghisler(Author) » 2019-11-07, 09:12 UTC

Ah, sorry, I didn't see that you reported also other bugs.
Author of Total Commander
http://www.ghisler.com

User avatar
Flint
Power Member
Power Member
Posts: 3249
Joined: 2003-10-27, 09:25 UTC
Location: Moscow, Russia
Contact:

Re: [Old] Copy full info from Command Browser: wrong encoding

Post by *Flint » 2019-11-13, 18:17 UTC

Confirm fixed in 9.50β5.
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 9.22a / Win7 x64 SP1, Win10 x64

Post Reply