TC7rc4 - no longer finds text in UTF8 files

Bug reports will be moved here when the described bug has been fixed

Moderators: white, Hacker, petermad, Stefan2

Post Reply
User avatar
marekjed
Junior Member
Junior Member
Posts: 25
Joined: 2003-10-16, 18:09 UTC

TC7rc4 - no longer finds text in UTF8 files

Post by *marekjed »

(Thank you so much for the quick response to my previous report!)

config: TC7 rc 4, WinXP SP2, Polish locale

Steps to reproduce:
Use "Find files" and search for text in known UTF8-encoded files. (UTF8 checkbox should be checked). Search expression must contain "extended" characters (i.e. those which take up more than 1 byte in UTF8 encoding).

Result:
TC7 does not find any matches if the search expression contains "extended" characters.

By comparison, TC 6.56 finds the same text in the same set of files. I am only working with UTF8 files, so I have not been able to see if the bug affects also other Unicode encodings ("Unicode" checkbox).

Notes:
1. The bug does not affect lister.
2. The files I am working with do not use BOM signature.
No ads, no nags freeware: http://www.tranglos.com
(KeyNote, PhoneDeck, KookieJar, Oubliette)
User avatar
Flint
Power Member
Power Member
Posts: 3487
Joined: 2003-10-27, 09:25 UTC
Location: Antalya, Turkey
Contact:

Post by *Flint »

Hm... I confirm this. No matter whether BOM signature present or not.

I checked now, this problem takes place from the version 7.0 private beta 2.5. I suppose, it has to do something with this change:
13.12.06 Added: Case-insensitive search for UTF8 text in the search function and in lister (currently only the forward direction works with non-English characters)
Also, I tested the same search with the option "Case-sensitive" turned on: in beta 2.5 and later it worked fine, till private beta 4.5, where it stopped working completely, for both case-sensitive and case-insensitive searches. The change here seems to be:
27.02.07 Fixed: Case-insensitive search in UTF-8 not always working properly
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 10.52 / Win10 x64
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48083
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

I could reproduce it and fix it, thanks!
Author of Total Commander
https://www.ghisler.com
User avatar
Lefteous
Power Member
Power Member
Posts: 9535
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous »

UTF-8 encoded words containing characters higher than 0x7F can be found using TC 7 RC 5. I have tested only files containing the UTF-8 BOM though.
User avatar
Flint
Power Member
Power Member
Posts: 3487
Joined: 2003-10-27, 09:25 UTC
Location: Antalya, Turkey
Contact:

Post by *Flint »

Yes, works fine in TC7rc5 without BOM as well.
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 10.52 / Win10 x64
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48083
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Great, thanks!
Author of Total Commander
https://www.ghisler.com
Post Reply