Plain text search fails but RegEx succeeds

Please report only one bug per message!

Moderators: Stefan2, white, sheep, Hacker

User avatar
MarkFilipak
Junior Member
Junior Member
Posts: 90
Joined: 2008-09-28, 01:00 UTC
Location: Mansfield, Ohio

Plain text search fails but RegEx succeeds

Post by *MarkFilipak » 2019-01-11, 08:02 UTC

I discovered this bug while searching Windows error logs...

This plain text search:
[X] Find text: nvac.inf_amd64_d79d2e834862ae12\nvac.inf
[_] RegEx (2)
fails to find the string.

But this RegEx search:
[X] Find text: nvac\.inf_amd64_d79d2e834862ae12\\nvac\.inf
[X] RegEx (2)
correctly returns 9 log file names.

Version 9.21a 64bit
Hi Christian! Delighted customer since 1999. License #37627

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 37495
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Plain text search fails but RegEx succeeds

Post by *ghisler(Author) » 2019-01-11, 11:24 UTC

Not a bug: \n is translated to a line break in plain text search, and \t to a tab. Try searching for:
nvac.inf_amd64_d79d2e834862ae12\\nvac.inf
Author of Total Commander
http://www.ghisler.com

User avatar
Usher
Senior Member
Senior Member
Posts: 408
Joined: 2011-03-11, 10:11 UTC

Re: Plain text search fails but RegEx succeeds

Post by *Usher » 2019-01-11, 21:18 UTC

Dots should also be escaped in regex, but it's the case that standard search works like regex though it should NOT…
Regards from Poland
Andrzej P. Wozniak

User avatar
Hacker
Moderator
Moderator
Posts: 11246
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Re: Plain text search fails but RegEx succeeds

Post by *Hacker » 2019-01-11, 21:51 UTC

Usher,
:?:

Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.

User avatar
MarkFilipak
Junior Member
Junior Member
Posts: 90
Joined: 2008-09-28, 01:00 UTC
Location: Mansfield, Ohio

Re: Plain text search fails but RegEx succeeds

Post by *MarkFilipak » 2019-01-12, 01:20 UTC

ghisler(Author) wrote:
2019-01-11, 11:24 UTC
Not a bug: \n is translated to a line break in plain text search, and \t to a tab. Try searching for:
nvac.inf_amd64_d79d2e834862ae12\\nvac.inf
Well, apparently there's a 3rd 'special' in plain-text search: \\ is translated as \

Does this really make sense? To me, it seems like mixed up regex.

So if I wanted to do a plain text search for '\\' I'd input '\\\\'? Oh, brother. Plain text search should be plain text.

Proposal for handling line breaks.
Suppose I wanted to search for "This is interesting food" and one of the candidates looked like this:

This is
interesting food.

TC should be smart enough to search across the line break and to handle extra spaces as superfluous. TC could handle that (internally) by taking the user's target, "This is interesting food", and, instead, searching for this:

This( +|\n|\t)is( +|\n|\t)interesting( +|\n|\t)food

and reporting all resulting hits. Call it "smart plain-text search".
Hi Christian! Delighted customer since 1999. License #37627

User avatar
Hacker
Moderator
Moderator
Posts: 11246
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Re: Plain text search fails but RegEx succeeds

Post by *Hacker » 2019-01-12, 01:53 UTC

MarkFilipak,
apparently there's a 3rd 'special' in plain-text search: \\ is translated as \
Um, exactly as it's documented in Help (press F1 or the Help button in the seatch dialog).

Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.

User avatar
MarkFilipak
Junior Member
Junior Member
Posts: 90
Joined: 2008-09-28, 01:00 UTC
Location: Mansfield, Ohio

Re: Plain text search fails but RegEx succeeds

Post by *MarkFilipak » 2019-01-12, 01:55 UTC

Hacker wrote:
2019-01-12, 01:53 UTC
MarkFilipak,
apparently there's a 3rd 'special' in plain-text search: \\ is translated as \
Um, exactly as it's documented in Help (press F1 or the Help button in the seatch dialog).

Roman
Hi Roman,

Got any comment on This( +|\n|\t)is( +|\n|\t)interesting( +|\n|\t)food?
Hi Christian! Delighted customer since 1999. License #37627

User avatar
MarkFilipak
Junior Member
Junior Member
Posts: 90
Joined: 2008-09-28, 01:00 UTC
Location: Mansfield, Ohio

Re: Plain text search fails but RegEx succeeds

Post by *MarkFilipak » 2019-01-12, 02:06 UTC

Hacker wrote:
2019-01-12, 01:53 UTC
MarkFilipak,
apparently there's a 3rd 'special' in plain-text search: \\ is translated as \
Um, exactly as it's documented in Help (press F1 or the Help button in the seatch dialog).

Roman
Actually, I never read the help for plain-text search because ...why would anyone ever need to read help for plain-text search?

So, what happens if I search for "\a or \b or \c" ...?
In regex, that's the same as searching for "a or b or c".

Oh, wait. I'll perform the experiment. ...Stay tuned.

...I'm back.

I give up. "This( +|\n|\t)is( +|\n|\t)interesting( +|\n|\t)food" as a RegEx search target didn't work if the candidate actually spans lines (but does work if the candidate includes tabs). So it appears that TC RegEx can't search across lines. Good grief.

PS: The Proof

Candidate:
This is
interesting food

Plain text search:
This is\ninteresting food
succeeds.

RegEx search:
This is\ninteresting food
fails!?!?!?!?!
Hi Christian! Delighted customer since 1999. License #37627

User avatar
Hacker
Moderator
Moderator
Posts: 11246
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Re: Plain text search fails but RegEx succeeds

Post by *Hacker » 2019-01-12, 13:23 UTC

MarkFilipak,
Yes, again, that is also described in the Help.

Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.

User avatar
MarkFilipak
Junior Member
Junior Member
Posts: 90
Joined: 2008-09-28, 01:00 UTC
Location: Mansfield, Ohio

Re: Plain text search fails but RegEx succeeds

Post by *MarkFilipak » 2019-01-12, 19:03 UTC

Hacker wrote:
2019-01-12, 13:23 UTC
MarkFilipak,
Yes, again, that is also described in the Help.

Roman
Technically, you are correct.
The "Find files: General" page does cite the regex-style behavior of a plain text search.
And the "Regular expressions" page does not list end-of-line as something that can be found.

I still find those quirks strange and surprising.
But ...enough said!

Happy 2019 to you, Roman!
Hi Christian! Delighted customer since 1999. License #37627

User avatar
MarkFilipak
Junior Member
Junior Member
Posts: 90
Joined: 2008-09-28, 01:00 UTC
Location: Mansfield, Ohio

Re: Plain text search fails but RegEx succeeds

Post by *MarkFilipak » 2019-01-12, 20:00 UTC

Hmmm...

RegEx searches for '\x0a' and '\x0d' also fail (as does '\x0a\x0d|\x0d\x0a' of course). Since searches for hex-chars are listed in TC Help, this behavior is surely a bug.
Hi Christian! Delighted customer since 1999. License #37627

User avatar
Hacker
Moderator
Moderator
Posts: 11246
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Re: Plain text search fails but RegEx succeeds

Post by *Hacker » 2019-01-12, 20:32 UTC

MarkFilipak,
The "Find files: General" page does cite the regex-style behavior of a plain text search.
I was rather referring to the help page for the Lister search (when you open Lister and press F7, F1).
the "Regular expressions" page does not list end-of-line as something that can be found.
While not really prominent, it explicitly states:
"The other modificators are not relevant for Total Commander, because the program only supports searching within one line."
RegEx searches for '\x0a' and '\x0d' also fail
I'd guess the RegEx library simply cuts the line endings off?

Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.

User avatar
MarkFilipak
Junior Member
Junior Member
Posts: 90
Joined: 2008-09-28, 01:00 UTC
Location: Mansfield, Ohio

Re: Plain text search fails but RegEx succeeds

Post by *MarkFilipak » 2019-01-13, 01:12 UTC

It appears that Andrey V. Sorokin has revised TRegExpr, to wit:

https://regexpstudio.com/en/regexp_syntax.html
Metacharacters ...
$ is at the end of a input string, and, if modifier /m is On, also immediately preceding any occurrence of \x0D\x0A or \x0A or \x0D (if You are using Unicode version of TRegExpr, then also \x2028 or \x2029 or \x0B or \x0C or \x85). Note that there is no empty line within the sequence \x0D\x0A.
https://regexpstudio.com/en/regexp_syntax.html#modifier_m
Modifiers ...
m
Treat string as multiple lines. That is, change. ^ and. $ from matching at only the very start or end of the string to the start or end of any line anywhere within the string, see also Line separators.
Can I look forward to TC implementing this new version?

PS: If modifier-m is implemented, can it be set on permanently? The reason is that searching within a line is simply a superset of searching across lines, thus, previous searches that are saved are unaffected by modifier-m. - M.
Hi Christian! Delighted customer since 1999. License #37627

User avatar
Hacker
Moderator
Moderator
Posts: 11246
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Re: Plain text search fails but RegEx succeeds

Post by *Hacker » 2019-01-13, 13:45 UTC

MarkFilipak,
Can I look forward to TC implementing this new version?
That'd be up to Christian to answer.

Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.

User avatar
MarkFilipak
Junior Member
Junior Member
Posts: 90
Joined: 2008-09-28, 01:00 UTC
Location: Mansfield, Ohio

Re: Plain text search fails but RegEx succeeds

Post by *MarkFilipak » 2019-01-13, 21:08 UTC

As a topic for general discussion... (I think poking brains is fun)

In the world of GUI, why are we still dragging around '\n' & '\t'? Why can't we simply feed text -- any text -- into a text-box and click "Find"? By "any text" I include new-lines and tabs and control chars and... anything. The current search input methods are CLI relics that can be abandoned.

So, what would submit the search string? Not '\n' -- that's so 'CLI'. What would submit the search string would be [ Find ].
Hi Christian! Delighted customer since 1999. License #37627

Post Reply