Page 1 of 2

Plain text search fails but RegEx succeeds

Posted: 2019-01-11, 08:02 UTC
by MarkFilipak
I discovered this bug while searching Windows error logs...

This plain text search:
[X] Find text: nvac.inf_amd64_d79d2e834862ae12\nvac.inf
[_] RegEx (2)
fails to find the string.

But this RegEx search:
[X] Find text: nvac\.inf_amd64_d79d2e834862ae12\\nvac\.inf
[X] RegEx (2)
correctly returns 9 log file names.

Version 9.21a 64bit

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-11, 11:24 UTC
by ghisler(Author)
Not a bug: \n is translated to a line break in plain text search, and \t to a tab. Try searching for:
nvac.inf_amd64_d79d2e834862ae12\\nvac.inf

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-11, 21:18 UTC
by Usher
Dots should also be escaped in regex, but it's the case that standard search works like regex though it should NOT…

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-11, 21:51 UTC
by Hacker
Usher,
:?:

Roman

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-12, 01:20 UTC
by MarkFilipak
ghisler(Author) wrote:
2019-01-11, 11:24 UTC
Not a bug: \n is translated to a line break in plain text search, and \t to a tab. Try searching for:
nvac.inf_amd64_d79d2e834862ae12\\nvac.inf
Well, apparently there's a 3rd 'special' in plain-text search: \\ is translated as \

Does this really make sense? To me, it seems like mixed up regex.

So if I wanted to do a plain text search for '\\' I'd input '\\\\'? Oh, brother. Plain text search should be plain text.

Proposal for handling line breaks.
Suppose I wanted to search for "This is interesting food" and one of the candidates looked like this:

This is
interesting food.

TC should be smart enough to search across the line break and to handle extra spaces as superfluous. TC could handle that (internally) by taking the user's target, "This is interesting food", and, instead, searching for this:

This( +|\n|\t)is( +|\n|\t)interesting( +|\n|\t)food

and reporting all resulting hits. Call it "smart plain-text search".

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-12, 01:53 UTC
by Hacker
MarkFilipak,
apparently there's a 3rd 'special' in plain-text search: \\ is translated as \
Um, exactly as it's documented in Help (press F1 or the Help button in the seatch dialog).

Roman

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-12, 01:55 UTC
by MarkFilipak
Hacker wrote:
2019-01-12, 01:53 UTC
MarkFilipak,
apparently there's a 3rd 'special' in plain-text search: \\ is translated as \
Um, exactly as it's documented in Help (press F1 or the Help button in the seatch dialog).

Roman
Hi Roman,

Got any comment on This( +|\n|\t)is( +|\n|\t)interesting( +|\n|\t)food?

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-12, 02:06 UTC
by MarkFilipak
Hacker wrote:
2019-01-12, 01:53 UTC
MarkFilipak,
apparently there's a 3rd 'special' in plain-text search: \\ is translated as \
Um, exactly as it's documented in Help (press F1 or the Help button in the seatch dialog).

Roman
Actually, I never read the help for plain-text search because ...why would anyone ever need to read help for plain-text search?

So, what happens if I search for "\a or \b or \c" ...?
In regex, that's the same as searching for "a or b or c".

Oh, wait. I'll perform the experiment. ...Stay tuned.

...I'm back.

I give up. "This( +|\n|\t)is( +|\n|\t)interesting( +|\n|\t)food" as a RegEx search target didn't work if the candidate actually spans lines (but does work if the candidate includes tabs). So it appears that TC RegEx can't search across lines. Good grief.

PS: The Proof

Candidate:
This is
interesting food

Plain text search:
This is\ninteresting food
succeeds.

RegEx search:
This is\ninteresting food
fails!?!?!?!?!

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-12, 13:23 UTC
by Hacker
MarkFilipak,
Yes, again, that is also described in the Help.

Roman

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-12, 19:03 UTC
by MarkFilipak
Hacker wrote:
2019-01-12, 13:23 UTC
MarkFilipak,
Yes, again, that is also described in the Help.

Roman
Technically, you are correct.
The "Find files: General" page does cite the regex-style behavior of a plain text search.
And the "Regular expressions" page does not list end-of-line as something that can be found.

I still find those quirks strange and surprising.
But ...enough said!

Happy 2019 to you, Roman!

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-12, 20:00 UTC
by MarkFilipak
Hmmm...

RegEx searches for '\x0a' and '\x0d' also fail (as does '\x0a\x0d|\x0d\x0a' of course). Since searches for hex-chars are listed in TC Help, this behavior is surely a bug.

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-12, 20:32 UTC
by Hacker
MarkFilipak,
The "Find files: General" page does cite the regex-style behavior of a plain text search.
I was rather referring to the help page for the Lister search (when you open Lister and press F7, F1).
the "Regular expressions" page does not list end-of-line as something that can be found.
While not really prominent, it explicitly states:
"The other modificators are not relevant for Total Commander, because the program only supports searching within one line."
RegEx searches for '\x0a' and '\x0d' also fail
I'd guess the RegEx library simply cuts the line endings off?

Roman

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-13, 01:12 UTC
by MarkFilipak
It appears that Andrey V. Sorokin has revised TRegExpr, to wit:

https://regexpstudio.com/en/regexp_syntax.html
Metacharacters ...
$ is at the end of a input string, and, if modifier /m is On, also immediately preceding any occurrence of \x0D\x0A or \x0A or \x0D (if You are using Unicode version of TRegExpr, then also \x2028 or \x2029 or \x0B or \x0C or \x85). Note that there is no empty line within the sequence \x0D\x0A.
https://regexpstudio.com/en/regexp_syntax.html#modifier_m
Modifiers ...
m
Treat string as multiple lines. That is, change. ^ and. $ from matching at only the very start or end of the string to the start or end of any line anywhere within the string, see also Line separators.
Can I look forward to TC implementing this new version?

PS: If modifier-m is implemented, can it be set on permanently? The reason is that searching within a line is simply a superset of searching across lines, thus, previous searches that are saved are unaffected by modifier-m. - M.

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-13, 13:45 UTC
by Hacker
MarkFilipak,
Can I look forward to TC implementing this new version?
That'd be up to Christian to answer.

Roman

Re: Plain text search fails but RegEx succeeds

Posted: 2019-01-13, 21:08 UTC
by MarkFilipak
As a topic for general discussion... (I think poking brains is fun)

In the world of GUI, why are we still dragging around '\n' & '\t'? Why can't we simply feed text -- any text -- into a text-box and click "Find"? By "any text" I include new-lines and tabs and control chars and... anything. The current search input methods are CLI relics that can be abandoned.

So, what would submit the search string? Not '\n' -- that's so 'CLI'. What would submit the search string would be [ Find ].