Find text:"alpha" AND "bravo" in arch.

Here you can propose new features, make suggestions etc.

Moderators: white, Hacker, petermad, Stefan2

User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

multiline searches don't work in TC period.
The file I tested with is 912 bytes, and TC refuses to find a multi-line regex. Whereas a dinky 132kb application grep.exe does so happily.

AFAIK the whole file doesn't need to be in memory to search it for multi-line, you flag partial matches and continue scanning the file until the partial match becomes a complete match or no match.
Postkutscher
Power Member
Power Member
Posts: 556
Joined: 2006-04-01, 00:11 UTC

Post by *Postkutscher »

white wrote:
Balderstrom wrote:Why wouldn't a regex search of:
.*(Alpha.*Bravo|Bravo.*Alpha).* work?
Because Total Commander does a line by line search and "Alpha" and "Bravo" may be on different lines.
Fortunately this was not a problem in my case, but I have UTF8 encoded files, so regex could not be used anyway. I tried to manually translate my keywords into 1251 ANSI codepage and then apply for search, but it doesn`t work. Aside of this it takes about 6 hours for a 100 MB directory with 100% load on one of two processor cores.
User avatar
white
Power Member
Power Member
Posts: 4676
Joined: 2003-11-19, 08:16 UTC
Location: Netherlands

Post by *white »

Postkutscher wrote:
white wrote:
Balderstrom wrote:Why wouldn't a regex search of:
.*(Alpha.*Bravo|Bravo.*Alpha).* work?
Because Total Commander does a line by line search and "Alpha" and "Bravo" may be on different lines.
Fortunately this was not a problem in my case, but I have UTF8 encoded files, so regex could not be used anyway. I tried to manually translate my keywords into 1251 ANSI codepage and then apply for search, but it doesn`t work. Aside of this it takes about 6 hours for a 100 MB directory with 100% load on one of two processor cores.
Try searching for the binary (utf8) codes of your keywords and remove the leading and trailing ".*" because that is very inefficient.

A regex search of:
\x41\x6c\x70\x68\x61.*\x42\x72\x61\x76\x6F|\x42\x72\x61\x76\x6F.*\x41\x6c\x70\x68\x61

is the same as a regex search of:
Alpha.*Bravo|Bravo.*Alpha
Postkutscher
Power Member
Power Member
Posts: 556
Joined: 2006-04-01, 00:11 UTC

Post by *Postkutscher »

2white
Thank you for the tip, I will try it tomorrow.

But I still think it is not so handy to make such a regexps for the simpliest search. It must be improved.

Moderators, could you please move this thread to suggestions?
User avatar
Hacker
Moderator
Moderator
Posts: 13102
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Post by *Hacker »

[mod]Moved to the Suggestions forum.

Hacker (Moderator)[/mod]
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.
Postkutscher
Power Member
Power Member
Posts: 556
Joined: 2006-04-01, 00:11 UTC

Post by *Postkutscher »

2Hacker, thanks.

2white
Your tip speeds up the search a lot, thank you.

But as I can see, it is more efficient to unpack my files manually and to use the regular search twice then. Very uncomfortable :(
User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

AFAICT Grep, et al from UnxUtils aren't any help in these particular cases either -- unless there is some setting I am missing. Everytime I try and use a linux-based tool with anything remotely close to unicode or utf-8 or the like, they fail miserably (even the ones specifically compiled for windows).

On slashdot there's a common meme about When will it be the "year of the linux desktop". To me, that sure as hell isn't gonna happen until linux starts supporting more than basic ansi.
*BLINK* TC9 Added WM_COPYDATA and WM_USER queries for scripting.
Post Reply