rename regex bug?

Bug reports will be moved here when the described bug has been fixed

Moderators: Hacker, petermad, Stefan2, white

User avatar
dindog
Senior Member
Senior Member
Posts: 316
Joined: 2010-10-18, 07:41 UTC

rename regex bug?

Post by *dindog »

supose there is a file name : [123][abc]xyz.doc
in perl, \[.*\] is greedy, it will match [123][abc]
to get lazy match, a "?" need to add in the end of .* , in another word, \[.*?\] will match [123].

nevertheless in TC batch rename, seem like with or without the ending "?", the regex is greedy, is it a bug?
User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

You don't appear to be able to turn off greedy-mode, even the recommended syntax from TC's helpfile doesn't work.
e.g
Search: (?-g)\[.*\]
---> Matches: [123][abc]

Whereas if I do this in EmEditor Search/Replace:
Find: \[.*?\]
---> Matches: [123]

Note the same behaviour is seen in TC 7.56a as well.

This could explain the various problems I've had with TC's multi-rename tool in the past.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50479
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

I think that there is some misunderstanding how regular expressions are used in the multi-rename tool: They are not only applied once, but multiple times.

Example 1 (greedy):
Search name: [123][abc]xyz.doc
Search for: \[.*\]
Replace by: a

One match: [123][abc]
Resulting name: axyz.doc

Example 2 (non-greedy):
Search name: [123][abc]xyz.doc
Search for: \[.*\]
Replace by: a

First match: [123]
Second march: [abc]
Resulting name: aaxyz.doc
Author of Total Commander
https://www.ghisler.com
User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

Wouldn't it be better for that to be an option? (checkbox beside subst) With applying a regex "multiple-times" it makes it significantly harder to write a regex string that will do what you want|expect.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50479
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Well, that's how search+replace works - if you don't use RegEx, you get the same thing, e.g. searching for "[" will replace all the "[", not just the first.
Author of Total Commander
https://www.ghisler.com
fleggy
Junior Member
Junior Member
Posts: 97
Joined: 2011-10-20, 07:00 UTC

Post by *fleggy »

I think the current behaviour (multiple replaces) is OK. Perhaps an option like Replace All/Replace Once (only for regexp of course) could be sometimes useful.
User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

@DingDog,

This seems to work in both my Editor w/ RegEx, and TC's MRT
Search: ^\[(.*?)\](.*)
Replace: $1__$2
Behaves the same whether [x] Subbst is checked or not.
Looks like you need to add the trailing (.*) to prevent the MRT's recursion.

Guess it's fine as is, just makes it slightly incompatible with other tools that use Regex, since they would allow a single replace, as well as replace all (recursion).
*BLINK* TC9 Added WM_COPYDATA and WM_USER queries for scripting.
fleggy
Junior Member
Junior Member
Posts: 97
Joined: 2011-10-20, 07:00 UTC

Post by *fleggy »

@Balderstrom
just to be precise - Replace All is not a recursion. Replace All only finds all occurences in the source string - one by one. I don't wont to be a nitpicker but regexps are my favourite theme. Sorry for a little OT :)
User avatar
white
Power Member
Power Member
Posts: 5789
Joined: 2003-11-19, 08:16 UTC
Location: Netherlands

Post by *white »

As a workaround it is fairly easy to adjust search and replace string so replacement is done only once. You can do this by making sure the search string matches the whole string. Generally you can do this by adding "^.*?" ("^" for higher efficiency) at the beginning and ".*" at the end.
User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

fleggy wrote:@Balderstrom
just to be precise - Replace All is not a recursion. Replace All only finds all occurences in the source string - one by one. I don't wont to be a nitpicker but regexps are my favourite theme. Sorry for a little OT :)
@fleggy, I musta tested wrong at some point,

This, ^\[(.*?)\] --- works the same in TC's MRT and my Editor with proper regex implementation.

Note: It's actually not a small task to find an editor that implements Regex properly: when I was doing a search for a new Text Editor a little over two years ago - there were dozens that either 1) didn't support regex at all, 2) only a subset of regex, 3) claimed to be full regex but had major parsing bugs. For example, Notepad2 has a really poor regex implementation.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50479
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

In beta 8, there will be a new checkbox "1x" which allows to replace just the first match. This will work with and without regular expressions.
Author of Total Commander
https://www.ghisler.com
User avatar
white
Power Member
Power Member
Posts: 5789
Joined: 2003-11-19, 08:16 UTC
Location: Netherlands

Post by *white »

ghisler(Author) wrote:In beta 8, there will be a new checkbox "1x" which allows to replace just the first match. This will work with and without regular expressions.
Tested OK using TC 8.0 beta 8 32-bit.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50479
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Thanks! Can anyone else try it too, please?
Author of Total Commander
https://www.ghisler.com
User avatar
Samuel
Power Member
Power Member
Posts: 1930
Joined: 2003-08-29, 15:44 UTC
Location: Germany, Brandenburg an der Havel
Contact:

Post by *Samuel »

I tested it too and it worked like a charm. With and without RegEx.
User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

Seems to work well, given the following file input:

Code: Select all

default - Copy.bar
default - Copy.br2
default - Copy - Copy.bar
default - Copy - Copy.br2
tcignore - Copy.txt
tcignore - Copy - Copy.txt
wincmd - Copy.ini
wincmd - Copy - Copy.ini
=====================
Search: (.*?)Copy
Replace: ($1)FOO
Options Enabled: x1, Regex
=====================
The first "Copy" is replaced only. Without x1, both are matched/replaced.

I would recommend an example using the x1 in the help doc, that recommends using (.*?) with globs as without the question mark the x1 has no effect at all. Further, if you use two globs, then again you would need two question marks otherwise the x1 has no effect and only the last 'Copy' is replaced. e.g.

Search: (.*)Copy(.*)
Replace: ($1)FOO[$2]

→ The second (or sole) Copy is replaced. [x1] no effect


Search: (.*?)Copy(.*)

→ The first (or sole) Copy is replaced. [x1] no effect.


Search: (.*?)Copy(.*?)

→ Both 'Copy' strings are replaced.
→ x1: The first 'Copy' is replaced.
Post Reply