Content plugins and and invalid filenames in MRT

Discuss and announce Total Commander plugins, addons and other useful tools here, both their usage and their development.

Moderators: Hacker, petermad, Stefan2, white

User avatar
Lefteous
Power Member
Power Member
Posts: 9536
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Content plugins and and invalid filenames in MRT

Post by *Lefteous »

Content plugin values can be used for various purposes. They can be just displayed or used as an input for renaming files.
In my xPDFSearch plugin there is a field that reads the first ~1000 chars of text in a PDF file. I use this fields as an input for renaming PDF files. Unfortunately that doesn't work always. TC often displays a message that certain files could not be renamed in MRT. When I take a look at those bold displayed filenames I cannot see invalid characters. I haven't invested this in detail (by debugging xPDFSearch) but I guess it might my unprintable chars.

Of course I could filter them out but these characters are not a problem or are even useful unless used for renaming. So what do you think is the best solution?
User avatar
milo1012
Power Member
Power Member
Posts: 1158
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 »

Why debug?
Just create a custom column for such special files and use
cm_CopyFileDetailsToClip
and paste it into some decent text editor, e.g. [url=ttp://xhmikosr.github.io/notepad2-mod/]Notepad2-mod[/url]
where control characters are clearly highlighted with glyphs.

I would use a configurable plugin option for xPDFSearch, to let the user decide if he wants to pre-filter forbidden characters.
Maybe alternatively create a 2nd plugin field "First line safe" which has that prefilter applied, but only shows up if the option is set.

I also apply a pre-filter with the Windows forbidden characters in PCREsearch's random fields, but I'm thinking of allowing them in the next version when the user sets/unsets a plugin option too.
TC plugins: PCREsearch and RegXtract
User avatar
tbeu
Power Member
Power Member
Posts: 1354
Joined: 2003-07-04, 07:52 UTC
Location: Germany
Contact:

Post by *tbeu »

I noticed that some characters (invalid in file names) are substitued, e.g. '?' is replaced by '_' in MRT.
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more
User avatar
milo1012
Power Member
Power Member
Posts: 1158
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 »

tbeu wrote:I noticed that some characters (invalid in file names) are substitued, e.g. '?' is replaced by '_' in MRT.
Indeed, it seems that:

Code: Select all

\  ->  stays
/  ->  _
:  ->  .
*  ->  _
?  ->  _
"  ->  '
<  ->  _
>  ->  _
|  ->  _
Besides that, every newline (CR/LF or combinations) are replaced with space, just like the normal tab character.
(but this already done when after returning a ft_string(w) field)
Not sure about the Unicode whitespace characters though.

But despite these filters, most remaining ASCII control characters (0x01-0x1F) seem to stay unaltered, but:
they are replaced with the corresponding CP 437
characters, no matter if in MRT or custom columns or elsewhere for display,
and when you copy them with
cm_CopyFileDetailsToClip
you get e.g. Unicode glyphs in scintilla based editors.

Did Ghisler ever explained that in detail?
It should be documented somewhere IMO.
TC plugins: PCREsearch and RegXtract
User avatar
Lefteous
Power Member
Power Member
Posts: 9536
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous »

Did Ghisler ever explained that in detail?
It should be documented somewhere IMO.
Yes, I was hoping that Christian could give us a clue here.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50390
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

TC is expecting user-readable text fields, as used for custom columns etc. You cannot use binary fields here, it just doesn't make any sense. TC doesn't show fields of type ft_fulltext in the rename tool or for custom colums exactly for this reason.
Author of Total Commander
https://www.ghisler.com
User avatar
Lefteous
Power Member
Power Member
Posts: 9536
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous »

2ghisler(Author)
I'm already using a function to filter-out invalid characters but as I don't know which characters are not allowed it's a bit difficult to catch them all by example. In my case \r and \n had been the problem. So on one hand I could filter out all characters not allowed in filenames but I think that would be too much. So which should I filter out and which are handled by MRT?
TC doesn't show fields of type ft_fulltext in the rename tool or for custom colums exactly for this reason.
I'm talking about all the other fields in xpdfsearch - mainly ft_string.
Last edited by Lefteous on 2015-02-09, 08:42 UTC, edited 1 time in total.
User avatar
milo1012
Power Member
Power Member
Posts: 1158
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 »

Lefteous wrote:In my case \r and \n had been the problem.
It should be clarified:

CR/LF-Combinations are replaced with one space character, a tab wit two space characters,
but only for display of the ft_string field (and therefore visible in cm_CopyFileDetailsToClip).

As soon as you try to use it in MRT, CR+LF are still there and are of course problematic.


IMO TC should replace them with space in MRT too, like the other chars listed above (should be a trivial task).
TC plugins: PCREsearch and RegXtract
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50390
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

IMO TC should replace them with space in MRT too, like the other chars listed above (should be a trivial task).
OK, no problem, I can add that.
Author of Total Commander
https://www.ghisler.com
User avatar
Lefteous
Power Member
Power Member
Posts: 9536
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous »

2ghisler(Author)
Thanks!
User avatar
tbeu
Power Member
Power Member
Posts: 1354
Joined: 2003-07-04, 07:52 UTC
Location: Germany
Contact:

Post by *tbeu »

If this should be fixed by
13.02.15 Fixed: Multi-rename tool: Replace characters (received from plugins) with code <32 (e.g. tab, line break) by spaces (32/64)
in TC8.52b1 I wonder why it was not fixed for the listed characters above.
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more
User avatar
Lefteous
Power Member
Power Member
Posts: 9536
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous »

2tbeu
n TC8.52b1 I wonder why it was not fixed for the listed characters above.
CR/LF are all below 32 or which are the 'above listed characters'?
User avatar
tbeu
Power Member
Power Member
Posts: 1354
Joined: 2003-07-04, 07:52 UTC
Location: Germany
Contact:

Post by *tbeu »

I know but why not fixing all invalid chars?
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50390
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Which ones do you mean? The only one which isn't replaced now is the backslash, but this is intentional - the tool allows to move files to subfolders now.
Author of Total Commander
https://www.ghisler.com
User avatar
tbeu
Power Member
Power Member
Posts: 1354
Joined: 2003-07-04, 07:52 UTC
Location: Germany
Contact:

Post by *tbeu »

All the one that milo1012 listed above.
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more
Post Reply