Content plugins and and invalid filenames in MRT
Moderators: Hacker, petermad, Stefan2, white
Content plugins and and invalid filenames in MRT
Content plugin values can be used for various purposes. They can be just displayed or used as an input for renaming files.
In my xPDFSearch plugin there is a field that reads the first ~1000 chars of text in a PDF file. I use this fields as an input for renaming PDF files. Unfortunately that doesn't work always. TC often displays a message that certain files could not be renamed in MRT. When I take a look at those bold displayed filenames I cannot see invalid characters. I haven't invested this in detail (by debugging xPDFSearch) but I guess it might my unprintable chars.
Of course I could filter them out but these characters are not a problem or are even useful unless used for renaming. So what do you think is the best solution?
In my xPDFSearch plugin there is a field that reads the first ~1000 chars of text in a PDF file. I use this fields as an input for renaming PDF files. Unfortunately that doesn't work always. TC often displays a message that certain files could not be renamed in MRT. When I take a look at those bold displayed filenames I cannot see invalid characters. I haven't invested this in detail (by debugging xPDFSearch) but I guess it might my unprintable chars.
Of course I could filter them out but these characters are not a problem or are even useful unless used for renaming. So what do you think is the best solution?
Why debug?
Just create a custom column for such special files and use
cm_CopyFileDetailsToClip
and paste it into some decent text editor, e.g. [url=ttp://xhmikosr.github.io/notepad2-mod/]Notepad2-mod[/url]
where control characters are clearly highlighted with glyphs.
I would use a configurable plugin option for xPDFSearch, to let the user decide if he wants to pre-filter forbidden characters.
Maybe alternatively create a 2nd plugin field "First line safe" which has that prefilter applied, but only shows up if the option is set.
I also apply a pre-filter with the Windows forbidden characters in PCREsearch's random fields, but I'm thinking of allowing them in the next version when the user sets/unsets a plugin option too.
Just create a custom column for such special files and use
cm_CopyFileDetailsToClip
and paste it into some decent text editor, e.g. [url=ttp://xhmikosr.github.io/notepad2-mod/]Notepad2-mod[/url]
where control characters are clearly highlighted with glyphs.
I would use a configurable plugin option for xPDFSearch, to let the user decide if he wants to pre-filter forbidden characters.
Maybe alternatively create a 2nd plugin field "First line safe" which has that prefilter applied, but only shows up if the option is set.
I also apply a pre-filter with the Windows forbidden characters in PCREsearch's random fields, but I'm thinking of allowing them in the next version when the user sets/unsets a plugin option too.
TC plugins: PCREsearch and RegXtract
I noticed that some characters (invalid in file names) are substitued, e.g. '?' is replaced by '_' in MRT.
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more
Indeed, it seems that:tbeu wrote:I noticed that some characters (invalid in file names) are substitued, e.g. '?' is replaced by '_' in MRT.
Code: Select all
\ -> stays
/ -> _
: -> .
* -> _
? -> _
" -> '
< -> _
> -> _
| -> _
(but this already done when after returning a ft_string(w) field)
Not sure about the Unicode whitespace characters though.
But despite these filters, most remaining ASCII control characters (0x01-0x1F) seem to stay unaltered, but:
they are replaced with the corresponding CP 437
characters, no matter if in MRT or custom columns or elsewhere for display,
and when you copy them with
cm_CopyFileDetailsToClip
you get e.g. Unicode glyphs in scintilla based editors.
Did Ghisler ever explained that in detail?
It should be documented somewhere IMO.
TC plugins: PCREsearch and RegXtract
- ghisler(Author)
- Site Admin
- Posts: 50390
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
TC is expecting user-readable text fields, as used for custom columns etc. You cannot use binary fields here, it just doesn't make any sense. TC doesn't show fields of type ft_fulltext in the rename tool or for custom colums exactly for this reason.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
2ghisler(Author)
I'm already using a function to filter-out invalid characters but as I don't know which characters are not allowed it's a bit difficult to catch them all by example. In my case \r and \n had been the problem. So on one hand I could filter out all characters not allowed in filenames but I think that would be too much. So which should I filter out and which are handled by MRT?
I'm already using a function to filter-out invalid characters but as I don't know which characters are not allowed it's a bit difficult to catch them all by example. In my case \r and \n had been the problem. So on one hand I could filter out all characters not allowed in filenames but I think that would be too much. So which should I filter out and which are handled by MRT?
I'm talking about all the other fields in xpdfsearch - mainly ft_string.TC doesn't show fields of type ft_fulltext in the rename tool or for custom colums exactly for this reason.
Last edited by Lefteous on 2015-02-09, 08:42 UTC, edited 1 time in total.
It should be clarified:Lefteous wrote:In my case \r and \n had been the problem.
CR/LF-Combinations are replaced with one space character, a tab wit two space characters,
but only for display of the ft_string field (and therefore visible in cm_CopyFileDetailsToClip).
As soon as you try to use it in MRT, CR+LF are still there and are of course problematic.
IMO TC should replace them with space in MRT too, like the other chars listed above (should be a trivial task).
TC plugins: PCREsearch and RegXtract
- ghisler(Author)
- Site Admin
- Posts: 50390
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
OK, no problem, I can add that.IMO TC should replace them with space in MRT too, like the other chars listed above (should be a trivial task).
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
If this should be fixed by
in TC8.52b1 I wonder why it was not fixed for the listed characters above.13.02.15 Fixed: Multi-rename tool: Replace characters (received from plugins) with code <32 (e.g. tab, line break) by spaces (32/64)
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more
I know but why not fixing all invalid chars?
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more
- ghisler(Author)
- Site Admin
- Posts: 50390
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Which ones do you mean? The only one which isn't replaced now is the backslash, but this is intentional - the tool allows to move files to subfolders now.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
All the one that milo1012 listed above.
TC plugins: Autodesk 3ds Max / Inventor / Revit Preview, FileInDir, ImageMetaData (JPG Comment/EXIF/IPTC/XMP), MATLAB MAT-file Viewer, Mover, SetFolderDate, Solid Edge Preview, Zip2Zero and more