Preliminary information about Unicode support (TC7.5)

Discuss and announce Total Commander plugins, addons and other useful tools here, both their usage and their development.

Moderators: sheep, Hacker, Stefan2, white

Post Reply
User avatar
milo1012
Power Member
Power Member
Posts: 1104
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 » 2016-01-19, 15:02 UTC

Lefteous wrote:It somehow must be 'combined' with 'Office xml'.
Erm, but there is no rule that the wdx full text search applies to "office style" files only.
It was intended to filter ANY file type, for example, I use the full text search for my APK-wdx plug-in to search in the app string pool.
So it wouldn't make any sense if you can only choose a plug-in field when you "activated" Office/XML search.

What I meant is:
Showing all wdx full text fields (dropdown list etc.) in any case, so that users don't need to navigate to the plugin tab and enter a term there.
TC can use a raw content search, either by looking for ANSI/OEM text, UTF-8 text, UTF-16 text, or by gracefully unpacking XML-style office files and searching in it.
Offering a plug-in search in that same dialog location greatly advertises an alternative search engine, no matter the file type.
Lefteous wrote:...you have to define if you want to use the internal mechanism or for example using the "TextSearch" plugin
Exactly.
That's why we'd need the usual checkbox, so that you can combine all search mechanism as you like.
The point was that the plug-in search needs to be independent from the internal Office/XML search.
TC plugins: PCREsearch and RegXtract

User avatar
Lefteous
Power Member
Power Member
Posts: 9457
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous » 2016-01-19, 16:04 UTC

2milo1012
For me using content plugins for fulltext search is more a general settings topic, not a per-search settings topic. This means once configured you just use it in search. It discuable if there should be an overall option to use it or not.

What I imagine is a dialog where users can add filetypes
as in other parts of TC (very much as in sync. dirs). By default there is only the option to search inside Office documents. Other rules may overrule it. A rule would be an assignment of a filetype and exactly one fulltext search plugin field. The rules would work like this (first comes, first serves):

*.docx;*.ods --> textsearch.text
*.pdf --> xpdfsearch.text
*.xyz --> xyz.text
Office documents --> use internal search (this static rule cannot be edited or deleted, but moved up or down)

Some links:
http://ghisler.ch/board/viewtopic.php?t=41926
http://ghisler.ch/wiki/index.php/Integrate_fulltext_content_plugin_fields_into_find_text

User avatar
milo1012
Power Member
Power Member
Posts: 1104
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 » 2016-01-19, 16:29 UTC

Lefteous wrote:For me using content plugins for fulltext search is more a general settings topic, not a per-search settings topic.
I disagree.
My daily search routines just don't work that way.
I have e.g. certain dirs with office/business documents I want to search, so I want and need a "per-search setting", simple as that.
What's the point in offering plugins for fulltext search if you don't use them how they were intended?
We can have the most powerful search/text filters behind it, but don't offer it to the user directly visible? It's quite obscure.
I'm really tempted to start a poll about this.

My whole point was that the uncomfortable "go to Plugins tab and enter text" method is really not how I want to use plug-ins with (maybe) better search capabilities.
Additionally offering them in the main search dialog tab wouldn't be that much of a problematic change, would it?
Lefteous wrote:What I imagine is a dialog where users can add filetypes...
That may be an additional good solution of course.
TC plugins: PCREsearch and RegXtract

User avatar
Lefteous
Power Member
Power Member
Posts: 9457
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous » 2016-01-20, 08:28 UTC

2milo1012
I have e.g. certain dirs with office/business documents I want to search, so I want and need a "per-search setting", simple as that.
I really don't get your point. Do you mean that you want setup your fulltext search plugins everytime you want search office documents. Why do you want to do that?
What's the point in offering plugins for fulltext search if you don't use them how they were intended?
We can have the most powerful search/text filters behind it, but don't offer it to the user directly visible? It's quite obscure.
I'm really tempted to start a poll about this.
I really don't understand why you keep asking me this as I suggested all this many years ago. You dind't click on this links - did you?
My whole point was that the uncomfortable "go to Plugins tab and enter text" method is really not how I want to use plug-ins with (maybe) better search capabilities.
Additionally offering them in the main search dialog tab wouldn't be that much of a problematic change, would it?
Well I think just having the same solution as on the plugins tab is still to cumbersome. That's why I suggested a much easier way to use them.

So here is another mockup that should illustrate what I mean.
[img]http://fs5.directupload.net/images/160120/tqggdt99.png[/img]

So it works like this:
1. Check 'Support for additional filetypes' (would by default only turn on just the internal office search, later it controls also the plugin-extended text search).
2. When the user clicks the 'Configure' button he can add filetype<-->content plugin fulltext search fields associations
3. Once configured they just work

What could be easier?

User avatar
milo1012
Power Member
Power Member
Posts: 1104
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 » 2016-01-20, 15:03 UTC

Lefteous wrote:I really don't understand why you keep asking me this as I suggested all this many years ago. You dind't click on this links - did you?
Because you asked me to look at use your suggestion, but like I said, configuring "per filetype" is not what I intended nor is it how the fulltext fields work in my concept,
or to put it simple: it's an overcomplicated solution for a simple problem - that's why I said "directly visible".
Why would I bring in another filter, when I could use a search mask in the dialog anyway?
I don't see why you'd think that the fulltext fields *NEED* a filetype filter - it's just redundant.
The average user *MAY* configure it once, but forgets about it a week later, wondering why certain terms are not found.
Another point: the plugin fields rely on the wdx detection mechanism in the first place.
I could configure some plug-in to use a certain file type, but which doesn't work with it at all, leaving the user possibly confused.
Lefteous wrote:I really don't get your point. Do you mean that you want setup your fulltext search plugins everytime you want search office documents. Why do you want to do that?
Not setup, but simply choose the plug-in field (from a dropdown list). And TC remembers the last used one.

Lefteous wrote:So here is another mockup that should illustrate what I mean.
...
it works like this
...
What could be easier?
Yes, I understood your suggestion in the first place.
Like I said above: another filter - even though we can use a search mask above, another point of error, another overlay of configuration you have to click through.
I don't see this to be easier than a simple wdx field dropdown list in the place you'd have your "Configure" button.
What if I maybe want to deliberately try another plug-in for a certain filetype? For your suggestion I'd have to alter the filter over and over, where I could quickly choose another one when having a simple list.

And you can still save/load search parameters (presets) that way, by using the mentioned search mask to filter filetypes, and TC saving the chosen fulltext field.
We'd and up with an e.g. "office text document" preset like "*.doc *.odt *.rtf ..." and "Textsearch.text".
TC plugins: PCREsearch and RegXtract

User avatar
Lefteous
Power Member
Power Member
Posts: 9457
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous » 2016-01-20, 16:03 UTC

2milo1012
Because you asked me to look at use your suggestion, but like I said, configuring "per filetype" is not what I intended, or to put it simple: it's an overcomplicated solution for a simple problem.
So the answer to my question if you want to select the content plugins to search for fulltext search everytime is yes - a really really cumbersome solution :-(

Please consider that using a certain plugin field already implies that you know which content plugin fulltext field will search files with a certain filetype. So there is no way around a per filetype configuration. It's always a per filetype configuration.

Why would I bring in another filter, when I could use a search mask in the dialog anyway?
These are completely different things. I might search for all files but use different plugins for different filetypes (as in my example).
I don't see why you'd think that the fulltext fields *NEED* a filetype filter - it's just "redundant".
No absulutely not redundant as there can be conflicts between multiple fields capable of handling the same filetype. TC solves this problem anywhere in the program in the same way.
Another point: the plugin fields rely on the wdx detection mechanism in the first place.
Yes that's a good point. Would be great if they could help. But detection strings and filetype definitions are completely different.
An idea could be to have an ordered list of fields only without filetypes and then let the detection strings do the rest. The downside is that it might be more difficult for the user which field is used for which field. Detection strings can be quite complex.
I could configure some plug-in to use a certain file type, but which doesn't work with it at all, leaving the user possibly confused.
Yes that's exactly the same problem as I mentioned above. It's always there (applies to your suggestion as well).
The average user *MAY* configure it once, but forgets about it a week later, wondering why certain terms are not found.
I agree that users will forget about the configuration. So the solution would be to support the user in not entering nonsense. It's something that has to be solved in any case (applies to your suggestion as well).
Not setup, but simply choose the plug-in field (from a dropdown list). And TC remembers the last used one.
Just one - are you kidding?
What if I maybe want to deliberately try another plug-in for a certain filetype? For your suggestion I'd have to alter the filter over and over, where I could quickly choose another one when having a simple list.
For this (quite care) use case you could add the field there in top position and you are done.

User avatar
milo1012
Power Member
Power Member
Posts: 1104
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 » 2016-01-20, 16:37 UTC

Lefteous wrote:So the answer to my question if you want to select the content plugins to search for fulltext search everytime is yes...
Not necessarily, as TC should remember the last used one of course, plus I mentioned the usual search presets in my last post,
for which you can basically define the SAME file type based settings as with your additional filtering.
Lefteous wrote:a really really cumbersome solution
What's cumbersome is in the eye of the beholder.
You have your line of workflow, I have mine, another user may choose either one, you know?
Lefteous wrote:Please consider that using a certain plugin field already implies that you know which content plugin fulltext field will search files with a certain filetype. So there is no way around a per filetype configuration. It's always a per filetype configuration.
What kind of argument is that?
I already told you about the wdx detection mechanism.
You are the one that wants an ADDITIONAL filter on top of that, not me.
We are going in circles here.
Lefteous wrote:These are completely different things. I might search for all files but use different plugins for different filetypes (as in my example).
And these are completely different solutions we are talking about. You want multiple filters for multiple types, I was considering a simple static filter.
I already said that your solution might be good on top of moving the fulltext fields to the main tab.
But as we already can see in the "Sync dirs" function: it is cumbersome to alter such filter set.
I often try multiple plug-in solutions, where I quickly need different rules.
I'd vote for your solution only if one could save and load presets within this dialog, for quickly changing a set of rules.
Lefteous wrote:Just one - are you kidding?
As I explained above: I was considering a simple static filter.
So this implies my other points: using a search mask and presets, which you can start via internal commands, button bar, etc.


We don't know if Christian considers even one suggestion.
So I won't continue arguing about hypothetical scenarios and getting offtopic in this thread.
But to be honest: it bothers me that you are reasoning with a complex solution, where a simple one is completely enough for many purposes.
TC plugins: PCREsearch and RegXtract

User avatar
Lefteous
Power Member
Power Member
Posts: 9457
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous » 2016-01-20, 21:44 UTC

2milo1012
Not necessarily, as TC should remember the last used one of course, plus I mentioned the usual search presets in my last post,
for which you can basically define the SAME file type based settings as with your additional filtering.
How do you solve the conflicting fields problem? Are all defined fields used and only the detect string skips defined plugins?
Defining the fields right in the dialog overcharges the dialog even more - without any additional benefit.
What's cumbersome is in the eye of the beholder.
You have your line of workflow, I have mine, another user may choose either one, you know?
The number of clicks can be counted. In the end there will be just one solution implemented in the program.
I already told you about the wdx detection mechanism.
Yes and I explained that it doesn't help to decide which plugin field is used in case of conflict. As in my example. If I want to use Textsearch for Libreoffice documents and xpdfsearch for pdf. How does the detect string help here?
it bothers me that you are reasoning with a complex solution, where a simple one is completely enough for many purposes.
I'm always searching for easier solutions compared to my own but what you suggest is just overloading the dialog.

User avatar
milo1012
Power Member
Power Member
Posts: 1104
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 » 2016-01-20, 23:42 UTC

This will be my last post in this matter.
To clarify my thought: http://fs5.directupload.net/images/160121/fiz4xxf5.png
Lefteous wrote:How do you solve the conflicting fields problem? Are all defined fields used and only the detect string skips defined plugins?
Defining the fields right in the dialog overcharges the dialog even more - without any additional benefit.
I was talking about the search parameters ("presets") in the search dialog's "Load/Save" tab, which you can start via internal commands, button bar, etc..
Is it that hard to understand?
Just preselect your mask ("*.doc *.otf ...") and content search with your desired plug-in field.
Lefteous wrote:The number of clicks can be counted.
And this number depends on your task at hand.
I want to switch plug-ins depending on my search location and parameters, and for that your dialog would hinder.
Lefteous wrote:In the end there will be just one solution implemented in the program.
Obviously. But it seems you already made up your veto against any different solution than your own.
Lefteous wrote:I'm always searching for easier solutions compared to my own but what you suggest is just overloading the dialog.
A dropdown list is overloading the dialog?
TC plugins: PCREsearch and RegXtract

User avatar
Lefteous
Power Member
Power Member
Posts: 9457
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous » 2016-01-21, 08:10 UTC

2milo1012
I was talking about the search parameters ("presets") in the search dialog's "Load/Save" tab, which you can start via internal commands, button bar, etc..
Is it that hard to understand?
Just preselect your mask ("*.doc *.otf ...") and content search with your desired plug-in field.
Well if the user can really just add a single field for text search then you won't have a hierarchy problem.
And this number depends on your task at hand.
I can tell you what I think is the task most users will perform. Search for text in files for as many file types as possible. If you have just one field the task cannot be performed. It's as simple as that. What you suggest is to have a fraction of the current functionality on the first page.
So how many clicks would my solution need when performing a search? The answer is zero (if the checkbox is already checked) and one if you need to click it.
I mean - have you used a Desktop search or a search machine like Google? All steps into this direction help in my opinion and that includes not thinking of internal stuff when starting a search.
I want to switch plug-ins depending on my search location and parameters, and for that your dialog would hinder.
I'm still asking me what the downside is of having all fields considered in the search...
Obviously. But it seems you already made up your veto against any different solution than your own.
I would consider my solution as a baseline and there is always room for optimization but your solution feels a bit halfhearted compared to the current implementation.
A dropdown list is overloading the dialog?
I have to admit it's not not as bad as I imagined it. As you just list the fields in question the user don't has to add the fields.
But the dialog is already a monster. So the is not much room for complicated stuff like context plugin fields. They are right on the dialog which is not required and it adds complexity, but of course it's really unusable to have just a single field as mentioned above.


Just one more hint. As I have developed quite a few content plugins including some that feature text search I understand that there are other needs for testing purposes where I'm using a different configuration. Beside that I don't see a need for two configurations.

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 38170
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) » 2016-01-21, 11:06 UTC

Does this mean the WDX API version was incremented?
Yes, to 2.11.

ft_fulltextw expects UTF-16 Unicode just as ft_stringw.
Author of Total Commander
http://www.ghisler.com

User avatar
Ovg
Power Member
Power Member
Posts: 608
Joined: 2014-01-06, 16:26 UTC
Location: MOW

Post by *Ovg » 2016-06-10, 16:44 UTC

2ghisler(Author)

Is ft_fulltextw exist in TC 9 β1? Search with xpdfsearch plugin for Cyrillic text in pdf files still doesn't work in TC 9 β1 x32/x64 or what I'm missing?
It's impossible to lead us astray for we don't care even to choose the way.
#259941, TC 9.22a x64, Windows 7 SP1 x64

User avatar
Lefteous
Power Member
Power Member
Posts: 9457
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous » 2016-06-10, 16:58 UTC

2Ovg
Yes it's there but plugins must be updated. Stay tuned...

User avatar
Ovg
Power Member
Power Member
Posts: 608
Joined: 2014-01-06, 16:26 UTC
Location: MOW

Post by *Ovg » 2016-06-10, 17:03 UTC

2Lefteous

Thank you for reply! I will stay! :mrgreen:
It's impossible to lead us astray for we don't care even to choose the way.
#259941, TC 9.22a x64, Windows 7 SP1 x64

User avatar
milo1012
Power Member
Power Member
Posts: 1104
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 » 2016-06-10, 17:30 UTC

Lefteous wrote:Yes it's there but plugins must be updated.
Did you already test it to be working?
I'm still waiting for the new Content-Plugin Guide, though it's probably obvious how ft_fulltextw should be implemented (just switching to UTF-16 strings).
TC plugins: PCREsearch and RegXtract

Post Reply