Page 1 of 1

.gitignore-aware search

Posted: 2018-07-17, 09:35 UTC
by Yegor
Hi! Long time Total Commander fan here (:

Directories with source files for programming projects often contain a lot of "useless" stuff that can slow down your search enormously, such as unoptimized binaries, libraries or packages, and temporary files; 100 kB of payload per 100 MB of garbage is not too uncommon. It's especially bad when you have to go through the contents. Ideally, those files shouldn't be there, but in reality, you have no choice but to keep them. I know I can skip directories while searching using the status bar of the Find Files box, but that's strictly manual and I can't just leave it searching like that to see the results later. Searching with IDE isn't always an option either.

So why not add a feature to skip files based on readily-available patterns in VCS such as the .gitignore file in Git? It doesn't have to be aware of VCS' internal objects and metadata.

As I understand, it may not be trivial to do, but it's gonna be a great time-saver for us programming folks.

Thanks for consideration!

Posted: 2018-07-17, 14:07 UTC
by Dalai
You can already ignore directories in the search. Example:

Code: Select all

*.* | .git\
searches for everything but skips .git directories. You can find more on that in the TC help file (press F1 when the search window is open).

Regards
Dalai

Posted: 2018-07-17, 18:17 UTC
by MVV
Dalai,
I believe he wants to ignore files that are listed in GIT ignore files, such files may be everywhere within the repository, so it is not so easy to ignore them currently.

The only way to solve such task that I see is to find some GIT plugin for TC that will provide a field like IsIgnored... But I don't know if such a plugin exists.

Posted: 2018-07-17, 20:27 UTC
by Hacker
No idea what GIT ignore files look like, but perhaps they could be easily converted into TC's ignore list?

Roman

Posted: 2018-07-17, 20:52 UTC
by MVV
Hacker,
These files contain recursive patterns or patterns with relative paths, one per line, e.g.:

Code: Select all

Thumbs.db
*.o
*.so
[Dd]ebug/
[Rr]elease/
And such file may be in repository root (main one) and in any repository folder (one that extends main one).

Posted: 2018-07-18, 09:23 UTC
by Yegor
MVV wrote:The only way to solve such task that I see is to find some GIT plugin for TC that will provide a field like IsIgnored... But I don't know if such a plugin exists.
But it is possible to make it then? And that's one problem less, which is great. I was worried TC's API didn't allow messing with the search, but I completely forgot about customizable fields. So as long as this field can be provided in an efficient way (gotta check the API to find out), it should be the natural solution. Thank you!

Posted: 2018-07-18, 10:29 UTC
by MVV
Please check this plugin, it provides FileStatus field that has "ignored" value for ignored files. You can add required rule on Plugins page in Search dialog, and then save search template for quick access e.g. via buttonbar button. However it is 32-bit only (but is open-source so it should be possible to convert it), so it is bad if you use 64-bit TC (I use mostly 32-bit one because there is no real need in 64-bit one for me).

Posted: 2018-07-18, 12:56 UTC
by Yegor
Thanks, it works, but I was a little too optimistic about the performance. The problem is that TC still probes everything, and that takes a lot of time. Some sort of "DirectoryStatus" (as opposed to the FileStatus field) could help in theory, but it turns out that there's a generic TC performance issue.

For example, even in a simpler case without this plug-in, if I want to ignore all the "foo" folders in my project, I can use the built-in filter "tc: path !contains foo". That obviously excludes all the entries with "foo" in it. And if it was smart, TC would skip those directories entirely and immediately, but it doesn't seem to do that. Instead, it still attempts to check all the things it sees under the "foo" directories.

Here's a video: https://www.youtube.com/watch?v=bm0Skn3uRWo The search context is "D:/projects/tc_test" with a single "node_modules" folder in it. I choose to ignore the paths containing "node_modules" on the plugins tab. And yet, you can see that TC is scanning inside that folder. It doesn't have to do that. I'd even go as far as to say it shouldn't.

Don't you guys think it's a bug (or at least an actual performance issue)?

Posted: 2018-07-18, 15:33 UTC
by MVV
I agree that it is a problem that TC can't exclude entire folders from enumerating. Unfortunately it is by design, TC doesn't pass folders to plugins during search.