Searching archives recursively

Here you can propose new features, make suggestions etc.

Moderators: white, Hacker, petermad, Stefan2

User avatar
ado
Senior Member
Senior Member
Posts: 445
Joined: 2003-02-18, 13:22 UTC
Location: Slovakia, Pezinok

Post by *ado »

Balderstrom wrote: GZIP and BZIP are single file compressors, and are next to useless without TAR.
let me give you example:
I have process that generates logfile. This log file is rolled-over every 12 or 24 hrs and is quite big (let's say ~1GB). Now I want to keep these logs for let's say one week and then I can delete them.
So I keep current log and previous as it is generated, i.e. taking each up to 1GB and the rest of them till those 7 days I keep gzipped. Each time I am rolling over log, I am gzipping previous one, renaming current to previous and deleting the oldest gz file.... I have simply one version of log file for a day gzipped or no

ado
User avatar
ado
Senior Member
Senior Member
Posts: 445
Joined: 2003-02-18, 13:22 UTC
Location: Slovakia, Pezinok

Post by *ado »

DrShark wrote:
Balderstrom wrote:I don't believe its possible, without unpacking.
Of course it requires unpacking. And makes a search much longer in time. Look at antivirus scanners, they all scan archive formats in that way...
true but it is not the point. The point is: If I need to do it, I have to do it manually. When you have simple compressed file and you are looking for file with some name, is it easy search, because TC can search just file structure inside of that compressed file - it does not have to decompress file.
However, if you are looking for file that contains some text, TC has to "unzip" whole compressed file - file by file and search it for that string. And this is existing functionality. So all I want TC to do is: when it traverses files inside of compressed file and it hit another compressed file, just "unzip" it and keep going (thinking about implementation - it is most probably not that easy as it sounds, right Christian? ;-) ).
If TC cannot do it and I need that functionality, I have to "simulate" it manually and I am glad that TC is helping me in that - manually I can dive into at least 3-4 levels of zip in zip in zip... and even if I'll modify file in the last level, it will correctly re-zip files if I'll go back (Ctrl-PgDn/PgUp are magic keys to dive even into self extracting files).

ado
katzco
Junior Member
Junior Member
Posts: 6
Joined: 2009-06-30, 13:33 UTC

Post by *katzco »

ado wrote:
Balderstrom wrote: GZIP and BZIP are single file compressors, and are next to useless without TAR.
let me give you example:
I have process that generates logfile. This log file is rolled-over every 12 or 24 hrs and is quite big (let's say ~1GB). Now I want to keep these logs for let's say one week and then I can delete them.
So I keep current log and previous as it is generated, i.e. taking each up to 1GB and the rest of them till those 7 days I keep gzipped. Each time I am rolling over log, I am gzipping previous one, renaming current to previous and deleting the oldest gz file.... I have simply one version of log file for a day gzipped or no

ado
This is actually exactly the case I was dealing with when I first started this thread.
Balderstrom wrote:
katzco wrote:
All I say is that if file navigation through archives looks and feels like virtual directories, and TC can drill in and out of them seamlessly,
It's actually not seamless, TC is able to open the first tier of archives (zip/rar/cab/etc) and display the file lists as if it was a directory. Any further archives within an archive are actually unpacked to the Temp folder to be able to view the contents.
I meant seamless from the user point-of-view. I guess TC has many other difficult-to-implement features as well :)
User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

ado wrote:
Balderstrom wrote: GZIP and BZIP are single file compressors, and are next to useless without TAR.
let me give you example:
I have process that generates logfile. This log file is rolled-over every 12 or 24 hrs and is quite big (let's say ~1GB). Now I want to keep these logs for let's say one week and then I can delete them.
So I keep current log and previous as it is generated, i.e. taking each up to 1GB and the rest of them till those 7 days I keep gzipped. Each time I am rolling over log, I am gzipping previous one, renaming current to previous and deleting the oldest gz file.... I have simply one version of log file for a day gzipped or no

ado
Sure, but those individual .gz logs (which btw bzip2 is better for text, up to 100%+ better) could all be kept in a Rar for example - there is so much similiarity from one file to the next that the Rar of all of them would not take up much more space than a .gz of a single log.

Rar would also allow you to, IIRC, set how many Versions of backups you want to keep in the rar.

But if you are just telling me that bz/gz sometimes do have uses for a singlefile, then yes...I agree - thats why I said (mostly|next to) useless without Tar instead of '"completely" useless without Tar' :-)


In general I agree with this thread though, the functionality of searching within what TC considers archives is not optimal at best.

For example, say I want to find "msmq.cpl" in a bunch of MS Hotfixes, I have already found this, I know for a fact it is in this one: Windows2000-KB891861-v2-x86-ENU.EXE

IF I do a search for msmq.cpl, and click "[x] search archives" - then TC does not find the file.
IF I do a search for Text: msmq.cpl, then TC does find the "file".

The problem with the latter is it will also find matches of msmq.cpl in files inside the archives, such as in the .exe/.dlls etc inside the hotfixes.

This is most readily apparent if you try and search for User32.dll or the like, it is actually only in a one or two of them, but that "string" is in almost all of the .exe+.dlls inside the Hotfixes, so TC reports that almost all the Hotfixes contain User32.dll "text", and if I do a search for file User32.dll - then TC reports that NONE of the Hotfixes contain it.
User avatar
ado
Senior Member
Senior Member
Posts: 445
Joined: 2003-02-18, 13:22 UTC
Location: Slovakia, Pezinok

Post by *ado »

Balderstrom wrote: Sure, but those individual .gz logs (which btw bzip2 is better for text, up to 100%+ better) could all be kept in a Rar for example - there is so much similiarity from one file to the next that the Rar of all of them would not take up much more space than a .gz of a single log.
thx, I'll try to find some java bz2 library; I doubt there is such library for rar. I cannot use any native libraries, we are doing development on windows PCs but applications run on Solaris/Linux. Java has built in API to create gzip/zip files, so to implement these two are question of a few lines of code.
Balderstrom wrote: Rar would also allow you to, IIRC, set how many Versions of backups you want to keep in the rar.
wow I didn't know that. I thought that only old UC knew that trick
Balderstrom wrote:The problem with the latter is it will also find matches of msmq.cpl in files inside the archives, such as in the .exe/.dlls etc inside the hotfixes.
yeah, those exe/dll could be tricky. It depends on how TC know it is archive. If it is based on extension, you are out of luck. If TC also analyzes content's you can have at lest hope ;-)

ado
katzco
Junior Member
Junior Member
Posts: 6
Joined: 2009-06-30, 13:33 UTC

Post by *katzco »

I am curious what the developer thinks about the issue. In any case, the issue here is not how to archive the files in the most efficient way or how to avoid recursive archives. My point is that recursive archives are sometimes a given (lets say somebody else is to blame, but the recursive archive is the file in front of me). TMHO TC should expand the archive = "virtual directory" experience to search which includes recursive search.
User avatar
ado
Senior Member
Senior Member
Posts: 445
Joined: 2003-02-18, 13:22 UTC
Location: Slovakia, Pezinok

Post by *ado »

right, and not only in search, but also in "Synchronize dirs"

ado
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48005
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Currently it's not implemented because it would be very slow. TC would have to unpack all the nested archives even when searching only for file names. Writing the code isn't a problem, but it would certainly upset many people if searching would become much slower.
Author of Total Commander
https://www.ghisler.com
User avatar
Clo
Moderator
Moderator
Posts: 5731
Joined: 2003-12-02, 19:01 UTC
Location: Bordeaux, France
Contact:

Option & Info ?

Post by *Clo »

2ghisler(Author)

:) Good evening,
…Writing the code isn't a problem, but it would certainly upset many people if searching would become much slower.
• Maybe in a next version as an option (OFF by default) like :
[ ] Search in nested archives (Very slow !).
:?:

:mrgreen: VG
Claude
Clo
#31505 Traducteur Français de TC French translator Aide en Français Tutoriels Français English Tutorials
katzco
Junior Member
Junior Member
Posts: 6
Joined: 2009-06-30, 13:33 UTC

Post by *katzco »

ghisler(Author) wrote:Currently it's not implemented because it would be very slow. TC would have to unpack all the nested archives even when searching only for file names. Writing the code isn't a problem, but it would certainly upset many people if searching would become much slower.
Thanks for the response. I assumed recursive search will be slower then regular search, but in the cases which are supported right now (one level archive or no archives at all) there will be no degradation from the current behavior, so the user should not be upset... and in any case, the deep-search could be enabled/disabled by the user.
In any case, thanks for considering.
User avatar
ado
Senior Member
Senior Member
Posts: 445
Joined: 2003-02-18, 13:22 UTC
Location: Slovakia, Pezinok

Post by *ado »

2ghisler(Author)
I see, and I am glad that you are still obsessed with speed. That's why most people loves TC (me too). But sometimes I have feeling, that in the name of speed you are scarifying useful functionality. As I tried to explain earlier - when I need to search/compare nested archives, I'll have to do it manually. So if you would implement is as function that can be turned off, you'll kill two flies with one stone. Most people can keep it turned off and some (how knows maybe most) will like it and it will save them a tons of time.

ado
jjk
Member
Member
Posts: 181
Joined: 2003-07-03, 10:41 UTC

Post by *jjk »

Support ++
with option clearly displayed, so nobody would be deceived.
wangxi
Junior Member
Junior Member
Posts: 3
Joined: 2010-04-04, 17:26 UTC

Post by *wangxi »

yeah, this feature is quite neat, I want this kind of search as normally most deliverable are zip in zip ...please consider to add it..
miguy2k
Junior Member
Junior Member
Posts: 3
Joined: 2006-09-29, 14:12 UTC

Post by *miguy2k »

Support++

Recursive Archives all over the place here in the java world! :(
nxs
New Member
New Member
Posts: 1
Joined: 2015-03-10, 16:22 UTC

Post by *nxs »

TrueZip seems to be able to search and update jars within wars within ears in moments - I created an ant task for it on our project.

But TC is the best thing in town, so it must be able to do the same, only faster!
Post Reply