Page 1 of 1

Synchronize directories with uncompressed/compressed files

Posted: 2021-10-04, 06:25 UTC
by juno12
We produce lots of data which mostly is archived in compressed form with .gz (which can be used as input with our software as well as the uncompressed version).
Therefore we do have directories with files in the uncompressed version like B0_210821_0901.dat and other directories with
compressed files like B0_210821_0901.dat.gz just containing one file (B0_210821_0901.dat).
For us it would be very helpful to synchronize directories independent of the comprassing status of the files.

Re: Synchronize directories with uncompressed/compressed files

Posted: 2021-10-04, 11:36 UTC
by Sergey6257
As I can see it doesnt have any restrictions with *.zip archives. Does it still have with *.gz? Maybe you need archiver?

Re: Synchronize directories with uncompressed/compressed files

Posted: 2021-10-04, 12:19 UTC
by juno12
Thanks for the hint, but as far as I can see with .zip it only works with several files zipped in one zip-file.
But our files are quite big therefor we just pack one file in one gz-file to reduce hard disk space needed.

Re: Synchronize directories with uncompressed/compressed files

Posted: 2021-10-04, 14:20 UTC
by Sergey6257
juno12 wrote: 2021-10-04, 12:19 UTC
Yes, in this case you need to open archive like a folder. But one file in the archive and unarchived file are not the such files.
Most likely this is a violation of the concept.

Can you describe why do you want to silent unarchivate files before comparing?

Re: Synchronize directories with uncompressed/compressed files

Posted: 2021-10-05, 05:35 UTC
by juno12
During a measuring campaign several places with copies of the data are created (by several people). It might happen that different versions of a file do exist (same name, but different size and time) and the biggest/newest version needs to be archived.
At the end of the campaign the data is transfered and compressed for archiving to one place.
When the first dataset (each single file compressed with .gz) is in the archived location, we would need to synchonize this compressed dataset with directories containing other copies of the data (not compressed) to see all files are archived in the correct version.
Compressing each single file with gz is needed for the analyzing software which can use this way compressed and uncompressed data as input.

Hopefully this clarifies the needs.