Sync Directories - compare by file size and content

Here you can propose new features, make suggestions etc.

Moderators: white, Hacker, petermad, Stefan2

Post Reply
Alan
Junior Member
Junior Member
Posts: 3
Joined: 2021-09-27, 10:07 UTC

Sync Directories - compare by file size and content

Post by *Alan »

I often use Synchronize Directories with the [by content] and [ignore date] options selected for performing the comparisons. Works well in most cases but not when the file names in one or both of the two directories have been altered. Such alteration could have arisen through deliberate tweaking over a long period and I am now comparing an old backup with the current version of the folder. It could also arise when I am comparing a copy of a camera card (new or old) with files that were renamed by Lightroom or Downloader Pro while importing a copy of the image files onto my computer.

Whatever the cause for name differences, files can only ever be an exact match if they have exactly the same size. So if we had an option to [ignore name] as well as the existing [ignore date] option in Synchronize Folders, then TC could compare each file on the left with all files of equal size on the right (and vice versa) and then indicate any matches or mismatches.

There could be multiple files of the same size even if none have been renamed, so any file on either side could exactly match the content of zero or one or more files on the other side. It would be very unlikely for many files on each side to match many on the other side. The results list could show which files match, which could require multiple lines for some files. Or, it could be less specific and show a substitute "fake" entry on the other side such as "<<<multiple files>>>". I prefer a mutiple-line approach. However, either way will be better than having to using File Compare many times per folder, especially when a camera card can hold hundreds of image files.

A possible alternative would be for TC to calculate md5 checksums for the files and display an additional column for the md5 so that we can sort and compare by the checksums without losing sight of the filenames. If it were to work this way then it would need a [by md5] option that overrides the other selection options. A downside of this alternative is the time taken to calculate md5s for files that cannot possibly match because they have different sizes.
User avatar
petermad
Power Member
Power Member
Posts: 14739
Joined: 2003-02-05, 20:24 UTC
Location: Denmark
Contact:

Re: Sync Directories - compare by file size and content

Post by *petermad »

2Alan

There is a discussion about this issue here: https://ghisler.ch/board/viewtopic.php?p=425346#p425346
License #524 (1994)
Danish Total Commander Translator
TC 11.03 32+64bit on Win XP 32bit & Win 7, 8.1 & 10 (22H2) 64bit, 'Everything' 1.5.0.1371a
TC 3.50b4 on Android 6 & 13
Try: TC Extended Menus | TC Languagebar | TC Dark Help | PHSM-Calendar
georgeb
Senior Member
Senior Member
Posts: 250
Joined: 2021-04-30, 13:25 UTC

Re: Sync Directories - compare by file size and content

Post by *georgeb »

Hello @alan,

thankfully @petermad has already pointed to our recent discussion about that very problem of the current handling of moved or renamed files by "SynchronizeDirs" which may leave us with possible, even multiple duplicates that currently go undetected by "SynchronizeDirs".

There I've made a proposal for an enhanced concept/version of SyncDirs for future implementation which would be capable of detecting and remedying such cases.

The thread is quite long to follow but if you are interested in concepts of how to handle such cases my proposal might be of use for your situation as well. Please feel free to chime in to that discussion and perhaps you'd like to come up with alternative concepts or new ideas of your own which might lead to a feasible solution for remedying that problem in the future and which would be supported by a broader base of users.
algol
Senior Member
Senior Member
Posts: 448
Joined: 2007-07-31, 14:45 UTC

Re: Sync Directories - compare by file size and content

Post by *algol »

suppoort ++
mmm
Member
Member
Posts: 120
Joined: 2020-08-10, 12:32 UTC

Re: Sync Directories - compare by file size and content

Post by *mmm »

I requested the same Sync Dirs enhancement here:
viewtopic.php?p=417266&hilit=mmm+ghisler#p417266
georgeb
Senior Member
Senior Member
Posts: 250
Joined: 2021-04-30, 13:25 UTC

Re: Sync Directories - compare by file size and content

Post by *georgeb »

mmm wrote: 2023-01-09, 08:24 UTC I requested the same Sync Dirs enhancement here:
viewtopic.php?p=417266&hilit=mmm+ghisler#p417266
Excellent! I wasn't aware of your request so far. So I would also be interested in your opinion on my recent proposal of how to remedy this problem. As you can see following the link above cited by @petermad the core of my proposal would be to perform a duplicate-by-content-search (ignoring filenames) between the current groups "Unique Left" and "Unique Right" - which currently (and wrongfully) do contain moved/renamed duplicates in (perhaps multiple) different locations - all that from within "Sync Dirs" in a second optional run/pass.

After that - and depending on the duplicates found - those (so far pseudo-)unique groups would be further split into "Truly Unique Left/Right" and only "Locally Unique Left/Right" (with duplicates somewhere else). Those newly introduced groups of "Locally Unique Left/Right" would then be represented by a different color and also made separately selectable by split Left/Right-buttons.

Finally if any file in those duplicate-groups would be selected by cursor a new right-mouse-click-option would need to be implemented offering to only show all identical binary duplicates of that file currently selected.

With that concept any perceived necessity of a 1:1 relationship between the duplicates displayed would be rendered obsolete.

So what do you all think of my solution? Feel free to comment on my linked proposal in the TC-English section.
mmm
Member
Member
Posts: 120
Joined: 2020-08-10, 12:32 UTC

Re: Sync Directories - compare by file size and content

Post by *mmm »

Georgeb,
Much as I appreciate your effort I am afraid I have to disappoint you.

Here is reasons why:
1. Christian Ghisler already indicated he was not going to enhance the Sync Dirs in this regard.
2. I do not think he will change his mind unless this topic gets higher visibility - and that's not going to happen. The audience here is not as broad as you may think.

TC does not seem to be suitable for the job in question. You need to grab something else.



You wanted my opinion and you got it. Sorry.

Best,
mmm
Last edited by mmm on 2023-01-09, 13:50 UTC, edited 1 time in total.
georgeb
Senior Member
Senior Member
Posts: 250
Joined: 2021-04-30, 13:25 UTC

Re: Sync Directories - compare by file size and content

Post by *georgeb »

mmm wrote: 2023-01-09, 11:59 UTC Georgeb,
Much as I appreciate your effort I am afraid I have to disappoint you.

Here is reasons why:
1. Christian Ghisler already indicated he was not going to enhance the Sync Dirs in this regard.
2. I do not think he will change his mind unless this topic gets higher visibility - and that's not going to happen. The audience here is not as broad as may think.

TC does not seem to be suitable for the job in question. You need to grab something else.
No offense taken. But as far as I'm aware of Mr. Ghislers reservations about the topic they mainly emphasize the inability of an algorithm / program-feature to decide on which duplicate is intentional and which one is garbage and further a loss of the 1:1 relationship between those duplicates as displayed in "Sync Dirs".

My proposed solution would take care of both problems. First the decision-making can only be done on an interactive basis by the user for which "Sync Dirs" offers an excellent representation and oversight together with all the flexibility needed to further proceed with each file individually or groups of them in a combined fashion. And secondly that 1:1 relationship for displaying the duplicates would no longer be necessary when you instead would see all the identical dupes, each of them listed below their parent folder, all together in a single, separate screen as they are distributed among the Left vs. Right data-structure and with full capability of selecting them for further processing as the user sees fit.
HalbschuhTouri
Junior Member
Junior Member
Posts: 61
Joined: 2023-01-20, 09:33 UTC

Re: Sync Directories - compare by file size and content

Post by *HalbschuhTouri »

Support +++
Post Reply