Max time difference when comparing directories

The behaviour described in the bug report is either by design, or would be far too complex/time-consuming to be changed

Moderators: Hacker, petermad, Stefan2, white

Post Reply
User avatar
MarcinW
Power Member
Power Member
Posts: 852
Joined: 2012-01-23, 15:58 UTC
Location: Poland

Max time difference when comparing directories

Post by *MarcinW »

When comparing panels with Shift+F2, there must be a time difference of >= 4 seconds to recognize two files with same names as different. For example: "test.bin" with time 15:00:12 (in left panel) will be recognized as same with "test.bin" 15:00:13, "test.bin" 15:00:14 or "test.bin" 15:00:15 (in right panel).

Since FAT filesystems store the time with 2 seconds accuracy, a non-FAT time value, which has odd number of seconds, should be rounded up by 1 second before comparison - to make TC be able to compare files between FAT and non-FAT filesystems. This rounding should be done only when TC has a FAT filesystem in one panel, and non-FAT filesystem in the other.

But no other roundings should be done. The current requirement of 4 seconds, to recognize files as different, seems to be not intentional.

Regards
User avatar
milo1012
Power Member
Power Member
Posts: 1158
Joined: 2012-02-02, 19:23 UTC

Re: Max time difference when comparing directories

Post by *milo1012 »

MarcinW wrote:When comparing panels with Shift+F2, there must be a time difference of >= 4 seconds to recognize two files with same names as different.
This is known to be "by design" due to some Windows function glitch, see for example
http://www.ghisler.ch/board/viewtopic.php?p=73905#73905
MarcinW wrote:a non-FAT time value, which has odd number of seconds, should be rounded up by 1 second before comparison
No, it's actually the other way around. As you can see with e.g. FileTimeToDosDateTime, those functions squeeze the seconds in 5 bits, so you end up with:

Code: Select all

hour << 11 | minute << 5 | second >> 1
So "12" has the same bit value as "13", "14" the same value as "15", and so on.



But sure: it would be nice if TC would be "more precise" when there are NTFS file systems on both sides.


Edit: corrected second right shift from two to one
Last edited by milo1012 on 2016-07-14, 22:36 UTC, edited 1 time in total.
TC plugins: PCREsearch and RegXtract
User avatar
MarcinW
Power Member
Power Member
Posts: 852
Joined: 2012-01-23, 15:58 UTC
Location: Poland

Re: Max time difference when comparing directories

Post by *MarcinW »

milo1012 wrote:No, it's actually the other way around. As you can see with e.g. FileTimeToDosDateTime, those functions squeeze the seconds in 5 bits, so you end up with:

Code: Select all

hour << 11 | minute << 5 | second >> 2
So "12" has the same bit value as "13", "14" the same value as "15", and so on.
Well, as I understand, we both agree, that this gives a 2-second accuracy for FAT filesystems.

Maybe I wasn't clear enough.

The only case, when same files (= original file and its copy in some other place) may have different timestamps, is when all conditions are met together:
1) original file is on a non-FAT filesystem,
2) its timestamp has an odd number of seconds, for example 15,
3) it has been copied to a FAT filesystem.

When copying this file to the FAT filesystem, operating system modifies the timestamp by changing 15 seconds to 16 seconds (rounding up by 1) - it's because 15 seconds can't be used there. It may be easily tested with cmd.exe (which uses only OS functionality, so nothing else can touch the timestamp).

So, to reliably compare the original file (non-FAT filesystem, 15 seconds) and its copy (FAT filesystem, 16 seconds), TC - because 15 is odd - should round this value up by 1 second, to 16, before comparison.

milo1012 wrote:But sure: it would be nice if TC would be "more precise" when there are NTFS file systems on both sides.
More precisely, the rounding described above should be performed only when a non-FAT filesystem in one panel is compared with a FAT filesystem in the other. If there are FAT filesystems in both panels, there is no need to round. Same with NTFS filesystems in both panels. Same with NTFS filesystem in one panel and CDFS in the other.


Regards
User avatar
milo1012
Power Member
Power Member
Posts: 1158
Joined: 2012-02-02, 19:23 UTC

Re: Max time difference when comparing directories

Post by *milo1012 »

MarcinW wrote:Well, as I understand, we both agree, that this gives a 2-second accuracy for FAT filesystems.
I just wanted to show that the internal format that TC probably uses relies on rounding down when using API functions.
You can see this e.g. with the ReactOS implementation:
https://doxygen.reactos.org/d5/dac/dll_2win32_2kernel32_2client_2time_8c.html#af1ec56f183a6cde58d424442270cfe87

And BTW, this is the same time format that TC expects from WCX plug-ins.

MarcinW wrote:When copying this file to the FAT filesystem, operating system changes the timestamp value by changing 15 seconds to 16 seconds (rounding up by 1). It may be easily tested with cmd.exe (which uses only OS functionality, so nothing else can touch the timestamp).
I'm not so sure about this being guaranteed.
I remember some situations where the timestamp was in fact rounded down. It probably depends on the API functions used for copying, or when you set the timestamp manually afterwards (SetFileTime).
This is probably what Christian meant with "the seconds are rounded up in one case, and down in the other" and the explanation for the four (three) second margin.
TC plugins: PCREsearch and RegXtract
User avatar
MarcinW
Power Member
Power Member
Posts: 852
Joined: 2012-01-23, 15:58 UTC
Location: Poland

Post by *MarcinW »

milo1012 wrote:internal format that TC probably uses relies on rounding down when using API functions.
Ok, I think that I finally understand what you wanted to tell: since some API functions cut the last bit in the seconds field (for FAT only), there is a danger that some program, by using its own sequence of API functions, may cause rounding the seconds field down - regardless of the OS's sequence of API functions, which currently rounds the seconds field up. Rounding down is potentially possible, in particular when using some time subtracting operations.
milo1012 wrote:I remember some situations where the timestamp was in fact rounded down
milo1012 wrote:This is probably what Christian meant with "the seconds are rounded up in one case, and down in the other"
We also can't be sure, what other (than Windows) operating systems may do when copying between FAT and non-FAT filesystems.


Now some observations:

a) Only odd second values can be rounded (up or down), even second values will never be modified.

b) The worst case is when an original non-FAT file was copied to place FIRST (on a FAT filesystem) and the timestamp was rounded up, and also to place SECOND (on a FAT filesystem) and the timestamp was rounded down; in this case, TC should recognize FIRST and SECOND timestamps as same.

c) FIRST and SECOND files (or only one of them) can be then moved to a non-FAT filesystem, like NTFS, and TC should still recognize timestamps as same; so the timestamp comparing algorithm cannot differ depending on FAT or non-FAT filesystems in the panels, as I initially suggested.

d) If the timestamp has odd seconds value, it means that this is an original timestamp, that has never been modified by copying the file to a FAT filesystem.


So:

1) If a file has odd seconds value (like 15), it's enough to check if the other file has seconds value in the SEC-1..SEC+1 range (like 14..16) - because 15 is an original timestamp, and the timestamp in the other file could be rounded down to 14 or up to 16.

2) If a file has even seconds value (like 14), it's enough to check if the other file has seconds value in the SEC-2..SEC+2 range (like 12..16) - because A) 14 may be a timestamp 15 rounded down, and for the other file it may be 15 rounded up to 16, but also B) 14 may be a timestamp 13 rounded up, and for the other file it may be 13 rounded down to 12.



So the optimal algorithm is:

1) is TimeStamp1 equal to TimeStamp2?
YES => timestamps are same / exit

2) is TimeStamp1 odd?
YES => if TimeStamp2 in [TimeStamp1-1..TimeStamp1+1], timestamps are same, else timestamps are not same / exit

3) is TimeStamp2 odd?
YES => if TimeStamp1 in [TimeStamp2-1..TimeStamp2+1], timestamps are same, else timestamps are not same / exit

4) if TimeStamp1 in [TimeStamp2-2..TimeStamp2+2], timestamps are same, else timestamps are not same / exit

This algorithm works properly regardless of the file order - i.e. it still gives same results if we initially exchange TimeStamp1 with TimeStamp2.



As we can see, this is a quite short algorithm and it would be nice to have in implemented in TC, instead of hardcoded 4-second difference; it gives less false-positives.

Regards
User avatar
MarcinW
Power Member
Power Member
Posts: 852
Joined: 2012-01-23, 15:58 UTC
Location: Poland

Post by *MarcinW »

@ghisler(Author): Wouldn't be now a good moment to introduce this small improvement?

Regards
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50550
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Sorry, now is the worst possible moment (release candidate, any new feature would delay the release even further).
Author of Total Commander
https://www.ghisler.com
Post Reply