Verify after copy

English support forum

Moderators: white, Hacker, petermad, Stefan2

Post Reply
StatusQuo
Power Member
Power Member
Posts: 1524
Joined: 2007-01-17, 21:36 UTC
Location: Germany

Post by *StatusQuo »

HolgerK wrote:I want reliable hardware.
So do I, but sometimes you may not have a choice at a customer's PC or at work.
HolgerK wrote:If i can't trust the storage medium
It's not always the storage medium:
I met some systems (running 2k or XP) having deep trouble with different USB storage devices (stick, hard disk, card reader). The cause was e.g. defective RAM or other HW defects.
Result: A bunch of flipped single bits in the target files, as I found out using TC's Sync Tool (and binary file compare on the different files).

When rescueing important files from such (or unknown) systems, verifying the copy results is a good idea IMO.
(Changing the defective HW might be even better, but not always possible or wanted, especially on customer PCs.)

Related suggestion: Options for auto-generating MD5 comparison after copying
Who the hell is General Failure, and why is he reading my disk?
-- TC starter menu: Fast yet descriptive command access!
User avatar
ado
Senior Member
Senior Member
Posts: 445
Joined: 2003-02-18, 13:22 UTC
Location: Slovakia, Pezinok

Post by *ado »

well - the problem with defective HW is, that you even cannot be sure if you got error because of original copy failed or because of verify failed. Problems with HW can be pretty nasty and you can feel like fool.
I remember it happened to me loooong time ago I was using MS C++ compiler on my PC with 4MB of memory. It worked fine, just little slow. When I added another 4MB, compiler ran fast, except it started to display from time to time weird errors. Sometimes I ran it twice and I got 2 different errors and I didn't touch code. So I said, ok, wrong memory - I put it to another PC, ran mem test and guess what: memory was ok. At the end it was problem with onboard memory cache (turning off caching helped with compiler errors, except PC was unusably slow)

So my advice is to stay away from that shaky HW and if you cannot you better not transfer GB of data. If you have to transfer any data via such HW, then Sync dirs is pretty good solution and definitely worth of those a few extract key strokes/mouse clicks.

ado
User avatar
Fuzbolero
Junior Member
Junior Member
Posts: 20
Joined: 2007-06-08, 12:42 UTC
Location: Europe
Contact:

Post by *Fuzbolero »

It would be practical with such a function, gives me peace of mind. We may have different situations, preferences, etc., with so many different people using this tool, so I think that we will not agree on a single option-free approach. Good to provide this for those who want to use it.

Out of curiosity; why is it "not possible with the new copy method" to make checksums during a copy operation?

Ref. this related post:

"Options for auto-generating MD5 comparison after copying"
http://ghisler.ch/board/viewtopic.php?t=23146
Twitter.com/@FuzboleroXV
sehlat
Junior Member
Junior Member
Posts: 7
Joined: 2009-01-08, 02:09 UTC

Byte-by-byte compare only of copied/overwritten?

Post by *sehlat »

I use TC a lot to synchronize large 8-9GB file collections across hardware as a backup. Synchronize/Compare by Content takes a LONG time to find differences and it does full content compare both before and after.

I'd like to see an option when doing "Asymmetric" "Subdirectories" (I'm starting in root here.) but NOT "By Content" to do a "By Content" only on the files that get copied/overwritten. That would speed up synchronizations from master to backup without eating tons of time and processing.
User avatar
Balderstrom
Power Member
Power Member
Posts: 2148
Joined: 2005-10-11, 10:10 UTC

Post by *Balderstrom »

The initial compare would need to be by date then? If you have [x] by Content checked, then everything would need to be checked that exists on both sides.

Unless you mean something like:
[x] By Date
[x] By Content if != Date

Where it would only check content on files if their date mismatched.
(If size is different, then they are obviously not equal and don't need to be content compared).
sehlat
Junior Member
Junior Member
Posts: 7
Joined: 2009-01-08, 02:09 UTC

Post by *sehlat »

What I have in mind is something like a

[x]Compare copied files

while the synchronize process is going on. Once you do the copy,
the dates will match, so that doesn't work to find out what got copied
and needs back-compared to make sure the copy worked.
User avatar
Fuzbolero
Junior Member
Junior Member
Posts: 20
Joined: 2007-06-08, 12:42 UTC
Location: Europe
Contact:

Post by *Fuzbolero »

The compare obviously needs to be done after the copying has finished. But any checksum calculation should only need to be done on the target after the copy, as the initial reading of the file during the copy should ideally provide the comparison on the source without having to read it again for the verification. (Is this impossible due to some technical limitations in TC?)

As it is now, first a copy is done (source read), then a manual checksum must be initiated on the target, then another manual checksum must be done on the source, and finally a manual comparison between the checksums is needed.

This shole process should be automated by having a "verify" tick mark on the copy dialog box.

Further:
That is an example of only one large file, though.
What about when copying a whole bunch of folder hierarchies?

Wouldnt it be practical if TC could calculate and verify a checksum on the whole copy as one entity?
Twitter.com/@FuzboleroXV
User avatar
HolgerK
Power Member
Power Member
Posts: 5406
Joined: 2006-01-26, 22:15 UTC
Location: Europe, Aachen

Post by *HolgerK »

What about file system caching?

Is it in any way guaranteed, that directly reading a file after writing will read the bytes again from the storage medium, or does the local file system cache (or the file system cache of a remote server) deliver the bytes from memory?

Regards
Holger
knnknn
Junior Member
Junior Member
Posts: 60
Joined: 2007-07-20, 08:04 UTC

Post by *knnknn »

Fuzbolero wrote:The compare obviously needs to be done after the copying has finished. But any checksum calculation should only need to be done on the target after the copy, as the initial reading of the file during the copy should ideally provide the comparison on the source without having to read it again for the verification.
I disagree here slightly. I want also a byte-by-byte comparison, not only CRC.

CRC just checks for write errors, not for read errors: If you copy some files over a shaky network onto your harddisk, then you will hardly ever notice a CRC difference, although the file transfers may have been corrupted.

Thus ideally TC should offer a CRC verify and a byte-by-byte verify (= re-reading the source).
knnknn
Junior Member
Junior Member
Posts: 60
Joined: 2007-07-20, 08:04 UTC

Post by *knnknn »

Let's also not forget that "Move+Verify" offers another positive aspect:

Sometimes you move files over the network and the network computer crashes just after a file has been moved. Since the network computer reported a file as "transferred" (but didn't flush it yet to the harddisk) your local PC already deleted the file.

In other words: You lose the file since 1) it's locally deleted and 2) not completely saved yet on the network harddisk.

Happened to me already several times. I then noticed month later that some files had the correct size+date but were filled with zeroes at the end.

With move+verify you could make sure that files are transferred correctly.
Postkutscher
Power Member
Power Member
Posts: 556
Joined: 2006-04-01, 00:11 UTC

Post by *Postkutscher »

knnknn wrote:CRC just checks for write errors, not for read errors
Seriously ?
StatusQuo
Power Member
Power Member
Posts: 1524
Joined: 2007-01-17, 21:36 UTC
Location: Germany

Post by *StatusQuo »

Postkutscher wrote:
knnknn wrote:CRC just checks for write errors, not for read errors
Seriously ?
Makes sense. When you read the data only once, you have nothing to compare against.
An even faster version: calculate source CRC while reading and target CRC while writing. Then just compare the CRCs - extremely fast, but probably always reporting a worthless "OK". :D
Who the hell is General Failure, and why is he reading my disk?
-- TC starter menu: Fast yet descriptive command access!
User avatar
HolgerK
Power Member
Power Member
Posts: 5406
Joined: 2006-01-26, 22:15 UTC
Location: Europe, Aachen

Post by *HolgerK »

knnknn wrote:Happened to me already several times.
Never happened to me in the last ten years. I would refuse to use such a server.
I then noticed month later that some files had the correct size+date but were filled with zeroes at the end.
Sounds like the server's file cache or HD internal write cache wasn't flushed.
Are you sure that reading directly after writing will flush the server's cache?

BTW: using compatibility mode for network drives (default copy method in TC7.50) can increase the data security in combination with some faulty network devices.

Kind regards
Holger
knnknn
Junior Member
Junior Member
Posts: 60
Joined: 2007-07-20, 08:04 UTC

Post by *knnknn »

HolgerK wrote:
knnknn wrote:Happened to me already several times.
Never happened to me in the last ten years. I would refuse to use such a server.
You would refuse to use a server that can crash because of a power failure?
HolgerK wrote:Are you sure that reading directly after writing will flush the server's cache?
Well, definitely better than without verifying. Are you claiming that it's a good thing that TC doesn't have a "verify after copy" function?
User avatar
HolgerK
Power Member
Power Member
Posts: 5406
Joined: 2006-01-26, 22:15 UTC
Location: Europe, Aachen

Post by *HolgerK »

knnknn wrote:You would refuse to use a server that can crash because of a power failure?
Definitely yes, if this will happen more than twice.
You know what an UPS is (no, not the parcel service :arrow: UPS )
HolgerK wrote:Are you sure that reading directly after writing will flush the server's cache?
Well, definitely better than without verifying.
Are you claiming that it's a good thing that TC doesn't have a "verify after copy" function?
A solid system has several mechanism to guarantee the data integrity:
- Memory with parity.
- CRC checks on physical layers.
- Checksums on logical layers.
- RAID 1/5/6 hard disk mirroring/parity
- Backups
- ...

Seriously, you are talking about a imho in most cases unnecessary feature for a single program, while the rest of the programs (including the OS) running on such scrappy hardware, is creating one erroneous file after the another. :?

Creating & copying & verifying checksums on demand, to check your storage or network connection is one way, but without any consequence after detecting a possible source of data loss and replacing this faulty hardware, just going on.., doing the same again and again.., watching happily how your bytes are gone.., doesn't make much sense.

Regards
Holger
Post Reply