Synchronize directories - async compare with google drive

Here you can propose new features, make suggestions etc.

Moderators: white, Hacker, petermad, Stefan2

Post Reply
Ankha
Junior Member
Junior Member
Posts: 8
Joined: 2012-07-05, 08:02 UTC

Synchronize directories - async compare with google drive

Post by *Ankha »

Hello,

Currently, the "synchronize directories" functionality performs a file-by-file comparison. If I'm comparing the content of a local drive with a "cloud" drive such a google drive, it means that files are accessed one by one remotely, which causes the files to be downloaded one by one, which slows down by a 10x factor of more if there are plenty of small files.

It would be better to compare multiple files at the same time in that case. I'm aware that it may be slower when synchronizing two local directories, but this could be detected if there is a difference of throughput between the two directories.

Another solution would be to read the first 1KB of each file in parallel to trigger any download, then proceed sequentially.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48088
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Synchronize directories - async compare with google drive

Post by *ghisler(Author) »

Currently synchronizing by content is not supported with the cloud plugin. There are functions to compare via hash, but they are currently only used in the SFTP plugin. I plan to support them also in the cloud plugin, but not all cloud services seem to support hashes.

Google drive: No hash provided in file properties (Alt+Enter)
Microsoft OneDrive: SHA1 hash provided
Microsoft Azure: no hash
Dropbox: SHA256(?) hash provided
Box: sha1
Yandex: sha256 and md5
Author of Total Commander
https://www.ghisler.com
Ankha
Junior Member
Junior Member
Posts: 8
Joined: 2012-07-05, 08:02 UTC

Re: Synchronize directories - async compare with google drive

Post by *Ankha »

Sorry, I'm not talking about cloud with special APIs. I'm not even using a plugin in TC to access it. I meant Google Drive File Stream rather than Google Drive.

Google Drive File Stream mimics a Windows drive. When a file is accessed, it is downloaded and the file is "opened" once it's fully downloaded.
By opening multiple files at the same time, you launch the process for all that files. If your bandwidth permits it, you gain time as the maximum download speed may be limited per file, or smaller files are accessible earlier and can be compared with the rest is still being downloaded.

Currently, TC tries to access files one after the other, which results in only 1 file being downloaded at at time.

Basically, if accessing a file takes more than a few secondes, then probably it's worth trying accessing other files at the same time.
Post Reply