Hello,
Currently, the "synchronize directories" functionality performs a file-by-file comparison. If I'm comparing the content of a local drive with a "cloud" drive such a google drive, it means that files are accessed one by one remotely, which causes the files to be downloaded one by one, which slows down by a 10x factor of more if there are plenty of small files.
It would be better to compare multiple files at the same time in that case. I'm aware that it may be slower when synchronizing two local directories, but this could be detected if there is a difference of throughput between the two directories.
Another solution would be to read the first 1KB of each file in parallel to trigger any download, then proceed sequentially.
Synchronize directories - async compare with google drive
Moderators: white, Hacker, petermad, Stefan2
- ghisler(Author)
- Site Admin
- Posts: 48088
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Re: Synchronize directories - async compare with google drive
Currently synchronizing by content is not supported with the cloud plugin. There are functions to compare via hash, but they are currently only used in the SFTP plugin. I plan to support them also in the cloud plugin, but not all cloud services seem to support hashes.
Google drive: No hash provided in file properties (Alt+Enter)
Microsoft OneDrive: SHA1 hash provided
Microsoft Azure: no hash
Dropbox: SHA256(?) hash provided
Box: sha1
Yandex: sha256 and md5
Google drive: No hash provided in file properties (Alt+Enter)
Microsoft OneDrive: SHA1 hash provided
Microsoft Azure: no hash
Dropbox: SHA256(?) hash provided
Box: sha1
Yandex: sha256 and md5
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Re: Synchronize directories - async compare with google drive
Sorry, I'm not talking about cloud with special APIs. I'm not even using a plugin in TC to access it. I meant Google Drive File Stream rather than Google Drive.
Google Drive File Stream mimics a Windows drive. When a file is accessed, it is downloaded and the file is "opened" once it's fully downloaded.
By opening multiple files at the same time, you launch the process for all that files. If your bandwidth permits it, you gain time as the maximum download speed may be limited per file, or smaller files are accessible earlier and can be compared with the rest is still being downloaded.
Currently, TC tries to access files one after the other, which results in only 1 file being downloaded at at time.
Basically, if accessing a file takes more than a few secondes, then probably it's worth trying accessing other files at the same time.
Google Drive File Stream mimics a Windows drive. When a file is accessed, it is downloaded and the file is "opened" once it's fully downloaded.
By opening multiple files at the same time, you launch the process for all that files. If your bandwidth permits it, you gain time as the maximum download speed may be limited per file, or smaller files are accessible earlier and can be compared with the rest is still being downloaded.
Currently, TC tries to access files one after the other, which results in only 1 file being downloaded at at time.
Basically, if accessing a file takes more than a few secondes, then probably it's worth trying accessing other files at the same time.