Page 1 of 1
MD5 (SFV) per file multithreading
Posted: 2010-03-14, 12:56 UTC
by Hacker
Christian,
I would like to suggest to make use of multithreading when creating / verifying MD5 / SFV files, one thread per file. Would that be possible to implement?
TIA
Roman
Posted: 2010-03-14, 13:05 UTC
by MVV
I think it is not very good suggestion because MD5 checking/calculating requires intensive HDD access so multithreading will not give bonus performance.
Posted: 2010-03-14, 17:20 UTC
by Sob
Take my system for example. HDD read speed ~100MB/s, TC CPU usage ~10% on quad core (i5), meaning one core at 40%. It would need more than 250MB/s to benefit from more threads. But if Hacker has RAID made of SSDs or something like that, it'd make sense. :)
Posted: 2010-03-14, 17:54 UTC
by HolgerK
TC CPU usage ~10% on quad core (i5)
I guess it will be 50% on a dual core atom.
Holger
Posted: 2010-03-14, 18:44 UTC
by Hacker
Well, I have ~45% CPU usage on a Core 2 Duo, thus around 90% of one core.
Roman
Posted: 2010-03-14, 19:21 UTC
by HolgerK
The cpu-load percentage only, is not an adequate criterion to decide if it's useful or not.
For example: we don't know if Sob's (i5) quad core supports hyper threading, thus 10% would mean 80% load for a single core.
Here (sha1/3Gbyte files ) i'm getting a throughput (ProcessExplorer) about 30 to 35Mbyte/sec using an AMD X2 dual core while the overall cpu-load is also about 45% (90% per single core).
And of course the local HDD can delivery more than that (about 80-90MByte/s).
And this would be even more, if the file system cache provides the data.
Regards
Holger
Posted: 2010-03-14, 20:15 UTC
by Sob
Ok, I stand corrected, it is not a bad idea after all. :) Previously I was talking about sfv, but other algorithms need more CPU power. So in my case one core can do ~250MB/s for sfv, ~100MB/s for md5 and only ~60MB/s for sha (all are very rough numbers). So in some cases the CPU can be bottleneck instead of HDD and multithreading would actually help.
But it would require clever use of large buffers, because reading multiple files from normal disk at the same time can totally kill performance because of seeking delays.