MD5 (SFV) per file multithreading

Here you can propose new features, make suggestions etc.

Moderators: Hacker, petermad, Stefan2, white

Post Reply
User avatar
Hacker
Moderator
Moderator
Posts: 13142
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

MD5 (SFV) per file multithreading

Post by *Hacker »

Christian,
I would like to suggest to make use of multithreading when creating / verifying MD5 / SFV files, one thread per file. Would that be possible to implement?

TIA
Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.
User avatar
MVV
Power Member
Power Member
Posts: 8711
Joined: 2008-08-03, 12:51 UTC
Location: Russian Federation

Post by *MVV »

I think it is not very good suggestion because MD5 checking/calculating requires intensive HDD access so multithreading will not give bonus performance.
Sob
Power Member
Power Member
Posts: 945
Joined: 2005-01-19, 17:33 UTC

Post by *Sob »

Take my system for example. HDD read speed ~100MB/s, TC CPU usage ~10% on quad core (i5), meaning one core at 40%. It would need more than 250MB/s to benefit from more threads. But if Hacker has RAID made of SSDs or something like that, it'd make sense. :)
User avatar
HolgerK
Power Member
Power Member
Posts: 5409
Joined: 2006-01-26, 22:15 UTC
Location: Europe, Aachen

Post by *HolgerK »

TC CPU usage ~10% on quad core (i5)
I guess it will be 50% on a dual core atom.

Holger
User avatar
Hacker
Moderator
Moderator
Posts: 13142
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Post by *Hacker »

Well, I have ~45% CPU usage on a Core 2 Duo, thus around 90% of one core.

Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.
User avatar
HolgerK
Power Member
Power Member
Posts: 5409
Joined: 2006-01-26, 22:15 UTC
Location: Europe, Aachen

Post by *HolgerK »

The cpu-load percentage only, is not an adequate criterion to decide if it's useful or not.
For example: we don't know if Sob's (i5) quad core supports hyper threading, thus 10% would mean 80% load for a single core.

Here (sha1/3Gbyte files ) i'm getting a throughput (ProcessExplorer) about 30 to 35Mbyte/sec using an AMD X2 dual core while the overall cpu-load is also about 45% (90% per single core).
And of course the local HDD can delivery more than that (about 80-90MByte/s).
And this would be even more, if the file system cache provides the data.

Regards
Holger
Sob
Power Member
Power Member
Posts: 945
Joined: 2005-01-19, 17:33 UTC

Post by *Sob »

Ok, I stand corrected, it is not a bad idea after all. :) Previously I was talking about sfv, but other algorithms need more CPU power. So in my case one core can do ~250MB/s for sfv, ~100MB/s for md5 and only ~60MB/s for sha (all are very rough numbers). So in some cases the CPU can be bottleneck instead of HDD and multithreading would actually help.
But it would require clever use of large buffers, because reading multiple files from normal disk at the same time can totally kill performance because of seeking delays.
Post Reply