Add BLAKE2 to checksum methods

Here you can propose new features, make suggestions etc.

Moderators: Hacker, petermad, Stefan2, white

User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50383
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Add BLAKE2 to checksum methods

Post by *ghisler(Author) »

I'm using the C code from their Github. It's well optimized with support for SSE, AVX2 etc. Although there seems to be a way to create DLLs with Rust, their code doesn't support it, so I would have to learn Rust first and then modify their code to get it working. There is no guarantee that it would be faster, the disk reading speed is probably the limiting factor.
Author of Total Commander
https://www.ghisler.com
lelik007
Member
Member
Posts: 173
Joined: 2021-04-20, 06:37 UTC

Re: Add BLAKE2 to checksum methods

Post by *lelik007 »

Thank you for the answer. I know it's difficult to compile because multithreaded code is in Rust only and the official multithreaded С code doesn't exist.
But it's the fastest hash fuction for now in TC with all SIMD optimizations applied and I can't ask for more.
There is no guarantee that it would be faster, the disk reading speed is probably the limiting factor.
There will be a difference for a disk cache read or a RAM disk. But the thing is - I've never meet compiled Rust code in any software I've seen.
So there are not better BLAKE3 implementations yet.
lelik007
Member
Member
Posts: 173
Joined: 2021-04-20, 06:37 UTC

Re: Add BLAKE2 to checksum methods

Post by *lelik007 »

2ghisler(Author)
Christian, BTW, BLAKE3 Team seems to have multi-threaded BLAKE3 С Code in v1.7.0:
https://github.com/BLAKE3-team/BLAKE3/releases/tag/1.7.0
The C implementation has gained multithreading support, based on
Intel's oneTBB library. This works similarly to the Rayon-based
multithreading used in the Rust implementation. See c/README.md for
details.
During the next beta testing we could check it out.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50383
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Add BLAKE2 to checksum methods

Post by *ghisler(Author) »

(removed due to measurement error)
Author of Total Commander
https://www.ghisler.com
lelik007
Member
Member
Posts: 173
Joined: 2021-04-20, 06:37 UTC

Re: Add BLAKE2 to checksum methods

Post by *lelik007 »

2ghisler(Author)
6.5GB file on a 4 lane PCIe4.0 SSD (12.5s to 3.5s).
There's something weird in here. 6.5GB in 3.5s this should be even with a single thread, it's normal, 12.5s for 6.5GB is very slow.

And new dll is the same size, but for multi-threading BLAKE3 С Code uses oneTBB library by Intel,
BLAKE3 С Code didn't become parallel just by itself, it's more like an optional dependency they included:
https://github.com/BLAKE3-team/BLAKE3/blob/master/c/README.md#multithreading
oneTBB should be compiled separately or somehow included.
https://github.com/BLAKE3-team/BLAKE3/blob/master/c/README.md#cmake
BLAKE3_USE_TBB: Enable oneTBB parallelism (Requires a C++20 capable compiler)
But if it requires C++20 It's for the the modern systems only, as I think.
User avatar
ZoSTeR
Power Member
Power Member
Posts: 1049
Joined: 2004-07-29, 11:00 UTC

Re: Add BLAKE2 to checksum methods

Post by *ZoSTeR »

 
I didn't see any difference with the new blakex64 DLL.

System:
AMD Ryzen 7 9800X3D 8-Core Processor
32 GB DDR5-6400 (3200 MHz)
Samsung SSD 990 Pro 2TB PCIe Gen4 x4
Windows 11 Pro 24H2
TC 11.51 x64

Both took ca. 42 sec for a 125 GB compressed backup file.
CPU Load: ~12%
SSD Load: ~64%

125 GB/42 s = 2.98 GB/s
6.5 GB/12.5 s = 0.52 GB/s
6.5 GB/3.5 s = 1.86 GB/s

@ghisler: as lelik007 suggested, there seems to be something strange going on with your system or measurements...
Last edited by ZoSTeR on 2025-03-31, 20:10 UTC, edited 1 time in total.
lelik007
Member
Member
Posts: 173
Joined: 2021-04-20, 06:37 UTC

Re: Add BLAKE2 to checksum methods

Post by *lelik007 »

2ZoSTeR
There's nothing strange, what Christian gave just more modern single-threaded version, and yes,
6.5 GB/12.5 s = 0.52 GB/s is really weird, it's like the SIMDs are disabled.

Actually if you'd like you can download:
https://github.com/BLAKE3-team/BLAKE3/releases/download/1.8.0/b3sum_windows_x64_bin.exe
Which is compiled multi-threaded Rust code and measure again this 125 GB file.
b3sum_windows_x64_bin.exe file

2ghisler(Author)
Christian you have to read this carefully and understand how the devs suppose to multi-thread С code:
https://github.com/BLAKE3-team/BLAKE3/blob/master/c/README.md
They say DIY with the help of
https://uxlfoundation.github.io/oneTBB/
https://github.com/uxlfoundation/oneTBB
and this dll should also be provided as a runtime, I suppose.
Last edited by lelik007 on 2025-03-31, 20:49 UTC, edited 1 time in total.
User avatar
ZoSTeR
Power Member
Power Member
Posts: 1049
Joined: 2004-07-29, 11:00 UTC

Re: Add BLAKE2 to checksum methods

Post by *ZoSTeR »

 
With the linked b3sum_windows_x64_bin.exe:

Duration: 34s
CPU Load: ~60% (all cores used with 100% spikes)
SSD Load: ~100%

125 GB/34 s = 3.68 GB/s

Besides the speed increase the exe uses up all available RAM, not sure if this a good or bad thing... the TC DLL uses barely any.
Last edited by ZoSTeR on 2025-03-31, 20:57 UTC, edited 1 time in total.
lelik007
Member
Member
Posts: 173
Joined: 2021-04-20, 06:37 UTC

Re: Add BLAKE2 to checksum methods

Post by *lelik007 »

2ZoSTeR
Thank you for the measurements, what can I say, BLAKE3 performs very well on your PC, but I like the results of the single-threaded variety better considering the loads, and the fact that multi-threaded variety doesn't give like + 50% boost for example.
User avatar
ZoSTeR
Power Member
Power Member
Posts: 1049
Joined: 2004-07-29, 11:00 UTC

Re: Add BLAKE2 to checksum methods

Post by *ZoSTeR »

Agreed, since this is clearly bandwidth limited and not many users will have PCIe Gen5 SSDs, it might not be worth the effort (yet).
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50383
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Add BLAKE2 to checksum methods

Post by *ghisler(Author) »

Sorry, it looks like I tested the old dll with a file on my SATA SSD, which would explain why I got 500MB/sec.

It looks like I have to map the entire file into memory (or in large blocks of, say, 1 GB) and then pass that to the hasher to use multi-threading. I don't know whether it's worth the hassle to do this.
Author of Total Commander
https://www.ghisler.com
lelik007
Member
Member
Posts: 173
Joined: 2021-04-20, 06:37 UTC

Re: Add BLAKE2 to checksum methods

Post by *lelik007 »

2ghisler(Author)
It looks like I have to map the entire file into memory (or in large blocks of, say, 1 GB) and then pass that to the hasher to use multi-threading.
It looks like it. This is mmap: https://en.wikipedia.org/wiki/Mmap
b3sum_windows_x64_bin.exe has the switch --no-mmap with the description:
--no-mmap
Disable memory mapping.
Currently this also disables multithreading.

Rust code isn't also multithreaded by itself, it relies on Rust Rayon, which is the specific Rust library for multithreading:
Rayon is a data-parallelism library that makes it easy to convert sequential computations into parallel.
https://docs.rs/rayon/latest/rayon/
And that is similar to what OpenMP and oneTBB do, as I understand these things.

But I don't know how these 2 points relate to each other.
I don't know whether it's worth the hassle to do this.
So do I. You can check TC's x64 speed with the provided .dll v1.7.0 against the reference utility with a PC reboot after the first measurement to be precise:
https://github.com/BLAKE3-team/BLAKE3/releases/download/1.8.0/b3sum_windows_x64_bin.exe
b3sum_windows_x64_bin.exe file
on your NVMe drive of course and the biggest file you have.
To understand what we'll possibly get with the multithreading.

Actually, oneTBB has Windows 10/11 as the requirement. IDK if it's right for TC.
https://github.com/uxlfoundation/oneTBB/blob/master/SYSTEM_REQUIREMENTS.md#supported-operating-systems
Last edited by lelik007 on 2025-04-01, 17:18 UTC, edited 3 times in total.
lelik007
Member
Member
Posts: 173
Joined: 2021-04-20, 06:37 UTC

Re: Add BLAKE2 to checksum methods

Post by *lelik007 »

2ZoSTeR
If you have some time and if it's not very difficult measure, please your 125 GB Backup file again with this version:
https://github.com/BLAKE3-team/BLAKE3/releases/download/1.5.5/b3sum_windows_x64_bin.exe
but just after you turn PC on or reboot. There's might be a different result.
User avatar
ZoSTeR
Power Member
Power Member
Posts: 1049
Joined: 2004-07-29, 11:00 UTC

Re: Add BLAKE2 to checksum methods

Post by *ZoSTeR »

 
Average runtime for 10 measurements per version for a 125 GB compressed file:

b3sum v1.5.5
39.49 s

b3sum v1.8.0
38.51 s

There where fluctuations of ~2 s and not a clean lab setup
lelik007
Member
Member
Posts: 173
Joined: 2021-04-20, 06:37 UTC

Re: Add BLAKE2 to checksum methods

Post by *lelik007 »

2ZoSTeR
Thank you again for the testing and patience.

The loads on your system you've mentioned to do so simple task as hashing do not make me very happy:
Besides the speed increase the exe uses up all available RAM, not sure if this a good or bad thing... the TC DLL uses barely any.
It seems multi-threaded BLAKE3 place the pieces of the 125 Gb file in you RAM via mmap function to work on a piece with 8 or 16 (IDK if it detects physical or logical threads). And 60% load of CPU so powerful as yours to have so insignificant speed boost... not a good thing definitely.

But this is Rust code in b3sum_windows_x64_bin.exe and I'm not sure if С code with oneTBB can do better.
Post Reply