diff options
| author | Eric Biggers <ebiggers@kernel.org> | 2025-10-25 22:50:29 -0700 |
|---|---|---|
| committer | Eric Biggers <ebiggers@kernel.org> | 2025-11-05 20:30:41 -0800 |
| commit | 862445d3b9e74f58360a7a89787da4dca783e6dd (patch) | |
| tree | b1c7d54983fffc71177adc5de221b7d920856307 /drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | |
| parent | 0354d3c1f1b8628e60eceb304b6d2ef75eea6f41 (diff) | |
lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
Some z/Architecture processors can compute a SHA-3 digest in a single
instruction. arch/s390/crypto/ already uses this capability to optimize
the SHA-3 crypto_shash algorithms.
Use this capability to implement the sha3_224(), sha3_256(), sha3_384(),
and sha3_512() library functions too.
SHA3-256 benchmark results provided by Harald Freudenberger
(https://lore.kernel.org/r/4188d18bfcc8a64941c5ebd8de10ede2@linux.ibm.com/)
on a z/Architecture machine with "facility 86" (MSA level 12):
Length (bytes) Before (MB/s) After (MB/s)
============== ============= ============
16 212 225
64 820 915
256 1850 3350
1024 5400 8300
4096 11200 11300
Note: the original data from Harald was given in the form of a graph for
each length, showing the distribution of throughputs from 500 runs. I
guesstimated the peak of each one.
Harald also reported that the generic SHA-3 code was at most 259 MB/s
(https://lore.kernel.org/r/c39f6b6c110def0095e5da5becc12085@linux.ibm.com/).
So as expected, the earlier commit that optimized sha3_absorb_blocks()
and sha3_keccakf() is the more important one; it optimized the Keccak
permutation which is the most performance-critical part of SHA-3.
Still, this additional commit does notably improve performance further
on some lengths.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20251026055032.1413733-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/mes_userqueue.c')
0 files changed, 0 insertions, 0 deletions
