| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
"Fix a big endian specific issue in the PPC64-optimized AES code"
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: powerpc/aes: Fix rndkey_from_vsx() on big endian CPUs
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
I finally got a big endian PPC64 kernel to boot in QEMU. The PPC64 VSX
optimized AES library code does work in that case, with the exception of
rndkey_from_vsx() which doesn't take into account that the order in
which the VSX code stores the round key words depends on the endianness.
So fix rndkey_from_vsx() to do the right thing on big endian CPUs.
Fixes: 7cf2082e74ce ("lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library")
Link: https://lore.kernel.org/r/20260216022104.332991-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core & protocols:
- A significant effort all around the stack to guide the compiler to
make the right choice when inlining code, to avoid unneeded calls
for small helper and stack canary overhead in the fast-path.
This generates better and faster code with very small or no text
size increases, as in many cases the call generated more code than
the actual inlined helper.
- Extend AccECN implementation so that is now functionally complete,
also allow the user-space enabling it on a per network namespace
basis.
- Add support for memory providers with large (above 4K) rx buffer.
Paired with hw-gro, larger rx buffer sizes reduce the number of
buffers traversing the stack, dincreasing single stream CPU usage
by up to ~30%.
- Do not add HBH header to Big TCP GSO packets. This simplifies the
RX path, the TX path and the NIC drivers, and is possible because
user-space taps can now interpret correctly such packets without
the HBH hint.
- Allow IPv6 routes to be configured with a gateway address that is
resolved out of a different interface than the one specified,
aligning IPv6 to IPv4 behavior.
- Multi-queue aware sch_cake. This makes it possible to scale the
rate shaper of sch_cake across multiple CPUs, while still enforcing
a single global rate on the interface.
- Add support for the nbcon (new buffer console) infrastructure to
netconsole, enabling lock-free, priority-based console operations
that are safer in crash scenarios.
- Improve the TCP ipv6 output path to cache the flow information,
saving cpu cycles, reducing cache line misses and stack use.
- Improve netfilter packet tracker to resolve clashes for most
protocols, avoiding unneeded drops on rare occasions.
- Add IP6IP6 tunneling acceleration to the flowtable infrastructure.
- Reduce tcp socket size by one cache line.
- Notify neighbour changes atomically, avoiding inconsistencies
between the notification sequence and the actual states sequence.
- Add vsock namespace support, allowing complete isolation of vsocks
across different network namespaces.
- Improve xsk generic performances with cache-alignment-oriented
optimizations.
- Support netconsole automatic target recovery, allowing netconsole
to reestablish targets when underlying low-level interface comes
back online.
Driver API:
- Support for switching the working mode (automatic vs manual) of a
DPLL device via netlink.
- Introduce PHY ports representation to expose multiple front-facing
media ports over a single MAC.
- Introduce "rx-polarity" and "tx-polarity" device tree properties,
to generalize polarity inversion requirements for differential
signaling.
- Add helper to create, prepare and enable managed clocks.
Device drivers:
- Add Huawei hinic3 PF etherner driver.
- Add DWMAC glue driver for Motorcomm YT6801 PCIe ethernet
controller.
- Add ethernet driver for MaxLinear MxL862xx switches
- Remove parallel-port Ethernet driver.
- Convert existing driver timestamp configuration reporting to
hwtstamp_get and remove legacy ioctl().
- Convert existing drivers to .get_rx_ring_count(), simplifing the RX
ring count retrieval. Also remove the legacy fallback path.
- Ethernet high-speed NICs:
- Broadcom (bnxt, bng):
- bnxt: add FW interface update to support FEC stats histogram
and NVRAM defragmentation
- bng: add TSO and H/W GRO support
- nVidia/Mellanox (mlx5):
- improve latency of channel restart operations, reducing the
used H/W resources
- add TSO support for UDP over GRE over VLAN
- add flow counters support for hardware steering (HWS) rules
- use a static memory area to store headers for H/W GRO,
leading to 12% RX tput improvement
- Intel (100G, ice, idpf):
- ice: reorganizes layout of Tx and Rx rings for cacheline
locality and utilizes __cacheline_group* macros on the new
layouts
- ice: introduces Synchronous Ethernet (SyncE) support
- Meta (fbnic):
- adds debugfs for firmware mailbox and tx/rx rings vectors
- Ethernet virtual:
- geneve: introduce GRO/GSO support for double UDP encapsulation
- Ethernet NICs consumer, and embedded:
- Synopsys (stmmac):
- some code refactoring and cleanups
- RealTek (r8169):
- add support for RTL8127ATF (10G Fiber SFP)
- add dash and LTR support
- Airoha:
- AN8811HB 2.5 Gbps phy support
- Freescale (fec):
- add XDP zero-copy support
- Thunderbolt:
- add get link setting support to allow bonding
- Renesas:
- add support for RZ/G3L GBETH SoC
- Ethernet switches:
- Maxlinear:
- support R(G)MII slow rate configuration
- add support for Intel GSW150
- Motorcomm (yt921x):
- add DCB/QoS support
- TI:
- icssm-prueth: support bridging (STP/RSTP) via the switchdev
framework
- Ethernet PHYs:
- Realtek:
- enable SGMII and 2500Base-X in-band auto-negotiation
- simplify and reunify C22/C45 drivers
- Micrel: convert bindings to DT schema
- CAN:
- move skb headroom content into skb extensions, making CAN
metadata access more robust
- CAN drivers:
- rcar_canfd:
- add support for FD-only mode
- add support for the RZ/T2H SoC
- sja1000: cleanup the CAN state handling
- WiFi:
- implement EPPKE/802.1X over auth frames support
- split up drop reasons better, removing generic RX_DROP
- additional FTM capabilities: 6 GHz support, supported number of
spatial streams and supported number of LTF repetitions
- better mac80211 iterators to enumerate resources
- initial UHR (Wi-Fi 8) support for cfg80211/mac80211
- WiFi drivers:
- Qualcomm/Atheros:
- ath11k: support for Channel Frequency Response measurement
- ath12k: a significant driver refactor to support multi-wiphy
devices and and pave the way for future device support in the
same driver (rather than splitting to ath13k)
- ath12k: support for the QCC2072 chipset
- Intel:
- iwlwifi: partial Neighbor Awareness Networking (NAN) support
- iwlwifi: initial support for U-NII-9 and IEEE 802.11bn
- RealTek (rtw89):
- preparations for RTL8922DE support
- Bluetooth:
- implement setsockopt(BT_PHY) to set the connection packet type/PHY
- set link_policy on incoming ACL connections
- Bluetooth drivers:
- btusb: add support for MediaTek7920, Realtek RTL8761BU and 8851BE
- btqca: add WCN6855 firmware priority selection feature"
* tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1254 commits)
bnge/bng_re: Add a new HSI
net: macb: Fix tx/rx malfunction after phy link down and up
af_unix: Fix memleak of newsk in unix_stream_connect().
net: ti: icssg-prueth: Add optional dependency on HSR
net: dsa: add basic initial driver for MxL862xx switches
net: mdio: add unlocked mdiodev C45 bus accessors
net: dsa: add tag format for MxL862xx switches
dt-bindings: net: dsa: add MaxLinear MxL862xx
selftests: drivers: net: hw: Modify toeplitz.c to poll for packets
octeontx2-pf: Unregister devlink on probe failure
net: renesas: rswitch: fix forwarding offload statemachine
ionic: Rate limit unknown xcvr type messages
tcp: inet6_csk_xmit() optimization
tcp: populate inet->cork.fl.u.ip6 in tcp_v6_syn_recv_sock()
tcp: populate inet->cork.fl.u.ip6 in tcp_v6_connect()
ipv6: inet6_csk_xmit() and inet6_csk_update_pmtu() use inet->cork.fl.u.ip6
ipv6: use inet->cork.fl.u.ip6 and np->final in ip6_datagram_dst_update()
ipv6: use np->final in inet6_sk_rebuild_header()
ipv6: add daddr/final storage in struct ipv6_pinfo
net: stmmac: qcom-ethqos: fix qcom_ethqos_serdes_powerup()
...
|
|
mldsa_verify() implements ML-DSA.Verify with ctx='', so document this
more explicitly. Remove the one-liner comment above mldsa_verify()
which was somewhat misleading.
Reviewed-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20260202221552.174341-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Now that there are no users of the low-level SHA-1 interface, remove it.
Specifically:
- Remove SHA1_DIGEST_WORDS (no longer used)
- Remove sha1_init_raw() (no longer used)
- Rename sha1_transform() to sha1_block_generic() and make it static
- Move SHA1_WORKSPACE_WORDS into lib/crypto/sha1.c
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20260123051656.396371-3-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The volatile keyword is no longer necessary or useful on aes_sbox and
aes_inv_sbox, since the table prefetching is now done using a helper
function that casts to volatile itself and also includes an optimization
barrier. Since it prevents some compiler optimizations, remove it.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-36-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Now that all callers of the aes_encrypt() and aes_decrypt() type-generic
macros are using the new types, remove the old functions.
Then, replace the macro with direct calls to the new functions, dropping
the "_new" suffix from them.
This completes the change in the type of the key struct that is passed
to aes_encrypt() and aes_decrypt().
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-35-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey). This
eliminates the unnecessary computation and caching of the decryption
round keys. The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.
Note that in addition to the change in the key preparation function and
the key struct type itself, the change in the type of the key struct
results in aes_encrypt() (which is temporarily a type-generic macro)
calling the new encryption function rather than the old one.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-34-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey). This
eliminates the unnecessary computation and caching of the decryption
round keys. The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.
Note that in addition to the change in the key preparation function and
the key struct type itself, the change in the type of the key struct
results in aes_encrypt() (which is temporarily a type-generic macro)
calling the new encryption function rather than the old one.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-33-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Optimize the AES library with x86 AES-NI instructions.
The relevant existing assembly functions, aesni_set_key(), aesni_enc(),
and aesni_dec(), are a bit difficult to extract into the library:
- They're coupled to the code for the AES modes.
- They operate on struct crypto_aes_ctx. The AES library now uses
different structs.
- They assume the key is 16-byte aligned. The AES library only
*prefers* 16-byte alignment; it doesn't require it.
Moreover, they're not all that great in the first place:
- They use unrolled loops, which isn't a great choice on x86.
- They use the 'aeskeygenassist' instruction, which is unnecessary, is
slow on Intel CPUs, and forces the loop to be unrolled.
- They have special code for AES-192 key expansion, despite that being
kind of useless. AES-128 and AES-256 are the ones used in practice.
These are small functions anyway.
Therefore, I opted to just write replacements of these functions for the
library. They address all the above issues.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-18-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Move the SPARC64 AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the "aes-sparc64" crypto_cipher algorithm.
The result is that both the AES library and crypto_cipher APIs use the
SPARC64 AES opcodes, whereas previously only crypto_cipher did (and it
wasn't enabled by default, which this commit fixes as well).
Note that some of the functions in the SPARC64 AES assembly code are
still used by the AES mode implementations in
arch/sparc/crypto/aes_glue.c. For now, just export these functions.
These exports will go away once the AES mode implementations are
migrated to the library as well. (Trying to split up the assembly file
seemed like much more trouble than it would be worth.)
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-17-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Implement aes_preparekey_arch(), aes_encrypt_arch(), and
aes_decrypt_arch() using the CPACF AES instructions.
Then, remove the superseded "aes-s390" crypto_cipher.
The result is that both the AES library and crypto_cipher APIs use the
CPACF AES instructions, whereas previously only crypto_cipher did (and
it wasn't enabled by default, which this commit fixes as well).
Note that this preserves the optimization where the AES key is stored in
raw form rather than expanded form. CPACF just takes the raw key.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20260112192035.10427-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Move the aes_encrypt_zvkned() and aes_decrypt_zvkned() assembly
functions into lib/crypto/, wire them up to the AES library API, and
remove the "aes-riscv64-zvkned" crypto_cipher algorithm.
To make this possible, change the prototypes of these functions to
take (rndkeys, key_len) instead of a pointer to crypto_aes_ctx, and
change the RISC-V AES-XTS code to implement tweak encryption using the
AES library instead of directly calling aes_encrypt_zvkned().
The result is that both the AES library and crypto_cipher APIs use
RISC-V's AES instructions, whereas previously only crypto_cipher did
(and it wasn't enabled by default, which this commit fixes as well).
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-15-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Move the POWER8 AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the superseded "p8_aes" crypto_cipher algorithm.
The result is that both the AES library and crypto_cipher APIs are now
optimized for POWER8, whereas previously only crypto_cipher was (and
optimizations weren't enabled by default, which this commit fixes too).
Note that many of the functions in the POWER8 assembly code are still
used by the AES mode implementations in arch/powerpc/crypto/. For now,
just export these functions. These exports will go away once the AES
modes are migrated to the library as well. (Trying to split up the
assembly file seemed like much more trouble than it would be worth.)
Another challenge with this code is that the POWER8 assembly code uses a
custom format for the expanded AES key. Since that code is imported
from OpenSSL and is also targeted to POWER8 (rather than POWER9 which
has better data movement and byteswap instructions), that is not easily
changed. For now I've just kept the custom format. To maintain full
correctness, this requires executing some slow fallback code in the case
where the usability of VSX changes between key expansion and use. This
should be tolerable, as this case shouldn't happen in practice.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Move the PowerPC SPE AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the superseded "aes-ppc-spe" crypto_cipher algorithm.
The result is that both the AES library and crypto_cipher APIs are now
optimized with SPE, whereas previously only crypto_cipher was (and
optimizations weren't enabled by default, which this commit fixes too).
Note that many of the functions in the PowerPC SPE assembly code are
still used by the AES mode implementations in arch/powerpc/crypto/. For
now, just export these functions. These exports will go away once the
AES modes are migrated to the library as well. (Trying to split up the
assembly files seemed like much more trouble than it would be worth.)
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Move the ARM64 optimized AES key expansion and single-block AES
en/decryption code into lib/crypto/, wire it up to the AES library API,
and remove the superseded crypto_cipher algorithms.
The result is that both the AES library and crypto_cipher APIs are now
optimized for ARM64, whereas previously only crypto_cipher was (and the
optimizations weren't enabled by default, which this fixes as well).
Note: to see the diff from arch/arm64/crypto/aes-ce-glue.c to
lib/crypto/arm64/aes.h, view this commit with 'git show -M10'.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Move the ARM optimized single-block AES en/decryption code into
lib/crypto/, wire it up to the AES library API, and remove the
superseded "aes-arm" crypto_cipher algorithm.
The result is that both the AES library and crypto_cipher APIs are now
optimized for ARM, whereas previously only crypto_cipher was (and the
optimizations weren't enabled by default, which this fixes as well).
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
The kernel's AES library currently has the following issues:
- It doesn't take advantage of the architecture-optimized AES code,
including the implementations using AES instructions.
- It's much slower than even the other software AES implementations: 2-4
times slower than "aes-generic", "aes-arm", and "aes-arm64".
- It requires that both the encryption and decryption round keys be
computed and cached. This is wasteful for users that need only the
forward (encryption) direction of the cipher: the key struct is 484
bytes when only 244 are actually needed. This missed optimization is
very common, as many AES modes (e.g. GCM, CFB, CTR, CMAC, and even the
tweak key in XTS) use the cipher only in the forward (encryption)
direction even when doing decryption.
- It doesn't provide the flexibility to customize the prepared key
format. The API is defined to do key expansion, and several callers
in drivers/crypto/ use it specifically to expand the key. This is an
issue when integrating the existing powerpc, s390, and sparc code,
which is necessary to provide full parity with the traditional API.
To resolve these issues, I'm proposing the following changes:
1. New structs 'aes_key' and 'aes_enckey' are introduced, with
corresponding functions aes_preparekey() and aes_prepareenckey().
Generally these structs will include the encryption+decryption round
keys and the encryption round keys, respectively. However, the exact
format will be under control of the architecture-specific AES code.
(The verb "prepare" is chosen over "expand" since key expansion isn't
necessarily done. It's also consistent with hmac*_preparekey().)
2. aes_encrypt() and aes_decrypt() will be changed to operate on the new
structs instead of struct crypto_aes_ctx.
3. aes_encrypt() and aes_decrypt() will use architecture-optimized code
when available, or else fall back to a new generic AES implementation
that unifies the existing two fragmented generic AES implementations.
The new generic AES implementation uses tables for both SubBytes and
MixColumns, making it almost as fast as "aes-generic". However,
instead of aes-generic's huge 8192-byte tables per direction, it uses
only 1024 bytes for encryption and 1280 bytes for decryption (similar
to "aes-arm"). The cost is just some extra rotations.
The new generic AES implementation also includes table prefetching,
making it have some "constant-time hardening". That's an improvement
from aes-generic which has no constant-time hardening.
It does slightly regress in constant-time hardening vs. the old
lib/crypto/aes.c which had smaller tables, and from aes-fixed-time
which disabled IRQs on top of that. But I think this is tolerable.
The real solutions for constant-time AES are AES instructions or
bit-slicing. The table-based code remains a best-effort fallback for
the increasingly-rare case where a real solution is unavailable.
4. crypto_aes_ctx and aes_expandkey() will remain for now, but only for
callers that are using them specifically for the AES key expansion
(as opposed to en/decrypting data with the AES library).
This commit begins the migration process by introducing the new structs
and functions, backed by the new generic AES implementation.
To allow callers to be incrementally converted, aes_encrypt() and
aes_decrypt() are temporarily changed into macros that use a _Generic
expression to call either the old functions (which take crypto_aes_ctx)
or the new functions (which take the new types). Once all callers have
been updated, these macros will go away, the old functions will be
removed, and the "_new" suffix will be dropped from the new functions.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Since ML-DSA is FIPS-approved, add the boot-time self-test which is
apparently required.
Just add a test vector manually for now, borrowed from
lib/crypto/tests/mldsa-testvecs.h (where in turn it's borrowed from
leancrypto). The SHA-* FIPS test vectors are generated by
scripts/crypto/gen-fips-testvecs.py instead, but the common Python
libraries don't support ML-DSA yet.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20260107044215.109930-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Since the architecture-specific implementations of NH initialize memory
in assembly code, they aren't compatible with KMSAN as-is.
Fixes: 382de740759a ("lib/crypto: nh: Add NH library")
Link: https://lore.kernel.org/r/20260105053652.1708299-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
For the bitwise left rotation in MD5STEP, use rol32() from
<linux/bitops.h> instead of open-coding it.
Signed-off-by: Rusydi H. Makarim <rusydi.makarim@kriptograf.id>
Link: https://lore.kernel.org/r/20251214-rol32_in_md5-v1-1-20f5f11a92b2@kriptograf.id
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Migrate the x86_64 implementations of NH into lib/crypto/. This makes
the nh() function be optimized on x86_64 kernels.
Note: this temporarily makes the adiantum template not utilize the
x86_64 optimized NH code. This is resolved in a later commit that
converts the adiantum template to use nh() instead of "nhpoly1305".
Link: https://lore.kernel.org/r/20251211011846.8179-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Migrate the arm64 NEON implementation of NH into lib/crypto/. This
makes the nh() function be optimized on arm64 kernels.
Note: this temporarily makes the adiantum template not utilize the arm64
optimized NH code. This is resolved in a later commit that converts the
adiantum template to use nh() instead of "nhpoly1305".
Link: https://lore.kernel.org/r/20251211011846.8179-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Migrate the arm32 NEON implementation of NH into lib/crypto/. This
makes the nh() function be optimized on arm32 kernels.
Note: this temporarily makes the adiantum template not utilize the arm32
optimized NH code. This is resolved in a later commit that converts the
adiantum template to use nh() instead of "nhpoly1305".
Link: https://lore.kernel.org/r/20251211011846.8179-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add some simple KUnit tests for the nh() function.
These replace the test coverage which will be lost by removing the
nhpoly1305 crypto_shash.
Note that the NH code also continues to be tested indirectly as well,
via the tests for the "adiantum(xchacha12,aes)" crypto_skcipher.
Link: https://lore.kernel.org/r/20251211011846.8179-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add support for the NH "almost-universal hash function" to lib/crypto/,
specifically the variant of NH used in Adiantum.
This will replace the need for the "nhpoly1305" crypto_shash algorithm.
All the implementations of "nhpoly1305" use architecture-optimized code
only for the NH stage; they just use the generic C Poly1305 code for the
Poly1305 stage. We can achieve the same result in a simpler way using
an (architecture-optimized) nh() function combined with code in
crypto/adiantum.c that passes the results to the Poly1305 library.
This commit begins this cleanup by adding the nh() function. The code
is derived from crypto/nhpoly1305.c and include/crypto/nhpoly1305.h.
Link: https://lore.kernel.org/r/20251211011846.8179-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add a KUnit test suite for ML-DSA verification, including the following
for each ML-DSA parameter set (ML-DSA-44, ML-DSA-65, and ML-DSA-87):
- Positive test (valid signature), using vector imported from leancrypto
- Various negative tests:
- Wrong length for signature, message, or public key
- Out-of-range coefficients in z vector
- Invalid encoded hint vector
- Any bit flipped in signature, message, or public key
- Unit test for the internal function use_hint()
- A benchmark
ML-DSA inputs and outputs are very large. To keep the size of the tests
down, use just one valid test vector per parameter set, and generate the
negative tests at runtime by mutating the valid test vector.
I also considered importing the test vectors from Wycheproof. I've
tested that mldsa_verify() indeed passes all of Wycheproof's ML-DSA test
vectors that use an empty context string. However, importing these
permanently would add over 6 MB of source. That's too much to be a
reasonable addition to the Linux kernel tree for one algorithm. It also
wouldn't actually provide much better test coverage than this commit.
Another potential issue is that Wycheproof uses the Apache license.
Similarly, this also differs from the earlier proposal to import a long
list of test vectors from leancrypto. I retained only one valid
signature for each algorithm, and I also added (runtime-generated)
negative tests which were missing. I think this is a better tradeoff.
Reviewed-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20251214181712.29132-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add support for verifying ML-DSA signatures.
ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is specified
in FIPS 204 and is the standard version of Dilithium. Unlike RSA and
elliptic-curve cryptography, ML-DSA is believed to be secure even
against adversaries in possession of a large-scale quantum computer.
Compared to the earlier patch
(https://lore.kernel.org/r/20251117145606.2155773-3-dhowells@redhat.com/)
that was based on "leancrypto", this implementation:
- Is about 700 lines of source code instead of 4800.
- Generates about 4 KB of object code instead of 28 KB.
- Uses 9-13 KB of memory to verify a signature instead of 31-84 KB.
- Is at least about the same speed, with a microbenchmark showing 3-5%
improvements on one x86_64 CPU and -1% to 1% changes on another.
When memory is a bottleneck, it's likely much faster.
- Correctly implements the RejNTTPoly step of the algorithm.
The API just consists of a single function mldsa_verify(), supporting
pure ML-DSA with any standard parameter set (ML-DSA-44, ML-DSA-65, or
ML-DSA-87) as selected by an enum. That's all that's actually needed.
The following four potential features are unneeded and aren't included.
However, any that ever become needed could fairly easily be added later,
as they only affect how the message representative mu is calculated:
- Nonempty context strings
- Incremental message hashing
- HashML-DSA
- External mu
Signing support would, of course, be a larger and more complex addition.
However, the kernel doesn't, and shouldn't, need ML-DSA signing support.
Note that mldsa_verify() allocates memory, so it can sleep and can fail
with ENOMEM. Unfortunately we don't have much choice about that, since
ML-DSA needs a lot of memory. At least callers have to check for errors
anyway, since the signature could be invalid.
Note that verification doesn't require constant-time code, and in fact
some steps are inherently variable-time. I've used constant-time
patterns in some places anyway, but technically they're not needed.
Reviewed-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20251214181712.29132-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
__cacheline_aligned puts the data in the ".data..cacheline_aligned"
section, which isn't marked read-only i.e. it doesn't receive MMU
protection. Replace it with ____cacheline_aligned which does the right
thing and just aligns the data while keeping it in ".rodata".
Fixes: b5e0b032b6c3 ("crypto: aes - add generic time invariant AES cipher")
Cc: stable@vger.kernel.org
Reported-by: Qingfang Deng <dqfext@gmail.com>
Closes: https://lore.kernel.org/r/20260105074712.498-1-dqfext@gmail.com/
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260107052023.174620-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
On my development machine the generic, memcpy()-only implementation of
polyval_preparekey() is too fast for the IRQ workers to actually fire.
The test fails.
Increase the iterations to make the test more robust.
The test will run for a maximum of one second in any case.
[EB: This failure was already fixed by commit c31f4aa8fed0 ("kunit:
Enforce task execution in {soft,hard}irq contexts"). I'm still applying
this patch too, since the iteration count in this test made its running
time much shorter than the other similar ones.]
Fixes: b3aed551b3fc ("lib/crypto: tests: Add KUnit tests for POLYVAL")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20260102-kunit-polyval-fix-v1-1-5313b5a65f35@linutronix.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
poly1305-core.S is an auto-generated file, so it should be ignored.
Fixes: bef9c7559869 ("lib/crypto: riscv/poly1305: Import OpenSSL/CRYPTOGAMS implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Charles Mirabile <cmirabil@redhat.com>
Link: https://lore.kernel.org/r/20251212184717.133701-1-cmirabil@redhat.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
As we're doing in the BLAKE2b code, use unrolled_full to make the
compiler handle the loop unrolling. This simplifies the code slightly.
The generated object code is nearly the same with both gcc and clang.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251205051155.25274-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
BLAKE2b has a state of 16 64-bit words. Add the message data in and
there are 32 64-bit words. With the current code where all the rounds
are unrolled to enable constant-folding of the blake2b_sigma values,
this results in a very large code size on 32-bit kernels, including a
recurring issue where gcc uses a large amount of stack.
There's just not much benefit to this unrolling when the code is already
so large. Let's roll up the rounds when !CONFIG_64BIT.
To avoid having to duplicate the code, just write the code once using a
loop, and conditionally use 'unrolled_full' from <linux/unroll.h>.
Then, fold the now-unneeded ROUND() macro into the loop. Finally, also
remove the now-unneeded override of the stack frame size warning.
Code size improvements for blake2b_compress_generic():
Size before (bytes) Size after (bytes)
------------------- ------------------
i386, gcc 27584 3632
i386, clang 18208 3248
arm32, gcc 19912 2860
arm32, clang 21336 3344
Running the BLAKE2b benchmark on a !CONFIG_64BIT kernel on an x86_64
processor shows a 16384B throughput change of 351 => 340 MB/s (gcc) or
442 MB/s => 375 MB/s (clang). So clearly not much of a slowdown either.
But also that microbenchmark also effectively disregards cache usage,
which is important in practice and is far better in the smaller code.
Note: If we rolled up the loop on x86_64 too, the change would be
7024 bytes => 1584 bytes and 1960 MB/s => 1396 MB/s (gcc), or
6848 bytes => 1696 bytes and 1920 MB/s => 1263 MB/s (clang).
Maybe still worth it, though not quite as clearly beneficial.
Fixes: 91d689337fe8 ("crypto: blake2b - add blake2b generic implementation")
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251205050330.89704-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Replace the RISCV_ISA_V dependency of the RISC-V crypto code with
RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS, which implies RISCV_ISA_V as
well as vector unaligned accesses being efficient.
This is necessary because this code assumes that vector unaligned
accesses are supported and are efficient. (It does so to avoid having
to use lots of extra vsetvli instructions to switch the element width
back and forth between 8 and either 32 or 64.)
This was omitted from the code originally just because the RISC-V kernel
support for detecting this feature didn't exist yet. Support has now
been added, but it's fragmented into per-CPU runtime detection, a
command-line parameter, and a kconfig option. The kconfig option is the
only reasonable way to do it, though, so let's just rely on that.
Fixes: eb24af5d7a05 ("crypto: riscv - add vector crypto accelerated AES-{ECB,CBC,CTR,XTS}")
Fixes: bb54668837a0 ("crypto: riscv - add vector crypto accelerated ChaCha20")
Fixes: 600a3853dfa0 ("crypto: riscv - add vector crypto accelerated GHASH")
Fixes: 8c8e40470ffe ("crypto: riscv - add vector crypto accelerated SHA-{256,224}")
Fixes: b3415925a08b ("crypto: riscv - add vector crypto accelerated SHA-{512,384}")
Fixes: 563a5255afa2 ("crypto: riscv - add vector crypto accelerated SM3")
Fixes: b8d06352bbf3 ("crypto: riscv - add vector crypto accelerated SM4")
Cc: stable@vger.kernel.org
Reported-by: Vivian Wang <wangruikang@iscas.ac.cn>
Closes: https://lore.kernel.org/r/b3cfcdac-0337-4db0-a611-258f2868855f@iscas.ac.cn/
Reviewed-by: Jerry Shih <jerry.shih@sifive.com>
Link: https://lore.kernel.org/r/20251206213750.81474-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
In chacha_zvkb, avoid using the s0 register, which is the frame pointer,
by reallocating KEY0 to t5. This makes stack traces available if e.g. a
crash happens in chacha_zvkb.
No frame pointer maintenance is otherwise required since this is a leaf
function.
Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
Fixes: bb54668837a0 ("crypto: riscv - add vector crypto accelerated ChaCha20")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20251202-riscv-chacha_zvkb-fp-v2-1-7bd00098c9dc@iscas.ac.cn
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Rewrite memcpy_sglist from scratch
- Add on-stack AEAD request allocation
- Fix partial block processing in ahash
Algorithms:
- Remove ansi_cprng
- Remove tcrypt tests for poly1305
- Fix EINPROGRESS processing in authenc
- Fix double-free in zstd
Drivers:
- Use drbg ctr helper when reseeding xilinx-trng
- Add support for PCI device 0x115A to ccp
- Add support of paes in caam
- Add support for aes-xts in dthev2
Others:
- Use likely in rhashtable lookup
- Fix lockdep false-positive in padata by removing a helper"
* tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits)
crypto: zstd - fix double-free in per-CPU stream cleanup
crypto: ahash - Zero positive err value in ahash_update_finish
crypto: ahash - Fix crypto_ahash_import with partial block data
crypto: lib/mpi - use min() instead of min_t()
crypto: ccp - use min() instead of min_t()
hwrng: core - use min3() instead of nested min_t()
crypto: aesni - ctr_crypt() use min() instead of min_t()
crypto: drbg - Delete unused ctx from struct sdesc
crypto: testmgr - Add missing DES weak and semi-weak key tests
Revert "crypto: scatterwalk - Move skcipher walk and use it for memcpy_sglist"
crypto: scatterwalk - Fix memcpy_sglist() to always succeed
crypto: iaa - Request to add Kanchana P Sridhar to Maintainers.
crypto: tcrypt - Remove unused poly1305 support
crypto: ansi_cprng - Remove unused ansi_cprng algorithm
crypto: asymmetric_keys - fix uninitialized pointers with free attribute
KEYS: Avoid -Wflex-array-member-not-at-end warning
crypto: ccree - Correctly handle return of sg_nents_for_len
crypto: starfive - Correctly handle return of sg_nents_for_len
crypto: iaa - Fix incorrect return value in save_iaa_wq()
crypto: zstd - Remove unnecessary size_t cast
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers:
"This is a core arm64 change. However, I was asked to take this because
most uses of kernel-mode FPSIMD are in crypto or CRC code.
In v6.8, the size of task_struct on arm64 increased by 528 bytes due
to the new 'kernel_fpsimd_state' field. This field was added to allow
kernel-mode FPSIMD code to be preempted.
Unfortunately, 528 bytes is kind of a lot for task_struct. This
regression in the task_struct size was noticed and reported.
Recover that space by making this state be allocated on the stack at
the beginning of each kernel-mode FPSIMD section.
To make it easier for all the users of kernel-mode FPSIMD to do that
correctly, introduce and use a 'scoped_ksimd' abstraction"
* tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits)
lib/crypto: arm64: Move remaining algorithms to scoped ksimd API
lib/crypto: arm/blake2b: Move to scoped ksimd API
arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
arm64/fpu: Enforce task-context only for generic kernel mode FPU
net/mlx5: Switch to more abstract scoped ksimd guard API on arm64
arm64/xorblocks: Switch to 'ksimd' scoped guard API
crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API
crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API
crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API
crypto/arm64: polyval - Switch to 'ksimd' scoped guard API
crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API
raid6: Move to more abstract 'ksimd' guard API
crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit
crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit
crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit
lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull 'at_least' array size update from Eric Biggers:
"C supports lower bounds on the sizes of array parameters, using the
static keyword as follows: 'void f(int a[static 32]);'. This allows
the compiler to warn about a too-small array being passed.
As discussed, this reuse of the 'static' keyword, while standard, is a
bit obscure. Therefore, add an alias 'at_least' to compiler_types.h.
Then, add this 'at_least' annotation to the array parameters of
various crypto library functions"
* tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: sha2: Add at_least decoration to fixed-size array params
lib/crypto: sha1: Add at_least decoration to fixed-size array params
lib/crypto: poly1305: Add at_least decoration to fixed-size array params
lib/crypto: md5: Add at_least decoration to fixed-size array params
lib/crypto: curve25519: Add at_least decoration to fixed-size array params
lib/crypto: chacha: Add at_least decoration to fixed-size array params
lib/crypto: chacha20poly1305: Statically check fixed array lengths
compiler_types: introduce at_least parameter decoration pseudo keyword
wifi: iwlwifi: trans: rename at_least variable to min_mode
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library test updates from Eric Biggers:
- Add KUnit test suites for SHA-3, BLAKE2b, and POLYVAL. These are the
algorithms that have new crypto library interfaces this cycle.
- Remove the crypto_shash POLYVAL tests. They're no longer needed
because POLYVAL support was removed from crypto_shash. Better POLYVAL
test coverage is now provided via the KUnit test suite.
* tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
crypto: testmgr - Remove polyval tests
lib/crypto: tests: Add KUnit tests for POLYVAL
lib/crypto: tests: Add additional SHAKE tests
lib/crypto: tests: Add SHA3 kunit tests
lib/crypto: tests: Add KUnit tests for BLAKE2b
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
"This is the main crypto library pull request for 6.19. It includes:
- Add SHA-3 support to lib/crypto/, including support for both the
hash functions and the extendable-output functions. Reimplement the
existing SHA-3 crypto_shash support on top of the library.
This is motivated mainly by the upcoming support for the ML-DSA
signature algorithm, which needs the SHAKE128 and SHAKE256
functions. But even on its own it's a useful cleanup.
This also fixes the longstanding issue where the
architecture-optimized SHA-3 code was disabled by default.
- Add BLAKE2b support to lib/crypto/, and reimplement the existing
BLAKE2b crypto_shash support on top of the library.
This is motivated mainly by btrfs, which supports BLAKE2b
checksums. With this change, all btrfs checksum algorithms now have
library APIs. btrfs is planned to start just using the library
directly.
This refactor also improves consistency between the BLAKE2b code
and BLAKE2s code. And as usual, it also fixes the issue where the
architecture-optimized BLAKE2b code was disabled by default.
- Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL
support in crypto_shash. Reimplement HCTR2 on top of the library.
This simplifies the code and improves HCTR2 performance. As usual,
it also makes the architecture-optimized code be enabled by
default. The generic implementation of POLYVAL is greatly improved
as well.
- Clean up the BLAKE2s code
- Add FIPS self-tests for SHA-1, SHA-2, and SHA-3"
* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits)
fscrypt: Drop obsolete recommendation to enable optimized POLYVAL
crypto: polyval - Remove the polyval crypto_shash
crypto: hctr2 - Convert to use POLYVAL library
lib/crypto: x86/polyval: Migrate optimized code into library
lib/crypto: arm64/polyval: Migrate optimized code into library
lib/crypto: polyval: Add POLYVAL library
crypto: polyval - Rename conflicting functions
lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs
lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value
lib/crypto: x86/blake2s: Improve readability
lib/crypto: x86/blake2s: Use local labels for data
lib/crypto: x86/blake2s: Drop check for nblocks == 0
lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
lib/crypto: arm, arm64: Drop filenames from file comments
lib/crypto: arm/blake2s: Fix some comments
crypto: s390/sha3 - Remove superseded SHA-3 code
crypto: sha3 - Reimplement using library API
crypto: jitterentropy - Use default sha3 implementation
lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
lib/crypto: sha3: Support arch overrides of one-shot digest functions
...
|
|
min_t(unsigned int, a, b) casts an 'unsigned long' to 'unsigned int'.
Use min(a, b) instead as it promotes any 'unsigned int' to 'unsigned long'
and so cannot discard significant bits.
In this case the 'unsigned long' value is small enough that the result
is ok.
Detected by an extra check added to min_t().
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Several parameters of the chacha20poly1305 functions require arrays of
an exact length. Use the new at_least keyword to instruct gcc and
clang to statically check that the caller is passing an object of at
least that length.
Here it is in action, with this faulty patch to wireguard's cookie.h:
struct cookie_checker {
u8 secret[NOISE_HASH_LEN];
- u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN];
+ u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN - 1];
u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN];
If I try compiling this code, I get this helpful warning:
CC drivers/net/wireguard/cookie.o
drivers/net/wireguard/cookie.c: In function ‘wg_cookie_message_create’:
drivers/net/wireguard/cookie.c:193:9: warning: ‘xchacha20poly1305_encrypt’ reading 32 bytes from a region of size 31 [-Wstringop-overread]
193 | xchacha20poly1305_encrypt(dst->encrypted_cookie, cookie, COOKIE_LEN,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
194 | macs->mac1, COOKIE_LEN, dst->nonce,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
195 | checker->cookie_encryption_key);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/wireguard/cookie.c:193:9: note: referencing argument 7 of type ‘const u8 *’ {aka ‘const unsigned char *’}
In file included from drivers/net/wireguard/messages.h:10,
from drivers/net/wireguard/cookie.h:9,
from drivers/net/wireguard/cookie.c:6:
include/crypto/chacha20poly1305.h:28:6: note: in a call to function ‘xchacha20poly1305_encrypt’
28 | void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20251123054819.2371989-4-Jason@zx2c4.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Fully initialize *ctx, including the buf field which sha256_init()
doesn't initialize, to avoid a KMSAN warning when comparing *ctx to
orig_ctx. This KMSAN warning slipped in while KMSAN was not working
reliably due to a stackdepot bug, which has now been fixed.
Fixes: 6733968be7cb ("lib/crypto: tests: Add tests and benchmark for sha256_finup_2x()")
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251121033431.34406-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Move the arm64 implementations of SHA-3 and POLYVAL to the newly
introduced scoped ksimd API, which replaces kernel_neon_begin() and
kernel_neon_end(). On arm64, this is needed because the latter API
will change in an incompatible manner.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Even though ARM's versions of kernel_neon_begin()/_end() are not being
changed, update the newly migrated ARM blake2b to the scoped ksimd API
so that all ARM and arm64 in lib/crypto remains consistent in this
manner.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Pull scoped ksimd API for ARM and arm64 from Ard Biesheuvel:
"Introduce a more strict replacement API for
kernel_neon_begin()/kernel_neon_end() on both ARM and arm64, and
replace occurrences of the latter pair appearing in lib/crypto"
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Before modifying the prototypes of kernel_neon_begin() and
kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers
allocated on the stack, move arm64 to the new 'ksimd' scoped guard API,
which encapsulates the calls to those functions.
For symmetry, do the same for 32-bit ARM too.
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Add a test suite for the POLYVAL library, including:
- All the standard tests and the benchmark from hash-test-template.h
- Comparison with a test vector from the RFC
- Test with key and message containing all one bits
- Additional tests related to the key struct
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|