user/sven/linux.git - Linux Kernel

Age	Commit message (Collapse)	Author
6 days	Merge tag 'libcrypto-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library fix from Eric Biggers: "Fix a big endian specific issue in the PPC64-optimized AES code" * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: powerpc/aes: Fix rndkey_from_vsx() on big endian CPUs
7 days	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument	Linus Torvalds
	This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/\(alloc_objs(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 days	treewide: Replace kmalloc with kmalloc_obj for non-scalar types	Kees Cook
	This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>
10 days	lib/crypto: powerpc/aes: Fix rndkey_from_vsx() on big endian CPUs	Eric Biggers
	I finally got a big endian PPC64 kernel to boot in QEMU. The PPC64 VSX optimized AES library code does work in that case, with the exception of rndkey_from_vsx() which doesn't take into account that the order in which the VSX code stores the round key words depends on the endianness. So fix rndkey_from_vsx() to do the right thing on big endian CPUs. Fixes: 7cf2082e74ce ("lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library") Link: https://lore.kernel.org/r/20260216022104.332991-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-02-11	Merge tag 'net-next-7.0' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core & protocols: - A significant effort all around the stack to guide the compiler to make the right choice when inlining code, to avoid unneeded calls for small helper and stack canary overhead in the fast-path. This generates better and faster code with very small or no text size increases, as in many cases the call generated more code than the actual inlined helper. - Extend AccECN implementation so that is now functionally complete, also allow the user-space enabling it on a per network namespace basis. - Add support for memory providers with large (above 4K) rx buffer. Paired with hw-gro, larger rx buffer sizes reduce the number of buffers traversing the stack, dincreasing single stream CPU usage by up to ~30%. - Do not add HBH header to Big TCP GSO packets. This simplifies the RX path, the TX path and the NIC drivers, and is possible because user-space taps can now interpret correctly such packets without the HBH hint. - Allow IPv6 routes to be configured with a gateway address that is resolved out of a different interface than the one specified, aligning IPv6 to IPv4 behavior. - Multi-queue aware sch_cake. This makes it possible to scale the rate shaper of sch_cake across multiple CPUs, while still enforcing a single global rate on the interface. - Add support for the nbcon (new buffer console) infrastructure to netconsole, enabling lock-free, priority-based console operations that are safer in crash scenarios. - Improve the TCP ipv6 output path to cache the flow information, saving cpu cycles, reducing cache line misses and stack use. - Improve netfilter packet tracker to resolve clashes for most protocols, avoiding unneeded drops on rare occasions. - Add IP6IP6 tunneling acceleration to the flowtable infrastructure. - Reduce tcp socket size by one cache line. - Notify neighbour changes atomically, avoiding inconsistencies between the notification sequence and the actual states sequence. - Add vsock namespace support, allowing complete isolation of vsocks across different network namespaces. - Improve xsk generic performances with cache-alignment-oriented optimizations. - Support netconsole automatic target recovery, allowing netconsole to reestablish targets when underlying low-level interface comes back online. Driver API: - Support for switching the working mode (automatic vs manual) of a DPLL device via netlink. - Introduce PHY ports representation to expose multiple front-facing media ports over a single MAC. - Introduce "rx-polarity" and "tx-polarity" device tree properties, to generalize polarity inversion requirements for differential signaling. - Add helper to create, prepare and enable managed clocks. Device drivers: - Add Huawei hinic3 PF etherner driver. - Add DWMAC glue driver for Motorcomm YT6801 PCIe ethernet controller. - Add ethernet driver for MaxLinear MxL862xx switches - Remove parallel-port Ethernet driver. - Convert existing driver timestamp configuration reporting to hwtstamp_get and remove legacy ioctl(). - Convert existing drivers to .get_rx_ring_count(), simplifing the RX ring count retrieval. Also remove the legacy fallback path. - Ethernet high-speed NICs: - Broadcom (bnxt, bng): - bnxt: add FW interface update to support FEC stats histogram and NVRAM defragmentation - bng: add TSO and H/W GRO support - nVidia/Mellanox (mlx5): - improve latency of channel restart operations, reducing the used H/W resources - add TSO support for UDP over GRE over VLAN - add flow counters support for hardware steering (HWS) rules - use a static memory area to store headers for H/W GRO, leading to 12% RX tput improvement - Intel (100G, ice, idpf): - ice: reorganizes layout of Tx and Rx rings for cacheline locality and utilizes __cacheline_group* macros on the new layouts - ice: introduces Synchronous Ethernet (SyncE) support - Meta (fbnic): - adds debugfs for firmware mailbox and tx/rx rings vectors - Ethernet virtual: - geneve: introduce GRO/GSO support for double UDP encapsulation - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - some code refactoring and cleanups - RealTek (r8169): - add support for RTL8127ATF (10G Fiber SFP) - add dash and LTR support - Airoha: - AN8811HB 2.5 Gbps phy support - Freescale (fec): - add XDP zero-copy support - Thunderbolt: - add get link setting support to allow bonding - Renesas: - add support for RZ/G3L GBETH SoC - Ethernet switches: - Maxlinear: - support R(G)MII slow rate configuration - add support for Intel GSW150 - Motorcomm (yt921x): - add DCB/QoS support - TI: - icssm-prueth: support bridging (STP/RSTP) via the switchdev framework - Ethernet PHYs: - Realtek: - enable SGMII and 2500Base-X in-band auto-negotiation - simplify and reunify C22/C45 drivers - Micrel: convert bindings to DT schema - CAN: - move skb headroom content into skb extensions, making CAN metadata access more robust - CAN drivers: - rcar_canfd: - add support for FD-only mode - add support for the RZ/T2H SoC - sja1000: cleanup the CAN state handling - WiFi: - implement EPPKE/802.1X over auth frames support - split up drop reasons better, removing generic RX_DROP - additional FTM capabilities: 6 GHz support, supported number of spatial streams and supported number of LTF repetitions - better mac80211 iterators to enumerate resources - initial UHR (Wi-Fi 8) support for cfg80211/mac80211 - WiFi drivers: - Qualcomm/Atheros: - ath11k: support for Channel Frequency Response measurement - ath12k: a significant driver refactor to support multi-wiphy devices and and pave the way for future device support in the same driver (rather than splitting to ath13k) - ath12k: support for the QCC2072 chipset - Intel: - iwlwifi: partial Neighbor Awareness Networking (NAN) support - iwlwifi: initial support for U-NII-9 and IEEE 802.11bn - RealTek (rtw89): - preparations for RTL8922DE support - Bluetooth: - implement setsockopt(BT_PHY) to set the connection packet type/PHY - set link_policy on incoming ACL connections - Bluetooth drivers: - btusb: add support for MediaTek7920, Realtek RTL8761BU and 8851BE - btqca: add WCN6855 firmware priority selection feature" * tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1254 commits) bnge/bng_re: Add a new HSI net: macb: Fix tx/rx malfunction after phy link down and up af_unix: Fix memleak of newsk in unix_stream_connect(). net: ti: icssg-prueth: Add optional dependency on HSR net: dsa: add basic initial driver for MxL862xx switches net: mdio: add unlocked mdiodev C45 bus accessors net: dsa: add tag format for MxL862xx switches dt-bindings: net: dsa: add MaxLinear MxL862xx selftests: drivers: net: hw: Modify toeplitz.c to poll for packets octeontx2-pf: Unregister devlink on probe failure net: renesas: rswitch: fix forwarding offload statemachine ionic: Rate limit unknown xcvr type messages tcp: inet6_csk_xmit() optimization tcp: populate inet->cork.fl.u.ip6 in tcp_v6_syn_recv_sock() tcp: populate inet->cork.fl.u.ip6 in tcp_v6_connect() ipv6: inet6_csk_xmit() and inet6_csk_update_pmtu() use inet->cork.fl.u.ip6 ipv6: use inet->cork.fl.u.ip6 and np->final in ip6_datagram_dst_update() ipv6: use np->final in inet6_sk_rebuild_header() ipv6: add daddr/final storage in struct ipv6_pinfo net: stmmac: qcom-ethqos: fix qcom_ethqos_serdes_powerup() ...
2026-02-03	lib/crypto: mldsa: Clarify the documentation for mldsa_verify() slightly	Eric Biggers
	mldsa_verify() implements ML-DSA.Verify with ctx='', so document this more explicitly. Remove the one-liner comment above mldsa_verify() which was somewhat misleading. Reviewed-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20260202221552.174341-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-27	lib/crypto: sha1: Remove low-level functions from API	Eric Biggers
	Now that there are no users of the low-level SHA-1 interface, remove it. Specifically: - Remove SHA1_DIGEST_WORDS (no longer used) - Remove sha1_init_raw() (no longer used) - Rename sha1_transform() to sha1_block_generic() and make it static - Move SHA1_WORKSPACE_WORDS into lib/crypto/sha1.c Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20260123051656.396371-3-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15	lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox	Eric Biggers
	The volatile keyword is no longer necessary or useful on aes_sbox and aes_inv_sbox, since the table prefetching is now done using a helper function that casts to volatile itself and also includes an optimization barrier. Since it prevents some compiler optimizations, remove it. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-36-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15	lib/crypto: aes: Remove old AES en/decryption functions	Eric Biggers
	Now that all callers of the aes_encrypt() and aes_decrypt() type-generic macros are using the new types, remove the old functions. Then, replace the macro with direct calls to the new functions, dropping the "_new" suffix from them. This completes the change in the type of the key struct that is passed to aes_encrypt() and aes_decrypt(). Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-35-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15	lib/crypto: aesgcm: Use new AES library API	Eric Biggers
	Switch from the old AES library functions (which use struct crypto_aes_ctx) to the new ones (which use struct aes_enckey). This eliminates the unnecessary computation and caching of the decryption round keys. The new AES en/decryption functions are also much faster and use AES instructions when supported by the CPU. Note that in addition to the change in the key preparation function and the key struct type itself, the change in the type of the key struct results in aes_encrypt() (which is temporarily a type-generic macro) calling the new encryption function rather than the old one. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-34-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15	lib/crypto: aescfb: Use new AES library API	Eric Biggers
	Switch from the old AES library functions (which use struct crypto_aes_ctx) to the new ones (which use struct aes_enckey). This eliminates the unnecessary computation and caching of the decryption round keys. The new AES en/decryption functions are also much faster and use AES instructions when supported by the CPU. Note that in addition to the change in the key preparation function and the key struct type itself, the change in the type of the key struct results in aes_encrypt() (which is temporarily a type-generic macro) calling the new encryption function rather than the old one. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-33-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15	lib/crypto: x86/aes: Add AES-NI optimization	Eric Biggers
	Optimize the AES library with x86 AES-NI instructions. The relevant existing assembly functions, aesni_set_key(), aesni_enc(), and aesni_dec(), are a bit difficult to extract into the library: - They're coupled to the code for the AES modes. - They operate on struct crypto_aes_ctx. The AES library now uses different structs. - They assume the key is 16-byte aligned. The AES library only prefers 16-byte alignment; it doesn't require it. Moreover, they're not all that great in the first place: - They use unrolled loops, which isn't a great choice on x86. - They use the 'aeskeygenassist' instruction, which is unnecessary, is slow on Intel CPUs, and forces the loop to be unrolled. - They have special code for AES-192 key expansion, despite that being kind of useless. AES-128 and AES-256 are the ones used in practice. These are small functions anyway. Therefore, I opted to just write replacements of these functions for the library. They address all the above issues. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-18-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15	lib/crypto: sparc/aes: Migrate optimized code into library	Eric Biggers
	Move the SPARC64 AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the "aes-sparc64" crypto_cipher algorithm. The result is that both the AES library and crypto_cipher APIs use the SPARC64 AES opcodes, whereas previously only crypto_cipher did (and it wasn't enabled by default, which this commit fixes as well). Note that some of the functions in the SPARC64 AES assembly code are still used by the AES mode implementations in arch/sparc/crypto/aes_glue.c. For now, just export these functions. These exports will go away once the AES mode implementations are migrated to the library as well. (Trying to split up the assembly file seemed like much more trouble than it would be worth.) Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-17-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15	lib/crypto: s390/aes: Migrate optimized code into library	Eric Biggers
	Implement aes_preparekey_arch(), aes_encrypt_arch(), and aes_decrypt_arch() using the CPACF AES instructions. Then, remove the superseded "aes-s390" crypto_cipher. The result is that both the AES library and crypto_cipher APIs use the CPACF AES instructions, whereas previously only crypto_cipher did (and it wasn't enabled by default, which this commit fixes as well). Note that this preserves the optimization where the AES key is stored in raw form rather than expanded form. CPACF just takes the raw key. Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20260112192035.10427-16-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: riscv/aes: Migrate optimized code into library	Eric Biggers
	Move the aes_encrypt_zvkned() and aes_decrypt_zvkned() assembly functions into lib/crypto/, wire them up to the AES library API, and remove the "aes-riscv64-zvkned" crypto_cipher algorithm. To make this possible, change the prototypes of these functions to take (rndkeys, key_len) instead of a pointer to crypto_aes_ctx, and change the RISC-V AES-XTS code to implement tweak encryption using the AES library instead of directly calling aes_encrypt_zvkned(). The result is that both the AES library and crypto_cipher APIs use RISC-V's AES instructions, whereas previously only crypto_cipher did (and it wasn't enabled by default, which this commit fixes as well). Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-15-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library	Eric Biggers
	Move the POWER8 AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the superseded "p8_aes" crypto_cipher algorithm. The result is that both the AES library and crypto_cipher APIs are now optimized for POWER8, whereas previously only crypto_cipher was (and optimizations weren't enabled by default, which this commit fixes too). Note that many of the functions in the POWER8 assembly code are still used by the AES mode implementations in arch/powerpc/crypto/. For now, just export these functions. These exports will go away once the AES modes are migrated to the library as well. (Trying to split up the assembly file seemed like much more trouble than it would be worth.) Another challenge with this code is that the POWER8 assembly code uses a custom format for the expanded AES key. Since that code is imported from OpenSSL and is also targeted to POWER8 (rather than POWER9 which has better data movement and byteswap instructions), that is not easily changed. For now I've just kept the custom format. To maintain full correctness, this requires executing some slow fallback code in the case where the usability of VSX changes between key expansion and use. This should be tolerable, as this case shouldn't happen in practice. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-14-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: powerpc/aes: Migrate SPE optimized code into library	Eric Biggers
	Move the PowerPC SPE AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the superseded "aes-ppc-spe" crypto_cipher algorithm. The result is that both the AES library and crypto_cipher APIs are now optimized with SPE, whereas previously only crypto_cipher was (and optimizations weren't enabled by default, which this commit fixes too). Note that many of the functions in the PowerPC SPE assembly code are still used by the AES mode implementations in arch/powerpc/crypto/. For now, just export these functions. These exports will go away once the AES modes are migrated to the library as well. (Trying to split up the assembly files seemed like much more trouble than it would be worth.) Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-13-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: arm64/aes: Migrate optimized code into library	Eric Biggers
	Move the ARM64 optimized AES key expansion and single-block AES en/decryption code into lib/crypto/, wire it up to the AES library API, and remove the superseded crypto_cipher algorithms. The result is that both the AES library and crypto_cipher APIs are now optimized for ARM64, whereas previously only crypto_cipher was (and the optimizations weren't enabled by default, which this fixes as well). Note: to see the diff from arch/arm64/crypto/aes-ce-glue.c to lib/crypto/arm64/aes.h, view this commit with 'git show -M10'. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-12-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: arm/aes: Migrate optimized code into library	Eric Biggers
	Move the ARM optimized single-block AES en/decryption code into lib/crypto/, wire it up to the AES library API, and remove the superseded "aes-arm" crypto_cipher algorithm. The result is that both the AES library and crypto_cipher APIs are now optimized for ARM, whereas previously only crypto_cipher was (and the optimizations weren't enabled by default, which this fixes as well). Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-11-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: aes: Introduce improved AES library	Eric Biggers
	The kernel's AES library currently has the following issues: - It doesn't take advantage of the architecture-optimized AES code, including the implementations using AES instructions. - It's much slower than even the other software AES implementations: 2-4 times slower than "aes-generic", "aes-arm", and "aes-arm64". - It requires that both the encryption and decryption round keys be computed and cached. This is wasteful for users that need only the forward (encryption) direction of the cipher: the key struct is 484 bytes when only 244 are actually needed. This missed optimization is very common, as many AES modes (e.g. GCM, CFB, CTR, CMAC, and even the tweak key in XTS) use the cipher only in the forward (encryption) direction even when doing decryption. - It doesn't provide the flexibility to customize the prepared key format. The API is defined to do key expansion, and several callers in drivers/crypto/ use it specifically to expand the key. This is an issue when integrating the existing powerpc, s390, and sparc code, which is necessary to provide full parity with the traditional API. To resolve these issues, I'm proposing the following changes: 1. New structs 'aes_key' and 'aes_enckey' are introduced, with corresponding functions aes_preparekey() and aes_prepareenckey(). Generally these structs will include the encryption+decryption round keys and the encryption round keys, respectively. However, the exact format will be under control of the architecture-specific AES code. (The verb "prepare" is chosen over "expand" since key expansion isn't necessarily done. It's also consistent with hmac*_preparekey().) 2. aes_encrypt() and aes_decrypt() will be changed to operate on the new structs instead of struct crypto_aes_ctx. 3. aes_encrypt() and aes_decrypt() will use architecture-optimized code when available, or else fall back to a new generic AES implementation that unifies the existing two fragmented generic AES implementations. The new generic AES implementation uses tables for both SubBytes and MixColumns, making it almost as fast as "aes-generic". However, instead of aes-generic's huge 8192-byte tables per direction, it uses only 1024 bytes for encryption and 1280 bytes for decryption (similar to "aes-arm"). The cost is just some extra rotations. The new generic AES implementation also includes table prefetching, making it have some "constant-time hardening". That's an improvement from aes-generic which has no constant-time hardening. It does slightly regress in constant-time hardening vs. the old lib/crypto/aes.c which had smaller tables, and from aes-fixed-time which disabled IRQs on top of that. But I think this is tolerable. The real solutions for constant-time AES are AES instructions or bit-slicing. The table-based code remains a best-effort fallback for the increasingly-rare case where a real solution is unavailable. 4. crypto_aes_ctx and aes_expandkey() will remain for now, but only for callers that are using them specifically for the AES key expansion (as opposed to en/decrypting data with the AES library). This commit begins the migration process by introducing the new structs and functions, backed by the new generic AES implementation. To allow callers to be incrementally converted, aes_encrypt() and aes_decrypt() are temporarily changed into macros that use a _Generic expression to call either the old functions (which take crypto_aes_ctx) or the new functions (which take the new types). Once all callers have been updated, these macros will go away, the old functions will be removed, and the "_new" suffix will be dropped from the new functions. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: mldsa: Add FIPS cryptographic algorithm self-test	Eric Biggers
	Since ML-DSA is FIPS-approved, add the boot-time self-test which is apparently required. Just add a test vector manually for now, borrowed from lib/crypto/tests/mldsa-testvecs.h (where in turn it's borrowed from leancrypto). The SHA-* FIPS test vectors are generated by scripts/crypto/gen-fips-testvecs.py instead, but the common Python libraries don't support ML-DSA yet. Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20260107044215.109930-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: nh: Restore dependency of arch code on !KMSAN	Eric Biggers
	Since the architecture-specific implementations of NH initialize memory in assembly code, they aren't compatible with KMSAN as-is. Fixes: 382de740759a ("lib/crypto: nh: Add NH library") Link: https://lore.kernel.org/r/20260105053652.1708299-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: md5: Use rol32() instead of open-coding it	Rusydi H. Makarim
	For the bitwise left rotation in MD5STEP, use rol32() from <linux/bitops.h> instead of open-coding it. Signed-off-by: Rusydi H. Makarim <rusydi.makarim@kriptograf.id> Link: https://lore.kernel.org/r/20251214-rol32_in_md5-v1-1-20f5f11a92b2@kriptograf.id Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: x86/nh: Migrate optimized code into library	Eric Biggers
	Migrate the x86_64 implementations of NH into lib/crypto/. This makes the nh() function be optimized on x86_64 kernels. Note: this temporarily makes the adiantum template not utilize the x86_64 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305". Link: https://lore.kernel.org/r/20251211011846.8179-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: arm64/nh: Migrate optimized code into library	Eric Biggers
	Migrate the arm64 NEON implementation of NH into lib/crypto/. This makes the nh() function be optimized on arm64 kernels. Note: this temporarily makes the adiantum template not utilize the arm64 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305". Link: https://lore.kernel.org/r/20251211011846.8179-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: arm/nh: Migrate optimized code into library	Eric Biggers
	Migrate the arm32 NEON implementation of NH into lib/crypto/. This makes the nh() function be optimized on arm32 kernels. Note: this temporarily makes the adiantum template not utilize the arm32 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305". Link: https://lore.kernel.org/r/20251211011846.8179-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: tests: Add KUnit tests for NH	Eric Biggers
	Add some simple KUnit tests for the nh() function. These replace the test coverage which will be lost by removing the nhpoly1305 crypto_shash. Note that the NH code also continues to be tested indirectly as well, via the tests for the "adiantum(xchacha12,aes)" crypto_skcipher. Link: https://lore.kernel.org/r/20251211011846.8179-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: nh: Add NH library	Eric Biggers
	Add support for the NH "almost-universal hash function" to lib/crypto/, specifically the variant of NH used in Adiantum. This will replace the need for the "nhpoly1305" crypto_shash algorithm. All the implementations of "nhpoly1305" use architecture-optimized code only for the NH stage; they just use the generic C Poly1305 code for the Poly1305 stage. We can achieve the same result in a simpler way using an (architecture-optimized) nh() function combined with code in crypto/adiantum.c that passes the results to the Poly1305 library. This commit begins this cleanup by adding the nh() function. The code is derived from crypto/nhpoly1305.c and include/crypto/nhpoly1305.h. Link: https://lore.kernel.org/r/20251211011846.8179-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: tests: Add KUnit tests for ML-DSA verification	Eric Biggers
	Add a KUnit test suite for ML-DSA verification, including the following for each ML-DSA parameter set (ML-DSA-44, ML-DSA-65, and ML-DSA-87): - Positive test (valid signature), using vector imported from leancrypto - Various negative tests: - Wrong length for signature, message, or public key - Out-of-range coefficients in z vector - Invalid encoded hint vector - Any bit flipped in signature, message, or public key - Unit test for the internal function use_hint() - A benchmark ML-DSA inputs and outputs are very large. To keep the size of the tests down, use just one valid test vector per parameter set, and generate the negative tests at runtime by mutating the valid test vector. I also considered importing the test vectors from Wycheproof. I've tested that mldsa_verify() indeed passes all of Wycheproof's ML-DSA test vectors that use an empty context string. However, importing these permanently would add over 6 MB of source. That's too much to be a reasonable addition to the Linux kernel tree for one algorithm. It also wouldn't actually provide much better test coverage than this commit. Another potential issue is that Wycheproof uses the Apache license. Similarly, this also differs from the earlier proposal to import a long list of test vectors from leancrypto. I retained only one valid signature for each algorithm, and I also added (runtime-generated) negative tests which were missing. I think this is a better tradeoff. Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20251214181712.29132-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12	lib/crypto: Add ML-DSA verification support	Eric Biggers
	Add support for verifying ML-DSA signatures. ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is specified in FIPS 204 and is the standard version of Dilithium. Unlike RSA and elliptic-curve cryptography, ML-DSA is believed to be secure even against adversaries in possession of a large-scale quantum computer. Compared to the earlier patch (https://lore.kernel.org/r/20251117145606.2155773-3-dhowells@redhat.com/) that was based on "leancrypto", this implementation: - Is about 700 lines of source code instead of 4800. - Generates about 4 KB of object code instead of 28 KB. - Uses 9-13 KB of memory to verify a signature instead of 31-84 KB. - Is at least about the same speed, with a microbenchmark showing 3-5% improvements on one x86_64 CPU and -1% to 1% changes on another. When memory is a bottleneck, it's likely much faster. - Correctly implements the RejNTTPoly step of the algorithm. The API just consists of a single function mldsa_verify(), supporting pure ML-DSA with any standard parameter set (ML-DSA-44, ML-DSA-65, or ML-DSA-87) as selected by an enum. That's all that's actually needed. The following four potential features are unneeded and aren't included. However, any that ever become needed could fairly easily be added later, as they only affect how the message representative mu is calculated: - Nonempty context strings - Incremental message hashing - HashML-DSA - External mu Signing support would, of course, be a larger and more complex addition. However, the kernel doesn't, and shouldn't, need ML-DSA signing support. Note that mldsa_verify() allocates memory, so it can sleep and can fail with ENOMEM. Unfortunately we don't have much choice about that, since ML-DSA needs a lot of memory. At least callers have to check for errors anyway, since the signature could be invalid. Note that verification doesn't require constant-time code, and in fact some steps are inherently variable-time. I've used constant-time patterns in some places anyway, but technically they're not needed. Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20251214181712.29132-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-08	lib/crypto: aes: Fix missing MMU protection for AES S-box	Eric Biggers
	__cacheline_aligned puts the data in the ".data..cacheline_aligned" section, which isn't marked read-only i.e. it doesn't receive MMU protection. Replace it with ____cacheline_aligned which does the right thing and just aligns the data while keeping it in ".rodata". Fixes: b5e0b032b6c3 ("crypto: aes - add generic time invariant AES cipher") Cc: stable@vger.kernel.org Reported-by: Qingfang Deng <dqfext@gmail.com> Closes: https://lore.kernel.org/r/20260105074712.498-1-dqfext@gmail.com/ Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260107052023.174620-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-08	lib/crypto: tests: polyval_kunit: Increase iterations for preparekey in IRQs	Thomas Weißschuh
	On my development machine the generic, memcpy()-only implementation of polyval_preparekey() is too fast for the IRQ workers to actually fire. The test fails. Increase the iterations to make the test more robust. The test will run for a maximum of one second in any case. [EB: This failure was already fixed by commit c31f4aa8fed0 ("kunit: Enforce task execution in {soft,hard}irq contexts"). I'm still applying this patch too, since the iteration count in this test made its running time much shorter than the other similar ones.] Fixes: b3aed551b3fc ("lib/crypto: tests: Add KUnit tests for POLYVAL") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20260102-kunit-polyval-fix-v1-1-5313b5a65f35@linutronix.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-14	lib/crypto: riscv: Add poly1305-core.S to .gitignore	Charles Mirabile
	poly1305-core.S is an auto-generated file, so it should be ignored. Fixes: bef9c7559869 ("lib/crypto: riscv/poly1305: Import OpenSSL/CRYPTOGAMS implementation") Cc: stable@vger.kernel.org Signed-off-by: Charles Mirabile <cmirabil@redhat.com> Link: https://lore.kernel.org/r/20251212184717.133701-1-cmirabil@redhat.com Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-09	lib/crypto: blake2s: Replace manual unrolling with unrolled_full	Eric Biggers
	As we're doing in the BLAKE2b code, use unrolled_full to make the compiler handle the loop unrolling. This simplifies the code slightly. The generated object code is nearly the same with both gcc and clang. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251205051155.25274-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-09	lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit	Eric Biggers
	BLAKE2b has a state of 16 64-bit words. Add the message data in and there are 32 64-bit words. With the current code where all the rounds are unrolled to enable constant-folding of the blake2b_sigma values, this results in a very large code size on 32-bit kernels, including a recurring issue where gcc uses a large amount of stack. There's just not much benefit to this unrolling when the code is already so large. Let's roll up the rounds when !CONFIG_64BIT. To avoid having to duplicate the code, just write the code once using a loop, and conditionally use 'unrolled_full' from <linux/unroll.h>. Then, fold the now-unneeded ROUND() macro into the loop. Finally, also remove the now-unneeded override of the stack frame size warning. Code size improvements for blake2b_compress_generic(): Size before (bytes) Size after (bytes) ------------------- ------------------ i386, gcc 27584 3632 i386, clang 18208 3248 arm32, gcc 19912 2860 arm32, clang 21336 3344 Running the BLAKE2b benchmark on a !CONFIG_64BIT kernel on an x86_64 processor shows a 16384B throughput change of 351 => 340 MB/s (gcc) or 442 MB/s => 375 MB/s (clang). So clearly not much of a slowdown either. But also that microbenchmark also effectively disregards cache usage, which is important in practice and is far better in the smaller code. Note: If we rolled up the loop on x86_64 too, the change would be 7024 bytes => 1584 bytes and 1960 MB/s => 1396 MB/s (gcc), or 6848 bytes => 1696 bytes and 1920 MB/s => 1263 MB/s (clang). Maybe still worth it, though not quite as clearly beneficial. Fixes: 91d689337fe8 ("crypto: blake2b - add blake2b generic implementation") Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251205050330.89704-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-09	lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS	Eric Biggers
	Replace the RISCV_ISA_V dependency of the RISC-V crypto code with RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS, which implies RISCV_ISA_V as well as vector unaligned accesses being efficient. This is necessary because this code assumes that vector unaligned accesses are supported and are efficient. (It does so to avoid having to use lots of extra vsetvli instructions to switch the element width back and forth between 8 and either 32 or 64.) This was omitted from the code originally just because the RISC-V kernel support for detecting this feature didn't exist yet. Support has now been added, but it's fragmented into per-CPU runtime detection, a command-line parameter, and a kconfig option. The kconfig option is the only reasonable way to do it, though, so let's just rely on that. Fixes: eb24af5d7a05 ("crypto: riscv - add vector crypto accelerated AES-{ECB,CBC,CTR,XTS}") Fixes: bb54668837a0 ("crypto: riscv - add vector crypto accelerated ChaCha20") Fixes: 600a3853dfa0 ("crypto: riscv - add vector crypto accelerated GHASH") Fixes: 8c8e40470ffe ("crypto: riscv - add vector crypto accelerated SHA-{256,224}") Fixes: b3415925a08b ("crypto: riscv - add vector crypto accelerated SHA-{512,384}") Fixes: 563a5255afa2 ("crypto: riscv - add vector crypto accelerated SM3") Fixes: b8d06352bbf3 ("crypto: riscv - add vector crypto accelerated SM4") Cc: stable@vger.kernel.org Reported-by: Vivian Wang <wangruikang@iscas.ac.cn> Closes: https://lore.kernel.org/r/b3cfcdac-0337-4db0-a611-258f2868855f@iscas.ac.cn/ Reviewed-by: Jerry Shih <jerry.shih@sifive.com> Link: https://lore.kernel.org/r/20251206213750.81474-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-09	lib/crypto: riscv/chacha: Avoid s0/fp register	Vivian Wang
	In chacha_zvkb, avoid using the s0 register, which is the frame pointer, by reallocating KEY0 to t5. This makes stack traces available if e.g. a crash happens in chacha_zvkb. No frame pointer maintenance is otherwise required since this is a leaf function. Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Fixes: bb54668837a0 ("crypto: riscv - add vector crypto accelerated ChaCha20") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20251202-riscv-chacha_zvkb-fp-v2-1-7bd00098c9dc@iscas.ac.cn Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-03	Merge tag 'v6.19-p1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Rewrite memcpy_sglist from scratch - Add on-stack AEAD request allocation - Fix partial block processing in ahash Algorithms: - Remove ansi_cprng - Remove tcrypt tests for poly1305 - Fix EINPROGRESS processing in authenc - Fix double-free in zstd Drivers: - Use drbg ctr helper when reseeding xilinx-trng - Add support for PCI device 0x115A to ccp - Add support of paes in caam - Add support for aes-xts in dthev2 Others: - Use likely in rhashtable lookup - Fix lockdep false-positive in padata by removing a helper" * tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits) crypto: zstd - fix double-free in per-CPU stream cleanup crypto: ahash - Zero positive err value in ahash_update_finish crypto: ahash - Fix crypto_ahash_import with partial block data crypto: lib/mpi - use min() instead of min_t() crypto: ccp - use min() instead of min_t() hwrng: core - use min3() instead of nested min_t() crypto: aesni - ctr_crypt() use min() instead of min_t() crypto: drbg - Delete unused ctx from struct sdesc crypto: testmgr - Add missing DES weak and semi-weak key tests Revert "crypto: scatterwalk - Move skcipher walk and use it for memcpy_sglist" crypto: scatterwalk - Fix memcpy_sglist() to always succeed crypto: iaa - Request to add Kanchana P Sridhar to Maintainers. crypto: tcrypt - Remove unused poly1305 support crypto: ansi_cprng - Remove unused ansi_cprng algorithm crypto: asymmetric_keys - fix uninitialized pointers with free attribute KEYS: Avoid -Wflex-array-member-not-at-end warning crypto: ccree - Correctly handle return of sg_nents_for_len crypto: starfive - Correctly handle return of sg_nents_for_len crypto: iaa - Fix incorrect return value in save_iaa_wq() crypto: zstd - Remove unnecessary size_t cast ...
2025-12-02	Merge tag 'fpsimd-on-stack-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers: "This is a core arm64 change. However, I was asked to take this because most uses of kernel-mode FPSIMD are in crypto or CRC code. In v6.8, the size of task_struct on arm64 increased by 528 bytes due to the new 'kernel_fpsimd_state' field. This field was added to allow kernel-mode FPSIMD code to be preempted. Unfortunately, 528 bytes is kind of a lot for task_struct. This regression in the task_struct size was noticed and reported. Recover that space by making this state be allocated on the stack at the beginning of each kernel-mode FPSIMD section. To make it easier for all the users of kernel-mode FPSIMD to do that correctly, introduce and use a 'scoped_ksimd' abstraction" * tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits) lib/crypto: arm64: Move remaining algorithms to scoped ksimd API lib/crypto: arm/blake2b: Move to scoped ksimd API arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack arm64/fpu: Enforce task-context only for generic kernel mode FPU net/mlx5: Switch to more abstract scoped ksimd guard API on arm64 arm64/xorblocks: Switch to 'ksimd' scoped guard API crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API crypto/arm64: polyval - Switch to 'ksimd' scoped guard API crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API raid6: Move to more abstract 'ksimd' guard API crypto: aegis128-neon - Move to more abstract 'ksimd' guard API crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API ...
2025-12-02	Merge tag 'libcrypto-at-least-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull 'at_least' array size update from Eric Biggers: "C supports lower bounds on the sizes of array parameters, using the static keyword as follows: 'void f(int a[static 32]);'. This allows the compiler to warn about a too-small array being passed. As discussed, this reuse of the 'static' keyword, while standard, is a bit obscure. Therefore, add an alias 'at_least' to compiler_types.h. Then, add this 'at_least' annotation to the array parameters of various crypto library functions" * tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: sha2: Add at_least decoration to fixed-size array params lib/crypto: sha1: Add at_least decoration to fixed-size array params lib/crypto: poly1305: Add at_least decoration to fixed-size array params lib/crypto: md5: Add at_least decoration to fixed-size array params lib/crypto: curve25519: Add at_least decoration to fixed-size array params lib/crypto: chacha: Add at_least decoration to fixed-size array params lib/crypto: chacha20poly1305: Statically check fixed array lengths compiler_types: introduce at_least parameter decoration pseudo keyword wifi: iwlwifi: trans: rename at_least variable to min_mode
2025-12-02	Merge tag 'libcrypto-tests-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library test updates from Eric Biggers: - Add KUnit test suites for SHA-3, BLAKE2b, and POLYVAL. These are the algorithms that have new crypto library interfaces this cycle. - Remove the crypto_shash POLYVAL tests. They're no longer needed because POLYVAL support was removed from crypto_shash. Better POLYVAL test coverage is now provided via the KUnit test suite. * tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: crypto: testmgr - Remove polyval tests lib/crypto: tests: Add KUnit tests for POLYVAL lib/crypto: tests: Add additional SHAKE tests lib/crypto: tests: Add SHA3 kunit tests lib/crypto: tests: Add KUnit tests for BLAKE2b
2025-12-02	Merge tag 'libcrypto-updates-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library updates from Eric Biggers: "This is the main crypto library pull request for 6.19. It includes: - Add SHA-3 support to lib/crypto/, including support for both the hash functions and the extendable-output functions. Reimplement the existing SHA-3 crypto_shash support on top of the library. This is motivated mainly by the upcoming support for the ML-DSA signature algorithm, which needs the SHAKE128 and SHAKE256 functions. But even on its own it's a useful cleanup. This also fixes the longstanding issue where the architecture-optimized SHA-3 code was disabled by default. - Add BLAKE2b support to lib/crypto/, and reimplement the existing BLAKE2b crypto_shash support on top of the library. This is motivated mainly by btrfs, which supports BLAKE2b checksums. With this change, all btrfs checksum algorithms now have library APIs. btrfs is planned to start just using the library directly. This refactor also improves consistency between the BLAKE2b code and BLAKE2s code. And as usual, it also fixes the issue where the architecture-optimized BLAKE2b code was disabled by default. - Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL support in crypto_shash. Reimplement HCTR2 on top of the library. This simplifies the code and improves HCTR2 performance. As usual, it also makes the architecture-optimized code be enabled by default. The generic implementation of POLYVAL is greatly improved as well. - Clean up the BLAKE2s code - Add FIPS self-tests for SHA-1, SHA-2, and SHA-3" * tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits) fscrypt: Drop obsolete recommendation to enable optimized POLYVAL crypto: polyval - Remove the polyval crypto_shash crypto: hctr2 - Convert to use POLYVAL library lib/crypto: x86/polyval: Migrate optimized code into library lib/crypto: arm64/polyval: Migrate optimized code into library lib/crypto: polyval: Add POLYVAL library crypto: polyval - Rename conflicting functions lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value lib/crypto: x86/blake2s: Improve readability lib/crypto: x86/blake2s: Use local labels for data lib/crypto: x86/blake2s: Drop check for nblocks == 0 lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit lib/crypto: arm, arm64: Drop filenames from file comments lib/crypto: arm/blake2s: Fix some comments crypto: s390/sha3 - Remove superseded SHA-3 code crypto: sha3 - Reimplement using library API crypto: jitterentropy - Use default sha3 implementation lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions lib/crypto: sha3: Support arch overrides of one-shot digest functions ...
2025-11-24	crypto: lib/mpi - use min() instead of min_t()	David Laight
	min_t(unsigned int, a, b) casts an 'unsigned long' to 'unsigned int'. Use min(a, b) instead as it promotes any 'unsigned int' to 'unsigned long' and so cannot discard significant bits. In this case the 'unsigned long' value is small enough that the result is ok. Detected by an extra check added to min_t(). Signed-off-by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-11-23	lib/crypto: chacha20poly1305: Statically check fixed array lengths	Jason A. Donenfeld
	Several parameters of the chacha20poly1305 functions require arrays of an exact length. Use the new at_least keyword to instruct gcc and clang to statically check that the caller is passing an object of at least that length. Here it is in action, with this faulty patch to wireguard's cookie.h: struct cookie_checker { u8 secret[NOISE_HASH_LEN]; - u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN]; + u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN - 1]; u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN]; If I try compiling this code, I get this helpful warning: CC drivers/net/wireguard/cookie.o drivers/net/wireguard/cookie.c: In function ‘wg_cookie_message_create’: drivers/net/wireguard/cookie.c:193:9: warning: ‘xchacha20poly1305_encrypt’ reading 32 bytes from a region of size 31 [-Wstringop-overread] 193 \| xchacha20poly1305_encrypt(dst->encrypted_cookie, cookie, COOKIE_LEN, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 194 \| macs->mac1, COOKIE_LEN, dst->nonce, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 195 \| checker->cookie_encryption_key); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/wireguard/cookie.c:193:9: note: referencing argument 7 of type ‘const u8 ’ {aka ‘const unsigned char ’} In file included from drivers/net/wireguard/messages.h:10, from drivers/net/wireguard/cookie.h:9, from drivers/net/wireguard/cookie.c:6: include/crypto/chacha20poly1305.h:28:6: note: in a call to function ‘xchacha20poly1305_encrypt’ 28 \| void xchacha20poly1305_encrypt(u8 dst, const u8 src, const size_t src_len, Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20251123054819.2371989-4-Jason@zx2c4.com Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-21	lib/crypto: tests: Fix KMSAN warning in test_sha256_finup_2x()	Eric Biggers
	Fully initialize ctx, including the buf field which sha256_init() doesn't initialize, to avoid a KMSAN warning when comparing ctx to orig_ctx. This KMSAN warning slipped in while KMSAN was not working reliably due to a stackdepot bug, which has now been fixed. Fixes: 6733968be7cb ("lib/crypto: tests: Add tests and benchmark for sha256_finup_2x()") Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251121033431.34406-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12	lib/crypto: arm64: Move remaining algorithms to scoped ksimd API	Ard Biesheuvel
	Move the arm64 implementations of SHA-3 and POLYVAL to the newly introduced scoped ksimd API, which replaces kernel_neon_begin() and kernel_neon_end(). On arm64, this is needed because the latter API will change in an incompatible manner. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12	lib/crypto: arm/blake2b: Move to scoped ksimd API	Ard Biesheuvel
	Even though ARM's versions of kernel_neon_begin()/_end() are not being changed, update the newly migrated ARM blake2b to the scoped ksimd API so that all ARM and arm64 in lib/crypto remains consistent in this manner. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12	Merge tag 'scoped-ksimd-for-arm-arm64' into libcrypto-fpsimd-on-stack	Eric Biggers
	Pull scoped ksimd API for ARM and arm64 from Ard Biesheuvel: "Introduce a more strict replacement API for kernel_neon_begin()/kernel_neon_end() on both ARM and arm64, and replace occurrences of the latter pair appearing in lib/crypto" Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-12	lib/crypto: Switch ARM and arm64 to 'ksimd' scoped guard API	Ard Biesheuvel
	Before modifying the prototypes of kernel_neon_begin() and kernel_neon_end() to accommodate kernel mode FP/SIMD state buffers allocated on the stack, move arm64 to the new 'ksimd' scoped guard API, which encapsulates the calls to those functions. For symmetry, do the same for 32-bit ARM too. Reviewed-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-11-11	lib/crypto: tests: Add KUnit tests for POLYVAL	Eric Biggers
	Add a test suite for the POLYVAL library, including: - All the standard tests and the benchmark from hash-test-template.h - Comparison with a test vector from the RFC - Test with key and message containing all one bits - Additional tests related to the key struct Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251109234726.638437-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>