summaryrefslogtreecommitdiff
path: root/arch/riscv/include
AgeCommit message (Collapse)Author
2026-02-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "Loongarch: - Add more CPUCFG mask bits - Improve feature detection - Add lazy load support for FPU and binary translation (LBT) register state - Fix return value for memory reads from and writes to in-kernel devices - Add support for detecting preemption from within a guest - Add KVM steal time test case to tools/selftests ARM: - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers - More pKVM fixes for features that are not supposed to be exposed to guests - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest - Preliminary work for guest GICv5 support - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures - A small set of FPSIMD cleanups - Selftest fixes addressing the incorrect alignment of page allocation - Other assorted low-impact fixes and spelling fixes RISC-V: - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Transparent huge page support for hypervisor page tables - Adjust the number of available guest irq files based on MMIO register sizes found in the device tree or the ACPI tables - Add RISC-V specific paging modes to KVM selftests - Detect paging mode at runtime for selftests s390: - Performance improvement for vSIE (aka nested virtualization) - Completely new memory management. s390 was a special snowflake that enlisted help from the architecture's page table management to build hypervisor page tables, in particular enabling sharing the last level of page tables. This however was a lot of code (~3K lines) in order to support KVM, and also blocked several features. The biggest advantages is that the page size of userspace is completely independent of the page size used by the guest: userspace can mix normal pages, THPs and hugetlbfs as it sees fit, and in fact transparent hugepages were not possible before. It's also now possible to have nested guests and guests with huge pages running on the same host - Maintainership change for s390 vfio-pci - Small quality of life improvement for protected guests x86: - Add support for giving the guest full ownership of PMU hardware (contexted switched around the fastpath run loop) and allowing direct access to data MSRs and PMCs (restricted by the vPMU model). KVM still intercepts access to control registers, e.g. to enforce event filtering and to prevent the guest from profiling sensitive host state. This is more accurate, since it has no risk of contention and thus dropped events, and also has significantly less overhead. For more information, see the commit message for merge commit bf2c3138ae36 ("Merge tag 'kvm-x86-pmu-6.20' ...") - Disallow changing the virtual CPU model if L2 is active, for all the same reasons KVM disallows change the model after the first KVM_RUN - Fix a bug where KVM would incorrectly reject host accesses to PV MSRs when running with KVM_CAP_ENFORCE_PV_FEATURE_CPUID enabled, even if those were advertised as supported to userspace, - Fix a bug with protected guest state (SEV-ES/SNP and TDX) VMs, where KVM would attempt to read CR3 configuring an async #PF entry - Fail the build if EXPORT_SYMBOL_GPL or EXPORT_SYMBOL is used in KVM (for x86 only) to enforce usage of EXPORT_SYMBOL_FOR_KVM_INTERNAL. Only a few exports that are intended for external usage, and those are allowed explicitly - When checking nested events after a vCPU is unblocked, ignore -EBUSY instead of WARNing. Userspace can sometimes put the vCPU into what should be an impossible state, and spurious exit to userspace on -EBUSY does not really do anything to solve the issue - Also throw in the towel and drop the WARN on INIT/SIPI being blocked when vCPU is in Wait-For-SIPI, which also resulted in playing whack-a-mole with syzkaller stuffing architecturally impossible states into KVM - Add support for new Intel instructions that don't require anything beyond enumerating feature flags to userspace - Grab SRCU when reading PDPTRs in KVM_GET_SREGS2 - Add WARNs to guard against modifying KVM's CPU caps outside of the intended setup flow, as nested VMX in particular is sensitive to unexpected changes in KVM's golden configuration - Add a quirk to allow userspace to opt-in to actually suppress EOI broadcasts when the suppression feature is enabled by the guest (currently limited to split IRQCHIP, i.e. userspace I/O APIC). Sadly, simply fixing KVM to honor Suppress EOI Broadcasts isn't an option as some userspaces have come to rely on KVM's buggy behavior (KVM advertises Supress EOI Broadcast irrespective of whether or not userspace I/O APIC supports Directed EOIs) - Clean up KVM's handling of marking mapped vCPU pages dirty - Drop a pile of *ancient* sanity checks hidden behind in KVM's unused ASSERT() macro, most of which could be trivially triggered by the guest and/or user, and all of which were useless - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it more obvious what the weird parameter is used for, and to allow fropping these RTC shenanigans if CONFIG_KVM_IOAPIC=n - Bury all of ioapic.h, i8254.h and related ioctls (including KVM_CREATE_IRQCHIP) behind CONFIG_KVM_IOAPIC=y - Add a regression test for recent APICv update fixes - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv() to consolidate the updates, and to co-locate SVI updates with the updates for KVM's own cache of ISR information - Drop a dead function declaration - Minor cleanups x86 (Intel): - Rework KVM's handling of VMCS updates while L2 is active to temporarily switch to vmcs01 instead of deferring the update until the next nested VM-Exit. The deferred updates approach directly contributed to several bugs, was proving to be a maintenance burden due to the difficulty in auditing the correctness of deferred updates, and was polluting "struct nested_vmx" with a growing pile of booleans - Fix an SGX bug where KVM would incorrectly try to handle EPCM page faults, and instead always reflect them into the guest. Since KVM doesn't shadow EPCM entries, EPCM violations cannot be due to KVM interference and can't be resolved by KVM - Fix a bug where KVM would register its posted interrupt wakeup handler even if loading kvm-intel.ko ultimately failed - Disallow access to vmcb12 fields that aren't fully supported, mostly to avoid weirdness and complexity for FRED and other features, where KVM wants enable VMCS shadowing for fields that conditionally exist - Print out the "bad" offsets and values if kvm-intel.ko refuses to load (or refuses to online a CPU) due to a VMCS config mismatch x86 (AMD): - Drop a user-triggerable WARN on nested_svm_load_cr3() failure - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) *must* flush the RAP - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0 - Treat exit_code as an unsigned 64-bit value through all of KVM - Add support for fetching SNP certificates from userspace - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2 - Misc fixes and cleanups x86 selftests: - Add a regression test for TPR<=>CR8 synchronization and IRQ masking - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support, and extend x86's infrastructure to support EPT and NPT (for L2 guests) - Extend several nested VMX tests to also cover nested SVM - Add a selftest for nested VMLOAD/VMSAVE - Rework the nested dirty log test, originally added as a regression test for PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage and to hopefully make the test easier to understand and maintain guest_memfd: - Remove kvm_gmem_populate()'s preparation tracking and half-baked hugepage handling. SEV/SNP was the only user of the tracking and it can do it via the RMP - Retroactively document and enforce (for SNP) that KVM_SEV_SNP_LAUNCH_UPDATE and KVM_TDX_INIT_MEM_REGION require the source page to be 4KiB aligned, to avoid non-trivial complexity for something that no known VMM seems to be doing and to avoid an API special case for in-place conversion, which simply can't support unaligned sources - When populating guest_memfd memory, GUP the source page in common code and pass the refcounted page to the vendor callback, instead of letting vendor code do the heavy lifting. Doing so avoids a looming deadlock bug with in-place due an AB-BA conflict betwee mmap_lock and guest_memfd's filemap invalidate lock Generic: - Fix a bug where KVM would ignore the vCPU's selected address space when creating a vCPU-specific mapping of guest memory. Actually this bug could not be hit even on x86, the only architecture with multiple address spaces, but it's a bug nevertheless" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (267 commits) KVM: s390: Increase permitted SE header size to 1 MiB MAINTAINERS: Replace backup for s390 vfio-pci KVM: s390: vsie: Fix race in acquire_gmap_shadow() KVM: s390: vsie: Fix race in walk_guest_tables() KVM: s390: Use guest address to mark guest page dirty irqchip/riscv-imsic: Adjust the number of available guest irq files RISC-V: KVM: Transparent huge page support RISC-V: KVM: selftests: Add Zalasr extensions to get-reg-list test RISC-V: KVM: Allow Zalasr extensions for Guest/VM KVM: riscv: selftests: Add riscv vm satp modes KVM: riscv: selftests: add Zilsd and Zclsd extension to get-reg-list test riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VM RISC-V: KVM: Skip IMSIC update if vCPU IMSIC state is not initialized RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_rw_attr() RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_has_attr() RISC-V: KVM: Remove unnecessary 'ret' assignment KVM: s390: Add explicit padding to struct kvm_s390_keyop KVM: LoongArch: selftests: Add steal time test case LoongArch: KVM: Add paravirt vcpu_is_preempted() support in guest side LoongArch: KVM: Add paravirt preempt feature in hypervisor side ...
2026-02-12Merge tag 'riscv-for-linus-7.0-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: - Add support for control flow integrity for userspace processes. This is based on the standard RISC-V ISA extensions Zicfiss and Zicfilp - Improve ptrace behavior regarding vector registers, and add some selftests - Optimize our strlen() assembly - Enable the ISO-8859-1 code page as built-in, similar to ARM64, for EFI volume mounting - Clean up some code slightly, including defining copy_user_page() as copy_page() rather than memcpy(), aligning us with other architectures; and using max3() to slightly simplify an expression in riscv_iommu_init_check() * tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits) riscv: lib: optimize strlen loop efficiency selftests: riscv: vstate_exec_nolibc: Use the regular prctl() function selftests: riscv: verify ptrace accepts valid vector csr values selftests: riscv: verify ptrace rejects invalid vector csr inputs selftests: riscv: verify syscalls discard vector context selftests: riscv: verify initial vector state with ptrace selftests: riscv: test ptrace vector interface riscv: ptrace: validate input vector csr registers riscv: csr: define vtype register elements riscv: vector: init vector context with proper vlenb riscv: ptrace: return ENODATA for inactive vector extension kselftest/riscv: add kselftest for user mode CFI riscv: add documentation for shadow stack riscv: add documentation for landing pad / indirect branch tracking riscv: create a Kconfig fragment for shadow stack and landing pad support arch/riscv: add dual vdso creation logic and select vdso based on hw arch/riscv: compile vdso with landing pad and shadow stack note riscv: enable kernel access to shadow stack memory via the FWFT SBI call riscv: add kernel command line option to opt out of user CFI riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobe ...
2026-02-12Merge tag 'mm-stable-2026-02-11-19-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "powerpc/64s: do not re-activate batched TLB flush" makes arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev) It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - "zram: introduce compressed data writeback" implements data compression for zram writeback (Richard Chang and Sergey Senozhatsky) - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated (David Hildenbrand) - "memcg cleanups" tidies up some memcg code (Chen Ridong) - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" improves DAMOS stat's provided information, deterministic control, and readability (SeongJae Park) - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few issues in the hugetlb cgroup charging selftests (Li Wang) - "Fix va_high_addr_switch.sh test failure - again" addresses several issues in the va_high_addr_switch test (Chunyu Hu) - "mm/damon/tests/core-kunit: extend existing test scenarios" improves the KUnit test coverage for DAMON (Shu Anzai) - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN (Shivank Garg) - "arch, mm: consolidate hugetlb early reservation" reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb (Mike Rapoport) - "mm: clean up anon_vma implementation" cleans up the anon_vma implementation in various ways (Lorenzo Stoakes) - "tweaks for __alloc_pages_slowpath()" does a little streamlining of the page allocator's slowpath code (Vlastimil Babka) - "memcg: separate private and public ID namespaces" cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace (Shakeel Butt) - "mm: hugetlb: allocate frozen gigantic folio" cleans up the allocation of frozen folios and avoids some atomic refcount operations (Kefeng Wang) - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals (SeongJae Park) - "Support page table check on PowerPC" makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan) - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code (Yury Norov) - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up some DAMON internal interfaces (SeongJae Park) - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work in khupaged and fixes a scan limit accounting issue (Shivank Garg) - "mm: balloon infrastructure cleanups" goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification (David Hildenbrand) - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds additional tracepoints to the page reclaim code (Jiayuan Chen) - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues (Marco Crivellari) - "Various mm kselftests improvements/fixes" provides various unrelated improvements/fixes for the mm kselftests (Kevin Brodsky) - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig() (Kefeng Wang) - "selftests/damon: improve leak detection and wss estimation reliability" improves the reliability of two of the DAMON selftests (SeongJae Park) - "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" does some cleanup work in the core DAMON code (SeongJae Park) - "Docs/mm/damon: update intro, modules, maintainer profile, and misc" performs maintenance work on the DAMON documentation (SeongJae Park) - "mm: add and use vma_assert_stabilised() helper" refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires (Lorenzo Stoakes) - "mm, swap: swap table phase II: unify swapin use" removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark (Kairui Song) - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, and um. Various cleanups were performed along the way (Qi Zheng) * tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits) mm/memory: handle non-split locks correctly in zap_empty_pte_table() mm: move pte table reclaim code to memory.c mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config um: mm: enable MMU_GATHER_RCU_TABLE_FREE parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE mips: mm: enable MMU_GATHER_RCU_TABLE_FREE LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles zsmalloc: make common caches global mm: add SPDX id lines to some mm source files mm/zswap: use %pe to print error pointers mm/vmscan: use %pe to print error pointers mm/readahead: fix typo in comment mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file() mm: refactor vma_map_pages to use vm_insert_pages mm/damon: unify address range representation with damon_addr_range mm/cma: replace snprintf with strscpy in cma_new_area ...
2026-02-11Merge tag 'kvm-riscv-6.20-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 6.20 - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Add riscv vm satp modes in KVM selftests - Transparent huge page support for G-stage - Adjust the number of available guest irq files based on MMIO register sizes in DeviceTree or ACPI
2026-02-10Merge tag 'x86_paravirt_for_v7.0_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code (Juergen Gross) * tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/paravirt: Use XOR r32,r32 to clear register in pv_vcpu_is_preempted() x86/paravirt: Remove trailing semicolons from alternative asm templates x86/pvlocks: Move paravirt spinlock functions into own header x86/paravirt: Specify pv_ops array in paravirt macros x86/paravirt: Allow pv-calls outside paravirt.h objtool: Allow multiple pv_ops arrays x86/xen: Drop xen_mmu_ops x86/xen: Drop xen_cpu_ops x86/xen: Drop xen_irq_ops x86/paravirt: Move pv_native_*() prototypes to paravirt.c x86/paravirt: Introduce new paravirt-base.h header x86/paravirt: Move paravirt_sched_clock() related code into tsc.c x86/paravirt: Use common code for paravirt_steal_clock() riscv/paravirt: Use common code for paravirt_steal_clock() loongarch/paravirt: Use common code for paravirt_steal_clock() arm64/paravirt: Use common code for paravirt_steal_clock() arm/paravirt: Use common code for paravirt_steal_clock() sched: Move clock related paravirt code to kernel/sched paravirt: Remove asm/paravirt_api_clock.h x86/paravirt: Move thunk macros to paravirt_types.h ...
2026-02-09riscv: csr: define vtype register elementsSergey Matyukevich
Define masks and shifts for vtype CSR according to the vector specs: - v0.7.1 used in early T-Head cores, known as xtheadvector in the kernel - v1.0 Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Andy Chiu <andybnac@gmail.com> Link: https://patch.msgid.link/20251214163537.1054292-4-geomatsi@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-02-06RISC-V: KVM: Allow Zalasr extensions for Guest/VMXu Lu
Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zalasr extensions for Guest/VM. Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251020042457.30915-5-luxu.kernel@bytedance.com Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VMPincheng Wang
Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zilsd and Zclsd extensions for Guest/VM. Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250826162939.1494021-5-pincheng.plct@isrc.iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-01-29arch/riscv: add dual vdso creation logic and select vdso based on hwDeepak Gupta
Shadow stack instructions are taken from the Zimop ISA extension, which is mandated on RVA23. Any userspace with shadow stack instructions in it will fault on hardware that doesn't have support for Zimop. Thus, a shadow stack-enabled userspace can't be run on hardware that doesn't support Zimop. It's not known how Linux userspace providers will respond to this kind of binary fragmentation. In order to keep kernel portable across different hardware, 'arch/riscv/kernel/vdso_cfi' is created which has Makefile logic to compile 'arch/riscv/kernel/vdso' sources with CFI flags, and 'arch/riscv/kernel/vdso.c' is modified to select the appropriate vdso depending on whether the underlying CPU implements the Zimop extension. Since the offset of vdso symbols will change due to having two different vdso binaries, there is added logic to include a new generated vdso offset header and dynamically select the offset (like for rt_sigreturn). Signed-off-by: Deepak Gupta <debug@rivosinc.com> Acked-by: Charles Mirabile <cmirabil@redhat.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-24-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29arch/riscv: compile vdso with landing pad and shadow stack noteJim Shu
User mode tasks compiled with Zicfilp may call indirectly into the vdso (like hwprobe indirect calls). Add support for compiling landing pads into the vdso. Landing pad instructions in the vdso will be no-ops for tasks which have not enabled landing pads. Furthermore, add support for the C sources of the vdso to be compiled with shadow stack and landing pads enabled as well. Landing pad and shadow stack instructions are emitted only when the VDSO_CFI cflags option is defined during compile. Signed-off-by: Jim Shu <jim.shu@sifive.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-23-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description, issues reported by checkpatch] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv: add kernel command line option to opt out of user CFIDeepak Gupta
Add a kernel command line option to disable part or all of user CFI. User backward CFI and forward CFI can be controlled independently. The kernel command line parameter "riscv_nousercfi" can take the following values: - "all" : Disable forward and backward cfi both - "bcfi" : Disable backward cfi - "fcfi" : Disable forward cfi Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-21-b55691eacf4f@rivosinc.com [pjw@kernel.org: fixed warnings from checkpatch; cleaned up patch description, doc, printk text] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobeDeepak Gupta
Add enumeration of the zicfilp and zicfiss extensions in the hwprobe syscall. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-20-b55691eacf4f@rivosinc.com [pjw@kernel.org: updated to apply; extend into RISCV_HWPROBE_KEY_IMA_EXT_1; clean patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv: hwprobe: add support for RISCV_HWPROBE_KEY_IMA_EXT_1Paul Walmsley
We've run out of bits to describe RISC-V ISA extensions in our initial hwprobe key, RISCV_HWPROBE_KEY_IMA_EXT_0. So, let's add RISCV_HWPROBE_KEY_IMA_EXT_1, along with the framework to set the appropriate hwprobe tuple, and add testing for it. Based on a suggestion from Andrew Jones <andrew.jones@oss.qualcomm.com>, also fix the documentation for RISCV_HWPROBE_KEY_IMA_EXT_0. Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv/ptrace: expose riscv CFI status and state via ptrace and in core filesDeepak Gupta
Expose a new register type NT_RISCV_USER_CFI for risc-v CFI status and state. Intentionally, both landing pad and shadow stack status and state are rolled into the CFI state. Creating two different NT_RISCV_USER_XXX would not be useful and would waste a note type. Enabling, disabling and locking the CFI feature is not allowed via ptrace set interface. However, setting 'elp' state or setting shadow stack pointer are allowed via the ptrace set interface. It is expected that 'gdb' might need to fixup 'elp' state or 'shadow stack' pointer. Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-19-b55691eacf4f@rivosinc.com [pjw@kernel.org: updated to apply; cleaned patch description and comments; addressed checkpatch issues] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv/signal: save and restore the shadow stack on a signalDeepak Gupta
Save the shadow stack pointer in the sigcontext structure when delivering a signal. Restore the shadow stack pointer from sigcontext on sigreturn. As part of the save operation, the kernel uses the 'ssamoswap' instruction to save a snapshot of the current shadow stack on the shadow stack itself (this can be called a "save token"). During restore on sigreturn, the kernel retrieves the save token from the top of the shadow stack and validates it. This ensures that user mode can't arbitrarily pivot to any shadow stack address without having a token and thus provides a strong security assurance during the window between signal delivery and sigreturn. Use an ABI-compatible way of saving/restoring the shadow stack pointer into the signal stack. This follows the vector extension, where extra registers are placed in a form of extension header + extension body in the stack. The extension header indicates the size of the extra architectural states plus the size of header itself, and a magic identifier for the extension. Then, the extension body contains the new architectural states in the form defined by uapi. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-17-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned patch description, code comments; resolved checkpatch warning] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv/traps: Introduce software check exception and uprobe handlingDeepak Gupta
The Zicfiss and Zicfilp extensions introduce a new exception, the 'software check exception', in the privileged ISA, with cause code = 18. This patch implements support for software check exceptions. Additionally, the patch implements a CFI violation handler which checks the code in the xtval register. If xtval=2, the software check exception happened because of an indirect branch that didn't land on a 4 byte aligned PC or on a 'lpad' instruction, or the label value embedded in 'lpad' didn't match the label value set in the x7 register. If xtval=3, the software check exception happened due to a mismatch between the link register (x1 or x5) and the top of shadow stack (on execution of `sspopchk`). In case of a CFI violation, SIGSEGV is raised with code=SEGV_CPERR. SEGV_CPERR was introduced by the x86 shadow stack patches. To keep uprobes working, handle the uprobe event first before reporting the CFI violation in the software check exception handler. This is because, when the landing pad is activated, if the uprobe point is set at the lpad instruction at the beginning of a function, the system triggers a software check exception instead of an ebreak exception due to the exception priority. This would prevent uprobe from working. Reviewed-by: Zong Li <zong.li@sifive.com> Co-developed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-15-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up the patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv: Implement indirect branch tracking prctlsDeepak Gupta
This patch adds a RISC-V implementation of the following prctls: PR_SET_INDIR_BR_LP_STATUS, PR_GET_INDIR_BR_LP_STATUS and PR_LOCK_INDIR_BR_LP_STATUS. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-14-b55691eacf4f@rivosinc.com [pjw@kernel.org: clean up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv: Implement arch-agnostic shadow stack prctlsDeepak Gupta
Implement an architecture-agnostic prctl() interface for setting and getting shadow stack status. The prctls implemented are PR_GET_SHADOW_STACK_STATUS, PR_SET_SHADOW_STACK_STATUS and PR_LOCK_SHADOW_STACK_STATUS. As part of PR_SET_SHADOW_STACK_STATUS/PR_GET_SHADOW_STACK_STATUS, only PR_SHADOW_STACK_ENABLE is implemented because RISCV allows each mode to write to their own shadow stack using 'sspush' or 'ssamoswap'. PR_LOCK_SHADOW_STACK_STATUS locks the current shadow stack enablement configuration. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-12-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv/shstk: If needed allocate a new shadow stack on cloneDeepak Gupta
Userspace specifies CLONE_VM to share address space and spawn new thread. 'clone' allows userspace to specify a new stack for a new thread. However there is no way to specify a new shadow stack base address without changing the API. This patch allocates a new shadow stack whenever CLONE_VM is given. In case of CLONE_VFORK, the parent is suspended until the child finishes; thus the child can use the parent's shadow stack. In case of !CLONE_VM, COW kicks in because entire address space is copied from parent to child. 'clone3' is extensible and can provide mechanisms for specifying the shadow stack as an input parameter. This is not settled yet and is being extensively discussed on the mailing list. Once that's settled, this code should be adapted. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-11-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv: compat: fix COMPAT_UTS_MACHINE definitionHan Gao
The COMPAT_UTS_MACHINE for riscv was incorrectly defined as "riscv". Change it to "riscv32" to reflect the correct 32-bit compat name. Fixes: 06d0e3723647 ("riscv: compat: Add basic compat data type implementation") Cc: stable@vger.kernel.org Signed-off-by: Han Gao <gaohan@iscas.ac.cn> Reviewed-by: Guo Ren (Alibaba Damo Academy) <guoren@kernel.org> Link: https://patch.msgid.link/20260127190711.2264664-1-gaohan@iscas.ac.cn Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26mm: provide address parameter to p{te,md,ud}_user_accessible_page()Rohan McLure
On several powerpc platforms, a page table entry may not imply whether the relevant mapping is for userspace or kernelspace. Instead, such platforms infer this by the address which is being accessed. Add an additional address argument to each of these routines in order to provide support for page table check on powerpc. [ajd@linux.ibm.com: rebase on arm64 changes] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-9-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pte_clear() This reverts commit aa232204c468 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pte_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd@linux.ibm.com: rebase, fix additional occurrence and loop handling] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-8-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pmd_clear() This reverts commit 1831414cd729 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pmd_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd@linux.ibm.com: rebase on arm64 changes] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-7-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pud_clear() This reverts commit 931c38e16499 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pud_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd@linux.ibm.com: rebase on arm64 changes] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-6-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: provide addr parameter to page_table_check_ptes_set()Rohan McLure
To provide support for powerpc platforms, provide an addr parameter to the __page_table_check_ptes_set() and page_table_check_ptes_set() routines. This parameter is needed on some powerpc platforms which do not encode whether a mapping is for user or kernel in the pte. On such platforms, this can be inferred from the addr parameter. [ajd@linux.ibm.com: rebase on arm64 + riscv changes, update commit message] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-5-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pmd[s]_set() This reverts commit a3b837130b58 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pmd_set"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. Apply this to __page_table_check_pmds_set(), page_table_check_pmd_set(), and the page_table_check_pmd_set() wrapper macro. [ajd@linux.ibm.com: rebase on arm64 + riscv changes, update commit message] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-4-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pud[s]_set() This reverts commit 6d144436d954 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pud_set"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. Apply this to __page_table_check_puds_set(), page_table_check_puds_set() and the page_table_check_pud_set() wrapper macro. [ajd@linux.ibm.com: rebase on riscv + arm64 changes, update commit message] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-3-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-25riscv/mm: update write protect to work on shadow stacksDeepak Gupta
'fork' implements copy-on-write (COW) by making pages readonly in both child and parent. ptep_set_wrprotect() and pte_wrprotect() clear _PAGE_WRITE in PTE. The assumption is that the page is readable and, on a fault, copy-on-write happens. To implement COW on shadow stack pages, clearing the W bit makes them XWR = 000. This will result in the wrong PTE setting, which allows no permissions, but with V=1 and the PFN field pointing to the final page. Instead, the desired behavior is to turn it into a readable page, take an access (load/store) fault on sspush/sspop (shadow stack) and then perform COW on such pages. This way regular reads would still be allowed and not lead to COW maintaining current behavior of COW on non-shadow stack but writeable memory. On the other hand, this doesn't interfere with existing COW for read-write memory. The assumption is always that _PAGE_READ must have been set, and thus, setting _PAGE_READ is harmless. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-9-b55691eacf4f@rivosinc.com [pjw@kernel.org: clarify patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25riscv/mm: teach pte_mkwrite to manufacture shadow stack PTEsDeepak Gupta
pte_mkwrite() creates PTEs with WRITE encodings for the underlying architecture. The underlying architecture can have two types of writeable mappings: one that can be written using regular store instructions, and another one that can only be written using specialized store instructions (like shadow stack stores). pte_mkwrite can select write PTE encoding based on VMA range (i.e. VM_SHADOW_STACK) Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-8-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25riscv/mm: manufacture shadow stack ptesDeepak Gupta
This patch implements the creation of a shadow stack pte on riscv. Creating shadow stack PTE on riscv means clearing RWX and then setting W=1. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-7-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25riscv/mm: ensure PROT_WRITE leads to VM_READ | VM_WRITEDeepak Gupta
'arch_calc_vm_prot_bits' is implemented on risc-v to return VM_READ | VM_WRITE if PROT_WRITE is specified. Similarly 'riscv_sys_mmap' is updated to convert all incoming PROT_WRITE to (PROT_WRITE | PROT_READ). This is to make sure that any existing apps using PROT_WRITE still work. Earlier 'protection_map[VM_WRITE]' used to pick read-write PTE encodings. Now 'protection_map[VM_WRITE]' will always pick PAGE_SHADOWSTACK PTE encodings for shadow stack. The above changes ensure that existing apps continue to work because underneath, the kernel will be picking 'protection_map[VM_WRITE|VM_READ]' PTE encodings. Reviewed-by: Zong Li <zong.li@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-6-b55691eacf4f@rivosinc.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25riscv: Add usercfi state for task and save/restore of CSR_SSP on trap entry/exitDeepak Gupta
Carve out space in the RISC-V architecture-specific thread struct for cfi status and shadow stack in usermode. This patch: - defines a new structure cfi_status with status bit for cfi feature - defines shadow stack pointer, base and size in cfi_status structure - defines offsets to new member fields in thread in asm-offsets.c - saves and restores shadow stack pointer on trap entry (U --> S) and exit (S --> U) Shadow stack save/restore is gated on feature availability and is implemented using alternatives. CSR_SSP can be context-switched in 'switch_to' as well, but as soon as kernel shadow stack support gets rolled in, the shadow stack pointer will need to be switched at trap entry/exit point (much like 'sp'). It can be argued that a kernel using a shadow stack deployment scenario may not be as prevalent as user mode using this feature. But even if there is some minimal deployment of kernel shadow stack, that means that it needs to be supported. Thus save/restore of shadow stack pointer is implemented in entry.S instead of in 'switch_to.h'. Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-5-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25riscv: add Zicfiss / Zicfilp extension CSR and bit definitionsDeepak Gupta
The Zicfiss and Zicfilp extensions are enabled via b3 and b2 in *envcfg CSRs. menvcfg controls enabling for S/HS mode. henvcfg controls enabling for VS. senvcfg controls enabling for U/VU mode. The Zicfilp extension extends *status CSRs to hold an 'expected landing pad' bit. A trap or interrupt can occur between an indirect jmp/call and target instruction. The 'expected landing pad' bit from the CPU is recorded into the xstatus CSR so that when the supervisor performs xret, the 'expected landing pad' state of the CPU can be restored. Zicfiss adds one new CSR, CSR_SSP, which contains the current shadow stack pointer. Signed-off-by: Deepak Gupta <debug@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-4-b55691eacf4f@rivosinc.com [pjw@kernel.org: grouped CSR_SSP macro with the other CSR macros; clarified patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25riscv: zicfiss / zicfilp enumerationDeepak Gupta
This patch adds support for detecting the RISC-V ISA extensions Zicfiss and Zicfilp. Zicfiss and Zicfilp stand for the unprivileged integer spec extensions for shadow stack and indirect branch tracking, respectively. This patch looks for Zicfiss and Zicfilp in the device tree and accordingly lights up the corresponding bits in the cpu feature bitmap. Furthermore this patch adds detection utility functions to return whether shadow stack or landing pads are supported by the cpu. Reviewed-by: Zong Li <zong.li@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-3-b55691eacf4f@rivosinc.com [pjw@kernel.org: updated to apply; cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25riscv: mm: define copy_user_page() as copy_page()Florian Schmaus
Currently, the implementation of copy_user_page() is identical to copy_page(). Align riscv with other architectures (alpha, arc, arm64, hexagon, longarch, m68k, openrisc, s390, um, xtensa) and map copy_user_page() to copy_page() given that their implementation is identical. In addition to following a common pattern, this centralizes the implementation. Any changes to the underlying page copy logic (e.g., for CHERI) will now automatically propagate to copy_user_page(). Signed-off-by: Florian Schmaus <florian.schmaus@codasip.com> Link: https://patch.msgid.link/20260113134025.905627-1-florian.schmaus@codasip.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25riscv: fix minor typo in syscall.h commentAustin Kim
Some developers may be confused because RISC-V does not have a register named r0. Also, orig_r0 is not available in pt_regs structure, which is specific to riscv. So we had better fix this minor typo. Signed-off-by: Austin Kim <austin.kim@lge.com> Link: https://patch.msgid.link/aW3Z4zTBvGJpk7a7@adminpc-PowerEdge-R7525 Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-22riscv: Add intermediate cast to 'unsigned long' in __get_user_asmNathan Chancellor
After commit bdce162f2e57 ("riscv: Use 64-bit variable for output in __get_user_asm"), there is a warning when building for 32-bit RISC-V: In file included from include/linux/uaccess.h:13, from include/linux/sched/task.h:13, from include/linux/sched/signal.h:9, from include/linux/rcuwait.h:6, from include/linux/mm.h:36, from include/linux/migrate.h:5, from mm/migrate.c:16: mm/migrate.c: In function 'do_pages_move': arch/riscv/include/asm/uaccess.h:115:15: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 115 | (x) = (__typeof__(x))__tmp; \ | ^ arch/riscv/include/asm/uaccess.h:198:17: note: in expansion of macro '__get_user_asm' 198 | __get_user_asm("lb", (x), __gu_ptr, label); \ | ^~~~~~~~~~~~~~ arch/riscv/include/asm/uaccess.h:218:9: note: in expansion of macro '__get_user_nocheck' 218 | __get_user_nocheck(x, ptr, __gu_failed); \ | ^~~~~~~~~~~~~~~~~~ arch/riscv/include/asm/uaccess.h:255:9: note: in expansion of macro '__get_user_error' 255 | __get_user_error(__gu_val, __gu_ptr, __gu_err); \ | ^~~~~~~~~~~~~~~~ arch/riscv/include/asm/uaccess.h:285:17: note: in expansion of macro '__get_user' 285 | __get_user((x), __p) : \ | ^~~~~~~~~~ mm/migrate.c:2358:29: note: in expansion of macro 'get_user' 2358 | if (get_user(p, pages + i)) | ^~~~~~~~ Add an intermediate cast to 'unsigned long', which is guaranteed to be the same width as a pointer, before the cast to the type of the output variable to clear up the warning. Fixes: bdce162f2e57 ("riscv: Use 64-bit variable for output in __get_user_asm") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202601210526.OT45dlOZ-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20260121-riscv-fix-int-to-pointer-cast-v1-1-b83eebe57c76@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-20treewide: provide a generic clear_user_page() variantDavid Hildenbrand
Patch series "mm: folio_zero_user: clear page ranges", v11. This series adds clearing of contiguous page ranges for hugepages. The series improves on the current discontiguous clearing approach in two ways: - clear pages in a contiguous fashion. - use batched clearing via clear_pages() wherever exposed. The first is useful because it allows us to make much better use of hardware prefetchers. The second, enables advertising the real extent to the processor. Where specific instructions support it (ex. string instructions on x86; "mops" on arm64 etc), a processor can optimize based on this because, instead of seeing a sequence of 8-byte stores, or a sequence of 4KB pages, it sees a larger unit being operated on. For instance, AMD Zen uarchs (for extents larger than LLC-size) switch to a mode where they start eliding cacheline allocation. This is helpful not just because it results in higher bandwidth, but also because now the cache is not evicting useful cachelines and replacing them with zeroes. Demand faulting a 64GB region shows performance improvement: $ perf bench mem mmap -p $pg-sz -f demand -s 64GB -l 5 baseline +series (GBps +- %stdev) (GBps +- %stdev) pg-sz=2MB 11.76 +- 1.10% 25.34 +- 1.18% [*] +115.47% preempt=* pg-sz=1GB 24.85 +- 2.41% 39.22 +- 2.32% + 57.82% preempt=none|voluntary pg-sz=1GB (similar) 52.73 +- 0.20% [#] +112.19% preempt=full|lazy [*] This improvement is because switching to sequential clearing allows the hardware prefetchers to do a much better job. [#] For pg-sz=1GB a large part of the improvement is because of the cacheline elision mentioned above. preempt=full|lazy improves upon that because, not needing explicit invocations of cond_resched() to ensure reasonable preemption latency, it can clear the full extent as a single unit. In comparison the maximum extent used for preempt=none|voluntary is PROCESS_PAGES_NON_PREEMPT_BATCH (32MB). When provided the full extent the processor forgoes allocating cachelines on this path almost entirely. (The hope is that eventually, in the fullness of time, the lazy preemption model will be able to do the same job that none or voluntary models are used for, allowing us to do away with cond_resched().) Raghavendra also tested previous version of the series on AMD Genoa and sees similar improvement [1] with preempt=lazy. $ perf bench mem map -p $page-size -f populate -s 64GB -l 10 base patched change pg-sz=2MB 12.731939 GB/sec 26.304263 GB/sec 106.6% pg-sz=1GB 26.232423 GB/sec 61.174836 GB/sec 133.2% This patch (of 8): Let's drop all variants that effectively map to clear_page() and provide it in a generic variant instead. We'll use the macro clear_user_page to indicate whether an architecture provides it's own variant. Also, clear_user_page() is only called from the generic variant of clear_user_highpage(), so define it only if the architecture does not provide a clear_user_highpage(). And, for simplicity define it in linux/highmem.h. Note that for parisc, clear_page() and clear_user_page() map to clear_page_asm(), so we can just get rid of the custom clear_user_page() implementation. There is a clear_user_page_asm() function on parisc, that seems to be unused. Not sure what's up with that. Link: https://lkml.kernel.org/r/20260107072009.1615991-1-ankur.a.arora@oracle.com Link: https://lkml.kernel.org/r/20260107072009.1615991-2-ankur.a.arora@oracle.com Signed-off-by: David Hildenbrand <david@redhat.com> Co-developed-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ankur Arora <ankur.a.arora@oracle.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Hildenbrand <david@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Konrad Rzessutek Wilk <konrad.wilk@oracle.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Raghavendra K T <raghavendra.kt@amd.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-16riscv: Use 64-bit variable for output in __get_user_asmNathan Chancellor
After commit f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for get_user()"), which was the first commit that started using asm goto with outputs on RISC-V, builds of clang built with assertions enabled start crashing in certain files that use get_user() with: clang: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:12743: Register FollowCopyChain(MachineRegisterInfo &, Register): Assertion `MI->getOpcode() == TargetOpcode::COPY && "start of copy chain MUST be COPY"' failed. Internally, LLVM generates an addiw instruction when the output of the inline asm (which may be any scalar type) needs to be sign extended for ABI reasons, such as a later function call, so that basic block does not have to do it. Use a temporary 64-bit variable as the output of the inline assembly in __get_user_asm() and explicitly cast it to truncate it if necessary, avoiding the addiw that triggers the assertion. Link: https://github.com/ClangBuiltLinux/linux/issues/2092 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://patch.msgid.link/20260116-riscv-wa-llvm-asm-goto-outputs-assertion-failure-v3-1-55b5775f989b@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-12riscv/paravirt: Use common code for paravirt_steal_clock()Juergen Gross
Remove the arch specific variant of paravirt_steal_clock() and use the common one instead. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-11-jgross@suse.com
2026-01-12sched: Move clock related paravirt code to kernel/schedJuergen Gross
Paravirt clock related functions are available in multiple archs. In order to share the common parts, move the common static keys to kernel/sched/ and remove them from the arch specific files. Make a common paravirt_steal_clock() implementation available in kernel/sched/cputime.c, guarding it with a new config option CONFIG_HAVE_PV_STEAL_CLOCK_GEN, which can be selected by an arch in case it wants to use that common variant. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-7-jgross@suse.com
2026-01-12paravirt: Remove asm/paravirt_api_clock.hJuergen Gross
All architectures supporting CONFIG_PARAVIRT share the same contents of asm/paravirt_api_clock.h: #include <asm/paravirt.h> So remove all incarnations of asm/paravirt_api_clock.h and remove the only place where it is included, as there asm/paravirt.h is included anyway. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> # powerpc, scheduler bits Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-6-jgross@suse.com
2026-01-07riscv: remove irqflags.h inclusion in asm/bitops.hYunhui Cui
The arch/riscv/include/asm/bitops.h does not functionally require including /linux/irqflags.h. Additionally, adding arch/riscv/include/asm/percpu.h causes a circular inclusion: kernel/bounds.c ->include/linux/log2.h ->include/linux/bitops.h ->arch/riscv/include/asm/bitops.h ->include/linux/irqflags.h ->include/linux/find.h ->return val ? __ffs(val) : size; ->arch/riscv/include/asm/bitops.h The compilation log is as follows: CC kernel/bounds.s In file included from ./include/linux/bitmap.h:11, from ./include/linux/cpumask.h:12, from ./arch/riscv/include/asm/processor.h:55, from ./arch/riscv/include/asm/thread_info.h:42, from ./include/linux/thread_info.h:60, from ./include/asm-generic/preempt.h:5, from ./arch/riscv/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:79, from ./arch/riscv/include/asm/percpu.h:8, from ./include/linux/irqflags.h:19, from ./arch/riscv/include/asm/bitops.h:14, from ./include/linux/bitops.h:68, from ./include/linux/log2.h:12, from kernel/bounds.c:13: ./include/linux/find.h: In function 'find_next_bit': ./include/linux/find.h:66:30: error: implicit declaration of function '__ffs' [-Wimplicit-function-declaration] 66 | return val ? __ffs(val) : size; | ^~~~~ Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Acked-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> Link: https://patch.msgid.link/20251216014721.42262-2-cuiyunhui@bytedance.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-05riscv: pgtable: Cleanup useless VA_USER_XXX definitionsGuo Ren (Alibaba DAMO Academy)
These marcos are not used after commit b5b4287accd7 ("riscv: mm: Use hint address in mmap if available"). Cleanup VA_USER_XXX definitions in asm/pgtable.h. Fixes: b5b4287accd7 ("riscv: mm: Use hint address in mmap if available") Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20251201005850.702569-1-guoren@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19riscv: Add SBI debug trigger extension and function idsHimanshu Chauhan
Debug trigger extension is an SBI extension to support native debugging in S-mode and VS-mode. This patch adds the extension and the function IDs defined by the extension. Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com> Link: https://patch.msgid.link/20250710125231.653967-2-hchauhan@ventanamicro.com [pjw@kernel.org: updated to apply] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19riscv/atomic.h: use RISCV_FULL_BARRIER in _arch_atomic* function.Zongmin Zhou
Replace the same code with the pre-defined macro RISCV_FULL_BARRIER to simplify the code. Signed-off-by: Zongmin Zhou <zhouzongmin@kylinos.cn> Link: https://patch.msgid.link/20251120095831.64211-1-min_halo@163.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19riscv: hwprobe: export Zilsd and Zclsd ISA extensionsPincheng Wang
Export Zilsd and Zclsd ISA extensions through hwprobe. Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Link: https://patch.msgid.link/20250826162939.1494021-4-pincheng.plct@isrc.iscas.ac.cn [pjw@kernel.org: fixed whitespace; updated to apply] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19riscv: add ISA extension parsing for Zilsd and ZclsdPincheng Wang
Add parsing for Zilsd and Zclsd ISA extensions which were ratified in commit f88abf1 ("Integrating load/store pair for RV32 with the main manual") of the riscv-isa-manual. Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Link: https://patch.msgid.link/20250826162939.1494021-3-pincheng.plct@isrc.iscas.ac.cn [pjw@kernel.org: cleaned up checkpatch issues, whitespace; updated to apply] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19riscv: mm: use xchg() on non-atomic_long_t variables, not atomic_long_xchg()Paul Walmsley
Let's not call atomic_long_xchg() on something that's not an atomic_long_t, and just use xchg() instead. Continues the cleanup from commit 546e42c8c6d94 ("riscv: Use an atomic xchg in pudp_huge_get_and_clear()"), Cc: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19riscv: mm: ptep_get_and_clear(): avoid atomic ops when !CONFIG_SMPPaul Walmsley
When !CONFIG_SMP, there's no need for atomic operations in ptep_get_and_clear(), so, similar to x86, let's not use atomics in this case. Cc: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Paul Walmsley <pjw@kernel.org>