| Age | Commit message (Collapse) | Author |
|
Fedora QA reported the following panic:
BUG: unable to handle page fault for address: 0000000040003e54
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20251119-3.fc43 11/19/2025
RIP: 0010:vmware_hypercall4.constprop.0+0x52/0x90
..
Call Trace:
vmmouse_report_events+0x13e/0x1b0
psmouse_handle_byte+0x15/0x60
ps2_interrupt+0x8a/0xd0
...
because the QEMU VMware mouse emulation is buggy, and clears the top 32
bits of %rdi that the kernel kept a pointer in.
The QEMU vmmouse driver saves and restores the register state in a
"uint32_t data[6];" and as a result restores the state with the high
bits all cleared.
RDI originally contained the value of a valid kernel stack address
(0xff5eeb3240003e54). After the vmware hypercall it now contains
0x40003e54, and we get a page fault as a result when it is dereferenced.
The proper fix would be in QEMU, but this works around the issue in the
kernel to keep old setups working, when old kernels had not happened to
keep any state in %rdi over the hypercall.
In theory this same issue exists for all the hypercalls in the vmmouse
driver; in practice it has only been seen with vmware_hypercall3() and
vmware_hypercall4(). For now, just mark RDI/RSI as clobbered for those
two calls. This should have a minimal effect on code generation overall
as it should be rare for the compiler to want to make RDI/RSI live
across hypercalls.
Reported-by: Justin Forbes <jforbes@fedoraproject.org>
Link: https://lore.kernel.org/all/99a9c69a-fc1a-43b7-8d1e-c42d6493b41f@broadcom.com/
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull ARM fix from Russell King:
"Just one fix for memset64() on big endian 32-bit ARM systems"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9468/1: fix memset64() on big-endian
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Five hotfixes. Two are cc:stable, two are for MM.
All are singletons - please see the changelogs for details"
* tag 'mm-hotfixes-stable-2026-02-04-15-55' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
Documentation: document liveupdate cmdline parameter
mm, shmem: prevent infinite loop on truncate race
mailmap: update Alexander Mikhalitsyn's emails
liveupdate: luo_file: do not clear serialized_data on unfreeze
x86/kfence: fix booting on 32bit non-PAE systems
|
|
Pull KVM fixes from Paolo Bonzini:
- Fix a bug where AVIC is incorrectly inhibited when running with
x2AVIC disabled via module param (or on a system without x2AVIC)
- Fix a dangling device posted IRQs bug by explicitly checking if the
irqfd is still active (on the list) when handling an eventfd signal,
instead of zeroing the irqfd's routing information when the irqfd is
deassigned.
Zeroing the irqfd's routing info causes arm64 and x86's to not
disable posting for the IRQ (kvm_arch_irq_bypass_del_producer() looks
for an MSI), incorrectly leaving the IRQ in posted mode (and leading
to use-after-free and memory leaks on AMD in particular).
This is both the most pressing and scariest, but it's been in -next
for a while.
- Disable FORTIFY_SOURCE for KVM selftests to prevent the compiler from
generating calls to the checked versions of memset() and friends,
which leads to unexpected page faults in guest code due e.g.
__memset_chk@plt not being resolved.
- Explicitly configure the supported XSS capabilities from within
{svm,vmx}_set_cpu_caps() to fix a bug where VMX will compute the
reference VMCS configuration with SHSTK and IBT enabled, but then
compute each CPUs local config with SHSTK and IBT disabled if not all
CET xfeatures are enabled, e.g. if the kernel is built with
X86_KERNEL_IBT=n.
The mismatch in features results in differing nVMX setting, and
ultimately causes kvm-intel.ko to refuse to load with nested=1.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Explicitly configure supported XSS from {svm,vmx}_set_cpu_caps()
KVM: selftests: Add -U_FORTIFY_SOURCE to avoid some unpredictable test failures
KVM: x86: Assert that non-MSI doesn't have bypass vCPU when deleting producer
KVM: Don't clobber irqfd routing type when deassigning irqfd
KVM: SVM: Check vCPU ID against max x2AVIC ID if and only if x2AVIC is enabled
|
|
Final KVM fixes for 6.19:
- Fix a bug where AVIC is incorrectly inhibited when running with x2AVIC
disabled via module param (or on a system without x2AVIC).
- Fix a dangling device posted IRQs bug by explicitly checking if the irqfd is
still active (on the list) when handling an eventfd signal, instead of
zeroing the irqfd's routing information when the irqfd is deassigned.
Zeroing the irqfd's routing info causes arm64 and x86's to not disable
posting for the IRQ (kvm_arch_irq_bypass_del_producer() looks for an MSI),
incorrectly leaving the IRQ in posted mode (and leading to use-after-free
and memory leaks on AMD in particular).
This is both the most pressing and scariest, but it's been in -next for
a while.
- Disable FORTIFY_SOURCE for KVM selftests to prevent the compiler from
generating calls to the checked versions of memset() and friends, which
leads to unexpected page faults in guest code due e.g. __memset_chk@plt
not being resolved.
- Explicitly configure the support XSS from within {svm,vmx}_set_cpu_caps() to
fix a bug where VMX will compute the reference VMCS configuration with SHSTK
and IBT enabled, but then compute each CPUs local config with SHSTK and IBT
disabled if not all CET xfeatures are enabled, e.g. if the kernel is built
with X86_KERNEL_IBT=n. The mismatch in features results in differing nVMX
setting, and ultimately causes kvm-intel.ko to refuse to load with nested=1.
|
|
The original patch inverted the PTE unconditionally to avoid
L1TF-vulnerable PTEs, but Linux doesn't make this adjustment in 2-level
paging.
Adjust the logic to use the flip_protnone_guard() helper, which is a nop
on 2-level paging but inverts the address bits in all other paging modes.
This doesn't matter for the Xen aspect of the original change. Linux no
longer supports running 32bit PV under Xen, and Xen doesn't support
running any 32bit PV guests without using PAE paging.
Link: https://lkml.kernel.org/r/20260126211046.2096622-1-andrew.cooper3@citrix.com
Fixes: b505f1944535 ("x86/kfence: avoid writing L1TF-vulnerable PTEs")
Reported-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Closes: https://lore.kernel.org/lkml/CAKFNMokwjw68ubYQM9WkzOuH51wLznHpEOMSqtMoV1Rn9JV_gw@mail.gmail.com/
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jann Horn <jannh@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
On big-endian systems the 32-bit low and high halves need to be swapped
for the underlying assembly implementation to work correctly.
Fixes: fd1d362600e2 ("ARM: implement memset32 & memset64")
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
- Correct the RISC-V compat.h COMPAT_UTS_MACHINE architecture name
- Avoid printing a false warning message on kernels with the SiFive and
MIPS errata compiled in
- Address a few warnings generated by sparse in the signal handling
code
- Fix a comment typo
* tag 'riscv-for-linus-6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: compat: fix COMPAT_UTS_MACHINE definition
errata/sifive: remove unreliable warn_miss_errata
riscv: fix minor typo in syscall.h comment
riscv: signal: fix some warnings reported by sparse
|
|
Explicitly configure KVM's supported XSS as part of each vendor's setup
flow to fix a bug where clearing SHSTK and IBT in kvm_cpu_caps, e.g. due
to lack of CET XFEATURE support, makes kvm-intel.ko unloadable when nested
VMX is enabled, i.e. when nested=1. The late clearing results in
nested_vmx_setup_{entry,exit}_ctls() clearing VM_{ENTRY,EXIT}_LOAD_CET_STATE
when nested_vmx_setup_ctls_msrs() runs during the CPU compatibility checks,
ultimately leading to a mismatched VMCS config due to the reference config
having the CET bits set, but every CPU's "local" config having the bits
cleared.
Note, kvm_caps.supported_{xcr0,xss} are unconditionally initialized by
kvm_x86_vendor_init(), before calling into vendor code, and not referenced
between ops->hardware_setup() and their current/old location.
Fixes: 69cc3e886582 ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER")
Cc: stable@vger.kernel.org
Cc: Mathias Krause <minipli@grsecurity.net>
Cc: John Allen <john.allen@amd.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Binbin Wu <binbin.wu@linux.intel.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Link: https://patch.msgid.link/20260128014310.3255561-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"16 hotfixes. 9 are cc:stable, 12 are for MM.
There's a patch series from Pratyush Yadav which fixes a few things in
the new-in-6.19 LUO memfd code.
Plus the usual shower of singletons - please see the changelogs for
details"
* tag 'mm-hotfixes-stable-2026-01-29-09-41' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
vmcoreinfo: make hwerr_data visible for debugging
mm/zone_device: reinitialize large zone device private folios
mm/mm_init: don't cond_resched() in deferred_init_memmap_chunk() if called from deferred_grow_zone()
mm/kfence: randomize the freelist on initialization
kho: kho_preserve_vmalloc(): don't return 0 when ENOMEM
kho: init alloc tags when restoring pages from reserved memory
mm: memfd_luo: restore and free memfd_luo_ser on failure
mm: memfd_luo: use memfd_alloc_file() instead of shmem_file_setup()
memfd: export alloc_file()
flex_proportions: make fprop_new_period() hardirq safe
mailmap: add entry for Viacheslav Bocharov
mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn
mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
mm, swap: restore swap_space attr aviod kernel panic
mm/kasan: fix KASAN poisoning in vrealloc()
mm/shmem, swap: fix race of truncate and swap entry split
|
|
The COMPAT_UTS_MACHINE for riscv was incorrectly defined as "riscv".
Change it to "riscv32" to reflect the correct 32-bit compat name.
Fixes: 06d0e3723647 ("riscv: compat: Add basic compat data type implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
Reviewed-by: Guo Ren (Alibaba Damo Academy) <guoren@kernel.org>
Link: https://patch.msgid.link/20260127190711.2264664-1-gaohan@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Mark the Meson GPIO controller as sleeping to avoid a
context splat
- Fix up the I2S2 and SWR TX group settings in the
Qualcomm SM8350 LPASS pin controller, and implement the
proper .get_direction() callback
- Fix a pin typo in the TG1520 pin controller
- Fix a group name in the Marvell armada 3710 XB pin
controller that got mangled in a DT schema rewrite
* tag 'pinctrl-v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
dt-bindings: pinctrl: marvell,armada3710-xb-pinctrl: fix 'usb32_drvvbus0' group name
pinctrl: lpass-lpi: implement .get_direction() for the GPIO driver
pinctrl: th1520: Fix typo
pinctrl: qcom: sm8350-lpass-lpi: Merge with SC7280 to fix I2S2 and SWR TX pins
pinctrl: meson: mark the GPIO controller as sleeping
|
|
Reinitialize metadata for large zone device private folios in
zone_device_page_init prior to creating a higher-order zone device private
folio. This step is necessary when the folio's order changes dynamically
between zone_device_page_init calls to avoid building a corrupt folio. As
part of the metadata reinitialization, the dev_pagemap must be passed in
from the caller because the pgmap stored in the folio page may have been
overwritten with a compound head.
Without this fix, individual pages could have invalid pgmap fields and
flags (with PG_locked being notably problematic) due to prior different
order allocations, which can, and will, result in kernel crashes.
Link: https://lkml.kernel.org/r/20260116111325.1736137-2-francois.dugast@intel.com
Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Balbir Singh <balbirs@nvidia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When both the SiFive and MIPS errata are enabled then
sifive_errata_patch_func emits a wrong and misleading warning claiming
that the SiFive errata haven't been applied. This happens because
sifive_errata_patch_func is being called twice, once for the kernel image
and once for the vdso image. The vdso image has alternative entries
for the MIPS errata, but none for the SiFive errata.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Link: https://patch.msgid.link/mvmv7i8q8gg.fsf@suse.de
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Some developers may be confused because RISC-V does not have
a register named r0. Also, orig_r0 is not available in pt_regs structure,
which is specific to riscv. So we had better fix this minor typo.
Signed-off-by: Austin Kim <austin.kim@lge.com>
Link: https://patch.msgid.link/aW3Z4zTBvGJpk7a7@adminpc-PowerEdge-R7525
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Clean up a few warnings reported by sparse in
arch/riscv/kernel/signal.c. These come from code that was added
recently; they were missed when I initially reviewed the patch.
Fixes: 818d78ba1b3f ("riscv: signal: abstract header saving for setup_sigcontext")
Cc: Andy Chiu <andybnac@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601171848.ydLTJYrz-lkp@intel.com/
[pjw@kernel.org: updated to apply]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"The notable changes here are the three RISC-V timer compare register
update sequence patches. These only apply to RV32 systems and are
related to the 64-bit timer compare value being split across two
separate 32-bit registers.
We weren't using the appropriate three-write sequence, documented in
the RISC-V ISA specifications, to avoid spurious timer interrupts
during the update sequence; so, these patches now use the recommended
sequence.
This doesn't affect 64-bit RISC-V systems, since the timer compare
value fits inside a single register and can be updated with a single
write.
- Fix the RISC-V timer compare register update sequence on RV32
systems to use the recommended sequence in the RISC-V ISA manual
This avoids spurious interrupts during updates
- Add a dependence on the new CONFIG_CACHEMAINT_FOR_DMA Kconfig
symbol for Renesas and StarFive RISC-V SoCs
- Add a temporary workaround for a Clang compiler bug caused by using
asm_goto_output for get_user()
- Clarify our documentation to specifically state a particular ISA
specification version for a chapter number reference"
* tag 'riscv-for-linus-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Add intermediate cast to 'unsigned long' in __get_user_asm
riscv: Use 64-bit variable for output in __get_user_asm
soc: renesas: Fix missing dependency on new CONFIG_CACHEMAINT_FOR_DMA
riscv: ERRATA_STARFIVE_JH7100: Fix missing dependency on new CONFIG_CACHEMAINT_FOR_DMA
riscv: suspend: Fix stimecmp update hazard on RV32
riscv: kvm: Fix vstimecmp update hazard on RV32
riscv: clocksource: Fix stimecmp update hazard on RV32
Documentation: riscv: uabi: Clarify ISA spec version for canonical order
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events fixes from Ingo Molnar:
- Fix mmap_count warning & bug when creating a group member event
with the PERF_FLAG_FD_OUTPUT flag
- Disable the sample period == 1 branch events BTS optimization
on guests, because BTS is not virtualized
* tag 'perf-urgent-2026-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Do not enable BTS for guests
perf: Fix refcount warning on event->mmap_count increment
|
|
Pull arm64 kvm fixes from Paolo Bonzini:
- Ensure early return semantics are preserved for pKVM fault handlers
- Fix case where the kernel runs with the guest's PAN value when
CONFIG_ARM64_PAN is not set
- Make stage-1 walks to set the access flag respect the access
permission of the underlying stage-2, when enabled
- Propagate computed FGT values to the pKVM view of the vCPU at
vcpu_load()
- Correctly program PXN and UXN privilege bits for hVHE's stage-1 page
tables
- Check that the VM is actually using VGICv3 before accessing the GICv3
CPU interface
- Delete some unused code
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers
KVM: arm64: Don't blindly set set PSTATE.PAN on guest exit
KVM: arm64: nv: Respect stage-2 write permssion when setting stage-1 AF
KVM: arm64: Remove unused vcpu_{clear,set}_wfx_traps()
KVM: arm64: Remove unused parameter in synchronize_vcpu_pstate()
KVM: arm64: Remove extra argument for __pvkm_host_{share,unshare}_hyp()
KVM: arm64: Inject UNDEF for a register trap without accessor
KVM: arm64: Copy FGT traps to unprotected pKVM VCPU on VCPU load
KVM: arm64: Fix EL2 S1 XN handling for hVHE setups
KVM: arm64: gic: Check for vGICv3 when clearing TWI
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.19
- Ensure early return semantics are preserved for pKVM fault handlers
- Fix case where the kernel runs with the guest's PAN value when
CONFIG_ARM64_PAN is not set
- Make stage-1 walks to set the access flag respect the access
permission of the underlying stage-2, when enabled
- Propagate computed FGT values to the pKVM view of the vCPU at
vcpu_load()
- Correctly program PXN and UXN privilege bits for hVHE's stage-1 page
tables
- Check that the VM is actually using VGICv3 before accessing the GICv3
CPU interface
- Delete some unused code
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Add $(DISABLE_KSTACK_ERASE) to vdso compile flags to fix compile
errors with old gcc versions
- Fix path to s390 chacha implementation in vdso selftests, after
vdso64 has been renamed to vdso
- Fix off-by-one bug in APQN limit calculation
- Discard .modinfo section from decompressor image to fix SecureBoot
* tag 's390-6.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/boot/vmlinux.lds.S: Ensure bzImage ends with SecureBoot trailer
s390/ap: Fix wrong APQN fill calculation
selftests: vDSO: getrandom: Fix path to s390 chacha implementation
s390/vdso: Disable kstack erase
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- A set of fixes for FPSIMD/SVE/SME state management (around signal
handling and ptrace) where a task can be placed in an invalid state
- __nocfi added to swsusp_arch_resume() to avoid a data abort on
resuming from hibernate
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Set __nocfi on swsusp_arch_resume()
arm64/fpsimd: signal: Fix restoration of SVE context
arm64/fpsimd: signal: Allocate SSVE storage when restoring ZA
arm64/fpsimd: ptrace: Fix SVE writes on !SME systems
|
|
A DABT is reported[1] on an android based system when resume from hiberate.
This happens because swsusp_arch_suspend_exit() is marked with SYM_CODE_*()
and does not have a CFI hash, but swsusp_arch_resume() will attempt to
verify the CFI hash when calling a copy of swsusp_arch_suspend_exit().
Given that there's an existing requirement that the entrypoint to
swsusp_arch_suspend_exit() is the first byte of the .hibernate_exit.text
section, we cannot fix this by marking swsusp_arch_suspend_exit() with
SYM_FUNC_*(). The simplest fix for now is to disable the CFI check in
swsusp_arch_resume().
Mark swsusp_arch_resume() as __nocfi to disable the CFI check.
[1]
[ 22.991934][ T1] Unable to handle kernel paging request at virtual address 0000000109170ffc
[ 22.991934][ T1] Mem abort info:
[ 22.991934][ T1] ESR = 0x0000000096000007
[ 22.991934][ T1] EC = 0x25: DABT (current EL), IL = 32 bits
[ 22.991934][ T1] SET = 0, FnV = 0
[ 22.991934][ T1] EA = 0, S1PTW = 0
[ 22.991934][ T1] FSC = 0x07: level 3 translation fault
[ 22.991934][ T1] Data abort info:
[ 22.991934][ T1] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
[ 22.991934][ T1] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 22.991934][ T1] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 22.991934][ T1] [0000000109170ffc] user address but active_mm is swapper
[ 22.991934][ T1] Internal error: Oops: 0000000096000007 [#1] PREEMPT SMP
[ 22.991934][ T1] Dumping ftrace buffer:
[ 22.991934][ T1] (ftrace buffer empty)
[ 22.991934][ T1] Modules linked in:
[ 22.991934][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.98-android15-8-g0b1d2aee7fc3-dirty-4k #1 688c7060a825a3ac418fe53881730b355915a419
[ 22.991934][ T1] Hardware name: Unisoc UMS9360-base Board (DT)
[ 22.991934][ T1] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 22.991934][ T1] pc : swsusp_arch_resume+0x2ac/0x344
[ 22.991934][ T1] lr : swsusp_arch_resume+0x294/0x344
[ 22.991934][ T1] sp : ffffffc08006b960
[ 22.991934][ T1] x29: ffffffc08006b9c0 x28: 0000000000000000 x27: 0000000000000000
[ 22.991934][ T1] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000820
[ 22.991934][ T1] x23: ffffffd0817e3000 x22: ffffffd0817e3000 x21: 0000000000000000
[ 22.991934][ T1] x20: ffffff8089171000 x19: ffffffd08252c8c8 x18: ffffffc080061058
[ 22.991934][ T1] x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 0000000000000004
[ 22.991934][ T1] x14: ffffff8178c88000 x13: 0000000000000006 x12: 0000000000000000
[ 22.991934][ T1] x11: 0000000000000015 x10: 0000000000000001 x9 : ffffffd082533000
[ 22.991934][ T1] x8 : 0000000109171000 x7 : 205b5d3433393139 x6 : 392e32322020205b
[ 22.991934][ T1] x5 : 000000010916f000 x4 : 000000008164b000 x3 : ffffff808a4e0530
[ 22.991934][ T1] x2 : ffffffd08058e784 x1 : 0000000082326000 x0 : 000000010a283000
[ 22.991934][ T1] Call trace:
[ 22.991934][ T1] swsusp_arch_resume+0x2ac/0x344
[ 22.991934][ T1] hibernation_restore+0x158/0x18c
[ 22.991934][ T1] load_image_and_restore+0xb0/0xec
[ 22.991934][ T1] software_resume+0xf4/0x19c
[ 22.991934][ T1] software_resume_initcall+0x34/0x78
[ 22.991934][ T1] do_one_initcall+0xe8/0x370
[ 22.991934][ T1] do_initcall_level+0xc8/0x19c
[ 22.991934][ T1] do_initcalls+0x70/0xc0
[ 22.991934][ T1] do_basic_setup+0x1c/0x28
[ 22.991934][ T1] kernel_init_freeable+0xe0/0x148
[ 22.991934][ T1] kernel_init+0x20/0x1a8
[ 22.991934][ T1] ret_from_fork+0x10/0x20
[ 22.991934][ T1] Code: a9400a61 f94013e0 f9438923 f9400a64 (b85fc110)
Co-developed-by: Jeson Gao <jeson.gao@unisoc.com>
Signed-off-by: Jeson Gao <jeson.gao@unisoc.com>
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
[catalin.marinas@arm.com: commit log updated by Mark Rutland]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
After commit bdce162f2e57 ("riscv: Use 64-bit variable for output in
__get_user_asm"), there is a warning when building for 32-bit RISC-V:
In file included from include/linux/uaccess.h:13,
from include/linux/sched/task.h:13,
from include/linux/sched/signal.h:9,
from include/linux/rcuwait.h:6,
from include/linux/mm.h:36,
from include/linux/migrate.h:5,
from mm/migrate.c:16:
mm/migrate.c: In function 'do_pages_move':
arch/riscv/include/asm/uaccess.h:115:15: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
115 | (x) = (__typeof__(x))__tmp; \
| ^
arch/riscv/include/asm/uaccess.h:198:17: note: in expansion of macro '__get_user_asm'
198 | __get_user_asm("lb", (x), __gu_ptr, label); \
| ^~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:218:9: note: in expansion of macro '__get_user_nocheck'
218 | __get_user_nocheck(x, ptr, __gu_failed); \
| ^~~~~~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:255:9: note: in expansion of macro '__get_user_error'
255 | __get_user_error(__gu_val, __gu_ptr, __gu_err); \
| ^~~~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:285:17: note: in expansion of macro '__get_user'
285 | __get_user((x), __p) : \
| ^~~~~~~~~~
mm/migrate.c:2358:29: note: in expansion of macro 'get_user'
2358 | if (get_user(p, pages + i))
| ^~~~~~~~
Add an intermediate cast to 'unsigned long', which is guaranteed to be the same
width as a pointer, before the cast to the type of the output variable to clear
up the warning.
Fixes: bdce162f2e57 ("riscv: Use 64-bit variable for output in __get_user_asm")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601210526.OT45dlOZ-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260121-riscv-fix-int-to-pointer-cast-v1-1-b83eebe57c76@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
There's a big comment in the x86 do_page_fault() about our interrupt
disabling code:
* User address page fault handling might have reenabled
* interrupts. Fixing up all potential exit points of
* do_user_addr_fault() and its leaf functions is just not
* doable w/o creating an unholy mess or turning the code
* upside down.
but it turns out that comment is subtly wrong, and the code as a result
is also wrong.
Because it's certainly true that we may have re-enabled interrupts when
handling user page faults. And it's most certainly true that we don't
want to bother fixing up all the cases.
But what isn't true is that it's limited to user address page faults.
The confusion stems from the fact that we have logic here that depends
on the address range of the access, but other code then depends on the
_context_ the access was done in. The two are not related, even though
both of them are about user-vs-kernel.
In other words, both user and kernel addresses can cause interrupts to
have been enabled (eg when __bad_area_nosemaphore() gets called for user
accesses to kernel addresses). As a result we should make sure to
disable interrupts again regardless of the address range before
returning to the low-level fault handling code.
The __bad_area_nosemaphore() code actually did disable interrupts again
after enabling them, just not consistently. Ironically, as noted in the
original comment, fixing up all the cases is just not worth it, when the
simple solution is to just do it unconditionally in one single place.
So remove the incomplete case that unsuccessfully tried to do what the
comment said was "not doable" in commit ca4c6a9858c2 ("x86/traps: Make
interrupt enable/disable symmetric in C code"), and just make it do the
simple and straightforward thing.
Signed-off-by: Cedric Xing <cedric.xing@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Fixes: ca4c6a9858c2 ("x86/traps: Make interrupt enable/disable symmetric in C code")
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since commit 3e86e4d74c04 ("kbuild: keep .modinfo section in
vmlinux.unstripped") the .modinfo section which has SHF_ALLOC ends up
in bzImage after the SecureBoot trailer. This breaks SecureBoot because
the bootloader can no longer find the SecureBoot trailer with kernel's
signature at the expected location in bzImage. To fix the bug,
move discarded sections before the ELF_DETAILS macro and discard
the .modinfo section which is not needed by the decompressor.
Fixes: 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped")
Cc: stable@vger.kernel.org
Suggested-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
When SME is supported, Restoring SVE signal context can go wrong in a
few ways, including placing the task into an invalid state where the
kernel may read from out-of-bounds memory (and may potentially take a
fatal fault) and/or may kill the task with a SIGKILL.
(1) Restoring a context with SVE_SIG_FLAG_SM set can place the task into
an invalid state where SVCR.SM is set (and sve_state is non-NULL)
but TIF_SME is clear, consequently resuting in out-of-bounds memory
reads and/or killing the task with SIGKILL.
This can only occur in unusual (but legitimate) cases where the SVE
signal context has either been modified by userspace or was saved in
the context of another task (e.g. as with CRIU), as otherwise the
presence of an SVE signal context with SVE_SIG_FLAG_SM implies that
TIF_SME is already set.
While in this state, task_fpsimd_load() will NOT configure SMCR_ELx
(leaving some arbitrary value configured in hardware) before
restoring SVCR and attempting to restore the streaming mode SVE
registers from memory via sve_load_state(). As the value of
SMCR_ELx.LEN may be larger than the task's streaming SVE vector
length, this may read memory outside of the task's allocated
sve_state, reading unrelated data and/or triggering a fault.
While this can result in secrets being loaded into streaming SVE
registers, these values are never exposed. As TIF_SME is clear,
fpsimd_bind_task_to_cpu() will configure CPACR_ELx.SMEN to trap EL0
accesses to streaming mode SVE registers, so these cannot be
accessed directly at EL0. As fpsimd_save_user_state() verifies the
live vector length before saving (S)SVE state to memory, no secret
values can be saved back to memory (and hence cannot be observed via
ptrace, signals, etc).
When the live vector length doesn't match the expected vector length
for the task, fpsimd_save_user_state() will send a fatal SIGKILL
signal to the task. Hence the task may be killed after executing
userspace for some period of time.
(2) Restoring a context with SVE_SIG_FLAG_SM clear does not clear the
task's SVCR.SM. If SVCR.SM was set prior to restoring the context,
then the task will be left in streaming mode unexpectedly, and some
register state will be combined inconsistently, though the task will
be left in legitimate state from the kernel's PoV.
This can only occur in unusual (but legitimate) cases where ptrace
has been used to set SVCR.SM after entry to the sigreturn syscall,
as syscall entry clears SVCR.SM.
In these cases, the the provided SVE register data will be loaded
into the task's sve_state using the non-streaming SVE vector length
and the FPSIMD registers will be merged into this using the
streaming SVE vector length.
Fix (1) by setting TIF_SME when setting SVCR.SM. This also requires
ensuring that the task's sme_state has been allocated, but as this could
contain live ZA state, it should not be zeroed. Fix (2) by clearing
SVCR.SM when restoring a SVE signal context with SVE_SIG_FLAG_SM clear.
For consistency, I've pulled the manipulation of SVCR, TIF_SVE, TIF_SME,
and fp_type earlier, immediately after the allocation of
sve_state/sme_state, before the restore of the actual register state.
This makes it easier to ensure that these are always modified
consistently, even if a fault is taken while reading the register data
from the signal context. I do not expect any software to depend on the
exact state restored when a fault is taken while reading the context.
Fixes: 85ed24dad290 ("arm64/sme: Implement streaming SVE signal handling")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The code to restore a ZA context doesn't attempt to allocate the task's
sve_state before setting TIF_SME. Consequently, restoring a ZA context
can place a task into an invalid state where TIF_SME is set but the
task's sve_state is NULL.
In legitimate but uncommon cases where the ZA signal context was NOT
created by the kernel in the context of the same task (e.g. if the task
is saved/restored with something like CRIU), we have no guarantee that
sve_state had been allocated previously. In these cases, userspace can
enter streaming mode without trapping while sve_state is NULL, causing a
later NULL pointer dereference when the kernel attempts to store the
register state:
| # ./sigreturn-za
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
| Mem abort info:
| ESR = 0x0000000096000046
| EC = 0x25: DABT (current EL), IL = 32 bits
| SET = 0, FnV = 0
| EA = 0, S1PTW = 0
| FSC = 0x06: level 2 translation fault
| Data abort info:
| ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
| CM = 0, WnR = 1, TnD = 0, TagAccess = 0
| GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
| user pgtable: 4k pages, 52-bit VAs, pgdp=0000000101f47c00
| [0000000000000000] pgd=08000001021d8403, p4d=0800000102274403, pud=0800000102275403, pmd=0000000000000000
| Internal error: Oops: 0000000096000046 [#1] SMP
| Modules linked in:
| CPU: 0 UID: 0 PID: 153 Comm: sigreturn-za Not tainted 6.19.0-rc1 #1 PREEMPT
| Hardware name: linux,dummy-virt (DT)
| pstate: 214000c9 (nzCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
| pc : sve_save_state+0x4/0xf0
| lr : fpsimd_save_user_state+0xb0/0x1c0
| sp : ffff80008070bcc0
| x29: ffff80008070bcc0 x28: fff00000c1ca4c40 x27: 63cfa172fb5cf658
| x26: fff00000c1ca5228 x25: 0000000000000000 x24: 0000000000000000
| x23: 0000000000000000 x22: fff00000c1ca4c40 x21: fff00000c1ca4c40
| x20: 0000000000000020 x19: fff00000ff6900f0 x18: 0000000000000000
| x17: fff05e8e0311f000 x16: 0000000000000000 x15: 028fca8f3bdaf21c
| x14: 0000000000000212 x13: fff00000c0209f10 x12: 0000000000000020
| x11: 0000000000200b20 x10: 0000000000000000 x9 : fff00000ff69dcc0
| x8 : 00000000000003f2 x7 : 0000000000000001 x6 : fff00000c1ca5b48
| x5 : fff05e8e0311f000 x4 : 0000000008000000 x3 : 0000000000000000
| x2 : 0000000000000001 x1 : fff00000c1ca5970 x0 : 0000000000000440
| Call trace:
| sve_save_state+0x4/0xf0 (P)
| fpsimd_thread_switch+0x48/0x198
| __switch_to+0x20/0x1c0
| __schedule+0x36c/0xce0
| schedule+0x34/0x11c
| exit_to_user_mode_loop+0x124/0x188
| el0_interrupt+0xc8/0xd8
| __el0_irq_handler_common+0x18/0x24
| el0t_64_irq_handler+0x10/0x1c
| el0t_64_irq+0x198/0x19c
| Code: 54000040 d51b4408 d65f03c0 d503245f (e5bb5800)
| ---[ end trace 0000000000000000 ]---
Fix this by having restore_za_context() ensure that the task's sve_state
is allocated, matching what we do when taking an SME trap. Any live
SVE/SSVE state (which is restored earlier from a separate signal
context) must be preserved, and hence this is not zeroed.
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
When SVE is supported but SME is not supported, a ptrace write to the
NT_ARM_SVE regset can place the tracee into an invalid state where
(non-streaming) SVE register data is stored in FP_STATE_SVE format but
TIF_SVE is clear. This can result in a later warning from
fpsimd_restore_current_state(), e.g.
WARNING: CPU: 0 PID: 7214 at arch/arm64/kernel/fpsimd.c:383 fpsimd_restore_current_state+0x50c/0x748
When this happens, fpsimd_restore_current_state() will set TIF_SVE,
placing the task into the correct state. This occurs before any other
check of TIF_SVE can possibly occur, as other checks of TIF_SVE only
happen while the FPSIMD/SVE/SME state is live. Thus, aside from the
warning, there is no functional issue.
This bug was introduced during rework to error handling in commit:
9f8bf718f2923 ("arm64/fpsimd: ptrace: Gracefully handle errors")
... where the setting of TIF_SVE was moved into a block which is only
executed when system_supports_sme() is true.
Fix this by removing the system_supports_sme() check. This ensures that
TIF_SVE is set for (SVE-formatted) writes to NT_ARM_SVE, at the cost of
unconditionally manipulating the tracee's saved svcr value. The
manipulation of svcr is benign and inexpensive, and we already do
similar elsewhere (e.g. during signal handling), so I don't think it's
worth guarding this with system_supports_sme() checks.
Aside from the above, there is no functional change. The 'type' argument
to sve_set_common() is only set to ARM64_VEC_SME (in ssve_set())) when
system_supports_sme(), so the ARM64_VEC_SME case in the switch statement
is still unreachable when !system_supports_sme(). When
CONFIG_ARM64_SME=n, the only caller of sve_set_common() is sve_set(),
and the compiler can constant-fold for the case where type is
ARM64_VEC_SVE, removing the logic for other cases.
Reported-by: syzbot+d4ab35af21e99d07ce67@syzkaller.appspotmail.com
Fixes: 9f8bf718f292 ("arm64/fpsimd: ptrace: Gracefully handle errors")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"The main changes are devicetree updates for qualcomm and rockchips
arm64 platforms, fixing minor mistakes in SoC and board specific
settings:
- GPIO settings for Pinephone Pro buttons
- Register ranges for rk3576 GPU
- Power domains on sc8280xp
- Clocks on qcom talos
- dtc warnings for extraneous properties, nonstandard node names and
undocument identifiers
The Tegra210 platform gets a single revert for a devicetree change
that caused a 6.19 regression.
On 32-bit Arm, we have trivial fixes for Microchip SAMA7 devicetree
files and NPCM Kconfig, as well as Andrew Jeffery being officially
listed as MAINTAINER for NPCM.
A single driver fix is for Qualcomm RPMHD power domains, bringing the
driver up to date with a devicetree change that added additional power
domains to be enabled"
* tag 'soc-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (27 commits)
MAINTAINERS: Add Andrew as M: to ARM/NUVOTON NPCM ARCHITECTURE
MAINTAINERS: update email address for Yixun Lan
Revert "arm64: tegra: Add interconnect properties for Tegra210"
arm64: dts: rockchip: Drop unsupported properties
arm64: dts: rockchip: Fix gpio pinctrl node names
arm64: dts: rockchip: Fix pinctrl property typo on rk3326-odroid-go3
arm64: dts: rockchip: Drop "sitronix,st7789v" fallback compatible from rk3568-wolfvision
ARM: dts: microchip: sama7d65: fix size-cells property for i2c3
ARM: dts: microchip: sama7d65: fix the ranges property for flx9
arm: npcm: drop unused Kconfig ERRATA symbol
arm64: dts: rockchip: Fix wrong register range of rk3576 gpu
arm64: dts: rockchip: Configure MCLK for analog sound on NanoPi M5
arm64: dts: rockchip: Fix headphones widget name on NanoPi M5
ARM: dts: microchip: lan966x: Fix the access to the PHYs for pcb8290
arm64: dts: rockchip: remove redundant max-link-speed from nanopi-r4s
arm64: dts: rockchip: remove dangerous max-link-speed from helios64
arm64: dts: rockchip: fix unit-address for RK3588 NPU's core1 and core2's IOMMU
arm64: dts: rockchip: Fix wifi interrupts flag on Sakura Pi RK3308B
arm64: dts: qcom: sm8650: Fix compile warnings in USB controller node
arm64: dts: qcom: sm8550: Fix compile warnings in USB controller node
...
|
|
By default when users program perf to sample branch instructions
(PERF_COUNT_HW_BRANCH_INSTRUCTIONS) with a sample period of 1, perf
interprets this as a special case and enables BTS (Branch Trace Store)
as an optimization to avoid taking an interrupt on every branch.
Since BTS doesn't virtualize, this optimization doesn't make sense when
the request originates from a guest. Add an additional check that
prevents this optimization for virtualized events (exclude_host).
Reported-by: Jan H. Schönherr <jschoenh@amazon.de>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fernand Sieber <sieberf@amazon.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20251211183604.868641-1-sieberf@amazon.com
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm Arm64 DeviceTree fixes for v6.19
Add missing power-domains to the SC8280XP RPM power-domain and ensure
these are voted for from the remoteproc instances while powering them
up.
Clear a couple of DeviceTree validation warnings in SM8550 and SM8650
USB controller nodes.
Specify the correct display panel on the OnePlus 6.
Correct the UFS clock mapping on Talos, to ensure UFS is properly
clocked.
Add Abel's old emails address to .mailmap.
* tag 'qcom-arm64-fixes-for-6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: dts: qcom: sm8650: Fix compile warnings in USB controller node
arm64: dts: qcom: sm8550: Fix compile warnings in USB controller node
arm64: dts: qcom: sc8280xp: Add missing VDD_MXC links
pmdomain: qcom: rpmhpd: Add MXC to SC8280XP
dt-bindings: power: qcom,rpmpd: Add SC8280XP_MXC_AO
arm64: dts qcom: sdm845-oneplus-enchilada: Specify panel name within the compatible
mailmap: Update email address for Abel Vesa
arm64: dts: qcom: talos: Correct UFS clocks ordering
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bmc/linux into arm/fixes
Nuvoton NPCM Arm fixes for v6.19
Just the one change from Randy dropping an unused Kconfig symbol.
* tag 'nuvoton-arm-6.19-fixes-0' of https://git.kernel.org/pub/scm/linux/kernel/git/bmc/linux:
arm: npcm: drop unused Kconfig ERRATA symbol
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
For native, the choice of PTE is fine. There's real memory backing the
non-present PTE. However, for XenPV, Xen complains:
(XEN) d1 L1TF-vulnerable L1e 8010000018200066 - Shadowing
To explain, some background on XenPV pagetables:
Xen PV guests are control their own pagetables; they choose the new
PTE value, and use hypercalls to make changes so Xen can audit for
safety.
In addition to a regular reference count, Xen also maintains a type
reference count. e.g. SegDesc (referenced by vGDT/vLDT), Writable
(referenced with _PAGE_RW) or L{1..4} (referenced by vCR3 or a lower
pagetable level). This is in order to prevent e.g. a page being
inserted into the pagetables for which the guest has a writable mapping.
For non-present mappings, all other bits become software accessible,
and typically contain metadata rather a real frame address. There is
nothing that a reference count could sensibly be tied to. As such, even
if Xen could recognise the address as currently safe, nothing would
prevent that frame from changing owner to another VM in the future.
When Xen detects a PV guest writing a L1TF-PTE, it responds by
activating shadow paging. This is normally only used for the live phase
of migration, and comes with a reasonable overhead.
KFENCE only cares about getting #PF to catch wild accesses; it doesn't
care about the value for non-present mappings. Use a fully inverted PTE,
to avoid hitting the slow path when running under Xen.
While adjusting the logic, take the opportunity to skip all actions if the
PTE is already in the right state, half the number PVOps callouts, and
skip TLB maintenance on a !P -> P transition which benefits non-Xen cases
too.
Link: https://lkml.kernel.org/r/20260106180426.710013-1-andrew.cooper3@citrix.com
Fixes: 1dc0da6e9ec0 ("x86, kfence: enable KFENCE for x86")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jann Horn <jannh@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Qualcomm SC7280 and SM8350 SoCs have slightly different LPASS audio
blocks (v9.4.5 and v9.2), however the LPASS LPI pin controllers are
exactly the same. The driver for SM8350 has two issues, which can be
fixed by simply moving over to SC7280 driver which has them correct:
1. "i2s2_data_groups" listed twice GPIO12, but should have both GPIO12
and GPIO13,
2. "swr_tx_data_groups" contained GPIO5 for "swr_tx_data2" function, but
that function is also available on GPIO14, thus listing it twice is
not necessary. OTOH, GPIO5 has also "swr_rx_data1", so selecting
swr_rx_data function should not block the TX one.
Fixes: be9f6d56381d ("pinctrl: qcom: sm8350-lpass-lpi: add SM8350 LPASS TLMM")
Cc: stable@vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Fix resctrl initialization on Hygon CPUs
- Fix resctrl memory bandwidth counters on Hygon CPUs
- Fix x86 self-tests build bug
* tag 'x86-urgent-2026-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/x86: Add selftests include path for kselftest.h after centralization
x86/resctrl: Fix memory bandwidth counter width for Hygon
x86/resctrl: Add missing resctrl initialization for Hygon
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Remove redundant code in head.S, fix PMU counter allocation for mixed-
type event groups, fix a lot of dts build warnings, and fix kvm_device
memory leaks"
* tag 'loongarch-fixes-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Fix kvm_device leak in kvm_pch_pic_destroy()
LoongArch: KVM: Fix kvm_device leak in kvm_eiointc_destroy()
LoongArch: KVM: Fix kvm_device leak in kvm_ipi_destroy()
LoongArch: dts: loongson-2k1000: Fix i2c-gpio node names
LoongArch: dts: loongson-2k2000: Add default interrupt controller address cells
LoongArch: dts: loongson-2k1000: Add default interrupt controller address cells
LoongArch: dts: loongson-2k0500: Add default interrupt controller address cells
LoongArch: dts: Describe PCI sideband IRQ through interrupt-extended
LoongArch: Fix PMU counter allocation for mixed-type event groups
LoongArch: Remove redundant code in head.S
|
|
For some reason gcc 8, 9, 10, and 11 generate a dynamic relocation in
vdso.so.dbg if CONFIG_KSTACK_ERASE is enabled:
>> arch/s390/kernel/vdso/vdso.so.dbg: dynamic relocations are not supported
make[3]: *** [arch/s390/kernel/vdso/Makefile:54: arch/s390/kernel/vdso/vdso.so.dbg] Error 1
$ readelf -rW arch/s390/kernel/vdso/vdso.so.dbg
Relocation section '.rela.dyn' at offset 0x15c0 contains 1 entry:
Offset Info Type Symbol's Value Symbol's Name + Addend
00000000000015f0 000000010000000b R_390_JMP_SLOT 0000000000000000 __sanitizer_cov_stack_depth + 0
Add $(DISABLE_KSTACK_ERASE) to vdso compile flags to fix this.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202601070505.xQcLr5KV-lkp@intel.com/
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_pch_pic_destroy() is not currently doing this, that
would lead to a memory leak.
So, fix it.
Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_eiointc_destroy() is not currently doing this, that
would lead to a memory leak.
So, fix it.
Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
In kvm_ioctl_create_device(), kvm_device has allocated memory,
kvm_device->destroy() seems to be supposed to free its kvm_device
struct, but kvm_ipi_destroy() is not currently doing this, that
would lead to a memory leak.
So, fix it.
Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The binding wants the node to be named "i2c-number", but those are named
"i2c-gpio-number" instead.
Thus rename those to i2c-0, i2c-1 to adhere to the binding and suppress
dtbs_check warnings.
Cc: stable@vger.kernel.org
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add missing address-cells 0 to the Local I/O, Extend I/O and PCH-PIC
Interrupt Controller node to silence W=1 warning:
loongson-2k2000.dtsi:364.5-49: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map:
Missing property '#address-cells' in node /bus@10000000/interrupt-controller@10000000, using 0 as fallback
Value '0' is correct because:
1. The LIO/EIO/PCH interrupt controller does not have children,
2. interrupt-map property (in PCI node) consists of five components and
the fourth component "parent unit address", which size is defined by
'#address-cells' of the node pointed to by the interrupt-parent
component, is not used (=0)
Cc: stable@vger.kernel.org
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add missing address-cells 0 to the Local I/O interrupt controller node
to silence W=1 warning:
loongson-2k1000.dtsi:498.5-55: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map:
Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe01440, using 0 as fallback
Value '0' is correct because:
1. The Local I/O interrupt controller does not have children,
2. interrupt-map property (in PCI node) consists of five components and
the fourth component "parent unit address", which size is defined by
'#address-cells' of the node pointed to by the interrupt-parent
component, is not used (=0)
Cc: stable@vger.kernel.org
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add missing address-cells 0 to the Local I/O and Extend I/O interrupt
controller node to silence W=1 warning:
loongson-2k0500.dtsi:513.5-51: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@0,0:interrupt-map:
Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe11600, using 0 as fallback
Value '0' is correct because:
1. The Local I/O & Extend I/O interrupt controller do not have children,
2. interrupt-map property (in PCI node) consists of five components and
the fourth component "parent unit address", which size is defined by
'#address-cells' of the node pointed to by the interrupt-parent
component, is not used (=0)
Cc: stable@vger.kernel.org
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
SoC integrated peripherals on LS2K1000 and LS2K2000 could be discovered
as PCI devices, but require sideband interrupts to function, which are
previously described by interrupts and interrupt-parent properties.
However, pci/pci-device.yaml allows interrupts property to only specify
PCI INTx interrupts, not sideband ones. Convert these devices to use
interrupt-extended property, which describes sideband interrupts used by
PCI devices since dt-schema commit e6ea659d2baa ("schemas: pci-device:
Allow interrupts-extended for sideband interrupts"), eliminating
dtbs_check warnings.
Cc: stable@vger.kernel.org
Fixes: 30a5532a3206 ("LoongArch: dts: DeviceTree for Loongson-2K1000")
Signed-off-by: Yao Zi <me@ziyao.cc>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
When validating a perf event group, validate_group() unconditionally
attempts to allocate hardware PMU counters for the leader, sibling
events and the new event being added.
This is incorrect for mixed-type groups. If a PERF_TYPE_SOFTWARE event
is part of the group, the current code still tries to allocate a hardware
PMU counter for it, which can wrongly consume hardware PMU resources and
cause spurious allocation failures.
Fix this by only allocating PMU counters for hardware events during group
validation, and skipping software events.
A trimmed down reproducer is as simple as this:
#include <stdio.h>
#include <assert.h>
#include <unistd.h>
#include <string.h>
#include <sys/syscall.h>
#include <linux/perf_event.h>
int main (int argc, char *argv[])
{
struct perf_event_attr attr = { 0 };
int fds[5];
attr.disabled = 1;
attr.exclude_kernel = 1;
attr.exclude_hv = 1;
attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
PERF_FORMAT_TOTAL_TIME_RUNNING | PERF_FORMAT_ID | PERF_FORMAT_GROUP;
attr.size = sizeof (attr);
attr.type = PERF_TYPE_SOFTWARE;
attr.config = PERF_COUNT_SW_DUMMY;
fds[0] = syscall (SYS_perf_event_open, &attr, 0, -1, -1, 0);
assert (fds[0] >= 0);
attr.type = PERF_TYPE_HARDWARE;
attr.config = PERF_COUNT_HW_CPU_CYCLES;
fds[1] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0);
assert (fds[1] >= 0);
attr.type = PERF_TYPE_HARDWARE;
attr.config = PERF_COUNT_HW_INSTRUCTIONS;
fds[2] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0);
assert (fds[2] >= 0);
attr.type = PERF_TYPE_HARDWARE;
attr.config = PERF_COUNT_HW_BRANCH_MISSES;
fds[3] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0);
assert (fds[3] >= 0);
attr.type = PERF_TYPE_HARDWARE;
attr.config = PERF_COUNT_HW_CACHE_REFERENCES;
fds[4] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0);
assert (fds[4] >= 0);
printf ("PASSED\n");
return 0;
}
Cc: stable@vger.kernel.org
Fixes: b37042b2bb7c ("LoongArch: Add perf events support")
Signed-off-by: Lisa Robinson <lisa@bytefly.space>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
After commit f6bff7827a48 ("riscv: uaccess: use 'asm_goto_output' for
get_user()"), which was the first commit that started using asm goto
with outputs on RISC-V, builds of clang built with assertions enabled
start crashing in certain files that use get_user() with:
clang: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:12743: Register FollowCopyChain(MachineRegisterInfo &, Register): Assertion `MI->getOpcode() == TargetOpcode::COPY && "start of copy chain MUST be COPY"' failed.
Internally, LLVM generates an addiw instruction when the output of the
inline asm (which may be any scalar type) needs to be sign extended for
ABI reasons, such as a later function call, so that basic block does not
have to do it.
Use a temporary 64-bit variable as the output of the inline assembly in
__get_user_asm() and explicitly cast it to truncate it if necessary,
avoiding the addiw that triggers the assertion.
Link: https://github.com/ClangBuiltLinux/linux/issues/2092
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260116-riscv-wa-llvm-asm-goto-outputs-assertion-failure-v3-1-55b5775f989b@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link (CXL) fixes from Dave Jiang:
- Recognize all ZONE_DEVICE users as physaddr consumers
- Fix format string for extended_linear_cache_size_show()
- Fix target list setup for multiple decoders sharing the same
downstream port
- Restore HBIW check before derefernce platform data
- Fix potential infinite loop in __cxl_dpa_reserve()
- Check for invalid addresses returned from translation functions on
error
* tag 'cxl-fixes-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl: Check for invalid addresses returned from translation functions on errors
cxl/hdm: Fix potential infinite loop in __cxl_dpa_reserve()
cxl/acpi: Restore HBIW check before dereferencing platform_data
cxl/port: Fix target list setup for multiple decoders sharing the same dport
cxl/region: fix format string for resource_size_t
x86/kaslr: Recognize all ZONE_DEVICE users as physaddr consumers
|
|
CONFIG_CACHEMAINT_FOR_DMA
The Kconfig menu entry was converted to a menuconfig to allow it to be
hidden for !CONFIG_RISCV. The drivers under this new option were selected
by some other Kconfig symbols and so an extra select CACHEMAINT_FOR_DMA is
needed.
Fixes: 4d1608d0ab33 ("cache: Make top level Kconfig menu a boolean dependent on RISCV")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512100509.g6llkMMr-lkp@intel.com/
Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251210160047.201379-2-Jonathan.Cameron@huawei.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|