user/sven/linux.git/kernel/irq, branch v5.15.87

genirq/irqdesc: Don't try to remove non-existing sysfs files

2022-12-31T12:14:03Z

[ Upstream commit 9049e1ca41983ab773d7ea244bee86d7835ec9f5 ] Fault injection tests trigger warnings like this: kernfs: can not remove 'chip_name', no directory WARNING: CPU: 0 PID: 253 at fs/kernfs/dir.c:1616 kernfs_remove_by_name_ns+0xce/0xe0 RIP: 0010:kernfs_remove_by_name_ns+0xce/0xe0 Call Trace: remove_files.isra.1+0x3f/0xb0 sysfs_remove_group+0x68/0xe0 sysfs_remove_groups+0x41/0x70 __kobject_del+0x45/0xc0 kobject_del+0x29/0x40 free_desc+0x42/0x70 irq_free_descs+0x5e/0x90 The reason is that the interrupt descriptor sysfs handling does not roll back on a failing kobject_add() during allocation. If the descriptor is freed later on, kobject_del() is invoked with a not added kobject resulting in the above warnings. A proper rollback in case of a kobject_add() failure would be the straight forward solution. But this is not possible due to the way how interrupt descriptor sysfs handling works. Interrupt descriptors are allocated before sysfs becomes available. So the sysfs files for the early allocated descriptors are added later in the boot process. At this point there can be nothing useful done about a failing kobject_add(). For consistency the interrupt descriptor allocation always treats kobject_add() failures as non-critical and just emits a warning. To solve this problem, keep track in the interrupt descriptor whether kobject_add() was successful or not and make the invocation of kobject_del() conditional on that. [ tglx: Massage changelog, comments and use a state bit. ] Fixes: ecb3f394c5db ("genirq: Expose interrupt information through sysfs") Signed-off-by: Yang Yingliang Signed-off-by: Thomas Gleixner Reviewed-by: Greg Kroah-Hartman Link: https://lore.kernel.org/r/20221128151612.1786122-1-yangyingliang@huawei.com Signed-off-by: Sasha Levin

genirq: Take the proposed affinity at face value if force==true

2022-12-02T16:41:12Z

From: Marc Zyngier commit c48c8b829d2b966a6649827426bcdba082ccf922 upstream. Although setting the affinity of an interrupt to a set of CPUs that doesn't have any online CPU is generally frowned apon, there are a few limited cases where such affinity is set from a CPUHP notifier, setting the affinity to a CPU that isn't online yet. The saving grace is that this is always done using the 'force' attribute, which gives a hint that the affinity setting can be outside of the online CPU mask and the callsite set this flag with the knowledge that the underlying interrupt controller knows to handle it. This restores the expected behaviour on Marek's system. Fixes: 33de0aa4bae9 ("genirq: Always limit the affinity to online CPUs") Reported-by: Marek Szyprowski Signed-off-by: Marc Zyngier Signed-off-by: Thomas Gleixner Tested-by: Marek Szyprowski Link: https://lore.kernel.org/r/4b7fc13c-887b-a664-26e8-45aed13f048a@samsung.com Link: https://lore.kernel.org/r/20220414140011.541725-1-maz@kernel.org Signed-off-by: Luiz Capitulino Signed-off-by: Greg Kroah-Hartman

genirq: Always limit the affinity to online CPUs

2022-12-02T16:41:11Z

From: Marc Zyngier commit 33de0aa4bae982ed6f7c777f86b5af3e627ac937 upstream. [ Fixed small conflicts due to the HK_FLAG_MANAGED_IRQ flag been renamed on upstream ] When booting with maxcpus= (or even loading a driver while most CPUs are offline), it is pretty easy to observe managed affinities containing a mix of online and offline CPUs being passed to the irqchip driver. This means that the irqchip cannot trust the affinity passed down from the core code, which is a bit annoying and requires (at least in theory) all drivers to implement some sort of affinity narrowing. In order to address this, always limit the cpumask to the set of online CPUs. Signed-off-by: Marc Zyngier Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/r/20220405185040.206297-3-maz@kernel.org Signed-off-by: Luiz Capitulino Signed-off-by: Greg Kroah-Hartman

genirq/msi: Shutdown managed interrupts with unsatifiable affinities

2022-12-02T16:41:11Z

From: Marc Zyngier commit d802057c7c553ad426520a053da9f9fe08e2c35a upstream. [ This commit is almost a rewrite because it conflicts with Thomas Gleixner's refactoring of this code in v5.17-rc1. I wasn't sure if I should drop all the s-o-bs (including Mark's), but decided to keep as the original commit ] When booting with maxcpus=, interrupt controllers such as the GICv3 ITS may not be able to satisfy the affinity of some managed interrupts, as some of the HW resources are simply not available. The same thing happens when loading a driver using managed interrupts while CPUs are offline. In order to deal with this, do not try to activate such interrupt if there is no online CPU capable of handling it. Instead, place it in shutdown state. Once a capable CPU shows up, it will be activated. Reported-by: John Garry Reported-by: David Decotigny Signed-off-by: Marc Zyngier Signed-off-by: Thomas Gleixner Tested-by: John Garry Link: https://lore.kernel.org/r/20220405185040.206297-2-maz@kernel.org Signed-off-by: Luiz Capitulino Signed-off-by: Greg Kroah-Hartman

irqdomain: Report irq number for NOMAP domains

2022-08-17T12:23:14Z

[ Upstream commit 6f194c99f466147148cc08452718b46664112548 ] When using a NOMAP domain, __irq_resolve_mapping() doesn't store the Linux IRQ number at the address optionally provided by the caller. While this isn't a huge deal (the returned value is guaranteed to the hwirq that was passed as a parameter), let's honour the letter of the API by writing the expected value. Fixes: d22558dd0a6c (“irqdomain: Introduce irq_resolve_mapping()”) Signed-off-by: Xu Qiang [maz: commit message] Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20220719063641.56541-2-xuqiang36@huawei.com Signed-off-by: Sasha Levin

genirq: GENERIC_IRQ_IPI depends on SMP

2022-08-17T12:23:01Z

[ Upstream commit 0f5209fee90b4544c58b4278d944425292789967 ] The generic IPI code depends on the IRQ affinity mask being allocated and initialized. This will not be the case if SMP is disabled. Fix up the remaining driver that selected GENERIC_IRQ_IPI in a non-SMP config. Reported-by: kernel test robot Signed-off-by: Samuel Holland Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20220701200056.46555-3-samuel@sholland.org Signed-off-by: Sasha Levin

genirq: Don't return error on missing optional irq_request_resources()

2022-08-17T12:23:00Z

[ Upstream commit 95001b756467ecc9f5973eb5e74e97699d9bbdf1 ] Function irq_chip::irq_request_resources() is reported as optional in the declaration of struct irq_chip. If the parent irq_chip does not implement it, we should ignore it and return. Don't return error if the functions is missing. Signed-off-by: Antonio Borneo Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20220512160544.13561-1-antonio.borneo@foss.st.com Signed-off-by: Sasha Levin

random: remove unused irq_flags argument from add_interrupt_randomness()

2022-05-30T07:29:00Z

commit 703f7066f40599c290babdb79dd61319264987e9 upstream. Since commit ee3e00e9e7101 ("random: use registers from interrupted code for CPU's w/o a cycle counter") the irq_flags argument is no longer used. Remove unused irq_flags. Cc: Borislav Petkov Cc: Dave Hansen Cc: Dexuan Cui Cc: H. Peter Anvin Cc: Haiyang Zhang Cc: Ingo Molnar Cc: K. Y. Srinivasan Cc: Stephen Hemminger Cc: Thomas Gleixner Cc: Wei Liu Cc: linux-hyperv@vger.kernel.org Cc: x86@kernel.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Wei Liu Signed-off-by: Jason A. Donenfeld Signed-off-by: Greg Kroah-Hartman

genirq: Synchronize interrupt thread startup

2022-05-12T10:30:06Z

commit 8707898e22fd665bc1d7b18b809be4b56ce25bdd upstream. A kernel hang can be observed when running setserial in a loop on a kernel with force threaded interrupts. The sequence of events is: setserial open("/dev/ttyXXX") request_irq() do_stuff() -> serial interrupt -> wake(irq_thread) desc->threads_active++; close() free_irq() kthread_stop(irq_thread) synchronize_irq() <- hangs because desc->threads_active != 0 The thread is created in request_irq() and woken up, but does not get on a CPU to reach the actual thread function, which would handle the pending wake-up. kthread_stop() sets the should stop condition which makes the thread immediately exit, which in turn leaves the stale threads_active count around. This problem was introduced with commit 519cc8652b3a, which addressed a interrupt sharing issue in the PCIe code. Before that commit free_irq() invoked synchronize_irq(), which waits for the hard interrupt handler and also for associated threads to complete. To address the PCIe issue synchronize_irq() was replaced with __synchronize_hardirq(), which only waits for the hard interrupt handler to complete, but not for threaded handlers. This was done under the assumption, that the interrupt thread already reached the thread function and waits for a wake-up, which is guaranteed to be handled before acting on the stop condition. The problematic case, that the thread would not reach the thread function, was obviously overlooked. Make sure that the interrupt thread is really started and reaches thread_fn() before returning from __setup_irq(). This utilizes the existing wait queue in the interrupt descriptor. The wait queue is unused for non-shared interrupts. For shared interrupts the usage might cause a spurious wake-up of a waiter in synchronize_irq() or the completion of a threaded handler might cause a spurious wake-up of the waiter for the ready flag. Both are harmless and have no functional impact. [ tglx: Amended changelog ] Fixes: 519cc8652b3a ("genirq: Synchronize only with single thread on free_irq()") Signed-off-by: Thomas Pfaff Signed-off-by: Thomas Gleixner Reviewed-by: Marc Zyngier Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/552fe7b4-9224-b183-bb87-a8f36d335690@pcs.com Signed-off-by: Greg Kroah-Hartman

genirq/affinity: Consider that CPUs on nodes can be unbalanced

2022-04-20T07:34:20Z

commit 08d835dff916bfe8f45acc7b92c7af6c4081c8a7 upstream. If CPUs on a node are offline at boot time, the number of nodes is different when building affinity masks for present cpus and when building affinity masks for possible cpus. This causes the following problem: In the case that the number of vectors is less than the number of nodes there are cases where bits of masks for present cpus are overwritten when building masks for possible cpus. Fix this by excluding CPUs, which are not part of the current build mask (present/possible). [ tglx: Massaged changelog and added comment ] Fixes: b82592199032 ("genirq/affinity: Spread IRQs to all available NUMA nodes") Signed-off-by: Rei Yamamoto Signed-off-by: Thomas Gleixner Reviewed-by: Ming Lei Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220331003309.10891-1-yamamoto.rei@jp.fujitsu.com Signed-off-by: Greg Kroah-Hartman