summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-07-02Linux 4.9.321v4.9.321Greg Kroah-Hartman
Link: https://lore.kernel.org/r/20220630133231.200642128@linuxfoundation.org Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Pavel Machek (CIP) <pavel@denx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02swiotlb: skip swiotlb_bounce when orig_addr is zeroLiu Shixin
After patch ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE"), swiotlb_bounce will be called in swiotlb_tbl_map_single unconditionally. This requires that the physical address must be valid, which is not always true on stable-4.19 or earlier version. On stable-4.19, swiotlb_alloc_buffer will call swiotlb_tbl_map_single with orig_addr equal to zero, which cause such a panic: Unable to handle kernel paging request at virtual address ffffb77a40000000 ... pc : __memcpy+0x100/0x180 lr : swiotlb_bounce+0x74/0x88 ... Call trace: __memcpy+0x100/0x180 swiotlb_tbl_map_single+0x2c8/0x338 swiotlb_alloc+0xb4/0x198 __dma_alloc+0x84/0x1d8 ... On stable-4.9 and stable-4.14, swiotlb_alloc_coherent wille call map_single with orig_addr equal to zero, which can cause same panic. Fix this by skipping swiotlb_bounce when orig_addr is zero. Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02kexec_file: drop weak attribute from arch_kexec_apply_relocations[_add]Naveen N. Rao
commit 3e35142ef99fe6b4fe5d834ad43ee13cca10a2dc upstream. Since commit d1bcae833b32f1 ("ELF: Don't generate unused section symbols") [1], binutils (v2.36+) started dropping section symbols that it thought were unused. This isn't an issue in general, but with kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a separate .text.unlikely section and the section symbol ".text.unlikely" is being dropped. Due to this, recordmcount is unable to find a non-weak symbol in .text.unlikely to generate a relocation record against. Address this by dropping the weak attribute from these functions. Instead, follow the existing pattern of having architectures #define the name of the function they want to override in their headers. [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1 [akpm@linux-foundation.org: arch/s390/include/asm/kexec.h needs linux/module.h] Link: https://lkml.kernel.org/r/20220519091237.676736-1-naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02fdt: Update CRC check for rng-seedHsin-Yi Wang
commit dd753d961c4844a39f947be115b3d81e10376ee5 upstream. Commit 428826f5358c ("fdt: add support for rng-seed") moves of_fdt_crc32 from early_init_dt_verify() to early_init_dt_scan() since early_init_dt_scan_chosen() may modify fdt to erase rng-seed. However, arm and some other arch won't call early_init_dt_scan(), they call early_init_dt_verify() then early_init_dt_scan_nodes(). Restore of_fdt_crc32 to early_init_dt_verify() then update it in early_init_dt_scan_chosen() if fdt if updated. Fixes: 428826f5358c ("fdt: add support for rng-seed") Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02xen: unexport __init-annotated xen_xlate_map_ballooned_pages()Masahiro Yamada
commit dbac14a5a05ff8e1ce7c0da0e1f520ce39ec62ea upstream. EXPORT_SYMBOL and __init is a bad combination because the .init.text section is freed up after the initialization. Hence, modules cannot use symbols annotated __init. The access to a freed symbol may end up with kernel panic. modpost used to detect it, but it has been broken for a decade. Recently, I fixed modpost so it started to warn it again, then this showed up in linux-next builds. There are two ways to fix it: - Remove __init - Remove EXPORT_SYMBOL I chose the latter for this case because none of the in-tree call-sites (arch/arm/xen/enlighten.c, arch/x86/xen/grant-table.c) is compiled as modular. Fixes: 243848fc018c ("xen/grant-table: Move xlated_setup_gnttab_pages to common place") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20220606045920.4161881-1-masahiroy@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02drm: remove drm_fb_helper_modinitChristoph Hellwig
commit bf22c9ec39da90ce866d5f625d616f28bc733dc1 upstream. drm_fb_helper_modinit has a lot of boilerplate for what is not very simple functionality. Just open code it in the only caller using IS_ENABLED and IS_MODULE, and skip the find_module check as a request_module is harmless if the module is already loaded (and not other caller has this find_module check either). Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02powerpc/pseries: wire up rng during setup_arch()Jason A. Donenfeld
commit e561e472a3d441753bd012333b057f48fef1045b upstream. The platform's RNG must be available before random_init() in order to be useful for initial seeding, which in turn means that it needs to be called from setup_arch(), rather than from an init call. Fortunately, each platform already has a setup_arch function pointer, which means it's easy to wire this up. This commit also removes some noisy log messages that don't add much. Fixes: a489043f4626 ("powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM") Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220611151015.548325-4-Jason@zx2c4.com Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02modpost: fix section mismatch check for exported init/exit sectionsMasahiro Yamada
commit 28438794aba47a27e922857d27b31b74e8559143 upstream. Since commit f02e8a6596b7 ("module: Sort exported symbols"), EXPORT_SYMBOL* is placed in the individual section ___ksymtab(_gpl)+<sym> (3 leading underscores instead of 2). Since then, modpost cannot detect the bad combination of EXPORT_SYMBOL and __init/__exit. Fix the .fromsec field. Fixes: f02e8a6596b7 ("module: Sort exported symbols") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02ARM: cns3xxx: Fix refcount leak in cns3xxx_initMiaoqian Lin
commit 1ba904b6b16e08de5aed7c1349838d9cd0d178c5 upstream. of_find_compatible_node() returns a node pointer with refcount incremented, we should use of_node_put() on it when done. Add missing of_node_put() to avoid refcount leak. Fixes: 415f59142d9d ("ARM: cns3xxx: initial DT support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Acked-by: Krzysztof Halasa <khalasa@piap.pl> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02ARM: Fix refcount leak in axxia_boot_secondaryMiaoqian Lin
commit 7c7ff68daa93d8c4cdea482da4f2429c0398fcde upstream. of_find_compatible_node() returns a node pointer with refcount incremented, we should use of_node_put() on it when done. Add missing of_node_put() to avoid refcount leak. Fixes: 1d22924e1c4e ("ARM: Add platform support for LSI AXM55xx SoC") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220601090548.47616-1-linmq006@gmail.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02ARM: exynos: Fix refcount leak in exynos_map_pmuMiaoqian Lin
commit c4c79525042a4a7df96b73477feaf232fe44ae81 upstream. of_find_matching_node() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() to avoid refcount leak. of_node_put() checks null pointer. Fixes: fce9e5bb2526 ("ARM: EXYNOS: Add support for mapping PMU base address via DT") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220523145513.12341-1-linmq006@gmail.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02ARM: dts: imx6qdl: correct PU regulator ramp delayLucas Stach
commit 93a8ba2a619816d631bd69e9ce2172b4d7a481b8 upstream. Contrary to what was believed at the time, the ramp delay of 150us is not plenty for the PU LDO with the default step time of 512 pulses of the 24MHz clock. Measurements have shown that after enabling the LDO the voltage on VDDPU_CAP jumps to ~750mV in the first step and after that the regulator executes the normal ramp up as defined by the step size control. This means it takes the regulator between 360us and 370us to ramp up to the nominal 1.15V voltage for this power domain. With the old setting of the ramp delay the power up of the PU GPC domain would happen in the middle of the regulator ramp with the voltage being at around 900mV. Apparently this was enough for most units to properly power up the peripherals in the domain and execute the reset. Some units however, fail to power up properly, especially when the chip is at a low temperature. In that case any access to the GPU registers would yield an incorrect result with no way to recover from this situation. Change the ramp delay to 380us to cover the measured ramp up time with a bit of additional slack. Fixes: 40130d327f72 ("ARM: dts: imx6qdl: Allow disabling the PU regulator, add a enable ramp delay") Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02powerpc: Enable execve syscall exit tracepointNaveen N. Rao
commit ec6d0dde71d760aa60316f8d1c9a1b0d99213529 upstream. On execve[at], we are zero'ing out most of the thread register state including gpr[0], which contains the syscall number. Due to this, we fail to trigger the syscall exit tracepoint properly. Fix this by retaining gpr[0] in the thread register state. Before this patch: # tail /sys/kernel/debug/tracing/trace cat-123 [000] ..... 61.449351: sys_execve(filename: 7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8) cat-124 [000] ..... 62.428481: sys_execve(filename: 7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8) echo-125 [000] ..... 65.813702: sys_execve(filename: 7fffa6b23378, argv: 7fffa6b233a0, envp: 7fffa6b233b0) echo-125 [000] ..... 65.822214: sys_execveat(fd: 0, filename: 1009ac48, argv: 7ffff65d0c98, envp: 7ffff65d0ca8, flags: 0) After this patch: # tail /sys/kernel/debug/tracing/trace cat-127 [000] ..... 100.416262: sys_execve(filename: 7fffa41b3448, argv: 7fffa41b33e0, envp: 7fffa41b33f8) cat-127 [000] ..... 100.418203: sys_execve -> 0x0 echo-128 [000] ..... 103.873968: sys_execve(filename: 7fffa41b3378, argv: 7fffa41b33a0, envp: 7fffa41b33b0) echo-128 [000] ..... 103.875102: sys_execve -> 0x0 echo-128 [000] ..... 103.882097: sys_execveat(fd: 0, filename: 1009ac48, argv: 7fffd10d2148, envp: 7fffd10d2158, flags: 0) echo-128 [000] ..... 103.883225: sys_execveat -> 0x0 Cc: stable@vger.kernel.org Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Sumit Dubey2 <Sumit.Dubey2@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220609103328.41306-1-naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02xtensa: Fix refcount leak bug in time.cLiang He
commit a0117dc956429f2ede17b323046e1968d1849150 upstream. In calibrate_ccount(), of_find_compatible_node() will return a node pointer with refcount incremented. We should use of_node_put() when it is not used anymore. Cc: stable@vger.kernel.org Signed-off-by: Liang He <windhl@126.com> Message-Id: <20220617124432.4049006-1-windhl@126.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02xtensa: xtfpga: Fix refcount leak bug in setupLiang He
commit 173940b3ae40114d4179c251a98ee039dc9cd5b3 upstream. In machine_setup(), of_find_compatible_node() will return a node pointer with refcount incremented. We should use of_node_put() when it is not used anymore. Cc: stable@vger.kernel.org Signed-off-by: Liang He <windhl@126.com> Message-Id: <20220617115323.4046905-1-windhl@126.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02iio: trigger: sysfs: fix use-after-free on removeVincent Whitchurch
commit 78601726d4a59a291acc5a52da1d3a0a6831e4e8 upstream. Ensure that the irq_work has completed before the trigger is freed. ================================================================== BUG: KASAN: use-after-free in irq_work_run_list Read of size 8 at addr 0000000064702248 by task python3/25 Call Trace: irq_work_run_list irq_work_tick update_process_times tick_sched_handle tick_sched_timer __hrtimer_run_queues hrtimer_interrupt Allocated by task 25: kmem_cache_alloc_trace iio_sysfs_trig_add dev_attr_store sysfs_kf_write kernfs_fop_write_iter new_sync_write vfs_write ksys_write sys_write Freed by task 25: kfree iio_sysfs_trig_remove dev_attr_store sysfs_kf_write kernfs_fop_write_iter new_sync_write vfs_write ksys_write sys_write ================================================================== Fixes: f38bc926d022 ("staging:iio:sysfs-trigger: Use irq_work to properly active trigger") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Reviewed-by: Lars-Peter Clausen <lars@metafoo.de> Link: https://lore.kernel.org/r/20220519091925.1053897-1-vincent.whitchurch@axis.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02iio: accel: mma8452: ignore the return value of reset operationHaibo Chen
commit bf745142cc0a3e1723f9207fb0c073c88464b7b4 upstream. On fxls8471, after set the reset bit, the device will reset immediately, will not give ACK. So ignore the return value of this reset operation, let the following code logic to check whether the reset operation works. Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Fixes: ecabae713196 ("iio: mma8452: Initialise before activating") Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/1655292718-14287-1-git-send-email-haibo.chen@nxp.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02iio:accel:bma180: rearrange iio trigger get and registerDmitry Rokosov
commit e5f3205b04d7f95a2ef43bce4b454a7f264d6923 upstream. IIO trigger interface function iio_trigger_get() should be called after iio_trigger_register() (or its devm analogue) strictly, because of iio_trigger_get() acquires module refcnt based on the trigger->owner pointer, which is initialized inside iio_trigger_register() to THIS_MODULE. If this call order is wrong, the next iio_trigger_put() (from sysfs callback or "delete module" path) will dereference "default" module refcnt, which is incorrect behaviour. Fixes: 0668a4e4d297 ("iio: accel: bma180: Fix indio_dev->trig assignment") Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220524181150.9240-2-ddrokosov@sberdevices.ru Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02usb: chipidea: udc: check request status before setting device addressXu Yang
commit b24346a240b36cfc4df194d145463874985aa29b upstream. The complete() function may be called even though request is not completed. In this case, it's necessary to check request status so as not to set device address wrongly. Fixes: 10775eb17bee ("usb: chipidea: udc: update gadget states according to ch9") cc: <stable@vger.kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20220623030242.41796-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02iio: adc: vf610: fix conversion mode sysfs node nameBaruch Siach
[ Upstream commit f1a633b15cd5371a2a83f02c513984e51132dd68 ] The documentation missed the "in_" prefix for this IIO_SHARED_BY_DIR entry. Fixes: bf04c1a367e3 ("iio: adc: vf610: implement configurable conversion modes") Signed-off-by: Baruch Siach <baruch@tkos.co.il> Acked-by: Haibo Chen <haibo.chen@nxp.com> Link: https://lore.kernel.org/r/560dc93fafe5ef7e9a409885fd20b6beac3973d8.1653900626.git.baruch@tkos.co.il Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-02igb: Make DMA faster when CPU is active on the PCIe linkKai-Heng Feng
[ Upstream commit 4e0effd9007ea0be31f7488611eb3824b4541554 ] Intel I210 on some Intel Alder Lake platforms can only achieve ~750Mbps Tx speed via iperf. The RR2DCDELAY shows around 0x2xxx DMA delay, which will be significantly lower when 1) ASPM is disabled or 2) SoC package c-state stays above PC3. When the RR2DCDELAY is around 0x1xxx the Tx speed can reach to ~950Mbps. According to the I210 datasheet "8.26.1 PCIe Misc. Register - PCIEMISC", "DMA Idle Indication" doesn't seem to tie to DMA coalesce anymore, so set it to 1b for "DMA is considered idle when there is no Rx or Tx AND when there are no TLPs indicating that CPU is active detected on the PCIe link (such as the host executes CSR or Configuration register read or write operation)" and performing Tx should also fall under "active CPU on PCIe link" case. In addition to that, commit b6e0c419f040 ("igb: Move DMA Coalescing init code to separate function.") seems to wrongly changed from enabling E1000_PCIEMISC_LX_DECISION to disabling it, also fix that. Fixes: b6e0c419f040 ("igb: Move DMA Coalescing init code to separate function.") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220621221056.604304-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-02MIPS: Remove repetitive increase irq_err_counthuhai
[ Upstream commit c81aba8fde2aee4f5778ebab3a1d51bd2ef48e4c ] commit 979934da9e7a ("[PATCH] mips: update IRQ handling for vr41xx") added a function irq_dispatch, and it'll increase irq_err_count when the get_irq callback returns a negative value, but increase irq_err_count in get_irq was not removed. And also, modpost complains once gpio-vr41xx drivers become modules. ERROR: modpost: "irq_err_count" [drivers/gpio/gpio-vr41xx.ko] undefined! So it would be a good idea to remove repetitive increase irq_err_count in get_irq callback. Fixes: 27fdd325dace ("MIPS: Update VR41xx GPIO driver to use gpiolib") Fixes: 979934da9e7a ("[PATCH] mips: update IRQ handling for vr41xx") Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: huhai <huhai@kylinos.cn> Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-02x86/xen: Remove undefined behavior in setup_features()Julien Grall
[ Upstream commit ecb6237fa397b7b810d798ad19322eca466dbab1 ] 1 << 31 is undefined. So switch to 1U << 31. Fixes: 5ead97c84fa7 ("xen: Core Xen implementation") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220617103037.57828-1-julien@xen.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-02bonding: ARP monitor spams NETDEV_NOTIFY_PEERS notifiersJay Vosburgh
[ Upstream commit 7a9214f3d88cfdb099f3896e102a306b316d8707 ] The bonding ARP monitor fails to decrement send_peer_notif, the number of peer notifications (gratuitous ARP or ND) to be sent. This results in a continuous series of notifications. Correct this by decrementing the counter for each notification. Reported-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Fixes: b0929915e035 ("bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitor") Link: https://lore.kernel.org/netdev/b2fd4147-8f50-bebd-963a-1a3e8d1d9715@redhat.com/ Tested-by: Jonathan Toppins <jtoppins@redhat.com> Reviewed-by: Jonathan Toppins <jtoppins@redhat.com> Link: https://lore.kernel.org/r/9400.1655407960@famine Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-02USB: serial: option: add Telit LE910Cx 0x1250 compositionCarlo Lobrano
commit 342fc0c3b345525da21112bd0478a0dc741598ea upstream. Add support for the following Telit LE910Cx composition: 0x1250: rmnet, tty, tty, tty, tty Reviewed-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Carlo Lobrano <c.lobrano@gmail.com> Link: https://lore.kernel.org/r/20220614075623.2392607-1-c.lobrano@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02random: quiet urandom warning ratelimit suppression messageJason A. Donenfeld
commit c01d4d0a82b71857be7449380338bc53dde2da92 upstream. random.c ratelimits how much it warns about uninitialized urandom reads using __ratelimit(). When the RNG is finally initialized, it prints the number of missed messages due to ratelimiting. It has been this way since that functionality was introduced back in 2018. Recently, cc1e127bfa95 ("random: remove ratelimiting for in-kernel unseeded randomness") put a bit more stress on the urandom ratelimiting, which teased out a bug in the implementation. Specifically, when under pressure, __ratelimit() will print its own message and reset the count back to 0, making the final message at the end less useful. Secondly, it does so as a pr_warn(), which apparently is undesirable for people's CI. Fortunately, __ratelimit() has the RATELIMIT_MSG_ON_RELEASE flag exactly for this purpose, so we set the flag. Fixes: 4e00b339e264 ("random: rate limit unseeded randomness warnings") Cc: stable@vger.kernel.org Reported-by: Jon Hunter <jonathanh@nvidia.com> Reported-by: Ron Economos <re@w6rz.net> Tested-by: Ron Economos <re@w6rz.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02dm era: commit metadata in postsuspend after worker stopsNikos Tsironis
commit 9ae6e8b1c9bbf6874163d1243e393137313762b7 upstream. During postsuspend dm-era does the following: 1. Archives the current era 2. Commits the metadata, as part of the RPC call for archiving the current era 3. Stops the worker Until the worker stops, it might write to the metadata again. Moreover, these writes are not flushed to disk immediately, but are cached by the dm-bufio client, which writes them back asynchronously. As a result, the committed metadata of a suspended dm-era device might not be consistent with the in-core metadata. In some cases, this can result in the corruption of the on-disk metadata. Suppose the following sequence of events: 1. Load a new table, e.g. a snapshot-origin table, to a device with a dm-era table 2. Suspend the device 3. dm-era commits its metadata, but the worker does a few more metadata writes until it stops, as part of digesting an archived writeset 4. These writes are cached by the dm-bufio client 5. Load the dm-era table to another device. 6. The new instance of the dm-era target loads the committed, on-disk metadata, which don't include the extra writes done by the worker after the metadata commit. 7. Resume the new device 8. The new dm-era target instance starts using the metadata 9. Resume the original device 10. The destructor of the old dm-era target instance is called and destroys the dm-bufio client, which results in flushing the cached writes to disk 11. These writes might overwrite the writes done by the new dm-era instance, hence corrupting its metadata. Fix this by committing the metadata after the worker stops running. stop_worker uses flush_workqueue to flush the current work. However, the work item may re-queue itself and flush_workqueue doesn't wait for re-queued works to finish. This could result in the worker changing the metadata after they have been committed, or writing to the metadata concurrently with the commit in the postsuspend thread. Use drain_workqueue instead, which waits until the work and all re-queued works finish. Fixes: eec40579d8487 ("dm: add era target") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02ata: libata: add qc->flags in ata_qc_complete_template tracepointEdward Wu
commit 540a92bfe6dab7310b9df2e488ba247d784d0163 upstream. Add flags value to check the result of ata completion Fixes: 255c03d15a29 ("libata: Add tracepoints") Cc: stable@vger.kernel.org Signed-off-by: Edward Wu <edwardwu@realtek.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02random: schedule mix_interrupt_randomness() less oftenJason A. Donenfeld
commit 534d2eaf1970274150596fdd2bf552721e65d6b2 upstream. It used to be that mix_interrupt_randomness() would credit 1 bit each time it ran, and so add_interrupt_randomness() would schedule mix() to run every 64 interrupts, a fairly arbitrary number, but nonetheless considered to be a decent enough conservative estimate. Since e3e33fc2ea7f ("random: do not use input pool from hard IRQs"), mix() is now able to credit multiple bits, depending on the number of calls to add(). This was done for reasons separate from this commit, but it has the nice side effect of enabling this patch to schedule mix() less often. Currently the rules are: a) Credit 1 bit for every 64 calls to add(). b) Schedule mix() once a second that add() is called. c) Schedule mix() once every 64 calls to add(). Rules (a) and (c) no longer need to be coupled. It's still important to have _some_ value in (c), so that we don't "over-saturate" the fast pool, but the once per second we get from rule (b) is a plenty enough baseline. So, by increasing the 64 in rule (c) to something larger, we avoid calling queue_work_on() as frequently during irq storms. This commit changes that 64 in rule (c) to be 1024, which means we schedule mix() 16 times less often. And it does *not* need to change the 64 in rule (a). Fixes: 58340f8e952b ("random: defer fast pool mixing to worker") Cc: stable@vger.kernel.org Cc: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02vt: drop old FONT ioctlsJiri Slaby
commit ff2047fb755d4415ec3c70ac799889371151796d upstream. Drop support for these ioctls: * PIO_FONT, PIO_FONTX * GIO_FONT, GIO_FONTX * PIO_FONTRESET As was demonstrated by commit 90bfdeef83f1 (tty: make FONTX ioctl use the tty pointer they were actually passed), these ioctls are not used from userspace, as: 1) they used to be broken (set up font on current console, not the open one) and racy (before the commit above) 2) KDFONTOP ioctl is used for years instead Note that PIO_FONTRESET is defunct on most systems as VGA_CONSOLE is set on them for ages. That turns on BROKEN_GRAPHICS_PROGRAMS which makes PIO_FONTRESET just return an error. We are removing KD_FONT_FLAG_OLD here as it was used only by these removed ioctls. kd.h header exists both in kernel and uapi headers, so we can remove the kernel one completely. Everyone includeing kd.h will now automatically get the uapi one. There are now unused definitions of the ioctl numbers and "struct consolefontdesc" in kd.h, but as it is a uapi header, I am not touching these. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20210105120239.28031-8-jslaby@suse.cz Cc: guodaxing <guodaxing@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25Linux 4.9.320v4.9.320Greg Kroah-Hartman
Link: https://lore.kernel.org/r/20220623164344.053938039@linuxfoundation.org Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Pavel Machek (CIP) <pavel@denx.de> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25tcp: drop the hash_32() part from the index calculationWilly Tarreau
commit e8161345ddbb66e449abde10d2fdce93f867eba9 upstream. In commit 190cc82489f4 ("tcp: change source port randomizarion at connect() time"), the table_perturb[] array was introduced and an index was taken from the port_offset via hash_32(). But it turns out that hash_32() performs a multiplication while the input here comes from the output of SipHash in secure_seq, that is well distributed enough to avoid the need for yet another hash. Suggested-by: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25tcp: increase source port perturb table to 2^16Willy Tarreau
commit 4c2c8f03a5ab7cb04ec64724d7d176d00bcc91e5 upstream. Moshe Kol, Amit Klein, and Yossi Gilad reported being able to accurately identify a client by forcing it to emit only 40 times more connections than there are entries in the table_perturb[] table. The previous two improvements consisting in resalting the secret every 10s and adding randomness to each port selection only slightly improved the situation, and the current value of 2^8 was too small as it's not very difficult to make a client emit 10k connections in less than 10 seconds. Thus we're increasing the perturb table from 2^8 to 2^16 so that the same precision now requires 2.6M connections, which is more difficult in this time frame and harder to hide as a background activity. The impact is that the table now uses 256 kB instead of 1 kB, which could mostly affect devices making frequent outgoing connections. However such components usually target a small set of destinations (load balancers, database clients, perf assessment tools), and in practice only a few entries will be visited, like before. A live test at 1 million connections per second showed no performance difference from the previous value. Reported-by: Moshe Kol <moshe.kol@mail.huji.ac.il> Reported-by: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Reported-by: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25tcp: dynamically allocate the perturb table used by source portsWilly Tarreau
commit e9261476184be1abd486c9434164b2acbe0ed6c2 upstream. We'll need to further increase the size of this table and it's likely that at some point its size will not be suitable anymore for a static table. Let's allocate it on boot from inet_hashinfo2_init(), which is called from tcp_init(). Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org> [bwh: Backported to 4.9: - There is no inet_hashinfo2_init(), so allocate the table in inet_hashinfo_init() when called by TCP - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25tcp: add small random increments to the source portWilly Tarreau
commit ca7af0402550f9a0b3316d5f1c30904e42ed257d upstream. Here we're randomly adding between 0 and 7 random increments to the selected source port in order to add some noise in the source port selection that will make the next port less predictable. With the default port range of 32768-60999 this means a worst case reuse scenario of 14116/8=1764 connections between two consecutive uses of the same port, with an average of 14116/4.5=3137. This code was stressed at more than 800000 connections per second to a fixed target with all connections closed by the client using RSTs (worst condition) and only 2 connections failed among 13 billion, despite the hash being reseeded every 10 seconds, indicating a perfectly safe situation. Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25tcp: use different parts of the port_offset for index and offsetWilly Tarreau
commit 9e9b70ae923baf2b5e8a0ea4fd0c8451801ac526 upstream. Amit Klein suggests that we use different parts of port_offset for the table's index and the port offset so that there is no direct relation between them. Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25secure_seq: use the 64 bits of the siphash for port offset calculationWilly Tarreau
commit b2d057560b8107c633b39aabe517ff9d93f285e3 upstream. SipHash replaced MD5 in secure_ipv{4,6}_port_ephemeral() via commit 7cd23e5300c1 ("secure_seq: use SipHash in place of MD5"), but the output remained truncated to 32-bit only. In order to exploit more bits from the hash, let's make the functions return the full 64-bit of siphash_3u32(). We also make sure the port offset calculation in __inet_hash_connect() remains done on 32-bit to avoid the need for div_u64_rem() and an extra cost on 32-bit systems. Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25tcp: add some entropy in __inet_hash_connect()Eric Dumazet
commit c579bd1b4021c42ae247108f1e6f73dd3f08600c upstream. Even when implementing RFC 6056 3.3.4 (Algorithm 4: Double-Hash Port Selection Algorithm), a patient attacker could still be able to collect enough state from an otherwise idle host. Idea of this patch is to inject some noise, in the cases __inet_hash_connect() found a candidate in the first attempt. This noise should not significantly reduce the collision avoidance, and should be zero if connection table is already well used. Note that this is not implementing RFC 6056 3.3.5 because we think Algorithm 5 could hurt typical workloads. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Dworken <ddworken@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25tcp: change source port randomizarion at connect() timeEric Dumazet
commit 190cc82489f46f9d88e73c81a47e14f80a791e1a upstream. RFC 6056 (Recommendations for Transport-Protocol Port Randomization) provides good summary of why source selection needs extra care. David Dworken reminded us that linux implements Algorithm 3 as described in RFC 6056 3.3.3 Quoting David : In the context of the web, this creates an interesting info leak where websites can count how many TCP connections a user's computer is establishing over time. For example, this allows a website to count exactly how many subresources a third party website loaded. This also allows: - Distinguishing between different users behind a VPN based on distinct source port ranges. - Tracking users over time across multiple networks. - Covert communication channels between different browsers/browser profiles running on the same computer - Tracking what applications are running on a computer based on the pattern of how fast source ports are getting incremented. Section 3.3.4 describes an enhancement, that reduces attackers ability to use the basic information currently stored into the shared 'u32 hint'. This change also decreases collision rate when multiple applications need to connect() to different destinations. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: David Dworken <ddworken@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25fuse: fix pipe buffer lifetime for direct_ioMiklos Szeredi
commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 upstream. In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then imports the write buffer with fuse_get_user_pages(), which uses iov_iter_get_pages() to grab references to userspace pages instead of actually copying memory. On the filesystem device side, these pages can then either be read to userspace (via fuse_dev_read()), or splice()d over into a pipe using fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops. This is wrong because after fuse_dev_do_read() unlocks the FUSE request, the userspace filesystem can mark the request as completed, causing write() to return. At that point, the userspace filesystem should no longer have access to the pipe buffer. Fix by copying pages coming from the user address space to new pipe buffers. Reported-by: Jann Horn <jannh@google.com> Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""Linus Torvalds
commit 901c7280ca0d5e2b4a8929fbe0bfb007ac2a6544 upstream. Halil Pasic points out [1] that the full revert of that commit (revert in bddac7c1e02b), and that a partial revert that only reverts the problematic case, but still keeps some of the cleanups is probably better.  And that partial revert [2] had already been verified by Oleksandr Natalenko to also fix the issue, I had just missed that in the long discussion. So let's reinstate the cleanups from commit aa6f8dcbab47 ("swiotlb: rework "fix info leak with DMA_FROM_DEVICE""), and effectively only revert the part that caused problems. Link: https://lore.kernel.org/all/20220328013731.017ae3e3.pasic@linux.ibm.com/ [1] Link: https://lore.kernel.org/all/20220324055732.GB12078@lst.de/ [2] Link: https://lore.kernel.org/all/4386660.LvFx2qVVIh@natalenko.name/ [3] Suggested-by: Halil Pasic <pasic@linux.ibm.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [OP: backport to 4.14: apply swiotlb_tbl_map_single() changes in lib/swiotlb.c] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25swiotlb: fix info leak with DMA_FROM_DEVICEHalil Pasic
commit ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e upstream. The problem I'm addressing was discovered by the LTP test covering cve-2018-1000204. A short description of what happens follows: 1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV and a corresponding dxferp. The peculiar thing about this is that TUR is not reading from the device. 2) In sg_start_req() the invocation of blk_rq_map_user() effectively bounces the user-space buffer. As if the device was to transfer into it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()") we make sure this first bounce buffer is allocated with GFP_ZERO. 3) For the rest of the story we keep ignoring that we have a TUR, so the device won't touch the buffer we prepare as if the we had a DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device and the buffer allocated by SG is mapped by the function virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here scatter-gather and not scsi generics). This mapping involves bouncing via the swiotlb (we need swiotlb to do virtio in protected guest like s390 Secure Execution, or AMD SEV). 4) When the SCSI TUR is done, we first copy back the content of the second (that is swiotlb) bounce buffer (which most likely contains some previous IO data), to the first bounce buffer, which contains all zeros. Then we copy back the content of the first bounce buffer to the user-space buffer. 5) The test case detects that the buffer, which it zero-initialized, ain't all zeros and fails. One can argue that this is an swiotlb problem, because without swiotlb we leak all zeros, and the swiotlb should be transparent in a sense that it does not affect the outcome (if all other participants are well behaved). Copying the content of the original buffer into the swiotlb buffer is the only way I can think of to make swiotlb transparent in such scenarios. So let's do just that if in doubt, but allow the driver to tell us that the whole mapped buffer is going to be overwritten, in which case we can preserve the old behavior and avoid the performance impact of the extra bounce. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [OP: backport to 4.14: apply swiotlb_tbl_map_single() changes in lib/swiotlb.c] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25xprtrdma: fix incorrect header size calculationsColin Ian King
commit 912288442cb2f431bf3c8cb097a5de83bc6dbac1 upstream. Currently the header size calculations are using an assignment operator instead of a += operator when accumulating the header size leading to incorrect sizes. Fix this by using the correct operator. Addresses-Coverity: ("Unused value") Fixes: 302d3deb2068 ("xprtrdma: Prevent inline overflow") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25s390/mm: use non-quiescing sske for KVM switch to keyed guestChristian Borntraeger
commit 3ae11dbcfac906a8c3a480e98660a823130dc16a upstream. The switch to a keyed guest does not require a classic sske as the other guest CPUs are not accessing the key before the switch is complete. By using the NQ SSKE things are faster especially with multiple guests. Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Suggested-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220530092706.11637-3-borntraeger@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25l2tp: fix race in pppol2tp_release with session object destroyJames Chapman
commit d02ba2a6110c530a32926af8ad441111774d2893 upstream. pppol2tp_release uses call_rcu to put the final ref on its socket. But the session object doesn't hold a ref on the session socket so may be freed while the pppol2tp_put_sk RCU callback is scheduled. Fix this by having the session hold a ref on its socket until the session is destroyed. It is this ref that is dropped via call_rcu. Sessions are also deleted via l2tp_tunnel_closeall. This must now also put the final ref via call_rcu. So move the call_rcu call site into pppol2tp_session_close so that this happens in both destroy paths. A common destroy path should really be implemented, perhaps with l2tp_tunnel_closeall calling l2tp_session_delete like pppol2tp_release does, but this will be looked at later. ODEBUG: activate active (active state 1) object type: rcu_head hint: (null) WARNING: CPU: 3 PID: 13407 at lib/debugobjects.c:291 debug_print_object+0x166/0x220 Modules linked in: CPU: 3 PID: 13407 Comm: syzbot_19c09769 Not tainted 4.16.0-rc2+ #38 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 RIP: 0010:debug_print_object+0x166/0x220 RSP: 0018:ffff880013647a00 EFLAGS: 00010082 RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff814d3333 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88001a59f6d0 RBP: ffff880013647a40 R08: 0000000000000000 R09: 0000000000000001 R10: ffff8800136479a8 R11: 0000000000000000 R12: 0000000000000001 R13: ffffffff86161420 R14: ffffffff85648b60 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88001a580000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020e77000 CR3: 0000000006022000 CR4: 00000000000006e0 Call Trace: debug_object_activate+0x38b/0x530 ? debug_object_assert_init+0x3b0/0x3b0 ? __mutex_unlock_slowpath+0x85/0x8b0 ? pppol2tp_session_destruct+0x110/0x110 __call_rcu.constprop.66+0x39/0x890 ? __call_rcu.constprop.66+0x39/0x890 call_rcu_sched+0x17/0x20 pppol2tp_release+0x2c7/0x440 ? fcntl_setlk+0xca0/0xca0 ? sock_alloc_file+0x340/0x340 sock_release+0x92/0x1e0 sock_close+0x1b/0x20 __fput+0x296/0x6e0 ____fput+0x1a/0x20 task_work_run+0x127/0x1a0 do_exit+0x7f9/0x2ce0 ? SYSC_connect+0x212/0x310 ? mm_update_next_owner+0x690/0x690 ? up_read+0x1f/0x40 ? __do_page_fault+0x3c8/0xca0 do_group_exit+0x10d/0x330 ? do_group_exit+0x330/0x330 SyS_exit_group+0x22/0x30 do_syscall_64+0x1e0/0x730 ? trace_hardirqs_off_thunk+0x1a/0x1c entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x7f362e471259 RSP: 002b:00007ffe389abe08 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f362e471259 RDX: 00007f362e471259 RSI: 000000000000002e RDI: 0000000000000000 RBP: 00007ffe389abe30 R08: 0000000000000000 R09: 00007f362e944270 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000400b60 R13: 00007ffe389abf50 R14: 0000000000000000 R15: 0000000000000000 Code: 8d 3c dd a0 8f 64 85 48 89 fa 48 c1 ea 03 80 3c 02 00 75 7b 48 8b 14 dd a0 8f 64 85 4c 89 f6 48 c7 c7 20 85 64 85 e 8 2a 55 14 ff <0f> 0b 83 05 ad 2a 68 04 01 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 Fixes: ee40fb2e1eb5b ("l2tp: protect sock pointer of struct pppol2tp_session with RCU") Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25l2tp: don't use inet_shutdown on ppp session destroyJames Chapman
commit 225eb26489d05c679a4c4197ffcb81c81e9dcaf4 upstream. Previously, if a ppp session was closed, we called inet_shutdown to mark the socket as unconnected such that userspace would get errors and then close the socket. This could race with userspace closing the socket. Instead, leave userspace to close the socket in its own time (our session will be detached anyway). BUG: KASAN: use-after-free in inet_shutdown+0x5d/0x1c0 Read of size 4 at addr ffff880010ea3ac0 by task syzbot_347bd5ac/8296 CPU: 3 PID: 8296 Comm: syzbot_347bd5ac Not tainted 4.16.0-rc1+ #91 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Call Trace: dump_stack+0x101/0x157 ? inet_shutdown+0x5d/0x1c0 print_address_description+0x78/0x260 ? inet_shutdown+0x5d/0x1c0 kasan_report+0x240/0x360 __asan_load4+0x78/0x80 inet_shutdown+0x5d/0x1c0 ? pppol2tp_show+0x80/0x80 pppol2tp_session_close+0x68/0xb0 l2tp_tunnel_closeall+0x199/0x210 ? udp_v6_flush_pending_frames+0x90/0x90 l2tp_udp_encap_destroy+0x6b/0xc0 ? l2tp_tunnel_del_work+0x2e0/0x2e0 udpv6_destroy_sock+0x8c/0x90 sk_common_release+0x47/0x190 udp_lib_close+0x15/0x20 inet_release+0x85/0xd0 inet6_release+0x43/0x60 sock_release+0x53/0x100 ? sock_alloc_file+0x260/0x260 sock_close+0x1b/0x20 __fput+0x19f/0x380 ____fput+0x1a/0x20 task_work_run+0xd2/0x110 exit_to_usermode_loop+0x18d/0x190 do_syscall_64+0x389/0x3b0 entry_SYSCALL_64_after_hwframe+0x26/0x9b RIP: 0033:0x7fe240a45259 RSP: 002b:00007fe241132df8 EFLAGS: 00000297 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fe240a45259 RDX: 00007fe240a45259 RSI: 0000000000000000 RDI: 00000000000000a5 RBP: 00007fe241132e20 R08: 00007fe241133700 R09: 0000000000000000 R10: 00007fe241133700 R11: 0000000000000297 R12: 0000000000000000 R13: 00007ffc49aff84f R14: 0000000000000000 R15: 00007fe241141040 Allocated by task 8331: save_stack+0x43/0xd0 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc+0x144/0x3e0 sock_alloc_inode+0x22/0x130 alloc_inode+0x3d/0xf0 new_inode_pseudo+0x1c/0x90 sock_alloc+0x30/0x110 __sock_create+0xaa/0x4c0 SyS_socket+0xbe/0x130 do_syscall_64+0x128/0x3b0 entry_SYSCALL_64_after_hwframe+0x26/0x9b Freed by task 8314: save_stack+0x43/0xd0 __kasan_slab_free+0x11a/0x170 kasan_slab_free+0xe/0x10 kmem_cache_free+0x88/0x2b0 sock_destroy_inode+0x49/0x50 destroy_inode+0x77/0xb0 evict+0x285/0x340 iput+0x429/0x530 dentry_unlink_inode+0x28c/0x2c0 __dentry_kill+0x1e3/0x2f0 dput.part.21+0x500/0x560 dput+0x24/0x30 __fput+0x2aa/0x380 ____fput+0x1a/0x20 task_work_run+0xd2/0x110 exit_to_usermode_loop+0x18d/0x190 do_syscall_64+0x389/0x3b0 entry_SYSCALL_64_after_hwframe+0x26/0x9b Fixes: fd558d186df2c ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25ext4: add reserved GDT blocks checkZhang Yi
commit b55c3cd102a6f48b90e61c44f7f3dda8c290c694 upstream. We capture a NULL pointer issue when resizing a corrupt ext4 image which is freshly clear resize_inode feature (not run e2fsck). It could be simply reproduced by following steps. The problem is because of the resize_inode feature was cleared, and it will convert the filesystem to meta_bg mode in ext4_resize_fs(), but the es->s_reserved_gdt_blocks was not reduced to zero, so could we mistakenly call reserve_backup_gdb() and passing an uninitialized resize_inode to it when adding new group descriptors. mkfs.ext4 /dev/sda 3G tune2fs -O ^resize_inode /dev/sda #forget to run requested e2fsck mount /dev/sda /mnt resize2fs /dev/sda 8G ======== BUG: kernel NULL pointer dereference, address: 0000000000000028 CPU: 19 PID: 3243 Comm: resize2fs Not tainted 5.18.0-rc7-00001-gfde086c5ebfd #748 ... RIP: 0010:ext4_flex_group_add+0xe08/0x2570 ... Call Trace: <TASK> ext4_resize_fs+0xbec/0x1660 __ext4_ioctl+0x1749/0x24e0 ext4_ioctl+0x12/0x20 __x64_sys_ioctl+0xa6/0x110 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f2dd739617b ======== The fix is simple, add a check in ext4_resize_begin() to make sure that the es->s_reserved_gdt_blocks is zero when the resize_inode feature is disabled. Cc: stable@kernel.org Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220601092717.763694-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25ext4: make variable "count" signedDing Xiang
commit bc75a6eb856cb1507fa907bf6c1eda91b3fef52f upstream. Since dx_make_map() may return -EFSCORRUPTED now, so change "count" to be a signed integer so we can correctly check for an error code returned by dx_make_map(). Fixes: 46c116b920eb ("ext4: verify dir block before splitting it") Cc: stable@kernel.org Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20220530100047.537598-1-dingxiang@cmss.chinamobile.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25ext4: fix bug_on ext4_mb_use_inode_paBaokun Li
commit a08f789d2ab5242c07e716baf9a835725046be89 upstream. Hulk Robot reported a BUG_ON: ================================================================== kernel BUG at fs/ext4/mballoc.c:3211! [...] RIP: 0010:ext4_mb_mark_diskspace_used.cold+0x85/0x136f [...] Call Trace: ext4_mb_new_blocks+0x9df/0x5d30 ext4_ext_map_blocks+0x1803/0x4d80 ext4_map_blocks+0x3a4/0x1a10 ext4_writepages+0x126d/0x2c30 do_writepages+0x7f/0x1b0 __filemap_fdatawrite_range+0x285/0x3b0 file_write_and_wait_range+0xb1/0x140 ext4_sync_file+0x1aa/0xca0 vfs_fsync_range+0xfb/0x260 do_fsync+0x48/0xa0 [...] ================================================================== Above issue may happen as follows: ------------------------------------- do_fsync vfs_fsync_range ext4_sync_file file_write_and_wait_range __filemap_fdatawrite_range do_writepages ext4_writepages mpage_map_and_submit_extent mpage_map_one_extent ext4_map_blocks ext4_mb_new_blocks ext4_mb_normalize_request >>> start + size <= ac->ac_o_ex.fe_logical ext4_mb_regular_allocator ext4_mb_simple_scan_group ext4_mb_use_best_found ext4_mb_new_preallocation ext4_mb_new_inode_pa ext4_mb_use_inode_pa >>> set ac->ac_b_ex.fe_len <= 0 ext4_mb_mark_diskspace_used >>> BUG_ON(ac->ac_b_ex.fe_len <= 0); we can easily reproduce this problem with the following commands: `fallocate -l100M disk` `mkfs.ext4 -b 1024 -g 256 disk` `mount disk /mnt` `fsstress -d /mnt -l 0 -n 1000 -p 1` The size must be smaller than or equal to EXT4_BLOCKS_PER_GROUP. Therefore, "start + size <= ac->ac_o_ex.fe_logical" may occur when the size is truncated. So start should be the start position of the group where ac_o_ex.fe_logical is located after alignment. In addition, when the value of fe_logical or EXT4_BLOCKS_PER_GROUP is very large, the value calculated by start_off is more accurate. Cc: stable@kernel.org Fixes: cd648b8a8fd5 ("ext4: trim allocation requests to group size") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20220528110017.354175-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25serial: 8250: Store to lsr_save_flags after lsr readIlpo Järvinen
commit be03b0651ffd8bab69dfd574c6818b446c0753ce upstream. Not all LSR register flags are preserved across reads. Therefore, LSR readers must store the non-preserved bits into lsr_save_flags. This fix was initially mixed into feature commit f6f586102add ("serial: 8250: Handle UART without interrupt on TEMT using em485"). However, that feature change had a flaw and it was reverted to make room for simpler approach providing the same feature. The embedded fix got reverted with the feature change. Re-add the lsr_save_flags fix and properly mark it's a fix. Link: https://lore.kernel.org/all/1d6c31d-d194-9e6a-ddf9-5f29af829f3@linux.intel.com/T/#m1737eef986bd20cf19593e344cebd7b0244945fc Fixes: e490c9144cfa ("tty: Add software emulated RS485 support for 8250") Cc: stable <stable@kernel.org> Acked-by: Uwe Kleine-König <u.kleine-koenig@penugtronix.de> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/f4d774be-1437-a550-8334-19d8722ab98c@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>