summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2023-09-13memfd: replace ratcheting feature from vm.memfd_noexec with hierarchyAleksa Sarai
[ Upstream commit 9876cfe8ec1cb3c88de31f4d58d57b0e7e22bcc4 ] This sysctl has the very unusual behaviour of not allowing any user (even CAP_SYS_ADMIN) to reduce the restriction setting, meaning that if you were to set this sysctl to a more restrictive option in the host pidns you would need to reboot your machine in order to reset it. The justification given in [1] is that this is a security feature and thus it should not be possible to disable. Aside from the fact that we have plenty of security-related sysctls that can be disabled after being enabled (fs.protected_symlinks for instance), the protection provided by the sysctl is to stop users from being able to create a binary and then execute it. A user with CAP_SYS_ADMIN can trivially do this without memfd_create(2): % cat mount-memfd.c #include <fcntl.h> #include <string.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <linux/mount.h> #define SHELLCODE "#!/bin/echo this file was executed from this totally private tmpfs:" int main(void) { int fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC); assert(fsfd >= 0); assert(!fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 2)); int dfd = fsmount(fsfd, FSMOUNT_CLOEXEC, 0); assert(dfd >= 0); int execfd = openat(dfd, "exe", O_CREAT | O_RDWR | O_CLOEXEC, 0782); assert(execfd >= 0); assert(write(execfd, SHELLCODE, strlen(SHELLCODE)) == strlen(SHELLCODE)); assert(!close(execfd)); char *execpath = NULL; char *argv[] = { "bad-exe", NULL }, *envp[] = { NULL }; execfd = openat(dfd, "exe", O_PATH | O_CLOEXEC); assert(execfd >= 0); assert(asprintf(&execpath, "/proc/self/fd/%d", execfd) > 0); assert(!execve(execpath, argv, envp)); } % ./mount-memfd this file was executed from this totally private tmpfs: /proc/self/fd/5 % Given that it is possible for CAP_SYS_ADMIN users to create executable binaries without memfd_create(2) and without touching the host filesystem (not to mention the many other things a CAP_SYS_ADMIN process would be able to do that would be equivalent or worse), it seems strange to cause a fair amount of headache to admins when there doesn't appear to be an actual security benefit to blocking this. There appear to be concerns about confused-deputy-esque attacks[2] but a confused deputy that can write to arbitrary sysctls is a bigger security issue than executable memfds. /* New API */ The primary requirement from the original author appears to be more based on the need to be able to restrict an entire system in a hierarchical manner[3], such that child namespaces cannot re-enable executable memfds. So, implement that behaviour explicitly -- the vm.memfd_noexec scope is evaluated up the pidns tree to &init_pid_ns and you have the most restrictive value applied to you. The new lower limit you can set vm.memfd_noexec is whatever limit applies to your parent. Note that a pidns will inherit a copy of the parent pidns's effective vm.memfd_noexec setting at unshare() time. This matches the existing behaviour, and it also ensures that a pidns will never have its vm.memfd_noexec setting *lowered* behind its back (but it will be raised if the parent raises theirs). /* Backwards Compatibility */ As the previous version of the sysctl didn't allow you to lower the setting at all, there are no backwards compatibility issues with this aspect of the change. However it should be noted that now that the setting is completely hierarchical. Previously, a cloned pidns would just copy the current pidns setting, meaning that if the parent's vm.memfd_noexec was changed it wouldn't propoagate to existing pid namespaces. Now, the restriction applies recursively. This is a uAPI change, however: * The sysctl is very new, having been merged in 6.3. * Several aspects of the sysctl were broken up until this patchset and the other patchset by Jeff Xu last month. And thus it seems incredibly unlikely that any real users would run into this issue. In the worst case, if this causes userspace isues we could make it so that modifying the setting follows the hierarchical rules but the restriction checking uses the cached copy. [1]: https://lore.kernel.org/CABi2SkWnAgHK1i6iqSqPMYuNEhtHBkO8jUuCvmG3RmUB5TKHJw@mail.gmail.com/ [2]: https://lore.kernel.org/CALmYWFs_dNCzw_pW1yRAo4bGCPEtykroEQaowNULp7svwMLjOg@mail.gmail.com/ [3]: https://lore.kernel.org/CALmYWFuahdUF7cT4cm7_TGLqPanuHXJ-hVSfZt7vpTnc18DPrw@mail.gmail.com/ Link: https://lkml.kernel.org/r/20230814-memfd-vm-noexec-uapi-fixes-v2-4-7ff9e3e10ba6@cyphar.com Fixes: 105ff5339f49 ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: Jeff Xu <jeffxu@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13memfd: do not -EACCES old memfd_create() users with vm.memfd_noexec=2Aleksa Sarai
[ Upstream commit 202e14222fadb246dfdf182e67de1518e86a1e20 ] Given the difficulty of auditing all of userspace to figure out whether every memfd_create() user has switched to passing MFD_EXEC and MFD_NOEXEC_SEAL flags, it seems far less distruptive to make it possible for older programs that don't make use of executable memfds to run under vm.memfd_noexec=2. Otherwise, a small dependency change can result in spurious errors. For programs that don't use executable memfds, passing MFD_NOEXEC_SEAL is functionally a no-op and thus having the same In addition, every failure under vm.memfd_noexec=2 needs to print to the kernel log so that userspace can figure out where the error came from. The concerns about pr_warn_ratelimited() spam that caused the switch to pr_warn_once()[1,2] do not apply to the vm.memfd_noexec=2 case. This is a user-visible API change, but as it allows programs to do something that would be blocked before, and the sysctl itself was broken and recently released, it seems unlikely this will cause any issues. [1]: https://lore.kernel.org/Y5yS8wCnuYGLHMj4@x1n/ [2]: https://lore.kernel.org/202212161233.85C9783FB@keescook/ Link: https://lkml.kernel.org/r/20230814-memfd-vm-noexec-uapi-fixes-v2-2-7ff9e3e10ba6@cyphar.com Fixes: 105ff5339f49 ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: Jeff Xu <jeffxu@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Bluetooth: HCI: Introduce HCI_QUIRK_BROKEN_LE_CODEDLuiz Augusto von Dentz
[ Upstream commit 253f3399f4c09ce6f4e67350f839be0361b4d5ff ] This introduces HCI_QUIRK_BROKEN_LE_CODED which is used to indicate that LE Coded PHY shall not be used, it is then set for some Intel models that claim to support it but when used causes many problems. Cc: stable@vger.kernel.org # 6.4.y+ Link: https://github.com/bluez/bluez/issues/577 Link: https://github.com/bluez/bluez/issues/582 Link: https://lore.kernel.org/linux-bluetooth/CABBYNZKco-v7wkjHHexxQbgwwSz-S=GZ=dZKbRE1qxT1h4fFbQ@mail.gmail.com/T/# Fixes: 288c90224eec ("Bluetooth: Enable all supported LE PHY by default") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Bluetooth: msft: Extended monitor tracking by address filterHilda Wu
[ Upstream commit 9e14606d8f38ea52a38c27692a9c1513c987a5da ] Since limited tracking device per condition, this feature is to support tracking multiple devices concurrently. When a pattern monitor detects the device, this feature issues an address monitor for tracking that device. Let pattern monitor can keep monitor new devices. This feature adds an address filter when receiving a LE monitor device event which monitor handle is for a pattern, and the controller started monitoring the device. And this feature also has cancelled the monitor advertisement from address filters when receiving a LE monitor device event when the controller stopped monitoring the device specified by an address and monitor handle. Below is an example to know the feature adds the address filter. //Add MSFT pattern monitor < HCI Command: Vendor (0x3f|0x00f0) plen 14 #142 [hci0] 55.552420 03 b8 a4 03 ff 01 01 06 09 05 5f 52 45 46 .........._REF > HCI Event: Command Complete (0x0e) plen 6 #143 [hci0] 55.653960 Vendor (0x3f|0x00f0) ncmd 2 Status: Success (0x00) 03 00 //Got event from the pattern monitor > HCI Event: Vendor (0xff) plen 18 #148 [hci0] 58.384953 23 79 54 33 77 88 97 68 02 00 fb c1 29 eb 27 b8 #yT3w..h....).'. 00 01 .. //Add MSFT address monitor (Sample address: B8:27:EB:29:C1:FB) < HCI Command: Vendor (0x3f|0x00f0) plen 13 #149 [hci0] 58.385067 03 b8 a4 03 ff 04 00 fb c1 29 eb 27 b8 .........).'. //Report to userspace about found device (ADV Monitor Device Found) @ MGMT Event: Unknown (0x002f) plen 38 {0x0003} [hci0] 58.680042 01 00 fb c1 29 eb 27 b8 01 ce 00 00 00 00 16 00 ....).'......... 0a 09 4b 45 59 42 44 5f 52 45 46 02 01 06 03 19 ..KEYBD_REF..... c1 03 03 03 12 18 ...... //Got event from address monitor > HCI Event: Vendor (0xff) plen 18 #152 [hci0] 58.672956 23 79 54 33 77 88 97 68 02 00 fb c1 29 eb 27 b8 #yT3w..h....).'. 01 01 Signed-off-by: Alex Lu <alex_lu@realsil.com.cn> Signed-off-by: Hilda Wu <hildawu@realtek.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 253f3399f4c0 ("Bluetooth: HCI: Introduce HCI_QUIRK_BROKEN_LE_CODED") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13RISC-V: Add ptrace support for vectorsAndy Chiu
commit 9300f00439743c4a34d735e1a27118eb68a1504e upstream. This patch add back the ptrace support with the following fix: - Define NT_RISCV_CSR and re-number NT_RISCV_VECTOR to prevent conflicting with gdb's NT_RISCV_CSR. - Use struct __riscv_v_regset_state to handle ptrace requests Since gdb does not directly include the note description header in Linux and has already defined NT_RISCV_CSR as 0x900, we decide to sync with gdb and renumber NT_RISCV_VECTOR to solve and prevent future conflicts. Fixes: 0c59922c769a ("riscv: Add ptrace vector support") Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20230825050248.32681-1-andy.chiu@sifive.com [Palmer: Drop the unused "size" variable in riscv_vr_set().] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13arm64: sdei: abort running SDEI handlers during crashD Scott Phillips
commit 5cd474e57368f0957c343bb21e309cf82826b1ef upstream. Interrupts are blocked in SDEI context, per the SDEI spec: "The client interrupts cannot preempt the event handler." If we crashed in the SDEI handler-running context (as with ACPI's AGDI) then we need to clean up the SDEI state before proceeding to the crash kernel so that the crash kernel can have working interrupts. Track the active SDEI handler per-cpu so that we can COMPLETE_AND_RESUME the handler, discarding the interrupted context. Fixes: f5df26961853 ("arm64: kernel: Add arch-specific SDEI entry code and CPU masking") Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> Cc: stable@vger.kernel.org Reviewed-by: James Morse <james.morse@arm.com> Tested-by: Mihai Carabas <mihai.carabas@oracle.com> Link: https://lore.kernel.org/r/20230627002939.2758-1-scott@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13net: handle ARPHRD_PPP in dev_is_mac_header_xmit()Nicolas Dichtel
commit a4f39c9f14a634e4cd35fcd338c239d11fcc73fc upstream. The goal is to support a bpf_redirect() from an ethernet device (ingress) to a ppp device (egress). The l2 header is added automatically by the ppp driver, thus the ethernet header should be removed. CC: stable@vger.kernel.org Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Siwar Zitouni <siwar.zitouni@6wind.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13mfd: rz-mtu3: Link time dependenciesArnd Bergmann
[ Upstream commit 10d3340441bd0db857fc7fcb1733a800acf47a3d ] The new set of drivers for RZ/G2L MTU3a tries to enable compile-testing the individual client drivers even when the MFD portion is disabled but gets it wrong, causing a link failure when the core is in a loadable module but the other drivers are built-in: x86_64-linux-ld: drivers/pwm/pwm-rz-mtu3.o: in function `rz_mtu3_pwm_apply': pwm-rz-mtu3.c:(.text+0x4bf): undefined reference to `rz_mtu3_8bit_ch_write' x86_64-linux-ld: pwm-rz-mtu3.c:(.text+0x509): undefined reference to `rz_mtu3_disable' arm-linux-gnueabi-ld: drivers/counter/rz-mtu3-cnt.o: in function `rz_mtu3_cascade_counts_enable_get': rz-mtu3-cnt.c:(.text+0xbec): undefined reference to `rz_mtu3_shared_reg_read' It seems better not to add the extra complexity here but instead just use a normal hard dependency, so remove the #else portion in the header along with the "|| COMPILE_TEST". This could also be fixed by having slightly more elaborate Kconfig dependencies or using the cursed 'IS_REACHABLE()' helper, but in practice it's already possible to compile-test all these drivers by enabling the mtd portion. Fixes: 254d3a727421c ("pwm: Add Renesas RZ/G2L MTU3a PWM driver") Fixes: 0be8907359df4 ("counter: Add Renesas RZ/G2L MTU3a counter driver") Fixes: 654c293e1687b ("mfd: Add Renesas RZ/G2L MTU3a core driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thierry Reding <thierry.reding@gmail.com> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/r/20230719090430.1925182-1-arnd@kernel.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13nvmem: core: Return NULL when no nvmem layout is foundMiquel Raynal
[ Upstream commit 81e1d9a39569d315f747c2af19ce502cd08645ed ] Currently, of_nvmem_layout_get_container() returns NULL on error, or an error pointer if either CONFIG_NVMEM or CONFIG_OF is turned off. We should likely avoid this kind of mix for two reasons: to clarify the intend and anyway fix the !CONFIG_OF which will likely always if we use this helper somewhere else. Let's just return NULL when no layout is found, we don't need an error value here. Link: https://staticthinking.wordpress.com/2022/08/01/mixing-error-pointers-and-null/ Fixes: 266570f496b9 ("nvmem: core: introduce NVMEM layouts") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202308030002.DnSFOrMB-lkp@intel.com/ Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Michael Walle <michael@walle.cc> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20230823132744.350618-21-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13scsi: core: Use 32-bit hostnum in scsi_host_lookup()Tony Battersby
[ Upstream commit 62ec2092095b678ff89ce4ba51c2938cd1e8e630 ] Change scsi_host_lookup() hostnum argument type from unsigned short to unsigned int to match the type used everywhere else. Fixes: 6d49f63b415c ("[SCSI] Make host_no an unsigned int") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://lore.kernel.org/r/a02497e7-c12b-ef15-47fc-3f0a0b00ffce@cybernetics.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13HID: input: Support devices sending Eraser without InvertIllia Ostapyshyn
[ Upstream commit 276e14e6c3993317257e1787e93b7166fbc30905 ] Some digitizers (notably XP-Pen Artist 24) do not report the Invert usage when erasing. This causes the device to be permanently stuck with the BTN_TOOL_RUBBER tool after sending Eraser, as Invert is the only usage that can release the tool. In this state, Touch and Inrange are no longer reported to userspace, rendering the pen unusable. Prior to commit 87562fcd1342 ("HID: input: remove the need for HID_QUIRK_INVERT"), BTN_TOOL_RUBBER was never set and Eraser events were simply translated into BTN_TOUCH without causing an inconsistent state. Introduce HID_QUIRK_NOINVERT for such digitizers and detect them during hidinput_configure_usage(). This quirk causes the tool to be released as soon as Eraser is reported as not set. Set BTN_TOOL_RUBBER in input->keybit when mapping Eraser. Fixes: 87562fcd1342 ("HID: input: remove the need for HID_QUIRK_INVERT") Co-developed-by: Nils Fuhler <nils@nilsfuhler.de> Signed-off-by: Nils Fuhler <nils@nilsfuhler.de> Signed-off-by: Illia Ostapyshyn <ostapyshyn@sra.uni-hannover.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13media: cec: core: add adap_unconfigured() callbackHans Verkuil
[ Upstream commit 948a77aaecf202f722cf2264025f9987e5bd5c26 ] The adap_configured() callback was called with the adap->lock mutex held if the 'configured' argument was false, and without the adap->lock mutex held if that argument was true. That was very confusing, and so split this up in a adap_unconfigured() callback and a high-level configured() callback. This also makes it easier to understand when the mutex is held: all low-level adap_* callbacks are called with the mutex held. All other callbacks are called without that mutex held. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: f1b57164305d ("media: cec: add optional adap_configured callback") Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13media: cec: core: add adap_nb_transmit_canceled() callbackHans Verkuil
[ Upstream commit da53c36ddd3f118a525a04faa8c47ca471e6c467 ] A potential deadlock was found by Zheng Zhang with a local syzkaller instance. The problem is that when a non-blocking CEC transmit is canceled by calling cec_data_cancel, that in turn can call the high-level received() driver callback, which can call cec_transmit_msg() to transmit a new message. The cec_data_cancel() function is called with the adap->lock mutex held, and cec_transmit_msg() tries to take that same lock. The root cause is that the received() callback can either be used to pass on a received message (and then adap->lock is not held), or to report a canceled transmit (and then adap->lock is held). This is confusing, so create a new low-level adap_nb_transmit_canceled callback that reports back that a non-blocking transmit was canceled. And the received() callback is only called when a message is received, as was the case before commit f9d0ecbf56f4 ("media: cec: correctly pass on reply results") complicated matters. Reported-by: Zheng Zhang <zheng.zhang@email.ucr.edu> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: f9d0ecbf56f4 ("media: cec: correctly pass on reply results") Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13kernfs: add stub helper for kernfs_generic_poll()Arnd Bergmann
[ Upstream commit 79038a99445f69c5d28494dd4f8c6f0509f65b2e ] In some randconfig builds, kernfs ends up being disabled, so there is no prototype for kernfs_generic_poll() In file included from kernel/sched/build_utility.c:97: kernel/sched/psi.c:1479:3: error: implicit declaration of function 'kernfs_generic_poll' is invalid in C99 [-Werror,-Wimplicit-function-declaration] kernfs_generic_poll(t->of, wait); ^ Add a stub helper for it, as we have it for other kernfs functions. Fixes: aff037078ecae ("sched/psi: use kernfs polling functions for PSI trigger polling") Fixes: 147e1a97c4a0b ("fs: kernfs: add poll file operation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Link: https://lore.kernel.org/r/20230724121823.1357562-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13dma-buf/sync_file: Fix docs syntaxRob Clark
[ Upstream commit 05d56d8079d510a2994039470f65bea85f0075ee ] Fixes the warning: include/uapi/linux/sync_file.h:77: warning: Function parameter or member 'num_fences' not described in 'sync_file_info' Fixes: 2d75c88fefb2 ("staging/android: refactor SYNC IOCTLs") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20230724145000.125880-1-robdclark@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13scsi: ufs: Fix residual handlingBart Van Assche
[ Upstream commit 2903265e27bfc6dea915dd9e17a1b2587f621f73 ] Only call scsi_set_resid() in case of an underflow. Do not call scsi_set_resid() in case of an overflow. Cc: Avri Altman <avri.altman@wdc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Fixes: cb38845d90fc ("scsi: ufs: core: Set the residual byte count") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230724200843.3376570-2-bvanassche@acm.org Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI: Add locking to RMW PCI Express Capability Register accessorsIlpo Järvinen
[ Upstream commit 5e70d0acf0825f439079736080350371f8d6699a ] Many places in the kernel write the Link Control and Root Control PCI Express Capability Registers without proper concurrency control and this could result in losing the changes one of the writers intended to make. Add pcie_cap_lock spinlock into the struct pci_dev and use it to protect bit changes made in the RMW capability accessors. Protect only a selected set of registers by differentiating the RMW accessor internally to locked/unlocked variants using a wrapper which has the same signature as pcie_capability_clear_and_set_word(). As the Capability Register (pos) given to the wrapper is always a constant, the compiler should be able to simplify all the dead-code away. So far only the Link Control Register (ASPM, hotplug, link retraining, various drivers) and the Root Control Register (AER & PME) seem to require RMW locking. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: c7f486567c1d ("PCI PM: PCIe PME root port service driver") Fixes: f12eb72a268b ("PCI/ASPM: Use PCI Express Capability accessors") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Fixes: affa48de8417 ("staging/rdma/hfi1: Add support for enabling/disabling PCIe ASPM") Fixes: 849a9366cba9 ("misc: rtsx: Add support new chip rts5228 mmc: rtsx: Add support MMC_CAP2_NO_MMC") Fixes: 3d1e7aa80d1c ("misc: rtsx: Use pcie_capability_clear_and_set_word() for PCI_EXP_LNKCTL") Fixes: c0e5f4e73a71 ("misc: rtsx: Add support for RTS5261") Fixes: 3df4fce739e2 ("misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG") Fixes: 121e9c6b5c4c ("misc: rtsx: modify and fix init_hw function") Fixes: 19f3bd548f27 ("mfd: rtsx: Remove LCTLR defination") Fixes: 773ccdfd9cc6 ("mfd: rtsx: Read vendor setting from config space") Fixes: 8275b77a1513 ("mfd: rts5249: Add support for RTS5250S power saving") Fixes: 5da4e04ae480 ("misc: rtsx: Add support for RTS5260") Fixes: 0f49bfbd0f2e ("tg3: Use PCI Express Capability accessors") Fixes: 5e7dfd0fb94a ("tg3: Prevent corruption at 10 / 100Mbps w CLKREQ") Fixes: b726e493e8dc ("r8169: sync existing 8168 device hardware start sequences with vendor driver") Fixes: e6de30d63eb1 ("r8169: more 8168dp support.") Fixes: 8a06127602de ("Bluetooth: hci_bcm4377: Add new driver for BCM4377 PCIe boards") Fixes: 6f461f6c7c96 ("e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata") Fixes: 1eae4eb2a1c7 ("e1000e: Disable L1 ASPM power savings for 82573 mobile variants") Fixes: 8060e169e02f ("ath9k: Enable extended synch for AR9485 to fix L0s recovery issue") Fixes: 69ce674bfa69 ("ath9k: do btcoex ASPM disabling at initialization time") Fixes: f37f05503575 ("mt76: mt76x2e: disable pcie_aspm by default") Link: https://lore.kernel.org/r/20230717120503.15276-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13dt-bindings: clock: qcom,gcc-sc8280xp: Add missing GDSCsKonrad Dybcio
[ Upstream commit 9eba4db02a88e7a810aabd70f7a6960f184f391f ] There are 10 more GDSCs that we've not been caring about, and by extension (and perhaps even more importantly), not putting to sleep. Add them. Fixes: a66a82f2a55e ("dt-bindings: clock: Add Qualcomm SC8280XP GCC bindings") Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20230620-topic-sc8280_gccgdsc-v2-2-562c1428c10d@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Stable-dep-of: 4712eb7ff85b ("clk: qcom: gcc-sc8280xp: Add missing GDSCs") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13ALSA: ump: Don't create unused substreams for static blocksTakashi Iwai
[ Upstream commit b2bcbd031d34d1ba1f491b9152474cf9f6d4d51b ] When the UMP Endpoint is declared as "static", that is, no dynamic reassignment of UMP Groups, it makes little sense to expose always all 16 groups with 16 substreams. Many of those substreams are disabled groups, hence they are useless, but applications don't know it and try to open / access all those substreams unnecessarily. This patch limits the number of UMP legacy rawmidi substreams only to the active groups. The behavior is changed only for the static endpoint (i.e. devices without UMP v1.1 feature implemented or with the static block flag is set). Fixes: 0b5288f5fe63 ("ALSA: ump: Add legacy raw MIDI support") Link: https://lore.kernel.org/r/20230824075108.29958-4-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13block: uapi: Fix compilation errors using ioprio.h with C++Damien Le Moal
[ Upstream commit c7b4b23b36edf32239e7fc3b922797ff1d32b072 ] The use of the "class" argument name in the ioprio_value() inline function in include/uapi/linux/ioprio.h confuses C++ compilers resulting in compilation errors such as: /usr/include/linux/ioprio.h:110:43: error: expected primary-expression before ‘int’ 110 | static __always_inline __u16 ioprio_value(int class, int level, int hint) | ^~~ for user C++ programs including linux/ioprio.h. Avoid these errors by renaming the arguments of the ioprio_value() function to prioclass, priolevel and priohint. For consistency, the arguments of the IOPRIO_PRIO_VALUE() and IOPRIO_PRIO_VALUE_HINT() macros are also renamed in the same manner. Reported-by: Igor Pylypiv <ipylypiv@google.com> Fixes: 01584c1e2337 ("scsi: block: Improve ioprio value validity checks") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20230814215833.259286-1-dlemoal@kernel.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13block: don't allow enabling a cache on devices that don't support itChristoph Hellwig
[ Upstream commit 43c9835b144c7ce29efe142d662529662a9eb376 ] Currently the write_cache attribute allows enabling the QUEUE_FLAG_WC flag on devices that never claimed the capability. Fix that by adding a QUEUE_FLAG_HW_WC flag that is set by blk_queue_write_cache and guards re-enabling the cache through sysfs. Note that any rescan that calls blk_queue_write_cache will still re-enable the write cache as in the current code. Fixes: 93e9d8e836cb ("block: add ability to flag write back caching on a device") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230707094239.107968-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Bluetooth: ISO: Notify user space about failed bis connectionsIulia Tanasescu
[ Upstream commit f777d88278170410b06a1f6633f3b9375a4ddd6b ] Some use cases require the user to be informed if BIG synchronization fails. This commit makes it so that even if the BIG sync established event arrives with error status, a new hconn is added for each BIS, and the iso layer is notified about the failed connections. Unsuccesful bis connections will be marked using the HCI_CONN_BIG_SYNC_FAILED flag. From the iso layer, the POLLERR event is triggered on the newly allocated bis sockets, before adding them to the accept list of the parent socket. From user space, a new fd for each failed bis connection will be obtained by calling accept. The user should check for the POLLERR event on the new socket, to determine if the connection was successful or not. The HCI_CONN_BIG_SYNC flag has been added to mark whether the BIG sync has been successfully established. This flag is checked at bis cleanup, so the HCI LE BIG Terminate Sync command is only issued if needed. The BT_SK_BIG_SYNC flag indicates if BIG create sync has been called for a listening socket, to avoid issuing the command everytime a BIGInfo advertising report is received. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 94d9ba9f9888 ("Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Bluetooth: hci_conn: Consolidate code for aborting connectionsLuiz Augusto von Dentz
[ Upstream commit a13f316e90fdb1fb6df6582e845aa9b3270f3581 ] This consolidates code for aborting connections using hci_cmd_sync_queue so it is synchronized with other threads, but because of the fact that some commands may block the cmd_sync_queue while waiting specific events this attempt to cancel those requests by using hci_cmd_sync_cancel. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 94d9ba9f9888 ("Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13mac80211: make ieee80211_tx_info padding explicitArnd Bergmann
[ Upstream commit a7a2ef0c4b3efbd7d6f3fabd87dbbc0b3f2de5af ] While looking at a bug, I got rather confused by the layout of the 'status' field in ieee80211_tx_info. Apparently, the intention is that status_driver_data[] is used for driver specific data, and fills up the size of the union to 40 bytes, just like the other ones. This is indeed what actually happens, but only because of the combination of two mistakes: - "void *status_driver_data[18 / sizeof(void *)];" is intended to be 18 bytes long but is actually two bytes shorter because of rounding-down in the division, to a multiple of the pointer size (4 bytes or 8 bytes). - The other fields combined are intended to be 22 bytes long, but are actually 24 bytes because of padding in front of the unaligned tx_time member, and in front of the pointer array. The two mistakes cancel out. so the size ends up fine, but it seems more helpful to make this explicit, by having a multiple of 8 bytes in the size calculation and explicitly describing the padding. Fixes: ea5907db2a9cc ("mac80211: fix struct ieee80211_tx_info size") Fixes: 02219b3abca59 ("mac80211: add WMM admission control support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230623152443.2296825-2-arnd@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13nmi_backtrace: allow excluding an arbitrary CPUDouglas Anderson
[ Upstream commit 8d539b84f1e3478436f978ceaf55a0b6cab497b5 ] The APIs that allow backtracing across CPUs have always had a way to exclude the current CPU. This convenience means callers didn't need to find a place to allocate a CPU mask just to handle the common case. Let's extend the API to take a CPU ID to exclude instead of just a boolean. This isn't any more complex for the API to handle and allows the hardlockup detector to exclude a different CPU (the one it already did a trace for) without needing to find space for a CPU mask. Arguably, this new API also encourages safer behavior. Specifically if the caller wants to avoid tracing the current CPU (maybe because they already traced the current CPU) this makes it more obvious to the caller that they need to make sure that the current CPU ID can't change. [akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub] Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 1f38c86bb29f ("watchdog/hardlockup: avoid large stack frames in watchdog_hardlockup_check()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13usb: typec: bus: verify partner exists in typec_altmode_attentionRD Babiera
commit f23643306430f86e2f413ee2b986e0773e79da31 upstream. Some usb hubs will negotiate DisplayPort Alt mode with the device but will then negotiate a data role swap after entering the alt mode. The data role swap causes the device to unregister all alt modes, however the usb hub will still send Attention messages even after failing to reregister the Alt Mode. type_altmode_attention currently does not verify whether or not a device's altmode partner exists, which results in a NULL pointer error when dereferencing the typec_altmode and typec_altmode_ops belonging to the altmode partner. Verify the presence of a device's altmode partner before sending the Attention message to the Alt Mode driver. Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230814180559.923475-1-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13lwt: Check LWTUNNEL_XMIT_CONTINUE strictlyYan Zhai
[ Upstream commit a171fbec88a2c730b108c7147ac5e7b2f5a02b47 ] LWTUNNEL_XMIT_CONTINUE is implicitly assumed in ip(6)_finish_output2, such that any positive return value from a xmit hook could cause unexpected continue behavior, despite that related skb may have been freed. This could be error-prone for future xmit hook ops. One of the possible errors is to return statuses of dst_output directly. To make the code safer, redefine LWTUNNEL_XMIT_CONTINUE value to distinguish from dst_output statuses and check the continue condition explicitly. Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure") Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Yan Zhai <yan@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/96b939b85eda00e8df4f7c080f770970a4c5f698.1692326837.git.yan@cloudflare.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13net-memcg: Fix scope of sockmem pressure indicatorsAbel Wu
[ Upstream commit ac8a52962164a50e693fa021d3564d7745b83a7f ] Now there are two indicators of socket memory pressure sit inside struct mem_cgroup, socket_pressure and tcpmem_pressure, indicating memory reclaim pressure in memcg->memory and ->tcpmem respectively. When in legacy mode (cgroupv1), the socket memory is charged into ->tcpmem which is independent of ->memory, so socket_pressure has nothing to do with socket's pressure at all. Things could be worse by taking socket_pressure into consideration in legacy mode, as a pressure in ->memory can lead to premature reclamation/throttling in socket. While for the default mode (cgroupv2), the socket memory is charged into ->memory, and ->tcpmem/->tcpmem_pressure are simply not used. So {socket,tcpmem}_pressure are only used in default/legacy mode respectively for indicating socket memory pressure. This patch fixes the pieces of code that make mixed use of both. Fixes: 8e8ae645249b ("mm: memcontrol: hook up vmpressure to socket pressure") Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> Acked-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Bluetooth: hci_conn: Always allocate unique handlesLuiz Augusto von Dentz
[ Upstream commit 9f78191cc9f1b34c2e2afd7b554a83bf034092dd ] This attempts to always allocate a unique handle for connections so they can be properly aborted by the likes of hci_abort_conn, so this uses the invalid range as a pool of unset handles that way if userspace is trying to create multiple connections at once each will be given a unique handle which will be considered unset. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 66dee21524d9 ("Bluetooth: hci_event: drop only unbound CIS if Set CIG Parameters fails") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Bluetooth: ISO: do not emit new LE Create CIS if previous is pendingPauli Virtanen
[ Upstream commit 7f74563e6140e42b4ffae62adbef7a65967a3f98 ] LE Create CIS command shall not be sent before all CIS Established events from its previous invocation have been processed. Currently it is sent via hci_sync but that only waits for the first event, but there can be multiple. Make it wait for all events, and simplify the CIS creation as follows: Add new flag HCI_CONN_CREATE_CIS, which is set if Create CIS has been sent for the connection but it is not yet completed. Make BT_CONNECT state to mean the connection wants Create CIS. On events after which new Create CIS may need to be sent, send it if possible and some connections need it. These events are: hci_connect_cis, iso_connect_cfm, hci_cs_le_create_cis, hci_le_cis_estabilished_evt. The Create CIS status/completion events shall queue new Create CIS only if at least one of the connections transitions away from BT_CONNECT, so that we don't loop if controller is sending bogus events. This fixes sending multiple CIS Create for the same CIS in the "ISO AC 6(i) - Success" BlueZ test case: < HCI Command: LE Create Co.. (0x08|0x0064) plen 9 #129 [hci0] Number of CIS: 2 CIS Handle: 257 ACL Handle: 42 CIS Handle: 258 ACL Handle: 42 > HCI Event: Command Status (0x0f) plen 4 #130 [hci0] LE Create Connected Isochronous Stream (0x08|0x0064) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 29 #131 [hci0] LE Connected Isochronous Stream Established (0x19) Status: Success (0x00) Connection Handle: 257 ... < HCI Command: LE Setup Is.. (0x08|0x006e) plen 13 #132 [hci0] ... > HCI Event: Command Complete (0x0e) plen 6 #133 [hci0] LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1 ... < HCI Command: LE Create Co.. (0x08|0x0064) plen 5 #134 [hci0] Number of CIS: 1 CIS Handle: 258 ACL Handle: 42 > HCI Event: Command Status (0x0f) plen 4 #135 [hci0] LE Create Connected Isochronous Stream (0x08|0x0064) ncmd 1 Status: ACL Connection Already Exists (0x0b) > HCI Event: LE Meta Event (0x3e) plen 29 #136 [hci0] LE Connected Isochronous Stream Established (0x19) Status: Success (0x00) Connection Handle: 258 ... Fixes: c09b80be6ffc ("Bluetooth: hci_conn: Fix not waiting for HCI_EVT_LE_CIS_ESTABLISHED") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Bluetooth: ISO: Add support for connecting multiple BISesIulia Tanasescu
[ Upstream commit a0bfde167b506423111ddb8cd71930497a40fc54 ] It is required for some configurations to have multiple BISes as part of the same BIG. Similar to the flow implemented for unicast, DEFER_SETUP will also be used to bind multiple BISes for the same BIG, before starting Periodic Advertising and creating the BIG. The user will have to open a new socket for each BIS. By setting the BT_DEFER_SETUP socket option and calling connect, a new connection will be added for the BIG and advertising handle set by the socket QoS parameters. Since all BISes will be bound for the same BIG and advertising handle, the socket QoS options and base parameters should match for all connections. By calling connect on a socket that does not have the BT_DEFER_SETUP option set, periodic advertising will be started and the BIG will be created, with a BIS for each previously bound connection. Since a BIG cannot be reconfigured with additional BISes after creation, no more connections can be bound for the BIG after the start periodic advertising and create BIG commands have been queued. The bis_cleanup function has also been updated, so that the advertising set and the BIG will not be terminated unless there are no more bound or connected BISes. The HCI_CONN_BIG_CREATED connection flag has been added to indicate that the BIG has been successfully created. This flag is checked at bis_cleanup, so that the BIG is only terminated if the HCI_LE_Create_BIG_Complete has been received. This implementation has been tested on hardware, using the "isotest" tool with an additional command line option, to specify the number of BISes to create as part of the desired BIG: tools/isotest -i hci0 -s 00:00:00:00:00:00 -N 2 -G 1 -T 1 The btmon log shows that a BIG containing 2 BISes has been created: < HCI Command: LE Create Broadcast Isochronous Group (0x08|0x0068) plen 31 Handle: 0x01 Advertising Handle: 0x01 Number of BIS: 2 SDU Interval: 10000 us (0x002710) Maximum SDU size: 40 Maximum Latency: 10 ms (0x000a) RTN: 0x02 PHY: LE 2M (0x02) Packing: Sequential (0x00) Framing: Unframed (0x00) Encryption: 0x00 Broadcast Code: 00000000000000000000000000000000 > HCI Event: Command Status (0x0f) plen 4 LE Create Broadcast Isochronous Group (0x08|0x0068) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 23 LE Broadcast Isochronous Group Complete (0x1b) Status: Success (0x00) Handle: 0x01 BIG Synchronization Delay: 1974 us (0x0007b6) Transport Latency: 1974 us (0x0007b6) PHY: LE 2M (0x02) NSE: 3 BN: 1 PTO: 1 IRC: 3 Maximum PDU: 40 ISO Interval: 10.00 msec (0x0008) Connection Handle #0: 10 Connection Handle #1: 11 < HCI Command: LE Setup Isochronous Data Path (0x08|0x006e) plen 13 Handle: 10 Data Path Direction: Input (Host to Controller) (0x00) Data Path: HCI (0x00) Coding Format: Transparent (0x03) Company Codec ID: Ericsson Technology Licensing (0) Vendor Codec ID: 0 Controller Delay: 0 us (0x000000) Codec Configuration Length: 0 Codec Configuration: > HCI Event: Command Complete (0x0e) plen 6 LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1 Status: Success (0x00) Handle: 10 < HCI Command: LE Setup Isochronous Data Path (0x08|0x006e) plen 13 Handle: 11 Data Path Direction: Input (Host to Controller) (0x00) Data Path: HCI (0x00) Coding Format: Transparent (0x03) Company Codec ID: Ericsson Technology Licensing (0) Vendor Codec ID: 0 Controller Delay: 0 us (0x000000) Codec Configuration Length: 0 Codec Configuration: > HCI Event: Command Complete (0x0e) plen 6 LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1 Status: Success (0x00) Handle: 11 < ISO Data TX: Handle 10 flags 0x02 dlen 44 < ISO Data TX: Handle 11 flags 0x02 dlen 44 > HCI Event: Number of Completed Packets (0x13) plen 5 Num handles: 1 Handle: 10 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 Num handles: 1 Handle: 11 Count: 1 Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Stable-dep-of: 7f74563e6140 ("Bluetooth: ISO: do not emit new LE Create CIS if previous is pending") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13crypto: api - Use work queue in crypto_destroy_instanceHerbert Xu
[ Upstream commit 9ae4577bc077a7e32c3c7d442c95bc76865c0f17 ] The function crypto_drop_spawn expects to be called in process context. However, when an instance is unregistered while it still has active users, the last user may cause the instance to be freed in atomic context. Fix this by delaying the freeing to a work queue. Fixes: 6bfd48096ff8 ("[CRYPTO] api: Added spawns") Reported-by: Florent Revest <revest@chromium.org> Reported-by: syzbot+d769eed29cc42d75e2a3@syzkaller.appspotmail.com Reported-by: syzbot+610ec0671f51e838436e@syzkaller.appspotmail.com Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Florent Revest <revest@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13tcp: tcp_enter_quickack_mode() should be staticEric Dumazet
[ Upstream commit 03b123debcbc8db987bda17ed8412cc011064c22 ] After commit d2ccd7bc8acd ("tcp: avoid resetting ACK timer in DCTCP"), tcp_enter_quickack_mode() is only used from net/ipv4/tcp_input.c. Fixes: d2ccd7bc8acd ("tcp: avoid resetting ACK timer in DCTCP") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20230718162049.1444938-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13bpf: Clear the probe_addr for uprobeYafang Shao
[ Upstream commit 5125e757e62f6c1d5478db4c2b61a744060ddf3f ] To avoid returning uninitialized or random values when querying the file descriptor (fd) and accessing probe_addr, it is necessary to clear the variable prior to its use. Fixes: 41bdc4b40ed6 ("bpf: introduce bpf subcommand BPF_TASK_FD_QUERY") Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230709025630.3735-6-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13vfs, security: Fix automount superblock LSM init problem, preventing NFS sb ↵David Howells
sharing [ Upstream commit d80a8f1b58c2bc8d7c6bfb65401ea4f7ec8cddc2 ] When NFS superblocks are created by automounting, their LSM parameters aren't set in the fs_context struct prior to sget_fc() being called, leading to failure to match existing superblocks. This bug leads to messages like the following appearing in dmesg when fscache is enabled: NFS: Cache volume key already in use (nfs,4.2,2,108,106a8c0,1,,,,100000,100000,2ee,3a98,1d4c,3a98,1) Fix this by adding a new LSM hook to load fc->security for submount creation. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/165962680944.3334508.6610023900349142034.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165962729225.3357250.14350728846471527137.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/165970659095.2812394.6868894171102318796.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/166133579016.3678898.6283195019480567275.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/217595.1662033775@warthog.procyon.org.uk/ # v5 Fixes: 9bc61ab18b1d ("vfs: Introduce fs_context, switch vfs_kern_mount() to it.") Fixes: 779df6a5480f ("NFS: Ensure security label is set for root inode") Tested-by: Jeff Layton <jlayton@kernel.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: "Christian Brauner (Microsoft)" <brauner@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230808-master-v9-1-e0ecde888221@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-06usb: typec: tcpci: clear the fault status bitMarco Felsch
commit 23e60c8daf5ec2ab1b731310761b668745fcf6ed upstream. According the "USB Type-C Port Controller Interface Specification v2.0" the TCPC sets the fault status register bit-7 (AllRegistersResetToDefault) once the registers have been reset to their default values. This triggers an alert(-irq) on PTN5110 devices albeit we do mask the fault-irq, which may cause a kernel hang. Fix this generically by writing a one to the corresponding bit-7. Cc: stable@vger.kernel.org Fixes: 74e656d6b055 ("staging: typec: Type-C Port Controller Interface driver (tcpci)") Reported-by: "Angus Ainslie (Purism)" <angus@akkea.ca> Closes: https://lore.kernel.org/all/20190508002749.14816-2-angus@akkea.ca/ Reported-by: Christian Bach <christian.bach@scs.ch> Closes: https://lore.kernel.org/regressions/ZR0P278MB07737E5F1D48632897D51AC3EB329@ZR0P278MB0773.CHEP278.PROD.OUTLOOK.COM/t/ Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Signed-off-by: Fabio Estevam <festevam@denx.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230816172502.1155079-1-festevam@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-02ipv6: remove hard coded limitation on ipv6_pinfoEric Dumazet
commit f5f80e32de12fad2813d37270e8364a03e6d3ef0 upstream. IPv6 inet sockets are supposed to have a "struct ipv6_pinfo" field at the end of their definition, so that inet6_sk_generic() can derive from socket size the offset of the "struct ipv6_pinfo". This is very fragile, and prevents adding bigger alignment in sockets, because inet6_sk_generic() does not work if the compiler adds padding after the ipv6_pinfo component. We are currently working on a patch series to reorganize TCP structures for better data locality and found issues similar to the one fixed in commit f5d547676ca0 ("tcp: fix tcp_inet6_sk() for 32bit kernels") Alternative would be to force an alignment on "struct ipv6_pinfo", greater or equal to __alignof__(any ipv6 sock) to ensure there is no padding. This does not look great. v2: fix typo in mptcp_proto_v6_init() (Paolo) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Chao Wu <wwchao@google.com> Cc: Wei Wang <weiwan@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: YiFei Zhu <zhuyifei@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-02module: Expose module_init_layout_section()James Morse
commit 2abcc4b5a64a65a2d2287ba0be5c2871c1552416 upstream. module_init_layout_section() choses whether the core module loader considers a section as init or not. This affects the placement of the exit section when module unloading is disabled. This code will never run, so it can be free()d once the module has been initialised. arm and arm64 need to count the number of PLTs they need before applying relocations based on the section name. The init PLTs are stored separately so they can be free()d. arm and arm64 both use within_module_init() to decide which list of PLTs to use when applying the relocation. Because within_module_init()'s behaviour changes when module unloading is disabled, both architecture would need to take this into account when counting the PLTs. Today neither architecture does this, meaning when module unloading is disabled there are insufficient PLTs in the init section to load some modules, resulting in warnings: | WARNING: CPU: 2 PID: 51 at arch/arm64/kernel/module-plts.c:99 module_emit_plt_entry+0x184/0x1cc | Modules linked in: crct10dif_common | CPU: 2 PID: 51 Comm: modprobe Not tainted 6.5.0-rc4-yocto-standard-dirty #15208 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : module_emit_plt_entry+0x184/0x1cc | lr : module_emit_plt_entry+0x94/0x1cc | sp : ffffffc0803bba60 [...] | Call trace: | module_emit_plt_entry+0x184/0x1cc | apply_relocate_add+0x2bc/0x8e4 | load_module+0xe34/0x1bd4 | init_module_from_file+0x84/0xc0 | __arm64_sys_finit_module+0x1b8/0x27c | invoke_syscall.constprop.0+0x5c/0x104 | do_el0_svc+0x58/0x160 | el0_svc+0x38/0x110 | el0t_64_sync_handler+0xc0/0xc4 | el0t_64_sync+0x190/0x194 Instead of duplicating module_init_layout_section()s logic, expose it. Reported-by: Adam Johnston <adam.johnston@arm.com> Fixes: 055f23b74b20 ("module: check for exit sections in layout_sections() instead of module_init_section()") Cc: stable@vger.kernel.org Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-27Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three small driver fixes and one larger unused function set removal in the raid class (so no external impact)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: snic: Fix double free in snic_tgt_create() scsi: core: raid_class: Remove raid_component_add() scsi: ufs: ufs-qcom: Clear qunipro_g4_sel for HW major version > 5 scsi: ufs: mcq: Fix the search/wrap around logic
2023-08-25Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "One clk driver fix and two clk framework fixes: - Fix an OOB access when devm_get_clk_from_child() is used and devm_clk_release() casts the void pointer to the wrong type - Move clk_rate_exclusive_{get,put}() within the correct ifdefs in clk.h so that the stubs are used when CONFIG_COMMON_CLK=n - Register the proper clk provider function depending on the value of #clock-cells in the TI keystone driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: Fix slab-out-of-bounds error in devm_clk_release() clk: Fix undefined reference to `clk_rate_exclusive_{get,put}' clk: keystone: syscon-clk: Fix audio refclk
2023-08-25Merge tag 'mm-hotfixes-stable-2023-08-25-11-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4 issues or aren't considered suitable for a -stable backport" * tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: shmem: fix smaps BUG sleeping while atomic selftests: cachestat: catch failing fsync test on tmpfs selftests: cachestat: test for cachestat availability maple_tree: disable mas_wr_append() when other readers are possible madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check mm: multi-gen LRU: don't spin during memcg release mm: memory-failure: fix unexpected return value in soft_offline_page() radix tree: remove unused variable mm: add a call to flush_cache_vmap() in vmap_pfn() selftests/mm: FOLL_LONGTERM need to be updated to 0x100 nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers() mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast selftests: cgroup: fix test_kmem_basic less than error mm: enable page walking API to lock vmas during the walk smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd() mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
2023-08-25Merge tag 'riscv-for-linus-6.5-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "This is obviously not ideal, particularly for something this late in the cycle. Unfortunately we found some uABI issues in the vector support while reviewing the GDB port, which has triggered a revert -- probably a good sign we should have reviewed GDB before merging this, I guess I just dropped the ball because I was so worried about the context extension and libc suff I forgot. Hence the late revert. There's some risk here as we're still exposing the vector context for signal handlers, but changing that would have meant reverting all of the vector support. The issues we've found so far have been fixed already and they weren't absolute showstoppers, so we're essentially just playing it safe by holding ptrace support for another release (or until we get through a proper userspace code review). Summary: - The vector ucontext extension has been extended with vlenb - The vector registers ELF core dump note type has been changed to avoid aliasing with the CSR type used in embedded systems - Support for accessing vector registers via ptrace() has been reverted - Another build fix for the ISA spec changes around Zifencei/Zicsr that manifests on some systems built with binutils-2.37 and gcc-11.2" * tag 'riscv-for-linus-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fix build errors using binutils2.37 toolchains RISC-V: vector: export VLENB csr in __sc_riscv_v_state RISC-V: Remove ptrace support for vectors
2023-08-25Merge tag 'drm-fixes-2023-08-25' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "A bit bigger than I'd care for, but it's mostly a single vmwgfx fix and a fix for an i915 hotplug probing. Otherwise misc i915, bridge, panfrost and dma-buf fixes. core: - add a HPD poll helper i915: - fix regression in i915 polling - fix docs build warning - fix DG2 idle power consumption bridge: - samsung-dsim: init fix panfrost: - fix speed binning issue dma-buf: - fix recursive lock in fence signal vmwgfx: - fix shader stage validation - fix NULL ptr derefs in gem put" * tag 'drm-fixes-2023-08-25' of git://anongit.freedesktop.org/drm/drm: drm/i915: Fix HPD polling, reenabling the output poll work as needed drm: Add an HPD poll helper to reschedule the poll work drm/vmwgfx: Fix possible invalid drm gem put calls drm/vmwgfx: Fix shader stage validation dma-buf/sw_sync: Avoid recursive lock during fence signal drm/i915: fix Sphinx indentation warning drm/i915/dgfx: Enable d3cold at s2idle drm/display/dp: Fix the DP DSC Receiver cap size drm/panfrost: Skip speed binning on EOPNOTSUPP drm: bridge: samsung-dsim: Fix init during host transfer
2023-08-24Merge tag 'trace-v6.5-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix ring buffer being permanently disabled due to missed record_disabled() Changing the trace cpu mask will disable the ring buffers for the CPUs no longer in the mask. But it fails to update the snapshot buffer. If a snapshot takes place, the accounting for the ring buffer being disabled is corrupted and this can lead to the ring buffer being permanently disabled. - Add test case for snapshot and cpu mask working together - Fix memleak by the function graph tracer not getting closed properly. The iterator is used to read the ring buffer. When it opens, it calls the open function of a tracer, and when it is closed, it calls the close iteration. While a trace is being read, it is still possible to change the tracer. If this happens between the function graph tracer and the wakeup tracer (which uses function graph tracing), the tracers are not closed properly during when the iterator sees the switch, and the wakeup function did not initialize its private pointer to NULL, which is used to know if the function graph tracer was the last tracer. It could be fooled in thinking it is, but then on exit it does not call the close function of the function graph tracer to clean up its data. - Fix synthetic events on big endian machines, by introducing a union that does the conversions properly. - Fix synthetic events from printing out the number of elements in the stacktrace when it shouldn't. - Fix synthetic events stacktrace to not print a bogus value at the end. - Introduce a pipe_cpumask that prevents the trace_pipe files from being opened by more than one task (file descriptor). There was a race found where if splice is called, the iter->ent could become stale and events could be missed. There's no point reading a producer/consumer file by more than one task as they will corrupt each other anyway. Add a cpumask that keeps track of the per_cpu trace_pipe files as well as the global trace_pipe file that prevents more than one open of a trace_pipe file that represents the same ring buffer. This prevents the race from happening. - Fix ftrace samples for arm64 to work with older compilers. * tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: samples: ftrace: Replace bti assembly with hint for older compiler tracing: Introduce pipe_cpumask to avoid race on trace_pipes tracing: Fix memleak due to race between current_tracer and trace tracing/synthetic: Allocate one additional element for size tracing/synthetic: Skip first entry for stack traces tracing/synthetic: Use union instead of casts selftests/ftrace: Add a basic testcase for snapshot tracing: Fix cpu buffers unavailable due to 'record_disabled' missed
2023-08-24scsi: core: raid_class: Remove raid_component_add()Zhu Wang
The raid_component_add() function was added to the kernel tree via patch "[SCSI] embryonic RAID class" (2005). Remove this function since it never has had any callers in the Linux kernel. And also raid_component_release() is only used in raid_component_add(), so it is also removed. Signed-off-by: Zhu Wang <wangzhu9@huawei.com> Link: https://lore.kernel.org/r/20230822015254.184270-1-wangzhu9@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Fixes: 04b5b5cb0136 ("scsi: core: Fix possible memory leak if device_add() fails") Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-08-25Merge tag 'drm-intel-fixes-2023-08-24' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix power consumption at s2idle on DG2 (Anshuman) - Fix documentation build warning (Jani) - Fix Display HPD (Imre) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZOdPRFSJpo0ErPX/@intel.com
2023-08-25Merge tag 'drm-misc-fixes-2023-08-24' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A samsung-dsim initialization fix, a devfreq fix for panfrost, a DP DSC define fix, a recursive lock fix for dma-buf, a shader validation fix and a reference counting fix for vmwgfx Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/amy26vu5xbeeikswpx7nt6rddwfocdidshrtt2qovipihx5poj@y45p3dtzrloc
2023-08-24Merge tag 'net-6.5-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from wifi, can and netfilter. Fixes to fixes: - nf_tables: - GC transaction race with abort path - defer gc run if previous batch is still pending Previous releases - regressions: - ipv4: fix data-races around inet->inet_id - phy: fix deadlocking in phy_error() invocation - mdio: fix C45 read/write protocol - ipvlan: fix a reference count leak warning in ipvlan_ns_exit() - ice: fix NULL pointer deref during VF reset - i40e: fix potential NULL pointer dereferencing of pf->vf in i40e_sync_vsi_filters() - tg3: use slab_build_skb() when needed - mtk_eth_soc: fix NULL pointer on hw reset Previous releases - always broken: - core: validate veth and vxcan peer ifindexes - sched: fix a qdisc modification with ambiguous command request - devlink: add missing unregister linecard notification - wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning - batman: - do not get eth header before batadv_check_management_packet - fix batadv_v_ogm_aggr_send memory leak - bonding: fix macvlan over alb bond support - mlxsw: set time stamp fields also when its type is MIRROR_UTC" * tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) selftests: bonding: add macvlan over bond testing selftest: bond: add new topo bond_topo_2d1c.sh bonding: fix macvlan over alb bond support rtnetlink: Reject negative ifindexes in RTM_NEWLINK netfilter: nf_tables: defer gc run if previous batch is still pending netfilter: nf_tables: fix out of memory error handling netfilter: nf_tables: use correct lock to protect gc_list netfilter: nf_tables: GC transaction race with abort path netfilter: nf_tables: flush pending destroy work before netlink notifier netfilter: nf_tables: validate all pending tables ibmveth: Use dcbf rather than dcbfl i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters() net/sched: fix a qdisc modification with ambiguous command request igc: Fix the typo in the PTM Control macro batman-adv: Hold rtnl lock during MTU update via netlink igb: Avoid starting unnecessary workqueues can: raw: add missing refcount for memory leak fix can: isotp: fix support for transmission of SF without flow control bnx2x: new flag for track HW resource allocation sfc: allocate a big enough SKB for loopback selftest packet ...
2023-08-24Merge tag 'nf-23-08-23' of ↵Paolo Abeni
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter updates for net This PR contains nf_tables updates for your *net* tree. First patch fixes table validation, I broke this in 6.4 when tracking validation state per table, reported by Pablo, fixup from myself. Second patch makes sure objects waiting for memory release have been released, this was broken in 6.1, patch from Pablo Neira Ayuso. Patch three is a fix-for-fix from previous PR: In case a transaction gets aborted, gc sequence counter needs to be incremented so pending gc requests are invalidated, from Pablo. Same for patch 4: gc list needs to use gc list lock, not destroy lock, also from Pablo. Patch 5 fixes a UaF in a set backend, but this should only occur when failslab is enabled for GFP_KERNEL allocations, broken since feature was added in 5.6, from myself. Patch 6 fixes a double-free bug that was also added via previous PR: We must not schedule gc work if the previous batch is still queued. netfilter pull request 2023-08-23 * tag 'nf-23-08-23' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: defer gc run if previous batch is still pending netfilter: nf_tables: fix out of memory error handling netfilter: nf_tables: use correct lock to protect gc_list netfilter: nf_tables: GC transaction race with abort path netfilter: nf_tables: flush pending destroy work before netlink notifier netfilter: nf_tables: validate all pending tables ==================== Link: https://lore.kernel.org/r/20230823152711.15279-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-24bonding: fix macvlan over alb bond supportHangbin Liu
The commit 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds") aims to enable the use of macvlans on top of rlb bond mode. However, the current rlb bond mode only handles ARP packets to update remote neighbor entries. This causes an issue when a macvlan is on top of the bond, and remote devices send packets to the macvlan using the bond's MAC address as the destination. After delivering the packets to the macvlan, the macvlan will rejects them as the MAC address is incorrect. Consequently, this commit makes macvlan over bond non-functional. To address this problem, one potential solution is to check for the presence of a macvlan port on the bond device using netif_is_macvlan_port(bond->dev) and return NULL in the rlb_arp_xmit() function. However, this approach doesn't fully resolve the situation when a VLAN exists between the bond and macvlan. So let's just do a partial revert for commit 14af9963ba1e in rlb_arp_xmit(). As the comment said, Don't modify or load balance ARPs that do not originate locally. Fixes: 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds") Reported-by: susan.zheng@veritas.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2117816 Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>