summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2023-08-11mtd: spi-nor: avoid holes in struct spi_mem_opArnd Bergmann
[ Upstream commit 71c8f9cf2623d0db79665f876b95afcdd8214aec ] gcc gets confused when -ftrivial-auto-var-init=pattern is used on sparse bit fields such as 'struct spi_mem_op', which caused the previous false positive warning about an uninitialized variable: drivers/mtd/spi-nor/spansion.c: error: 'op' is used uninitialized [-Werror=uninitialized] In fact, the variable is fully initialized and gcc does not see it being used, so the warning is entirely bogus. The problem appears to be a misoptimization in the initialization of single bit fields when the rest of the bytes are not initialized. A previous workaround added another initialization, which ended up shutting up the warning in spansion.c, though it apparently still happens in other files as reported by Peter Foley in the gcc bugzilla. The workaround of adding a fake initialization seems particularly bad because it would set values that can never be correct but prevent the compiler from warning about actually missing initializations. Revert the broken workaround and instead pad the structure to only have bitfields that add up to full bytes, which should avoid this behavior in all drivers. I also filed a new bug against gcc with what I found, so this can hopefully be addressed in future gcc releases. At the moment, only gcc-12 and gcc-13 are affected. Cc: Peter Foley <pefoley2@pefoley.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110743 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108402 Link: https://godbolt.org/z/efMMsG1Kx Fixes: 420c4495b5e56 ("mtd: spi-nor: spansion: make sure local struct does not contain garbage") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Tudor Ambarus <tudor.ambarus@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20230719190045.4007391-1-arnd@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11f2fs: fix to do sanity check on direct node in truncate_dnode()Chao Yu
commit a6ec83786ab9f13f25fb18166dee908845713a95 upstream. syzbot reports below bug: BUG: KASAN: slab-use-after-free in f2fs_truncate_data_blocks_range+0x122a/0x14c0 fs/f2fs/file.c:574 Read of size 4 at addr ffff88802a25c000 by task syz-executor148/5000 CPU: 1 PID: 5000 Comm: syz-executor148 Not tainted 6.4.0-rc7-syzkaller-00041-ge660abd551f1 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:351 print_report mm/kasan/report.c:462 [inline] kasan_report+0x11c/0x130 mm/kasan/report.c:572 f2fs_truncate_data_blocks_range+0x122a/0x14c0 fs/f2fs/file.c:574 truncate_dnode+0x229/0x2e0 fs/f2fs/node.c:944 f2fs_truncate_inode_blocks+0x64b/0xde0 fs/f2fs/node.c:1154 f2fs_do_truncate_blocks+0x4ac/0xf30 fs/f2fs/file.c:721 f2fs_truncate_blocks+0x7b/0x300 fs/f2fs/file.c:749 f2fs_truncate.part.0+0x4a5/0x630 fs/f2fs/file.c:799 f2fs_truncate include/linux/fs.h:825 [inline] f2fs_setattr+0x1738/0x2090 fs/f2fs/file.c:1006 notify_change+0xb2c/0x1180 fs/attr.c:483 do_truncate+0x143/0x200 fs/open.c:66 handle_truncate fs/namei.c:3295 [inline] do_open fs/namei.c:3640 [inline] path_openat+0x2083/0x2750 fs/namei.c:3791 do_filp_open+0x1ba/0x410 fs/namei.c:3818 do_sys_openat2+0x16d/0x4c0 fs/open.c:1356 do_sys_open fs/open.c:1372 [inline] __do_sys_creat fs/open.c:1448 [inline] __se_sys_creat fs/open.c:1442 [inline] __x64_sys_creat+0xcd/0x120 fs/open.c:1442 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd The root cause is, inodeA references inodeB via inodeB's ino, once inodeA is truncated, it calls truncate_dnode() to truncate data blocks in inodeB's node page, it traverse mapping data from node->i.i_addr[0] to node->i.i_addr[ADDRS_PER_BLOCK() - 1], result in out-of-boundary access. This patch fixes to add sanity check on dnode page in truncate_dnode(), so that, it can help to avoid triggering such issue, and once it encounters such issue, it will record newly introduced ERROR_INVALID_NODE_REFERENCE error into superblock, later fsck can detect such issue and try repairing. Also, it removes f2fs_truncate_data_blocks() for cleanup due to the function has only one caller, and uses f2fs_truncate_data_blocks_range() instead. Reported-and-tested-by: syzbot+12cb4425b22169b52036@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000f3038a05fef867f8@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11vxlan: Fix nexthop hash sizeBenjamin Poirier
[ Upstream commit 0756384fb1bd38adb2ebcfd1307422f433a1d772 ] The nexthop code expects a 31 bit hash, such as what is returned by fib_multipath_hash() and rt6_multipath_hash(). Passing the 32 bit hash returned by skb_get_hash() can lead to problems related to the fact that 'int hash' is a negative number when the MSB is set. In the case of hash threshold nexthop groups, nexthop_select_path_hthr() will disproportionately select the first nexthop group entry. In the case of resilient nexthop groups, nexthop_select_path_res() may do an out of bounds access in nh_buckets[], for example: hash = -912054133 num_nh_buckets = 2 bucket_index = 65535 which leads to the following panic: BUG: unable to handle page fault for address: ffffc900025910c8 PGD 100000067 P4D 100000067 PUD 10026b067 PMD 0 Oops: 0002 [#1] PREEMPT SMP KASAN NOPTI CPU: 4 PID: 856 Comm: kworker/4:3 Not tainted 6.5.0-rc2+ #34 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:nexthop_select_path+0x197/0xbf0 Code: c1 e4 05 be 08 00 00 00 4c 8b 35 a4 14 7e 01 4e 8d 6c 25 00 4a 8d 7c 25 08 48 01 dd e8 c2 25 15 ff 49 8d 7d 08 e8 39 13 15 ff <4d> 89 75 08 48 89 ef e8 7d 12 15 ff 48 8b 5d 00 e8 14 55 2f 00 85 RSP: 0018:ffff88810c36f260 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000002000c0 RCX: ffffffffaf02dd77 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffffc900025910c8 RBP: ffffc900025910c0 R08: 0000000000000001 R09: fffff520004b2219 R10: ffffc900025910cf R11: 31392d2068736168 R12: 00000000002000c0 R13: ffffc900025910c0 R14: 00000000fffef608 R15: ffff88811840e900 FS: 0000000000000000(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc900025910c8 CR3: 0000000129d00000 CR4: 0000000000750ee0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x23/0x70 ? page_fault_oops+0x1ee/0x5c0 ? __pfx_is_prefetch.constprop.0+0x10/0x10 ? __pfx_page_fault_oops+0x10/0x10 ? search_bpf_extables+0xfe/0x1c0 ? fixup_exception+0x3b/0x470 ? exc_page_fault+0xf6/0x110 ? asm_exc_page_fault+0x26/0x30 ? nexthop_select_path+0x197/0xbf0 ? nexthop_select_path+0x197/0xbf0 ? lock_is_held_type+0xe7/0x140 vxlan_xmit+0x5b2/0x2340 ? __lock_acquire+0x92b/0x3370 ? __pfx_vxlan_xmit+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 ? __pfx_register_lock_class+0x10/0x10 ? skb_network_protocol+0xce/0x2d0 ? dev_hard_start_xmit+0xca/0x350 ? __pfx_vxlan_xmit+0x10/0x10 dev_hard_start_xmit+0xca/0x350 __dev_queue_xmit+0x513/0x1e20 ? __pfx___dev_queue_xmit+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? mark_held_locks+0x44/0x90 ? skb_push+0x4c/0x80 ? eth_header+0x81/0xe0 ? __pfx_eth_header+0x10/0x10 ? neigh_resolve_output+0x215/0x310 ? ip6_finish_output2+0x2ba/0xc90 ip6_finish_output2+0x2ba/0xc90 ? lock_release+0x236/0x3e0 ? ip6_mtu+0xbb/0x240 ? __pfx_ip6_finish_output2+0x10/0x10 ? find_held_lock+0x83/0xa0 ? lock_is_held_type+0xe7/0x140 ip6_finish_output+0x1ee/0x780 ip6_output+0x138/0x460 ? __pfx_ip6_output+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 ? __pfx_ip6_finish_output+0x10/0x10 NF_HOOK.constprop.0+0xc0/0x420 ? __pfx_NF_HOOK.constprop.0+0x10/0x10 ? ndisc_send_skb+0x2c0/0x960 ? __pfx_lock_release+0x10/0x10 ? __local_bh_enable_ip+0x93/0x110 ? lock_is_held_type+0xe7/0x140 ndisc_send_skb+0x4be/0x960 ? __pfx_ndisc_send_skb+0x10/0x10 ? mark_held_locks+0x65/0x90 ? find_held_lock+0x83/0xa0 ndisc_send_ns+0xb0/0x110 ? __pfx_ndisc_send_ns+0x10/0x10 addrconf_dad_work+0x631/0x8e0 ? lock_acquire+0x180/0x3f0 ? __pfx_addrconf_dad_work+0x10/0x10 ? mark_held_locks+0x24/0x90 process_one_work+0x582/0x9c0 ? __pfx_process_one_work+0x10/0x10 ? __pfx_do_raw_spin_lock+0x10/0x10 ? mark_held_locks+0x24/0x90 worker_thread+0x93/0x630 ? __kthread_parkme+0xdc/0x100 ? __pfx_worker_thread+0x10/0x10 kthread+0x1a5/0x1e0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 RIP: 0000:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Modules linked in: CR2: ffffc900025910c8 ---[ end trace 0000000000000000 ]--- RIP: 0010:nexthop_select_path+0x197/0xbf0 Code: c1 e4 05 be 08 00 00 00 4c 8b 35 a4 14 7e 01 4e 8d 6c 25 00 4a 8d 7c 25 08 48 01 dd e8 c2 25 15 ff 49 8d 7d 08 e8 39 13 15 ff <4d> 89 75 08 48 89 ef e8 7d 12 15 ff 48 8b 5d 00 e8 14 55 2f 00 85 RSP: 0018:ffff88810c36f260 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000002000c0 RCX: ffffffffaf02dd77 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffffc900025910c8 RBP: ffffc900025910c0 R08: 0000000000000001 R09: fffff520004b2219 R10: ffffc900025910cf R11: 31392d2068736168 R12: 00000000002000c0 R13: ffffc900025910c0 R14: 00000000fffef608 R15: ffff88811840e900 FS: 0000000000000000(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000129d00000 CR4: 0000000000750ee0 PKRU: 55555554 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x2ca00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fix this problem by ensuring the MSB of hash is 0 using a right shift - the same approach used in fib_multipath_hash() and rt6_multipath_hash(). Fixes: 1274e1cc4226 ("vxlan: ecmp support for mac fdb entries") Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11net: annotate data-races around sk->sk_markEric Dumazet
[ Upstream commit 3c5b4d69c358a9275a8de98f87caf6eda644b086 ] sk->sk_mark is often read while another thread could change the value. Fixes: 4a19ec5800fc ("[NET]: Introducing socket mark socket option.") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11net: gro: fix misuse of CB in udp socket lookupRichard Gobert
[ Upstream commit 7938cd15436873f649f31cb867bac2d88ca564d0 ] This patch fixes a misuse of IP{6}CB(skb) in GRO, while calling to `udp6_lib_lookup2` when handling udp tunnels. `udp6_lib_lookup2` fetch the device from CB. The fix changes it to fetch the device from `skb->dev`. l3mdev case requires special attention since it has a master and a slave device. Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket") Reported-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11net: move gso declarations and functions to their own filesEric Dumazet
[ Upstream commit d457a0e329b0bfd3a1450e0b1a18cd2b47a25a08 ] Move declarations into include/net/gso.h and code into net/core/gso.c Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stanislav Fomichev <sdf@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230608191738.3947077-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 7938cd154368 ("net: gro: fix misuse of CB in udp socket lookup") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11word-at-a-time: use the same return type for has_zero regardless of endiannessndesaulniers@google.com
[ Upstream commit 79e8328e5acbe691bbde029a52c89d70dcbc22f3 ] Compiling big-endian targets with Clang produces the diagnostic: fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical] } while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants))); ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ || fs/namei.c:2173:13: note: cast one or both operands to int to silence this warning It appears that when has_zero was introduced, two definitions were produced with different signatures (in particular different return types). Looking at the usage in hash_name() in fs/namei.c, I suspect that has_zero() is meant to be invoked twice per while loop iteration; using logical-or would not update `bdata` when `a` did not have zeros. So I think it's preferred to always return an unsigned long rather than a bool than update the while loop in hash_name() to use a logical-or rather than bitwise-or. [ Also changed powerpc version to do the same - Linus ] Link: https://github.com/ClangBuiltLinux/linux/issues/1832 Link: https://lore.kernel.org/lkml/20230801-bitwise-v1-1-799bec468dc4@google.com/ Fixes: 36126f8f2ed8 ("word-at-a-time: make the interfaces truly generic") Debugged-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-08x86/srso: Add a Speculative RAS Overflow mitigationBorislav Petkov (AMD)
Upstream commit: fb3bd914b3ec28f5fb697ac55c4846ac2d542855 Add a mitigation for the speculative return address stack overflow vulnerability found on AMD processors. The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence. To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret(). Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08init: Remove check_bugs() leftoversThomas Gleixner
commit 61235b24b9cb37c13fcad5b9596d59a1afdcec30 upstream Everything is converted over to arch_cpu_finalize_init(). Remove the check_bugs() leftovers including the empty stubs in asm-generic, alpha, parisc, powerpc and xtensa. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20230613224545.553215951@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08init: Provide arch_cpu_finalize_init()Thomas Gleixner
commit 7725acaa4f0c04fbefb0e0d342635b967bb7d414 upstream check_bugs() has become a dumping ground for all sorts of activities to finalize the CPU initialization before running the rest of the init code. Most are empty, a few do actual bug checks, some do alternative patching and some cobble a CPU advertisement string together.... Aside of that the current implementation requires duplicated function declaration and mostly empty header files for them. Provide a new function arch_cpu_finalize_init(). Provide a generic declaration if CONFIG_ARCH_HAS_CPU_FINALIZE_INIT is selected and a stub inline otherwise. This requires a temporary #ifdef in start_kernel() which will be removed along with check_bugs() once the architectures are converted over. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224544.957805717@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03dma-buf: keep the signaling time of merged fences v3Christian König
commit f781f661e8c99b0cb34129f2e374234d61864e77 upstream. Some Android CTS is testing if the signaling time keeps consistent during merges. v2: use the current time if the fence is still in the signaling path and the timestamp not yet available. v3: improve comment, fix one more case to use the correct timestamp Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230630120041.109216-1-christian.koenig@amd.com Cc: Jindong Yue <jindong.yue@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03mm: fix memory ordering for mm_lock_seq and vm_lock_seqJann Horn
commit b1f02b95758d05b799731d939e76a0bd6da312db upstream. mm->mm_lock_seq effectively functions as a read/write lock; therefore it must be used with acquire/release semantics. A specific example is the interaction between userfaultfd_register() and lock_vma_under_rcu(). userfaultfd_register() does the following from the point where it changes a VMA's flags to the point where concurrent readers are permitted again (in a simple scenario where only a single private VMA is accessed and no merging/splitting is involved): userfaultfd_register userfaultfd_set_vm_flags vm_flags_reset vma_start_write down_write(&vma->vm_lock->lock) vma->vm_lock_seq = mm_lock_seq [marks VMA as busy] up_write(&vma->vm_lock->lock) vm_flags_init [sets VM_UFFD_* in __vm_flags] vma->vm_userfaultfd_ctx.ctx = ctx mmap_write_unlock vma_end_write_all WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1) [unlocks VMA] There are no memory barriers in between the __vm_flags update and the mm->mm_lock_seq update that unlocks the VMA, so the unlock can be reordered to above the `vm_flags_init()` call, which means from the perspective of a concurrent reader, a VMA can be marked as a userfaultfd VMA while it is not VMA-locked. That's bad, we definitely need a store-release for the unlock operation. The non-atomic write to vma->vm_lock_seq in vma_start_write() is mostly fine because all accesses to vma->vm_lock_seq that matter are always protected by the VMA lock. There is a racy read in vma_start_read() though that can tolerate false-positives, so we should be using WRITE_ONCE() to keep things tidy and data-race-free (including for KCSAN). On the other side, lock_vma_under_rcu() works as follows in the relevant region for locking and userfaultfd check: lock_vma_under_rcu vma_start_read vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [early bailout] down_read_trylock(&vma->vm_lock->lock) vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [main check] userfaultfd_armed checks vma->vm_flags & __VM_UFFD_FLAGS Here, the interesting aspect is how far down the mm->mm_lock_seq read can be reordered - if this read is reordered down below the vma->vm_flags access, this could cause lock_vma_under_rcu() to partly operate on information that was read while the VMA was supposed to be locked. To prevent this kind of downwards bleeding of the mm->mm_lock_seq read, we need to read it with a load-acquire. Some of the comment wording is based on suggestions by Suren. BACKPORT WARNING: One of the functions changed by this patch (which I've written against Linus' tree) is vma_try_start_write(), but this function no longer exists in mm/mm-everything. I don't know whether the merged version of this patch will be ordered before or after the patch that removes vma_try_start_write(). If you're backporting this patch to a tree with vma_try_start_write(), make sure this patch changes that function. Link: https://lkml.kernel.org/r/20230721225107.942336-1-jannh@google.com Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03block: Fix a source code comment in include/uapi/linux/blkzoned.hBart Van Assche
[ Upstream commit e0933b526fbfd937c4a8f4e35fcdd49f0e22d411 ] Fix the symbolic names for zone conditions in the blkzoned.h header file. Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <dlemoal@kernel.org> Fixes: 6a0cb1bc106f ("block: Implement support for zoned block devices") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20230706201422.3987341-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03tcp: Reduce chance of collisions in inet6_hashfn().Stewart Smith
[ Upstream commit d11b0df7ddf1831f3e170972f43186dad520bfcc ] For both IPv4 and IPv6 incoming TCP connections are tracked in a hash table with a hash over the source & destination addresses and ports. However, the IPv6 hash is insufficient and can lead to a high rate of collisions. The IPv6 hash used an XOR to fit everything into the 96 bits for the fast jenkins hash, meaning it is possible for an external entity to ensure the hash collides, thus falling back to a linear search in the bucket, which is slow. We take the approach of hash the full length of IPv6 address in __ipv6_addr_jhash() so that all users can benefit from a more secure version. While this may look like it adds overhead, the reality of modern CPUs means that this is unmeasurable in real world scenarios. In simulating with llvm-mca, the increase in cycles for the hashing code was ~16 cycles on Skylake (from a base of ~155), and an extra ~9 on Nehalem (base of ~173). In commit dd6d2910c5e0 ("netfilter: conntrack: switch to siphash") netfilter switched from a jenkins hash to a siphash, but even the faster hsiphash is a more significant overhead (~20-30%) in some preliminary testing. So, in this patch, we keep to the more conservative approach to ensure we don't add much overhead per SYN. In testing, this results in a consistently even spread across the connection buckets. In both testing and real-world scenarios, we have not found any measurable performance impact. Fixes: 08dcdbf6a7b9 ("ipv6: use a stronger hash for tcp") Signed-off-by: Stewart Smith <trawets@amazon.com> Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230721222410.17914-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03vxlan: calculate correct header length for GPEJiri Benc
[ Upstream commit 94d166c5318c6edd1e079df8552233443e909c33 ] VXLAN-GPE does not add an extra inner Ethernet header. Take that into account when calculating header length. This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is cached. In the collect_md mode (which is the only mode that VXLAN-GPE supports), there's no magic auto-setting of the tunnel interface MTU. It can't be, since the destination and thus the underlying interface may be different for each packet. So, the administrator is responsible for setting the correct tunnel interface MTU. Apparently, the administrators are capable enough to calculate that the maximum MTU for VXLAN-GPE is (their_lower_MTU - 36). They set the tunnel interface MTU to 1464. If you run a TCP stream over such interface, it's then segmented according to the MTU 1464, i.e. producing 1514 bytes frames. Which is okay, this still fits the lower MTU. However, skb_tunnel_check_pmtu (called from vxlan_xmit_one) uses 50 as the header size and thus incorrectly calculates the frame size to be 1528. This leads to ICMP too big message being generated (locally), PMTU of 1450 to be cached and the TCP stream to be resegmented. The fix is to use the correct actual header size, especially for skb_tunnel_check_pmtu calculation. Fixes: e1e5314de08ba ("vxlan: implement GPE") Signed-off-by: Jiri Benc <jbenc@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27tcp: annotate data-races around fastopenq.max_qlenEric Dumazet
[ Upstream commit 70f360dd7042cb843635ece9d28335a4addff9eb ] This field can be read locklessly. Fixes: 1536e2857bd3 ("tcp: Add a TCP_FASTOPEN socket option to get a max backlog on its listner") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-12-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27tcp: annotate data-races around tp->notsent_lowatEric Dumazet
[ Upstream commit 1aeb87bc1440c5447a7fa2d6e3c2cca52cbd206b ] tp->notsent_lowat can be read locklessly from do_tcp_getsockopt() and tcp_poll(). Fixes: c9bee3b7fdec ("tcp: TCP_NOTSENT_LOWAT socket option") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-10-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27tcp: annotate data-races around tp->keepalive_probesEric Dumazet
[ Upstream commit 6e5e1de616bf5f3df1769abc9292191dfad9110a ] do_tcp_getsockopt() reads tp->keepalive_probes while another cpu might change its value. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27tcp: annotate data-races around tp->keepalive_intvlEric Dumazet
[ Upstream commit 5ecf9d4f52ff2f1d4d44c9b68bc75688e82f13b4 ] do_tcp_getsockopt() reads tp->keepalive_intvl while another cpu might change its value. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27tcp: annotate data-races around tp->keepalive_timeEric Dumazet
[ Upstream commit 4164245c76ff906c9086758e1c3f87082a7f5ef5 ] do_tcp_getsockopt() reads tp->keepalive_time while another cpu might change its value. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27Bluetooth: use RCU for hci_conn_params and iterate safely in hci_syncPauli Virtanen
[ Upstream commit 195ef75e19287b4bc413da3e3e3722b030ac881e ] hci_update_accept_list_sync iterates over hdev->pend_le_conns and hdev->pend_le_reports, and waits for controller events in the loop body, without holding hdev lock. Meanwhile, these lists and the items may be modified e.g. by le_scan_cleanup. This can invalidate the list cursor or any other item in the list, resulting to invalid behavior (eg use-after-free). Use RCU for the hci_conn_params action lists. Since the loop bodies in hci_sync block and we cannot use RCU or hdev->lock for the whole loop, copy list items first and then iterate on the copy. Only the flags field is written from elsewhere, so READ_ONCE/WRITE_ONCE should guarantee we read valid values. Free params everywhere with hci_conn_params_free so the cleanup is guaranteed to be done properly. This fixes the following, which can be triggered e.g. by BlueZ new mgmt-tester case "Add + Remove Device Nowait - Success", or by changing hci_le_set_cig_params to always return false, and running iso-tester: ================================================================== BUG: KASAN: slab-use-after-free in hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841) Read of size 8 at addr ffff888001265018 by task kworker/u3:0/32 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 Workqueue: hci0 hci_cmd_sync_work Call Trace: <TASK> dump_stack_lvl (./arch/x86/include/asm/irqflags.h:134 lib/dump_stack.c:107) print_report (mm/kasan/report.c:320 mm/kasan/report.c:430) ? __virt_addr_valid (./include/linux/mmzone.h:1915 ./include/linux/mmzone.h:2011 arch/x86/mm/physaddr.c:65) ? hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841) kasan_report (mm/kasan/report.c:538) ? hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841) hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841) ? __pfx_hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2780) ? mutex_lock (kernel/locking/mutex.c:282) ? __pfx_mutex_lock (kernel/locking/mutex.c:282) ? __pfx_mutex_unlock (kernel/locking/mutex.c:538) ? __pfx_update_passive_scan_sync (net/bluetooth/hci_sync.c:2861) hci_cmd_sync_work (net/bluetooth/hci_sync.c:306) process_one_work (./arch/x86/include/asm/preempt.h:27 kernel/workqueue.c:2399) worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2538) ? __pfx_worker_thread (kernel/workqueue.c:2480) kthread (kernel/kthread.c:376) ? __pfx_kthread (kernel/kthread.c:331) ret_from_fork (arch/x86/entry/entry_64.S:314) </TASK> Allocated by task 31: kasan_save_stack (mm/kasan/common.c:46) kasan_set_track (mm/kasan/common.c:52) __kasan_kmalloc (mm/kasan/common.c:374 mm/kasan/common.c:383) hci_conn_params_add (./include/linux/slab.h:580 ./include/linux/slab.h:720 net/bluetooth/hci_core.c:2277) hci_connect_le_scan (net/bluetooth/hci_conn.c:1419 net/bluetooth/hci_conn.c:1589) hci_connect_cis (net/bluetooth/hci_conn.c:2266) iso_connect_cis (net/bluetooth/iso.c:390) iso_sock_connect (net/bluetooth/iso.c:899) __sys_connect (net/socket.c:2003 net/socket.c:2020) __x64_sys_connect (net/socket.c:2027) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) Freed by task 15: kasan_save_stack (mm/kasan/common.c:46) kasan_set_track (mm/kasan/common.c:52) kasan_save_free_info (mm/kasan/generic.c:523) __kasan_slab_free (mm/kasan/common.c:238 mm/kasan/common.c:200 mm/kasan/common.c:244) __kmem_cache_free (mm/slub.c:1807 mm/slub.c:3787 mm/slub.c:3800) hci_conn_params_del (net/bluetooth/hci_core.c:2323) le_scan_cleanup (net/bluetooth/hci_conn.c:202) process_one_work (./arch/x86/include/asm/preempt.h:27 kernel/workqueue.c:2399) worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2538) kthread (kernel/kthread.c:376) ret_from_fork (arch/x86/entry/entry_64.S:314) ================================================================== Fixes: e8907f76544f ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 3") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECVAntoine Tenart
[ Upstream commit c0a8966e2bc7d31f77a7246947ebc09c1ff06066 ] When using IPv4/TCP, skb->hash comes from sk->sk_txhash except in TIME_WAIT and SYN_RECV where it's not set in the reply skb from ip_send_unicast_reply. Those packets will have a mismatched hash with others from the same flow as their hashes will be 0. IPv6 does not have the same issue as the hash is set from the socket txhash in those cases. This commits sets the hash in the reply skb from ip_send_unicast_reply, which makes the IPv4 code behaving like IPv6. Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Stable-dep-of: 5e5265522a9a ("tcp: annotate data-races around tcp_rsk(req)->txhash") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27sched/psi: use kernfs polling functions for PSI trigger pollingSuren Baghdasaryan
[ Upstream commit aff037078ecaecf34a7c2afab1341815f90fba5e ] Destroying psi trigger in cgroup_file_release causes UAF issues when a cgroup is removed from under a polling process. This is happening because cgroup removal causes a call to cgroup_file_release while the actual file is still alive. Destroying the trigger at this point would also destroy its waitqueue head and if there is still a polling process on that file accessing the waitqueue, it will step on the freed pointer: do_select vfs_poll do_rmdir cgroup_rmdir kernfs_drain_open_files cgroup_file_release cgroup_pressure_release psi_trigger_destroy wake_up_pollfree(&t->event_wait) // vfs_poll is unblocked synchronize_rcu kfree(t) poll_freewait -> UAF access to the trigger's waitqueue head Patch [1] fixed this issue for epoll() case using wake_up_pollfree(), however the same issue exists for synchronous poll() case. The root cause of this issue is that the lifecycles of the psi trigger's waitqueue and of the file associated with the trigger are different. Fix this by using kernfs_generic_poll function when polling on cgroup-specific psi triggers. It internally uses kernfs_open_node->poll waitqueue head with its lifecycle tied to the file's lifecycle. This also renders the fix in [1] obsolete, so revert it. [1] commit c2dbe32d5db5 ("sched/psi: Fix use-after-free in ep_remove_wait_queue()") Fixes: 0e94682b73bf ("psi: introduce psi monitor") Closes: https://lore.kernel.org/all/20230613062306.101831-1-lujialin4@huawei.com/ Reported-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230630005612.1014540-1-surenb@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27posix-timers: Ensure timer ID search-loop limit is validThomas Gleixner
[ Upstream commit 8ce8849dd1e78dadcee0ec9acbd259d239b7069f ] posix_timer_add() tries to allocate a posix timer ID by starting from the cached ID which was stored by the last successful allocation. This is done in a loop searching the ID space for a free slot one by one. The loop has to terminate when the search wrapped around to the starting point. But that's racy vs. establishing the starting point. That is read out lockless, which leads to the following problem: CPU0 CPU1 posix_timer_add() start = sig->posix_timer_id; lock(hash_lock); ... posix_timer_add() if (++sig->posix_timer_id < 0) start = sig->posix_timer_id; sig->posix_timer_id = 0; So CPU1 can observe a negative start value, i.e. -1, and the loop break never happens because the condition can never be true: if (sig->posix_timer_id == start) break; While this is unlikely to ever turn into an endless loop as the ID space is huge (INT_MAX), the racy read of the start value caught the attention of KCSAN and Dmitry unearthed that incorrectness. Rewrite it so that all id operations are under the hash lock. Reported-by: syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/87bkhzdn6g.ffs@tglx Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemptionMarc Zyngier
commit b321c31c9b7b309dcde5e8854b741c8e6a9a05f0 upstream. Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when running a preemptible kernel, as it is possible that a vCPU is blocked without requesting a doorbell interrupt. The issue is that any preemption that occurs between vgic_v4_put() and schedule() on the block path will mark the vPE as nonresident and *not* request a doorbell irq. This occurs because when the vcpu thread is resumed on its way to block, vcpu_load() will make the vPE resident again. Once the vcpu actually blocks, we don't request a doorbell anymore, and the vcpu won't be woken up on interrupt delivery. Fix it by tracking that we're entering WFI, and key the doorbell request on that flag. This allows us not to make the vPE resident when going through a preempt/schedule cycle, meaning we don't lose any state. Cc: stable@vger.kernel.org Fixes: 8e01d9a396e6 ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put") Reported-by: Xiang Chen <chenxiang66@hisilicon.com> Suggested-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Xiang Chen <chenxiang66@hisilicon.com> Co-developed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20230713070657.3873244-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-27fuse: add feature flag for expire-onlyMiklos Szeredi
commit 5cadfbd5a11e5495cac217534c5f788168b1afd7 upstream. Add an init flag idicating whether the FUSE_EXPIRE_ONLY flag of FUSE_NOTIFY_INVAL_ENTRY is effective. This is needed for backports of this feature, otherwise the server could just check the protocol version. Fixes: 4f8d37020e1f ("fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY") Cc: <stable@vger.kernel.org> # v6.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23fprobe: Ensure running fprobe_exit_handler() finished before calling ↵Masami Hiramatsu (Google)
rethook_free() commit 195b9cb5b288fec1c871ef89f78cc9a7461aad3a upstream. Ensure running fprobe_exit_handler() has finished before calling rethook_free() in the unregister_fprobe() so that caller can free the fprobe right after unregister_fprobe(). unregister_fprobe() ensured that all running fprobe_entry/exit_handler() have finished by calling unregister_ftrace_function() which synchronizes RCU. But commit 5f81018753df ("fprobe: Release rethook after the ftrace_ops is unregistered") changed to call rethook_free() after unregister_ftrace_function(). So call rethook_stop() to make rethook disabled before unregister_ftrace_function() and ensure it again. Here is the possible code flow that can call the exit handler after unregister_fprobe(). ------ CPU1 CPU2 call unregister_fprobe(fp) ... __fprobe_handler() rethook_hook() on probed function unregister_ftrace_function() return from probed function rethook hooks find rh->handler == fprobe_exit_handler call fprobe_exit_handler() rethook_free(): set rh->handler = NULL; return from unreigster_fprobe; call fp->exit_handler() <- (*) ------ (*) At this point, the exit handler is called after returning from unregister_fprobe(). This fixes it as following; ------ CPU1 CPU2 call unregister_fprobe() ... rethook_stop(): set rh->handler = NULL; __fprobe_handler() rethook_hook() on probed function unregister_ftrace_function() return from probed function rethook hooks find rh->handler == NULL return from rethook rethook_free() return from unreigster_fprobe; ------ Link: https://lore.kernel.org/all/168873859949.156157.13039240432299335849.stgit@devnote2/ Fixes: 5f81018753df ("fprobe: Release rethook after the ftrace_ops is unregistered") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23Revert "8250: add support for ASIX devices with a FIFO bug"Jiaqing Zhao
commit a82d62f708545d22859584e0e0620da8e3759bbc upstream. This reverts commit eb26dfe8aa7eeb5a5aa0b7574550125f8aa4c3b3. Commit eb26dfe8aa7e ("8250: add support for ASIX devices with a FIFO bug") merged on Jul 13, 2012 adds a quirk for PCI_VENDOR_ID_ASIX (0x9710). But that ID is the same as PCI_VENDOR_ID_NETMOS defined in 1f8b061050c7 ("[PATCH] Netmos parallel/serial/combo support") merged on Mar 28, 2005. In pci_serial_quirks array, the NetMos entry always takes precedence over the ASIX entry even since it was initially merged, code in that commit is always unreachable. In my tests, adding the FIFO workaround to pci_netmos_init() makes no difference, and the vendor driver also does not have such workaround. Given that the code was never used for over a decade, it's safe to revert it. Also, the real PCI_VENDOR_ID_ASIX should be 0x125b, which is used on their newer AX99100 PCIe serial controllers released on 2016. The FIFO workaround should not be intended for these newer controllers, and it was never implemented in vendor driver. Fixes: eb26dfe8aa7e ("8250: add support for ASIX devices with a FIFO bug") Cc: stable <stable@kernel.org> Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230619155743.827859-1-jiaqing.zhao@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23drm/dp_mst: Clear MSG_RDY flag before sending new messageWayne Lin
commit 72f1de49ffb90b29748284f27f1d6b829ab1de95 upstream. [Why] The sequence for collecting down_reply from source perspective should be: Request_n->repeat (get partial reply of Request_n->clear message ready flag to ack DPRX that the message is received) till all partial replies for Request_n are received->new Request_n+1. Now there is chance that drm_dp_mst_hpd_irq() will fire new down request in the tx queue when the down reply is incomplete. Source is restricted to generate interveleaved message transactions so we should avoid it. Also, while assembling partial reply packets, reading out DPCD DOWN_REP Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be wrapped up as a complete operation for reading out a reply packet. Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might be risky. e.g. If the reply of the new request has overwritten the DPRX DOWN_REP Sideband MSG buffer before source writing one to clear DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply for the new request. Should handle the up request in the same way. [How] Separete drm_dp_mst_hpd_irq() into 2 steps. After acking the MST IRQ event, driver calls drm_dp_mst_hpd_irq_send_new_request() and might trigger drm_dp_mst_kick_tx() only when there is no on going message transaction. Changes since v1: * Reworked on review comments received -> Adjust the fix to let driver explicitly kick off new down request when mst irq event is handled and acked -> Adjust the commit message Changes since v2: * Adjust the commit message * Adjust the naming of the divided 2 functions and add a new input parameter "ack". * Adjust code flow as per review comments. Changes since v3: * Update the function description of drm_dp_mst_hpd_irq_handle_event Changes since v4: * Change ack of drm_dp_mst_hpd_irq_handle_event() to be an array align the size of esi[] Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23kasan: use internal prototypes matching gcc-13 builtinsArnd Bergmann
commit bb6e04a173f06e51819a4bb512e127dfbc50dcfa upstream. gcc-13 warns about function definitions for builtin interfaces that have a different prototype, e.g.: In file included from kasan_test.c:31: kasan.h:574:6: error: conflicting types for built-in function '__asan_register_globals'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch] 574 | void __asan_register_globals(struct kasan_global *globals, size_t size); kasan.h:577:6: error: conflicting types for built-in function '__asan_alloca_poison'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch] 577 | void __asan_alloca_poison(unsigned long addr, size_t size); kasan.h:580:6: error: conflicting types for built-in function '__asan_load1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch] 580 | void __asan_load1(unsigned long addr); kasan.h:581:6: error: conflicting types for built-in function '__asan_store1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch] 581 | void __asan_store1(unsigned long addr); kasan.h:643:6: error: conflicting types for built-in function '__hwasan_tag_memory'; expected 'void(void *, unsigned char, long int)' [-Werror=builtin-declaration-mismatch] 643 | void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size); The two problems are: - Addresses are passes as 'unsigned long' in the kernel, but gcc-13 expects a 'void *'. - sizes meant to use a signed ssize_t rather than size_t. Change all the prototypes to match these. Using 'void *' consistently for addresses gets rid of a couple of type casts, so push that down to the leaf functions where possible. This now passes all randconfig builds on arm, arm64 and x86, but I have not tested it on the other architectures that support kasan, since they tend to fail randconfig builds in other ways. This might fail if any of the 32-bit architectures expect a 'long' instead of 'int' for the size argument. The __asan_allocas_unpoison() function prototype is somewhat weird, since it uses a pointer for 'stack_top' and an size_t for 'stack_bottom'. This looks like it is meant to be 'addr' and 'size' like the others, but the implementation clearly treats them as 'top' and 'bottom'. Link: https://lkml.kernel.org/r/20230509145735.9263-2-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23net/sched: make psched_mtu() RTNL-less safePedro Tammela
[ Upstream commit 150e33e62c1fa4af5aaab02776b6c3812711d478 ] Eric Dumazet says[1]: ------- Speaking of psched_mtu(), I see that net/sched/sch_pie.c is using it without holding RTNL, so dev->mtu can be changed underneath. KCSAN could issue a warning. ------- Annotate dev->mtu with READ_ONCE() so KCSAN don't issue a warning. [1] https://lore.kernel.org/all/CANn89iJoJO5VtaJ-2=_d2aOQhb0Xw8iBT_Cxqp2HyuS-zj6azw@mail.gmail.com/ v1 -> v2: Fix commit message Fixes: d4b36210c2e6 ("net: pkt_sched: PIE AQM scheme") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230711021634.561598-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23nvme: fix the NVME_ID_NS_NVM_STS_MASK definitionAnkit Kumar
[ Upstream commit b938e6603660652dc3db66d3c915fbfed3bce21d ] As per NVMe command set specification 1.0c Storage tag size is 7 bits. Fixes: 4020aad85c67 ("nvme: add support for enhanced metadata") Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enumVladimir Oltean
[ Upstream commit 2d800bc500fb3fb07a0fb42e2d0a1356fb9e1e8f ] Inspired from struct flow_cls_offload :: cmd, in order for taprio to be able to report statistics (which is future work), it seems that we need to drill one step further with the ndo_setup_tc(TC_SETUP_QDISC_TAPRIO) multiplexing, and pass the command as part of the common portion of the muxed structure. Since we already have an "enable" variable in tc_taprio_qopt_offload, refactor all drivers to check for "cmd" instead of "enable", and reject every other command except "replace" and "destroy" - to be future proof. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> # for lan966x Acked-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 8046063df887 ("igc: Rename qbv_enable to taprio_offload_enable") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23s390/ism: Fix and simplify add()/remove() callback handlingNiklas Schnelle
[ Upstream commit 76631ffa2fd2d45bae5ad717eef716b94144e0e7 ] Previously the clients_lock was protecting the clients array against concurrent addition/removal of clients but was also accessed from IRQ context. This meant that it had to be a spinlock and that the add() and remove() callbacks in which clients need to do allocation and take mutexes can't be called under the clients_lock. To work around this these callbacks were moved to workqueues. This not only introduced significant complexity but is also subtly broken in at least one way. In ism_dev_init() and ism_dev_exit() clients[i]->tgt_ism is used to communicate the added/removed ISM device to the work function. While write access to client[i]->tgt_ism is protected by the clients_lock and the code waits that there is no pending add/remove work before and after setting clients[i]->tgt_ism this is not enough. The problem is that the wait happens based on per ISM device counters. Thus a concurrent ism_dev_init()/ism_dev_exit() for a different ISM device may overwrite a clients[i]->tgt_ism between unlocking the clients_lock and the subsequent wait for the work to finnish. Thankfully with the clients_lock no longer held in IRQ context it can be turned into a mutex which can be held during the calls to add()/remove() completely removing the need for the workqueues and the associated broken housekeeping including the per ISM device counters and the clients[i]->tgt_ism. Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23s390/ism: Fix locking for forwarding of IRQs and events to clientsNiklas Schnelle
[ Upstream commit 6b5c13b591d753c6022fbd12f8c0c0a9a07fc065 ] The clients array references all registered clients and is protected by the clients_lock. Besides its use as general list of clients the clients array is accessed in ism_handle_irq() to forward ISM device events to clients. While the clients_lock is taken in the IRQ handler when calling handle_event() it is however incorrectly not held during the client->handle_irq() call and for the preceding clients[] access leaving it unprotected against concurrent client (un-)registration. Furthermore the accesses to ism->sba_client_arr[] in ism_register_dmb() and ism_unregister_dmb() are not protected by any lock. This is especially problematic as the client ID from the ism->sba_client_arr[] is not checked against NO_CLIENT and neither is the client pointer checked. Instead of expanding the use of the clients_lock further add a separate array in struct ism_dev which references clients subscribed to the device's events and IRQs. This array is protected by ism->lock which is already taken in ism_handle_irq() and can be taken outside the IRQ handler when adding/removing subscribers or the accessing ism->sba_client_arr[]. This also means that the clients_lock is no longer taken in IRQ context. Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23net: dsa: felix: make vsc9959_tas_guard_bands_update() visible to ocelot->opsVladimir Oltean
[ Upstream commit c60819149b637d0f9f7f66e110d2a0d90a3993ea ] In a future change we will need to make ocelot_port_update_active_preemptible_tcs() call vsc9959_tas_guard_bands_update(), but that is currently not possible, since the ocelot switch lib does not have access to functions private to the DSA wrapper. Move the pointer to vsc9959_tas_guard_bands_update() from felix->info (which is private to the DSA driver) to ocelot->ops (which is also visible to the ocelot switch lib). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Message-ID: <20230705104422.49025-3-vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: c6efb4ae387c ("net: mscc: ocelot: fix oversize frame dropping for preemptible TCs") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23blk-crypto: use dynamic lock class for blk_crypto_profile::lockEric Biggers
[ Upstream commit 2fb48d88e77f29bf9d278f25bcfe82cf59a0e09b ] When a device-mapper device is passing through the inline encryption support of an underlying device, calls to blk_crypto_evict_key() take the blk_crypto_profile::lock of the device-mapper device, then take the blk_crypto_profile::lock of the underlying device (nested). This isn't a real deadlock, but it causes a lockdep report because there is only one lock class for all instances of this lock. Lockdep subclasses don't really work here because the hierarchy of block devices is dynamic and could have more than 2 levels. Instead, register a dynamic lock class for each blk_crypto_profile, and associate that with the lock. This avoids false-positive lockdep reports like the following: ============================================ WARNING: possible recursive locking detected 6.4.0-rc5 #2 Not tainted -------------------------------------------- fscryptctl/1421 is trying to acquire lock: ffffff80829ca418 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0x44/0x1c0 but task is already holding lock: ffffff8086b68ca8 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0xc8/0x1c0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&profile->lock); lock(&profile->lock); *** DEADLOCK *** May be due to missing lock nesting notation Fixes: 1b2628397058 ("block: Keyslot Manager for Inline Encryption") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230610061139.212085-1-ebiggers@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23netfilter: conntrack: don't fold port numbers into addresses before hashingFlorian Westphal
[ Upstream commit eaf9e7192ec9af2fbf1b6eb2299dd0feca6c5f7e ] Originally this used jhash2() over tuple and folded the zone id, the pernet hash value, destination port and l4 protocol number into the 32bit seed value. When the switch to siphash was done, I used an on-stack temporary buffer to build a suitable key to be hashed via siphash(). But this showed up as performance regression, so I got rid of the temporary copy and collected to-be-hashed data in 4 u64 variables. This makes it easy to build tuples that produce the same hash, which isn't desirable even though chain lengths are limited. Switch back to plain siphash, but just like with jhash2(), take advantage of the fact that most of to-be-hashed data is already in a suitable order. Use an empty struct as annotation in 'struct nf_conntrack_tuple' to mark last member that can be used as hash input. The only remaining data that isn't present in the tuple structure are the zone identifier and the pernet hash: fold those into the key. Fixes: d2c806abcf0b ("netfilter: conntrack: use siphash_4u64") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23netfilter: nf_tables: report use refcount overflowPablo Neira Ayuso
[ Upstream commit 1689f25924ada8fe14a4a82c38925d04994c7142 ] Overflow use refcount checks are not complete. Add helper function to deal with object reference counter tracking. Report -EMFILE in case UINT_MAX is reached. nft_use_dec() splats in case that reference counter underflows, which should not ever happen. Add nft_use_inc_restore() and nft_use_dec_restore() which are used to restore reference counter from error and abort paths. Use u32 in nft_flowtable and nft_object since helper functions cannot work on bitfields. Remove the few early incomplete checks now that the helper functions are in place and used to check for refcount overflow. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19blktrace: use inline function for blk_trace_remove() while blktrace is disabledYu Kuai
commit cbe7cff4a76bc749dd70264ca5cf924e2adf9296 upstream. If config is disabled, call blk_trace_remove() directly will trigger build warning, hence use inline function instead, prepare to fix blktrace debugfs entries leakage. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230610022003.2557284-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-19shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfsRoberto Sassu
commit 36ce9d76b0a93bae799e27e4f5ac35478c676592 upstream. As the ramfs-based tmpfs uses ramfs_init_fs_context() for the init_fs_context method, which allocates fc->s_fs_info, use ramfs_kill_sb() to free it and avoid a memory leak. Link: https://lkml.kernel.org/r/20230607161523.2876433-1-roberto.sassu@huaweicloud.com Fixes: c3b1b1cbf002 ("ramfs: add support for "mode=" mount option") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: Hugh Dickins <hughd@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-19autofs: use flexible array in ioctl structureArnd Bergmann
commit e910c8e3aa02dc456e2f4c32cb479523c326b534 upstream. Commit df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") introduced a warning for the autofs_dev_ioctl structure: In function 'check_name', inlined from 'validate_dev_ioctl' at fs/autofs/dev-ioctl.c:131:9, inlined from '_autofs_dev_ioctl' at fs/autofs/dev-ioctl.c:624:8: fs/autofs/dev-ioctl.c:33:14: error: 'strchr' reading 1 or more bytes from a region of size 0 [-Werror=stringop-overread] 33 | if (!strchr(name, '/')) | ^~~~~~~~~~~~~~~~~ In file included from include/linux/auto_dev-ioctl.h:10, from fs/autofs/autofs_i.h:10, from fs/autofs/dev-ioctl.c:14: include/uapi/linux/auto_dev-ioctl.h: In function '_autofs_dev_ioctl': include/uapi/linux/auto_dev-ioctl.h:112:14: note: source object 'path' of size 0 112 | char path[0]; | ^~~~ This is easily fixed by changing the gnu 0-length array into a c99 flexible array. Since this is a uapi structure, we have to be careful about possible regressions but this one should be fine as they are equivalent here. While it would break building with ancient gcc versions that predate c99, it helps building with --std=c99 and -Wpedantic builds in user space, as well as non-gnu compilers. This means we probably also want it fixed in stable kernels. Cc: stable@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230523081944.581710-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-19watch_queue: prevent dangling pipe pointerSiddh Raman Pant
commit 943211c87427f25bd22e0e63849fb486bb5f87fa upstream. NULL the dangling pipe reference while clearing watch_queue. If not done, a reference to a freed pipe remains in the watch_queue, as this function is called before freeing a pipe in free_pipe_info() (see line 834 of fs/pipe.c). The sole use of wqueue->defunct is for checking if the watch queue has been cleared, but wqueue->pipe is also NULLed while clearing. Thus, wqueue->defunct is superfluous, as wqueue->pipe can be checked for NULL. Hence, the former can be removed. Tested with keyutils testsuite. Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Siddh Raman Pant <code@siddh.me> Acked-by: David Howells <dhowells@redhat.com> Message-Id: <20230605143616.640517-1-code@siddh.me> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-19net: dsa: sja1105: always enable the send_meta optionsVladimir Oltean
[ Upstream commit a372d66af48506d9f7aaae2a474cd18f14d98cb8 ] incl_srcpt has the limitation, mentioned in commit b4638af8885a ("net: dsa: sja1105: always enable the INCL_SRCPT option"), that frames with a MAC DA of 01:80:c2:xx:yy:zz will be received as 01:80:c2:00:00:zz unless PTP RX timestamping is enabled. The incl_srcpt option was initially unconditionally enabled, then that changed with commit 42824463d38d ("net: dsa: sja1105: Limit use of incl_srcpt to bridge+vlan mode"), then again with b4638af8885a ("net: dsa: sja1105: always enable the INCL_SRCPT option"). Bottom line is that it now needs to be always enabled, otherwise the driver does not have a reliable source of information regarding source_port and switch_id for link-local traffic (tag_8021q VLANs may be imprecise since now they identify an entire bridging domain when ports are not standalone). If we accept that PTP RX timestamping (and therefore, meta frame generation) is always enabled in hardware, then that limitation could be avoided and packets with any MAC DA can be properly received, because meta frames do contain the original bytes from the MAC DA of their associated link-local packet. This change enables meta frame generation unconditionally, which also has the nice side effects of simplifying the switch control path (a switch reset is no longer required on hwtstamping settings change) and the tagger data path (it no longer needs to be informed whether to expect meta frames or not - it always does). Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19net: fix net_dev_start_xmit trace event vs skb_transport_offset()Eric Dumazet
[ Upstream commit f88fcb1d7d961b4b402d675109726f94db87571c ] After blamed commit, we must be more careful about using skb_transport_offset(), as reminded us by syzbot: WARNING: CPU: 0 PID: 10 at include/linux/skbuff.h:2868 skb_transport_offset include/linux/skbuff.h:2977 [inline] WARNING: CPU: 0 PID: 10 at include/linux/skbuff.h:2868 perf_trace_net_dev_start_xmit+0x89a/0xce0 include/trace/events/net.h:14 Modules linked in: CPU: 0 PID: 10 Comm: kworker/u4:1 Not tainted 6.1.30-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet RIP: 0010:skb_transport_header include/linux/skbuff.h:2868 [inline] RIP: 0010:skb_transport_offset include/linux/skbuff.h:2977 [inline] RIP: 0010:perf_trace_net_dev_start_xmit+0x89a/0xce0 include/trace/events/net.h:14 Code: 8b 04 25 28 00 00 00 48 3b 84 24 c0 00 00 00 0f 85 4e 04 00 00 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc e8 56 22 01 fd <0f> 0b e9 f6 fc ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 86 f9 ff RSP: 0018:ffffc900002bf700 EFLAGS: 00010293 RAX: ffffffff8485d8ca RBX: 000000000000ffff RCX: ffff888100914280 RDX: 0000000000000000 RSI: 000000000000ffff RDI: 000000000000ffff RBP: ffffc900002bf818 R08: ffffffff8485d5b6 R09: fffffbfff0f8fb5e R10: 0000000000000000 R11: dffffc0000000001 R12: 1ffff110217d8f67 R13: ffff88810bec7b3a R14: dffffc0000000000 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8881f6a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f96cf6d52f0 CR3: 000000012224c000 CR4: 0000000000350ef0 Call Trace: <TASK> [<ffffffff84715e35>] trace_net_dev_start_xmit include/trace/events/net.h:14 [inline] [<ffffffff84715e35>] xmit_one net/core/dev.c:3643 [inline] [<ffffffff84715e35>] dev_hard_start_xmit+0x705/0x980 net/core/dev.c:3660 [<ffffffff8471a232>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff85416493>] dev_queue_xmit include/linux/netdevice.h:3030 [inline] [<ffffffff85416493>] batadv_send_skb_packet+0x3f3/0x680 net/batman-adv/send.c:108 [<ffffffff85416744>] batadv_send_broadcast_skb+0x24/0x30 net/batman-adv/send.c:127 [<ffffffff853bc52a>] batadv_iv_ogm_send_to_if net/batman-adv/bat_iv_ogm.c:393 [inline] [<ffffffff853bc52a>] batadv_iv_ogm_emit net/batman-adv/bat_iv_ogm.c:421 [inline] [<ffffffff853bc52a>] batadv_iv_send_outstanding_bat_ogm_packet+0x69a/0x840 net/batman-adv/bat_iv_ogm.c:1701 [<ffffffff8151023c>] process_one_work+0x8ac/0x1170 kernel/workqueue.c:2289 [<ffffffff81511938>] worker_thread+0xaa8/0x12d0 kernel/workqueue.c:2436 Fixes: 66e4c8d95008 ("net: warn if transport header was not set") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19Bluetooth: MGMT: Fix marking SCAN_RSP as not connectableLuiz Augusto von Dentz
[ Upstream commit 73f55453ea5236a586a7f1b3d5e2ee051d655351 ] When receiving a scan response there is no way to know if the remote device is connectable or not, so when it cannot be merged don't make any assumption and instead just mark it with a new flag defined as MGMT_DEV_FOUND_SCAN_RSP so userspace can tell it is a standalone SCAN_RSP. Link: https://lore.kernel.org/linux-bluetooth/CABBYNZ+CYMsDSPTxBn09Js3BcdC-x7vZFfyLJ3ppZGGwJKmUTw@mail.gmail.com/ Fixes: c70a7e4cc8d2 ("Bluetooth: Add support for Not Connectable flag for Device Found events") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19net: mscc: ocelot: don't keep PTP configuration of all ports in single structureVladimir Oltean
[ Upstream commit 45d0fcb5bc9558d0bf3d2fa7fabc5d8a88d35439 ] In a future change, the driver will need to determine whether PTP RX timestamping is enabled on a port (including whether traps were set up on that port in particular) and that is currently not possible. The driver supports different RX filters (L2, L4) and kinds of TX timestamping (one-step, two-step) on its ports, but it saves all configuration in a single struct hwtstamp_config that is global to the switch. So, the latest timestamping configuration on one port (including a request to disable timestamping) affects what gets reported for all ports, even though the configuration itself is still individual to each port. The port timestamping configurations are only coupled because of the common structure, so replace the hwtstamp_config with a mask of trapped protocols saved per port. We also have the ptp_cmd to distinguish between one-step and two-step PTP timestamping, so with those 2 bits of information we can fully reconstruct a descriptive struct hwtstamp_config for each port, during the SIOCGHWTSTAMP ioctl. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19lib/bitmap: drop optimization of bitmap_{from,to}_arr64Yury Norov
[ Upstream commit c1d2ba10f594046831d14b03f194e8d05e78abad ] bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE architectures when it's wired to bitmap_copy_clear_tail(). bitmap_copy_clear_tail() takes care of unused bits in the bitmap up to the next word boundary. But on 32-bit machines when copying bits from bitmap to array of 64-bit words, it's expected that the unused part of a recipient array must be cleared up to 64-bit boundary, so the last 4 bytes may stay untouched when nbits % 64 <= 32. While the copying part of the optimization works correct, that clear-tail trick makes corresponding tests reasonably fail: test_bitmap: bitmap_to_arr64(nbits == 1): tail is not safely cleared: 0xa5a5a5a500000001 (must be 0x0000000000000001) Fix it by removing bitmap_{from,to}_arr64() optimization for 32-bit LE arches. Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/lkml/20230225184702.GA3587246@roeck-us.net/ Fixes: 0a97953fd221 ("lib: add bitmap_{from,to}_arr64") Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19Revert "usb: common: usb-conn-gpio: Set last role to unknown before initial ↵Greg Kroah-Hartman
detection" [ Upstream commit df49f2a0ac4a34c0cb4b5c233fcfa0add644c43c ] This reverts commit edd60d24bd858cef165274e4cd6cab43bdc58d15. Heikki reports that this should not be a global flag just to work around one broken driver and should be fixed differently, so revert it. Reported-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Fixes: edd60d24bd85 ("usb: common: usb-conn-gpio: Set last role to unknown before initial detection") Link: https://lore.kernel.org/r/ZImE4L3YgABnCIsP@kuha.fi.intel.com Cc: Prashanth K <quic_prashk@quicinc.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19usb: common: usb-conn-gpio: Set last role to unknown before initial detectionPrashanth K
[ Upstream commit edd60d24bd858cef165274e4cd6cab43bdc58d15 ] Currently if we bootup a device without cable connected, then usb-conn-gpio won't call set_role() since last_role is same as current role. This happens because during probe last_role gets initialised to zero. To avoid this, added a new constant in enum usb_role, last_role is set to USB_ROLE_UNKNOWN before performing initial detection. While at it, also handle default case for the usb_role switch in cdns3, intel-xhci-usb-role-switch & musb/jz4740 to avoid build warnings. Fixes: 4602f3bff266 ("usb: common: add USB GPIO based connection detection driver") Signed-off-by: Prashanth K <quic_prashk@quicinc.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Message-ID: <1685544074-17337-1-git-send-email-quic_prashk@quicinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>