summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2023-08-30clk: Fix undefined reference to `clk_rate_exclusive_{get,put}'Biju Das
[ Upstream commit 2746f13f6f1df7999001d6595b16f789ecc28ad1 ] The COMMON_CLK config is not enabled in some of the architectures. This causes below build issues: pwm-rz-mtu3.c:(.text+0x114): undefined reference to `clk_rate_exclusive_put' pwm-rz-mtu3.c:(.text+0x32c): undefined reference to `clk_rate_exclusive_get' Fix these issues by moving clk_rate_exclusive_{get,put} inside COMMON_CLK code block, as clk.c is enabled by COMMON_CLK. Fixes: 55e9b8b7b806 ("clk: add clk_rate_exclusive api") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/all/202307251752.vLfmmhYm-lkp@intel.com/ Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/r/20230725175140.361479-1-biju.das.jz@bp.renesas.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-30scsi: core: raid_class: Remove raid_component_add()Zhu Wang
commit 60c5fd2e8f3c42a5abc565ba9876ead1da5ad2b7 upstream. The raid_component_add() function was added to the kernel tree via patch "[SCSI] embryonic RAID class" (2005). Remove this function since it never has had any callers in the Linux kernel. And also raid_component_release() is only used in raid_component_add(), so it is also removed. Signed-off-by: Zhu Wang <wangzhu9@huawei.com> Link: https://lore.kernel.org/r/20230822015254.184270-1-wangzhu9@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Fixes: 04b5b5cb0136 ("scsi: core: Fix possible memory leak if device_add() fails") Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-30cgroup/cpuset: Free DL BW in case can_attach() failsDietmar Eggemann
commit 2ef269ef1ac006acf974793d975539244d77b28f upstream. cpuset_can_attach() can fail. Postpone DL BW allocation until all tasks have been checked. DL BW is not allocated per-task but as a sum over all DL tasks migrating. If multiple controllers are attached to the cgroup next to the cpuset controller a non-cpuset can_attach() can fail. In this case free DL BW in cpuset_cancel_attach(). Finally, update cpuset DL task count (nr_deadline_tasks) only in cpuset_attach(). Suggested-by: Waiman Long <longman@redhat.com> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> [ Fix conflicts in kernel/cgroup/cpuset.c due to new code being applied that is not applicable on this branch. Reject new code. ] Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-30sched/deadline: Create DL BW alloc, free & check overflow interfaceDietmar Eggemann
commit 85989106feb734437e2d598b639991b9185a43a6 upstream. While moving a set of tasks between exclusive cpusets, cpuset_can_attach() -> task_can_attach() calls dl_cpu_busy(..., p) for DL BW overflow checking and per-task DL BW allocation on the destination root_domain for the DL tasks in this set. This approach has the issue of not freeing already allocated DL BW in the following error cases: (1) The set of tasks includes multiple DL tasks and DL BW overflow checking fails for one of the subsequent DL tasks. (2) Another controller next to the cpuset controller which is attached to the same cgroup fails in its can_attach(). To address this problem rework dl_cpu_busy(): (1) Split it into dl_bw_check_overflow() & dl_bw_alloc() and add a dedicated dl_bw_free(). (2) dl_bw_alloc() & dl_bw_free() take a `u64 dl_bw` parameter instead of a `struct task_struct *p` used in dl_cpu_busy(). This allows to allocate DL BW for a set of tasks too rather than only for a single task. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-30sched/cpuset: Keep track of SCHED_DEADLINE task in cpusetsJuri Lelli
commit 6c24849f5515e4966d94fa5279bdff4acf2e9489 upstream. Qais reported that iterating over all tasks when rebuilding root domains for finding out which ones are DEADLINE and need their bandwidth correctly restored on such root domains can be a costly operation (10+ ms delays on suspend-resume). To fix the problem keep track of the number of DEADLINE tasks belonging to each cpuset and then use this information (followup patch) to only perform the above iteration if DEADLINE tasks are actually present in the cpuset for which a corresponding root domain is being rebuilt. Reported-by: Qais Yousef (Google) <qyousef@layalina.io> Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> [ Fix conflicts in kernel/cgroup/cpuset.c and kernel/sched/deadline.c due to pulling new fields and functions. Remove new code and match the patch diff. ] Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-30sched/cpuset: Bring back cpuset_mutexJuri Lelli
commit 111cd11bbc54850f24191c52ff217da88a5e639b upstream. Turns out percpu_cpuset_rwsem - commit 1243dc518c9d ("cgroup/cpuset: Convert cpuset_mutex to percpu_rwsem") - wasn't such a brilliant idea, as it has been reported to cause slowdowns in workloads that need to change cpuset configuration frequently and it is also not implementing priority inheritance (which causes troubles with realtime workloads). Convert percpu_cpuset_rwsem back to regular cpuset_mutex. Also grab it only for SCHED_DEADLINE tasks (other policies don't care about stable cpusets anyway). Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> [ Fix conflict in kernel/cgroup/cpuset.c due to pulling new functions or comment that don't exist on 5.10 or the usage of different cpu hotplug lock whenever replacing the rwsem with mutex. Remove BUG_ON() for rwsem that doesn't exist on mainline. ] Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-30libceph, rbd: ignore addr->type while comparing in some casesIlya Dryomov
[ Upstream commit 313771e80fd253d4b5472e61a2d12b03c5293aa9 ] For libceph, this ensures that libceph instance sharing (share option) continues to work. For rbd, this avoids blocklisting alive lock owners (locker addr is always LEGACY, while watcher addr is ANY in nautilus). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Stable-dep-of: 588159009d5b ("rbd: retrieve and check lock owner twice before blocklisting") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-26x86/ibt: Add ANNOTATE_NOENDBRPeter Zijlstra
[ Upstream commit c8c301abeae58ec756b8fcb2178a632bd3c9e284 ] In order to have objtool warn about code references to !ENDBR instruction, we need an annotation to allow this for non-control-flow instances -- consider text range checks, text patching, or return trampolines etc. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154317.578968224@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-26objtool: Add frame-pointer-specific function ignoreJosh Poimboeuf
[ Upstream commit e028c4f7ac7ca8c96126fe46c54ab3d56ffe6a66 ] Add a CONFIG_FRAME_POINTER-specific version of STACK_FRAME_NON_STANDARD() for the case where a function is intentionally missing frame pointer setup, but otherwise needs objtool/ORC coverage when frame pointers are disabled. Link: https://lkml.kernel.org/r/163163047364.489837.17377799909553689661.stgit@devnote2 Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Stable-dep-of: c8c301abeae5 ("x86/ibt: Add ANNOTATE_NOENDBR") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-26net: do not allow gso_size to be set to GSO_BY_FRAGSEric Dumazet
[ Upstream commit b616be6b97688f2f2bd7c4a47ab32f27f94fb2a9 ] One missing check in virtio_net_hdr_to_skb() allowed syzbot to crash kernels again [1] Do not allow gso_size to be set to GSO_BY_FRAGS (0xffff), because this magic value is used by the kernel. [1] general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 0 PID: 5039 Comm: syz-executor401 Not tainted 6.5.0-rc5-next-20230809-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 RIP: 0010:skb_segment+0x1a52/0x3ef0 net/core/skbuff.c:4500 Code: 00 00 00 e9 ab eb ff ff e8 6b 96 5d f9 48 8b 84 24 00 01 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e ea 21 00 00 48 8b 84 24 00 01 RSP: 0018:ffffc90003d3f1c8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 000000000001fffe RCX: 0000000000000000 RDX: 000000000000000e RSI: ffffffff882a3115 RDI: 0000000000000070 RBP: ffffc90003d3f378 R08: 0000000000000005 R09: 000000000000ffff R10: 000000000000ffff R11: 5ee4a93e456187d6 R12: 000000000001ffc6 R13: dffffc0000000000 R14: 0000000000000008 R15: 000000000000ffff FS: 00005555563f2380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020020000 CR3: 000000001626d000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> udp6_ufo_fragment+0x9d2/0xd50 net/ipv6/udp_offload.c:109 ipv6_gso_segment+0x5c4/0x17b0 net/ipv6/ip6_offload.c:120 skb_mac_gso_segment+0x292/0x610 net/core/gso.c:53 __skb_gso_segment+0x339/0x710 net/core/gso.c:124 skb_gso_segment include/net/gso.h:83 [inline] validate_xmit_skb+0x3a5/0xf10 net/core/dev.c:3625 __dev_queue_xmit+0x8f0/0x3d60 net/core/dev.c:4329 dev_queue_xmit include/linux/netdevice.h:3082 [inline] packet_xmit+0x257/0x380 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3087 [inline] packet_sendmsg+0x24c7/0x5570 net/packet/af_packet.c:3119 sock_sendmsg_nosec net/socket.c:727 [inline] sock_sendmsg+0xd9/0x180 net/socket.c:750 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2496 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2550 __sys_sendmsg+0x117/0x1e0 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7ff27cdb34d9 Fixes: 3953c46c3ac7 ("sk_buff: allow segmenting based on frag sizes") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Xin Long <lucien.xin@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20230816142158.1779798-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-26mmc: core: add devm_mmc_alloc_hostHeiner Kallweit
[ Upstream commit 80df83c2c57e75cb482ccf0c639ce84703ab41a2 ] Add a device-managed version of mmc_alloc_host(). The argument order is reversed compared to mmc_alloc_host() because device-managed functions typically have the device argument first. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/6d8f9fdc-7c9e-8e4f-e6ef-5470b971c74e@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Stable-dep-of: b8ada54fa1b8 ("mmc: meson-gx: fix deferred probing") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-26usb: gadget: udc: core: Introduce check_config to verify USB configurationWesley Cheng
[ Upstream commit ce7d0008c2356626f69f37ef1afce8fbc83fe142 ] Some UDCs may have constraints on how many high bandwidth endpoints it can support in a certain configuration. This API allows for the composite driver to pass down the total number of endpoints to the UDC so it can verify it has the required resources to support the configuration. Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> Link: https://lore.kernel.org/r/1625908395-5498-2-git-send-email-wcheng@codeaurora.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: dbe678f6192f ("usb: cdns3: fix NCM gadget RX speed 20x slow than expection at iMX8QM") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-26bus: mhi: Add MMIO region length to controller structureBhaumik Bhatt
[ Upstream commit baa7a08569358d9d16e71ce36f287c39a665d776 ] Make controller driver specify the MMIO register region length for range checking of BHI or BHIe space. This can help validate that offsets are in acceptable memory region or not and avoid any boot-up issues due to BHI or BHIe memory accesses. Link: https://lore.kernel.org/r/1620330705-40192-4-git-send-email-bbhatt@codeaurora.org Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20210802051255.5771-6-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: 6a0c637bfee6 ("bus: mhi: host: Range check CHDBOFF and ERDBOFF") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-26iopoll: Call cpu_relax() in busy loopsGeert Uytterhoeven
[ Upstream commit b407460ee99033503993ac7437d593451fcdfe44 ] It is considered good practice to call cpu_relax() in busy loops, see Documentation/process/volatile-considered-harmful.rst. This can not only lower CPU power consumption or yield to a hyperthreaded twin processor, but also allows an architecture to mitigate hardware issues (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the architecture-specific cpu_relax() implementation. In addition, cpu_relax() is also a compiler barrier. It is not immediately obvious that the @op argument "function" will result in an actual function call (e.g. in case of inlining). Where a function call is a C sequence point, this is lost on inlining. Therefore, with agressive enough optimization it might be possible for the compiler to hoist the: (val) = op(args); "load" out of the loop because it doesn't see the value changing. The addition of cpu_relax() would inhibit this. As the iopoll helpers lack calls to cpu_relax(), people are sometimes reluctant to use them, and may fall back to open-coded polling loops (including cpu_relax() calls) instead. Fix this by adding calls to cpu_relax() to the iopoll helpers: - For the non-atomic case, it is sufficient to call cpu_relax() in case of a zero sleep-between-reads value, as a call to usleep_range() is a safe barrier otherwise. However, it doesn't hurt to add the call regardless, for simplicity, and for similarity with the atomic case below. - For the atomic case, cpu_relax() must be called regardless of the sleep-between-reads value, as there is no guarantee all architecture-specific implementations of udelay() handle this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/45c87bec3397fdd704376807f0eec5cc71be440f.1685692810.git.geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-26net/mlx5: Move all internal timer metadata into a dedicated structEran Ben Elisha
[ Upstream commit d6f3dc8f509ce6288e2537eb4b0614ef444fd84a ] Internal timer mode (SW clock) requires some PTP clock related metadata structs. Real time mode (HW clock) will not need these metadata structs. This separation emphasize the different interfaces for HW clock and SW clock. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Stable-dep-of: d00620762565 ("net/mlx5: Skip clock update work when device is in error state") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-16x86/speculation: Add cpu_show_gds() prototypeArnd Bergmann
commit a57c27c7ad85c420b7de44c6ee56692d51709dda upstream. The newly added function has two definitions but no prototypes: drivers/base/cpu.c:605:16: error: no previous prototype for 'cpu_show_gds' [-Werror=missing-prototypes] Add a declaration next to the other ones for this file to avoid the warning. Fixes: 8974eb588283b ("x86/speculation: Add Gather Data Sampling mitigation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Cc: stable@kernel.org Link: https://lore.kernel.org/all/20230809130530.1913368-1-arnd%40kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11PM / wakeirq: support enabling wake-up irq after runtime_suspend calledChunfeng Yun
[ Upstream commit 259714100d98b50bf04d36a21bf50ca8b829fc11 ] When the dedicated wake IRQ is level trigger, and it uses the device's low-power status as the wakeup source, that means if the device is not in low-power state, the wake IRQ will be triggered if enabled; For this case, need enable the wake IRQ after running the device's ->runtime_suspend() which make it enter low-power state. e.g. Assume the wake IRQ is a low level trigger type, and the wakeup signal comes from the low-power status of the device. The wakeup signal is low level at running time (0), and becomes high level when the device enters low-power state (runtime_suspend (1) is called), a wakeup event at (2) make the device exit low-power state, then the wakeup signal also becomes low level. ------------------ | ^ ^| ---------------- | | -------------- |<---(0)--->|<--(1)--| (3) (2) (4) if enable the wake IRQ before running runtime_suspend during (0), a wake IRQ will arise, it causes resume immediately; it works if enable wake IRQ ( e.g. at (3) or (4)) after running ->runtime_suspend(). This patch introduces a new status WAKE_IRQ_DEDICATED_REVERSE to optionally support enabling wake IRQ after running ->runtime_suspend(). Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Stable-dep-of: 8527beb12087 ("PM: sleep: wakeirq: fix wake irq arming") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11tracing: Show real address for trace event argumentsMasami Hiramatsu
[ Upstream commit efbbdaa22bb78761bff8dfdde027ad04bedd47ce ] To help debugging kernel, show real address for trace event arguments in tracefs/trace{,pipe} instead of hashed pointer value. Since ftrace human-readable format uses vsprintf(), all %p are translated to hash values instead of pointer address. However, when debugging the kernel, raw address value gives a hint when comparing with the memory mapping in the kernel. (Those are sometimes used with crash log, which is not hashed too) So converting %p with %px when calling trace_seq_printf(). Moreover, this is not improving the security because the tracefs can be used only by root user and the raw address values are readable from tracefs/percpu/cpu*/trace_pipe_raw file. Link: https://lkml.kernel.org/r/160277370703.29307.5134475491761971203.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Stable-dep-of: d5a821896360 ("tracing: Fix memory leak of iter->temp when reading trace_pipe") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-08x86/srso: Add a Speculative RAS Overflow mitigationBorislav Petkov (AMD)
Upstream commit: fb3bd914b3ec28f5fb697ac55c4846ac2d542855 Add a mitigation for the speculative return address stack overflow vulnerability found on AMD processors. The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence. To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret(). Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08mm: Move mm_cachep initialization to mm_init()Peter Zijlstra
commit af80602799681c78f14fbe20b6185a56020dedee upstream. In order to allow using mm_alloc() much earlier, move initializing mm_cachep into mm_init(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221025201057.751153381@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/mm: Use mm_alloc() in poking_init()Peter Zijlstra
commit 3f4c8211d982099be693be9aa7d6fc4607dff290 upstream. Instead of duplicating init_mm, allocate a fresh mm. The advantage is that mm_alloc() has much simpler dependencies. Additionally it makes more conceptual sense, init_mm has no (and must not have) user state to duplicate. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221025201057.816175235@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08init: Provide arch_cpu_finalize_init()Thomas Gleixner
commit 7725acaa4f0c04fbefb0e0d342635b967bb7d414 upstream check_bugs() has become a dumping ground for all sorts of activities to finalize the CPU initialization before running the rest of the init code. Most are empty, a few do actual bug checks, some do alternative patching and some cobble a CPU advertisement string together.... Aside of that the current implementation requires duplicated function declaration and mostly empty header files for them. Provide a new function arch_cpu_finalize_init(). Provide a generic declaration if CONFIG_ARCH_HAS_CPU_FINALIZE_INIT is selected and a stub inline otherwise. This requires a temporary #ifdef in start_kernel() which will be removed along with check_bugs() once the architectures are converted over. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224544.957805717@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-27tcp: annotate data-races around fastopenq.max_qlenEric Dumazet
[ Upstream commit 70f360dd7042cb843635ece9d28335a4addff9eb ] This field can be read locklessly. Fixes: 1536e2857bd3 ("tcp: Add a TCP_FASTOPEN socket option to get a max backlog on its listner") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230719212857.3943972-12-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27posix-timers: Ensure timer ID search-loop limit is validThomas Gleixner
[ Upstream commit 8ce8849dd1e78dadcee0ec9acbd259d239b7069f ] posix_timer_add() tries to allocate a posix timer ID by starting from the cached ID which was stored by the last successful allocation. This is done in a loop searching the ID space for a free slot one by one. The loop has to terminate when the search wrapped around to the starting point. But that's racy vs. establishing the starting point. That is read out lockless, which leads to the following problem: CPU0 CPU1 posix_timer_add() start = sig->posix_timer_id; lock(hash_lock); ... posix_timer_add() if (++sig->posix_timer_id < 0) start = sig->posix_timer_id; sig->posix_timer_id = 0; So CPU1 can observe a negative start value, i.e. -1, and the loop break never happens because the condition can never be true: if (sig->posix_timer_id == start) break; While this is unlikely to ever turn into an endless loop as the ID space is huge (INT_MAX), the racy read of the start value caught the attention of KCSAN and Dmitry unearthed that incorrectness. Rewrite it so that all id operations are under the hash lock. Reported-by: syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/87bkhzdn6g.ffs@tglx Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27Revert "8250: add support for ASIX devices with a FIFO bug"Jiaqing Zhao
commit a82d62f708545d22859584e0e0620da8e3759bbc upstream. This reverts commit eb26dfe8aa7eeb5a5aa0b7574550125f8aa4c3b3. Commit eb26dfe8aa7e ("8250: add support for ASIX devices with a FIFO bug") merged on Jul 13, 2012 adds a quirk for PCI_VENDOR_ID_ASIX (0x9710). But that ID is the same as PCI_VENDOR_ID_NETMOS defined in 1f8b061050c7 ("[PATCH] Netmos parallel/serial/combo support") merged on Mar 28, 2005. In pci_serial_quirks array, the NetMos entry always takes precedence over the ASIX entry even since it was initially merged, code in that commit is always unreachable. In my tests, adding the FIFO workaround to pci_netmos_init() makes no difference, and the vendor driver also does not have such workaround. Given that the code was never used for over a decade, it's safe to revert it. Also, the real PCI_VENDOR_ID_ASIX should be 0x125b, which is used on their newer AX99100 PCIe serial controllers released on 2016. The FIFO workaround should not be intended for these newer controllers, and it was never implemented in vendor driver. Fixes: eb26dfe8aa7e ("8250: add support for ASIX devices with a FIFO bug") Cc: stable <stable@kernel.org> Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230619155743.827859-1-jiaqing.zhao@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-27workqueue: clean up WORK_* constant types, clarify maskingLinus Torvalds
commit afa4bb778e48d79e4a642ed41e3b4e0de7489a6c upstream. Dave Airlie reports that gcc-13.1.1 has started complaining about some of the workqueue code in 32-bit arm builds: kernel/workqueue.c: In function ‘get_work_pwq’: kernel/workqueue.c:713:24: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 713 | return (void *)(data & WORK_STRUCT_WQ_DATA_MASK); | ^ [ ... a couple of other cases ... ] and while it's not immediately clear exactly why gcc started complaining about it now, I suspect it's some C23-induced enum type handlign fixup in gcc-13 is the cause. Whatever the reason for starting to complain, the code and data types are indeed disgusting enough that the complaint is warranted. The wq code ends up creating various "helper constants" (like that WORK_STRUCT_WQ_DATA_MASK) using an enum type, which is all kinds of confused. The mask needs to be 'unsigned long', not some unspecified enum type. To make matters worse, the actual "mask and cast to a pointer" is repeated a couple of times, and the cast isn't even always done to the right pointer, but - as the error case above - to a 'void *' with then the compiler finishing the job. That's now how we roll in the kernel. So create the masks using the proper types rather than some ambiguous enumeration, and use a nice helper that actually does the type conversion in one well-defined place. Incidentally, this magically makes clang generate better code. That, admittedly, is really just a sign of clang having been seriously confused before, and cleaning up the typing unconfuses the compiler too. Reported-by: Dave Airlie <airlied@gmail.com> Link: https://lore.kernel.org/lkml/CAPM=9twNnV4zMCvrPkw3H-ajZOH-01JVh_kDrxdPYQErz8ZTdA@mail.gmail.com/ Cc: Arnd Bergmann <arnd@arndb.de> Cc: Tejun Heo <tj@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-27shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfsRoberto Sassu
commit 36ce9d76b0a93bae799e27e4f5ac35478c676592 upstream. As the ramfs-based tmpfs uses ramfs_init_fs_context() for the init_fs_context method, which allocates fc->s_fs_info, use ramfs_kill_sb() to free it and avoid a memory leak. Link: https://lkml.kernel.org/r/20230607161523.2876433-1-roberto.sassu@huaweicloud.com Fixes: c3b1b1cbf002 ("ramfs: add support for "mode=" mount option") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: Hugh Dickins <hughd@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-27Revert "usb: common: usb-conn-gpio: Set last role to unknown before initial ↵Greg Kroah-Hartman
detection" [ Upstream commit df49f2a0ac4a34c0cb4b5c233fcfa0add644c43c ] This reverts commit edd60d24bd858cef165274e4cd6cab43bdc58d15. Heikki reports that this should not be a global flag just to work around one broken driver and should be fixed differently, so revert it. Reported-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Fixes: edd60d24bd85 ("usb: common: usb-conn-gpio: Set last role to unknown before initial detection") Link: https://lore.kernel.org/r/ZImE4L3YgABnCIsP@kuha.fi.intel.com Cc: Prashanth K <quic_prashk@quicinc.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27usb: common: usb-conn-gpio: Set last role to unknown before initial detectionPrashanth K
[ Upstream commit edd60d24bd858cef165274e4cd6cab43bdc58d15 ] Currently if we bootup a device without cable connected, then usb-conn-gpio won't call set_role() since last_role is same as current role. This happens because during probe last_role gets initialised to zero. To avoid this, added a new constant in enum usb_role, last_role is set to USB_ROLE_UNKNOWN before performing initial detection. While at it, also handle default case for the usb_role switch in cdns3, intel-xhci-usb-role-switch & musb/jz4740 to avoid build warnings. Fixes: 4602f3bff266 ("usb: common: add USB GPIO based connection detection driver") Signed-off-by: Prashanth K <quic_prashk@quicinc.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Message-ID: <1685544074-17337-1-git-send-email-quic_prashk@quicinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t *Aneesh Kumar K.V
[ Upstream commit dc4875f0e791de554bdc45aa1dbd6e45e107e50f ] No functional change in this patch. [aneesh.kumar@linux.ibm.com: m68k build error reported by kernel robot] Link: https://lkml.kernel.org/r/87tulxnb2v.fsf@linux.ibm.com Link: https://lkml.kernel.org/r/20210615110859.320299-2-aneesh.kumar@linux.ibm.com Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Stable-dep-of: 0da90af431ab ("powerpc/book3s64/mm: Fix DirectMap stats in /proc/meminfo") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *Aneesh Kumar K.V
[ Upstream commit 9cf6fa2458443118b84090aa1bf7a3630b5940e8 ] No functional change in this patch. [aneesh.kumar@linux.ibm.com: fix] Link: https://lkml.kernel.org/r/87wnqtnb60.fsf@linux.ibm.com [sfr@canb.auug.org.au: another fix] Link: https://lkml.kernel.org/r/20210619134410.89559-1-aneesh.kumar@linux.ibm.com Link: https://lkml.kernel.org/r/20210615110859.320299-1-aneesh.kumar@linux.ibm.com Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Stable-dep-of: 0da90af431ab ("powerpc/book3s64/mm: Fix DirectMap stats in /proc/meminfo") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27PCI: Add pci_clear_master() stub for non-CONFIG_PCISui Jingfeng
[ Upstream commit 2aa5ac633259843f656eb6ecff4cf01e8e810c5e ] Add a pci_clear_master() stub when CONFIG_PCI is not set so drivers that support both PCI and platform devices don't need #ifdefs or extra Kconfig symbols for the PCI parts. [bhelgaas: commit log] Fixes: 6a479079c072 ("PCI: Add pci_clear_master() as opposite of pci_set_master()") Link: https://lore.kernel.org/r/20230531102744.2354313-1-suijingfeng@loongson.cn Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27watchdog/perf: define dummy watchdog_update_hrtimer_threshold() on correct ↵Douglas Anderson
config [ Upstream commit 5e008df11c55228a86a1bae692cc2002503572c9 ] Patch series "watchdog/hardlockup: Add the buddy hardlockup detector", v5. This patch series adds the "buddy" hardlockup detector. In brief, the buddy hardlockup detector can detect hardlockups without arch-level support by having CPUs checkup on a "buddy" CPU periodically. Given the new design of this patch series, testing all combinations is fairly difficult. I've attempted to make sure that all combinations of CONFIG_ options are good, but it wouldn't surprise me if I missed something. I apologize in advance and I'll do my best to fix any problems that are found. This patch (of 18): The real watchdog_update_hrtimer_threshold() is defined in kernel/watchdog_hld.c. That file is included if CONFIG_HARDLOCKUP_DETECTOR_PERF and the function is defined in that file if CONFIG_HARDLOCKUP_CHECK_TIMESTAMP. The dummy version of the function in "nmi.h" didn't get that quite right. While this doesn't appear to be a huge deal, it's nice to make it consistent. It doesn't break builds because CHECK_TIMESTAMP is only defined by x86 so others don't get a double definition, and x86 uses perf lockup detector, so it gets the out of line version. Link: https://lkml.kernel.org/r/20230519101840.v5.18.Ia44852044cdcb074f387e80df6b45e892965d4a1@changeid Link: https://lkml.kernel.org/r/20230519101840.v5.1.I8cbb2f4fa740528fcfade4f5439b6cdcdd059251@changeid Fixes: 7edaeb6841df ("kernel/watchdog: Prevent false positives with turbo modes") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guenter Roeck <groeck@chromium.org> Cc: Ian Rogers <irogers@google.com> Cc: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masayoshi Mizuma <msys.mizuma@gmail.com> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Sumit Garg <sumit.garg@linaro.org> Cc: Tzung-Bi Shih <tzungbi@chromium.org> Cc: Will Deacon <will@kernel.org> Cc: Colin Cross <ccross@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27net: create netdev->dev_addr assignment helpersJakub Kicinski
[ Upstream commit 48eab831ae8b9f7002a533fa4235eed63ea1f1a3 ] Recent work on converting address list to a tree made it obvious we need an abstraction around writing netdev->dev_addr. Without such abstraction updating the main device address is invisible to the core. Introduce a number of helpers which for now just wrap memcpy() but in the future can make necessary changes to the address tree. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 391af06a02e7 ("wifi: wl3501_cs: Fix an error handling path in wl3501_probe()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27bpf: Remove extra lock_sock for TCP_ZEROCOPY_RECEIVEStanislav Fomichev
[ Upstream commit 9cacf81f8161111db25f98e78a7a0e32ae142b3f ] Add custom implementation of getsockopt hook for TCP_ZEROCOPY_RECEIVE. We skip generic hooks for TCP_ZEROCOPY_RECEIVE and have a custom call in do_tcp_getsockopt using the on-stack data. This removes 3% overhead for locking/unlocking the socket. Without this patch: 3.38% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt | --3.30%--__cgroup_bpf_run_filter_getsockopt | --0.81%--__kmalloc With the patch applied: 0.52% 0.12% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt_kern Note, exporting uapi/tcp.h requires removing netinet/tcp.h from test_progs.h because those headers have confliciting definitions. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210115163501.805133-2-sdf@google.com Stable-dep-of: 2598619e012c ("sctp: add bpf_bypass_getsockopt proto callback") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27fs: pipe: reveal missing function protoypesArnd Bergmann
[ Upstream commit 247c8d2f9837a3e29e3b6b7a4aa9c36c37659dd4 ] A couple of functions from fs/pipe.c are used both internally and for the watch queue code, but the declaration is only visible when the latter is enabled: fs/pipe.c:1254:5: error: no previous prototype for 'pipe_resize_ring' fs/pipe.c:758:15: error: no previous prototype for 'account_pipe_buffers' fs/pipe.c:764:6: error: no previous prototype for 'too_many_pipe_buffers_soft' fs/pipe.c:771:6: error: no previous prototype for 'too_many_pipe_buffers_hard' fs/pipe.c:777:6: error: no previous prototype for 'pipe_is_unprivileged_user' Make the visible unconditionally to avoid these warnings. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Message-Id: <20230516195629.551602-1-arnd@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28gpio: Allow per-parent interrupt dataMarc Zyngier
[ Upstream commit cfe6807d82e97e81c3209dca9448f091e1448a57 ] The core gpiolib code is able to deal with multiple interrupt parents for a single gpio irqchip. It however only allows a single piece of data to be conveyed to all flow handlers (either the gpio_chip or some other, driver-specific data). This means that drivers have to go through some interesting dance to find the correct context, something that isn't great in interrupt context (see aebdc8abc9db86e2bd33070fc2f961012fff74b4 for a prime example). Instead, offer an optional way for a pinctrl/gpio driver to provide an array of pointers which gets used to provide the correct context to the flow handler. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20211026175815.52703-2-joey.gouly@arm.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Stable-dep-of: 8c00914e5438 ("gpiolib: Fix GPIO chip IRQ initialization restriction") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28sysctl: move some boundary constants from sysctl.c to sysctl_valsXiaoming Ni
[ Upstream commit 78e36f3b0dae586f623c4a37ec5eb5496f5abbe1 ] sysctl has helpers which let us specify boundary values for a min or max int value. Since these are used for a boundary check only they don't change, so move these variables to sysctl_vals to avoid adding duplicate variables. This will help with our cleanup of kernel/sysctl.c. [akpm@linux-foundation.org: update it for "mm/pagealloc: sysctl: change watermark_scale_factor max limit to 30%"] [mcgrof@kernel.org: major rebase] Link: https://lkml.kernel.org/r/20211123202347.818157-3-mcgrof@kernel.org Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Qing Wang <wangqing@vivo.com> Cc: Sebastian Reichel <sre@kernel.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Stephen Kitt <steve@sk2.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Antti Palosaari <crope@iki.fi> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Julia Lawall <julia.lawall@inria.fr> Cc: Lukas Middendorf <kernel@tuxforce.de> Cc: Mark Fasheh <mark@fasheh.com> Cc: Phillip Potter <phil@philpotter.co.uk> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Cc: James E.J. Bottomley <jejb@linux.ibm.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Stable-dep-of: 935d44acf621 ("memfd: check for non-NULL file_seals in memfd_create() syscall") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28regulator: pca9450: Fix LDO3OUT and LDO4OUT MASKTeresa Remmet
[ Upstream commit 7257d930aadcd62d1c7971ab14f3b1126356abdc ] L3_OUT and L4_OUT Bit fields range from Bit 0:4 and thus the mask should be 0x1F instead of 0x0F. Fixes: 0935ff5f1f0a ("regulator: pca9450: add pca9450 pmic driver") Signed-off-by: Teresa Remmet <t.remmet@phytec.de> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Link: https://lore.kernel.org/r/20230614125240.3946519-1-t.remmet@phytec.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21Remove DECnet support from kernelStephen Hemminger
commit 1202cdd665315c525b5237e96e0bedc76d7e754f upstream. DECnet is an obsolete network protocol that receives more attention from kernel janitors than users. It belongs in computer protocol history museum not in Linux kernel. It has been "Orphaned" in kernel since 2010. The iproute2 support for DECnet was dropped in 5.0 release. The documentation link on Sourceforge says it is abandoned there as well. Leave the UAPI alone to keep userspace programs compiling. This means that there is still an empty neighbour table for AF_DECNET. The table of /proc/sys/net entries was updated to match current directories and reformatted to be alphabetical. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Ahern <dsahern@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-21kernel.h: split out kstrtox() and simple_strtox() to a separate headerAndy Shevchenko
[ Upstream commit 4c52729377eab025b238caeed48994a39c3b73f2 ] kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out kstrtox() and simple_strtox() helpers. At the same time convert users in header and lib folders to use new header. Though for time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [andy.shevchenko@gmail.com: fix documentation references] Link: https://lkml.kernel.org/r/20210615220003.377901-1-andy.shevchenko@gmail.com Link: https://lkml.kernel.org/r/20210611185815.44103-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Francis Laniel <laniel_francis@privacyrequired.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Kars Mulder <kerneldev@karsmulder.nl> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Stable-dep-of: 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-14usb: usbfs: Enforce page requirements for mmapRuihan Li
commit 0143d148d1e882fb1538dc9974c94d63961719b9 upstream. The current implementation of usbdev_mmap uses usb_alloc_coherent to allocate memory pages that will later be mapped into the user space. Meanwhile, usb_alloc_coherent employs three different methods to allocate memory, as outlined below: * If hcd->localmem_pool is non-null, it uses gen_pool_dma_alloc to allocate memory; * If DMA is not available, it uses kmalloc to allocate memory; * Otherwise, it uses dma_alloc_coherent. However, it should be noted that gen_pool_dma_alloc does not guarantee that the resulting memory will be page-aligned. Furthermore, trying to map slab pages (i.e., memory allocated by kmalloc) into the user space is not resonable and can lead to problems, such as a type confusion bug when PAGE_TABLE_CHECK=y [1]. To address these issues, this patch introduces hcd_alloc_coherent_pages, which addresses the above two problems. Specifically, hcd_alloc_coherent_pages uses gen_pool_dma_alloc_align instead of gen_pool_dma_alloc to ensure that the memory is page-aligned. To replace kmalloc, hcd_alloc_coherent_pages directly allocates pages by calling __get_free_pages. Reported-by: syzbot+fcf1a817ceb50935ce99@syzkaller.appspotmail.comm Closes: https://lore.kernel.org/lkml/000000000000258e5e05fae79fc1@google.com/ [1] Fixes: f7d34b445abc ("USB: Add support for usbfs zerocopy.") Fixes: ff2437befd8f ("usb: host: Fix excessive alignment restriction for local memory allocations") Cc: stable@vger.kernel.org Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20230515130958.32471-2-lrh2000@pku.edu.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-14rfs: annotate lockless accesses to RFS sock flow tableEric Dumazet
[ Upstream commit 5c3b74a92aa285a3df722bf6329ba7ccf70346d6 ] Add READ_ONCE()/WRITE_ONCE() on accesses to the sock flow table. This also prevents a (smart ?) compiler to remove the condition in: if (table->ents[index] != newval) table->ents[index] = newval; We need the condition to avoid dirtying a shared cache line. Fixes: fec5e652e58f ("rfs: Receive Flow Steering") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05regulator: Add regmap helper for ramp-delay settingMatti Vaittinen
[ Upstream commit fb8fee9efdcf084d9e31ba14cc4734d97e5dd972 ] Quite a few regulator ICs do support setting ramp-delay by writing a value matching the delay to a ramp-delay register. Provide a simple helper for table-based delay setting. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Link: https://lore.kernel.org/r/f101f1db564cf32cb58719c77af0b00d7236bb89.1617020713.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Mark Brown <broonie@kernel.org> Stable-dep-of: d67dada3e252 ("regulator: pca9450: Fix BUCK2 enable_mask") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05power: supply: core: Refactor ↵Hans de Goede
power_supply_set_input_current_limit_from_supplier() [ Upstream commit 2220af8ca61ae67de4ec3deec1c6395a2f65b9fd ] Some (USB) charger ICs have variants with USB D+ and D- pins to do their own builtin charger-type detection, like e.g. the bq24190 and bq25890 and also variants which lack this functionality, e.g. the bq24192 and bq25892. In case the charger-type; and thus the input-current-limit detection is done outside the charger IC then we need some way to communicate this to the charger IC. In the past extcon was used for this, but if the external detection does e.g. full USB PD negotiation then the extcon cable-types do not convey enough information. For these setups it was decided to model the external charging "brick" and the parameters negotiated with it as a power_supply class-device itself; and power_supply_set_input_current_limit_from_supplier() was introduced to allow drivers to get the input-current-limit this way. But in some cases psy drivers may want to know other properties, e.g. the bq25892 can do "quick-charge" negotiation by pulsing its current draw, but this should only be done if the usb_type psy-property of its supplier is set to DCP (and device-properties indicate the board allows higher voltages). Instead of adding extra helper functions for each property which a psy-driver wants to query from its supplier, refactor power_supply_set_input_current_limit_from_supplier() into a more generic power_supply_get_property_from_supplier() function. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Stable-dep-of: 77c2a3097d70 ("power: supply: bq24190: Call power_supply_changed() after updating input current") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-05power: supply: bq27xxx: Ensure power_supply_changed() is called on current ↵Hans de Goede
sign changes [ Upstream commit 939a116142012926e25de0ea6b7e2f8d86a5f1b6 ] On gauges where the current register is signed, there is no charging flag in the flags register. So only checking flags will not result in power_supply_changed() getting called when e.g. a charger is plugged in and the current sign changes from negative (discharging) to positive (charging). This causes userspace's notion of the status to lag until userspace does a poll. And when a power_supply_leds.c LED trigger is used to indicate charging status with a LED, this LED will lag until the capacity percentage changes, which may take many minutes (because the LED trigger only is updated on power_supply_changed() calls). Fix this by calling bq27xxx_battery_current_and_status() on gauges with a signed current register and checking if the status has changed. Fixes: 297a533b3e62 ("bq27x00: Cache battery registers") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-30fs: fix undefined behavior in bit shift for SB_NOUSERHao Ge
commit f15afbd34d8fadbd375f1212e97837e32bc170cc upstream. Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. It was spotted by UBSAN. So let's just fix this by using the BIT() helper for all SB_* flags. Fixes: e462ec50cb5f ("VFS: Differentiate mount flags (MS_*) from internal superblock flags") Signed-off-by: Hao Ge <gehao@kylinos.cn> Message-Id: <20230424051835.374204-1-gehao@kylinos.cn> [brauner@kernel.org: use BIT() for all SB_* flags] Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30power: supply: bq27xxx: Fix poll_interval handling and races on removeHans de Goede
commit c00bc80462afc7963f449d7f21d896d2f629cacc upstream. Before this patch bq27xxx_battery_teardown() was setting poll_interval = 0 to avoid bq27xxx_battery_update() requeuing the delayed_work item. There are 2 problems with this: 1. If the driver is unbound through sysfs, rather then the module being rmmod-ed, this changes poll_interval unexpectedly 2. This is racy, after it being set poll_interval could be changed before bq27xxx_battery_update() checks it through /sys/module/bq27xxx_battery/parameters/poll_interval Fix this by added a removed attribute to struct bq27xxx_device_info and using that instead of setting poll_interval to 0. There also is another poll_interval related race on remove(), writing /sys/module/bq27xxx_battery/parameters/poll_interval will requeue the delayed_work item for all devices on the bq27xxx_battery_devices list and the device being removed was only removed from that list after cancelling the delayed_work item. Fix this by moving the removal from the bq27xxx_battery_devices list to before cancelling the delayed_work item. Fixes: 8cfaaa811894 ("bq27x00_battery: Fix OOPS caused by unregistring bq27x00 driver") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30USB: core: Add routines for endpoint checks in old driversAlan Stern
commit 13890626501ffda22b18213ddaf7930473da5792 upstream. Many of the older USB drivers in the Linux USB stack were written based simply on a vendor's device specification. They use the endpoint information in the spec and assume these endpoints will always be present, with the properties listed, in any device matching the given vendor and product IDs. While that may have been true back then, with spoofing and fuzzing it is not true any more. More and more we are finding that those old drivers need to perform at least a minimum of checking before they try to use any endpoint other than ep0. To make this checking as simple as possible, we now add a couple of utility routines to the USB core. usb_check_bulk_endpoints() and usb_check_int_endpoints() take an interface pointer together with a list of endpoint addresses (numbers and directions). They check that the interface's current alternate setting includes endpoints with those addresses and that each of these endpoints has the right type: bulk or interrupt, respectively. Although we already have usb_find_common_endpoints() and related routines meant for a similar purpose, they are not well suited for this kind of checking. Those routines find endpoints of various kinds, but only one (either the first or the last) of each kind, and they don't verify that the endpoints' addresses agree with what the caller expects. In theory the new routines could be more general: They could take a particular altsetting as their argument instead of always using the interface's current altsetting. In practice I think this won't matter too much; multiple altsettings tend to be used for transferring media (audio or visual) over isochronous endpoints, not bulk or interrupt. Drivers for such devices will generally require more sophisticated checking than these simplistic routines provide. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/dd2c8e8c-2c87-44ea-ba17-c64b97e201c9@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30net: fix stack overflow when LRO is disabled for virtual interfacesTaehee Yoo
commit ae9b15fbe63447bc1d3bba3769f409d17ca6fdf6 upstream. When the virtual interface's feature is updated, it synchronizes the updated feature for its own lower interface. This propagation logic should be worked as the iteration, not recursively. But it works recursively due to the netdev notification unexpectedly. This problem occurs when it disables LRO only for the team and bonding interface type. team0 | +------+------+-----+-----+ | | | | | team1 team2 team3 ... team200 If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE event to its own lower interfaces(team1 ~ team200). It is worked by netdev_sync_lower_features(). So, the NETDEV_FEAT_CHANGE notification logic of each lower interface work iteratively. But generated NETDEV_FEAT_CHANGE event is also sent to the upper interface too. upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own lower interfaces again. lower and upper interfaces receive this event and generate this event again and again. So, the stack overflow occurs. But it is not the infinite loop issue. Because the netdev_sync_lower_features() updates features before generating the NETDEV_FEAT_CHANGE event. Already synchronized lower interfaces skip notification logic. So, it is just the problem that iteration logic is changed to the recursive unexpectedly due to the notification mechanism. Reproducer: ip link add team0 type team ethtool -K team0 lro on for i in {1..200} do ip link add team$i master team0 type team ethtool -K team$i lro on done ethtool -K team0 lro off In order to fix it, the notifier_ctx member of bonding/team is introduced. Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230517143010.3596250-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>