<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel, branch v6.1.53</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.53</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.53'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2023-09-13T07:43:05Z</updated>
<entry>
<title>tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY</title>
<updated>2023-09-13T07:43:05Z</updated>
<author>
<name>Brian Foster</name>
<email>bfoster@redhat.com</email>
</author>
<published>2023-08-31T12:55:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1dd387668d5bd911df0b21716fff6f5180b11e42'/>
<id>urn:sha1:1dd387668d5bd911df0b21716fff6f5180b11e42</id>
<content type='text'>
commit 3d07fa1dd19035eb0b13ae6697efd5caa9033e74 upstream.

The pipe cpumask used to serialize opens between the main and percpu
trace pipes is not zeroed or initialized. This can result in
spurious -EBUSY returns if underlying memory is not fully zeroed.
This has been observed by immediate failure to read the main
trace_pipe file on an otherwise newly booted and idle system:

 # cat /sys/kernel/debug/tracing/trace_pipe
 cat: /sys/kernel/debug/tracing/trace_pipe: Device or resource busy

Zero the allocation of pipe_cpumask to avoid the problem.

Link: https://lore.kernel.org/linux-trace-kernel/20230831125500.986862-1-bfoster@redhat.com

Cc: stable@vger.kernel.org
Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
Reviewed-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Reviewed-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bpf: Fix issue in verifying allow_ptr_leaks</title>
<updated>2023-09-13T07:43:02Z</updated>
<author>
<name>Yafang Shao</name>
<email>laoar.shao@gmail.com</email>
</author>
<published>2023-08-23T02:07:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c96c67991aac6401b4c6996093bccb704bb2ea4b'/>
<id>urn:sha1:c96c67991aac6401b4c6996093bccb704bb2ea4b</id>
<content type='text'>
commit d75e30dddf73449bc2d10bb8e2f1a2c446bc67a2 upstream.

After we converted the capabilities of our networking-bpf program from
cap_sys_admin to cap_net_admin+cap_bpf, our networking-bpf program
failed to start. Because it failed the bpf verifier, and the error log
is "R3 pointer comparison prohibited".

A simple reproducer as follows,

SEC("cls-ingress")
int ingress(struct __sk_buff *skb)
{
	struct iphdr *iph = (void *)(long)skb-&gt;data + sizeof(struct ethhdr);

	if ((long)(iph + 1) &gt; (long)skb-&gt;data_end)
		return TC_ACT_STOLEN;
	return TC_ACT_OK;
}

Per discussion with Yonghong and Alexei [1], comparison of two packet
pointers is not a pointer leak. This patch fixes it.

Our local kernel is 6.1.y and we expect this fix to be backported to
6.1.y, so stable is CCed.

[1]. https://lore.kernel.org/bpf/CAADnVQ+Nmspr7Si+pxWn8zkE7hX-7s93ugwC+94aXSy4uQ9vBg@mail.gmail.com/

Suggested-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Suggested-by: Alexei Starovoitov &lt;alexei.starovoitov@gmail.com&gt;
Signed-off-by: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230823020703.3790-2-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cpu/hotplug: Prevent self deadlock on CPU hot-unplug</title>
<updated>2023-09-13T07:43:00Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2023-08-23T08:47:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fea9dd8653ff39ce383c54e747bde4c39289b4ad'/>
<id>urn:sha1:fea9dd8653ff39ce383c54e747bde4c39289b4ad</id>
<content type='text'>
commit 2b8272ff4a70b866106ae13c36be7ecbef5d5da2 upstream.

Xiongfeng reported and debugged a self deadlock of the task which initiates
and controls a CPU hot-unplug operation vs. the CFS bandwidth timer.

    CPU1      			                 	 CPU2

T1 sets cfs_quota
   starts hrtimer cfs_bandwidth 'period_timer'
T1 is migrated to CPU2
						T1 initiates offlining of CPU1
Hotplug operation starts
  ...
'period_timer' expires and is re-enqueued on CPU1
  ...
take_cpu_down()
  CPU1 shuts down and does not handle timers
  anymore. They have to be migrated in the
  post dead hotplug steps by the control task.

						T1 runs the post dead offline operation
					      	T1 is scheduled out
						T1 waits for 'period_timer' to expire

T1 waits there forever if it is scheduled out before it can execute the hrtimer
offline callback hrtimers_dead_cpu().

Cure this by delegating the hotplug control operation to a worker thread on
an online CPU. This takes the initiating user space task, which might be
affected by the bandwidth timer, completely out of the picture.

Reported-by: Xiongfeng Wang &lt;wangxiongfeng2@huawei.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Yu Liao &lt;liaoyu15@huawei.com&gt;
Acked-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/lkml/8e785777-03aa-99e1-d20e-e956f5685be6@huawei.com
Link: https://lore.kernel.org/r/87h6oqdq0i.ffs@tglx
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>printk: ringbuffer: Fix truncating buffer size min_t cast</title>
<updated>2023-09-13T07:43:00Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2023-08-11T05:45:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2344b13976512fab5f7d2dcc6018ebdea215c98e'/>
<id>urn:sha1:2344b13976512fab5f7d2dcc6018ebdea215c98e</id>
<content type='text'>
commit 53e9e33ede37a247d926db5e4a9e56b55204e66c upstream.

If an output buffer size exceeded U16_MAX, the min_t(u16, ...) cast in
copy_data() was causing writes to truncate. This manifested as output
bytes being skipped, seen as %NUL bytes in pstore dumps when the available
record size was larger than 65536. Fix the cast to no longer truncate
the calculation.

Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: John Ogness &lt;john.ogness@linutronix.de&gt;
Reported-by: Vijay Balakrishna &lt;vijayb@linux.microsoft.com&gt;
Link: https://lore.kernel.org/lkml/d8bb1ec7-a4c5-43a2-9de0-9643a70b899f@linux.microsoft.com/
Fixes: b6cf8b3f3312 ("printk: add lockless ringbuffer")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Vijay Balakrishna &lt;vijayb@linux.microsoft.com&gt;
Tested-by: Guilherme G. Piccoli &lt;gpiccoli@igalia.com&gt; # Steam Deck
Reviewed-by: Tyler Hicks (Microsoft) &lt;code@tyhicks.com&gt;
Tested-by: Tyler Hicks (Microsoft) &lt;code@tyhicks.com&gt;
Reviewed-by: John Ogness &lt;john.ogness@linutronix.de&gt;
Reviewed-by: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Reviewed-by: Petr Mladek &lt;pmladek@suse.com&gt;
Signed-off-by: Petr Mladek &lt;pmladek@suse.com&gt;
Link: https://lore.kernel.org/r/20230811054528.never.165-kees@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tracing: Fix race issue between cpu buffer write and swap</title>
<updated>2023-09-13T07:42:57Z</updated>
<author>
<name>Zheng Yejian</name>
<email>zhengyejian1@huawei.com</email>
</author>
<published>2023-08-31T13:27:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=74c85396bd73eca80b96510b4edf93b9a3aff75f'/>
<id>urn:sha1:74c85396bd73eca80b96510b4edf93b9a3aff75f</id>
<content type='text'>
[ Upstream commit 3163f635b20e9e1fb4659e74f47918c9dddfe64e ]

Warning happened in rb_end_commit() at code:
	if (RB_WARN_ON(cpu_buffer, !local_read(&amp;cpu_buffer-&gt;committing)))

  WARNING: CPU: 0 PID: 139 at kernel/trace/ring_buffer.c:3142
	rb_commit+0x402/0x4a0
  Call Trace:
   ring_buffer_unlock_commit+0x42/0x250
   trace_buffer_unlock_commit_regs+0x3b/0x250
   trace_event_buffer_commit+0xe5/0x440
   trace_event_buffer_reserve+0x11c/0x150
   trace_event_raw_event_sched_switch+0x23c/0x2c0
   __traceiter_sched_switch+0x59/0x80
   __schedule+0x72b/0x1580
   schedule+0x92/0x120
   worker_thread+0xa0/0x6f0

It is because the race between writing event into cpu buffer and swapping
cpu buffer through file per_cpu/cpu0/snapshot:

  Write on CPU 0             Swap buffer by per_cpu/cpu0/snapshot on CPU 1
  --------                   --------
                             tracing_snapshot_write()
                               [...]

  ring_buffer_lock_reserve()
    cpu_buffer = buffer-&gt;buffers[cpu]; // 1. Suppose find 'cpu_buffer_a';
    [...]
    rb_reserve_next_event()
      [...]

                               ring_buffer_swap_cpu()
                                 if (local_read(&amp;cpu_buffer_a-&gt;committing))
                                     goto out_dec;
                                 if (local_read(&amp;cpu_buffer_b-&gt;committing))
                                     goto out_dec;
                                 buffer_a-&gt;buffers[cpu] = cpu_buffer_b;
                                 buffer_b-&gt;buffers[cpu] = cpu_buffer_a;
                                 // 2. cpu_buffer has swapped here.

      rb_start_commit(cpu_buffer);
      if (unlikely(READ_ONCE(cpu_buffer-&gt;buffer)
          != buffer)) { // 3. This check passed due to 'cpu_buffer-&gt;buffer'
        [...]           //    has not changed here.
        return NULL;
      }
                                 cpu_buffer_b-&gt;buffer = buffer_a;
                                 cpu_buffer_a-&gt;buffer = buffer_b;
                                 [...]

      // 4. Reserve event from 'cpu_buffer_a'.

  ring_buffer_unlock_commit()
    [...]
    cpu_buffer = buffer-&gt;buffers[cpu]; // 5. Now find 'cpu_buffer_b' !!!
    rb_commit(cpu_buffer)
      rb_end_commit()  // 6. WARN for the wrong 'committing' state !!!

Based on above analysis, we can easily reproduce by following testcase:
  ``` bash
  #!/bin/bash

  dmesg -n 7
  sysctl -w kernel.panic_on_warn=1
  TR=/sys/kernel/tracing
  echo 7 &gt; ${TR}/buffer_size_kb
  echo "sched:sched_switch" &gt; ${TR}/set_event
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  while [ true ]; do
          echo 1 &gt; ${TR}/per_cpu/cpu0/snapshot
  done &amp;
  ```

To fix it, IIUC, we can use smp_call_function_single() to do the swap on
the target cpu where the buffer is located, so that above race would be
avoided.

Link: https://lore.kernel.org/linux-trace-kernel/20230831132739.4070878-1-zhengyejian1@huawei.com

Cc: &lt;mhiramat@kernel.org&gt;
Fixes: f1affcaaa861 ("tracing: Add snapshot in the per_cpu trace directories")
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tracing: Remove extra space at the end of hwlat_detector/mode</title>
<updated>2023-09-13T07:42:57Z</updated>
<author>
<name>Mikhail Kobuk</name>
<email>m.kobuk@ispras.ru</email>
</author>
<published>2023-08-25T10:34:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb34716c9ee6700bc6f2b0fdc2a46a4c534b0c93'/>
<id>urn:sha1:fb34716c9ee6700bc6f2b0fdc2a46a4c534b0c93</id>
<content type='text'>
[ Upstream commit 2cf0dee989a8b2501929eaab29473b6b1fa11057 ]

Space is printed after each mode value including the last one:
$ echo \"$(sudo cat /sys/kernel/tracing/hwlat_detector/mode)\"
"none [round-robin] per-cpu "

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Link: https://lore.kernel.org/linux-trace-kernel/20230825103432.7750-1-m.kobuk@ispras.ru

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 8fa826b7344d ("trace/hwlat: Implement the mode config option")
Signed-off-by: Mikhail Kobuk &lt;m.kobuk@ispras.ru&gt;
Reviewed-by: Alexey Khoroshilov &lt;khoroshilov@ispras.ru&gt;
Acked-by: Daniel Bristot de Oliveira &lt;bristot@kernel.org&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tick/rcu: Fix false positive "softirq work is pending" messages</title>
<updated>2023-09-13T07:42:57Z</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2023-08-18T20:07:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=55a448e8d86342a09e207c58675b65213509db2f'/>
<id>urn:sha1:55a448e8d86342a09e207c58675b65213509db2f</id>
<content type='text'>
[ Upstream commit 96c1fa04f089a7e977a44e4e8fdc92e81be20bef ]

In commit 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle") the
new function report_idle_softirq() was created by breaking code out of the
existing can_stop_idle_tick() for kernels v5.18 and newer.

In doing so, the code essentially went from a one conditional:

	if (a &amp;&amp; b &amp;&amp; c)
		warn();

to a three conditional:

	if (!a)
		return;
	if (!b)
		return;
	if (!c)
		return;
	warn();

But that conversion got the condition for the RT specific
local_bh_blocked() wrong. The original condition was:

   	!local_bh_blocked()

but the conversion failed to negate it so it ended up as:

        if (!local_bh_blocked())
		return false;

This issue lay dormant until another fixup for the same commit was added
in commit a7e282c77785 ("tick/rcu: Fix bogus ratelimit condition").
This commit realized the ratelimit was essentially set to zero instead
of ten, and hence *no* softirq pending messages would ever be issued.

Once this commit was backported via linux-stable, both the v6.1 and v6.4
preempt-rt kernels started printing out 10 instances of this at boot:

  NOHZ tick-stop error: local softirq work is pending, handler #80!!!

Remove the negation and return when local_bh_blocked() evaluates to true to
bring the correct behaviour back.

Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Ahmad Fatoum &lt;a.fatoum@pengutronix.de&gt;
Reviewed-by: Wen Yang &lt;wenyang.linux@foxmail.com&gt;
Acked-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Link: https://lore.kernel.org/r/20230818200757.1808398-1-paul.gortmaker@windriver.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup:namespace: Remove unused cgroup_namespaces_init()</title>
<updated>2023-09-13T07:42:56Z</updated>
<author>
<name>Lu Jialin</name>
<email>lujialin4@huawei.com</email>
</author>
<published>2023-08-10T11:25:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=848cd6f24aa72fc4f7cc2dbca6edb41edf059de5'/>
<id>urn:sha1:848cd6f24aa72fc4f7cc2dbca6edb41edf059de5</id>
<content type='text'>
[ Upstream commit 82b90b6c5b38e457c7081d50dff11ecbafc1e61a ]

cgroup_namspace_init() just return 0. Therefore, there is no need to
call it during start_kernel. Just remove it.

Fixes: a79a908fd2b0 ("cgroup: introduce cgroup namespaces")
Signed-off-by: Lu Jialin &lt;lujialin4@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup/cpuset: Inherit parent's load balance state in v2</title>
<updated>2023-09-13T07:42:49Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2023-06-27T14:35:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8199a46af2ea7b65a0ad24db3fdee1b7034005c1'/>
<id>urn:sha1:8199a46af2ea7b65a0ad24db3fdee1b7034005c1</id>
<content type='text'>
[ Upstream commit c8c926200c55454101f072a4b16c9ff5b8c9e56f ]

Since commit f28e22441f35 ("cgroup/cpuset: Add a new isolated
cpus.partition type"), the CS_SCHED_LOAD_BALANCE bit of a v2 cpuset
can be on or off. The child cpusets of a partition root must have the
same setting as its parent or it may screw up the rebuilding of sched
domains. Fix this problem by making sure the a child v2 cpuset will
follows its parent cpuset load balance state unless the child cpuset
is a new partition root itself.

Fixes: f28e22441f35 ("cgroup/cpuset: Add a new isolated cpus.partition type")
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: Allow drivers to request exclusive config regions</title>
<updated>2023-09-13T07:42:46Z</updated>
<author>
<name>Ira Weiny</name>
<email>ira.weiny@intel.com</email>
</author>
<published>2022-09-26T21:57:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3108f7c7888401c8afdae2605705fcb3f81ff631'/>
<id>urn:sha1:3108f7c7888401c8afdae2605705fcb3f81ff631</id>
<content type='text'>
[ Upstream commit 278294798ac9118412c9624a801d3f20f2279363 ]

PCI config space access from user space has traditionally been
unrestricted with writes being an understood risk for device operation.

Unfortunately, device breakage or odd behavior from config writes lacks
indicators that can leave driver writers confused when evaluating
failures.  This is especially true with the new PCIe Data Object
Exchange (DOE) mailbox protocol where backdoor shenanigans from user
space through things such as vendor defined protocols may affect device
operation without complete breakage.

A prior proposal restricted read and writes completely.[1]  Greg and
Bjorn pointed out that proposal is flawed for a couple of reasons.
First, lspci should always be allowed and should not interfere with any
device operation.  Second, setpci is a valuable tool that is sometimes
necessary and it should not be completely restricted.[2]  Finally
methods exist for full lock of device access if required.

Even though access should not be restricted it would be nice for driver
writers to be able to flag critical parts of the config space such that
interference from user space can be detected.

Introduce pci_request_config_region_exclusive() to mark exclusive config
regions.  Such regions trigger a warning and kernel taint if accessed
via user space.

Create pci_warn_once() to restrict the user from spamming the log.

[1] https://lore.kernel.org/all/161663543465.1867664.5674061943008380442.stgit@dwillia2-desk3.amr.corp.intel.com/
[2] https://lore.kernel.org/all/YF8NGeGv9vYcMfTV@kroah.com/

Cc: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Suggested-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Link: https://lore.kernel.org/r/20220926215711.2893286-2-ira.weiny@intel.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Stable-dep-of: 5e70d0acf082 ("PCI: Add locking to RMW PCI Express Capability Register accessors")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
