<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel, branch v5.15.8</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.8</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.8'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-12-14T09:57:15Z</updated>
<entry>
<title>wait: add wake_up_pollfree()</title>
<updated>2021-12-14T09:57:15Z</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2021-12-09T01:04:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1ebb6cd8c754bfe1a5f9539027980756bce7cb08'/>
<id>urn:sha1:1ebb6cd8c754bfe1a5f9539027980756bce7cb08</id>
<content type='text'>
commit 42288cb44c4b5fff7653bc392b583a2b8bd6a8c0 upstream.

Several -&gt;poll() implementations are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case.  This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution.  This solution is for the queue to be cleared
before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'.

However, that has a bug: wake_up_poll() calls __wake_up() with
nr_exclusive=1.  Therefore, if there are multiple "exclusive" waiters,
and the wakeup function for the first one returns a positive value, only
that one will be called.  That's *not* what's needed for POLLFREE;
POLLFREE is special in that it really needs to wake up everyone.

Considering the three non-blocking poll systems:

- io_uring poll doesn't handle POLLFREE at all, so it is broken anyway.

- aio poll is unaffected, since it doesn't support exclusive waits.
  However, that's fragile, as someone could add this feature later.

- epoll doesn't appear to be broken by this, since its wakeup function
  returns 0 when it sees POLLFREE.  But this is fragile.

Although there is a workaround (see epoll), it's better to define a
function which always sends POLLFREE to all waiters.  Add such a
function.  Also make it verify that the queue really becomes empty after
all waiters have been woken up.

Reported-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>timers: implement usleep_idle_range()</title>
<updated>2021-12-14T09:57:11Z</updated>
<author>
<name>SeongJae Park</name>
<email>sj@kernel.org</email>
</author>
<published>2021-12-10T22:46:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5a960e533c2a0499f5f7ae6b6962fff847dacf74'/>
<id>urn:sha1:5a960e533c2a0499f5f7ae6b6962fff847dacf74</id>
<content type='text'>
commit e4779015fd5d2fb8390c258268addff24d6077c7 upstream.

Patch series "mm/damon: Fix fake /proc/loadavg reports", v3.

This patchset fixes DAMON's fake load report issue.  The first patch
makes yet another variant of usleep_range() for this fix, and the second
patch fixes the issue of DAMON by making it using the newly introduced
function.

This patch (of 2):

Some kernel threads such as DAMON could need to repeatedly sleep in
micro seconds level.  Because usleep_range() sleeps in uninterruptible
state, however, such threads would make /proc/loadavg reports fake load.

To help such cases, this commit implements a variant of usleep_range()
called usleep_idle_range().  It is same to usleep_range() but sets the
state of the current task as TASK_IDLE while sleeping.

Link: https://lkml.kernel.org/r/20211126145015.15862-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211126145015.15862-2-sj@kernel.org
Signed-off-by: SeongJae Park &lt;sj@kernel.org&gt;
Suggested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bpf: Fix the off-by-two error in range markings</title>
<updated>2021-12-14T09:57:09Z</updated>
<author>
<name>Maxim Mikityanskiy</name>
<email>maximmi@nvidia.com</email>
</author>
<published>2021-11-30T18:16:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b4fb67fd1adf87be9e61486da8bddcc22e9c717c'/>
<id>urn:sha1:b4fb67fd1adf87be9e61486da8bddcc22e9c717c</id>
<content type='text'>
commit 2fa7d94afc1afbb4d702760c058dc2d7ed30f226 upstream.

The first commit cited below attempts to fix the off-by-one error that
appeared in some comparisons with an open range. Due to this error,
arithmetically equivalent pieces of code could get different verdicts
from the verifier, for example (pseudocode):

  // 1. Passes the verifier:
  if (data + 8 &gt; data_end)
      return early
  read *(u64 *)data, i.e. [data; data+7]

  // 2. Rejected by the verifier (should still pass):
  if (data + 7 &gt;= data_end)
      return early
  read *(u64 *)data, i.e. [data; data+7]

The attempted fix, however, shifts the range by one in a wrong
direction, so the bug not only remains, but also such piece of code
starts failing in the verifier:

  // 3. Rejected by the verifier, but the check is stricter than in #1.
  if (data + 8 &gt;= data_end)
      return early
  read *(u64 *)data, i.e. [data; data+7]

The change performed by that fix converted an off-by-one bug into
off-by-two. The second commit cited below added the BPF selftests
written to ensure than code chunks like #3 are rejected, however,
they should be accepted.

This commit fixes the off-by-two error by adjusting new_range in the
right direction and fixes the tests by changing the range into the
one that should actually fail.

Fixes: fb2a311a31d3 ("bpf: fix off by one for range markings with L{T, E} patterns")
Fixes: b37242c773b2 ("bpf: add test cases to bpf selftests to cover all access tests")
Signed-off-by: Maxim Mikityanskiy &lt;maximmi@nvidia.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20211130181607.593149-1-maximmi@nvidia.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>sched/uclamp: Fix rq-&gt;uclamp_max not set on first enqueue</title>
<updated>2021-12-08T08:04:54Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2021-12-02T11:20:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a246d92dda9fa3ca3a79bd49d8b982b7c176b042'/>
<id>urn:sha1:a246d92dda9fa3ca3a79bd49d8b982b7c176b042</id>
<content type='text'>
[ Upstream commit 315c4f884800c45cb6bd8c90422fad554a8b9588 ]

Commit d81ae8aac85c ("sched/uclamp: Fix initialization of struct
uclamp_rq") introduced a bug where uclamp_max of the rq is not reset to
match the woken up task's uclamp_max when the rq is idle.

The code was relying on rq-&gt;uclamp_max initialized to zero, so on first
enqueue

	static inline void uclamp_rq_inc_id(struct rq *rq, struct task_struct *p,
					    enum uclamp_id clamp_id)
	{
		...

		if (uc_se-&gt;value &gt; READ_ONCE(uc_rq-&gt;value))
			WRITE_ONCE(uc_rq-&gt;value, uc_se-&gt;value);
	}

was actually resetting it. But since commit d81ae8aac85c changed the
default to 1024, this no longer works. And since rq-&gt;uclamp_flags is
also initialized to 0, neither above code path nor uclamp_idle_reset()
update the rq-&gt;uclamp_max on first wake up from idle.

This is only visible from first wake up(s) until the first dequeue to
idle after enabling the static key. And it only matters if the
uclamp_max of this task is &lt; 1024 since only then its uclamp_max will be
effectively ignored.

Fix it by properly initializing rq-&gt;uclamp_flags = UCLAMP_FLAG_IDLE to
ensure uclamp_idle_reset() is called which then will update the rq
uclamp_max value as expected.

Fixes: d81ae8aac85c ("sched/uclamp: Fix initialization of struct uclamp_rq")
Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;Valentin.Schneider@arm.com&gt;
Tested-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Link: https://lkml.kernel.org/r/20211202112033.1705279-1-qais.yousef@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>preempt/dynamic: Fix setup_preempt_mode() return value</title>
<updated>2021-12-08T08:04:54Z</updated>
<author>
<name>Andrew Halaney</name>
<email>ahalaney@redhat.com</email>
</author>
<published>2021-12-03T23:32:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fcf714776066865b2969c12793a12f92d8ded342'/>
<id>urn:sha1:fcf714776066865b2969c12793a12f92d8ded342</id>
<content type='text'>
[ Upstream commit 9ed20bafc85806ca6c97c9128cec46c3ef80ae86 ]

__setup() callbacks expect 1 for success and 0 for failure. Correct the
usage here to reflect that.

Fixes: 826bfeb37bb4 ("preempt/dynamic: Support dynamic preempt with preempt= boot option")
Reported-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Andrew Halaney &lt;ahalaney@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20211203233203.133581-1-ahalaney@redhat.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tracing/histograms: String compares should not care about signed values</title>
<updated>2021-12-08T08:04:44Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-11-30T17:31:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f7b4f571d580caa47d8261e3ef999dfb401a7b72'/>
<id>urn:sha1:f7b4f571d580caa47d8261e3ef999dfb401a7b72</id>
<content type='text'>
commit 450fec13d9170127678f991698ac1a5b05c02e2f upstream.

When comparing two strings for the "onmatch" histogram trigger, fields
that are strings use string comparisons, which do not care about being
signed or not.

Do not fail to match two string fields if one is unsigned char array and
the other is a signed char array.

Link: https://lore.kernel.org/all/20211129123043.5cfd687a@gandalf.local.home/

Cc: stable@vgerk.kernel.org
Cc: Tom Zanussi &lt;zanussi@kernel.org&gt;
Cc: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Fixes: b05e89ae7cf3b ("tracing: Accept different type for synthetic event fields")
Reviewed-by: Masami Hiramatsu &lt;mhiramatsu@kernel.org&gt;
Reported-by: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>kprobes: Limit max data_size of the kretprobe instances</title>
<updated>2021-12-08T08:04:41Z</updated>
<author>
<name>Masami Hiramatsu</name>
<email>mhiramat@kernel.org</email>
</author>
<published>2021-12-01T14:45:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=16ccd481e3d8773065240b2b4e6e21853fe86b1d'/>
<id>urn:sha1:16ccd481e3d8773065240b2b4e6e21853fe86b1d</id>
<content type='text'>
commit 6bbfa44116689469267f1a6e3d233b52114139d2 upstream.

The 'kprobe::data_size' is unsigned, thus it can not be negative.  But if
user sets it enough big number (e.g. (size_t)-8), the result of 'data_size
+ sizeof(struct kretprobe_instance)' becomes smaller than sizeof(struct
kretprobe_instance) or zero. In result, the kretprobe_instance are
allocated without enough memory, and kretprobe accesses outside of
allocated memory.

To avoid this issue, introduce a max limitation of the
kretprobe::data_size. 4KB per instance should be OK.

Link: https://lkml.kernel.org/r/163836995040.432120.10322772773821182925.stgit@devnote2

Cc: stable@vger.kernel.org
Fixes: f47cd9b553aa ("kprobes: kretprobe user entry-handler")
Reported-by: zhangyue &lt;zhangyue1@kylinos.cn&gt;
Signed-off-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tracing: Don't use out-of-sync va_list in event printing</title>
<updated>2021-12-08T08:04:41Z</updated>
<author>
<name>Nikita Yushchenko</name>
<email>nikita.yushchenko@virtuozzo.com</email>
</author>
<published>2021-11-18T14:55:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=859ea5a20ee718093ceb28b8a39916c908be7d64'/>
<id>urn:sha1:859ea5a20ee718093ceb28b8a39916c908be7d64</id>
<content type='text'>
[ Upstream commit 2ef75e9bd2c998f1c6f6f23a3744136105ddefd5 ]

If trace_seq becomes full, trace_seq_vprintf() no longer consumes
arguments from va_list, making va_list out of sync with format
processing by trace_check_vprintf().

This causes va_arg() in trace_check_vprintf() to return wrong
positional argument, which results into a WARN_ON_ONCE() hit.

ftrace_stress_test from LTP triggers this situation.

Fix it by explicitly avoiding further use if va_list at the point
when it's consistency can no longer be guaranteed.

Link: https://lkml.kernel.org/r/20211118145516.13219-1-nikita.yushchenko@virtuozzo.com

Signed-off-by: Nikita Yushchenko &lt;nikita.yushchenko@virtuozzo.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tracing: Check pid filtering when creating events</title>
<updated>2021-12-01T08:04:55Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-11-26T18:35:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c9c8c054a01ca6259cd380641cd21bfce791d124'/>
<id>urn:sha1:c9c8c054a01ca6259cd380641cd21bfce791d124</id>
<content type='text'>
commit 6cb206508b621a9a0a2c35b60540e399225c8243 upstream.

When pid filtering is activated in an instance, all of the events trace
files for that instance has the PID_FILTER flag set. This determines
whether or not pid filtering needs to be done on the event, otherwise the
event is executed as normal.

If pid filtering is enabled when an event is created (via a dynamic event
or modules), its flag is not updated to reflect the current state, and the
events are not filtered properly.

Cc: stable@vger.kernel.org
Fixes: 3fdaf80f4a836 ("tracing: Implement event pid filtering")
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>sched/scs: Reset task stack state in bringup_cpu()</title>
<updated>2021-12-01T08:04:54Z</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2021-11-23T11:40:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=229c555260cb9c1ccdab861e16f0410f1718f302'/>
<id>urn:sha1:229c555260cb9c1ccdab861e16f0410f1718f302</id>
<content type='text'>
[ Upstream commit dce1ca0525bfdc8a69a9343bc714fbc19a2f04b3 ]

To hot unplug a CPU, the idle task on that CPU calls a few layers of C
code before finally leaving the kernel. When KASAN is in use, poisoned
shadow is left around for each of the active stack frames, and when
shadow call stacks are in use. When shadow call stacks (SCS) are in use
the task's saved SCS SP is left pointing at an arbitrary point within
the task's shadow call stack.

When a CPU is offlined than onlined back into the kernel, this stale
state can adversely affect execution. Stale KASAN shadow can alias new
stackframes and result in bogus KASAN warnings. A stale SCS SP is
effectively a memory leak, and prevents a portion of the shadow call
stack being used. Across a number of hotplug cycles the idle task's
entire shadow call stack can become unusable.

We previously fixed the KASAN issue in commit:

  e1b77c92981a5222 ("sched/kasan: remove stale KASAN poison after hotplug")

... by removing any stale KASAN stack poison immediately prior to
onlining a CPU.

Subsequently in commit:

  f1a0a376ca0c4ef1 ("sched/core: Initialize the idle task with preemption disabled")

... the refactoring left the KASAN and SCS cleanup in one-time idle
thread initialization code rather than something invoked prior to each
CPU being onlined, breaking both as above.

We fixed SCS (but not KASAN) in commit:

  63acd42c0d4942f7 ("sched/scs: Reset the shadow stack when idle_task_exit")

... but as this runs in the context of the idle task being offlined it's
potentially fragile.

To fix these consistently and more robustly, reset the SCS SP and KASAN
shadow of a CPU's idle task immediately before we online that CPU in
bringup_cpu(). This ensures the idle task always has a consistent state
when it is running, and removes the need to so so when exiting an idle
task.

Whenever any thread is created, dup_task_struct() will give the task a
stack which is free of KASAN shadow, and initialize the task's SCS SP,
so there's no need to specially initialize either for idle thread within
init_idle(), as this was only necessary to handle hotplug cycles.

I've tested this on arm64 with:

* gcc 11.1.0, defconfig +KASAN_INLINE, KASAN_STACK
* clang 12.0.0, defconfig +KASAN_INLINE, KASAN_STACK, SHADOW_CALL_STACK

... offlining and onlining CPUS with:

| while true; do
|   for C in /sys/devices/system/cpu/cpu*/online; do
|     echo 0 &gt; $C;
|     echo 1 &gt; $C;
|   done
| done

Fixes: f1a0a376ca0c4ef1 ("sched/core: Initialize the idle task with preemption disabled")
Reported-by: Qian Cai &lt;quic_qiancai@quicinc.com&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Tested-by: Qian Cai &lt;quic_qiancai@quicinc.com&gt;
Link: https://lore.kernel.org/lkml/20211115113310.35693-1-mark.rutland@arm.com/
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
