<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel, branch v5.10.83</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.83</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.83'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-12-01T08:19:09Z</updated>
<entry>
<title>tracing: Check pid filtering when creating events</title>
<updated>2021-12-01T08:19:09Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-11-26T18:35:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=406f2d5fe368d440fd4e262188a6640f92804c5d'/>
<id>urn:sha1:406f2d5fe368d440fd4e262188a6640f92804c5d</id>
<content type='text'>
commit 6cb206508b621a9a0a2c35b60540e399225c8243 upstream.

When pid filtering is activated in an instance, all of the events trace
files for that instance has the PID_FILTER flag set. This determines
whether or not pid filtering needs to be done on the event, otherwise the
event is executed as normal.

If pid filtering is enabled when an event is created (via a dynamic event
or modules), its flag is not updated to reflect the current state, and the
events are not filtered properly.

Cc: stable@vger.kernel.org
Fixes: 3fdaf80f4a836 ("tracing: Implement event pid filtering")
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>sched/scs: Reset task stack state in bringup_cpu()</title>
<updated>2021-12-01T08:19:08Z</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2021-11-23T11:40:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e6ee7abd6bfe559ad9989004b34c320fd638c526'/>
<id>urn:sha1:e6ee7abd6bfe559ad9989004b34c320fd638c526</id>
<content type='text'>
[ Upstream commit dce1ca0525bfdc8a69a9343bc714fbc19a2f04b3 ]

To hot unplug a CPU, the idle task on that CPU calls a few layers of C
code before finally leaving the kernel. When KASAN is in use, poisoned
shadow is left around for each of the active stack frames, and when
shadow call stacks are in use. When shadow call stacks (SCS) are in use
the task's saved SCS SP is left pointing at an arbitrary point within
the task's shadow call stack.

When a CPU is offlined than onlined back into the kernel, this stale
state can adversely affect execution. Stale KASAN shadow can alias new
stackframes and result in bogus KASAN warnings. A stale SCS SP is
effectively a memory leak, and prevents a portion of the shadow call
stack being used. Across a number of hotplug cycles the idle task's
entire shadow call stack can become unusable.

We previously fixed the KASAN issue in commit:

  e1b77c92981a5222 ("sched/kasan: remove stale KASAN poison after hotplug")

... by removing any stale KASAN stack poison immediately prior to
onlining a CPU.

Subsequently in commit:

  f1a0a376ca0c4ef1 ("sched/core: Initialize the idle task with preemption disabled")

... the refactoring left the KASAN and SCS cleanup in one-time idle
thread initialization code rather than something invoked prior to each
CPU being onlined, breaking both as above.

We fixed SCS (but not KASAN) in commit:

  63acd42c0d4942f7 ("sched/scs: Reset the shadow stack when idle_task_exit")

... but as this runs in the context of the idle task being offlined it's
potentially fragile.

To fix these consistently and more robustly, reset the SCS SP and KASAN
shadow of a CPU's idle task immediately before we online that CPU in
bringup_cpu(). This ensures the idle task always has a consistent state
when it is running, and removes the need to so so when exiting an idle
task.

Whenever any thread is created, dup_task_struct() will give the task a
stack which is free of KASAN shadow, and initialize the task's SCS SP,
so there's no need to specially initialize either for idle thread within
init_idle(), as this was only necessary to handle hotplug cycles.

I've tested this on arm64 with:

* gcc 11.1.0, defconfig +KASAN_INLINE, KASAN_STACK
* clang 12.0.0, defconfig +KASAN_INLINE, KASAN_STACK, SHADOW_CALL_STACK

... offlining and onlining CPUS with:

| while true; do
|   for C in /sys/devices/system/cpu/cpu*/online; do
|     echo 0 &gt; $C;
|     echo 1 &gt; $C;
|   done
| done

Fixes: f1a0a376ca0c4ef1 ("sched/core: Initialize the idle task with preemption disabled")
Reported-by: Qian Cai &lt;quic_qiancai@quicinc.com&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Tested-by: Qian Cai &lt;quic_qiancai@quicinc.com&gt;
Link: https://lore.kernel.org/lkml/20211115113310.35693-1-mark.rutland@arm.com/
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PM: hibernate: use correct mode for swsusp_close()</title>
<updated>2021-12-01T08:19:06Z</updated>
<author>
<name>Thomas Zeitlhofer</name>
<email>thomas.zeitlhofer+lkml@ze-it.at</email>
</author>
<published>2021-11-23T19:18:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f634c755a0ee16232a450406cbd6266f0f500d2d'/>
<id>urn:sha1:f634c755a0ee16232a450406cbd6266f0f500d2d</id>
<content type='text'>
[ Upstream commit cefcf24b4d351daf70ecd945324e200d3736821e ]

Commit 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in
swsusp_check()") changed the opening mode of the block device to
(FMODE_READ | FMODE_EXCL).

In the corresponding calls to swsusp_close(), the mode is still just
FMODE_READ which triggers the warning in blkdev_flush_mapping() on
resume from hibernate.

So, use the mode (FMODE_READ | FMODE_EXCL) also when closing the
device.

Fixes: 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()")
Signed-off-by: Thomas Zeitlhofer &lt;thomas.zeitlhofer+lkml@ze-it.at&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tracing: Fix pid filtering when triggers are attached</title>
<updated>2021-12-01T08:19:01Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-11-26T22:34:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=092a58f0d9ef5013ad3c197a9a5ecf2d7f797a88'/>
<id>urn:sha1:092a58f0d9ef5013ad3c197a9a5ecf2d7f797a88</id>
<content type='text'>
commit a55f224ff5f238013de8762c4287117e47b86e22 upstream.

If a event is filtered by pid and a trigger that requires processing of
the event to happen is a attached to the event, the discard portion does
not take the pid filtering into account, and the event will then be
recorded when it should not have been.

Cc: stable@vger.kernel.org
Fixes: 3fdaf80f4a836 ("tracing: Implement event pid filtering")
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tracing/uprobe: Fix uprobe_perf_open probes iteration</title>
<updated>2021-12-01T08:19:01Z</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@redhat.com</email>
</author>
<published>2021-11-23T14:28:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=68fa6bf7f1798edd07fbfc9865a7e9b92dfbef4c'/>
<id>urn:sha1:68fa6bf7f1798edd07fbfc9865a7e9b92dfbef4c</id>
<content type='text'>
commit 1880ed71ce863318c1ce93bf324876fb5f92854f upstream.

Add missing 'tu' variable initialization in the probes loop,
otherwise the head 'tu' is used instead of added probes.

Link: https://lkml.kernel.org/r/20211123142801.182530-1-jolsa@kernel.org

Cc: stable@vger.kernel.org
Fixes: 99c9a923e97a ("tracing/uprobe: Fix double perf_event linking on multiprobe uprobe")
Acked-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bpf: Fix toctou on read-only map's constant scalar tracking</title>
<updated>2021-12-01T08:18:58Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2021-11-09T18:48:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33fe044f6a9e8977686a6a09f0bf33e5cc75257e'/>
<id>urn:sha1:33fe044f6a9e8977686a6a09f0bf33e5cc75257e</id>
<content type='text'>
commit 353050be4c19e102178ccc05988101887c25ae53 upstream.

Commit a23740ec43ba ("bpf: Track contents of read-only maps as scalars") is
checking whether maps are read-only both from BPF program side and user space
side, and then, given their content is constant, reading out their data via
map-&gt;ops-&gt;map_direct_value_addr() which is then subsequently used as known
scalar value for the register, that is, it is marked as __mark_reg_known()
with the read value at verification time. Before a23740ec43ba, the register
content was marked as an unknown scalar so the verifier could not make any
assumptions about the map content.

The current implementation however is prone to a TOCTOU race, meaning, the
value read as known scalar for the register is not guaranteed to be exactly
the same at a later point when the program is executed, and as such, the
prior made assumptions of the verifier with regards to the program will be
invalid which can cause issues such as OOB access, etc.

While the BPF_F_RDONLY_PROG map flag is always fixed and required to be
specified at map creation time, the map-&gt;frozen property is initially set to
false for the map given the map value needs to be populated, e.g. for global
data sections. Once complete, the loader "freezes" the map from user space
such that no subsequent updates/deletes are possible anymore. For the rest
of the lifetime of the map, this freeze one-time trigger cannot be undone
anymore after a successful BPF_MAP_FREEZE cmd return. Meaning, any new BPF_*
cmd calls which would update/delete map entries will be rejected with -EPERM
since map_get_sys_perms() removes the FMODE_CAN_WRITE permission. This also
means that pending update/delete map entries must still complete before this
guarantee is given. This corner case is not an issue for loaders since they
create and prepare such program private map in successive steps.

However, a malicious user is able to trigger this TOCTOU race in two different
ways: i) via userfaultfd, and ii) via batched updates. For i) userfaultfd is
used to expand the competition interval, so that map_update_elem() can modify
the contents of the map after map_freeze() and bpf_prog_load() were executed.
This works, because userfaultfd halts the parallel thread which triggered a
map_update_elem() at the time where we copy key/value from the user buffer and
this already passed the FMODE_CAN_WRITE capability test given at that time the
map was not "frozen". Then, the main thread performs the map_freeze() and
bpf_prog_load(), and once that had completed successfully, the other thread
is woken up to complete the pending map_update_elem() which then changes the
map content. For ii) the idea of the batched update is similar, meaning, when
there are a large number of updates to be processed, it can increase the
competition interval between the two. It is therefore possible in practice to
modify the contents of the map after executing map_freeze() and bpf_prog_load().

One way to fix both i) and ii) at the same time is to expand the use of the
map's map-&gt;writecnt. The latter was introduced in fc9702273e2e ("bpf: Add mmap()
support for BPF_MAP_TYPE_ARRAY") and further refined in 1f6cb19be2e2 ("bpf:
Prevent re-mmap()'ing BPF map as writable for initially r/o mapping") with
the rationale to make a writable mmap()'ing of a map mutually exclusive with
read-only freezing. The counter indicates writable mmap() mappings and then
prevents/fails the freeze operation. Its semantics can be expanded beyond
just mmap() by generally indicating ongoing write phases. This would essentially
span any parallel regular and batched flavor of update/delete operation and
then also have map_freeze() fail with -EBUSY. For the check_mem_access() in
the verifier we expand upon the bpf_map_is_rdonly() check ensuring that all
last pending writes have completed via bpf_map_write_active() test. Once the
map-&gt;frozen is set and bpf_map_write_active() indicates a map-&gt;writecnt of 0
only then we are really guaranteed to use the map's data as known constants.
For map-&gt;frozen being set and pending writes in process of still being completed
we fall back to marking that register as unknown scalar so we don't end up
making assumptions about it. With this, both TOCTOU reproducers from i) and
ii) are fixed.

Note that the map-&gt;writecnt has been converted into a atomic64 in the fix in
order to avoid a double freeze_mutex mutex_{un,}lock() pair when updating
map-&gt;writecnt in the various map update/delete BPF_* cmd flavors. Spanning
the freeze_mutex over entire map update/delete operations in syscall side
would not be possible due to then causing everything to be serialized.
Similarly, something like synchronize_rcu() after setting map-&gt;frozen to wait
for update/deletes to complete is not possible either since it would also
have to span the user copy which can sleep. On the libbpf side, this won't
break d66562fba1ce ("libbpf: Add BPF object skeleton support") as the
anonymous mmap()-ed "map initialization image" is remapped as a BPF map-backed
mmap()-ed memory where for .rodata it's non-writable.

Fixes: a23740ec43ba ("bpf: Track contents of read-only maps as scalars")
Reported-by: w1tcher.bupt@gmail.com
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
[fix conflict to call bpf_map_write_active_dec() in err_put block.
fix conflict to insert new functions after find_and_alloc_map().]
Reference: CVE-2021-4001
Signed-off-by: Masami Ichikawa(CIP) &lt;masami.ichikawa@cybertrust.co.jp&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "perf: Rework perf_event_exit_event()"</title>
<updated>2021-11-26T09:39:22Z</updated>
<author>
<name>Sasha Levin</name>
<email>sashal@kernel.org</email>
</author>
<published>2021-11-25T00:18:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d35250ec5a23771187c85a46e6812d5943b5c13e'/>
<id>urn:sha1:d35250ec5a23771187c85a46e6812d5943b5c13e</id>
<content type='text'>
This reverts commit 94902ee2996a7f71471138093495df452dab87b6 which is
upstream commit ef54c1a476aef7eef26fe13ea10dc090952c00f8.

Reverting for now due to issues that need to get fixed upstream.

Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tracing: Add length protection to histogram string copies</title>
<updated>2021-11-26T09:39:14Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-11-14T18:28:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3984876f91a3da423fa27c3f48b568a58f4a65d8'/>
<id>urn:sha1:3984876f91a3da423fa27c3f48b568a58f4a65d8</id>
<content type='text'>
[ Upstream commit 938aa33f14657c9ed9deea348b7d6f14b6d69cb7 ]

The string copies to the histogram storage has a max size of 256 bytes
(defined by MAX_FILTER_STR_VAL). Only the string size of the event field
needs to be copied to the event storage, but no more than what is in the
event storage. Although nothing should be bigger than 256 bytes, there's
no protection against overwriting of the storage if one day there is.

Copy no more than the destination size, and enforce it.

Also had to turn MAX_FILTER_STR_VAL into an unsigned int, to keep the
min() comparison of the string sizes of comparable types.

Link: https://lore.kernel.org/all/CAHk-=wjREUihCGrtRBwfX47y_KrLCGjiq3t6QtoNJpmVrAEb1w@mail.gmail.com/
Link: https://lkml.kernel.org/r/20211114132834.183429a4@rorschach.local.home

Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Tom Zanussi &lt;zanussi@kernel.org&gt;
Reported-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Reviewed-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Fixes: 63f84ae6b82b ("tracing/histogram: Do not copy the fixed-size char array field over the field size")
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tracing/histogram: Do not copy the fixed-size char array field over the field size</title>
<updated>2021-11-26T09:39:13Z</updated>
<author>
<name>Masami Hiramatsu</name>
<email>mhiramat@kernel.org</email>
</author>
<published>2021-11-12T16:02:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5b2f2cbbc925ac5ce72547a163a69fa90a67e1d0'/>
<id>urn:sha1:5b2f2cbbc925ac5ce72547a163a69fa90a67e1d0</id>
<content type='text'>
[ Upstream commit 63f84ae6b82bb4dff672f76f30c6fd7b9d3766bc ]

Do not copy the fixed-size char array field of the events over
the field size. The histogram treats char array as a string and
there are 2 types of char array in the event, fixed-size and
dynamic string. The dynamic string (__data_loc) field must be
null terminated, but the fixed-size char array field may not
be null terminated (not a string, but just a data).
In that case, histogram can copy the data after the field.
This uses the original field size for fixed-size char array
field to restrict the histogram not to access over the original
field size.

Link: https://lkml.kernel.org/r/163673292822.195747.3696966210526410250.stgit@devnote2

Fixes: 02205a6752f2 (tracing: Add support for 'field variables')
Signed-off-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()</title>
<updated>2021-11-26T09:39:13Z</updated>
<author>
<name>Vincent Donnefort</name>
<email>vincent.donnefort@arm.com</email>
</author>
<published>2021-11-04T17:51:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=68fcb52b610c1853cadf3d135415555a128355f9'/>
<id>urn:sha1:68fcb52b610c1853cadf3d135415555a128355f9</id>
<content type='text'>
[ Upstream commit 42dc938a590c96eeb429e1830123fef2366d9c80 ]

Nothing protects the access to the per_cpu variable sd_llc_id. When testing
the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
update_top_cache_domain(). One scenario being:

              CPU1                            CPU2
  ==================================================================

  per_cpu(sd_llc_id, CPUX) =&gt; 0
                                    partition_sched_domains_locked()
      				      detach_destroy_domains()
  cpus_share_cache(CPUX, CPUX)          update_top_cache_domain(CPUX)
    per_cpu(sd_llc_id, CPUX) =&gt; 0
                                          per_cpu(sd_llc_id, CPUX) = CPUX
    per_cpu(sd_llc_id, CPUX) =&gt; CPUX
    return false

ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
is a warning triggered from ttwu_queue_wakelist().

Avoid a such race in cpus_share_cache() by always returning true when
this_cpu == that_cpu.

Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
Reported-by: Jing-Ting Wu &lt;jing-ting.wu@mediatek.com&gt;
Signed-off-by: Vincent Donnefort &lt;vincent.donnefort@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Reviewed-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Link: https://lore.kernel.org/r/20211104175120.857087-1-vincent.donnefort@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
