<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel, branch v5.18.19</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.18.19</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.18.19'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-08-21T13:18:56Z</updated>
<entry>
<title>kexec, KEYS: make the code in bzImage64_verify_sig generic</title>
<updated>2022-08-21T13:18:56Z</updated>
<author>
<name>Coiby Xu</name>
<email>coxu@redhat.com</email>
</author>
<published>2022-07-14T13:40:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8fd872cddf2a46f6d60e616832b7686d3f5fe40a'/>
<id>urn:sha1:8fd872cddf2a46f6d60e616832b7686d3f5fe40a</id>
<content type='text'>
commit c903dae8941deb55043ee46ded29e84e97cd84bb upstream.

commit 278311e417be ("kexec, KEYS: Make use of platform keyring for
signature verify") adds platform keyring support on x86 kexec but not
arm64.

The code in bzImage64_verify_sig uses the keys on the
.builtin_trusted_keys, .machine, if configured and enabled,
.secondary_trusted_keys, also if configured, and .platform keyrings
to verify the signed kernel image as PE file.

Cc: kexec@lists.infradead.org
Cc: keyrings@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Reviewed-by: Michal Suchanek &lt;msuchanek@suse.de&gt;
Signed-off-by: Coiby Xu &lt;coxu@redhat.com&gt;
Signed-off-by: Mimi Zohar &lt;zohar@linux.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bpf: Suppress 'passing zero to PTR_ERR' warning</title>
<updated>2022-08-17T12:42:35Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2022-05-21T13:26:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d93dbef95c08516d39d83c1b9e5b12944da60253'/>
<id>urn:sha1:d93dbef95c08516d39d83c1b9e5b12944da60253</id>
<content type='text'>
commit 1ec5ee8c8a5a65ea377f8bea64bf4d5b743f6f79 upstream.

Kernel Test Robot complains about passing zero to PTR_ERR for the said
line, suppress it by using PTR_ERR_OR_ZERO.

Fixes: c0a5a21c25f3 ("bpf: Allow storing referenced kptr in map")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220521132620.1976921-1-memxor@gmail.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: serialize all debugfs operations using q-&gt;debugfs_mutex</title>
<updated>2022-08-17T12:42:24Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2022-06-14T07:48:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2c1eebbef74b40145c7af587b16a4f7a5c9c93a9'/>
<id>urn:sha1:2c1eebbef74b40145c7af587b16a4f7a5c9c93a9</id>
<content type='text'>
[ Upstream commit 5cf9c91ba927119fc6606b938b1895bb2459d3bc ]

Various places like I/O schedulers or the QOS infrastructure try to
register debugfs files on demans, which can race with creating and
removing the main queue debugfs directory.  Use the existing
debugfs_mutex to serialize all debugfs operations that rely on
q-&gt;debugfs_dir or the directories hanging off it.

To make the teardown code a little simpler declare all debugfs dentry
pointers and not just the main one uncoditionally in blkdev.h.

Move debugfs_mutex next to the dentries that it protects and document
what it is used for.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20220614074827.458955-3-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/csd_lock: Change csdlock_debug from early_param to __setup</title>
<updated>2022-08-17T12:42:24Z</updated>
<author>
<name>Chen Zhongjin</name>
<email>chenzhongjin@huawei.com</email>
</author>
<published>2022-05-10T09:46:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=05de9e2e33b1625c71aee69e353fe906dd2be88a'/>
<id>urn:sha1:05de9e2e33b1625c71aee69e353fe906dd2be88a</id>
<content type='text'>
[ Upstream commit 9c9b26b0df270d4f9246e483a44686fca951a29c ]

The csdlock_debug kernel-boot parameter is parsed by the
early_param() function csdlock_debug().  If set, csdlock_debug()
invokes static_branch_enable() to enable csd_lock_wait feature, which
triggers a panic on arm64 for kernels built with CONFIG_SPARSEMEM=y and
CONFIG_SPARSEMEM_VMEMMAP=n.

With CONFIG_SPARSEMEM_VMEMMAP=n, __nr_to_section is called in
static_key_enable() and returns NULL, resulting in a NULL dereference
because mem_section is initialized only later in sparse_init().

This is also a problem for powerpc because early_param() functions
are invoked earlier than jump_label_init(), also resulting in
static_key_enable() failures.  These failures cause the warning "static
key 'xxx' used before call to jump_label_init()".

Thus, early_param is too early for csd_lock_wait to run
static_branch_enable(), so changes it to __setup to fix these.

Fixes: 8d0968cc6b8f ("locking/csd_lock: Add boot parameter for controlling CSD lock debugging")
Cc: stable@vger.kernel.org
Reported-by: Chen jingwen &lt;chenjingwen6@huawei.com&gt;
Signed-off-by: Chen Zhongjin &lt;chenzhongjin@huawei.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>timekeeping: contribute wall clock to rng on time change</title>
<updated>2022-08-17T12:42:24Z</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2022-07-17T21:53:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2d74ca7d3663d185377535c6257f14bce7d041a3'/>
<id>urn:sha1:2d74ca7d3663d185377535c6257f14bce7d041a3</id>
<content type='text'>
[ Upstream commit b8ac29b40183a6038919768b5d189c9bd91ce9b4 ]

The rng's random_init() function contributes the real time to the rng at
boot time, so that events can at least start in relation to something
particular in the real world. But this clock might not yet be set that
point in boot, so nothing is contributed. In addition, the relation
between minor clock changes from, say, NTP, and the cycle counter is
potentially useful entropic data.

This commit addresses this by mixing in a time stamp on calls to
settimeofday and adjtimex. No entropy is credited in doing so, so it
doesn't make initialization faster, but it is still useful input to
have.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>kexec: clean up arch_kexec_kernel_verify_sig</title>
<updated>2022-08-17T12:42:23Z</updated>
<author>
<name>Coiby Xu</name>
<email>coxu@redhat.com</email>
</author>
<published>2022-07-14T13:40:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df036b16058a8f3620fbb793091e37a6284ec00c'/>
<id>urn:sha1:df036b16058a8f3620fbb793091e37a6284ec00c</id>
<content type='text'>
[ Upstream commit 689a71493bd2f31c024f8c0395f85a1fd4b2138e ]

Before commit 105e10e2cf1c ("kexec_file: drop weak attribute from
functions"), there was already no arch-specific implementation
of arch_kexec_kernel_verify_sig. With weak attribute dropped by that
commit, arch_kexec_kernel_verify_sig is completely useless. So clean it
up.

Note later patches are dependent on this patch so it should be backported
to the stable tree as well.

Cc: stable@vger.kernel.org
Suggested-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Reviewed-by: Michal Suchanek &lt;msuchanek@suse.de&gt;
Acked-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Coiby Xu &lt;coxu@redhat.com&gt;
[zohar@linux.ibm.com: reworded patch description "Note"]
Link: https://lore.kernel.org/linux-integrity/20220714134027.394370-1-coxu@redhat.com/
Signed-off-by: Mimi Zohar &lt;zohar@linux.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>kexec_file: drop weak attribute from functions</title>
<updated>2022-08-17T12:42:23Z</updated>
<author>
<name>Naveen N. Rao</name>
<email>naveen.n.rao@linux.vnet.ibm.com</email>
</author>
<published>2022-07-01T07:34:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=90341045b5095d136b6ea5809687d3506bb7b84a'/>
<id>urn:sha1:90341045b5095d136b6ea5809687d3506bb7b84a</id>
<content type='text'>
[ Upstream commit 65d9a9a60fd71be964effb2e94747a6acb6e7015 ]

As requested
(http://lkml.kernel.org/r/87ee0q7b92.fsf@email.froward.int.ebiederm.org),
this series converts weak functions in kexec to use the #ifdef approach.

Quoting the 3e35142ef99fe ("kexec_file: drop weak attribute from
arch_kexec_apply_relocations[_add]") changelog:

: Since commit d1bcae833b32f1 ("ELF: Don't generate unused section symbols")
: [1], binutils (v2.36+) started dropping section symbols that it thought
: were unused.  This isn't an issue in general, but with kexec_file.c, gcc
: is placing kexec_arch_apply_relocations[_add] into a separate
: .text.unlikely section and the section symbol ".text.unlikely" is being
: dropped.  Due to this, recordmcount is unable to find a non-weak symbol in
: .text.unlikely to generate a relocation record against.

This patch (of 2);

Drop __weak attribute from functions in kexec_file.c:
- arch_kexec_kernel_image_probe()
- arch_kimage_file_post_load_cleanup()
- arch_kexec_kernel_image_load()
- arch_kexec_locate_mem_hole()
- arch_kexec_kernel_verify_sig()

arch_kexec_kernel_image_load() calls into kexec_image_load_default(), so
drop the static attribute for the latter.

arch_kexec_kernel_verify_sig() is not overridden by any architecture, so
drop the __weak attribute.

Link: https://lkml.kernel.org/r/cover.1656659357.git.naveen.n.rao@linux.vnet.ibm.com
Link: https://lkml.kernel.org/r/2cd7ca1fe4d6bb6ca38e3283c717878388ed6788.1656659357.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Suggested-by: Eric Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Mimi Zohar &lt;zohar@linux.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Do not requeue task on CPU excluded from cpus_mask</title>
<updated>2022-08-17T12:42:16Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2022-08-04T09:21:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fde45283f4c8a91c367ea5f20f87036468755121'/>
<id>urn:sha1:fde45283f4c8a91c367ea5f20f87036468755121</id>
<content type='text'>
[ Upstream commit 751d4cbc43879229dbc124afefe240b70fd29a85 ]

The following warning was triggered on a large machine early in boot on
a distribution kernel but the same problem should also affect mainline.

   WARNING: CPU: 439 PID: 10 at ../kernel/workqueue.c:2231 process_one_work+0x4d/0x440
   Call Trace:
    &lt;TASK&gt;
    rescuer_thread+0x1f6/0x360
    kthread+0x156/0x180
    ret_from_fork+0x22/0x30
    &lt;/TASK&gt;

Commit c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p-&gt;on_cpu")
optimises ttwu by queueing a task that is descheduling on the wakelist,
but does not check if the task descheduling is still allowed to run on that CPU.

In this warning, the problematic task is a workqueue rescue thread which
checks if the rescue is for a per-cpu workqueue and running on the wrong CPU.
While this is early in boot and it should be possible to create workers,
the rescue thread may still used if the MAYDAY_INITIAL_TIMEOUT is reached
or MAYDAY_INTERVAL and on a sufficiently large machine, the rescue
thread is being used frequently.

Tracing confirmed that the task should have migrated properly using the
stopper thread to handle the migration. However, a parallel wakeup from udev
running on another CPU that does not share CPU cache observes p-&gt;on_cpu and
uses task_cpu(p), queues the task on the old CPU and triggers the warning.

Check that the wakee task that is descheduling is still allowed to run
on its current CPU and if not, wait for the descheduling to complete
and select an allowed CPU.

Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p-&gt;on_cpu")
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20220804092119.20137-1-mgorman@techsingularity.net
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Remove the limitation of WF_ON_CPU on wakelist if wakee cpu is idle</title>
<updated>2022-08-17T12:42:16Z</updated>
<author>
<name>Tianchen Ding</name>
<email>dtcccc@linux.alibaba.com</email>
</author>
<published>2022-06-08T23:34:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0a969f46e86a6ef84244751f0b3929cbaaeecade'/>
<id>urn:sha1:0a969f46e86a6ef84244751f0b3929cbaaeecade</id>
<content type='text'>
[ Upstream commit f3dd3f674555bd9455c5ae7fafce0696bd9931b3 ]

Wakelist can help avoid cache bouncing and offload the overhead of waker
cpu. So far, using wakelist within the same llc only happens on
WF_ON_CPU, and this limitation could be removed to further improve
wakeup performance.

The commit 518cd6234178 ("sched: Only queue remote wakeups when
crossing cache boundaries") disabled queuing tasks on wakelist when
the cpus share llc. This is because, at that time, the scheduler must
send IPIs to do ttwu_queue_wakelist. Nowadays, ttwu_queue_wakelist also
supports TIF_POLLING, so this is not a problem now when the wakee cpu is
in idle polling.

Benefits:
  Queuing the task on idle cpu can help improving performance on waker cpu
  and utilization on wakee cpu, and further improve locality because
  the wakee cpu can handle its own rq. This patch helps improving rt on
  our real java workloads where wakeup happens frequently.

  Consider the normal condition (CPU0 and CPU1 share same llc)
  Before this patch:

         CPU0                                       CPU1

    select_task_rq()                                idle
    rq_lock(CPU1-&gt;rq)
    enqueue_task(CPU1-&gt;rq)
    notify CPU1 (by sending IPI or CPU1 polling)

                                                    resched()

  After this patch:

         CPU0                                       CPU1

    select_task_rq()                                idle
    add to wakelist of CPU1
    notify CPU1 (by sending IPI or CPU1 polling)

                                                    rq_lock(CPU1-&gt;rq)
                                                    enqueue_task(CPU1-&gt;rq)
                                                    resched()

  We see CPU0 can finish its work earlier. It only needs to put task to
  wakelist and return.
  While CPU1 is idle, so let itself handle its own runqueue data.

This patch brings no difference about IPI.
  This patch only takes effect when the wakee cpu is:
  1) idle polling
  2) idle not polling

  For 1), there will be no IPI with or without this patch.

  For 2), there will always be an IPI before or after this patch.
  Before this patch: waker cpu will enqueue task and check preempt. Since
  "idle" will be sure to be preempted, waker cpu must send a resched IPI.
  After this patch: waker cpu will put the task to the wakelist of wakee
  cpu, and send an IPI.

Benchmark:
We've tested schbench, unixbench, and hachbench on both x86 and arm64.

On x86 (Intel Xeon Platinum 8269CY):
  schbench -m 2 -t 8

    Latency percentiles (usec)              before        after
        50.0000th:                             8            6
        75.0000th:                            10            7
        90.0000th:                            11            8
        95.0000th:                            12            8
        *99.0000th:                           13           10
        99.5000th:                            15           11
        99.9000th:                            18           14

  Unixbench with full threads (104)
                                            before        after
    Dhrystone 2 using register variables  3011862938    3009935994  -0.06%
    Double-Precision Whetstone              617119.3      617298.5   0.03%
    Execl Throughput                         27667.3       27627.3  -0.14%
    File Copy 1024 bufsize 2000 maxblocks   785871.4      784906.2  -0.12%
    File Copy 256 bufsize 500 maxblocks     210113.6      212635.4   1.20%
    File Copy 4096 bufsize 8000 maxblocks  2328862.2     2320529.1  -0.36%
    Pipe Throughput                      145535622.8   145323033.2  -0.15%
    Pipe-based Context Switching           3221686.4     3583975.4  11.25%
    Process Creation                        101347.1      103345.4   1.97%
    Shell Scripts (1 concurrent)            120193.5      123977.8   3.15%
    Shell Scripts (8 concurrent)             17233.4       17138.4  -0.55%
    System Call Overhead                   5300604.8     5312213.6   0.22%

  hackbench -g 1 -l 100000
                                            before        after
    Time                                     3.246        2.251

On arm64 (Ampere Altra):
  schbench -m 2 -t 8

    Latency percentiles (usec)              before        after
        50.0000th:                            14           10
        75.0000th:                            19           14
        90.0000th:                            22           16
        95.0000th:                            23           16
        *99.0000th:                           24           17
        99.5000th:                            24           17
        99.9000th:                            28           25

  Unixbench with full threads (80)
                                            before        after
    Dhrystone 2 using register variables  3536194249    3537019613   0.02%
    Double-Precision Whetstone              629383.6      629431.6   0.01%
    Execl Throughput                         65920.5       65846.2  -0.11%
    File Copy 1024 bufsize 2000 maxblocks  1063722.8     1064026.8   0.03%
    File Copy 256 bufsize 500 maxblocks     322684.5      318724.5  -1.23%
    File Copy 4096 bufsize 8000 maxblocks  2348285.3     2328804.8  -0.83%
    Pipe Throughput                      133542875.3   131619389.8  -1.44%
    Pipe-based Context Switching           3215356.1     3576945.1  11.25%
    Process Creation                        108520.5      120184.6  10.75%
    Shell Scripts (1 concurrent)            122636.3        121888  -0.61%
    Shell Scripts (8 concurrent)             17462.1       17381.4  -0.46%
    System Call Overhead                   4429998.9     4435006.7   0.11%

  hackbench -g 1 -l 100000
                                            before        after
    Time                                     4.217        2.916

Our patch has improvement on schbench, hackbench
and Pipe-based Context Switching of unixbench
when there exists idle cpus,
and no obvious regression on other tests of unixbench.
This can help improve rt in scenes where wakeup happens frequently.

Signed-off-by: Tianchen Ding &lt;dtcccc@linux.alibaba.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Link: https://lore.kernel.org/r/20220608233412.327341-3-dtcccc@linux.alibaba.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Fix the check of nr_running at queue wakelist</title>
<updated>2022-08-17T12:42:16Z</updated>
<author>
<name>Tianchen Ding</name>
<email>dtcccc@linux.alibaba.com</email>
</author>
<published>2022-06-08T23:34:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e858facbcfaee40d61cfaedb17993a4ae43c9b18'/>
<id>urn:sha1:e858facbcfaee40d61cfaedb17993a4ae43c9b18</id>
<content type='text'>
[ Upstream commit 28156108fecb1f808b21d216e8ea8f0d205a530c ]

The commit 2ebb17717550 ("sched/core: Offload wakee task activation if it
the wakee is descheduling") checked rq-&gt;nr_running &lt;= 1 to avoid task
stacking when WF_ON_CPU.

Per the ordering of writes to p-&gt;on_rq and p-&gt;on_cpu, observing p-&gt;on_cpu
(WF_ON_CPU) in ttwu_queue_cond() implies !p-&gt;on_rq, IOW p has gone through
the deactivate_task() in __schedule(), thus p has been accounted out of
rq-&gt;nr_running. As such, the task being the only runnable task on the rq
implies reading rq-&gt;nr_running == 0 at that point.

The benchmark result is in [1].

[1] https://lore.kernel.org/all/e34de686-4e85-bde1-9f3c-9bbc86b38627@linux.alibaba.com/

Suggested-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Signed-off-by: Tianchen Ding &lt;dtcccc@linux.alibaba.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Link: https://lore.kernel.org/r/20220608233412.327341-2-dtcccc@linux.alibaba.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
