<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux, branch v6.13.6</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.13.6</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.13.6'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2025-03-07T17:27:12Z</updated>
<entry>
<title>rcuref: Plug slowpath race in rcuref_put()</title>
<updated>2025-03-07T17:27:12Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2025-01-18T23:55:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d82176973caab7daee9297f68a05aa5330b74f12'/>
<id>urn:sha1:d82176973caab7daee9297f68a05aa5330b74f12</id>
<content type='text'>
commit b9a49520679e98700d3d89689cc91c08a1c88c1d upstream.

Kernel test robot reported an "imbalanced put" in the rcuref_put() slow
path, which turned out to be a false positive. Consider the following race:

            ref  = 0 (via rcuref_init(ref, 1))
 T1                                      T2
 rcuref_put(ref)
 -&gt; atomic_add_negative_release(-1, ref)                                         # ref -&gt; 0xffffffff
 -&gt; rcuref_put_slowpath(ref)
                                         rcuref_get(ref)
                                         -&gt; atomic_add_negative_relaxed(1, &amp;ref-&gt;refcnt)
                                           -&gt; return true;                       # ref -&gt; 0

                                         rcuref_put(ref)
                                         -&gt; atomic_add_negative_release(-1, ref) # ref -&gt; 0xffffffff
                                         -&gt; rcuref_put_slowpath()

    -&gt; cnt = atomic_read(&amp;ref-&gt;refcnt);                                          # cnt -&gt; 0xffffffff / RCUREF_NOREF
    -&gt; atomic_try_cmpxchg_release(&amp;ref-&gt;refcnt, &amp;cnt, RCUREF_DEAD))              # ref -&gt; 0xe0000000 / RCUREF_DEAD
       -&gt; return true
                                           -&gt; cnt = atomic_read(&amp;ref-&gt;refcnt);   # cnt -&gt; 0xe0000000 / RCUREF_DEAD
                                           -&gt; if (cnt &gt; RCUREF_RELEASED)         # 0xe0000000 &gt; 0xc0000000
                                             -&gt; WARN_ONCE(cnt &gt;= RCUREF_RELEASED, "rcuref - imbalanced put()")

The problem is the additional read in the slow path (after it
decremented to RCUREF_NOREF) which can happen after the counter has been
marked RCUREF_DEAD.

Prevent this by reusing the return value of the decrement. Now every "final"
put uses RCUREF_NOREF in the slow path and attempts the final cmpxchg() to
RCUREF_DEAD.

[ bigeasy: Add changelog ]

Fixes: ee1ee6db07795 ("atomics: Provide rcuref - scalable reference counting")
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Debugged-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: stable@vger.kernel.org
Closes: https://lore.kernel.org/oe-lkp/202412311453.9d7636a2-lkp@intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: Remove zone write plugs when handling native zone append writes</title>
<updated>2025-03-07T17:27:07Z</updated>
<author>
<name>Damien Le Moal</name>
<email>dlemoal@kernel.org</email>
</author>
<published>2025-02-14T04:14:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=abc63d1944a63983c2459c0eba6033f55566f7da'/>
<id>urn:sha1:abc63d1944a63983c2459c0eba6033f55566f7da</id>
<content type='text'>
commit a6aa36e957a1bfb5341986dec32d013d23228fe1 upstream.

For devices that natively support zone append operations,
REQ_OP_ZONE_APPEND BIOs are not processed through zone write plugging
and are immediately issued to the zoned device. This means that there is
no write pointer offset tracking done for these operations and that a
zone write plug is not necessary.

However, when receiving a zone append BIO, we may already have a zone
write plug for the target zone if that zone was previously partially
written using regular write operations. In such case, since the write
pointer offset of the zone write plug is not incremented by the amount
of sectors appended to the zone, 2 issues arise:
1) we risk leaving the plug in the disk hash table if the zone is fully
   written using zone append or regular write operations, because the
   write pointer offset will never reach the "zone full" state.
2) Regular write operations that are issued after zone append operations
   will always be failed by blk_zone_wplug_prepare_bio() as the write
   pointer alignment check will fail, even if the user correctly
   accounted for the zone append operations and issued the regular
   writes with a correct sector.

Avoid these issues by immediately removing the zone write plug of zones
that are the target of zone append operations when blk_zone_plug_bio()
is called. The new function blk_zone_wplug_handle_native_zone_append()
implements this for devices that natively support zone append. The
removal of the zone write plug using disk_remove_zone_wplug() requires
aborting all plugged regular write using disk_zone_wplug_abort() as
otherwise the plugged write BIOs would never be executed (with the plug
removed, the completion path will never see again the zone write plug as
disk_get_zone_wplug() will return NULL). Rate-limited warnings are added
to blk_zone_wplug_handle_native_zone_append() and to
disk_zone_wplug_abort() to signal this.

Since blk_zone_wplug_handle_native_zone_append() is called in the hot
path for operations that will not be plugged, disk_get_zone_wplug() is
optimized under the assumption that a user issuing zone append
operations is not at the same time issuing regular writes and that there
are no hashed zone write plugs. The struct gendisk atomic counter
nr_zone_wplugs is added to check this, with this counter incremented in
disk_insert_zone_wplug() and decremented in disk_remove_zone_wplug().

To be consistent with this fix, we do not need to fill the zone write
plug hash table with zone write plugs for zones that are partially
written for a device that supports native zone append operations.
So modify blk_revalidate_seq_zone() to return early to avoid allocating
and inserting a zone write plug for partially written sequential zones
if the device natively supports zone append.

Reported-by: Jorgen Hansen &lt;Jorgen.Hansen@wdc.com&gt;
Fixes: 9b1ce7f0c6f8 ("block: Implement zone append emulation")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;dlemoal@kernel.org&gt;
Tested-by: Jorgen Hansen &lt;Jorgen.Hansen@wdc.com&gt;
Link: https://lore.kernel.org/r/20250214041434.82564-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>objtool: Fix C jump table annotations for Clang</title>
<updated>2025-03-07T17:27:03Z</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ardb@kernel.org</email>
</author>
<published>2025-02-21T13:57:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=03f56d000a7e828fe33f4c55c8b28e0df7c427f9'/>
<id>urn:sha1:03f56d000a7e828fe33f4c55c8b28e0df7c427f9</id>
<content type='text'>
[ Upstream commit 73cfc53cc3b6380eccf013049574485f64cb83ca ]

A C jump table (such as the one used by the BPF interpreter) is a const
global array of absolute code addresses, and this means that the actual
values in the table may not be known until the kernel is booted (e.g.,
when using KASLR or when the kernel VA space is sized dynamically).

When using PIE codegen, the compiler will default to placing such const
global objects in .data.rel.ro (which is annotated as writable), rather
than .rodata (which is annotated as read-only). As C jump tables are
explicitly emitted into .rodata, this used to result in warnings for
LoongArch builds (which uses PIE codegen for the entire kernel) like

  Warning: setting incorrect section attributes for .rodata..c_jump_table

due to the fact that the explicitly specified .rodata section inherited
the read-write annotation that the compiler uses for such objects when
using PIE codegen.

This warning was suppressed by explicitly adding the read-only
annotation to the __attribute__((section(""))) string, by commit

  c5b1184decc8 ("compiler.h: specify correct attribute for .rodata..c_jump_table")

Unfortunately, this hack does not work on Clang's integrated assembler,
which happily interprets the appended section type and permission
specifiers as part of the section name, which therefore no longer
matches the hard-coded pattern '.rodata..c_jump_table' that objtool
expects, causing it to emit a warning

  kernel/bpf/core.o: warning: objtool: ___bpf_prog_run+0x20: sibling call from callable instruction with modified stack frame

Work around this, by emitting C jump tables into .data.rel.ro instead,
which is treated as .rodata by the linker script for all builds, not
just PIE based ones.

Fixes: c5b1184decc8 ("compiler.h: specify correct attribute for .rodata..c_jump_table")
Tested-by: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt; # on LoongArch
Signed-off-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Link: https://lore.kernel.org/r/20250221135704.431269-6-ardb+git@google.com
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>objtool: Remove annotate_{,un}reachable()</title>
<updated>2025-03-07T17:27:03Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-11-28T09:39:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c304ca39e10c1b0c834d9b9ca0c7f33fd69cf8c5'/>
<id>urn:sha1:c304ca39e10c1b0c834d9b9ca0c7f33fd69cf8c5</id>
<content type='text'>
[ Upstream commit 06e24745985c8dd0da18337503afcf2f2fdbdff1 ]

There are no users of annotate_reachable() left.

And the annotate_unreachable() usage in unreachable() is plain wrong;
it will hide dangerous fall-through code-gen.

Remove both.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Link: https://lore.kernel.org/r/20241128094312.235637588@infradead.org
Stable-dep-of: 73cfc53cc3b6 ("objtool: Fix C jump table annotations for Clang")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>unreachable: Unify</title>
<updated>2025-03-07T17:27:03Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-11-28T09:39:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9d924a1d45ee487f1d0f98bdddc03ba79e1de893'/>
<id>urn:sha1:9d924a1d45ee487f1d0f98bdddc03ba79e1de893</id>
<content type='text'>
[ Upstream commit c837de3810982cd41cd70e5170da1931439f025c ]

Since barrier_before_unreachable() is empty for !GCC it is trivial to
unify the two definitions. Less is more.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Link: https://lore.kernel.org/r/20241128094311.924381359@infradead.org
Stable-dep-of: 73cfc53cc3b6 ("objtool: Fix C jump table annotations for Clang")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tcp: devmem: don't write truncated dmabuf CMSGs to userspace</title>
<updated>2025-03-07T17:27:00Z</updated>
<author>
<name>Stanislav Fomichev</name>
<email>sdf@fomichev.me</email>
</author>
<published>2025-02-24T17:44:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0af4760881666b0569bae1b2e845eafd44292e40'/>
<id>urn:sha1:0af4760881666b0569bae1b2e845eafd44292e40</id>
<content type='text'>
[ Upstream commit 18912c520674ec4d920fe3826e7e4fefeecdf5ae ]

Currently, we report -ETOOSMALL (err) only on the first iteration
(!sent). When we get put_cmsg error after a bunch of successful
put_cmsg calls, we don't signal the error at all. This might be
confusing on the userspace side which will see truncated CMSGs
but no MSG_CTRUNC signal.

Consider the following case:
- sizeof(struct cmsghdr) = 16
- sizeof(struct dmabuf_cmsg) = 24
- total cmsg size (CMSG_LEN) = 40 (16+24)

When calling recvmsg with msg_controllen=60, the userspace
will receive two(!) dmabuf_cmsg(s), the first one will
be a valid one and the second one will be silently truncated. There is no
easy way to discover the truncation besides doing something like
"cm-&gt;cmsg_len != CMSG_LEN(sizeof(dmabuf_cmsg))".

Introduce new put_devmem_cmsg wrapper that reports an error instead
of doing the truncation. Mina suggests that it's the intended way
this API should work.

Note that we might now report MSG_CTRUNC when the users (incorrectly)
call us with msg_control == NULL.

Fixes: 8f0b3cc9a4c1 ("tcp: RX path for devmem TCP")
Reviewed-by: Mina Almasry &lt;almasrymina@google.com&gt;
Signed-off-by: Stanislav Fomichev &lt;sdf@fomichev.me&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20250224174401.3582695-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>SUNRPC: Prevent looping due to rpc_signal_task() races</title>
<updated>2025-03-07T17:26:55Z</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@hammerspace.com</email>
</author>
<published>2025-02-01T20:00:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e837ef178276ead04f59ed70a23ce4c0b03596f4'/>
<id>urn:sha1:e837ef178276ead04f59ed70a23ce4c0b03596f4</id>
<content type='text'>
[ Upstream commit 5bbd6e863b15a85221e49b9bdb2d5d8f0bb91f3d ]

If rpc_signal_task() is called while a task is in an rpc_call_done()
callback function, and the latter calls rpc_restart_call(), the task can
end up looping due to the RPC_TASK_SIGNALLED flag being set without the
tk_rpc_status being set.
Removing the redundant mechanism for signalling the task fixes the
looping behaviour.

Reported-by: Li Lingfeng &lt;lilingfeng3@huawei.com&gt;
Fixes: 39494194f93b ("SUNRPC: Fix races with rpc_killall_tasks()")
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Compact RSEQ concurrency IDs with reduced threads and affinity</title>
<updated>2025-02-27T12:34:18Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2025-02-10T15:32:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c2fa2877a55cc51d37d03c68838c5dd6aeccd6aa'/>
<id>urn:sha1:c2fa2877a55cc51d37d03c68838c5dd6aeccd6aa</id>
<content type='text'>
[ Upstream commit 02d954c0fdf91845169cdacc7405b120f90afe01 ]

When a process reduces its number of threads or clears bits in its CPU
affinity mask, the mm_cid allocation should eventually converge towards
smaller values.

However, the change introduced by:

commit 7e019dcc470f ("sched: Improve cache locality of RSEQ concurrency
IDs for intermittent workloads")

adds a per-mm/CPU recent_cid which is never unset unless a thread
migrates.

This is a tradeoff between:

A) Preserving cache locality after a transition from many threads to few
   threads, or after reducing the hamming weight of the allowed CPU mask.

B) Making the mm_cid upper bounds wrt nr threads and allowed CPU mask
   easy to document and understand.

C) Allowing applications to eventually react to mm_cid compaction after
   reduction of the nr threads or allowed CPU mask, making the tracking
   of mm_cid compaction easier by shrinking it back towards 0 or not.

D) Making sure applications that periodically reduce and then increase
   again the nr threads or allowed CPU mask still benefit from good
   cache locality with mm_cid.

Introduce the following changes:

* After shrinking the number of threads or reducing the number of
  allowed CPUs, reduce the value of max_nr_cid so expansion of CID
  allocation will preserve cache locality if the number of threads or
  allowed CPUs increase again.

* Only re-use a recent_cid if it is within the max_nr_cid upper bound,
  else find the first available CID.

Fixes: 7e019dcc470f ("sched: Improve cache locality of RSEQ concurrency IDs for intermittent workloads")
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Gabriele Monaco &lt;gmonaco@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Gabriele Monaco &lt;gmonaco@redhat.com&gt;
Link: https://lkml.kernel.org/r/20250210153253.460471-2-gmonaco@redhat.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: Add non-RCU dev_getbyhwaddr() helper</title>
<updated>2025-02-27T12:34:15Z</updated>
<author>
<name>Breno Leitao</name>
<email>leitao@debian.org</email>
</author>
<published>2025-02-18T13:49:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f799ab3c10192a0a535a8b017ab3e427660dd35a'/>
<id>urn:sha1:f799ab3c10192a0a535a8b017ab3e427660dd35a</id>
<content type='text'>
[ Upstream commit 4b5a28b38c4a0106c64416a1b2042405166b26ce ]

Add dedicated helper for finding devices by hardware address when
holding rtnl_lock, similar to existing dev_getbyhwaddr_rcu(). This prevents
PROVE_LOCKING warnings when rtnl_lock is held but RCU read lock is not.

Extract common address comparison logic into dev_addr_cmp().

The context about this change could be found in the following
discussion:

Link: https://lore.kernel.org/all/20250206-scarlet-ermine-of-improvement-1fcac5@leitao/

Cc: kuniyu@amazon.com
Cc: ushankar@purestorage.com
Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://patch.msgid.link/20250218-arm_fix_selftest-v5-1-d3d6892db9e1@debian.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Stable-dep-of: 4eae0ee0f1e6 ("arp: switch to dev_getbyhwaddr() in arp_req_set_public()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: pse-pd: Use power limit at driver side instead of current limit</title>
<updated>2025-02-27T12:34:14Z</updated>
<author>
<name>Kory Maincent</name>
<email>kory.maincent@bootlin.com</email>
</author>
<published>2025-01-10T09:40:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ae9ca06295e89cb45c39a60854dbe0b8e4a844c8'/>
<id>urn:sha1:ae9ca06295e89cb45c39a60854dbe0b8e4a844c8</id>
<content type='text'>
[ Upstream commit e0a5e2bba38aa61a900934b45d6e846e0a6d7524 ]

The regulator framework uses current limits, but the PSE standard and
known PSE controllers rely on power limits. Instead of converting
current to power within each driver, perform the conversion in the PSE
core. This avoids redundancy in driver implementation and aligns better
with the standard, simplifying driver development.

Remove at the same time the _pse_ethtool_get_status() function which is
not needed anymore.

Acked-by: Oleksij Rempel &lt;o.rempel@pengutronix.de&gt;
Signed-off-by: Kory Maincent &lt;kory.maincent@bootlin.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Stable-dep-of: f6093c5ec74d ("net: pse-pd: pd692x0: Fix power limit retrieval")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
