<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/bpf, branch v6.18.13</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.18.13</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.18.13'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2026-01-02T11:56:37Z</updated>
<entry>
<title>bpf: Fix truncated dmabuf iterator reads</title>
<updated>2026-01-02T11:56:37Z</updated>
<author>
<name>T.J. Mercier</name>
<email>tjmercier@google.com</email>
</author>
<published>2025-12-04T00:03:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b7a2957ace3de743ff1bcf1295e662576b1a02ec'/>
<id>urn:sha1:b7a2957ace3de743ff1bcf1295e662576b1a02ec</id>
<content type='text'>
[ Upstream commit 234483565dbb2b264fdd165927c89fbf3ecf4733 ]

If there is a large number (hundreds) of dmabufs allocated, the text
output generated from dmabuf_iter_seq_show can exceed common user buffer
sizes (e.g. PAGE_SIZE) necessitating multiple start/stop cycles to
iterate through all dmabufs. However the dmabuf iterator currently
returns NULL in dmabuf_iter_seq_start for all non-zero pos values, which
results in the truncation of the output before all dmabufs are handled.

After dma_buf_iter_begin / dma_buf_iter_next, the refcount of the buffer
is elevated so that the BPF iterator program can run without holding any
locks. When a stop occurs, instead of immediately dropping the reference
on the buffer, stash a pointer to the buffer in seq-&gt;priv until
either start is called or the iterator is released. This also enables
the resumption of iteration without first walking through the list of
dmabufs based on the pos value.

Fixes: 76ea95534995 ("bpf: Add dmabuf iterator")
Signed-off-by: T.J. Mercier &lt;tjmercier@google.com&gt;
Link: https://lore.kernel.org/r/20251204000348.1413593-1-tjmercier@google.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>rqspinlock: Use trylock fallback when per-CPU rqnode is busy</title>
<updated>2025-12-18T13:03:28Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2025-11-28T23:27:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=14980281ebff84f6b6ec9262ba6951c73031680c'/>
<id>urn:sha1:14980281ebff84f6b6ec9262ba6951c73031680c</id>
<content type='text'>
[ Upstream commit 81d5a6a438595e46be191d602e5c2d6d73992fdc ]

In addition to deferring to the trylock fallback in NMIs, only do so
when an rqspinlock waiter is queued on the current CPU. This is detected
by noticing a non-zero node index. This allows NMI waiters to join the
waiter queue if it isn't interrupting an existing rqspinlock waiter, and
increase the chances of fairly obtaining the lock, performing deadlock
detection as the head, and not being starved while attempting the
trylock.

The trylock path in particular is unlikely to succeed under contention,
as it relies on the lock word becoming 0, which indicates no contention.
This means that the most likely result for NMIs attempting a trylock is
a timeout under contention if they don't hit an AA or ABBA case.

The core problem being addressed through the fixed commit was removing
the dependency edge between an NMI queue waiter and the queue waiter it
is interrupting. Whenever a circular dependency forms, and with no way
to break it (as non-head waiters don't poll for deadlocks or timeouts),
we would enter into a deadlock. A trylock either breaks such an edge by
probing for deadlocks, and finally terminating the waiting loop using a
timeout.

By excluding queueing on CPUs where the node index is non-zero for NMIs,
this sort of dependency is broken. The CPU enters the trylock path for
those cases, and falls back to deadlock checks and timeouts. However, in
other case where it doesn't interrupt the CPU in the slow path while its
queued on the lock, it can join the queue as a normal waiter, and avoid
trylock associated starvation and subsequent timeouts.

There are a few remaining cases here that matter: the NMI can still
preempt the owner in its critical section, and if it queues as a
non-head waiter, it can end up impeding the progress of the owner. While
this won't deadlock, since the head waiter will eventually signal the
NMI waiter to either stop (due to a timeout), it can still lead to long
timeouts. These gaps will be addressed in subsequent commits.

Note that while the node count detection approach is less conservative
than simply deferring NMIs to trylock, it is going to return errors
where attempts to lock B in NMI happen while waiters for lock A are in a
lower context on the same CPU. However, this only occurs when the lower
context is queued in the slow path, and the NMI attempt can proceed
without failure in all other cases. To continue to prevent AA deadlocks
(or ABBA in a similar NMI interrupting lower context pattern), we'd need
a more fleshed out algorithm to unlink NMI waiters after they queue and
detect such cases. However, all that complexity isn't appealing yet to
reduce the failure rate in the small window inside the slow path.

It is important to note that reentrancy in the slow path can also happen
through trace_contention_{begin,end}, but in those cases, unlike an NMI,
the forward progress of the head waiter (or the predecessor in general)
is not being blocked.

Fixes: 0d80e7f951be ("rqspinlock: Choose trylock fallback for NMI waiters")
Reported-by: Ritesh Oedayrajsingh Varma &lt;ritesh@superluminal.eu&gt;
Suggested-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20251128232802.1031906-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>rqspinlock: Enclose lock/unlock within lock entry acquisitions</title>
<updated>2025-12-18T13:03:28Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2025-11-28T23:27:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=43b6c312b8ff013258e21c2b4eca4887171ebbc3'/>
<id>urn:sha1:43b6c312b8ff013258e21c2b4eca4887171ebbc3</id>
<content type='text'>
[ Upstream commit beb7021a6003d9c6a463fffca0d6311efb8e0e66 ]

Ritesh reported that timeouts occurred frequently for rqspinlock despite
reentrancy on the same lock on the same CPU in [0]. This patch closes
one of the races leading to this behavior, and reduces the frequency of
timeouts.

We currently have a tiny window between the fast-path cmpxchg and the
grabbing of the lock entry where an NMI could land, attempt the same
lock that was just acquired, and end up timing out. This is not ideal.
Instead, move the lock entry acquisition from the fast path to before
the cmpxchg, and remove the grabbing of the lock entry in the slow path,
assuming it was already taken by the fast path. The TAS fallback is
invoked directly without being preceded by the typical fast path,
therefore we must continue to grab the deadlock detection entry in that
case.

Case on lock leading to missed AA:

cmpxchg lock A
&lt;NMI&gt;
... rqspinlock acquisition of A
... timeout
&lt;/NMI&gt;
grab_held_lock_entry(A)

There is a similar case when unlocking the lock. If the NMI lands
between the WRITE_ONCE and smp_store_release, it is possible that we end
up in a situation where the NMI fails to diagnose the AA condition,
leading to a timeout.

Case on unlock leading to missed AA:

WRITE_ONCE(rqh-&gt;locks[rqh-&gt;cnt - 1], NULL)
&lt;NMI&gt;
... rqspinlock acquisition of A
... timeout
&lt;/NMI&gt;
smp_store_release(A-&gt;locked, 0)

The patch changes the order on unlock to smp_store_release() succeeded
by WRITE_ONCE() of NULL. This avoids the missed AA detection described
above, but may lead to a false positive if the NMI lands between these
two statements, which is acceptable (and preferred over a timeout).

The original intention of the reverse order on unlock was to prevent the
following possible misdiagnosis of an ABBA scenario:

grab entry A
lock A
grab entry B
lock B
unlock B
   smp_store_release(B-&gt;locked, 0)
							grab entry B
							lock B
							grab entry A
							lock A
							! &lt;detect ABBA&gt;
   WRITE_ONCE(rqh-&gt;locks[rqh-&gt;cnt - 1], NULL)

If the store release were is after the WRITE_ONCE, the other CPU would
not observe B in the table of the CPU unlocking the lock B.  However,
since the threads are obviously participating in an ABBA deadlock, it
is no longer appealing to use the order above since it may lead to a
250 ms timeout due to missed AA detection.

  [0]: https://lore.kernel.org/bpf/CAH6OuBTjG+N=+GGwcpOUbeDN563oz4iVcU3rbse68egp9wj9_A@mail.gmail.com

Fixes: 0d80e7f951be ("rqspinlock: Choose trylock fallback for NMI waiters")
Reported-by: Ritesh Oedayrajsingh Varma &lt;ritesh@superluminal.eu&gt;
Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20251128232802.1031906-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Fix exclusive map memory leak</title>
<updated>2025-12-18T13:03:22Z</updated>
<author>
<name>Edward Adam Davis</name>
<email>eadavis@qq.com</email>
</author>
<published>2025-11-16T14:58:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f0022551745d72fc0e7bc8601234d690dee2178d'/>
<id>urn:sha1:f0022551745d72fc0e7bc8601234d690dee2178d</id>
<content type='text'>
[ Upstream commit 688b745401ab16e2e1a3b504863f0a45fd345638 ]

When excl_prog_hash is 0 and excl_prog_hash_size is non-zero, the map also
needs to be freed. Otherwise, the map memory will not be reclaimed, just
like the memory leak problem reported by syzbot [1].

syzbot reported:
BUG: memory leak
  backtrace (crc 7b9fb9b4):
    map_create+0x322/0x11e0 kernel/bpf/syscall.c:1512
    __sys_bpf+0x3556/0x3610 kernel/bpf/syscall.c:6131

Fixes: baefdbdf6812 ("bpf: Implement exclusive map creation")
Reported-by: syzbot+cf08c551fecea9fd1320@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cf08c551fecea9fd1320
Tested-by: syzbot+cf08c551fecea9fd1320@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis &lt;eadavis@qq.com&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/tencent_3F226F882CE56DCC94ACE90EED1ECCFC780A@qq.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: properly verify tail call behavior</title>
<updated>2025-12-18T13:03:11Z</updated>
<author>
<name>Martin Teichmann</name>
<email>martin.teichmann@xfel.eu</email>
</author>
<published>2025-11-19T16:03:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a6c22c91b6284e4713c0cf5b20ab3c9c71b77ae8'/>
<id>urn:sha1:a6c22c91b6284e4713c0cf5b20ab3c9c71b77ae8</id>
<content type='text'>
[ Upstream commit e3245f8990431950d20631c72236d4e8cb2dcde8 ]

A successful ebpf tail call does not return to the caller, but to the
caller-of-the-caller, often just finishing the ebpf program altogether.

Any restrictions that the verifier needs to take into account - notably
the fact that the tail call might have modified packet pointers - are to
be checked on the caller-of-the-caller. Checking it on the caller made
the verifier refuse perfectly fine programs that would use the packet
pointers after a tail call, which is no problem as this code is only
executed if the tail call was unsuccessful, i.e. nothing happened.

This patch simulates the behavior of a tail call in the verifier. A
conditional jump to the code after the tail call is added for the case
of an unsucessful tail call, and a return to the caller is simulated for
a successful tail call.

For the successful case we assume that the tail call returns an int,
as tail calls are currently only allowed in functions that return and
int. We always assume that the tail call modified the packet pointers,
as we do not know what the tail call did.

For the unsuccessful case we know nothing happened, so we do not need to
add new constraints.

This approach also allows to check other problems that may occur with
tail calls, namely we are now able to check that precision is properly
propagated into subprograms using tail calls, as well as checking the
live slots in such a subprogram.

Fixes: 1a4607ffba35 ("bpf: consider that tail calls invalidate packet pointers")
Link: https://lore.kernel.org/bpf/20251029105828.1488347-1-martin.teichmann@xfel.eu/
Signed-off-by: Martin Teichmann &lt;martin.teichmann@xfel.eu&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/r/20251119160355.1160932-2-martin.teichmann@xfel.eu
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Fix invalid prog-&gt;stats access when update_effective_progs fails</title>
<updated>2025-12-18T13:03:04Z</updated>
<author>
<name>Pu Lehui</name>
<email>pulehui@huawei.com</email>
</author>
<published>2025-11-15T10:23:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2579c356ccd35d06238b176e4b460978186d804b'/>
<id>urn:sha1:2579c356ccd35d06238b176e4b460978186d804b</id>
<content type='text'>
[ Upstream commit 7dc211c1159d991db609bdf4b0fb9033c04adcbc ]

Syzkaller triggers an invalid memory access issue following fault
injection in update_effective_progs. The issue can be described as
follows:

__cgroup_bpf_detach
  update_effective_progs
    compute_effective_progs
      bpf_prog_array_alloc &lt;-- fault inject
  purge_effective_progs
    /* change to dummy_bpf_prog */
    array-&gt;items[index] = &amp;dummy_bpf_prog.prog

---softirq start---
__do_softirq
  ...
    __cgroup_bpf_run_filter_skb
      __bpf_prog_run_save_cb
        bpf_prog_run
          stats = this_cpu_ptr(prog-&gt;stats)
          /* invalid memory access */
          flags = u64_stats_update_begin_irqsave(&amp;stats-&gt;syncp)
---softirq end---

  static_branch_dec(&amp;cgroup_bpf_enabled_key[atype])

The reason is that fault injection caused update_effective_progs to fail
and then changed the original prog into dummy_bpf_prog.prog in
purge_effective_progs. Then a softirq came, and accessing the members of
dummy_bpf_prog.prog in the softirq triggers invalid mem access.

To fix it, skip updating stats when stats is NULL.

Fixes: 492ecee892c2 ("bpf: enable program stats")
Signed-off-by: Pu Lehui &lt;pulehui@huawei.com&gt;
Link: https://lore.kernel.org/r/20251115102343.2200727-1-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Handle return value of ftrace_set_filter_ip in register_fentry</title>
<updated>2025-12-18T13:03:01Z</updated>
<author>
<name>Menglong Dong</name>
<email>menglong8.dong@gmail.com</email>
</author>
<published>2025-11-10T12:07:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7547576c6209f510550ddcd342b6c52c85e1a612'/>
<id>urn:sha1:7547576c6209f510550ddcd342b6c52c85e1a612</id>
<content type='text'>
[ Upstream commit fea3f5e83c5cd80a76d97343023a2f2e6bd862bf ]

The error that returned by ftrace_set_filter_ip() in register_fentry() is
not handled properly. Just fix it.

Fixes: 00963a2e75a8 ("bpf: Support bpf_trampoline on functions with IPMODIFY (e.g. livepatch)")
Signed-off-by: Menglong Dong &lt;dongml2@chinatelecom.cn&gt;
Acked-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20251110120705.1553694-1-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Prevent nesting overflow in bpf_try_get_buffers</title>
<updated>2025-12-18T13:03:01Z</updated>
<author>
<name>Sahil Chandna</name>
<email>chandna.sahil@gmail.com</email>
</author>
<published>2025-11-14T06:49:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4a9b0581757a97ed70805bd01b3c254890020be6'/>
<id>urn:sha1:4a9b0581757a97ed70805bd01b3c254890020be6</id>
<content type='text'>
[ Upstream commit c1da3df7191f1b4df9256bcd30d78f78201e1d17 ]

bpf_try_get_buffers() returns one of multiple per-CPU buffers based on a
per-CPU nesting counter. This mechanism expects that buffers are not
endlessly acquired before being returned. migrate_disable() ensures that a
task remains on the same CPU, but it does not prevent the task from being
preempted by another task on that CPU.

Without disabled preemption, a task may be preempted while holding a
buffer, allowing another task to run on same CPU and acquire an
additional buffer. Several such preemptions can cause the per-CPU
nest counter to exceed MAX_BPRINTF_NEST_LEVEL and trigger the warning in
bpf_try_get_buffers(). Adding preempt_disable()/preempt_enable() around
buffer acquisition and release prevents this task preemption and
preserves the intended bounded nesting behavior.

Reported-by: syzbot+b0cff308140f79a9c4cb@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68f6a4c8.050a0220.1be48.0011.GAE@google.com/
Fixes: 4223bf833c849 ("bpf: Remove preempt_disable in bpf_try_get_buffers")
Suggested-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Reviewed-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Sahil Chandna &lt;chandna.sahil@gmail.com&gt;
Link: https://lore.kernel.org/r/20251114064922.11650-1-chandna.sahil@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Free special fields when update [lru_,]percpu_hash maps</title>
<updated>2025-12-18T13:02:59Z</updated>
<author>
<name>Leon Hwang</name>
<email>leon.hwang@linux.dev</email>
</author>
<published>2025-11-05T15:14:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4a03d69cece145e4fb527464be29c3806aa3221e'/>
<id>urn:sha1:4a03d69cece145e4fb527464be29c3806aa3221e</id>
<content type='text'>
[ Upstream commit 6af6e49a76c9af7d42eb923703e7648cb2bf401a ]

As [lru_,]percpu_hash maps support BPF_KPTR_{REF,PERCPU}, missing
calls to 'bpf_obj_free_fields()' in 'pcpu_copy_value()' could cause the
memory referenced by BPF_KPTR_{REF,PERCPU} fields to be held until the
map gets freed.

Fix this by calling 'bpf_obj_free_fields()' after
'copy_map_value[,_long]()' in 'pcpu_copy_value()'.

Fixes: 65334e64a493 ("bpf: Support kptrs in percpu hashmap and percpu LRU hashmap")
Signed-off-by: Leon Hwang &lt;leon.hwang@linux.dev&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20251105151407.12723-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Fix stackmap overflow check in __bpf_get_stackid()</title>
<updated>2025-12-18T13:02:42Z</updated>
<author>
<name>Arnaud Lecomte</name>
<email>contact@arnaud-lcm.com</email>
</author>
<published>2025-10-25T19:29:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4669a8db976c8cbd5427fe9945f12c5fa5168ff3'/>
<id>urn:sha1:4669a8db976c8cbd5427fe9945f12c5fa5168ff3</id>
<content type='text'>
[ Upstream commit 23f852daa4bab4d579110e034e4d513f7d490846 ]

Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid()
when copying stack trace data. The issue occurs when the perf trace
 contains more stack entries than the stack map bucket can hold,
 leading to an out-of-bounds write in the bucket's data array.

Fixes: ee2a098851bf ("bpf: Adjust BPF stack helper functions to accommodate skip &gt; 0")
Reported-by: syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com
Signed-off-by: Arnaud Lecomte &lt;contact@arnaud-lcm.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Acked-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20251025192941.1500-1-contact@arnaud-lcm.com

Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
