<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/bpf_verifier.h, branch v6.14.8</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.14.8</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.14.8'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2024-12-16T16:53:59Z</updated>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf</title>
<updated>2024-12-16T16:53:59Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2024-12-16T16:52:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=06103dccbbd29408255a409f6f98f7f02387dc93'/>
<id>urn:sha1:06103dccbbd29408255a409f6f98f7f02387dc93</id>
<content type='text'>
Cross-merge bpf fixes after downstream PR.

No conflicts.

Adjacent changes in:
Auto-merging include/linux/bpf.h
Auto-merging include/linux/bpf_verifier.h
Auto-merging kernel/bpf/btf.c
Auto-merging kernel/bpf/verifier.c
Auto-merging kernel/trace/bpf_trace.c
Auto-merging tools/testing/selftests/bpf/progs/test_tp_btf_nullable.c

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: track changes_pkt_data property for global functions</title>
<updated>2024-12-10T18:24:57Z</updated>
<author>
<name>Eduard Zingerman</name>
<email>eddyz87@gmail.com</email>
</author>
<published>2024-12-10T04:10:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=51081a3f25c742da5a659d7fc6fd77ebfdd555be'/>
<id>urn:sha1:51081a3f25c742da5a659d7fc6fd77ebfdd555be</id>
<content type='text'>
When processing calls to certain helpers, verifier invalidates all
packet pointers in a current state. For example, consider the
following program:

    __attribute__((__noinline__))
    long skb_pull_data(struct __sk_buff *sk, __u32 len)
    {
        return bpf_skb_pull_data(sk, len);
    }

    SEC("tc")
    int test_invalidate_checks(struct __sk_buff *sk)
    {
        int *p = (void *)(long)sk-&gt;data;
        if ((void *)(p + 1) &gt; (void *)(long)sk-&gt;data_end) return TCX_DROP;
        skb_pull_data(sk, 0);
        *p = 42;
        return TCX_PASS;
    }

After a call to bpf_skb_pull_data() the pointer 'p' can't be used
safely. See function filter.c:bpf_helper_changes_pkt_data() for a list
of such helpers.

At the moment verifier invalidates packet pointers when processing
helper function calls, and does not traverse global sub-programs when
processing calls to global sub-programs. This means that calls to
helpers done from global sub-programs do not invalidate pointers in
the caller state. E.g. the program above is unsafe, but is not
rejected by verifier.

This commit fixes the omission by computing field
bpf_subprog_info-&gt;changes_pkt_data for each sub-program before main
verification pass.
changes_pkt_data should be set if:
- subprogram calls helper for which bpf_helper_changes_pkt_data
  returns true;
- subprogram calls a global function,
  for which bpf_subprog_info-&gt;changes_pkt_data should be set.

The verifier.c:check_cfg() pass is modified to compute this
information. The commit relies on depth first instruction traversal
done by check_cfg() and absence of recursive function calls:
- check_cfg() would eventually visit every call to subprogram S in a
  state when S is fully explored;
- when S is fully explored:
  - every direct helper call within S is explored
    (and thus changes_pkt_data is set if needed);
  - every call to subprogram S1 called by S was visited with S1 fully
    explored (and thus S inherits changes_pkt_data from S1).

The downside of such approach is that dead code elimination is not
taken into account: if a helper call inside global function is dead
because of current configuration, verifier would conservatively assume
that the call occurs for the purpose of the changes_pkt_data
computation.

Reported-by: Nick Zavaritsky &lt;mejedi@gmail.com&gt;
Closes: https://lore.kernel.org/bpf/0498CA22-5779-4767-9C0C-A9515CEA711F@gmail.com/
Signed-off-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/r/20241210041100.1898468-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Introduce support for bpf_local_irq_{save,restore}</title>
<updated>2024-12-04T16:38:29Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2024-12-04T03:03:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c8e2ee1f3df05dc4caa746c062c6b5791c745c79'/>
<id>urn:sha1:c8e2ee1f3df05dc4caa746c062c6b5791c745c79</id>
<content type='text'>
Teach the verifier about IRQ-disabled sections through the introduction
of two new kfuncs, bpf_local_irq_save, to save IRQ state and disable
them, and bpf_local_irq_restore, to restore IRQ state and enable them
back again.

For the purposes of tracking the saved IRQ state, the verifier is taught
about a new special object on the stack of type STACK_IRQ_FLAG. This is
a 8 byte value which saves the IRQ flags which are to be passed back to
the IRQ restore kfunc.

Renumber the enums for REF_TYPE_* to simplify the check in
find_lock_state, filtering out non-lock types as they grow will become
cumbersome and is unecessary.

To track a dynamic number of IRQ-disabled regions and their associated
saved states, a new resource type RES_TYPE_IRQ is introduced, which its
state management functions: acquire_irq_state and release_irq_state,
taking advantage of the refactoring and clean ups made in earlier
commits.

One notable requirement of the kernel's IRQ save and restore API is that
they cannot happen out of order. For this purpose, when releasing reference
we keep track of the prev_id we saw with REF_TYPE_IRQ. Since reference
states are inserted in increasing order of the index, this is used to
remember the ordering of acquisitions of IRQ saved states, so that we
maintain a logical stack in acquisition order of resource identities,
and can enforce LIFO ordering when restoring IRQ state. The top of the
stack is maintained using bpf_verifier_state's active_irq_id.

To maintain the stack property when releasing reference states, we need
to modify release_reference_state to instead shift the remaining array
left using memmove instead of swapping deleted element with last that
might break the ordering. A selftest to test this subtle behavior is
added in late patches.

The logic to detect initialized and unitialized irq flag slots, marking
and unmarking is similar to how it's done for iterators. No additional
checks are needed in refsafe for REF_TYPE_IRQ, apart from the usual
check_id satisfiability check on the ref[i].id. We have to perform the
same check_ids check on state-&gt;active_irq_id as well.

To ensure we don't get assigned REF_TYPE_PTR by default after
acquire_reference_state, if someone forgets to assign the type, let's
also renumber the enum ref_state_type. This way any unassigned types
get caught by refsafe's default switch statement, don't assume
REF_TYPE_PTR by default.

The kfuncs themselves are plain wrappers over local_irq_save and
local_irq_restore macros.

Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20241204030400.208005-5-memxor@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Consolidate locks and reference state in verifier state</title>
<updated>2024-12-04T16:38:29Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2024-12-04T03:03:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1995edc5f9089ecb8b77a34f21e4abd8f887b856'/>
<id>urn:sha1:1995edc5f9089ecb8b77a34f21e4abd8f887b856</id>
<content type='text'>
Currently, state for RCU read locks and preemption is in
bpf_verifier_state, while locks and pointer reference state remains in
bpf_func_state. There is no particular reason to keep the latter in
bpf_func_state. Additionally, it is copied into a new frame's state and
copied back to the caller frame's state everytime the verifier processes
a pseudo call instruction. This is a bit wasteful, given this state is
global for a given verification state / path.

Move all resource and reference related state in bpf_verifier_state
structure in this patch, in preparation for introducing new reference
state types in the future.

Since we switch print_verifier_state and friends to print using vstate,
we now need to explicitly pass in the verifier state from the caller
along with the bpf_func_state, so modify the prototype and callers to do
so. To ensure func state matches the verifier state when we're printing
data, take in frame number instead of bpf_func_state pointer instead and
avoid inconsistencies induced by the caller.

Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20241204030400.208005-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: use common instruction history across all states</title>
<updated>2024-11-15T18:20:47Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2024-11-15T00:13:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=96a30e469ca1d2b8cc7811b40911f8614b558241'/>
<id>urn:sha1:96a30e469ca1d2b8cc7811b40911f8614b558241</id>
<content type='text'>
Instead of allocating and copying instruction history each time we
enqueue child verifier state, switch to a model where we use one common
dynamically sized array of instruction history entries across all states.

The key observation for proving this is correct is that instruction
history is only relevant while state is active, which means it either is
a current state (and thus we are actively modifying instruction history
and no other state can interfere with us) or we are checkpointed state
with some children still active (either enqueued or being current).

In the latter case our portion of instruction history is finalized and
won't change or grow, so as long as we keep it immutable until the state
is finalized, we are good.

Now, when state is finalized and is put into state hash for potentially
future pruning lookups, instruction history is not used anymore. This is
because instruction history is only used by precision marking logic, and
we never modify precision markings for finalized states.

So, instead of each state having its own small instruction history, we
keep a global dynamically-sized instruction history, where each state in
current DFS path from root to active state remembers its portion of
instruction history. Current state can append to this history, but
cannot modify any of its parent histories.

Async callback state enqueueing, while logically detached from parent
state, still is part of verification backtracking tree, so has to follow
the same schema as normal state checkpoints.

Because the insn_hist array can be grown through realloc, states don't
keep pointers, they instead maintain two indices, [start, end), into
global instruction history array. End is exclusive index, so
`start == end` means there is no relevant instruction history.

This eliminates a lot of allocations and minimizes overall memory usage.

For instance, running a worst-case test from [0] (but without the
heuristics-based fix [1]), it took 12.5 minutes until we get -ENOMEM.
With the changes in this patch the whole test succeeds in 10 minutes
(very slow, so heuristics from [1] is important, of course).

To further validate correctness, veristat-based comparison was performed for
Meta production BPF objects and BPF selftests objects. In both cases there
were no differences *at all* in terms of verdict or instruction and state
counts, providing a good confidence in the change.

Having this low-memory-overhead solution of keeping dynamic
per-instruction history cheaply opens up some new possibilities, like
keeping extra information for literally every single validated
instruction. This will be used for simplifying precision backpropagation
logic in follow up patches.

  [0] https://lore.kernel.org/bpf/20241029172641.1042523-2-eddyz87@gmail.com/
  [1] https://lore.kernel.org/bpf/20241029172641.1042523-1-eddyz87@gmail.com/

Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/r/20241115001303.277272-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Support private stack for struct_ops progs</title>
<updated>2024-11-13T00:26:25Z</updated>
<author>
<name>Yonghong Song</name>
<email>yonghong.song@linux.dev</email>
</author>
<published>2024-11-12T16:39:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5bd36da1e37e7a78e8b38efd287de6e1394b7d6e'/>
<id>urn:sha1:5bd36da1e37e7a78e8b38efd287de6e1394b7d6e</id>
<content type='text'>
For struct_ops progs, whether a particular prog uses private stack
depends on prog-&gt;aux-&gt;priv_stack_requested setting before actual
insn-level verification for that prog. One particular implementation
is to piggyback on struct_ops-&gt;check_member(). The next patch has
an example for this. The struct_ops-&gt;check_member() sets
prog-&gt;aux-&gt;priv_stack_requested to be true which enables private stack
usage.

The struct_ops prog follows the same rule as kprobe/tracing progs after
function bpf_enable_priv_stack(). For example, even a struct_ops prog
requests private stack, it could still use normal kernel stack if
the stack size is small (&lt; 64 bytes).

Similar to tracing progs, nested same cpu same prog run will be skipped.
A field (recursion_detected()) is added to bpf_prog_aux structure.
If bpf_prog-&gt;aux-&gt;recursion_detected is implemented by the struct_ops
subsystem and nested same cpu/prog happens, the function will be
triggered to report an error, collect related info, etc.

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20241112163933.2224962-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Find eligible subprogs for private stack support</title>
<updated>2024-11-13T00:26:24Z</updated>
<author>
<name>Yonghong Song</name>
<email>yonghong.song@linux.dev</email>
</author>
<published>2024-11-12T16:39:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a76ab5731e32d50ff5b1ae97e9dc4b23f41c23f5'/>
<id>urn:sha1:a76ab5731e32d50ff5b1ae97e9dc4b23f41c23f5</id>
<content type='text'>
Private stack will be allocated with percpu allocator in jit time.
To avoid complexity at runtime, only one copy of private stack is
available per cpu per prog. So runtime recursion check is necessary
to avoid stack corruption.

Current private stack only supports kprobe/perf_event/tp/raw_tp
which has recursion check in the kernel, and prog types that use
bpf trampoline recursion check. For trampoline related prog types,
currently only tracing progs have recursion checking.

To avoid complexity, all async_cb subprogs use normal kernel stack
including those subprogs used by both main prog subtree and async_cb
subtree. Any prog having tail call also uses kernel stack.

To avoid jit penalty with private stack support, a subprog stack
size threshold is set such that only if the stack size is no less
than the threshold, private stack is supported. The current threshold
is 64 bytes. This avoids jit penality if the stack usage is small.

A useless 'continue' is also removed from a loop in func
check_max_stack_depth().

Signed-off-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20241112163907.2223839-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Drop special callback reference handling</title>
<updated>2024-11-11T16:18:55Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2024-11-09T23:14:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ae6e3a273f590a2b64f14a9fab3546c3a8f44ed4'/>
<id>urn:sha1:ae6e3a273f590a2b64f14a9fab3546c3a8f44ed4</id>
<content type='text'>
Logic to prevent callbacks from acquiring new references for the program
(i.e. leaving acquired references), and releasing caller references
(i.e. those acquired in parent frames) was introduced in commit
9d9d00ac29d0 ("bpf: Fix reference state management for synchronous callbacks").

This was necessary because back then, the verifier simulated each
callback once (that could potentially be executed N times, where N can
be zero). This meant that callbacks that left lingering resources or
cleared caller resources could do it more than once, operating on
undefined state or leaking memory.

With the fixes to callback verification in commit
ab5cfac139ab ("bpf: verify callbacks as if they are called unknown number of times"),
all of this extra logic is no longer necessary. Hence, drop it as part
of this commit.

Cc: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20241109231430.2475236-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Refactor active lock management</title>
<updated>2024-11-11T16:18:51Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2024-11-09T23:14:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f6b9a69a9e56b2083aca8a925fc1a28eb698e3ed'/>
<id>urn:sha1:f6b9a69a9e56b2083aca8a925fc1a28eb698e3ed</id>
<content type='text'>
When bpf_spin_lock was introduced originally, there was deliberation on
whether to use an array of lock IDs, but since bpf_spin_lock is limited
to holding a single lock at any given time, we've been using a single ID
to identify the held lock.

In preparation for introducing spin locks that can be taken multiple
times, introduce support for acquiring multiple lock IDs. For this
purpose, reuse the acquired_refs array and store both lock and pointer
references. We tag the entry with REF_TYPE_PTR or REF_TYPE_LOCK to
disambiguate and find the relevant entry. The ptr field is used to track
the map_ptr or btf (for bpf_obj_new allocations) to ensure locks can be
matched with protected fields within the same "allocation", i.e.
bpf_obj_new object or map value.

The struct active_lock is changed to an int as the state is part of the
acquired_refs array, and we only need active_lock as a cheap way of
detecting lock presence.

Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20241109231430.2475236-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: use type_may_be_null() helper for nullable-param check</title>
<updated>2024-09-05T20:29:06Z</updated>
<author>
<name>Shung-Hsi Yu</name>
<email>shung-hsi.yu@suse.com</email>
</author>
<published>2024-09-05T05:52:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1ae497c78f01855f3695b58481311ffdd429b028'/>
<id>urn:sha1:1ae497c78f01855f3695b58481311ffdd429b028</id>
<content type='text'>
Commit 980ca8ceeae6 ("bpf: check bpf_dummy_struct_ops program params for
test runs") does bitwise AND between reg_type and PTR_MAYBE_NULL, which
is correct, but due to type difference the compiler complains:

  net/bpf/bpf_dummy_struct_ops.c:118:31: warning: bitwise operation between different enumeration types ('const enum bpf_reg_type' and 'enum bpf_type_flag') [-Wenum-enum-conversion]
    118 |                 if (info &amp;&amp; (info-&gt;reg_type &amp; PTR_MAYBE_NULL))
        |                              ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~

Workaround the warning by moving the type_may_be_null() helper from
verifier.c into bpf_verifier.h, and reuse it here to check whether param
is nullable.

Fixes: 980ca8ceeae6 ("bpf: check bpf_dummy_struct_ops program params for test runs")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202404241956.HEiRYwWq-lkp@intel.com/
Signed-off-by: Shung-Hsi Yu &lt;shung-hsi.yu@suse.com&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/r/20240905055233.70203-1-shung-hsi.yu@suse.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
</feed>
