<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/bpf/log.c, branch v6.9.2</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.9.2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.9.2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2024-03-11T22:37:24Z</updated>
<entry>
<title>bpf: Recognize addr_space_cast instruction in the verifier.</title>
<updated>2024-03-11T22:37:24Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2024-03-08T01:08:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6082b6c328b5486da2b356eae94b8b83c98b5565'/>
<id>urn:sha1:6082b6c328b5486da2b356eae94b8b83c98b5565</id>
<content type='text'>
rY = addr_space_cast(rX, 0, 1) tells the verifier that rY-&gt;type = PTR_TO_ARENA.
Any further operations on PTR_TO_ARENA register have to be in 32-bit domain.

The verifier will mark load/store through PTR_TO_ARENA with PROBE_MEM32.
JIT will generate them as kern_vm_start + 32bit_addr memory accesses.

rY = addr_space_cast(rX, 1, 0) tells the verifier that rY-&gt;type = unknown scalar.
If arena-&gt;map_flags has BPF_F_NO_USER_CONV set then convert cast_user to mov32 as well.
Otherwise JIT will convert it to:
  rY = (u32)rX;
  if (rY)
     rY |= arena-&gt;user_vm_start &amp; ~(u64)~0U;

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20240308010812.89848-6-alexei.starovoitov@gmail.com
</content>
</entry>
<entry>
<title>bpf: improve duplicate source code line detection</title>
<updated>2024-02-15T21:00:48Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2024-02-14T17:41:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=57354f5fdee8017783b5cc2e53b263641b6862e9'/>
<id>urn:sha1:57354f5fdee8017783b5cc2e53b263641b6862e9</id>
<content type='text'>
Verifier log avoids printing the same source code line multiple times
when a consecutive block of BPF assembly instructions are covered by the
same original (C) source code line. This greatly improves verifier log
legibility.

Unfortunately, this check is imperfect and in production applications it
quite often happens that verifier log will have multiple duplicated
source lines emitted, for no apparently good reason. E.g., this is
excerpt from a real-world BPF application (with register states omitted
for clarity):

BEFORE
======
; for (int i = 0; i &lt; STROBE_MAX_MAP_ENTRIES; ++i) { @ strobemeta_probe.bpf.c:394
5369: (07) r8 += 2                    ;
5370: (07) r7 += 16                   ;
; for (int i = 0; i &lt; STROBE_MAX_MAP_ENTRIES; ++i) { @ strobemeta_probe.bpf.c:394
5371: (07) r9 += 1                    ;
5372: (79) r4 = *(u64 *)(r10 -32)     ;
; for (int i = 0; i &lt; STROBE_MAX_MAP_ENTRIES; ++i) { @ strobemeta_probe.bpf.c:394
5373: (55) if r9 != 0xf goto pc+2
; if (i &gt;= map-&gt;cnt) @ strobemeta_probe.bpf.c:396
5376: (79) r1 = *(u64 *)(r10 -40)     ;
5377: (79) r1 = *(u64 *)(r1 +8)       ;
; if (i &gt;= map-&gt;cnt) @ strobemeta_probe.bpf.c:396
5378: (dd) if r1 s&lt;= r9 goto pc-5     ;
; descr-&gt;key_lens[i] = 0; @ strobemeta_probe.bpf.c:398
5379: (b4) w1 = 0                     ;
5380: (6b) *(u16 *)(r8 -30) = r1      ;
; task, data, off, STROBE_MAX_STR_LEN, map-&gt;entries[i].key); @ strobemeta_probe.bpf.c:400
5381: (79) r3 = *(u64 *)(r7 -8)       ;
5382: (7b) *(u64 *)(r10 -24) = r6     ;
; task, data, off, STROBE_MAX_STR_LEN, map-&gt;entries[i].key); @ strobemeta_probe.bpf.c:400
5383: (bc) w6 = w6                    ;
; barrier_var(payload_off); @ strobemeta_probe.bpf.c:280
5384: (bf) r2 = r6                    ;
5385: (bf) r1 = r4                    ;

As can be seen, line 394 is emitted thrice, 396 is emitted twice, and
line 400 is duplicated as well. Note that there are no intermingling
other lines of source code in between these duplicates, so the issue is
not compiler reordering assembly instruction such that multiple original
source code lines are in effect.

It becomes more obvious what's going on if we look at *full* original line info
information (using btfdump for this, [0]):

  #2764: line: insn #5363 --&gt; 394:3 @ ./././strobemeta_probe.bpf.c
            for (int i = 0; i &lt; STROBE_MAX_MAP_ENTRIES; ++i) {
  #2765: line: insn #5373 --&gt; 394:21 @ ./././strobemeta_probe.bpf.c
            for (int i = 0; i &lt; STROBE_MAX_MAP_ENTRIES; ++i) {
  #2766: line: insn #5375 --&gt; 394:47 @ ./././strobemeta_probe.bpf.c
            for (int i = 0; i &lt; STROBE_MAX_MAP_ENTRIES; ++i) {
  #2767: line: insn #5377 --&gt; 394:3 @ ./././strobemeta_probe.bpf.c
            for (int i = 0; i &lt; STROBE_MAX_MAP_ENTRIES; ++i) {
  #2768: line: insn #5378 --&gt; 414:10 @ ./././strobemeta_probe.bpf.c
            return off;

We can see that there are four line info records covering
instructions #5363 through #5377 (instruction indices are shifted due to
subprog instruction being appended to main program), all of them are
pointing to the same C source code line #394. But each of them points to
a different part of that line, which is denoted by differing column
numbers (3, 21, 47, 3).

But verifier log doesn't distinguish between parts of the same source code line
and doesn't emit this column number information, so for end user it's just a
repetitive visual noise. So let's improve the detection of repeated source code
line and avoid this.

With the changes in this patch, we get this output for the same piece of BPF
program log:

AFTER
=====
; for (int i = 0; i &lt; STROBE_MAX_MAP_ENTRIES; ++i) { @ strobemeta_probe.bpf.c:394
5369: (07) r8 += 2                    ;
5370: (07) r7 += 16                   ;
5371: (07) r9 += 1                    ;
5372: (79) r4 = *(u64 *)(r10 -32)     ;
5373: (55) if r9 != 0xf goto pc+2
; if (i &gt;= map-&gt;cnt) @ strobemeta_probe.bpf.c:396
5376: (79) r1 = *(u64 *)(r10 -40)     ;
5377: (79) r1 = *(u64 *)(r1 +8)       ;
5378: (dd) if r1 s&lt;= r9 goto pc-5     ;
; descr-&gt;key_lens[i] = 0; @ strobemeta_probe.bpf.c:398
5379: (b4) w1 = 0                     ;
5380: (6b) *(u16 *)(r8 -30) = r1      ;
; task, data, off, STROBE_MAX_STR_LEN, map-&gt;entries[i].key); @ strobemeta_probe.bpf.c:400
5381: (79) r3 = *(u64 *)(r7 -8)       ;
5382: (7b) *(u64 *)(r10 -24) = r6     ;
5383: (bc) w6 = w6                    ;
; barrier_var(payload_off); @ strobemeta_probe.bpf.c:280
5384: (bf) r2 = r6                    ;
5385: (bf) r1 = r4                    ;

All the duplication is gone and the log is cleaner and less distracting.

  [0] https://github.com/anakryiko/btfdump

Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/r/20240214174100.2847419-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Use O(log(N)) binary search to find line info record</title>
<updated>2024-02-14T22:53:42Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2024-02-14T00:23:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a4561f5afef8a8ff25a2cfd46d587f65869494f2'/>
<id>urn:sha1:a4561f5afef8a8ff25a2cfd46d587f65869494f2</id>
<content type='text'>
Real-world BPF applications keep growing in size. Medium-sized production
application can easily have 50K+ verified instructions, and its line
info section in .BTF.ext has more than 3K entries.

When verifier emits log with log_level&gt;=1, it annotates assembly code
with matched original C source code. Currently it uses linear search
over line info records to find a match. As complexity of BPF
applications grows, this O(K * N) approach scales poorly.

So, let's instead of linear O(N) search for line info record use faster
equivalent O(log(N)) binary search algorithm. It's not a plain binary
search, as we don't look for exact match. It's an upper bound search
variant, looking for rightmost line info record that starts at or before
given insn_off.

Some unscientific measurements were done before and after this change.
They were done in VM and fluctuate a bit, but overall the speed up is
undeniable.

BASELINE
========
File                              Program           Duration (us)   Insns
--------------------------------  ----------------  -------------  ------
katran.bpf.o                      balancer_ingress        2497130  343552
pyperf600.bpf.linked3.o           on_event               12389611  627288
strobelight_pyperf_libbpf.o       on_py_event              387399   52445
--------------------------------  ----------------  -------------  ------

BINARY SEARCH
=============

File                              Program           Duration (us)   Insns
--------------------------------  ----------------  -------------  ------
katran.bpf.o                      balancer_ingress        2339312  343552
pyperf600.bpf.linked3.o           on_event                5602203  627288
strobelight_pyperf_libbpf.o       on_py_event              294761   52445
--------------------------------  ----------------  -------------  ------

While Katran's speed up is pretty modest (about 105ms, or 6%), for
production pyperf BPF program (on_py_event) it's much greater already,
going from 387ms down to 295ms (23% improvement).

Looking at BPF selftests's biggest pyperf example, we can see even more
dramatic improvement, shaving more than 50% of time, going from 12.3s
down to 5.6s.

Different amount of improvement is the function of overall amount of BPF
assembly instructions in .bpf.o files (which contributes to how much
line info records there will be and thus, on average, how much time linear
search will take), among other things:

$ llvm-objdump -d katran.bpf.o | wc -l
3863
$ llvm-objdump -d strobelight_pyperf_libbpf.o | wc -l
6997
$ llvm-objdump -d pyperf600.bpf.linked3.o | wc -l
87854

Granted, this only applies to debugging cases (e.g., using veristat, or
failing verification in production), but seems worth doing to improve
overall developer experience anyways.

Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20240214002311.2197116-1-andrii@kernel.org
</content>
</entry>
<entry>
<title>bpf: emit source code file name and line number in verifier log</title>
<updated>2024-02-14T02:51:32Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2024-02-12T23:59:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7cc13adbd057f1905564ec2a254883d7fd407deb'/>
<id>urn:sha1:7cc13adbd057f1905564ec2a254883d7fd407deb</id>
<content type='text'>
As BPF applications grow in size and complexity and are separated into
multiple .bpf.c files that are statically linked together, it becomes
harder and harder to match verifier's BPF assembly level output to
original C code. While often annotated C source code is unique enough to
be able to identify the file it belongs to, quite often this is actually
problematic as parts of source code can be quite generic.

Long story short, it is very useful to see source code file name and
line number information along with the original C code. Verifier already
knows this information, we just need to output it.

This patch extends verifier log with file name and line number
information, emitted next to original (presumably C) source code,
annotating BPF assembly output, like so:

  ; &lt;original C code&gt; @ &lt;filename&gt;.bpf.c:&lt;line&gt;

If file name has directory names in it, they are stripped away. This
should be fine in practice as file names tend to be pretty unique with
C code anyways, and keeping log size smaller is always good.

In practice this might look something like below, where some code is
coming from application files, while others are from libbpf's usdt.bpf.h
header file:

  ; if (STROBEMETA_READ( @ strobemeta_probe.bpf.c:534
  5592: (79) r1 = *(u64 *)(r10 -56)     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0
  5593: (7b) *(u64 *)(r10 -56) = r1     ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0
  5594: (79) r3 = *(u64 *)(r10 -8)      ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm

  ...

  170: (71) r1 = *(u8 *)(r8 +15)        ; frame1: R1_w=scalar(...) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208)
  171: (67) r1 &lt;&lt;= 56                   ; frame1: R1_w=scalar(...)
  172: (c7) r1 s&gt;&gt;= 56                  ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127)
  ; val &lt;&lt;= arg_spec-&gt;arg_bitshift; @ usdt.bpf.h:183
  173: (67) r1 &lt;&lt;= 32                   ; frame1: R1_w=scalar(...)
  174: (77) r1 &gt;&gt;= 32                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
  175: (79) r2 = *(u64 *)(r10 -8)       ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm
  176: (6f) r2 &lt;&lt;= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar()
  177: (7b) *(u64 *)(r10 -8) = r2       ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61)
  ; if (arg_spec-&gt;arg_signed) @ usdt.bpf.h:184
  178: (bf) r3 = r2                     ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61)
  179: (7f) r3 &gt;&gt;= r1                   ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar()
  ; if (arg_spec-&gt;arg_signed) @ usdt.bpf.h:184
  180: (71) r4 = *(u8 *)(r8 +14)
  181: safe

log_fixup tests needed a minor adjustment as verifier log output
increased a bit and that test is quite sensitive to such changes.

Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/r/20240212235944.2816107-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: emit more dynptr information in verifier log</title>
<updated>2023-12-12T03:21:22Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2023-12-04T23:39:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=22b769bb4f87060774bfdd6facbab438ed3b8453'/>
<id>urn:sha1:22b769bb4f87060774bfdd6facbab438ed3b8453</id>
<content type='text'>
Emit dynptr type for CONST_PTR_TO_DYNPTR register. Also emit id,
ref_obj_id, and dynptr_id fields for STACK_DYNPTR stack slots.

Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/r/20231204233931.49758-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: log PTR_TO_MEM memory size in verifier log</title>
<updated>2023-12-12T03:21:22Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2023-12-04T23:39:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1e68485d8299860e68c4e1d29589ff0d20db0287'/>
<id>urn:sha1:1e68485d8299860e68c4e1d29589ff0d20db0287</id>
<content type='text'>
Emit valid memory size addressable through PTR_TO_MEM register.

Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Link: https://lore.kernel.org/r/20231204233931.49758-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: simplify tnum output if a fully known constant</title>
<updated>2023-12-02T19:36:51Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2023-12-02T17:57:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=81eff2e36481c5cf4a2ac906ae56c3fbd3e6f305'/>
<id>urn:sha1:81eff2e36481c5cf4a2ac906ae56c3fbd3e6f305</id>
<content type='text'>
Emit tnum representation as just a constant if all bits are known.
Use decimal-vs-hex logic to determine exact format of emitted
constant value, just like it's done for register range values.
For that move tnum_strn() to kernel/bpf/log.c to reuse decimal-vs-hex
determination logic and constants.

Acked-by: Shung-Hsi Yu &lt;shung-hsi.yu@suse.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/r/20231202175705.885270-12-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: emit frameno for PTR_TO_STACK regs if it differs from current one</title>
<updated>2023-11-18T19:39:59Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2023-11-18T03:46:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=46862ee854b4f5a315d63b677ca3af14a89aefeb'/>
<id>urn:sha1:46862ee854b4f5a315d63b677ca3af14a89aefeb</id>
<content type='text'>
It's possible to pass a pointer to parent's stack to child subprogs. In
such case verifier state output is ambiguous not showing whether
register container a pointer to "current" stack, belonging to current
subprog (frame), or it's actually a pointer to one of parent frames.

So emit this information if frame number differs between the state which
register is part of. E.g., if current state is in frame 2 and it has
a register pointing to stack in grand parent state (frame #0), we'll see
something like 'R1=fp[0]-16', while "local stack pointer" will be just
'R2=fp-16'.

Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/r/20231118034623.3320920-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: smarter verifier log number printing logic</title>
<updated>2023-11-18T19:39:59Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2023-11-18T03:46:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0f8dbdbc641b45a5fa31d497f9fc83ffe1174fa3'/>
<id>urn:sha1:0f8dbdbc641b45a5fa31d497f9fc83ffe1174fa3</id>
<content type='text'>
Instead of always printing numbers as either decimals (and in some
cases, like for "imm=%llx", in hexadecimals), decide the form based on
actual values. For numbers in a reasonably small range (currently,
[0, U16_MAX] for unsigned values, and [S16_MIN, S16_MAX] for signed ones),
emit them as decimals. In all other cases, even for signed values,
emit them in hexadecimals.

For large values hex form is often times way more useful: it's easier to
see an exact difference between 0xffffffff80000000 and 0xffffffff7fffffff,
than between 18446744071562067966 and 18446744071562067967, as one
particular example.

Small values representing small pointer offsets or application
constants, on the other hand, are way more useful to be represented in
decimal notation.

Adjust reg_bounds register state parsing logic to take into account this
change.

Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/r/20231118034623.3320920-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: omit default off=0 and imm=0 in register state log</title>
<updated>2023-11-18T19:39:59Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andrii@kernel.org</email>
</author>
<published>2023-11-18T03:46:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1db747d75b1dbe17bf4283ed87bd3b7a92010f34'/>
<id>urn:sha1:1db747d75b1dbe17bf4283ed87bd3b7a92010f34</id>
<content type='text'>
Simplify BPF verifier log further by omitting default (and frequently
irrelevant) off=0 and imm=0 parts for non-SCALAR_VALUE registers. As can
be seen from fixed tests, this is often a visual noise for PTR_TO_CTX
register and even for PTR_TO_PACKET registers.

Omitting default values follows the rest of register state logic: we
omit default values to keep verifier log succinct and to highlight
interesting state that deviates from default one. E.g., we do the same
for var_off, when it's unknown, which gives no additional information.

Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/r/20231118034623.3320920-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
</feed>
