<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/tools/include/uapi/linux/bpf.h, branch v5.7</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.7</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.7'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-04-25T00:01:26Z</updated>
<entry>
<title>bpf: Fix reStructuredText markup</title>
<updated>2020-04-25T00:01:26Z</updated>
<author>
<name>Jakub Wilk</name>
<email>jwilk@jwilk.net</email>
</author>
<published>2020-04-22T08:23:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a33d3147945543f9ded67a052f358a75595f1ecb'/>
<id>urn:sha1:a33d3147945543f9ded67a052f358a75595f1ecb</id>
<content type='text'>
The patch fixes:
$ scripts/bpf_helpers_doc.py &gt; bpf-helpers.rst
$ rst2man bpf-helpers.rst &gt; bpf-helpers.7
bpf-helpers.rst:1105: (WARNING/2) Inline strong start-string without end-string.

Signed-off-by: Jakub Wilk &lt;jwilk@jwilk.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Reviewed-by: Quentin Monnet &lt;quentin@isovalent.com&gt;
Link: https://lore.kernel.org/bpf/20200422082324.2030-1-jwilk@jwilk.net
</content>
</entry>
<entry>
<title>libbpf: Add support for bpf_link-based cgroup attachment</title>
<updated>2020-03-31T00:36:41Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andriin@fb.com</email>
</author>
<published>2020-03-30T03:00:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cc4f864bb118e0ae7bf9f4e3418eaeb083aa34f2'/>
<id>urn:sha1:cc4f864bb118e0ae7bf9f4e3418eaeb083aa34f2</id>
<content type='text'>
Add bpf_program__attach_cgroup(), which uses BPF_LINK_CREATE subcommand to
create an FD-based kernel bpf_link. Also add low-level bpf_link_create() API.

If expected_attach_type is not specified explicitly with
bpf_program__set_expected_attach_type(), libbpf will try to determine proper
attach type from BPF program's section definition.

Also add support for bpf_link's underlying BPF program replacement:
  - unconditional through high-level bpf_link__update_program() API;
  - cmpxchg-like with specifying expected current BPF program through
    low-level bpf_link_update() API.

Signed-off-by: Andrii Nakryiko &lt;andriin@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20200330030001.2312810-4-andriin@fb.com
</content>
</entry>
<entry>
<title>bpf: Implement bpf_link-based cgroup BPF program attachment</title>
<updated>2020-03-31T00:35:59Z</updated>
<author>
<name>Andrii Nakryiko</name>
<email>andriin@fb.com</email>
</author>
<published>2020-03-30T02:59:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=af6eea57437a830293eab56246b6025cc7d46ee7'/>
<id>urn:sha1:af6eea57437a830293eab56246b6025cc7d46ee7</id>
<content type='text'>
Implement new sub-command to attach cgroup BPF programs and return FD-based
bpf_link back on success. bpf_link, once attached to cgroup, cannot be
replaced, except by owner having its FD. Cgroup bpf_link supports only
BPF_F_ALLOW_MULTI semantics. Both link-based and prog-based BPF_F_ALLOW_MULTI
attachments can be freely intermixed.

To prevent bpf_cgroup_link from keeping cgroup alive past the point when no
BPF program can be executed, implement auto-detachment of link. When
cgroup_bpf_release() is called, all attached bpf_links are forced to release
cgroup refcounts, but they leave bpf_link otherwise active and allocated, as
well as still owning underlying bpf_prog. This is because user-space might
still have FDs open and active, so bpf_link as a user-referenced object can't
be freed yet. Once last active FD is closed, bpf_link will be freed and
underlying bpf_prog refcount will be dropped. But cgroup refcount won't be
touched, because cgroup is released already.

The inherent race between bpf_cgroup_link release (from closing last FD) and
cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So
the only additional check required is when bpf_cgroup_link attempts to detach
itself from cgroup. At that time we need to check whether there is still
cgroup associated with that link. And if not, exit with success, because
bpf_cgroup_link was already successfully detached.

Signed-off-by: Andrii Nakryiko &lt;andriin@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Roman Gushchin &lt;guro@fb.com&gt;
Link: https://lore.kernel.org/bpf/20200330030001.2312810-2-andriin@fb.com
</content>
</entry>
<entry>
<title>bpf: Add socket assign support</title>
<updated>2020-03-30T20:45:04Z</updated>
<author>
<name>Joe Stringer</name>
<email>joe@wand.net.nz</email>
</author>
<published>2020-03-29T22:53:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cf7fbe660f2dbd738ab58aea8e9b0ca6ad232449'/>
<id>urn:sha1:cf7fbe660f2dbd738ab58aea8e9b0ca6ad232449</id>
<content type='text'>
Add support for TPROXY via a new bpf helper, bpf_sk_assign().

This helper requires the BPF program to discover the socket via a call
to bpf_sk*_lookup_*(), then pass this socket to the new helper. The
helper takes its own reference to the socket in addition to any existing
reference that may or may not currently be obtained for the duration of
BPF processing. For the destination socket to receive the traffic, the
traffic must be routed towards that socket via local route. The
simplest example route is below, but in practice you may want to route
traffic more narrowly (eg by CIDR):

  $ ip route add local default dev lo

This patch avoids trying to introduce an extra bit into the skb-&gt;sk, as
that would require more invasive changes to all code interacting with
the socket to ensure that the bit is handled correctly, such as all
error-handling cases along the path from the helper in BPF through to
the orphan path in the input. Instead, we opt to use the destructor
variable to switch on the prefetch of the socket.

Signed-off-by: Joe Stringer &lt;joe@wand.net.nz&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz
</content>
</entry>
<entry>
<title>bpf: Introduce BPF_PROG_TYPE_LSM</title>
<updated>2020-03-29T23:34:00Z</updated>
<author>
<name>KP Singh</name>
<email>kpsingh@google.com</email>
</author>
<published>2020-03-29T00:43:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fc611f47f2188ade2b48ff6902d5cce8baac0c58'/>
<id>urn:sha1:fc611f47f2188ade2b48ff6902d5cce8baac0c58</id>
<content type='text'>
Introduce types and configs for bpf programs that can be attached to
LSM hooks. The programs can be enabled by the config option
CONFIG_BPF_LSM.

Signed-off-by: KP Singh &lt;kpsingh@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Brendan Jackman &lt;jackmanb@google.com&gt;
Reviewed-by: Florent Revest &lt;revest@google.com&gt;
Reviewed-by: Thomas Garnier &lt;thgarnie@google.com&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Acked-by: Andrii Nakryiko &lt;andriin@fb.com&gt;
Acked-by: James Morris &lt;jamorris@linux.microsoft.com&gt;
Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
</content>
</entry>
<entry>
<title>bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id</title>
<updated>2020-03-28T02:40:39Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2020-03-27T15:58:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0f09abd105da6c37713d2b253730a86cb45e127a'/>
<id>urn:sha1:0f09abd105da6c37713d2b253730a86cb45e127a</id>
<content type='text'>
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(),
recvmsg() and bind-related hooks in order to retrieve the cgroup v2
context which can then be used as part of the key for BPF map lookups,
for example. Given these hooks operate in process context 'current' is
always valid and pointing to the app that is performing mentioned
syscalls if it's subject to a v2 cgroup. Also with same motivation of
commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper")
enable retrieval of ancestor from current so the cgroup id can be used
for policy lookups which can then forbid connect() / bind(), for example.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
</content>
</entry>
<entry>
<title>bpf: Add netns cookie and enable it for bpf cgroup hooks</title>
<updated>2020-03-28T02:40:38Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2020-03-27T15:58:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f318903c0bf42448b4c884732df2bbb0ef7a2284'/>
<id>urn:sha1:f318903c0bf42448b4c884732df2bbb0ef7a2284</id>
<content type='text'>
In Cilium we're mainly using BPF cgroup hooks today in order to implement
kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*),
ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic
between Cilium managed nodes. While this works in its current shape and avoids
packet-level NAT for inter Cilium managed node traffic, there is one major
limitation we're facing today, that is, lack of netns awareness.

In Kubernetes, the concept of Pods (which hold one or multiple containers)
has been built around network namespaces, so while we can use the global scope
of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing
NodePort ports on loopback addresses), we also have the need to differentiate
between initial network namespaces and non-initial one. For example, ExternalIP
services mandate that non-local service IPs are not to be translated from the
host (initial) network namespace as one example. Right now, we have an ugly
work-around in place where non-local service IPs for ExternalIP services are
not xlated from connect() and friends BPF hooks but instead via less efficient
packet-level NAT on the veth tc ingress hook for Pod traffic.

On top of determining whether we're in initial or non-initial network namespace
we also have a need for a socket-cookie like mechanism for network namespaces
scope. Socket cookies have the nice property that they can be combined as part
of the key structure e.g. for BPF LRU maps without having to worry that the
cookie could be recycled. We are planning to use this for our sessionAffinity
implementation for services. Therefore, add a new bpf_get_netns_cookie() helper
which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would
provide the cookie for the initial network namespace while passing the context
instead of NULL would provide the cookie from the application's network namespace.
We're using a hole, so no size increase; the assignment happens only once.
Therefore this allows for a comparison on initial namespace as well as regular
cookie usage as we have today with socket cookies. We could later on enable
this helper for other program types as well as we would see need.

  (*) Both externalTrafficPolicy={Local|Cluster} types
  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
</content>
</entry>
<entry>
<title>bpf: Add bpf_xdp_output() helper</title>
<updated>2020-03-13T00:47:38Z</updated>
<author>
<name>Eelco Chaudron</name>
<email>echaudro@redhat.com</email>
</author>
<published>2020-03-06T08:59:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d831ee84bfc9173eecf30dbbc2553ae81b996c60'/>
<id>urn:sha1:d831ee84bfc9173eecf30dbbc2553ae81b996c60</id>
<content type='text'>
Introduce new helper that reuses existing xdp perf_event output
implementation, but can be called from raw_tracepoint programs
that receive 'struct xdp_buff *' as a tracepoint argument.

Signed-off-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Acked-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Link: https://lore.kernel.org/bpf/158348514556.2239.11050972434793741444.stgit@xdp-tutorial
</content>
</entry>
<entry>
<title>bpf: Added new helper bpf_get_ns_current_pid_tgid</title>
<updated>2020-03-13T00:33:11Z</updated>
<author>
<name>Carlos Neira</name>
<email>cneirabustos@gmail.com</email>
</author>
<published>2020-03-04T20:41:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b4490c5c4e023f09b7d27c9a9d3e7ad7d09ea6bf'/>
<id>urn:sha1:b4490c5c4e023f09b7d27c9a9d3e7ad7d09ea6bf</id>
<content type='text'>
New bpf helper bpf_get_ns_current_pid_tgid,
This helper will return pid and tgid from current task
which namespace matches dev_t and inode number provided,
this will allows us to instrument a process inside a container.

Signed-off-by: Carlos Neira &lt;cneirabustos@gmail.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lore.kernel.org/bpf/20200304204157.58695-3-cneirabustos@gmail.com
</content>
</entry>
<entry>
<title>bpf: Introduce BPF_MODIFY_RETURN</title>
<updated>2020-03-04T21:41:05Z</updated>
<author>
<name>KP Singh</name>
<email>kpsingh@google.com</email>
</author>
<published>2020-03-04T19:18:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ae24082331d9bbaae283aafbe930a8f0eb85605a'/>
<id>urn:sha1:ae24082331d9bbaae283aafbe930a8f0eb85605a</id>
<content type='text'>
When multiple programs are attached, each program receives the return
value from the previous program on the stack and the last program
provides the return value to the attached function.

The fmod_ret bpf programs are run after the fentry programs and before
the fexit programs. The original function is only called if all the
fmod_ret programs return 0 to avoid any unintended side-effects. The
success value, i.e. 0 is not currently configurable but can be made so
where user-space can specify it at load time.

For example:

int func_to_be_attached(int a, int b)
{  &lt;--- do_fentry

do_fmod_ret:
   &lt;update ret by calling fmod_ret&gt;
   if (ret != 0)
        goto do_fexit;

original_function:

    &lt;side_effects_happen_here&gt;

}  &lt;--- do_fexit

The fmod_ret program attached to this function can be defined as:

SEC("fmod_ret/func_to_be_attached")
int BPF_PROG(func_name, int a, int b, int ret)
{
        // This will skip the original function logic.
        return 1;
}

The first fmod_ret program is passed 0 in its return argument.

Signed-off-by: KP Singh &lt;kpsingh@google.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andriin@fb.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20200304191853.1529-4-kpsingh@chromium.org
</content>
</entry>
</feed>
