<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/bpf/arraymap.c, branch v4.12</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.12</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.12'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-05-25T17:44:28Z</updated>
<entry>
<title>bpf: fix wrong exposure of map_flags into fdinfo for lpm</title>
<updated>2017-05-25T17:44:28Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-05-24T23:05:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a316338cb71a3260201490e615f2f6d5c0d8fb2c'/>
<id>urn:sha1:a316338cb71a3260201490e615f2f6d5c0d8fb2c</id>
<content type='text'>
trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via
attr-&gt;map_flags, since it does not support preallocation yet. We
check the flag, but we never copy the flag into trie-&gt;map.map_flags,
which is later on exposed into fdinfo and used by loaders such as
iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test
whether a pinned map has the same spec as the one from the BPF obj
file and if not, bails out, which is currently the case for lpm
since it exposes always 0 as flags.

Also copy over flags in array_map_alloc() and stack_map_alloc().
They always have to be 0 right now, but we should make sure to not
miss to copy them over at a later point in time when we add actual
flags for them to use.

Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation")
Reported-by: Jarno Rajahalme &lt;jarno@covalent.io&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: map_get_next_key to return first key on NULL</title>
<updated>2017-04-25T15:57:45Z</updated>
<author>
<name>Teng Qin</name>
<email>qinteng@fb.com</email>
</author>
<published>2017-04-25T02:00:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8fe45924387be6b5c1be59a7eb330790c61d5d10'/>
<id>urn:sha1:8fe45924387be6b5c1be59a7eb330790c61d5d10</id>
<content type='text'>
When iterating through a map, we need to find a key that does not exist
in the map so map_get_next_key will give us the first key of the map.
This often requires a lot of guessing in production systems.

This patch makes map_get_next_key return the first key when the key
pointer in the parameter is NULL.

Signed-off-by: Teng Qin &lt;qinteng@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: remove struct bpf_map_type_list</title>
<updated>2017-04-11T18:38:43Z</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2017-04-11T13:34:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=40077e0cf62206ac3c315b6991d8dcddb3703286'/>
<id>urn:sha1:40077e0cf62206ac3c315b6991d8dcddb3703286</id>
<content type='text'>
There's no need to have struct bpf_map_type_list since
it just contains a list_head, the type, and the ops
pointer. Since the types are densely packed and not
actually dynamically registered, it's much easier and
smaller to have an array of type-&gt;ops pointer. Also
initialize this array statically to remove code needed
to initialize it.

In order to save duplicating the list, move it to the
types header file added by the previous patch and
include it in the same fashion.

Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: Add array of maps support</title>
<updated>2017-03-22T22:45:45Z</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2017-03-22T17:00:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=56f668dfe00dcf086734f1c42ea999398fad6572'/>
<id>urn:sha1:56f668dfe00dcf086734f1c42ea999398fad6572</id>
<content type='text'>
This patch adds a few helper funcs to enable map-in-map
support (i.e. outer_map-&gt;inner_map).  The first outer_map type
BPF_MAP_TYPE_ARRAY_OF_MAPS is also added in this patch.
The next patch will introduce a hash of maps type.

Any bpf map type can be acted as an inner_map.  The exception
is BPF_MAP_TYPE_PROG_ARRAY because the extra level of
indirection makes it harder to verify the owner_prog_type
and owner_jited.

Multi-level map-in-map is not supported (i.e. map-&gt;map is ok
but not map-&gt;map-&gt;map).

When adding an inner_map to an outer_map, it currently checks the
map_type, key_size, value_size, map_flags, max_entries and ops.
The verifier also uses those map's properties to do static analysis.
map_flags is needed because we need to ensure BPF_PROG_TYPE_PERF_EVENT
is using a preallocated hashtab for the inner_hash also.  ops and
max_entries are needed to generate inlined map-lookup instructions.
For simplicity reason, a simple '==' test is used for both map_flags
and max_entries.  The equality of ops is implied by the equality of
map_type.

During outer_map creation time, an inner_map_fd is needed to create an
outer_map.  However, the inner_map_fd's life time does not depend on the
outer_map.  The inner_map_fd is merely used to initialize
the inner_map_meta of the outer_map.

Also, for the outer_map:

* It allows element update and delete from syscall
* It allows element lookup from bpf_prog

The above is similar to the current fd_array pattern.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: Fix and simplifications on inline map lookup</title>
<updated>2017-03-22T22:45:45Z</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2017-03-22T17:00:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fad73a1a35ea61f13607a391aca669caad8c04ca'/>
<id>urn:sha1:fad73a1a35ea61f13607a391aca669caad8c04ca</id>
<content type='text'>
Fix in verifier:
For the same bpf_map_lookup_elem() instruction (i.e. "call 1"),
a broken case is "a different type of map could be used for the
same lookup instruction". For example, an array in one case and a
hashmap in another.  We have to resort to the old dynamic call behavior
in this case.  The fix is to check for collision on insn_aux-&gt;map_ptr.
If there is collision, don't inline the map lookup.

Please see the "do_reg_lookup()" in test_map_in_map_kern.c in the later
patch for how-to trigger the above case.

Simplifications on array_map_gen_lookup():
1. Calculate elem_size from map-&gt;value_size.  It removes the
   need for 'struct bpf_array' which makes the later map-in-map
   implementation easier.
2. Remove the 'elem_size == 1' test

Fixes: 81ed18ab3098 ("bpf: add helper inlining infra and optimize map_array lookup")
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: add helper inlining infra and optimize map_array lookup</title>
<updated>2017-03-17T03:44:11Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2017-03-16T01:26:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=81ed18ab3098b6519274545e80a29caacb77d160'/>
<id>urn:sha1:81ed18ab3098b6519274545e80a29caacb77d160</id>
<content type='text'>
Optimize bpf_call -&gt; bpf_map_lookup_elem() -&gt; array_map_lookup_elem()
into a sequence of bpf instructions.
When JIT is on the sequence of bpf instructions is the sequence
of native cpu instructions with significantly faster performance
than indirect call and two function's prologue/epilogue.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: mark all registered map/prog types as __ro_after_init</title>
<updated>2017-02-17T18:40:04Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-02-16T21:24:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c78f8bdfa11fcceb9723c61212e4bd8f76c87f9e'/>
<id>urn:sha1:c78f8bdfa11fcceb9723c61212e4bd8f76c87f9e</id>
<content type='text'>
All map types and prog types are registered to the BPF core through
bpf_register_map_type() and bpf_register_prog_type() during init and
remain unchanged thereafter. As by design we don't (and never will)
have any pluggable code that can register to that at any later point
in time, lets mark all the existing bpf_{map,prog}_type_list objects
in the tree as __ro_after_init, so they can be moved to read-only
section from then onwards.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: don't trigger OOM killer under pressure with map alloc</title>
<updated>2017-01-18T22:12:26Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-01-18T14:14:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d407bd25a204bd66b7346dde24bd3d37ef0e0b05'/>
<id>urn:sha1:d407bd25a204bd66b7346dde24bd3d37ef0e0b05</id>
<content type='text'>
This patch adds two helpers, bpf_map_area_alloc() and bpf_map_area_free(),
that are to be used for map allocations. Using kmalloc() for very large
allocations can cause excessive work within the page allocator, so i) fall
back earlier to vmalloc() when the attempt is considered costly anyway,
and even more importantly ii) don't trigger OOM killer with any of the
allocators.

Since this is based on a user space request, for example, when creating
maps with element pre-allocation, we really want such requests to fail
instead of killing other user space processes.

Also, don't spam the kernel log with warnings should any of the allocations
fail under pressure. Given that, we can make backend selection in
bpf_map_area_alloc() generic, and convert all maps over to use this API
for spots with potentially large allocation requests.

Note, replacing the one kmalloc_array() is fine as overflow checks happen
earlier in htab_map_alloc(), since it must also protect the multiplication
for vmalloc() should kmalloc_array() fail.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: do not use KMALLOC_SHIFT_MAX</title>
<updated>2017-01-11T02:31:54Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2017-01-11T00:57:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7984c27c2c5cd3298de8afdba3e1bd46f884e934'/>
<id>urn:sha1:7984c27c2c5cd3298de8afdba3e1bd46f884e934</id>
<content type='text'>
Commit 01b3f52157ff ("bpf: fix allocation warnings in bpf maps and
integer overflow") has added checks for the maximum allocateable size.
It (ab)used KMALLOC_SHIFT_MAX for that purpose.

While this is not incorrect it is not very clean because we already have
KMALLOC_MAX_SIZE for this very reason so let's change both checks to use
KMALLOC_MAX_SIZE instead.

The original motivation for using KMALLOC_SHIFT_MAX was to work around
an incorrect KMALLOC_MAX_SIZE which could lead to allocation warnings
but it is no longer needed since "slab: make sure that KMALLOC_MAX_SIZE
will fit into MAX_ORDER".

Link: http://lkml.kernel.org/r/20161220130659.16461-3-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>bpf: Add bpf_current_task_under_cgroup helper</title>
<updated>2016-08-13T04:49:41Z</updated>
<author>
<name>Sargun Dhillon</name>
<email>sargun@sargun.me</email>
</author>
<published>2016-08-12T15:56:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=60d20f9195b260bdf0ac10c275ae9f6016f9c069'/>
<id>urn:sha1:60d20f9195b260bdf0ac10c275ae9f6016f9c069</id>
<content type='text'>
This adds a bpf helper that's similar to the skb_in_cgroup helper to check
whether the probe is currently executing in the context of a specific
subset of the cgroupsv2 hierarchy. It does this based on membership test
for a cgroup arraymap. It is invalid to call this in an interrupt, and
it'll return an error. The helper is primarily to be used in debugging
activities for containers, where you may have multiple programs running in
a given top-level "container".

Signed-off-by: Sargun Dhillon &lt;sargun@sargun.me&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
