<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/bpf/arraymap.c, branch v4.7</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.7</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.7'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-03-08T20:28:31Z</updated>
<entry>
<title>bpf: check for reserved flag bits in array and stack maps</title>
<updated>2016-03-08T20:28:31Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-03-08T05:57:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=823707b68d6e6c4b1be619b039c7045fef1740e6'/>
<id>urn:sha1:823707b68d6e6c4b1be619b039c7045fef1740e6</id>
<content type='text'>
Suggested-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: add lookup/update support for per-cpu hash and array maps</title>
<updated>2016-02-06T08:34:36Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-02-02T06:39:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=15a07b33814d14ca817887dbea8530728dc0fbe4'/>
<id>urn:sha1:15a07b33814d14ca817887dbea8530728dc0fbe4</id>
<content type='text'>
The functions bpf_map_lookup_elem(map, key, value) and
bpf_map_update_elem(map, key, value, flags) need to get/set
values from all-cpus for per-cpu hash and array maps,
so that user space can aggregate/update them as necessary.

Example of single counter aggregation in user space:
  unsigned int nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
  long values[nr_cpus];
  long value = 0;

  bpf_lookup_elem(fd, key, values);
  for (i = 0; i &lt; nr_cpus; i++)
    value += values[i];

The user space must provide round_up(value_size, 8) * nr_cpus
array to get/set values, since kernel will use 'long' copy
of per-cpu values to try to copy good counters atomically.
It's a best-effort, since bpf programs and user space are racing
to access the same memory.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map</title>
<updated>2016-02-06T08:34:36Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-02-02T06:39:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a10423b87a7eae75da79ce80a8d9475047a674ee'/>
<id>urn:sha1:a10423b87a7eae75da79ce80a8d9475047a674ee</id>
<content type='text'>
Primary use case is a histogram array of latency
where bpf program computes the latency of block requests or other
events and stores histogram of latency into array of 64 elements.
All cpus are constantly running, so normal increment is not accurate,
bpf_xadd causes cache ping-pong and this per-cpu approach allows
fastest collision-free counters.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>perf/bpf: Convert perf_event_array to use struct file</title>
<updated>2016-01-29T07:35:25Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>alexei.starovoitov@gmail.com</email>
</author>
<published>2016-01-26T04:59:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e03e7ee34fdd1c3ef494949a75cb8c61c7265fa9'/>
<id>urn:sha1:e03e7ee34fdd1c3ef494949a75cb8c61c7265fa9</id>
<content type='text'>
Robustify refcounting.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: Wang Nan &lt;wangnan0@huawei.com&gt;
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160126045947.GA40151@ast-mbp.thefacebook.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: fix allocation warnings in bpf maps and integer overflow</title>
<updated>2015-12-03T04:36:00Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2015-11-30T00:59:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=01b3f52157ff5a47d6d8d796f396a4b34a53c61d'/>
<id>urn:sha1:01b3f52157ff5a47d6d8d796f396a4b34a53c61d</id>
<content type='text'>
For large map-&gt;value_size the user space can trigger memory allocation warnings like:
WARNING: CPU: 2 PID: 11122 at mm/page_alloc.c:2989
__alloc_pages_nodemask+0x695/0x14e0()
Call Trace:
 [&lt;     inline     &gt;] __dump_stack lib/dump_stack.c:15
 [&lt;ffffffff82743b56&gt;] dump_stack+0x68/0x92 lib/dump_stack.c:50
 [&lt;ffffffff81244ec9&gt;] warn_slowpath_common+0xd9/0x140 kernel/panic.c:460
 [&lt;ffffffff812450f9&gt;] warn_slowpath_null+0x29/0x30 kernel/panic.c:493
 [&lt;     inline     &gt;] __alloc_pages_slowpath mm/page_alloc.c:2989
 [&lt;ffffffff81554e95&gt;] __alloc_pages_nodemask+0x695/0x14e0 mm/page_alloc.c:3235
 [&lt;ffffffff816188fe&gt;] alloc_pages_current+0xee/0x340 mm/mempolicy.c:2055
 [&lt;     inline     &gt;] alloc_pages include/linux/gfp.h:451
 [&lt;ffffffff81550706&gt;] alloc_kmem_pages+0x16/0xf0 mm/page_alloc.c:3414
 [&lt;ffffffff815a1c89&gt;] kmalloc_order+0x19/0x60 mm/slab_common.c:1007
 [&lt;ffffffff815a1cef&gt;] kmalloc_order_trace+0x1f/0xa0 mm/slab_common.c:1018
 [&lt;     inline     &gt;] kmalloc_large include/linux/slab.h:390
 [&lt;ffffffff81627784&gt;] __kmalloc+0x234/0x250 mm/slub.c:3525
 [&lt;     inline     &gt;] kmalloc include/linux/slab.h:463
 [&lt;     inline     &gt;] map_update_elem kernel/bpf/syscall.c:288
 [&lt;     inline     &gt;] SYSC_bpf kernel/bpf/syscall.c:744

To avoid never succeeding kmalloc with order &gt;= MAX_ORDER check that
elem-&gt;value_size and computed elem_size are within limits for both hash and
array type maps.
Also add __GFP_NOWARN to kmalloc(value_size | elem_size) to avoid OOM warnings.
Note kmalloc(key_size) is highly unlikely to trigger OOM, since key_size &lt;= 512,
so keep those kmalloc-s as-is.

Large value_size can cause integer overflows in elem_size and map.pages
formulas, so check for that as well.

Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf, array: fix heap out-of-bounds access when updating elements</title>
<updated>2015-12-02T02:56:17Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-11-30T12:02:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fbca9d2d35c6ef1b323fae75cc9545005ba25097'/>
<id>urn:sha1:fbca9d2d35c6ef1b323fae75cc9545005ba25097</id>
<content type='text'>
During own review but also reported by Dmitry's syzkaller [1] it has been
noticed that we trigger a heap out-of-bounds access on eBPF array maps
when updating elements. This happens with each map whose map-&gt;value_size
(specified during map creation time) is not multiple of 8 bytes.

In array_map_alloc(), elem_size is round_up(attr-&gt;value_size, 8) and
used to align array map slots for faster access. However, in function
array_map_update_elem(), we update the element as ...

memcpy(array-&gt;value + array-&gt;elem_size * index, value, array-&gt;elem_size);

... where we access 'value' out-of-bounds, since it was allocated from
map_update_elem() from syscall side as kmalloc(map-&gt;value_size, GFP_USER)
and later on copied through copy_from_user(value, uvalue, map-&gt;value_size).
Thus, up to 7 bytes, we can access out-of-bounds.

Same could happen from within an eBPF program, where in worst case we
access beyond an eBPF program's designated stack.

Since 1be7f75d1668 ("bpf: enable non-root eBPF programs") didn't hit an
official release yet, it only affects priviledged users.

In case of array_map_lookup_elem(), the verifier prevents eBPF programs
from accessing beyond map-&gt;value_size through check_map_access(). Also
from syscall side map_lookup_elem() only copies map-&gt;value_size back to
user, so nothing could leak.

  [1] http://github.com/google/syzkaller

Fixes: 28fbcfa08d8e ("bpf: add array type of eBPF maps")
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix bpf_perf_event_read() helper</title>
<updated>2015-10-27T04:49:26Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-10-23T00:10:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=62544ce8e01c1879d420ba309f7f319d24c0f4e6'/>
<id>urn:sha1:62544ce8e01c1879d420ba309f7f319d24c0f4e6</id>
<content type='text'>
Fix safety checks for bpf_perf_event_read():
- only non-inherited events can be added to perf_event_array map
  (do this check statically at map insertion time)
- dynamically check that event is local and !pmu-&gt;count
Otherwise buggy bpf program can cause kernel splat.

Also fix error path after perf_event_attrs()
and remove redundant 'extern'.

Fixes: 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Tested-by: Wang Nan &lt;wangnan0@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: introduce bpf_perf_event_output() helper</title>
<updated>2015-10-22T13:42:15Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-10-21T03:02:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a43eec304259a6c637f4014a6d4767159b6a3aa3'/>
<id>urn:sha1:a43eec304259a6c637f4014a6d4767159b6a3aa3</id>
<content type='text'>
This helper is used to send raw data from eBPF program into
special PERF_TYPE_SOFTWARE/PERF_COUNT_SW_BPF_OUTPUT perf_event.
User space needs to perf_event_open() it (either for one or all cpus) and
store FD into perf_event_array (similar to bpf_perf_event_read() helper)
before eBPF program can send data into it.

Today the programs triggered by kprobe collect the data and either store
it into the maps or print it via bpf_trace_printk() where latter is the debug
facility and not suitable to stream the data. This new helper replaces
such bpf_trace_printk() usage and allows programs to have dedicated
channel into user space for post-processing of the raw data collected.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: charge user for creation of BPF maps and programs</title>
<updated>2015-10-13T02:13:36Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-10-08T05:23:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aaac3ba95e4c8b496d22f68bd1bc01cfbf525eca'/>
<id>urn:sha1:aaac3ba95e4c8b496d22f68bd1bc01cfbf525eca</id>
<content type='text'>
since eBPF programs and maps use kernel memory consider it 'locked' memory
from user accounting point of view and charge it against RLIMIT_MEMLOCK limit.
This limit is typically set to 64Kbytes by distros, so almost all
bpf+tracing programs would need to increase it, since they use maps,
but kernel charges maximum map size upfront.
For example the hash map of 1024 elements will be charged as 64Kbyte.
It's inconvenient for current users and changes current behavior for root,
but probably worth doing to be consistent root vs non-root.

Similar accounting logic is done by mmap of perf_event.

Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ebpf: include perf_event only where really needed</title>
<updated>2015-10-05T14:04:08Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-10-02T16:42:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0cdf5640e4f6940bdbbefee4bb0adb7dffb185ec'/>
<id>urn:sha1:0cdf5640e4f6940bdbbefee4bb0adb7dffb185ec</id>
<content type='text'>
Commit ea317b267e9d ("bpf: Add new bpf map type to store the pointer
to struct perf_event") added perf_event.h to the main eBPF header, so
it gets included for all users. perf_event.h is actually only needed
from array map side, so lets sanitize this a bit.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Kaixu Xia &lt;xiakaixu@huawei.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
