<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux, branch v3.12.61</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.61</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.61'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-05-13T13:09:25Z</updated>
<entry>
<title>mm/balloon_compaction: redesign ballooned pages management</title>
<updated>2016-05-13T13:09:25Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>k.khlebnikov@samsung.com</email>
</author>
<published>2014-10-09T22:29:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33904d89e5114be414d2b1ddd1887954c7bf8c81'/>
<id>urn:sha1:33904d89e5114be414d2b1ddd1887954c7bf8c81</id>
<content type='text'>
commit d6d86c0a7f8ddc5b38cf089222cb1d9540762dc2 upstream.

Sasha Levin reported KASAN splash inside isolate_migratepages_range().
Problem is in the function __is_movable_balloon_page() which tests
AS_BALLOON_MAP in page-&gt;mapping-&gt;flags.  This function has no protection
against anonymous pages.  As result it tried to check address space flags
inside struct anon_vma.

Further investigation shows more problems in current implementation:

* Special branch in __unmap_and_move() never works:
  balloon_page_movable() checks page flags and page_count.  In
  __unmap_and_move() page is locked, reference counter is elevated, thus
  balloon_page_movable() always fails.  As a result execution goes to the
  normal migration path.  virtballoon_migratepage() returns
  MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
  move_to_new_page() thinks this is an error code and assigns
  newpage-&gt;mapping to NULL.  Newly migrated page lose connectivity with
  balloon an all ability for further migration.

* lru_lock erroneously required in isolate_migratepages_range() for
  isolation ballooned page.  This function releases lru_lock periodically,
  this makes migration mostly impossible for some pages.

* balloon_page_dequeue have a tight race with balloon_page_isolate:
  balloon_page_isolate could be executed in parallel with dequeue between
  picking page from list and locking page_lock.  Race is rare because they
  use trylock_page() for locking.

This patch fixes all of them.

Instead of fake mapping with special flag this patch uses special state of
page-&gt;_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256.  Buddy allocator uses
PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose.  Storing mark
directly in struct page makes everything safer and easier.

PagePrivate is used to mark pages present in page list (i.e.  not
isolated, like PageLRU for normal pages).  It replaces special rules for
reference counter and makes balloon migration similar to migration of
normal pages.  This flag is protected by page_lock together with link to
the balloon device.

[js] backport to 3.12. MIGRATEPAGE_BALLOON_SUCCESS had to be removed
     from one more place. VM_BUG_ON_PAGE does not exist in 3.12 yet,
     use plain VM_BUG_ON.

Signed-off-by: Konstantin Khlebnikov &lt;k.khlebnikov@samsung.com&gt;
Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
Cc: Rafael Aquini &lt;aquini@redhat.com&gt;
Cc: Andrey Ryabinin &lt;ryabinin.a.a@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Gavin Guo &lt;gavin.guo@canonical.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>x86: LLVMLinux: Fix "incomplete type const struct x86cpu_device_id"</title>
<updated>2016-05-11T09:37:32Z</updated>
<author>
<name>Behan Webster</name>
<email>behanw@converseincode.com</email>
</author>
<published>2014-02-13T20:21:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=376539fa006eb2cba95c4c534ae5357c3f3c6d22'/>
<id>urn:sha1:376539fa006eb2cba95c4c534ae5357c3f3c6d22</id>
<content type='text'>
commit c4586256f0c440bc2bdb29d2cbb915f0ca785d26 upstream.

Similar to the fix in 40413dcb7b273bda681dca38e6ff0bbb3728ef11

MODULE_DEVICE_TABLE(x86cpu, ...) expects the struct to be called struct
x86cpu_device_id, and not struct x86_cpu_id which is what is used in the rest
of the kernel code.  Although gcc seems to ignore this error, clang fails
without this define to fix the name.

Code from drivers/thermal/x86_pkg_temp_thermal.c
static const struct x86_cpu_id __initconst pkg_temp_thermal_ids[] = { ... };
MODULE_DEVICE_TABLE(x86cpu, pkg_temp_thermal_ids);

Error from clang:
drivers/thermal/x86_pkg_temp_thermal.c:577:1: error: variable has
      incomplete type 'const struct x86cpu_device_id'
MODULE_DEVICE_TABLE(x86cpu, pkg_temp_thermal_ids);
^
include/linux/module.h:145:3: note: expanded from macro
      'MODULE_DEVICE_TABLE'
  MODULE_GENERIC_TABLE(type##_device, name)
  ^
include/linux/module.h:87:32: note: expanded from macro
      'MODULE_GENERIC_TABLE'
extern const struct gtype##_id __mod_##gtype##_table            \
                               ^
&lt;scratch space&gt;:143:1: note: expanded from here
__mod_x86cpu_device_table
^
drivers/thermal/x86_pkg_temp_thermal.c:577:1: note: forward declaration of
      'struct x86cpu_device_id'
include/linux/module.h:145:3: note: expanded from macro
      'MODULE_DEVICE_TABLE'
  MODULE_GENERIC_TABLE(type##_device, name)
  ^
include/linux/module.h:87:21: note: expanded from macro
      'MODULE_GENERIC_TABLE'
extern const struct gtype##_id __mod_##gtype##_table            \
                    ^
&lt;scratch space&gt;:141:1: note: expanded from here
x86cpu_device_id
^
1 error generated.

Signed-off-by: Behan Webster &lt;behanw@converseincode.com&gt;
Signed-off-by: Jan-Simon Möller &lt;dl9pf@gmx.de&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
[added vmbus, mei, and rapdio #defines, needed for 3.14 - gregkh]
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>compiler-gcc: disable -ftracer for __noclone functions</title>
<updated>2016-05-11T09:37:32Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2016-03-31T07:38:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0a0ff4ebd29deed338d7b6989b93333df75b462a'/>
<id>urn:sha1:0a0ff4ebd29deed338d7b6989b93333df75b462a</id>
<content type='text'>
commit 95272c29378ee7dc15f43fa2758cb28a5913a06d upstream.

-ftracer can duplicate asm blocks causing compilation to fail in
noclone functions.  For example, KVM declares a global variable
in an asm like

    asm("2: ... \n
         .pushsection data \n
         .global vmx_return \n
         vmx_return: .long 2b");

and -ftracer causes a double declaration.

Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michal Marek &lt;mmarek@suse.cz&gt;
Cc: stable@vger.kernel.org
Cc: kvm@vger.kernel.org
Reported-by: Linda Walsh &lt;lkml@tlinx.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>cpuset: Fix potential deadlock w/ set_mems_allowed</title>
<updated>2016-05-03T15:42:14Z</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2013-10-07T22:52:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d9aa9f580cb2c2e79ea9573447c807607b195123'/>
<id>urn:sha1:d9aa9f580cb2c2e79ea9573447c807607b195123</id>
<content type='text'>
commit db751fe3ea6880ff5ac5abe60cb7b80deb5a4140 upstream.

After adding lockdep support to seqlock/seqcount structures,
I started seeing the following warning:

[    1.070907] ======================================================
[    1.072015] [ INFO: SOFTIRQ-safe -&gt; SOFTIRQ-unsafe lock order detected ]
[    1.073181] 3.11.0+ #67 Not tainted
[    1.073801] ------------------------------------------------------
[    1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[    1.076088]  (&amp;p-&gt;mems_allowed_seq){+.+...}, at: [&lt;ffffffff81187d7f&gt;] new_slab+0x5f/0x280
[    1.077572]
[    1.077572] and this task is already holding:
[    1.078593]  (&amp;(&amp;q-&gt;__queue_lock)-&gt;rlock){..-...}, at: [&lt;ffffffff81339f03&gt;] blk_execute_rq_nowait+0x53/0xf0
[    1.080042] which would create a new lock dependency:
[    1.080042]  (&amp;(&amp;q-&gt;__queue_lock)-&gt;rlock){..-...} -&gt; (&amp;p-&gt;mems_allowed_seq){+.+...}
[    1.080042]
[    1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock:
[    1.080042]  (&amp;(&amp;q-&gt;__queue_lock)-&gt;rlock){..-...}
[    1.080042] ... which became SOFTIRQ-irq-safe at:
[    1.080042]   [&lt;ffffffff810ec179&gt;] __lock_acquire+0x5b9/0x1db0
[    1.080042]   [&lt;ffffffff810edfe5&gt;] lock_acquire+0x95/0x130
[    1.080042]   [&lt;ffffffff818968a1&gt;] _raw_spin_lock+0x41/0x80
[    1.080042]   [&lt;ffffffff81560c9e&gt;] scsi_device_unbusy+0x7e/0xd0
[    1.080042]   [&lt;ffffffff8155a612&gt;] scsi_finish_command+0x32/0xf0
[    1.080042]   [&lt;ffffffff81560e91&gt;] scsi_softirq_done+0xa1/0x130
[    1.080042]   [&lt;ffffffff8133b0f3&gt;] blk_done_softirq+0x73/0x90
[    1.080042]   [&lt;ffffffff81095dc0&gt;] __do_softirq+0x110/0x2f0
[    1.080042]   [&lt;ffffffff81095fcd&gt;] run_ksoftirqd+0x2d/0x60
[    1.080042]   [&lt;ffffffff810bc506&gt;] smpboot_thread_fn+0x156/0x1e0
[    1.080042]   [&lt;ffffffff810b3916&gt;] kthread+0xd6/0xe0
[    1.080042]   [&lt;ffffffff818980ac&gt;] ret_from_fork+0x7c/0xb0
[    1.080042]
[    1.080042] to a SOFTIRQ-irq-unsafe lock:
[    1.080042]  (&amp;p-&gt;mems_allowed_seq){+.+...}
[    1.080042] ... which became SOFTIRQ-irq-unsafe at:
[    1.080042] ...  [&lt;ffffffff810ec1d3&gt;] __lock_acquire+0x613/0x1db0
[    1.080042]   [&lt;ffffffff810edfe5&gt;] lock_acquire+0x95/0x130
[    1.080042]   [&lt;ffffffff810b3df2&gt;] kthreadd+0x82/0x180
[    1.080042]   [&lt;ffffffff818980ac&gt;] ret_from_fork+0x7c/0xb0
[    1.080042]
[    1.080042] other info that might help us debug this:
[    1.080042]
[    1.080042]  Possible interrupt unsafe locking scenario:
[    1.080042]
[    1.080042]        CPU0                    CPU1
[    1.080042]        ----                    ----
[    1.080042]   lock(&amp;p-&gt;mems_allowed_seq);
[    1.080042]                                local_irq_disable();
[    1.080042]                                lock(&amp;(&amp;q-&gt;__queue_lock)-&gt;rlock);
[    1.080042]                                lock(&amp;p-&gt;mems_allowed_seq);
[    1.080042]   &lt;Interrupt&gt;
[    1.080042]     lock(&amp;(&amp;q-&gt;__queue_lock)-&gt;rlock);
[    1.080042]
[    1.080042]  *** DEADLOCK ***

The issue stems from the kthreadd() function calling set_mems_allowed
with irqs enabled. While its possibly unlikely for the actual deadlock
to trigger, a fix is fairly simple: disable irqs before taking the
mems_allowed_seq lock.

Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>include/linux/poison.h: fix LIST_POISON{1,2} offset</title>
<updated>2016-05-03T05:45:08Z</updated>
<author>
<name>Vasily Kulikov</name>
<email>segoon@openwall.com</email>
</author>
<published>2015-09-09T22:36:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c7ecfa39d66c62ee662ae6906a2eec3d28a96e6a'/>
<id>urn:sha1:c7ecfa39d66c62ee662ae6906a2eec3d28a96e6a</id>
<content type='text'>
commit 8a5e5e02fc83aaf67053ab53b359af08c6c49aaf upstream.

Poison pointer values should be small enough to find a room in
non-mmap'able/hardly-mmap'able space.  E.g.  on x86 "poison pointer space"
is located starting from 0x0.  Given unprivileged users cannot mmap
anything below mmap_min_addr, it should be safe to use poison pointers
lower than mmap_min_addr.

The current poison pointer values of LIST_POISON{1,2} might be too big for
mmap_min_addr values equal or less than 1 MB (common case, e.g.  Ubuntu
uses only 0x10000).  There is little point to use such a big value given
the "poison pointer space" below 1 MB is not yet exhausted.  Changing it
to a smaller value solves the problem for small mmap_min_addr setups.

The values are suggested by Solar Designer:
http://www.openwall.com/lists/oss-security/2015/05/02/6

Signed-off-by: Vasily Kulikov &lt;segoon@openwall.com&gt;
Cc: Solar Designer &lt;solar@openwall.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>pipe: limit the per-user amount of pages allocated in pipes</title>
<updated>2016-04-21T11:11:54Z</updated>
<author>
<name>Willy Tarreau</name>
<email>w@1wt.eu</email>
</author>
<published>2016-01-18T15:36:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2a032e307d35402306c6464537b8bc6a0a3ac91d'/>
<id>urn:sha1:2a032e307d35402306c6464537b8bc6a0a3ac91d</id>
<content type='text'>
commit 759c01142a5d0f364a462346168a56de28a80f52 upstream.

On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>tracing: Fix trace_printk() to print when not using bprintk()</title>
<updated>2016-04-11T14:44:23Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-03-22T21:30:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a20513a2aff547c52db7b32238b1feeb1ba11319'/>
<id>urn:sha1:a20513a2aff547c52db7b32238b1feeb1ba11319</id>
<content type='text'>
commit 3debb0a9ddb16526de8b456491b7db60114f7b5e upstream.

The trace_printk() code will allocate extra buffers if the compile detects
that a trace_printk() is used. To do this, the format of the trace_printk()
is saved to the __trace_printk_fmt section, and if that section is bigger
than zero, the buffers are allocated (along with a message that this has
happened).

If trace_printk() uses a format that is not a constant, and thus something
not guaranteed to be around when the print happens, the compiler optimizes
the fmt out, as it is not used, and the __trace_printk_fmt section is not
filled. This means the kernel will not allocate the special buffers needed
for the trace_printk() and the trace_printk() will not write anything to the
tracing buffer.

Adding a "__used" to the variable in the __trace_printk_fmt section will
keep it around, even though it is set to NULL. This will keep the string
from being printed in the debugfs/tracing/printk_formats section as it is
not needed.

Reported-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Fixes: 07d777fe8c398 "tracing: Add percpu buffers for trace_printk()"
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>fs/coredump: prevent fsuid=0 dumps into user-controlled directories</title>
<updated>2016-04-11T14:44:20Z</updated>
<author>
<name>Jann Horn</name>
<email>jann@thejh.net</email>
</author>
<published>2016-03-22T21:25:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fd797512816d954c1b82e405f8fb7022c0cbeb81'/>
<id>urn:sha1:fd797512816d954c1b82e405f8fb7022c0cbeb81</id>
<content type='text'>
commit 378c6520e7d29280f400ef2ceaf155c86f05a71a upstream.

This commit fixes the following security hole affecting systems where
all of the following conditions are fulfilled:

 - The fs.suid_dumpable sysctl is set to 2.
 - The kernel.core_pattern sysctl's value starts with "/". (Systems
   where kernel.core_pattern starts with "|/" are not affected.)
 - Unprivileged user namespace creation is permitted. (This is
   true on Linux &gt;=3.8, but some distributions disallow it by
   default using a distro patch.)

Under these conditions, if a program executes under secure exec rules,
causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
namespace, changes its root directory and crashes, the coredump will be
written using fsuid=0 and a path derived from kernel.core_pattern - but
this path is interpreted relative to the root directory of the process,
allowing the attacker to control where a coredump will be written with
root privileges.

To fix the security issue, always interpret core_pattern for dumps that
are written under SUID_DUMP_ROOT relative to the root directory of init.

Signed-off-by: Jann Horn &lt;jann@thejh.net&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>PCI: Disable IO/MEM decoding for devices with non-compliant BARs</title>
<updated>2016-04-11T14:44:00Z</updated>
<author>
<name>Bjorn Helgaas</name>
<email>bhelgaas@google.com</email>
</author>
<published>2016-02-25T20:35:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b82df119eed7e6d5815c04af734ae2f527b46af0'/>
<id>urn:sha1:b82df119eed7e6d5815c04af734ae2f527b46af0</id>
<content type='text'>
commit b84106b4e2290c081cdab521fa832596cdfea246 upstream.

The PCI config header (first 64 bytes of each device's config space) is
defined by the PCI spec so generic software can identify the device and
manage its usage of I/O, memory, and IRQ resources.

Some non-spec-compliant devices put registers other than BARs where the
BARs should be.  When the PCI core sizes these "BARs", the reads and writes
it does may have unwanted side effects, and the "BAR" may appear to
describe non-sensical address space.

Add a flag bit to mark non-compliant devices so we don't touch their BARs.
Turn off IO/MEM decoding to prevent the devices from consuming address
space, since we can't read the BARs to find out what that address space
would be.

Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Tested-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>mld, igmp: Fix reserved tailroom calculation</title>
<updated>2016-04-11T14:43:57Z</updated>
<author>
<name>Benjamin Poirier</name>
<email>bpoirier@suse.com</email>
</author>
<published>2016-02-29T23:03:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6af792bb57d0ba6695b5390a19f8eedcf51b602c'/>
<id>urn:sha1:6af792bb57d0ba6695b5390a19f8eedcf51b602c</id>
<content type='text'>
commit 1837b2e2bcd23137766555a63867e649c0b637f0 upstream.

The current reserved_tailroom calculation fails to take hlen and tlen into
account.

skb:
[__hlen__|__data____________|__tlen___|__extra__]
^                                               ^
head                                            skb_end_offset

In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:

[__hlen__|__data____________|__extra__|__tlen___]
^                                               ^
head                                            skb_end_offset

The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.

The min() in the third line can be expanded into:
if mtu &lt; skb_tailroom - tlen:
	reserved_tailroom = skb_tailroom - mtu
else:
	reserved_tailroom = tlen

Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev-&gt;needed_tailroom or it may output more skbs than needed because not all
space available is used.

Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier &lt;bpoirier@suse.com&gt;
Acked-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
</feed>
