<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include, branch v4.9.51</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.51</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.51'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-09-20T06:20:01Z</updated>
<entry>
<title>xfs: evict all inodes involved with log redo item</title>
<updated>2017-09-20T06:20:01Z</updated>
<author>
<name>Darrick J. Wong</name>
<email>darrick.wong@oracle.com</email>
</author>
<published>2017-09-17T21:06:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=63d184d2955bab0584acc10b502e415ce23394b1'/>
<id>urn:sha1:63d184d2955bab0584acc10b502e415ce23394b1</id>
<content type='text'>
commit 799ea9e9c59949008770aab4e1da87f10e99dbe4 upstream.

When we introduced the bmap redo log items, we set MS_ACTIVE on the
mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
from being truncated prematurely during log recovery.  This also had the
effect of putting linked inodes on the lru instead of evicting them.

Unfortunately, we neglected to find all those unreferenced lru inodes
and evict them after finishing log recovery, which means that we leak
them if anything goes wrong in the rest of xfs_mountfs, because the lru
is only cleaned out on unmount.

Therefore, evict unreferenced inodes in the lru list immediately
after clearing MS_ACTIVE.

Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Cc: viro@ZenIV.linux.org.uk
Reviewed-by: Brian Foster &lt;bfoster@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "net: fix percpu memory leaks"</title>
<updated>2017-09-20T06:19:55Z</updated>
<author>
<name>Jesper Dangaard Brouer</name>
<email>brouer@redhat.com</email>
</author>
<published>2017-09-01T09:26:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1bcf18718ec63ad5fb025b75a5d2439e1dcf1213'/>
<id>urn:sha1:1bcf18718ec63ad5fb025b75a5d2439e1dcf1213</id>
<content type='text'>
[ Upstream commit 5a63643e583b6a9789d7a225ae076fb4e603991c ]

This reverts commit 1d6119baf0610f813eb9d9580eb4fd16de5b4ceb.

After reverting commit 6d7b857d541e ("net: use lib/percpu_counter API
for fragmentation mem accounting") then here is no need for this
fix-up patch.  As percpu_counter is no longer used, it cannot
memory leak it any-longer.

Fixes: 6d7b857d541e ("net: use lib/percpu_counter API for fragmentation mem accounting")
Fixes: 1d6119baf061 ("net: fix percpu memory leaks")
Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "net: use lib/percpu_counter API for fragmentation mem accounting"</title>
<updated>2017-09-20T06:19:55Z</updated>
<author>
<name>Jesper Dangaard Brouer</name>
<email>brouer@redhat.com</email>
</author>
<published>2017-09-01T09:26:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5a7a40bad254d2571d93059ba4b3963dc448cdb0'/>
<id>urn:sha1:5a7a40bad254d2571d93059ba4b3963dc448cdb0</id>
<content type='text'>
[ Upstream commit fb452a1aa3fd4034d7999e309c5466ff2d7005aa ]

This reverts commit 6d7b857d541ecd1d9bd997c97242d4ef94b19de2.

There is a bug in fragmentation codes use of the percpu_counter API,
that can cause issues on systems with many CPUs.

The frag_mem_limit() just reads the global counter (fbc-&gt;count),
without considering other CPUs can have upto batch size (130K) that
haven't been subtracted yet.  Due to the 3MBytes lower thresh limit,
this become dangerous at &gt;=24 CPUs (3*1024*1024/130000=24).

The correct API usage would be to use __percpu_counter_compare() which
does the right thing, and takes into account the number of (online)
CPUs and batch size, to account for this and call __percpu_counter_sum()
when needed.

We choose to revert the use of the lib/percpu_counter API for frag
memory accounting for several reasons:

1) On systems with CPUs &gt; 24, the heavier fully locked
   __percpu_counter_sum() is always invoked, which will be more
   expensive than the atomic_t that is reverted to.

Given systems with more than 24 CPUs are becoming common this doesn't
seem like a good option.  To mitigate this, the batch size could be
decreased and thresh be increased.

2) The add_frag_mem_limit+sub_frag_mem_limit pairs happen on the RX
   CPU, before SKBs are pushed into sockets on remote CPUs.  Given
   NICs can only hash on L2 part of the IP-header, the NIC-RXq's will
   likely be limited.  Thus, a fair chance that atomic add+dec happen
   on the same CPU.

Revert note that commit 1d6119baf061 ("net: fix percpu memory leaks")
removed init_frag_mem_limit() and instead use inet_frags_init_net().
After this revert, inet_frags_uninit_net() becomes empty.

Fixes: 6d7b857d541e ("net: use lib/percpu_counter API for fragmentation mem accounting")
Fixes: 1d6119baf061 ("net: fix percpu memory leaks")
Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Acked-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mlxsw: spectrum: Forbid linking to devices that have uppers</title>
<updated>2017-09-20T06:19:55Z</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@mellanox.com</email>
</author>
<published>2017-09-01T08:52:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=73ee5a73e75f3c0e5d4ca0c5a362424e93413bb0'/>
<id>urn:sha1:73ee5a73e75f3c0e5d4ca0c5a362424e93413bb0</id>
<content type='text'>
[ Upstream commit 25cc72a33835ed8a6f53180a822cadab855852ac ]

The mlxsw driver relies on NETDEV_CHANGEUPPER events to configure the
device in case a port is enslaved to a master netdev such as bridge or
bond.

Since the driver ignores events unrelated to its ports and their
uppers, it's possible to engineer situations in which the device's data
path differs from the kernel's.

One example to such a situation is when a port is enslaved to a bond
that is already enslaved to a bridge. When the bond was enslaved the
driver ignored the event - as the bond wasn't one of its uppers - and
therefore a bridge port instance isn't created in the device.

Until such configurations are supported forbid them by checking that the
upper device doesn't have uppers of its own.

Fixes: 0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Reported-by: Nogah Frankel &lt;nogahf@mellanox.com&gt;
Tested-by: Nogah Frankel &lt;nogahf@mellanox.com&gt;
Signed-off-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipv6: fix sparse warning on rt6i_node</title>
<updated>2017-09-20T06:19:53Z</updated>
<author>
<name>Wei Wang</name>
<email>weiwan@google.com</email>
</author>
<published>2017-08-25T22:03:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=43c792a8488087668f7e1052201e2eeb32150141'/>
<id>urn:sha1:43c792a8488087668f7e1052201e2eeb32150141</id>
<content type='text'>
[ Upstream commit 4e587ea71bf924f7dac621f1351653bd41e446cb ]

Commit c5cff8561d2d adds rcu grace period before freeing fib6_node. This
generates a new sparse warning on rt-&gt;rt6i_node related code:
  net/ipv6/route.c:1394:30: error: incompatible types in comparison
  expression (different address spaces)
  ./include/net/ip6_fib.h:187:14: error: incompatible types in comparison
  expression (different address spaces)

This commit adds "__rcu" tag for rt6i_node and makes sure corresponding
rcu API is used for it.
After this fix, sparse no longer generates the above warning.

Fixes: c5cff8561d2d ("ipv6: add rcu grace period before freeing fib6_node")
Signed-off-by: Wei Wang &lt;weiwan@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipv6: add rcu grace period before freeing fib6_node</title>
<updated>2017-09-20T06:19:53Z</updated>
<author>
<name>Wei Wang</name>
<email>weiwan@google.com</email>
</author>
<published>2017-08-21T16:47:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7f8f23fc8026a7a4f29f49c18a2ebbb529ee3916'/>
<id>urn:sha1:7f8f23fc8026a7a4f29f49c18a2ebbb529ee3916</id>
<content type='text'>
[ Upstream commit c5cff8561d2d0006e972bd114afd51f082fee77c ]

We currently keep rt-&gt;rt6i_node pointing to the fib6_node for the route.
And some functions make use of this pointer to dereference the fib6_node
from rt structure, e.g. rt6_check(). However, as there is neither
refcount nor rcu taken when dereferencing rt-&gt;rt6i_node, it could
potentially cause crashes as rt-&gt;rt6i_node could be set to NULL by other
CPUs when doing a route deletion.
This patch introduces an rcu grace period before freeing fib6_node and
makes sure the functions that dereference it takes rcu_read_lock().

Note: there is no "Fixes" tag because this bug was there in a very
early stage.

Signed-off-by: Wei Wang &lt;weiwan@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cs5536: add support for IDE controller variant</title>
<updated>2017-09-09T15:39:41Z</updated>
<author>
<name>Andrey Korolyov</name>
<email>andrey@xdel.ru</email>
</author>
<published>2017-08-10T10:21:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5b9c6a54c629e38271acf833f75154f2f8333365'/>
<id>urn:sha1:5b9c6a54c629e38271acf833f75154f2f8333365</id>
<content type='text'>
commit 591b6bb605785c12a21e8b07a08a277065b655a5 upstream.

Several legacy devices such as Geode-based Cisco ASA appliances
and DB800 development board do possess CS5536 IDE controller
with different PCI id than existing one. Using pata_generic is
not always feasible as at least DB800 requires MSR quirk from
pata_cs5536 to be used with vendor firmware.

Signed-off-by: Andrey Korolyov &lt;andrey@xdel.ru&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>workqueue: Fix flag collision</title>
<updated>2017-09-09T15:39:41Z</updated>
<author>
<name>Ben Hutchings</name>
<email>ben@decadent.org.uk</email>
</author>
<published>2017-09-03T00:18:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ec552ece1f25e8ecadf0afafe87ce314d13dc3e7'/>
<id>urn:sha1:ec552ece1f25e8ecadf0afafe87ce314d13dc3e7</id>
<content type='text'>
commit fbf1c41fc0f4d3574ac2377245efd666c1fa3075 upstream.

Commit 0a94efb5acbb ("workqueue: implicit ordered attribute should be
overridable") introduced a __WQ_ORDERED_EXPLICIT flag but gave it the
same value as __WQ_LEGACY.  I don't believe these were intended to
mean the same thing, so renumber __WQ_ORDERED_EXPLICIT.

Fixes: 0a94efb5acbb ("workqueue: implicit ordered attribute should be ...")
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configs</title>
<updated>2017-09-07T06:35:40Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-08-28T21:51:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=da16ed52c36aa200e60230de54271a8556dc8674'/>
<id>urn:sha1:da16ed52c36aa200e60230de54271a8556dc8674</id>
<content type='text'>
commit b339752d054fb32863418452dff350a1086885b1 upstream.

When !NUMA, cpumask_of_node(@node) equals cpu_online_mask regardless of
@node.  The assumption seems that if !NUMA, there shouldn't be more than
one node and thus reporting cpu_online_mask regardless of @node is
correct.  However, that assumption was broken years ago to support
DISCONTIGMEM and whether a system has multiple nodes or not is
separately controlled by NEED_MULTIPLE_NODES.

This means that, on a system with !NUMA &amp;&amp; NEED_MULTIPLE_NODES,
cpumask_of_node() will report cpu_online_mask for all possible nodes,
indicating that the CPUs are associated with multiple nodes which is an
impossible configuration.

This bug has been around forever but doesn't look like it has caused any
noticeable symptoms.  However, it triggers a WARN recently added to
workqueue to verify NUMA affinity configuration.

Fix it by reporting empty cpumask on non-zero nodes if !NUMA.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-and-tested-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Clarify (and fix) MAX_LFS_FILESIZE macros</title>
<updated>2017-08-30T08:21:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-08-27T19:12:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b8fce38266dbf3fdc1e100a0c2ffd555b336f03d'/>
<id>urn:sha1:b8fce38266dbf3fdc1e100a0c2ffd555b336f03d</id>
<content type='text'>
commit 0cc3b0ec23ce4c69e1e890ed2b8d2fa932b14aad upstream.

We have a MAX_LFS_FILESIZE macro that is meant to be filled in by
filesystems (and other IO targets) that know they are 64-bit clean and
don't have any 32-bit limits in their IO path.

It turns out that our 32-bit value for that limit was bogus.  On 32-bit,
the VM layer is limited by the page cache to only 32-bit index values,
but our logic for that was confusing and actually wrong.  We used to
define that value to

	(((loff_t)PAGE_SIZE &lt;&lt; (BITS_PER_LONG-1))-1)

which is actually odd in several ways: it limits the index to 31 bits,
and then it limits files so that they can't have data in that last byte
of a page that has the highest 31-bit index (ie page index 0x7fffffff).

Neither of those limitations make sense.  The index is actually the full
32 bit unsigned value, and we can use that whole full page.  So the
maximum size of the file would logically be "PAGE_SIZE &lt;&lt; BITS_PER_LONG".

However, we do wan tto avoid the maximum index, because we have code
that iterates over the page indexes, and we don't want that code to
overflow.  So the maximum size of a file on a 32-bit host should
actually be one page less than the full 32-bit index.

So the actual limit is ULONG_MAX &lt;&lt; PAGE_SHIFT.  That means that we will
not actually be using the page of that last index (ULONG_MAX), but we
can grow a file up to that limit.

The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug
Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5
volume.  It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one
byte less), but the actual true VM limit is one page less than 16TiB.

This was invisible until commit c2a9737f45e2 ("vfs,mm: fix a dead loop
in truncate_inode_pages_range()"), which started applying that
MAX_LFS_FILESIZE limit to block devices too.

NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is
actually just the offset type itself (loff_t), which is signed.  But for
clarity, on 64-bit, just use the maximum signed value, and don't make
people have to count the number of 'f' characters in the hex constant.

So just use LLONG_MAX for the 64-bit case.  That was what the value had
been before too, just written out as a hex constant.

Fixes: c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
Reported-and-tested-by: Doug Nazar &lt;nazard@nazar.ca&gt;
Cc: Andreas Dilger &lt;adilger@dilger.ca&gt;
Cc: Mark Fasheh &lt;mfasheh@versity.com&gt;
Cc: Joel Becker &lt;jlbec@evilplan.org&gt;
Cc: Dave Kleikamp &lt;shaggy@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
