<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/mm.h, branch v4.1.37</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.1.37</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.1.37'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-10-23T23:37:51Z</updated>
<entry>
<title>mm: remove gup_flags FOLL_WRITE games from __get_user_pages()</title>
<updated>2016-10-23T23:37:51Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-10-13T20:07:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c865f98df72112a3997b219bf711bc46c1e90706'/>
<id>urn:sha1:c865f98df72112a3997b219bf711bc46c1e90706</id>
<content type='text'>
[ Upstream commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 ]

This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").

In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better).  The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9.  Earlier kernels will
have to look at the page state itself.

Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.

To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.

Reported-and-tested-by: Phil "not Paul" Oester &lt;kernel@linuxace.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Nick Piggin &lt;npiggin@gmail.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</content>
</entry>
<entry>
<title>mm: use 'unsigned int' for page order</title>
<updated>2016-04-18T12:51:09Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2015-11-07T00:29:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4686552323f938fc04e4e6101a271810a1880f73'/>
<id>urn:sha1:4686552323f938fc04e4e6101a271810a1880f73</id>
<content type='text'>
[ Upstream commit d00181b96eb86c914cb327d1de974a1b71366e1b ]

Let's try to be consistent about data type of page order.

[sfr@canb.auug.org.au: fix build (type of pageblock_order)]
[hughd@google.com: some configs end up with MAX_ORDER and pageblock_order having different types]
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;

Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>mm: make page pfmemalloc check more robust</title>
<updated>2015-09-29T17:26:06Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2015-08-21T21:11:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c85ea6919f1cf092ad6418323e49faac6ac7cf3c'/>
<id>urn:sha1:c85ea6919f1cf092ad6418323e49faac6ac7cf3c</id>
<content type='text'>
commit 2f064f3485cd29633ad1b3cfb00cc519509a3d72 upstream.

Commit c48a11c7ad26 ("netvm: propagate page-&gt;pfmemalloc to skb") added
checks for page-&gt;pfmemalloc to __skb_fill_page_desc():

        if (page-&gt;pfmemalloc &amp;&amp; !page-&gt;mapping)
                skb-&gt;pfmemalloc = true;

It assumes page-&gt;mapping == NULL implies that page-&gt;pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page-&gt;mapping
to NULL and leave page-&gt;index value alone.  Due to being in union, a
non-zero page-&gt;index will be interpreted as true page-&gt;pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page-&gt;pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page-&gt;index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad26 ("netvm: propagate page-&gt;pfmemalloc to skb")
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Debugged-by: Vlastimil Babka &lt;vbabka@suse.com&gt;
Debugged-by: Jiri Bohac &lt;jbohac@suse.com&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>Revert "mm: avoid tail page refcounting on non-THP compound pages"</title>
<updated>2015-04-22T16:44:36Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-04-22T16:44:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f3ca10dde49043fbbb055854278b26954b549b36'/>
<id>urn:sha1:f3ca10dde49043fbbb055854278b26954b549b36</id>
<content type='text'>
This reverts commit 8d63d99a5dfbdb997d12dd3c07b2070ca723db3b.

It causes in VM mapping refcount errors:

  page:ffffea0010a15040 count:0 mapcount:1 mapping:          (null) index:0x0
  flags: 0x8000000000008014(referenced|dirty|tail)
  page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) != 0)
  ------------[ cut here ]------------
  kernel BUG at mm/swap.c:134!

as reported by Borislav Petkov

Reported-and-tested-by: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: linux-mm@kvack.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: new pfn_mkwrite same as page_mkwrite for VM_PFNMAP</title>
<updated>2015-04-15T23:35:20Z</updated>
<author>
<name>Boaz Harrosh</name>
<email>boaz@plexistor.com</email>
</author>
<published>2015-04-15T23:15:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dd9061846a3ba01b0fa45423aaa087e4a69187fa'/>
<id>urn:sha1:dd9061846a3ba01b0fa45423aaa087e4a69187fa</id>
<content type='text'>
This will allow FS that uses VM_PFNMAP | VM_MIXEDMAP (no page structs) to
get notified when access is a write to a read-only PFN.

This can happen if we mmap() a file then first mmap-read from it to
page-in a read-only PFN, than we mmap-write to the same page.

We need this functionality to fix a DAX bug, where in the scenario above
we fail to set ctime/mtime though we modified the file.  An xfstest is
attached to this patchset that shows the failure and the fix.  (A DAX
patch will follow)

This functionality is extra important for us, because upon dirtying of a
pmem page we also want to RDMA the page to a remote cluster node.

We define a new pfn_mkwrite and do not reuse page_mkwrite because
  1 - The name ;-)
  2 - But mainly because it would take a very long and tedious
      audit of all page_mkwrite functions of VM_MIXEDMAP/VM_PFNMAP
      users. To make sure they do not now CRASH. For example current
      DAX code (which this is for) would crash.
      If we would want to reuse page_mkwrite, We will need to first
      patch all users, so to not-crash-on-no-page. Then enable this
      patch. But even if I did that I would not sleep so well at night.
      Adding a new vector is the safest thing to do, and is not that
      expensive. an extra pointer at a static function vector per driver.
      Also the new vector is better for performance, because else we
      Will call all current Kernel vectors, so to:
        check-ha-no-page-do-nothing and return.

No need to call it from do_shared_fault because do_wp_page is called to
change pte permissions anyway.

Signed-off-by: Yigal Korman &lt;yigal@plexistor.com&gt;
Signed-off-by: Boaz Harrosh &lt;boaz@plexistor.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;matthew.r.wilcox@intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: uninline and cleanup page-mapping related helpers</title>
<updated>2015-04-15T23:35:19Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2015-04-15T23:14:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e39155ea11eac6da056b04669d7c9fc612e2065a'/>
<id>urn:sha1:e39155ea11eac6da056b04669d7c9fc612e2065a</id>
<content type='text'>
Most-used page-&gt;mapping helper -- page_mapping() -- has already uninlined.
 Let's uninline also page_rmapping() and page_anon_vma().  It saves us
depending on configuration around 400 bytes in text:

   text	   data	    bss	    dec	    hex	filename
 660318	  99254	 410000	1169572	 11d8a4	mm/built-in.o-before
 659854	  99254	 410000	1169108	 11d6d4	mm/built-in.o

I also tried to make code a bit more clean.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>include/linux/mm.h: simplify flag check</title>
<updated>2015-04-15T23:35:19Z</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2015-04-15T23:14:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cdd7875e0c4db5c41e28b645d3bf7d41bd2cbb45'/>
<id>urn:sha1:cdd7875e0c4db5c41e28b645d3bf7d41bd2cbb45</id>
<content type='text'>
Flip the flag test so that it is the simplest.  No functional change, just
a small readability improvement:

No code changed:

  # arch/x86/kernel/sys_x86_64.o:

   text    data     bss     dec     hex filename
   1551      24       0    1575     627 sys_x86_64.o.before
   1551      24       0    1575     627 sys_x86_64.o.after

md5:
   70708d1b1ad35cc891118a69dc1a63f9  sys_x86_64.o.before.asm
   70708d1b1ad35cc891118a69dc1a63f9  sys_x86_64.o.after.asm

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: avoid tail page refcounting on non-THP compound pages</title>
<updated>2015-04-15T23:35:17Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2015-04-15T23:13:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8d63d99a5dfbdb997d12dd3c07b2070ca723db3b'/>
<id>urn:sha1:8d63d99a5dfbdb997d12dd3c07b2070ca723db3b</id>
<content type='text'>
THP uses tail page refcounting to be able to split huge pages at any time.
 Tail page refcounting is not needed for other users of compound pages and
it's harmful because of overhead.

We try to exclude non-THP pages from tail page refcounting using
__compound_tail_refcounted() check.  It excludes most common non-THP
compound pages: SL*B and hugetlb, but it doesn't catch rest of __GFP_COMP
users -- drivers.

And it's not only about overhead.

Drivers might want to use compound pages to get refcounting semantics
suitable for mapping high-order pages to userspace.  But tail page
refcounting breaks it.

Tail page refcounting uses -&gt;_mapcount in tail pages to store GUP pins on
them.  It means GUP pins would affect page_mapcount() for tail pages.
It's not a problem for THP, because it never maps tail pages.  But unlike
THP, drivers map parts of compound pages with PTEs and it makes
page_mapcount() be called for tail pages.

In particular, GUP pins would shift PSS up and affect /proc/kpagecount for
such pages.  But, I'm not aware about anything which can lead to crash or
other serious misbehaviour.

Since currently all THP pages are anonymous and all drivers pages are not,
we can fix the __compound_tail_refcounted() check by requiring PageAnon()
to enable tail page refcounting.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: consolidate all page-flags helpers in &lt;linux/page-flags.h&gt;</title>
<updated>2015-04-15T23:35:17Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2015-04-15T23:13:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e8c6158fef15a1532bd5242a0cd88565eedabe61'/>
<id>urn:sha1:e8c6158fef15a1532bd5242a0cd88565eedabe61</id>
<content type='text'>
Currently we take a naive approach to page flags on compound pages - we
set the flag on the page without consideration if the flag makes sense
for tail page or for compound page in general.  This patchset try to
sort this out by defining per-flag policy on what need to be done if
page-flag helper operate on compound page.

The last patch in the patchset also sanitizes usege of page-&gt;mapping for
tail pages.  We don't define the meaning of page-&gt;mapping for tail
pages.  Currently it's always NULL, which can be inconsistent with head
page and potentially lead to problems.

For now I caught one case of illegal usage of page flags or -&gt;mapping:
sound subsystem allocates pages with __GFP_COMP and maps them with PTEs.
It leads to setting dirty bit on tail pages and access to tail_page's
-&gt;mapping.  I don't see any bad behaviour caused by this, but worth
fixing anyway.

This patchset makes more sense if you take my THP refcounting into
account: we will see more compound pages mapped with PTEs and we need to
define behaviour of flags on compound pages to avoid bugs.

This patch (of 16):

We have page-flags helper function declarations/definitions spread over
several header files.  Let's consolidate them in &lt;linux/page-flags.h&gt;.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Steve Capper &lt;steve.capper@linaro.org&gt;
Cc: "Aneesh Kumar K.V" &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: completely remove dumping per-cpu lists from show_mem()</title>
<updated>2015-04-14T23:49:01Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>koct9i@gmail.com</email>
</author>
<published>2015-04-14T22:45:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=761b06771adeeb734e9eebc6f70f916cb9e2f643'/>
<id>urn:sha1:761b06771adeeb734e9eebc6f70f916cb9e2f643</id>
<content type='text'>
It seems nobody needs this.

Signed-off-by: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
