<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/gfp.h, branch v5.10.154</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.154</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.154'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-10-15T21:43:29Z</updated>
<entry>
<title>Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping</title>
<updated>2020-10-15T21:43:29Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-10-15T21:43:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5a32c3413d3340f90c82c84b375ad4b335a59f28'/>
<id>urn:sha1:5a32c3413d3340f90c82c84b375ad4b335a59f28</id>
<content type='text'>
Pull dma-mapping updates from Christoph Hellwig:

 - rework the non-coherent DMA allocator

 - move private definitions out of &lt;linux/dma-mapping.h&gt;

 - lower CMA_ALIGNMENT (Paul Cercueil)

 - remove the omap1 dma address translation in favor of the common code

 - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

 - support per-node DMA CMA areas (Barry Song)

 - increase the default seg boundary limit (Nicolin Chen)

 - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

 - various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
  ARM/ixp4xx: add a missing include of dma-map-ops.h
  dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
  dma-direct: factor out a dma_direct_alloc_from_pool helper
  dma-direct check for highmem pages in dma_direct_alloc_pages
  dma-mapping: merge &lt;linux/dma-noncoherent.h&gt; into &lt;linux/dma-map-ops.h&gt;
  dma-mapping: move large parts of &lt;linux/dma-direct.h&gt; to kernel/dma
  dma-mapping: move dma-debug.h to kernel/dma/
  dma-mapping: remove &lt;asm/dma-contiguous.h&gt;
  dma-mapping: merge &lt;linux/dma-contiguous.h&gt; into &lt;linux/dma-map-ops.h&gt;
  dma-contiguous: remove dma_contiguous_set_default
  dma-contiguous: remove dev_set_cma_area
  dma-contiguous: remove dma_declare_contiguous
  dma-mapping: split &lt;linux/dma-mapping.h&gt;
  cma: decrease CMA_ALIGNMENT lower limit to 2
  firewire-ohci: use dma_alloc_pages
  dma-iommu: implement -&gt;alloc_noncoherent
  dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
  dma-mapping: add a new dma_alloc_pages API
  dma-mapping: remove dma_cache_sync
  53c700: convert to dma_alloc_noncoherent
  ...
</content>
</entry>
<entry>
<title>mm: remove unused alloc_page_vma_node()</title>
<updated>2020-10-14T01:38:34Z</updated>
<author>
<name>Wei Yang</name>
<email>richard.weiyang@linux.alibaba.com</email>
</author>
<published>2020-10-13T23:57:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f8fd52535c7326d72645c9878d7897aaf44db51c'/>
<id>urn:sha1:f8fd52535c7326d72645c9878d7897aaf44db51c</id>
<content type='text'>
No one use this macro anymore.

Also fix code style of policy_node().

Signed-off-by: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lkml.kernel.org/r/20200921021401.84508-1-richard.weiyang@linux.alibaba.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>include/linux/gfp.h: clarify usage of GFP_ATOMIC in !preemptible contexts</title>
<updated>2020-10-14T01:38:33Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2020-10-13T23:56:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ab00db216c9c78cc0a68bc4e27889c1ee374598d'/>
<id>urn:sha1:ab00db216c9c78cc0a68bc4e27889c1ee374598d</id>
<content type='text'>
There is a general understanding that GFP_ATOMIC/GFP_NOWAIT are to be used
from atomic contexts.  E.g.  from within a spin lock or from the IRQ
context.  This is correct but there are some atomic contexts where the
above doesn't hold.  One of them would be an NMI context.  Page allocator
has never supported that and the general fear of this context didn't let
anybody to actually even try to use the allocator there.  Good, but let's
be more specific about that.

Another such a context, and that is where people seem to be more daring,
is raw_spin_lock.  Mostly because it simply resembles regular spin lock
which is supported by the allocator and there is not any implementation
difference with !RT kernels in the first place.  Be explicit that such a
context is not supported by the allocator.  The underlying reason is that
zone-&gt;lock would have to become raw_spin_lock as well and that has turned
out to be a problem for RT
(http://lkml.kernel.org/r/87mu305c1w.fsf@nanos.tec.linutronix.de).

Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Uladzislau Rezki &lt;urezki@gmail.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Link: https://lkml.kernel.org/r/20200929123010.5137-1-mhocko@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: turn alloc_pages into an inline function</title>
<updated>2020-09-25T04:20:40Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-08-17T17:17:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=43ee5b6daa6c45246098493dab2c229d196c9cf6'/>
<id>urn:sha1:43ee5b6daa6c45246098493dab2c229d196c9cf6</id>
<content type='text'>
To prevent a compiler error when a method call alloc_pages is
added (which I plan to for the dma_map_ops).

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>mm: rename gfpflags_to_migratetype to gfp_migratetype for same convention</title>
<updated>2020-06-04T03:09:45Z</updated>
<author>
<name>Wei Yang</name>
<email>richard.weiyang@gmail.com</email>
</author>
<published>2020-06-03T22:59:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=01c0bfe061f309b848d51619f20495ee2acd7727'/>
<id>urn:sha1:01c0bfe061f309b848d51619f20495ee2acd7727</id>
<content type='text'>
Pageblock migrate type is encoded in GFP flags, just as zone_type and
zonelist.

Currently we use gfp_zone() and gfp_zonelist() to extract related
information, it would be proper to use the same naming convention for
migrate type.

Signed-off-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Link: http://lkml.kernel.org/r/20200329080823.7735-1-richard.weiyang@gmail.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: clarify __GFP_MEMALLOC usage</title>
<updated>2020-06-04T03:09:43Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2020-06-03T22:56:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=574c1ae66c12410a08aeef8474936baa50e0371d'/>
<id>urn:sha1:574c1ae66c12410a08aeef8474936baa50e0371d</id>
<content type='text'>
It seems that the existing documentation is not explicit about the
expected usage and potential risks enough.  While it is calls out that
users have to free memory when using this flag it is not really apparent
that users have to careful to not deplete memory reserves and that they
should implement some sort of throttling wrt.  freeing process.

This is partly based on Neil's explanation [1].

Let's also call out that a pre allocated pool allocator should be
considered.

[1] http://lkml.kernel.org/r/877dz0yxoa.fsf@notabene.neil.brown.name

[akpm@linux-foundation.org: coding style fixes]
[mhocko@kernel.org: update]
  Link: http://lkml.kernel.org/r/20200406070137.GC19426@dhcp22.suse.cz
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Link: http://lkml.kernel.org/r/20200403083543.11552-2-mhocko@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: make it clear that gfp reclaim modifiers are valid only for sleepable allocations</title>
<updated>2020-04-07T17:43:38Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2020-04-07T03:04:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=29fd1897070125ab49634524a20f146fb4240a51'/>
<id>urn:sha1:29fd1897070125ab49634524a20f146fb4240a51</id>
<content type='text'>
While it might be really clear to MM developers that gfp reclaim modifiers
are applicable only to sleepable allocations (those with
__GFP_DIRECT_RECLAIM) it seems that actual users of the API are not always
sure.  Make it explicit that they are not applicable for GFP_NOWAIT or
GFP_ATOMIC allocations which are the most commonly used non-sleepable
allocation masks.

Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Acked-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Link: http://lkml.kernel.org/r/20200403083543.11552-3-mhocko@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/gup/writeback: add callbacks for inaccessible pages</title>
<updated>2020-04-02T16:35:27Z</updated>
<author>
<name>Claudio Imbrenda</name>
<email>imbrenda@linux.ibm.com</email>
</author>
<published>2020-04-02T04:05:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f28d43636d6f940e60abef4f0131119836c8ebd4'/>
<id>urn:sha1:f28d43636d6f940e60abef4f0131119836c8ebd4</id>
<content type='text'>
With the introduction of protected KVM guests on s390 there is now a
concept of inaccessible pages.  These pages need to be made accessible
before the host can access them.

While cpu accesses will trigger a fault that can be resolved, I/O accesses
will just fail.  We need to add a callback into architecture code for
places that will do I/O, namely when writeback is started or when a page
reference is taken.

This is not only to enable paging, file backing etc, it is also necessary
to protect the host against a malicious user space.  For example a bad
QEMU could simply start direct I/O on such protected memory.  We do not
want userspace to be able to trigger I/O errors and thus the logic is
"whenever somebody accesses that page (gup) or does I/O, make sure that
this page can be accessed".  When the guest tries to access that page we
will wait in the page fault handler for writeback to have finished and for
the page_ref to be the expected value.

On s390x the function is not supposed to fail, so it is ok to use a
WARN_ON on failure.  If we ever need some more finegrained handling we can
tackle this when we know the details.

Signed-off-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Reviewed-by: John Hubbard &lt;jhubbard@nvidia.com&gt;
Acked-by: Will Deacon &lt;will@kernel.org&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jérôme Glisse &lt;jglisse@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Link: http://lkml.kernel.org/r/20200306132537.783769-3-imbrenda@linux.ibm.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_alloc: add alloc_contig_pages()</title>
<updated>2019-12-01T20:59:06Z</updated>
<author>
<name>Anshuman Khandual</name>
<email>anshuman.khandual@arm.com</email>
</author>
<published>2019-12-01T01:55:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5e27a2df03b8933aa7c1579816ecb6a071bb0e0d'/>
<id>urn:sha1:5e27a2df03b8933aa7c1579816ecb6a071bb0e0d</id>
<content type='text'>
HugeTLB helper alloc_gigantic_page() implements fairly generic
allocation method where it scans over various zones looking for a large
contiguous pfn range before trying to allocate it with
alloc_contig_range().

Other than deriving the requested order from 'struct hstate', there is
nothing HugeTLB specific in there.  This can be made available for
general use to allocate contiguous memory which could not have been
allocated through the buddy allocator.

alloc_gigantic_page() has been split carving out actual allocation
method which is then made available via new alloc_contig_pages() helper
wrapped under CONFIG_CONTIG_ALLOC.  All references to 'gigantic' have
been replaced with more generic term 'contig'.  Allocated pages here
should be freed with free_contig_range() or by calling __free_page() on
each allocated page.

Link: http://lkml.kernel.org/r/1571300646-32240-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Pavel Tatashin &lt;pavel.tatashin@microsoft.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>net: fix sk_page_frag() recursion from memory reclaim</title>
<updated>2019-10-28T23:17:31Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-10-24T20:50:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=20eb4f29b60286e0d6dc01d9c260b4bd383c58fb'/>
<id>urn:sha1:20eb4f29b60286e0d6dc01d9c260b4bd383c58fb</id>
<content type='text'>
sk_page_frag() optimizes skb_frag allocations by using per-task
skb_frag cache when it knows it's the only user.  The condition is
determined by seeing whether the socket allocation mask allows
blocking - if the allocation may block, it obviously owns the task's
context and ergo exclusively owns current-&gt;task_frag.

Unfortunately, this misses recursion through memory reclaim path.
Please take a look at the following backtrace.

 [2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10
     ...
     tcp_sendmsg+0x27/0x40
     sock_sendmsg+0x30/0x40
     sock_xmit.isra.24+0xa1/0x170 [nbd]
     nbd_send_cmd+0x1d2/0x690 [nbd]
     nbd_queue_rq+0x1b5/0x3b0 [nbd]
     __blk_mq_try_issue_directly+0x108/0x1b0
     blk_mq_request_issue_directly+0xbd/0xe0
     blk_mq_try_issue_list_directly+0x41/0xb0
     blk_mq_sched_insert_requests+0xa2/0xe0
     blk_mq_flush_plug_list+0x205/0x2a0
     blk_flush_plug_list+0xc3/0xf0
 [1] blk_finish_plug+0x21/0x2e
     _xfs_buf_ioapply+0x313/0x460
     __xfs_buf_submit+0x67/0x220
     xfs_buf_read_map+0x113/0x1a0
     xfs_trans_read_buf_map+0xbf/0x330
     xfs_btree_read_buf_block.constprop.42+0x95/0xd0
     xfs_btree_lookup_get_block+0x95/0x170
     xfs_btree_lookup+0xcc/0x470
     xfs_bmap_del_extent_real+0x254/0x9a0
     __xfs_bunmapi+0x45c/0xab0
     xfs_bunmapi+0x15/0x30
     xfs_itruncate_extents_flags+0xca/0x250
     xfs_free_eofblocks+0x181/0x1e0
     xfs_fs_destroy_inode+0xa8/0x1b0
     destroy_inode+0x38/0x70
     dispose_list+0x35/0x50
     prune_icache_sb+0x52/0x70
     super_cache_scan+0x120/0x1a0
     do_shrink_slab+0x120/0x290
     shrink_slab+0x216/0x2b0
     shrink_node+0x1b6/0x4a0
     do_try_to_free_pages+0xc6/0x370
     try_to_free_mem_cgroup_pages+0xe3/0x1e0
     try_charge+0x29e/0x790
     mem_cgroup_charge_skmem+0x6a/0x100
     __sk_mem_raise_allocated+0x18e/0x390
     __sk_mem_schedule+0x2a/0x40
 [0] tcp_sendmsg_locked+0x8eb/0xe10
     tcp_sendmsg+0x27/0x40
     sock_sendmsg+0x30/0x40
     ___sys_sendmsg+0x26d/0x2b0
     __sys_sendmsg+0x57/0xa0
     do_syscall_64+0x42/0x100
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

In [0], tcp_send_msg_locked() was using current-&gt;page_frag when it
called sk_wmem_schedule().  It already calculated how many bytes can
be fit into current-&gt;page_frag.  Due to memory pressure,
sk_wmem_schedule() called into memory reclaim path which called into
xfs and then IO issue path.  Because the filesystem in question is
backed by nbd, the control goes back into the tcp layer - back into
tcp_sendmsg_locked().

nbd sets sk_allocation to (GFP_NOIO | __GFP_MEMALLOC) which makes
sense - it's in the process of freeing memory and wants to be able to,
e.g., drop clean pages to make forward progress.  However, this
confused sk_page_frag() called from [2].  Because it only tests
whether the allocation allows blocking which it does, it now thinks
current-&gt;page_frag can be used again although it already was being
used in [0].

After [2] used current-&gt;page_frag, the offset would be increased by
the used amount.  When the control returns to [0],
current-&gt;page_frag's offset is increased and the previously calculated
number of bytes now may overrun the end of allocated memory leading to
silent memory corruptions.

Fix it by adding gfpflags_normal_context() which tests sleepable &amp;&amp;
!reclaim and use it to determine whether to use current-&gt;task_frag.

v2: Eric didn't like gfp flags being tested twice.  Introduce a new
    helper gfpflags_normal_context() and combine the two tests.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
