<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/mm, branch v3.4.92</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.92</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.92'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2014-06-07T23:02:03Z</updated>
<entry>
<title>percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree()</title>
<updated>2014-06-07T23:02:03Z</updated>
<author>
<name>Jianyu Zhan</name>
<email>nasa4836@gmail.com</email>
</author>
<published>2014-04-14T05:47:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6baddd03b577bfdbf103570c0b3257679c99e267'/>
<id>urn:sha1:6baddd03b577bfdbf103570c0b3257679c99e267</id>
<content type='text'>
commit 5a838c3b60e3a36ade764cf7751b8f17d7c9c2da upstream.

pcpu_chunk_struct_size = sizeof(struct pcpu_chunk) +
	BITS_TO_LONGS(pcpu_unit_pages) * sizeof(unsigned long)

It hardly could be ever bigger than PAGE_SIZE even for large-scale machine,
but for consistency with its couterpart pcpu_mem_zalloc(),
use pcpu_mem_free() instead.

Commit b4916cb17c26 ("percpu: make pcpu_free_chunk() use
pcpu_mem_free() instead of kfree()") addressed this problem, but
missed this one.

tj: commit message updated

Signed-off-by: Jianyu Zhan &lt;nasa4836@gmail.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Fixes: 099a19d91ca4 ("percpu: allow limited allocation before slab is online)
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>hwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage</title>
<updated>2014-06-07T23:02:01Z</updated>
<author>
<name>Chen Yucong</name>
<email>slaoub@gmail.com</email>
</author>
<published>2014-05-22T18:54:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1ebe3d11087a74fea05620b13e33b0a30f6af2d0'/>
<id>urn:sha1:1ebe3d11087a74fea05620b13e33b0a30f6af2d0</id>
<content type='text'>
commit b985194c8c0a130ed155b71662e39f7eaea4876f upstream.

For handling a free hugepage in memory failure, the race will happen if
another thread hwpoisoned this hugepage concurrently.  So we need to
check PageHWPoison instead of !PageHWPoison.

If hwpoison_filter(p) returns true or a race happens, then we need to
unlock_page(hpage).

Signed-off-by: Chen Yucong &lt;slaoub@gmail.com&gt;
Reviewed-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Tested-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Reviewed-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: make fixup_user_fault() check the vma access rights too</title>
<updated>2014-06-07T23:02:00Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-04-22T20:49:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3be2bb4956e5d9d849329d5d1b92f6510ac12315'/>
<id>urn:sha1:3be2bb4956e5d9d849329d5d1b92f6510ac12315</id>
<content type='text'>
commit 1b17844b29ae042576bea588164f2f1e9590a8bc upstream.

fixup_user_fault() is used by the futex code when the direct user access
fails, and the futex code wants it to either map in the page in a usable
form or return an error.  It relied on handle_mm_fault() to map the
page, and correctly checked the error return from that, but while that
does map the page, it doesn't actually guarantee that the page will be
mapped with sufficient permissions to be then accessed.

So do the appropriate tests of the vma access rights by hand.

[ Side note: arguably handle_mm_fault() could just do that itself, but
  we have traditionally done it in the caller, because some callers -
  notably get_user_pages() - have been able to access pages even when
  they are mapped with PROT_NONE.  Maybe we should re-visit that design
  decision, but in the meantime this is the minimal patch. ]

Found by Dave Jones running his trinity tool.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()</title>
<updated>2014-06-07T23:01:57Z</updated>
<author>
<name>Mizuma, Masayoshi</name>
<email>m.mizuma@jp.fujitsu.com</email>
</author>
<published>2014-04-18T22:07:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e4e351a30ae3928b889cf23e5153075591938926'/>
<id>urn:sha1:e4e351a30ae3928b889cf23e5153075591938926</id>
<content type='text'>
commit 7848a4bf51b34f41fcc9bd77e837126d99ae84e3 upstream.

soft lockup in freeing gigantic hugepage fixed in commit 55f67141a892 "mm:
hugetlb: fix softlockup when a large number of hugepages are freed." can
happen in return_unused_surplus_pages(), so let's fix it.

Signed-off-by: Masayoshi Mizuma &lt;m.mizuma@jp.fujitsu.com&gt;
Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: hugetlb: fix softlockup when a large number of hugepages are freed.</title>
<updated>2014-05-06T14:51:45Z</updated>
<author>
<name>Mizuma, Masayoshi</name>
<email>m.mizuma@jp.fujitsu.com</email>
</author>
<published>2014-04-07T22:37:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=af4acfaf3ca2f1da6e48d67e50f27f5e22fa3308'/>
<id>urn:sha1:af4acfaf3ca2f1da6e48d67e50f27f5e22fa3308</id>
<content type='text'>
commit 55f67141a8927b2be3e51840da37b8a2320143ed upstream.

When I decrease the value of nr_hugepage in procfs a lot, softlockup
happens.  It is because there is no chance of context switch during this
process.

On the other hand, when I allocate a large number of hugepages, there is
some chance of context switch.  Hence softlockup doesn't happen during
this process.  So it's necessary to add the context switch in the
freeing process as same as allocating process to avoid softlockup.

When I freed 12 TB hugapages with kernel-2.6.32-358.el6, the freeing
process occupied a CPU over 150 seconds and following softlockup message
appeared twice or more.

$ echo 6000000 &gt; /proc/sys/vm/nr_hugepages
$ cat /proc/sys/vm/nr_hugepages
6000000
$ grep ^Huge /proc/meminfo
HugePages_Total:   6000000
HugePages_Free:    6000000
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
$ echo 0 &gt; /proc/sys/vm/nr_hugepages

BUG: soft lockup - CPU#16 stuck for 67s! [sh:12883] ...
Pid: 12883, comm: sh Not tainted 2.6.32-358.el6.x86_64 #1
Call Trace:
  free_pool_huge_page+0xb8/0xd0
  set_max_huge_pages+0x128/0x190
  hugetlb_sysctl_handler_common+0x113/0x140
  hugetlb_sysctl_handler+0x1e/0x20
  proc_sys_call_handler+0x97/0xd0
  proc_sys_write+0x14/0x20
  vfs_write+0xb8/0x1a0
  sys_write+0x51/0x90
  __audit_syscall_exit+0x265/0x290
  system_call_fastpath+0x16/0x1b

I have not confirmed this problem with upstream kernels because I am not
able to prepare the machine equipped with 12TB memory now.  However I
confirmed that the amount of decreasing hugepages was directly
proportional to the amount of required time.

I measured required times on a smaller machine.  It showed 130-145
hugepages decreased in a millisecond.

  Amount of decreasing     Required time      Decreasing rate
  hugepages                     (msec)         (pages/msec)
  ------------------------------------------------------------
  10,000 pages == 20GB         70 -  74          135-142
  30,000 pages == 60GB        208 - 229          131-144

It means decrement of 6TB hugepages will trigger softlockup with the
default threshold 20sec, in this decreasing rate.

Signed-off-by: Masayoshi Mizuma &lt;m.mizuma@jp.fujitsu.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/hotplug: correctly add new zone to all other nodes' zone lists</title>
<updated>2014-03-11T23:10:04Z</updated>
<author>
<name>Jiang Liu</name>
<email>jiang.liu@huawei.com</email>
</author>
<published>2012-07-31T23:43:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=446327d6f1763bbe7f9793cc8dae04fca82c9d48'/>
<id>urn:sha1:446327d6f1763bbe7f9793cc8dae04fca82c9d48</id>
<content type='text'>
commit 08dff7b7d629807dbb1f398c68dd9cd58dd657a1 upstream.

When online_pages() is called to add new memory to an empty zone, it
rebuilds all zone lists by calling build_all_zonelists().  But there's a
bug which prevents the new zone to be added to other nodes' zone lists.

online_pages() {
	build_all_zonelists()
	.....
	node_set_state(zone_to_nid(zone), N_HIGH_MEMORY)
}

Here the node of the zone is put into N_HIGH_MEMORY state after calling
build_all_zonelists(), but build_all_zonelists() only adds zones from
nodes in N_HIGH_MEMORY state to the fallback zone lists.
build_all_zonelists()

    -&gt;__build_all_zonelists()
	-&gt;build_zonelists()
	    -&gt;find_next_best_node()
		-&gt;for_each_node_state(n, N_HIGH_MEMORY)

So memory in the new zone will never be used by other nodes, and it may
cause strange behavor when system is under memory pressure.  So put node
into N_HIGH_MEMORY state before calling build_all_zonelists().

Signed-off-by: Jianguo Wu &lt;wujianguo@huawei.com&gt;
Signed-off-by: Jiang Liu &lt;liuj97@gmail.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Keping Chen &lt;chenkeping@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Qiang Huang &lt;h.huangqiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: vmscan: fix endless loop in kswapd balancing</title>
<updated>2014-03-11T23:10:02Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2012-11-29T21:54:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f47929fd5093c4b5c134ff2b2811ae327102bafd'/>
<id>urn:sha1:f47929fd5093c4b5c134ff2b2811ae327102bafd</id>
<content type='text'>
commit 60cefed485a02bd99b6299dad70666fe49245da7 upstream.

Kswapd does not in all places have the same criteria for a balanced
zone.  Zones are only being reclaimed when their high watermark is
breached, but compaction checks loop over the zonelist again when the
zone does not meet the low watermark plus two times the size of the
allocation.  This gets kswapd stuck in an endless loop over a small
zone, like the DMA zone, where the high watermark is smaller than the
compaction requirement.

Add a function, zone_balanced(), that checks the watermark, and, for
higher order allocations, if compaction has enough free memory.  Then
use it uniformly to check for balanced zones.

This makes sure that when the compaction watermark is not met, at least
reclaim happens and progress is made - or the zone is declared
unreclaimable at some point and skipped entirely.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: George Spelvin &lt;linux@horizon.com&gt;
Reported-by: Johannes Hirte &lt;johannes.hirte@fem.tu-ilmenau.de&gt;
Reported-by: Tomas Racek &lt;tracek@redhat.com&gt;
Tested-by: Johannes Hirte &lt;johannes.hirte@fem.tu-ilmenau.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[hq: Backported to 3.4: adjust context]
Signed-off-by: Qiang Huang &lt;h.huangqiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>mm: setup pageblock_order before it's used by sparsemem</title>
<updated>2014-02-20T18:45:32Z</updated>
<author>
<name>Xishi Qiu</name>
<email>qiuxishi@huawei.com</email>
</author>
<published>2012-07-31T23:43:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3fea8b0a9f978cce6d0685464d4e8acb9bbd1acc'/>
<id>urn:sha1:3fea8b0a9f978cce6d0685464d4e8acb9bbd1acc</id>
<content type='text'>
commit ca57df79d4f64e1a4886606af4289d40636189c5 upstream.

On architectures with CONFIG_HUGETLB_PAGE_SIZE_VARIABLE set, such as
Itanium, pageblock_order is a variable with default value of 0.  It's set
to the right value by set_pageblock_order() in function
free_area_init_core().

But pageblock_order may be used by sparse_init() before free_area_init_core()
is called along path:
sparse_init()
    -&gt;sparse_early_usemaps_alloc_node()
	-&gt;usemap_size()
	    -&gt;SECTION_BLOCKFLAGS_BITS
		-&gt;((1UL &lt;&lt; (PFN_SECTION_SHIFT - pageblock_order)) *
NR_PAGEBLOCK_BITS)

The uninitialized pageblock_size will cause memory wasting because
usemap_size() returns a much bigger value then it's really needed.

For example, on an Itanium platform,
sparse_init() pageblock_order=0 usemap_size=24576
free_area_init_core() before pageblock_order=0, usemap_size=24576
free_area_init_core() after pageblock_order=12, usemap_size=8

That means 24K memory has been wasted for each section, so fix it by calling
set_pageblock_order() from sparse_init().

Signed-off-by: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Signed-off-by: Jiang Liu &lt;liuj97@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Keping Chen &lt;chenkeping@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_alloc.c: remove pageblock_default_order()</title>
<updated>2014-02-20T18:45:32Z</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2012-05-29T22:06:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=237597d8f73155bba781acf3f4f558f6ed08a8ee'/>
<id>urn:sha1:237597d8f73155bba781acf3f4f558f6ed08a8ee</id>
<content type='text'>
commit 955c1cd7401565671b064e499115344ec8067dfd upstream.

This has always been broken: one version takes an unsigned int and the
other version takes no arguments.  This bug was hidden because one
version of set_pageblock_order() was a macro which doesn't evaluate its
argument.

Simplify it all and remove pageblock_default_order() altogether.

Reported-by: rajman mekaco &lt;rajman.mekaco@gmail.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq()</title>
<updated>2014-02-20T18:45:32Z</updated>
<author>
<name>KOSAKI Motohiro</name>
<email>kosaki.motohiro@jp.fujitsu.com</email>
</author>
<published>2014-02-06T20:04:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4d4bed8141f7883dfbed44749653f769367b3e01'/>
<id>urn:sha1:4d4bed8141f7883dfbed44749653f769367b3e01</id>
<content type='text'>
commit a85d9df1ea1d23682a0ed1e100e6965006595d06 upstream.

During aio stress test, we observed the following lockdep warning.  This
mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but
__set_page_dirty_nobuffers unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave() instead of
spin_lock_irq() because they don't know caller at all.

   other info that might help us debug this:
    Possible unsafe locking scenario:

          CPU0
          ----
     lock(&amp;(&amp;ctx-&gt;completion_lock)-&gt;rlock);
     &lt;Interrupt&gt;
       lock(&amp;(&amp;ctx-&gt;completion_lock)-&gt;rlock);

    *** DEADLOCK ***

      dump_stack+0x19/0x1b
      print_usage_bug+0x1f7/0x208
      mark_lock+0x21d/0x2a0
      mark_held_locks+0xb9/0x140
      trace_hardirqs_on_caller+0x105/0x1d0
      trace_hardirqs_on+0xd/0x10
      _raw_spin_unlock_irq+0x2c/0x50
      __set_page_dirty_nobuffers+0x8c/0xf0
      migrate_page_copy+0x434/0x540
      aio_migratepage+0xb1/0x140
      move_to_new_page+0x7d/0x230
      migrate_pages+0x5e5/0x700
      migrate_misplaced_page+0xbc/0xf0
      do_numa_page+0x102/0x190
      handle_pte_fault+0x241/0x970
      handle_mm_fault+0x265/0x370
      __do_page_fault+0x172/0x5a0
      do_page_fault+0x1a/0x70
      page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Johannes Weiner &lt;jweiner@redhat.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
