<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/mm/page_alloc.c, branch v3.4.93</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.93</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.93'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2014-02-20T18:45:32Z</updated>
<entry>
<title>mm: setup pageblock_order before it's used by sparsemem</title>
<updated>2014-02-20T18:45:32Z</updated>
<author>
<name>Xishi Qiu</name>
<email>qiuxishi@huawei.com</email>
</author>
<published>2012-07-31T23:43:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3fea8b0a9f978cce6d0685464d4e8acb9bbd1acc'/>
<id>urn:sha1:3fea8b0a9f978cce6d0685464d4e8acb9bbd1acc</id>
<content type='text'>
commit ca57df79d4f64e1a4886606af4289d40636189c5 upstream.

On architectures with CONFIG_HUGETLB_PAGE_SIZE_VARIABLE set, such as
Itanium, pageblock_order is a variable with default value of 0.  It's set
to the right value by set_pageblock_order() in function
free_area_init_core().

But pageblock_order may be used by sparse_init() before free_area_init_core()
is called along path:
sparse_init()
    -&gt;sparse_early_usemaps_alloc_node()
	-&gt;usemap_size()
	    -&gt;SECTION_BLOCKFLAGS_BITS
		-&gt;((1UL &lt;&lt; (PFN_SECTION_SHIFT - pageblock_order)) *
NR_PAGEBLOCK_BITS)

The uninitialized pageblock_size will cause memory wasting because
usemap_size() returns a much bigger value then it's really needed.

For example, on an Itanium platform,
sparse_init() pageblock_order=0 usemap_size=24576
free_area_init_core() before pageblock_order=0, usemap_size=24576
free_area_init_core() after pageblock_order=12, usemap_size=8

That means 24K memory has been wasted for each section, so fix it by calling
set_pageblock_order() from sparse_init().

Signed-off-by: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Signed-off-by: Jiang Liu &lt;liuj97@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Keping Chen &lt;chenkeping@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_alloc.c: remove pageblock_default_order()</title>
<updated>2014-02-20T18:45:32Z</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2012-05-29T22:06:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=237597d8f73155bba781acf3f4f558f6ed08a8ee'/>
<id>urn:sha1:237597d8f73155bba781acf3f4f558f6ed08a8ee</id>
<content type='text'>
commit 955c1cd7401565671b064e499115344ec8067dfd upstream.

This has always been broken: one version takes an unsigned int and the
other version takes no arguments.  This bug was hidden because one
version of set_pageblock_order() was a macro which doesn't evaluate its
argument.

Simplify it all and remove pageblock_default_order() altogether.

Reported-by: rajman mekaco &lt;rajman.mekaco@gmail.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm, show_mem: suppress page counts in non-blockable contexts</title>
<updated>2013-10-13T22:42:49Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2013-04-29T22:06:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=022a41db8aa1bc0b4ff4c013f889292324a1c465'/>
<id>urn:sha1:022a41db8aa1bc0b4ff4c013f889292324a1c465</id>
<content type='text'>
commit 4b59e6c4730978679b414a8da61514a2518da512 upstream.

On large systems with a lot of memory, walking all RAM to determine page
types may take a half second or even more.

In non-blockable contexts, the page allocator will emit a page allocation
failure warning unless __GFP_NOWARN is specified.  In such contexts, irqs
are typically disabled and such a lengthy delay may even result in NMI
watchdog timeouts.

To fix this, suppress the page walk in such contexts when printing the
page allocation failure warning.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/memory-hotplug: fix lowmem count overflow when offline pages</title>
<updated>2013-08-04T08:26:07Z</updated>
<author>
<name>Wanpeng Li</name>
<email>liwanp@linux.vnet.ibm.com</email>
</author>
<published>2013-07-03T22:02:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=19d22ea89933ca48bcb10fc7919ed7bbefd52362'/>
<id>urn:sha1:19d22ea89933ca48bcb10fc7919ed7bbefd52362</id>
<content type='text'>
commit cea27eb2a202959783f81254c48c250ddd80e129 upstream.

The logic for the memory-remove code fails to correctly account the
Total High Memory when a memory block which contains High Memory is
offlined as shown in the example below.  The following patch fixes it.

Before logic memory remove:

MemTotal:        7603740 kB
MemFree:         6329612 kB
Buffers:           94352 kB
Cached:           872008 kB
SwapCached:            0 kB
Active:           626932 kB
Inactive:         519216 kB
Active(anon):     180776 kB
Inactive(anon):   222944 kB
Active(file):     446156 kB
Inactive(file):   296272 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:       7294672 kB
HighFree:        5704696 kB
LowTotal:         309068 kB
LowFree:          624916 kB

After logic memory remove:

MemTotal:        7079452 kB
MemFree:         5805976 kB
Buffers:           94372 kB
Cached:           872000 kB
SwapCached:            0 kB
Active:           626936 kB
Inactive:         519236 kB
Active(anon):     180780 kB
Inactive(anon):   222944 kB
Active(file):     446156 kB
Inactive(file):   296292 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:       7294672 kB
HighFree:        5181024 kB
LowTotal:       4294752076 kB
LowFree:          624952 kB

[mhocko@suse.cz: fix CONFIG_HIGHMEM=n build]
Signed-off-by: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[2.6.24+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Zhouping Liu &lt;zliu@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: fix pageblock bitmap allocation</title>
<updated>2013-02-28T14:58:58Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-02-18T17:58:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=17c2f96bd1de5999900303b3eb923b8cdf6ee2ba'/>
<id>urn:sha1:17c2f96bd1de5999900303b3eb923b8cdf6ee2ba</id>
<content type='text'>
commit 7c45512df987c5619db041b5c9b80d281e26d3db upstream.

Commit c060f943d092 ("mm: use aligned zone start for pfn_to_bitidx
calculation") fixed out calculation of the index into the pageblock
bitmap when a !SPARSEMEM zome was not aligned to pageblock_nr_pages.

However, the _allocation_ of that bitmap had never taken this alignment
requirement into accout, so depending on the exact size and alignment of
the zone, the use of that index could then access past the allocation,
resulting in some very subtle memory corruption.

This was reported (and bisected) by Ingo Molnar: one of his random
config builds would hang with certain very specific kernel command line
options.

In the meantime, commit c060f943d092 has been marked for stable, so this
fix needs to be back-ported to the stable kernels that backported the
commit to use the right alignment.

Bisected-and-tested-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: use aligned zone start for pfn_to_bitidx calculation</title>
<updated>2013-01-17T16:50:43Z</updated>
<author>
<name>Laura Abbott</name>
<email>lauraa@codeaurora.org</email>
</author>
<published>2013-01-11T22:31:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ce19422296fb5fb5997d81e1b0043d771def7996'/>
<id>urn:sha1:ce19422296fb5fb5997d81e1b0043d771def7996</id>
<content type='text'>
commit c060f943d0929f3e429c5d9522290584f6281d6e upstream.

The current calculation in pfn_to_bitidx assumes that (pfn -
zone-&gt;zone_start_pfn) &gt;&gt; pageblock_order will return the same bit for
all pfn in a pageblock.  If zone_start_pfn is not aligned to
pageblock_nr_pages, this may not always be correct.

Consider the following with pageblock order = 10, zone start 2MB:

  pfn     | pfn - zone start | (pfn - zone start) &gt;&gt; page block order
  ----------------------------------------------------------------
  0x26000 | 0x25e00	   |  0x97
  0x26100 | 0x25f00	   |  0x97
  0x26200 | 0x26000	   |  0x98
  0x26300 | 0x26100	   |  0x98

This means that calling {get,set}_pageblock_migratetype on a single page
will not set the migratetype for the full block.  Fix this by rounding
down zone_start_pfn when doing the bitidx calculation.

For our use case, the effects of this bug were mostly tied to the fact
that CMA allocations would either take a long time or fail to happen.
Depending on the driver using CMA, this could result in anything from
visual glitches to application failures.

Signed-off-by: Laura Abbott &lt;lauraa@codeaurora.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/page_alloc: fix the page address of higher page's buddy calculation</title>
<updated>2012-10-02T17:30:05Z</updated>
<author>
<name>Li Haifeng</name>
<email>omycle@gmail.com</email>
</author>
<published>2012-09-17T21:09:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aa7994f281a5e705b5f9cb13b3219fc346263872'/>
<id>urn:sha1:aa7994f281a5e705b5f9cb13b3219fc346263872</id>
<content type='text'>
commit 0ba8f2d59304dfe69b59c034de723ad80f7ab9ac upstream.

The heuristic method for buddy has been introduced since commit
43506fad21ca ("mm/page_alloc.c: simplify calculation of combined index
of adjacent buddy lists").  But the page address of higher page's buddy
was wrongly calculated, which will lead page_is_buddy to fail for ever.
IOW, the heuristic method would be disabled with the wrong page address
of higher page's buddy.

Calculating the page address of higher page's buddy should be based
higher_page with the offset between index of higher page and index of
higher page's buddy.

Signed-off-by: Haifeng Li &lt;omycle@gmail.com&gt;
Signed-off-by: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: KyongHo Cho &lt;pullip.cho@samsung.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Johannes Weiner &lt;jweiner@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: raise MemFree by reverting percpu_pagelist_fraction to 0</title>
<updated>2012-05-11T16:23:39Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-05-11T08:00:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1b76b02f15c70d5f392ee2e231fbd20a26063a77'/>
<id>urn:sha1:1b76b02f15c70d5f392ee2e231fbd20a26063a77</id>
<content type='text'>
Why is there less MemFree than there used to be?  It perturbed a test,
so I've just been bisecting linux-next, and now find the offender went
upstream yesterday.

Commit 93278814d359 "mm: fix division by 0 in percpu_pagelist_fraction()"
mistakenly initialized percpu_pagelist_fraction to the sysctl's minimum 8,
which leaves 1/8th of memory on percpu lists (on each cpu??); but most of
us expect it to be left unset at 0 (and it's not then used as a divisor).

  MemTotal: 8061476kB  8061476kB  8061476kB  8061476kB  8061476kB  8061476kB
  Repetitive test with percpu_pagelist_fraction 8:
  MemFree:  6948420kB  6237172kB  6949696kB  6840692kB  6949048kB  6862984kB
  Same test with percpu_pagelist_fraction back to 0:
  MemFree:  7945000kB  7944908kB  7948568kB  7949060kB  7948796kB  7948812kB

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
[ We really should fix the crazy sysctl interface too, but that's a
  separate thing - Linus ]
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix division by 0 in percpu_pagelist_fraction()</title>
<updated>2012-05-10T22:06:44Z</updated>
<author>
<name>Sasha Levin</name>
<email>levinsasha928@gmail.com</email>
</author>
<published>2012-05-10T20:01:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=93278814d3590eba0ee360b8d69a35c7f2203ea8'/>
<id>urn:sha1:93278814d3590eba0ee360b8d69a35c7f2203ea8</id>
<content type='text'>
percpu_pagelist_fraction_sysctl_handler() has only considered -EINVAL as
a possible error from proc_dointvec_minmax().

If any other error is returned, it would proceed to divide by zero since
percpu_pagelist_fraction wasn't getting initialized at any point.  For
example, writing 0 bytes into the proc file would trigger the issue.

Signed-off-by: Sasha Levin &lt;levinsasha928@gmail.com&gt;
Reviewed-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: only IPI CPUs to drain local pages if they exist</title>
<updated>2012-03-29T00:14:35Z</updated>
<author>
<name>Gilad Ben-Yossef</name>
<email>gilad@benyossef.com</email>
</author>
<published>2012-03-28T21:42:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=74046494ea68676d29ef6501a4bd950f08112a2c'/>
<id>urn:sha1:74046494ea68676d29ef6501a4bd950f08112a2c</id>
<content type='text'>
Calculate a cpumask of CPUs with per-cpu pages in any zone and only send
an IPI requesting CPUs to drain these pages to the buddy allocator if they
actually have pages when asked to flush.

This patch saves 85%+ of IPIs asking to drain per-cpu pages in case of
severe memory pressure that leads to OOM since in these cases multiple,
possibly concurrent, allocation requests end up in the direct reclaim code
path so when the per-cpu pages end up reclaimed on first allocation
failure for most of the proceeding allocation attempts until the memory
pressure is off (possibly via the OOM killer) there are no per-cpu pages
on most CPUs (and there can easily be hundreds of them).

This also has the side effect of shortening the average latency of direct
reclaim by 1 or more order of magnitude since waiting for all the CPUs to
ACK the IPI takes a long time.

Tested by running "hackbench 400" on a 8 CPU x86 VM and observing the
difference between the number of direct reclaim attempts that end up in
drain_all_pages() and those were more then 1/2 of the online CPU had any
per-cpu page in them, using the vmstat counters introduced in the next
patch in the series and using proc/interrupts.

In the test sceanrio, this was seen to save around 3600 global
IPIs after trigerring an OOM on a concurrent workload:

$ cat /proc/vmstat | tail -n 2
pcp_global_drain 0
pcp_global_ipi_saved 0

$ cat /proc/interrupts | grep CAL
CAL:          1          2          1          2
          2          2          2          2   Function call interrupts

$ hackbench 400
[OOM messages snipped]

$ cat /proc/vmstat | tail -n 2
pcp_global_drain 3647
pcp_global_ipi_saved 3642

$ cat /proc/interrupts | grep CAL
CAL:          6         13          6          3
          3          3         1 2          7   Function call interrupts

Please note that if the global drain is removed from the direct reclaim
path as a patch from Mel Gorman currently suggests this should be replaced
with an on_each_cpu_cond invocation.

Signed-off-by: Gilad Ben-Yossef &lt;gilad@benyossef.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Acked-by: Christoph Lameter &lt;cl@linux.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Acked-by: Michal Nazarewicz &lt;mina86@mina86.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
