<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/mm/mm_init.c, branch v4.12</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.12</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.12'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-03-17T22:09:34Z</updated>
<entry>
<title>mm: convert printk(KERN_&lt;LEVEL&gt; to pr_&lt;level&gt;</title>
<updated>2016-03-17T22:09:34Z</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2016-03-17T21:19:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1170532bb49f9468aedabdc1d5a560e2521a2bcc'/>
<id>urn:sha1:1170532bb49f9468aedabdc1d5a560e2521a2bcc</id>
<content type='text'>
Most of the mm subsystem uses pr_&lt;level&gt; so make it consistent.

Miscellanea:

 - Realign arguments
 - Add missing newline to format
 - kmemleak-test.c has a "kmemleak: " prefix added to the
   "Kmemleak testing" logging message via pr_fmt

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;	[percpu]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: meminit: remove mminit_verify_page_links</title>
<updated>2015-07-01T02:44:56Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-06-30T21:57:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=74033a798f5a5db368126ee6f690111cf019bf7a'/>
<id>urn:sha1:74033a798f5a5db368126ee6f690111cf019bf7a</id>
<content type='text'>
mminit_verify_page_links() is an extremely paranoid check that was
introduced when memory initialisation was being heavily reworked.
Profiles indicated that up to 10% of parallel memory initialisation was
spent on checking this for every page.  The cost could be reduced but in
practice this check only found problems very early during the
initialisation rewrite and has found nothing since.  This patch removes an
expensive unnecessary check.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Tested-by: Nate Zimmer &lt;nzimmer@sgi.com&gt;
Tested-by: Waiman Long &lt;waiman.long@hp.com&gt;
Tested-by: Daniel J Blueman &lt;daniel@numascale.com&gt;
Acked-by: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Robin Holt &lt;robinmholt@gmail.com&gt;
Cc: Nate Zimmer &lt;nzimmer@sgi.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Waiman Long &lt;waiman.long@hp.com&gt;
Cc: Scott Norton &lt;scott.norton@hp.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: meminit: initialise remaining struct pages in parallel with kswapd</title>
<updated>2015-07-01T02:44:56Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-06-30T21:57:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7e18adb4f80bea90d30b62158694d97c31f71d37'/>
<id>urn:sha1:7e18adb4f80bea90d30b62158694d97c31f71d37</id>
<content type='text'>
Only a subset of struct pages are initialised at the moment.  When this
patch is applied kswapd initialise the remaining struct pages in parallel.

This should boot faster by spreading the work to multiple CPUs and
initialising data that is local to the CPU.  The user-visible effect on
large machines is that free memory will appear to rapidly increase early
in the lifetime of the system until kswapd reports that all memory is
initialised in the kernel log.  Once initialised there should be no other
user-visibile effects.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Tested-by: Nate Zimmer &lt;nzimmer@sgi.com&gt;
Tested-by: Waiman Long &lt;waiman.long@hp.com&gt;
Tested-by: Daniel J Blueman &lt;daniel@numascale.com&gt;
Acked-by: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Robin Holt &lt;robinmholt@gmail.com&gt;
Cc: Nate Zimmer &lt;nzimmer@sgi.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Waiman Long &lt;waiman.long@hp.com&gt;
Cc: Scott Norton &lt;scott.norton@hp.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mm_init.c: mark mminit_loglevel __meminitdata</title>
<updated>2015-02-13T02:54:11Z</updated>
<author>
<name>Rasmus Villemoes</name>
<email>linux@rasmusvillemoes.dk</email>
</author>
<published>2015-02-12T23:00:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=194e81512063e96763025a990f841f5ab24815ee'/>
<id>urn:sha1:194e81512063e96763025a990f841f5ab24815ee</id>
<content type='text'>
mminit_loglevel is only referenced from __init and __meminit functions, so
we can mark it __meminitdata.

Signed-off-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Vishnu Pratap Singh &lt;vishnu.ps@samsung.com&gt;
Cc: Pintu Kumar &lt;pintu.k@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mm_init.c: park mminit_verify_zonelist as __init</title>
<updated>2015-02-13T02:54:11Z</updated>
<author>
<name>Rasmus Villemoes</name>
<email>linux@rasmusvillemoes.dk</email>
</author>
<published>2015-02-12T23:00:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0e2342c709aa568b90cde3387d6e588ca862a0ba'/>
<id>urn:sha1:0e2342c709aa568b90cde3387d6e588ca862a0ba</id>
<content type='text'>
The only caller of mminit_verify_zonelist is build_all_zonelists_init,
which is annotated with __init, so it should be safe to also mark the
former as __init, saving ~400 bytes of .text.

Signed-off-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Vishnu Pratap Singh &lt;vishnu.ps@samsung.com&gt;
Cc: Pintu Kumar &lt;pintu.k@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: bring back /sys/kernel/mm</title>
<updated>2014-01-28T05:02:39Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2014-01-28T01:06:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e82cb95d626a6bb0e4fe7db1f311dc22039c2ed3'/>
<id>urn:sha1:e82cb95d626a6bb0e4fe7db1f311dc22039c2ed3</id>
<content type='text'>
Commit da29bd36224b ("mm/mm_init.c: make creation of the mm_kobj happen
earlier than device_initcall") changed to pure_initcall(mm_sysfs_init).

That's too early: mm_sysfs_init() depends on core_initcall(ksysfs_init)
to have made the kernel_kobj directory "kernel" in which to create "mm".

Make it postcore_initcall(mm_sysfs_init).  We could use core_initcall(),
and depend upon Makefile link order kernel/ mm/ fs/ ipc/ security/ ...
as core_initcall(debugfs_init) and core_initcall(securityfs_init) do;
but better not.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mm_init.c: make creation of the mm_kobj happen earlier than device_initcall</title>
<updated>2014-01-24T00:36:52Z</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2014-01-23T23:53:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=da29bd36224bfa008df5d83df496c07e31a0da6d'/>
<id>urn:sha1:da29bd36224bfa008df5d83df496c07e31a0da6d</id>
<content type='text'>
The use of __initcall is to be eventually replaced by choosing one from
the prioritized groupings laid out in init.h header:

	pure_initcall               0
	core_initcall               1
	postcore_initcall           2
	arch_initcall               3
	subsys_initcall             4
	fs_initcall                 5
	device_initcall             6
	late_initcall               7

In the interim, all __initcall are mapped onto device_initcall, which as
can be seen above, comes quite late in the ordering.

Currently the mm_kobj is created with __initcall in mm_sysfs_init().
This means that any other initcalls that want to reference the mm_kobj
have to be device_initcall (or later), otherwise we will for example,
trip the BUG_ON(!kobj) in sysfs's internal_create_group().  This
unfairly restricts those users; for example something that clearly makes
sense to be an arch_initcall will not be able to choose that.

However, upon examination, it is only this way for historical reasons
(i.e.  simply not reprioritized yet).  We see that sysfs is ready quite
earlier in init/main.c via:

 vfs_caches_init
 |_ mnt_init
    |_ sysfs_init

well ahead of the processing of the prioritized calls listed above.

So we can recategorize mm_sysfs_init to be a pure_initcall, which in
turn allows any mm_kobj initcall users a wider range (1 --&gt; 7) of
initcall priorities to choose from.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: numa: Change page last {nid,pid} into {cpu,pid}</title>
<updated>2013-10-09T12:47:45Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-10-07T10:29:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=90572890d202527c366aa9489b32404e88a7c020'/>
<id>urn:sha1:90572890d202527c366aa9489b32404e88a7c020</id>
<content type='text'>
Change the per page last fault tracking to use cpu,pid instead of
nid,pid. This will allow us to try and lookup the alternate task more
easily. Note that even though it is the cpu that is store in the page
flags that the mpol_misplaced decision is still based on the node.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/1381141781-10992-43-git-send-email-mgorman@suse.de
[ Fixed build failure on 32-bit systems. ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/numa: Set preferred NUMA node based on number of private faults</title>
<updated>2013-10-09T10:40:35Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2013-10-07T10:29:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b795854b1fa70f6aee923ae5df74ff7afeaddcaa'/>
<id>urn:sha1:b795854b1fa70f6aee923ae5df74ff7afeaddcaa</id>
<content type='text'>
Ideally it would be possible to distinguish between NUMA hinting faults that
are private to a task and those that are shared. If treated identically
there is a risk that shared pages bounce between nodes depending on
the order they are referenced by tasks. Ultimately what is desirable is
that task private pages remain local to the task while shared pages are
interleaved between sharing tasks running on different nodes to give good
average performance. This is further complicated by THP as even
applications that partition their data may not be partitioning on a huge
page boundary.

To start with, this patch assumes that multi-threaded or multi-process
applications partition their data and that in general the private accesses
are more important for cpu-&gt;memory locality in the general case. Also,
no new infrastructure is required to treat private pages properly but
interleaving for shared pages requires additional infrastructure.

To detect private accesses the pid of the last accessing task is required
but the storage requirements are a high. This patch borrows heavily from
Ingo Molnar's patch "numa, mm, sched: Implement last-CPU+PID hash tracking"
to encode some bits from the last accessing task in the page flags as
well as the node information. Collisions will occur but it is better than
just depending on the node information. Node information is then used to
determine if a page needs to migrate. The PID information is used to detect
private/shared accesses. The preferred NUMA node is selected based on where
the maximum number of approximately private faults were measured. Shared
faults are not taken into consideration for a few reasons.

First, if there are many tasks sharing the page then they'll all move
towards the same node. The node will be compute overloaded and then
scheduled away later only to bounce back again. Alternatively the shared
tasks would just bounce around nodes because the fault information is
effectively noise. Either way accounting for shared faults the same as
private faults can result in lower performance overall.

The second reason is based on a hypothetical workload that has a small
number of very important, heavily accessed private pages but a large shared
array. The shared array would dominate the number of faults and be selected
as a preferred node even though it's the wrong decision.

The third reason is that multiple threads in a process will race each
other to fault the shared page making the fault information unreliable.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
[ Fix complication error when !NUMA_BALANCING. ]
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1381141781-10992-30-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm: tune vm_committed_as percpu_counter batching size</title>
<updated>2013-07-03T23:07:32Z</updated>
<author>
<name>Tim Chen</name>
<email>tim.c.chen@linux.intel.com</email>
</author>
<published>2013-07-03T22:02:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=917d9290af749fac9c4d90bacf18699c9d8ba28d'/>
<id>urn:sha1:917d9290af749fac9c4d90bacf18699c9d8ba28d</id>
<content type='text'>
Currently the per cpu counter's batch size for memory accounting is
configured as twice the number of cpus in the system.  However, for
system with very large memory, it is more appropriate to make it
proportional to the memory size per cpu in the system.

For example, for a x86_64 system with 64 cpus and 128 GB of memory, the
batch size is only 2*64 pages (0.5 MB).  So any memory accounting
changes of more than 0.5MB will overflow the per cpu counter into the
global counter.  Instead, for the new scheme, the batch size is
configured to be 0.4% of the memory/cpu = 8MB (128 GB/64 /256), which is
more inline with the memory size.

I've done a repeated brk test of 800KB (from will-it-scale test suite)
with 80 concurrent processes on a 4 socket Westmere machine with a total
of 40 cores.  Without the patch, about 80% of cpu is spent on spin-lock
contention within the vm_committed_as counter.  With the patch, there's
a 73x speedup on the benchmark and the lock contention drops off almost
entirely.

[akpm@linux-foundation.org: fix section mismatch]
Signed-off-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
