<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/swap.h, branch v3.10.44</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.10.44</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.10.44'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2013-04-29T22:54:38Z</updated>
<entry>
<title>mm: thp: add split tail pages to shrink page list in page reclaim</title>
<updated>2013-04-29T22:54:38Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@kernel.org</email>
</author>
<published>2013-04-29T22:08:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5bc7b8aca942d03bf2716ddcfcb4e0b57e43a1b8'/>
<id>urn:sha1:5bc7b8aca942d03bf2716ddcfcb4e0b57e43a1b8</id>
<content type='text'>
In page reclaim, huge page is split.  split_huge_page() adds tail pages
to LRU list.  Since we are reclaiming a huge page, it's better we
reclaim all subpages of the huge page instead of just the head page.
This patch adds split tail pages to shrink page list so the tail pages
can be reclaimed soon.

Before this patch, run a swap workload:
  thp_fault_alloc 3492
  thp_fault_fallback 608
  thp_collapse_alloc 6
  thp_collapse_alloc_failed 0
  thp_split 916

With this patch:
  thp_fault_alloc 4085
  thp_fault_fallback 16
  thp_collapse_alloc 90
  thp_collapse_alloc_failed 0
  thp_split 1272

fallback allocation is reduced a lot.

[akpm@linux-foundation.org: fix CONFIG_SWAP=n build]
Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: allow for outstanding swap writeback accounting</title>
<updated>2013-04-29T22:54:38Z</updated>
<author>
<name>Seth Jennings</name>
<email>sjenning@linux.vnet.ibm.com</email>
</author>
<published>2013-04-29T22:08:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1eec6702a80e04416d528846a5ff2122484d95ec'/>
<id>urn:sha1:1eec6702a80e04416d528846a5ff2122484d95ec</id>
<content type='text'>
To prevent flooding the swap device with writebacks, frontswap backends
need to count and limit the number of outstanding writebacks.  The
incrementing of the counter can be done before the call to
__swap_writepage().  However, the caller must receive a notification
when the writeback completes in order to decrement the counter.

To achieve this functionality, this patch modifies __swap_writepage() to
take the bio completion callback function as an argument.

end_swap_bio_write(), the normal bio completion function, is also made
non-static so that code doing the accounting can call it after the
accounting is done.

There should be no behavioural change to existing code.

Signed-off-by: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Signed-off-by: Bob Liu &lt;bob.liu@oracle.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Reviewed-by: Dan Magenheimer &lt;dan.magenheimer@oracle.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: break up swap_writepage() for frontswap backends</title>
<updated>2013-04-29T22:54:38Z</updated>
<author>
<name>Seth Jennings</name>
<email>sjenning@linux.vnet.ibm.com</email>
</author>
<published>2013-04-29T22:08:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2f772e6cadf8ad8fca38927b17e6be028be669f5'/>
<id>urn:sha1:2f772e6cadf8ad8fca38927b17e6be028be669f5</id>
<content type='text'>
swap_writepage() is currently where frontswap hooks into the swap write
path to capture pages with the frontswap_store() function.  However, if
a frontswap backend wants to "resume" the writeback of a page to the
swap device, it can't call swap_writepage() as the page will simply
reenter the backend.

This patch separates swap_writepage() into a top and bottom half, the
bottom half named __swap_writepage() to allow a frontswap backend, like
zswap, to resume writeback beyond the frontswap_store() hook.

__add_to_swap_cache() is also made non-static so that the page for which
writeback is to be resumed can be added to the swap cache.

Signed-off-by: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Signed-off-by: Bob Liu &lt;bob.liu@oracle.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Reviewed-by: Dan Magenheimer &lt;dan.magenheimer@oracle.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>vmscan: change type of vm_total_pages to unsigned long</title>
<updated>2013-02-24T01:50:22Z</updated>
<author>
<name>Zhang Yanfei</name>
<email>zhangyanfei@cn.fujitsu.com</email>
</author>
<published>2013-02-23T00:35:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b21e0b90ccb99a377bce0167fed1e881bb5065d7'/>
<id>urn:sha1:b21e0b90ccb99a377bce0167fed1e881bb5065d7</id>
<content type='text'>
This variable is calculated from nr_free_pagecache_pages so
change its type to unsigned long.

Signed-off-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix return type for functions nr_free_*_pages</title>
<updated>2013-02-24T01:50:21Z</updated>
<author>
<name>Zhang Yanfei</name>
<email>zhangyanfei@cn.fujitsu.com</email>
</author>
<published>2013-02-23T00:35:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ebec3862fd6eefe8301aa55ed2e30c685d831842'/>
<id>urn:sha1:ebec3862fd6eefe8301aa55ed2e30c685d831842</id>
<content type='text'>
Currently, the amount of RAM that functions nr_free_*_pages return is
held in unsigned int.  But in machines with big memory (exceeding 16TB),
the amount may be incorrect because of overflow, so fix it.

Signed-off-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Simon Horman &lt;horms@verge.net.au&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Eric Van Hensbergen &lt;ericvh@gmail.com&gt;
Cc: Ron Minnich &lt;rminnich@sandia.gov&gt;
Cc: Latchesar Ionkov &lt;lucho@ionkov.net&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>swap: add per-partition lock for swapfile</title>
<updated>2013-02-24T01:50:17Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@kernel.org</email>
</author>
<published>2013-02-23T00:34:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ec8acf20afb8534ed511f6613dd2226b9e301010'/>
<id>urn:sha1:ec8acf20afb8534ed511f6613dd2226b9e301010</id>
<content type='text'>
swap_lock is heavily contended when I test swap to 3 fast SSD (even
slightly slower than swap to 2 such SSD).  The main contention comes
from swap_info_get().  This patch tries to fix the gap with adding a new
per-partition lock.

Global data like nr_swapfiles, total_swap_pages, least_priority and
swap_list are still protected by swap_lock.

nr_swap_pages is an atomic now, it can be changed without swap_lock.  In
theory, it's possible get_swap_page() finds no swap pages but actually
there are free swap pages.  But sounds not a big problem.

Accessing partition specific data (like scan_swap_map and so on) is only
protected by swap_info_struct.lock.

Changing swap_info_struct.flags need hold swap_lock and
swap_info_struct.lock, because scan_scan_map() will check it.  read the
flags is ok with either the locks hold.

If both swap_lock and swap_info_struct.lock must be hold, we always hold
the former first to avoid deadlock.

swap_entry_free() can change swap_list.  To delete that code, we add a
new highest_priority_index.  Whenever get_swap_page() is called, we
check it.  If it's valid, we use it.

It's a pity get_swap_page() still holds swap_lock().  But in practice,
swap_lock() isn't heavily contended in my test with this patch (or I can
say there are other much more heavier bottlenecks like TLB flush).  And
BTW, looks get_swap_page() doesn't really need the lock.  We never free
swap_info[] and we check SWAP_WRITEOK flag.  The only risk without the
lock is we could swapout to some low priority swap, but we can quickly
recover after several rounds of swap, so sounds not a big deal to me.
But I'd prefer to fix this if it's a real problem.

"swap: make each swap partition have one address_space" improved the
swapout speed from 1.7G/s to 2G/s.  This patch further improves the
speed to 2.3G/s, so around 15% improvement.  It's a multi-process test,
so TLB flush isn't the biggest bottleneck before the patches.

[arnd@arndb.de: fix it for nommu]
[hughd@google.com: add missing unlock]
[minchan@kernel.org: get rid of lockdep whinge on sys_swapon]
Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Seth Jennings &lt;sjenning@linux.vnet.ibm.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Xiao Guangrong &lt;xiaoguangrong@linux.vnet.ibm.com&gt;
Cc: Dan Magenheimer &lt;dan.magenheimer@oracle.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>swap: make each swap partition have one address_space</title>
<updated>2013-02-24T01:50:17Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@kernel.org</email>
</author>
<published>2013-02-23T00:34:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33806f06da654092182410d974b6d3c5396ea3eb'/>
<id>urn:sha1:33806f06da654092182410d974b6d3c5396ea3eb</id>
<content type='text'>
When I use several fast SSD to do swap, swapper_space.tree_lock is
heavily contended.  This makes each swap partition have one
address_space to reduce the lock contention.  There is an array of
address_space for swap.  The swap entry type is the index to the array.

In my test with 3 SSD, this increases the swapout throughput 20%.

[akpm@linux-foundation.org: revert unneeded change to  __add_to_swap_cache]
Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: vmscan: save work scanning (almost) empty LRU lists</title>
<updated>2013-02-24T01:50:09Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2013-02-23T00:32:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d778df51c09264076fe0208c099ef7d428f21790'/>
<id>urn:sha1:d778df51c09264076fe0208c099ef7d428f21790</id>
<content type='text'>
In certain cases (kswapd reclaim, memcg target reclaim), a fixed minimum
amount of pages is scanned from the LRU lists on each iteration, to make
progress.

Do not make this minimum bigger than the respective LRU list size,
however, and save some busy work trying to isolate and reclaim pages
that are not there.

Empty LRU lists are quite common with memory cgroups in NUMA
environments because there exists a set of LRU lists for each zone for
each memory cgroup, while the memory of a single cgroup is expected to
stay on just one node.  The number of expected empty LRU lists is thus

  memcgs * (nodes - 1) * lru types

Each attempt to reclaim from an empty LRU list does expensive size
comparisons between lists, acquires the zone's lru lock etc.  Avoid
that.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Satoru Moriya &lt;satoru.moriya@hds.com&gt;
Cc: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: remove vma arg from page_evictable</title>
<updated>2012-10-09T07:22:55Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-10-08T23:33:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=39b5f29ac1f988c1615fbc9c69f6651ab0d0c3c7'/>
<id>urn:sha1:39b5f29ac1f988c1615fbc9c69f6651ab0d0c3c7</id>
<content type='text'>
page_evictable(page, vma) is an irritant: almost all its callers pass
NULL for vma.  Remove the vma arg and use mlocked_vma_newpage(vma, page)
explicitly in the couple of places it's needed.  But in those places we
don't even need page_evictable() itself!  They're dealing with a freshly
allocated anonymous page, which has no "mapping" and cannot be mlocked yet.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Ying Han &lt;yinghan@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: swap: implement generic handler for swap_activate</title>
<updated>2012-08-01T01:42:47Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-07-31T23:44:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a509bc1a9e487d952d9404318f7f990166ab57a7'/>
<id>urn:sha1:a509bc1a9e487d952d9404318f7f990166ab57a7</id>
<content type='text'>
The version of swap_activate introduced is sufficient for swap-over-NFS
but would not provide enough information to implement a generic handler.
This patch shuffles things slightly to ensure the same information is
available for aops-&gt;swap_activate() as is available to the core.

No functionality change.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Sebastian Andrzej Siewior &lt;sebastian@breakpoint.cc&gt;
Cc: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;
Cc: Xiaotian Feng &lt;dfeng@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
