<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/Documentation/sysctl, branch stable/2.6.31.y</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=stable%2F2.6.31.y</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=stable%2F2.6.31.y'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2009-06-17T02:47:45Z</updated>
<entry>
<title>vmscan: properly account for the number of page cache pages zone_reclaim() can reclaim</title>
<updated>2009-06-17T02:47:45Z</updated>
<author>
<name>Mel Gorman</name>
<email>mel@csn.ul.ie</email>
</author>
<published>2009-06-16T22:33:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=90afa5de6f3fa89a733861e843377302479fcf7e'/>
<id>urn:sha1:90afa5de6f3fa89a733861e843377302479fcf7e</id>
<content type='text'>
A bug was brought to my attention against a distro kernel but it affects
mainline and I believe problems like this have been reported in various
guises on the mailing lists although I don't have specific examples at the
moment.

The reported problem was that malloc() stalled for a long time (minutes in
some cases) if a large tmpfs mount was occupying a large percentage of
memory overall.  The pages did not get cleaned or reclaimed by
zone_reclaim() because the zone_reclaim_mode was unsuitable, but the lists
are uselessly scanned frequencly making the CPU spin at near 100%.

This patchset intends to address that bug and bring the behaviour of
zone_reclaim() more in line with expectations which were noticed during
investigation.  It is based on top of mmotm and takes advantage of
Kosaki's work with respect to zone_reclaim().

Patch 1 fixes the heuristics that zone_reclaim() uses to determine if the
	scan should go ahead. The broken heuristic is what was causing the
	malloc() stall as it uselessly scanned the LRU constantly. Currently,
	zone_reclaim is assuming zone_reclaim_mode is 1 and historically it
	could not deal with tmpfs pages at all. This fixes up the heuristic so
	that an unnecessary scan is more likely to be correctly avoided.

Patch 2 notes that zone_reclaim() returning a failure automatically means
	the zone is marked full. This is not always true. It could have
	failed because the GFP mask or zone_reclaim_mode were unsuitable.

Patch 3 introduces a counter zreclaim_failed that will increment each
	time the zone_reclaim scan-avoidance heuristics fail. If that
	counter is rapidly increasing, then zone_reclaim_mode should be
	set to 0 as a temporarily resolution and a bug reported because
	the scan-avoidance heuristic is still broken.

This patch:

On NUMA machines, the administrator can configure zone_reclaim_mode that
is a more targetted form of direct reclaim.  On machines with large NUMA
distances for example, a zone_reclaim_mode defaults to 1 meaning that
clean unmapped pages will be reclaimed if the zone watermarks are not
being met.

There is a heuristic that determines if the scan is worthwhile but the
problem is that the heuristic is not being properly applied and is
basically assuming zone_reclaim_mode is 1 if it is enabled.  The lack of
proper detection can manfiest as high CPU usage as the LRU list is scanned
uselessly.

Historically, once enabled it was depending on NR_FILE_PAGES which may
include swapcache pages that the reclaim_mode cannot deal with.  Patch
vmscan-change-the-number-of-the-unmapped-files-in-zone-reclaim.patch by
Kosaki Motohiro noted that zone_page_state(zone, NR_FILE_PAGES) included
pages that were not file-backed such as swapcache and made a calculation
based on the inactive, active and mapped files.  This is far superior when
zone_reclaim==1 but if RECLAIM_SWAP is set, then NR_FILE_PAGES is a
reasonable starting figure.

This patch alters how zone_reclaim() works out how many pages it might be
able to reclaim given the current reclaim_mode.  If RECLAIM_SWAP is set in
the reclaim_mode it will either consider NR_FILE_PAGES as potential
candidates or else use NR_{IN}ACTIVE}_PAGES-NR_FILE_MAPPED to discount
swapcache and other non-file-backed pages.  If RECLAIM_WRITE is not set,
then NR_FILE_DIRTY number of pages are not candidates.  If RECLAIM_SWAP is
not set, then NR_FILE_MAPPED are not.

[kosaki.motohiro@jp.fujitsu.com: Estimate unmapped pages minus tmpfs pages]
[fengguang.wu@intel.com: Fix underflow problem in Kosaki's estimate]
Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>page allocator: use allocation flags as an index to the zone watermark</title>
<updated>2009-06-17T02:47:35Z</updated>
<author>
<name>Mel Gorman</name>
<email>mel@csn.ul.ie</email>
</author>
<published>2009-06-16T22:32:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=418589663d6011de9006425b6c5721e1544fb47a'/>
<id>urn:sha1:418589663d6011de9006425b6c5721e1544fb47a</id>
<content type='text'>
ALLOC_WMARK_MIN, ALLOC_WMARK_LOW and ALLOC_WMARK_HIGH determin whether
pages_min, pages_low or pages_high is used as the zone watermark when
allocating the pages.  Two branches in the allocator hotpath determine
which watermark to use.

This patch uses the flags as an array index into a watermark array that is
indexed with WMARK_* defines accessed via helpers.  All call sites that
use zone-&gt;pages_* are updated to use the helpers for accessing the values
and the array offsets for setting.

Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Reviewed-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Cc: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Cc: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>trivial: Miscellaneous documentation typo fixes</title>
<updated>2009-06-12T16:01:47Z</updated>
<author>
<name>Matt LaPlante</name>
<email>kernel1@cyberdogtech.com</email>
</author>
<published>2009-04-27T13:06:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=19f594600110377ec4037fdf7fb93a25ec516212'/>
<id>urn:sha1:19f594600110377ec4037fdf7fb93a25ec516212</id>
<content type='text'>
Fix various typos in documentation txts.

Signed-off-by: Matt LaPlante &lt;kernel1@cyberdogtech.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' into next</title>
<updated>2009-05-22T08:40:59Z</updated>
<author>
<name>James Morris</name>
<email>jmorris@namei.org</email>
</author>
<published>2009-05-22T08:40:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2c9e703c618106f5383226fbb1f526cb11034f8a'/>
<id>urn:sha1:2c9e703c618106f5383226fbb1f526cb11034f8a</id>
<content type='text'>
Conflicts:
	fs/exec.c

Removed IMA changes (the IMA checks are now performed via may_open()).

Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</content>
</entry>
<entry>
<title>Revert "mm: add /proc controls for pdflush threads"</title>
<updated>2009-05-15T09:32:24Z</updated>
<author>
<name>Jens Axboe</name>
<email>jens.axboe@oracle.com</email>
</author>
<published>2009-05-15T09:32:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cd17cbfda004fe5f406c01b318c6378d9895896f'/>
<id>urn:sha1:cd17cbfda004fe5f406c01b318c6378d9895896f</id>
<content type='text'>
This reverts commit fafd688e4c0c34da0f3de909881117d374e4c7af.

Work is progressing to switch away from pdflush as the process backing
for flushing out dirty data. So it seems pointless to add more knobs
to control pdflush threads. The original author of the patch did not
have any specific use cases for adding the knobs, so we can easily
revert this before 2.6.30 to avoid having to maintain this API
forever.

Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' into next</title>
<updated>2009-05-08T07:56:47Z</updated>
<author>
<name>James Morris</name>
<email>jmorris@namei.org</email>
</author>
<published>2009-05-08T07:56:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d254117099d711f215e62427f55dfb8ebd5ad011'/>
<id>urn:sha1:d254117099d711f215e62427f55dfb8ebd5ad011</id>
<content type='text'>
</content>
</entry>
<entry>
<title>mm: prevent divide error for small values of vm_dirty_bytes</title>
<updated>2009-05-02T22:36:10Z</updated>
<author>
<name>Andrea Righi</name>
<email>righi.andrea@gmail.com</email>
</author>
<published>2009-04-30T22:08:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9e4a5bda89034502fb144331e71a0efdfd5fae97'/>
<id>urn:sha1:9e4a5bda89034502fb144331e71a0efdfd5fae97</id>
<content type='text'>
Avoid setting less than two pages for vm_dirty_bytes: this is necessary to
avoid potential division by 0 (like the following) in get_dirty_limits().

[   49.951610] divide error: 0000 [#1] PREEMPT SMP
[   49.952195] last sysfs file: /sys/devices/pci0000:00/0000:00:01.1/host0/target0:0:0/0:0:0:0/block/sda/uevent
[   49.952195] CPU 1
[   49.952195] Modules linked in: pcspkr
[   49.952195] Pid: 3064, comm: dd Not tainted 2.6.30-rc3 #1
[   49.952195] RIP: 0010:[&lt;ffffffff802d39a9&gt;]  [&lt;ffffffff802d39a9&gt;] get_dirty_limits+0xe9/0x2c0
[   49.952195] RSP: 0018:ffff88001de03a98  EFLAGS: 00010202
[   49.952195] RAX: 00000000000000c0 RBX: ffff88001de03b80 RCX: 28f5c28f5c28f5c3
[   49.952195] RDX: 0000000000000000 RSI: 00000000000000c0 RDI: 0000000000000000
[   49.952195] RBP: ffff88001de03ae8 R08: 0000000000000000 R09: 0000000000000000
[   49.952195] R10: ffff88001ddda9a0 R11: 0000000000000001 R12: 0000000000000001
[   49.952195] R13: ffff88001fbc8218 R14: ffff88001de03b70 R15: ffff88001de03b78
[   49.952195] FS:  00007fe9a435b6f0(0000) GS:ffff8800025d9000(0000) knlGS:0000000000000000
[   49.952195] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   49.952195] CR2: 00007fe9a39ab000 CR3: 000000001de38000 CR4: 00000000000006e0
[   49.952195] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   49.952195] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   49.952195] Process dd (pid: 3064, threadinfo ffff88001de02000, task ffff88001ddda250)
[   49.952195] Stack:
[   49.952195]  ffff88001fa0de00 ffff88001f2dbd70 ffff88001f9fe800 000080b900000000
[   49.952195]  00000000000000c0 ffff8800027a6100 0000000000000400 ffff88001fbc8218
[   49.952195]  0000000000000000 0000000000000600 ffff88001de03bb8 ffffffff802d3ed7
[   49.952195] Call Trace:
[   49.952195]  [&lt;ffffffff802d3ed7&gt;] balance_dirty_pages_ratelimited_nr+0x1d7/0x3f0
[   49.952195]  [&lt;ffffffff80368f8e&gt;] ? ext3_writeback_write_end+0x9e/0x120
[   49.952195]  [&lt;ffffffff802cc7df&gt;] generic_file_buffered_write+0x12f/0x330
[   49.952195]  [&lt;ffffffff802cce8d&gt;] __generic_file_aio_write_nolock+0x26d/0x460
[   49.952195]  [&lt;ffffffff802cda32&gt;] ? generic_file_aio_write+0x52/0xd0
[   49.952195]  [&lt;ffffffff802cda49&gt;] generic_file_aio_write+0x69/0xd0
[   49.952195]  [&lt;ffffffff80365fa6&gt;] ext3_file_write+0x26/0xc0
[   49.952195]  [&lt;ffffffff803034d1&gt;] do_sync_write+0xf1/0x140
[   49.952195]  [&lt;ffffffff80290d1a&gt;] ? get_lock_stats+0x2a/0x60
[   49.952195]  [&lt;ffffffff80280730&gt;] ? autoremove_wake_function+0x0/0x40
[   49.952195]  [&lt;ffffffff8030411b&gt;] vfs_write+0xcb/0x190
[   49.952195]  [&lt;ffffffff803042d0&gt;] sys_write+0x50/0x90
[   49.952195]  [&lt;ffffffff8022ff6b&gt;] system_call_fastpath+0x16/0x1b
[   49.952195] Code: 00 00 00 2b 05 09 1c 17 01 48 89 c6 49 0f af f4 48 c1 ee 02 48 89 f0 48 f7 e1 48 89 d6 31 d2 48 c1 ee 02 48 0f af 75 d0 48 89 f0 &lt;48&gt; f7 f7 41 8b 95 ac 01 00 00 48 89 c7 49 0f af d4 48 c1 ea 02
[   49.952195] RIP  [&lt;ffffffff802d39a9&gt;] get_dirty_limits+0xe9/0x2c0
[   49.952195]  RSP &lt;ffff88001de03a98&gt;
[   50.096523] ---[ end trace 008d7aa02f244d7b ]---

Signed-off-by: Andrea Righi &lt;righi.andrea@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Documentation/sysctl/net.txt: fix a typo</title>
<updated>2009-04-13T22:04:28Z</updated>
<author>
<name>Li Zefan</name>
<email>lizf@cn.fujitsu.com</email>
</author>
<published>2009-04-13T21:39:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ca8b9950298c84ca528a5943409a727c04ec88f8'/>
<id>urn:sha1:ca8b9950298c84ca528a5943409a727c04ec88f8</id>
<content type='text'>
s/spicified/specified

Signed-off-by: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: add /proc controls for pdflush threads</title>
<updated>2009-04-07T15:31:03Z</updated>
<author>
<name>Peter W Morreale</name>
<email>pmorreale@novell.com</email>
</author>
<published>2009-04-07T02:00:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fafd688e4c0c34da0f3de909881117d374e4c7af'/>
<id>urn:sha1:fafd688e4c0c34da0f3de909881117d374e4c7af</id>
<content type='text'>
Add /proc entries to give the admin the ability to control the minimum and
maximum number of pdflush threads.  This allows finer control of pdflush
on both large and small machines.

The rationale is simply one size does not fit all.  Admins on large and/or
small systems may want to tune the min/max pdflush thread count to best
suit their needs.  Right now the min/max is hardcoded to 2/8.  While
probably a fair estimate for smaller machines, large machines with large
numbers of CPUs and large numbers of filesystems/block devices may benefit
from larger numbers of threads working on different block devices.

Even if the background flushing algorithm is radically changed, it is
still likely that multiple threads will be involved and admins would still
desire finer control on the min/max other than to have to recompile the
kernel.

The patch adds '/proc/sys/vm/nr_pdflush_threads_min' and
'/proc/sys/vm/nr_pdflush_threads_max' with r/w permissions.

The minimum value for nr_pdflush_threads_min is 1 and the maximum value is
the current value of nr_pdflush_threads_max.  This minimum is required
since additional thread creation is performed in a pdflush thread itself.

The minimum value for nr_pdflush_threads_max is the current value of
nr_pdflush_threads_min and the maximum value can be 1000.

Documentation/sysctl/vm.txt is also updated.

[akpm@linux-foundation.org: fix comment, fix whitespace, use __read_mostly]
Signed-off-by: Peter W Morreale &lt;pmorreale@novell.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>documentation: fix unix_dgram_qlen description</title>
<updated>2009-04-03T02:04:53Z</updated>
<author>
<name>Li Xiaodong</name>
<email>lixd@cn.fujitsu.com</email>
</author>
<published>2009-04-02T23:57:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=45dad7bd9d9b65a30d6e790b111f6f2d8f746d22'/>
<id>urn:sha1:45dad7bd9d9b65a30d6e790b111f6f2d8f746d22</id>
<content type='text'>
Previous description about system parameter in /proc/sys/net/unix/ is
wrong (or missed).  Simply add a new description about unix_dgram_qlen
according to latest kernel.

Signed-off-by: Li Xiaodong &lt;lixd@cn.fujitsu.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
