<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/ipc, branch v3.3.4</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.3.4</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.3.4'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-01-23T16:38:48Z</updated>
<entry>
<title>SHM_UNLOCK: fix Unevictable pages stranded after swap</title>
<updated>2012-01-23T16:38:48Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-01-20T22:34:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=245132643e1cfcd145bbc86a716c1818371fcb93'/>
<id>urn:sha1:245132643e1cfcd145bbc86a716c1818371fcb93</id>
<content type='text'>
Commit cc39c6a9bbde ("mm: account skipped entries to avoid looping in
find_get_pages") correctly fixed an infinite loop; but left a problem
that find_get_pages() on shmem would return 0 (appearing to callers to
mean end of tree) when it meets a run of nr_pages swap entries.

The only uses of find_get_pages() on shmem are via pagevec_lookup(),
called from invalidate_mapping_pages(), and from shmctl SHM_UNLOCK's
scan_mapping_unevictable_pages().  The first is already commented, and
not worth worrying about; but the second can leave pages on the
Unevictable list after an unusual sequence of swapping and locking.

Fix that by using shmem_find_get_pages_and_swap() (then ignoring the
swap) instead of pagevec_lookup().

But I don't want to contaminate vmscan.c with shmem internals, nor
shmem.c with LRU locking.  So move scan_mapping_unevictable_pages() into
shmem.c, renaming it shmem_unlock_mapping(); and rename
check_move_unevictable_page() to check_move_unevictable_pages(), looping
down an array of pages, oftentimes under the same lock.

Leave out the "rotate unevictable list" block: that's a leftover from
when this was used for /proc/sys/vm/scan_unevictable_pages, whose flawed
handling involved looking at pages at tail of LRU.

Was there significance to the sequence first ClearPageUnevictable, then
test page_evictable, then SetPageUnevictable here? I think not, we're
under LRU lock, and have no barriers between those.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Shaohua Li &lt;shaohua.li@intel.com&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; [back to 3.1 but will need respins]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>SHM_UNLOCK: fix long unpreemptible section</title>
<updated>2012-01-23T16:38:48Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-01-20T22:34:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=85046579bde15e532983438f86b36856e358f417'/>
<id>urn:sha1:85046579bde15e532983438f86b36856e358f417</id>
<content type='text'>
scan_mapping_unevictable_pages() is used to make SysV SHM_LOCKed pages
evictable again once the shared memory is unlocked.  It does this with
pagevec_lookup()s across the whole object (which might occupy most of
memory), and takes 300ms to unlock 7GB here.  A cond_resched() every
PAGEVEC_SIZE pages would be good.

However, KOSAKI-san points out that this is called under shmem.c's
info-&gt;lock, and it's also under shm.c's shm_lock(), both spinlocks.
There is no strong reason for that: we need to take these pages off the
unevictable list soonish, but those locks are not required for it.

So move the call to scan_mapping_unevictable_pages() from shmem.c's
unlock handling up to shm.c's unlock handling.  Remove the recently
added barrier, not needed now we have spin_unlock() before the scan.

Use get_file(), with subsequent fput(), to make sure we have a reference
to mapping throughout scan_mapping_unevictable_pages(): that's something
that was previously guaranteed by the shm_lock().

Remove shmctl's lru_add_drain_all(): we don't fault in pages at SHM_LOCK
time, and we lazily discover them to be Unevictable later, so it serves
no purpose for SHM_LOCK; and serves no purpose for SHM_UNLOCK, since
pages still on pagevec are not marked Unevictable.

The original code avoided redundant rescans by checking VM_LOCKED flag
at its level: now avoid them by checking shp's SHM_LOCKED.

The original code called scan_mapping_unevictable_pages() on a locked
area at shm_destroy() time: perhaps we once had accounting cross-checks
which required that, but not now, so skip the overhead and just let
inode eviction deal with them.

Put check_move_unevictable_page() and scan_mapping_unevictable_pages()
under CONFIG_SHMEM (with stub for the TINY case when ramfs is used),
more as comment than to save space; comment them used for SHM_UNLOCK.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Shaohua Li &lt;shaohua.li@intel.com&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/mqueue: simplify reading msgqueue limit</title>
<updated>2012-01-23T16:38:47Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>dave@gnu.org</email>
</author>
<published>2012-01-20T22:34:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2a4e64b8f6bcbf23ddd375b78342051ae8862284'/>
<id>urn:sha1:2a4e64b8f6bcbf23ddd375b78342051ae8862284</id>
<content type='text'>
Because the current task is being used to get the limit, we can simply
use rlimit() instead of task_rlimit().

Signed-off-by: Davidlohr Bueso &lt;dave@gnu.org&gt;
Acked-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>user namespace: make signal.c respect user namespaces</title>
<updated>2012-01-11T00:30:54Z</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serge@hallyn.com</email>
</author>
<published>2012-01-10T23:11:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6b550f9495947fc279d12c38feaf98500e8d0646'/>
<id>urn:sha1:6b550f9495947fc279d12c38feaf98500e8d0646</id>
<content type='text'>
ipc/mqueue.c: for __SI_MESQ, convert the uid being sent to recipient's
user namespace. (new, thanks Oleg)

__send_signal: convert current's uid to the recipient's user namespace
for any siginfo which is not SI_FROMKERNEL (patch from Oleg, thanks
again :)

do_notify_parent and do_notify_parent_cldstop: map task's uid to parent's
user namespace

ptrace_signal maps parent's uid into current's user namespace before
including in signal to current.  IIUC Oleg has argued that this shouldn't
matter as the debugger will play with it, but it seems like not converting
the value currently being set is misleading.

Changelog:
Sep 20: Inspired by Oleg's suggestion, define map_cred_ns() helper to
	simplify callers and help make clear what we are translating
        (which uid into which namespace).  Passing the target task would
	make callers even easier to read, but we pass in user_ns because
	current_user_ns() != task_cred_xxx(current, user_ns).
Sep 20: As recommended by Oleg, also put task_pid_vnr() under rcu_read_lock
	in ptrace_signal().
Sep 23: In send_signal(), detect when (user) signal is coming from an
	ancestor or unrelated user namespace.  Pass that on to __send_signal,
	which sets si_uid to 0 or overflowuid if needed.
Oct 12: Base on Oleg's fixup_uid() patch.  On top of that, handle all
	SI_FROMKERNEL cases at callers, because we can't assume sender is
	current in those cases.
Nov 10: (mhelsley) rename fixup_uid to more meaningful usern_fixup_signal_uid
Nov 10: (akpm) make the !CONFIG_USER_NS case clearer

Signed-off-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Matt Helsley &lt;matthltc@us.ibm.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
From: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Subject: __send_signal: pass q-&gt;info, not info, to userns_fixup_signal_uid (v2)

Eric Biederman pointed out that passing info is a bug and could lead to a
NULL pointer deref to boot.

A collection of signal, securebits, filecaps, cap_bounds, and a few other
ltp tests passed with this kernel.

Changelog:
    Nov 18: previous patch missed a leading '&amp;'

Signed-off-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
From: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Subject: ipc/mqueue: lock() =&gt; unlock() typo

There was a double lock typo introduced in b085f4bd6b21 "user namespace:
make signal.c respect user namespaces"

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Matt Helsley &lt;matthltc@us.ibm.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>switch mq_open() to umode_t</title>
<updated>2012-01-04T03:55:16Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-07-26T09:26:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df0a42837b86567a130c44515ab620d23e7f182b'/>
<id>urn:sha1:df0a42837b86567a130c44515ab620d23e7f182b</id>
<content type='text'>
</content>
</entry>
<entry>
<title>mqueue: propagate umode_t</title>
<updated>2012-01-04T03:55:11Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-07-24T18:18:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1b9d5ff7644ddf2723c9205f4726c95ec01bf033'/>
<id>urn:sha1:1b9d5ff7644ddf2723c9205f4726c95ec01bf033</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>switch -&gt;create() to umode_t</title>
<updated>2012-01-04T03:54:53Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-07-26T05:42:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4acdaf27ebe2034c342f3be57ef49aed1ad885ef'/>
<id>urn:sha1:4acdaf27ebe2034c342f3be57ef49aed1ad885ef</id>
<content type='text'>
vfs_create() ignores everything outside of 16bit subset of its
mode argument; switching it to umode_t is obviously equivalent
and it's the only caller of the method

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>vfs: fix the stupidity with i_dentry in inode destructors</title>
<updated>2012-01-04T03:52:40Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-12-12T20:51:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6b520e0565422966cdf1c3759bd73df77b0f248c'/>
<id>urn:sha1:6b520e0565422966cdf1c3759bd73df77b0f248c</id>
<content type='text'>
Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
the cost of taking it into inode_init_always() will be negligible for pipes
and sockets and negative for everything else.  Not to mention the removal of
boilerplate code from -&gt;destroy_inode() instances...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>... and the same kind of leak for mqueue</title>
<updated>2011-12-09T05:40:21Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-12-09T05:38:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6f686574cccc2ef66fb38e41f19cedd81e7b4504'/>
<id>urn:sha1:6f686574cccc2ef66fb38e41f19cedd81e7b4504</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>ipc/sem.c: remove private structures from public header file</title>
<updated>2011-11-02T23:07:01Z</updated>
<author>
<name>Manfred Spraul</name>
<email>manfred@colorfullife.com</email>
</author>
<published>2011-11-02T20:38:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e57940d719e9fc5223d133b631f8cb5232d6064e'/>
<id>urn:sha1:e57940d719e9fc5223d133b631f8cb5232d6064e</id>
<content type='text'>
include/linux/sem.h contains several structures that are only used within
ipc/sem.c.

The patch moves them into ipc/sem.c - there is no need to expose the
structures to the whole kernel.

No functional changes, only whitespace cleanups and 80-char per line
fixes.

Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
