user/sven/linux.git - Linux Kernel

Age	Commit message (Collapse)	Author
2006-04-17	[PATCH] shmat: stop mprotect from giving write permission to a readonly ↵	Hugh Dickins
	attachment (CVE-2006-1524) I found that all of 2.4 and 2.6 have been letting mprotect give write permission to a readonly attachment of shared memory, whether or not IPC would give the caller that permission. SUS says "The behaviour of this function [mprotect] is unspecified if the mapping was not established by a call to mmap", but I don't think we can interpret that as allowing it to subvert IPC permissions. I haven't tried 2.2, but the 2.2.26 source looks like it gets it right; and the patch below reproduces that behaviour - mprotect cannot be used to add write permission to a shared memory segment attached readonly. This patch is simple, and I'm sure it's what we should have done in 2.4.0: if you want to go on to switch write permission on and off with mprotect, just don't attach the segment readonly in the first place. However, we could have accumulated apps which attach readonly (even though they would be permitted to attach read/write), and which subsequently use mprotect to switch write permission on and off: it's not unreasonable. I was going to add a second ipcperms check in do_shmat, to check for writable when readonly, and if not writable find_vma and clear VM_MAYWRITE. But security_ipc_permission might do auditing, and it seems wrong to report an attempt for write permission when there has been none. Or we could flag the vma as SHM, note the shmid or shp in vm_private_data, and then get mprotect to check. But the patch below is a lot simpler: I'd rather stick with it, if we can convince ourselves somehow that it'll be safe. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-02-10	[PATCH] shmdt cannot detach not-alined shm segment cleanly.	KAMEZAWA Hiroyuki
	sys_shmdt() can manage shm segments which are covered by multiple vmas. (This can happen when a user uses mprotect() after shmat().) This works well if shm is aligned to PAGE_SIZE, but if not, the last segment cannot be detached. It is because a comparison in sys_shmdt() (vma->vm_end - addr) < size addr == return address of shmat() size == shmsize, argments to shmget() size should be aligned to PAGE_SIZE before being compared with vma->vm_end, which is aligned. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Manfred Spraul <manfred@colorfullife.com> Acked-by: Hugh Dickins <hugh@veritas.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-09	[NETLINK]: Fix a severe bug	Alexey Kuznetsov
	netlink overrun was broken while improvement of netlink. Destination socket is used in the place where it was meant to be source socket, so that now overrun is never sent to user netlink sockets, when it should be, and it even can be set on kernel socket, which results in complete deadlock of rtnetlink. Suggested fix is to restore status quo passing source socket as additional argument to netlink_attachskb(). A little explanation: overrun is set on a socket, when it failed to receive some message and sender of this messages does not or even have no way to handle this error. This happens in two cases: 1. when kernel sends something. Kernel never retransmits and cannot wait for buffer space. 2. when user sends a broadcast and the message was not delivered to some recipients. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-15	correct email address of Manfred Spraul	Christian Kujau
	I tried to send the forcedeth maintainer an email, but it came back with: "The mail address manfreds@colorfullife.com is not read anymore. Please resent your mail to manfred@ instead of manfreds@." This patch fixes this. Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-01-14	[PATCH] Fix double decrement of mqueue_mnt->mnt_count in sys_mq_open	Alexander Viro
	Fixed the refcounting on failure exits in sys_mq_open() and cleaned the logics up. Rules are actually pretty simple - dentry_open() expects vfsmount and dentry to be pinned down and it either transfers them into created struct file or drops them. Old code had been very confused in that area - if dentry_open() had failed either in do_open() or do_create(), we ended up dentry and mqueue_mnt dropped twice, once by dentry_open() cleanup and then by sys_mq_open(). Fix consists of making the rules for do_create() and do_open() same as for dentry_open() and updating the sys_mq_open() accordingly; that actually leads to more straightforward code and less work on normal path. Signed-off-by: Al Viro <aviro@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11	[PATCH] move capable() to capability.h	Randy.Dunlap
	- Move capable() from sched.h to capability.h; - Use <linux/capability.h> where capable() is used (in include/, block/, ipc/, kernel/, a few drivers/, mm/, security/, & sound/; many more drivers/ to go) Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09	[PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem	Jes Sorensen
	This patch converts the inode semaphore to a mutex. I have tested it on XFS and compiled as much as one can consider on an ia64. Anyway your luck with it might be different. Modified-by: Ingo Molnar <mingo@elte.hu> (finished the conversion) Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2006-01-08	[PATCH] ipc: expand shm_flags	Andrew Morton
	Unobfsucate this struct member Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06	[PATCH] NOMMU: Make SYSV IPC SHM use ramfs facilities on NOMMU	David Howells
	The attached patch makes the SYSV IPC shared memory facilities use the new ramfs facilities on a no-MMU kernel. The following changes are made: (1) There are now shmem_mmap() and shmem_get_unmapped_area() functions to allow the IPC SHM facilities to commune with the tiny-shmem and shmem code. (2) ramfs files now need resizing using do_truncate() rather than by modifying the inode size directly (see shmem_file_setup()). This causes ramfs to attempt to bind a block of pages of sufficient size to the inode. (3) CONFIG_SYSVIPC is no longer contingent on CONFIG_MMU. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-24	Fix silly typo ("smb" vs "smp")	Linus Torvalds
	Introduced by commit 6003a93e7bf6c02f33c02976ff364785d4273295
2005-12-24	[PATCH] add missing memory barriers to ipc/sem.c	Manfred Spraul
	Two smp_wmb() statements are missing in the sysv sem code: This could cause stack corruptions. The attached patch adds them. Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07	Merge master.kernel.org:/pub/scm/linux/kernel/git/bunk/trivial	Linus Torvalds

2005-11-07	[PATCH] more kernel-doc cleanups, additions	Randy Dunlap
	Various core kernel-doc cleanups: - add missing function parameters in ipc, irq/manage, kernel/sys, kernel/sysctl, and mm/slab; - move description to just above function for kernel_restart() Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07	[PATCH] SHM_NORESERVE flags for shmget()	Badari Pulavarty
	Add SHM_NORESERVE functionality similar to MAP_NORESERVE for shared memory segments. This is mainly to avoid abuse of OVERCOMMIT_ALWAYS and this flag is ignored for OVERCOMMIT_NEVER. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-06	Update Michal Wronski contact info	Michal Wronski

2005-10-29	[PATCH] hugetlb: remove repeated code	Krishnakumar R
	Clean up some repeated code related to HugeTLB. hugetlb_zero_setup would have already allocated the file->f_op. Signed-off-by: Krishnakumar. R <rkrishnakumar@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-27	[PATCH] Make POSIX message queue sys_mq_open() honor umask	Krzysztof Benedyczak
	We ignored umask when creating new queues via mq_open (when creating with open() on mqueue fs it is ok of course). According to the specification this a bug. This trivial patch fixes this. Signed-off-by: Krzysztof Benedyczak <golbi@mat.uni.torun.pl> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10	[PATCH] merge some from Rusty's trivial patches	Adrian Bunk
	This patch contains the most trivial from Rusty's trivial patches: - spelling fixes - remove duplicate includes Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] ipc: convert /proc/sysvipc/* to generic seq_file interface	Mike Waychison
	Change the /proc/sysvipc/shm\|sem\|msg files to use the generic seq_file implementation for struct ipc_ids. Signed-off-by: Mike Waychison <mikew@google.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] ipc: add generic struct ipc_ids seq_file iteration	Mike Waychison
	The following two patches convert /proc/sysvipc/* to use seq_file. This gives us the following: - Self-consistent IPC records in proc. - O(n) reading of the files themselves. This patch: Add a generic method for ipc types to be displayed using seq_file. This patch abstracts out seq_file iterating over struct ipc_ids into ipc/util.c Signed-off-by: Mike Waychison <mikew@google.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07	[PATCH] compat: be more consistent about [ug]id_t	Stephen Rothwell
	When I first wrote the compat layer patches, I was somewhat cavalier about the definition of compat_uid_t and compat_gid_t (or maybe I just misunderstood :-)). This patch makes the compat types much more consistent with the types we are being compatible with and hopefully will fix a few bugs along the way. compat type type in compat arch __compat_[ug]id_t __kernel_[ug]id_t __compat_[ug]id32_t __kernel_[ug]id32_t compat_[ug]id_t [ug]id_t The difference is that compat_uid_t is always 32 bits (for the archs we care about) but __compat_uid_t may be 16 bits on some. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-05	[PATCH] Fix semundo lock leakage	Ingo Molnar
	semundo->lock can leak if semundo->refcount goes from 2 to 1 while another thread has it locked. This causes major problems for PREEMPT kernels. The simplest fix for now is to undo the single-thread optimization. This bug was found via relentless testing by Dominik Karall. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01	[PATCH] shm: CONFIG_SHMEM=n build fix	Andrew Morton
	Fix bug found by Grant Coady <lkml@dodo.com.au>'s autobuild setup. shmem_set_policy() and shmem_get_policy() are macros if !CONFIG_SHMEM, so this doesn't work. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12	[PATCH] xtensa: remove old syscalls	Chris Zankel
	This patch fixes some minor bugs introduced by the previous patch (remove old syscalls). Both patches remove the obsolete syscalls. The changes in this patch were suggested by Arnd Bergmann. The vmlinux.lds.S changes are required for the latest gcc/binutils. Signed-off-by: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07	[PATCH] put_compat_shminfo() warning fix	Jesse Millan
	GCC 4 complains because the function put_compat_shminfo() can't get to its return statement if there is no error... If the function does not return -EFAULT, it doesn't return anything at all. Looks like a typo. Signed-off-by: Jesse Millan <jessem@cs.pdx.edu> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] ipcsem: remove superflous decrease variable from sys_semtimedop	Manfred Spraul
	Patrick noticed that the initial scan of the semaphore operations logs decrease and increase operations seperately, but then both cases are or'ed together and decrease is never used. The attached patch removes the decrease parameter - it shrinks sys_semtimedop() by 56 bytes. Signed-Of-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01	[PATCH] convert that currently tests _NSIG directly to use valid_signal()	Jesper Juhl
	Convert most of the current code that uses _NSIG directly to instead use valid_signal(). This avoids gcc -W warnings and off-by-one errors. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01	[PATCH] consolidate sys_shmat	Stephen Rothwell
	Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01	[PATCH] use smp_mb/wmb/rmb where possible	akpm@osdl.org
	Replace a number of memory barriers with smp_ variants. This means we won't take the unnecessary hit on UP machines. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-16	Merge	Linus Torvalds

2005-03-13	[PATCH] verify_area cleanup : i386 and misc.	Jesper Juhl
	This patch converts verify_area to access_ok in arch/i386, fs/, kernel/ and a few other bits that didn't fit in the other patches or that I actually was able to test on my hardware - this is by far the best tested of all the patches. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-09	[PATCH] consolidate the last of the compat sigevent structs	Stephen Rothwell
	This patch pulls together the compat_sigevent structs. It also consolidates the copying of these structures into the kernel. The only part of the second union in sigevent that the kernel looks at currently is the _tid, so that is the only bit we copy. This patch depends on my previous two patches "add and use COMPAT_SIGEV_PAD_SIZE" and "Consolidate the last compat sigvals". Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-01	Audit IPC object owner/permission changes.	David Woodhouse
	Add linked list of auxiliary data to audit_context Add callbacks in IPC_SET functions to record requested changes. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-01-04	[PATCH] fix missing wakeup in ipc/sem	Manfred Spraul
	My patch that removed the spin_lock calls from the tail of sys_semtimedop introduced a bug: Before my patch was merged, every operation that altered an array called update_queue. That call woke up threads that were waiting until a semaphore value becomes 0. I've accidentially removed that call. The attached patch fixes that by modifying update_queue: the function now loops internally and wakes up all threads. The patch also removes update_queue calls from the error path of sys_semtimedop: failed operations do not modify the array, no need to rescan the list of waiting threads. Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-12-12	[PATCH] shmctl SHM_LOCK perms	Hugh Dickins
	Michael Kerrisk has observed that at present any process can SHM_LOCK any shm segment of size within process RLIMIT_MEMLOCK, despite having no permissions on the segment: surprising, though not obviously evil. And any process can SHM_UNLOCK any shm segment, despite no permissions on it: that is surely wrong. Unless CAP_IPC_LOCK, restrict both SHM_LOCK and SHM_UNLOCK to when the process euid matches the shm owner or creator: that seems the least surprising behaviour, which could be relaxed if a need appears later. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-27	[PATCH] handle posix message queues with /proc/sys disabled	Manfred Spraul
	register_sysctl_table() fails if sysctl support is not compiled into the kernel. The POSIX message queue subsystem aborted it's initialization if register_sysctl_table() fails, and that causes an oops in sys_mq_open(). The patch fixes that by ignoring failures from register_sysctl_table(). Signed-off-by; Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-27	[PATCH] Lock initializer unifying (Core)	Thomas Gleixner
	To make spinlock/rwlock initialization consistent all over the kernel, this patch converts explicit lock-initializers into spin_lock_init() and rwlock_init() calls. Currently, spinlocks and rwlocks are initialized in two different ways: lock = SPIN_LOCK_UNLOCKED spin_lock_init(&lock) rwlock = RW_LOCK_UNLOCKED rwlock_init(&rwlock) this patch converts all explicit lock initializations to spin_lock_init() or rwlock_init(). (Besides consistency this also helps automatic lock validators and debugging code.) The conversion was done with a script, it was verified manually and it was reviewed, compiled and tested as far as possible on x86, ARM, PPC. There is no runtime overhead or actual code change resulting out of this patch, because spin_lock_init() and rwlock_init() are macros and are thus equivalent to the explicit initialization method. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-27	[PATCH] RCU: eliminating explicit memory barriers from SysV IPC	Paul E. McKenney
	This patch uses the rcu_assign_pointer() API to eliminate a number of explicit memory barriers from the SysV IPC code that uses RCU. It also restructures the ipc_ids structure so that the array size is stored in the same memory block as the array itself (see the new struct ipc_id_ary). This prevents the race that the earlier code was subject to, where a reader could see a mismatch between the size and the actual array. With the size stored with the array, the possibility of mismatch is eliminated -- with out the need for careful ordering and explicit memory barriers. This has been tested successfully on i386 and ppc64. Signed-off-by: <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18	[PATCH] add missing linux/syscalls.h includes	Arnd Bergmann
	I found that the prototypes for sys_waitid and sys_fcntl in <linux/syscalls.h> don't match the implementation. In order to keep all prototypes in sync in the future, now include the header from each file implementing any syscall. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18	[PATCH] make rlimit settings per-process instead of per-thread	Roland McGrath
	POSIX specifies that the limit settings provided by getrlimit/setrlimit are shared by the whole process, not specific to individual threads. This patch changes the behavior of those calls to comply with POSIX. I've moved the struct rlimit array from task_struct to signal_struct, as it has the correct sharing properties. (This reduces kernel memory usage per thread in multithreaded processes by around 100/200 bytes for 32/64 machines respectively.) I took a fairly minimal approach to the locking issues with the newly shared struct rlimit array. It turns out that all the code that is checking limits really just needs to look at one word at a time (one rlim_cur field, usually). It's only the few places like getrlimit itself (and fork), that require atomicity in accessing a whole struct rlimit, so I just used a spin lock for them and no locking for most of the checks. If it turns out that readers of struct rlimit need more atomicity where they are now cheap, or less overhead where they are now atomic (e.g. fork), then seqcount is certainly the right thing to use for them instead of readers using the spin lock. Though it's in signal_struct, I didn't use siglock since the access to rlimits never needs to disable irqs and doesn't overlap with other siglock uses. Instead of adding something new, I overloaded task_lock(task->group_leader) for this; it is used for other things that are not likely to happen simultaneously with limit tweaking. To me that seems preferable to adding a word, but it would be trivial (and arguably cleaner) to add a separate lock for these users (or e.g. just use seqlock, which adds two words but is optimal for readers). Most of the changes here are just the trivial s/->rlim/->signal->rlim/. I stumbled across what must be a long-standing bug, in reparent_to_init. It does: memcpy(current->rlim, init_task.rlim, sizeof((current->rlim))); when surely it was intended to be: memcpy(current->rlim, init_task.rlim, sizeof(current->rlim)); As rlim is an array, the in the sizeof expression gets the size of the first element, so this just changes the first limit (RLIMIT_CPU). This is for kernel threads, where it's clear that resetting all the rlimits is what you want. With that fixed, the setting of RLIMIT_FSIZE in nfsd is superfluous since it will now already have been reset to RLIM_INFINITY. The other subtlety is removing: tsk->rlim[RLIMIT_CPU].rlim_cur = RLIM_INFINITY; in exit_notify, which was to avoid a race signalling during self-reaping exit. As the limit is now shared, a dying thread should not change it for others. Instead, I avoid that race by checking current->state before the RLIMIT_CPU check. (Adding one new conditional in that path is now required one way or another, since if not for this check there would also be a new race with self-reaping exit later on clearing current->signal that would have to be checked for.) The one loose end left by this patch is with process accounting. do_acct_process temporarily resets the RLIMIT_FSIZE limit while writing the accounting record. I left this as it was, but it is now changing a limit that might be shared by other threads still running. I left this in a dubious state because it seems to me that processing accounting may already be more generally a dubious state when it comes to NPTL threads. I would think you would want one record per process, with aggregate data about all threads that ever lived in it, not a separate record for each thread. I don't use process accounting myself, but if anyone is interested in testing it out I could provide a patch to change it this way. One final note, this is not 100% to POSIX compliance in regards to rlimits. POSIX specifies that RLIMIT_CPU refers to a whole process in aggregate, not to each individual thread. I will provide patches later on to achieve that change, assuming this patch goes in first. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-23	[PATCH] hugetlb: permit executable mappings	William Lee Irwin III
	During the kernel summit, some discussion was had about the support requirements for a userspace program loader that loads executables into hugetlb on behalf of a major application (Oracle). In order to support this in a robust fashion, the cleanup of the hugetlb must be robust in the presence of disorderly termination of the programs (e.g. kill -9). Hence, the cleanup semantics are those of System V shared memory, but Linux' System V shared memory needs one critical extension for this use: executability. The following microscopic patch enables this major application to provide robust hugetlb cleanup. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-23	[PATCH] remove magic +1 from shm segment count	Manfred Spraul
	Michael Kerrisk found a bug in the shm accounting code: sysv shm allows to create SHMMNI+1 shared memory segments, instead of SHMMNI segments. The +1 is probably from the first shared anonymous mapping implementation that used the sysv code to implement shared anon mappings. The implementation got replaced, it's now the other way around (sysv uses the shared anon code), but the +1 remained. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] rlimit-based mlocks for unprivileged users	Rik van Riel
	Here is the last agreed-on patch that lets normal users mlock pages up to their rlimit. This patch addresses all the issues brought up by Chris and Andrea. From: Chris Wright <chrisw@osdl.org> Couple more nits. The default lockable amount is one page now (first patch is was 0). Why don't we keep it as 0, with the CAP_IPC_LOCK overrides in place? That way nothing is changed from user perspective, and the rest of the policy can be done by userspace as it should. This patch breaks in one scenario. When ulimit == 0, process has CAP_IPC_LOCK, and does SHM_LOCK. The subsequent unlock or destroy will corrupt the locked_shm count. It's also inconsistent in handling user_can_mlock/CAP_IPC_LOCK interaction betwen shm_lock and shm_hugetlb. SHM_HUGETLB can now only be done by the shm_group or CAP_IPC_LOCK. Not any can_do_mlock() user. Double check of can_do_mlock isn't needed in SHM_LOCK path. Interface names user_can_mlock and user_substract_mlock could be better. Incremental update below. Ran some simple sanity tests on this plus my patch below and didn't find any problems. * Make default RLIM_MEMLOCK limit 0. * Move CAP_IPC_LOCK check into user_can_mlock to be consistent and fix but with ulimit == 0 && CAP_IPC_LOCK with SHM_LOCK. * Allow can_do_mlock() user to try SHM_HUGETLB setup. * Remove unecessary extra can_do_mlock() test in shmem_lock(). * Rename user_can_mlock to user_shm_lock and user_subtract_mlock to user_shm_unlock. * Use user instead of current->user to fit in 80 cols on SHM_LOCK. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] IS_ERR() unlikeliness cleanup	Andrew Morton
	Remove now-unneeded open-coded unlikelies around IS_ERR(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] rcu: abstracted RCU dereferencing	Dipankar Sarma
	Use abstracted RCU API to dereference RCU protected data. Hides barrier details. Patch from Paul McKenney. This patch introduced an rcu_dereference() macro that replaces most uses of smp_read_barrier_depends(). The new macro has the advantage of explicitly documenting which pointers are protected by RCU -- in contrast, it is sometimes difficult to figure out which pointer is being protected by a given smp_read_barrier_depends() call. Signed-off-by: Paul McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] cleanup of ipc/msg.c	Manfred Spraul
	Attached is a cleanup of the main loops in sys_msgrcv and sys_msgsnd, based on ipc_lock_by_ptr(). Most backward gotos are gone, instead normal "for(;;)" loops until a suitable message is found. Description: - General cleanup of sys_msgrcv and sys_msgsnd: the function were too convoluted. - Enable lockless receive, update comments. - Use ipc_getref for sys_msgsnd(), it's better than rechecking that the msqid is still valid. Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] ipc: enforce SEMVMX limit for undo	Manfred Spraul
	Independent from the other patches: undo operations should not result in out of range semaphore values. The test for newval > SEMVMX is missing. The attached patch adds the test and a comment. Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] ipc: remove sem_revalidate	Manfred Spraul
	The attached patch removes sem_revalidate and replaces it with ipc_rcu_getref() calls followed by ipc_lock_by_ptr(). Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22	[PATCH] ipc: Add refcount to ipc_rcu_alloc	Manfred Spraul
	The lifetime of the ipc objects (sem array, msg queue, shm mapping) is controlled by kern_ipc_perms->lock - a spinlock. There is no simple way to reacquire this spinlock after it was dropped to schedule()/kmalloc/copy_{to,from}_user/whatever. The attached patch adds a reference count as a preparation to get rid of sem_revalidate(). Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-07-12	[PATCH] sparse: ipc compat annotations and cleanups	Alexander Viro
	ipc compat code switched to compat_alloc_user_space() and annotated.