summaryrefslogtreecommitdiff
path: root/ipc/sem.c
AgeCommit message (Collapse)Author
2005-03-01Audit IPC object owner/permission changes.David Woodhouse
Add linked list of auxiliary data to audit_context Add callbacks in IPC_SET functions to record requested changes. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-01-04[PATCH] fix missing wakeup in ipc/semManfred Spraul
My patch that removed the spin_lock calls from the tail of sys_semtimedop introduced a bug: Before my patch was merged, every operation that altered an array called update_queue. That call woke up threads that were waiting until a semaphore value becomes 0. I've accidentially removed that call. The attached patch fixes that by modifying update_queue: the function now loops internally and wakes up all threads. The patch also removes update_queue calls from the error path of sys_semtimedop: failed operations do not modify the array, no need to rescan the list of waiting threads. Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-27[PATCH] RCU: eliminating explicit memory barriers from SysV IPCPaul E. McKenney
This patch uses the rcu_assign_pointer() API to eliminate a number of explicit memory barriers from the SysV IPC code that uses RCU. It also restructures the ipc_ids structure so that the array size is stored in the same memory block as the array itself (see the new struct ipc_id_ary). This prevents the race that the earlier code was subject to, where a reader could see a mismatch between the size and the actual array. With the size stored with the array, the possibility of mismatch is eliminated -- with out the need for careful ordering and explicit memory barriers. This has been tested successfully on i386 and ppc64. Signed-off-by: <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] add missing linux/syscalls.h includesArnd Bergmann
I found that the prototypes for sys_waitid and sys_fcntl in <linux/syscalls.h> don't match the implementation. In order to keep all prototypes in sync in the future, now include the header from each file implementing any syscall. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22[PATCH] ipc: enforce SEMVMX limit for undoManfred Spraul
Independent from the other patches: undo operations should not result in out of range semaphore values. The test for newval > SEMVMX is missing. The attached patch adds the test and a comment. Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22[PATCH] ipc: remove sem_revalidateManfred Spraul
The attached patch removes sem_revalidate and replaces it with ipc_rcu_getref() calls followed by ipc_lock_by_ptr(). Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22[PATCH] ipc: Add refcount to ipc_rcu_allocManfred Spraul
The lifetime of the ipc objects (sem array, msg queue, shm mapping) is controlled by kern_ipc_perms->lock - a spinlock. There is no simple way to reacquire this spinlock after it was dropped to schedule()/kmalloc/copy_{to,from}_user/whatever. The attached patch adds a reference count as a preparation to get rid of sem_revalidate(). Signed-Off-By: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-30[PATCH] sparse: NULL vs 0 - the rest of itMika Kukkonen
2004-05-28[PATCH] sparse: ipc __user annotationAlexander Viro
2004-03-26[PATCH] ipc locking fixAndrew Morton
From: badari <pbadari@us.ibm.com> I ran into an ipc hang while trying to shutdown a database. The problem is due to missing sem_unlock() in find_undo().
2004-03-11[PATCH] Remove unneeded unlock in ipc/sem.cAndrew Morton
From: Manfred Spraul <manfred@colorfullife.com> sem_revalidate checks that a semaphore array didn't disappear while the code was running without the semaphore array spinlock. If the array disappeared, then it will return without holding a lock. find_undo calls sem_revalidate and then sem_unlock, even if sem_revalidate failed. The sem_unlock call must be removed. Mingming Cao reported a spinlock deadlock with sysv semaphores. A superflous unlock doesn't explain the deadlock, but it's obviously a bug.
2004-02-24[PATCH] add syscalls.hAndrew Morton
From: "Randy.Dunlap" <rddunlap@osdl.org> Add syscalls.h, which contains prototypes for the kernel's system calls. Replace open-coded declarations all over the place. This patch found a couple of prior bugs. It appears to be more important with -mregparm=3 as we discover more asmlinkage mismatches. Some syscalls have arch-dependent arguments, so their prototypes are in the arch-specific unistd.h. Maybe it should have been asm/syscalls.h, but there were already arch-specific syscall prototypes in asm/unistd.h... Tested on x86, ia64, x86_64, ppc64, s390 and sparc64. May cause trivial-to-fix build breakage on other architectures.
2003-12-29[PATCH] lockless semopAndrew Morton
From: Manfred Spraul <manfred@colorfullife.com> attached is the lockless semop patch. I did another test run with idle=poll on an pentium III, and it remained unchanged: 99.9% direct fast path, 0.1% race with wakeup against writing the final result code: http://khack.osdl.org/stp/282936/environment/proc/slabinfo That means there is no immediate need to add the two-stage implementation to finish_wait. It reduces the spinlock operations on the semaphore array spinlock by 1/3.
2003-09-21[PATCH] Fix sem_lock deadlockAndrew Morton
From: Anton Blanchard <anton@samba.org> I saw a lockup where 2 cpus were stuck in sem_lock(). It seems like we can loop back to retry_undos with the lock held. That path takes the lock so we will deadlock.
2003-08-30[PATCH] More ->pid to ->tgid changesUlrich Drepper
One more overlooked area where the proper process ID has to be used: SysV IPC "pid" values should use the thread group ID, not the per-thread one.
2003-08-13[PATCH] sparse annotations for ipc/semDave Jones
2003-07-04[PATCH] ipc semaphore optimizationAndrew Morton
From: "Chen, Kenneth W" <kenneth.w.chen@intel.com> This patch proposes a performance fix for the current IPC semaphore implementation. There are two shortcoming in the current implementation: try_atomic_semop() was called two times to wake up a blocked process, once from the update_queue() (executed from the process that wakes up the sleeping process) and once in the retry part of the blocked process (executed from the block process that gets woken up). A second issue is that when several sleeping processes that are eligible for wake up, they woke up in daisy chain formation and each one in turn to wake up next process in line. However, every time when a process wakes up, it start scans the wait queue from the beginning, not from where it was last scanned. This causes large number of unnecessary scanning of the wait queue under a situation of deep wait queue. Blocked processes come and go, but chances are there are still quite a few blocked processes sit at the beginning of that queue. What we are proposing here is to merge the portion of the code in the bottom part of sys_semtimedop() (code that gets executed when a sleeping process gets woken up) into update_queue() function. The benefit is two folds: (1) is to reduce redundant calls to try_atomic_semop() and (2) to increase efficiency of finding eligible processes to wake up and higher concurrency for multiple wake-ups. We have measured that this patch improves throughput for a large application significantly on a industry standard benchmark. This patch is relative to 2.5.72. Any feedback is very much appreciated. Some kernel profile data attached: Kernel profile before optimization: ----------------------------------------------- 0.05 0.14 40805/529060 sys_semop [133] 0.55 1.73 488255/529060 ia64_ret_from_syscall [2] [52] 2.5 0.59 1.88 529060 sys_semtimedop [52] 0.05 0.83 477766/817966 schedule_timeout [62] 0.34 0.46 529064/989340 update_queue [61] 0.14 0.00 1006740/6473086 try_atomic_semop [75] 0.06 0.00 529060/989336 ipcperms [149] ----------------------------------------------- 0.30 0.40 460276/989340 semctl_main [68] 0.34 0.46 529064/989340 sys_semtimedop [52] [61] 1.5 0.64 0.87 989340 update_queue [61] 0.75 0.00 5466346/6473086 try_atomic_semop [75] 0.01 0.11 477676/576698 wake_up_process [146] ----------------------------------------------- 0.14 0.00 1006740/6473086 sys_semtimedop [52] 0.75 0.00 5466346/6473086 update_queue [61] [75] 0.9 0.89 0.00 6473086 try_atomic_semop [75] ----------------------------------------------- Kernel profile with optimization: ----------------------------------------------- 0.03 0.05 26139/503178 sys_semop [155] 0.46 0.92 477039/503178 ia64_ret_from_syscall [2] [61] 1.2 0.48 0.97 503178 sys_semtimedop [61] 0.04 0.79 470724/784394 schedule_timeout [62] 0.05 0.00 503178/3301773 try_atomic_semop [109] 0.05 0.00 503178/930934 ipcperms [149] 0.00 0.03 32454/460210 update_queue [99] ----------------------------------------------- 0.00 0.03 32454/460210 sys_semtimedop [61] 0.06 0.36 427756/460210 semctl_main [75] [99] 0.4 0.06 0.39 460210 update_queue [99] 0.30 0.00 2798595/3301773 try_atomic_semop [109] 0.00 0.09 470630/614097 wake_up_process [146] ----------------------------------------------- 0.05 0.00 503178/3301773 sys_semtimedop [61] 0.30 0.00 2798595/3301773 update_queue [99] [109] 0.3 0.35 0.00 3301773 try_atomic_semop [109] -----------------------------------------------=20 Both number of function calls to try_atomic_semop() and update_queue() are reduced by 50% as a result of the merge. Execution time of sys_semtimedop is reduced because of the reduction in the low level functions.
2003-06-20[PATCH] sysv semundo fixesAndrew Morton
From: Manfred Spraul <manfred@colorfullife.com> The CLONE_SYSVSEM implementation is racy: it does an (atomic_read(->refcnt) ==1) instead of atomic_dec_and_test calls in the exit handling. The patch fixes that. Additionally, the patch contains the following changes: - lock_undo() locks the list of undo structures. The lock is held throughout the semop() syscall, but that's unnecessary - we can drop it immediately after the lookup. - undo structures are only allocated when necessary. The need for undo structures is only noticed in the middle of the semop operation, while holding the semaphore array spinlock. The result is a convoluted unlock&revalidate implementation. I've reordered the code, and now the undo allocation can happen before acquiring the semaphore array spinlock. As a bonus, less code runs under the semaphore array spinlock. - sysvsem.sleep_list looks like code to handle oopses: if an oops kills a thread that sleeps in sys_timedsemop(), then sem_exit tries to recover. I've removed that - too fragile.
2003-06-02[PATCH] remove 16-bit pid assumption from ipc/sem.cAndrew Morton
From: Manfred Spraul <manfred@colorfullife.com> SysV sem operations that involve multiple semaphores can fail in the middle, and then sempid (pid of the last successful operation) must be restored. This happens with "sempid >>= 16" - broken due to the 32-bit pid values. The attached patch fixes that by reordering the updates of the semaphore fields. Additionally, the patch fixes the corruption of the sempid value that occurs if a wait-for-zero operation fails. The patch is more than two years old, and was in -dj and -ak kernels.
2003-05-12[PATCH] semop race fixAndrew Morton
From: Mingming Cao <cmm@us.ibm.com> Basically, freeary() is called with the spinlock for that semaphore set hold. But after the semaphore set is removed from the ID array by calling sem_rmid(), there is no lock to protect the waiting queue for that semaphore set. So, if a waiter is woken up by a signal (not by the wakeup from freeary()), it will check the q->status and q->prev fields. At that moment, freeary() may not have a chance to update those fields yet. static void freeary (int id) { ....... sma = sem_rmid(id); ...... /* Wake up all pending processes and let them fail with EIDRM.*/ for (q = sma->sem_pending; q; q = q->next) { q->status = -EIDRM; q->prev = NULL; wake_up_process(q->sleeper); /* doesn't sleep */ } sem_unlock(sma); ...... } So I propose move sem_rmid() after the loop of waking up every waiters. That could gurantee that when the waiters are woke up, the updates for q->status and q->prev have already done. Similar thing in message queue case. The patch is attached below. Comments are very welcomed. I have tested this patch on 2.5.68 kernel with LTP tests, seems fine to me. Paul, could you test this on DOTS test again? Thanks!
2003-01-09[PATCH] 2.5.52-lsm-{dummy,ipc}.patchStephen D. Smalley
This patch adds the remaining System V IPC hooks, including the inline documentation for them in security.h. This includes a restored sem_semop hook, as it does seem to be necessary to support fine-grained access. All of these System V IPC hooks are used by SELinux. The SELinux System V IPC access controls were originally described in the technical report available from http://www.nsa.gov/selinux/slinux-abs.html, and the LSM-based implementation is described in the technical report available from http://www.nsa.gov/selinux/module-abs.html.
2002-12-14[PATCH] semtimedop - semop() with a timeoutAndrew Morton
Patch from Mark Fasheh <mark.fasheh@oracle.com> (plus a few cleanups and a speedup from yours truly) Adds the semtimedop() function - semop with a timeout. Solaris has this. It's apparently worth a couple of percent to Oracle throughput and given the simplicity, that is sufficient benefit for inclusion IMO. This patch hooks up semtimedop() only for ia64 and ia32.
2002-11-26LSM: change if statements into something more readable for the ipc/*, mm/*, ↵Greg Kroah-Hartman
and net/* files.
2002-11-23MergeGreg Kroah-Hartman
2002-11-17[PATCH] nanosecond stat timefieldsAndi Kleen
stat64 has been changed to return jiffies granuality as nsec in previously unused fields. This allows make to make better decisions on when to recompile a file. Follows losely the Solaris API. CURRENT_TIME has been redefined to return struct timespec. The users who don't use it in a inode/attr context have been changed to use a new get_seconds() function. CURRENT_TIME is implemented by an out-of-line function. There is a small performance penalty in this patch. The previous filemap code had an optimization to flush atime only once a second. This is currently gone, which will increase flushes a bit. I believe the correct solution if it should be a problem is to have per super block fields that give an arbitary atime flush granuality - so that you can set it to be only flushed once a hour if you prefer that. I will work on that later in separate patches if the need should arise. struct inode and the attr struct has been changed to store struct timespec instead of time_t for [cma]time. Not all file systems support this granuality, but some like XFS,NFSv3,CIFS,JFS do. The others will currently truncate the nsec part on flushing to disk. There was some discussion on this rounding on l-k previously. I went for simple truncation because there is not much evidence IMHO that the more complicated roundings have any advantages. In practice application will be rather unlikely to notice the rounding anyways - they can only see a difference when an inode is flush from memory and reloaded in less than a second, which is rather unlikely.
2002-10-31[PATCH] use RCU for IPC lockingAndrew Morton
Patch from Mingming, Rusty, Hugh, Dipankar, me: - It greatly reduces the lock contention by having one lock per id. The global spinlock is removed and a spinlock is added in kern_ipc_perm structure. - Uses ReadCopyUpdate in grow_ary() for locking-free resizing. - In the places where ipc_rmid() is called, delay calling ipc_free() to RCU callbacks. This is to prevent ipc_lock() returning an invalid pointer after ipc_rmid(). In addition, use the workqueue to enable RCU freeing vmalloced entries. Also some other changes: - Remove redundant ipc_lockall/ipc_unlockall - Now ipc_unlock() directly takes IPC ID pointer as argument, avoid extra looking up the array. The changes are made based on the input from Huge Dickens, Manfred Spraul and Dipankar Sarma. In addition, Cliff White has run OSDL's dbt1 test on a 2 way against the earlier version of this patch. Results shows about 2-6% improvement on the average number of transactions per second. Here is the summary of his tests: 2.5.42-mm2 2.5.42-mm2-ipclock ----------------------------- Average over 5 runs 85.0 BT 89.8 BT Std Deviation 5 runs 7.4 BT 1.0 BT Average over 4 best 88.15 BT 90.2 BT Std Deviation 4 best 2.8 BT 0.5 BT Also, another test today from Bill Hartner: I tested Mingming's RCU ipc lock patch using a *new* microbenchmark - semopbench. semopbench was written to test the performance of Mingming's patch. I also ran a 3 hour stress and it completed successfully. Explanation of the microbenchmark is below the results. Here is a link to the microbenchmark source. http://www-124.ibm.com/developerworks/opensource/linuxperf/semopbench/semopbench.c SUT : 8-way 700 Mhz PIII I tested 2.5.44-mm2 and 2.5.44-mm2 + RCU ipc patch >semopbench -g 64 -s 16 -n 16384 -r > sem.results.out >readprofile -m /boot/System.map | sort -n +0 -r > sem.profile.out The metric is seconds / per repetition. Lower is better. kernel run 1 run 2 seconds seconds ================== ======= ======= 2.5.44-mm2 515.1 515.4 2.5.44-mm2+rcu-ipc 46.7 46.7 With Mingming's patch, the test completes 10X faster.
2002-10-17LSM: convert over the remaining security calls to the new format.Greg Kroah-Hartman
2002-10-08[PATCH] Base set of LSM hooks for SysV IPCStephen D. Smalley
The patch below adds the base set of LSM hooks for System V IPC to the 2.5.41 kernel. These hooks permit a security module to label semaphore sets, message queues, and shared memory segments and to perform security checks on these objects that parallel the existing IPC access checks. Additional LSM hooks for labeling and controlling individual messages sent on a single message queue and for providing fine-grained distinctions among IPC operations will be submitted separately after this base set of LSM IPC hooks has been accepted.
2002-07-16Merge kroah.com:/home/greg/linux/BK/bleeding_edge-2.5Greg Kroah-Hartman
into kroah.com:/home/greg/linux/BK/lsm-2.5
2002-07-14[PATCH] ipc_ staticsStephen Rothwell
This patch just makes some stuff in ipc/ static.
2002-07-14LSM: move struct shmid_kernel out of ipc/shm.c to include/linux/shm.hGreg Kroah-Hartman
Also move where we set sma->sem_perm.mode and .key to before ipc_addid() gets called.
2002-05-26[PATCH] semctl SUSv2 complianceRusty Russell
Christopher Yeoh <cyeoh@samba.org>: (Made -p1 compliant by rusty) SUSv2 semctl compliance: The semctl call with SETVAL currently does not set sempid (at the moment sempid is only set during a successful semop call). An explanation from Geoff Clare of the Open Group regarding why sempid should be set during the semctl call: "The spec isn't very clear, but there is a statement on the semget() page which I think justifies the assumption made by the test. It says that upon creation, the data structure associated with each semaphore in the set is not initialised, and that the semctl() function with SETVAL or SETALL can be used to initialise each semaphore. Therefore semctl() with SETVAL has to set sempid to *something*, and since sempid contains the "process ID of the last operation", setting it to anything other than the pid of the calling process would mean that sempid contained misleading information. It could be argued that setting it to zero would not be misleading, but zero cannot be the process ID of a process, and so is not a valid value for sempid anyway." The following patch changes semctl so when called with SETVAL sempid is set to the pid of the calling process:
2002-04-28[PATCH] 2.5.10 BKL not always released in sem_exit()Chris Wright
The patch below fixes sem_exit() so that the BKL is always released.
2002-04-23[PATCH] 2.5.9 SEM_UNDO patchDave Olien
As we discussed some time ago, here is a patch for the SEM_UNDO change that can be applied to linux-2.5.9.
2002-04-02Fix missing include due to do_exit() BKL movementLinus Torvalds
2002-04-02[PATCH] BKL reduction in do_exitDave Hansen
Push BKL down to the (few) routines that actually need it, remove it from the do_exit() path.
2002-02-04v2.4.10.1 -> v2.4.10.2Linus Torvalds
- me/Al Viro: fix bdget() oops with block device modules that don't clean up after they exit - Alan Cox: continued merging (drivers, license tags) - David Miller: sparc update, network fixes - Christoph Hellwig: work around broken drivers that add a gendisk more than once - Jakub Jelinek: handle more ELF loading special cases - Trond Myklebust: NFS client and lockd reclaimer cleanups/fixes - Greg KH: USB updates - Mikael Pettersson: sparate out local APIC / IO-APIC config options
2002-02-04v2.4.1.4 -> v2.4.2Linus Torvalds
- sync up more with Alan - Urban Widmark: smbfs and HIGHMEM fix - Chris Mason: reiserfs tail unpacking fix ("null bytes in reiserfs files") - Adan Richter: new cpia usb ID - Hugh Dickins: misc small sysv ipc fixes - Andries Brouwer: remove overly restrictive sector size check for SCSI cd-roms
2002-02-04v2.4.1.2 -> v2.4.1.3Linus Torvalds
- Jens: better ordering of requests when unable to merge - Neil Brown: make md work as a module again (we cannot autodetect in modules, not enough background information) - Neil Brown: raid5 SMP locking cleanups - Neil Brown: nfsd: handle Irix NFS clients named pipe behavior and dentry leak fix - maestro3 shutdown fix - fix dcache hash calculation that could cause bad hashes under certain circumstances (Dean Gaudet) - David Miller: networking and sparc updates - Jeff Garzik: include file cleanups - Andy Grover: ACPI update - Coda-fs error return fixes - rth: alpha Jensen update
2002-02-04Import changesetLinus Torvalds