| Age | Commit message (Collapse) | Author |
|
Change the /proc/sysvipc/shm|sem|msg files to use the generic seq_file
implementation for struct ipc_ids.
Signed-off-by: Mike Waychison <mikew@google.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Add linked list of auxiliary data to audit_context
Add callbacks in IPC_SET functions to record requested changes.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
|
|
This patch uses the rcu_assign_pointer() API to eliminate a number of explicit
memory barriers from the SysV IPC code that uses RCU. It also restructures
the ipc_ids structure so that the array size is stored in the same memory
block as the array itself (see the new struct ipc_id_ary). This prevents the
race that the earlier code was subject to, where a reader could see a mismatch
between the size and the actual array. With the size stored with the array,
the possibility of mismatch is eliminated -- with out the need for careful
ordering and explicit memory barriers. This has been tested successfully on
i386 and ppc64.
Signed-off-by: <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
I found that the prototypes for sys_waitid and sys_fcntl in
<linux/syscalls.h> don't match the implementation. In order to keep all
prototypes in sync in the future, now include the header from each file
implementing any syscall.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Attached is a cleanup of the main loops in sys_msgrcv and sys_msgsnd, based on
ipc_lock_by_ptr(). Most backward gotos are gone, instead normal "for(;;)"
loops until a suitable message is found.
Description:
- General cleanup of sys_msgrcv and sys_msgsnd: the function were too
convoluted.
- Enable lockless receive, update comments.
- Use ipc_getref for sys_msgsnd(), it's better than rechecking that the
msqid is still valid.
Signed-Off-By: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
The lifetime of the ipc objects (sem array, msg queue, shm mapping) is
controlled by kern_ipc_perms->lock - a spinlock. There is no simple way to
reacquire this spinlock after it was dropped to
schedule()/kmalloc/copy_{to,from}_user/whatever.
The attached patch adds a reference count as a preparation to get rid of
sem_revalidate().
Signed-Off-By: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
|
|
|
|
From: Manfred Spraul <manfred@colorfullife.com>
cleanup of sysv ipc as a preparation for posix message queues:
- replace !CONFIG_SYSVIPC wrappers for copy_semundo and exit_sem with
static inline wrappers. Now the whole ipc/util.c file is only used if
CONFIG_SYSVIPC is set, use makefile magic instead of #ifdef.
- remove the prototypes for copy_semundo and exit_sem from kernel/fork.c
- they belong into a header file.
- create a new msgutil.c with the helper functions for message queues.
- cleanup the helper functions: run Lindent, add __user tags.
|
|
The LSM changes broke the error checking for queue lengths in IPC_SET. The LSM check would
set set err to 0, but the next check expected it to still be -EPERM. Result was that
no error was reported, but the new parameters weren't correctly set.
|
|
Backport this fix from 2.4
|
|
One more overlooked area where the proper process ID has to be used:
SysV IPC "pid" values should use the thread group ID, not the per-thread
one.
|
|
From: Mingming Cao <cmm@us.ibm.com>
Basically, freeary() is called with the spinlock for that semaphore set
hold. But after the semaphore set is removed from the ID array by
calling sem_rmid(), there is no lock to protect the waiting queue for
that semaphore set. So, if a waiter is woken up by a signal (not by the
wakeup from freeary()), it will check the q->status and q->prev fields.
At that moment, freeary() may not have a chance to update those fields
yet.
static void freeary (int id)
{
.......
sma = sem_rmid(id);
......
/* Wake up all pending processes and let them fail with EIDRM.*/
for (q = sma->sem_pending; q; q = q->next) {
q->status = -EIDRM;
q->prev = NULL;
wake_up_process(q->sleeper); /* doesn't sleep */
}
sem_unlock(sma);
......
}
So I propose move sem_rmid() after the loop of waking up every waiters.
That could gurantee that when the waiters are woke up, the updates for
q->status and q->prev have already done. Similar thing in message queue
case. The patch is attached below. Comments are very welcomed.
I have tested this patch on 2.5.68 kernel with LTP tests, seems fine to
me. Paul, could you test this on DOTS test again? Thanks!
|
|
This patch adds the remaining System V IPC hooks, including the inline
documentation for them in security.h. This includes a restored
sem_semop hook, as it does seem to be necessary to support fine-grained
access.
All of these System V IPC hooks are used by SELinux. The SELinux System
V IPC access controls were originally described in the technical report
available from http://www.nsa.gov/selinux/slinux-abs.html, and the
LSM-based implementation is described in the technical report available
from http://www.nsa.gov/selinux/module-abs.html.
|
|
|
|
into conectiva.com.br:/home/BK/includes-2.5
|
|
and net/* files.
|
|
|
|
|
|
stat64 has been changed to return jiffies granuality as nsec in previously
unused fields. This allows make to make better decisions on when
to recompile a file. Follows losely the Solaris API.
CURRENT_TIME has been redefined to return struct timespec. The users
who don't use it in a inode/attr context have been changed to use a new
get_seconds() function. CURRENT_TIME is implemented by an out-of-line
function.
There is a small performance penalty in this patch. The previous
filemap code had an optimization to flush atime only once a second.
This is currently gone, which will increase flushes a bit. I believe
the correct solution if it should be a problem is to have per super
block fields that give an arbitary atime flush granuality - so that you
can set it to be only flushed once a hour if you prefer that. I will
work on that later in separate patches if the need should arise.
struct inode and the attr struct has been changed to store struct
timespec instead of time_t for [cma]time. Not all file systems support
this granuality, but some like XFS,NFSv3,CIFS,JFS do. The others will
currently truncate the nsec part on flushing to disk. There was some
discussion on this rounding on l-k previously. I went for simple
truncation because there is not much evidence IMHO that the more
complicated roundings have any advantages. In practice application will
be rather unlikely to notice the rounding anyways - they can only see a
difference when an inode is flush from memory and reloaded in less than
a second, which is rather unlikely.
|
|
Patch from Mingming, Rusty, Hugh, Dipankar, me:
- It greatly reduces the lock contention by having one lock per id.
The global spinlock is removed and a spinlock is added in
kern_ipc_perm structure.
- Uses ReadCopyUpdate in grow_ary() for locking-free resizing.
- In the places where ipc_rmid() is called, delay calling ipc_free()
to RCU callbacks. This is to prevent ipc_lock() returning an invalid
pointer after ipc_rmid(). In addition, use the workqueue to enable
RCU freeing vmalloced entries.
Also some other changes:
- Remove redundant ipc_lockall/ipc_unlockall
- Now ipc_unlock() directly takes IPC ID pointer as argument, avoid
extra looking up the array.
The changes are made based on the input from Huge Dickens, Manfred
Spraul and Dipankar Sarma. In addition, Cliff White has run OSDL's
dbt1 test on a 2 way against the earlier version of this patch.
Results shows about 2-6% improvement on the average number of
transactions per second. Here is the summary of his tests:
2.5.42-mm2 2.5.42-mm2-ipclock
-----------------------------
Average over 5 runs 85.0 BT 89.8 BT
Std Deviation 5 runs 7.4 BT 1.0 BT
Average over 4 best 88.15 BT 90.2 BT
Std Deviation 4 best 2.8 BT 0.5 BT
Also, another test today from Bill Hartner:
I tested Mingming's RCU ipc lock patch using a *new* microbenchmark - semopbench.
semopbench was written to test the performance of Mingming's patch.
I also ran a 3 hour stress and it completed successfully.
Explanation of the microbenchmark is below the results.
Here is a link to the microbenchmark source.
http://www-124.ibm.com/developerworks/opensource/linuxperf/semopbench/semopbench.c
SUT : 8-way 700 Mhz PIII
I tested 2.5.44-mm2 and 2.5.44-mm2 + RCU ipc patch
>semopbench -g 64 -s 16 -n 16384 -r > sem.results.out
>readprofile -m /boot/System.map | sort -n +0 -r > sem.profile.out
The metric is seconds / per repetition. Lower is better.
kernel run 1 run 2
seconds seconds
================== ======= =======
2.5.44-mm2 515.1 515.4
2.5.44-mm2+rcu-ipc 46.7 46.7
With Mingming's patch, the test completes 10X faster.
|
|
|
|
The patch below adds the base set of LSM hooks for System V IPC to the
2.5.41 kernel. These hooks permit a security module to label
semaphore sets, message queues, and shared memory segments and to
perform security checks on these objects that parallel the existing
IPC access checks. Additional LSM hooks for labeling and controlling
individual messages sent on a single message queue and for providing
fine-grained distinctions among IPC operations will be submitted
separately after this base set of LSM IPC hooks has been accepted.
|
|
into kroah.com:/home/greg/linux/BK/lsm-2.5
|
|
This patch just makes some stuff in ipc/ static.
|
|
msg.c file to the msg.h file
Also move where the msg->q_perm.mode and .key values get set to before
ipc_addid() gets called to make placing a hook there easier.
|
|
- Alan Cox: continued merging
- Mingming Cao: make msgrcv/shmat check the queue/segment ID's properly
- Greg KH: USB serial init failure fix, Xircom serial converter driver
- Neil Brown: nsfd/raid/md/lockd cleanups
- Ingo Molnar: multipath RAID personality, raid xor update
- Hugh Dickins/Marcelo Tosatti: swapin read-ahead race fix
- Vojtech Pavlik: fix up some of the infrastructure for x86-64
- Robert Love: AMD 761 AGP GART support
- Jens Axboe: fix SCSI-generic queue handling race
- me: be sane about page reference bits
|
|
- sync up more with Alan
- Urban Widmark: smbfs and HIGHMEM fix
- Chris Mason: reiserfs tail unpacking fix ("null bytes in reiserfs files")
- Adan Richter: new cpia usb ID
- Hugh Dickins: misc small sysv ipc fixes
- Andries Brouwer: remove overly restrictive sector size check for
SCSI cd-roms
|
|
- Jens: better ordering of requests when unable to merge
- Neil Brown: make md work as a module again (we cannot autodetect
in modules, not enough background information)
- Neil Brown: raid5 SMP locking cleanups
- Neil Brown: nfsd: handle Irix NFS clients named pipe behavior and
dentry leak fix
- maestro3 shutdown fix
- fix dcache hash calculation that could cause bad hashes under certain
circumstances (Dean Gaudet)
- David Miller: networking and sparc updates
- Jeff Garzik: include file cleanups
- Andy Grover: ACPI update
- Coda-fs error return fixes
- rth: alpha Jensen update
|
|
|