| Age | Commit message (Collapse) | Author |
|
Both work, but the latter can cause warnings in user space
from compilers that don't like using undefined identifiers
in preprocessor expressions (quite reasonable).
Pointed out by Randy Dunlap.
|
|
Both gcc and sparse will speed up tokenization by noticing
when a header file is protected by the standard preprocessor
#ifndef .. #endif exclusion.
However, that requires that the headers actually _use_ that
standard exclusion. Some did their own little broken dance.
|
|
into ppc970.osdl.org:/home/torvalds/v2.6/linux
|
|
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch removes CONFIG_NETPOLL_RX, as discussed.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Use hlists for the PID hashes. This halves the memory footprint of these
hashes. No benchmarks, but I think this is a worthy improvement because
the hashes are something that would be likely to have significant portions
loaded into the cache of every CPU on some workloads.
This comes at the "expense" of
1. reintroducing the memory prefetch into the hash traversal loop;
2. adding new pids to the head of the list instead of the tail. I
suspect that if this was a big problem then the hash isn't sized
well or could benefit from moving hot entries to the head.
Also, account for all the pid hashes when reporting hash memory usage.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
A 4GB, 4-way Opteron would create the smallest size table (16 entries) because
pidhash_init is called before mem_init which is where x86-64 sets up max_pfn.
nr_kernel_pages is setup by paging_init, called from setup_arch, which is also
where i386 sets up max_pfn.
So export nr_kernel_pages, nr_all_pages. Use nr_kernel_pages when sizing the
PID hash. This fixes the problem.
This also makes the pid hash dependant on the size of ZONE_NORMAL instead of
total size of memory.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch changes the rusage bookkeeping and the semantics of the
getrusage and times calls in a couple of ways.
The first change is in the c* fields counting dead child processes. POSIX
requires that children that have died be counted in these fields when they
are reaped by a wait* call, and that if they are never reaped (e.g.
because of ignoring SIGCHLD, or exitting yourself first) then they are
never counted. These were counted in release_task for all threads. I've
changed it so they are counted in wait_task_zombie, i.e. exactly when
being reaped.
POSIX also specifies for RUSAGE_CHILDREN that the report include the reaped
child processes of the calling process, i.e. whole thread group in Linux,
not just ones forked by the calling thread. POSIX specifies tms_c[us]time
fields in the times call the same way. I've moved the c* fields that
contain this information into signal_struct, where the single set of
counters accumulates data from any thread in the group that calls wait*.
Finally, POSIX specifies getrusage and times as returning cumulative totals
for the whole process (aka thread group), not just the calling thread.
I've added fields in signal_struct to accumulate the stats of detached
threads as they die. The process stats are the sums of these records plus
the stats of remaining each live/zombie thread. The times and getrusage
calls, and the internal uses for filling in wait4 results and siginfo_t,
now iterate over the threads in the thread group and sum up their stats
along with the stats recorded for threads already dead and gone.
I added a new value RUSAGE_GROUP (-3) for the getrusage system call rather
than changing the behavior of the old RUSAGE_SELF (0). POSIX specifies
RUSAGE_SELF to mean all threads, so the glibc getrusage call will just
translate it to RUSAGE_GROUP for new kernels. I did this thinking that
someone somewhere might want the old behavior with an old glibc and a new
kernel (it is only different if they are using CLONE_THREAD anyway).
However, I've changed the times system call to conform to POSIX as well and
did not provide any backward compatibility there. In that case there is
nothing easy like a parameter value to use, it would have to be a new
system call number. That seems pretty pointless. Given that, I wonder if
it is worth bothering to preserve the compatible RUSAGE_SELF behavior by
introducing RUSAGE_GROUP instead of just changing RUSAGE_SELF's meaning.
Comments?
I've done some basic testing on x86 and x86-64, and all the numbers come
out right after these fixes. (I have a test program that shows a few
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch adds a new system call `waitid'. This is a new POSIX call that
subsumes the rest of the wait* family and can do some things the older
calls cannot. A minor addition is the ability to select what kinds of
status to check for with a mask of independent bits, so you can wait for
just stops and not terminations, for example. A more significant
improvement is the WNOWAIT flag, which allows for polling child status
without reaping. This interface fills in a siginfo_t with the same details
that a SIGCHLD for the status change has; some of that info (e.g. si_uid)
is not available via wait4 or other calls.
I've added a new system call that has the parameter conventions of the
POSIX function because that seems like the cleanest thing. This patch
includes the actual system call table additions for i386 and x86-64; other
architectures will need to assign the system call number, and 64-bit ones
may need to implement 32-bit compat support for it as I did for x86-64.
The new features could instead be provided by some new kludge inventions in
the wait4 system call interface (that's what BSD did). If kludges are
preferable to adding a system call, I can work up something different.
I added a struct rusage field si_rusage to siginfo_t in the SIGCHLD case
(this does not affect the size or layout of the struct). This is not part
of the POSIX interface, but it makes it so that `waitid' subsumes all the
functionality of `wait4'. Future kernel ABIs (new arch's or whatnot) can
have only the `waitid' system call and the rest of the wait* family
including wait3 and wait4 can be implemented in user space using waitid.
There is nothing in user space as yet that would make use of the new field.
Most of the new functionality is implemented purely in the waitid system
call itself. POSIX also provides for the WCONTINUED flag to report when a
child process had been stopped by job control and then resumed with
SIGCONT. Corresponding to this, a SIGCHLD is now generated when a child
resumes (unless SA_NOCLDSTOP is set), with the value CLD_CONTINUED in
siginfo_t.si_code. To implement this, some additional bookkeeping is
required in the signal code handling job control stops.
The motivation for this work is to make it possible to implement the POSIX
semantics of the `waitid' function in glibc completely and correctly. If
changing either the system call interface used to accomplish that, or any
details of the kernel implementation work, would improve the chances of
getting this incorporated, I am more than happy to work through any issues.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
A patch to replace tmpfs/shmem with ramfs for systems without swap,
incorporating the suggestions from Andi and Hugh. It uses ramfs instead.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
A special kprobe type which can be placed on function entry points, and
employs a simple mirroring principle to allow seamless access to the
arguments of a function being probed. The probe handler routine should
have the same prototype as the function being probed. Currently
implemented for x86.
The way it works is that when the probe is hit, the breakpoint handler
simply irets to the probe handler's eip while retaining register and stack
state corresponding to the function entry. After it is done, the probe
handler calls jprobe_return() which traps again to restore processor state
and switch back to the probed function. Linus noted correctly at KS that
we need to be careful as gcc assumes that the callee owns arguments. We
save and restore enough stack bytes to cover argument space.
Sample Usage:
static int jip_queue_xmit(struct sk_buff *skb, int ipfragok)
{
... whatever ...
jprobe_return();
return 0;
}
struct jprobe jp = {
{.addr = (kprobe_opcode_t *) ip_queue_xmit},
.entry = (kprobe_opcode_t *) jip_queue_xmit
};
register_jprobe(&jp);
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch helps developers to trap at almost any kernel code address,
specifying a handler routine to be invoked when the breakpoint is hit.
Useful for analysing the Linux kernel by collecting debugging information
non-disruptively. Employs single-stepping out-of-line to avoid probe
misses on SMP and may be especially useful in aiding debugging elusive
races and problems on live systems. More elaborate dynamic tracing tools
such as DProbes can be built over the kprobes interface.
Helps developers to trap at almost any kernel code address, specifying a
handler routine to be invoked when the breakpoint is hit. Useful for
analysing the Linux kernel by collecting debugging information
non-disruptively. Employs single-stepping out-of-line to avoid probe
misses on SMP and may be especially useful in aiding debugging elusive
races and problems on live systems. More elaborate dynamic tracing tools
such as DProbes can be built over the kprobes interface.
Sample usage:
To place a probe on __blockdev_direct_IO:
static int probe_handler(struct kprobe *p, struct pt_regs *)
{
... whatever ...
}
struct kprobe kp = {
.addr = __blockdev_direct_IO,
.pre_handler = probe_handler
};
register_kprobe(&kp);
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into nuts.davemloft.net:/disk1/BK/net-2.6
|
|
Fix some issues with netem.
* fix memory link in q_destroy where distribution table not freed
* fix math error in tabledist that made it work for only small values
* change API for distribution table so scaling factor is constant and
table size is determined by looking at the rtnetlink payload size.
this is faster and simpler. Since haven't actually released the tools to load
the table yet, this is the chance to get it right ;-)
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Problem discovered, and original version of patch, by
Olaf Kirch.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
Paolo Giarrusso points out that the UML merge incorrectly
caused INITIAL_JIFFIES to be reset back to zero, instead
of the debug value for getting an early jiffies wrap.
Fix it back up.
|
|
into ppc970.osdl.org:/home/torvalds/v2.6/linux
|
|
into ppc970.osdl.org:/home/torvalds/v2.6/linux
|
|
As noticed by Alan Cox:
"It doesn't work now so it clearly isnt being used 8). We hold the lock
because its a proc function and we then replace the proc functions in the
attach method -> deadlock. It is also incredibly hard to fix without a
major rewrite."
The same is true for HDIO_SET_IDE_SCSI ioctl.
Both were broken 18 months ago in 2.5.63 as a side-effect of locking fixes.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into kernel.bkbits.net:/home/davem/net-2.6
|
|
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>'
|
|
This is a third revision of the netem extensions which provides
* packet duplication
* correlated random number
* loading distribution table
The API is backwards compatible and now uses nested elements to allow
for easier future changes.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
into mulgrave.(none):/home/jejb/BK/scsi-misc-2.6
|
|
into ppc970.osdl.org:/home/torvalds/v2.6/linux
|
|
into suse.cz:/home/perex/bk/linux-sound/linux-sound
|
|
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Add support for netmos devices to the parallel port driver.
NetMOS 9805 support is already in the kernel, this patch adds the support for
the missing 9735,9855,9755 and 9715 chips.
And another remark: The 9735 and 9835 seem to be chips with serial *and*
parallel interfaces, so I suppose they are already claimed somewhere in the
serial driver. I don't know whether this causes any problems. I'm sorry that
I can't test, I've only a 9805 here. Any idea how these "dual" chips have to
be handled by the kernel?
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
As David M-T points out, the default per-user mlock limit should be at least a
single page.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Account reserved memory properly as per acahalan's speecified semantics.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Remove the accounting overhead when CONFIG_PROC_FS is not defined.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Merely removing down_read(&mm->mmap_sem) from task_vsize() is too
half-assed to let stand. The following patch removes the vma iteration
as well as the down_read(&mm->mmap_sem) from both task_mem() and
task_statm() and callers for the CONFIG_MMU=y case in favor of
accounting the various stats reported at the times of vma creation,
destruction, and modification. Unlike the 2.4.x patches of the same
name, this has no per-pte-modification overhead whatsoever.
This patch quashes end user complaints of top(1) being slow as well as
kernel hacker complaints of per-pte accounting overhead simultaneously.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
task_vsize() doesn't need mm->mmap_sem for the CONFIG_MMU case; the
semaphore doesn't prevent mm->total_vm from going stale or getting
inconsistent with other numbers regardless. Also, KSTK_EIP() and
KSTK_ESP() don't want or need protection from mm->mmap_sem either. So this
pushes mm->mmap_sem to task_vsize() in the CONFIG_MMU=n task_vsize().
Also, hoist the prototype of task_vsize() into proc_fs.h
The net result of this is a small speedup of procps for CONFIG_MMU.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Anton prompted me to get this patch merged. It changes the core buffer
sync algorithm of OProfile to avoid global locks wherever possible. Anton
tested an earlier version of this patch with some success. I've lightly
tested this applied against 2.6.8.1-mm3 on my two-way machine.
The changes also have the happy side-effect of losing less samples after
munmap operations, and removing the blind spot of tasks exiting inside the
kernel.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Add a whole bunch more might_sleep() checks. We also enable might_sleep()
checking in copy_*_user(). This was non-trivial because of the "copy_*_user()
in atomic regions" trick would generate false positives. Fix that up by
adding a new __copy_*_user_inatomic(), which avoids the might_sleep() check.
Only i386 is supported in this patch.
With: Arjan van de Ven <arjanv@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Reduce the size of struct inode on 64bit architectures by reducing padding.
This assumes spinlocks are 32bit or less which is the case on most
architectures.
This reduces inode structs by 24 bytes on ppc64, and on ext2 increases the
number of inodes in a 4kB slab from 5 to 6.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
The megaraid driver is calling these, but they don't exist if !CONFIG_COMPAT.
Add the necessary stubs, and clean a few things up.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Make the various bits of state no longer used anywhere else static to
kernel/profile.c
Signed-off-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
proc_misc.c is a trainwreck. Move the file_operations for /proc/profile into
kernel/profile.c and call the profiling setup via initcall.
Signed-off-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
With prof_cpu_mask and profile_pc() in hand, the core is now able to perform
all the profile accounting work on behalf of arches. Consolidate the profile
accounting and convert all arches to call the core function.
Signed-off-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Handling of prof_cpu_mask is grossly inconsistent. Some arches have it as a
cpumask_t, others unsigned long, and even within arches it's treated
inconsistently. This makes it cpumask_t across the board, and consolidates
the handling in kernel/profile.c
Signed-off-by: William Irwin <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into suse.cz:/home/perex/bk/linux-sound/linux-sound
|
|
into kroah.com:/home/greg/linux/BK/usb-2.6
|
|
into kernel.bkbits.net:/home/davem/net-2.6
|
|
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
The problem:
In arch/i386/signal.c, in the do_signal() function, it calls
get_signal_to_deliver() which returns the signal number to deliver (along
with siginfo). get_signal_to_deliver() grabs and releases the lock, so
the signal handler lock is not held in do_signal(). Then the do_signal()
calls handle_signal(), which uses the signal number to extract the
sa_handler, etc.
Since no lock is held, it seems like another thread with the same
signal handler set can come in and call sigaction(), it can change
sa_handler between the call to get_signal_to_deliver() and fetching the
value of sa_handler. If the sigaction() call set it to SIG_IGN, SIG_DFL,
or some other fundamental change, that bad things can happen.
The patch:
You have to get the sigaction information that will be delivered while
holding sighand->siglock in get_signal_to_deliver().
In 2.4, it can be fixed per-arch and requires no change to the
arch-independent code because the arch fetches the signal with
dequeue_signal() and does all the checking.
The test app:
The program below has three threads that share signal handlers. Thread
1 changes the signal handler for a signal from a handler to SIG_IGN and
back. Thread 0 sends signals to thread 3, which just receives them.
What I believe is happening is that thread 1 changes the signal handler
in the process of thread 3 receiving the signal, between the time that
thread 3 fetches the signal info using get_signal_to_deliver() and
actually delivers the signal with handle_signal().
Although the program is obvously an extreme case, it seems like any
time you set the handler value of a signal to SIG_IGN or SIG_DFL, you can
have this happen. Changing signal attributes might also cause problems,
although I am not so sure about that.
(akpm: this test app segv'd on SMP within milliseconds for me)
#include <signal.h>
#include <stdio.h>
#include <sched.h>
char stack1[16384];
char stack2[16384];
void sighnd(int sig)
{
}
int child1(void *data)
{
struct sigaction act;
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
for (;;) {
act.sa_handler = sighnd;
sigaction(45, &act, NULL);
act.sa_handler = SIG_IGN;
sigaction(45, &act, NULL);
}
}
int child2(void *data)
{
for (;;) {
sleep(100);
}
}
int main(int argc, char *argv[])
{
int pid1, pid2;
signal(45, SIG_IGN);
pid2 = clone(child2, stack2 + sizeof(stack2) - 8,
CLONE_SIGHAND | CLONE_VM, NULL);
pid1 = clone(child1, stack1 + sizeof(stack2) - 8,
CLONE_SIGHAND | CLONE_VM, NULL);
for (;;) {
kill(pid2, 45);
}
}
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into suse.cz:/home/perex/bk/linux-sound/linux-sound
|
|
|
|
The ebtables brouting chain, traversed through the call
br_should_route_hook(), can alter a packet. The redirect target
does this, f.e., to change the MAC destination.
Bart discovered this and proposed a patch; this is a revised version.
This version cleans up the handle_bridge code in net/core/dev.c as well
as getting rid of extra rcu_read_lock and only does the br_port checking
once.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@redhat.com>
|