| Age | Commit message (Collapse) | Author |
|
32-bit syscall compatibility support. (This patch also moves all futex
related compat functionality into kernel/futex_compat.c.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Acked-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
[PATCH] fix audit_init failure path
[PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
[PATCH] sem2mutex: audit_netlink_sem
[PATCH] simplify audit_free() locking
[PATCH] Fix audit operators
[PATCH] promiscuous mode
[PATCH] Add tty to syscall audit records
[PATCH] add/remove rule update
[PATCH] audit string fields interface + consumer
[PATCH] SE Linux audit events
[PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
[PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
[PATCH] Fix IA64 success/failure indication in syscall auditing.
[PATCH] Miscellaneous bug and warning fixes
[PATCH] Capture selinux subject/object context information.
[PATCH] Exclude messages by message type
[PATCH] Collect more inode information during syscall processing.
[PATCH] Pass dentry, not just name, in fsnotify creation hooks.
[PATCH] Define new range of userspace messages.
[PATCH] Filter rule comparators
...
Fixed trivial conflict in security/selinux/hooks.c
|
|
Original patch from Paul Mundt, sysfs parts removed by me since they
were broken.
Signed-off-by: Jens Axboe <axboe@suse.de>
|
|
This fixes the per-user and per-message-type filtering when syscall
auditing isn't enabled.
[AV: folded followup fix from the same author]
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Build kernel/intermodule.c only when required.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
hrtimer subsystem core. It is initialized at bootup and expired by the timer
interrupt, but is otherwise not utilized by any other subsystem yet.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
- Moving the crash_dump.c file to arch dependent part as kmap_atomic_pfn is
specific to i386 and highmem may not exist in other archs.
- Use ioremap for x86_64 to map the previous kernel memory.
- In copy_oldmem_page(), we now directly copy to the user/kernel buffer and
avoid the unneccesary copy to a kmalloc'd page.
Signed-off-by: Rachita Kothiyal <rachita@in.ibm.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
mutex implementation - add debugging code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
|
|
mutex implementation, core files: just the basic subsystem, no users of it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
|
|
This patch is a rewrite of the one submitted on October 1st, using modules
(http://marc.theaimsgroup.com/?l=linux-kernel&m=112819093522998&w=2).
This rewrite adds a tristate CONFIG_RCU_TORTURE_TEST, which enables an
intense torture test of the RCU infratructure. This is needed due to the
continued changes to the RCU infrastructure to accommodate dynamic ticks,
CPU hotplug, realtime, and so on. Most of the code is in a separate file
that is compiled only if the CONFIG variable is set. Documentation on how
to run the test and interpret the output is also included.
This code has been tested on i386 and ppc64, and an earlier version of the
code has received extensive testing on a number of architectures as part of
the PREEMPT_RT patchset.
Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Since CONFIG_IKCONFIG_PROC already depends on CONFIG_IKCONFIG, adding
configs.o again is redundant.
Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch (written by me and also containing many suggestions of Arjan van
de Ven) does a major cleanup of the spinlock code. It does the following
things:
- consolidates and enhances the spinlock/rwlock debugging code
- simplifies the asm/spinlock.h files
- encapsulates the raw spinlock type and moves generic spinlock
features (such as ->break_lock) into the generic code.
- cleans up the spinlock code hierarchy to get rid of the spaghetti.
Most notably there's now only a single variant of the debugging code,
located in lib/spinlock_debug.c. (previously we had one SMP debugging
variant per architecture, plus a separate generic one for UP builds)
Also, i've enhanced the rwlock debugging facility, it will now track
write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too.
All locks have lockup detection now, which will work for both soft and hard
spin/rwlock lockups.
The arch-level include files now only contain the minimally necessary
subset of the spinlock code - all the rest that can be generalized now
lives in the generic headers:
include/asm-i386/spinlock_types.h | 16
include/asm-x86_64/spinlock_types.h | 16
I have also split up the various spinlock variants into separate files,
making it easier to see which does what. The new layout is:
SMP | UP
----------------------------|-----------------------------------
asm/spinlock_types_smp.h | linux/spinlock_types_up.h
linux/spinlock_types.h | linux/spinlock_types.h
asm/spinlock_smp.h | linux/spinlock_up.h
linux/spinlock_api_smp.h | linux/spinlock_api_up.h
linux/spinlock.h | linux/spinlock.h
/*
* here's the role of the various spinlock/rwlock related include files:
*
* on SMP builds:
*
* asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
* initializers
*
* linux/spinlock_types.h:
* defines the generic type and initializers
*
* asm/spinlock.h: contains the __raw_spin_*()/etc. lowlevel
* implementations, mostly inline assembly code
*
* (also included on UP-debug builds:)
*
* linux/spinlock_api_smp.h:
* contains the prototypes for the _spin_*() APIs.
*
* linux/spinlock.h: builds the final spin_*() APIs.
*
* on UP builds:
*
* linux/spinlock_type_up.h:
* contains the generic, simplified UP spinlock type.
* (which is an empty structure on non-debug builds)
*
* linux/spinlock_types.h:
* defines the generic type and initializers
*
* linux/spinlock_up.h:
* contains the __raw_spin_*()/etc. version of UP
* builds. (which are NOPs on non-debug, non-preempt
* builds)
*
* (included on UP-non-debug builds:)
*
* linux/spinlock_api_up.h:
* builds the _spin_*() APIs.
*
* linux/spinlock.h: builds the final spin_*() APIs.
*/
All SMP and UP architectures are converted by this patch.
arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should
be mostly fine.
From: Grant Grundler <grundler@parisc-linux.org>
Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
Builds 32-bit SMP kernel (not booted or tested). I did not try to build
non-SMP kernels. That should be trivial to fix up later if necessary.
I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids
some ugly nesting of linux/*.h and asm/*.h files. Those particular locks
are well tested and contained entirely inside arch specific code. I do NOT
expect any new issues to arise with them.
If someone does ever need to use debug/metrics with them, then they will
need to unravel this hairball between spinlocks, atomic ops, and bit ops
that exist only because parisc has exactly one atomic instruction: LDCW
(load and clear word).
From: "Luck, Tony" <tony.luck@intel.com>
ia64 fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjanv@infradead.org>
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se>
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch adds a new kernel debug feature: CONFIG_DETECT_SOFTLOCKUP.
When enabled then per-CPU watchdog threads are started, which try to run
once per second. If they get delayed for more than 10 seconds then a
callback from the timer interrupt detects this condition and prints out a
warning message and a stack dump (once per lockup incident). The feature
is otherwise non-intrusive, it doesnt try to unlock the box in any way, it
only gets the debug info out, automatically, and on all CPUs affected by
the lockup.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-Off-By: Matthias Urlichs <smurf@smurf.noris.de>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch provides the interfaces necessary to read the dump contents,
treating it as a high memory device.
Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch introduces the architecture independent implementation the
sys_kexec_load, the compat_sys_kexec_load system calls.
Kexec on panic support has been integrated into the core patch and is
relatively clean.
In addition the hopefully architecture independent option
crashkernel=size@location has been docuemented. It's purpose is to reserve
space for the panic kernel to live, and where no DMA transfer will ever be
setup to access.
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
While looking at code generated by gcc4.0 I noticed some functions still
had frame pointers, even after we stopped ppc64 from defining
CONFIG_FRAME_POINTER. It turns out kernel/Makefile hardwires
-fno-omit-frame-pointer on when compiling schedule.c.
Create CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER and define it on architectures
that dont require frame pointers in sched.c code.
(akpm: blame me for the name)
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into mars.ravnborg.org:/home/sam/bk/kbuild
|
|
This my cpuset patch, with the following changes in the last two weeks:
1) Updated to 2.6.8.1-mm1
2) [Simon Derr <Simon.Derr@bull.net>] Fix new cpuset to begin empty,
not copied from parent. Needed to avoid breaking exclusive property.
3) [Dinakar Guniguntala <dino@in.ibm.com>] Finish initializing top
cpuset from cpu_possible_map after smp_init() called.
4) [Paul Jackson <pj@sgi.com>] Check on each call to __alloc_pages()
if the current tasks cpuset mems_allowed has changed. Use a cpuset
generation number, bumped on any cpuset memory placement change,
to make this check efficient. Update the tasks mems_allowed from
its cpuset, if the cpuset has changed.
5) [Paul Jackson <pj@sgi.com>] If a task is moved to another cpuset,
then update its cpus_allowed, using set_cpus_allowed().
6) [Paul Jackson <pj@sgi.com>] Update Documentation/cpusets.txt to
reflect above changes (4) and (5).
I continue to recommend the following patch for inclusion in your 2.6.9-*mm
series, when that opens. It provides an important facility for high
performance computing on large systems. Simon Derr of Bull (France) and
myself are the primary authors. Erich Focht has indicated that NEC is also
a potential user of this patch on the TX-7 NUMA machines, and that he
"would very much welcome the inclusion of cpusets."
I offer this update to lkml, in order to invite continued feedback.
The one prerequiste patch for this cpuset patch was just posted before this
one. That was a patch to provide a new bitmap list format, of which
cpusets is the first user.
This patch has been built on top of 2.6.8.1-mm1, for the arch's:
i386 x86_64 sparc ia64 powerpc-405 powerpc-750 sparc64
with and without CONFIG_CPUSET. It has been booted and tested on ia64
(sn2_defconfig, SN2 hardware). The 'alpha' arch also built, except for
what seems to be an unrelated toolchain problem (crosstool ld sigsegv) in
the final link step.
===
Cpusets provide a mechanism for assigning a set of CPUs and Memory Nodes to
a set of tasks.
Cpusets constrain the CPU and Memory placement of tasks to only the
processor and memory resources within a tasks current cpuset. They form a
nested hierarchy visible in a virtual file system. These are the essential
hooks, beyond what is already present, required to manage dynamic job
placement on large systems.
Cpusets require small kernel hooks in init, exit, fork, mempolicy,
sched_setaffinity, page_alloc and vmscan. And they require a "struct
cpuset" pointer, a cpuset_mems_generation, and a "mems_allowed" nodemask_t
(to go along with the "cpus_allowed" cpumask_t that's already there) in
each task struct.
These hooks:
1) establish and propagate cpusets,
2) enforce CPU placement in sched_setaffinity,
3) enforce Memory placement in mbind and sys_set_mempolicy,
4) restrict page allocation and scanning to mems_allowed, and
5) restrict migration and set_cpus_allowed to cpus_allowed.
The other required hook, restricting task scheduling to CPUs in a tasks
cpus_allowed mask, is already present.
Cpusets extend the usefulness of, the existing placement support that was
added to Linux 2.6 kernels: sched_setaffinity() for CPU placement, and
mbind() and set_mempolicy() for memory placement. On smaller or dedicated
use systems, the existing calls are often sufficient.
On larger NUMA systems, running more than one, performance critical, job,
it is necessary to be able to manage jobs in their entirety. This includes
providing a job with exclusive CPU and memory that no other job can use,
and being able to list all tasks currently in a cpuset.
A given job running within a cpuset, would likely use the existing
placement calls to manage its CPU and memory placement in more detail.
Cpusets are named, nested sets of CPUs and Memory Nodes. Each cpuset is
represented by a directory in the cpuset virtual file system, normally
mounted at /dev/cpuset.
Each cpuset directory provides the following files, which can be
read and written:
cpus:
List of CPUs allowed to tasks in that cpuset.
mems:
List of Memory Nodes allowed to tasks in that cpuset.
tasks:
List of pid's of tasks in that cpuset.
cpu_exclusive:
Flag (0 or 1) - if set, cpuset has exclusive use of
its CPUs (no sibling or cousin cpuset may overlap CPUs).
mem_exclusive:
Flag (0 or 1) - if set, cpuset has exclusive use of
its Memory Nodes (no sibling or cousin may overlap).
notify_on_release:
Flag (0 or 1) - if set, then /sbin/cpuset_release_agent
will be invoked, with the name (/dev/cpuset relative path)
of that cpuset in argv[1], when the last user of it (task
or child cpuset) goes away. This supports automatic
cleanup of abandoned cpusets.
In addition one new filetype is added to the /proc file system:
/proc/<pid>/cpuset:
For each task (pid), list its cpuset path, relative to the
root of the cpuset file system. This file is read-only.
New cpusets are created using 'mkdir' (at the shell or in C). Old ones are
removed using 'rmdir'. The above files are accessed using read(2) and
write(2) system calls, or shell commands such as 'cat' and 'echo'.
The CPUs and Memory Nodes in a given cpuset are always a subset of its
parent. The root cpuset has all possible CPUs and Memory Nodes in the
system. A cpuset may be exclusive (cpu or memory) only if its parent is
similarly exclusive.
See further Documentation/cpusets.txt, at the top of the following
patch.
/proc interface:
It is useful, when learning and making new uses of cpusets and placement to be
able to see what are the current value of a tasks cpus_allowed and
mems_allowed, which are the actual placement used by the kernel scheduler and
memory allocator.
The cpus_allowed and mems_allowed values are needed by user space apps that
are micromanaging placement, such as when moving an app to a obtained by
that app within its cpuset using sched_setaffinity, mbind and
set_mempolicy.
The cpus_allowed value is also available via the sched_getaffinity system
call. But since the entire rest of the cpuset API, including the display
of mems_allowed added here, is via an ascii style presentation in /proc and
/dev/cpuset, it is worth the extra couple lines of code to display
cpus_allowed in the same way.
This patch adds the display of these two fields to the 'status' file in the
/proc/<pid> directory of each task. The fields are only added if
CONFIG_CPUSETS is enabled (which is also needed to define the mems_allowed
field of each task). The new output lines look like:
$ tail -2 /proc/1/status
Cpus_allowed: ffffffff,ffffffff,ffffffff,ffffffff
Mems_allowed: ffffffff,ffffffff
Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch provides support for thread and process CPU time clocks in the
POSIX clock interface. Both the existing utime and utime+stime information
(already available via getrusage et al) can be used, as well as a new
(potentially) more precise and accurate clock (which cannot distinguish user
from system time). The clock used is that provided by the `sched_clock'
function already used internally by the scheduler. This gives a way for
platforms to provide the highest-resolution CPU time tracking that is
available cheaply, and some already do so (such as x86 using TSC). Because
this clock is already sampled internally by the scheduler, this new tracking
adds only the tiniest new overhead to accomplish the bookkeeping.
Some notes:
This allows per-thread clocks to be accessed only by other threads in the same
process. The only POSIX calls that access these are defined only for
in-process use, and having this check is necessary for the userland
implementations of the POSIX clock functions to robustly refuse stale
clockid_t's in the face of potential PID reuse.
This makes no constraint on who can see whose per-process clocks. This
information is already available for the VIRT and PROF (i.e. utime and stime)
information via /proc. I am open to suggestions on if/how security
constraints on who can see whose clocks should be imposed.
The SCHED clock information is now available only via clock_* syscalls. This
means that per-thread information is not available outside the process.
Perhaps /proc should show sched_time as well? This would let ps et al show
this more-accurate information.
When this code is merged, it will be supported in glibc. I have written the
support and added some test programs for glibc, which are what I mainly used
to test the new kernel code. You can get those here:
http://people.redhat.com/roland/glibc/kernel-cpuclocks.patch
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
I'd need it merged into mainline at some point, unless anybody has strong
arguments against it. All I can guarantee here, is that I'll back it out
myself in the future, iff Cpushare will fail and nobody else started using
it in the meantime for similar security purposes.
(akpm: project details are at http://www.cpushare.com/technical. It seems
like a good idea to me, and one which is worth supporting. I agree that for
this to be successful, the added robustness of Andrea's simple and specific
jail is worthwhile).
Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch makes a needlessly global variable static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Randy Dunlap <rddunlap@osdl.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
|
|
Sticking the not-implemented syscall stuff in sys.c is a pain because the
cond_syscall()s explode when certain prototypes are in scope. And we need
those prototypes' header files for the C code in sys.c.
Fix all that up by moving all the sys_ni_syscall code into its own .c file.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
|
|
A simple ringbuffer implementation for various character drivers.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
|
|
The following patch series consolidates the various instances of waitqueue
hashing to use a uniform structure and share the per-zone hashtable among all
waitqueue hashers. This is expected to increase the number of hashtable
buckets available for waiting on bh's and inodes and eliminate statically
allocated kernel data structures for greater node locality and reduced kernel
image size. Some attempt was made to look similar to Oleg Nesterov's
suggested API in order to provide some kind of credit for independent
invention of something very similar (the original versions of these patches
predated my public postings on the subject of filtered waitqueues).
These patches have the further benefit and intention of enabling aio to use
filtered wakeups by standardizing the data structure passed to wake functions
so that embedded waitqueue elements in aio structures may be succesfully
passed to the filtered wakeup wake functions, though this patch series doesn't
implement that particular functionality.
Successfully stress-tested on x86-64, and ia64 in recent prior versions.
This patch:
Move waitqueue -related functions not needing static functions in sched.c
to kernel/wait.c
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
The main goal of this patch is to consolidate all the different but still
fundamentally similar arch/*/kernel/irq.c code into the kernel/irq/ subsystem.
There are 4 new files in the kernel/irq/ directory:
- handle.c: core bits: __do_IRQ() and handle_IRQ_event(),
callable from arch-specific irq.c code.
- manage.c: the main driver apis
- spurious.c: the handling of buggy interrupt sources.
- autoprobe.c: probing of interrupts - older code but still in use.
- proc.c: /proc/irq/ code.
- internals.h for irq-core-internal interfaces not visible to drivers
nor arch PIC code.
An architecture enables the generic hardirq code by defining
CONFIG_GENERIC_HARDIRQS in its arch Kconfig. People doing this conversion
should check out the x86/x64/ppc/ppc64 patches for details - the conversion is
quite straightforward but every converted function (i.e. every function
removed from the arch irq.c) _must_ be matched to the generic version and if
there is any detail that the generic code should do it has to be added to the
generic code. All of the currently converted 4 architectures were converted
like that, and the generic code was extended/fixed along the way.
Other changes related to this patchset:
- clean up the irq include files (linux/irq.h, linux/interrupt.h,
linux/hardirq.h) and consolidate asm-*/[hard]irq.h. Note, to keep all
non-touched architectures in an untouched state this consolidation is
done carefully and strictly under CONFIG_GENERIC_HARDIRQS.
Once the consolidation is done we can do a couple of final cleanups
to reach the following logical splitup of 3 include files:
linux/interrupt.h: driver-visible APIs and details
linux/irq.h: core irq and arch-PIC code, internals
asm-*/irq.h: arch PIC and irq delivery details
the following include files will likely vanish:
linux/hardirq.h merges into linux/irq.h
asm-*/hardirq.h: merges into asm-*/irq.h
asm-*/hw_irq.h: merges into asm-*/irq.h
Christoph would like to do these once the current wave of
cleanups gets in.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into kroah.com:/home/greg/linux/BK/driver-2.6
|
|
Thanks to Kay Sievers for pointing this out.
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
|
|
This patch achieves out of line spinlocks by creating kernel/spinlock.c
and using the _raw_* inline locking functions.
Now, as much as this is supposed to be arch agnostic, there was still a
fair amount of rummaging about in archs, mostly for the cases where the
arch already has out of line locks and i wanted to avoid the extra call,
saving that extra call also makes lock profiling easier. PPC32/64 was
an example of such an arch and i have added the necessary profile_pc()
function as an example.
Size differences are with CONFIG_PREEMPT enabled since we wanted to
determine how much could be saved by moving that lot out of line too.
ppc64 = 259897 bytes:
text data bss dec hex filename
5489808 1962724 709064 8161596 7c893c vmlinux-after
5749577 1962852 709064 8421493 808075 vmlinux-before
sparc64 = 193368 bytes:
text data bss dec hex filename
3472037 633712 308920 4414669 435ccd vmlinux-after
3665285 633832 308920 4608037 465025 vmlinux-before
i386 = 416075 bytes
text data bss dec hex filename
5808371 867442 326864 7002677 6ada35 vmlinux-after
6221254 870634 326864 7418752 713380 vmlinux-before
x86-64 = 282446 bytes
text data bss dec hex filename
4598025 1450644 523632 6572301 64490d vmlinux-after
4881679 1449436 523632 6854747 68985b vmlinux-before
It has been compile tested (UP, SMP, PREEMPT) on i386, x86-64, sparc,
sparc64, ppc64, ppc32 and runtime tested on i386, x86-64 and sparc64.
Signed-off-by: Zwane Mwaikambo <zwane@fsmlabs.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
o /sys/kernel/hotplug_seqnum exports the current number
o lib/kobject.c's sequence_num is renamed to hotplug_seqnum and
exported by include/linux/kobject.h
o the source file ksysfs.c in kernel/ creates on init the
sybsystem "/sys/kernel/" in sysfs
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
|
|
This patch helps developers to trap at almost any kernel code address,
specifying a handler routine to be invoked when the breakpoint is hit.
Useful for analysing the Linux kernel by collecting debugging information
non-disruptively. Employs single-stepping out-of-line to avoid probe
misses on SMP and may be especially useful in aiding debugging elusive
races and problems on live systems. More elaborate dynamic tracing tools
such as DProbes can be built over the kprobes interface.
Helps developers to trap at almost any kernel code address, specifying a
handler routine to be invoked when the breakpoint is hit. Useful for
analysing the Linux kernel by collecting debugging information
non-disruptively. Employs single-stepping out-of-line to avoid probe
misses on SMP and may be especially useful in aiding debugging elusive
races and problems on live systems. More elaborate dynamic tracing tools
such as DProbes can be built over the kprobes interface.
Sample usage:
To place a probe on __blockdev_direct_IO:
static int probe_handler(struct kprobe *p, struct pt_regs *)
{
... whatever ...
}
struct kprobe kp = {
.addr = __blockdev_direct_IO,
.pre_handler = probe_handler
};
register_kprobe(&kp);
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
From: Andy Whitcroft <apw@shadowen.org>
Being able to recover the configuration from a kernel is very useful and it
would be nice to default this option to Yes. Currently, to have the config
available both from the image (using extract-ikconfig) and via /proc we
keep two copies of the original .config in the kernel. One in plain text
and one gzip compressed. This is not optimal.
This patch removes the plain text version of the configuration and updates
the extraction tools to locate and use the gzip'd version of the file.
This has the added bonus of providing us with the exact same results in
both cases, the original .config; including the comments.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
When using a separate output directory the in-kernel config wiere rebuild
each time the kernel was compiled. Fix this by specifying correct path to
Makefile in the prerequisite to the ikconfig.h file.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
From: Rik Faith <faith@redhat.com>
This patch provides a low-overhead system-call auditing framework for Linux
that is usable by LSM components (e.g., SELinux). This is an update of the
patch discussed in this thread:
http://marc.theaimsgroup.com/?t=107815888100001&r=1&w=2
In brief, it provides for netlink-based logging of audit records that have
been generated in other parts of the kernel (e.g., SELinux) as well as the
ability to audit system calls, either independently (using simple
filtering) or as a compliment to the audit record that another part of the
kernel generated.
The main goals were to provide system call auditing with 1) as low overhead
as possible, and 2) without duplicating functionality that is already
provided by SELinux (and/or other security infrastructures). This
framework will work "stand-alone", but is not designed to provide, e.g.,
CAPP functionality without another security component in place.
This updated patch includes changes from feedback I have received,
including the ability to compile without CONFIG_NET (and better use of
tabs, so use -w if you diff against the older patch).
Please see http://people.redhat.com/faith/audit/ for an early example
user-space client (auditd-0.4.tar.gz) and instructions on how to try it.
My future intentions at the kernel level include improving filtering (e.g.,
syscall personality/exit codes) and syscall support for more architectures.
First, though, I'm going to work on documentation, a (real) audit daemon,
and patches for other user-space tools so that people can play with the
framework and understand how it can be used with and without SELinux.
Update:
Light-weight Auditing Framework receive filter fixes
From: Rik Faith <faith@redhat.com>
Since audit_receive_filter() is only called with audit_netlink_sem held, it
cannot race with either audit_del_rule() or audit_add_rule(), so the
list_for_each_entry_rcu()s may be replaced by list_for_each_entry()s, and
the rcu_read_{un,}lock()s removed. A fix for this is part of the attached
patch.
Other features of the attached patch are:
1) generalized the ability to test for inequality
2) added syscall exit status reporting and testing
3) added ability to report and test first 4 syscall arguments (this adds
a large amount of flexibility for little cost; not implemented or tested
on ppc64)
4) added ability to report and test personality
User-space demo program enhanced for new fields and inequality testing:
http://people.redhat.com/faith/audit/auditd-0.5.tar.gz
|
|
The "bogolock" code was introduced in module.c, as a way of freezing
the machine when we wanted to remove a module. This patch moves it
out to stop_machine.c and stop_machine.h.
Since the code changes affinity and proirity, it's impolite to hijack
the current context, so we use a kthread. This means we have to pass
the function rather than implement "stop_machine()" and
"restart_machine()".
|
|
From: Rusty Russell <rusty@rustcorp.com.au>
These two patches provide the framework for stopping kernel threads to
allow hotplug CPU. This one just adds kthread.c and kthread.h, next
one uses it.
Most importantly, adds a Monty Python quote to the kernel.
Details:
The hotplug CPU code introduces two major problems:
1) Threads which previously never stopped (migration thread,
ksoftirqd, keventd) have to be stopped cleanly as CPUs go offline.
2) Threads which previously never had to be created now have
to be created when a CPU goes online.
Unfortunately, stopping a thread is fairly baroque, involving memory
barriers, a completion and spinning until the task is actually dead
(for example, complete_and_exit() must be used if inside a module).
There are also three problems in starting a thread:
1) Doing it from a random process context risks environment contamination:
better to do it from keventd to guarantee a clean environment, a-la
call_usermodehelper.
2) Getting the task struct without races is a hard: see kernel/sched.c
migration_call(), kernel/workqueue.c create_workqueue_thread().
3) There are races in starting a thread for a CPU which is not yet
online: migration thread does a complex dance at the moment for
a similar reason (there may be no migration thread to migrate us).
Place all this logic in some primitives to make life easier:
kthread_create() and kthread_stop(). These primitives require no
extra data-structures in the caller: they operate on normal "struct
task_struct"s.
Other changes:
- Expose keventd_up(), as keventd and migration threads will use kthread to
launch, and kthread normally uses workqueues and must recognize this case.
- Kthreads created at boot before "keventd" are spawned directly. However,
this means that they don't have all signals blocked, and hence can be
killed. The simplest solution is to always explicitly block all signals in
the kthread.
- Change over the migration threads, the workqueue threads and the
ksoftirqd threads to use kthread.
- module.c currently spawns threads directly to stop the machine, so a
module can be atomically tested for removal.
- Unfortunately, this means that the current task is manipulated (which
races with set_cpus_allowed, for example), and it can't set its priority
artificially high. Using a kernel thread can solve this cleanly, and with
kthread_run, it's simple.
- kthreads use keventd, so they inherit its cpus_allowed mask. Unset it.
All current users set it explicity anyway, but it's nice to fix.
- call_usermode_helper uses keventd, so the process created inherits its
cpus_allowed mask. Unset it.
- Prevent errors in boot when cpus_possible() contains a cpu which is not
online (ie. a cpu didn't come up). This doesn't happen on x86, since a
boot failure makes that CPU no longer possible (hacky, but it works).
- When the cpu fails to come up, some callbacks do kthread_stop(), which
doesn't work without keventd (which hasn't started yet). Call it directly,
and take care that it restores signal state (note: do_sigaction does a
flush on blocked signals, so we don't need to repeat it).
|
|
|
|
From: "Randy.Dunlap" <randy.dunlap@verizon.net>
The SuSE kernels place their ikconfig info at /proc/config.gz: in a
different place, and compressed. We thought it was a good idea to do it
that way in 2.6 as well.
- gzip the /proc config file, put it in /proc/config.gz;
- Based on a SuSE patch by Oliver Xymoron <oxymoron@waste.org>, which was
derived from a patch by Nicholas Leon <nicholas@binary9.net>
- change /proc/ikconfig/built_with to /proc/config_build_info;
- cleanup ikconfig init/exit entry points (static, __init, __exit);
- Makefile help from Sam Ravnborg;
DESC
ikconfig cleanup
EDESC
From: Stephen Hemminger <shemminger@osdl.org>
Simplify and cleanup the code:
- use single interface to seq_file where possible
- don't need to do as much of the /proc interface, only read
- use copy_to_user to avoid char at a time copy
- remove unneccesary globals
- use const char[] rather than const char * where possible.
Didn't change the version since interface doesn't change.
|
|
Also remove $Id$ tag.
No other code change.
|
|
From: "Randy.Dunlap" <rddunlap@osdl.org>
Please merge this makefile update from Sam.
From: Sam Ravnborg <sam@ravnborg.org>
Remark, I removed dependencies for configs.o - the are generated by kbuild
anyway. Only generated files needs explicit dependencies.
|
|
- Move kernel/pm.c to kernel/power/pm.c
- Move poweroff sysrq registration to kernel/power/poweroff.c
- Mark pm_* functions deprecated to prevent new uers.
|
|
|
|
into osdl.org:/home/mochel/src/kernel/devel/linux-2.5-power
|
|
(Randy Dunlap)
Build the kernel config data into the kernel - either unloaded or accessible
via /proc
|
|
Move kernel/suspend.c -> drivers/power/swsusp.c
|
|
From: Christopher Hoover <ch@murgatroid.com>
Not everyone needs futex support, so it should be optional. This is needed
for small platforms.
|
|
This is version 23 or so of the POSIX timer code.
Internal changelog:
- Changed the signals code to match the new order of things. Also the
new xtime_lock code needed to be picked up. It made some things a lot
simpler.
- Fixed a spin lock hand off problem in locking timers (thanks
to Randy).
- Fixed nanosleep to test for out of bound nanoseconds
(thanks to Julie).
- Fixed a couple of id deallocation bugs that left old ids
laying around (hey I get this one).
- This version has a new timer id manager. Andrew Morton
suggested elimination of recursion (done) and I added code
to allow it to release unused nodes. The prior version only
released the leaf nodes. (The id manager uses radix tree
type nodes.) Also added is a reuse count so ids will not
repeat for at least 256 alloc/ free cycles.
- The changes for the new sys_call restart now allow one
restart function to handle both nanosleep and clock_nanosleep.
Saves a bit of code, nice.
- All the requested changes and Lindent too :).
- I also broke clock_nanosleep() apart much the same way
nanosleep() was with the 2.5.50-bk5 changes.
TIMER STORMS
The POSIX clocks and timers code prevents "timer storms" by
not putting repeating timers back in the timer list until
the signal is delivered for the prior expiry. Timer events
missed by this delay are accounted for in the timer overrun
count. The net result is MUCH lower system overhead while
presenting the same info to the user as would be the case if
an interrupt and timer processing were required for each
increment in the overrun count.
|
|
One of the goals of the whole new modversions implementation:
export-objs is gone for good!
|
|
This was never used, and a bad idea to begin with.
|