summaryrefslogtreecommitdiff
path: root/include/linux/init_task.h
AgeCommit message (Collapse)Author
2005-11-13[PATCH] aio: remove kioctx from mm_structZach Brown
Sync iocbs have a life cycle that don't need a kioctx. Their retrying, if any, is done in the context of their owner who has allocated them on the stack. The sole user of a sync iocb's ctx reference was aio_complete() checking for an elevated iocb ref count that could never happen. No path which grabs an iocb ref has access to sync iocbs. If we were to implement sync iocb cancelation it would be done by the owner of the iocb using its on-stack reference. Removing this chunk from aio_complete allows us to remove the entire kioctx instance from mm_struct, reducing its size by a third. On a i386 testing box the slab size went from 768 to 504 bytes and from 5 to 8 per page. Signed-off-by: Zach Brown <zach.brown@oracle.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] files: files struct with RCUDipankar Sarma
Patch to eliminate struct files_struct.file_lock spinlock on the reader side and use rcu refcounting rcuref_xxx api for the f_count refcounter. The updates to the fdtable are done by allocating a new fdtable structure and setting files->fdt to point to the new structure. The fdtable structure is protected by RCU thereby allowing lock-free lookup. For fd arrays/sets that are vmalloced, we use keventd to free them since RCU callbacks can't sleep. A global list of fdtable to be freed is not scalable, so we use a per-cpu list. If keventd is already handling the current cpu's work, we use a timer to defer queueing of that work. Since the last publication, this patch has been re-written to avoid using explicit memory barriers and use rcu_assign_pointer(), rcu_dereference() premitives instead. This required that the fd information is kept in a separate structure (fdtable) and updated atomically. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09[PATCH] files: break up files structDipankar Sarma
In order for the RCU to work, the file table array, sets and their sizes must be updated atomically. Instead of ensuring this through too many memory barriers, we put the arrays and their sizes in a separate structure. This patch takes the first step of putting the file table elements in a separate structure fdtable that is embedded withing files_struct. It also changes all the users to refer to the file table using files_fdtable() macro. Subsequent applciation of RCU becomes easier after this. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27[PATCH] Update cfq io scheduler to time sliced designJens Axboe
This updates the CFQ io scheduler to the new time sliced design (cfq v3). It provides full process fairness, while giving excellent aggregate system throughput even for many competing processes. It supports io priorities, either inherited from the cpu nice value or set directly with the ioprio_get/set syscalls. The latter closely mimic set/getpriority. This import is based on my latest from -mm. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25[PATCH] sched: cleanup context switch lockingNick Piggin
Instead of requiring architecture code to interact with the scheduler's locking implementation, provide a couple of defines that can be used by the architecture to request runqueue unlocked context switches, and ask for interrupts to be enabled over the context switch. Also replaces the "switch_lock" used by these architectures with an oncpu flag (note, not a potentially slow bitflag). This eliminates one bus locked memory operation when context switching, and simplifies the task_running function. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-07[PATCH] make ITIMER_REAL per-processRoland McGrath
POSIX requires that setitimer, getitimer, and alarm work on a per-process basis. Currently, Linux implements these for individual threads. This patch fixes these semantics for the ITIMER_REAL timer (which generates SIGALRM), making it shared by all threads in a process (thread group). Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-07[PATCH] posix-timers: CPU clock support for POSIX timersRoland McGrath
POSIX requires that when you claim _POSIX_CPUTIME and _POSIX_THREAD_CPUTIME, not only the clock_* calls but also timer_* calls must support the thread and process CPU time clocks. This patch provides that support, building on my recent additions to support these clocks in the POSIX clock_* interfaces. This patch will not work without those changes, as well as the patch fixing the timer lock-siglock deadlock problem. The apparent pervasive changes to posix-timers.c are simply that some fields of struct k_itimer have changed name and moved into a union. This was appropriate since the data structures required for the existing real-time timer support and for the new thread/process CPU-time timers are quite different. The glibc patches to support CPU time clocks using the new kernel support is in http://people.redhat.com/roland/glibc/kernel-cpuclocks.patch, and that includes tests for the timer support (if you build glibc with NPTL). From: Christoph Lameter <clameter@sgi.com> Your patch breaks the mmtimer driver because it used k_itimer values for its own purposes. Here is a fix by defining an additional structure in k_itimer (same approach for mmtimer as the cpu timers): From: Roland McGrath <roland@redhat.com> Fix bug identified by Alexander Nyberg <alexn@dsv.su.se> > The problem arises from code touching the union in alloc_posix_timer() > which makes firing go non-zero. When firing is checked in > posix_cpu_timer_set() it will be positive causing an infinite loop. > > So either the below fix or preferably move the INIT_LIST_HEAD(x) from > alloc_posix_timer() to somewhere later where it doesn't disturb the other > union members. Thanks for finding this problem. The latter is what I think is the right solution. This patch does that, and also removes some superfluous rezeroing. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-20[PATCH] fix INIT_SIGHAND warning on mipsChristoph Hellwig
sa_handler isn't the first member of struct sigaction on mips. Use C99 initializers to avoid a compiler warning. (There don't seem to be more serious problems as mips worked with that warning for ages) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-04[PATCH] move waitchld_exit from task_struct to signal_structRoland McGrath
There is really no point in each task_struct having its own waitchld_exit. In the only use of it, the waitchld_exit of each thread in a group gets woken up at the same time. So, there might as well just be one wait queue for the whole thread group. This patch does that by moving the field from task_struct to signal_struct. It should have no effect on the behavior, but saves a little work and a little storage in the multithreaded case. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] make rlimit settings per-process instead of per-threadRoland McGrath
POSIX specifies that the limit settings provided by getrlimit/setrlimit are shared by the whole process, not specific to individual threads. This patch changes the behavior of those calls to comply with POSIX. I've moved the struct rlimit array from task_struct to signal_struct, as it has the correct sharing properties. (This reduces kernel memory usage per thread in multithreaded processes by around 100/200 bytes for 32/64 machines respectively.) I took a fairly minimal approach to the locking issues with the newly shared struct rlimit array. It turns out that all the code that is checking limits really just needs to look at one word at a time (one rlim_cur field, usually). It's only the few places like getrlimit itself (and fork), that require atomicity in accessing a whole struct rlimit, so I just used a spin lock for them and no locking for most of the checks. If it turns out that readers of struct rlimit need more atomicity where they are now cheap, or less overhead where they are now atomic (e.g. fork), then seqcount is certainly the right thing to use for them instead of readers using the spin lock. Though it's in signal_struct, I didn't use siglock since the access to rlimits never needs to disable irqs and doesn't overlap with other siglock uses. Instead of adding something new, I overloaded task_lock(task->group_leader) for this; it is used for other things that are not likely to happen simultaneously with limit tweaking. To me that seems preferable to adding a word, but it would be trivial (and arguably cleaner) to add a separate lock for these users (or e.g. just use seqlock, which adds two words but is optimal for readers). Most of the changes here are just the trivial s/->rlim/->signal->rlim/. I stumbled across what must be a long-standing bug, in reparent_to_init. It does: memcpy(current->rlim, init_task.rlim, sizeof(*(current->rlim))); when surely it was intended to be: memcpy(current->rlim, init_task.rlim, sizeof(current->rlim)); As rlim is an array, the * in the sizeof expression gets the size of the first element, so this just changes the first limit (RLIMIT_CPU). This is for kernel threads, where it's clear that resetting all the rlimits is what you want. With that fixed, the setting of RLIMIT_FSIZE in nfsd is superfluous since it will now already have been reset to RLIM_INFINITY. The other subtlety is removing: tsk->rlim[RLIMIT_CPU].rlim_cur = RLIM_INFINITY; in exit_notify, which was to avoid a race signalling during self-reaping exit. As the limit is now shared, a dying thread should not change it for others. Instead, I avoid that race by checking current->state before the RLIMIT_CPU check. (Adding one new conditional in that path is now required one way or another, since if not for this check there would also be a new race with self-reaping exit later on clearing current->signal that would have to be checked for.) The one loose end left by this patch is with process accounting. do_acct_process temporarily resets the RLIMIT_FSIZE limit while writing the accounting record. I left this as it was, but it is now changing a limit that might be shared by other threads still running. I left this in a dubious state because it seems to me that processing accounting may already be more generally a dubious state when it comes to NPTL threads. I would think you would want one record per process, with aggregate data about all threads that ever lived in it, not a separate record for each thread. I don't use process accounting myself, but if anyone is interested in testing it out I could provide a patch to change it this way. One final note, this is not 100% to POSIX compliance in regards to rlimits. POSIX specifies that RLIMIT_CPU refers to a whole process in aggregate, not to each individual thread. I will provide patches later on to achieve that change, assuming this patch goes in first. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-30[PATCH] sparse: NULL vs 0 - the rest of itMika Kukkonen
2004-04-11[PATCH] fix posix-timers to have proper per-process scopeAndrew Morton
From: Roland McGrath <roland@redhat.com> The posix-timers implementation associates timers with the creating thread and destroys timers when their creator thread dies. POSIX clearly specifies that these timers are per-process, and a timer should not be torn down when the thread that created it exits. I hope there won't be any controversy on what the correct semantics are here, since POSIX is clear and the Linux feature is called "posix-timers". The attached program built with NPTL -lrt -lpthread demonstrates the bug. The program is correct by POSIX, but fails on Linux. Note that a until just the other day, NPTL had a trivial bug that always disabled its use of kernel timer syscalls (check strace for lack of timer_create/SYS_259). So unless you have built your own NPTL libs very recently, you probably won't see the kernel calls actually used by this program. Also attached is my patch to fix this. It (you guessed it) moves the posix_timers field from task_struct to signal_struct. Access is now governed by the siglock instead of the task lock. exit_itimers is called from __exit_signal, i.e. only on the death of the last thread in the group, rather than from do_exit for every thread. Timers' it_process fields store the group leader's pointer, which won't die. For the case of SIGEV_THREAD_ID, I hold a ref on the task_struct for it_process to stay robust in case the target thread dies; the ref is released and the dangling pointer cleared when the timer fires and the target thread is dead. (This should only come up in a buggy user program, so noone cares exactly how the kernel handles that case. But I think what I did is robust and sensical.) /* Test for bogus per-thread deletion of timers. */ #include <stdio.h> #include <error.h> #include <time.h> #include <signal.h> #include <stdint.h> #include <sys/time.h> #include <sys/resource.h> #include <unistd.h> #include <pthread.h> /* Creating timers in another thread should work too. */ static void *do_timer_create(void *arg) { struct sigevent *const sigev = arg; timer_t *const timerId = sigev->sigev_value.sival_ptr; if (timer_create(CLOCK_REALTIME, sigev, timerId) < 0) { perror("timer_create"); return NULL; } return timerId; } int main(void) { int i, res; timer_t timerId; struct itimerspec itval; struct sigevent sigev; itval.it_interval.tv_sec = 2; itval.it_interval.tv_nsec = 0; itval.it_value.tv_sec = 2; itval.it_value.tv_nsec = 0; sigev.sigev_notify = SIGEV_SIGNAL; sigev.sigev_signo = SIGALRM; sigev.sigev_value.sival_ptr = (void *)&timerId; for (i = 0; i < 100; i++) { printf("cnt = %d\n", i); pthread_t thr; res = pthread_create(&thr, NULL, &do_timer_create, &sigev); if (res) { error(0, res, "pthread_create"); continue; } void *val; res = pthread_join(thr, &val); if (res) { error(0, res, "pthread_join"); continue; } if (val == NULL) continue; res = timer_settime(timerId, 0, &itval, NULL); if (res < 0) perror("timer_settime"); res = timer_delete(timerId); if (res < 0) perror("timer_delete"); } return 0; }
2004-02-18[PATCH] NGROUPS 2.6.2rc2 + fixupsAndrew Morton
From: Tim Hockin <thockin@sun.com>, Neil Brown <neilb@cse.unsw.edu.au>, me New groups infrastructure. task->groups and task->ngroups are replaced by task->group_info. Group)info is a refcounted, dynamic struct with an array of pages. This allows for large numbers of groups. The current limit of 32 groups has been raised to 64k groups. It can be raised more by changing the NGROUPS_MAX constant in limits.h
2004-02-03[PATCH] initialise cpu_vm_mask in init_mmAndrew Morton
From: Anton Blanchard <anton@samba.org> Some architectures use cpu_vm_mask to optimise TLB flushes. On ppc64 we are now using a common flush infrastructure that handles both userspace and kernelspace (vmalloc) pages. In order to avoid triggering this optimisation we need to mark the init mm as having scheduled on all cpus. Things currently work by luck (we check for the cpu only having run on the local cpu, and the field is initialised to 0), but it would be safer to initialise it CPU_MASK_ALL.
2003-08-18[PATCH] cpumask_t: allow more than BITS_PER_LONG CPUsAndrew Morton
From: William Lee Irwin III <wli@holomorphy.com> Contributions from: Jan Dittmer <jdittmer@sfhq.hn.org> Arnd Bergmann <arnd@arndb.de> "Bryan O'Sullivan" <bos@serpentine.com> "David S. Miller" <davem@redhat.com> Badari Pulavarty <pbadari@us.ibm.com> "Martin J. Bligh" <mbligh@aracnet.com> Zwane Mwaikambo <zwane@linuxpower.ca> It has ben tested on x86, sparc64, x86_64, ia64 (I think), ppc and ppc64. cpumask_t enables systems with NR_CPUS > BITS_PER_LONG to utilize all their cpus by creating an abstract data type dedicated to representing cpu bitmasks, similar to fd sets from userspace, and sweeping the appropriate code to update callers to the access API. The fd set-like structure is according to Linus' own suggestion; the macro calling convention to ambiguate representations with minimal code impact is my own invention. Specifically, a new set of inline functions for manipulating arbitrary-width bitmaps is introduced with a relatively simple implementation, in tandem with a new data type representing bitmaps of width NR_CPUS, cpumask_t, whose accessor functions are defined in terms of the bitmap manipulation inlines. This bitmap ADT found an additional use in i386 arch code handling sparse physical APIC ID's, which was convenient to use in this case as the accounting structure was required to be wider to accommodate the physids consumed by larger numbers of cpus. For the sake of simplicity and low code impact, these cpu bitmasks are passed primarily by value; however, an additional set of accessors along with an auxiliary data type with const call-by-reference semantics is provided to address performance concerns raised in connection with very large systems, such as SGI's larger models, where copying and call-by-value overhead would be prohibitive. Few (if any) users of the call-by-reference API are immediately introduced. Also, in order to avoid calling convention overhead on architectures where structures are required to be passed by value, NR_CPUS <= BITS_PER_LONG is special-cased so that cpumask_t falls back to an unsigned long and the accessors perform the usual bit twiddling on unsigned longs as opposed to arrays thereof. Audits were done with the structure overhead in-place, restoring this special-casing only afterward so as to ensure a more complete API conversion while undergoing the majority of its end-user exposure in -mm. More -mm's were shipped after its restoration to be sure that was tested, too. The immediate users of this functionality are Sun sparc64 systems, SGI mips64 and ia64 systems, and IBM ia32, ppc64, and s390 systems. Of these, only the ppc64 machines needing the functionality have yet to be released; all others have had systems requiring it for full functionality for at least 6 months, and in some cases, since the initial Linux port to the affected architecture.
2003-06-02[PATCH] preallocate signal queue resource - Posix timersJim Houston
This adds a new interface to kernel/signal.c which allows signals to be sent using preallocated sigqueue structures. It also modifies kernel/posix-timers.c to use this interface. The current timer code may fail to deliver a timer expiry signal if there are no sigqueue structures available at the time of the expiry. The Posix specification is clear that the signal queuing resource should be allocated at timer_create time. This allows the error to be returned to the application rather than silently losing the signal. This patch does not change the sigqueue structure allocation policy. I hope to revisit that in another patch. Here is the definition for the new interface: struct sigqueue *sigqueue_alloc(void) Preallocate a sigqueue structure for use with the functions described below. void sigqueue_free(struct sigqueue *q) Free a preallocated sigqueue structure. If the sigqueue structure being freed is still queued, it will be removed from the queue. I currently leave the signal pending. It may be delivered without the siginfo structure. int send_sigqueue(int sig, struct sigqueue *q, struct task_struct *p) This function is equivalent to send_sig_info(). It queues a signal to the specified thread using the supplied sigqueue structure. The caller is expected to fill in the siginfo_t which is part of the sigqueue structure. int send_group_sigqueue(int sig, struct sigqueue *q, struct task_struct *p) This function is equivalent to send_group_sig_info(). It queues the signal to a process allowing the system to select which thread will receive the signal in a multi-threaded process. Again, the sigqueue structure is used to queue the signal. Both send_sigqueue() and send_group_sigqueue() return 0 if the signal is queued. They return 1 if the signal was not queued because the process is ignoring the signal. Both versions include code to increment the si_overrun count if the sigqueue entry is for a Posix timer and they are called while the sigqueue entry is still queued. Yes, I know that the current code doesn't rearm the timer until the signal is delivered. Having this extra bit of code doesn't do any harm, and I plan to use it. These routines do not check if there already is a legacy (non-realtime) signal pending. They always queue the signal. This requires that collect_signal() always checks if there is another matching siginfo before clearing the signal bit.
2003-05-25[PATCH] Fix dcache_lock/tasklist_lock ranking bugAndrew Morton
__unhash_process acquires the dcache_lock while holding the tasklist_lock for writing. This can deadlock. Additionally, fs/proc/base.c incorrectly assumed that p->pid would be set to 0 during release_task. The patch fixes that by adding a new spinlock to the task structure and fixing all references to (!p->pid). The alternative to the new spinlock would be to hold dcache_lock around __unhash_process. - fs/proc/base.c assumed that p->pid is reset to 0 during exit. This is not the case anymore. I now look at the count of the pid structure for PIDTYPE_PID. - de_thread now tested - as broken as it was before: open handles to /proc/<pid> are either stale or invalid after an exec of a nptl process, if the exec was call from a secondary thread. - a few lock_kernels removed - that part of /proc doesn't need it. - additional instances of 'if(current->pid)' replaced with pid_alive.
2003-04-12[PATCH] convert file_lock to a spinlockAndrew Morton
Time to write a 2M file, one byte at a time: Before: 1.09s user 4.92s system 99% cpu 6.014 total 0.74s user 5.28s system 99% cpu 6.023 total 1.03s user 4.97s system 100% cpu 5.991 total After: 0.79s user 5.17s system 99% cpu 5.993 total 0.79s user 5.17s system 100% cpu 5.957 total 0.84s user 5.11s system 100% cpu 5.942 total
2003-02-20Fix x86 "switch_to()" to properly set the previous task information,Linus Torvalds
which is needed to keep track of process usage counts correctly and efficiently.
2003-02-17[PATCH] POSIX clocks & timersGeorge Anzinger
This is version 23 or so of the POSIX timer code. Internal changelog: - Changed the signals code to match the new order of things. Also the new xtime_lock code needed to be picked up. It made some things a lot simpler. - Fixed a spin lock hand off problem in locking timers (thanks to Randy). - Fixed nanosleep to test for out of bound nanoseconds (thanks to Julie). - Fixed a couple of id deallocation bugs that left old ids laying around (hey I get this one). - This version has a new timer id manager. Andrew Morton suggested elimination of recursion (done) and I added code to allow it to release unused nodes. The prior version only released the leaf nodes. (The id manager uses radix tree type nodes.) Also added is a reuse count so ids will not repeat for at least 256 alloc/ free cycles. - The changes for the new sys_call restart now allow one restart function to handle both nanosleep and clock_nanosleep. Saves a bit of code, nice. - All the requested changes and Lindent too :). - I also broke clock_nanosleep() apart much the same way nanosleep() was with the 2.5.50-bk5 changes. TIMER STORMS The POSIX clocks and timers code prevents "timer storms" by not putting repeating timers back in the timer list until the signal is delivered for the prior expiry. Timer events missed by this delay are accounted for in the timer overrun count. The net result is MUCH lower system overhead while presenting the same info to the user as would be the case if an interrupt and timer processing were required for each increment in the overrun count.
2003-02-06Split up "struct signal_struct" into "signal" and "sighand" parts.Linus Torvalds
This is required to get make the old LinuxThread semantics work together with the fixed-for-POSIX full signal sharing. A traditional CLONE_SIGHAND thread (LinuxThread) will not see any other shared signal state, while a new-style CLONE_THREAD thread will share all of it. This way the two methods don't confuse each other.
2002-12-20[PATCH] Fix CPU bitmask truncationWilliam Lee Irwin III
Fix task->cpus_allowed bitmask truncations on 64.bit architectures. Originally by Bjorn Helgaas for 2.4.x.
2002-09-28[PATCH] atomic-thread-signalsIngo Molnar
Avoid racing on signal delivery with thread signal blocking in thread groups. The method to do this is to eliminate the per-thread sigmask_lock, and use the per-group (per 'process') siglock for all signal related activities. This immensely simplified some of the locking interactions within signal.c, and enabled the fixing of the above category of signal delivery races. This became possible due to the former thread-signal patch, which made siglock an irq-safe thing. (it used to be a process-context-only spinlock.) And this is even a speedup for non-threaded applications: only one lock is used. I fixed all places within the kernel except the non-x86 arch sections. Even for them the transition is very straightforward, in almost every case the following is sufficient in arch/*/kernel/signal.c: :1,$s/->sigmask_lock/->sig->siglock/g
2002-09-22[PATCH] pidhash cleanups, tgid-2.5.38-F3Ingo Molnar
This does the following things: - removes the ->thread_group list and uses a new PIDTYPE_TGID pid class to handle thread groups. This cleans up lots of code in signal.c and elsewhere. - fixes sys_execve() if a non-leader thread calls it. (2.5.38 crashed in this case.) - renames list_for_each_noprefetch to __list_for_each. - cleans up delayed-leader parent notification. - introduces link_pid() to optimize PIDTYPE_TGID installation in the thread-group case. I've tested the patch with a number of threaded and non-threaded workloads, and it works just fine. Compiles & boots on UP and SMP x86. The session/pgrp bugs reported to lkml are probably still open, they are the next on my todo - now that we have a clean pidhash architecture they should be easier to fix.
2002-09-13[PATCH] Use a sync iocb for generic_file_readAndrew Morton
This adds support for synchronous iocbs and converts generic_file_read to use a sync iocb to call into generic_file_aio_read. The tests I've run with lmbench on a piii-866 showed no difference in file re-read speed when forced to use a completion path via aio_complete and an -EIOCBQUEUED return from generic_file_aio_read -- people with slower machines might want to test this to see if we can tune it any better. Also, a bug fix to correct a missing call into the aio code from the fork code is present. This patch sets things up for making generic_file_aio_read actually asynchronous.
2002-09-12[PATCH] sys_exit() threading improvements, BK-currIngo Molnar
This implements the 'keep the initial thread around until every thread in the group exits' concept in a different, less intrusive way, along your suggestions. There is no exit_done completion handling anymore, freeing of the task is still done by wait4(). This has the following side-effect: detached threads/processes can only be started within a thread group, not in a standalone way. (This also fixes the bugs introduced by the ->exit_done code, which made it possible for a zombie task to be reactivated.) I've introduced the p->group_leader pointer, which can/will be used for other purposes in the future as well - since from now on the thread group leader is always existent. Right now it's used to notify the parent of the thread group leader from the last non-leader thread that exits [if the thread group leader is a zombie already].
2002-09-08[PATCH] Re: pinpointed: PANIC caused by dequeue_signal() in current LinusIngo Molnar
This fixes the bootup crash. There were two initialization bugs: - INIT_SIGNAL needs to set shared_pending. - exec() needs to set up newsig properly. the second one caused the crash Anton saw.
2002-08-18[PATCH] O(1) sys_exit(), threading, scalable-exit-2.5.31-B4Ingo Molnar
the attached patch updates a number of items: - adds cleanups suggested by Christoph Hellwig: needed unlikely() statements, a superfluous #define and line length problems. - splits up the global ptrace list into per-task ptrace lists. This was pretty straightforward, and this makes the worst-case exit() latency O(nr_children). the per-task ptrace lists unearthed a bug that the previous code did not take care of: tasks on the ptrace list have to be correctly reparented as well. This patch passed my stresstests as well.
2002-08-12[PATCH] designated initializers for include/linuxRusty Russell
These are the completely generic bits (linux/init_task.h and linux/wait.h). From: Art Haas <ahaas@neosoft.com> Here's the latest diffs for the files in include/linux. Patches are against 2.5.31.
2002-07-23[PATCH] scheduler fixesIngo Molnar
- introduce new type of context-switch locking, this is a must-have for ia64 and sparc64. - load_balance() bug noticed by Scott Rhine and myself: scan the whole list to find imbalance number of tasks, not just the tail of the list. - sched_yield() fix: use current->array not rq->active.
2002-05-17[PATCH] clean up maximum prioritiesRobert Love
This patch further cleans up and separates the code in an effort to allow setting (a) a larger maximum real-time priority than default and (b) a maximum kernel RT priority that is separate than the maximum priority exported to user-space.
2002-03-14Cleanup: use list macros for task listLinus Torvalds
2002-03-14[PATCH] wait4() WIFSTOPPED starvation fix #1/2David Howells
This patch (#1) just converts the task_struct to use struct list_head rather than direct pointers for maintaining the children list.
2002-02-23- new, less intrusive and faster migration method:Ingo Molnar
/* * This is how migration works: * * 1) we queue a migration_req_t structure in the source CPU's * runqueue and wake up that CPU's migration thread. * 2) we down() the locked semaphore => thread blocks. * 3) migration thread wakes up (implicitly it forces the migrated * thread off the CPU) * 4) it gets the migration request and checks whether the migrated * task is still in the wrong runqueue. * 5) if it's in the wrong runqueue then the migration thread removes * it and puts it into the right queue. * 6) migration thread up()s the semaphore. * 7) we wake up and the migration is done. */
2002-02-21cleanups, speedups and fixes. Added support for non-current set_cpus_allowed().Ingo Molnar
2002-02-11merge to the -K3 scheduler.Ingo Molnar
2002-02-06[PATCH] thread information blockDavid Howells
syscall latency improvement * There's now an asm/thread_info.h header file with the basic structure def and asm offsets in it. * There's now a linux/thread_info.h header file which includes the asm version and wraps some bitops calls to make convenience functions for accessing the low-level flags. * The task_struct has had some fields removed (and some flags), and has acquired a pointer to the thread_info struct. * task_struct's are now allocated on slabs in kernel/fork.c, whereas thread_info structs are allocated at the bottom of the stack pages. * Some more convenience functions are provided at the end of linux/sched.h to access flags in other tasks (these are here because they need to access the task_struct).
2002-02-05v2.5.2.6 -> v2.5.3Linus Torvalds
- Doug Ledford: i810 audio driver update - Evgeniy Polyakov: update various SCSI drivers to new locking - David Howells: syscall latency improvement, try 2 - Francois Romieu: dscc4 driver update - Patrick Mochel: driver model fixes - Andrew Morton: clean up a few details in ext3 inode initialization - Pete Wyckoff: make x86 machine check print out right address.. - Hans Reiser: reiserfs update - Richard Gooch: devfs update - Greg KH: USB updates - Dave Jones: PNPBIOS - Nathan Scott: extended attributes - Corey Minyard: clean up zlib duplication (triplication..)
2002-02-05v2.5.2.5 -> v2.5.2.6Linus Torvalds
- Asit Mallick: mtrr update - Patrick Mochel: split up kernel/device.c into drivers/base - Mikael Pettersson/Al Viro: fix missing in-core inode initialization in ext2 introduced by Al's inode trimming - David Miller: sparc and network updates - Frank Davis: firewire video mmap page remapping fix - me: fix configure help scripts to fix breakage noticed by Dave Jones - Greg KH: USB updates - Kai Germaschewski: ISDN fixes, Config.help entries - Douglas Gilbert: SCSI doc update - Ingo Molnar: x86 taskswitch optimizations, scheduler updates - Mikael Pettersson: make APIC work on old external setups - Al Viro: more inode trimming
2002-02-05v2.5.2.1 -> v2.5.2.1.1Linus Torvalds
- David Howells: abtract out "current->need_resched" as "need_resched()" - Frank Davis: ide-tape update for bio - various: header file fixups - Jens Axboe: fix up bio/ide/highmem issues - Kai Germaschewski: ISDN update - Tim Waugh: parport update - Patrik Mochel: initcall update - Greg KH: USB and Compaq PCI hotplug updates
2002-02-05v2.5.2 -> v2.5.2.1Linus Torvalds
- Al Viro: fix up silly problem in swapfile filp cleanups in 2.5.2 - Tachino Nobuhiro: fix another error return for swapfile filp code - Robert Love: merge some of Ingo's scheduler fixes - David Miller: networking, sparc and some scsi driver fixes - Tim Waugh: parport update - OGAWA Hirofumi: fatfs cleanups and bugfixes - Roland Dreier: fix vsscanf buglets. - Ben LaHaise: include file cleanup - Andre Hedrick: IDE taskfile update