diff options
| author | Ingo Molnar <mingo@elte.hu> | 2003-02-05 20:49:30 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@home.transmeta.com> | 2003-02-05 20:49:30 -0800 |
| commit | ebf5ebe31d2cd1e0f13e5b65deb0b4af7afd9dc1 (patch) | |
| tree | b6af9aa99995048ac5d1731dea69610086c3b8d3 /include | |
| parent | 44a5a59c0b5d34ff01c685be87894f24132a8328 (diff) | |
[PATCH] signal-fixes-2.5.59-A4
this is the current threading patchset, which accumulated up during the
past two weeks. It consists of a biggest set of changes from Roland, to
make threaded signals work. There were still tons of testcases and
boundary conditions (mostly in the signal/exit/ptrace area) that we did
not handle correctly.
Roland's thread-signal semantics/behavior/ptrace fixes:
- fix signal delivery race with do_exit() => signals are re-queued to the
'process' if do_exit() finds pending unhandled ones. This prevents
signals getting lost upon thread-sys_exit().
- a non-main thread has died on one processor and gone to TASK_ZOMBIE,
but before it's gotten to release_task a sys_wait4 on the other
processor reaps it. It's only because it's ptraced that this gets
through eligible_child. Somewhere in there the main thread is also
dying so it reparents the child thread to hit that case. This means
that there is a race where P might be totally invalid.
- forget_original_parent is not doing the right thing when the group
leader dies, i.e. reparenting threads to init when there is a zombie
group leader. Perhaps it doesn't matter for any practical purpose
without ptrace, though it makes for ppid=1 for each thread in core
dumps, which looks funny. Incidentally, SIGCHLD here really should be
p->exit_signal.
- one of the gdb tests makes a questionable assumption about what kill
will do when it has some threads stopped by ptrace and others running.
exit races:
1. Processor A is in sys_wait4 case TASK_STOPPED considering task P.
Processor B is about to resume P and then switch to it.
While A is inside that case block, B starts running P and it clears
P->exit_code, or takes a pending fatal signal and sets it to a new
value. Depending on the interleaving, the possible failure modes are:
a. A gets to its put_user after B has cleared P->exit_code
=> returns with WIFSTOPPED, WSTOPSIG==0
b. A gets to its put_user after B has set P->exit_code anew
=> returns with e.g. WIFSTOPPED, WSTOPSIG==SIGKILL
A can spend an arbitrarily long time in that case block, because
there's getrusage and put_user that can take page faults, and
write_lock'ing of the tasklist_lock that can block. But even if it's
short the race is there in principle.
2. This is new with NPTL, i.e. CLONE_THREAD.
Two processors A and B are both in sys_wait4 case TASK_STOPPED
considering task P.
Both get through their tests and fetches of P->exit_code before either
gets to P->exit_code = 0. => two threads return the same pid from
waitpid.
In other interleavings where one processor gets to its put_user after
the other has cleared P->exit_code, it's like case 1(a).
3. SMP races with stop/cont signals
First, take:
kill(pid, SIGSTOP);
kill(pid, SIGCONT);
or:
kill(pid, SIGSTOP);
kill(pid, SIGKILL);
It's possible for this to leave the process stopped with a pending
SIGCONT/SIGKILL. That's a state that should never be possible.
Moreover, kill(pid, SIGKILL) without any repetition should always be
enough to kill a process. (Likewise SIGCONT when you know it's
sequenced after the last stop signal, must be sufficient to resume a
process.)
4. take:
kill(pid, SIGKILL); // or any fatal signal
kill(pid, SIGCONT); // or SIGKILL
it's possible for this to cause pid to be reaped with status 0
instead of its true termination status. The equivalent scenario
happens when the process being killed is in an _exit call or a
trap-induced fatal signal before the kills.
plus i've done stability fixes for bugs that popped up during
beta-testing, and minor tidying of Roland's changes:
- a rare tasklist corruption during exec, causing some very spurious and
colorful crashes.
- a copy_process()-related dereference of already freed thread structure
if hit with a SIGKILL in the wrong moment.
- SMP spinlock deadlocks in the signal code
this patchset has been tested quite well in the 2.4 backport of the
threading changes - and i've done some stresstesting on 2.5.59 SMP as
well, and did an x86 UP testcompile + testboot as well.
Diffstat (limited to 'include')
| -rw-r--r-- | include/linux/sched.h | 10 |
1 files changed, 8 insertions, 2 deletions
diff --git a/include/linux/sched.h b/include/linux/sched.h index 648d4d3ace3c..d41f7a24fc14 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -235,6 +235,9 @@ struct signal_struct { int group_exit; int group_exit_code; struct task_struct *group_exit_task; + + /* thread group stop support, overloads group_exit_code too */ + int group_stop_count; }; /* @@ -508,7 +511,6 @@ extern int in_egroup_p(gid_t); extern void proc_caches_init(void); extern void flush_signals(struct task_struct *); extern void flush_signal_handlers(struct task_struct *); -extern void sig_exit(int, int, struct siginfo *); extern int dequeue_signal(sigset_t *mask, siginfo_t *info); extern void block_all_signals(int (*notifier)(void *priv), void *priv, sigset_t *mask); @@ -525,7 +527,7 @@ extern void do_notify_parent(struct task_struct *, int); extern void force_sig(int, struct task_struct *); extern void force_sig_specific(int, struct task_struct *); extern int send_sig(int, struct task_struct *, int); -extern int __broadcast_thread_group(struct task_struct *p, int sig); +extern void zap_other_threads(struct task_struct *p); extern int kill_pg(pid_t, int, int); extern int kill_sl(pid_t, int, int); extern int kill_proc(pid_t, int, int); @@ -590,6 +592,8 @@ extern void exit_files(struct task_struct *); extern void exit_sighand(struct task_struct *); extern void __exit_sighand(struct task_struct *); +extern NORET_TYPE void do_group_exit(int); + extern void reparent_to_init(void); extern void daemonize(void); extern task_t *child_reaper; @@ -762,6 +766,8 @@ static inline void cond_resched_lock(spinlock_t * lock) extern FASTCALL(void recalc_sigpending_tsk(struct task_struct *t)); extern void recalc_sigpending(void); +extern void signal_wake_up(struct task_struct *t, int resume_stopped); + /* * Wrappers for p->thread_info->cpu access. No-op on UP. */ |
