summaryrefslogtreecommitdiff
path: root/include/linux
diff options
context:
space:
mode:
authorRoland McGrath <roland@redhat.com>2005-01-04 05:37:48 -0800
committerLinus Torvalds <torvalds@ppc970.osdl.org>2005-01-04 05:37:48 -0800
commit0b4eff5d61471075a27b84b8a5348bca6d37d8e0 (patch)
treebe37199476568d8184c6fc037da0542e0ccd1d33 /include/linux
parent8baf8d2a74f77fc52a868b2f60c09789a6b5463e (diff)
[PATCH] fix stop signal race
The `sig_avoid_stop_race' checks fail to catch a related race scenario that can happen. I don't think this has been seen in nature, but it could happen in the same sorts of situations where the observed problems come up that those checks work around. This patch takes a different approach to catching this race condition. The new approach plugs the hole, and I think is also cleaner. The issue is a race between one CPU processing a stop signal while another CPU processes a SIGCONT or SIGKILL. There is a window in stop-signal processing where the siglock must be released. If a SIGCONT or SIGKILL comes along here on another CPU, then the stop signal in the midst of being processed needs to be discarded rather than having the stop take place after the SIGCONT or SIGKILL has been generated. The existing workaround checks for this case explicitly by looking for a pending SIGCONT or SIGKILL after reacquiring the lock. However, there is another problem related to the same race issue. In the window where the processing of the stop signal has released the siglock, the stop signal is not represented in the pending set any more, but it is still "pending" and not "delivered" in POSIX terms. The SIGCONT coming in this window is required to clear all pending stop signals. But, if a stop signal has been dequeued but not yet processed, the SIGCONT generation will fail to clear it (in handle_stop_signal). Likewise, a SIGKILL coming here should prevent the stop processing and make the thread die immediately instead. The `sig_avoid_stop_race' code checks for this by examining the pending set to see if SIGCONT or SIGKILL is in it. But this fails to handle the case where another CPU running another thread in the same process has already dequeued the signal (so it no longer can be found in the pending set). We must catch this as well, so that the same problems do not arise when another thread on another CPU acted real fast. I've fixed this dumping the `sig_avoid_stop_race' kludge in favor of a little explicit bookkeeping. Now, dequeuing any stop signal sets a flag saying that a pending stop signal has been taken on by some CPU since the last time all pending stop signals were cleared due to SIGCONT/SIGKILL. The processing of stop signals checks the flag after the window where it released the lock, and abandons the signal the flag has been cleared. The code that clears pending stop signals on SIGCONT generation also clears this flag. The various places that are trying to ensure the process dies quickly (SIGKILL or other unhandled signals) also clear the flag. I've made this a general flags word in signal_struct, and replaced the stop_state field with flag bits in this word. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'include/linux')
-rw-r--r--include/linux/sched.h11
1 files changed, 9 insertions, 2 deletions
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 574f879c9bf4..eb87d91df180 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -293,8 +293,7 @@ struct signal_struct {
/* thread group stop support, overloads group_exit_code too */
int group_stop_count;
- /* 1 if group stopped since last SIGCONT, -1 if SIGCONT since report */
- int stop_state;
+ unsigned int flags; /* see SIGNAL_* flags below */
/* POSIX.1b Interval Timers */
struct list_head posix_timers;
@@ -331,6 +330,14 @@ struct signal_struct {
};
/*
+ * Bits in flags field of signal_struct.
+ */
+#define SIGNAL_STOP_STOPPED 0x00000001 /* job control stop in effect */
+#define SIGNAL_STOP_DEQUEUED 0x00000002 /* stop signal dequeued */
+#define SIGNAL_STOP_CONTINUED 0x00000004 /* SIGCONT since WCONTINUED reap */
+
+
+/*
* Priority of a process goes from 0..MAX_PRIO-1, valid RT
* priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL tasks are
* in the range MAX_RT_PRIO..MAX_PRIO-1. Priority values