<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/exit.c, branch v3.0.17</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.0.17</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.0.17'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-01-06T22:14:14Z</updated>
<entry>
<title>ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD-&gt;EXIT_ZOMBIE race</title>
<updated>2012-01-06T22:14:14Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-01-04T16:29:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ef50d8d96fa350c1f602f8c3bae65eb21ddb28e3'/>
<id>urn:sha1:ef50d8d96fa350c1f602f8c3bae65eb21ddb28e3</id>
<content type='text'>
commit 50b8d257486a45cba7b65ca978986ed216bbcc10 upstream.

Test-case:

	int main(void)
	{
		int pid, status;

		pid = fork();
		if (!pid) {
			for (;;) {
				if (!fork())
					return 0;
				if (waitpid(-1, &amp;status, 0) &lt; 0) {
					printf("ERR!! wait: %m\n");
					return 0;
				}
			}
		}

		assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
		assert(waitpid(-1, NULL, 0) == pid);

		assert(ptrace(PTRACE_SETOPTIONS, pid, 0,
					PTRACE_O_TRACEFORK) == 0);

		do {
			ptrace(PTRACE_CONT, pid, 0, 0);
			pid = waitpid(-1, NULL, 0);
		} while (pid &gt; 0);

		return 1;
	}

It fails because -&gt;real_parent sees its child in EXIT_DEAD state
while the tracer is going to change the state back to EXIT_ZOMBIE
in wait_task_zombie().

The offending commit is 823b018e which moved the EXIT_DEAD check,
but in fact we should not blame it. The original code was not
correct as well because it didn't take ptrace_reparented() into
account and because we can't really trust -&gt;ptrace.

This patch adds the additional check to close this particular
race but it doesn't solve the whole problem. We simply can't
rely on -&gt;ptrace in this case, it can be cleared if the tracer
is multithreaded by the exiting -&gt;parent.

I think we should kill EXIT_DEAD altogether, we should always
remove the soon-to-be-reaped child from -&gt;children or at least
we should never do the DEAD-&gt;ZOMBIE transition. But this is too
complex for 3.2.

Reported-and-tested-by: Denys Vlasenko &lt;vda.linux@googlemail.com&gt;
Tested-by: Lukasz Michalik &lt;lmi@ift.uni.wroc.pl&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>memcg: clear mm-&gt;owner when last possible owner leaves</title>
<updated>2011-06-16T03:04:01Z</updated>
<author>
<name>KAMEZAWA Hiroyuki</name>
<email>kamezawa.hiroyu@jp.fujitsu.com</email>
</author>
<published>2011-06-15T22:08:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=733eda7ac316cd4e550fa096e4ed42356dc546e7'/>
<id>urn:sha1:733eda7ac316cd4e550fa096e4ed42356dc546e7</id>
<content type='text'>
The following crash was reported:

&gt; Call Trace:
&gt; [&lt;ffffffff81139792&gt;] mem_cgroup_from_task+0x15/0x17
&gt; [&lt;ffffffff8113a75a&gt;] __mem_cgroup_try_charge+0x148/0x4b4
&gt; [&lt;ffffffff810493f3&gt;] ? need_resched+0x23/0x2d
&gt; [&lt;ffffffff814cbf43&gt;] ? preempt_schedule+0x46/0x4f
&gt; [&lt;ffffffff8113afe8&gt;] mem_cgroup_charge_common+0x9a/0xce
&gt; [&lt;ffffffff8113b6d1&gt;] mem_cgroup_newpage_charge+0x5d/0x5f
&gt; [&lt;ffffffff81134024&gt;] khugepaged+0x5da/0xfaf
&gt; [&lt;ffffffff81078ea0&gt;] ? __init_waitqueue_head+0x4b/0x4b
&gt; [&lt;ffffffff81133a4a&gt;] ? add_mm_counter.constprop.5+0x13/0x13
&gt; [&lt;ffffffff81078625&gt;] kthread+0xa8/0xb0
&gt; [&lt;ffffffff814d13e8&gt;] ? sub_preempt_count+0xa1/0xb4
&gt; [&lt;ffffffff814d5664&gt;] kernel_thread_helper+0x4/0x10
&gt; [&lt;ffffffff814ce858&gt;] ? retint_restore_args+0x13/0x13
&gt; [&lt;ffffffff8107857d&gt;] ? __init_kthread_worker+0x5a/0x5a

What happens is that khugepaged tries to charge a huge page against an mm
whose last possible owner has already exited, and the memory controller
crashes when the stale mm-&gt;owner is used to look up the cgroup to charge.

mm-&gt;owner has never been set to NULL with the last owner going away, but
nobody cared until khugepaged came along.

Even then it wasn't a problem because the final mmput() on an mm was
forced to acquire and release mmap_sem in write-mode, preventing an
exiting owner to go away while the mmap_sem was held, and until "692e0b3
mm: thp: optimize memcg charge in khugepaged", the memory cgroup charge
was protected by mmap_sem in read-mode.

Instead of going back to relying on the mmap_sem to enforce lifetime of a
task, this patch ensures that mm-&gt;owner is properly set to NULL when the
last possible owner is exiting, which the memory controller can handle
just fine.

[akpm@linux-foundation.org: tweak comments]
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Hugh Dickins &lt;hughd@google.com&gt;
Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc</title>
<updated>2011-05-20T20:33:21Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-05-20T20:33:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3ed4c0583daa34dedb568b26ff99e5a7b58db612'/>
<id>urn:sha1:3ed4c0583daa34dedb568b26ff99e5a7b58db612</id>
<content type='text'>
* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (41 commits)
  signal: trivial, fix the "timespec declared inside parameter list" warning
  job control: reorganize wait_task_stopped()
  ptrace: fix signal-&gt;wait_chldexit usage in task_clear_group_stop_trapping()
  signal: sys_sigprocmask() needs retarget_shared_pending()
  signal: cleanup sys_sigprocmask()
  signal: rename signandsets() to sigandnsets()
  signal: do_sigtimedwait() needs retarget_shared_pending()
  signal: introduce do_sigtimedwait() to factor out compat/native code
  signal: sys_rt_sigtimedwait: simplify the timeout logic
  signal: cleanup sys_rt_sigprocmask()
  x86: signal: sys_rt_sigreturn() should use set_current_blocked()
  x86: signal: handle_signal() should use set_current_blocked()
  signal: sigprocmask() should do retarget_shared_pending()
  signal: sigprocmask: narrow the scope of -&gt;siglock
  signal: retarget_shared_pending: optimize while_each_thread() loop
  signal: retarget_shared_pending: consider shared/unblocked signals only
  signal: introduce retarget_shared_pending()
  ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/
  signal: Turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED
  signal: do_signal_stop: Remove the unneeded task_clear_group_stop_pending()
  ...
</content>
</entry>
<entry>
<title>job control: reorganize wait_task_stopped()</title>
<updated>2011-05-13T16:56:02Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-05-12T08:47:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=19e274630c9e23a84d5940af83cf5db35103f968'/>
<id>urn:sha1:19e274630c9e23a84d5940af83cf5db35103f968</id>
<content type='text'>
wait_task_stopped() tested task_stopped_code() without acquiring
siglock and, if stop condition existed, called wait_task_stopped() and
directly returned the result.  This patch moves the initial
task_stopped_code() testing into wait_task_stopped() and make
wait_consider_task() fall through to wait_task_continue() on 0 return.

This is for the following two reasons.

* Because the initial task_stopped_code() test is done without
  acquiring siglock, it may race against SIGCONT generation.  The
  stopped condition might have been replaced by continued state by the
  time wait_task_stopped() acquired siglock.  This may lead to
  unexpected failure of WNOHANG waits.

  This reorganization addresses this single race case but there are
  other cases - TASK_RUNNING -&gt; TASK_STOPPED transition and EXIT_*
  transitions.

* Scheduled ptrace updates require changes to the initial test which
  would fit better inside wait_task_stopped().

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
</entry>
<entry>
<title>ptrace: Prepare to fix racy accesses on task breakpoints</title>
<updated>2011-04-25T15:28:24Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-04-07T14:53:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf26c018490c2fce7fe9b629083b96ce0e6ad019'/>
<id>urn:sha1:bf26c018490c2fce7fe9b629083b96ce0e6ad019</id>
<content type='text'>
When a task is traced and is in a stopped state, the tracer
may execute a ptrace request to examine the tracee state and
get its task struct. Right after, the tracee can be killed
and thus its breakpoints released.
This can happen concurrently when the tracer is in the middle
of reading or modifying these breakpoints, leading to dereferencing
a freed pointer.

Hence, to prepare the fix, create a generic breakpoint reference
holding API. When a reference on the breakpoints of a task is
held, the breakpoints won't be released until the last reference
is dropped. After that, no more ptrace request on the task's
breakpoints can be serviced for the tracer.

Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Prasad &lt;prasad@linux.vnet.ibm.com&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: v2.6.33.. &lt;stable@kernel.org&gt;
Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com
</content>
</entry>
<entry>
<title>Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into ptrace</title>
<updated>2011-04-07T18:44:11Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2011-04-07T18:44:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e46bc9b6fd65bc9f406a4211fbf95683cc9c2937'/>
<id>urn:sha1:e46bc9b6fd65bc9f406a4211fbf95683cc9c2937</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Fix common misspellings</title>
<updated>2011-03-31T14:26:23Z</updated>
<author>
<name>Lucas De Marchi</name>
<email>lucas.demarchi@profusion.mobi</email>
</author>
<published>2011-03-31T01:57:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=25985edcedea6396277003854657b5f3cb31a628'/>
<id>urn:sha1:25985edcedea6396277003854657b5f3cb31a628</id>
<content type='text'>
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@profusion.mobi&gt;
</content>
</entry>
<entry>
<title>job control: Allow access to job control events through ptracees</title>
<updated>2011-03-23T09:37:01Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-03-23T09:37:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=45cb24a1da53beb70f09efccc0373f6a47a9efe0'/>
<id>urn:sha1:45cb24a1da53beb70f09efccc0373f6a47a9efe0</id>
<content type='text'>
Currently a real parent can't access job control stopped/continued
events through a ptraced child.  This utterly breaks job control when
the children are ptraced.

For example, if a program is run from an interactive shell and then
strace(1) attaches to it, pressing ^Z would send SIGTSTP and strace(1)
would notice it but the shell has no way to tell whether the child
entered job control stop and thus can't tell when to take over the
terminal - leading to awkward lone ^Z on the terminal.

Because the job control and ptrace stopped states are independent,
there is no reason to prevent real parents from accessing the stopped
state regardless of ptrace.  The continued state isn't separate but
ptracers don't have any use for them as ptracees can never resume
without explicit command from their ptracers, so as long as ptracers
don't consume it, it should be fine.

Although this is a behavior change, because the previous behavior is
utterly broken when viewed from real parents and the change is only
visible to real parents, I don't think it's necessary to make this
behavior optional.

One situation to be careful about is when a task from the real
parent's group is ptracing.  The parent group is the recipient of both
ptrace and job control stop events and one stop can be reported as
both job control and ptrace stops.  As this can break the current
ptrace users, suppress job control stopped events for these cases.

If a real parent ptracer wants to know about both job control and
ptrace stops, it can create a separate process to serve the role of
real parent.

Note that this only updates wait(2) side of things.  The real parent
can access the states via wait(2) but still is not properly notified
(woken up and delivered signal).  Test case polls wait(2) with WNOHANG
to work around.  Notification will be updated by future patches.

Test case follows.

  #include &lt;stdio.h&gt;
  #include &lt;unistd.h&gt;
  #include &lt;time.h&gt;
  #include &lt;errno.h&gt;
  #include &lt;sys/types.h&gt;
  #include &lt;sys/ptrace.h&gt;
  #include &lt;sys/wait.h&gt;

  int main(void)
  {
	  const struct timespec ts100ms = { .tv_nsec = 100000000 };
	  pid_t tracee, tracer;
	  siginfo_t si;
	  int i;

	  tracee = fork();
	  if (tracee == 0) {
		  while (1) {
			  printf("tracee: SIGSTOP\n");
			  raise(SIGSTOP);
			  nanosleep(&amp;ts100ms, NULL);
			  printf("tracee: SIGCONT\n");
			  raise(SIGCONT);
			  nanosleep(&amp;ts100ms, NULL);
		  }
	  }

	  waitid(P_PID, tracee, &amp;si, WSTOPPED | WNOHANG | WNOWAIT);

	  tracer = fork();
	  if (tracer == 0) {
		  nanosleep(&amp;ts100ms, NULL);
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);

		  for (i = 0; i &lt; 11; i++) {
			  si.si_pid = 0;
			  waitid(P_PID, tracee, &amp;si, WSTOPPED);
			  if (si.si_pid &amp;&amp; si.si_code == CLD_TRAPPED)
				  ptrace(PTRACE_CONT, tracee, NULL,
					 (void *)(long)si.si_status);
		  }
		  printf("tracer: EXITING\n");
		  return 0;
	  }

	  while (1) {
		  si.si_pid = 0;
		  waitid(P_PID, tracee, &amp;si,
			 WSTOPPED | WCONTINUED | WEXITED | WNOHANG);
		  if (si.si_pid)
			  printf("mommy : WAIT status=%02d code=%02d\n",
				 si.si_status, si.si_code);
		  nanosleep(&amp;ts100ms, NULL);
	  }
	  return 0;
  }

Before the patch, while ptraced, the parent can't see any job control
events.

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracee: SIGCONT
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

After the patch,

  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  mommy : WAIT status=19 code=05
  tracee: SIGCONT
  mommy : WAIT status=18 code=06
  tracee: SIGSTOP
  tracer: EXITING
  mommy : WAIT status=19 code=05
  ^C

-v2: Oleg pointed out that wait(2) should be suppressed for the real
     parent's group instead of only the real parent task itself.
     Updated accordingly.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
</entry>
<entry>
<title>job control: Fix ptracer wait(2) hang and explain notask_error clearing</title>
<updated>2011-03-23T09:37:01Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-03-23T09:37:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9b84cca2564b9a5b2d064fb44d2a55a5b44473a0'/>
<id>urn:sha1:9b84cca2564b9a5b2d064fb44d2a55a5b44473a0</id>
<content type='text'>
wait(2) and friends allow access to stopped/continued states through
zombies, which is required as the states are process-wide and should
be accessible whether the leader task is alive or undead.
wait_consider_task() implements this by always clearing notask_error
and going through wait_task_stopped/continued() for unreaped zombies.

However, while ptraced, the stopped state is per-task and as such if
the ptracee became a zombie, there's no further stopped event to
listen to and wait(2) and friends should return -ECHILD on the tracee.

Fix it by clearing notask_error only if WCONTINUED | WEXITED is set
for ptraced zombies.  While at it, document why clearing notask_error
is safe for each case.

Test case follows.

  #include &lt;stdio.h&gt;
  #include &lt;unistd.h&gt;
  #include &lt;pthread.h&gt;
  #include &lt;time.h&gt;
  #include &lt;sys/types.h&gt;
  #include &lt;sys/ptrace.h&gt;
  #include &lt;sys/wait.h&gt;

  static void *nooper(void *arg)
  {
	  pause();
	  return NULL;
  }

  int main(void)
  {
	  const struct timespec ts1s = { .tv_sec = 1 };
	  pid_t tracee, tracer;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  pthread_t thr;

		  pthread_create(&amp;thr, NULL, nooper, NULL);
		  nanosleep(&amp;ts1s, NULL);
		  printf("tracee exiting\n");
		  pthread_exit(NULL);	/* let subthread run */
	  }

	  tracer = fork();
	  if (tracer == 0) {
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
		  while (1) {
			  if (waitid(P_PID, tracee, &amp;si, WSTOPPED) &lt; 0) {
				  perror("waitid");
				  break;
			  }
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(long)si.si_status);
		  }
		  return 0;
	  }

	  waitid(P_PID, tracer, &amp;si, WEXITED);
	  kill(tracee, SIGKILL);
	  return 0;
  }

Before the patch, after the tracee becomes a zombie, the tracer's
waitid(WSTOPPED) never returns and the program doesn't terminate.

  tracee exiting
  ^C

After the patch, tracee exiting triggers waitid() to fail.

  tracee exiting
  waitid: No child processes

-v2: Oleg pointed out that exited in addition to continued can happen
     for ptraced dead group leader.  Clear notask_error for ptraced
     child on WEXITED too.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
</entry>
<entry>
<title>job control: Small reorganization of wait_consider_task()</title>
<updated>2011-03-23T09:37:01Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-03-23T09:37:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=823b018e5b1196d810790559357447948f644548'/>
<id>urn:sha1:823b018e5b1196d810790559357447948f644548</id>
<content type='text'>
Move EXIT_DEAD test in wait_consider_task() above ptrace check.  As
ptraced tasks can't be EXIT_DEAD, this change doesn't cause any
behavior change.  This is to prepare for further changes.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
</entry>
</feed>
