<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel, branch v2.6.25.2</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v2.6.25.2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v2.6.25.2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2008-05-01T21:44:39Z</updated>
<entry>
<title>hrtimer: raise softirq unlocked to avoid circular lock dependency</title>
<updated>2008-05-01T21:44:39Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2008-04-29T01:15:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f5f5e084959d9c22c43c235b206b2e2fe2971e7f'/>
<id>urn:sha1:f5f5e084959d9c22c43c235b206b2e2fe2971e7f</id>
<content type='text'>
commit 0c96c5979a522c3323c30a078a70120e29b5bdbc upstream

The scheduler hrtimer bits in 2.6.25 introduced a circular lock
dependency in a rare code path:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.25-sched-devel.git-x86-latest.git #19
-------------------------------------------------------
X/2980 is trying to acquire lock:
 (&amp;rq-&gt;rq_lock_key#2){++..}, at: [&lt;ffffffff80230146&gt;] task_rq_lock+0x56/0xa0

but task is already holding lock:
 (&amp;cpu_base-&gt;lock){++..}, at: [&lt;ffffffff80257ae1&gt;] lock_hrtimer_base+0x31/0x60

which lock already depends on the new lock.

The scenario which leads to this is:

posix-timer signal is delivered
 -&gt; posix-timer is rearmed
    timer is already expired in hrtimer_enqueue()
     -&gt; softirq is raised

To prevent this we need to move the raise of the softirq out of the
base-&gt;lock protected code path.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>hrtimer: timeout too long when using HRTIMER_CB_SOFTIRQ</title>
<updated>2008-05-01T21:44:38Z</updated>
<author>
<name>Bodo Stroesser</name>
<email>bstroesser@fujitsu-siemens.com</email>
</author>
<published>2008-04-28T17:15:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=59c5775ada913643998fd78d8a5b1a76ba57515f'/>
<id>urn:sha1:59c5775ada913643998fd78d8a5b1a76ba57515f</id>
<content type='text'>
commit d7b41a24bfb5d7fa02f7b49be1293d468814e424 upstream

When using hrtimer with timer-&gt;cb_mode == HRTIMER_CB_SOFTIRQ
in some cases the clockevent is not programmed.
This happens, if:
 - a timer is rearmed while it's state is HRTIMER_STATE_CALLBACK
 - hrtimer_reprogram() returns -ETIME, when it is called after
   CALLBACK is finished. This occurs if the new timer-&gt;expires
   is in the past when CALLBACK is done.
In this case, the timer needs to be removed from the tree and put
onto the pending list again.

The patch is against 2.6.22.5, but AFAICS, it is relevant
for 2.6.25 also (in run_hrtimer_pending()).

Signed-off-by: Bodo Stroesser &lt;bstroesser@fujitsu-siemens.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>cgroup: fix a race condition in manipulating tsk-&gt;cg_list</title>
<updated>2008-05-01T21:44:33Z</updated>
<author>
<name>Li Zefan</name>
<email>lizf@cn.fujitsu.com</email>
</author>
<published>2008-04-18T16:25:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bc657c218dc4f0d8fbb5fb9c746c0dd9736e128a'/>
<id>urn:sha1:bc657c218dc4f0d8fbb5fb9c746c0dd9736e128a</id>
<content type='text'>
commit: 0e04388f0189fa1f6812a8e1cb6172136eada87e

When I ran a test program to fork mass processes and at the same time
'cat /cgroup/tasks', I got the following oops:

  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:72!
  invalid opcode: 0000 [#1] SMP
  Pid: 4178, comm: a.out Not tainted (2.6.25-rc9 #72)
  ...
  Call Trace:
   [&lt;c044a5f9&gt;] ? cgroup_exit+0x55/0x94
   [&lt;c0427acf&gt;] ? do_exit+0x217/0x5ba
   [&lt;c0427ed7&gt;] ? do_group_exit+0.65/0x7c
   [&lt;c0427efd&gt;] ? sys_exit_group+0xf/0x11
   [&lt;c0404842&gt;] ? syscall_call+0x7/0xb
   [&lt;c05e0000&gt;] ? init_cyrix+0x2fa/0x479
  ...
  EIP: [&lt;c04df671&gt;] list_del+0x35/0x53 SS:ESP 0068:ebc7df4
  ---[ end trace caffb7332252612b ]---
  Fixing recursive fault but reboot is needed!

After digging into the code and debugging, I finlly found out a race
situation:

				do_exit()
				  -&gt;cgroup_exit()
				    -&gt;if (!list_empty(&amp;tsk-&gt;cg_list))
				        list_del(&amp;tsk-&gt;cg_list);

  cgroup_iter_start()
    -&gt;cgroup_enable_task_cg_list()
      -&gt;list_add(&amp;tsk-&gt;cg_list, ..);

In this case the list won't be deleted though the process has exited.

We got two bug reports in the past, which seem to be the same bug as
this one:
	http://lkml.org/lkml/2008/3/5/332
	http://lkml.org/lkml/2007/10/17/224

Actually sometimes I got oops on list_del, sometimes oops on list_add.
And I can change my test program a bit to trigger other oops.

The patch has been tested both on x86_32 and x86_64.

Signed-off-by: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Acked-by: Paul Menage &lt;menage@google.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>Fix locking bug in "acquire_console_semaphore_for_printk()"</title>
<updated>2008-04-15T20:09:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-04-15T20:09:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=093a07e2fdfaddab7fc7d4adc76cc569c86603d7'/>
<id>urn:sha1:093a07e2fdfaddab7fc7d4adc76cc569c86603d7</id>
<content type='text'>
When I cleaned up printk() and split up the printk locking logic in
commit 266c2e0abeca649fa6667a1a427ad1da507c6375 ("Make printk() console
semaphore accesses sensible") I had incorrectly moved the call to
have_callable_console() outside of the console semaphore.

That was buggy.  The console semaphore protects the console_drivers list
that is used by have_callable_console().

Thanks go to Bongani Hlope who saw this as a hang on shutdown and reboot
and bisected the bug to the right commit, and tested this patch. See

	http://lkml.org/lkml/2008/4/11/315

Bisected-and-tested-by: Bongani Hlope &lt;bonganilinux@mweb.co.za&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>revert "sched: fix fair sleepers"</title>
<updated>2008-04-14T12:26:23Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-04-14T06:50:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e2df9e0905136eebeca66eb9a994ca48d0fa7990'/>
<id>urn:sha1:e2df9e0905136eebeca66eb9a994ca48d0fa7990</id>
<content type='text'>
revert "sched: fix fair sleepers" (e22ecef1d2658ba54ed7d3fdb5d60829fb434c23),
because it is causing audio skipping, see:

   http://bugzilla.kernel.org/show_bug.cgi?id=10428

the patch is correct and the real cause of the skipping is not
understood (tracing makes it go away), but time has run out so we'll
revert it and re-try in 2.6.26.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>cgroups: include hierarchy ids in /proc/&lt;pid&gt;/cgroup</title>
<updated>2008-04-11T15:06:43Z</updated>
<author>
<name>Paul Menage</name>
<email>menage@google.com</email>
</author>
<published>2008-04-11T04:29:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b6c3006d204a0b86e1ebe02ca38f9f071a03c7ef'/>
<id>urn:sha1:b6c3006d204a0b86e1ebe02ca38f9f071a03c7ef</id>
<content type='text'>
Extend the /proc/&lt;pid&gt;/cgroup file to include the appropriate hierarchy ID on
each line.

Currently this ID isn't really needed since a hierarchy can be completely
identified by the set of subsystems bound to it, but this is likely to change
in the near future in order to support stateless subsystems and
merging/rebinding of subsystems.  Getting this change into 2.6.25 reduces the
need for an API change later.

Signed-off-by: Paul Menage &lt;menage@google.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Cc: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>asmlinkage_protect replaces prevent_tail_call</title>
<updated>2008-04-11T00:28:26Z</updated>
<author>
<name>Roland McGrath</name>
<email>roland@redhat.com</email>
</author>
<published>2008-04-10T22:37:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=54a015104136974262afa4b8ddd943ea70dec8a2'/>
<id>urn:sha1:54a015104136974262afa4b8ddd943ea70dec8a2</id>
<content type='text'>
The prevent_tail_call() macro works around the problem of the compiler
clobbering argument words on the stack, which for asmlinkage functions
is the caller's (user's) struct pt_regs.  The tail/sibling-call
optimization is not the only way that the compiler can decide to use
stack argument words as scratch space, which we have to prevent.
Other optimizations can do it too.

Until we have new compiler support to make "asmlinkage" binding on the
compiler's own use of the stack argument frame, we have work around all
the manifestations of this issue that crop up.

More cases seem to be prevented by also keeping the incoming argument
variables live at the end of the function.  This makes their original
stack slots attractive places to leave those variables, so the compiler
tends not clobber them for something else.  It's still no guarantee, but
it handles some observed cases that prevent_tail_call() did not.

Signed-off-by: Roland McGrath &lt;roland@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>cgroups: add cgroup support for enabling controllers at boot time</title>
<updated>2008-04-04T21:46:26Z</updated>
<author>
<name>Paul Menage</name>
<email>menage@google.com</email>
</author>
<published>2008-04-04T21:29:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8bab8dded67d026c39367bbd5e27d2f6c556c38e'/>
<id>urn:sha1:8bab8dded67d026c39367bbd5e27d2f6c556c38e</id>
<content type='text'>
The effects of cgroup_disable=foo are:

- foo isn't auto-mounted if you mount all cgroups in a single hierarchy
- foo isn't visible as an individually mountable subsystem

As a result there will only ever be one call to foo-&gt;create(), at init time;
all processes will stay in this group, and the group will never be mounted on
a visible hierarchy.  Any additional effects (e.g.  not allocating metadata)
are up to the foo subsystem.

This doesn't handle early_init subsystems (their "disabled" bit isn't set be,
but it could easily be extended to do so if any of the early_init systems
wanted it - I think it would just involve some nastier parameter processing
since it would occur before the command-line argument parser had been run.

Hugh said:

  Ballpark figures, I'm trying to get this question out rather than
  processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead
  to the affected paths, booting with cgroup_disable=memory cuts that back to
  1% overhead (due to slightly bigger struct page).

  I'm no expert on distros, they may have no interest whatever in
  CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or
  without it, or apply the cgroup_disable=memory patches.

Unix bench's execl test result on x86_64 was

== just after boot without mounting any cgroup fs.==
mem_cgorup=off : Execl Throughput       43.0     3150.1      732.6
mem_cgroup=on  : Execl Throughput       43.0     2932.6      682.0
==

[lizf@cn.fujitsu.com: fix boot option parsing]
Signed-off-by: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Cc: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: Sudhir Kumar &lt;skumar@linux.vnet.ibm.com&gt;
Cc: YAMAMOTO Takashi &lt;yamamoto@valinux.co.jp&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>markers: use synchronize_sched()</title>
<updated>2008-04-02T22:28:19Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@polymtl.ca</email>
</author>
<published>2008-04-02T20:04:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6496968e6cc3f01faafa63a5a28549a708539ac0'/>
<id>urn:sha1:6496968e6cc3f01faafa63a5a28549a708539ac0</id>
<content type='text'>
Markers do not mix well with CONFIG_PREEMPT_RCU because it uses
preempt_disable/enable() and not rcu_read_lock/unlock for minimal
intrusiveness.  We would need call_sched and sched_barrier primitives.

Currently, the modification (connection and disconnection) of probes
from markers requires changes to the data structure done in RCU-style :
a new data structure is created, the pointer is changed atomically, a
quiescent state is reached and then the old data structure is freed.

The quiescent state is reached once all the currently running
preempt_disable regions are done running.  We use the call_rcu mechanism
to execute kfree() after such quiescent state has been reached.
However, the new CONFIG_PREEMPT_RCU version of call_rcu and rcu_barrier
does not guarantee that all preempt_disable code regions have finished,
hence the race.

The "proper" way to do this is to use rcu_read_lock/unlock, but we don't
want to use it to minimize intrusiveness on the traced system.  (we do
not want the marker code to call into much of the OS code, because it
would quickly restrict what can and cannot be instrumented, such as the
scheduler).

The temporary fix, until we get call_rcu_sched and rcu_barrier_sched in
mainline, is to use synchronize_sched before each call_rcu calls, so we
wait for the quiescent state in the system call code path.  It will slow
down batch marker enable/disable, but will make sure the race is gone.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>futex_compat __user annotation</title>
<updated>2008-03-30T21:18:41Z</updated>
<author>
<name>Al Viro</name>
<email>viro@ftp.linux.org.uk</email>
</author>
<published>2008-03-29T03:07:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8481664d373e7e2cea3ea0c2d7a06c9e939b19ee'/>
<id>urn:sha1:8481664d373e7e2cea3ea0c2d7a06c9e939b19ee</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
