<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/softlockup.c, branch v2.6.26.8</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v2.6.26.8</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v2.6.26.8'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2008-07-05T06:51:24Z</updated>
<entry>
<title>softlockup: print a module list on being stuck</title>
<updated>2008-07-05T06:51:24Z</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2008-06-16T22:51:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3b7253238801a7b97b3929d8db2fa7a0721fb17b'/>
<id>urn:sha1:3b7253238801a7b97b3929d8db2fa7a0721fb17b</id>
<content type='text'>
Most places in the kernel that go BUG: print a module list
(which is very useful for doing statistics and finding patterns),
however the softlockup detector does not do this yet.

This patch adds the one line change to fix this gap.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression</title>
<updated>2008-06-19T07:45:38Z</updated>
<author>
<name>Jason Wessel</name>
<email>jason.wessel@windriver.com</email>
</author>
<published>2008-05-27T17:23:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9c106c119ebedf624fbd682fd2a4d52e3c8c1a67'/>
<id>urn:sha1:9c106c119ebedf624fbd682fd2a4d52e3c8c1a67</id>
<content type='text'>
The touch_nmi_watchdog() routine on x86 ultimately calls
touch_softlockup_watchdog().  The problem is that to touch the
softlockup watchdog, the cpu_clock code has to be called which could
involve multiple cpu locks and can lead to a hard hang if one of the
locks is held by a processor that is not going to return anytime soon
(such as could be the case with kgdb or perhaps even with some other
kind of exception).

This patch causes the public version of the
touch_softlockup_watchdog() to defer the cpu clock access to a later
point.

The test case for this problem is to use the following kernel config
options:

CONFIG_KGDB_TESTS=y
CONFIG_KGDB_TESTS_ON_BOOT=y
CONFIG_KGDB_TESTS_BOOT_STRING="V1F100I100000"

It should be noted that kgdb test suite and these options were not
available until 2.6.26-rc2, so it was necessary to patch the kgdb
test suite during the bisection.

I would consider this patch a regression fix because the problem first
appeared in commit 27ec4407790d075c325e1f4da0a19c56953cce23 when some
logic was added to try to periodically sync the clocks.  It was
possible to work around this particular problem by simply not
performing the sync anytime the system was in a critical context.
This was ok until commit 3e51f33fcc7f55e6df25d15b55ed10c8b4da84cd,
which added config option CONFIG_HAVE_UNSTABLE_SCHED_CLOCK and some
multi-cpu locks to sync the clocks.  It became clear that accessing
this code from an nmi was the source of the lockups.  Avoiding the
access to the low level clock code from an code inside the NMI
processing also fixed the problem with the 27ec44... commit.

Signed-off-by: Jason Wessel &lt;jason.wessel@windriver.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>softlockup: fix task state setting</title>
<updated>2008-02-29T17:46:53Z</updated>
<author>
<name>Dmitry Adamushko</name>
<email>dmitry.adamushko@gmail.com</email>
</author>
<published>2008-02-08T14:41:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7be2a03e3174cee3a3cdcdf17db357470f51caff'/>
<id>urn:sha1:7be2a03e3174cee3a3cdcdf17db357470f51caff</id>
<content type='text'>
kthread_stop() can be called when a 'watchdog' thread is executing after
kthread_should_stop() but before set_task_state(TASK_INTERRUPTIBLE).

Signed-off-by: Dmitry Adamushko &lt;dmitry.adamushko@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>debug: softlockup looping fix</title>
<updated>2008-02-02T03:27:45Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2008-02-01T23:23:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ed50d6cbc394cd0966469d3e249353c9dd1d38b9'/>
<id>urn:sha1:ed50d6cbc394cd0966469d3e249353c9dd1d38b9</id>
<content type='text'>
Rafael J. Wysocki reported weird, multi-seconds delays during
suspend/resume and bisected it back to:

  commit 82a1fcb90287052aabfa235e7ffc693ea003fe69
  Author: Ingo Molnar &lt;mingo@elte.hu&gt;
  Date:   Fri Jan 25 21:08:02 2008 +0100

      softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks

fix it:

 - restore the old wakeup mechanism
 - fix break usage in do_each_thread() { } while_each_thread().
 - fix the hotplug switch stmt, a fall-through case was broken.

Bisected-by: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Tested-by: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Acked-by: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>softlockup: fix signedness</title>
<updated>2008-01-25T20:08:34Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-01-25T20:08:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=90739081ef8d5495d50abba9c5d333be9acd872a'/>
<id>urn:sha1:90739081ef8d5495d50abba9c5d333be9acd872a</id>
<content type='text'>
fix softlockup tunables signedness.

mark tunables read-mostly.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks</title>
<updated>2008-01-25T20:08:02Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-01-25T20:08:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=82a1fcb90287052aabfa235e7ffc693ea003fe69'/>
<id>urn:sha1:82a1fcb90287052aabfa235e7ffc693ea003fe69</id>
<content type='text'>
this patch extends the soft-lockup detector to automatically
detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are
printed the following way:

 ------------------&gt;
 INFO: task prctl:3042 blocked for more than 120 seconds.
 "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message
 prctl         D fd5e3793     0  3042   2997
        f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286
        f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000
        f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b
 Call Trace:
  [&lt;c04883a5&gt;] schedule_timeout+0x6d/0x8b
  [&lt;c04883d8&gt;] schedule_timeout_uninterruptible+0x15/0x17
  [&lt;c0133a76&gt;] msleep+0x10/0x16
  [&lt;c0138974&gt;] sys_prctl+0x30/0x1e2
  [&lt;c0104c52&gt;] sysenter_past_esp+0x5f/0xa5
  =======================
 2 locks held by prctl/3042:
 #0:  (&amp;sb-&gt;s_type-&gt;i_mutex_key#5){--..}, at: [&lt;c0197d11&gt;] do_fsync+0x38/0x7a
 #1:  (jbd_handle){--..}, at: [&lt;c01ca3d2&gt;] journal_start+0xc7/0xe9
 &lt;------------------

the current default timeout is 120 seconds. Such messages are printed
up to 10 times per bootup. If the system has crashed already then the
messages are not printed.

if lockdep is enabled then all held locks are printed as well.

this feature is a natural extension to the softlockup-detector (kernel
locked up without scheduling) and to the NMI watchdog (kernel locked up
with IRQs disabled).

[ Gautham R Shenoy &lt;ego@in.ibm.com&gt;: CPU hotplug fixes. ]
[ Andrew Morton &lt;akpm@linux-foundation.org&gt;: build warning fix. ]

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>Use helpers to obtain task pid in printks</title>
<updated>2007-10-19T18:53:43Z</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2007-10-19T06:40:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ba25f9dcc4ea6e30839fcab5a5516f2176d5bfed'/>
<id>urn:sha1:ba25f9dcc4ea6e30839fcab5a5516f2176d5bfed</id>
<content type='text'>
The task_struct-&gt;pid member is going to be deprecated, so start
using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
the kernel.

The first thing to start with is the pid, printed to dmesg - in
this case we may safely use task_pid_nr(). Besides, printks produce
more (much more) than a half of all the explicit pid usage.

[akpm@linux-foundation.org: git-drm went and changed lots of stuff]
Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Cc: Dave Airlie &lt;airlied@linux.ie&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>softlockup: add a /proc tuning parameter</title>
<updated>2007-10-17T15:42:47Z</updated>
<author>
<name>Ravikiran G Thirumalai</name>
<email>kiran@scalex86.org</email>
</author>
<published>2007-10-17T06:26:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c4f3b63fe15b4629aa1ec163c95ab30423d0f76a'/>
<id>urn:sha1:c4f3b63fe15b4629aa1ec163c95ab30423d0f76a</id>
<content type='text'>
Control the trigger limit for softlockup warnings.  This is useful for
debugging softlockups, by lowering the softlockup_thresh to identify
possible softlockups earlier.

This patch:
1. Adds a sysctl softlockup_thresh with valid values of 1-60s
   (Higher value to disable false positives)
2. Changes the softlockup printk to print the cpu softlockup time

[akpm@linux-foundation.org: Fix various warnings and add definition of "two"]
Signed-off-by: Ravikiran Thirumalai &lt;kiran@scalex86.org&gt;
Signed-off-by: Shai Fultheim &lt;shai@scalex86.org&gt;
Acked-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>softlockup watchdog: style cleanups</title>
<updated>2007-10-17T15:42:47Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2007-10-17T06:26:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a5f2ce3c6024a5bb895647b6bd88ecae5001020a'/>
<id>urn:sha1:a5f2ce3c6024a5bb895647b6bd88ecae5001020a</id>
<content type='text'>
kernel/softirq.c grew a few style uncleanlinesses in the past few
months, clean that up. No functional changes:

   text    data     bss     dec     hex filename
   1126      76       4    1206     4b6 softlockup.o.before
   1129      76       4    1209     4b9 softlockup.o.after

( the 3 bytes .text increase is due to the "&lt;1&gt;" appended to one of
  the printk messages. )

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>softlockup: improve debug output</title>
<updated>2007-10-17T15:42:47Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2007-10-17T06:26:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=43581a10075492445f65234384210492ff333eba'/>
<id>urn:sha1:43581a10075492445f65234384210492ff333eba</id>
<content type='text'>
Improve the debuggability of kernel lockups by enhancing the debug
output of the softlockup detector: print the task that causes the lockup
and try to print a more intelligent backtrace.

The old format was:

  BUG: soft lockup detected on CPU#1!
   [&lt;c0105e4a&gt;] show_trace_log_lvl+0x19/0x2e
   [&lt;c0105f43&gt;] show_trace+0x12/0x14
   [&lt;c0105f59&gt;] dump_stack+0x14/0x16
   [&lt;c015f6bc&gt;] softlockup_tick+0xbe/0xd0
   [&lt;c013457d&gt;] run_local_timers+0x12/0x14
   [&lt;c01346b8&gt;] update_process_times+0x3e/0x63
   [&lt;c0145fb8&gt;] tick_sched_timer+0x7c/0xc0
   [&lt;c0140a75&gt;] hrtimer_interrupt+0x135/0x1ba
   [&lt;c011bde7&gt;] smp_apic_timer_interrupt+0x6e/0x80
   [&lt;c0105aa3&gt;] apic_timer_interrupt+0x33/0x38
   [&lt;c0104f8a&gt;] syscall_call+0x7/0xb
   =======================

The new format is:

  BUG: soft lockup detected on CPU#1! [prctl:2363]

  Pid: 2363, comm:                prctl
  EIP: 0060:[&lt;c013915f&gt;] CPU: 1
  EIP is at sys_prctl+0x24/0x18c
   EFLAGS: 00000213    Not tainted  (2.6.22-cfs-v20 #26)
  EAX: 00000001 EBX: 000003e7 ECX: 00000001 EDX: f6df0000
  ESI: 000003e7 EDI: 000003e7 EBP: f6df0fb0 DS: 007b ES: 007b FS: 00d8
  CR0: 8005003b CR2: 4d8c3340 CR3: 3731d000 CR4: 000006d0
   [&lt;c0105e4a&gt;] show_trace_log_lvl+0x19/0x2e
   [&lt;c0105f43&gt;] show_trace+0x12/0x14
   [&lt;c01040be&gt;] show_regs+0x1ab/0x1b3
   [&lt;c015f807&gt;] softlockup_tick+0xef/0x108
   [&lt;c013457d&gt;] run_local_timers+0x12/0x14
   [&lt;c01346b8&gt;] update_process_times+0x3e/0x63
   [&lt;c0145fcc&gt;] tick_sched_timer+0x7c/0xc0
   [&lt;c0140a89&gt;] hrtimer_interrupt+0x135/0x1ba
   [&lt;c011bde7&gt;] smp_apic_timer_interrupt+0x6e/0x80
   [&lt;c0105aa3&gt;] apic_timer_interrupt+0x33/0x38
   [&lt;c0104f8a&gt;] syscall_call+0x7/0xb
   =======================

Note that in the old format we only knew that some system call locked
up, we didnt know _which_. With the new format we know that it's at a
specific place in sys_prctl(). [which was where i created an artificial
kernel lockup to test the new format.]

This is also useful if the lockup happens in user-space - the user-space
EIP (and other registers) will be printed too. (such a lockup would
either suggest that the task was running at SCHED_FIFO:99 and looping
for more than 10 seconds, or that the softlockup detector has a
false-positive.)

The task name is printed too first, just in case we dont manage to print
a useful backtrace.

[satyam@infradead.org: fix warning]
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Satyam Sharma &lt;satyam@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
