<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git, branch v2.6.24.4</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v2.6.24.4</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v2.6.24.4'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2008-03-24T18:49:18Z</updated>
<entry>
<title>Linux 2.6.24.4</title>
<updated>2008-03-24T18:49:18Z</updated>
<author>
<name>Chris Wright</name>
<email>chrisw@sous-sol.org</email>
</author>
<published>2008-03-24T18:49:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=16c64cac7d9c6a503f49887219c4fe675e7d43d9'/>
<id>urn:sha1:16c64cac7d9c6a503f49887219c4fe675e7d43d9</id>
<content type='text'>
</content>
</entry>
<entry>
<title>S390 futex: let futex_atomic_cmpxchg_pt survive early functional tests.</title>
<updated>2008-03-24T18:48:36Z</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2008-03-20T16:33:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=78d05881017eef1e9ea49571c52275dc2935e719'/>
<id>urn:sha1:78d05881017eef1e9ea49571c52275dc2935e719</id>
<content type='text'>
a0c1e9073ef7428a14309cba010633a6cd6719ea "futex: runtime enable pi and
robust functionality" introduces a test wether futex in atomic stuff
works or not.
It does that by writing to address 0 of the kernel address space. This
will crash on older machines where addressing mode switching is enabled
but where the mvcos instruction is not available. Page table walking is
done by hand and therefore the code tries to access current-&gt;mm which
is NULL.
Therefore add an extra check, so we survive the early test.

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>slab: NUMA slab allocator migration bugfix</title>
<updated>2008-03-24T18:48:36Z</updated>
<author>
<name>Joe Korty</name>
<email>joe.korty@ccur.com</email>
</author>
<published>2008-03-05T23:04:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0837aca7b42f1bc9cd8dc1cd4d10aa2aaddaf84d'/>
<id>urn:sha1:0837aca7b42f1bc9cd8dc1cd4d10aa2aaddaf84d</id>
<content type='text'>
NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller.  As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit.  When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

	kernel BUG at mm/slab.c:2417
	cache_alloc_refill+0xe6
	kmem_cache_alloc+0xd0
	...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty &lt;joe.korty@ccur.com&gt;
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>relay: fix subbuf_splice_actor() adding too many pages</title>
<updated>2008-03-24T18:48:35Z</updated>
<author>
<name>Jens Axboe</name>
<email>jens.axboe@oracle.com</email>
</author>
<published>2008-03-17T08:04:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c0715b44c9330454e7f8a1b271f5f6e1ed849614'/>
<id>urn:sha1:c0715b44c9330454e7f8a1b271f5f6e1ed849614</id>
<content type='text'>
If subbuf_pages was larger than the max number of pages the pipe
buffer will hold, subbuf_splice_actor() would happily go beyond
the array size.

Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>BLUETOOTH: Fix bugs in previous conn add/del workqueue changes.</title>
<updated>2008-03-24T18:48:33Z</updated>
<author>
<name>Dave Young</name>
<email>hidave.darkstar@gmail.com</email>
</author>
<published>2008-02-01T02:33:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=981a64f60d0cb62846ae5a7d5cd27851dddfbed9'/>
<id>urn:sha1:981a64f60d0cb62846ae5a7d5cd27851dddfbed9</id>
<content type='text'>
Jens Axboe noticed that we were queueing &amp;conn-&gt;work on both btaddconn
and keventd_wq.

Signed-off-by: Dave Young &lt;hidave.darkstar@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>SCSI advansys: Fix bug in AdvLoadMicrocode</title>
<updated>2008-03-24T18:48:30Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>matthew@wil.cx</email>
</author>
<published>2008-03-21T15:15:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=868145d87bd058f271921c69519d4e2311171d28'/>
<id>urn:sha1:868145d87bd058f271921c69519d4e2311171d28</id>
<content type='text'>
commit: 951b62c11e86acf8c55d9828aa8c921575023c29

buf[i] can be up to 0xfd, so doubling it and assigning the result to an
unsigned char truncates the value.  Just use an unsigned int instead;
it's only a temporary.

Signed-off-by: Matthew Wilcox &lt;willy@linux.intel.com&gt;
Signed-off-by: James Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
<entry>
<title>async_tx: avoid the async xor_zero_sum path when src_cnt &gt; device-&gt;max_xor</title>
<updated>2008-03-24T18:48:26Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2008-03-19T04:40:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5d51c29a9ccf12c169c13d155a22b5e683280604'/>
<id>urn:sha1:5d51c29a9ccf12c169c13d155a22b5e683280604</id>
<content type='text'>
commit: 8d8002f642886ae256a3c5d70fe8aff4faf3631a

If the channel cannot perform the operation in one call to
-&gt;device_prep_dma_zero_sum, then fallback to the xor+page_is_zero path.
This only affects users with arrays larger than 16 devices on iop13xx or
32 devices on iop3xx.

Cc: &lt;stable@kernel.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
<entry>
<title>aio: bad AIO race in aio_complete() leads to process hang</title>
<updated>2008-03-24T18:48:19Z</updated>
<author>
<name>Quentin Barnes</name>
<email>qbarnes+linux@yahoo-inc.com</email>
</author>
<published>2008-03-20T02:45:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0db49fc729eee503836ea12745b55f7f802d2abb'/>
<id>urn:sha1:0db49fc729eee503836ea12745b55f7f802d2abb</id>
<content type='text'>
commit: 6cb2a21049b8990df4576c5fce4d48d0206c22d5

My group ran into a AIO process hang on a 2.6.24 kernel with the process
sleeping indefinitely in io_getevents(2) waiting for the last wakeup to come
and it never would.

We ran the tests on x86_64 SMP.  The hang only occurred on a Xeon box
("Clovertown") but not a Core2Duo ("Conroe").  On the Xeon, the L2 cache isn't
shared between all eight processors, but is L2 is shared between between all
two processors on the Core2Duo we use.

My analysis of the hang is if you go down to the second while-loop
in read_events(), what happens on processor #1:
	1) add_wait_queue_exclusive() adds thread to ctx-&gt;wait
	2) aio_read_evt() to check tail
	3) if aio_read_evt() returned 0, call [io_]schedule() and sleep

In aio_complete() with processor #2:
	A) info-&gt;tail = tail;
	B) waitqueue_active(&amp;ctx-&gt;wait)
	C) if waitqueue_active() returned non-0, call wake_up()

The way the code is written, step 1 must be seen by all other processors
before processor 1 checks for pending events in step 2 (that were recorded by
step A) and step A by processor 2 must be seen by all other processors
(checked in step 2) before step B is done.

The race I believed I was seeing is that steps 1 and 2 were
effectively swapped due to the __list_add() being delayed by the L2
cache not shared by some of the other processors.  Imagine:
proc 2: just before step A
proc 1, step 1: adds to ctx-&gt;wait, but is not visible by other processors yet
proc 1, step 2: checks tail and sees no pending events
proc 2, step A: updates tail
proc 1, step 3: calls [io_]schedule() and sleeps
proc 2, step B: checks ctx-&gt;wait, but sees no one waiting, skips wakeup
                so proc 1 sleeps indefinitely

My patch adds a memory barrier between steps A and B.  It ensures that the
update in step 1 gets seen on processor 2 before continuing.  If processor 1
was just before step 1, the memory barrier makes sure that step A (update
tail) gets seen by the time processor 1 makes it to step 2 (check tail).

Before the patch our AIO process would hang virtually 100% of the time.  After
the patch, we have yet to see the process ever hang.

Signed-off-by: Quentin Barnes &lt;qbarnes+linux@yahoo-inc.com&gt;
Reviewed-by: Zach Brown &lt;zach.brown@oracle.com&gt;
Cc: Benjamin LaHaise &lt;bcrl@kvack.org&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
[ We should probably disallow that "if (waitqueue_active()) wake_up()"
  coding pattern, because it's so often buggy wrt memory ordering ]
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
<entry>
<title>jbd: correctly unescape journal data blocks</title>
<updated>2008-03-24T18:48:06Z</updated>
<author>
<name>Duane Griffin</name>
<email>duaneg@dghda.com</email>
</author>
<published>2008-03-20T02:45:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d447e76ecaa4d4bb005eef4de5655a8418a4d60d'/>
<id>urn:sha1:d447e76ecaa4d4bb005eef4de5655a8418a4d60d</id>
<content type='text'>
commit: 439aeec639d7c57f3561054a6d315c40fd24bb74

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin &lt;duaneg@dghda.com&gt;
Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
<entry>
<title>jbd2: correctly unescape journal data blocks</title>
<updated>2008-03-24T18:48:03Z</updated>
<author>
<name>Duane Griffin</name>
<email>duaneg@dghda.com</email>
</author>
<published>2008-03-20T02:45:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9ecfdfeaf6210f17b93f4130845a9349ab893196'/>
<id>urn:sha1:9ecfdfeaf6210f17b93f4130845a9349ab893196</id>
<content type='text'>
commit: d00256766a0b4f1441931a7f569a13edf6c68200

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JBD2_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin &lt;duaneg@dghda.com&gt;
Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
</feed>
