user/sven/linux.git, branch v2.6.24.4

Linux 2.6.24.4

2008-03-24T18:49:18Z

S390 futex: let futex_atomic_cmpxchg_pt survive early functional tests.

2008-03-24T18:48:36Z

a0c1e9073ef7428a14309cba010633a6cd6719ea "futex: runtime enable pi and robust functionality" introduces a test wether futex in atomic stuff works or not. It does that by writing to address 0 of the kernel address space. This will crash on older machines where addressing mode switching is enabled but where the mvcos instruction is not available. Page table walking is done by hand and therefore the code tries to access current->mm which is NULL. Therefore add an extra check, so we survive the early test. Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky Cc: Thomas Gleixner Signed-off-by: Chris Wright

slab: NUMA slab allocator migration bugfix

2008-03-24T18:48:36Z

NUMA slab allocator cpu migration bugfix The NUMA slab allocator (specifically, cache_alloc_refill) is not refreshing its local copies of what cpu and what numa node it is on, when it drops and reacquires the irq block that it inherited from its caller. As a result those values become invalid if an attempt to migrate the process to another numa node occured while the irq block had been dropped. The solution is to make cache_alloc_refill reload these variables whenever it drops and reacquires the irq block. The error is very difficult to hit. When it does occur, one gets the following oops + stack traceback bits in check_spinlock_acquired: kernel BUG at mm/slab.c:2417 cache_alloc_refill+0xe6 kmem_cache_alloc+0xd0 ... This patch was developed against 2.6.23, ported to and compiled-tested only against 2.6.25-rc4. Signed-off-by: Joe Korty Signed-off-by: Christoph Lameter Signed-off-by: Chris Wright

relay: fix subbuf_splice_actor() adding too many pages

2008-03-24T18:48:35Z

If subbuf_pages was larger than the max number of pages the pipe buffer will hold, subbuf_splice_actor() would happily go beyond the array size. Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman Signed-off-by: Chris Wright

BLUETOOTH: Fix bugs in previous conn add/del workqueue changes.

2008-03-24T18:48:33Z

Jens Axboe noticed that we were queueing &conn->work on both btaddconn and keventd_wq. Signed-off-by: Dave Young Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman Signed-off-by: Chris Wright

SCSI advansys: Fix bug in AdvLoadMicrocode

2008-03-24T18:48:30Z

commit: 951b62c11e86acf8c55d9828aa8c921575023c29 buf[i] can be up to 0xfd, so doubling it and assigning the result to an unsigned char truncates the value. Just use an unsigned int instead; it's only a temporary. Signed-off-by: Matthew Wilcox Signed-off-by: James Bottomley Signed-off-by: Chris Wright Signed-off-by: Greg Kroah-Hartman

async_tx: avoid the async xor_zero_sum path when src_cnt > device->max_xor

2008-03-24T18:48:26Z

commit: 8d8002f642886ae256a3c5d70fe8aff4faf3631a If the channel cannot perform the operation in one call to ->device_prep_dma_zero_sum, then fallback to the xor+page_is_zero path. This only affects users with arrays larger than 16 devices on iop13xx or 32 devices on iop3xx. Cc: Cc: Neil Brown Signed-off-by: Dan Williams [chrisw@sous-sol.org: backport to 2.6.24.3] Signed-off-by: Chris Wright Signed-off-by: Greg Kroah-Hartman

aio: bad AIO race in aio_complete() leads to process hang

2008-03-24T18:48:19Z

commit: 6cb2a21049b8990df4576c5fce4d48d0206c22d5 My group ran into a AIO process hang on a 2.6.24 kernel with the process sleeping indefinitely in io_getevents(2) waiting for the last wakeup to come and it never would. We ran the tests on x86_64 SMP. The hang only occurred on a Xeon box ("Clovertown") but not a Core2Duo ("Conroe"). On the Xeon, the L2 cache isn't shared between all eight processors, but is L2 is shared between between all two processors on the Core2Duo we use. My analysis of the hang is if you go down to the second while-loop in read_events(), what happens on processor #1: 1) add_wait_queue_exclusive() adds thread to ctx->wait 2) aio_read_evt() to check tail 3) if aio_read_evt() returned 0, call [io_]schedule() and sleep In aio_complete() with processor #2: A) info->tail = tail; B) waitqueue_active(&ctx->wait) C) if waitqueue_active() returned non-0, call wake_up() The way the code is written, step 1 must be seen by all other processors before processor 1 checks for pending events in step 2 (that were recorded by step A) and step A by processor 2 must be seen by all other processors (checked in step 2) before step B is done. The race I believed I was seeing is that steps 1 and 2 were effectively swapped due to the __list_add() being delayed by the L2 cache not shared by some of the other processors. Imagine: proc 2: just before step A proc 1, step 1: adds to ctx->wait, but is not visible by other processors yet proc 1, step 2: checks tail and sees no pending events proc 2, step A: updates tail proc 1, step 3: calls [io_]schedule() and sleeps proc 2, step B: checks ctx->wait, but sees no one waiting, skips wakeup so proc 1 sleeps indefinitely My patch adds a memory barrier between steps A and B. It ensures that the update in step 1 gets seen on processor 2 before continuing. If processor 1 was just before step 1, the memory barrier makes sure that step A (update tail) gets seen by the time processor 1 makes it to step 2 (check tail). Before the patch our AIO process would hang virtually 100% of the time. After the patch, we have yet to see the process ever hang. Signed-off-by: Quentin Barnes Reviewed-by: Zach Brown Cc: Benjamin LaHaise Cc: Cc: Nick Piggin Signed-off-by: Andrew Morton [ We should probably disallow that "if (waitqueue_active()) wake_up()" coding pattern, because it's so often buggy wrt memory ordering ] Signed-off-by: Linus Torvalds Signed-off-by: Chris Wright Signed-off-by: Greg Kroah-Hartman

jbd: correctly unescape journal data blocks

2008-03-24T18:48:06Z

commit: 439aeec639d7c57f3561054a6d315c40fd24bb74 Fix a long-standing typo (predating git) that will cause data corruption if a journal data block needs unescaping. At the moment the wrong buffer head's data is being unescaped. To test this case mount a filesystem with data=journal, start creating and deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then pull the plug on the device. Without this patch the files will contain zeros instead of the correct data after recovery. Signed-off-by: Duane Griffin Acked-by: Jan Kara Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Chris Wright Signed-off-by: Greg Kroah-Hartman

jbd2: correctly unescape journal data blocks

2008-03-24T18:48:03Z

commit: d00256766a0b4f1441931a7f569a13edf6c68200 Fix a long-standing typo (predating git) that will cause data corruption if a journal data block needs unescaping. At the moment the wrong buffer head's data is being unescaped. To test this case mount a filesystem with data=journal, start creating and deleting a bunch of files containing only JBD2_MAGIC_NUMBER (0xc03b3998), then pull the plug on the device. Without this patch the files will contain zeros instead of the correct data after recovery. Signed-off-by: Duane Griffin Acked-by: Jan Kara Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Chris Wright Signed-off-by: Greg Kroah-Hartman