| Age | Commit message (Collapse) | Author |
|
It seems that all the list_*_rcu primitives are missing a memory barrier
on the very first dereference. For example,
#define list_for_each_rcu(pos, head) \
for (pos = (head)->next; prefetch(pos->next), pos != (head); \
pos = rcu_dereference(pos->next))
It will go something like:
pos = (head)->next
prefetch(pos->next)
pos != (head)
do stuff
We're missing a barrier here.
pos = rcu_dereference(pos->next)
fetch pos->next
barrier given by rcu_dereference(pos->next)
store pos
Without the missing barrier, the pos->next value may turn out to be stale.
In fact, if "do stuff" were also dereferencing pos and relying on
list_for_each_rcu to provide the barrier then it may also break.
So here is a patch to make sure that we have a barrier for the first
element in the list.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
* RCU versions of hlist_***_rcu
* fib_alias partial rcu port just whats needed now.
Signed-off-by: Robert Olsson <Robert.Olsson@data.slu.se>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This also changes the list_for_each_entry_safe_continue behaviour to match its
kerneldoc comment, that is, to start after the pos passed.
Also adds several helper functions from previously open coded fragments, making
the code more clear.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
|
|
Used in the dccp CCID3 code, that is going to be submitted RSN.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
2.6.12-rc6-mm1 has a few remaining synchronize_kernel()s, some (but not
all) in comments. This patch changes these synchronize_kernel() calls (and
comments) to synchronize_rcu() or synchronize_sched() as follows:
- arch/x86_64/kernel/mce.c mce_read(): change to synchronize_sched() to
handle races with machine-check exceptions (synchronize_rcu() would not cut
it given RCU implementations intended for hardcore realtime use.
- drivers/input/serio/i8042.c i8042_stop(): change to synchronize_sched() to
handle races with i8042_interrupt() interrupt handler. Again,
synchronize_rcu() would not cut it given RCU implementations intended for
hardcore realtime use.
- include/*/kdebug.h comments: change to synchronize_sched() to handle races
with NMIs. As before, synchronize_rcu() would not cut it...
- include/linux/list.h comment: change to synchronize_rcu(), since this
comment is for list_del_rcu().
- security/keys/key.c unregister_key_type(): change to synchronize_rcu(),
since this is interacting with RCU read side.
- security/keys/process_keys.c install_session_keyring(): change to
synchronize_rcu(), since this is interacting with RCU read side.
Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
The hlist_for_each_entry_rcu() comment block refers to a nonexistent
hlist_add_rcu() API, needs to change to hlist_add_head_rcu().
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch changes list_for_each_xxx iterators
from:
for (pos = (head)->next, prefetch(pos->next);
pos != (head);
pos = pos->next, prefetch(pos->next))
to:
for (pos = (head)->next;
prefetch(pos->next), pos != (head);
pos = pos->next)
Reduces my vmlinux .text size by 4401 bytes.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch uses the rcu_assign_pointer() API to eliminate a number of explicit
memory barriers from code using RCU. This has been tested successfully on
i386 and ppc64.
Signed-off-by: <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This attached patch adds list_replace_rcu() to include/linux/list.h for
atomic updating operations according to RCU-model.
void list_replace_rcu(struct list_head *old, struct list_head *new)
The 'old' element is detached from the linked list, and the 'new' element
is inserted to the same point of the linked list concurrently.
This patch is necessary for the performance improvement of SELinux.
See, http://lkml.org/lkml/2004/8/16/54
(Subject: RCU issue with SELinux)
http://lkml.org/lkml/2004/8/30/63
(Subject: [PATCH]SELinux performance improvement by RCU)
Signed-off-by: KaiGai, Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Use abstracted RCU API to dereference RCU protected data. Hides barrier
details. Patch from Paul McKenney.
This patch introduced an rcu_dereference() macro that replaces most uses of
smp_read_barrier_depends(). The new macro has the advantage of explicitly
documenting which pointers are protected by RCU -- in contrast, it is
sometimes difficult to figure out which pointer is being protected by a given
smp_read_barrier_depends() call.
Signed-off-by: Paul McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into nuts.davemloft.net:/disk1/BK/net-2.6
|
|
Convert the bridge forwarding database over to using RCU.
This avoids a read_lock and atomic_inc/dec in the fast path
of output.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
Make code for hlist_for_each_safe use better code (same as
hlist_for_each_entry_safe). Get rid of comment about prefetch, because
that was fixed a while ago. Only current use of this is in the bridge
code, that I maintain.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
- missing ; between default: and } in sun4setup.c
- cast of pointer to unsigned long long instead of unsigned long in
x86_64 signal.c
- missed annotations for ioctl structure in sparc64 openpromio.h
(should've been in the same patch as the rest of drivers/sbus/*
annotations)
- 0->NULL in list.h and pmdisk.c
|
|
From: "Pedro Emanuel M. D. Pinto" <pepinto@student.dei.uc.pt>
This currently-unused function is incorrectly implemented. Fix.
|
|
- s/__inline__/inline/
- Remove lots of extraneous andi-was-here trailing whitespace
|
|
From: "Paul E. McKenney" <paulmck@us.ibm.com>
The attached patch improves the documentation of the _rcu list primitives.
|
|
correctly handle the "restart from this device/driver" case, and
caused oopses with ieee1394.
This just uses "list_for_each_entry_continue()" instead.
Add helper macro to make usage of "list_for_each_entry_continue()"
a bit more readable.
|
|
From: Ingo Molnar <mingo@elte.hu>
I'd also suggest the following patch below, to clarify the use of
unsynchronized list_empty(). list_empty_careful() can only be safe in the
very specific case of "one-shot" list entries which might be removed by
another CPU. (but nothing else can happen to them and this is their only
final state.) list_empty_careful() is otherwise completely unsynchronized
on both the compiler and CPU level and is not 'SMP safe' in any way.
|
|
corruption on SMP because of another CPU still accessing a waitqueue
even after it was de-allocated.
Use a careful version of the list emptiness check to make sure we
don't de-allocate the stack frame before the waitqueue is all done.
|
|
From: "Perez-Gonzalez, Inaky" <inaky.perez-gonzalez@intel.com>
|
|
From: Mitchell Blank Jr <mitch@sfgoth.com>
Make some more of the hlist functions accept constant arguments.
|
|
From: Michael Still <mikal@stillhq.com>
The patch squelches build errors in the kernel-doc make targets by adding
documentation to arguements previously not documented, and updating the
argument names where they have changed.
|
|
|
|
|
|
as we delete the entry, we can only poison the back pointer, not the
traversal pointer (rcu traversal only ever walks forward).
Make __d_drop() take this into account.
|
|
into home.transmeta.com:/home/torvalds/v2.5/linux
|
|
|
|
list pointers to give us a nice oops if somebody is doing something
bad.
Also introduce hlist_del_rcu_init() - same as hlist_del_init().
|
|
|
|
This changeset:
1. Implements hlist_add_after
2. uses prefetch in hlist_for_each, using a trick that ends up being
equivalent to having the prefetch instruction in the first block
of the hlist_for_each for block, the compiler optimizes the second
"test" away, as its result is constant
3. implements hlist_for_each_entry and hlist_for_each_entry safe,
using a struct hlist_node as iterator to avoid the extra branches a
similar implementation to list_for_each_entry would have if used
a typed iterator, but while avoiding having to have the explicit
hlist_entry as in hlist_for_each.
4. Converts the hlist_for_each users that had explicit prefetches, i.e.
removed the explicit prefetch
5. fix a harmless list_entry use in a hlist_for_each in inode.c
|
|
|
|
|
|
|
|
|
|
From: Dipankar Sarma <dipankar@in.ibm.com>
This patch makes the list macros use smp-only version of the barriers,
no need to hurt UP performance.
|
|
|
|
This patches add an include to stddef.h into include/linux/list.h.
It uses the NULL define.
|
|
list.h must now include stddef since it uses NULL.
|
|
- Inode and dcache Hash table only needs half the memory/cache because
of using hlists.
- Simplify dcache-rcu code. With NULL end markers in the hlists
is_bucket is not needed anymore. Also the list walking code
generates better code on x86 now because it doesn't need to dedicate
a register for the list head.
- Reorganize struct dentry to be more cache friendly. All the state
accessed for the hash walk is in one chunk now together with the
inline name (all at the end)
- Add prefetching for all the list walks. Old hash lookup code didn't
use it.
- Some other minor cleanup.
|
|
Now that the devicemapper hit the tree there's no more reason
to keep the uncompiling LVM1 code around and it's various hacks
to other files around, this patch removes it.
|
|
This adds a set of list macros that make handling of list protected
by RCU simpler. The interfaces added are -
list_add_rcu
list_add_tail_rcu
- Adds an element by taking care of memory barrier (wmb()).
list_del_rcu
- Deletes an element but doesn't re-initialize the pointers in
the element for supporting RCU based traversal.
list_for_each_rcu
__list_for_each_rcu
- Traversal of RCU protected list - takes care of memory barriers
transparently.
|
|
It hasn't caught any bugs, and it is causing confusion over whether
this is a permanent part of list_del() behaviour.
|
|
This does the following things:
- removes the ->thread_group list and uses a new PIDTYPE_TGID pid class
to handle thread groups. This cleans up lots of code in signal.c and
elsewhere.
- fixes sys_execve() if a non-leader thread calls it. (2.5.38 crashed in
this case.)
- renames list_for_each_noprefetch to __list_for_each.
- cleans up delayed-leader parent notification.
- introduces link_pid() to optimize PIDTYPE_TGID installation in the
thread-group case.
I've tested the patch with a number of threaded and non-threaded
workloads, and it works just fine. Compiles & boots on UP and SMP x86.
The session/pgrp bugs reported to lkml are probably still open, they are
the next on my todo - now that we have a clean pidhash architecture they
should be easier to fix.
|
|
This is the latest version of the generic pidhash patch. The biggest
change is the removal of separately allocated pid structures: they are
now part of the task structure and the first task that uses a PID will
provide the pid structure. Task refcounting is used to avoid the
freeing of the task structure before every member of a process group or
session has exited.
This approach has a number of advantages besides the performance gains.
Besides simplifying the whole hashing code significantly, attach_pid()
is now fundamentally atomic and can be called during create_process()
without worrying about task-list side-effects. It does not have to
re-search the pidhash to find out about raced PID-adding either, and
attach_pid() cannot fail due to OOM. detach_pid() can do a simple
put_task_struct() instead of the kmem_cache_free().
The only minimal downside is the potential pending task structures after
session leaders or group leaders have exited - but the number of orphan
sessions and process groups is usually very low - and even if it's
higher, this can be regarded as a slow execution of the final
deallocation of the session leader, not some additional burden.
|
|
This removes list_t, which is a gratuitous typedef for a "struct
list_head". Unless there is good reason, the kernel doesn't usually
typedef, as typedefs cannot be predeclared unlike structs.
|
|
This adds list_for_each_entry, which is the equivalent of list_for_each
and list_entry, except only one variable is needed.
|
|
There are a few VM-related patches in this series. Mainly fixes;
feature work is on hold.
We have some fairly serious locking contention problems with the reverse
mapping's pte_chains. Until we have a clear way out of that I believe
that it is best to not merge code which has a lot of rmap dependency.
It is apparent that these problems will not be solved by tweaking -
some redesign is needed. In the 2.5 timeframe the only practical
solution appears to be page table sharing, based on Daniel's February
work. Daniel and Dave McCracken are working that.
Some bits and pieces here:
- list_splice() has an open-coded list_empty() in it. Use
list_empty() instead.
- in shrink_cache() we have a local `nr_pages' which shadows another
local. Rename the inner one. (Nikita Danilov)
- Add a BUG() on a can't-happen code path in page_remove_rmap().
- Tighten up the bug checks in the BH completion handlers - if the
buffer is still under IO then it must be locked, because we unlock it
inside the page_uptodate_lock.
|
|
Define container_of which cast from member to struct with some type checking.
This is much like list_entry but is cearly for things other than lists.
List_entry now uses container_of.
|