| Age | Commit message (Collapse) | Author |
|
sk_alloc_slab becomes proto_register, that receives a struct proto not necessarily
completely filled, but at least with the proto name, owner and obj_size (aka proto
specific sock size), with this we can remove the struct sock sk_owner and sk_slab,
using sk->sk_prot->{owner,slab} instead.
This patch also makes sk_set_owner not necessary anymore, as at sk_alloc time we
have now access to the struct proto onwer and slab members, so we can bump the
module refcount exactly at sock allocation time.
Another nice "side effect" is that this patch removes the generic sk_cachep slab
cache, making the only last two protocols that used it use just kmalloc, informing
a struct proto obj_size equal to sizeof(struct sock).
Ah, almost forgot that with this patch it is very easy to use a slab cache, as it is
now created at proto_register time, and all protocols need to use proto_register,
so its just a matter of switching the second parameter of proto_register to '1', heck,
this can be done even at module load time with some small additional patch.
Another optimization that will be possible in the future is to move the sk_protocol
and sk_type struct sock members to struct proto, but this has to wait for all protocols
to move completely to sk_prot.
This changeset also introduces /proc/net/protocols, that lists the registered protocols
details, some may seem excessive, but I'd like to keep them while working on further
struct sock hierarchy work and also to realize which protocols are old ones, i.e. that
still use struct proto_ops, etc, yeah, this is a bit of an exaggeration, as all protos
still use struct proto_ops, but in time the idea is to move all to use sk->sk_prot and
make the proto_ops infrastructure be shared among all protos, reducing one level of
indirection.
Signed-off-by: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
into sunset.davemloft.net:/home/davem/src/BK/acme-2.6
|
|
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replacing the open coded equivalent.
Signed-off-by: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
into nuts.davemloft.net:/disk1/BK/net-2.6.12
|
|
|
|
|
|
More of the Guninski "copy_to_user() takes a size_t" series.
|
|
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It should be unsigned long.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Required to introduce struct connection_sock.
Signed-off-by: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
When drivers other than loopback were using the LLTX
feature a race window was present. While sending
queued packets, the packet scheduler layer drops the
queue lock then calls directly into the drivers xmit
handler. The driver then grabs it's private TX lock
and goes to work.
However, as soon as we've dropped the queue lock another
thread doing TX processing for that card can execute
a netif_stop_queue() due to the TX queue filling up.
This race window causes problems because a properly coded
driver should never end up in it's ->hard_start_xmit()
handler if the queue on the device has been stopped and
we even BUG() trap for this condition in all of the device
drivers. That is how this race window was discovered
by Roland and the Infiniband folks.
Various suggestions were made to close this race. One
of which involved holding onto the queue lock all the
way into the ->hard_start_xmit() routine. Then having
the driver drop that lock only after taking it's private
TX lock. This solution was deemed grotty because it is
not wise to put queueing discipline internals into the
device drivers.
The solution taken here, which is based upon ideas from
Stephen Hemminger, is twofold:
1) Leave LLTX around for purely software devices that
need no locking at all for TX processing. The existing
example is loopback, although all tunnel devices could
be converted in this way too.
2) Stop trying to use LLTX for the other devices. Instead
achieve the same goal using a different mechanism.
For #2, the thing we were trying to achieve with LLTX
was to eliminate excess locking. We accomplish that
now by letting the device driver use dev->xmit_lock directly
instead of a seperate priv->tx_lock of some sort.
In order to allow that, we had to turn dev->xmit_lock into
a hardware IRQ disabling lock instead of a BH disabling one.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the new lock initializers DEFINE_SPIN_LOCk and DEFINE_RW_LOCK
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
into ppc970.osdl.org:/home/torvalds/v2.6/linux
|
|
Signed-off-by: walter harms <wharms@bfs.de>
Signed-off-by: Maximilian Attems <janitor@sternwelten.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
From: <tglx@linutronix.de>
To make spinlock/rwlock initialization consistent all over the kernel,
this patch converts explicit lock-initializers into spin_lock_init() and
rwlock_init() calls.
Currently, spinlocks and rwlocks are initialized in two different ways:
lock = SPIN_LOCK_UNLOCKED
spin_lock_init(&lock)
rwlock = RW_LOCK_UNLOCKED
rwlock_init(&rwlock)
this patch converts all explicit lock initializations to
spin_lock_init() or rwlock_init(). (Besides consistency this also helps
automatic lock validators and debugging code.)
The conversion was done with a script, it was verified manually and it
was reviewed, compiled and tested as far as possible on x86, ARM, PPC.
There is no runtime overhead or actual code change resulting out of this
patch, because spin_lock_init() and rwlock_init() are macros and are
thus equivalent to the explicit initialization method.
That's the second batch of the unifying patches.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is the start of a series of patches to remove protocol
specific stuff out of include/linux/skbuff.h and to make the
struct sk_buff header pointers private, i.e. they will only
be accessible thru foo_hdr(skb) and some other accessor
functions.
Signed-off-by: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Based upon work by Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now there is no reason for any neigh implementation
to know the value of {P,}NEIGH_HASHMASK
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This way no code actually needs to traverse the
neigh hash tables outside of net/core/neighbour.c
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
|
|
So here is a patch to make sure that there is a barrier between the
reading of dev->*_ptr and *dev->neigh_parms.
With these barriers in place, it's clear that *dev->neigh_parms can no
longer be NULL since once the parms are allocated, that pointer is never
reset to NULL again. Therefore I've also removed the parms check in
these paths.
They were bogus to begin with since if they ever triggered then we'll
have dead neigh entries stuck in the hash table.
Unfortunately I couldn't arrange for this to happen with DECnet due
to the dn_db->parms.up() call that's sandwiched between the assignment
of dev->dn_ptr and dn_db->neigh_parms. So I've kept the parms check
there but it will now fail instead of continuing. I've also added an
smp_wmb() there so that at least we won't be reading garbage from
dn_db->neigh_parms.
DECnet is also buggy since there is no locking at all in the destruction
path. It either needs locking or RCU like IPv4.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I've added a refcnt on neigh_parms as well as a dead flag. The latter
is checked under the tbl_lock before adding a neigh entry to the hash
table.
The non-trivial bit of the patch is the first chunk of net/core/neighbour.c.
I removed that line because not doing so would mean that I have to drop
the reference to the parms right there. That would've lead to race
conditions since many places dereference neigh->parms without holding
locks. It's also unnecessary to reset n->parms since we're no longer
in a hurry to see it go due to the new ref counting.
You'll also notice that I've put all dereferences of dev->*_ptr under
the rcu_read_lock(). Without this we may get a neigh_parms that's
already been released.
Incidentally a lot of these places were racy even before the RCU change.
For example, in the IPv6 case neigh->parms may be set to a value that's
just been released.
Finally in order to make sure that all stale entries are purged as
quickly as possible I've added neigh_ifdown/arp_ifdown calls after
every neigh_parms_release call. In many cases we now have multiple
calls to neigh_ifdown in the shutdown path. I didn't remove the
earlier calls because there may be hidden dependencies for them to
be there. Once the respective maintainers have looked at them we
can probably remove most of them.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Forgot to switch return type from ssize_t to int when switching to seq_file
Signed-off-by: Al Viro <viro@parcelfarce.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
into ppc970.osdl.org:/home/torvalds/v2.6/linux
|
|
Remove a whole bunch of prototypes which declare no-longer-present functions.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
- conversion to seq_file, overflow fixes
- qos_parse sanitized (3 sscanf calls instead of insane manual parsing)
leaks plugged
code cleaned up
We still have serious races, but they are general problem in atm code - it
has no locking whatsoever for any of the lists (mpcs, qos_head, per-client
lists).
|
|
Signed-off-by: Adrian Bunk <bunk@fs.tum.de>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
Assorted pointer-to-int fixes:
a) some places want to take pointer modulo alignment or extract
integer that was cast to pointer (which is legitimate), but do that via
wrong cast, triggering sparse warnings.
b) usual %x (int)ptr -> %p ptr fixes
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
|
|
<shemminger@osdl.org>)
|
|
|
|
|
|
|
|
ATM core annotated; ATM drivers will go in the next patch, here we only
annotated their method prototypes
|
|
|
|
optval (and in case of getsockopt - optlen) made __user, changes
percolated down into the instances.
|