| Age | Commit message (Collapse) | Author |
|
When we recieve a FORWARD_TSN chunk, we need to reap
all the queued fast-forwarded chunks from the ordering queue
However, if we don't have them queued, we need to see if
the next expected one is there as well. If it is, start
deliver from that point instead of waiting for the next
chunk to arrive.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
|
|
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
|
|
I was notified by Randy Stewart that lksctp claims to be
"the reference implementation". First of all, "the
refrence implementation" was the original implementation
of SCTP in usersapce written ty Randy and a few others.
Second, after looking at the definiton of 'reference implementation',
we don't really meet the requirements.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
|
|
This patch introduces new memory accounting functions for each network
protocol. Most of them are renamed from memory accounting functions
for stream protocols. At the same time, some stream memory accounting
functions are removed since other functions do same thing.
Renaming:
sk_stream_free_skb() -> sk_wmem_free_skb()
__sk_stream_mem_reclaim() -> __sk_mem_reclaim()
sk_stream_mem_reclaim() -> sk_mem_reclaim()
sk_stream_mem_schedule -> __sk_mem_schedule()
sk_stream_pages() -> sk_mem_pages()
sk_stream_rmem_schedule() -> sk_rmem_schedule()
sk_stream_wmem_schedule() -> sk_wmem_schedule()
sk_charge_skb() -> sk_mem_charge()
Removeing
sk_stream_rfree(): consolidates into sock_rfree()
sk_stream_set_owner_r(): consolidates into skb_set_owner_r()
sk_stream_mem_schedule()
The following functions are added.
sk_has_account(): check if the protocol supports accounting
sk_mem_uncharge(): do the opposite of sk_mem_charge()
In addition, to achieve consolidation, updating sk_wmem_queued is
removed from sk_mem_charge().
Next, to consolidate memory accounting functions, this patch adds
memory accounting calls to network core functions. Moreover, present
memory accounting call is renamed to new accounting call.
Finally we replace present memory accounting calls with new interface
in TCP and SCTP.
Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Hideo Aoki <haoki@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
At the end of partial delivery, we may have complete messages
sitting on the fragment queue. These messages are stuck there
until a new fragment arrives. This can comletely stall a
given association. When clearing partial delivery state, flush
any complete messages from the fragment queue and send them on
their way up.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a small bug when we process a FWD-TSN. We'll deliver
anything upto the current next expected SSN. However, if the
next expected is already in the queue, it will take another
chunk to trigger its delivery. The fix is to simply check
the current queued SSN is the next expected one.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
|
|
Both are equal, except for the list to be traversed.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch introduces autotuning to the sctp buffer management code
similar to the TCP. The buffer space can be grown if the advertised
receive window still has room. This might happen if small message
sizes are used, which is common in telecom environmens.
New tunables are introduced that provide limits to buffer growth
and memory pressure is entered if to much buffer spaces is used.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we recieve a FWD-TSN (meaning the peer has abandoned the data),
we need to clean up any partially received messages that may be
hanging out on the re-assembly or re-ordering queues. This is
a MUST requirement that was not properly done before.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com.>
|
|
Spring cleaning time...
There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals. Most commonly is a
bogus semicolon after: switch() { }
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This option induces partial delivery to run as soon
as the specified amount of data has been accumulated on
the association. However, we give preference to fully
reassembled messages over PD messages. In any case,
window and buffer is freed up.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@.hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This option was introduced in draft-ietf-tsvwg-sctpsocket-13. It
prevents head-of-line blocking in the case of one-to-many endpoint.
Applications enabling this option really must enable SCTP_SNDRCV event
so that they would know where the data belongs. Based on an
earlier patch by Ivan Skytte Jørgensen.
Additionally, this functionality now permits multiple associations
on the same endpoint to enter Partial Delivery. Applications should
be extra careful, when using this functionality, to track EOR indicators.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The way partial delivery is currently implemnted, it is possible to
intereleave a message (either from another steram, or unordered) that
is not part of partial delivery process. The only way to this is for
a message to not be a fragment and be 'in order' or unorderd for a
given stream. This will result in bypassing the reassembly/ordering
queues where things live duing partial delivery, and the
message will be delivered to the socket in the middle of partial delivery.
This is a two-fold problem, in that:
1. the app now must check the stream-id and flags which it may not
be doing.
2. this clearing partial delivery state from the association and results
in ulp hanging.
This patch is a band-aid over a much bigger problem in that we
don't do stream interleave.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During association restart we may have stale data sitting
on the ULP queue waiting for ordering or reassembly. This
data may cause severe problems if not cleaned up. In particular
stale data pending ordering may cause problems with receive
window exhaustion if our peer has decided to restart the
association.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When doing receiver buffer accounting, we always used skb->truesize.
This is problematic when processing bundled DATA chunks because for
every DATA chunk that could be small part of one large skb, we would
charge the size of the entire skb. The new approach is to store the
size of the DATA chunk we are accounting for in the sctp_ulpevent
structure and use that stored value for accounting.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a rare situation that causes lksctp to go into infinite recursion
and crash the system. The trigger is a packet that contains at least the
first two DATA fragments of a message bundled together. The recursion is
triggered when the user data buffer is smaller that the full data message.
The problem is that we clone the skb for every fragment in the message.
When reassembling the full message, we try to link skbs from the "first
fragment" clone using the frag_list. However, since the frag_list is shared
between two clones in this rare situation, we end up setting the frag_list
pointer of the second fragment to point to itself. This causes
sctp_skb_pull() to potentially recurse indefinitely.
Proposed solution is to make a copy of the skb when attempting to link
things using frag_list.
Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
- added typedef unsigned int __nocast gfp_t;
- replaced __nocast uses for gfp flags with gfp_t - it gives exactly
the same warnings as far as sparse is concerned, doesn't change
generated code (from gcc point of view we replaced unsigned int with
typedef) and documents what's going on far better.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Remove the "list" member of struct sk_buff, as it is entirely
redundant. All SKB list removal callers know which list the
SKB is on, so storing this in sk_buff does nothing other than
taking up some space.
Two tricky bits were SCTP, which I took care of, and two ATM
drivers which Francois Romieu <romieu@fr.zoreil.com> fixed
up.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
|
|
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No need for two structs, follow the new inet_sock layout style.
Also introduce inet_sk_copy_descendant, to copy just the inet_sock
descendant specific area from one sock to another.
Signed-off-by: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Adrian Bunk <bunk@stutsa.de>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
|
|
into nuts.davemloft.net:/disk1/BK/snmp-2.6
|
|
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@redhat.com>
|
|
Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
|
|
|
|
|
|
only the receive specific fields of sctp_sndrcvinfo.
|
|
cleanup.
|
|
When the skb was shared with Ethereal, Ethereal was sometimes the last
user and the destructor would get called on another CPU, not
knowing anything about our sock_lock. Move our rwnd updates and I/O
redrive out of the skb destructor.
Also, if unable to allocate an skb for our transmission packet,
walk the packet's chunks and free the control chunks.
Also change list_dels to list_del_init. Fix real later, but this prevent
us from a doing damage if we list_del twice.
|
|
With this the data dependency is reduced to just making sure that the first
member of both struct sock and struct tcp_tw_bucket are a struct sock_common.
Also makes it easier to grep for struct sock and struct tcp_tw_bucket usage in
the tree as all the members in those structs are prefixed, respectively, with
sk_ and tw_, like struct inode (i_), struct block_device (bd_), etc.
Checked namespace with make tags/ctags, just one colision with the macros for
the struct sock members, with a wanrouter struct, fixed that
s/sk_state/state_sk/g in the wanrouter struct.
Checked as well if the names of the members in both structs collided with some
macro, none found.
|
|
This makes:
1. simpler primitive to access struct sock flags, shorter
2. we check if the flag is valid by using enum sock_flags
3. we can change the implementation to an open coded bit operations
if it proves to be faster than the more general bit manipulation
routines now used, i.e. we only have to change sock.h, not the
whole net tree like now
|
|
Continue typedef removal. Also, change sctp_chunk.num_times_sent
counter to a resent flag. There was a rather obscure, unlikely,
and in the end fairly benign bug sitting there if num_times_sent
wrapped. However, there's no need to keep this counter, as its
use was really just to know if we'd ever resent this chunk.
|
|
Short circuit normal case. First check if the chunk belongs at the
end of the queue. If so, don't bother walking the entire list, just
just put at end.
|
|
More typedef removals and naming consistency.
Remove sctp_association_t, sctp_endpoint_t, & sctp_endpoint_common_t.
|
|
into nuts.ninka.net:/home/davem/src/BK/net-2.5
|
|
into us.ibm.com:/home/sridhar/BK/lksctp-2.5.64
|
|
into us.ibm.com:/home/sridhar/BK/lksctp-2.5.63
|
|
|
|
Hortas.
|
|
|
|
into touki.austin.ibm.com:/home/jgrimm/bk/lksctp-2.5.work
|
|
Yes, it is _that_ obvious. If someone does a connect (its not
required, but one can) the C-E may have already been sent by
the time the first DATA is available. Don't calculate in the C-E
bundling overhead if we've already sent the C-E.
|
|
|
|
This patch provides the following spelling fix.
propogate -> propagate
|
|
If our receive buffer is full, but this is the most important TSN
to receive, make room by reneging less important TSNs. Only renege
if there is a gap and this is the next TSN to fit in the gap.
|
|
|
|
|
|
Support pushing a partial record up to the application if we
are receiving pressure on rwnd. The most common case is that
the sender is sending a record larger than our rwnd. We send
as much up the receive queue in hopes that a read will occur
up room in rwnd.
Other associations on the socket need held off until the partial
delivery condition is finally fufilled (or ABORTed). Additionally,
one must be careful to "do the right thing" with regards to
associations peeled off to new sockets, properly preserving or
clearing the partial delivery state.
|
|
sndrcvinfo.sinfo_cumtsn is new field added by the latest (05) API I-D.
Remove unused fields in ulpevent, minimally to make room for for
storing this new field. But I'll clear out even more so I can
make room for impending partial data delivery work.
See changes in comments for ulpqueue.c.
Many naming and typedef removal cleanups.
|