<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/net/ipv4, branch v3.2.33</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.33</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.33'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-10-30T23:26:35Z</updated>
<entry>
<title>tcp: resets are misrouted</title>
<updated>2012-10-30T23:26:35Z</updated>
<author>
<name>Alexey Kuznetsov</name>
<email>kuznet@ms2.inr.ac.ru</email>
</author>
<published>2012-10-12T04:34:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bc25463b2238562c3470c3bb0e15a9799defde7b'/>
<id>urn:sha1:bc25463b2238562c3470c3bb0e15a9799defde7b</id>
<content type='text'>
[ Upstream commit 4c67525849e0b7f4bd4fab2487ec9e43ea52ef29 ]

After commit e2446eaa ("tcp_v4_send_reset: binding oif to iif in no
sock case").. tcp resets are always lost, when routing is asymmetric.
Yes, backing out that patch will result in misrouting of resets for
dead connections which used interface binding when were alive, but we
actually cannot do anything here.  What's died that's died and correct
handling normal unbound connections is obviously a priority.

Comment to comment:
&gt; This has few benefits:
&gt;   1. tcp_v6_send_reset already did that.

It was done to route resets for IPv6 link local addresses. It was a
mistake to do so for global addresses. The patch fixes this as well.

Actually, the problem appears to be even more serious than guaranteed
loss of resets.  As reported by Sergey Soloviev &lt;sol@eqv.ru&gt;, those
misrouted resets create a lot of arp traffic and huge amount of
unresolved arp entires putting down to knees NAT firewalls which use
asymmetric routing.

Signed-off-by: Alexey Kuznetsov &lt;kuznet@ms2.inr.ac.ru&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_nat_sip: fix via header translation with multiple parameters</title>
<updated>2012-10-17T02:50:05Z</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2012-08-09T10:08:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=351f007137c71c376bc7accf97dcaa49bea24a6d'/>
<id>urn:sha1:351f007137c71c376bc7accf97dcaa49bea24a6d</id>
<content type='text'>
commit f22eb25cf5b1157b29ef88c793b71972efc47143 upstream.

Via-headers are parsed beginning at the first character after the Via-address.
When the address is translated first and its length decreases, the offset to
start parsing at is incorrect and header parameters might be missed.

Update the offset after translating the Via-address to fix this.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation</title>
<updated>2012-10-17T02:50:05Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2012-08-29T15:24:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5fd28aca055742a5cf6e02cbac9a29af48386658'/>
<id>urn:sha1:5fd28aca055742a5cf6e02cbac9a29af48386658</id>
<content type='text'>
commit 3f509c689a07a4aa989b426893d8491a7ffcc410 upstream.

We're hitting bug while trying to reinsert an already existing
expectation:

kernel BUG at kernel/timer.c:895!
invalid opcode: 0000 [#1] SMP
[...]
Call Trace:
 &lt;IRQ&gt;
 [&lt;ffffffffa0069563&gt;] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack]
 [&lt;ffffffff812d423a&gt;] ? in4_pton+0x72/0x131
 [&lt;ffffffffa00ca69e&gt;] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip]
 [&lt;ffffffffa00b5b9b&gt;] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip]
 [&lt;ffffffffa00b5f15&gt;] process_sdp+0x30c/0x3ec [nf_conntrack_sip]
 [&lt;ffffffff8103f1eb&gt;] ? irq_exit+0x9a/0x9c
 [&lt;ffffffffa00ca738&gt;] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip]

We have to remove the RTP expectation if the RTCP expectation hits EBUSY
since we keep trying with other ports until we succeed.

Reported-by: Rafal Fitt &lt;rafalf@aplusc.com.pl&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_ct_ipv4: packets with wrong ihl are invalid</title>
<updated>2012-10-17T02:50:04Z</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2012-04-03T20:02:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ffb364427629d467e0087c36ba9edcfd94032327'/>
<id>urn:sha1:ffb364427629d467e0087c36ba9edcfd94032327</id>
<content type='text'>
commit 07153c6ec074257ade76a461429b567cff2b3a1e upstream.

It was reported that the Linux kernel sometimes logs:

klogd: [2629147.402413] kernel BUG at net / netfilter /
nf_conntrack_proto_tcp.c: 447!
klogd: [1072212.887368] kernel BUG at net / netfilter /
nf_conntrack_proto_tcp.c: 392

ipv4_get_l4proto() in nf_conntrack_l3proto_ipv4.c and tcp_error() in
nf_conntrack_proto_tcp.c should catch malformed packets, so the errors
at the indicated lines - TCP options parsing - should not happen.
However, tcp_error() relies on the "dataoff" offset to the TCP header,
calculated by ipv4_get_l4proto().  But ipv4_get_l4proto() does not check
bogus ihl values in IPv4 packets, which then can slip through tcp_error()
and get caught at the TCP options parsing routines.

The patch fixes ipv4_get_l4proto() by invalidating packets with bogus
ihl value.

The patch closes netfilter bugzilla id 771.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ipv4: raw: fix icmp_filter()</title>
<updated>2012-10-10T02:31:31Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-09-22T00:08:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=65a484273f1b97019bcc923be0e47921b895ac30'/>
<id>urn:sha1:65a484273f1b97019bcc923be0e47921b895ac30</id>
<content type='text'>
[ Upstream commit ab43ed8b7490cb387782423ecf74aeee7237e591 ]

icmp_filter() should not modify its input, or else its caller
would need to recompute ip_hdr() if skb-&gt;head is reallocated.

Use skb_header_pointer() instead of pskb_may_pull() and
change the prototype to make clear both sk and skb are const.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero</title>
<updated>2012-10-10T02:31:29Z</updated>
<author>
<name>Michal Kubeček</name>
<email>mkubecek@suse.cz</email>
</author>
<published>2012-09-14T04:59:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f189aa26fd4f45d4f445d8de85d1b29c6ba92982'/>
<id>urn:sha1:f189aa26fd4f45d4f445d8de85d1b29c6ba92982</id>
<content type='text'>
[ Upstream commit 15c041759bfcd9ab0a4e43f1c16e2644977d0467 ]

If recv() syscall is called for a TCP socket so that
  - IOAT DMA is used
  - MSG_WAITALL flag is used
  - requested length is bigger than sk_rcvbuf
  - enough data has already arrived to bring rcv_wnd to zero
then when tcp_recvmsg() gets to calling sk_wait_data(), receive
window can be still zero while sk_async_wait_queue exhausts
enough space to keep it zero. As this queue isn't cleaned until
the tcp_service_net_dma() call, sk_wait_data() cannot receive
any data and blocks forever.

If zero receive window and non-empty sk_async_wait_queue is
detected before calling sk_wait_data(), process the queue first.

Signed-off-by: Michal Kubecek &lt;mkubecek@suse.cz&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>net: ipv4: ipmr_expire_timer causes crash when removing net namespace</title>
<updated>2012-09-19T14:04:57Z</updated>
<author>
<name>Francesco Ruggeri</name>
<email>fruggeri@aristanetworks.com</email>
</author>
<published>2012-08-24T07:38:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4c5e5e2a49816debff35ac5400b0d56a2d57e495'/>
<id>urn:sha1:4c5e5e2a49816debff35ac5400b0d56a2d57e495</id>
<content type='text'>
[ Upstream commit acbb219d5f53821b2d0080d047800410c0420ea1 ]

When tearing down a net namespace, ipv4 mr_table structures are freed
without first deactivating their timers. This can result in a crash in
run_timer_softirq.
This patch mimics the corresponding behaviour in ipv6.
Locking and synchronization seem to be adequate.
We are about to kfree mrt, so existing code should already make sure that
no other references to mrt are pending or can be created by incoming traffic.
The functions invoked here do not cause new references to mrt or other
race conditions to be created.
Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive.
Both ipmr_expire_process (whose completion we may have to wait in
del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock
or other synchronizations when needed, and they both only modify mrt.

Tested in Linux 3.4.8.

Signed-off-by: Francesco Ruggeri &lt;fruggeri@aristanetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: Apply device TSO segment limit earlier</title>
<updated>2012-09-19T14:04:46Z</updated>
<author>
<name>Ben Hutchings</name>
<email>bhutchings@solarflare.com</email>
</author>
<published>2012-07-30T16:11:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9f871e883277cc22c6217db806376dce52401a31'/>
<id>urn:sha1:9f871e883277cc22c6217db806376dce52401a31</id>
<content type='text'>
[ Upstream commit 1485348d2424e1131ea42efc033cbd9366462b01 ]

Cache the device gso_max_segs in sock::sk_gso_max_segs and use it to
limit the size of TSO skbs.  This avoids the need to fall back to
software GSO for local TCP senders.

Signed-off-by: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: perform DMA to userspace only if there is a task waiting for it</title>
<updated>2012-08-19T17:15:25Z</updated>
<author>
<name>Jiri Kosina</name>
<email>jkosina@suse.cz</email>
</author>
<published>2012-07-27T10:38:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=789e0ed6c1132db40b4e6349113516ea0f700710'/>
<id>urn:sha1:789e0ed6c1132db40b4e6349113516ea0f700710</id>
<content type='text'>
[ Upstream commit 59ea33a68a9083ac98515e4861c00e71efdc49a1 ]

Back in 2006, commit 1a2449a87b ("[I/OAT]: TCP recv offload to I/OAT")
added support for receive offloading to IOAT dma engine if available.

The code in tcp_rcv_established() tries to perform early DMA copy if
applicable. It however does so without checking whether the userspace
task is actually expecting the data in the buffer.

This is not a problem under normal circumstances, but there is a corner
case where this doesn't work -- and that's when MSG_TRUNC flag to
recvmsg() is used.

If the IOAT dma engine is not used, the code properly checks whether
there is a valid ucopy.task and the socket is owned by userspace, but
misses the check in the dmaengine case.

This problem can be observed in real trivially -- for example 'tbench' is a
good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing
IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they
have been already early-copied in tcp_rcv_established() using dma engine.

This patch introduces the same check we are performing in the simple
iovec copy case to the IOAT case as well. It fixes the indefinite
recvmsg(MSG_TRUNC) hangs.

Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tcp: Add TCP_USER_TIMEOUT negative value check</title>
<updated>2012-08-19T17:15:24Z</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2012-07-26T22:52:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df7ba2e6620095039c0aa42af7de1ec565eca2eb'/>
<id>urn:sha1:df7ba2e6620095039c0aa42af7de1ec565eca2eb</id>
<content type='text'>
[ Upstream commit 42493570100b91ef663c4c6f0c0fdab238f9d3c2 ]

TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int. But
patch "tcp: Add TCP_USER_TIMEOUT socket option"(dca43c75) didn't check the negative
values. If a user assign -1 to it, the socket will set successfully and wait
for 4294967295 miliseconds. This patch add a negative value check to avoid
this issue.

Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
</feed>
