<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/socket.h, branch v5.6.10</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.6.10</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.6.10'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-03-20T14:48:36Z</updated>
<entry>
<title>io_uring: make sure accept honor rlimit nofile</title>
<updated>2020-03-20T14:48:36Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-03-20T02:16:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=09952e3e7826119ddd4357c453d54bcc7ef25156'/>
<id>urn:sha1:09952e3e7826119ddd4357c453d54bcc7ef25156</id>
<content type='text'>
Just like commit 4022e7af86be, this fixes the fact that
IORING_OP_ACCEPT ends up using get_unused_fd_flags(), which checks
current-&gt;signal-&gt;rlim[] for limits.

Add an extra argument to __sys_accept4_file() that allows us to pass
in the proper nofile limit, and grab it at request prep time.

Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: ensure async punted connect requests copy data</title>
<updated>2019-12-03T14:04:30Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-12-02T23:28:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f499a021ea8c9f70321fce3d674d8eca5bbeee2c'/>
<id>urn:sha1:f499a021ea8c9f70321fce3d674d8eca5bbeee2c</id>
<content type='text'>
Just like commit f67676d160c6 for read/write requests, this one ensures
that the sockaddr data has been copied for IORING_OP_CONNECT if we need
to punt the request to async context.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: ensure async punted sendmsg/recvmsg requests copy data</title>
<updated>2019-12-03T14:03:35Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-12-03T01:50:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=03b1230ca12a12e045d83b0357792075bf94a1e0'/>
<id>urn:sha1:03b1230ca12a12e045d83b0357792075bf94a1e0</id>
<content type='text'>
Just like commit f67676d160c6 for read/write requests, this one ensures
that the msghdr data is fully copied if we need to punt a recvmsg or
sendmsg system call to async context.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>net: add __sys_connect_file() helper</title>
<updated>2019-11-26T02:56:11Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-11-23T21:17:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bd3ded3146daa2cbb57ed353749ef99cf75371b0'/>
<id>urn:sha1:bd3ded3146daa2cbb57ed353749ef99cf75371b0</id>
<content type='text'>
This is identical to __sys_connect(), except it takes a struct file
instead of an fd, and it also allows passing in extra file-&gt;f_flags
flags. The latter is done to support masking in O_NONBLOCK without
manipulating the original file flags.

No functional changes in this patch.

Cc: netdev@vger.kernel.org
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block</title>
<updated>2019-11-25T18:40:27Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-11-25T18:40:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb4b3d3fd0c7f53168b6f6d07d1d97f55c676eeb'/>
<id>urn:sha1:fb4b3d3fd0c7f53168b6f6d07d1d97f55c676eeb</id>
<content type='text'>
Pull io_uring updates from Jens Axboe:
 "A lot of stuff has been going on this cycle, with improving the
  support for networked IO (and hence unbounded request completion
  times) being one of the major themes. There's been a set of fixes done
  this week, I'll send those out as well once we're certain we're fully
  happy with them.

  This contains:

   - Unification of the "normal" submit path and the SQPOLL path (Pavel)

   - Support for sparse (and bigger) file sets, and updating of those
     file sets without needing to unregister/register again.

   - Independently sized CQ ring, instead of just making it always 2x
     the SQ ring size. This makes it more flexible for networked
     applications.

   - Support for overflowed CQ ring, never dropping events but providing
     backpressure on submits.

   - Add support for absolute timeouts, not just relative ones.

   - Support for generic cancellations. This divorces io_uring from
     workqueues as well, which additionally gets us one step closer to
     generic async system call support.

   - With cancellations, we can support grabbing the process file table
     as well, just like we do mm context. This allows support for system
     calls that create file descriptors, like accept4() support that's
     built on top of that.

   - Support for io_uring tracing (Dmitrii)

   - Support for linked timeouts. These abort an operation if it isn't
     completed by the time noted in the linke timeout.

   - Speedup tracking of poll requests

   - Various cleanups making the coder easier to follow (Jackie, Pavel,
     Bob, YueHaibing, me)

   - Update MAINTAINERS with new io_uring list"

* tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block: (64 commits)
  io_uring: make POLL_ADD/POLL_REMOVE scale better
  io-wq: remove now redundant struct io_wq_nulls_list
  io_uring: Fix getting file for non-fd opcodes
  io_uring: introduce req_need_defer()
  io_uring: clean up io_uring_cancel_files()
  io-wq: ensure free/busy list browsing see all items
  io-wq: ensure we have a stable view of -&gt;cur_work for cancellations
  io_wq: add get/put_work handlers to io_wq_create()
  io_uring: check for validity of -&gt;rings in teardown
  io_uring: fix potential deadlock in io_poll_wake()
  io_uring: use correct "is IO worker" helper
  io_uring: fix -ENOENT issue with linked timer with short timeout
  io_uring: don't do flush cancel under inflight_lock
  io_uring: flag SQPOLL busy condition to userspace
  io_uring: make ASYNC_CANCEL work with poll and timeout
  io_uring: provide fallback request for OOM situations
  io_uring: convert accept4() -ERESTARTSYS into -EINTR
  io_uring: fix error clear of -&gt;file_table in io_sqe_files_register()
  io_uring: separate the io_free_req and io_free_req_find_next interface
  io_uring: keep io_put_req only responsible for release and put req
  ...
</content>
</entry>
<entry>
<title>net: increase SOMAXCONN to 4096</title>
<updated>2019-10-31T21:01:40Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-10-30T16:36:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=19f92a030ca6d772ab44b22ee6a01378a8cb32d4'/>
<id>urn:sha1:19f92a030ca6d772ab44b22ee6a01378a8cb32d4</id>
<content type='text'>
SOMAXCONN is /proc/sys/net/core/somaxconn default value.

It has been defined as 128 more than 20 years ago.

Since it caps the listen() backlog values, the very small value has
caused numerous problems over the years, and many people had
to raise it on their hosts after beeing hit by problems.

Google has been using 1024 for at least 15 years, and we increased
this to 4096 after TCP listener rework has been completed, more than
4 years ago. We got no complain of this change breaking any
legacy application.

Many applications indeed setup a TCP listener with listen(fd, -1);
meaning they let the system select the backlog.

Raising SOMAXCONN lowers chance of the port being unavailable under
even small SYNFLOOD attack, and reduces possibilities of side channel
vulnerabilities.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Yue Cao &lt;ycao009@ucr.edu&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: add __sys_accept4_file() helper</title>
<updated>2019-10-29T18:43:06Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-10-17T20:41:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=de2ea4b64b75a79ed9cdf9bf30e0e197901084e4'/>
<id>urn:sha1:de2ea4b64b75a79ed9cdf9bf30e0e197901084e4</id>
<content type='text'>
This is identical to __sys_accept4(), except it takes a struct file
instead of an fd, and it also allows passing in extra file-&gt;f_flags
flags. The latter is done to support masking in O_NONBLOCK without
manipulating the original file flags.

No functional changes in this patch.

Cc: netdev@vger.kernel.org
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>net/tls: prevent skb_orphan() from leaking TLS plain text with offload</title>
<updated>2019-08-09T05:39:35Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>jakub.kicinski@netronome.com</email>
</author>
<published>2019-08-08T00:03:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=414776621d1006e57e80e6db7fdc3837897aaa64'/>
<id>urn:sha1:414776621d1006e57e80e6db7fdc3837897aaa64</id>
<content type='text'>
sk_validate_xmit_skb() and drivers depend on the sk member of
struct sk_buff to identify segments requiring encryption.
Any operation which removes or does not preserve the original TLS
socket such as skb_orphan() or skb_clone() will cause clear text
leaks.

Make the TCP socket underlying an offloaded TLS connection
mark all skbs as decrypted, if TLS TX is in offload mode.
Then in sk_validate_xmit_skb() catch skbs which have no socket
(or a socket with no validation) and decrypted flag set.

Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
sk-&gt;sk_validate_xmit_skb are slightly interchangeable right now,
they all imply TLS offload. The new checks are guarded by
CONFIG_TLS_DEVICE because that's the option guarding the
sk_buff-&gt;decrypted member.

Second, smaller issue with orphaning is that it breaks
the guarantee that packets will be delivered to device
queues in-order. All TLS offload drivers depend on that
scheduling property. This means skb_orphan_partial()'s
trick of preserving partial socket references will cause
issues in the drivers. We need a full orphan, and as a
result netem delay/throttling will cause all TLS offload
skbs to be dropped.

Reusing the sk_buff-&gt;decrypted flag also protects from
leaking clear text when incoming, decrypted skb is redirected
(e.g. by TC).

See commit 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect
through ULP") for justification why the internal flag is safe.
The only location which could leak the flag in is tcp_bpf_sendmsg(),
which is taken care of by clearing the previously unused bit.

v2:
 - remove superfluous decrypted mark copy (Willem);
 - remove the stale doc entry (Boris);
 - rely entirely on EOR marking to prevent coalescing (Boris);
 - use an internal sendpages flag instead of marking the socket
   (Boris).
v3 (Willem):
 - reorganize the can_skb_orphan_partial() condition;
 - fix the flag leak-in through tcp_bpf_sendmsg.

Signed-off-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: Boris Pismenny &lt;borisp@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>io_uring: add support for recvmsg()</title>
<updated>2019-07-09T20:32:14Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-04-19T19:38:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aa1fa28fc73ea6b740ee7b62bf3b07141883dbb8'/>
<id>urn:sha1:aa1fa28fc73ea6b740ee7b62bf3b07141883dbb8</id>
<content type='text'>
This is done through IORING_OP_RECVMSG. This opcode uses the same
sqe-&gt;msg_flags that IORING_OP_SENDMSG added, and we pass in the
msghdr struct in the sqe-&gt;addr field as well.

We use MSG_DONTWAIT to force an inline fast path if recvmsg() doesn't
block, and punt to async execution if it would have.

Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: add support for sendmsg()</title>
<updated>2019-07-09T20:32:05Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-04-19T19:34:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0fa03c624d8fc9932d0f27c39a9deca6a37e0e17'/>
<id>urn:sha1:0fa03c624d8fc9932d0f27c39a9deca6a37e0e17</id>
<content type='text'>
This is done through IORING_OP_SENDMSG. There's a new sqe-&gt;msg_flags
for the flags argument, and the msghdr struct is passed in the
sqe-&gt;addr field.

We use MSG_DONTWAIT to force an inline fast path if sendmsg() doesn't
block, and punt to async execution if it would have.

Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
