<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/uapi/linux/nsfs.h, branch v6.17</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.17</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.17'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2025-06-11T09:59:08Z</updated>
<entry>
<title>mntns: use stable inode number for initial mount ns</title>
<updated>2025-06-11T09:59:08Z</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2025-06-06T09:45:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7f4f229195b73606ded77e56943f463b78adf635'/>
<id>urn:sha1:7f4f229195b73606ded77e56943f463b78adf635</id>
<content type='text'>
Apart from the network and mount namespace all other namespaces expose a
stable inode number and userspace has been relying on that for a very
long time now. It's very much heavily used API. Align the mount
namespace and use a stable inode number from the reserved procfs inode
number space so this is consistent across all namespaces.

Link: https://lore.kernel.org/20250606-work-nsfs-v1-3-b8749c9a8844@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>netns: use stable inode number for initial mount ns</title>
<updated>2025-06-11T09:59:08Z</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2025-06-06T09:45:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9b0240b3ccc325c7a96cf362877180bc9e10d546'/>
<id>urn:sha1:9b0240b3ccc325c7a96cf362877180bc9e10d546</id>
<content type='text'>
Apart from the network and mount namespace all other namespaces expose a
stable inode number and userspace has been relying on that for a very
long time now. It's very much heavily used API. Align the network
namespace and use a stable inode number from the reserved procfs inode
number space so this is consistent across all namespaces.

Link: https://lore.kernel.org/20250606-work-nsfs-v1-2-b8749c9a8844@kernel.org
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>nsfs: move root inode number to uapi</title>
<updated>2025-06-11T09:59:08Z</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2025-06-06T09:45:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6a9e2fb1bab53b54d02714a2ee3c6612d19629ce'/>
<id>urn:sha1:6a9e2fb1bab53b54d02714a2ee3c6612d19629ce</id>
<content type='text'>
Userspace relies on the root inode numbers to identify the initial
namespaces. That's already a hard dependency. So we cannot change that
anymore. Move the initial inode numbers to a public header.

Link: https://github.com/systemd/systemd/commit/d293fade24b34ccc2f5716b0ff5513e9533cf0c4
Link: https://lore.kernel.org/20250606-work-nsfs-v1-1-b8749c9a8844@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.12.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-09-16T09:15:26Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-09-16T09:15:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9020d0d844ad58a051f90b1e5b82ba34123925b9'/>
<id>urn:sha1:9020d0d844ad58a051f90b1e5b82ba34123925b9</id>
<content type='text'>
Pull vfs mount updates from Christian Brauner:
 "Recently, we added the ability to list mounts in other mount
  namespaces and the ability to retrieve namespace file descriptors
  without having to go through procfs by deriving them from pidfds.

  This extends nsfs in two ways:

   (1) Add the ability to retrieve information about a mount namespace
       via NS_MNT_GET_INFO.

       This will return the mount namespace id and the number of mounts
       currently in the mount namespace. The number of mounts can be
       used to size the buffer that needs to be used for listmount() and
       is in general useful without having to actually iterate through
       all the mounts.

      The structure is extensible.

   (2) Add the ability to iterate through all mount namespaces over
       which the caller holds privilege returning the file descriptor
       for the next or previous mount namespace.

       To retrieve a mount namespace the caller must be privileged wrt
       to it's owning user namespace. This means that PID 1 on the host
       can list all mounts in all mount namespaces or that a container
       can list all mounts of its nested containers.

       Optionally pass a structure for NS_MNT_GET_INFO with
       NS_MNT_GET_{PREV,NEXT} to retrieve information about the mount
       namespace in one go.

  (1) and (2) can be implemented for other namespace types easily.

  Together with recent api additions this means one can iterate through
  all mounts in all mount namespaces without ever touching procfs.

  The commit message in 49224a345c48 ('Merge patch series "nsfs: iterate
  through mount namespaces"') contains example code how to do this"

* tag 'vfs-6.12.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  nsfs: iterate through mount namespaces
  file: add fput() cleanup helper
  fs: add put_mnt_ns() cleanup helper
  fs: allow mount namespace fd
</content>
</entry>
<entry>
<title>nsfs: fix ioctl declaration</title>
<updated>2024-08-12T20:03:26Z</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2024-07-31T05:47:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=42b0f8da3acc87953161baeb24f756936eb4d4b2'/>
<id>urn:sha1:42b0f8da3acc87953161baeb24f756936eb4d4b2</id>
<content type='text'>
The kernel is writing an object of type __u64, so the ioctl has to be
defined to _IOR(NSIO, 0x5, __u64) instead of _IO(NSIO, 0x5).

Reported-by: Dmitry V. Levin &lt;ldv@strace.io&gt;
Link: https://lore.kernel.org/r/20240730164554.GA18486@altlinux.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>nsfs: iterate through mount namespaces</title>
<updated>2024-08-09T10:46:59Z</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2024-07-19T11:41:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a1d220d9dafa8d76ba60a784a1016c3134e6a1e8'/>
<id>urn:sha1:a1d220d9dafa8d76ba60a784a1016c3134e6a1e8</id>
<content type='text'>
It is already possible to list mounts in other mount namespaces and to
retrieve namespace file descriptors without having to go through procfs
by deriving them from pidfds.

Augment these abilities by adding the ability to retrieve information
about a mount namespace via NS_MNT_GET_INFO. This will return the mount
namespace id and the number of mounts currently in the mount namespace.
The number of mounts can be used to size the buffer that needs to be
used for listmount() and is in general useful without having to actually
iterate through all the mounts. The structure is extensible.

And add the ability to iterate through all mount namespaces over which
the caller holds privilege returning the file descriptor for the next or
previous mount namespace.

To retrieve a mount namespace the caller must be privileged wrt to it's
owning user namespace. This means that PID 1 on the host can list all
mounts in all mount namespaces or that a container can list all mounts
of its nested containers.

Optionally pass a structure for NS_MNT_GET_INFO with
NS_MNT_GET_{PREV,NEXT} to retrieve information about the mount namespace
in one go. Both ioctls can be implemented for other namespace types
easily.

Together with recent api additions this means one can iterate through
all mounts in all mount namespaces without ever touching procfs.

Link: https://lore.kernel.org/r/20240719-work-mount-namespace-v1-5-834113cab0d2@kernel.org
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.11.nsfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-07-15T19:27:39Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-07-15T19:27:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1b074abe885f43b2c207b5e748ffa60604dbc020'/>
<id>urn:sha1:1b074abe885f43b2c207b5e748ffa60604dbc020</id>
<content type='text'>
Pull namespace-fs updates from Christian Brauner:
 "This adds ioctls allowing to translate PIDs between PID namespaces.

  The motivating use-case comes from LXCFS which is a tiny fuse
  filesystem used to virtualize various aspects of procfs. LXCFS is run
  on the host. The files and directories it creates can be bind-mounted
  by e.g. a container at startup and mounted over the various procfs
  files the container wishes to have virtualized.

  When e.g. a read request for uptime is received, LXCFS will receive
  the pid of the reader. In order to virtualize the corresponding read,
  LXCFS needs to know the pid of the init process of the reader's pid
  namespace.

  In order to do this, LXCFS first needs to fork() two helper processes.
  The first helper process setns() to the readers pid namespace. The
  second helper process is needed to create a process that is a proper
  member of the pid namespace.

  The second helper process then creates a ucred message with ucred.pid
  set to 1 and sends it back to LXCFS. The kernel will translate the
  ucred.pid field to the corresponding pid number in LXCFS's pid
  namespace. This way LXCFS can learn the init pid number of the
  reader's pid namespace and can go on to virtualize.

  Since these two forks() are costly LXCFS maintains an init pid cache
  that caches a given pid for a fixed amount of time. The cache is
  pruned during new read requests. However, even with the cache the hit
  of the two forks() is singificant when a very large number of
  containers are running.

  So this adds a simple set of ioctls that let's a caller translate PIDs
  from and into a given PID namespace. This significantly improves
  performance with a very simple change.

  To protect against races pidfds can be used to check whether the
  process is still valid"

* tag 'vfs-6.11.nsfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  nsfs: add pid translation ioctls
</content>
</entry>
<entry>
<title>fs: add an ioctl to get the mnt ns id from nsfs</title>
<updated>2024-06-28T07:53:31Z</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2024-06-24T15:49:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e8e43a1fcc5c07575f37e40f8a2cd78aee46f9a0'/>
<id>urn:sha1:e8e43a1fcc5c07575f37e40f8a2cd78aee46f9a0</id>
<content type='text'>
In order to utilize the listmount() and statmount() extensions that
allow us to call them on different namespaces we need a way to get the
mnt namespace id from user space.  Add an ioctl to nsfs that will allow
us to extract the mnt namespace id in order to make these new extensions
usable.

Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Link: https://lore.kernel.org/r/180449959d5a756af7306d6bda55f41b9d53e3cb.1719243756.git.josef@toxicpanda.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>nsfs: add pid translation ioctls</title>
<updated>2024-06-25T21:00:41Z</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2020-06-07T20:47:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ca567df74a28a9fb368c6b2d93e864113f73f5c2'/>
<id>urn:sha1:ca567df74a28a9fb368c6b2d93e864113f73f5c2</id>
<content type='text'>
Add ioctl()s to translate pids between pid namespaces.

LXCFS is a tiny fuse filesystem used to virtualize various aspects of
procfs. LXCFS is run on the host. The files and directories it creates
can be bind-mounted by e.g. a container at startup and mounted over the
various procfs files the container wishes to have virtualized. When e.g.
a read request for uptime is received, LXCFS will receive the pid of the
reader. In order to virtualize the corresponding read, LXCFS needs to
know the pid of the init process of the reader's pid namespace. In order
to do this, LXCFS first needs to fork() two helper processes. The first
helper process setns() to the readers pid namespace. The second helper
process is needed to create a process that is a proper member of the pid
namespace. The second helper process then creates a ucred message with
ucred.pid set to 1 and sends it back to LXCFS. The kernel will translate
the ucred.pid field to the corresponding pid number in LXCFS's pid
namespace. This way LXCFS can learn the init pid number of the reader's
pid namespace and can go on to virtualize. Since these two forks() are
costly LXCFS maintains an init pid cache that caches a given pid for a
fixed amount of time. The cache is pruned during new read requests.
However, even with the cache the hit of the two forks() is singificant
when a very large number of containers are running. With this simple
patch we add an ns ioctl that let's a caller retrieve the init pid nr of
a pid namespace through its pid namespace fd. This significantly
improves performance with a very simple change.

Support translation of pids and tgids. Other concepts can be added but
there are no obvious users for this right now.

To protect against races pidfds can be used to check whether the process
is still valid. If needed, this can also be extended to work on pidfds
directly.

Link: https://lore.kernel.org/r/20240619-work-ns_ioctl-v1-1-7c0097e6bb6b@kernel.org
Reviewed-by: Alexander Mikhalitsyn &lt;aleksandr.mikhalitsyn@canonical.com&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>License cleanup: add SPDX license identifier to uapi header files with no license</title>
<updated>2017-11-02T10:19:54Z</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2017-11-01T14:08:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6f52b16c5b29b89d92c0e7236f4655dc8491ad70'/>
<id>urn:sha1:6f52b16c5b29b89d92c0e7236f4655dc8491ad70</id>
<content type='text'>
Many user space API headers are missing licensing information, which
makes it hard for compliance tools to determine the correct license.

By default are files without license information under the default
license of the kernel, which is GPLV2.  Marking them GPLV2 would exclude
them from being included in non GPLV2 code, which is obviously not
intended. The user space API headers fall under the syscall exception
which is in the kernels COPYING file:

   NOTE! This copyright does *not* cover user programs that use kernel
   services by normal system calls - this is merely considered normal use
   of the kernel, and does *not* fall under the heading of "derived work".

otherwise syscall usage would not be possible.

Update the files which contain no license information with an SPDX
license identifier.  The chosen identifier is 'GPL-2.0 WITH
Linux-syscall-note' which is the officially assigned identifier for the
Linux syscall exception.  SPDX license identifiers are a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.  See the previous patch in this series for the
methodology of how this patch was researched.

Reviewed-by: Kate Stewart &lt;kstewart@linuxfoundation.org&gt;
Reviewed-by: Philippe Ombredanne &lt;pombredanne@nexb.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
