<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/fsnotify_backend.h, branch v5.15.6</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.6</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.6'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-03-16T15:49:31Z</updated>
<entry>
<title>fanotify: configurable limits via sysfs</title>
<updated>2021-03-16T15:49:31Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2021-03-04T11:29:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5b8fea65d197f408bb00b251c70d842826d6b70b'/>
<id>urn:sha1:5b8fea65d197f408bb00b251c70d842826d6b70b</id>
<content type='text'>
fanotify has some hardcoded limits. The only APIs to escape those limits
are FAN_UNLIMITED_QUEUE and FAN_UNLIMITED_MARKS.

Allow finer grained tuning of the system limits via sysfs tunables under
/proc/sys/fs/fanotify, similar to tunables under /proc/sys/fs/inotify,
with some minor differences.

- max_queued_events - global system tunable for group queue size limit.
  Like the inotify tunable with the same name, it defaults to 16384 and
  applies on initialization of a new group.

- max_user_marks - user ns tunable for marks limit per user.
  Like the inotify tunable named max_user_watches, on a machine with
  sufficient RAM and it defaults to 1048576 in init userns and can be
  further limited per containing user ns.

- max_user_groups - user ns tunable for number of groups per user.
  Like the inotify tunable named max_user_instances, it defaults to 128
  in init userns and can be further limited per containing user ns.

The slightly different tunable names used for fanotify are derived from
the "group" and "mark" terminology used in the fanotify man pages and
throughout the code.

Considering the fact that the default value for max_user_instances was
increased in kernel v5.10 from 8192 to 1048576, leaving the legacy
fanotify limit of 8192 marks per group in addition to the max_user_marks
limit makes little sense, so the per group marks limit has been removed.

Note that when a group is initialized with FAN_UNLIMITED_MARKS, its own
marks are not accounted in the per user marks account, so in effect the
limit of max_user_marks is only for the collection of groups that are
not initialized with FAN_UNLIMITED_MARKS.

Link: https://lore.kernel.org/r/20210304112921.3996419-2-amir73il@gmail.com
Suggested-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fsnotify: use hash table for faster events merge</title>
<updated>2021-03-16T15:37:51Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2021-03-04T10:48:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=94e00d28a680dff18805ca472b191364347d2234'/>
<id>urn:sha1:94e00d28a680dff18805ca472b191364347d2234</id>
<content type='text'>
In order to improve event merge performance, hash events in a 128 size
hash table by the event merge key.

The fanotify_event size grows by two pointers, but we just reduced its
size by removing the objectid member, so overall its size is increased
by one pointer.

Permission events and overflow event are not merged so they are also
not hashed.

Link: https://lore.kernel.org/r/20210304104826.3993892-5-amir73il@gmail.com
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fanotify: reduce event objectid to 29-bit hash</title>
<updated>2021-03-16T15:14:28Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2021-03-04T10:48:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8988f11abb820bacfcc53d498370bfb30f792ec4'/>
<id>urn:sha1:8988f11abb820bacfcc53d498370bfb30f792ec4</id>
<content type='text'>
objectid is only used by fanotify backend and it is just an optimization
for event merge before comparing all fields in event.

Move the objectid member from common struct fsnotify_event into struct
fanotify_event and reduce it to 29-bit hash to cram it together with the
3-bit event type.

Events of different types are never merged, so the combination of event
type and hash form a 32-bit key for fast compare of events.

This reduces the size of events by one pointer and paves the way for
adding hashed queue support for fanotify.

Link: https://lore.kernel.org/r/20210304104826.3993892-3-amir73il@gmail.com
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue</title>
<updated>2021-03-16T15:14:23Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2021-03-04T10:48:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6f73171e192366ff7c98af9fb50615ef9615f8a7'/>
<id>urn:sha1:6f73171e192366ff7c98af9fb50615ef9615f8a7</id>
<content type='text'>
Current code has an assumtion that fsnotify_notify_queue_is_empty() is
called to verify that queue is not empty before trying to peek or remove
an event from queue.

Remove this assumption by moving the fsnotify_notify_queue_is_empty()
into the functions, allow them to return NULL value and check return
value by all callers.

This is a prep patch for multi event queues.

Link: https://lore.kernel.org/r/20210304104826.3993892-2-amir73il@gmail.com
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>inotify, memcg: account inotify instances to kmemcg</title>
<updated>2021-01-05T13:42:54Z</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeelb@google.com</email>
</author>
<published>2020-12-20T04:46:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ac7b79fd190b02e7151bc7d2b9da692f537657f3'/>
<id>urn:sha1:ac7b79fd190b02e7151bc7d2b9da692f537657f3</id>
<content type='text'>
Currently the fs sysctl inotify/max_user_instances is used to limit the
number of inotify instances on the system. For systems running multiple
workloads, the per-user namespace sysctl max_inotify_instances can be
used to further partition inotify instances. However there is no easy
way to set a sensible system level max limit on inotify instances and
further partition it between the workloads. It is much easier to charge
the underlying resource (i.e. memory) behind the inotify instances to
the memcg of the workload and let their memory limits limit the number
of inotify instances they can create.

With inotify instances charged to memcg, the admin can simply set
max_user_instances to INT_MAX and let the memcg limits of the jobs limit
their inotify instances.

Link: https://lore.kernel.org/r/20201220044608.1258123-1-shakeelb@google.com
Reviewed-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fsnotify: fix events reported to watching parent and child</title>
<updated>2020-12-11T10:40:43Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-12-02T12:07:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fecc4559780d52d174ea05e3bf543669165389c3'/>
<id>urn:sha1:fecc4559780d52d174ea05e3bf543669165389c3</id>
<content type='text'>
fsnotify_parent() used to send two separate events to backends when a
parent inode is watching children and the child inode is also watching.
In an attempt to avoid duplicate events in fanotify, we unified the two
backend callbacks to a single callback and handled the reporting of the
two separate events for the relevant backends (inotify and dnotify).
However the handling is buggy and can result in inotify and dnotify
listeners receiving events of the type they never asked for or spurious
events.

The problem is the unified event callback with two inode marks (parent and
child) is called when any of the parent and child inodes are watched and
interested in the event, but the parent inode's mark that is interested
in the event on the child is not necessarily the one we are currently
reporting to (it could belong to a different group).

So before reporting the parent or child event flavor to backend we need
to check that the mark is really interested in that event flavor.

The semantics of INODE and CHILD marks were hard to follow and made the
logic more complicated than it should have been.  Replace it with INODE
and PARENT marks semantics to hopefully make the logic more clear.

Thanks to Hugh Dickins for spotting a bug in the earlier version of this
patch.

Fixes: 497b0c5a7c06 ("fsnotify: send event to parent and child with single callback")
CC: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201202120713.702387-4-amir73il@gmail.com
Reported-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fsnotify: generalize handle_inode_event()</title>
<updated>2020-12-03T13:58:35Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-12-02T12:07:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=950cc0d2bef078e1f6459900ca4d4b2a2e0e3c37'/>
<id>urn:sha1:950cc0d2bef078e1f6459900ca4d4b2a2e0e3c37</id>
<content type='text'>
The handle_inode_event() interface was added as (quoting comment):
"a simple variant of handle_event() for groups that only have inode
marks and don't have ignore mask".

In other words, all backends except fanotify.  The inotify backend
also falls under this category, but because it required extra arguments
it was left out of the initial pass of backends conversion to the
simple interface.

This results in code duplication between the generic helper
fsnotify_handle_event() and the inotify_handle_event() callback
which also happen to be buggy code.

Generalize the handle_inode_event() arguments and add the check for
FS_EXCL_UNLINK flag to the generic helper, so inotify backend could
be converted to use the simple interface.

Link: https://lore.kernel.org/r/20201202120713.702387-2-amir73il@gmail.com
CC: stable@vger.kernel.org
Fixes: b9a1b9772509 ("fsnotify: create method handle_inode_event() in fsnotify_operations")
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fsnotify: create method handle_inode_event() in fsnotify_operations</title>
<updated>2020-07-27T21:25:50Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-07-22T12:58:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b9a1b9772509cbc6f6aa8bcd0b019f6347a2b631'/>
<id>urn:sha1:b9a1b9772509cbc6f6aa8bcd0b019f6347a2b631</id>
<content type='text'>
The method handle_event() grew a lot of complexity due to the design of
fanotify and merging of ignore masks.

Most backends do not care about this complex functionality, so we can hide
this complexity from them.

Introduce a method handle_inode_event() that serves those backends and
passes a single inode mark and less arguments.

This change converts all backends except fanotify and inotify to use the
simplified handle_inode_event() method.  In pricipal, inotify could have
also used the new method, but that would require passing more arguments
on the simple helper (data, data_type, cookie), so we leave it with the
handle_event() method.

Link: https://lore.kernel.org/r/20200722125849.17418-9-amir73il@gmail.com
Suggested-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fsnotify: send event with parent/name info to sb/mount/non-dir marks</title>
<updated>2020-07-27T21:21:02Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-07-16T08:42:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9b93f33105f5f9bd3d016ff870eb6000c9d89eff'/>
<id>urn:sha1:9b93f33105f5f9bd3d016ff870eb6000c9d89eff</id>
<content type='text'>
Similar to events "on child" to watching directory, send event
with parent/name info if sb/mount/non-dir marks are interested in
parent/name info.

The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify
interest in parent/name info for events on non-directory inodes.

Events on "orphan" children (disconnected dentries) are sent without
parent/name info.

Events on directories are sent with parent/name info only if the parent
directory is watching.

After this change, even groups that do not subscribe to events on
children could get an event with mark iterator type TYPE_CHILD and
without mark iterator type TYPE_INODE if fanotify has marks on the same
objects.

dnotify and inotify event handlers can already cope with that situation.
audit does not subscribe to events that are possible on child, so won't
get to this situation. nfsd does not access the marks iterator from its
event handler at the moment, so it is not affected.

This is a bit too fragile, so we should prepare all groups to cope with
mark type TYPE_CHILD preferably using a generic helper.

Link: https://lore.kernel.org/r/20200716084230.30611-16-amir73il@gmail.com
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fsnotify: pass dir and inode arguments to fsnotify()</title>
<updated>2020-07-27T21:15:48Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-07-22T12:58:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=40a100d3adc1ad7f0a34875468c499fcecd20ba4'/>
<id>urn:sha1:40a100d3adc1ad7f0a34875468c499fcecd20ba4</id>
<content type='text'>
The arguments of fsnotify() are overloaded and mean different things
for different event types.

Replace the to_tell argument with separate arguments @dir and @inode,
because we may be sending to both dir and child.  Using the @data
argument to pass the child is not enough, because dirent events pass
this argument (for audit), but we do not report to child.

Document the new fsnotify() function argumenets.

Link: https://lore.kernel.org/r/20200722125849.17418-7-amir73il@gmail.com
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
</feed>
