<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/fs/notify, branch stable/4.3.y</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=stable%2F4.3.y</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=stable%2F4.3.y'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-09-06T03:34:28Z</updated>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2015-09-06T03:34:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-09-06T03:34:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7d9071a095023cd1db8fa18fa0d648dc1a5210e0'/>
<id>urn:sha1:7d9071a095023cd1db8fa18fa0d648dc1a5210e0</id>
<content type='text'>
Pull vfs updates from Al Viro:
 "In this one:

   - d_move fixes (Eric Biederman)

   - UFS fixes (me; locking is mostly sane now, a bunch of bugs in error
     handling ought to be fixed)

   - switch of sb_writers to percpu rwsem (Oleg Nesterov)

   - superblock scalability (Josef Bacik and Dave Chinner)

   - swapon(2) race fix (Hugh Dickins)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (65 commits)
  vfs: Test for and handle paths that are unreachable from their mnt_root
  dcache: Reduce the scope of i_lock in d_splice_alias
  dcache: Handle escaped paths in prepend_path
  mm: fix potential data race in SyS_swapon
  inode: don't softlockup when evicting inodes
  inode: rename i_wb_list to i_io_list
  sync: serialise per-superblock sync operations
  inode: convert inode_sb_list_lock to per-sb
  inode: add hlist_fake to avoid the inode hash lock in evict
  writeback: plug writeback at a high level
  change sb_writers to use percpu_rw_semaphore
  shift percpu_counter_destroy() into destroy_super_work()
  percpu-rwsem: kill CONFIG_PERCPU_RWSEM
  percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()
  percpu-rwsem: introduce percpu_down_read_trylock()
  document rwsem_release() in sb_wait_write()
  fix the broken lockdep logic in __sb_start_write()
  introduce __sb_writers_{acquired,release}() helpers
  ufs_inode_get{frag,block}(): get rid of 'phys' argument
  ufs_getfrag_block(): tidy up a bit
  ...
</content>
</entry>
<entry>
<title>fsnotify: get rid of fsnotify_destroy_mark_locked()</title>
<updated>2015-09-04T23:54:41Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.com</email>
</author>
<published>2015-09-04T22:43:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4712e722f91457e60723b9cef6265a74290efba9'/>
<id>urn:sha1:4712e722f91457e60723b9cef6265a74290efba9</id>
<content type='text'>
fsnotify_destroy_mark_locked() is subtle to use because it temporarily
releases group-&gt;mark_mutex.  To avoid future problems with this
function, split it into two.

fsnotify_detach_mark() is the part that needs group-&gt;mark_mutex and
fsnotify_free_mark() is the part that must be called outside of
group-&gt;mark_mutex.  This way it's much clearer what's going on and we
also avoid some pointless acquisitions of group-&gt;mark_mutex.

Signed-off-by: Jan Kara &lt;jack@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fsnotify: remove mark-&gt;free_list</title>
<updated>2015-09-04T23:54:41Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.com</email>
</author>
<published>2015-09-04T22:43:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=925d1132a03e33cb8f29a0057300d023b4f1be23'/>
<id>urn:sha1:925d1132a03e33cb8f29a0057300d023b4f1be23</id>
<content type='text'>
Free list is used when all marks on given inode / mount should be
destroyed when inode / mount is going away.  However we can free all of
the marks without using a special list with some care.

Signed-off-by: Jan Kara &lt;jack@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fsnotify: fix check in inotify fdinfo printing</title>
<updated>2015-09-04T23:54:41Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2015-09-04T22:43:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3c53e514212455db9923c203694a72007558b48f'/>
<id>urn:sha1:3c53e514212455db9923c203694a72007558b48f</id>
<content type='text'>
A check in inotify_fdinfo() checking whether mark is valid was always
true due to a bug.  Luckily we can never get to invalidated marks since
we hold mark_mutex and invalidated marks get removed from the group list
when they are invalidated under that mutex.

Anyway fix the check to make code more future proof.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/notify: optimize inotify/fsnotify code for unwatched files</title>
<updated>2015-09-04T23:54:41Z</updated>
<author>
<name>Dave Hansen</name>
<email>dave.hansen@linux.intel.com</email>
</author>
<published>2015-09-04T22:43:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7c49b8616460ebb12ee56d80d1abfbc20b6f3cbb'/>
<id>urn:sha1:7c49b8616460ebb12ee56d80d1abfbc20b6f3cbb</id>
<content type='text'>
I have a _tiny_ microbenchmark that sits in a loop and writes single
bytes to a file.  Writing one byte to a tmpfs file is around 2x slower
than reading one byte from a file, which is a _bit_ more than I expecte.
This is a dumb benchmark, but I think it's hard to deny that write() is
a hot path and we should avoid unnecessary overhead there.

I did a 'perf record' of 30-second samples of read and write.  The top
item in a diffprofile is srcu_read_lock() from fsnotify().  There are
active inotify fd's from systemd, but nothing is actually listening to
the file or its part of the filesystem.

I *think* we can avoid taking the srcu_read_lock() for the common case
where there are no actual marks on the file.  This means that there will
both be nothing to notify for *and* implies that there is no need for
clearing the ignore mask.

This patch gave a 13.1% speedup in writes/second on my test, which is an
improvement from the 10.8% that I saw with the last version.

Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: John McCutchan &lt;john@johnmccutchan.com&gt;
Cc: Robert Love &lt;rlove@rlove.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>inode: convert inode_sb_list_lock to per-sb</title>
<updated>2015-08-17T22:39:46Z</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2015-03-04T17:37:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=74278da9f70d84d715601fe794567a6d2bfdf078'/>
<id>urn:sha1:74278da9f70d84d715601fe794567a6d2bfdf078</id>
<content type='text'>
The process of reducing contention on per-superblock inode lists
starts with moving the locking to match the per-superblock inode
list. This takes the global lock out of the picture and reduces the
contention problems to within a single filesystem. This doesn't get
rid of contention as the locks still have global CPU scope, but it
does isolate operations on different superblocks form each other.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: Dave Chinner &lt;dchinner@redhat.com&gt;
</content>
</entry>
<entry>
<title>fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()</title>
<updated>2015-08-07T01:39:41Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.com</email>
</author>
<published>2015-08-06T22:46:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8f2f3eb59dff4ec538de55f2e0592fec85966aab'/>
<id>urn:sha1:8f2f3eb59dff4ec538de55f2e0592fec85966aab</id>
<content type='text'>
fsnotify_clear_marks_by_group_flags() can race with
fsnotify_destroy_marks() so that when fsnotify_destroy_mark_locked()
drops mark_mutex, a mark from the list iterated by
fsnotify_clear_marks_by_group_flags() can be freed and thus the next
entry pointer we have cached may become stale and we dereference free
memory.

Fix the problem by first moving marks to free to a special private list
and then always free the first entry in the special list.  This method
is safe even when entries from the list can disappear once we drop the
lock.

Signed-off-by: Jan Kara &lt;jack@suse.com&gt;
Reported-by: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Reviewed-by: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Cc: Lino Sanfilippo &lt;LinoSanfilippo@gmx.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()"</title>
<updated>2015-07-21T23:06:53Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-07-21T23:06:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d725e66c06ab440032f49ef17e960896d0ec6d49'/>
<id>urn:sha1:d725e66c06ab440032f49ef17e960896d0ec6d49</id>
<content type='text'>
This reverts commit a2673b6e040663bf16a552f8619e6bde9f4b9acf.

Kinglong Mee reports a memory leak with that patch, and Jan Kara confirms:

 "Thanks for report! You are right that my patch introduces a race
  between fsnotify kthread and fsnotify_destroy_group() which can result
  in leaking inotify event on group destruction.

  I haven't yet decided whether the right fix is not to queue events for
  dying notification group (as that is pointless anyway) or whether we
  should just fix the original problem differently...  Whenever I look
  at fsnotify code mark handling I get lost in the maze of locks, lists,
  and subtle differences between how different notification systems
  handle notification marks :( I'll think about it over night"

and after thinking about it, Jan says:

 "OK, I have looked into the code some more and I found another
  relatively simple way of fixing the original oops.  It will be IMHO
  better than trying to fixup this issue which has more potential for
  breakage.  I'll ask Linus to revert the fsnotify fix he already merged
  and send a new fix"

Reported-by: Kinglong Mee &lt;kinglongmee@gmail.com&gt;
Requested-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()</title>
<updated>2015-07-17T23:39:54Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2015-07-17T23:24:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a2673b6e040663bf16a552f8619e6bde9f4b9acf'/>
<id>urn:sha1:a2673b6e040663bf16a552f8619e6bde9f4b9acf</id>
<content type='text'>
fsnotify_clear_marks_by_group_flags() can race with
fsnotify_destroy_marks() so when fsnotify_destroy_mark_locked() drops
mark_mutex, a mark from the list iterated by
fsnotify_clear_marks_by_group_flags() can be freed and we dereference free
memory in the loop there.

Fix the problem by keeping mark_mutex held in
fsnotify_destroy_mark_locked().  The reason why we drop that mutex is that
we need to call a -&gt;freeing_mark() callback which may acquire mark_mutex
again.  To avoid this and similar lock inversion issues, we move the call
to -&gt;freeing_mark() callback to the kthread destroying the mark.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Reported-by: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Suggested-by: Lino Sanfilippo &lt;LinoSanfilippo@gmx.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/notify: don't use module_init for non-modular inotify_user code</title>
<updated>2015-06-16T18:12:34Z</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2015-05-02T00:08:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c013d5a4581203e074a1065e17378984544fcaef'/>
<id>urn:sha1:c013d5a4581203e074a1065e17378984544fcaef</id>
<content type='text'>
The INOTIFY_USER option is bool, and hence this code is either
present or absent.  It will never be modular, so using
module_init as an alias for __initcall is rather misleading.

Fix this up now, so that we can relocate module_init from
init.h into module.h in the future.  If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.

Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups.  As __initcall gets
mapped onto device_initcall, our use of fs_initcall (which
makes sense for fs code) will thus change this registration
from level 6-device to level 5-fs (i.e. slightly earlier).
However no observable impact of that small difference has
been observed during testing, or is expected.

Cc: John McCutchan &lt;john@johnmccutchan.com&gt;
Cc: Robert Love &lt;rlove@rlove.org&gt;
Cc: Eric Paris &lt;eparis@parisplace.org&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
</feed>
