<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/fs/file.c, branch v4.4.27</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.27</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.27'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-11-06T07:05:32Z</updated>
<entry>
<title>vfs: clear remainder of 'full_fds_bits' in dup_fd()</title>
<updated>2015-11-06T07:05:32Z</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers3@gmail.com</email>
</author>
<published>2015-11-06T06:32:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ea5c58e70c3a148ada0d3061a8f529589bb766ba'/>
<id>urn:sha1:ea5c58e70c3a148ada0d3061a8f529589bb766ba</id>
<content type='text'>
This fixes a bug from commit f3f86e33dc3d ("vfs: Fix pathological
performance case for __alloc_fd()").

v2: refactor to share fd bitmap copying code
Signed-off-by: Eric Biggers &lt;ebiggers3@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>vfs: conditionally clear close-on-exec flag</title>
<updated>2015-10-31T23:14:51Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-10-31T23:06:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fc90888d07b8e17eec49c04bdb26344fdea96c3b'/>
<id>urn:sha1:fc90888d07b8e17eec49c04bdb26344fdea96c3b</id>
<content type='text'>
We clear the close-on-exec flag when opening and closing files, and the
bit was almost always already clear before.  Avoid dirtying the
cacheline if the clearning isn't necessary.  That avoids unnecessary
cacheline dirtying and bouncing in multi-socket environments.

Eric Dumazet has a file descriptor benchmark that goes 4% faster from
this on his two-socket machine.  It's probably partly superlinear
improvement due to getting slightly less spinlock contention on the
file_lock spinlock due to less work in the critical section.

Tested-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>vfs: Fix pathological performance case for __alloc_fd()</title>
<updated>2015-10-31T23:12:10Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-10-30T23:53:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f3f86e33dc3da437fa4f204588ce7c78ea756982'/>
<id>urn:sha1:f3f86e33dc3da437fa4f204588ce7c78ea756982</id>
<content type='text'>
Al Viro points out that:
&gt; &gt;     * [Linux-specific aside] our __alloc_fd() can degrade quite badly
&gt; &gt; with some use patterns.  The cacheline pingpong in the bitmap is probably
&gt; &gt; inevitable, unless we accept considerably heavier memory footprint,
&gt; &gt; but we also have a case when alloc_fd() takes O(n) and it's _not_ hard
&gt; &gt; to trigger - close(3);open(...); will have the next open() after that
&gt; &gt; scanning the entire in-use bitmap.

And Eric Dumazet has a somewhat realistic multithreaded microbenchmark
that opens and closes a lot of sockets with minimal work per socket.

This patch largely fixes it.  We keep a 2nd-level bitmap of the open
file bitmaps, showing which words are already full.  So then we can
traverse that second-level bitmap to efficiently skip already allocated
file descriptors.

On his benchmark, this improves performance by up to an order of
magnitude, by avoiding the excessive open file bitmap scanning.

Tested-and-acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/file.c: __fget() and dup2() atomicity rules</title>
<updated>2015-07-01T06:31:08Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-06-29T15:10:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5ba97d2832f87943c43bb69cb1ef86dbc59df5bc'/>
<id>urn:sha1:5ba97d2832f87943c43bb69cb1ef86dbc59df5bc</id>
<content type='text'>
__fget() does lockless fetch of pointer from the descriptor
table, attempts to grab a reference and treats "it was already
zero" as "it's already gone from the table, we just hadn't
seen the store, let's fail".  Unfortunately, that breaks the
atomicity of dup2() - __fget() might see the old pointer,
notice that it's been already dropped and treat that as
"it's closed".  What we should be getting is either the
old file or new one, depending whether we come before or after
dup2().

Dmitry had following test failing sometimes :

int fd;
void *Thread(void *x) {
  char buf;
  int n = read(fd, &amp;buf, 1);
  if (n != 1)
    exit(printf("read failed: n=%d errno=%d\n", n, errno));
  return 0;
}

int main()
{
  fd = open("/dev/urandom", O_RDONLY);
  int fd2 = open("/dev/urandom", O_RDONLY);
  if (fd == -1 || fd2 == -1)
    exit(printf("open failed\n"));
  pthread_t th;
  pthread_create(&amp;th, 0, Thread, 0);
  if (dup2(fd2, fd) == -1)
    exit(printf("dup2 failed\n"));
  pthread_join(th, 0);
  if (close(fd) == -1)
    exit(printf("close failed\n"));
  if (close(fd2) == -1)
    exit(printf("close failed\n"));
  printf("DONE\n");
  return 0;
}

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>fs/file.c: don't acquire files-&gt;file_lock in fd_install()</title>
<updated>2015-07-01T06:30:09Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-06-30T13:54:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8a81252b774b53e628a8a0fe18e2b8fc236d92cc'/>
<id>urn:sha1:8a81252b774b53e628a8a0fe18e2b8fc236d92cc</id>
<content type='text'>
Mateusz Guzik reported :

 Currently obtaining a new file descriptor results in locking fdtable
 twice - once in order to reserve a slot and second time to fill it.

Holding the spinlock in __fd_install() is needed in case a resize is
done, or to prevent a resize.

Mateusz provided an RFC patch and a micro benchmark :
  http://people.redhat.com/~mguzik/pipebench.c

A resize is an unlikely operation in a process lifetime,
as table size is at least doubled at every resize.

We can use RCU instead of the spinlock.

__fd_install() must wait if a resize is in progress.

The resize must block new __fd_install() callers from starting,
and wait that ongoing install are finished (synchronize_sched())

resize should be attempted by a single thread to not waste resources.

rcu_sched variant is used, as __fd_install() and expand_fdtable() run
from process context.

It gives us a ~30% speedup using pipebench on a dual Intel(R) Xeon(R)
CPU E5-2696 v2 @ 2.50GHz

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Mateusz Guzik &lt;mguzik@redhat.com&gt;
Acked-by: Mateusz Guzik &lt;mguzik@redhat.com&gt;
Tested-by: Mateusz Guzik &lt;mguzik@redhat.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>mm: rcu-protected get_mm_exe_file()</title>
<updated>2015-04-17T13:04:07Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2015-04-16T19:47:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=90f31d0ea88880f780574f3d0bb1a227c4c66ca3'/>
<id>urn:sha1:90f31d0ea88880f780574f3d0bb1a227c4c66ca3</id>
<content type='text'>
This patch removes mm-&gt;mmap_sem from mm-&gt;exe_file read side.
Also it kills dup_mm_exe_file() and moves exe_file duplication into
dup_mmap() where both mmap_sems are locked.

[akpm@linux-foundation.org: fix comment typo]
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Cc: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/file.c: replace get_unused_fd() with get_unused_fd_flags(0)</title>
<updated>2014-12-11T01:41:10Z</updated>
<author>
<name>Yann Droneaud</name>
<email>ydroneaud@opteya.com</email>
</author>
<published>2014-12-10T23:45:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8d10a035829cacbd84bf35610870c35e566d1242'/>
<id>urn:sha1:8d10a035829cacbd84bf35610870c35e566d1242</id>
<content type='text'>
This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

In a further patch, get_unused_fd() will be removed so that new code
start using get_unused_fd_flags(), with the hope O_CLOEXEC could be
used, either by default or choosen by userspace.

Signed-off-by: Yann Droneaud &lt;ydroneaud@opteya.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2014-10-13T13:44:12Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-10-13T13:44:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d6dd50e07c5bec00db2005969b1a01f8ca3d25ef'/>
<id>urn:sha1:d6dd50e07c5bec00db2005969b1a01f8ca3d25ef</id>
<content type='text'>
Pull RCU updates from Ingo Molnar:
 "The main changes in this cycle were:

   - changes related to No-CBs CPUs and NO_HZ_FULL

   - RCU-tasks implementation

   - torture-test updates

   - miscellaneous fixes

   - locktorture updates

   - RCU documentation updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits)
  workqueue: Use cond_resched_rcu_qs macro
  workqueue: Add quiescent state between work items
  locktorture: Cleanup header usage
  locktorture: Cannot hold read and write lock
  locktorture: Fix __acquire annotation for spinlock irq
  locktorture: Support rwlocks
  rcu: Eliminate deadlock between CPU hotplug and expedited grace periods
  locktorture: Document boot/module parameters
  rcutorture: Rename rcutorture_runnable parameter
  locktorture: Add test scenario for rwsem_lock
  locktorture: Add test scenario for mutex_lock
  locktorture: Make torture scripting account for new _runnable name
  locktorture: Introduce torture context
  locktorture: Support rwsems
  locktorture: Add infrastructure for torturing read locks
  torture: Address race in module cleanup
  locktorture: Make statistics generic
  locktorture: Teach about lock debugging
  locktorture: Support mutexes
  locktorture: Add documentation
  ...
</content>
</entry>
<entry>
<title>missing annotation in fs/file.c</title>
<updated>2014-10-09T06:39:11Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-08-31T18:12:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e983094d6dce524f3890edfec44b7ca6dbfa1183'/>
<id>urn:sha1:e983094d6dce524f3890edfec44b7ca6dbfa1183</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops</title>
<updated>2014-09-07T23:27:20Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.vnet.ibm.com</email>
</author>
<published>2014-07-01T18:26:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bde6c3aa993066acb0d6ce32ecabe03b9d5df92d'/>
<id>urn:sha1:bde6c3aa993066acb0d6ce32ecabe03b9d5df92d</id>
<content type='text'>
RCU-tasks requires the occasional voluntary context switch
from CPU-bound in-kernel tasks.  In some cases, this requires
instrumenting cond_resched().  However, there is some reluctance
to countenance unconditionally instrumenting cond_resched() (see
http://lwn.net/Articles/603252/), so this commit creates a separate
cond_resched_rcu_qs() that may be used in place of cond_resched() in
locations prone to long-duration in-kernel looping.

This commit currently instruments only RCU-tasks.  Future possibilities
include also instrumenting RCU, RCU-bh, and RCU-sched in order to reduce
IPI usage.

Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
</feed>
