<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/fs.h, branch v3.3.5</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.3.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.3.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-04-13T16:13:54Z</updated>
<entry>
<title>CIFS: Fix VFS lock usage for oplocked files</title>
<updated>2012-04-13T16:13:54Z</updated>
<author>
<name>Pavel Shilovsky</name>
<email>piastry@etersoft.ru</email>
</author>
<published>2012-03-28T17:56:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=49c8ca7ade694f65de205adc08c23a710244b6b0'/>
<id>urn:sha1:49c8ca7ade694f65de205adc08c23a710244b6b0</id>
<content type='text'>
commit 66189be74ff5f9f3fd6444315b85be210d07cef2 upstream.

We can deadlock if we have a write oplock and two processes
use the same file handle. In this case the first process can't
unlock its lock if the second process blocked on the lock in the
same time.

Fix it by using posix_lock_file rather than posix_lock_file_wait
under cinode-&gt;lock_mutex. If we request a blocking lock and
posix_lock_file indicates that there is another lock that prevents
us, wait untill that lock is released and restart our call.

Acked-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Pavel Shilovsky &lt;piastry@etersoft.ru&gt;
Signed-off-by: Steve French &lt;sfrench@us.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>vfs: Provide function to get superblock and wait for it to thaw</title>
<updated>2012-02-14T01:45:38Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2012-02-10T10:03:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6b6dc836a195e077e76977b6c020a73de411b46d'/>
<id>urn:sha1:6b6dc836a195e077e76977b6c020a73de411b46d</id>
<content type='text'>
In quota code we need to find a superblock corresponding to a device and wait
for superblock to be unfrozen. However this waiting has to happen without
s_umount semaphore because that is required for superblock to thaw. So provide
a function in VFS for this to keep dances with s_umount where they belong.

[AV: implementation switched to saner variant]

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>migrate_mode.h is not exported to user mode</title>
<updated>2012-01-24T01:17:45Z</updated>
<author>
<name>Stephen Rothwell</name>
<email>sfr@canb.auug.org.au</email>
</author>
<published>2012-01-24T00:41:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c1aab02dac690af7ff634d8e1cb3be6a04387eef'/>
<id>urn:sha1:c1aab02dac690af7ff634d8e1cb3be6a04387eef</id>
<content type='text'>
so move its include into fs.h inside the __KERNEL__ protection.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix warnings regarding enum migrate_mode</title>
<updated>2012-01-23T16:38:47Z</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2012-01-20T22:33:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6536e3123e5d3371a6f52e32a3d0694bcc987702'/>
<id>urn:sha1:6536e3123e5d3371a6f52e32a3d0694bcc987702</id>
<content type='text'>
sparc64 allmodconfig:

In file included from include/linux/compat.h:15,
                 from /usr/src/25/arch/sparc/include/asm/siginfo.h:19,
                 from include/linux/signal.h:5,
                 from include/linux/sched.h:73,
                 from arch/sparc/kernel/asm-offsets.c:13:
include/linux/fs.h:618: warning: parameter has incomplete type

It seems that my sparc64 compiler (gcc-3.4.5) doesn't like the forward
declaration of enums.

Fix this by moving the "enum migrate_mode" definition into its own header
file.

Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Andy Isaacson &lt;adi@hexapodia.org&gt;
Cc: Nai Xia &lt;nai.xia@gmail.com&gt;
Cc: Johannes Weiner &lt;jweiner@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-3.3/core' of git://git.kernel.dk/linux-block</title>
<updated>2012-01-15T20:24:45Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-01-15T20:24:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b3c9dd182ed3bdcdaf0e42625a35924b0497afdc'/>
<id>urn:sha1:b3c9dd182ed3bdcdaf0e42625a35924b0497afdc</id>
<content type='text'>
* 'for-3.3/core' of git://git.kernel.dk/linux-block: (37 commits)
  Revert "block: recursive merge requests"
  block: Stop using macro stubs for the bio data integrity calls
  blockdev: convert some macros to static inlines
  fs: remove unneeded plug in mpage_readpages()
  block: Add BLKROTATIONAL ioctl
  block: Introduce blk_set_stacking_limits function
  block: remove WARN_ON_ONCE() in exit_io_context()
  block: an exiting task should be allowed to create io_context
  block: ioc_cgroup_changed() needs to be exported
  block: recursive merge requests
  block, cfq: fix empty queue crash caused by request merge
  block, cfq: move icq creation and rq-&gt;elv.icq association to block core
  block, cfq: restructure io_cq creation path for io_context interface cleanup
  block, cfq: move io_cq exit/release to blk-ioc.c
  block, cfq: move icq cache management to block core
  block, cfq: move io_cq lookup to blk-ioc.c
  block, cfq: move cfqd-&gt;icq_list to request_queue and add request-&gt;elv.icq
  block, cfq: reorganize cfq_io_context into generic and cfq specific parts
  block: remove elevator_queue-&gt;ops
  block: reorder elevator switch sequence
  ...

Fix up conflicts in:
 - block/blk-cgroup.c
	Switch from can_attach_task to can_attach
 - block/cfq-iosched.c
	conflict with now removed cic index changes (we now use q-&gt;id instead)
</content>
</entry>
<entry>
<title>vfs: cache request_queue in struct block_device</title>
<updated>2012-01-13T04:13:12Z</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2012-01-13T01:20:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=87192a2a49c475cf322cb143e0fa63b0102d8567'/>
<id>urn:sha1:87192a2a49c475cf322cb143e0fa63b0102d8567</id>
<content type='text'>
This makes it possible to get from the inode to the request_queue with one
less cache miss.  Used in followon optimization.

The livetime of the pointer is the same as the gendisk.

This assumes that the queue will always stay the same in the gendisk while
it's visible to block_devices.  I think that's safe correct?

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Acked-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: compaction: introduce sync-light migration for use by compaction</title>
<updated>2012-01-13T04:13:09Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-01-13T01:19:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a6bc32b899223a877f595ef9ddc1e89ead5072b8'/>
<id>urn:sha1:a6bc32b899223a877f595ef9ddc1e89ead5072b8</id>
<content type='text'>
This patch adds a lightweight sync migrate operation MIGRATE_SYNC_LIGHT
mode that avoids writing back pages to backing storage.  Async compaction
maps to MIGRATE_ASYNC while sync compaction maps to MIGRATE_SYNC_LIGHT.
For other migrate_pages users such as memory hotplug, MIGRATE_SYNC is
used.

This avoids sync compaction stalling for an excessive length of time,
particularly when copying files to a USB stick where there might be a
large number of dirty pages backed by a filesystem that does not support
-&gt;writepages.

[aarcange@redhat.com: This patch is heavily based on Andrea's work]
[akpm@linux-foundation.org: fix fs/nfs/write.c build]
[akpm@linux-foundation.org: fix fs/btrfs/disk-io.c build]
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Andy Isaacson &lt;adi@hexapodia.org&gt;
Cc: Nai Xia &lt;nai.xia@gmail.com&gt;
Cc: Johannes Weiner &lt;jweiner@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: compaction: determine if dirty pages can be migrated without blocking within -&gt;migratepage</title>
<updated>2012-01-13T04:13:09Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-01-13T01:19:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b969c4ab9f182a6e1b2a0848be349f99714947b0'/>
<id>urn:sha1:b969c4ab9f182a6e1b2a0848be349f99714947b0</id>
<content type='text'>
Asynchronous compaction is used when allocating transparent hugepages to
avoid blocking for long periods of time.  Due to reports of stalling,
there was a debate on disabling synchronous compaction but this severely
impacted allocation success rates.  Part of the reason was that many dirty
pages are skipped in asynchronous compaction by the following check;

	if (PageDirty(page) &amp;&amp; !sync &amp;&amp;
		mapping-&gt;a_ops-&gt;migratepage != migrate_page)
			rc = -EBUSY;

This skips over all mapping aops using buffer_migrate_page() even though
it is possible to migrate some of these pages without blocking.  This
patch updates the -&gt;migratepage callback with a "sync" parameter.  It is
the responsibility of the callback to fail gracefully if migration would
block.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Andy Isaacson &lt;adi@hexapodia.org&gt;
Cc: Nai Xia &lt;nai.xia@gmail.com&gt;
Cc: Johannes Weiner &lt;jweiner@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>epoll: limit paths</title>
<updated>2012-01-13T04:13:04Z</updated>
<author>
<name>Jason Baron</name>
<email>jbaron@redhat.com</email>
</author>
<published>2012-01-13T01:17:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=28d82dc1c4edbc352129f97f4ca22624d1fe61de'/>
<id>urn:sha1:28d82dc1c4edbc352129f97f4ca22624d1fe61de</id>
<content type='text'>
The current epoll code can be tickled to run basically indefinitely in
both loop detection path check (on ep_insert()), and in the wakeup paths.
The programs that tickle this behavior set up deeply linked networks of
epoll file descriptors that cause the epoll algorithms to traverse them
indefinitely.  A couple of these sample programs have been previously
posted in this thread: https://lkml.org/lkml/2011/2/25/297.

To fix the loop detection path check algorithms, I simply keep track of
the epoll nodes that have been already visited.  Thus, the loop detection
becomes proportional to the number of epoll file descriptor and links.
This dramatically decreases the run-time of the loop check algorithm.  In
one diabolical case I tried it reduced the run-time from 15 mintues (all
in kernel time) to .3 seconds.

Fixing the wakeup paths could be done at wakeup time in a similar manner
by keeping track of nodes that have already been visited, but the
complexity is harder, since there can be multiple wakeups on different
cpus...Thus, I've opted to limit the number of possible wakeup paths when
the paths are created.

This is accomplished, by noting that the end file descriptor points that
are found during the loop detection pass (from the newly added link), are
actually the sources for wakeup events.  I keep a list of these file
descriptors and limit the number and length of these paths that emanate
from these 'source file descriptors'.  In the current implemetation I
allow 1000 paths of length 1, 500 of length 2, 100 of length 3, 50 of
length 4 and 10 of length 5.  Note that it is sufficient to check the
'source file descriptors' reachable from the newly added link, since no
other 'source file descriptors' will have newly added links.  This allows
us to check only the wakeup paths that may have gotten too long, and not
re-check all possible wakeup paths on the system.

In terms of the path limit selection, I think its first worth noting that
the most common case for epoll, is probably the model where you have 1
epoll file descriptor that is monitoring n number of 'source file
descriptors'.  In this case, each 'source file descriptor' has a 1 path of
length 1.  Thus, I believe that the limits I'm proposing are quite
reasonable and in fact may be too generous.  Thus, I'm hoping that the
proposed limits will not prevent any workloads that currently work to
fail.

In terms of locking, I have extended the use of the 'epmutex' to all
epoll_ctl add and remove operations.  Currently its only used in a subset
of the add paths.  I need to hold the epmutex, so that we can correctly
traverse a coherent graph, to check the number of paths.  I believe that
this additional locking is probably ok, since its in the setup/teardown
paths, and doesn't affect the running paths, but it certainly is going to
add some extra overhead.  Also, worth noting is that the epmuex was
recently added to the ep_ctl add operations in the initial path loop
detection code using the argument that it was not on a critical path.

Another thing to note here, is the length of epoll chains that is allowed.
Currently, eventpoll.c defines:

/* Maximum number of nesting allowed inside epoll sets */
#define EP_MAX_NESTS 4

This basically means that I am limited to a graph depth of 5 (EP_MAX_NESTS
+ 1).  However, this limit is currently only enforced during the loop
check detection code, and only when the epoll file descriptors are added
in a certain order.  Thus, this limit is currently easily bypassed.  The
newly added check for wakeup paths, stricly limits the wakeup paths to a
length of 5, regardless of the order in which ep's are linked together.
Thus, a side-effect of the new code is a more consistent enforcement of
the graph depth.

Thus far, I've tested this, using the sample programs previously
mentioned, which now either return quickly or return -EINVAL.  I've also
testing using the piptest.c epoll tester, which showed no difference in
performance.  I've also created a number of different epoll networks and
tested that they behave as expectded.

I believe this solves the original diabolical test cases, while still
preserving the sane epoll nesting.

Signed-off-by: Jason Baron &lt;jbaron@redhat.com&gt;
Cc: Nelson Elhage &lt;nelhage@ksplice.com&gt;
Cc: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>block: Add BLKROTATIONAL ioctl</title>
<updated>2012-01-11T15:29:31Z</updated>
<author>
<name>Martin K. Petersen</name>
<email>martin.petersen@oracle.com</email>
</author>
<published>2012-01-11T15:29:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ef00f59c95fe6e002e7c6e3663cdea65e253f4cc'/>
<id>urn:sha1:ef00f59c95fe6e002e7c6e3663cdea65e253f4cc</id>
<content type='text'>
Introduce an ioctl which permits applications to query whether a block
device is rotational.

Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
