<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/fs, branch v3.12.43</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.43</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.43'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-05-16T07:12:43Z</updated>
<entry>
<title>mnt: Fix fs_fully_visible to verify the root directory is visible</title>
<updated>2015-05-16T07:12:43Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-05-08T21:36:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=95c021d8b720eb3ade60e98b593f38a47df039fb'/>
<id>urn:sha1:95c021d8b720eb3ade60e98b593f38a47df039fb</id>
<content type='text'>
commit 7e96c1b0e0f495c5a7450dc4aa7c9a24ba4305bd upstream.

This fixes a dumb bug in fs_fully_visible that allows proc or sys to
be mounted if there is a bind mount of part of /proc/ or /sys/ visible.

Reported-by: Eric Windisch &lt;ewindisch@docker.com&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>nilfs2: fix sanity check of btree level in nilfs_btree_root_broken()</title>
<updated>2015-05-16T07:12:42Z</updated>
<author>
<name>Ryusuke Konishi</name>
<email>konishi.ryusuke@lab.ntt.co.jp</email>
</author>
<published>2015-05-05T23:24:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=579524b1a966ee12d83ab116140c493eb8e2baaa'/>
<id>urn:sha1:579524b1a966ee12d83ab116140c493eb8e2baaa</id>
<content type='text'>
commit d8fd150fe3935e1692bf57c66691e17409ebb9c1 upstream.

The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).

Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.

This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.

Signed-off-by: Ryusuke Konishi &lt;konishi.ryusuke@lab.ntt.co.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>ocfs2: dlm: fix race between purge and get lock resource</title>
<updated>2015-05-16T07:12:42Z</updated>
<author>
<name>Junxiao Bi</name>
<email>junxiao.bi@oracle.com</email>
</author>
<published>2015-05-05T23:24:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ad02ca86cf07d341bc6b85ade0e5b7f5e96839ba'/>
<id>urn:sha1:ad02ca86cf07d341bc6b85ade0e5b7f5e96839ba</id>
<content type='text'>
commit b1432a2a35565f538586774a03bf277c27fc267d upstream.

There is a race window in dlm_get_lock_resource(), which may return a
lock resource which has been purged.  This will cause the process to
hang forever in dlmlock() as the ast msg can't be handled due to its
lock resource not existing.

    dlm_get_lock_resource {
        ...
        spin_lock(&amp;dlm-&gt;spinlock);
        tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
        if (tmpres) {
             spin_unlock(&amp;dlm-&gt;spinlock);
             &gt;&gt;&gt;&gt;&gt;&gt;&gt;&gt; race window, dlm_run_purge_list() may run and purge
                              the lock resource
             spin_lock(&amp;tmpres-&gt;spinlock);
             ...
             spin_unlock(&amp;tmpres-&gt;spinlock);
        }
    }

Signed-off-by: Junxiao Bi &lt;junxiao.bi@oracle.com&gt;
Cc: Joseph Qi &lt;joseph.qi@huawei.com&gt;
Cc: Mark Fasheh &lt;mfasheh@suse.com&gt;
Cc: Joel Becker &lt;jlbec@evilplan.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>fs/seq_file: fallback to vmalloc allocation</title>
<updated>2015-05-15T07:10:57Z</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2014-07-02T22:22:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=77167581c8a317d95e15b164acd323ea3aef0a2c'/>
<id>urn:sha1:77167581c8a317d95e15b164acd323ea3aef0a2c</id>
<content type='text'>
commit 058504edd02667eef8fac9be27ab3ea74332e9b4 upstream.

There are a couple of seq_files which use the single_open() interface.
This interface requires that the whole output must fit into a single
buffer.

E.g.  for /proc/stat allocation failures have been observed because an
order-4 memory allocation failed due to memory fragmentation.  In such
situations reading /proc/stat is not possible anymore.

Therefore change the seq_file code to fallback to vmalloc allocations
which will usually result in a couple of order-0 allocations and hence
also work if memory is fragmented.

For reference a call trace where reading from /proc/stat failed:

  sadc: page allocation failure: order:4, mode:0x1040d0
  CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
  [...]
  Call Trace:
    show_stack+0x6c/0xe8
    warn_alloc_failed+0xd6/0x138
    __alloc_pages_nodemask+0x9da/0xb68
    __get_free_pages+0x2e/0x58
    kmalloc_order_trace+0x44/0xc0
    stat_open+0x5a/0xd8
    proc_reg_open+0x8a/0x140
    do_dentry_open+0x1bc/0x2c8
    finish_open+0x46/0x60
    do_last+0x382/0x10d0
    path_openat+0xc8/0x4f8
    do_filp_open+0x46/0xa8
    do_sys_open+0x114/0x1f0
    sysc_tracego+0x14/0x1a

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Tested-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Ian Kent &lt;raven@themaw.net&gt;
Cc: Hendrik Brueckner &lt;brueckner@linux.vnet.ibm.com&gt;
Cc: Thorsten Diehl &lt;thorsten.diehl@de.ibm.com&gt;
Cc: Andrea Righi &lt;andrea@betterlinux.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>seq_file: always clear m-&gt;count when we free m-&gt;buf</title>
<updated>2015-05-15T07:10:56Z</updated>
<author>
<name>Al Viro</name>
<email>viro@ZenIV.linux.org.uk</email>
</author>
<published>2013-11-19T01:20:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bb1317a3eb0f8cc3615e77cfa83b2902f35906d8'/>
<id>urn:sha1:bb1317a3eb0f8cc3615e77cfa83b2902f35906d8</id>
<content type='text'>
commit 801a76050bcf8d4e500eb8d048ff6265f37a61c8 upstream.

Once we'd freed m-&gt;buf, m-&gt;count should become zero - we have no valid
contents reachable via m-&gt;buf.

Reported-by: Charley (Hao Chuan) Chu &lt;charley.chu@broadcom.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>/proc/stat: convert to single_open_size()</title>
<updated>2015-05-15T07:10:56Z</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2014-07-02T22:22:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a58d83a1a10afa8b19a026859bcb6c47832975a8'/>
<id>urn:sha1:a58d83a1a10afa8b19a026859bcb6c47832975a8</id>
<content type='text'>
commit f74373a5cc7a0155d232c4e999648c7a95435bb2 upstream.

These two patches are supposed to "fix" failed order-4 memory
allocations which have been observed when reading /proc/stat.  The
problem has been observed on s390 as well as on x86.

To address the problem change the seq_file memory allocations to
fallback to use vmalloc, so that allocations also work if memory is
fragmented.

This approach seems to be simpler and less intrusive than changing
/proc/stat to use an interator.  Also it "fixes" other users as well,
which use seq_file's single_open() interface.

This patch (of 2):

Use seq_file's single_open_size() to preallocate a buffer that is large
enough to hold the whole output, instead of open coding it.  Also
calculate the requested size using the number of online cpus instead of
possible cpus, since the size of the output only depends on the number
of online cpus.

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Ian Kent &lt;raven@themaw.net&gt;
Cc: Hendrik Brueckner &lt;brueckner@linux.vnet.ibm.com&gt;
Cc: Thorsten Diehl &lt;thorsten.diehl@de.ibm.com&gt;
Cc: Andrea Righi &lt;andrea@betterlinux.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>ext4: fix data corruption caused by unwritten and delayed extents</title>
<updated>2015-05-15T07:10:49Z</updated>
<author>
<name>Lukas Czerner</name>
<email>lczerner@redhat.com</email>
</author>
<published>2015-05-03T01:36:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=50a61d0ce8cadb9185b6fc47bcc99ce7bf69c62c'/>
<id>urn:sha1:50a61d0ce8cadb9185b6fc47bcc99ce7bf69c62c</id>
<content type='text'>
commit d2dc317d564a46dfc683978a2e5a4f91434e9711 upstream.

Currently it is possible to lose whole file system block worth of data
when we hit the specific interaction with unwritten and delayed extents
in status extent tree.

The problem is that when we insert delayed extent into extent status
tree the only way to get rid of it is when we write out delayed buffer.
However there is a limitation in the extent status tree implementation
so that when inserting unwritten extent should there be even a single
delayed block the whole unwritten extent would be marked as delayed.

At this point, there is no way to get rid of the delayed extents,
because there are no delayed buffers to write out. So when a we write
into said unwritten extent we will convert it to written, but it still
remains delayed.

When we try to write into that block later ext4_da_map_blocks() will set
the buffer new and delayed and map it to invalid block which causes
the rest of the block to be zeroed loosing already written data.

For now we can fix this by simply not allowing to set delayed status on
written extent in the extent status tree. Also add WARN_ON() to make
sure that we notice if this happens in the future.

This problem can be easily reproduced by running the following xfs_io.

xfs_io -f -c "pwrite -S 0xaa 4096 2048" \
          -c "falloc 0 131072" \
          -c "pwrite -S 0xbb 65536 2048" \
          -c "fsync" /mnt/test/fff

echo 3 &gt; /proc/sys/vm/drop_caches
xfs_io -c "pwrite -S 0xdd 67584 2048" /mnt/test/fff

This can be theoretically also reproduced by at random by running fsx,
but it's not very reliable, though on machines with bigger page size
(like ppc) this can be seen more often (especially xfstest generic/127)

Signed-off-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>fs: take i_mutex during prepare_binprm for set[ug]id executables</title>
<updated>2015-05-15T07:10:41Z</updated>
<author>
<name>Jann Horn</name>
<email>jann@thejh.net</email>
</author>
<published>2015-04-19T00:48:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5176b77f1aacdc560eaeac4685ade444bb814689'/>
<id>urn:sha1:5176b77f1aacdc560eaeac4685ade444bb814689</id>
<content type='text'>
commit 8b01fc86b9f425899f8a3a8fc1c47d73c2c20543 upstream.

This prevents a race between chown() and execve(), where chowning a
setuid-user binary to root would momentarily make the binary setuid
root.

This patch was mostly written by Linus Torvalds.

Signed-off-by: Jann Horn &lt;jann@thejh.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Charles Williams &lt;ciwillia@brocade.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>RCU pathwalk breakage when running into a symlink overmounting something</title>
<updated>2015-05-15T07:10:36Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2015-04-24T19:47:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6e310c82af1ba014a4bb312551d1a74d1c71761a'/>
<id>urn:sha1:6e310c82af1ba014a4bb312551d1a74d1c71761a</id>
<content type='text'>
commit 3cab989afd8d8d1bc3d99fef0e7ed87c31e7b647 upstream.

Calling unlazy_walk() in walk_component() and do_last() when we find
a symlink that needs to be followed doesn't acquire a reference to vfsmount.
That's fine when the symlink is on the same vfsmount as the parent directory
(which is almost always the case), but it's not always true - one _can_
manage to bind a symlink on top of something.  And in such cases we end up
with excessive mntput().

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>ext4: make fsync to sync parent dir in no-journal for real this time</title>
<updated>2015-05-04T09:50:05Z</updated>
<author>
<name>Lukas Czerner</name>
<email>lczerner@redhat.com</email>
</author>
<published>2015-04-03T14:46:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f7b22e25281cd03ed8c7f5ebb559dc109bb592cc'/>
<id>urn:sha1:f7b22e25281cd03ed8c7f5ebb559dc109bb592cc</id>
<content type='text'>
commit e12fb97222fc41e8442896934f76d39ef99b590a upstream.

Previously commit 14ece1028b3ed53ffec1b1213ffc6acaf79ad77c added a
support for for syncing parent directory of newly created inodes to
make sure that the inode is not lost after a power failure in
no-journal mode.

However this does not work in majority of cases, namely:
 - if the directory has inline data
 - if the directory is already indexed
 - if the directory already has at least one block and:
	- the new entry fits into it
	- or we've successfully converted it to indexed

So in those cases we might lose the inode entirely even after fsync in
the no-journal mode. This also includes ext2 default mode obviously.

I've noticed this while running xfstest generic/321 and even though the
test should fail (we need to run fsck after a crash in no-journal mode)
I could not find a newly created entries even when if it was fsynced
before.

Fix this by adjusting the ext4_add_entry() successful exit paths to set
the inode EXT4_STATE_NEWENTRY so that fsync has the chance to fsync the
parent directory as well.

Signed-off-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Frank Mayhar &lt;fmayhar@google.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
</feed>
