<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/fs/ext4/inode.c, branch v3.2.52</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.52</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.52'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2013-07-27T04:34:34Z</updated>
<entry>
<title>ext4: fix overflow when counting used blocks on 32-bit architectures</title>
<updated>2013-07-27T04:34:34Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2013-05-31T23:39:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=967c9a98626b1e402ab042bd770b7109532f49fb'/>
<id>urn:sha1:967c9a98626b1e402ab042bd770b7109532f49fb</id>
<content type='text'>
commit 8af8eecc1331dbf5e8c662022272cf667e213da5 upstream.

The arithmetics adding delalloc blocks to the number of used blocks in
ext4_getattr() can easily overflow on 32-bit archs as we first multiply
number of blocks by blocksize and then divide back by 512. Make the
arithmetics more clever and also use proper type (unsigned long long
instead of unsigned long).

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4/jbd2: don't wait (forever) for stale tid caused by wraparound</title>
<updated>2013-05-13T14:02:14Z</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2013-04-04T02:02:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=164ed4383ca615f9a4f19673ac82372ccf96a33f'/>
<id>urn:sha1:164ed4383ca615f9a4f19673ac82372ccf96a33f</id>
<content type='text'>
commit d76a3a77113db020d9bb1e894822869410450bd9 upstream.

In the case where an inode has a very stale transaction id (tid) in
i_datasync_tid or i_sync_tid, it's possible that after a very large
(2**31) number of transactions, that the tid number space might wrap,
causing tid_geq()'s calculations to fail.

Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified
by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily",
attempted to fix this problem, but it only avoided kjournald spinning
forever by fixing the logic in jbd2_log_start_commit().

Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c
that might call jbd2_log_start_commit() with a stale tid, those
functions will subsequently call jbd2_log_wait_commit() with the same
stale tid, and then wait for a very long time.  To fix this, we
replace the calls to jbd2_log_start_commit() and
jbd2_log_wait_commit() with a call to a new function,
jbd2_complete_transaction(), which will correctly handle stale tid's.

As a bonus, jbd2_complete_transaction() will avoid locking
j_state_lock for writing unless a commit needs to be started.  This
should have a small (but probably not measurable) improvement for
ext4's scalability.

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Reported-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Reported-by: George Barnett &lt;gbarnett@atlassian.com&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4: fix data=journal fast mount/umount hang</title>
<updated>2013-03-27T02:41:18Z</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2013-03-20T13:42:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c5b769c09e752469a2342105e882be3bad04134c'/>
<id>urn:sha1:c5b769c09e752469a2342105e882be3bad04134c</id>
<content type='text'>
commit 2b405bfa84063bfa35621d2d6879f52693c614b0 upstream.

In data=journal mode, if we unmount the file system before a
transaction has a chance to complete, when the journal inode is being
evicted, we can end up calling into jbd2_log_wait_commit() for the
last transaction, after the journalling machinery has been shut down.

Arguably we should adjust ext4_should_journal_data() to return FALSE
for the journal inode, but the only place it matters is
ext4_evict_inode(), and so to save a bit of CPU time, and to make the
patch much more obviously correct by inspection(tm), we'll fix it by
explicitly not trying to waiting for a journal commit when we are
evicting the journal inode, since it's guaranteed to never succeed in
this case.

This can be easily replicated via:

     mount -t ext4 -o data=journal /dev/vdb /vdb ; umount /vdb

------------[ cut here ]------------
WARNING: at /usr/projects/linux/ext4/fs/jbd2/journal.c:542 __jbd2_log_start_commit+0xba/0xcd()
Hardware name: Bochs
JBD2: bad log_start_commit: 3005630206 3005630206 0 0
Modules linked in:
Pid: 2909, comm: umount Not tainted 3.8.0-rc3 #1020
Call Trace:
 [&lt;c015c0ef&gt;] warn_slowpath_common+0x68/0x7d
 [&lt;c02b7e7d&gt;] ? __jbd2_log_start_commit+0xba/0xcd
 [&lt;c015c177&gt;] warn_slowpath_fmt+0x2b/0x2f
 [&lt;c02b7e7d&gt;] __jbd2_log_start_commit+0xba/0xcd
 [&lt;c02b8075&gt;] jbd2_log_start_commit+0x24/0x34
 [&lt;c0279ed5&gt;] ext4_evict_inode+0x71/0x2e3
 [&lt;c021f0ec&gt;] evict+0x94/0x135
 [&lt;c021f9aa&gt;] iput+0x10a/0x110
 [&lt;c02b7836&gt;] jbd2_journal_destroy+0x190/0x1ce
 [&lt;c0175284&gt;] ? bit_waitqueue+0x50/0x50
 [&lt;c028d23f&gt;] ext4_put_super+0x52/0x294
 [&lt;c020efe3&gt;] generic_shutdown_super+0x48/0xb4
 [&lt;c020f071&gt;] kill_block_super+0x22/0x60
 [&lt;c020f3e0&gt;] deactivate_locked_super+0x22/0x49
 [&lt;c020f5d6&gt;] deactivate_super+0x30/0x33
 [&lt;c0222795&gt;] mntput_no_expire+0x107/0x10c
 [&lt;c02233a7&gt;] sys_umount+0x2cf/0x2e0
 [&lt;c02233ca&gt;] sys_oldumount+0x12/0x14
 [&lt;c08096b8&gt;] syscall_call+0x7/0xb
---[ end trace 6a954cc790501c1f ]---
jbd2_log_wait_commit: error: j_commit_request=-1289337090, tid=0

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4: fix possible use-after-free with AIO</title>
<updated>2013-03-06T03:23:45Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2013-01-30T03:48:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3e411534ea3be89a7d0ddf4e62e5c5b79d80556a'/>
<id>urn:sha1:3e411534ea3be89a7d0ddf4e62e5c5b79d80556a</id>
<content type='text'>
commit 091e26dfc156aeb3b73bc5c5f277e433ad39331c upstream.

Running AIO is pinning inode in memory using file reference. Once AIO
is completed using aio_complete(), file reference is put and inode can
be freed from memory. So we have to be sure that calling aio_complete()
is the last thing we do with the inode.

Reviewed-by: Carlos Maiolino &lt;cmaiolino@redhat.com&gt;
Acked-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4: return ENOMEM if sb_getblk() fails</title>
<updated>2013-03-06T03:22:36Z</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2013-01-12T21:19:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=99d930c0ffe5396442f93cd3c421a9bd1984b923'/>
<id>urn:sha1:99d930c0ffe5396442f93cd3c421a9bd1984b923</id>
<content type='text'>
commit 860d21e2c585f7ee8a4ecc06f474fdc33c9474f4 upstream.

The only reason for sb_getblk() failing is if it can't allocate the
buffer_head.  So ENOMEM is more appropriate than EIO.  In addition,
make sure that the file system is marked as being inconsistent if
sb_getblk() fails.

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
[bwh: Backported to 3.2:
 - Adjust context
 - Drop change to inline.c
 - Call to ext4_ext_check() from ext4_ext_find_extent() is conditional]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4: init pagevec in ext4_da_block_invalidatepages</title>
<updated>2013-01-03T03:33:09Z</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@redhat.com</email>
</author>
<published>2012-11-15T03:22:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e14b82308631247bad28ba23db69468265d4e583'/>
<id>urn:sha1:e14b82308631247bad28ba23db69468265d4e583</id>
<content type='text'>
commit 66bea92c69477a75a5d37b9bfed5773c92a3c4b4 upstream.

ext4_da_block_invalidatepages is missing a pagevec_init(),
which means that pvec-&gt;cold contains random garbage.

This affects whether the page goes to the front or
back of the LRU when -&gt;cold makes it to
free_hot_cold_page()

Reviewed-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
Reviewed-by: Carlos Maiolino &lt;cmaiolino@redhat.com&gt;
Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4: fix fdatasync() for files with only i_size changes</title>
<updated>2012-10-17T02:49:04Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2012-09-27T01:52:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d7644110a92a74c71dfa65834eb9484a5ff7bf1f'/>
<id>urn:sha1:d7644110a92a74c71dfa65834eb9484a5ff7bf1f</id>
<content type='text'>
commit b71fc079b5d8f42b2a52743c8d2f1d35d655b1c5 upstream.

Code tracking when transaction needs to be committed on fdatasync(2) forgets
to handle a situation when only inode's i_size is changed. Thus in such
situations fdatasync(2) doesn't force transaction with new i_size to disk
and that can result in wrong i_size after a crash.

Fix the issue by updating inode's i_datasync_tid whenever its size is
updated.

Reported-by: Kristian Nielsen &lt;knielsen@knielsen-hq.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4: fix potential deadlock in ext4_nonda_switch()</title>
<updated>2012-10-17T02:48:38Z</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2012-09-20T02:42:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fda55976471f47647592f59b14a907fe2c0d38a0'/>
<id>urn:sha1:fda55976471f47647592f59b14a907fe2c0d38a0</id>
<content type='text'>
commit 00d4e7362ed01987183e9528295de3213031309c upstream.

In ext4_nonda_switch(), if the file system is getting full we used to
call writeback_inodes_sb_if_idle().  The problem is that we can be
holding i_mutex already, and this causes a potential deadlock when
writeback_inodes_sb_if_idle() when it tries to take s_umount.  (See
lockdep output below).

As it turns out we don't need need to hold s_umount; the fact that we
are in the middle of the write(2) system call will keep the superblock
pinned.  Unfortunately writeback_inodes_sb() checks to make sure
s_umount is taken, and the VFS uses a different mechanism for making
sure the file system doesn't get unmounted out from under us.  The
simplest way of dealing with this is to just simply grab s_umount
using a trylock, and skip kicking the writeback flusher thread in the
very unlikely case that we can't take a read lock on s_umount without
blocking.

Also, we now check the cirteria for kicking the writeback thread
before we decide to whether to fall back to non-delayed writeback, so
if there are any outstanding delayed allocation writes, we try to get
them resolved as soon as possible.

   [ INFO: possible circular locking dependency detected ]
   3.6.0-rc1-00042-gce894ca #367 Not tainted
   -------------------------------------------------------
   dd/8298 is trying to acquire lock:
    (&amp;type-&gt;s_umount_key#18){++++..}, at: [&lt;c02277d4&gt;] writeback_inodes_sb_if_idle+0x28/0x46

   but task is already holding lock:
    (&amp;sb-&gt;s_type-&gt;i_mutex_key#8){+.+...}, at: [&lt;c01ddcce&gt;] generic_file_aio_write+0x5f/0xd3

   which lock already depends on the new lock.

   2 locks held by dd/8298:
    #0:  (sb_writers#2){.+.+.+}, at: [&lt;c01ddcc5&gt;] generic_file_aio_write+0x56/0xd3
    #1:  (&amp;sb-&gt;s_type-&gt;i_mutex_key#8){+.+...}, at: [&lt;c01ddcce&gt;] generic_file_aio_write+0x5f/0xd3

   stack backtrace:
   Pid: 8298, comm: dd Not tainted 3.6.0-rc1-00042-gce894ca #367
   Call Trace:
    [&lt;c015b79c&gt;] ? console_unlock+0x345/0x372
    [&lt;c06d62a1&gt;] print_circular_bug+0x190/0x19d
    [&lt;c019906c&gt;] __lock_acquire+0x86d/0xb6c
    [&lt;c01999db&gt;] ? mark_held_locks+0x5c/0x7b
    [&lt;c0199724&gt;] lock_acquire+0x66/0xb9
    [&lt;c02277d4&gt;] ? writeback_inodes_sb_if_idle+0x28/0x46
    [&lt;c06db935&gt;] down_read+0x28/0x58
    [&lt;c02277d4&gt;] ? writeback_inodes_sb_if_idle+0x28/0x46
    [&lt;c02277d4&gt;] writeback_inodes_sb_if_idle+0x28/0x46
    [&lt;c026f3b2&gt;] ext4_nonda_switch+0xe1/0xf4
    [&lt;c0271ece&gt;] ext4_da_write_begin+0x27/0x193
    [&lt;c01dcdb0&gt;] generic_file_buffered_write+0xc8/0x1bb
    [&lt;c01ddc47&gt;] __generic_file_aio_write+0x1dd/0x205
    [&lt;c01ddce7&gt;] generic_file_aio_write+0x78/0xd3
    [&lt;c026d336&gt;] ext4_file_write+0x480/0x4a6
    [&lt;c0198c1d&gt;] ? __lock_acquire+0x41e/0xb6c
    [&lt;c0180944&gt;] ? sched_clock_cpu+0x11a/0x13e
    [&lt;c01967e9&gt;] ? trace_hardirqs_off+0xb/0xd
    [&lt;c018099f&gt;] ? local_clock+0x37/0x4e
    [&lt;c0209f2c&gt;] do_sync_write+0x67/0x9d
    [&lt;c0209ec5&gt;] ? wait_on_retry_sync_kiocb+0x44/0x44
    [&lt;c020a7b9&gt;] vfs_write+0x7b/0xe6
    [&lt;c020a9a6&gt;] sys_write+0x3b/0x64
    [&lt;c06dd4bd&gt;] syscall_call+0x7/0xb

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4: undo ext4_calc_metadata_amount if we fail to claim space</title>
<updated>2012-08-02T13:37:58Z</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2012-07-23T04:00:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d9af29329519c623a5a9c8b39439ed2886c9f30b'/>
<id>urn:sha1:d9af29329519c623a5a9c8b39439ed2886c9f30b</id>
<content type='text'>
commit 03179fe92318e7934c180d96f12eff2cb36ef7b6 upstream.

The function ext4_calc_metadata_amount() has side effects, although
it's not obvious from its function name.  So if we fail to claim
space, regardless of whether we retry to claim the space again, or
return an error, we need to undo these side effects.

Otherwise we can end up incorrectly calculating the number of metadata
blocks needed for the operation, which was responsible for an xfstests
failure for test #271 when using an ext2 file system with delalloc
enabled.

Reported-by: Brian Foster &lt;bfoster@redhat.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>ext4: don't let i_reserved_meta_blocks go negative</title>
<updated>2012-08-02T13:37:58Z</updated>
<author>
<name>Brian Foster</name>
<email>bfoster@redhat.com</email>
</author>
<published>2012-07-23T03:59:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=50e7ae34b18bb8a4911349184685fc1c4d9c0f50'/>
<id>urn:sha1:50e7ae34b18bb8a4911349184685fc1c4d9c0f50</id>
<content type='text'>
commit 97795d2a5b8d3c8dc4365d4bd3404191840453ba upstream.

If we hit a condition where we have allocated metadata blocks that
were not appropriately reserved, we risk underflow of
ei-&gt;i_reserved_meta_blocks.  In turn, this can throw
sbi-&gt;s_dirtyclusters_counter significantly out of whack and undermine
the nondelalloc fallback logic in ext4_nonda_switch().  Warn if this
occurs and set i_allocated_meta_blocks to avoid this problem.

This condition is reproduced by xfstests 270 against ext2 with
delalloc enabled:

Mar 28 08:58:02 localhost kernel: [  171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
Mar 28 08:58:02 localhost kernel: [  171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost

270 ultimately fails with an inconsistent filesystem and requires an
fsck to repair.  The cause of the error is an underflow in
ext4_da_update_reserve_space() due to an unreserved meta block
allocation.

Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
</feed>
