| Age | Commit message (Collapse) | Author |
|
commit a70b52ec1aaeaf60f4739edb1b422827cb6f3893 upstream.
We had for some reason overlooked the AIO interface, and it didn't use
the proper rw_verify_area() helper function that checks (for example)
mandatory locking on the file, and that the size of the access doesn't
cause us to overflow the provided offset limits etc.
Instead, AIO did just the security_file_permission() thing (that
rw_verify_area() also does) directly.
This fixes it to do all the proper helper functions, which not only
means that now mandatory file locking works with AIO too, we can
actually remove lines of code.
Reported-by: Manish Honap <manish_honap_vit@yahoo.co.in>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 080399aaaf3531f5b8761ec0ac30ff98891e8686 upstream.
Hi,
We have a bug report open where a squashfs image mounted on ppc64 would
exhibit errors due to trying to read beyond the end of the disk. It can
easily be reproduced by doing the following:
[root@ibm-p750e-02-lp3 ~]# ls -l install.img
-rw-r--r-- 1 root root 142032896 Apr 30 16:46 install.img
[root@ibm-p750e-02-lp3 ~]# mount -o loop ./install.img /mnt/test
[root@ibm-p750e-02-lp3 ~]# dd if=/dev/loop0 of=/dev/null
dd: reading `/dev/loop0': Input/output error
277376+0 records in
277376+0 records out
142016512 bytes (142 MB) copied, 0.9465 s, 150 MB/s
In dmesg, you'll find the following:
squashfs: version 4.0 (2009/01/31) Phillip Lougher
[ 43.106012] attempt to access beyond end of device
[ 43.106029] loop0: rw=0, want=277410, limit=277408
[ 43.106039] Buffer I/O error on device loop0, logical block 138704
[ 43.106053] attempt to access beyond end of device
[ 43.106057] loop0: rw=0, want=277412, limit=277408
[ 43.106061] Buffer I/O error on device loop0, logical block 138705
[ 43.106066] attempt to access beyond end of device
[ 43.106070] loop0: rw=0, want=277414, limit=277408
[ 43.106073] Buffer I/O error on device loop0, logical block 138706
[ 43.106078] attempt to access beyond end of device
[ 43.106081] loop0: rw=0, want=277416, limit=277408
[ 43.106085] Buffer I/O error on device loop0, logical block 138707
[ 43.106089] attempt to access beyond end of device
[ 43.106093] loop0: rw=0, want=277418, limit=277408
[ 43.106096] Buffer I/O error on device loop0, logical block 138708
[ 43.106101] attempt to access beyond end of device
[ 43.106104] loop0: rw=0, want=277420, limit=277408
[ 43.106108] Buffer I/O error on device loop0, logical block 138709
[ 43.106112] attempt to access beyond end of device
[ 43.106116] loop0: rw=0, want=277422, limit=277408
[ 43.106120] Buffer I/O error on device loop0, logical block 138710
[ 43.106124] attempt to access beyond end of device
[ 43.106128] loop0: rw=0, want=277424, limit=277408
[ 43.106131] Buffer I/O error on device loop0, logical block 138711
[ 43.106135] attempt to access beyond end of device
[ 43.106139] loop0: rw=0, want=277426, limit=277408
[ 43.106143] Buffer I/O error on device loop0, logical block 138712
[ 43.106147] attempt to access beyond end of device
[ 43.106151] loop0: rw=0, want=277428, limit=277408
[ 43.106154] Buffer I/O error on device loop0, logical block 138713
[ 43.106158] attempt to access beyond end of device
[ 43.106162] loop0: rw=0, want=277430, limit=277408
[ 43.106166] attempt to access beyond end of device
[ 43.106169] loop0: rw=0, want=277432, limit=277408
...
[ 43.106307] attempt to access beyond end of device
[ 43.106311] loop0: rw=0, want=277470, limit=2774
Squashfs manages to read in the end block(s) of the disk during the
mount operation. Then, when dd reads the block device, it leads to
block_read_full_page being called with buffers that are beyond end of
disk, but are marked as mapped. Thus, it would end up submitting read
I/O against them, resulting in the errors mentioned above. I fixed the
problem by modifying init_page_buffers to only set the buffer mapped if
it fell inside of i_size.
Cheers,
Jeff
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Nick Piggin <npiggin@kernel.dk>
--
Changes from v1->v2: re-used max_block, as suggested by Nick Piggin.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f908ee9463b09ddd05e1c1a0111132212dc05fac upstream.
The number of bio_get_nr_vecs() is passed down via bio_alloc() to
bvec_alloc_bs(), which fails the bio allocation if
nr_iovecs > BIO_MAX_PAGES. For the underlying caller this causes an
unexpected bio allocation failure.
Limiting to queue_max_segments() is not sufficient, as max_segments
also might be very large.
bvec_alloc_bs(gfp_mask, nr_iovecs, ) => NULL when nr_iovecs > BIO_MAX_PAGES
bio_alloc_bioset(gfp_mask, nr_iovecs, ...)
bio_alloc(GFP_NOIO, nvecs)
xfs_alloc_ioend_bio()
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5a00689930ab975fdd1b37b034475017e460cf2a upstream.
Bug noticed in commit
bf118a342f10dafe44b14451a1392c3254629a1f
When calling GETACL, if the size of the bitmap array, the length
attribute and the acl returned by the server is greater than the
allocated buffer(args.acl_len), we can Oops with a General Protection
fault at _copy_from_pages() when we attempt to read past the pages
allocated.
This patch allocates an extra PAGE for the bitmap and checks to see that
the bitmap + attribute_length + ACLs don't exceed the buffer space
allocated to it.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jian Li <jiali@redhat.com>
[Trond: Fixed a size_t vs unsigned int printk() warning]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5794d21ef4639f0e33440927bb903f9598c21e92 upstream.
When attempting to cache ACLs returned from the server, if the bitmap
size + the ACL size is greater than a PAGE_SIZE but the ACL size itself
is smaller than a PAGE_SIZE, we can read past the buffer page boundary.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 48a5730e5b71201e226ff06e245bf308feba5f10 upstream.
This test is always true so it means we revalidate the length every
time, which generates more network traffic. When it is SEEK_SET or
SEEK_CUR, then we don't need to revalidate.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c1bb05a657fb3d8c6179a4ef7980261fae4521d7 upstream.
Processes hang forever on a sync-mounted ext2 file system that
is mounted with the ext4 module (default in Fedora 16).
I can reproduce this reliably by mounting an ext2 partition with
"-o sync" and opening a new file an that partition with vim. vim
will hang in "D" state forever. The same happens on ext4 without
a journal.
I am attaching a small patch here that solves this issue for me.
In the sync mounted case without a journal,
ext4_handle_dirty_metadata() may call sync_dirty_buffer(), which
can't be called with buffer lock held.
Also move mb_cache_entry_release inside lock to avoid race
fixed previously by 8a2bfdcb ext[34]: EA block reference count racing fix
Note too that ext2 fixed this same problem in 2006 with
b2f49033 [PATCH] fix deadlock in ext2
Signed-off-by: Martin.Wilck@ts.fujitsu.com
[sandeen@redhat.com: move mb_cache_entry_release before unlock, edit commit msg]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 226bb7df3d22bcf4a1c0fe8206c80cc427498eae upstream.
The locking policy is such that the erase_complete_block spinlock is
nested within the alloc_sem mutex. This fixes a case in which the
acquisition order was erroneously reversed. This issue was caught by
the following lockdep splat:
=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.5 #1
-------------------------------------------------------
jffs2_gcd_mtd6/299 is trying to acquire lock:
(&c->alloc_sem){+.+.+.}, at: [<c01f7714>] jffs2_garbage_collect_pass+0x314/0x890
but task is already holding lock:
(&(&c->erase_completion_lock)->rlock){+.+...}, at: [<c01f7708>] jffs2_garbage_collect_pass+0x308/0x890
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&(&c->erase_completion_lock)->rlock){+.+...}:
[<c008bec4>] validate_chain+0xe6c/0x10bc
[<c008c660>] __lock_acquire+0x54c/0xba4
[<c008d240>] lock_acquire+0xa4/0x114
[<c046780c>] _raw_spin_lock+0x3c/0x4c
[<c01f744c>] jffs2_garbage_collect_pass+0x4c/0x890
[<c01f937c>] jffs2_garbage_collect_thread+0x1b4/0x1cc
[<c0071a68>] kthread+0x98/0xa0
[<c000f264>] kernel_thread_exit+0x0/0x8
-> #0 (&c->alloc_sem){+.+.+.}:
[<c008ad2c>] print_circular_bug+0x70/0x2c4
[<c008c08c>] validate_chain+0x1034/0x10bc
[<c008c660>] __lock_acquire+0x54c/0xba4
[<c008d240>] lock_acquire+0xa4/0x114
[<c0466628>] mutex_lock_nested+0x74/0x33c
[<c01f7714>] jffs2_garbage_collect_pass+0x314/0x890
[<c01f937c>] jffs2_garbage_collect_thread+0x1b4/0x1cc
[<c0071a68>] kthread+0x98/0xa0
[<c000f264>] kernel_thread_exit+0x0/0x8
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&(&c->erase_completion_lock)->rlock);
lock(&c->alloc_sem);
lock(&(&c->erase_completion_lock)->rlock);
lock(&c->alloc_sem);
*** DEADLOCK ***
1 lock held by jffs2_gcd_mtd6/299:
#0: (&(&c->erase_completion_lock)->rlock){+.+...}, at: [<c01f7708>] jffs2_garbage_collect_pass+0x308/0x890
stack backtrace:
[<c00155dc>] (unwind_backtrace+0x0/0x100) from [<c0463dc0>] (dump_stack+0x20/0x24)
[<c0463dc0>] (dump_stack+0x20/0x24) from [<c008ae84>] (print_circular_bug+0x1c8/0x2c4)
[<c008ae84>] (print_circular_bug+0x1c8/0x2c4) from [<c008c08c>] (validate_chain+0x1034/0x10bc)
[<c008c08c>] (validate_chain+0x1034/0x10bc) from [<c008c660>] (__lock_acquire+0x54c/0xba4)
[<c008c660>] (__lock_acquire+0x54c/0xba4) from [<c008d240>] (lock_acquire+0xa4/0x114)
[<c008d240>] (lock_acquire+0xa4/0x114) from [<c0466628>] (mutex_lock_nested+0x74/0x33c)
[<c0466628>] (mutex_lock_nested+0x74/0x33c) from [<c01f7714>] (jffs2_garbage_collect_pass+0x314/0x890)
[<c01f7714>] (jffs2_garbage_collect_pass+0x314/0x890) from [<c01f937c>] (jffs2_garbage_collect_thread+0x1b4/0x1cc)
[<c01f937c>] (jffs2_garbage_collect_thread+0x1b4/0x1cc) from [<c0071a68>] (kthread+0x98/0xa0)
[<c0071a68>] (kthread+0x98/0xa0) from [<c000f264>] (kernel_thread_exit+0x0/0x8)
This was introduce in '81cfc9f jffs2: Fix serious write stall due to erase'.
Signed-off-by: Josh Cartwright <joshc@linux.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9dc4e6c4d1182d34604ea40fef641775f5b15456 upstream.
Allow a v3 unchecked open of a non-regular file succeed as if it were a
lookup; typically a client in such a case will want to fall back on a
local open, so succeeding and giving it the filehandle is more useful
than failing with nfserr_exist, which makes it appear that nothing at
all exists by that name.
Similarly for v4, on an open-create, return the same errors we would on
an attempt to open a non-regular file, instead of returning
nfserr_exist.
This fixes a problem found doing a v4 open of a symlink with
O_RDONLY|O_CREAT, which resulted in the current client returning EEXIST.
Thanks also to Trond for analysis.
Reported-by: Orion Poplawski <orion@cora.nwra.com>
Tested-by: Orion Poplawski <orion@cora.nwra.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
[bwh: Backported to 3.2 and 3.3: use &resfh, not resfh]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 90481622d75715bfcb68501280a917dbfe516029 upstream.
hugetlbfs_{get,put}_quota() are badly named. They don't interact with the
general quota handling code, and they don't much resemble its behaviour.
Rather than being about maintaining limits on on-disk block usage by
particular users, they are instead about maintaining limits on in-memory
page usage (including anonymous MAP_PRIVATE copied-on-write pages)
associated with a particular hugetlbfs filesystem instance.
Worse, they work by having callbacks to the hugetlbfs filesystem code from
the low-level page handling code, in particular from free_huge_page().
This is a layering violation of itself, but more importantly, if the
kernel does a get_user_pages() on hugepages (which can happen from KVM
amongst others), then the free_huge_page() can be delayed until after the
associated inode has already been freed. If an unmount occurs at the
wrong time, even the hugetlbfs superblock where the "quota" limits are
stored may have been freed.
Andrew Barry proposed a patch to fix this by having hugepages, instead of
storing a pointer to their address_space and reaching the superblock from
there, had the hugepages store pointers directly to the superblock,
bumping the reference count as appropriate to avoid it being freed.
Andrew Morton rejected that version, however, on the grounds that it made
the existing layering violation worse.
This is a reworked version of Andrew's patch, which removes the extra, and
some of the existing, layering violation. It works by introducing the
concept of a hugepage "subpool" at the lower hugepage mm layer - that is a
finite logical pool of hugepages to allocate from. hugetlbfs now creates
a subpool for each filesystem instance with a page limit set, and a
pointer to the subpool gets added to each allocated hugepage, instead of
the address_space pointer used now. The subpool has its own lifetime and
is only freed once all pages in it _and_ all other references to it (i.e.
superblocks) are gone.
subpools are optional - a NULL subpool pointer is taken by the code to
mean that no subpool limits are in effect.
Previous discussion of this bug found in: "Fix refcounting in hugetlbfs
quota handling.". See: https://lkml.org/lkml/2011/8/11/28 or
http://marc.info/?l=linux-mm&m=126928970510627&w=1
v2: Fixed a bug spotted by Hillf Danton, and removed the extra parameter to
alloc_huge_page() - since it already takes the vma, it is not necessary.
Signed-off-by: Andrew Barry <abarry@cray.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d8f2799b105a24bb0bbd3380a0d56e6348484058 upstream.
The problem was that the first referral was parsed more than once
and so the caller tried the same referrals multiple times.
The problem was introduced partly by commit
066ce6899484d9026acd6ba3a8dbbedb33d7ae1b,
where 'ref += le16_to_cpu(ref->Size);' got lost,
but that was also wrong...
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Tested-by: Björn Jacke <bj@sernet.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6f24f892871acc47b40dd594c63606a17c714f77 upstream.
Commit ec81aecb2966 ("hfs: fix a potential buffer overflow") fixed a few
potential buffer overflows in the hfs filesystem. But as Timo Warns
pointed out, these changes also need to be made on the hfsplus
filesystem as well.
Reported-by: Timo Warns <warns@pre-sense.de>
Acked-by: WANG Cong <amwang@redhat.com>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Sage Weil <sage@newdream.net>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
commit e636825346b36a07ccfc8e30946d52855e21f681 upstream.
exit_notify() checks "tsk->self_exec_id != tsk->parent_exec_id"
to handle the "we have changed execution domain" case.
We can change do_thread() to always set ->exit_signal = SIGCHLD
and remove this check to simplify the code.
We could change setup_new_exec() instead, this looks more logical
because it increments ->self_exec_id. But note that de_thread()
already resets ->exit_signal if it changes the leader, let's keep
both changes close to each other.
Note that we change ->exit_signal lockless, this changes the rules.
Thereafter ->exit_signal is not stable under tasklist but this is
fine, the only possible change is OLDSIG -> SIGCHLD. This can race
with eligible_child() but the race is harmless. We can race with
reparent_leader() which changes our ->exit_signal in parallel, but
it does the same change to SIGCHLD.
The noticeable user-visible change is that the execing task is not
"visible" to do_wait()->eligible_child(__WCLONE) right after exec.
To me this looks more logical, and this is consistent with mt case.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 64f371bc3107e69efce563a3d0f0e6880de0d537 upstream.
The autofs packet size has had a very unfortunate size problem on x86:
because the alignment of 'u64' differs in 32-bit and 64-bit modes, and
because the packet data was not 8-byte aligned, the size of the autofsv5
packet structure differed between 32-bit and 64-bit modes despite
looking otherwise identical (300 vs 304 bytes respectively).
We first fixed that up by making the 64-bit compat mode know about this
problem in commit a32744d4abae ("autofs: work around unhappy compat
problem on x86-64"), and that made a 32-bit 'systemd' work happily on a
64-bit kernel because everything then worked the same way as on a 32-bit
kernel.
But it turned out that 'automount' had actually known and worked around
this problem in user space, so fixing the kernel to do the proper 32-bit
compatibility handling actually *broke* 32-bit automount on a 64-bit
kernel, because it knew that the packet sizes were wrong and expected
those incorrect sizes.
As a result, we ended up reverting that compatibility mode fix, and
thus breaking systemd again, in commit fcbf94b9dedd.
With both automount and systemd doing a single read() system call, and
verifying that they get *exactly* the size they expect but using
different sizes, it seemed that fixing one of them inevitably seemed to
break the other. At one point, a patch I seriously considered applying
from Michael Tokarev did a "strcmp()" to see if it was automount that
was doing the operation. Ugly, ugly.
However, a prettier solution exists now thanks to the packetized pipe
mode. By marking the communication pipe as being packetized (by simply
setting the O_DIRECT flag), we can always just write the bigger packet
size, and if user-space does a smaller read, it will just get that
partial end result and the extra alignment padding will simply be thrown
away.
This makes both automount and systemd happy, since they now get the size
they asked for, and the kernel side of autofs simply no longer needs to
care - it could pad out the packet arbitrarily.
Of course, if there is some *other* user of autofs (please, please,
please tell me it ain't so - and we haven't heard of any) that tries to
read the packets with multiple writes, that other user will now be
broken - the whole point of the packetized mode is that one system call
gets exactly one packet, and you cannot read a packet in pieces.
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: Ian Kent <raven@themaw.net>
Cc: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9883035ae7edef3ec62ad215611cb8e17d6a1a5d upstream.
The actual internal pipe implementation is already really about
individual packets (called "pipe buffers"), and this simply exposes that
as a special packetized mode.
When we are in the packetized mode (marked by O_DIRECT as suggested by
Alan Cox), a write() on a pipe will not merge the new data with previous
writes, so each write will get a pipe buffer of its own. The pipe
buffer is then marked with the PIPE_BUF_FLAG_PACKET flag, which in turn
will tell the reader side to break the read at that boundary (and throw
away any partial packet contents that do not fit in the read buffer).
End result: as long as you do writes less than PIPE_BUF in size (so that
the pipe doesn't have to split them up), you can now treat the pipe as a
packet interface, where each read() system call will read one packet at
a time. You can just use a sufficiently big read buffer (PIPE_BUF is
sufficient, since bigger than that doesn't guarantee atomicity anyway),
and the return value of the read() will naturally give you the size of
the packet.
NOTE! We do not support zero-sized packets, and zero-sized reads and
writes to a pipe continue to be no-ops. Also note that big packets will
currently be split at write time, but that the size at which that
happens is not really specified (except that it's bigger than PIPE_BUF).
Currently that limit is the system page size, but we might want to
explicitly support bigger packets some day.
The main user for this is going to be the autofs packet interface,
allowing us to stop having to care so deeply about exact packet sizes
(which have had bugs with 32/64-bit compatibility modes). But user
space can create packetized pipes with "pipe2(fd, O_DIRECT)", which will
fail with an EINVAL on kernels that do not support this interface.
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: Ian Kent <raven@themaw.net>
Cc: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit fcbf94b9dedd2ce08e798a99aafc94fec8668161 upstream.
This reverts commit a32744d4abae24572eff7269bc17895c41bd0085.
While that commit was technically the right thing to do, and made the
x86-64 compat mode work identically to native 32-bit mode (and thus
fixing the problem with a 32-bit systemd install on a 64-bit kernel), it
turns out that the automount binaries had workarounds for this compat
problem.
Now, the workarounds are disgusting: doing an "uname()" to find out the
architecture of the kernel, and then comparing it for the 64-bit cases
and fixing up the size of the read() in automount for those. And they
were confused: it's not actually a generic 64-bit issue at all, it's
very much tied to just x86-64, which has different alignment for an
'u64' in 64-bit mode than in 32-bit mode.
But the end result is that fixing the compat layer actually breaks the
case of a 32-bit automount on a x86-64 kernel.
There are various approaches to fix this (including just doing a
"strcmp()" on current->comm and comparing it to "automount"), but I
think that I will do the one that teaches pipes about a special "packet
mode", which will allow user space to not have to care too deeply about
the padding at the end of the autofs packet.
That change will make the compat workaround unnecessary, so let's revert
it first, and get automount working again in compat mode. The
packetized pipes will then fix autofs for systemd.
Reported-and-requested-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8ccd271f7a3a846ce6f85ead0760d9d12994a611 upstream.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 73fb7bc7c57d971b11f2e00536ac2d3e316e0609 upstream.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 55725513b5ef9d462aa3e18527658a0362aaae83 upstream.
Since we may be simulating flock() locks using NFS byte range locks,
we can't rely on the VFS having checked the file open mode for us.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 05ffe24f5290dc095f98fbaf84afe51ef404ccc5 upstream.
All callers of nfs4_handle_exception() that need to handle
NFS4ERR_OPENMODE correctly should set exception->inode
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 98a2139f4f4d7b5fcc3a54c7fddbe88612abed20 upstream.
When hostname contains colon (e.g. when it is an IPv6 address) it needs
to be enclosed in brackets to make parsing of NFS device string possible.
Fix nfs_do_root_mount() to enclose hostname properly when needed. NFS code
actually does not need this as it does not parse the string passed by
nfs_do_root_mount() but the device string is exposed to userspace in
/proc/mounts.
CC: Josh Boyer <jwboyer@redhat.com>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ This combines upstream commit
2f53384424251c06038ae612e56231b96ab610ee and the follow-on bug fix
commit 35f9c09fe9c72eb8ca2b8e89a593e1c151f28fc2 ]
vmsplice()/splice(pipe, socket) call do_tcp_sendpages() one page at a
time, adding at most 4096 bytes to an skb. (assuming PAGE_SIZE=4096)
The call to tcp_push() at the end of do_tcp_sendpages() forces an
immediate xmit when pipe is not already filled, and tso_fragment() try
to split these skb to MSS multiples.
4096 bytes are usually split in a skb with 2 MSS, and a remaining
sub-mss skb (assuming MTU=1500)
This makes slow start suboptimal because many small frames are sent to
qdisc/driver layers instead of big ones (constrained by cwnd and packets
in flight of course)
In fact, applications using sendmsg() (adding an additional memory copy)
instead of vmsplice()/splice()/sendfile() are a bit faster because of
this anomaly, especially if serving small files in environments with
large initial [c]wnd.
Call tcp_push() only if MSG_MORE is not set in the flags parameter.
This bit is automatically provided by splice() internals but for the
last page, or on all pages if user specified SPLICE_F_MORE splice()
flag.
In some workloads, this can reduce number of sent logical packets by an
order of magnitude, making zero-copy TCP actually faster than
one-copy :)
Reported-by: Tom Herbert <therbert@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e847469bf77a1d339274074ed068d461f0c872bc upstream.
comparing be32 values for < is not doing the right thing...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 72094e43e3af5020510f920321d71f1798fa896d upstream.
le16, not le32...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 28748b325dc2d730ccc312830a91c4ae0c0d9379 upstream.
le16, not le32...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e1bf4cc620fd143766ddfcee3b004a1d1bb34fd0 upstream.
it's le16, not le32 or le64...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3a251f04fe97c3d335b745c98e4b377e3c3899f2 upstream.
It's le16, not le32...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6ed3cf2cdfce4c9f1d73171bd3f27d9cb77b734e upstream.
->root_flags is __le64 and all accesses to it go through the helpers
that do proper conversions. Except for btrfs_root_readonly(), which
checks bit 0 as in host-endian...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit efe39651f08813180f37dc508d950fc7d92b29a8 upstream.
Restore the original logics ("fail on mountpoints, negatives and in
case of fh_compose() failures"). Since commit 8177e (nfsd: clean up
readdirplus encoding) that got broken -
rv = fh_compose(fhp, exp, dchild, &cd->fh);
if (rv)
goto out;
if (!dchild->d_inode)
goto out;
rv = 0;
out:
is equivalent to
rv = fh_compose(fhp, exp, dchild, &cd->fh);
out:
and the second check has no effect whatsoever...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 02f5fde5df0ea930e70f93763dd48beff182b208 upstream.
->ts_id_status gets nfs errno, i.e. it's already big-endian; no need
to apply htonl() to it. Broken by commit 174568 (NFSD: Added TEST_STATEID
operation) last year...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 04da6e9d63427b2d0fd04766712200c250b3278f upstream.
nfsd_open() already returns an NFS error value; only vfs_test_lock()
result needs to be fed through nfserrno(). Broken by commit 55ef12
(nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT)
three years ago...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 96f6f98501196d46ce52c2697dd758d9300c63f5 upstream.
..._want_write() returns -EROFS on failure, _not_ an NFS error value.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit af1584f570b19b0285e4402a0b54731495d31784 upstream.
->ee_len is __le16, so assigning cpu_to_le32() to it is going to do
Bad Things(tm) on big-endian hosts...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ted Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 99aa78466777083255b876293e9e83dec7cd809a upstream.
flush request is issued in transaction commit code path, so looks using
GFP_KERNEL to allocate memory for flush request bio falls into the classic
deadlock issue. I saw btrfs and dm get it right, but ext4, xfs and md are
using GFP.
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9cd70b347e9761ea2d2ac3d758c529a48a8193e6 upstream.
Andi Kleen and Tim Chen have reported that under certain circumstances
the extent cache statistics are causing scalability problems due to
cache line bounces.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8e62c2de6e23e5c1fee04f59de51b54cc2868ca5 upstream.
This reverts commit 5500cdbe14d7435e04f66ff3cfb8ecd8b8e44ebf.
We've had a number of complaints of early enospc that bisect down
to this patch. We'll hae to fix the reservations differently.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7a3ae2f8c8c8432e65467b7fc84d5deab04061a0 upstream.
In commit 4692cf58 we introduced new backref walking code for btrfs. This
assumes we're searching live roots, which requires a transaction context.
While scrubbing, however, we must not join a transaction because this could
deadlock with the commit path. Additionally, what scrub really wants to do
is resolving a logical address in the commit root it's currently checking.
This patch adds support for logical to path resolving on commit roots and
makes scrub use that.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 20e0fa98b751facf9a1101edaefbc19c82616a68 upstream.
_copy_from_pages() used to copy data from the temporary buffer to the
user passed buffer is passed the wrong size parameter when copying
data. res.acl_len contains both the bitmap and acl lenghts while
acl_len contains the acl length after adjusting for the bitmap size.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 66189be74ff5f9f3fd6444315b85be210d07cef2 upstream.
We can deadlock if we have a write oplock and two processes
use the same file handle. In this case the first process can't
unlock its lock if the second process blocked on the lock in the
same time.
Fix it by using posix_lock_file rather than posix_lock_file_wait
under cinode->lock_mutex. If we request a blocking lock and
posix_lock_file indicates that there is another lock that prevents
us, wait untill that lock is released and restart our call.
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit de5b8e8e047534aac6bc9803f96e7257436aef9c upstream.
If you try to set grace_period or timeout via a module parameter
to lockd, and do this on a big-endian machine where
sizeof(int) != sizeof(unsigned long)
it won't work. This number given will be effectively shifted right
by the difference in those two sizes.
So cast kp->arg properly to get correct result.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e59d27e05a6435f8c04d5ad843f37fa795f2eaaa upstream.
Firstly, task->tk_status will always return negative error values,
so the current tests for 'NFS4ERR_DELEG_REVOKED' etc. are all being
ignored.
Secondly, clean up the code so that we only need to test
task->tk_status once!
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 05e9cfb408b24debb3a85fd98edbfd09dd148881 upstream.
We can currently loop forever in nfs4_lookup_root() and in
nfs41_proc_secinfo_no_name(), if the first iteration returns a
NFS4ERR_DELAY or something else that causes exception.retry to get
set.
Reported-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d97d32edcd732110758799ae60af725e5110b3dc upstream.
When an IO error happens during inode deletion run from
xlog_recover_process_iunlinks() filesystem gets shutdown. Thus any subsequent
attempt to read buffers fails. Code in xlog_recover_process_iunlinks() does not
count with the fact that read of a buffer which was read a while ago can
really fail which results in the oops on
agi = XFS_BUF_TO_AGI(agibp);
Fix the problem by cleaning up the buffer handling in
xlog_recover_process_iunlinks() as suggested by Dave Chinner. We release buffer
lock but keep buffer reference to AG buffer. That is enough for buffer to stay
pinned in memory and we don't have to call xfs_read_agi() all the time.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b18dafc86bb879d2f38a1743985d7ceb283c2f4d upstream.
In d_materialise_unique() there are 3 subcases to the 'aliased dentry'
case; in two subcases the inode i_lock is properly released but this
does not occur in the -ELOOP subcase.
This seems to have been introduced by commit 1836750115f2 ("fix loop
checks in d_materialise_unique()").
Signed-off-by: Michel Lespinasse <walken@google.com>
[ Added a comment, and moved the unlock to where we generate the -ELOOP,
which seems to be more natural.
You probably can't actually trigger this without a buggy network file
server - d_materialize_unique() is for finding aliases on non-local
filesystems, and the d_ancestor() case is for a hardlinked directory
loop.
But we should be robust in the case of such buggy servers anyway. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 31d4f3a2f3c73f279ff96a7135d7202ef6833f12 upstream.
Explicitly test for an extent whose length is zero, and flag that as a
corrupted extent.
This avoids a kernel BUG_ON assertion failure.
Tested: Without this patch, the file system image found in
tests/f_ext_zero_len/image.gz in the latest e2fsprogs sources causes a
kernel panic. With this patch, an ext4 file system error is noted
instead, and the file system is marked as being corrupted.
https://bugzilla.kernel.org/show_bug.cgi?id=42859
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 491caa43639abcffaa645fbab372a7ef4ce2975c upstream.
The following command line will leave the aio-stress process unkillable
on an ext4 file system (in my case, mounted on /mnt/test):
aio-stress -t 20 -s 10 -O -S -o 2 -I 1000 /mnt/test/aiostress.3561.4 /mnt/test/aiostress.3561.4.20 /mnt/test/aiostress.3561.4.19 /mnt/test/aiostress.3561.4.18 /mnt/test/aiostress.3561.4.17 /mnt/test/aiostress.3561.4.16 /mnt/test/aiostress.3561.4.15 /mnt/test/aiostress.3561.4.14 /mnt/test/aiostress.3561.4.13 /mnt/test/aiostress.3561.4.12 /mnt/test/aiostress.3561.4.11 /mnt/test/aiostress.3561.4.10 /mnt/test/aiostress.3561.4.9 /mnt/test/aiostress.3561.4.8 /mnt/test/aiostress.3561.4.7 /mnt/test/aiostress.3561.4.6 /mnt/test/aiostress.3561.4.5 /mnt/test/aiostress.3561.4.4 /mnt/test/aiostress.3561.4.3 /mnt/test/aiostress.3561.4.2
This is using the aio-stress program from the xfstests test suite.
That particular command line tells aio-stress to do random writes to
20 files from 20 threads (one thread per file). The files are NOT
preallocated, so you will get writes to random offsets within the
file, thus creating holes and extending i_size. It also opens the
file with O_DIRECT and O_SYNC.
On to the problem. When an I/O requires unwritten extent conversion,
it is queued onto the completed_io_list for the ext4 inode. Two code
paths will pull work items from this list. The first is the
ext4_end_io_work routine, and the second is ext4_flush_completed_IO,
which is called via the fsync path (and O_SYNC handling, as well).
There are two issues I've found in these code paths. First, if the
fsync path beats the work routine to a particular I/O, the work
routine will free the io_end structure! It does not take into account
the fact that the io_end may still be in use by the fsync path. I've
fixed this issue by adding yet another IO_END flag, indicating that
the io_end is being processed by the fsync path.
The second problem is that the work routine will make an assignment to
io->flag outside of the lock. I have witnessed this result in a hang
at umount. Moving the flag setting inside the lock resolved that
problem.
The problem was introduced by commit b82e384c7b ("ext4: optimize
locking for end_io extent conversion"), which first appeared in 3.2.
As such, the fix should be backported to that release (probably along
with the unwritten extent conversion race fix).
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 266991b13890049ee1a6bb95b9817f06339ee3d7 upstream.
The following comment in ext4_end_io_dio caught my attention:
/* XXX: probably should move into the real I/O completion handler */
inode_dio_done(inode);
The truncate code takes i_mutex, then calls inode_dio_wait. Because the
ext4 code path above will end up dropping the mutex before it is
reacquired by the worker thread that does the extent conversion, it
seems to me that the truncate can happen out of order. Jan Kara
mentioned that this might result in error messages in the system logs,
but that should be the extent of the "damage."
The fix is pretty straight-forward: don't call inode_dio_done until the
extent conversion is complete.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3d2b158262826e8b75bbbfb7b97010838dd92ac7 upstream.
Ext4 does not support data journalling with delayed allocation enabled.
We even do not allow to mount the file system with delayed allocation
and data journalling enabled, however it can be set via FS_IOC_SETFLAGS
so we can hit the inode with EXT4_INODE_JOURNAL_DATA set even on file
system mounted with delayed allocation (default) and that's where
problem arises. The easies way to reproduce this problem is with the
following set of commands:
mkfs.ext4 /dev/sdd
mount /dev/sdd /mnt/test1
dd if=/dev/zero of=/mnt/test1/file bs=1M count=4
chattr +j /mnt/test1/file
dd if=/dev/zero of=/mnt/test1/file bs=1M count=4 conv=notrunc
chattr -j /mnt/test1/file
Additionally it can be reproduced quite reliably with xfstests 272 and
269. In fact the above reproducer is a part of test 272.
To fix this we should ignore the EXT4_INODE_JOURNAL_DATA inode flag if
the file system is mounted with delayed allocation. This can be easily
done by fixing ext4_should_*_data() functions do ignore data journal
flag when delalloc is set (suggested by Ted). We also have to set the
appropriate address space operations for the inode (again, ignoring data
journal flag if delalloc enabled).
Additionally this commit introduces ext4_inode_journal_mode() function
because ext4_should_*_data() has already had a lot of common code and
this change is putting it all into one function so it is easier to
read.
Successfully tested with xfstests in following configurations:
delalloc + data=ordered
delalloc + data=writeback
data=journal
nodelalloc + data=ordered
nodelalloc + data=writeback
nodelalloc + data=journal
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 15291164b22a357cb211b618adfef4fa82fc0de3 upstream.
journal_unmap_buffer()'s zap_buffer: code clears a lot of buffer head
state ala discard_buffer(), but does not touch _Delay or _Unwritten as
discard_buffer() does.
This can be problematic in some areas of the ext4 code which assume
that if they have found a buffer marked unwritten or delay, then it's
a live one. Perhaps those spots should check whether it is mapped
as well, but if jbd2 is going to tear down a buffer, let's really
tear it down completely.
Without this I get some fsx failures on sub-page-block filesystems
up until v3.2, at which point 4e96b2dbbf1d7e81f22047a50f862555a6cb87cb
and 189e868fa8fdca702eb9db9d8afc46b5cb9144c9 make the failures go
away, because buried within that large change is some more flag
clearing. I still think it's worth doing in jbd2, since
->invalidatepage leads here directly, and it's the right place
to clear away these flags.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9a3ba432330e504ac61ff0043dbdaba7cea0e35a upstream.
Prevent the state manager from filling up system logs when recovery
fails on the server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|