user/sven/linux.git/fs, branch v4.9.51

xfs: fix compiler warnings

2017-09-20T06:20:02Z

commit 7bf7a193a90cadccaad21c5970435c665c40fe27 upstream. Fix up all the compiler warnings that have crept in. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Cc: Arnd Bergmann Signed-off-by: Greg Kroah-Hartman

xfs: use kmem_free to free return value of kmem_zalloc

2017-09-20T06:20:02Z

commit 6c370590cfe0c36bcd62d548148aa65c984540b7 upstream. In function xfs_test_remount_options(), kfree() is used to free memory allocated by kmem_zalloc(). But it is better to use kmem_free(). Signed-off-by: Pan Bian Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman

xfs: open code end_buffer_async_write in xfs_finish_page_writeback

2017-09-20T06:20:02Z

commit 8353a814f2518dcfa79a5bb77afd0e7dfa391bb1 upstream. Our loop in xfs_finish_page_writeback, which iterates over all buffer heads in a page and then calls end_buffer_async_write, which also iterates over all buffers in the page to check if any I/O is in flight is not only inefficient, but also potentially dangerous as end_buffer_async_write can cause the page and all buffers to be freed. Replace it with a single loop that does the work of end_buffer_async_write on a per-page basis. Signed-off-by: Christoph Hellwig Reviewed-by: Brian Foster Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman

xfs: don't set v3 xflags for v2 inodes

2017-09-20T06:20:02Z

commit dd60687ee541ca3f6df8758f38e6f22f57c42a37 upstream. Reject attempts to set XFLAGS that correspond to di_flags2 inode flags if the inode isn't a v3 inode, because di_flags2 only exists on v3. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman

xfs: fix incorrect log_flushed on fsync

2017-09-20T06:20:02Z

commit 47c7d0b19502583120c3f396c7559e7a77288a68 upstream. When calling into _xfs_log_force{,_lsn}() with a pointer to log_flushed variable, log_flushed will be set to 1 if: 1. xlog_sync() is called to flush the active log buffer AND/OR 2. xlog_wait() is called to wait on a syncing log buffers xfs_file_fsync() checks the value of log_flushed after _xfs_log_force_lsn() call to optimize away an explicit PREFLUSH request to the data block device after writing out all the file's pages to disk. This optimization is incorrect in the following sequence of events: Task A Task B ------------------------------------------------------- xfs_file_fsync() _xfs_log_force_lsn() xlog_sync() [submit PREFLUSH] xfs_file_fsync() file_write_and_wait_range() [submit WRITE X] [endio WRITE X] _xfs_log_force_lsn() xlog_wait() [endio PREFLUSH] The write X is not guarantied to be on persistent storage when PREFLUSH request in completed, because write A was submitted after the PREFLUSH request, but xfs_file_fsync() of task A will be notified of log_flushed=1 and will skip explicit flush. If the system crashes after fsync of task A, write X may not be present on disk after reboot. This bug was discovered and demonstrated using Josef Bacik's dm-log-writes target, which can be used to record block io operations and then replay a subset of these operations onto the target device. The test goes something like this: - Use fsx to execute ops of a file and record ops on log device - Every now and then fsync the file, store md5 of file and mark the location in the log - Then replay log onto device for each mark, mount fs and compare md5 of file to stored value Cc: Christoph Hellwig Cc: Josef Bacik Cc: Signed-off-by: Amir Goldstein Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman

xfs: disable per-inode DAX flag

2017-09-20T06:20:02Z

commit 742d84290739ae908f1b61b7d17ea382c8c0073a upstream. Currently flag switching can be used to easily crash the kernel. Disable the per-inode DAX flag until that is sorted out. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman

xfs: relog dirty buffers during swapext bmbt owner change

2017-09-20T06:20:02Z

commit 2dd3d709fc4338681a3aa61658122fa8faa5a437 upstream. The owner change bmbt scan that occurs during extent swap operations does not handle ordered buffer failures. Buffers that cannot be marked ordered must be physically logged so previously dirty ranges of the buffer can be relogged in the transaction. Since the bmbt scan may need to process and potentially log a large number of blocks, we can't expect to complete this operation in a single transaction. Update extent swap to use a permanent transaction with enough log reservation to physically log a buffer. Update the bmbt scan to physically log any buffers that cannot be ordered and to terminate the scan with -EAGAIN. On -EAGAIN, the caller rolls the transaction and restarts the scan. Finally, update the bmbt scan helper function to skip bmbt blocks that already match the expected owner so they are not reprocessed after scan restarts. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig [darrick: fix the xfs_trans_roll call] Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman

xfs: disallow marking previously dirty buffers as ordered

2017-09-20T06:20:02Z

commit a5814bceea48ee1c57c4db2bd54b0c0246daf54a upstream. Ordered buffers are used in situations where the buffer is not physically logged but must pass through the transaction/logging pipeline for a particular transaction. As a result, ordered buffers are not unpinned and written back until the transaction commits to the log. Ordered buffers have a strict requirement that the target buffer must not be currently dirty and resident in the log pipeline at the time it is marked ordered. If a dirty+ordered buffer is committed, the buffer is reinserted to the AIL but not physically relogged at the LSN of the associated checkpoint. The buffer log item is assigned the LSN of the latest checkpoint and the AIL effectively releases the previously logged buffer content from the active log before the buffer has been written back. If the tail pushes forward and a filesystem crash occurs while in this state, an inconsistent filesystem could result. It is currently the caller responsibility to ensure an ordered buffer is not already dirty from a previous modification. This is unclear and error prone when not used in situations where it is guaranteed a buffer has not been previously modified (such as new metadata allocations). To facilitate general purpose use of ordered buffers, update xfs_trans_ordered_buf() to conditionally order the buffer based on state of the log item and return the status of the result. If the bli is dirty, do not order the buffer and return false. The caller must either physically log the buffer (having acquired the appropriate log reservation) or push it from the AIL to clean it before it can be marked ordered in the current transaction. Note that ordered buffers are currently only used in two situations: 1.) inode chunk allocation where previously logged buffers are not possible and 2.) extent swap which will be updated to handle ordered buffer failures in a separate patch. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman

xfs: move bmbt owner change to last step of extent swap

2017-09-20T06:20:01Z

commit 6fb10d6d22094bc4062f92b9ccbcee2f54033d04 upstream. The extent swap operation currently resets bmbt block owners before the inode forks are swapped. The bmbt buffers are marked as ordered so they do not have to be physically logged in the transaction. This use of ordered buffers is not safe as bmbt buffers may have been previously physically logged. The bmbt owner change algorithm needs to be updated to physically log buffers that are already dirty when/if they are encountered. This means that an extent swap will eventually require multiple rolling transactions to handle large btrees. In addition, all inode related changes must be logged before the bmbt owner change scan begins and can roll the transaction for the first time to preserve fs consistency via log recovery. In preparation for such fixes to the bmbt owner change algorithm, refactor the bmbt scan out of the extent fork swap code to the last operation before the transaction is committed. Update xfs_swap_extent_forks() to only set the inode log flags when an owner change scan is necessary. Update xfs_swap_extents() to trigger the owner change based on the inode log flags. Note that since the owner change now occurs after the extent fork swap, the inode btrees must be fixed up with the inode number of the current inode (similar to log recovery). Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman

xfs: skip bmbt block ino validation during owner change

2017-09-20T06:20:01Z

commit 99c794c639a65cc7b74f30a674048fd100fe9ac8 upstream. Extent swap uses xfs_btree_visit_blocks() to fix up bmbt block owners on v5 (!rmapbt) filesystems. The bmbt scan uses xfs_btree_lookup_get_block() to read bmbt blocks which verifies the current owner of the block against the parent inode of the bmbt. This works during extent swap because the bmbt owners are updated to the opposite inode number before the inode extent forks are swapped. The modified bmbt blocks are marked as ordered buffers which allows everything to commit in a single transaction. If the transaction commits to the log and the system crashes such that recovery of the extent swap is required, log recovery restarts the bmbt scan to fix up any bmbt blocks that may have not been written back before the crash. The log recovery bmbt scan occurs after the inode forks have been swapped, however. This causes the bmbt block owner verification to fail, leads to log recovery failure and requires xfs_repair to zap the log to recover. Define a new invalid inode owner flag to inform the btree block lookup mechanism that the current inode may be invalid with respect to the current owner of the bmbt block. Set this flag on the cursor used for change owner scans to allow this operation to work at runtime and during log recovery. Signed-off-by: Brian Foster Fixes: bb3be7e7c ("xfs: check for bogus values in btree block headers") Cc: stable@vger.kernel.org Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Darrick J. Wong Signed-off-by: Greg Kroah-Hartman