summaryrefslogtreecommitdiff
path: root/fs/sysv
diff options
context:
space:
mode:
authorAndrew Morton <akpm@zip.com.au>2002-05-19 02:20:58 -0700
committerArnaldo Carvalho de Melo <acme@conectiva.com.br>2002-05-19 02:20:58 -0700
commit43152186ec28f3d4adf2a79ff8becacdfca9c82d (patch)
treea0e8366d2b4cc5ebf6c4535b5c92e425a2b839e9 /fs/sysv
parent6b9f3b418f690318f88fbfd2002bbd7ca344c04a (diff)
[PATCH] i_dirty_buffers locking fix
This fixes a race between try_to_free_buffers' call to __remove_inode_queue() and other users of b_inode_buffers (fsync_inode_buffers and mark_buffer_dirty_inode()). They are presently taking different locks. The patch relocates and redefines and clarifies(?) the role of inode.i_dirty_buffers. The 2.4 definition of i_dirty_buffers is "a list of random buffers which is protected by a kernel-wide lock". This definition needs to be narrowed in the 2.5 context. It is now "a list of buffers from a different mapping, protected by a lock within that mapping". This list of buffers is specifically for fsync(). As this is a "data plane" operation, all the structures have been moved out of the inode and into the address_space. So address_space now has: list_head private_list; A list, available to the address_space for any purpose. If that address_space chooses to use the helper functions mark_buffer_dirty_inode and sync_mapping_buffers() then this list will contain buffer_heads, attached via buffer_head.b_assoc_buffers. If the address_space does not call those helper functions then the list is free for other usage. The only requirement is that the list be list_empty() at destroy_inode() time. At least, this is the objective. At present, generic_file_write() will call generic_osync_inode(), which expects that list to contain buffer_heads. So private_list isn't useful for anything else yet. spinlock_t private_lock; A spinlock, available to the address_space. If the address_space is using try_to_free_buffers(), mark_inode_dirty_buffers() and fsync_inode_buffers() then this lock is used to protect the private_list of *other* mappings which have listed buffers from *this* mapping onto themselves. That is: for buffer_heads, mapping_A->private_lock does not protect mapping_A->private_list! It protects the b_assoc_buffers list from buffers which are backed by mapping_A and it protects mapping_B->private_list, mapping_C->private_list, ... So what we have here is a cross-mapping association. S_ISREG mappings maintain a list of buffers from the blockdev's address_space which they need to know about for a successful fsync(). The locking follows the buffers: the lock in in the blockdev's mapping, not in the S_ISREG file's mapping. For address_spaces which use try_to_free_buffers, private_lock is also (and quite unrelatedly) used for protection of the buffer ring at page->private. Exclusion between try_to_free_buffers(), __get_hash_table() and __set_page_dirty_buffers(). This is in fact its major use. address_space *assoc_mapping Sigh. This is the address of the mapping which backs the buffers which are attached to private_list. It's here so that generic_osync_inode() can locate the lock which protects this mapping's private_list. Will probably go away. A consequence of all the above is that: a) All the buffers at a mapping_A's ->private_list must come from the same mapping, mapping_B. There is no requirement that mapping_B be a blockdev mapping, but that's how it's used. There is a BUG() check in mark_buffer_dirty_inode() for this. b) blockdev mappings never have any buffers on ->private_list. It just never happens, and doesn't make a lot of sense. reiserfs is using b_inode_buffers for attaching dependent buffers to its journal and that caused a few problems. Fixed in reiserfs_releasepage.patch
Diffstat (limited to 'fs/sysv')
-rw-r--r--fs/sysv/file.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/fs/sysv/file.c b/fs/sysv/file.c
index 255230c20988..9dce95103718 100644
--- a/fs/sysv/file.c
+++ b/fs/sysv/file.c
@@ -36,7 +36,7 @@ int sysv_sync_file(struct file * file, struct dentry *dentry, int datasync)
struct inode *inode = dentry->d_inode;
int err;
- err = fsync_inode_buffers(inode);
+ err = sync_mapping_buffers(inode->i_mapping);
if (!(inode->i_state & I_DIRTY))
return err;
if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))