diff options
| author | Andrew Morton <akpm@zip.com.au> | 2002-05-19 02:20:58 -0700 |
|---|---|---|
| committer | Arnaldo Carvalho de Melo <acme@conectiva.com.br> | 2002-05-19 02:20:58 -0700 |
| commit | 43152186ec28f3d4adf2a79ff8becacdfca9c82d (patch) | |
| tree | a0e8366d2b4cc5ebf6c4535b5c92e425a2b839e9 /fs/sysv | |
| parent | 6b9f3b418f690318f88fbfd2002bbd7ca344c04a (diff) | |
[PATCH] i_dirty_buffers locking fix
This fixes a race between try_to_free_buffers' call to
__remove_inode_queue() and other users of b_inode_buffers
(fsync_inode_buffers and mark_buffer_dirty_inode()). They are
presently taking different locks.
The patch relocates and redefines and clarifies(?) the role of
inode.i_dirty_buffers.
The 2.4 definition of i_dirty_buffers is "a list of random buffers
which is protected by a kernel-wide lock". This definition needs to be
narrowed in the 2.5 context. It is now
"a list of buffers from a different mapping, protected by a lock within
that mapping". This list of buffers is specifically for fsync().
As this is a "data plane" operation, all the structures have been moved
out of the inode and into the address_space. So address_space now has:
list_head private_list;
A list, available to the address_space for any purpose. If
that address_space chooses to use the helper functions
mark_buffer_dirty_inode and sync_mapping_buffers() then this list
will contain buffer_heads, attached via
buffer_head.b_assoc_buffers.
If the address_space does not call those helper functions
then the list is free for other usage. The only requirement is
that the list be list_empty() at destroy_inode() time.
At least, this is the objective. At present,
generic_file_write() will call generic_osync_inode(), which
expects that list to contain buffer_heads. So private_list isn't
useful for anything else yet.
spinlock_t private_lock;
A spinlock, available to the address_space.
If the address_space is using try_to_free_buffers(),
mark_inode_dirty_buffers() and fsync_inode_buffers() then this
lock is used to protect the private_list of *other* mappings which
have listed buffers from *this* mapping onto themselves.
That is: for buffer_heads, mapping_A->private_lock does not
protect mapping_A->private_list! It protects the b_assoc_buffers
list from buffers which are backed by mapping_A and it protects
mapping_B->private_list, mapping_C->private_list, ...
So what we have here is a cross-mapping association. S_ISREG
mappings maintain a list of buffers from the blockdev's
address_space which they need to know about for a successful
fsync(). The locking follows the buffers: the lock in in the
blockdev's mapping, not in the S_ISREG file's mapping.
For address_spaces which use try_to_free_buffers,
private_lock is also (and quite unrelatedly) used for protection
of the buffer ring at page->private. Exclusion between
try_to_free_buffers(), __get_hash_table() and
__set_page_dirty_buffers(). This is in fact its major use.
address_space *assoc_mapping
Sigh. This is the address of the mapping which backs the
buffers which are attached to private_list. It's here so that
generic_osync_inode() can locate the lock which protects this
mapping's private_list. Will probably go away.
A consequence of all the above is that:
a) All the buffers at a mapping_A's ->private_list must come
from the same mapping, mapping_B. There is no requirement that
mapping_B be a blockdev mapping, but that's how it's used.
There is a BUG() check in mark_buffer_dirty_inode() for this.
b) blockdev mappings never have any buffers on ->private_list.
It just never happens, and doesn't make a lot of sense.
reiserfs is using b_inode_buffers for attaching dependent buffers to its
journal and that caused a few problems. Fixed in reiserfs_releasepage.patch
Diffstat (limited to 'fs/sysv')
| -rw-r--r-- | fs/sysv/file.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/fs/sysv/file.c b/fs/sysv/file.c index 255230c20988..9dce95103718 100644 --- a/fs/sysv/file.c +++ b/fs/sysv/file.c @@ -36,7 +36,7 @@ int sysv_sync_file(struct file * file, struct dentry *dentry, int datasync) struct inode *inode = dentry->d_inode; int err; - err = fsync_inode_buffers(inode); + err = sync_mapping_buffers(inode->i_mapping); if (!(inode->i_state & I_DIRTY)) return err; if (datasync && !(inode->i_state & I_DIRTY_DATASYNC)) |
