diff options
| author | Neil Brown <neilb@cse.unsw.edu.au> | 2002-04-15 08:33:55 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@home.transmeta.com> | 2002-04-15 08:33:55 -0700 |
| commit | dcd2127081230d3ee15d9c8b3c705087b73b0c09 (patch) | |
| tree | 3b9fb893d51902a9ce7dc25f647bc5e203eb0100 | |
| parent | 0de4fa30014b4d6f46b0cddc3edda39803e020d3 (diff) | |
[PATCH] PATCH - Create "export_operations" interface for filesystems to describe
Create "export_operations" interface for filesystems to describe
whether and how they should be exported.
- add new field in struct super_block "s_export_op" to describe
how a filesystem is exported (i.e. how filehandles are mapped to
dentries).
- New module: fs/exportfs for holding helper code for mapping between
filehandles and dentries
- Change nfsd to use new interface if it exists.
- Change ext2 to provide new interface
- Add documention to filesystems/Exporting
If s_export_op isn't set, old mechanism still works, but it is
planned to remove old method and only use s_export_op.
| -rw-r--r-- | Documentation/filesystems/Exporting | 80 | ||||
| -rw-r--r-- | Documentation/filesystems/porting | 20 | ||||
| -rw-r--r-- | fs/Config.in | 1 | ||||
| -rw-r--r-- | fs/Makefile | 1 | ||||
| -rw-r--r-- | fs/exportfs/Makefile | 11 | ||||
| -rw-r--r-- | fs/exportfs/expfs.c | 516 | ||||
| -rw-r--r-- | fs/ext2/namei.c | 27 | ||||
| -rw-r--r-- | fs/ext2/super.c | 11 | ||||
| -rw-r--r-- | fs/nfsd/export.c | 9 | ||||
| -rw-r--r-- | fs/nfsd/nfsctl.c | 2 | ||||
| -rw-r--r-- | fs/nfsd/nfsfh.c | 131 | ||||
| -rw-r--r-- | include/linux/dcache.h | 18 | ||||
| -rw-r--r-- | include/linux/fs.h | 105 |
13 files changed, 884 insertions, 48 deletions
diff --git a/Documentation/filesystems/Exporting b/Documentation/filesystems/Exporting index 03621499cac1..e16492d7c720 100644 --- a/Documentation/filesystems/Exporting +++ b/Documentation/filesystems/Exporting @@ -86,8 +86,7 @@ Filesystem Issues For a filesystem to be exportable it must: - 1/ provide the filehandle fragment routines described below - (later). + 1/ provide the filehandle fragment routines described below. 2/ make sure that d_splice_alias is used rather than d_add when ->lookup finds an inode for a given parent and name. Typically the ->lookup routine will end: @@ -98,3 +97,80 @@ For a filesystem to be exportable it must: } + + A file system implementation declares that instances of the filesystem +are exportable by setting the s_export_op field in the struct +super_block. This field must point to a "struct export_operations" +struct which could potentially be full of NULLs, though normally at +least get_parent will be set. + + The primary operations are decode_fh and encode_fh. +decode_fh takes a filehandle fragment and tries to find or create a +dentry for the object referred to by the filehandle. +encode_fh takes a dentry and creates a filehandle fragment which can +later be used to find/create a dentry for the same object. + +decode_fh will probably make use of "find_exported_dentry". +This function lives in the "exportfs" module which a filesystem does +not need unless it is being exported. So rather that calling +find_exported_dentry directly, each filesystem should call it through +the find_exported_dentry pointer in it's export_operations table. +This field is set correctly by the exporting agent (e.g. nfsd) when a +filesystem is exported, and before any export operations are called. + +find_exported_dentry needs three support functions from the +filesystem: + get_name. When given a parent dentry and a child dentry, this + should find a name in the directory identified by the parent + dentry, which leads to the object identified by the child dentry. + If no get_name function is supplied, a default implementation + which used vfs_readdir to find potential names, and matches inode + numbers to find the correct match. + + get_parent. When given a dentry for a directory, this should return + a dentry for the parent. Quite possibly the parent dentry will + have been allocated by d_alloc_anon. + The default get_parent function just returns an error so any + filehandle lookup that requires finding a parent will fail. + ->lookup("..") is *not* used as a default as it can leave ".." + entries in the dcache which are too messy to work with. + + get_dentry. When given a opaque datum, this should find the + implied object and create a dentry for it (possibly with + d_alloc_anon). + The opaque datum is whatever is passed down by the decode_fh + function, and is often simply a fragment of the filehandle + fragment. + decode_fh passes two datums through find_exported_dentry. One that + should be used to identify the target object, and one that can be + used to identify the objects parent, should that be necessary. + The default get_dentry function assumes that the datum contains an + inode number and a generation number, and it attempts to get the + inode using "iget" and check it's validity by matching the + generation number. A filesystem should only depend on the default + if iget can safely be used this way. + +If decode_fh and/or encode_fh are left as NULL, then default +implementations are used. These defaults are suitable for ext2 and +extremely similar filesystems (like ext3). + +The default encode_fh creates a filehandle fragment from the inode +number and generation number of the target together with the inode +number and generation number of the parent (if the parent is +required). + +The default decode_fh extract the target and parent datums from the +filehandle assuming the format used by the default encode_fh and +passed them to find_exported_dentry. + + +A filehandle fragment consists of an array of 1 or more 4byte words. +Together with a one byte "type". +The decode_fh routine should not depend on the stated size that is +passed to it. This size may be larger than the original filehandle +generated by encode_fh, in which case it will have been padded with +nuls. Rather, the encode_fh routine should choose a "type" which +indicates the decode_fh how much of the filehandle is valid, and how +it should be interpreted. + + diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index 1ac5d3af530e..dd28c5c12e86 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting @@ -1,7 +1,7 @@ Changes since 2.5.0: --- -[recommeneded] +[recommended] New helpers: sb_bread(), sb_getblk(), sb_get_hash_table(), set_bh(), sb_set_blocksize() and sb_min_blocksize(). @@ -9,7 +9,7 @@ New helpers: sb_bread(), sb_getblk(), sb_get_hash_table(), set_bh(), Use them. --- -[recommeneded] +[recommended] New methods: ->alloc_inode() and ->destroy_inode(). @@ -123,3 +123,19 @@ went in - and hadn't been documented ;-/). Just remove it from fs_flags ->setattr() is called without BKL now. Caller _always_ holds ->i_sem, so watch for ->i_sem-grabbing code that might be used by your ->setattr(). Callers of notify_change() need ->i_sem now. + +--- +[recommended] + +New super_block field "struct export_operations *s_export_op" for +explicit support for exporting, e.g. via NFS. The structure is fully +documented at its declaration in include/linux/fs.h, and in +Documentation/filesystems/Exporting. + +Briefly it allows for the definition of decode_fh and encode_fh operations +to encode and decode filehandles, and allows the filesystem to use +a standard helper function for decode_fh, and provide file-system specific +support for this helper, particularly get_parent. + +It is planned that this will be required for exporting once the code +settles down a bit. diff --git a/fs/Config.in b/fs/Config.in index 8a4cf90999bb..bd49b9e40cbf 100644 --- a/fs/Config.in +++ b/fs/Config.in @@ -124,6 +124,7 @@ if [ "$CONFIG_NET" = "y" ]; then if [ "$CONFIG_NFSD_V3" = "y" -o "$CONFIG_NFS_V3" = "y" ]; then define_bool CONFIG_LOCKD_V4 y fi + define_tristate CONFIG_EXPORTFS $CONFIG_NFSD dep_tristate 'SMB file system support (to mount Windows shares etc.)' CONFIG_SMB_FS $CONFIG_INET if [ "$CONFIG_SMB_FS" != "n" ]; then diff --git a/fs/Makefile b/fs/Makefile index 43a3d6ffbacf..f129943df16f 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -45,6 +45,7 @@ subdir-$(CONFIG_DEVFS_FS) += devfs subdir-$(CONFIG_HFS_FS) += hfs subdir-$(CONFIG_VXFS_FS) += freevxfs subdir-$(CONFIG_NFS_FS) += nfs +subdir-$(CONFIG_EXPORTFS) += exportfs subdir-$(CONFIG_NFSD) += nfsd subdir-$(CONFIG_LOCKD) += lockd subdir-$(CONFIG_NLS) += nls diff --git a/fs/exportfs/Makefile b/fs/exportfs/Makefile new file mode 100644 index 000000000000..268f2f73dbd0 --- /dev/null +++ b/fs/exportfs/Makefile @@ -0,0 +1,11 @@ +# +# Makefile for the filesystem export support routines. + +O_TARGET := exportfs.o + +export-objs := expfs.o + +obj-y := expfs.o +obj-m := $(O_TARGET) + +include $(TOPDIR)/Rules.make diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c new file mode 100644 index 000000000000..00a75caee89f --- /dev/null +++ b/fs/exportfs/expfs.c @@ -0,0 +1,516 @@ + +#include <linux/fs.h> +#include <linux/module.h> +#include <linux/smp_lock.h> + +/** + * find_exported_dentry - helper routine to implement export_operations->decode_fh + * @sb: The &super_block identifying the filesystem + * @obj: An opaque identifier of the object to be found - passed to get_inode + * @parent: An optional opqaue identifier of the parent of the object. + * @acceptable: A function used to test possible &dentries to see of they are acceptable + * @context: A parameter to @acceptable so that it knows on what basis to judge. + * + * find_exported_dentry is the central helper routine to enable file systems to provide + * the decode_fh() export_operation. It's main task is to take an &inode, find or create an + * appropriate &dentry structure, and possibly splice this into the dcache in the + * correct place. + * + * The decode_fh() operation provided by the filesystem should call find_exported_dentry() + * with the same parameters that it received except that instead of the file handle fragment, + * pointers to opaque identifiers for the object and optionally its parent are passed. + * The default decode_fh routine passes one pointer to the start of the filehandle fragment, and + * one 8 bytes into the fragment. It is expected that most filesystems will take this + * approach, though the offset to the parent identifier may well be different. + * + * find_exported_dentry() will call get_dentry to get an dentry pointer from the file system. If + * any &dentry in the d_alias list is acceptable, it will be returned. Otherwise + * find_exported_dentry() will attempt to splice a new &dentry into the dcache using get_name() and + * get_parent() to find the appropriate place. + * + */ + +struct export_operations export_op_default; + +#define CALL(ops,fun) ((ops->fun)?(ops->fun):export_op_default.fun) + +#define dprintk(x, ...) do{}while(0) + +struct dentry * +find_exported_dentry(struct super_block *sb, void *obj, void *parent, + int (*acceptable)(void *context, struct dentry *de), + void *context) +{ + struct dentry *result = NULL; + struct dentry *target_dir; + int err; + struct export_operations *nops = sb->s_export_op; + struct list_head *le, *head; + struct dentry *toput = NULL; + int noprogress; + + + /* + * Attempt to find the inode. + */ + result = CALL(sb->s_export_op,get_dentry)(sb,obj); + err = -ESTALE; + if (result == NULL) + goto err_out; + if (IS_ERR(result)) { + err = PTR_ERR(result); + goto err_out; + } + if (S_ISDIR(result->d_inode->i_mode) && + (result->d_flags & DCACHE_DISCONNECTED)) { + /* it is an unconnected directory, we must connect it */ + ; + } else { + if (acceptable(context, result)) + return result; + if (S_ISDIR(result->d_inode->i_mode)) { + /* there is no other dentry, so fail */ + goto err_result; + } + /* try any other aliases */ + spin_lock(&dcache_lock); + head = &result->d_inode->i_dentry; + list_for_each(le, head) { + struct dentry *dentry = list_entry(le, struct dentry, d_alias); + dget_locked(dentry); + spin_unlock(&dcache_lock); + if (toput) + dput(toput); + toput = NULL; + if (dentry != result && + acceptable(context, dentry)) { + dput(result); + dentry->d_vfs_flags |= DCACHE_REFERENCED; + return dentry; + } + spin_lock(&dcache_lock); + toput = dentry; + } + spin_unlock(&dcache_lock); + if (toput) + dput(toput); + } + + /* It's a directory, or we are required to confirm the file's + * location in the tree based on the parent information + */ + dprintk("find_exported_dentry: need to look harder for %d/%d\n",kdev_t_to_nr(sb->s_dev),*(int*)obj); + if (S_ISDIR(result->d_inode->i_mode)) + target_dir = dget(result); + else { + if (parent == NULL) + goto err_result; + + target_dir = CALL(sb->s_export_op,get_dentry)(sb,parent); + if (IS_ERR(target_dir)) + err = PTR_ERR(target_dir); + if (target_dir == NULL || IS_ERR(target_dir)) + goto err_result; + } + /* + * Now we need to make sure that target_dir is properly connected. + * It may already be, as the flag isn't always updated when connection + * happens. + * So, we walk up parent links until we find a connected directory, + * or we run out of directories. Then we find the parent, find + * the name of the child in that parent, and do a lookup. + * This should connect the child into the parent + * We then repeat. + */ + + /* it is possible that a confused file system might not let us complete the + * path to the root. For example, if get_parent returns a directory + * in which we cannot find a name for the child. While this implies a very + * sick filesystem we don't want it to cause knfsd to spin. Hence the noprogress + * counter. If we go through the loop 10 times (2 is probably enough) without + * getting anywhere, we just give up + */ + lock_kernel(); + noprogress= 0; + while (target_dir->d_flags & DCACHE_DISCONNECTED && noprogress++ < 10) { + struct dentry *pd = target_dir; + read_lock(&dparent_lock); + while (!IS_ROOT(pd) && + (pd->d_parent->d_flags & DCACHE_DISCONNECTED)) + pd = pd->d_parent; + + dget(pd); + read_unlock(&dparent_lock); + + if (!IS_ROOT(pd)) { + /* must have found a connected parent - great */ + pd->d_flags &= ~DCACHE_DISCONNECTED; + noprogress = 0; + } else if (pd == sb->s_root) { + printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n"); + pd->d_flags &= ~DCACHE_DISCONNECTED; + noprogress = 0; + } else { + /* we have hit the top of a disconnected path. Try + * to find parent and connect + * note: racing with some other process renaming a + * directory isn't much of a problem here. If someone + * renames the directory, it will end up properly connected, + * which is what we want + */ + struct dentry *ppd; + struct dentry *npd; + char nbuf[NAME_MAX+1]; + + down(&pd->d_inode->i_sem); + ppd = CALL(nops,get_parent)(pd); + up(&pd->d_inode->i_sem); + + if (IS_ERR(ppd)) { + err = PTR_ERR(ppd); + dprintk("find_exported_dentry: get_parent of %ld failed, err %d\n", + pd->d_inode->i_ino, err); + dput(pd); + break; + } + dprintk("find_exported_dentry: find name of %lu in %lu\n", pd->d_inode->i_ino, ppd->d_inode->i_ino); + err = CALL(nops,get_name)(ppd, nbuf, pd); + if (err) { + dput(ppd); + if (err == -ENOENT) + /* some race between get_parent and get_name? + * just try again + */ + continue; + dput(pd); + break; + } + dprintk("find_exported_dentry: found name: %s\n", nbuf); + down(&ppd->d_inode->i_sem); + npd = lookup_one_len(nbuf, ppd, strlen(nbuf)); + up(&ppd->d_inode->i_sem); + if (IS_ERR(npd)) { + err = PTR_ERR(npd); + dprintk("find_exported_dentry: lookup failed: %d\n", err); + dput(ppd); + dput(pd); + break; + } + /* we didn't really want npd, we really wanted + * a side-effect of the lookup. + * hopefully, npd == pd, though it isn't really + * a problem if it isn't + */ + if (npd == pd) + noprogress = 0; + else + printk("find_exported_dentry: npd != pd\n"); + dput(npd); + dput(ppd); + if (IS_ROOT(pd)) { + /* something went wrong, we will have to give up */ + dput(pd); + break; + } + } + dput(pd); + } + + if (target_dir->d_flags & DCACHE_DISCONNECTED) { + /* something went wrong - oh-well */ + if (!err) + err = -ESTALE; + unlock_kernel(); + goto err_target; + } + /* if we weren't after a directory, have one more step to go */ + if (result != target_dir) { + struct dentry *nresult; + char nbuf[NAME_MAX+1]; + err = CALL(nops,get_name)(target_dir, nbuf, result); + if (!err) { + down(&target_dir->d_inode->i_sem); + nresult = lookup_one_len(nbuf, target_dir, strlen(nbuf)); + up(&target_dir->d_inode->i_sem); + if (!IS_ERR(nresult)) { + if (nresult->d_inode) { + dput(result); + result = nresult; + } else + dput(nresult); + } + } + } + dput(target_dir); + unlock_kernel(); + /* now result is properly connected, it is our best bet */ + if (acceptable(context, result)) + return result; + /* one last try of the aliases.. */ + spin_lock(&dcache_lock); + head = &result->d_inode->i_dentry; + list_for_each(le, head) { + struct dentry *dentry = list_entry(le, struct dentry, d_alias); + dget_locked(dentry); + spin_unlock(&dcache_lock); + if (toput) dput(toput); + if (dentry != result && + acceptable(context, dentry)) { + dput(result); + dentry->d_vfs_flags |= DCACHE_REFERENCED; + return dentry; + } + spin_lock(&dcache_lock); + toput = dentry; + } + spin_unlock(&dcache_lock); + if (toput) + dput(toput); + + /* drat - I just cannot find anything acceptable */ + dput(result); + return ERR_PTR(-ESTALE); + + err_target: + dput(target_dir); + err_result: + dput(result); + err_out: + return ERR_PTR(err); +} + + + +static struct dentry *get_parent(struct dentry *child) +{ + /* get_parent cannot be supported generically, the locking + * is too icky. + * instead, we just return EACCES. If server reboots or inodes + * get flushed, you lose + */ + return ERR_PTR(-EACCES); +} + + +struct getdents_callback { + char *name; /* name that was found. It already points to a buffer NAME_MAX+1 is size */ + unsigned long ino; /* the inum we are looking for */ + int found; /* inode matched? */ + int sequence; /* sequence counter */ +}; + +/* + * A rather strange filldir function to capture + * the name matching the specified inode number. + */ +static int filldir_one(void * __buf, const char * name, int len, + loff_t pos, ino_t ino, unsigned int d_type) +{ + struct getdents_callback *buf = __buf; + int result = 0; + + buf->sequence++; + if (buf->ino == ino) { + memcpy(buf->name, name, len); + buf->name[len] = '\0'; + buf->found = 1; + result = -1; + } + return result; +} + +/** + * get_name - default export_operations->get_name function + * @dentry: the directory in which to find a name + * @name: a pointer to a %NAME_MAX+1 char buffer to store the name + * @child: the dentry for the child directory. + * + * calls readdir on the parent until it finds an entry with + * the same inode number as the child, and returns that. + */ +static int get_name(struct dentry *dentry, char *name, + struct dentry *child) +{ + struct inode *dir = dentry->d_inode; + int error; + struct file file; + struct getdents_callback buffer; + + error = -ENOTDIR; + if (!dir || !S_ISDIR(dir->i_mode)) + goto out; + error = -EINVAL; + if (!dir->i_fop) + goto out; + /* + * Open the directory ... + */ + error = init_private_file(&file, dentry, FMODE_READ); + if (error) + goto out; + error = -EINVAL; + if (!file.f_op->readdir) + goto out_close; + + buffer.name = name; + buffer.ino = child->d_inode->i_ino; + buffer.found = 0; + buffer.sequence = 0; + while (1) { + int old_seq = buffer.sequence; + + error = vfs_readdir(&file, filldir_one, &buffer); + + if (error < 0) + break; + + error = 0; + if (buffer.found) + break; + error = -ENOENT; + if (old_seq == buffer.sequence) + break; + } + +out_close: + if (file.f_op->release) + file.f_op->release(dir, &file); +out: + return error; +} + + +static struct dentry *export_iget(struct super_block *sb, unsigned long ino, __u32 generation) +{ + + /* iget isn't really right if the inode is currently unallocated!! + * This should really all be done inside each filesystem + * + * ext2fs' read_inode has been strengthed to return a bad_inode if the inode + * had been deleted. + * + * Currently we don't know the generation for parent directory, so a generation + * of 0 means "accept any" + */ + struct inode *inode; + struct dentry *result; + if (ino == 0) + return ERR_PTR(-ESTALE); + inode = iget(sb, ino); + if (inode == NULL) + return ERR_PTR(-ENOMEM); + if (is_bad_inode(inode) + || (generation && inode->i_generation != generation) + ) { + /* we didn't find the right inode.. */ + dprintk("fh_verify: Inode %lu, Bad count: %d %d or version %u %u\n", + inode->i_ino, + inode->i_nlink, atomic_read(&inode->i_count), + inode->i_generation, + generation); + + iput(inode); + return ERR_PTR(-ESTALE); + } + /* now to find a dentry. + * If possible, get a well-connected one + */ + result = d_alloc_anon(inode); + if (!result) { + iput(inode); + return ERR_PTR(-ENOMEM); + } + result->d_vfs_flags |= DCACHE_REFERENCED; + return result; +} + + +static struct dentry *get_object(struct super_block *sb, void *vobjp) +{ + __u32 *objp = vobjp; + unsigned long ino = objp[0]; + __u32 generation = objp[1]; + + return export_iget(sb, ino, generation); +} + + +/** + * export_encode_fh - default export_operations->encode_fh function + * dentry: the dentry to encode + * fh: where to store the file handle fragment + * max_len: maximum length to store there + * connectable: whether to store parent infomation + * + * This default encode_fh function assumes that the 32 inode number + * is suitable for locating an inode, and that the generation number + * can be used to check that it is still valid. It places them in the + * filehandle fragment where export_decode_fh expects to find them. + */ +static int export_encode_fh(struct dentry *dentry, __u32 *fh, int *max_len, + int connectable) +{ + struct inode * inode = dentry->d_inode; + struct inode *parent = dentry->d_parent->d_inode; + int len = *max_len; + int type = 1; + + if (len < 2 || (connectable && len < 4)) + return 255; + + len = 2; + fh[0] = inode->i_ino; + fh[1] = inode->i_generation; + if (connectable && !S_ISDIR(inode->i_mode)) { + fh[2] = parent->i_ino; + fh[3] = parent->i_generation; + len = 4; + type = 2; + } + *max_len = len; + return type; +} + + +/** + * export_decode_fh - default export_operations->decode_fh function + * sb: The superblock + * fh: pointer to the file handle fragment + * fh_len: length of file handle fragment + * acceptable: function for testing acceptability of dentrys + * context: context for @acceptable + * + * This is the default decode_fh() function. + * a fileid_type of 1 indicates that the filehandlefragment + * just contains an object identifier understood by get_dentry. + * a fileid_type of 2 says that there is also a directory + * identifier 8 bytes in to the filehandlefragement. + */ +static struct dentry *export_decode_fh(struct super_block *sb, __u32 *fh, int fh_len, + int fileid_type, + int (*acceptable)(void *context, struct dentry *de), + void *context) +{ + __u32 parent[2]; + parent[0] = parent[1] = 0; + if (fh_len < 2 || fileid_type > 2) + return NULL; + if (fileid_type == 2) { + if (fh_len > 2) parent[0] = fh[2]; + if (fh_len > 3) parent[1] = fh[3]; + } + return find_exported_dentry(sb, fh, parent, + acceptable, context); +} + +struct export_operations export_op_default = { + decode_fh: export_decode_fh, + encode_fh: export_encode_fh, + + get_name: get_name, + get_parent: get_parent, + get_dentry: get_object, +}; + +EXPORT_SYMBOL(export_op_default); +EXPORT_SYMBOL(find_exported_dentry); diff --git a/fs/ext2/namei.c b/fs/ext2/namei.c index 53ad39ffebd2..f7335b8739c6 100644 --- a/fs/ext2/namei.c +++ b/fs/ext2/namei.c @@ -79,10 +79,37 @@ static struct dentry *ext2_lookup(struct inode * dir, struct dentry *dentry) if (!inode) return ERR_PTR(-EACCES); } + if (inode) + return d_splice_alias(inode, dentry); d_add(dentry, inode); return NULL; } +struct dentry *ext2_get_parent(struct dentry *child) +{ + unsigned long ino; + struct dentry *parent; + struct inode *inode; + struct dentry dotdot; + + dotdot.d_name.name = ".."; + dotdot.d_name.len = 2; + + ino = ext2_inode_by_name(child->d_inode, &dotdot); + if (!ino) + return ERR_PTR(-ENOENT); + inode = iget(child->d_inode->i_sb, ino); + + if (!inode) + return ERR_PTR(-EACCES); + parent = d_alloc_anon(inode); + if (!parent) { + iput(inode); + parent = ERR_PTR(-ENOMEM); + } + return parent; +} + /* * By the time this is called, we already have created * the directory cache entry for the new file, but it diff --git a/fs/ext2/super.c b/fs/ext2/super.c index d4fcb94cac45..83c73e46c70b 100644 --- a/fs/ext2/super.c +++ b/fs/ext2/super.c @@ -209,6 +209,16 @@ static struct super_operations ext2_sops = { remount_fs: ext2_remount, }; +/* Yes, most of these are left as NULL!! + * A NULL value implies the default, which works with ext2-like file + * systems, but can be improved upon. + * Currently only get_parent is required. + */ +struct dentry *ext2_get_parent(struct dentry *child); +static struct export_operations ext2_export_ops = { + get_parent: ext2_get_parent, +}; + /* * This function has been shamelessly adapted from the msdos fs */ @@ -687,6 +697,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent) * set up enough so that it can read an inode */ sb->s_op = &ext2_sops; + sb->s_export_op = &ext2_export_ops; sb->s_root = d_alloc_root(iget(sb, EXT2_ROOT_INO)); if (!sb->s_root || !S_ISDIR(sb->s_root->d_inode->i_mode) || !sb->s_root->d_inode->i_blocks || !sb->s_root->d_inode->i_size) { diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c index 967700a34b43..4ffd5c531027 100644 --- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -233,6 +233,10 @@ static void exp_fsid_hash(struct svc_client *clp, struct svc_export *exp) list_add(&exp->ex_fsid_hash, head); } +extern struct dentry * +find_exported_dentry(struct super_block *sb, void *obj, void *parent, + int (*acceptable)(void *context, struct dentry *de), + void *context); /* * Export a file system. */ @@ -316,12 +320,17 @@ exp_export(struct nfsctl_export *nxp) || (nxp->ex_flags & NFSEXP_FSID)) && (inode->i_sb->s_op->read_inode + || inode->i_sb->s_export_op || inode->i_sb->s_op->fh_to_dentry)) /* Ok, we can export it */; else { dprintk("exp_export: export of invalid fs type.\n"); goto finish; } + if (inode->i_sb->s_export_op && + !inode->i_sb->s_export_op->find_exported_dentry) + inode->i_sb->s_export_op->find_exported_dentry = + find_exported_dentry; if ((parent = exp_child(clp, inode->i_sb, nd.dentry)) != NULL) { dprintk("exp_export: export not valid (Rule 3).\n"); diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index be2e78d9f5f9..634a40acc689 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -30,6 +30,7 @@ #include <linux/nfsd/cache.h> #include <linux/nfsd/xdr.h> #include <linux/nfsd/syscall.h> +#include <linux/nfsd/interface.h> #include <asm/uaccess.h> @@ -379,6 +380,7 @@ static struct file_system_type nfsd_fs_type = { static int __init init_nfsd(void) { printk(KERN_INFO "Installing knfsd (copyright (C) 1996 okir@monad.swb.de).\n"); + nfsd_stat_init(); /* Statistics */ nfsd_cache_init(); /* RPC reply cache */ nfsd_export_init(); /* Exports table */ diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index 0ed91d714a6d..db637bc733d8 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -6,6 +6,7 @@ * Copyright (C) 1995, 1996 Olaf Kirch <okir@monad.swb.de> * Portions Copyright (C) 1999 G. Allen Morris III <gam3@acm.org> * Extensive rewrite by Neil Brown <neilb@cse.unsw.edu.au> Southern-Spring 1999 + * ... and again Southern-Winter 2001 to support export_operations */ #include <linux/sched.h> @@ -29,6 +30,11 @@ static int nfsd_nr_verified; static int nfsd_nr_put; +extern struct export_operations export_op_default; + +#define CALL(ops,fun) ((ops->fun)?(ops->fun):export_op_default.fun) + + struct nfsd_getdents_callback { char *name; /* name that was found. It already points to a buffer NAME_MAX+1 is size */ @@ -60,15 +66,6 @@ dprintk("filldir_one: seq=%d, ino=%ld, name=%s\n", buf->sequence, ino, name); return result; } -/** - * nfsd_get_name - default nfsd_operations->get_name function - * @dentry: the directory in which to find a name - * @name: a pointer to a %NAME_MAX+1 char buffer to store the name - * @child: the dentry for the child directory. - * - * calls readdir on the parent until it finds an entry with - * the same inode number as the child, and returns that. - */ static int nfsd_get_name(struct dentry *dentry, char *name, struct dentry *child) { @@ -120,9 +117,6 @@ out: return error; } -/* this should be provided by each filesystem in an nfsd_operations interface as - * iget isn't really the right interface - */ static struct dentry *nfsd_iget(struct super_block *sb, unsigned long ino, __u32 generation) { @@ -244,7 +238,6 @@ int d_splice(struct dentry *target, struct dentry *parent, struct qstr *name) } /* this routine finds the dentry of the parent of a given directory - * it should be in the filesystem accessed by nfsd_operations * it assumes lookup("..") works. */ struct dentry *nfsd_findparent(struct dentry *child) @@ -510,6 +503,46 @@ err_out: } /* + * our acceptability function. + * if NOSUBTREECHECK, accept anything + * if not, require that we can walk up to exp->ex_dentry + * doing some checks on the 'x' bits + */ +int nfsd_acceptable(void *expv, struct dentry *dentry) +{ + struct svc_export *exp = expv; + int rv; + struct dentry *tdentry; + + if (exp->ex_flags & NFSEXP_NOSUBTREECHECK) + return 1; + + dget(dentry); + read_lock(&dparent_lock); + for (tdentry = dentry; + tdentry != exp->ex_dentry && ! IS_ROOT(tdentry); + (dget(tdentry->d_parent), + dput(tdentry), + tdentry = tdentry->d_parent) + ) { + /* make sure parents give x permission to user */ + int err; + read_unlock(&dparent_lock); + err = permission(tdentry->d_parent->d_inode, S_IXOTH); + read_lock(&dparent_lock); + if (err < 0) + break; + } + read_unlock(&dparent_lock); + if (tdentry != exp->ex_dentry) + dprintk("nfsd_acceptable failed at %p %s\n", tdentry, tdentry->d_name.name); + rv = (tdentry == exp->ex_dentry); + dput(tdentry); + return rv; +} + + +/* * Perform sanity checks on the dentry in a client's file handle. * * Note that the file handle dentry may need to be freed even after @@ -536,6 +569,8 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) kdev_t xdev = NODEV; ino_t xino = 0; __u32 *datap=NULL; + __u32 tfh[3]; /* filehandle fragment for oldstyle filehandles */ + int fileid_type; int data_left = fh->fh_size/4; int nfsdev; int fsid = 0; @@ -543,8 +578,8 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) error = nfserr_stale; if (rqstp->rq_vers == 3) error = nfserr_badhandle; + if (fh->fh_version == 1) { - datap = fh->fh_auth; if (--data_left<0) goto out; switch (fh->fh_auth_type) { @@ -585,7 +620,6 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) if (!exp) { /* export entry revoked */ - nfsdstats.fh_stale++; goto out; } @@ -609,27 +643,35 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) if (rqstp->rq_vers == 3) error = nfserr_badhandle; - if (fh->fh_version == 1) { - /* if fileid_type != 0, and super_operations provide fh_to_dentry lookup, - * then should use that */ - switch (fh->fh_fileid_type) { - case 0: - dentry = dget(exp->ex_dentry); - break; - default: - dentry = find_fh_dentry(exp->ex_dentry->d_sb, - datap, data_left, fh->fh_fileid_type, - !(exp->ex_flags & NFSEXP_NOSUBTREECHECK)); - } - } else { - __u32 tfh[3]; + if (fh->fh_version != 1) { tfh[0] = fh->ofh_ino; tfh[1] = fh->ofh_generation; tfh[2] = fh->ofh_dirino; - dentry = find_fh_dentry(exp->ex_dentry->d_sb, - tfh, 3, fh->ofh_dirino?2:1, - !(exp->ex_flags & NFSEXP_NOSUBTREECHECK)); + datap = tfh; + data_left = 3; + if (fh->ofh_dirino == 0) + fileid_type = 1; + else + fileid_type = 2; + } else + fileid_type = fh->fh_fileid_type; + + if (fileid_type == 0) + dentry = dget(exp->ex_dentry); + else { + struct export_operations *nop = exp->ex_mnt->mnt_sb->s_export_op; + if (nop) + dentry = CALL(nop,decode_fh)(exp->ex_mnt->mnt_sb, + datap, data_left, + fileid_type, + nfsd_acceptable, exp); + else + dentry = find_fh_dentry(exp->ex_dentry->d_sb, + datap, data_left, fileid_type, + !(exp->ex_flags & NFSEXP_NOSUBTREECHECK)); } + if (dentry == NULL) + goto out; if (IS_ERR(dentry)) { if (PTR_ERR(dentry) != -EINVAL) error = nfserrno(PTR_ERR(dentry)); @@ -664,7 +706,7 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) * write call). */ - /* When is type ever negative? */ + /* Type can be negative when creating hardlinks - not to a dir */ if (type > 0 && (inode->i_mode & S_IFMT) != type) { error = (type == S_IFDIR)? nfserr_notdir : nfserr_isdir; goto out; @@ -676,10 +718,14 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) /* * Security: Check that the export is valid for dentry <gam3@acm.org> + * This is only needed with subtree_check, and if export_operations is + * not being used - export_operations does the check via the "acceptable" + * callback */ error = 0; - if (!(exp->ex_flags & NFSEXP_NOSUBTREECHECK)) { + if (exp->ex_mnt->mnt_sb->s_export_op == NULL && + !(exp->ex_flags & NFSEXP_NOSUBTREECHECK)) { if (exp->ex_dentry != dentry) { struct dentry *tdentry = dentry; @@ -701,13 +747,11 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) && !(tdentry->d_inode->i_mode & S_IXOTH) ) { error = nfserr_stale; - nfsdstats.fh_stale++; dprintk("fh_verify: no root_squashed access.\n"); } } while ((tdentry != tdentry->d_parent)); if (exp->ex_dentry != tdentry) { error = nfserr_stale; - nfsdstats.fh_stale++; printk("nfsd Security: %s/%s bad export.\n", dentry->d_parent->d_name.name, dentry->d_name.name); @@ -729,9 +773,12 @@ fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) } #endif out: + if (error == nfserr_stale) + nfsdstats.fh_stale++; return error; } + /* * Compose a file handle for an NFS reply. * @@ -742,6 +789,7 @@ out: inline int _fh_update(struct dentry *dentry, struct svc_export *exp, __u32 *datap, int *maxsize) { + struct export_operations *nop = exp->ex_mnt->mnt_sb->s_export_op; struct super_block *sb = dentry->d_sb; if (dentry == exp->ex_dentry) { @@ -749,6 +797,10 @@ inline int _fh_update(struct dentry *dentry, struct svc_export *exp, return 0; } + if (nop) + return CALL(nop,encode_fh)(dentry, datap, maxsize, + !(exp->ex_flags&NFSEXP_NOSUBTREECHECK)); + if (sb->s_op->dentry_to_fh) { int need_parent = !S_ISDIR(dentry->d_inode->i_mode) && !(exp->ex_flags & NFSEXP_NOSUBTREECHECK); @@ -853,11 +905,11 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, st _fh_update(dentry, exp, datap, &size); fhp->fh_handle.fh_size += size*4; } + if (fhp->fh_handle.fh_fileid_type == 255) + return nfserr_opnotsupp; } nfsd_nr_verified++; - if (fhp->fh_handle.fh_fileid_type == 255) - return nfserr_opnotsupp; return 0; } @@ -889,6 +941,8 @@ fh_update(struct svc_fh *fhp) fhp->fh_handle.fh_fileid_type = _fh_update(dentry, fhp->fh_export, datap, &size); fhp->fh_handle.fh_size += size*4; + if (fhp->fh_handle.fh_fileid_type == 255) + return nfserr_opnotsupp; } out: return 0; @@ -921,3 +975,4 @@ fh_put(struct svc_fh *fhp) } return; } + diff --git a/include/linux/dcache.h b/include/linux/dcache.h index a98c3fe940c0..6cf86c3e301c 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -117,12 +117,18 @@ d_iput: no no yes * renamed" and has to be * deleted on the last dput() */ -#define DCACHE_DISCONNECTED 0x0004 /* This dentry is not currently connected to the - * dcache tree. Its parent will either be itself, - * or will have this flag as well. - * If this dentry points to a directory, then - * s_nfsd_free_path semaphore will be down - */ +#define DCACHE_DISCONNECTED 0x0004 + /* This dentry is possibly not currently connected to the dcache tree, + * in which case its parent will either be itself, or will have this + * flag as well. nfsd will not use a dentry with this bit set, but will + * first endeavour to clear the bit either by discovering that it is + * connected, or by performing lookup operations. Any filesystem which + * supports nfsd_operations MUST have a lookup function which, if it finds + * a directory inode with a DCACHE_DISCONNECTED dentry, will d_move + * that dentry into place and return that dentry rather than the passed one, + * typically using d_splice_alias. + */ + #define DCACHE_REFERENCED 0x0008 /* Recently used, don't discard. */ extern spinlock_t dcache_lock; diff --git a/include/linux/fs.h b/include/linux/fs.h index 17b3b586df2c..1e990e477adf 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -701,6 +701,7 @@ struct super_block { struct file_system_type *s_type; struct super_operations *s_op; struct dquot_operations *dq_op; + struct export_operations *s_export_op; unsigned long s_flags; unsigned long s_magic; struct dentry *s_root; @@ -936,6 +937,110 @@ struct dquot_operations { int (*transfer) (struct inode *, struct iattr *); }; + +/** + * &export_operations - for nfsd to communicate with file systems + * decode_fh: decode a file handle fragment and return a &struct dentry + * encode_fh: encode a file handle fragment from a dentry + * get_name: find the name for a given inode in a given directory + * get_parent: find the parent of a given directory + * get_dentry: find a dentry for the inode given a file handle sub-fragment + * + * Description: + * The export_operations structure provides a means for nfsd to communicate + * with a particular exported file system - particularly enabling nfsd and + * the filesystem to co-operate when dealing with file handles. + * + * export_operations contains two basic operation for dealing with file handles, + * decode_fh() and encode_fh(), and allows for some other operations to be defined + * which standard helper routines use to get specific information from the + * filesystem. + * + * nfsd encodes information use to determine which filesystem a filehandle + * applies to in the initial part of the file handle. The remainder, termed a + * file handle fragment, is controlled completely by the filesystem. + * The standard helper routines assume that this fragment will contain one or two + * sub-fragments, one which identifies the file, and one which may be used to + * identify the (a) directory containing the file. + * + * In some situations, nfsd needs to get a dentry which is connected into a + * specific part of the file tree. To allow for this, it passes the function + * acceptable() together with a @context which can be used to see if the dentry + * is acceptable. As there can be multiple dentrys for a given file, the filesystem + * should check each one for acceptability before looking for the next. As soon + * as an acceptable one is found, it should be returned. + * + * decode_fh: + * @decode_fh is given a &struct super_block (@sb), a file handle fragment (@fh, @fh_len) + * and an acceptability testing function (@acceptable, @context). It should return + * a &struct dentry which refers to the same file that the file handle fragment refers + * to, and which passes the acceptability test. If it cannot, it should return + * a %NULL pointer if the file was found but no acceptable &dentries were available, or + * a %ERR_PTR error code indicating why it couldn't be found (e.g. %ENOENT or %ENOMEM). + * + * encode_fh: + * @encode_fh should store in the file handle fragment @fh (using at most @max_len bytes) + * information that can be used by @decode_fh to recover the file refered to by the + * &struct dentry @de. If the @connectable flag is set, the encode_fh() should store + * sufficient information so that a good attempt can be made to find not only + * the file but also it's place in the filesystem. This typically means storing + * a reference to de->d_parent in the filehandle fragment. + * encode_fh() should return the number of bytes stored or a negative error code + * such as %-ENOSPC + * + * get_name: + * @get_name should find a name for the given @child in the given @parent directory. + * The name should be stored in the @name (with the understanding that it is already + * pointing to a a %NAME_MAX+1 sized buffer. get_name() should return %0 on success, + * a negative error code or error. + * @get_name will be called without @parent->i_sem held. + * + * get_parent: + * @get_parent should find the parent directory for the given @child which is also + * a directory. In the event that it cannot be found, or storage space cannot be + * allocated, a %ERR_PTR should be returned. + * + * get_dentry: + * Given a &super_block (@sb) and a pointer to a file-system specific inode identifier, + * possibly an inode number, (@inump) get_dentry() should find the identified inode and + * return a dentry for that inode. + * Any suitable dentry can be returned including, if necessary, a new dentry created + * with d_alloc_root. The caller can then find any other extant dentrys by following the + * d_alias links. If a new dentry was created using d_alloc_root, DCACHE_NFSD_DISCONNECTED + * should be set, and the dentry should be d_rehash()ed. + * + * If the inode cannot be found, either a %NULL pointer or an %ERR_PTR code can be returned. + * The @inump will be whatever was passed to nfsd_find_fh_dentry() in either the + * @obj or @parent parameters. + * + * Locking rules: + * get_parent is called with child->d_inode->i_sem down + * get_name is not (which is possibly inconsistent) + */ + +struct export_operations { + struct dentry *(*decode_fh)(struct super_block *sb, __u32 *fh, int fh_len, int fh_type, + int (*acceptable)(void *context, struct dentry *de), + void *context); + int (*encode_fh)(struct dentry *de, __u32 *fh, int *max_len, + int connectable); + + /* the following are only called from the filesystem itself */ + int (*get_name)(struct dentry *parent, char *name, + struct dentry *child); + struct dentry * (*get_parent)(struct dentry *child); + struct dentry * (*get_dentry)(struct super_block *sb, void *inump); + + /* This is set by the exporting module to a standard helper */ + struct dentry * (*find_exported_dentry)( + struct super_block *sb, void *obj, void *parent, + int (*acceptable)(void *context, struct dentry *de), + void *context); + + +}; + + struct file_system_type { const char *name; int fs_flags; |
