summaryrefslogtreecommitdiff
path: root/include/linux/mount.h
AgeCommit message (Collapse)Author
2010-08-11vfs: remove unused MNT_STRICTATIMEMiklos Szeredi
Commit d0adde574b8487ef30f69e2d08bba769e4be513f added MNT_STRICTATIME but it isn't actually used (MS_STRICTATIME clears MNT_RELATIME and MNT_NOATIME rather than setting any mount flag). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-07-28fsnotify/vfsmount: add fsnotify fields to struct vfsmountAndreas Gruenbacher
This patch adds the list and mask fields needed to support vfsmount marks. These are the same fields fsnotify needs on an inode. They are not used, just declared and we note where the cleanup hook should be (the function is not yet defined) Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
2010-03-04Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) init: Open /dev/console from rootfs mqueue: fix typo "failues" -> "failures" mqueue: only set error codes if they are really necessary mqueue: simplify do_open() error handling mqueue: apply mathematics distributivity on mq_bytes calculation mqueue: remove unneeded info->messages initialization mqueue: fix mq_open() file descriptor leak on user-space processes fix race in d_splice_alias() set S_DEAD on unlink() and non-directory rename() victims vfs: add NOFOLLOW flag to umount(2) get rid of ->mnt_parent in tomoyo/realpath hppfs can use existing proc_mnt, no need for do_kern_mount() in there Mirror MS_KERNMOUNT in ->mnt_flags get rid of useless vfsmount_lock use in put_mnt_ns() Take vfsmount_lock to fs/internal.h get rid of insanity with namespace roots in tomoyo take check for new events in namespace (guts of mounts_poll()) to namespace.c Don't mess with generic_permission() under ->d_lock in hpfs sanitize const/signedness for udf nilfs: sanitize const/signedness in dealing with ->d_name.name ... Fix up fairly trivial (famous last words...) conflicts in drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c
2010-03-03Mirror MS_KERNMOUNT in ->mnt_flagsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03Take vfsmount_lock to fs/internal.hAl Viro
no more users left outside of fs/*.c (and very few outside of fs/namespace.c, actually) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03VFS: Clean up shared mount flag propagationValerie Aurora
The handling of mount flags in set_mnt_shared() got a little tangled up during previous cleanups, with the following problems: * MNT_PNODE_MASK is defined as a literal constant when it should be a bitwise xor of other MNT_* flags * set_mnt_shared() clears and then sets MNT_SHARED (part of MNT_PNODE_MASK) * MNT_PNODE_MASK could use a comment in mount.h * MNT_PNODE_MASK is a terrible name, change to MNT_SHARED_MASK This patch fixes these problems. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-02-17percpu: add __percpu sparse annotations to fsTejun Heo
Add __percpu sparse annotations to fs. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
2009-06-11fs: introduce mnt_clone_writenpiggin@suse.de
This patch speeds up lmbench lat_mmap test by about another 2% after the first patch. Before: avg = 462.286 std = 5.46106 After: avg = 453.12 std = 9.58257 (50 runs of each, stddev gives a reasonable confidence) It does this by introducing mnt_clone_write, which avoids some heavyweight operations of mnt_want_write if called on a vfsmount which we know already has a write count; and mnt_want_write_file, which can call mnt_clone_write if the file is open for write. After these two patches, mnt_want_write and mnt_drop_write go from 7% on the profile down to 1.3% (including mnt_clone_write). [AV: mnt_want_write_file() should take file alone and derive mnt from it; not only all callers have that form, but that's the only mnt about which we know that it's already held for write if file is opened for write] Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11fs: mnt_want_write speedupnpiggin@suse.de
This patch speeds up lmbench lat_mmap test by about 8%. lat_mmap is set up basically to mmap a 64MB file on tmpfs, fault in its pages, then unmap it. A microbenchmark yes, but it exercises some important paths in the mm. Before: avg = 501.9 std = 14.7773 After: avg = 462.286 std = 5.46106 (50 runs of each, stddev gives a reasonable confidence, but there is quite a bit of variation there still) It does this by removing the complex per-cpu locking and counter-cache and replaces it with a percpu counter in struct vfsmount. This makes the code much simpler, and avoids spinlocks (although the msync is still pretty costly, unfortunately). It results in about 900 bytes smaller code too. It does increase the size of a vfsmount, however. It should also give a speedup on large systems if CPUs are frequently operating on different mounts (because the existing scheme has to operate on an atomic in the struct vfsmount when switching between mounts). But I'm most interested in the single threaded path performance for the moment. [AV: minor cleanup] Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-26Add a strictatime mount optionMatthew Garrett
Add support for explicitly requesting full atime updates. This makes it possible for kernels to default to relatime but still allow userspace to override it. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16include/linux/mount.h: remove CVS keywordAdrian Bunk
Remove a CVS keyword that wasn't updated for a long time from a comment. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-01[PATCH] pass struct path * to do_add_mount()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26[PATCH] vfs: use kstrdup() and check failing allocationLi Zefan
- use kstrdup() instead of kmalloc() + memcpy() - return NULL if allocating ->mnt_devname failed - mnt_devname should be const Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-30Remove "#ifdef __KERNEL__" checks from unexported headersRobert P. J. Day
Remove the "#ifdef __KERNEL__" tests from unexported header files in linux/include whose entire contents are wrapped in that preprocessor test. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-23[patch 4/7] vfs: mountinfo: add mount peer group IDMiklos Szeredi
Add a unique ID to each peer group using the IDR infrastructure. The identifiers are reused after the peer group dissolves. The IDR structures are protected by holding namepspace_sem for write while allocating or deallocating IDs. IDs are allocated when a previously unshared vfsmount becomes the first member of a peer group. When a new member is added to an existing group, the ID is copied from one of the old members. IDs are freed when the last member of a peer group is unshared. Setting the MNT_SHARED flag on members of a subtree is done as a separate step, after all the IDs have been allocated. This way an allocation failure can be cleaned up easilty, without affecting the propagation state. Based on design sketch by Al Viro. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-23[patch 3/7] vfs: mountinfo: add mount IDMiklos Szeredi
Add a unique ID to each vfsmount using the IDR infrastructure. The identifiers are reused after the vfsmount is freed. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-21[PATCH] move a bunch of declarations to fs/internal.hAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19[PATCH] r/o bind mounts: honor mount writer counts at remountDave Hansen
Originally from: Herbert Poetzl <herbert@13thfloor.at> This is the core of the read-only bind mount patch set. Note that this does _not_ add a "ro" option directly to the bind mount operation. If you require such a mount, you must first do the bind, then follow it up with a 'mount -o remount,ro' operation: If you wish to have a r/o bind mount of /foo on bar: mount --bind /foo /bar mount -o remount,ro /bar Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19[PATCH] r/o bind mounts: track numbers of writers to mountsDave Hansen
This is the real meat of the entire series. It actually implements the tracking of the number of writers to a mount. However, it causes scalability problems because there can be hundreds of cpus doing open()/close() on files on the same mnt at the same time. Even an atomic_t in the mnt has massive scalaing problems because the cacheline gets so terribly contended. This uses a statically-allocated percpu variable. All want/drop operations are local to a cpu as long that cpu operates on the same mount, and there are no writer count imbalances. Writer count imbalances happen when a write is taken on one cpu, and released on another, like when an open/close pair is performed on two Upon a remount,ro request, all of the data from the percpu variables is collected (expensive, but very rare) and we determine if there are any outstanding writers to the mount. I've written a little benchmark to sit in a loop for a couple of seconds in several cpus in parallel doing open/write/close loops. http://sr71.net/~dave/linux/openbench.c The code in here is a a worst-possible case for this patch. It does opens on a _pair_ of files in two different mounts in parallel. This should cause my code to lose its "operate on the same mount" optimization completely. This worst-case scenario causes a 3% degredation in the benchmark. I could probably get rid of even this 3%, but it would be more complex than what I have here, and I think this is getting into acceptable territory. In practice, I expect writing more than 3 bytes to a file, as well as disk I/O to mask any effects that this has. (To get rid of that 3%, we could have an #defined number of mounts in the percpu variable. So, instead of a CPU getting operate only on percpu data when it accesses only one mount, it could stay on percpu data when it only accesses N or fewer mounts.) [AV] merged fix for __clear_mnt_mount() stepping on freed vfsmount Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19[PATCH] r/o bind mounts: stub functionsDave Hansen
This patch adds two function mnt_want_write() and mnt_drop_write(). These are used like a lock pair around and fs operations that might cause a write to the filesystem. Before these can become useful, we must first cover each place in the VFS where writes are performed with a want/drop pair. When that is complete, we can actually introduce code that will safely check the counts before allowing r/w<->r/o transitions to occur. Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27[PATCH] do shrink_submounts() for all fs typesAl Viro
... and take it out of ->umount_begin() instances. Call with all locks already taken (by do_umount()) and leave calling release_mounts() to caller (it will do release_mounts() anyway, so we can just put into the same list). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27[PATCH] count ghost references to vfsmountsAl Viro
make propagate_mount_busy() exclude references from the vfsmounts that had been isolated by umount_tree() and are just waiting for release_mounts() to dispose of their ->mnt_parent/->mnt_mountpoint. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-09Fix misspellings collected by members of KJ list.Robert P. J. Day
Fix the misspellings of "propogate", "writting" and (oh, the shame :-) "kenrel" in the source tree. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-02-11[PATCH] struct vfsmount: keep mnt_count & mnt_expiry_mark away from mnt_flagsEric Dumazet
I noticed cache misses in touch_atime() that can be avoided if we keep mnt_count & mnt_expiry_mark in a different cache line than mnt_flags (mostly read) mnt_count & mnt_expiry_mark are modified each time a file is opened/closed in a file system. touch_atime() is called each time a file is read, and generally needs to read mnt_flags. Other fields of struct vfsmount are mostly read so I chose to move mnt_count & mnt_expiry_mark at the end of struct vfsmount. And adding a comment so that nobody tries to re-arrange fields to fill the holes :) On 64bits platforms, the new offsetof(mnt_count) is 0xC0 On 32bits platforms, it is 0x60, so I didnot add a ____cacheline_aligned_in_smp because it would have a too big impact on the size of this object (in particular if CONFIG_X86_L1_CACHE_SHIFT=7) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2006-12-13[PATCH] relative atimeValerie Henson
Add "relatime" (relative atime) support. Relative atime only updates the atime if the previous atime is older than the mtime or ctime. Like noatime, but useful for applications like mutt that need to know when a file has been read since it was last modified. A corresponding patch against mount(8) is available at http://userweb.kernel.org/~akpm/mount-relative-atime.txt Signed-off-by: Valerie Henson <val_henson@linux.intel.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Karel Zak <kzak@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08[PATCH] rename struct namespace to struct mnt_namespaceKirill Korotaev
Rename 'struct namespace' to 'struct mnt_namespace' to avoid confusion with other namespaces being developped for the containers : pid, uts, ipc, etc. 'namespace' variables and attributes are also renamed to 'mnt_ns' Signed-off-by: Kirill Korotaev <dev@sw.ru> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-24Merge branch 'master' of /home/trondmy/kernel/linux-2.6/Trond Myklebust
Conflicts: fs/nfs/inode.c fs/super.c Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch 'VFS: Permit filesystem to override root dentry on mount'
2006-06-23[PATCH] VFS: Permit filesystem to perform statfs with a known root dentryDavid Howells
Give the statfs superblock operation a dentry pointer rather than a superblock pointer. This complements the get_sb() patch. That reduced the significance of sb->s_root, allowing NFS to place a fake root there. However, NFS does require a dentry to use as a target for the statfs operation. This permits the root in the vfsmount to be used instead. linux/mount.h has been added where necessary to make allyesconfig build successfully. Interest has also been expressed for use with the FUSE and XFS filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-09VFS: Add shrink_submounts()Trond Myklebust
Allow a submount to be marked as being 'shrinkable' by means of the vfsmount->mnt_flags, and then add a function 'shrink_submounts()' which attempts to recursively unmount these submounts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09VFS: Add GPL_EXPORTED function vfs_kern_mount()Trond Myklebust
do_kern_mount() does not allow the kernel to use private mount interfaces without exposing the same interfaces to userland. The problem is that the filesystem is referenced by name, thus meaning that it and its mount interface must be registered in the global filesystem list. vfs_kern_mount() passes the struct file_system_type as an explicit parameter in order to overcome this limitation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-10[PATCH] per-mountpoint noatime/nodiratimeChristoph Hellwig
Turn noatime and nodiratime into per-mount instead of per-sb flags. After all the preparations this is a rather trivial patch. The mount code needs to treat the two options as per-mount instead of per-superblock, and touch_atime needs to be changed to check the new MNT_ flags in addition to the MS_ flags that are kept for filesystems that are always noatime/nodiratime but not user settable anymore. Besides that core code only nfs needed an update because it's leaving atime updates to the server and thus sets the S_NOATIME flag on every inode, but needs to know whether it's a real noatime mount for an getattr optimization. While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that were only used by touch_atime. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] shared mounts: cleanupMiklos Szeredi
Small cleanups in shared mounts code. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Ram Pai <linuxram@us.ibm.com> Cc: <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] unbindable mountsRam Pai
An unbindable mount does not forward or receive propagation. Also unbindable mount disallows bind mounts. The semantics is as follows. Bind semantics: It is invalid to bind mount an unbindable mount. Move semantics: It is invalid to move an unbindable mount under shared mount. Clone-namespace semantics: If a mount is unbindable in the parent namespace, the corresponding cloned mount in the child namespace becomes unbindable too. Note: there is subtle difference, unbindable mounts cannot be bind mounted but can be cloned during clone-namespace. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] introduce slave mountsRam Pai
A slave mount always has a master mount from which it receives mount/umount events. Unlike shared mount the event propagation does not flow from the slave mount to the master. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] introduce shared mountsRam Pai
This creates shared mounts. A shared mount when bind-mounted to some mountpoint, propagates mount/umount events to each other. All the shared mounts that propagate events to each other belong to the same peer-group. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] beginning of the shared-subtree properRam Pai
A private mount does not forward or receive propagation. This patch provides user the ability to convert any mount to private. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] saner handling of auto_acct_off() and DQUOT_OFF() in umountAl Viro
The way we currently deal with quota and process accounting that might keep vfsmount busy at umount time is inherently broken; we try to turn them off just in case (not quite correctly, at that) and a) pray umount doesn't fail (otherwise they'll stay turned off) b) pray nobody doesn anything funny just as we turn quota off Moreover, LSM provides hooks for doing the same sort of broken logics. The proper way to deal with that is to introduce the second kind of reference to vfsmount. Semantics: - when the last normal reference is dropped, all special ones are converted to normal ones and if there had been any, cleanup is done. - normal reference can be cloned into a special one - special reference can be converted to normal one; that's a no-op if we'd already passed the point of no return (i.e. mntput() had converted special references to normal and started cleanup). The way it works: e.g. starting process accounting converts the vfsmount reference pinned by the opened file into special one and turns it back to normal when it gets shut down; acct_auto_close() is done when no normal references are left. That way it does *not* obstruct umount(2) and it silently gets turned off when the last normal reference to vfsmount is gone. Which is exactly what we want... The same should be done by LSM module that holds some internal references to vfsmount and wants to shut them down on umount - it should make them special and security_sb_umount_close() will be called exactly when the last normal reference to vfsmount is gone. quota handling is even simpler - we don't use normal file IO anymore, so there's no need to hold vfsmounts at all. DQUOT_OFF() is done from deactivate_super(), where it really belongs. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12[PATCH] name_to_dev_t warning fixAndrew Morton
kernel/power/disk.c needs a declaration of name_to_dev_t() in scope. mount.h seems like an appropriate choice. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07[PATCH] namespace: rename _mntput to mntput_no_expireMiklos Szeredi
This patch renames _mntput() to something a little more descriptive: mntput_no_expire(). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07[PATCH] namespace: rename mnt_fslink to mnt_expireMiklos Szeredi
This patch renames vfsmount->mnt_fslink to something a little more descriptive: vfsmount->mnt_expire. Signed-off-by: Mike Waychison <michael.waychison@sun.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-11-18[PATCH] linux/mount.h: add atomic.h and spinlock.h #includesRoland Dreier
<linux/mount.h> uses atomic_t and spinlock_t, but doesn't include either <asm/atomic.h> or <linux/spinlock.h>, which means that any users of <linux/mount.h> have to include them. This patch adds the necessary #includes to avoid this. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-07-10[PATCH] intrinsic automount and mountpoint degradation supportDavid Howells
Here's a patch that I worked out with Al Viro that adds support for a filesystem (such as kAFS) to perform automounting intrinsically without the need for a userspace daemon. It also adds support for such mountpoints to be degraded at the filesystem's behest until they've been untouched long enough that they'll be removed. I've a patch (to follow) that removes some #ifdef's from fs/afs/* thus allowing it to make use of this facility. There are five pieces to this: (1) Any interested filesystem needs to have at least one list to which expirable mountpoints can be added. Access to this list is governed by the vfsmount_lock. (2) When a filesystem wants to create an expirable mount, it calls do_kern_mount() to get a handle on the filesystem it wants mounting, and then calls do_add_mount() to mount that filesystem on the designated mountpoint, supplying the list mentioned in (1) to which the vfsmount will be added. In kAFS's case, the mountpoint is a directory with a follow_link() method defined (fs/afs/mntpt.c). This uses the struct nameidata supplied as an argument as a determination of where the new filesystem should be mounted. (3) When something using a vfsmount finishes dealing with it, it calls mntput(). This unmarks the vfsmount for immediate expiry. There are two criteria for determining if a vfsmount may be expired - it mustn't be marked as in use for anything other than being a child of another vfsmount, and it must have an expiry mark against it already. (4) The filesystem then determines the policy on expiring the mounts created in (2). When it feels the need to, it passes the list mentioned in (1) to mark_mounts_for_expiry() to request everything on the list be expired. This function examines each mount listed. If the vfsmount meets the criteria mentioned in (3), then the vfsmount is deleted from the namespace and disposed of as for unmounting; otherwise the vfsmount is left untouched apart from now bearing an expiration mark if it didn't before. kAFS's expiration policy is simply to invoke this process at regular intervals for all the mounts on its list. (5) An expiration facility is also provided to userspace: by calling umount() with a MNT_EXPIRE flag, it can make a request to unmount only if the mountpoint hasn't been used since the last request and isn't in use now. This allows expiration to be driven by userspace instead of by the kernel if that is desirable. This also means that do_umount() has to use a different version of path_release() to everyone else... it can't call mntput() as that clears the expiration flag, thus rendering this unachievable; so it's version of path_release() calls _mntput(), which doesn't do the clear. My original idea was to give the kernel more knowledge of automounted things. This avoids a certain problem with stat() on a mountpoint causing it to mount (for example, do "ls -l /afs" on a machine with kAFS), but Al wanted it done this way. > Why is autofs unsuitable? Because: (1) Autofs is flat; AFS requires a tree - mounts on mounts on mounts on mounts... (2) AFS holds the data as to what the mountpoints are and where they go, and these may be cross-links to subtrees beyond your control. It's also not trivial to extract a list of mountpoints as is required for autofs. (3) Autofs is not namespace safe. (4) Ducking back to userspace to get that to do the mount is pretty tricky if namespaces are involved. In fact, autofs may well want to make use of this facility. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2003-07-10[PATCH] separate locking for vfsmountsAndrew Morton
From: Maneesh Soni <maneesh@in.ibm.com> While path walking we do follow_mount or follow_down which uses dcache_lock for serialisation. vfsmount related operations also use dcache_lock for all updates. I think we can use a separate lock for vfsmount related work and can improve path walking. The following two patches does the same. The first one replaces dcache_lock with new vfsmount_lock in namespace.c. The lock is local to namespace.c and is not required outside. The second patch uses RCU to have lock free lookup_mnt(). The patches are quite simple and straight forward. The lockmeter reults show reduced contention, and lock acquisitions for dcache_lock while running dcachebench* on a 4-way SMP box SPINLOCKS HOLD WAIT UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME baselkm-2569: 20.7% 20.9% 0.5us( 146us) 2.9us( 144us)(0.81%) 31590840 79.1% 20.9% 0% dcache_lock mntlkm-2569: 14.3% 13.6% 0.4us( 170us) 2.9us( 187us)(0.42%) 23071746 86.4% 13.6% 0% dcache_lock We get more than 8% improvement on 4-way SMP and 44% improvement on 16-way NUMAQ while runing dcachebench*. Average (usecs/iteration) Std. Deviation (lower is better) 4-way SMP 2.5.69 15739.3 470.90 2.5.69-mnt 14459.6 298.51 16-way NUMAQ 2.5.69 120426.5 363.78 2.5.69-mnt 63225.8 427.60 *dcachebench is a microbenchmark written by Bill Hartner and is available at http://www-124.ibm.com/developerworks/opensource/linuxperf/dcachebench/dcachebench.html vfsmount_lock.patch ------------------- - Patch for replacing dcache_lock with new vfsmount_lock for all mount related operation. This removes the need to take dcache_lock while doing follow_mount or follow_down operations in path walking. I re-ran dcachebench with 2.5.70 as base on 16-way NUMAQ box. Average (usecs/iteration) Std. Deviation (lower is better) 16-way NUMAQ 2.5.70 120710.9 230.67 + vfsmount_lock.patch 65209.6 242.97 + lookup_mnt-rcu.patch 64042.3 416.61 So just the lock splitting (vfsmount_lock.patch) gives almost similar benifits
2003-05-25[PATCH] change get_sb prototypeAndries E. Brouwer
(i) The prototypes for free_vfsmnt(), alloc_vfsmnt(), do_kern_mount() so far occurred in several individual c files. Now they are in <linux/mount.h>. (ii) do_kern_mount() has a third argument name that is typically a constant. It is called with "rootfs", "nfsd", type->name, "capifs", "usbdevfs", "binfmt_misc" etc. So, it should have a prototype that expresses this: do_kern_mount(const char *fstype, int flags, const char *name, void *data); This makes the ugly cast - return do_kern_mount(type->name, 0, (char *)type->name, NULL); + return do_kern_mount(type->name, 0, type->name, NULL); go away. Now do_kern_mount() calls type->get_sb(), so also get_sb() must have a const third argument. That is what the patch below does. If I am not mistaken, precisely two filesystems do not treat this argument as a constant, namely afs and cifs. A separate patch gives some cleanup there.
2002-05-22Fix build fallout from namei.h/jiffies.h changes.David S. Miller
- Include dcache.h/namei.h in fs/autofs/autofs_i.h not dirhash.c - Include list.h and spinlock.h in dcache.h - Include list.h in mount.h and namei.h
2002-02-04v2.4.10.4 -> v2.4.10.5Linus Torvalds
- Keith Owens: module exporting error checking - Greg KH: USB update - Paul Mackerras: clean up wait_init_idle(), ppc prefetch macros - Jan Kara: quota fixes - Abraham vd Merwe: agpgart support for Intel 830M - Jakub Jelinek: ELF loader cleanups - Al Viro: more cleanups - David Miller: sparc64 fix, netfilter fixes - me: tweak resurrected oom handling
2002-02-04v2.4.9.8 -> v2.4.9.9Linus Torvalds
- Greg KH: start migration to new "min()/max()" - Roman Zippel: move affs over to "min()/max()". - Vojtech Pavlik: VIA update (make sure not to IRQ-unmask a vt82c576) - Jan Kara: quota bug-fix (don't decrement quota for non-counted inode) - Anton Altaparmakov: more NTFS updates - Al Viro: make nosuid/noexec/nodev be per-mount flags, not per-filesystem - Alan Cox: merge input/joystick layer differences, driver and alpha merge - Keith Owens: scsi Makefile cleanup - Trond Myklebust: fix oopsable race in locking code - Jean Tourrilhes: IrDA update
2002-02-04v2.4.7.7 -> v2.4.7.8Linus Torvalds
- Jeff Hartmann: serverworks AGP gart unload memory leak fix - Marcelo Tosatti: make zone_inactive_shortage() return how big the shortage is. - Hugh Dickins: tidy up age_page_down() - Al Viro: super block handling cleanups
2002-02-04v2.4.5.1 -> v2.4.5.2Linus Torvalds
- Takanori Kawano: brlock indexing bugfix - Ingo Molnar, Jeff Garzik: softirq updates and fixes - Al Viro: rampage of superblock cleanups. - Jean Tourrilhes: Orinoco driver update v6, IrNET update - Trond Myklebust: NFS brown-paper-bag thing - Tim Waugh: parport update - David Miller: networking and sparc updates - Jes Sorensen: m68k update. - Ben Fennema: UDF update - Geert Uytterhoeven: fbdev logo updates - Willem Riede: osst driver updates - Paul Mackerras: PPC update - Marcelo Tosatti: unlazy swap cache - Mikulas Patocka: hpfs update
2002-02-04v2.4.4.6 -> v2.4.5Linus Torvalds
- Alan Cox: camera conversion missed parts - Neil Brown: md graceful alloc failure - Andrea Arkangeli: more alpha fixups, bounce buffer deadlock avoidance - Adam Fritzler: tms380tr driver update - Al Viro: VFS layer cleanups