summaryrefslogtreecommitdiff
path: root/include/linux/dcache.h
AgeCommit message (Collapse)Author
2006-10-11[PATCH] VFS: Destroy the dentries contributed by a superblock on unmountingDavid Howells
The attached patch destroys all the dentries attached to a superblock in one go by: (1) Destroying the tree rooted at s_root. (2) Destroying every entry in the anon list, one at a time. (3) Each entry in the anon list has its subtree consumed from the leaves inwards. This reduces the amount of work generic_shutdown_super() does, and avoids iterating through the dentry_unused list. Note that locking is almost entirely absent in the shrink_dcache_for_umount*() functions added by this patch. This is because: (1) at the point the filesystem calls generic_shutdown_super(), it is not permitted to further touch the superblock's set of dentries, and nor may it remove aliases from inodes; (2) the dcache memory shrinker now skips dentries that are being unmounted; and (3) the superblock no longer has any external references through which the VFS can reach it. Given these points, the only locking we need to do is when we remove dentries from the unused list and the name hashes, which we do a directory's worth at a time. We also don't need to guard against reference counts going to zero unexpectedly and removing bits of the tree we're working on as nothing else can call dput(). A cut down version of dentry_iput() has been folded into shrink_dcache_for_umount_subtree() function. Apart from not needing to unlock things, it also doesn't need to check for inotify watches. In this version of the patch, the complaint about a dentry still being in use has been expanded from a single BUG_ON() and now gives much more information. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: NeilBrown <neilb@suse.de> Acked-by: Ian Kent <raven@themaw.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-22NFS: Add dentry materialisation opDavid Howells
The attached patch adds a new directory cache management function that prepares a disconnected anonymous function to be connected into the dentry tree. The anonymous dentry is transferred the name and parentage from another dentry. The following changes were made in [try #2]: (*) d_materialise_dentry() now switches the parentage of the two nodes around correctly when one or other of them is self-referential. The following changes were made in [try #7]: (*) d_instantiate_unique() has had the interior part split out as function __d_instantiate_unique(). Callers of this latter function must be holding the appropriate locks. (*) _d_rehash() has been added as a wrapper around __d_rehash() to call it with the most obvious hash list (the one from the name). d_rehash() now calls _d_rehash(). (*) d_materialise_dentry() is now __d_materialise_dentry() and is static. (*) d_materialise_unique() added to perform the combination of d_find_alias(), d_materialise_dentry() and d_add_unique() that the NFS client was doing twice, all within a single dcache_lock critical section. This reduces the number of times two different spinlocks were being accessed. The following further changes were made: (*) Add the dentries onto their parents d_subdirs lists. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-03[PATCH] lockdep: annotate dcacheIngo Molnar
Teach special (recursive) locking code to the lock validator. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] VFS: Permit filesystem to override root dentry on mountDavid Howells
Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. (*) If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries with child trees. [*] Anonymous until discovered from another tree. (*) The documentation has been adjusted, including the additional bit of changing ext2_* into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] Fix dcache race during umountNeilBrown
The race is that the shrink_dcache_memory shrinker could get called while a filesystem is being unmounted, and could try to prune a dentry belonging to that filesystem. If it does, then it will call in to iput on the inode while the dentry is no longer able to be found by the umounting process. If iput takes a while, generic_shutdown_super could get all the way though shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without ever waiting on this particular inode. Eventually the superblock gets freed anyway and if the iput tried to touch it (which some filesystems certainly do), it will lose. The promised "Self-destruct in 5 seconds" doesn't lead to a nice day. The race is closed by holding s_umount while calling prune_one_dentry on someone else's dentry. As a down_read_trylock is used, shrink_dcache_memory will no longer try to prune the dentry of a filesystem that is being unmounted, and unmount will not be able to start until any such active prune_one_dentry completes. This requires that prune_dcache *knows* which filesystem (if any) it is doing the prune on behalf of so that it can be careful of other filesystems. shrink_dcache_memory isn't called it on behalf of any filesystem, and so is careful of everything. shrink_dcache_anon is now passed a super_block rather than the s_anon list out of the superblock, so it can get the s_anon list itself, and can pass the superblock down to prune_dcache. If prune_dcache finds a dentry that it cannot free, it leaves it where it is (at the tail of the list) and exits, on the assumption that some other thread will be removing that dentry soon. To try to make sure that some work gets done, a limited number of dnetries which are untouchable are skipped over while choosing the dentry to work on. I believe this race was first found by Kirill Korotaev. Cc: Jan Blunck <jblunck@suse.de> Acked-by: Kirill Korotaev <dev@openvz.org> Cc: Olaf Hering <olh@suse.de> Acked-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31[PATCH] dcache: Add helper d_hash_and_lookupEric W. Biederman
It is very common to hash a dentry and then to call lookup. If we take fs specific hash functions into account the full hash logic can get ugly. Further full_name_hash as an inline function is almost 100 bytes on x86 so having a non-inline choice in some cases can measurably decrease code size. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25[PATCH] inotify: lock avoidance with parent watch status in dentryNick Piggin
Previous inotify work avoidance is good when inotify is completely unused, but it breaks down if even a single watch is in place anywhere in the system. Robin Holt notices that udev is one such culprit - it slows down a 512-thread application on a 512 CPU system from 6 seconds to 22 minutes. Solve this by adding a flag in the dentry that tells inotify whether or not its parent inode has a watch on it. Event queueing to parent will skip taking locks if this flag is cleared. Setting and clearing of this flag on all child dentries versus event delivery: this is no in terms of race cases, and that was shown to be equivalent to always performing the check. The essential behaviour is that activity occuring _after_ a watch has been added and _before_ it has been removed, will generate events. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Robert Love <rml@novell.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-07[PATCH] remove bogus asm/bug.h includes.Al Viro
A bunch of asm/bug.h includes are both not needed (since it will get pulled anyway) and bogus (since they are done too early). Removed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-03[PATCH] make "struct d_cookie" depend on CONFIG_PROFILINGMarcelo Tosatti
Shrinks "struct dentry" from 128 bytes to 124 on x86, allowing 31 objects per slab instead of 30. Cc: John Levon <levon@movementarian.org> Cc: Philippe Elie <phil.el@wanadoo.fr> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] shrink dentry structEric Dumazet
Some long time ago, dentry struct was carefully tuned so that on 32 bits UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple of memory cache lines. Then RCU was added and dentry struct enlarged by two pointers, with nice results for SMP, but not so good on UP, because breaking the above tuning (128 + 8 = 136 bytes) This patch reverts this unwanted side effect, by using an union (d_u), where d_rcu and d_child are placed so that these two fields can share their memory needs. At the time d_free() is called (and d_rcu is really used), d_child is known to be empty and not touched by the dentry freeing. Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so the previous content of d_child is not needed if said dentry was unhashed but still accessed by a CPU because of RCU constraints) As dentry cache easily contains millions of entries, a size reduction is worth the extra complexity of the ugly C union. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Maneesh Soni <maneesh@in.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Ian Kent <raven@themaw.net> Cc: Paul Jackson <pj@sgi.com> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] shared mounts handling: umountRam Pai
An unmount of a mount creates a umount event on the parent. If the parent is a shared mount, it gets propagated to all mounts in the peer group. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] struct dentry: place d_hash close to d_parent and d_name to speedup ↵Eric Dumazet
lookups dentry cache uses sophisticated RCU technology (and prefetching if available) but touches 2 cache lines per dentry during hlist lookup. This patch moves d_hash in the same cache line than d_parent and d_name fields so that : 1) One cache line is needed instead of two. 2) the hlist_for_each_rcu() prefetching has a chance to bring all the needed data in advance, not only the part that includes d_hash.next. I also changed one old comment that was wrong for 64bits. A further optimisation would be to separate dentry in two parts, one that is mostly read, and one writen (d_count/d_lock) to avoid false sharing on SMP/NUMA but this would need different field placement depending on 32bits or 64bits platform. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-07[PATCH] d_drop should use per dentry lockJan Blunck
d_drop() must use the dentry->d_lock spinlock. In some cases __d_drop() was used without holding the dentry->d_lock spinlock, too. This could end in a race with __d_lookup(). Signed-off-by: Jan Blunck <j.blunck@tu-harburg.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-04 VFS: Avoid dentry aliasing problems in filesystems like NFS, whereTrond Myklebust
inodes may be marked as stale in one instance (causing the dentry to be dropped) then re-enabled in the next instance. Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>
2004-10-18[PATCH] Remove d_bucketDipankar Sarma
Tested using dcachebench and hevy rename test. http://lse.sourceforge.net/locking/dcache/rename_test/ While going over dcache code, I realized that d_bucket which was introduced to prevent hash chain traversals from going into an infinite loop earlier, is no longer necessary. Originally, when RCU based lock-free lookup was first introduced, dcache hash chains used list_head. Hash chain traversal was terminated when dentry->next reaches the list_head in the hash bucket. However, if renames happen during a lock-free lookup, a dentry may move to different bucket and subsequent hash chain traversal from there onwards may not see the list_head in the original bucket at all. In fact, this would result in the list_head in the bucket interpreted as a list_head in dentry and bad things will happen after that. Once hlist based hash chains were introduced in dcache, the termination condition changed and lock-free traversal would be safe with NULL pointer based termination of hlists. This means that d_bucket check is no longer required. There still exist some theoritical livelocks like a dentry getting continuously moving and lock-free look-up never terminating. But that isn't really any worse that what we have. In return for these changes, we reduce the dentry size by the size of a pointer. That should make akpm and mpm happy. Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-23[PATCH] reduce size of struct dentry on 64bitAnton Blanchard
Reduce size of struct dentry from 248 to 232 bytes on 64bit. - Reduce size of qstr by 8 bytes, placing int hash and int len together. We gain a further 4 byte saving when qstr is used in struct dentry since qstr goes from 24 to 16 bytes and the next member (d_lru) requires 8 byte alignment (which means 4 bytes of padding). - Move d_mounted to the end, since char d_iname[] only requires 1 byte alignment. This reduces struct dentry by another 4 bytes. With these changes the number of objects we can fit into a 4kB slab goes from 16 to 17 on ppc64. Note the above assumes the architecture naturally aligns types. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-23[PATCH] vm: vfs shrinkage tuningAndrew Morton
Some people want the dentry and inode caches shrink harder, others want them shrunk more reluctantly. The patch adds /proc/sys/vm/vfs_cache_pressure, which tunes the vfs cache versus pagecache scanning pressure. - at vfs_cache_pressure=0 we don't shrink dcache and icache at all. - at vfs_cache_pressure=100 there is no change in behaviour. - at vfs_cache_pressure > 100 we reclaim dentries and inodes harder. The number of megabytes of slab left after a slocate.cron on my 256MB test box: vfs_cache_pressure=100000 33480 vfs_cache_pressure=10000 61996 vfs_cache_pressure=1000 104056 vfs_cache_pressure=200 166340 vfs_cache_pressure=100 190200 vfs_cache_pressure=50 206168 Of course, this just left more directory and inode pagecache behind instead of vfs cache. Interestingly, on this machine the entire slocate run fits into pagecache, but not into VFS caches. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-05-19[PATCH] dentry size tuningAndrew Morton
Experimenting with various values of DENTRY_STORAGE dentry size objs/slab dentry size * objs/slab inline string 148 26 3848 32 152 26 3952 36 156 25 3900 40 160 24 4000 44 We're currently at 160. The patch fairly arbitrarily takes it down to 152, so we can fit a 35-char name into the inline part of the dentry. Also, go back to the old way of sizing d_iname so that any arch-specific compiler-forced alignemnts are honoured.
2004-05-14[PATCH] dentry layout tweaksAndrew Morton
Lookup typically touches three fields of the dentry: d_bucket, d_name.hash and d_parent. Change the layout of things so that these will always be in the same cacheline.
2004-05-14[PATCH] more dentry shrinkageAndrew Morton
- d_vfs_flags can be removed - just use d_flags. All modifications of dentry->d_flags are under dentry->d_lock. On x86 this takes the internal string size up to 40 bytes. The internal/external ratio on my 1.5M files hits 96%.
2004-05-14[PATCH] dentry qstr consolidationAndrew Morton
When dentries are given an external name we currently allocate an entire qstr for the external name. This isn't needed. We can use the internal qstr and kmalloc only the string itself. This saves 12 bytes from externally-allocated names and 4 bytes from the dentry itself. The saving of 4 bytes from the dentry doesn't actually decrease the dentry's storage requirements, but it makes four more bytes available for internal names, taking the internal/external ratio from 89% up to 93% on my 1.5M files. Fix: The qstr consolidation wasn't quite right, because it can cause qstr->len to be unstable during lookup lockless traverasl. Fix that up by taking d_lock earlier in lookup. This serialises against d_move. Take the lock after comparing the parent and hash to preserve the mostly-lockless behaviour. This obsoletes d_movecount, which is removed.
2004-05-14[PATCH] dentry shrinkageAndrew Morton
Rework dentries so that the inline name length is between 31 and 48 bytes. On SMP P4-compiled x86 each dentry consumes 160 bytes (24 per page). Here's the histogram of name lengths on all 1.5M files on my workstation: 1: 0% 2: 0% 3: 1% 4: 5% 5: 8% 6: 13% 7: 19% 8: 26% 9: 33% 10: 42% 11: 49% 12: 55% 13: 60% 14: 64% 15: 67% 16: 69% 17: 71% 18: 73% 19: 75% 20: 76% 21: 78% 22: 79% 23: 80% 24: 81% 25: 82% 26: 83% 27: 85% 28: 86% 29: 87% 30: 88% 31: 89% 32: 90% 33: 91% 34: 92% 35: 93% 36: 94% 37: 95% 38: 96% 39: 96% 40: 96% 41: 96% 42: 96% 43: 96% 44: 97% 45: 97% 46: 97% 47: 97% 48: 97% 49: 98% 50: 98% 51: 98% 52: 98% 53: 98% 54: 98% 55: 98% 56: 98% 57: 98% 58: 98% 59: 98% 60: 99% 61: 99% 62: 99% 63: 99% 64: 99% So on x86 we'll fit 89% of filenames into the inline name. The patch also removes the NAME_ALLOC_LEN() rounding-up of the storage for the out-of-line names. That seems unnecessary.
2004-05-08Waste less memory in dentries.Linus Torvalds
We don't bother aligining them on a cacheline boundary, since that is totally excessive in some configurations (especially P4's with 128-byte cachelines). Instead, we make the minimum inline string size a bit longer, and re-order a few fields that allow for better packing on 64-bit architectures, for better memory utilization.
2004-01-19[PATCH] if ... BUG() -> BUG_ON()Andrew Morton
From: Adrian Bunk <bunk@fs.tum.de> four months ago, Rolf Eike Beer <eike-kernel@sf-tec.de> sent a patch against 2.6.0-test5-bk1 that converted several if ... BUG() to BUG_ON() This might in some cases result in slightly faster code because BUG_ON() uses unlikely().
2003-08-06[PATCH] export lookup_create()Andrew Morton
From: jbarnes@sgi.com (Jesse Barnes) hwgfs needs lookup_create(), and intermezzo already has copied it. Document it, export it to modules and fix intermezzo.
2003-07-03[PATCH] Add open intent information to the 'struct nameidata'Trond Myklebust
- Add open intent information to the 'struct nameidata'. - Pass the struct nameidata as an optional parameter to the lookup() inode operation. - Pass the struct nameidata as an optional parameter to the d_revalidate() dentry operation. - Make link_path_walk() set the LOOKUP_CONTINUE flag in nd->flags instead of passing it as an extra parameter to d_revalidate(). - Make open_namei(), and sys_uselib() set the open()/create() intent data.
2003-06-11Fix rcu list poisoning - since another CPU may be traversing the listLinus Torvalds
as we delete the entry, we can only poison the back pointer, not the traversal pointer (rcu traversal only ever walks forward). Make __d_drop() take this into account.
2003-06-10Fix __d_drop() to properly initialize the d_hash fields,Linus Torvalds
so that __d_drop() can safely be done multiple times on a dentry without corrupting other hash entries. Noticed by Trond Myklebust.
2003-04-20[PATCH] Fix and clean up DCACHE_REFERENCED usageAndrew Morton
From: Maneesh Soni <maneesh@in.ibm.com> This patch changes the way DCACHE_REFERENCED flag is used. It got messed up in dcache_rcu iterations. I hope this will be ok now. The flag was meant to be advisory flag which is used while prune_dcache() so as not to free dentries which have recently entered d_lru list. At first pass in prune_dcache the dentries marked DCACHE_REFERENCED are left with the flag reset. and they are freed in the next pass. So, now we mark the dentry as DCACHE_REFERENCED when it is first entering the d_lru list in dput() and resetthe flag in prune_dcache(). If the flag remains reset in the next call to prune_dcache(), the dentry is then freed. Also I don't think any file system have to use this flag as it is taken care by the dcache layer. The patch removes such code from a few of file systems. Moreover these filesystems were anyway doing worng thing as they were changing the flag out of dcache_lock. Changes: o dput() marks dentry DCACHE_REFERENCED when it is added to the dentry_unused list o no need to set the flag in dget, dget_locked, d_lookup as these guys anyway increments the ref count. o check the ref count in prune_dcache and use DCACHE_REFERENCED flag just for two stage aging. o remove code for setting DACACHE_REFERENCED from reiserfs, fat, xfs and exportfs.
2003-04-02[PATCH] remove dparent_lockAndrew Morton
The big SMP machines are seeing quite some contention in dnotify_parent() (via vfs_write). This function is hammering the global dparent_lock. However we don't actually need a global dparent_lock for pinning down dentry->d_parent. We can use dentry->d_lock for this. That is already being held across d_move. This patch speeds up SDET on the 16-way by 5% and wipes dnotify_parent() off the profiles. It also uninlines dnofity_parent(). It also uses spin_lock(), which is faster than read_lock(). I'm not sure that we need to take both the source and target dentry's d_lock in d_move. The patch also does lots of s/__inline__/inline/ in dcache.h
2003-04-02[PATCH] real_lookup race fixAndrew Morton
From: Maneesh Soni <maneesh@in.ibm.com> Here is a patch to use seqlock for real_lookup race with d_lookup as suggested by Linus. The race condition can result in duplicate dentry when d_lookup fails due concurrent d_move in some unrelated directory. Apart from real_lookup, lookup_hash()->cached_lookup() can also fail due to same reason. So, for that I am doing the d_lookup again. Now we have __d_lookup (called from do_lookup() during pathwalk) and d_lookup which uses seqlock to protect againt rename race. dcachebench numbers (lower is better) don't have much difference on a 4-way PIII xeon SMP box. base-2565 Average usec/iteration 19059.4 Standard Deviation 503.07 base-2565 + seq_lock Average usec/iteration 18843.2 Standard Deviation 450.57
2003-03-03Merge penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/hlistLinus Torvalds
into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
2003-03-03[PATCH] dcache/inode hlist patchkitAndi Kleen
- Inode and dcache Hash table only needs half the memory/cache because of using hlists. - Simplify dcache-rcu code. With NULL end markers in the hlists is_bucket is not needed anymore. Also the list walking code generates better code on x86 now because it doesn't need to dedicate a register for the list head. - Reorganize struct dentry to be more cache friendly. All the state accessed for the hash walk is in one chunk now together with the inline name (all at the end) - Add prefetching for all the list walks. Old hash lookup code didn't use it. - Some other minor cleanup.
2003-02-24[PATCH] Check for zero d_count in dget()Andrew Morton
Patch from Maneesh Soni <maneesh@in.ibm.com> Turns out that sysfs is doing dget() on a zero-ref dentry. That's a bug, but dcache is no longer detecting it. The check was removed because with lockless d_lookup, there can be cases when d_lookup and dput are going on concurrently, If d_lookup happens earlier then it may do dget() on a dentry for which dput() has decremented the ref count to zero. This race is handled by taking the per dentry lock and checking the DCACHE_UNHASHED flag. The patch open-codes that part of d_lookup(), and restores the BUG check in dget().
2003-02-14[PATCH] dcache_rcuAndrew Morton
Patch from Maneesh Soni <maneesh@in.ibm.com>, Dipankar Sarma <dipankar@in.ibm.com> and probably others. This patch provides dcache_lock free d_lookup() using RCU. Al pointed races with d_move and lockfree d_lookup() while concurrent rename is going on. We tested this with a test doing million renames each in 50 threads on 50 different ramfs filesystems. And simultaneously running millions of "ls". The tests were done on 4-way SMP box. 1. Lookup going to a different bucket as the current dentry is moved to a different bucket due to rename. This is solved by having a list_head pointer in the dentry structure which points to the bucket head it belongs. The bucket pointer is updated when the dentry is added to the hash chain. Lookup checks if the current dentry belongs to a different bucket, the cached lookup is failed and real lookup will be done. This condition occured nearly about 100 times during the heavy_rename test. 2. Lookup has got the dentry it is looking and it is comparing various keys and meanwhile a rename operation moves the dentry. This is solved by using a per dentry counter (d_move_count) which is updated at the end of d_move. Lookup takes a snapshot of the d_move_count before comparing the keys and once the comparision succeeds, it takes the per dentry lock to check the d_move_count again. If move_count differs, then dentry is moved (or renamed) and the lookup is failed. 3. There can be a theoritical race when a dentry keeps coming back to original bucket due to double moves. Due to this lookup may consider that it has never moved and can end up in a infinite loop. This is solved by using a loop_counter which is compared with a approximate maximum number of dentries per bucket. This never got hit during the heavy_rename test. 4. There is one more change regarding the loop termintaion condition in d_lookup, now the next hash pointer is compared with the current dentries bucket pointer (is_bucket()). 5. memcmp() in d_lookup() can go out of bounds if name pointer and length fields are not consistent. For this we used a pointer to qstr to keep length and name pointer in one structre. We also tried solving these by using a rwlock but it could not compete with lockless solution.
2003-01-12[PATCH] use <asm/bug.h> for BUG() definesRussell King
This patch moves BUG() and PAGE_BUG() from asm/page.h into asm/bug.h. We also fix up linux/dcache.h, which included asm/page.h for the sole purpose of getting the BUG() definition. Since linux/kernel.h and linux/smp.h make use of BUG(), asm/bug.h is included there as well. In addition, linux/jbd.h did not contain a clear path with which to obtain the archtecture BUG() definition, but did contain its own definition.
2002-11-17[PATCH] dcache usage cleanupsManeesh Soni
This cleans up the dcache code to always use the proper dcache functions (d_unhashed and __d_drop) instead of accessing the dentry lists directly. In other words: use "d_unhashed(dentry)" instead of doing a manual "list_empty(&dentry->d_hash)" test. And use "__d_drop(dentry)" instead of doing "list_del_init(&dentry->d_hash)" by hand. This will help the dcache-rcu patches.
2002-11-16[PATCH] don't include mount.h in dcache.hChristoph Hellwig
Once again we only need a forward-declaration of struct vfsmount.
2002-11-15[PATCH] Remove d_path from sched.hMatthew Wilcox
This patch from William Lee Irwin III privatizes __d_path() to dcache.c, uninlines d_path(), moves its declaration to dcache.h, moves it to dcache.c, and exports d_path() instead of __d_path().
2002-11-03Misc Alpha compilation fixes.Richard Henderson
2002-10-31[PATCH] improved space efficiency in dcacheAndrew Morton
Currently we are storing filenames which are 16-chars or less inside struct dentry itself and then separately allocating larger names. But this leaves spare space in the dentry - the dentry slab cache is using cacheline alignment. In my build, struct dentry is 112 bytes so there are at least an additional 16 bytes in there. And the number of files which have names in the 16-32 char range will be significant. So Manfred's patch changes the dcache code to utilise _all_ the space between the last member of the dentry and the start of the next cacheline.
2002-10-15[PATCH] oprofile - dcookiesJohn Levon
This implements the persistent path-to-dcookies mapping, and adds a system call for the user-space profiler to look up the profile data, so it can tag profiles to specific binaries.
2002-10-13[PATCH] batched slab shrink and registration APIAndrew Morton
From Ed Tomlinson, then mauled by yours truly. The current shrinking of the dentry, inode and dquot caches seems to work OK, but it is slightly CPU-inefficient: we call the shrinking functions many times, for tiny numbers of objects. So here, we just batch that up - shrinking happens at the same rate but we perform it in larger units of work. To do this, we need a way of knowing how many objects are currently in use by individual caches. slab does not actually track this information, but the existing shrinkable caches do have this on hand. So rather than adding the counters to slab, we require that the shrinker callback functions keep their own count - we query that via the callback. We add a simple registration API which is exported to modules. A subsystem may register its own callback function via set_shrinker(). set_shrinker() simply takes a function pointer. The function is called with int (*shrinker)(int nr_to_shrink, unsigned int gfp_mask); The shrinker callback must scan `nr_to_scan' objects and free all freeable scanned objects. Note: it doesn't have to *free* `nr_to_scan' objects. It need only scan that many. Which is a fairly pedantic detail, really. The shrinker callback must return the number of objects which are in its cache at the end of the scanning attempt. It will be called with nr_to_scan == 0 when we're just querying the cache size. The set_shrinker() registration API is passed a hint as to how many disk seeks a single cache object is worth. Everything uses "2" at present. I saw no need to add the traditional `here is my void *data' to the registration/callback. Because there is a one-to-one relationship between caches and their shrinkers. Various cleanups became possible: - shrink_icache_memory() is no longer exported to modules. - shrink_icache_memory() is now static to fs/inode.c - prune_icache() is now static to fs/inode.c, and made inline (single caller) - shrink_dcache_memory() is made static to fs/dcache.c - prune_dcache() is no longer exported to modules - prune_dcache() is made static to fs/dcache.c - shrink_dqcache_memory() is made static to fs/dquot.c - All the quota init code has been moved from fs/dcache.c into fs/dquot.c - All modifications to inodes_stat.nr_inodes are now inside inode_lock - the dispose_list one was racy.
2002-10-07[PATCH] increase list_del_init usage.Dave Jones
2002-09-25[PATCH] slab reclaim balancingAndrew Morton
A patch from Ed Tomlinson which improves the way in which the kernel reclaims slab objects. The theory is: a cached object's usefulness is measured in terms of the number of disk seeks which it saves. Furthermore, we assume that one dentry or inode saves as many seeks as one pagecache page. So we reap slab objects at the same rate as we reclaim pages. For each 1% of reclaimed pagecache we reclaim 1% of slab. (Actually, we _scan_ 1% of slab for each 1% of scanned pages). Furthermore we assume that one swapout costs twice as many seeks as one pagecache page, and twice as many seeks as one slab object. So we double the pressure on slab when anonymous pages are being considered for eviction. The code works nicely, and smoothly. Possibly it does not shrink slab hard enough, but that is now very easy to tune up and down. It is just: ratio *= 3; in shrink_caches(). Slab caches no longer hold onto completely empty pages. Instead, pages are freed as soon as they have zero objects. This is possibly a performance hit for slabs which have constructors, but it's doubtful. Most allocations after a batch of frees are satisfied from inside internally-fragmented pages and by the time slab gets back onto using the wholly-empty pages they'll be cache-cold. slab would be better off going and requesting a new, cache-warm page and reconstructing the objects therein. (Once we have the per-cpu hot-page allocator in place. It's happening). As a consequence of the above, kmem_cache_shrink() is now unused. No great loss there - the serialising effect of kmem_cache_shrink and its semaphore in front of page reclaim was measurably bad. Still todo: - batch up the shrinking so we don't call into prune_dcache and friends at high frequency asking for a tiny number of objects. - Maybe expose the shrink ratio via a tunable. - clean up slab.c - highmem page reclaim in prune_icache: highmem pages can pin inodes.
2002-08-27[PATCH] rename zone_struct and zonelist_struct, kill zone_t andAndrew Morton
- Remove the zonelist_t typedef. Rename struct zonelist_struct to struct zonelist and use that everywhere. - Remove the zone_t typedef. Rename struct zone_struct to struct zone and use that everywhere.
2002-05-22Fix build fallout from namei.h/jiffies.h changes.David S. Miller
- Include dcache.h/namei.h in fs/autofs/autofs_i.h not dirhash.c - Include list.h and spinlock.h in dcache.h - Include list.h in mount.h and namei.h
2002-04-24[PATCH] FastWalk DcacheHanna V. Linder
Reduce cacheline bouncing when a dentry is in the cache. Specifically, the d_count reference counter is not incremented and decremented for every dentry in a path during path walking if the dentry is in the dcache. Execcisve atomic inc/dec's are expensive on SMP systems due to the cachline bouncing.
2002-04-15[PATCH] PATCH - Create "export_operations" interface for filesystems to describeNeil Brown
Create "export_operations" interface for filesystems to describe whether and how they should be exported. - add new field in struct super_block "s_export_op" to describe how a filesystem is exported (i.e. how filehandles are mapped to dentries). - New module: fs/exportfs for holding helper code for mapping between filehandles and dentries - Change nfsd to use new interface if it exists. - Change ext2 to provide new interface - Add documention to filesystems/Exporting If s_export_op isn't set, old mechanism still works, but it is planned to remove old method and only use s_export_op.
2002-04-15[PATCH] dcache changes for preparing for "export_operations" interface for ↵Neil Brown
nfsd to use. Prepare for new export_operations interface (for filehandle lookup): - define d_splice_alias and d_alloc_anon. - define shrink_dcache_anon for removing anonymous dentries - modify d_move to work with anonymous dentries (IS_ROOT dentries) - modify d_find_alias to avoid anonymous dentries where possible as d_splice_alias and d_alloc_anon use this - put in place infrastructure for s_anon allocation and cleaning - replace a piece of code that is in nfsfh, reiserfs and fat with a call to d_alloc_anon - Rename DCACHE_NFSD_DISCONNECTED to DCACHE_DISCONNECTED - Add documentation at Documentation/filesystems/Exporting