[PATCH] Convert i_shared_sem back to a spinlock

Having a semaphore in there causes modest performance regressions on heavily mmap-intensive workloads on some hardware. Specifically, up to 30% in SDET on NUMAQ and big PPC64. So switch it back to being a spinlock. This does mean that unmap_vmas() needs to be told whether or not it is allowed to schedule away; that's simple to do via the zap_details structure. This change means that there will be high scheuling latencies when someone truncates a large file which is currently mmapped, but nobody does that anyway. The scheduling points in unmap_vmas() are mainly for munmap() and exit(), and they still will work OK for that. From: Hugh Dickins <hugh@veritas.com> Sorry, my premature optimizations (trying to pass down NULL zap_details except when needed) have caught you out doubly: unmap_mapping_range_list was NULLing the details even though atomic was set; and if it hadn't, then zap_pte_range would have missed free_swap_and_cache and pte_clear when pte not present. Moved the optimization into zap_pte_range itself. Plus massive documentation update. From: Hugh Dickins <hugh@veritas.com> Here's a second patch to add to the first: mremap's cows can't come home without releasing the i_mmap_lock, better move the whole "Subtle point" locking from move_vma into move_page_tables. And it's possible for the file that was behind an anonymous page to be truncated while we drop that lock, don't want to abort mremap because of VM_FAULT_SIGBUS. (Eek, should we be checking do_swap_page of a vm_file area against the truncate_count sequence? Technically yes, but I doubt we need bother.) - We cannot hold i_mmap_lock across move_one_page() because move_one_page() needs to perform __GFP_WAIT allocations of pagetable pages. - Move the cond_resched() out so we test it once per page rather than only when move_one_page() returns -EAGAIN.
author: Andrew Morton <akpm@osdl.org> 2004-05-22 08:02:36 -0700
committer: Linus Torvalds <torvalds@ppc970.osdl.org> 2004-05-22 08:02:36 -0700
commit: c08689623e4e807489a2727b5362a75c11ce6342 (patch)
tree: 0d416df9aad86cd79831f4727814c53ab0ed75e1 /kernel/fork.c
parent: 71a1874542dd939ae1505b336b90cc6b6e95bd2d (diff)
1 files changed, 2 insertions, 2 deletions
diff --git a/kernel/fork.c b/kernel/fork.c
index 47b8f8e6f787..3cb3bc41b0c0 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -324,9 +324,9 @@ static inline int dup_mmap(struct mm_struct * mm, struct mm_struct * oldmm)
 				atomic_dec(&inode->i_writecount);
       
 			/* insert tmp into the share list, just after mpnt */
-			down(&file->f_mapping->i_shared_sem);
+			spin_lock(&file->f_mapping->i_mmap_lock);
 			list_add(&tmp->shared, &mpnt->shared);
-			up(&file->f_mapping->i_shared_sem);
+			spin_unlock(&file->f_mapping->i_mmap_lock);
 		}
 
 		/*
author	Andrew Morton <akpm@osdl.org>	2004-05-22 08:02:36 -0700
committer	Linus Torvalds <torvalds@ppc970.osdl.org>	2004-05-22 08:02:36 -0700
commit	c08689623e4e807489a2727b5362a75c11ce6342 (patch)
tree	0d416df9aad86cd79831f4727814c53ab0ed75e1 /kernel/fork.c
parent	71a1874542dd939ae1505b336b90cc6b6e95bd2d (diff)