From 968f11a8688e1be78719154d05bcab061bbfde2b Mon Sep 17 00:00:00 2001 From: Jamie Lokier Date: Thu, 4 Sep 2003 18:00:45 -0700 Subject: [PATCH] Unpinned futexes v2: indexing changes This changes the way futexes are indexed, so that they don't pin pages. It also fixes some bugs with private mappings and COW pages. Currently, all futexes look up the page at the userspace address and pin it, using the pair (page,offset) as an index into a table of waiting futexes. Any page with a futex waiting on it remains pinned in RAM, which is a problem when many futexes are used, especially with FUTEX_FD. Another problem is that the page is not always the correct one, if it can be changed later by a COW (copy on write) operation. This can happen when waiting on a futex without writing to it after fork(), exec() or mmap(), if the page is then written to before attempting to wake a futex at the same adress. There are two symptoms of the COW problem: - The wrong process can receive wakeups - A process can fail to receive required wakeups. This patch fixes both by changing the indexing so that VM_SHARED mappings use the triple (inode,offset,index), and private mappings use the pair (mm,virtual_address). The former correctly handles all shared mappings, including tmpfs and therefore all kinds of shared memory (IPC shm, /dev/shm and MAP_ANON|MAP_SHARED). This works because every mapping which is VM_SHARED has an associated non-zero vma->vm_file, and hence inode. (This is ensured in do_mmap_pgoff, where it calls shmem_zero_setup). The latter handles all private mappings, both files and anonymous. It isn't affected by COW, because it doesn't care about the actual pages, just the virtual address. The patch has a few bonuses: 1. It removes the vcache implementation, as only futexes were using it, and they don't any more. 2. Removing the vcache should make COW page faults a bit faster. 3. Futex operations no longer take the page table lock, walk the page table, fault in pages that aren't mapped in the page table, or do a vcache hash lookup - they are mostly a simple offset calculation with one hash for the futex table. So they should be noticably faster. Special thanks to Hugh Dickins, Andrew Morton and Rusty Russell for insightful feedback. All suggestions are included. --- include/linux/vcache.h | 26 -------------------------- 1 file changed, 26 deletions(-) delete mode 100644 include/linux/vcache.h (limited to 'include/linux/vcache.h') diff --git a/include/linux/vcache.h b/include/linux/vcache.h deleted file mode 100644 index 5708fe6a908a..000000000000 --- a/include/linux/vcache.h +++ /dev/null @@ -1,26 +0,0 @@ -/* - * virtual => physical mapping cache support. - */ -#ifndef _LINUX_VCACHE_H -#define _LINUX_VCACHE_H - -typedef struct vcache_s { - unsigned long address; - struct mm_struct *mm; - struct list_head hash_entry; - void (*callback)(struct vcache_s *data, struct page *new_page); -} vcache_t; - -extern spinlock_t vcache_lock; - -extern void __attach_vcache(vcache_t *vcache, - unsigned long address, - struct mm_struct *mm, - void (*callback)(struct vcache_s *data, struct page *new_page)); - -extern void __detach_vcache(vcache_t *vcache); - -extern void invalidate_vcache(unsigned long address, struct mm_struct *mm, - struct page *new_page); - -#endif -- cgit v1.2.3