user/sven/linux.git - Linux Kernel

Age	Commit message (Collapse)	Author
2008-08-01	[S390] move include/asm-s390 to arch/s390/include/asm	Martin Schwidefsky
	Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-02-09	[S390] dynamic page tables.	Martin Schwidefsky
	Add support for different number of page table levels dependent on the highest address used for a process. This will cause a 31 bit process to use a two level page table instead of the four level page table that is the default after the pud has been introduced. Likewise a normal 64 bit process will use three levels instead of four. Only if a process runs out of the 4 tera bytes which can be addressed with a three level page table the fourth level is dynamically added. Then the process can use up to 8 peta byte. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-02-09	[S390] Add four level page tables for CONFIG_64BIT=y.	Martin Schwidefsky
	Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-02-09	[S390] 1K/2K page table pages.	Martin Schwidefsky
	This patch implements 1K/2K page table pages for s390. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-02-08	CONFIG_HIGHPTE vs. sub-page page tables.	Martin Schwidefsky
	Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte ) - to be introduced with a later patch. For everybody else it will be a (struct page ). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05	add mm argument to pte/pmd/pud/pgd_free	Benjamin Herrenschmidt
	(with Martin Schwidefsky <schwidefsky@de.ibm.com>) The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as first argument. The free functions do not get the mm_struct argument. This is 1) asymmetrical and 2) to do mm related page table allocations the mm argument is needed on the free function as well. [kamalesh@linux.vnet.ibm.com: i386 fix] [akpm@linux-foundation.org: coding-syle fixes] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22	[S390] 4level-fixup cleanup	Martin Schwidefsky
	Get independent from asm-generic/4level-fixup.h Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-10-22	[S390] Cleanup page table definitions.	Martin Schwidefsky
	- De-confuse the defines for the address-space-control-elements and the segment/region table entries. - Create out of line functions for page table allocation / freeing. - Simplify get_shadow_xxx functions. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-10-22	[S390] tlb flush fix.	Martin Schwidefsky
	The current tlb flushing code for page table entries violates the s390 architecture in a small detail. The relevant section from the principles of operation (SA22-7832-02 page 3-47): "A valid table entry must not be changed while it is attached to any CPU and may be used for translation by that CPU except to (1) invalidate the entry by using INVALIDATE PAGE TABLE ENTRY or INVALIDATE DAT TABLE ENTRY, (2) alter bits 56-63 of a page-table entry, or (3) make a change by means of a COMPARE AND SWAP AND PURGE instruction that purges the TLB." That means if one thread of a multithreaded applciation uses a vma while another thread does an unmap on it, the page table entries of that vma needs to get removed with IPTE, IDTE or CSP. In some strange and rare situations a cpu could check-stop (die) because a entry has been pushed out of the TLB that is still needed to complete a (milli-coded) instruction. I've never seen it happen with the current code on any of the supported machines, so right now this is a theoretical problem. But I want to fix it nevertheless, to avoid headaches in the futures. To get this implemented correctly without changing common code the primitives ptep_get_and_clear, ptep_get_and_clear_full and ptep_set_wrprotect need to use the IPTE instruction to invalidate the pte before the new pte value gets stored. If IPTE is always used for the three primitives three important operations will have a performace hit: fork, mprotect and exit_mmap. Time for some workarounds: * 1: ptep_get_and_clear_full is used in unmap_vmas to remove page tables entries in a batched tlb gather operation. If the mmu_gather context passed to unmap_vmas has been started with full_mm_flush==1 or if only one cpu is online or if the only user of a mm_struct is the current process then the fullmm indication in the mmu_gather context is set to one. All TLBs for mm_struct are flushed by the tlb_gather_mmu call. No new TLBs can be created while the unmap is in progress. In this case ptep_get_and_clear_full clears the ptes with a simple store. * 2: ptep_get_and_clear is used in change_protection to clear the ptes from the page tables before they are reentered with the new access flags. At the end of the update flush_tlb_range clears the remaining TLBs. In general the ptep_get_and_clear has to issue IPTE for each pte and flush_tlb_range is a nop. But if there is only one user of the mm_struct then ptep_get_and_clear uses simple stores to do the update and flush_tlb_range will flush the TLBs. * 3: Similar to 2, ptep_set_wrprotect is used in copy_page_range for a fork to make all ptes of a cow mapping read-only. At the end of of copy_page_range dup_mmap will flush the TLBs with a call to flush_tlb_mm. Check for mm->mm_users and if there is only one user avoid using IPTE in ptep_set_wrprotect and let flush_tlb_mm clear the TLBs. Overall for single threaded programs the tlb flush code now performs better, for multi threaded programs it is slightly worse. In particular exit_mmap() now does a single IDTE for the mm and then just frees every page cache reference and every page table page directly without a delay over the mmu_gather structure. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-08-22	[S390] vmur: fix diag14 exceptions with addresses > 2GB.	Michael Holzheu
	There are several s390 diagnose calls, which must be executed below the 2GB memory boundary. In order to enforce this, those diagnoses must be compiled into the kernel. Currently diag 14 can be called within the vmur kernel module from addresses above 2GB. This leads to specification exceptions. This patch moves diag10, diag14 and diag210 into the new diag.c file. Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2007-02-05	[S390] noexec protection	Gerald Schaefer
	This provides a noexec protection on s390 hardware. Our hardware does not have any bits left in the pte for a hw noexec bit, so this is a different approach using shadow page tables and a special addressing mode that allows separate address spaces for code and data. As a special feature of our "secondary-space" addressing mode, separate page tables can be specified for the translation of data addresses (storage operands) and instruction addresses. The shadow page table is used for the instruction addresses and the standard page table for the data addresses. The shadow page table is linked to the standard page table by a pointer in page->lru.next of the struct page corresponding to the page that contains the standard page table (since page->private is not really private with the pte_lock and the page table pages are not in the LRU list). Depending on the software bits of a pte, it is either inserted into both page tables or just into the standard (data) page table. Pages of a vma that does not have the VM_EXEC bit set get mapped only in the data address space. Any try to execute code on such a page will cause a page translation exception. The standard reaction to this is a SIGSEGV with two exceptions: the two system call opcodes 0x0a77 (sys_sigreturn) and 0x0aad (sys_rt_sigreturn) are allowed. They are stored by the kernel to the signal stack frame. Unfortunately, the signal return mechanism cannot be modified to use an SA_RESTORER because the exception unwinding code depends on the system call opcode stored behind the signal stack frame. This feature requires that user space is executed in secondary-space mode and the kernel in home-space mode, which means that the addressing modes need to be switched and that the noexec protection only works for user space. After switching the addressing modes, we cannot use the mvcp/mvcs instructions anymore to copy between kernel and user space. A new mvcos instruction has been added to the z9 EC/BC hardware which allows to copy between arbitrary address spaces, but on older hardware the page tables need to be walked manually. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-12-08	[S390] Virtual memmap for s390.	Heiko Carstens
	Virtual memmap support for s390. Inspired by the ia64 implementation. Unlike ia64 we need a mechanism which allows us to dynamically attach shared memory regions. These memory regions are accessed via the dcss device driver. dcss implements the 'direct_access' operation, which requires struct pages for every single shared page. Therefore this implementation provides an interface to attach/detach shared memory: int add_shared_memory(unsigned long start, unsigned long size); int remove_shared_memory(unsigned long start, unsigned long size); The purpose of the add_shared_memory function is to add the given memory range to the 1:1 mapping and to make sure that the corresponding range in the vmemmap is backed with physical pages. It also initialises the new struct pages. remove_shared_memory in turn only invalidates the page table entries in the 1:1 mapping. The page tables and the memory used for struct pages in the vmemmap are currently not freed. They will be reused when the next segment will be attached. Given that the maximum size of a shared memory region is 2GB and in addition all regions must reside below 2GB this is not too much of a restriction, but there is room for improvement. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-10-04	[S390] Remove open-coded mem_map usage.	Heiko Carstens
	Use page_to_phys and pfn_to_page to avoid open-coded mem_map usage. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2006-09-20	[S390] Cleanup in page table related code.	Gerald Schaefer
	Changed and simplified some page table related #defines and code. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-07-12	[S390] Fix sparse warnings.	Heiko Carstens
	Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2006-04-26	Don't include linux/config.h from anywhere else in include/	David Woodhouse
	Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-03-22	[PATCH] mm: remove set_pgdir leftovers	Christoph Hellwig
	set_pgdir isn't needed anymore for a very long time. Remove the leftover implementation on sh64 and the stub on s390. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Richard Curnow <rc@rc0.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-28	[PATCH] include cleanup in pgalloc.h	Herbert Pötzl
	This patch cleans up asm-*/pgalloc.h by removing the generous includes which are obsoleted (duplicated) by including linux/mm.h (and friends) They are double checked and verified by the PLM cross compiling service (the patched kernel gives the same warnings/errors as the unpatched) http://osdl.org/plm-cgi/plm?module=patch_info&patch_id=4313 Signed-off-by: Herbert Pötzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-02-22	[MM]: Add set_pte_at() which takes 'mm' and 'addr' args.	David S. Miller
	I'm taking a slightly different approach this time around so things are easier to integrate. Here is the first patch which builds the infrastructure. Basically: 1) Add set_pte_at() which is set_pte() with 'mm' and 'addr' arguments added. All generic code uses set_pte_at(). Most platforms simply get this define: #define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval) I chose this method over simply changing all set_pte() call sites because many platforms implement this in assembler and it would take forever to preserve the build and stabilize things if modifying that was necessary. Soon, with platform maintainer's help, we can kill of set_pte() entirely. To be honest, there are only a handful of set_pte() call sites in the arch specific code. Actually, in this patch ppc64 is completely set_pte() free and does not define it. 2) pte_clear() gets 'mm' and 'addr' arguments now. This had a cascading effect on many ptep_test_and_*() routines. Specifically: a) ptep_test_and_clear_{young,dirty}() now take 'vma' and 'address' args. b) ptep_get_and_clear now take 'mm' and 'address' args. c) ptep_mkdirty was deleted, unused by any code. d) ptep_set_wrprotect now takes 'mm' and 'address' args. I've tested this patch as follows: 1) compile and run tested on sparc64/SMP 2) compile tested on: a) ppc64/SMP b) i386 both with and without PAE enabled Signed-off-by: David S. Miller <davem@davemloft.net>
2004-04-11	[PATCH] missing NULL pointer check in pte_alloc_one.	Andrew Morton
	From: Martin Schwidefsky <schwidefsky@de.ibm.com> Just found an small bug in pgalloc for s390*. Comparing notes with other architectures I found that pte_alloc_one is sick for alpha and sparc64 as well.
2004-02-25	[PATCH] s390: collaborative memory management.	Andrew Morton
	From: Martin Schwidefsky <schwidefsky@de.ibm.com> Add collaborative memory management interface.
2004-01-18	[PATCH] s390: tlb flush optimization.	Andrew Morton
	From: Martin Schwidefsky <schwidefsky@de.ibm.com> On the s/390 architecture we still have the issue with tlb flushing and the ipte instruction. We can optimize the tlb flushing a lot with some minor interface changes between the arch backend and the memory management core. In the end the whole thing is about the Invalidate Page Table Entry (ipte) instruction. The instruction sets the invalid bit in the pte and removes the tlb for the page on all cpus for the virtual to physical mapping of the page in a particular address space. The nice thing is that only the tlb for this page gets removed, all the other tlbs stay valid. The reason we can't use ipte to implement flush_tlb_page() is one of the requirements of the instruction: the pte that should get flushed needs to be valid. I'd like to add the following four functions to the mm interface: * ptep_establish: Establish a new mapping. This sets a pte entry to a page table and flushes the tlb of the old entry on all cpus if it exists. This is more or less what establish_pte in mm/memory.c does right now but without the update_mmu_cache call. * ptep_test_and_clear_and_flush_young. Do what ptep_test_and_clear_young does and flush the tlb. * ptep_test_and_clear_and_flush_dirty. Do what ptep_test_and_clear_dirty does and flush the tlb. * ptep_get_and_clear_and_flush: Do what ptep_get_and_clear does and flush the tlb. The s/390 specific functions in include/pgtable.h define their own optimized version of these four functions by use of the ipte. I avoid the definition of these function for every architecture I added them to include/asm-generic/pgtable.h. Since i386/x86 and others don't include this header yet and define their own version of the functions found there I #ifdef'd all functions in include/asm-generic/pgtable.h to be able to pick the ones that are needed for each architecture (see patch for details). With the new functions in place it is easy to do the optimization, e.g. the sequence ptep_get_and_clear(ptep); flush_tlb_page(vma, address); gets replace by ptep_get_and_clear_and_flush(vma, address, ptep); The old sequence still works but it is suboptimal on s/390.
2004-01-18	[PATCH] s390: general update	Andrew Morton
	From: Martin Schwidefsky <schwidefsky@de.ibm.com> - Add console_unblank in machine_{restart,halt,power_off} to get all messages on the screen. - Set console_irq to -1 if condev= parameter is present. - Fix write_trylock for 64 bit. - Fix svc restarting. - System call number on 64 bit is an int. Fix compare in entry64.S. - Fix tlb flush problem. - Use the idte instruction to flush tlbs of a particular mm. - Fix ptrace. - Add fadvise64_64 system call wrapper. - Fix pfault handling. - Do not clobber _PAGE_INVALID_NONE pages in pte_wrprotect. - Fix siginfo_t size problem (needs to be 128 for s390x, not 136). - Avoid direct assignment to tsk->state, use __set_task_state. - Always make any pending restarted system call return -EINTR. - Add panic_on_oops. - Display symbol for psw address in show_trace. - Don't discard sections .exit.text, .exit.data and .eh_frame, otherwise stabs information for kerntypes will get lost. - Add memory clobber to assembler inline in ip_fast_checksum for gcc 3.3. - Fix softirq_pending calls for the current cpu (cpu == smp_processor_id()). - Remove BUG_ON in irq_enter. Two irq_enters are possible.
2003-04-20	[PATCH] use __GFP_REPEAT in pte_alloc_one()	Andrew Morton
	Remove all the open-coded retry loops in various architectures, use __GFP_REPEAT. It could be that at some time in the future we change __GFP_REPEAT to give up after ten seconds or so, so all the checks for failed allocations are retained.
2003-04-13	[PATCH] s390/s390x unification (6/7)	Martin Schwidefsky
	Merge s390x and s390 to one architecture.
2002-10-03	[PATCH] s390 update (2/27): include.	Martin Schwidefsky
	s390 include file changes for 2.5.39.
2002-08-27	[PATCH] reduced TLB invalidation rate	Andrew Morton
	It has been noticed that across a kernel build many calls to tlb_flush_mmu() do not have anything to flush, apparently because glibc is mmapping a file over a previously-mapped region which has no faulted-in ptes. This patch detects this case and optimises away a little over one third of the tlb invalidations. The functions which potentially cause an invalidate are tlb_remove_tlb_entry(), pte_free_tlb() and pmd_free_tlb(). These have been front-ended in asm-generic/tlb.h and the per-arch versions now have leading double-underscores. The generic versions tag the mmu_gather_t as needing a flush and then call the arch-specific version. tlb_flush_mmu() looks at tlb->need_flush and if it sees that no real activity has happened, the invalidation is avoided. The success rate is displayed in /proc/meminfo for the while. This should be removed later.
2002-06-08	[PATCH] s/390 patches for 2.5.20 (2 of 4).	Martin Schwidefsky
	Second patch of the s/390 update. Contains all the include file changes in include/asm-{s390,s390x}.
2002-02-05	v2.5.2 -> v2.5.2.1	Linus Torvalds
	- Al Viro: fix up silly problem in swapfile filp cleanups in 2.5.2 - Tachino Nobuhiro: fix another error return for swapfile filp code - Robert Love: merge some of Ingo's scheduler fixes - David Miller: networking, sparc and some scsi driver fixes - Tim Waugh: parport update - OGAWA Hirofumi: fatfs cleanups and bugfixes - Roland Dreier: fix vsscanf buglets. - Ben LaHaise: include file cleanup - Andre Hedrick: IDE taskfile update
2002-02-04	v2.4.12 -> v2.4.12.1	Linus Torvalds
	- Trond Myklebust: deadlock checking in lockd server - Tim Waugh: fix up parport wrong #define - Christoph Hellwig: i2c update, ext2 cleanup - Al Viro: fix partition handling sanity check. - Trond Myklebust: make NFS use SLAB_NOFS, and not play games with PF_MEMALLOC - Ben Fennema: UDF update - Alan Cox: continued merging - Chris Mason: get /proc buffer memory sizes right after buf-in-page-cache
2002-02-04	v2.4.3.6 -> v2.4.3.7	Linus Torvalds
	- Johannes Erdfelt: USB updates - David Howells: more rw-sem stuff - David Miller: network callback cleanups and fixes - Jan Harkes: make Coda use the proper VFS layer interfaces, so that it can use "non-traditional-unix" filesystems without inode numbers for backing store.
2002-02-04	v2.4.3.1 -> v2.4.3.2	Linus Torvalds
	- Ingo Molnar/Al Viro: don't use bforget() on ext2 (and minix) metadata where we may not be the only owner of the buffer! FS corruption. - Andi Kleen: IPv6 packet re-assembly fix. - David Howells: fix up rwsem implementation - Alan Cox: more merging (S/390 down, ARM to go). - Jens Axboe: LVM and loop fixes
2002-02-04	v2.4.1.3 -> v2.4.1.4	Linus Torvalds
	- big S/390x 64-bit merge - typos and license name fixes. doc updates. - more include file cleanups (phase out "malloc.h") - even more elevator corner cases.. When not merging, find the best insertion point. - pmac ide update - network fixes (netif_wake_queue on tx timeout) - USB printer select() fix - NFS client missed initialization, deamon fixed client address check
2002-02-04	Import changeset	Linus Torvalds