user/sven/linux.git/fs/proc, branch v3.12.13

procfs: call default get_unmapped_area on MMU-present architectures

2013-10-17T04:35:53Z

Commit c4fe24485729 ("sparc: fix PCI device proc file mmap(2)") added proc_reg_get_unmapped_area in proc_reg_file_ops and proc_reg_file_ops_no_compat, by which now mmap always returns EIO if get_unmapped_area method is not defined for the target procfs file, which causes regression of mmap on /proc/vmcore. To address this issue, like get_unmapped_area(), call default current->mm->get_unmapped_area on MMU-present architectures if pde->proc_fops->get_unmapped_area, i.e. the one in actual file operation in the procfs file, is not defined. Reported-by: Michael Holzheu Signed-off-by: HATAYAMA Daisuke Cc: Alexey Dobriyan Cc: David S. Miller Tested-by: Michael Holzheu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

procfs: fix unintended truncation of returned mapped address

2013-10-17T04:35:53Z

Currently, proc_reg_get_unmapped_area truncates upper 32-bit of the mapped virtual address returned from get_unmapped_area method in pde->proc_fops due to the variable rv of signed integer on x86_64. This is too small to have vitual address of unsigned long on x86_64 since on x86_64, signed integer is of 4 bytes while unsigned long is of 8 bytes. To fix this issue, use unsigned long instead. Fixes a regression added in commit c4fe24485729 ("sparc: fix PCI device proc file mmap(2)"). Signed-off-by: HATAYAMA Daisuke Cc: Alexey Dobriyan Cc: David S. Miller Tested-by: Michael Holzheu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm: /proc/pid/pagemap: inspect _PAGE_SOFT_DIRTY only on present pages

2013-10-17T04:35:52Z

If a page we are inspecting is in swap we may occasionally report it as having soft dirty bit (even if it is clean). The pte_soft_dirty helper should be called on present pte only. Signed-off-by: Cyrill Gorcunov Cc: Pavel Emelyanov Cc: Andy Lutomirski Cc: Matt Mackall Cc: Xiao Guangrong Cc: Marcelo Tosatti Cc: KOSAKI Motohiro Cc: Stephen Rothwell Cc: Peter Zijlstra Cc: "Aneesh Kumar K.V" Reviewed-by: Naoya Horiguchi Cc: Mel Gorman Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

thp: account anon transparent huge pages into NR_ANON_PAGES

2013-09-12T22:38:03Z

We use NR_ANON_PAGES as base for reporting AnonPages to user. There's not much sense in not accounting transparent huge pages there, but add them on printing to user. Let's account transparent huge pages in NR_ANON_PAGES in the first place. Signed-off-by: Kirill A. Shutemov Acked-by: Dave Hansen Cc: Andrea Arcangeli Cc: Al Viro Cc: Hugh Dickins Cc: Wu Fengguang Cc: Jan Kara Cc: Mel Gorman Cc: Andi Kleen Cc: Matthew Wilcox Cc: Hillf Danton Cc: Ning Qu Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

vmcore: enable /proc/vmcore mmap for s390

2013-09-11T22:59:14Z

The patch "s390/vmcore: Implement remap_oldmem_pfn_range for s390" allows now to use mmap also on s390. So enable mmap for s390 again. Signed-off-by: Michael Holzheu Cc: HATAYAMA Daisuke Cc: Jan Willeke Cc: Vivek Goyal Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

vmcore: introduce remap_oldmem_pfn_range()

2013-09-11T22:59:10Z

For zfcpdump we can't map the HSA storage because it is only available via a read interface. Therefore, for the new vmcore mmap feature we have introduce a new mechanism to create mappings on demand. This patch introduces a new architecture function remap_oldmem_pfn_range() that should be used to create mappings with remap_pfn_range() for oldmem areas that can be directly mapped. For zfcpdump this is everything besides of the HSA memory. For the areas that are not mapped by remap_oldmem_pfn_range() a generic vmcore a new generic vmcore fault handler mmap_vmcore_fault() is called. This handler works as follows: * Get already available or new page from page cache (find_or_create_page) * Check if /proc/vmcore page is filled with data (PageUptodate) * If yes: Return that page * If no: Fill page using __vmcore_read(), set PageUptodate, and return page Signed-off-by: Michael Holzheu Acked-by: Vivek Goyal Cc: HATAYAMA Daisuke Cc: Jan Willeke Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

vmcore: introduce ELF header in new memory feature

2013-09-11T22:59:10Z

For s390 we want to use /proc/vmcore for our SCSI stand-alone dump (zfcpdump). We have support where the first HSA_SIZE bytes are saved into a hypervisor owned memory area (HSA) before the kdump kernel is booted. When the kdump kernel starts, it is restricted to use only HSA_SIZE bytes. The advantages of this mechanism are: * No crashkernel memory has to be defined in the old kernel. * Early boot problems (before kexec_load has been done) can be dumped * Non-Linux systems can be dumped. We modify the s390 copy_oldmem_page() function to read from the HSA memory if memory below HSA_SIZE bytes is requested. Since we cannot use the kexec tool to load the kernel in this scenario, we have to build the ELF header in the 2nd (kdump/new) kernel. So with the following patch set we would like to introduce the new function that the ELF header for /proc/vmcore can be created in the 2nd kernel memory. The following steps are done during zfcpdump execution: 1. Production system crashes 2. User boots a SCSI disk that has been prepared with the zfcpdump tool 3. Hypervisor saves CPU state of boot CPU and HSA_SIZE bytes of memory into HSA 4. Boot loader loads kernel into low memory area 5. Kernel boots and uses only HSA_SIZE bytes of memory 6. Kernel saves registers of non-boot CPUs 7. Kernel does memory detection for dump memory map 8. Kernel creates ELF header for /proc/vmcore 9. /proc/vmcore uses this header for initialization 10. The zfcpdump user space reads /proc/vmcore to write dump to SCSI disk - copy_oldmem_page() copies from HSA for memory below HSA_SIZE - copy_oldmem_page() copies from real memory for memory above HSA_SIZE Currently for s390 we create the ELF core header in the 2nd kernel with a small trick. We relocate the addresses in the ELF header in a way that for the /proc/vmcore code it seems to be in the 1st kernel (old) memory and the read_from_oldmem() returns the correct data. This allows the /proc/vmcore code to use the ELF header in the 2nd kernel. This patch: Exchange the old mechanism with the new and much cleaner function call override feature that now offcially allows to create the ELF core header in the 2nd kernel. To use the new feature the following function have to be defined by the architecture backend code to read from new memory: * elfcorehdr_alloc: Allocate ELF header * elfcorehdr_free: Free the memory of the ELF header * elfcorehdr_read: Read from ELF header * elfcorehdr_read_notes: Read from ELF notes Signed-off-by: Michael Holzheu Acked-by: Vivek Goyal Cc: HATAYAMA Daisuke Cc: Jan Willeke Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

proc: make proc_fd_permission() thread-friendly

2013-09-11T22:59:03Z

proc_fd_permission() says "process can still access /proc/self/fd after it has executed a setuid()", but the "task_pid() = proc_pid() check only helps if the task is group leader, /proc/self points to /proc/. Change this check to use task_tgid() so that the whole thread group can access its /proc/self/fd or /proc//fd. Notes: - CLONE_THREAD does not require CLONE_FILES so task->files can differ, but I don't think this can lead to any security problem. And this matches same_thread_group() in __ptrace_may_access(). - /proc/self should probably point to /proc/, but it is too late to change the rules. Perhaps it makes sense to add /proc/thread though. Test-case: void *tfunc(void *arg) { assert(opendir("/proc/self/fd")); return NULL; } int main(void) { pthread_t t; pthread_create(&t, NULL, tfunc, NULL); pthread_join(t, NULL); return 0; } fails if, say, this executable is not readable and suid_dumpable = 0. Signed-off-by: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

fs/proc/task_mmu.c: check the return value of mpol_to_str()

2013-09-11T22:59:03Z

mpol_to_str() may fail, and not fill the buffer (e.g. -EINVAL), so need check about it, or buffer may not be zero based, and next seq_printf() will cause issue. The failure return need after mpol_cond_put() to match get_vma_policy(). Signed-off-by: Chen Gang Cc: Cyrill Gorcunov Cc: Mel Gorman Cc: Andi Kleen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm: track vma changes with VM_SOFTDIRTY bit

2013-09-11T22:57:56Z

Pavel reported that in case if vma area get unmapped and then mapped (or expanded) in-place, the soft dirty tracker won't be able to recognize this situation since it works on pte level and ptes are get zapped on unmap, loosing soft dirty bit of course. So to resolve this situation we need to track actions on vma level, there VM_SOFTDIRTY flag comes in. When new vma area created (or old expanded) we set this bit, and keep it here until application calls for clearing soft dirty bit. Thus when user space application track memory changes now it can detect if vma area is renewed. Reported-by: Pavel Emelyanov Signed-off-by: Cyrill Gorcunov Cc: Andy Lutomirski Cc: Matt Mackall Cc: Xiao Guangrong Cc: Marcelo Tosatti Cc: KOSAKI Motohiro Cc: Stephen Rothwell Cc: Peter Zijlstra Cc: "Aneesh Kumar K.V" Cc: Rob Landley Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds