user/sven/linux.git/include/linux/memory_hotplug.h, branch v4.9.5

memory-hotplug: more general validation of zone during online

2016-07-26T23:19:19Z

When memory is onlined, we are only able to rezone from ZONE_MOVABLE to ZONE_KERNEL, or from (ZONE_MOVABLE - 1) to ZONE_MOVABLE. To be more flexible, use the following criteria instead; to online memory from zone X into zone Y, * Any zones between X and Y must be unused. * If X is lower than Y, the onlined memory must lie at the end of X. * If X is higher than Y, the onlined memory must lie at the start of X. Add zone_can_shift() to make this determination. Link: http://lkml.kernel.org/r/1462816419-4479-3-git-send-email-arbab@linux.vnet.ibm.com Signed-off-by: Reza Arbab Reviewd-by: Yasuaki Ishimatsu Cc: Greg Kroah-Hartman Cc: Daniel Kiper Cc: Dan Williams Cc: Vlastimil Babka Cc: Tang Chen Cc: Joonsoo Kim Cc: David Vrabel Cc: Vitaly Kuznetsov Cc: David Rientjes Cc: Andrew Banman Cc: Chen Yucong Cc: Yasunori Goto Cc: Zhang Zhen Cc: Shaohua Li Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm: fix section mismatch warning

2016-05-27T22:23:32Z

The register_page_bootmem_info_node() function needs to be marked __init in order to avoid a new warning introduced by commit f65e91df25aa ("mm: use early_pfn_to_nid in register_page_bootmem_info_node"). Otherwise you'll get a warning about how a non-init function calls early_pfn_to_nid (which is __meminit) Cc: Yang Shi Cc: Andrew Morton Signed-off-by: Linus Torvalds

mm/memory_hotplug: is_mem_section_removable() can return bool

2016-05-20T02:12:14Z

Make is_mem_section_removable() return bool to improve readability due to this particular function only using either one or zero as its return value. Signed-off-by: Yaowei Bai Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous

2016-03-15T23:55:16Z

There is a performance drop report due to hugepage allocation and in there half of cpu time are spent on pageblock_pfn_to_page() in compaction [1]. In that workload, compaction is triggered to make hugepage but most of pageblocks are un-available for compaction due to pageblock type and skip bit so compaction usually fails. Most costly operations in this case is to find valid pageblock while scanning whole zone range. To check if pageblock is valid to compact, valid pfn within pageblock is required and we can obtain it by calling pageblock_pfn_to_page(). This function checks whether pageblock is in a single zone and return valid pfn if possible. Problem is that we need to check it every time before scanning pageblock even if we re-visit it and this turns out to be very expensive in this workload. Although we have no way to skip this pageblock check in the system where hole exists at arbitrary position, we can use cached value for zone continuity and just do pfn_to_page() in the system where hole doesn't exist. This optimization considerably speeds up in above workload. Before vs After Max: 1096 MB/s vs 1325 MB/s Min: 635 MB/s 1015 MB/s Avg: 899 MB/s 1194 MB/s Avg is improved by roughly 30% [2]. [1]: http://www.spinics.net/lists/linux-mm/msg97378.html [2]: https://lkml.org/lkml/2015/12/9/23 [akpm@linux-foundation.org: don't forget to restore zone->contiguous on error path, per Vlastimil] Signed-off-by: Joonsoo Kim Reported-by: Aaron Lu Acked-by: Vlastimil Babka Tested-by: Aaron Lu Cc: Mel Gorman Cc: Rik van Riel Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

memory-hotplug: add automatic onlining policy for the newly added memory

2016-03-15T23:55:16Z

Currently, all newly added memory blocks remain in 'offline' state unless someone onlines them, some linux distributions carry special udev rules like: SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online" to make this happen automatically. This is not a great solution for virtual machines where memory hotplug is being used to address high memory pressure situations as such onlining is slow and a userspace process doing this (udev) has a chance of being killed by the OOM killer as it will probably require to allocate some memory. Introduce default policy for the newly added memory blocks in /sys/devices/system/memory/auto_online_blocks file with two possible values: "offline" which preserves the current behavior and "online" which causes all newly added memory blocks to go online as soon as they're added. The default is "offline". Signed-off-by: Vitaly Kuznetsov Reviewed-by: Daniel Kiper Cc: Jonathan Corbet Cc: Greg Kroah-Hartman Cc: Daniel Kiper Cc: Dan Williams Cc: Tang Chen Cc: David Vrabel Acked-by: David Rientjes Cc: Naoya Horiguchi Cc: Xishi Qiu Cc: Mel Gorman Cc: "K. Y. Srinivasan" Cc: Igor Mammedov Cc: Kay Sievers Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

x86, mm: introduce vmem_altmap to augment vmemmap_populate()

2016-01-16T01:56:32Z

In support of providing struct page for large persistent memory capacities, use struct vmem_altmap to change the default policy for allocating memory for the memmap array. The default vmemmap_populate() allocates page table storage area from the page allocator. Given persistent memory capacities relative to DRAM it may not be feasible to store the memmap in 'System Memory'. Instead vmem_altmap represents pre-allocated "device pages" to satisfy vmemmap_alloc_block_buf() requests. Signed-off-by: Dan Williams Reported-by: kbuild test robot Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Dave Hansen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm: memory hotplug with an existing resource

2015-10-23T13:19:58Z

Add add_memory_resource() to add memory using an existing "System RAM" resource. This is useful if the memory region is being located by finding a free resource slot with allocate_resource(). Xen guests will make use of this in their balloon driver to hotplug arbitrary amounts of memory in response to toolstack requests. Signed-off-by: David Vrabel Reviewed-by: Daniel Kiper Reviewed-by: Tang Chen

mm: ZONE_DEVICE for "device memory"

2015-08-27T23:40:58Z

While pmem is usable as a block device or via DAX mappings to userspace there are several usage scenarios that can not target pmem due to its lack of struct page coverage. In preparation for "hot plugging" pmem into the vmemmap add ZONE_DEVICE as a new zone to tag these pages separately from the ones that are subject to standard page allocations. Importantly "device memory" can be removed at will by userspace unbinding the driver of the device. Having a separate zone prevents allocation and otherwise marks these pages that are distinct from typical uniform memory. Device memory has different lifetime and performance characteristics than RAM. However, since we have run out of ZONES_SHIFT bits this functionality currently depends on sacrificing ZONE_DMA. Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Dave Hansen Cc: Rik van Riel Cc: Mel Gorman Cc: Jerome Glisse [hch: various simplifications in the arch interface] Signed-off-by: Christoph Hellwig Signed-off-by: Dan Williams

mm, hotplug: fix concurrent memory hot-add deadlock

2015-04-14T23:49:00Z

There's a deadlock when concurrently hot-adding memory through the probe interface and switching a memory block from offline to online. When hot-adding memory via the probe interface, add_memory() first takes mem_hotplug_begin() and then device_lock() is later taken when registering the newly initialized memory block. This creates a lock dependency of (1) mem_hotplug.lock (2) dev->mutex. When switching a memory block from offline to online, dev->mutex is first grabbed in device_online() when the write(2) transitions an existing memory block from offline to online, and then online_pages() will take mem_hotplug_begin(). This creates a lock inversion between mem_hotplug.lock and dev->mutex. Vitaly reports that this deadlock can happen when kworker handling a probe event races with systemd-udevd switching a memory block's state. This patch requires the state transition to take mem_hotplug_begin() before dev->mutex. Hot-adding memory via the probe interface creates a memory block while holding mem_hotplug_begin(), there is no way to take dev->mutex first in this case. online_pages() and offline_pages() are only called when transitioning memory block state. We now require that mem_hotplug_begin() is taken before calling them -- this requires exporting the mem_hotplug_begin() and mem_hotplug_done() to generic code. In all hot-add and hot-remove cases, mem_hotplug_begin() is done prior to device_online(). This is all that is needed to avoid the deadlock. Signed-off-by: David Rientjes Reported-by: Vitaly Kuznetsov Tested-by: Vitaly Kuznetsov Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: "K. Y. Srinivasan" Cc: Yasuaki Ishimatsu Cc: Tang Chen Cc: Vlastimil Babka Cc: Zhang Zhen Cc: Vladimir Davydov Cc: Wang Nan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

memory-hotplug: add sysfs valid_zones attribute

2014-10-10T02:25:52Z

Currently memory-hotplug has two limits: 1. If the memory block is in ZONE_NORMAL, you can change it to ZONE_MOVABLE, but this memory block must be adjacent to ZONE_MOVABLE. 2. If the memory block is in ZONE_MOVABLE, you can change it to ZONE_NORMAL, but this memory block must be adjacent to ZONE_NORMAL. With this patch, we can easy to know a memory block can be onlined to which zone, and don't need to know the above two limits. Updated the related Documentation. [akpm@linux-foundation.org: use conventional comment layout] [akpm@linux-foundation.org: fix build with CONFIG_MEMORY_HOTREMOVE=n] [akpm@linux-foundation.org: remove unused local zone_prev] Signed-off-by: Zhang Zhen Cc: Dave Hansen Cc: David Rientjes Cc: Toshi Kani Cc: Yasuaki Ishimatsu Cc: Naoya Horiguchi Cc: Wang Nan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds