user/sven/linux.git/mm/percpu-vm.c, branch v5.3.3

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428

2019-06-05T15:37:16Z

Based on 1 normalized pattern(s): this file is released under the gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 68 file(s). Signed-off-by: Thomas Gleixner Reviewed-by: Armijn Hemel Reviewed-by: Allison Randal Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190114.292346262@linutronix.de Signed-off-by: Greg Kroah-Hartman

percpu: allow select gfp to be passed to underlying allocators

2018-02-18T13:33:01Z

The prior patch added support for passing gfp flags through to the underlying allocators. This patch allows users to pass along gfp flags (currently only __GFP_NORETRY and __GFP_NOWARN) to the underlying allocators. This should allow users to decide if they are ok with failing allocations recovering in a more graceful way. Additionally, gfp passing was done as additional flags in the previous patch. Instead, change this to caller passed semantics. GFP_KERNEL is also removed as the default flag. It continues to be used for internally caused underlying percpu allocations. V2: Removed gfp_percpu_mask in favor of doing it inline. Removed GFP_KERNEL as a default flag for __alloc_percpu_gfp. Signed-off-by: Dennis Zhou Suggested-by: Daniel Borkmann Acked-by: Christoph Lameter Signed-off-by: Tejun Heo

percpu: add __GFP_NORETRY semantics to the percpu balancing path

2018-02-18T13:33:00Z

Percpu memory using the vmalloc area based chunk allocator lazily populates chunks by first requesting the full virtual address space required for the chunk and subsequently adding pages as allocations come through. To ensure atomic allocations can succeed, a workqueue item is used to maintain a minimum number of empty pages. In certain scenarios, such as reported in [1], it is possible that physical memory becomes quite scarce which can result in either a rather long time spent trying to find free pages or worse, a kernel panic. This patch adds support for __GFP_NORETRY and __GFP_NOWARN passing them through to the underlying allocators. This should prevent any unnecessary panics potentially caused by the workqueue item. The passing of gfp around is as additional flags rather than a full set of flags. The next patch will change these to caller passed semantics. V2: Added const modifier to gfp flags in the balance path. Removed an extra whitespace. [1] https://lkml.org/lkml/2018/2/12/551 Signed-off-by: Dennis Zhou Suggested-by: Daniel Borkmann Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com Acked-by: Christoph Lameter Signed-off-by: Tejun Heo

mm: remove __GFP_COLD

2017-11-16T02:21:06Z

As the page free path makes no distinction between cache hot and cold pages, there is no real useful ordering of pages in the free list that allocation requests can take advantage of. Juding from the users of __GFP_COLD, it is likely that a number of them are the result of copying other sites instead of actually measuring the impact. Remove the __GFP_COLD parameter which simplifies a number of paths in the page allocator. This is potentially controversial but bear in mind that the size of the per-cpu pagelists versus modern cache sizes means that the whole per-cpu list can often fit in the L3 cache. Hence, there is only a potential benefit for microbenchmarks that alloc/free pages in a tight loop. It's even worse when THP is taken into account which has little or no chance of getting a cache-hot page as the per-cpu list is bypassed and the zeroing of multiple pages will thrash the cache anyway. The truncate microbenchmarks are not shown as this patch affects the allocation path and not the free path. A page fault microbenchmark was tested but it showed no sigificant difference which is not surprising given that the __GFP_COLD branches are a miniscule percentage of the fault path. Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: Andi Kleen Cc: Dave Chinner Cc: Dave Hansen Cc: Jan Kara Cc: Johannes Weiner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

percpu: fix static checker warnings in pcpu_destroy_chunk

2017-06-29T15:23:38Z

From 5021b97f4026334d2c8dfad80797dd1028cddd73 Mon Sep 17 00:00:00 2001 From: Dennis Zhou Date: Thu, 29 Jun 2017 07:11:41 -0700 Add NULL check in pcpu_destroy_chunk to correct static checker warnings. Signed-off-by: Dennis Zhou Reported-by: Dan Carpenter Signed-off-by: Tejun Heo

percpu: add tracepoint support for percpu memory

2017-06-20T19:31:43Z

Add support for tracepoints to the following events: chunk allocation, chunk free, area allocation, area free, and area allocation failure. This should let us replay percpu memory requests and evaluate corresponding decisions. Signed-off-by: Dennis Zhou Signed-off-by: Tejun Heo

percpu: expose statistics about percpu memory via debugfs

2017-06-20T19:31:38Z

There is limited visibility into the use of percpu memory leaving us unable to reason about correctness of parameters and overall use of percpu memory. These counters and statistics aim to help understand basic statistics about percpu memory such as number of allocations over the lifetime, allocation sizes, and fragmentation. New Config: PERCPU_STATS Signed-off-by: Dennis Zhou Signed-off-by: Tejun Heo

percpu: remove unused chunk_alloc parameter from pcpu_get_pages()

2017-03-06T20:56:55Z

pcpu_get_pages() doesn't use chunk_alloc parameter, remove it. Fixes: fbbb7f4e149f ("percpu: remove the usage of separate populated bitmap in percpu-vm") Signed-off-by: Tahsin Erdogan Signed-off-by: Tejun Heo

percpu: move region iterations out of pcpu_[de]populate_chunk()

2014-09-02T18:46:02Z

Previously, pcpu_[de]populate_chunk() were called with the range which may contain multiple target regions in it and pcpu_[de]populate_chunk() iterated over the regions. This has the benefit of batching up cache flushes for all the regions; however, we're planning to add more bookkeeping logic around [de]population to support atomic allocations and this delegation of iterations gets in the way. This patch moves the region iterations out of pcpu_[de]populate_chunk() into its callers - pcpu_alloc() and pcpu_reclaim() - so that we can later add logic to track more states around them. This change may make cache and tlb flushes more frequent but multi-region [de]populations are rare anyway and if this actually becomes a problem, it's not difficult to factor out cache flushes as separate callbacks which are directly invoked from percpu.c. Signed-off-by: Tejun Heo

percpu: move common parts out of pcpu_[de]populate_chunk()

2014-09-02T18:46:01Z

percpu-vm and percpu-km implement separate versions of pcpu_[de]populate_chunk() and some part which is or should be common are currently in the specific implementations. Make the following changes. * Allocate area clearing is moved from the pcpu_populate_chunk() implementations to pcpu_alloc(). This makes percpu-km's version noop. * Quick exit tests in pcpu_[de]populate_chunk() of percpu-vm are moved to their respective callers so that they are applied to percpu-km too. This doesn't make any meaningful difference as both functions are noop for percpu-km; however, this is more consistent and will help implementing atomic allocation support. Signed-off-by: Tejun Heo