<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/percpu.h, branch v5.16.19</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.16.19</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.16.19'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-11-18T18:50:45Z</updated>
<entry>
<title>Merge tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux</title>
<updated>2021-11-18T18:50:45Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-11-18T18:50:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7d5775d49e4a488bc8a07e5abb2b71a4c28aadbb'/>
<id>urn:sha1:7d5775d49e4a488bc8a07e5abb2b71a4c28aadbb</id>
<content type='text'>
Pull printk fixes from Petr Mladek:

 - Try to flush backtraces from other CPUs also on the local one. This
   was a regression caused by printk_safe buffers removal.

 - Remove header dependency warning.

* tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Remove printk.h inclusion in percpu.h
  printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
</content>
</entry>
<entry>
<title>printk: Remove printk.h inclusion in percpu.h</title>
<updated>2021-11-15T14:03:32Z</updated>
<author>
<name>Andy Shevchenko</name>
<email>andriy.shevchenko@linux.intel.com</email>
</author>
<published>2021-11-12T14:07:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a0ddee65c527d877e798205c1391c6170e580c66'/>
<id>urn:sha1:a0ddee65c527d877e798205c1391c6170e580c66</id>
<content type='text'>
After the commit 42a0bb3f7138 ("printk/nmi: generic solution for safe
printk in NMI") the printk.h is not needed anymore in percpu.h.

Moreover `make headerdep` complains (an excerpt)

In file included from linux/printk.h,
                 from linux/dynamic_debug.h:188
                 from linux/printk.h:559 &lt;-- here
                 from linux/percpu.h:9
                 from linux/idr.h:17
include/net/9p/client.h:13: warning: recursive header inclusion

Yeah, it's not a root cause of this, but removing will help to reduce
the noise.

Fixes: 42a0bb3f7138 ("printk/nmi: generic solution for safe printk in NMI")
Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Acked-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Reviewed-by: Petr Mladek &lt;pmladek@suse.com&gt;
Signed-off-by: Petr Mladek &lt;pmladek@suse.com&gt;
Link: https://lore.kernel.org/r/20211112140749.80042-1-andriy.shevchenko@linux.intel.com
</content>
</entry>
<entry>
<title>percpu: add __alloc_size attributes for better bounds checking</title>
<updated>2021-11-06T20:30:34Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2021-11-05T20:36:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=17197dd460469c6fca66e274f5cd51ee43bb6ddd'/>
<id>urn:sha1:17197dd460469c6fca66e274f5cd51ee43bb6ddd</id>
<content type='text'>
As already done in GrapheneOS, add the __alloc_size attribute for
appropriate percpu allocator interfaces, to provide additional hinting
for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other
compiler optimizations.

Note that due to the implementation of the percpu API, this is unlikely
to ever actually provide compile-time checking beyond very simple
non-SMP builds.  But, since they are technically allocators, mark them
as such.

Link: https://lkml.kernel.org/r/20210930222704.2631604-9-keescook@chromium.org
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Co-developed-by: Daniel Micay &lt;danielmicay@gmail.com&gt;
Signed-off-by: Daniel Micay &lt;danielmicay@gmail.com&gt;
Acked-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Andy Whitcroft &lt;apw@canonical.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dwaipayan Ray &lt;dwaipayanray1@gmail.com&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Lukas Bulwahn &lt;lukas.bulwahn@gmail.com&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Nathan Chancellor &lt;nathan@kernel.org&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Alexandre Bounine &lt;alex.bou9@gmail.com&gt;
Cc: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jing Xiangfeng &lt;jingxiangfeng@huawei.com&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: kernel test robot &lt;lkp@intel.com&gt;
Cc: Matt Porter &lt;mporter@kernel.crashing.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Souptick Joarder &lt;jrdr.linux@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>percpu: Make pcpu_setup_first_chunk() void function</title>
<updated>2019-07-04T15:05:52Z</updated>
<author>
<name>Kefeng Wang</name>
<email>wangkefeng.wang@huawei.com</email>
</author>
<published>2019-07-03T08:25:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=163fa23435cc9c705a71001d4aa15f3f945554a1'/>
<id>urn:sha1:163fa23435cc9c705a71001d4aa15f3f945554a1</id>
<content type='text'>
pcpu_setup_first_chunk() will panic or BUG_ON if the are some
error and doesn't return any error, hence it can be defined to
return void.

Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Signed-off-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
[Dennis: fixed kbuild warning for pcpu_page_first_chunk()]
</content>
</entry>
<entry>
<title>percpu: set PCPU_BITMAP_BLOCK_SIZE to PAGE_SIZE</title>
<updated>2019-03-13T19:25:31Z</updated>
<author>
<name>Dennis Zhou</name>
<email>dennis@kernel.org</email>
</author>
<published>2019-02-13T19:10:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b239f7daf5530f562000bf55f02cc8028703f507'/>
<id>urn:sha1:b239f7daf5530f562000bf55f02cc8028703f507</id>
<content type='text'>
Previously, block size was flexible based on the constraint that the
GCD(PCPU_BITMAP_BLOCK_SIZE, PAGE_SIZE) &gt; 1. However, this carried the
overhead that keeping a floating number of populated free pages required
scanning over the free regions of a chunk.

Setting the block size to be fixed at PAGE_SIZE lets us know when an
empty page becomes used as we will break a full contig_hint of a block.
This means we no longer have to scan the whole chunk upon breaking a
contig_hint which empty page management piggybacked off. A later patch
takes advantage of this to optimize the allocation path by only scanning
forward using the scan_hint introduced later too.

Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Reviewed-by: Peng Fan &lt;peng.fan@nxp.com&gt;
</content>
</entry>
<entry>
<title>/proc/meminfo: add percpu populated pages count</title>
<updated>2018-08-22T17:52:45Z</updated>
<author>
<name>Dennis Zhou (Facebook)</name>
<email>dennisszhou@gmail.com</email>
</author>
<published>2018-08-22T04:53:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7e8a6304d5419cbf056a59de92939e5eef039c57'/>
<id>urn:sha1:7e8a6304d5419cbf056a59de92939e5eef039c57</id>
<content type='text'>
Currently, percpu memory only exposes allocation and utilization
information via debugfs.  This more or less is only really useful for
understanding the fragmentation and allocation information at a per-chunk
level with a few global counters.  This is also gated behind a config.
BPF and cgroup, for example, have seen an increase in use causing
increased use of percpu memory.  Let's make it easier for someone to
identify how much memory is being used.

This patch adds the "Percpu" stat to meminfo to more easily look up how
much percpu memory is in use.  This number includes the cost for all
allocated backing pages and not just insight at the per a unit, per chunk
level.  Metadata is excluded.  I think excluding metadata is fair because
the backing memory scales with the numbere of cpus and can quickly
outweigh the metadata.  It also makes this calculation light.

Link: http://lkml.kernel.org/r/20180807184723.74919-1-dennisszhou@gmail.com
Signed-off-by: Dennis Zhou &lt;dennisszhou@gmail.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Roman Gushchin &lt;guro@fb.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>License cleanup: add SPDX GPL-2.0 license identifier to files with no license</title>
<updated>2017-11-02T10:10:55Z</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2017-11-01T14:07:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b24413180f5600bcb3bb70fbed5cf186b60864bd'/>
<id>urn:sha1:b24413180f5600bcb3bb70fbed5cf186b60864bd</id>
<content type='text'>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode &amp; Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained &gt;5
   lines of source
 - File already had some variant of a license header in it (even if &lt;5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart &lt;kstewart@linuxfoundation.org&gt;
Reviewed-by: Philippe Ombredanne &lt;pombredanne@nexb.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>percpu: update free path to take advantage of contig hints</title>
<updated>2017-07-26T21:41:06Z</updated>
<author>
<name>Dennis Zhou (Facebook)</name>
<email>dennisszhou@gmail.com</email>
</author>
<published>2017-07-24T23:02:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b185cd0dc61c14875155e7bcc3f2c139b6feefd2'/>
<id>urn:sha1:b185cd0dc61c14875155e7bcc3f2c139b6feefd2</id>
<content type='text'>
The bitmap allocator must keep metadata consistent. The easiest way is
to scan after every allocation for each affected block and the entire
chunk. This is rather expensive.

The free path can take advantage of current contig hints to prevent
scanning within the start and end block.  If a scan is needed, it can
be done by scanning backwards from the start and forwards from the end
to identify the entire free area this can be combined with. The blocks
can then be updated by some basic checks rather than complete block
scans.

A chunk scan happens when the freed area makes a page free, a block
free, or spans across blocks. This is necessary as the contig hint at
this point could span across blocks. The check uses the minimum of page
size and the block size to allow for variable sized blocks. There is a
tradeoff here with not updating after every free. It is possible a
contig hint in one block can be merged with the contig hint in the next
block. This means the contig hint can be off by up to a page. However,
if the chunk's contig hint is contained in one block, the contig hint
will be accurate.

Signed-off-by: Dennis Zhou &lt;dennisszhou@gmail.com&gt;
Reviewed-by: Josef Bacik &lt;jbacik@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>percpu: introduce bitmap metadata blocks</title>
<updated>2017-07-26T21:41:05Z</updated>
<author>
<name>Dennis Zhou (Facebook)</name>
<email>dennisszhou@gmail.com</email>
</author>
<published>2017-07-24T23:02:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ca460b3c96274d79f84b31a3fea23a6eed479917'/>
<id>urn:sha1:ca460b3c96274d79f84b31a3fea23a6eed479917</id>
<content type='text'>
This patch introduces the bitmap metadata blocks and adds the skeleton
of the code that will be used to maintain these blocks.  Each chunk's
bitmap is made up of full metadata blocks. These blocks maintain basic
metadata to help prevent scanning unnecssarily to update hints. Full
scanning methods are used for the skeleton and will be replaced in the
coming patches. A number of helper functions are added as well to do
conversion of pages to blocks and manage offsets. Comments will be
updated as the final version of each function is added.

There exists a relationship between PAGE_SIZE, PCPU_BITMAP_BLOCK_SIZE,
the region size, and unit_size. Every chunk's region (including offsets)
is page aligned at the beginning to preserve alignment. The end is
aligned to LCM(PAGE_SIZE, PCPU_BITMAP_BLOCK_SIZE) to ensure that the end
can fit with the populated page map which is by page and every metadata
block is fully accounted for. The unit_size is already page aligned, but
must also be aligned with PCPU_BITMAP_BLOCK_SIZE to ensure full metadata
blocks.

Signed-off-by: Dennis Zhou &lt;dennisszhou@gmail.com&gt;
Reviewed-by: Josef Bacik &lt;jbacik@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>percpu: replace area map allocator with bitmap</title>
<updated>2017-07-26T21:41:05Z</updated>
<author>
<name>Dennis Zhou (Facebook)</name>
<email>dennisszhou@gmail.com</email>
</author>
<published>2017-07-12T18:27:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=40064aeca35c5c14789e2adcf3a1d7e5d4bd65f2'/>
<id>urn:sha1:40064aeca35c5c14789e2adcf3a1d7e5d4bd65f2</id>
<content type='text'>
The percpu memory allocator is experiencing scalability issues when
allocating and freeing large numbers of counters as in BPF.
Additionally, there is a corner case where iteration is triggered over
all chunks if the contig_hint is the right size, but wrong alignment.

This patch replaces the area map allocator with a basic bitmap allocator
implementation. Each subsequent patch will introduce new features and
replace full scanning functions with faster non-scanning options when
possible.

Implementation:
This patchset removes the area map allocator in favor of a bitmap
allocator backed by metadata blocks. The primary goal is to provide
consistency in performance and memory footprint with a focus on small
allocations (&lt; 64 bytes). The bitmap removes the heavy memmove from the
freeing critical path and provides a consistent memory footprint. The
metadata blocks provide a bound on the amount of scanning required by
maintaining a set of hints.

In an effort to make freeing fast, the metadata is updated on the free
path if the new free area makes a page free, a block free, or spans
across blocks. This causes the chunk's contig hint to potentially be
smaller than what it could allocate by up to the smaller of a page or a
block. If the chunk's contig hint is contained within a block, a check
occurs and the hint is kept accurate. Metadata is always kept accurate
on allocation, so there will not be a situation where a chunk has a
later contig hint than available.

Evaluation:
I have primarily done testing against a simple workload of allocation of
1 million objects (2^20) of varying size. Deallocation was done by in
order, alternating, and in reverse. These numbers were collected after
rebasing ontop of a80099a152. I present the worst-case numbers here:

  Area Map Allocator:

        Object Size | Alloc Time (ms) | Free Time (ms)
        ----------------------------------------------
              4B    |        310      |     4770
             16B    |        557      |     1325
             64B    |        436      |      273
            256B    |        776      |      131
           1024B    |       3280      |      122

  Bitmap Allocator:

        Object Size | Alloc Time (ms) | Free Time (ms)
        ----------------------------------------------
              4B    |        490      |       70
             16B    |        515      |       75
             64B    |        610      |       80
            256B    |        950      |      100
           1024B    |       3520      |      200

This data demonstrates the inability for the area map allocator to
handle less than ideal situations. In the best case of reverse
deallocation, the area map allocator was able to perform within range
of the bitmap allocator. In the worst case situation, freeing took
nearly 5 seconds for 1 million 4-byte objects. The bitmap allocator
dramatically improves the consistency of the free path. The small
allocations performed nearly identical regardless of the freeing
pattern.

While it does add to the allocation latency, the allocation scenario
here is optimal for the area map allocator. The area map allocator runs
into trouble when it is allocating in chunks where the latter half is
full. It is difficult to replicate this, so I present a variant where
the pages are second half filled. Freeing was done sequentially. Below
are the numbers for this scenario:

  Area Map Allocator:

        Object Size | Alloc Time (ms) | Free Time (ms)
        ----------------------------------------------
              4B    |       4118      |     4892
             16B    |       1651      |     1163
             64B    |        598      |      285
            256B    |        771      |      158
           1024B    |       3034      |      160

  Bitmap Allocator:

        Object Size | Alloc Time (ms) | Free Time (ms)
        ----------------------------------------------
              4B    |        481      |       67
             16B    |        506      |       69
             64B    |        636      |       75
            256B    |        892      |       90
           1024B    |       3262      |      147

The data shows a parabolic curve of performance for the area map
allocator. This is due to the memmove operation being the dominant cost
with the lower object sizes as more objects are packed in a chunk and at
higher object sizes, the traversal of the chunk slots is the dominating
cost. The bitmap allocator suffers this problem as well. The above data
shows the inability to scale for the allocation path with the area map
allocator and that the bitmap allocator demonstrates consistent
performance in general.

The second problem of additional scanning can result in the area map
allocator completing in 52 minutes when trying to allocate 1 million
4-byte objects with 8-byte alignment. The same workload takes
approximately 16 seconds to complete for the bitmap allocator.

V2:
Fixed a bug in pcpu_alloc_first_chunk end_offset was setting the bitmap
using bytes instead of bits.

Added a comment to pcpu_cnt_pop_pages to explain bitmap_weight.

Signed-off-by: Dennis Zhou &lt;dennisszhou@gmail.com&gt;
Reviewed-by: Josef Bacik &lt;jbacik@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
