<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/bitops.h, branch v4.4.232</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.232</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.232'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-06-30T00:07:54Z</updated>
<entry>
<title>include/linux/bitops.h: avoid clang shift-count-overflow warnings</title>
<updated>2020-06-30T00:07:54Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2020-06-04T23:50:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b656010d1a06d2f63999148fb0966e2a89dec0a5'/>
<id>urn:sha1:b656010d1a06d2f63999148fb0966e2a89dec0a5</id>
<content type='text'>
[ Upstream commit bd93f003b7462ae39a43c531abca37fe7073b866 ]

Clang normally does not warn about certain issues in inline functions when
it only happens in an eliminated code path. However if something else
goes wrong, it does tend to complain about the definition of hweight_long()
on 32-bit targets:

  include/linux/bitops.h:75:41: error: shift count &gt;= width of type [-Werror,-Wshift-count-overflow]
          return sizeof(w) == 4 ? hweight32(w) : hweight64(w);
                                                 ^~~~~~~~~~~~
  include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64'
   define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))
                                                  ^~~~~~~~~~~~~~~~~~~~
  include/asm-generic/bitops/const_hweight.h:21:76: note: expanded from macro '__const_hweight64'
   define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) &gt;&gt; 32))
                                                                             ^  ~~
  include/asm-generic/bitops/const_hweight.h:20:49: note: expanded from macro '__const_hweight32'
   define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) &gt;&gt; 16))
                                                  ^
  include/asm-generic/bitops/const_hweight.h:19:72: note: expanded from macro '__const_hweight16'
   define __const_hweight16(w) (__const_hweight8(w)  + __const_hweight8((w)  &gt;&gt; 8 ))
                                                                         ^
  include/asm-generic/bitops/const_hweight.h:12:9: note: expanded from macro '__const_hweight8'
            (!!((w) &amp; (1ULL &lt;&lt; 2))) +     \

Adding an explicit cast to __u64 avoids that warning and makes it easier
to read other output.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Link: http://lkml.kernel.org/r/20200505135513.65265-1-arnd@arndb.de
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>include/linux/bitops.h: introduce BITS_PER_TYPE</title>
<updated>2020-03-11T06:51:14Z</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2018-08-22T04:57:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cefb5fdeb4ceb0eba488650ae71b7ece6a5b0ff9'/>
<id>urn:sha1:cefb5fdeb4ceb0eba488650ae71b7ece6a5b0ff9</id>
<content type='text'>
commit 9144d75e22cad3c89e6b2ccab551db9ee28d250a upstream.

net_dim.h has a rather useful extension to BITS_PER_BYTE to compute the
number of bits in a type (BITS_PER_BYTE * sizeof(T)), so promote the macro
to bitops.h, alongside BITS_PER_BYTE, for wider usage.

Link: http://lkml.kernel.org/r/20180706094458.14116-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Reviewed-by: Jani Nikula &lt;jani.nikula@intel.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Andy Gospodarek &lt;gospo@broadcom.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[only take the bitops.h portion for stable kernels - gregkh]
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>include/linux/bitops.h: sanitize rotate primitives</title>
<updated>2019-06-11T10:24:08Z</updated>
<author>
<name>Rasmus Villemoes</name>
<email>linux@rasmusvillemoes.dk</email>
</author>
<published>2019-05-14T22:43:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ec70e2c130d6b48f6b84fde6b47700ff3012ca4a'/>
<id>urn:sha1:ec70e2c130d6b48f6b84fde6b47700ff3012ca4a</id>
<content type='text'>
commit ef4d6f6b275c498f8e5626c99dbeefdc5027f843 upstream.

The ror32 implementation (word &gt;&gt; shift) | (word &lt;&lt; (32 - shift) has
undefined behaviour if shift is outside the [1, 31] range.  Similarly
for the 64 bit variants.  Most callers pass a compile-time constant
(naturally in that range), but there's an UBSAN report that these may
actually be called with a shift count of 0.

Instead of special-casing that, we can make them DTRT for all values of
shift while also avoiding UB.  For some reason, this was already partly
done for rol32 (which was well-defined for [0, 31]).  gcc 8 recognizes
these patterns as rotates, so for example

  __u32 rol32(__u32 word, unsigned int shift)
  {
	return (word &lt;&lt; (shift &amp; 31)) | (word &gt;&gt; ((-shift) &amp; 31));
  }

compiles to

0000000000000020 &lt;rol32&gt;:
  20:   89 f8                   mov    %edi,%eax
  22:   89 f1                   mov    %esi,%ecx
  24:   d3 c0                   rol    %cl,%eax
  26:   c3                      retq

Older compilers unfortunately do not do as well, but this only affects
the small minority of users that don't pass constants.

Due to integer promotions, ro[lr]8 were already well-defined for shifts
in [0, 8], and ro[lr]16 were mostly well-defined for shifts in [0, 16]
(only mostly - u16 gets promoted to _signed_ int, so if bit 15 is set,
word &lt;&lt; 16 is undefined).  For consistency, update those as well.

Link: http://lkml.kernel.org/r/20190410211906.2190-1-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Reported-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Tested-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Reviewed-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Vadim Pasternak &lt;vadimp@mellanox.com&gt;
Cc: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Jacek Anaszewski &lt;jacek.anaszewski@gmail.com&gt;
Cc: Pavel Machek &lt;pavel@ucw.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Matthias Kaehlcke &lt;mka@chromium.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/atomics, asm-generic: Move some macros from &lt;linux/bitops.h&gt; to a new &lt;linux/bits.h&gt; file</title>
<updated>2019-05-16T17:45:09Z</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2018-06-19T12:53:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8d1385ea4c67f17b1f5de625d06ccae47aadfb3d'/>
<id>urn:sha1:8d1385ea4c67f17b1f5de625d06ccae47aadfb3d</id>
<content type='text'>
commit 8bd9cb51daac89337295b6f037b0486911e1b408 upstream.

In preparation for implementing the asm-generic atomic bitops in terms
of atomic_long_*(), we need to prevent &lt;asm/atomic.h&gt; implementations from
pulling in &lt;linux/bitops.h&gt;. A common reason for this include is for the
BITS_PER_BYTE definition, so move this and some other BIT() and masking
macros into a new header file, &lt;linux/bits.h&gt;.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: yamada.masahiro@socionext.com
Link: https://lore.kernel.org/lkml/1529412794-17720-4-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bitops: avoid integer overflow in GENMASK(_ULL)</title>
<updated>2019-05-16T17:45:08Z</updated>
<author>
<name>Matthias Kaehlcke</name>
<email>mka@chromium.org</email>
</author>
<published>2017-09-08T23:14:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c2a357d9b42977cb511d681c3fcb6819cba939da'/>
<id>urn:sha1:c2a357d9b42977cb511d681c3fcb6819cba939da</id>
<content type='text'>
commit c32ee3d9abd284b4fcaacc250b101f93829c7bae upstream.

GENMASK(_ULL) performs a left-shift of ~0UL(L), which technically
results in an integer overflow.  clang raises a warning if the overflow
occurs in a preprocessor expression.  Clear the low-order bits through a
substraction instead of the left-shift to avoid the overflow.

(akpm: no change in .text size in my testing)

Link: http://lkml.kernel.org/r/20170803212020.24939-1-mka@chromium.org
Signed-off-by: Matthias Kaehlcke &lt;mka@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bitops.h: correctly handle rol32 with 0 byte shift</title>
<updated>2015-12-09T18:35:16Z</updated>
<author>
<name>Sasha Levin</name>
<email>sasha.levin@oracle.com</email>
</author>
<published>2015-12-04T03:04:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d7e35dfa2531b53618b9e6edcd8752ce988ac555'/>
<id>urn:sha1:d7e35dfa2531b53618b9e6edcd8752ce988ac555</id>
<content type='text'>
ROL on a 32 bit integer with a shift of 32 or more is undefined and the
result is arch-dependent. Avoid this by handling the trivial case of
roling by 0 correctly.

The trivial solution of checking if shift is 0 breaks gcc's detection
of this code as a ROL instruction, which is unacceptable.

This bug was reported and fixed in GCC
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157):

	The standard rotate idiom,

	  (x &lt;&lt; n) | (x &gt;&gt; (32 - n))

	is recognized by gcc (for concreteness, I discuss only the case that x
	is an uint32_t here).

	However, this is portable C only for n in the range 0 &lt; n &lt; 32. For n
	== 0, we get x &gt;&gt; 32 which gives undefined behaviour according to the
	C standard (6.5.7, Bitwise shift operators). To portably support n ==
	0, one has to write the rotate as something like

	  (x &lt;&lt; n) | (x &gt;&gt; ((-n) &amp; 31))

	And this is apparently not recognized by gcc.

Note that this is broken on older GCCs and will result in slower ROL.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>bitops.h: add sign_extend64()</title>
<updated>2015-11-07T01:50:42Z</updated>
<author>
<name>Martin Kepplinger</name>
<email>martink@posteo.de</email>
</author>
<published>2015-11-07T00:31:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=48e203e21b29cd4b2c58403fe8bca68e2e854895'/>
<id>urn:sha1:48e203e21b29cd4b2c58403fe8bca68e2e854895</id>
<content type='text'>
Months back, this was discussed, see https://lkml.org/lkml/2015/1/18/289
The result was the 64-bit version being "likely fine", "valuable" and
"correct".  The discussion fell asleep but since there are possible users,
let's add it.

Signed-off-by: Martin Kepplinger &lt;martin.kepplinger@theobroma-systems.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: George Spelvin &lt;linux@horizon.com&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Maxime Coquelin &lt;maxime.coquelin@st.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Yury Norov &lt;yury.norov@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>bitops.h: improve sign_extend32()'s documentation</title>
<updated>2015-11-07T01:50:42Z</updated>
<author>
<name>Martin Kepplinger</name>
<email>martink@posteo.de</email>
</author>
<published>2015-11-07T00:30:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e2eb53aa96754b97d158eff884dde88abbad925e'/>
<id>urn:sha1:e2eb53aa96754b97d158eff884dde88abbad925e</id>
<content type='text'>
It is often overlooked that sign_extend32(), despite its name, is safe to
use for 16 and 8 bit types as well.  This should help prevent sign
extension being done manually some other way.

Signed-off-by: Martin Kepplinger &lt;martin.kepplinger@theobroma-systems.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: George Spelvin &lt;linux@horizon.com&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Maxime Coquelin &lt;maxime.coquelin@st.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Yury Norov &lt;yury.norov@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>linux/bitmap: Force inlining of bitmap weight functions</title>
<updated>2015-08-05T07:38:08Z</updated>
<author>
<name>Denys Vlasenko</name>
<email>dvlasenk@redhat.com</email>
</author>
<published>2015-08-04T14:15:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1a1d48a4a8fde49aedc045d894efe67173d59fe0'/>
<id>urn:sha1:1a1d48a4a8fde49aedc045d894efe67173d59fe0</id>
<content type='text'>
With this config:

  http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os

gcc-4.7.2 generates many copies of these tiny functions:

	bitmap_weight (55 copies):
	55                      push   %rbp
	48 89 e5                mov    %rsp,%rbp
	e8 3f 3a 8b 00          callq  __bitmap_weight
	5d                      pop    %rbp
	c3                      retq

	hweight_long (23 copies):
	55                      push   %rbp
	e8 b5 65 8e 00          callq  __sw_hweight64
	48 89 e5                mov    %rsp,%rbp
	5d                      pop    %rbp
	c3                      retq

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122

This patch fixes this via s/inline/__always_inline/

While at it, replaced two "__inline__" with usual "inline"
(the rest of the source file uses the latter).

	    text     data      bss       dec  filename
	86971357 17195880 36659200 140826437  vmlinux.before
	86971120 17195912 36659200 140826232  vmlinux

Signed-off-by: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1438697716-28121-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>lib: find_*_bit reimplementation</title>
<updated>2015-04-17T13:03:53Z</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2015-04-16T19:43:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2c57a0e233d72f8c2e2404560dcf0188ac3cf5d7'/>
<id>urn:sha1:2c57a0e233d72f8c2e2404560dcf0188ac3cf5d7</id>
<content type='text'>
This patchset does rework to find_bit function family to achieve better
performance, and decrease size of text.  All rework is done in patch 1.
Patches 2 and 3 are about code moving and renaming.

It was boot-tested on x86_64 and MIPS (big-endian) machines.
Performance tests were ran on userspace with code like this:

	/* addr[] is filled from /dev/urandom */
	start = clock();
	while (ret &lt; nbits)
		ret = find_next_bit(addr, nbits, ret + 1);

	end = clock();
	printf("%ld\t", (unsigned long) end - start);

On Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz measurements are: (for
find_next_bit, nbits is 8M, for find_first_bit - 80K)

	find_next_bit:		find_first_bit:
	new	current		new	current
	26932	43151		14777	14925
	26947	43182		14521	15423
	26507	43824		15053	14705
	27329	43759		14473	14777
	26895	43367		14847	15023
	26990	43693		15103	15163
	26775	43299		15067	15232
	27282	42752		14544	15121
	27504	43088		14644	14858
	26761	43856		14699	15193
	26692	43075		14781	14681
	27137	42969		14451	15061
	...			...

find_next_bit performance gain is 35-40%;
find_first_bit - no measurable difference.

On ARM machine, there is arch-specific implementation for find_bit.

Thanks a lot to George Spelvin and Rasmus Villemoes for hints and
helpful discussions.

This patch (of 3):

New implementations takes less space in source file (see diffstat) and in
object.  For me it's 710 vs 453 bytes of text.  It also shows better
performance.

find_last_bit description fixed due to obvious typo.

[akpm@linux-foundation.org: include linux/bitmap.h, per Rasmus]
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Reviewed-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Reviewed-by: George Spelvin &lt;linux@horizon.com&gt;
Cc: Alexey Klimov &lt;klimov.linux@gmail.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: AKASHI Takahiro &lt;takahiro.akashi@linaro.org&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Cc: Valentin Rothberg &lt;valentinrothberg@gmail.com&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
