<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux, branch v5.9.14</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.9.14</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.9.14'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-12-11T12:22:09Z</updated>
<entry>
<title>genirq/irqdomain: Add an irq_create_mapping_affinity() function</title>
<updated>2020-12-11T12:22:09Z</updated>
<author>
<name>Laurent Vivier</name>
<email>lvivier@redhat.com</email>
</author>
<published>2020-11-26T08:28:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b732e2c47f5925538917a34aff5ca7d120100a68'/>
<id>urn:sha1:b732e2c47f5925538917a34aff5ca7d120100a68</id>
<content type='text'>
commit bb4c6910c8b41623104c2e64a30615682689a54d upstream.

There is currently no way to convey the affinity of an interrupt
via irq_create_mapping(), which creates issues for devices that
expect that affinity to be managed by the kernel.

In order to sort this out, rename irq_create_mapping() to
irq_create_mapping_affinity() with an additional affinity parameter that
can be passed down to irq_domain_alloc_descs().

irq_create_mapping() is re-implemented as a wrapper around
irq_create_mapping_affinity().

No functional change.

Fixes: e75eafb9b039 ("genirq/msi: Switch to new irq spreading infrastructure")
Signed-off-by: Laurent Vivier &lt;lvivier@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Greg Kurz &lt;groug@kaod.org&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201126082852.1178497-2-lvivier@redhat.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>tty: Fix -&gt;session locking</title>
<updated>2020-12-11T12:22:04Z</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2020-12-03T01:25:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=730649666353d495cfa8eade6e7f57936d0466af'/>
<id>urn:sha1:730649666353d495cfa8eade6e7f57936d0466af</id>
<content type='text'>
commit c8bcd9c5be24fb9e6132e97da5a35e55a83e36b9 upstream.

Currently, locking of -&gt;session is very inconsistent; most places
protect it using the legacy tty mutex, but disassociate_ctty(),
__do_SAK(), tiocspgrp() and tiocgsid() don't.
Two of the writers hold the ctrl_lock (because they already need it for
-&gt;pgrp), but __proc_set_tty() doesn't do that yet.

On a PREEMPT=y system, an unprivileged user can theoretically abuse
this broken locking to read 4 bytes of freed memory via TIOCGSID if
tiocgsid() is preempted long enough at the right point. (Other things
might also go wrong, especially if root-only ioctls are involved; I'm
not sure about that.)

Change the locking on -&gt;session such that:

 - tty_lock() is held by all writers: By making disassociate_ctty()
   hold it. This should be fine because the same lock can already be
   taken through the call to tty_vhangup_session().
   The tricky part is that we need to shorten the area covered by
   siglock to be able to take tty_lock() without ugly retry logic; as
   far as I can tell, this should be fine, since nothing in the
   signal_struct is touched in the `if (tty)` branch.
 - ctrl_lock is held by all writers: By changing __proc_set_tty() to
   hold the lock a little longer.
 - All readers that aren't holding tty_lock() hold ctrl_lock: By
   adding locking to tiocgsid() and __do_SAK(), and expanding the area
   covered by ctrl_lock in tiocspgrp().

Cc: stable@kernel.org
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Jiri Slaby &lt;jirislaby@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>net/mlx5: DR, Proper handling of unsupported Connect-X6DX SW steering</title>
<updated>2020-12-08T09:42:03Z</updated>
<author>
<name>Yevgeny Kliteynik</name>
<email>kliteyn@nvidia.com</email>
</author>
<published>2020-12-03T04:39:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=db7af1871e78e6da4f5f8cb76067319935995169'/>
<id>urn:sha1:db7af1871e78e6da4f5f8cb76067319935995169</id>
<content type='text'>
[ Upstream commit d421e466c2373095f165ddd25cbabd6c5b077928 ]

STEs format for Connect-X5 and Connect-X6DX different. Currently, on
Connext-X6DX the SW steering would break at some point when building STEs
w/o giving a proper error message. Fix this by checking the STE format of
the current device when initializing domain: add mlx5_ifc definitions for
Connect-X6DX SW steering, read FW capability to get the current format
version, and check this version when domain is being created.

Fixes: 26d688e33f88 ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Yevgeny Kliteynik &lt;kliteyn@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net/packet: fix packet receive on L3 devices without visible hard header</title>
<updated>2020-12-08T09:42:01Z</updated>
<author>
<name>Eyal Birger</name>
<email>eyal.birger@gmail.com</email>
</author>
<published>2020-11-21T06:28:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dadddde259a19105ee69e788835f4d4552767aa9'/>
<id>urn:sha1:dadddde259a19105ee69e788835f4d4552767aa9</id>
<content type='text'>
[ Upstream commit d549699048b4b5c22dd710455bcdb76966e55aa3 ]

In the patchset merged by commit b9fcf0a0d826
("Merge branch 'support-AF_PACKET-for-layer-3-devices'") L3 devices which
did not have header_ops were given one for the purpose of protocol parsing
on af_packet transmit path.

That change made af_packet receive path regard these devices as having a
visible L3 header and therefore aligned incoming skb-&gt;data to point to the
skb's mac_header. Some devices, such as ipip, xfrmi, and others, do not
reset their mac_header prior to ingress and therefore their incoming
packets became malformed.

Ideally these devices would reset their mac headers, or af_packet would be
able to rely on dev-&gt;hard_header_len being 0 for such cases, but it seems
this is not the case.

Fix by changing af_packet RX ll visibility criteria to include the
existence of a '.create()' header operation, which is used when creating
a device hard header - via dev_hard_header() - by upper layers, and does
not exist in these L3 devices.

As this predicate may be useful in other situations, add it as a common
dev_has_header() helper in netdevice.h.

Fixes: b9fcf0a0d826 ("Merge branch 'support-AF_PACKET-for-layer-3-devices'")
Signed-off-by: Eyal Birger &lt;eyal.birger@gmail.com&gt;
Acked-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Link: https://lore.kernel.org/r/20201121062817.3178900-1-eyal.birger@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed</title>
<updated>2020-12-02T07:51:54Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2020-11-11T16:52:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=89aada63529fb0b175d7cb5175163f9f5ee66383'/>
<id>urn:sha1:89aada63529fb0b175d7cb5175163f9f5ee66383</id>
<content type='text'>
[ Upstream commit cef397038167ac15d085914493d6c86385773709 ]

Stefan Agner reported a bug when using zsram on 32-bit Arm machines
with RAM above the 4GB address boundary:

  Unable to handle kernel NULL pointer dereference at virtual address 00000000
  pgd = a27bd01c
  [00000000] *pgd=236a0003, *pmd=1ffa64003
  Internal error: Oops: 207 [#1] SMP ARM
  Modules linked in: mdio_bcm_unimac(+) brcmfmac cfg80211 brcmutil raspberrypi_hwmon hci_uart crc32_arm_ce bcm2711_thermal phy_generic genet
  CPU: 0 PID: 123 Comm: mkfs.ext4 Not tainted 5.9.6 #1
  Hardware name: BCM2711
  PC is at zs_map_object+0x94/0x338
  LR is at zram_bvec_rw.constprop.0+0x330/0xa64
  pc : [&lt;c0602b38&gt;]    lr : [&lt;c0bda6a0&gt;]    psr: 60000013
  sp : e376bbe0  ip : 00000000  fp : c1e2921c
  r10: 00000002  r9 : c1dda730  r8 : 00000000
  r7 : e8ff7a00  r6 : 00000000  r5 : 02f9ffa0  r4 : e3710000
  r3 : 000fdffe  r2 : c1e0ce80  r1 : ebf979a0  r0 : 00000000
  Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
  Control: 30c5383d  Table: 235c2a80  DAC: fffffffd
  Process mkfs.ext4 (pid: 123, stack limit = 0x495a22e6)
  Stack: (0xe376bbe0 to 0xe376c000)

As it turns out, zsram needs to know the maximum memory size, which
is defined in MAX_PHYSMEM_BITS when CONFIG_SPARSEMEM is set, or in
MAX_POSSIBLE_PHYSMEM_BITS on the x86 architecture.

The same problem will be hit on all 32-bit architectures that have a
physical address space larger than 4GB and happen to not enable sparsemem
and include asm/sparsemem.h from asm/pgtable.h.

After the initial discussion, I suggested just always defining
MAX_POSSIBLE_PHYSMEM_BITS whenever CONFIG_PHYS_ADDR_T_64BIT is
set, or provoking a build error otherwise. This addresses all
configurations that can currently have this runtime bug, but
leaves all other configurations unchanged.

I looked up the possible number of bits in source code and
datasheets, here is what I found:

 - on ARC, CONFIG_ARC_HAS_PAE40 controls whether 32 or 40 bits are used
 - on ARM, CONFIG_LPAE enables 40 bit addressing, without it we never
   support more than 32 bits, even though supersections in theory allow
   up to 40 bits as well.
 - on MIPS, some MIPS32r1 or later chips support 36 bits, and MIPS32r5
   XPA supports up to 60 bits in theory, but 40 bits are more than
   anyone will ever ship
 - On PowerPC, there are three different implementations of 36 bit
   addressing, but 32-bit is used without CONFIG_PTE_64BIT
 - On RISC-V, the normal page table format can support 34 bit
   addressing. There is no highmem support on RISC-V, so anything
   above 2GB is unused, but it might be useful to eventually support
   CONFIG_ZRAM for high pages.

Fixes: 61989a80fb3a ("staging: zsmalloc: zsmalloc memory allocation library")
Fixes: 02390b87a945 ("mm/zsmalloc: Prepare to variable MAX_PHYSMEM_BITS")
Acked-by: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Reviewed-by: Stefan Agner &lt;stefan@agner.ch&gt;
Tested-by: Stefan Agner &lt;stefan@agner.ch&gt;
Acked-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Link: https://lore.kernel.org/linux-mm/bdfa44bf1c570b05d6c70898e2bbb0acf234ecdf.1604762181.git.stefan@agner.ch/
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bus: ti-sysc: Fix reset status check for modules with quirks</title>
<updated>2020-12-02T07:51:52Z</updated>
<author>
<name>Tony Lindgren</name>
<email>tony@atomide.com</email>
</author>
<published>2020-10-26T08:08:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e79c3d983c3e81474c0a718908224e60749d5c42'/>
<id>urn:sha1:e79c3d983c3e81474c0a718908224e60749d5c42</id>
<content type='text'>
[ Upstream commit e275d2109cdaea8b4554b9eb8a828bdb8f8ba068 ]

Commit d46f9fbec719 ("bus: ti-sysc: Use optional clocks on for enable and
wait for softreset bit") started showing a "OCP softreset timed out"
warning on enable if the interconnect target module is not out of reset.
This caused the warning to be often triggered for i2c and hdq while the
devices are working properly.

Turns out that some interconnect target modules seem to have an unusable
reset status bits unless the module specific reset quirks are activated.

Let's just skip the reset status check for those modules as we only want
to activate the reset quirks when doing a reset, and not on enable. This
way we don't see the bogus "OCP softreset timed out" warnings during boot.

Fixes: d46f9fbec719 ("bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit")
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>firmware: xilinx: Use hash-table for api feature check</title>
<updated>2020-12-02T07:51:46Z</updated>
<author>
<name>Amit Sunil Dhamne</name>
<email>amit.sunil.dhamne@xilinx.com</email>
</author>
<published>2020-11-24T05:52:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a38a99930883fb1e24f2a34b78a05a6598e86150'/>
<id>urn:sha1:a38a99930883fb1e24f2a34b78a05a6598e86150</id>
<content type='text'>
commit acfdd18591eaac25446e976a0c0d190f8b3dbfb1 upstream.

Currently array of fix length PM_API_MAX is used to cache
the pm_api version (valid or invalid). However ATF based
PM APIs values are much higher then PM_API_MAX.
So to include ATF based PM APIs also, use hash-table to
store the pm_api version status.

Signed-off-by: Amit Sunil Dhamne &lt;amit.sunil.dhamne@xilinx.com&gt;
Reported-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Ravi Patel &lt;ravi.patel@xilinx.com&gt;
Signed-off-by: Rajan Vaja &lt;rajan.vaja@xilinx.com&gt;
Reviewed-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Tested-by: Michal Simek &lt;michal.simek@xilinx.com&gt;
Fixes: f3217d6f2f7a ("firmware: xilinx: fix out-of-bounds access")
Cc: stable &lt;stable@vger.kernel.org&gt;
Link: https://lore.kernel.org/r/1606197161-25976-1-git-send-email-rajan.vaja@xilinx.com
Signed-off-by: Michal Simek &lt;michal.simek@xilinx.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: fix readahead_page_batch for retry entries</title>
<updated>2020-11-24T12:39:15Z</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2020-11-22T06:17:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9906d7a4839606307ca6bf072fd9a9d17f5daeac'/>
<id>urn:sha1:9906d7a4839606307ca6bf072fd9a9d17f5daeac</id>
<content type='text'>
commit 4349a83a3190c1d4414371161b0f4a4c3ccd3f9d upstream.

Both btrfs and fuse have reported faults caused by seeing a retry entry
instead of the page they were looking for.  This was caused by a missing
check in the iterator.

As can be seen in the below panic log, the accessing 0x402 causes a
panic.  In the xarray.h, 0x402 means RETRY_ENTRY.

  BUG: kernel NULL pointer dereference, address: 0000000000000402
  CPU: 14 PID: 306003 Comm: as Not tainted 5.9.0-1-amd64 #1 Debian 5.9.1-1
  Hardware name: Lenovo ThinkSystem SR665/7D2VCTO1WW, BIOS D8E106Q-1.01 05/30/2020
  RIP: 0010:fuse_readahead+0x152/0x470 [fuse]
  Code: 41 8b 57 18 4c 8d 54 10 ff 4c 89 d6 48 8d 7c 24 10 e8 d2 e3 28 f9 48 85 c0 0f 84 fe 00 00 00 44 89 f2 49 89 04 d4 44 8d 72 01 &lt;48&gt; 8b 10 41 8b 4f 1c 48 c1 ea 10 83 e2 01 80 fa 01 19 d2 81 e2 01
  RSP: 0018:ffffad99ceaebc50 EFLAGS: 00010246
  RAX: 0000000000000402 RBX: 0000000000000001 RCX: 0000000000000002
  RDX: 0000000000000000 RSI: ffff94c5af90bd98 RDI: ffffad99ceaebc60
  RBP: ffff94ddc1749a00 R08: 0000000000000402 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000100 R12: ffff94de6c429ce0
  R13: ffff94de6c4d3700 R14: 0000000000000001 R15: ffffad99ceaebd68
  FS:  00007f228c5c7040(0000) GS:ffff94de8ed80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000402 CR3: 0000001dbd9b4000 CR4: 0000000000350ee0
  Call Trace:
    read_pages+0x83/0x270
    page_cache_readahead_unbounded+0x197/0x230
    generic_file_buffered_read+0x57a/0xa20
    new_sync_read+0x112/0x1a0
    vfs_read+0xf8/0x180
    ksys_read+0x5f/0xe0
    do_syscall_64+0x33/0x80
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 042124cc64c3 ("mm: add new readahead_control API")
Reported-by: David Sterba &lt;dsterba@suse.com&gt;
Reported-by: Wonhyuk Yang &lt;vvghjk1234@gmail.com&gt;
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20201103142852.8543-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20201103124349.16722-1-vvghjk1234@gmail.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>spi: Introduce device-managed SPI controller allocation</title>
<updated>2020-11-24T12:39:11Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2020-11-11T19:07:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bd1a5b2307279029faaddbecf2f2ac25eaef8dc6'/>
<id>urn:sha1:bd1a5b2307279029faaddbecf2f2ac25eaef8dc6</id>
<content type='text'>
commit 5e844cc37a5cbaa460e68f9a989d321d63088a89 upstream.

SPI driver probing currently comprises two steps, whereas removal
comprises only one step:

    spi_alloc_master()
    spi_register_controller()

    spi_unregister_controller()

That's because spi_unregister_controller() calls device_unregister()
instead of device_del(), thereby releasing the reference on the
spi_controller which was obtained by spi_alloc_master().

An SPI driver's private data is contained in the same memory allocation
as the spi_controller struct.  Thus, once spi_unregister_controller()
has been called, the private data is inaccessible.  But some drivers
need to access it after spi_unregister_controller() to perform further
teardown steps.

Introduce devm_spi_alloc_master() and devm_spi_alloc_slave(), which
release a reference on the spi_controller struct only after the driver
has unbound, thereby keeping the memory allocation accessible.  Change
spi_unregister_controller() to not release a reference if the
spi_controller was allocated by one of these new devm functions.

The present commit is small enough to be backportable to stable.
It allows fixing drivers which use the private data in their -&gt;remove()
hook after it's been freed.  It also allows fixing drivers which neglect
to release a reference on the spi_controller in the probe error path.

Long-term, most SPI drivers shall be moved over to the devm functions
introduced herein.  The few that can't shall be changed in a treewide
commit to explicitly release the last reference on the controller.
That commit shall amend spi_unregister_controller() to no longer release
a reference, thereby completing the migration.

As a result, the behaviour will be less surprising and more consistent
with subsystems such as IIO, which also includes the private data in the
allocation of the generic iio_dev struct, but calls device_del() in
iio_device_unregister().

Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Link: https://lore.kernel.org/r/272bae2ef08abd21388c98e23729886663d19192.1605121038.git.lukas@wunner.de
Signed-off-by: Mark Brown &lt;broonie@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched: Fix data-race in wakeup</title>
<updated>2020-11-24T12:39:06Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-11-17T08:08:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=09b5b608f461cbd4dca250fc480e0f7bef6a140e'/>
<id>urn:sha1:09b5b608f461cbd4dca250fc480e0f7bef6a140e</id>
<content type='text'>
[ Upstream commit f97bb5272d9e95d400d6c8643ebb146b3e3e7842 ]

Mel reported that on some ARM64 platforms loadavg goes bananas and
Will tracked it down to the following race:

  CPU0					CPU1

  schedule()
    prev-&gt;sched_contributes_to_load = X;
    deactivate_task(prev);

					try_to_wake_up()
					  if (p-&gt;on_rq &amp;&amp;) // false
					  if (smp_load_acquire(&amp;p-&gt;on_cpu) &amp;&amp; // true
					      ttwu_queue_wakelist())
					        p-&gt;sched_remote_wakeup = Y;

    smp_store_release(prev-&gt;on_cpu, 0);

where both p-&gt;sched_contributes_to_load and p-&gt;sched_remote_wakeup are
in the same word, and thus the stores X and Y race (and can clobber
one another's data).

Whereas prior to commit c6e7bd7afaeb ("sched/core: Optimize ttwu()
spinning on p-&gt;on_cpu") the p-&gt;on_cpu handoff serialized access to
p-&gt;sched_remote_wakeup (just as it still does with
p-&gt;sched_contributes_to_load) that commit broke that by calling
ttwu_queue_wakelist() with p-&gt;on_cpu != 0.

However, due to

  p-&gt;XXX = X			ttwu()
  schedule()			  if (p-&gt;on_rq &amp;&amp; ...) // false
    smp_mb__after_spinlock()	  if (smp_load_acquire(&amp;p-&gt;on_cpu) &amp;&amp;
    deactivate_task()		      ttwu_queue_wakelist())
      p-&gt;on_rq = 0;		        p-&gt;sched_remote_wakeup = Y;

We can be sure any 'current' store is complete and 'current' is
guaranteed asleep. Therefore we can move p-&gt;sched_remote_wakeup into
the current flags word.

Note: while the observed failure was loadavg accounting gone wrong due
to ttwu() cobbering p-&gt;sched_contributes_to_load, the reverse problem
is also possible where schedule() clobbers p-&gt;sched_remote_wakeup,
this could result in enqueue_entity() wrecking -&gt;vruntime and causing
scheduling artifacts.

Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p-&gt;on_cpu")
Reported-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Debugged-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20201117083016.GK3121392@hirez.programming.kicks-ass.net
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
