<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/pci/hotplug, branch v4.19.170</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.170</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.170'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-10-01T11:14:41Z</updated>
<entry>
<title>PCI: pciehp: Fix MSI interrupt race</title>
<updated>2020-10-01T11:14:41Z</updated>
<author>
<name>Stuart Hayes</name>
<email>stuart.w.hayes@gmail.com</email>
</author>
<published>2020-02-19T14:31:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a8cc52270f3d8e8f4faf01ffd6c4a95bbfb55ba4'/>
<id>urn:sha1:a8cc52270f3d8e8f4faf01ffd6c4a95bbfb55ba4</id>
<content type='text'>
[ Upstream commit 8edf5332c39340b9583cf9cba659eb7ec71f75b5 ]

Without this commit, a PCIe hotplug port can stop generating interrupts on
hotplug events, so device adds and removals will not be seen:

The pciehp interrupt handler pciehp_isr() reads the Slot Status register
and then writes back to it to clear the bits that caused the interrupt.  If
a different interrupt event bit gets set between the read and the write,
pciehp_isr() returns without having cleared all of the interrupt event
bits.  If this happens when the MSI isn't masked (which by default it isn't
in handle_edge_irq(), and which it will never be when MSI per-vector
masking is not supported), we won't get any more hotplug interrupts from
that device.

That is expected behavior, according to the PCIe Base Spec r5.0, section
6.7.3.4, "Software Notification of Hot-Plug Events".

Because the Presence Detect Changed and Data Link Layer State Changed event
bits can both get set at nearly the same time when a device is added or
removed, this is more likely to happen than it might seem.  The issue was
found (and can be reproduced rather easily) by connecting and disconnecting
an NVMe storage device on at least one system model where the NVMe devices
were being connected to an AMD PCIe port (PCI device 0x1022/0x1483).

Fix the issue by modifying pciehp_isr() to loop back and re-read the Slot
Status register immediately after writing to it, until it sees that all of
the event status bits have been cleared.

[lukas: drop loop count limitation, write "events" instead of "status",
don't loop back in INTx and poll modes, tweak code comment &amp; commit msg]
Link: https://lore.kernel.org/r/78b4ced5072bfe6e369d20e8b47c279b8c7af12e.1582121613.git.lukas@wunner.de
Tested-by: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Signed-off-by: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Joerg Roedel &lt;jroedel@suse.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context()</title>
<updated>2020-08-21T09:05:29Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2020-06-26T17:42:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c59ea9bde42ede641006940a339eabe6669cc1be'/>
<id>urn:sha1:c59ea9bde42ede641006940a339eabe6669cc1be</id>
<content type='text'>
commit dae68d7fd4930315389117e9da35b763f12238f9 upstream.

If context is not NULL in acpiphp_grab_context(), but the
is_going_away flag is set for the device's parent, the reference
counter of the context needs to be decremented before returning
NULL or the context will never be freed, so make that happen.

Fixes: edf5bf34d408 ("ACPI / dock: Use callback pointers from devices' ACPI hotplug contexts")
Reported-by: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Cc: 3.15+ &lt;stable@vger.kernel.org&gt; # 3.15+
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>PCI: pciehp: Fix indefinite wait on sysfs requests</title>
<updated>2020-04-17T08:48:45Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2020-03-18T11:33:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1ada617e3627ddc734b3bda5576552c815d9014f'/>
<id>urn:sha1:1ada617e3627ddc734b3bda5576552c815d9014f</id>
<content type='text'>
commit 3e487d2e4aa466decd226353755c9d423e8fbacc upstream.

David Hoyer reports that powering pciehp slots up or down via sysfs may
hang:  The call to wait_event() in pciehp_sysfs_enable_slot() and
_disable_slot() does not return because ctrl-&gt;ist_running remains true.

This flag, which was introduced by commit 157c1062fcd8 ("PCI: pciehp: Avoid
returning prematurely from sysfs requests"), signifies that the IRQ thread
pciehp_ist() is running.  It is set to true at the top of pciehp_ist() and
reset to false at the end.  However there are two additional return
statements in pciehp_ist() before which the commit neglected to reset the
flag to false and wake up waiters for the flag.

That omission opens up the following race when powering up the slot:

* pciehp_ist() runs because a PCI_EXP_SLTSTA_PDC event was requested
  by pciehp_sysfs_enable_slot()

* pciehp_ist() turns on slot power via the following call stack:
  pciehp_handle_presence_or_link_change() -&gt; pciehp_enable_slot() -&gt;
  __pciehp_enable_slot() -&gt; board_added() -&gt; pciehp_power_on_slot()

* after slot power is turned on, the link comes up, resulting in a
  PCI_EXP_SLTSTA_DLLSC event

* the IRQ handler pciehp_isr() stores the event in ctrl-&gt;pending_events
  and returns IRQ_WAKE_THREAD

* the IRQ thread is already woken (it's bringing up the slot), but the
  genirq code remembers to re-run the IRQ thread after it has finished
  (such that it can deal with the new event) by setting IRQTF_RUNTHREAD
  via __handle_irq_event_percpu() -&gt; __irq_wake_thread()

* the IRQ thread removes PCI_EXP_SLTSTA_DLLSC from ctrl-&gt;pending_events
  via board_added() -&gt; pciehp_check_link_status() in order to deal with
  presence and link flaps per commit 6c35a1ac3da6 ("PCI: pciehp:
  Tolerate initially unstable link")

* after pciehp_ist() has successfully brought up the slot, it resets
  ctrl-&gt;ist_running to false and wakes up the sysfs requester

* the genirq code re-runs pciehp_ist(), which sets ctrl-&gt;ist_running
  to true but then returns with IRQ_NONE because ctrl-&gt;pending_events
  is empty

* pciehp_sysfs_enable_slot() is finally woken but notices that
  ctrl-&gt;ist_running is true, hence continues waiting

The only way to get the hung task going again is to trigger a hotplug
event which brings down the slot, e.g. by yanking out the card.

The same race exists when powering down the slot because remove_board()
likewise clears link or presence changes in ctrl-&gt;pending_events per commit
3943af9d01e9 ("PCI: pciehp: Ignore Link State Changes after powering off a
slot") and thereby may cause a re-run of pciehp_ist() which returns with
IRQ_NONE without resetting ctrl-&gt;ist_running to false.

Fix by adding a goto label before the teardown steps at the end of
pciehp_ist() and jumping to that label from the two return statements which
currently neglect to reset the ctrl-&gt;ist_running flag.

Fixes: 157c1062fcd8 ("PCI: pciehp: Avoid returning prematurely from sysfs requests")
Link: https://lore.kernel.org/r/cca1effa488065cb055120aa01b65719094bdcb5.1584530321.git.lukas@wunner.de
Reported-by: David Hoyer &lt;David.Hoyer@netapp.com&gt;
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Cc: stable@vger.kernel.org	# v4.19+
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>PCI: rpaphp: Correctly match ibm, my-drc-index to drc-name when using drc-info</title>
<updated>2020-01-04T18:12:55Z</updated>
<author>
<name>Tyrel Datwyler</name>
<email>tyreld@linux.ibm.com</email>
</author>
<published>2019-11-11T05:21:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=802534c0ffaeb0adb69a60c3727a66308909cd38'/>
<id>urn:sha1:802534c0ffaeb0adb69a60c3727a66308909cd38</id>
<content type='text'>
[ Upstream commit 4f9f2d3d7a434b7f882b72550194c9278f4a3925 ]

The newer ibm,drc-info property is a condensed description of the old
ibm,drc-* properties (ie. names, types, indexes, and power-domains).
When matching a drc-index to a drc-name we need to verify that the
index is within the start and last drc-index range and map it to a
drc-name using the drc-name-prefix and logical index.

Fix the mapping by checking that the index is within the range of the
current drc-info entry, and build the name from the drc-name-prefix
concatenated with the starting drc-name-suffix value and the sequential
index obtained by subtracting ibm,my-drc-index from this entries
drc-start-index.

Signed-off-by: Tyrel Datwyler &lt;tyreld@linux.ibm.com&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/1573449697-5448-10-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: rpaphp: Annotate and correctly byte swap DRC properties</title>
<updated>2020-01-04T18:12:54Z</updated>
<author>
<name>Tyrel Datwyler</name>
<email>tyreld@linux.ibm.com</email>
</author>
<published>2019-11-11T05:21:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1dc37bd4750fa5f283fa889b648dc7b52d93f72b'/>
<id>urn:sha1:1dc37bd4750fa5f283fa889b648dc7b52d93f72b</id>
<content type='text'>
[ Upstream commit 0737686778c6dbe0908d684dd5b9c05b127526ba ]

The device tree is in big endian format and any properties directly
retrieved using OF helpers that don't explicitly byte swap should
be annotated. In particular there are several places where we grab
the opaque property value for the old ibm,drc-* properties and the
ibm,my-drc-index property.

Fix this for better static checking by annotating values we know to
explicitly big endian, and byte swap where appropriate.

Signed-off-by: Tyrel Datwyler &lt;tyreld@linux.ibm.com&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/1573449697-5448-9-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: rpaphp: Don't rely on firmware feature to imply drc-info support</title>
<updated>2020-01-04T18:12:54Z</updated>
<author>
<name>Tyrel Datwyler</name>
<email>tyreld@linux.ibm.com</email>
</author>
<published>2019-11-11T05:21:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a22f89ba4421e0473f0298f95be9b30f62dd7ce0'/>
<id>urn:sha1:a22f89ba4421e0473f0298f95be9b30f62dd7ce0</id>
<content type='text'>
[ Upstream commit 52e2b0f16574afd082cff0f0e8567b2d9f68c033 ]

In the event that the partition is migrated to a platform with older
firmware that doesn't support the ibm,drc-info property the device
tree is modified to remove the ibm,drc-info property and replace it
with the older style ibm,drc-* properties for types, names, indexes,
and power-domains. One of the requirements of the drc-info firmware
feature is that the client is able to handle both the new property,
and old style properties at runtime. Therefore we can't rely on the
firmware feature alone to dictate which property is currently
present in the device tree.

Fix this short coming by checking explicitly for the ibm,drc-info
property, and falling back to the older ibm,drc-* properties if it
doesn't exist.

Signed-off-by: Tyrel Datwyler &lt;tyreld@linux.ibm.com&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/1573449697-5448-6-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: rpaphp: Fix up pointer to first drc-info entry</title>
<updated>2020-01-04T18:12:53Z</updated>
<author>
<name>Tyrel Datwyler</name>
<email>tyreld@linux.ibm.com</email>
</author>
<published>2019-11-11T05:21:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=68b7ccf6b0ee5f9942fa9295a98e6b5301f3aa21'/>
<id>urn:sha1:68b7ccf6b0ee5f9942fa9295a98e6b5301f3aa21</id>
<content type='text'>
[ Upstream commit 9723c25f99aff0451cfe6392e1b9fdd99d0bf9f0 ]

The first entry of the ibm,drc-info property is an int encoded count
of the number of drc-info entries that follow. The "value" pointer
returned by of_prop_next_u32() is still pointing at the this value
when we call of_read_drc_info_cell(), but the helper function
expects that value to be pointing at the first element of an entry.

Fix up by incrementing the "value" pointer to point at the first
element of the first drc-info entry prior.

Signed-off-by: Tyrel Datwyler &lt;tyreld@linux.ibm.com&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/1573449697-5448-5-git-send-email-tyreld@linux.ibm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI: pciehp: Avoid returning prematurely from sysfs requests</title>
<updated>2019-12-21T09:57:22Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2019-08-09T10:28:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=248e65f3220e66fb70c25192d693b5499614dda4'/>
<id>urn:sha1:248e65f3220e66fb70c25192d693b5499614dda4</id>
<content type='text'>
commit 157c1062fcd86ade3c674503705033051fd3d401 upstream.

A sysfs request to enable or disable a PCIe hotplug slot should not
return before it has been carried out.  That is sought to be achieved by
waiting until the controller's "pending_events" have been cleared.

However the IRQ thread pciehp_ist() clears the "pending_events" before
it acts on them.  If pciehp_sysfs_enable_slot() / _disable_slot() happen
to check the "pending_events" after they have been cleared but while
pciehp_ist() is still running, the functions may return prematurely
with an incorrect return value.

Fix by introducing an "ist_running" flag which must be false before a sysfs
request is allowed to return.

Fixes: 32a8cef274fe ("PCI: pciehp: Enable/disable exclusively from IRQ thread")
Link: https://lore.kernel.org/linux-pci/1562226638-54134-1-git-send-email-wangxiongfeng2@huawei.com
Link: https://lore.kernel.org/r/4174210466e27eb7e2243dd1d801d5f75baaffd8.1565345211.git.lukas@wunner.de
Reported-and-tested-by: Xiongfeng Wang &lt;wangxiongfeng2@huawei.com&gt;
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge</title>
<updated>2019-12-17T19:35:04Z</updated>
<author>
<name>Mika Westerberg</name>
<email>mika.westerberg@linux.intel.com</email>
</author>
<published>2019-10-30T15:05:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9f5ee706852e43782910de18e74ef357a4b07d23'/>
<id>urn:sha1:9f5ee706852e43782910de18e74ef357a4b07d23</id>
<content type='text'>
commit 77adf9355304f8dcf09054280af5e23fc451ab3d upstream.

Valerio and others reported that commit 84c8b58ed3ad ("ACPI / hotplug /
PCI: Don't scan bridges managed by native hotplug") prevents some recent
LG and HP laptops from booting with endless loop of:

  ACPI Error: No handler or method for GPE 08, disabling event (20190215/evgpe-835)
  ACPI Error: No handler or method for GPE 09, disabling event (20190215/evgpe-835)
  ACPI Error: No handler or method for GPE 0A, disabling event (20190215/evgpe-835)
  ...

What seems to happen is that during boot, after the initial PCI enumeration
when EC is enabled the platform triggers ACPI Notify() to one of the root
ports. The root port itself looks like this:

  pci 0000:00:1b.0: PCI bridge to [bus 02-3a]
  pci 0000:00:1b.0:   bridge window [mem 0xc4000000-0xda0fffff]
  pci 0000:00:1b.0:   bridge window [mem 0x80000000-0xa1ffffff 64bit pref]

The BIOS has configured the root port so that it does not have I/O bridge
window.

Now when the ACPI Notify() is triggered ACPI hotplug handler calls
acpiphp_native_scan_bridge() for each non-hotplug bridge (as this system is
using native PCIe hotplug) and pci_assign_unassigned_bridge_resources() to
allocate resources.

The device connected to the root port is a PCIe switch (Thunderbolt
controller) with two hotplug downstream ports. Because of the hotplug ports
__pci_bus_size_bridges() tries to add "additional I/O" of 256 bytes to each
(DEFAULT_HOTPLUG_IO_SIZE). This gets further aligned to 4k as that's the
minimum I/O window size so each hotplug port gets 4k I/O window and the
same happens for the root port (which is also hotplug port). This means
3 * 4k = 12k I/O window.

Because of this pci_assign_unassigned_bridge_resources() ends up opening a
I/O bridge window for the root port at first available I/O address which
seems to be in range 0x1000 - 0x3fff. Normally this range is used for ACPI
stuff such as GPE bits (below is part of /proc/ioports):

    1800-1803 : ACPI PM1a_EVT_BLK
    1804-1805 : ACPI PM1a_CNT_BLK
    1808-180b : ACPI PM_TMR
    1810-1815 : ACPI CPU throttle
    1850-1850 : ACPI PM2_CNT_BLK
    1854-1857 : pnp 00:05
    1860-187f : ACPI GPE0_BLK

However, when the ACPI Notify() happened this range was not yet reserved
for ACPI/PNP (that happens later) so PCI gets it. It then starts writing to
this range and accidentally stomps over GPE bits among other things causing
the endless stream of messages about missing GPE handler.

This problem does not happen if "pci=hpiosize=0" is passed in the kernel
command line. The reason is that then the kernel does not try to allocate
the additional 256 bytes for each hotplug port.

Fix this by allocating resources directly below the non-hotplug bridges
where a new device may appear as a result of ACPI Notify(). This avoids the
hotplug bridges and prevents opening the additional I/O window.

Fixes: 84c8b58ed3ad ("ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203617
Link: https://lore.kernel.org/r/20191030150545.19885-1-mika.westerberg@linux.intel.com
Reported-by: Valerio Passini &lt;passini.valerio@gmail.com&gt;
Signed-off-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>PCI: portdrv: Initialize service drivers directly</title>
<updated>2019-11-20T17:47:10Z</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2018-09-20T16:27:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1eeee2fd2e0c79ca5a51f10bb255e330ecbc638c'/>
<id>urn:sha1:1eeee2fd2e0c79ca5a51f10bb255e330ecbc638c</id>
<content type='text'>
[ Upstream commit c29de84149aba5f74e87b6491c13ac7203c12f55 ]

The PCI port driver saves the PCI state after initializing the device with
the applicable service devices.  This was, however, before the service
drivers were even registered because PCI probe happens before the
device_initcall initialized those service drivers.  The config space state
that the services set up were not being saved.  The end result would cause
PCI devices to not react to events that the drivers think they did if the
PCI state ever needed to be restored.

Fix this by changing the service drivers from using the init calls to
having the portdrv driver calling the services directly.  This will get the
state saved as desired, while making the relationship between the port
driver and the services under it more explicit in the code.

Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Sinan Kaya &lt;okaya@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
