From 1e398eae8407abdc02cde8a449b14d17ed193d56 Mon Sep 17 00:00:00 2001 From: Lukas Wunner Date: Mon, 2 May 2016 13:48:25 -0500 Subject: [PATCH 1/2] PCI: Fix BUG on device attach failure Previously when pci_bus_add_device() called device_attach() and it returned a negative value, we emitted a WARN but carried on. Commit ab1a187bba5c ("PCI: Check device_attach() return value always"), introduced in Linux 4.6-rc1, changed this to unwind all steps preceding device_attach() and to not set dev->is_added = 1. The latter leads to a BUG if pci_bus_add_device() was called from pci_bus_add_devices(). Fix by not recursing to a child bus if device_attach() failed for the bridge leading to it. This can be triggered by plugging in a PCI device (e.g. Thunderbolt) while the system is asleep. The system locks up when woken because device_attach() returns -EPROBE_DEFER. Fixes: ab1a187bba5c ("PCI: Check device_attach() return value always") Signed-off-by: Lukas Wunner Signed-off-by: Bjorn Helgaas Acked-by: Rafael J. Wysocki --- drivers/pci/bus.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 6c9f5467bc5f..23a39fdc311e 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -324,7 +324,9 @@ void pci_bus_add_devices(const struct pci_bus *bus) } list_for_each_entry(dev, &bus->devices, bus_list) { - BUG_ON(!dev->is_added); + /* Skip if device attach failed */ + if (!dev->is_added) + continue; child = dev->subordinate; if (child) pci_bus_add_devices(child); From 9a2a5a638f8eb9c612a7a9af0afab93f506f6ba4 Mon Sep 17 00:00:00 2001 From: Lukas Wunner Date: Mon, 2 May 2016 13:48:31 -0500 Subject: [PATCH 2/2] PCI: Do not treat EPROBE_DEFER as device attach failure Linux 4.5 introduced a behavioral change in device probing during the suspend process with commit 013c074f8642 ("PM / sleep: prohibit devices probing during suspend/hibernation"): It defers device probing during the entire suspend process, starting from the prepare phase and ending with the complete phase. A rule existed before that "we rely on subsystems not to do any probing once a device is suspended" but it is enforced only now (Alan Stern, https://lkml.org/lkml/2015/9/15/908). This resulted in a WARN splat if a PCI device (e.g., Thunderbolt) is plugged in while the system is asleep: Upon waking up, pciehp_resume() discovers new devices in the resume phase and immediately tries to bind them to a driver. Since probing is now deferred, device_attach() returns -EPROBE_DEFER, which provoked a WARN in pci_bus_add_device(). Linux 4.6-rc1 aggravates the situation with commit ab1a187bba5c ("PCI: Check device_attach() return value always"): If device_attach() returns a negative value, pci_bus_add_device() now removes the sysfs and procfs entries for the device and pci_bus_add_devices() subsequently locks up with a BUG. Even with the BUG fixed we're still in trouble because the device remains on the deferred probing list even though its sysfs and procfs entries are gone and its children won't be added. Fix by not interpreting -EPROBE_DEFER as failure. The device will be probed eventually (through device_unblock_probing() in dpm_complete()) and there is proper locking in place to avoid races (e.g., if devices are unplugged again und thus deleted from the system before deferred probing happens, I have tested this). Also, those functions which dereference dev->driver (e.g. pci_pm_*()) do contain proper NULL pointer checks. So it seems safe to ignore -EPROBE_DEFER. Fixes: ab1a187bba5c ("PCI: Check device_attach() return value always") Signed-off-by: Lukas Wunner Signed-off-by: Bjorn Helgaas Acked-by: Rafael J. Wysocki Cc: Grygorii Strashko Cc: Alan Stern --- drivers/pci/bus.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 23a39fdc311e..dd7cdbee8029 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -294,7 +294,7 @@ void pci_bus_add_device(struct pci_dev *dev) dev->match_driver = true; retval = device_attach(&dev->dev); - if (retval < 0) { + if (retval < 0 && retval != -EPROBE_DEFER) { dev_warn(&dev->dev, "device attach failed (%d)\n", retval); pci_proc_detach_device(dev); pci_remove_sysfs_dev_files(dev);