1
0
Fork 0
Commit Graph

603 Commits (redonkable)

Author SHA1 Message Date
Jan Kiszka e6ad7faf99 PCI: Make pci_get_new_domain_nr() static
The only user of pci_get_new_domain_nr() is of_pci_bus_find_domain_nr().
Since they are defined in the same file, pci_get_new_domain_nr() can be
made static, which also simplifies preprocessor conditionals.

No functional change intended.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
(cherry picked from commit ae07b78688)
(cherry picked from commit 3c52948d8a977b147e7b36172b9531eb2b7cb59d)
Signed-off-by: Peng Fan <peng.fan@nxp.com>
2018-10-29 11:10:38 +08:00
Daniel Drake 8ebd655833 PCI: Reprogram bridge prefetch registers on resume
commit 083874549f upstream.

On 38+ Intel-based ASUS products, the NVIDIA GPU becomes unusable after S3
suspend/resume.  The affected products include multiple generations of
NVIDIA GPUs and Intel SoCs.  After resume, nouveau logs many errors such
as:

  fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04
        [HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
  DRM: failed to idle channel 0 [DRM]

Similarly, the NVIDIA proprietary driver also fails after resume (black
screen, 100% CPU usage in Xorg process).  We shipped a sample to NVIDIA for
diagnosis, and their response indicated that it's a problem with the parent
PCI bridge (on the Intel SoC), not the GPU.

Runtime suspend/resume works fine, only S3 suspend is affected.

We found a workaround: on resume, rewrite the Intel PCI bridge
'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32).  In the
cases that I checked, this register has value 0 and we just have to rewrite
that value.

Linux already saves and restores PCI config space during suspend/resume,
but this register was being skipped because upon resume, it already has
value 0 (the correct, pre-suspend value).

Intel appear to have previously acknowledged this behaviour and the
requirement to rewrite this register:
https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23

Based on that, rewrite the prefetch register values even when that appears
unnecessary.

We have confirmed this solution on all the affected models we have in-hands
(X542UQ, UX533FD, X530UN, V272UN).

Additionally, this solves an issue where r8169 MSI-X interrupts were broken
after S3 suspend/resume on ASUS X441UAR.  This issue was recently worked
around in commit 7bb05b85bc ("r8169: don't use MSI-X on RTL8106e").  It
also fixes the same issue on RTL6186evl/8111evl on an Aimfor-tech laptop
that we had not yet patched.  I suspect it will also fix the issue that was
worked around in commit 7c53a72245 ("r8169: don't use MSI-X on
RTL8168g").

Thomas Martitz reports that this change also solves an issue where the AMD
Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive after S3
suspend/resume.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-By: Peter Wu <peter@lekensteyn.nl>
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-13 09:27:24 +02:00
Sergei Shtylyov 0e66392d98 PCI: OF: Fix I/O space page leak
commit a5fb9fb023 upstream.

When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
driver was left disabled, the kernel crashed with this BUG:

  kernel BUG at lib/ioremap.c:72!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
  Hardware name: Renesas Condor board based on r8a77980 (DT)
  Workqueue: events deferred_probe_work_func
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : ioremap_page_range+0x370/0x3c8
  lr : ioremap_page_range+0x40/0x3c8
  sp : ffff000008da39e0
  x29: ffff000008da39e0 x28: 00e8000000000f07
  x27: ffff7dfffee00000 x26: 0140000000000000
  x25: ffff7dfffef00000 x24: 00000000000fe100
  x23: ffff80007b906000 x22: ffff000008ab8000
  x21: ffff000008bb1d58 x20: ffff7dfffef00000
  x19: ffff800009c30fb8 x18: 0000000000000001
  x17: 00000000000152d0 x16: 00000000014012d0
  x15: 0000000000000000 x14: 0720072007200720
  x13: 0720072007200720 x12: 0720072007200720
  x11: 0720072007300730 x10: 00000000000000ae
  x9 : 0000000000000000 x8 : ffff7dffff000000
  x7 : 0000000000000000 x6 : 0000000000000100
  x5 : 0000000000000000 x4 : 000000007b906000
  x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
  x1 : 0000000040000000 x0 : 00e80000fe100f07
  Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
  Call trace:
   ioremap_page_range+0x370/0x3c8
   pci_remap_iospace+0x7c/0xac
   pci_parse_request_of_pci_ranges+0x13c/0x190
   rcar_pcie_probe+0x4c/0xb04
   platform_drv_probe+0x50/0xbc
   driver_probe_device+0x21c/0x308
   __device_attach_driver+0x98/0xc8
   bus_for_each_drv+0x54/0x94
   __device_attach+0xc4/0x12c
   device_initial_probe+0x10/0x18
   bus_probe_device+0x90/0x98
   deferred_probe_work_func+0xb0/0x150
   process_one_work+0x12c/0x29c
   worker_thread+0x200/0x3fc
   kthread+0x108/0x134
   ret_from_fork+0x10/0x18
  Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried, finally causing the BUG due to trying to remap
already remapped pages.

Introduce the devm_pci_remap_iospace() managed API and replace the
pci_remap_iospace() call with it to fix the bug.

Fixes: dbf9826d57 ("PCI: generic: Convert to DT resource parsing API")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
[lorenzo.pieralisi@arm.com: split commit/updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-24 13:09:19 +02:00
Rafael J. Wysocki 64a03d3b24 PCI / PM: Check device_may_wakeup() in pci_enable_wake()
commit cfcadfaad7 upstream.

Commit 0847684cfc (PCI / PM: Simplify device wakeup settings code)
went too far and dropped the device_may_wakeup() check from
pci_enable_wake() which causes wakeup to be enabled during system
suspend, hibernation or shutdown for some PCI devices that are not
allowed by user space to wake up the system from sleep (or power off).

As a result of this, excessive power is drawn by some of the affected
systems while in sleep states or off.

Restore the device_may_wakeup() check in pci_enable_wake(), but make
sure that the PCI bus type's runtime suspend callback will not call
device_may_wakeup() which is about system wakeup from sleep and not
about device wakeup from runtime suspend.

Fixes: 0847684cfc (PCI / PM: Simplify device wakeup settings code)
Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Cc: 4.13+ <stable@vger.kernel.org> # 4.13+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-16 10:10:29 +02:00
Kai Heng Feng 89d5c4eb81 PCI / PM: Always check PME wakeup capability for runtime wakeup support
commit 8feaec33b9 upstream.

USB controller ASM1042 stops working after commit de3ef1eb1c (PM /
core: Drop run_wake flag from struct dev_pm_info).

The device in question is not power managed by platform firmware,
furthermore, it only supports PME# from D3cold:
Capabilities: [78] Power Management version 3
       Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
       Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

Before commit de3ef1eb1c, the device never gets runtime suspended.
After that commit, the device gets runtime suspended to D3hot, which can
not generate any PME#.

usb_hcd_pci_probe() unconditionally calls device_wakeup_enable(), hence
device_can_wakeup() in pci_dev_run_wake() always returns true.

So pci_dev_run_wake() needs to check PME wakeup capability as its first
condition.

In addition, change wakeup flag passed to pci_target_state() from false
to true, because we want to find the deepest state different from D3cold
that the device can still generate PME#. In this case, it's D0 for the
device in question.

Fixes: de3ef1eb1c (PM / core: Drop run_wake flag from struct dev_pm_info)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: 4.13+ <stable@vger.kernel.org> # 4.13+
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-16 10:10:29 +02:00
David Daney bfd66a406f PCI: Avoid bus reset if bridge itself is broken
[ Upstream commit 357027786f ]

When checking to see if a PCI bus can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set.  Children marked with that flag are known not to behave well
after a bus reset.

Some PCIe root port bridges also do not behave well after a bus reset,
sometimes causing the devices behind the bridge to become unusable.

Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to allow these bridges to be flagged, and prevent their secondary buses
from being reset.

Signed-off-by: David Daney <david.daney@cavium.com>
[jglauber@cavium.com: fixed typo]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>

Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-25 14:26:25 +01:00
Bjorn Helgaas 0f50a49e30 Revert "PCI: Avoid race while enabling upstream bridges"
This reverts commit 40f11adc7c.

Jens found that iwlwifi firmware loading failed on a Lenovo X1 Carbon,
gen4:

  iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-34.ucode failed with error -2
  iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-33.ucode failed with error -2
  iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-32.ucode failed with error -2
  iwlwifi 0000:04:00.0: loaded firmware version 31.532993.0 op_mode iwlmvm
  iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 8260, REV=0x208
  ...
  iwlwifi 0000:04:00.0: Failed to load firmware chunk!
  iwlwifi 0000:04:00.0: Could not load the [0] uCode section
  iwlwifi 0000:04:00.0: Failed to start INIT ucode: -110
  iwlwifi 0000:04:00.0: Failed to run INIT ucode: -110

He bisected it to 40f11adc7c ("PCI: Avoid race while enabling upstream
bridges").  Revert that commit to fix the regression.

Link: http://lkml.kernel.org/r/4bcbcbc1-7c79-09f0-5071-bc2f53bf6574@kernel.dk
Fixes: 40f11adc7c ("PCI: Avoid race while enabling upstream bridges")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Srinath Mannam <srinath.mannam@broadcom.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Luca Coelho <luca@coelho.fi>
CC: Johannes Berg <johannes@sipsolutions.net>
CC: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2017-09-15 01:33:51 -05:00
Linus Torvalds 0d519f2d1e pci-v4.14-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZsr8cAAoJEFmIoMA60/r8lXYQAKViYIRMJDD4n3NhjMeLOsnJ
 vwaBmWlLRjSFIEpag5kMjS1RJE17qAvmkBZnDvSNZ6cT28INkkZnVM2IW96WECVq
 64MIvDijVPcvqGuWePCfWdDiSXApiDWwJuw55BOhmvV996wGy0gYgzpPY+1g0Knh
 XzH9IOzDL79hZleLfsxX0MLV6FGBVtOsr0jvQ04k4IgEMIxEDTlbw85rnrvzQUtc
 0Vj2koaxWIESZsq7G/wiZb2n6ekaFdXO/VlVvvhmTSDLCBaJ63Hb/gfOhwMuVkS6
 B3cVprNrCT0dSzWmU4ZXf+wpOyDpBexlemW/OR/6CQUkC6AUS6kQ5si1X44dbGmJ
 nBPh414tdlm/6V4h/A3UFPOajSGa/ZWZ/uQZPfvKs1R6WfjUerWVBfUpAzPbgjam
 c/mhJ19HYT1J7vFBfhekBMeY2Px3JgSJ9rNsrFl48ynAALaX5GEwdpo4aqBfscKz
 4/f9fU4ysumopvCEuKD2SsJvsPKd5gMQGGtvAhXM1TxvAoQ5V4cc99qEetAPXXPf
 h2EqWm4ph7YP4a+n/OZBjzluHCmZJn1CntH5+//6wpUk6HnmzsftGELuO9n12cLE
 GGkreI3T9ctV1eOkzVVa0l0QTE1X/VLyEyKCtb9obXsDaG4Ud7uKQoZgB19DwyTJ
 EG76ridTolUFVV+wzJD9
 =9cLP
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

 - add enhanced Downstream Port Containment support, which prints more
   details about Root Port Programmed I/O errors (Dongdong Liu)

 - add Layerscape ls1088a and ls2088a support (Hou Zhiqiang)

 - add MediaTek MT2712 and MT7622 support (Ryder Lee)

 - add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang)

 - add Qualcom IPQ8074 support (Varadarajan Narayanan)

 - add R-Car r8a7743/5 device tree support (Biju Das)

 - add Rockchip per-lane PHY support for better power management (Shawn
   Lin)

 - fix IRQ mapping for hot-added devices by replacing the
   pci_fixup_irqs() boot-time design with a host bridge hook called at
   probe-time (Lorenzo Pieralisi, Matthew Minter)

 - fix race when enabling two devices that results in upstream bridge
   not being enabled correctly (Srinath Mannam)

 - fix pciehp power fault infinite loop (Keith Busch)

 - fix SHPC bridge MSI hotplug events by enabling bus mastering
   (Aleksandr Bezzubikov)

 - fix a VFIO issue by correcting PCIe capability sizes (Alex
   Williamson)

 - fix an INTD issue on Xilinx and possibly other drivers by unifying
   INTx IRQ domain support (Paul Burton)

 - avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg
   Roedel)

 - allow APM X-Gene device assignment to guests by adding an ACS quirk
   (Feng Kan)

 - fix driver crashes by disabling Extended Tags on Broadcom HT2100
   (Extended Tags support is required for PCIe Receivers but not
   Requesters, and we now enable them by default when Requesters support
   them) (Sinan Kaya)

 - fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs
   use the real Requester ID (not a phantom RID) (Robin Murphy)

 - prevent assignment of Intel VMD children to guests (which may be
   supported eventually, but isn't yet) by not associating an IOMMU with
   them (Jon Derrick)

 - fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott
   Bauer)

 - fix a Function-Level Reset issue with Intel 750 NVMe by waiting
   longer (up to 60sec instead of 1sec) for device to become ready
   (Sinan Kaya)

 - fix a Function-Level Reset issue on iProc Stingray by working around
   hardware defects in the CRS implementation (Oza Pawandeep)

 - fix an issue with Intel NVMe P3700 after an iProc reset by adding a
   delay during shutdown (Oza Pawandeep)

 - fix a Microsoft Hyper-V lockdep issue by polling instead of blocking
   in compose_msi_msg() (Stephen Hemminger)

 - fix a wireless LAN driver timeout by clearing DesignWare MSI
   interrupt status after it is handled, not before (Faiz Abbas)

 - fix DesignWare ATU enable checking (Jisheng Zhang)

 - reduce Layerscape dependencies on the bootloader by doing more
   initialization in the driver (Hou Zhiqiang)

 - improve Intel VMD performance allowing allocation of more IRQ vectors
   than present CPUs (Keith Busch)

 - improve endpoint framework support for initial DMA mask, different
   BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon
   Vijay Abraham I, Stan Drozd)

 - rework CRS support to add periodic messages while we poll during
   enumeration and after Function-Level Reset and prepare for possible
   other uses of CRS (Sinan Kaya)

 - clean up Root Port AER handling by removing unnecessary code and
   moving error handler methods to struct pcie_port_service_driver
   (Christoph Hellwig)

 - clean up error handling paths in various drivers (Bjorn Andersson,
   Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen,
   Lorenzo Pieralisi, Sergei Shtylyov)

 - clean up SR-IOV resource handling by disabling VF decoding before
   updating the corresponding resource structs (Gavin Shan)

 - clean up DesignWare-based drivers by unifying quirks to update Class
   Code and Interrupt Pin and related handling of write-protected
   registers (Hou Zhiqiang)

 - clean up by adding empty generic pcibios_align_resource() and
   pcibios_fixup_bus() and removing empty arch-specific implementations
   (Palmer Dabbelt)

 - request exclusive reset control for several drivers to allow cleanup
   elsewhere (Philipp Zabel)

 - constify various structures (Arvind Yadav, Bhumika Goyal)

 - convert from full_name() to %pOF (Rob Herring)

 - remove unused variables from iProc, HiSi, Altera, Keystone (Shawn
   Lin)

* tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits)
  PCI: xgene: Clean up whitespace
  PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset
  PCI: xgene: Fix platform_get_irq() error handling
  PCI: xilinx-nwl: Fix platform_get_irq() error handling
  PCI: rockchip: Fix platform_get_irq() error handling
  PCI: altera: Fix platform_get_irq() error handling
  PCI: spear13xx: Fix platform_get_irq() error handling
  PCI: artpec6: Fix platform_get_irq() error handling
  PCI: armada8k: Fix platform_get_irq() error handling
  PCI: dra7xx: Fix platform_get_irq() error handling
  PCI: exynos: Fix platform_get_irq() error handling
  PCI: iproc: Clean up whitespace
  PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP
  PCI: iproc: Add 500ms delay during device shutdown
  PCI: Fix typos and whitespace errors
  PCI: Remove unused "res" variable from pci_resource_io()
  PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag()
  PCI/AER: Reformat AER register definitions
  iommu/vt-d: Prevent VMD child devices from being remapping targets
  x86/PCI: Use is_vmd() rather than relying on the domain number
  ...
2017-09-08 15:47:43 -07:00
Bjorn Helgaas d872694bac Merge branch 'pci/pm' into next
* pci/pm:
  PCI/PM: Expand description of pci_set_power_state()
2017-09-07 13:24:18 -05:00
Bjorn Helgaas 33db87de6a Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Fix PCIe capability sizes
  PCI: Convert to using %pOF instead of full_name()
  PCI: Constify endpoint pci_epf_type device_type
  PCI: Constify bin_attribute structures
  PCI: Constify hotplug pci_device_id structures
  PCI: Constify hotplug attribute_group structures
  PCI: Constify label attribute_group structures
  PCI: Constify sysfs attribute_group structures
2017-09-07 13:24:16 -05:00
Rafael J. Wysocki 4467ade90d Merge branches 'acpi-scan' and 'acpi-pm'
* acpi-scan:
  ACPI / scan: Enable GPEs before scanning the namespace
  ACPICA: Make it possible to enable runtime GPEs earlier
  ACPICA: Dispatch active GPEs at init time

* acpi-pm:
  ACPI / PM: Add debug statements to acpi_pm_notify_handler()
  ACPI: Add debug statements to acpi_global_event_handler()
  ACPI / sleep: Make acpi_sleep_syscore_init() static
  ACPI / PCI / PM: Rework acpi_pci_propagate_wakeup()
  ACPI / PM: Split acpi_device_wakeup()
  PCI / PM: Skip bridges in pci_enable_wake()
2017-09-03 23:53:19 +02:00
Sinan Kaya 821cdad5c4 PCI: Wait up to 60 seconds for device to become ready after FLR
Sporadic reset issues have been observed with an Intel 750 NVMe drive while
assigning the physical function to the guest machine.  The sequence of
events observed is as follows:

  - perform a Function Level Reset (FLR)
  - sleep up to 1000ms total
  - read ~0 from PCI_COMMAND (CRS completion for config read)
  - warn that the device didn't return from FLR
  - touch the device before it's ready
  - device drops config writes when we restore register settings (there's
    no mechanism for software to learn about CRS completions for writes)
  - incomplete register restore leaves device in inconsistent state
  - device probe fails because device is in inconsistent state

After reset, an endpoint may respond to config requests with Configuration
Request Retry Status (CRS) to indicate that it is not ready to accept new
requests. See PCIe r3.1, sec 2.3.1 and 6.6.2.

Increase the timeout value from 1 second to 60 seconds to cover the period
where device responds with CRS and also report polling progress.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: include the mandatory 100ms in the delays we print]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-08-29 14:45:45 -05:00
Rob Herring b63773a801 PCI: Convert to using %pOF instead of full_name()
Now that we have a custom printf format specifier, convert users of
full_name() to use %pOF instead.  This is preparation for removing storing
of the full path string for each node.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
2017-08-24 11:24:59 -05:00
Srinath Mannam 40f11adc7c PCI: Avoid race while enabling upstream bridges
When we enable a device, we first enable any upstream bridges.  If a bridge
has multiple downstream devices and we enable them simultaneously, the race
to enable the upstream bridge may cause problems.  Consider this hierarchy:

  bridge A --+-- device B
             +-- device C

If drivers for B and C call pci_enable_device() simultaneously, both will
attempt to enable A, which involves setting PCI_COMMAND_MASTER via
pci_set_master() and PCI_COMMAND_MEMORY via pci_enable_resources().

In the following sequence, B's update to set A's PCI_COMMAND_MEMORY is
lost, and neither B nor C will work correctly:

      B                                C
  pci_set_master(A)
    cmd = read(A, PCI_COMMAND)
    cmd |= PCI_COMMAND_MASTER
                                   pci_set_master(A)
                                     cmd = read(A, PCI_COMMAND)
                                     cmd |= PCI_COMMAND_MASTER
    write(A, PCI_COMMAND, cmd)
  pci_enable_device(A)
    pci_enable_resources(A)
      cmd = read(A, PCI_COMMAND)
      cmd |= PCI_COMMAND_MEMORY
      write(A, PCI_COMMAND, cmd)
                                     write(A, PCI_COMMAND, cmd)

Avoid this race by holding a new pci_bridge_mutex while enabling a bridge.
This ensures that both PCI_COMMAND_MASTER and PCI_COMMAND_MEMORY will be
updated before another thread can start enabling the bridge.

Note that although pci_enable_bridge() is recursive, it enables any
upstream bridges *before* acquiring the mutex.  When it acquires the mutex
and calls pci_set_master() and pci_enable_device(), any upstream bridges
have already been enabled so pci_enable_device() will not deadlock by
calling pci_enable_bridge() again.

Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
[bhelgaas: changelog, comment]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-08-18 21:50:48 -05:00
Thierry Reding b6f6d56c91 PCI: Allow PCI express root ports to find themselves
If the pci_find_pcie_root_port() function is called on a root port
itself, return the root port rather than NULL.

This effectively reverts commit 0e40523287 ("PCI: fix oops when
try to find Root Port for a PCI device") which added an extra check
that would now be redundant.

Fixes: a99b646afa ("PCI: Disable PCIe Relaxed Ordering if unsupported")
Fixes: c56d4450eb ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 16:14:37 -07:00
Linus Torvalds 510c8a899c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix TCP checksum offload handling in iwlwifi driver, from Emmanuel
    Grumbach.

 2) In ksz DSA tagging code, free SKB if skb_put_padto() fails. From
    Vivien Didelot.

 3) Fix two regressions with bonding on wireless, from Andreas Born.

 4) Fix build when busypoll is disabled, from Daniel Borkmann.

 5) Fix copy_linear_skb() wrt. SO_PEEK_OFF, from Eric Dumazet.

 6) Set SKB cached route properly in inet_rtm_getroute(), from Florian
    Westphal.

 7) Fix PCI-E relaxed ordering handling in cxgb4 driver, from Ding
    Tianhong.

 8) Fix module refcnt leak in ULP code, from Sabrina Dubroca.

 9) Fix use of GFP_KERNEL in atomic contexts in AF_KEY code, from Eric
    Dumazet.

10) Need to purge socket write queue in dccp_destroy_sock(), also from
    Eric Dumazet.

11) Make bpf_trace_printk() work properly on 32-bit architectures, from
    Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  bpf: fix bpf_trace_printk on 32 bit archs
  PCI: fix oops when try to find Root Port for a PCI device
  sfc: don't try and read ef10 data on non-ef10 NIC
  net_sched: remove warning from qdisc_hash_add
  net_sched/sfq: update hierarchical backlog when drop packet
  net_sched: reset pointers to tcf blocks in classful qdiscs' destructors
  ipv4: fix NULL dereference in free_fib_info_rcu()
  net: Fix a typo in comment about sock flags.
  ipv6: fix NULL dereference in ip6_route_dev_notify()
  tcp: fix possible deadlock in TCP stack vs BPF filter
  dccp: purge write queue in dccp_destroy_sock()
  udp: fix linear skb reception with PEEK_OFF
  ipv6: release rt6->rt6i_idev properly during ifdown
  af_key: do not use GFP_KERNEL in atomic contexts
  tcp: ulp: avoid module refcnt leak in tcp_set_ulp
  net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
  net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
  PCI: Disable Relaxed Ordering Attributes for AMD A1100
  PCI: Disable Relaxed Ordering for some Intel processors
  PCI: Disable PCIe Relaxed Ordering if unsupported
  ...
2017-08-15 18:52:28 -07:00
dingtianhong 0e40523287 PCI: fix oops when try to find Root Port for a PCI device
Eric report a oops when booting the system after applying
the commit a99b646afa ("PCI: Disable PCIe Relaxed..."):

[    4.241029] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[    4.247001] IP: pci_find_pcie_root_port+0x62/0x80
[    4.253011] PGD 0
[    4.253011] P4D 0
[    4.253011]
[    4.258013] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[    4.262015] Modules linked in:
[    4.265005] CPU: 31 PID: 1 Comm: swapper/0 Not tainted 4.13.0-dbx-DEV #316
[    4.271002] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016
[    4.279002] task: ffffa2ee38cfa040 task.stack: ffffa51ec0004000
[    4.285001] RIP: 0010:pci_find_pcie_root_port+0x62/0x80
[    4.290012] RSP: 0000:ffffa51ec0007ab8 EFLAGS: 00010246
[    4.295003] RAX: 0000000000000000 RBX: ffffa2ee36bae000 RCX: 0000000000000006
[    4.303002] RDX: 000000000000081c RSI: ffffa2ee38cfa8c8 RDI: ffffa2ee36bae000
[    4.310013] RBP: ffffa51ec0007b58 R08: 0000000000000001 R09: 0000000000000000
[    4.317001] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa51ec0007ad0
[    4.324005] R13: ffffa2ee36bae098 R14: 0000000000000002 R15: ffffa2ee37204818
[    4.331002] FS:  0000000000000000(0000) GS:ffffa2ee3fcc0000(0000) knlGS:0000000000000000
[    4.339002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.345001] CR2: 0000000000000050 CR3: 000000401000f000 CR4: 00000000001406e0
[    4.351002] Call Trace:
[    4.354012]  ? pci_configure_device+0x19f/0x570
[    4.359002]  ? pci_conf1_read+0xb8/0xf0
[    4.363002]  ? raw_pci_read+0x23/0x40
[    4.366011]  ? pci_read+0x2c/0x30
[    4.370014]  ? pci_read_config_word+0x67/0x70
[    4.374012]  pci_device_add+0x28/0x230
[    4.378012]  ? pci_vpd_f0_read+0x50/0x80
[    4.382014]  pci_scan_single_device+0x96/0xc0
[    4.386012]  pci_scan_slot+0x79/0xf0
[    4.389001]  pci_scan_child_bus+0x31/0x180
[    4.394014]  acpi_pci_root_create+0x1c6/0x240
[    4.398013]  pci_acpi_scan_root+0x15f/0x1b0
[    4.402012]  acpi_pci_root_add+0x2e6/0x400
[    4.406012]  ? acpi_evaluate_integer+0x37/0x60
[    4.411002]  acpi_bus_attach+0xdf/0x200
[    4.415002]  acpi_bus_attach+0x6a/0x200
[    4.418014]  acpi_bus_attach+0x6a/0x200
[    4.422013]  acpi_bus_scan+0x38/0x70
[    4.426011]  acpi_scan_init+0x10c/0x271
[    4.429001]  acpi_init+0x2fa/0x348
[    4.433004]  ? acpi_sleep_proc_init+0x2d/0x2d
[    4.437001]  do_one_initcall+0x43/0x169
[    4.441001]  kernel_init_freeable+0x1d0/0x258
[    4.445003]  ? rest_init+0xe0/0xe0
[    4.449001]  kernel_init+0xe/0x150

====================== cut here =============================

It looks like the pci_find_pcie_root_port() was trying to
find the Root Port for the PCI device which is the Root
Port already, it will return NULL and trigger the problem,
so check the highest_pcie_bridge to fix thie problem.

Fixes: a99b646afa ("PCI: Disable PCIe Relaxed Ordering if unsupported")
Fixes: c56d4450eb ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:25:16 -07:00
Piotr Gregor ab4b8a47ab PCI/PM: Expand description of pci_set_power_state()
Add two reasons for returning 0 value to the description of
pci_set_power_state() to include the cases when:

  - the transition is to D1 or D2 but D1 and D2 are not supported
  - the transition is to D3 but D3 is not supported

Signed-off-by: Piotr Gregor <piotrgregor@rsyncme.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-08-03 18:12:56 -05:00
Marc Zyngier a477b9cd37 PCI: Add pci_reset_function_locked()
The implementation of PCI workarounds may require that the device is reset
from its probe function.  This implies that the PCI device lock is already
held, and makes calling pci_reset_function() impossible (since it will
itself try to take that lock).

Add pci_reset_function_locked(), which is the equivalent of
pci_reset_function(), except that it requires the PCI device lock to be
already held by the caller.

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[bhelgaas: folded in fix for conflict with 52354b9d1f ("PCI: Remove
__pci_dev_reset() and pci_dev_reset()")]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org	# 4.11: 52354b9d1f46: PCI: Remove __pci_dev_reset() and pci_dev_reset()
Cc: stable@vger.kernel.org	# 4.11
2017-08-01 20:11:02 -05:00
Rafael J. Wysocki baecc470d5 PCI / PM: Skip bridges in pci_enable_wake()
PCI bridges only have a reason to generate wakeup signals on behalf
of devices below them, so avoid preparing bridges for wakeup directly
in pci_enable_wake().

Also drop the pci_has_subordinate() check from pci_pm_default_resume()
as this will be done by pci_enable_wake() itself now.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-08-01 14:05:02 +02:00
Rafael J. Wysocki ec4b8ddcd3 Merge branch 'pm-pci'
* pm-pci:
  PCI / PM: Fix native PME handling during system suspend/resume
  PCI / PM: Restore PME Enable after config space restoration
2017-07-14 13:15:49 +02:00
Rafael J. Wysocki 0ce3fcaff9 PCI / PM: Restore PME Enable after config space restoration
Commit dc15e71eef (PCI / PM: Restore PME Enable if skipping wakeup
setup) introduced a mechanism by which the PME Enable bit can be
restored by pci_enable_wake() if dev->wakeup_prepared is set in
case it has been overwritten by PCI config space restoration.

However, that commit overlooked the fact that on some systems (Dell
XPS13 9360 in particular) the AML handling wakeup events checks PME
Status and PME Enable and it won't trigger a Notify() for devices
where those bits are not set while it is running.

That happens during resume from suspend-to-idle when pci_restore_state()
invoked by pci_pm_default_resume_early() clears PME Enable before the
wakeup events are processed by AML, effectively causing those wakeup
events to be ignored.

Fix this issue by restoring the PME Enable configuration right after
pci_restore_state() has been called instead of doing that in
pci_enable_wake().

Fixes: dc15e71eef (PCI / PM: Restore PME Enable if skipping wakeup setup)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-07-12 23:09:21 +02:00
Linus Torvalds f263fbb8d6 pci-v4.13-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZYAFUAAoJEFmIoMA60/r8cFQP/A4fpdjhd42WRNQXGTpZieop
 i40lBQtGdBn/UY97U6BoutcS1ygDi9OiSzg+IR6I90iMgidqyUHFhe4hGWgVHD2g
 Tg0KLzd+lKKfQ6Gqt1P6t4dLGLvyEj5NUbCeFE4XYODAUkkiBaOndax6DK1GvU54
 Vjuj63rHtMKFR/tG/4iFTigObqyI8QE6O9JVxwuvIyEX6RXKbJe+wkulv5taSnWt
 Ne94950i10MrELtNreVdi8UbCbXiqjg0r5sKI/WTJ7Bc7WsC7X5PhWlhcNrbHyBT
 Ivhoypkui3Ky8gvwWqL0KBG+cRp8prBXAdabrD9wRbz0TKnfGI6pQzseCGRnkE6T
 mhlSJpsSNIHaejoCjk93yPn5oRiTNtPMdVhMpEQL9V/crVRGRRmbd7v2TYvpMHVR
 JaPZ8bv+C2aBTY8uL3/v/rgrjsMKOYFeaxeNklpErxrknsbgb6BgubmeZXDvTBVv
 YUIbAkvveonUKisv+kbD8L7tp1+jdbRUT0AikS0NVgAJQhfArOmBcDpTL9YC51vE
 feFhkVx4A32vvOm7Zcg9A7IMXNjeSfccKGw3dJOAvzgDODuJiaCG6S0o7B5Yngze
 axMi87ixGT4QM98z/I4MC8E9rDrJdIitlpvb6ZBgiLzoO3kmvsIZZKt8UxWqf5r8
 w3U2HoyKH13Qbkn1xkum
 =mkyb
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

  - add sysfs max_link_speed/width, current_link_speed/width (Wong Vee
    Khee)

  - make host bridge IRQ mapping much more generic (Matthew Minter,
    Lorenzo Pieralisi)

  - convert most drivers to pci_scan_root_bus_bridge() (Lorenzo
    Pieralisi)

  - mutex sriov_configure() (Jakub Kicinski)

  - mutex pci_error_handlers callbacks (Christoph Hellwig)

  - split ->reset_notify() into ->reset_prepare()/reset_done()
    (Christoph Hellwig)

  - support multiple PCIe portdrv interrupts for MSI as well as MSI-X
    (Gabriele Paoloni)

  - allocate MSI/MSI-X vector for Downstream Port Containment (Gabriele
    Paoloni)

  - fix MSI IRQ affinity pre/post/min_vecs issue (Michael Hernandez)

  - test INTx masking during enumeration, not at run-time (Piotr Gregor)

  - avoid using device_may_wakeup() for runtime PM (Rafael J. Wysocki)

  - restore the status of PCI devices across hibernation (Chen Yu)

  - keep parent resources that start at 0x0 (Ard Biesheuvel)

  - enable ECRC only if device supports it (Bjorn Helgaas)

  - restore PRI and PASID state after Function-Level Reset (CQ Tang)

  - skip DPC event if device is not present (Keith Busch)

  - check domain when matching SMBIOS info (Sujith Pandel)

  - mark Intel XXV710 NIC INTx masking as broken (Alex Williamson)

  - avoid AMD SB7xx EHCI USB wakeup defect (Kai-Heng Feng)

  - work around long-standing Macbook Pro poweroff issue (Bjorn Helgaas)

  - add Switchtec "running" status flag (Logan Gunthorpe)

  - fix dra7xx incorrect RW1C IRQ register usage (Arvind Yadav)

  - modify xilinx-nwl IRQ chip for legacy interrupts (Bharat Kumar
    Gogada)

  - move VMD SRCU cleanup after bus, child device removal (Jon Derrick)

  - add Faraday clock handling (Linus Walleij)

  - configure Rockchip MPS and reorganize (Shawn Lin)

  - limit Qualcomm TLP size to 2K (hardware issue) (Srinivas Kandagatla)

  - support Tegra MSI 64-bit addressing (Thierry Reding)

  - use Rockchip normal (not privileged) register bank (Shawn Lin)

  - add HiSilicon Kirin SoC PCIe controller driver (Xiaowei Song)

  - add Sigma Designs Tango SMP8759 PCIe controller driver (Marc
    Gonzalez)

  - add MediaTek PCIe host controller support (Ryder Lee)

  - add Qualcomm IPQ4019 support (John Crispin)

  - add HyperV vPCI protocol v1.2 support (Jork Loeser)

  - add i.MX6 regulator support (Quentin Schulz)

* tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits)
  PCI: tango: Add Sigma Designs Tango SMP8759 PCIe host bridge support
  PCI: Add DT binding for Sigma Designs Tango PCIe controller
  PCI: rockchip: Use normal register bank for config accessors
  dt-bindings: PCI: Add documentation for MediaTek PCIe
  PCI: Remove __pci_dev_reset() and pci_dev_reset()
  PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done()
  PCI: xilinx: Make of_device_ids const
  PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts
  PCI: vmd: Move SRCU cleanup after bus, child device removal
  PCI: vmd: Correct comment: VMD domains start at 0x10000, not 0x1000
  PCI: versatile: Add local struct device pointers
  PCI: tegra: Do not allocate MSI target memory
  PCI: tegra: Support MSI 64-bit addressing
  PCI: rockchip: Use local struct device pointer consistently
  PCI: rockchip: Check for clk_prepare_enable() errors during resume
  MAINTAINERS: Remove Wenrui Li as Rockchip PCIe driver maintainer
  PCI: rockchip: Configure RC's MPS setting
  PCI: rockchip: Reconfigure configuration space header type
  PCI: rockchip: Split out rockchip_pcie_cfg_configuration_accesses()
  PCI: rockchip: Move configuration accesses into rockchip_pcie_cfg_atu()
  ...
2017-07-08 15:51:57 -07:00
Bjorn Helgaas bb02ce95a5 Merge branch 'pci/virtualization' into next
* pci/virtualization:
  PCI: Remove __pci_dev_reset() and pci_dev_reset()
  PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done()
  PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()
  PCI: Protect pci_driver->sriov_configure() usage with device_lock()
  PCI: Mark Intel XXV710 NIC INTx masking as broken
  PCI: Restore PRI and PASID state after Function-Level Reset
  PCI: Cache PRI and PASID bits in pci_dev
2017-07-03 08:00:29 -05:00
Christoph Hellwig 52354b9d1f PCI: Remove __pci_dev_reset() and pci_dev_reset()
Implement the reset probing / reset chain directly in
__pci_probe_reset_function() and __pci_reset_function_locked()
respectively.

Link: http://lkml.kernel.org/r/20170601111039.8913-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-07-03 07:58:30 -05:00
Christoph Hellwig 775755ed3c PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done()
The pci_error_handlers->reset_notify() method had a flag to indicate
whether to prepare for or clean up after a reset.  The prepare and done
cases have no shared functionality whatsoever, so split them into separate
methods.

[bhelgaas: changelog, update locking comments]
Link: http://lkml.kernel.org/r/20170601111039.8913-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-07-03 07:58:30 -05:00
Bjorn Helgaas 8cd9385034 Merge branch 'pci/resource' into next
* pci/resource:
  PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11
  PCI: Do not disregard parent resources starting at 0x0

Conflicts:
arch/x86/pci/fixup.c
2017-07-02 18:49:49 -05:00
Bjorn Helgaas 2cf816a947 Merge branch 'pci/pm' into next
* pci/pm:
  PCI/PM: Avoid using device_may_wakeup() for runtime PM
  x86/PCI: Avoid AMD SB7xx EHCI USB wakeup defect
  PCI/PM: Restore the status of PCI devices across hibernation
  drm/radeon: make MacBook Pro d3_delay quirk more generic
  drm/amdgpu: remove unnecessary save/restore of pdev->d3_delay
  PCI/PM: Add needs_resume flag to avoid suspend complete optimization
  PCI: imx6: Fix config read timeout handling
  switchtec: Fix minor bug with partition ID register
  switchtec: Use new cdev_device_add() helper function
  PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
2017-07-02 18:48:49 -05:00
Rafael J. Wysocki 666ff6f83e PCI/PM: Avoid using device_may_wakeup() for runtime PM
pci_target_state() calls device_may_wakeup() which checks whether or not
the device may wake up the system from sleep states, but pci_target_state()
is used for runtime PM too.

Since runtime PM is expected to always enable remote wakeup if possible,
modify pci_target_state() to take additional argument indicating whether or
not it should look for a state from which the device can signal wakeup and
pass either the return value of device_can_wakeup(), or "false" (if the
device itself is not wakeup-capable) to it from the code related to runtime
PM.

While at it, fix the comment in pci_dev_run_wake() which is not about sleep
states.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2017-06-30 11:15:12 -05:00
Rafael J. Wysocki de3ef1eb1c PM / core: Drop run_wake flag from struct dev_pm_info
The run_wake flag in struct dev_pm_info is used to indicate whether
or not the device is capable of generating remote wakeup signals at
run time (or in the system working state), but the distinction
between runtime remote wakeup and system wakeup signaling has always
been rather artificial.  The only practical reason for it to exist
at the core level was that ACPI and PCI treated those two cases
differently, but that's not the case any more after recent changes.

For this reason, get rid of the run_wake flag and, when applicable,
use device_set_wakeup_capable() and device_can_wakeup() instead of
device_set_run_wake() and device_run_wake(), respectively.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28 01:52:52 +02:00
Rafael J. Wysocki 0847684cfc PCI / PM: Simplify device wakeup settings code
After previous changes it is not necessary to distinguish between
device wakeup for run time and device wakeup from system sleep states
any more, so rework the PCI device wakeup settings code accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28 01:52:45 +02:00
Piotr Gregor 99b3c58f7b PCI: Test INTx masking during enumeration, not at run-time
The test for INTx masking via PCI_COMMAND_INTX_DISABLE performed in
pci_intx_mask_supported() should be done before the device can be used.
This is to avoid writing PCI_COMMAND while the driver owns the device, in
case that has any effect on MSI/MSI-X interrupts.

Move the content of pci_intx_mask_supported() to pci_intx_mask_broken() and
call it from pci_setup_device().

The test result can be queried at any time later using the same
pci_intx_mask_supported() interface as before (though with changed
implementation), so callers (uio, vfio) should be unaffected.

Signed-off-by: Piotr Gregor <piotrgregor@rsyncme.org>
[bhelgaas: changelog, remove quirk check, remove locking, move
dev->broken_intx_masking assignment to caller]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-16 16:12:37 -05:00
Christoph Hellwig b014e96d1a PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()
Every method in struct device_driver or structures derived from it like
struct pci_driver MUST provide exclusion vs the driver's ->remove() method,
usually by using device_lock().

Protect use of pci_error_handlers->reset_notify() by holding the device
lock while calling it.

Note:

  - pci_dev_lock() calls device_lock() in addition to blocking user-space
    config accesses.

  - pci_err_handlers->reset_notify() is used inside
    pci_dev_save_and_disable() and pci_dev_restore().  We could hold the
    device lock directly in pci_reset_notify(), but we expand the region
    since we have several calls following each other.

Without this, ->reset_notify() may race with ->remove() calls, which can be
easily triggered in NVMe.

[bhelgaas: changelog, add pci_reset_notify() comment]
[bhelgaas: fold in fix from Dan Carpenter <dan.carpenter@oracle.com>:
http://lkml.kernel.org/r/20170701135323.x5vaj4e2wcs2mcro@mwanda]
Link: http://lkml.kernel.org/r/20170601111039.8913-2-hch@lst.de
Reported-by: Rakesh Pandit <rakesh@tuxera.com>
Tested-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-14 21:49:13 -05:00
Rafael J. Wysocki dc15e71eef PCI / PM: Restore PME Enable if skipping wakeup setup
The wakeup_prepared PCI device flag is used for preventing subsequent
changes of PCI device wakeup settings in the same way (e.g. enabling
device wakeup twice in a row).

However, in some cases PME Enable may be updated by things like PCI
configuration space restoration in the meantime and it may need to be
set again even though the rest of the settings need not change, so
modify __pci_enable_wake() to do that when it is about to return
early.

Also, it is reasonable to expect that __pci_enable_wake() will always
clear PME Status when invoked to disable device wakeup, so make it do
so even if it is going to return early then.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-15 00:55:44 +02:00
CQ Tang 4ebeb1ec56 PCI: Restore PRI and PASID state after Function-Level Reset
After a Function-Level Reset, PCI states need to be restored.  Save PASID
features and PRI reqs cached.

[bhelgaas: search for capability only if PRI/PASID were enabled]
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jean-Phillipe Brucker <jean-philippe.brucker@arm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
2017-05-30 15:40:50 -05:00
Imre Deak 4d071c3238 PCI/PM: Add needs_resume flag to avoid suspend complete optimization
Some drivers - like i915 - may not support the system suspend direct
complete optimization due to differences in their runtime and system
suspend sequence.  Add a flag that when set resumes the device before
calling the driver's system suspend handlers which effectively disables
the optimization.

Needed by a future patch fixing suspend/resume on i915.

Suggested by Rafael.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
2017-05-23 14:18:17 -05:00
Ard Biesheuvel 3134233097 PCI: Do not disregard parent resources starting at 0x0
Commit f44116ae88 ("PCI: Remove pci_find_parent_resource() use for
allocation") updated the logic that iterates over all bus resources and
compares them to a given resource, in order to decide whether one is the
parent of the latter.

This change inadvertently causes pci_find_parent_resource() to disregard
resources starting at address 0x0, resulting in an error such as the one
below on ARM systems whose I/O window starts at 0x0.

  pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff window]
  pci_bus 0000:00: root bus resource [io  0x0000-0xffff window]
  pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff window]
  pci_bus 0000:00: root bus resource [bus 00-0f]
  pci 0000:00:01.0: PCI bridge to [bus 01]
  pci 0000:00:02.0: PCI bridge to [bus 02]
  pci 0000:00:03.0: PCI bridge to [bus 03]
  pci 0000:00:03.0: can't claim BAR 13 [io  0x0000-0x0fff]: no compatible bridge window
  pci 0000:03:01.0: can't claim BAR 0 [io  0x0000-0x001f]: no compatible bridge window

While this never happens on x86, it is perfectly legal in general for a PCI
MMIO or IO window to start at address 0x0, and it was supported in the code
before commit f44116ae88.

Drop the test for res->start != 0; resource_contains() already checks
whether [start, end) completely covers the resource, and so it should be
redundant.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-05-17 16:20:35 -05:00
Bjorn Helgaas ef1b5dad5a Merge branch 'pci/virtualization' into next
* pci/virtualization:
  ixgbe: Use pcie_flr() instead of duplicating it
  IB/hfi1: Use pcie_flr() instead of duplicating it
  PCI: Call pcie_flr() from reset_chelsio_generic_dev()
  PCI: Call pcie_flr() from reset_intel_82599_sfp_virtfn()
  PCI: Export pcie_flr()
  PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding
  PCI: Avoid FLR for Intel 82579 NICs

Conflicts:
	include/linux/pci.h
2017-04-28 10:36:12 -05:00
Bjorn Helgaas 78f098383a Merge branch 'pci/resource' into next
* pci/resource:
  PCI: Don't resize resources when realigning all devices in system
  PCI: Don't reassign resources that are already aligned
  PCI: Factor pci_reassigndev_resource_alignment()
  powerpc/powernv: Override pcibios_default_alignment() to force PCI devices to be page aligned
  PCI: Add pcibios_default_alignment() for arch-specific alignment control
  PCI: Fix calculation of bridge window's size and alignment
  PCI: Ignore requested alignment for IOV BARs
  PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constant
2017-04-28 10:34:29 -05:00
Bjorn Helgaas acc886ec93 Merge branch 'pci/pm' into next
* pci/pm:
  PCI: Freeze PME scan before suspending devices
  PCI/PM: Don't sleep at all when d3_delay or d3cold_delay is zero
2017-04-28 10:34:24 -05:00
Bjorn Helgaas 0b0ee66c4f Merge branch 'pci/ioremap' into next
* pci/ioremap:
  PCI: versatile: Update PCI config space remap function
  PCI: keystone-dw: Update PCI config space remap function
  PCI: layerscape: Update PCI config space remap function
  PCI: hisi: Update PCI config space remap function
  PCI: tegra: Update PCI config space remap function
  PCI: xgene: Update PCI config space remap function
  PCI: armada8k: Update PCI config space remap function
  PCI: designware: Update PCI config space remap function
  PCI: iproc-platform: Update PCI config space remap function
  PCI: qcom: Update PCI config space remap function
  PCI: rockchip: Update PCI config space remap function
  PCI: spear13xx: Update PCI config space remap function
  PCI: xilinx-nwl: Update PCI config space remap function
  PCI: xilinx: Update PCI config space remap function
  PCI: ECAM: Map config region with pci_remap_cfgspace()
  PCI: Implement devm_pci_remap_cfgspace()
  devres: fix devm_ioremap_*() offset parameter kerneldoc description
  ARM: Implement pci_remap_cfgspace() interface
  ARM64: Implement pci_remap_cfgspace() interface
  linux/io.h: Add pci_remap_cfgspace() interface
  PCI: Remove __weak tag from pci_remap_iospace()
2017-04-28 10:34:05 -05:00
Bjorn Helgaas f503ee4cbe Merge branch 'pci/enumeration' into next
* pci/enumeration:
  PCI: Include PCI-to-PCIe bridges as "Downstream Ports"
  PCI: Improve __pci_read_base() robustness
  PCI: Short-circuit pci_device_is_present() for disconnected devices
  PCI/MSI: Skip disabling disconnected devices
  PCI: Don't attempt config access to disconnected devices
  PCI: Add device disconnected state
  PCI: Export PCI device config accessors
2017-04-28 10:33:55 -05:00
Lorenzo Pieralisi 490cb6ddb1 PCI: Implement devm_pci_remap_cfgspace()
The introduction of the pci_remap_cfgspace() interface allows PCI host
controller drivers to map PCI config space through a dedicated kernel
interface. Current PCI host controller drivers use the devm_ioremap_*()
devres interfaces to map PCI configuration space regions so in order to
update them to the new pci_remap_cfgspace() mapping interface a new set of
devres interfaces should be implemented so that PCI host controller drivers
can make use of them.

Introduce two new functions in the PCI kernel layer and Devres
documentation:

- devm_pci_remap_cfgspace()
- devm_pci_remap_cfg_resource()

so that PCI host controller drivers can make use of them to map PCI
configuration space regions.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
2017-04-24 13:53:13 -05:00
Brian Norris f90b087546 PCI: Export pci_remap_iospace() and pci_unmap_iospace()
These are useful for PCIe host drivers, and those drivers can be modules.

[bhelgaas: don't remove __weak; it's removed elsewhere]
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
2017-04-21 10:57:29 -05:00
Christoph Hellwig a60a2b73ba PCI: Export pcie_flr()
Currently we opencode the FLR sequence in lots of place; export a core
helper instead.  We split out the probing for FLR support as all the
non-core callers already know their hardware.

Note that in the new pci_has_flr() function the quirk check has been moved
before the capability check as there is no point in reading the capability
in this case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20 08:53:51 -05:00
Lorenzo Pieralisi 7b309aef04 PCI: Remove __weak tag from pci_remap_iospace()
pci_remap_iospace() is marked as a weak symbol even though no architecture
is currently overriding it; given that its implementation internals have
already code paths that are arch specific (ie PCI_IOBASE and
ioremap_page_range() attributes) there is no need to leave the weak symbol
in the kernel since the same functionality can be achieved by customizing
per-arch the corresponding functionality.

Remove the __weak symbol from pci_remap_iospace().

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2017-04-19 13:57:13 -05:00
Yongji Xie e3adec72a3 PCI: Don't resize resources when realigning all devices in system
The "pci=resource_alignment" argument aligns BARs of designated devices by
artificially increasing their size.  Increasing the size increases the
alignment and prevents other resources from being assigned in the same
alignment region, e.g., in the same page, but it can break drivers that use
the BAR size to locate things, e.g., ilo_map_device() does this:

  off = pci_resource_len(pdev, bar) - 0x2000;

The new pcibios_default_alignment() interface allows an arch to request
that *all* BARs in the system be aligned to a larger size.  In this case,
we don't need to artificially increase the resource size because we know
every BAR of every device will be realigned, so nothing will share the same
alignment region.

Use IORESOURCE_STARTALIGN to request realignment of PCI BARs when we know
we're realigning all BARs in the system.

[bhelgaas: comment, changelog]
Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19 12:52:47 -05:00
Bjorn Helgaas 0dde1c08d1 PCI: Don't reassign resources that are already aligned
The "pci=resource_alignment=" kernel argument designates devices for which
we want alignment greater than is required by the PCI specs.  Previously we
set IORESOURCE_UNSET for every MEM resource of those devices, even if the
resource was *already* sufficiently aligned.

If a resource is already sufficiently aligned, leave it alone and don't try
to reassign it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19 12:52:40 -05:00
Bjorn Helgaas 81a5e70e0d PCI: Factor pci_reassigndev_resource_alignment()
Pull the BAR size adjustment out into a new function,
pci_request_resource_alignment(), and add a comment about how and why we
increase the resource size and alignment.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19 12:52:22 -05:00
Yongji Xie 0a701aa637 PCI: Add pcibios_default_alignment() for arch-specific alignment control
When VFIO passes through a PCI device to a guest, it does not allow the
guest to mmap BARs that are smaller than PAGE_SIZE unless it can reserve
the rest of the page (see vfio_pci_probe_mmaps()). This is because a page
might contain several small BARs for unrelated devices and a guest should
not be able to access all of them.

VFIO emulates guest accesses to non-mappable BARs, which is functional but
slow. On systems with large page sizes, e.g., PowerNV with 64K pages, BARs
are more likely to share a page and performance is more likely to be a
problem.

Add a weak function to set default alignment for all PCI devices.  An arch
can override it to force the PCI core to place memory BARs on their own
pages.

Signed-off-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-19 12:51:25 -05:00