1
0
Fork 0
Commit Graph

2784 Commits (d2faee42f9e7dbe147de6d049e33ee9de51b404d)

Author SHA1 Message Date
Megha Dey 72a0cfeb51 iommu/vt-d: Populate debugfs if IOMMUs are detected
[ Upstream commit 1da8347d85 ]

Currently, the intel iommu debugfs directory(/sys/kernel/debug/iommu/intel)
gets populated only when DMA remapping is enabled (dmar_disabled = 0)
irrespective of whether interrupt remapping is enabled or not.

Instead, populate the intel iommu debugfs directory if any IOMMUs are
detected.

Cc: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: ee2636b867 ("iommu/vt-d: Enable base Intel IOMMU debugfs support")
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-01 11:01:56 +02:00
Megha Dey cb17ed60ec iommu/vt-d: Fix debugfs register reads
[ Upstream commit ba3b01d7a6 ]

Commit 6825d3ea6c ("iommu/vt-d: Add debugfs support to show register
contents") dumps the register contents for all IOMMU devices.

Currently, a 64 bit read(dmar_readq) is done for all the IOMMU registers,
even though some of the registers are 32 bits, which is incorrect.

Use the correct read function variant (dmar_readl/dmar_readq) while
reading the contents of 32/64 bit registers respectively.

Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Link: https://lore.kernel.org/r/1583784587-26126-2-git-send-email-megha.dey@linux.intel.com
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-01 11:01:55 +02:00
Qian Cai f8de95a236 iommu/vt-d: Silence RCU-list debugging warnings
[ Upstream commit f515241652 ]

Similar to the commit 02d715b4a8 ("iommu/vt-d: Fix RCU list debugging
warnings"), there are several other places that call
list_for_each_entry_rcu() outside of an RCU read side critical section
but with dmar_global_lock held. Silence those false positives as well.

 drivers/iommu/intel-iommu.c:4288 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
  #0: ffffffff935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1ad/0xb97

 drivers/iommu/dmar.c:366 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
  #0: ffffffff935892c8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x125/0xb97

 drivers/iommu/intel-iommu.c:5057 RCU-list traversed in non-reader section!!
 1 lock held by swapper/0/1:
  #0: ffffffffa71892c8 (dmar_global_lock){++++}, at: intel_iommu_init+0x61a/0xb13

Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-01 11:01:53 +02:00
Suravee Suthikulpanit 62fd4e348d iommu/amd: Fix IOMMU AVIC not properly update the is_run bit in IRTE
commit 730ad0ede1 upstream.

Commit b9c6ff94e4 ("iommu/amd: Re-factor guest virtual APIC
(de-)activation code") accidentally left out the ir_data pointer when
calling modity_irte_ga(), which causes the function amd_iommu_update_ga()
to return prematurely due to struct amd_ir_data.ref is NULL and
the "is_run" bit of IRTE does not get updated properly.

This results in bad I/O performance since IOMMU AVIC always generate GA Log
entry and notify IOMMU driver and KVM when it receives interrupt from the
PCI pass-through device instead of directly inject interrupt to the vCPU.

Fixes by passing ir_data when calling modify_irte_ga() as done previously.

Fixes: b9c6ff94e4 ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:58 +01:00
Daniel Drake 03d524d70e iommu/vt-d: Ignore devices with out-of-spec domain number
commit da72a379b2 upstream.

VMD subdevices are created with a PCI domain ID of 0x10000 or
higher.

These subdevices are also handled like all other PCI devices by
dmar_pci_bus_notifier().

However, when dmar_alloc_pci_notify_info() take records of such devices,
it will truncate the domain ID to a u16 value (in info->seg).
The device at (e.g.) 10000:00:02.0 is then treated by the DMAR code as if
it is 0000:00:02.0.

In the unlucky event that a real device also exists at 0000:00:02.0 and
also has a device-specific entry in the DMAR table,
dmar_insert_dev_scope() will crash on:
   BUG_ON(i >= devices_cnt);

That's basically a sanity check that only one PCI device matches a
single DMAR entry; in this case we seem to have two matching devices.

Fix this by ignoring devices that have a domain number higher than
what can be looked up in the DMAR table.

This problem was carefully diagnosed by Jian-Hong Pan.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Fixes: 59ce0515cd ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:58 +01:00
Zhenzhong Duan 9493a6361d iommu/vt-d: Fix the wrong printing in RHSA parsing
commit b0bb0c22c4 upstream.

When base address in RHSA structure doesn't match base address in
each DRHD structure, the base address in last DRHD is printed out.

This doesn't make sense when there are multiple DRHD units, fix it
by printing the buggy RHSA's base address.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com>
Fixes: fd0c889489 ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:58 +01:00
Qian Cai 4f60640723 iommu/vt-d: Fix RCU-list bugs in intel_iommu_init()
commit 2d48ea0efb upstream.

There are several places traverse RCU-list without holding any lock in
intel_iommu_init(). Fix them by acquiring dmar_global_lock.

 WARNING: suspicious RCU usage
 -----------------------------
 drivers/iommu/intel-iommu.c:5216 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 no locks held by swapper/0/1.

 Call Trace:
  dump_stack+0xa0/0xea
  lockdep_rcu_suspicious+0x102/0x10b
  intel_iommu_init+0x947/0xb13
  pci_iommu_init+0x26/0x62
  do_one_initcall+0xfe/0x500
  kernel_init_freeable+0x45a/0x4f8
  kernel_init+0x11/0x139
  ret_from_fork+0x3a/0x50
 DMAR: Intel(R) Virtualization Technology for Directed I/O

Fixes: d8190dc638 ("iommu/vt-d: Enable DMA remapping after rmrr mapped")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:56 +01:00
Yonghyun Hwang 8159e369d1 iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page
commit 77a1bce84b upstream.

intel_iommu_iova_to_phys() has a bug when it translates an IOVA for a huge
page onto its corresponding physical address. This commit fixes the bug by
accomodating the level of page entry for the IOVA and adds IOVA's lower
address to the physical address.

Cc: <stable@vger.kernel.org>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Yonghyun Hwang <yonghyun@google.com>
Fixes: 3871794642 ("VT-d: Changes to support KVM")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:54 +01:00
Hans de Goede 798c1441bd iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint
commit 5983369644 upstream.

Quoting from the comment describing the WARN functions in
include/asm-generic/bug.h:

 * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
 * significant kernel issues that need prompt attention if they should ever
 * appear at runtime.
 *
 * Do not use these macros when checking for invalid external inputs

The (buggy) firmware tables which the dmar code was calling WARN_TAINT
for really are invalid external inputs. They are not under the kernel's
control and the issues in them cannot be fixed by a kernel update.
So logging a backtrace, which invites bug reports to be filed about this,
is not helpful.

Some distros, e.g. Fedora, have tools watching for the kernel backtraces
logged by the WARN macros and offer the user an option to file a bug for
this when these are encountered. The WARN_TAINT in warn_invalid_dmar()
+ another iommu WARN_TAINT, addressed in another patch, have lead to over
a 100 bugs being filed this way.

This commit replaces the WARN_TAINT("...") calls, with
pr_warn(FW_BUG "...") + add_taint(TAINT_FIRMWARE_WORKAROUND, ...) calls
avoiding the backtrace and thus also avoiding bug-reports being filed
about this against the kernel.

Fixes: fd0c889489 ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables")
Fixes: e625b4a95d ("iommu/vt-d: Parse ANDD records")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200309140138.3753-2-hdegoede@redhat.com
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1564895
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:54 +01:00
Marc Zyngier 77abae8657 iommu/dma: Fix MSI reservation allocation
commit 65ac74f1de upstream.

The way cookie_init_hw_msi_region() allocates the iommu_dma_msi_page
structures doesn't match the way iommu_put_dma_cookie() frees them.

The former performs a single allocation of all the required structures,
while the latter tries to free them one at a time. It doesn't quite
work for the main use case (the GICv3 ITS where the range is 64kB)
when the base granule size is 4kB.

This leads to a nice slab corruption on teardown, which is easily
observable by simply creating a VF on a SRIOV-capable device, and
tearing it down immediately (no need to even make use of it).
Fortunately, this only affects systems where the ITS isn't translated
by the SMMU, which are both rare and non-standard.

Fix it by allocating iommu_dma_msi_page structures one at a time.

Fixes: 7c1b058c8b ("iommu/dma: Handle IOMMU API reserved regions")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:54 +01:00
Hans de Goede 3ca828bd0f iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint
commit 81ee85d046 upstream.

Quoting from the comment describing the WARN functions in
include/asm-generic/bug.h:

 * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
 * significant kernel issues that need prompt attention if they should ever
 * appear at runtime.
 *
 * Do not use these macros when checking for invalid external inputs

The (buggy) firmware tables which the dmar code was calling WARN_TAINT
for really are invalid external inputs. They are not under the kernel's
control and the issues in them cannot be fixed by a kernel update.
So logging a backtrace, which invites bug reports to be filed about this,
is not helpful.

Fixes: 556ab45f9a ("ioat2: catch and recover from broken vtd configurations v6")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200309182510.373875-1-hdegoede@redhat.com
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=701847
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18 07:17:49 +01:00
Kai-Heng Feng 9b412c4aa3 iommu/amd: Disable IOMMU on Stoney Ridge systems
[ Upstream commit 3dfee47b21 ]

Serious screen flickering when Stoney Ridge outputs to a 4K monitor.

Use identity-mapping and PCI ATS doesn't help this issue.

According to Alex Deucher, IOMMU isn't enabled on Windows, so let's do
the same here to avoid screen flickering on 4K monitor.

Cc: Alex Deucher <alexander.deucher@amd.com>
Bug: https://gitlab.freedesktop.org/drm/amd/issues/961
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-12 13:00:14 +01:00
Robin Murphy b76e00b67d iommu/qcom: Fix bogus detach logic
commit faf305c51a upstream.

Currently, the implementation of qcom_iommu_domain_free() is guaranteed
to do one of two things: WARN() and leak everything, or dereference NULL
and crash. That alone is terrible, but in fact the whole idea of trying
to track the liveness of a domain via the qcom_domain->iommu pointer as
a sanity check is full of fundamentally flawed assumptions. Make things
robust and actually functional by not trying to be quite so clever.

Reported-by: Brian Masney <masneyb@onstation.org>
Tested-by: Brian Masney <masneyb@onstation.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 0ae349a0f3 ("iommu/qcom: Add qcom_iommu")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Stephan Gerhold <stephan@gerhold.net>
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28 17:22:12 +01:00
Lu Baolu 777baa1baf iommu/vt-d: Remove unnecessary WARN_ON_ONCE()
[ Upstream commit 857f081426 ]

Address field in device TLB invalidation descriptor is qualified
by the S field. If S field is zero, a single page at page address
specified by address [63:12] is requested to be invalidated. If S
field is set, the least significant bit in the address field with
value 0b (say bit N) indicates the invalidation address range. The
spec doesn't require the address [N - 1, 0] to be cleared, hence
remove the unnecessary WARN_ON_ONCE().

Otherwise, the caller might set "mask = MAX_AGAW_PFN_WIDTH" in order
to invalidating all the cached mappings on an endpoint, and below
overflow error will be triggered.

[...]
UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1354:3
shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]

Reported-and-tested-by: Frank <fgndev@posteo.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:57 +01:00
Will Deacon 480494e28a iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE
[ Upstream commit d71e01716b ]

If, for some bizarre reason, the compiler decided to split up the write
of STE DWORD 0, we could end up making a partial structure valid.

Although this probably won't happen, follow the example of the
context-descriptor code and use WRITE_ONCE() to ensure atomicity of the
write.

Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:49 +01:00
Jacob Pan 960671ac50 iommu/vt-d: Avoid sending invalid page response
[ Upstream commit 5f75585e19 ]

Page responses should only be sent when last page in group (LPIG) or
private data is present in the page request. This patch avoids sending
invalid descriptors.

Fixes: 5d308fc1ec ("iommu/vt-d: Add 256-bit invalidation descriptor support")
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:43 +01:00
Jacob Pan 2aab9e9d1f iommu/vt-d: Match CPU and IOMMU paging mode
[ Upstream commit 79db7e1b4c ]

When setting up first level page tables for sharing with CPU, we need
to ensure IOMMU can support no less than the levels supported by the
CPU.

It is not adequate, as in the current code, to set up 5-level paging
in PASID entry First Level Paging Mode(FLPM) solely based on CPU.

Currently, intel_pasid_setup_first_level() is only used by native SVM
code which already checks paging mode matches. However, future use of
this helper function may not be limited to native SVM.
https://lkml.org/lkml/2019/11/18/1037

Fixes: 437f35e1cd ("iommu/vt-d: Add first level page table interface")
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:43 +01:00
Qian Cai fa0150ba88 iommu/iova: Silence warnings under memory pressure
[ Upstream commit 944c917539 ]

When running heavy memory pressure workloads, this 5+ old system is
throwing endless warnings below because disk IO is too slow to recover
from swapping. Since the volume from alloc_iova_fast() could be large,
once it calls printk(), it will trigger disk IO (writing to the log
files) and pending softirqs which could cause an infinite loop and make
no progress for days by the ongoimng memory reclaim. This is the counter
part for Intel where the AMD part has already been merged. See the
commit 3d70889532 ("iommu/amd: Silence warnings under memory
pressure"). Since the allocation failure will be reported in
intel_alloc_iova(), so just call dev_err_once() there because even the
"ratelimited" is too much, and silence the one in alloc_iova_mem() to
avoid the expensive warn_alloc().

 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 slab_out_of_memory: 66 callbacks suppressed
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
   node 0: slabs: 1822, objs: 16398, free: 0
   node 1: slabs: 2051, objs: 18459, free: 31
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
   node 0: slabs: 1822, objs: 16398, free: 0
   node 1: slabs: 2051, objs: 18459, free: 31
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: iommu_iova, object size: 40, buffer size: 448, default order:
0, min order: 0
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 1: slabs: 381, objs: 2286, free: 27
   node 0: slabs: 1822, objs: 16398, free: 0
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 1: slabs: 2051, objs: 18459, free: 31
   node 0: slabs: 697, objs: 4182, free: 0
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
   node 1: slabs: 381, objs: 2286, free: 27
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
   node 1: slabs: 381, objs: 2286, free: 27
 hpsa 0000:03:00.0: DMAR: Allocating 1-page iova failed
 warn_alloc: 96 callbacks suppressed
 kworker/11:1H: page allocation failure: order:0,
mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0-1
 CPU: 11 PID: 1642 Comm: kworker/11:1H Tainted: G    B
 Hardware name: HP ProLiant XL420 Gen9/ProLiant XL420 Gen9, BIOS U19
12/27/2015
 Workqueue: kblockd blk_mq_run_work_fn
 Call Trace:
  dump_stack+0xa0/0xea
  warn_alloc.cold.94+0x8a/0x12d
  __alloc_pages_slowpath+0x1750/0x1870
  __alloc_pages_nodemask+0x58a/0x710
  alloc_pages_current+0x9c/0x110
  alloc_slab_page+0xc9/0x760
  allocate_slab+0x48f/0x5d0
  new_slab+0x46/0x70
  ___slab_alloc+0x4ab/0x7b0
  __slab_alloc+0x43/0x70
  kmem_cache_alloc+0x2dd/0x450
 SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
  alloc_iova+0x33/0x210
   cache: skbuff_head_cache, object size: 208, buffer size: 640, default
order: 0, min order: 0
   node 0: slabs: 697, objs: 4182, free: 0
  alloc_iova_fast+0x62/0x3d1
   node 1: slabs: 381, objs: 2286, free: 27
  intel_alloc_iova+0xce/0xe0
  intel_map_sg+0xed/0x410
  scsi_dma_map+0xd7/0x160
  scsi_queue_rq+0xbf7/0x1310
  blk_mq_dispatch_rq_list+0x4d9/0xbc0
  blk_mq_sched_dispatch_requests+0x24a/0x300
  __blk_mq_run_hw_queue+0x156/0x230
  blk_mq_run_work_fn+0x3b/0x40
  process_one_work+0x579/0xb90
  worker_thread+0x63/0x5b0
  kthread+0x1e6/0x210
  ret_from_fork+0x3a/0x50
 Mem-Info:
 active_anon:2422723 inactive_anon:361971 isolated_anon:34403
  active_file:2285 inactive_file:1838 isolated_file:0
  unevictable:0 dirty:1 writeback:5 unstable:0
  slab_reclaimable:13972 slab_unreclaimable:453879
  mapped:2380 shmem:154 pagetables:6948 bounce:0
  free:19133 free_pcp:7363 free_cma:0

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:40 +01:00
Suravee Suthikulpanit 8c35843545 iommu/amd: Only support x2APIC with IVHD type 11h/40h
[ Upstream commit 966b753cf3 ]

Current implementation for IOMMU x2APIC support makes use of
the MMIO access to MSI capability block registers, which requires
checking EFR[MsiCapMmioSup]. However, only IVHD type 11h/40h contain
the information, and not in the IVHD type 10h IOMMU feature reporting
field. Since the BIOS in newer systems, which supports x2APIC, would
normally contain IVHD type 11h/40h, remove the IOMMU_FEAT_XTSUP_SHIFT
check for IVHD type 10h, and only support x2APIC with IVHD type 11h/40h.

Fixes: 6692981295 ('iommu/amd: Add support for X2APIC IOMMU interrupts')
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:40 +01:00
Suravee Suthikulpanit b1b7add9d2 iommu/amd: Check feature support bit before accessing MSI capability registers
[ Upstream commit 813071438e ]

The IOMMU MMIO access to MSI capability registers is available only if
the EFR[MsiCapMmioSup] is set. Current implementation assumes this bit
is set if the EFR[XtSup] is set, which might not be the case.

Fix by checking the EFR[MsiCapMmioSup] before accessing the MSI address
low/high and MSI data registers via the MMIO.

Fixes: 6692981295 ('iommu/amd: Add support for X2APIC IOMMU interrupts')
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:39 +01:00
James Sewart 27a35f0936 PCI: Add nr_devfns parameter to pci_add_dma_alias()
[ Upstream commit 09298542cd ]

Add a "nr_devfns" parameter to pci_add_dma_alias() so it can be used to
create DMA aliases for a range of devfns.

[bhelgaas: incorporate nr_devfns fix from James, update
quirk_pex_vca_alias() and setup_aliases()]
Signed-off-by: James Sewart <jamessewart@arista.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:37 +01:00
Jacob Pan 9b743915bd iommu/vt-d: Fix off-by-one in PASID allocation
[ Upstream commit 39d630e332 ]

PASID allocator uses IDR which is exclusive for the end of the
allocation range. There is no need to decrement pasid_max.

Fixes: af39507305 ("iommu/vt-d: Apply global PASID in SVA")
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24 08:36:26 +01:00
Shameer Kolothum 451b91d88a iommu/arm-smmu-v3: Populate VMID field for CMDQ_OP_TLBI_NH_VA
commit 935d43ba27 upstream.

CMDQ_OP_TLBI_NH_VA requires VMID and this was missing since
commit 1c27df1c0a ("iommu/arm-smmu: Use correct address mask
for CMD_TLBI_S2_IPA"). Add it back.

Fixes: 1c27df1c0a ("iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA")
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-14 16:34:16 -05:00
Logan Gunthorpe b02b0a6bcc iommu/amd: Support multiple PCI DMA aliases in IRQ Remapping
[ Upstream commit 3c124435e8 ]

Non-Transparent Bridge (NTB) devices (among others) may have many DMA
aliases seeing the hardware will send requests with different device ids
depending on their origin across the bridged hardware.

See commit ad281ecf1c ("PCI: Add DMA alias quirk for Microsemi Switchtec
NTB") for more information on this.

The AMD IOMMU IRQ remapping functionality ignores all PCI aliases for
IRQs so if devices send an interrupt from one of their aliases they
will be blocked on AMD hardware with the IOMMU enabled.

To fix this, ensure IRQ remapping is enabled for all aliases with
MSI interrupts.

This is analogous to the functionality added to the Intel IRQ remapping
code in commit 3f0c625c6a ("iommu/vt-d: Allow interrupts from the entire
bus for aliased devices")

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-01 09:34:48 +00:00
Logan Gunthorpe 1f03a258f2 iommu/amd: Support multiple PCI DMA aliases in device table
[ Upstream commit 3332364e4e ]

Non-Transparent Bridge (NTB) devices (among others) may have many DMA
aliases seeing the hardware will send requests with different device ids
depending on their origin across the bridged hardware.

See commit ad281ecf1c ("PCI: Add DMA alias quirk for Microsemi
Switchtec NTB") for more information on this.

The AMD IOMMU ignores all the PCI aliases except the last one so DMA
transfers from these aliases will be blocked on AMD hardware with the
IOMMU enabled.

To fix this, ensure the DTEs are cloned for every PCI alias. This is
done by copying the DTE data for each alias as well as the IVRS alias
every time it is changed.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-01 09:34:48 +00:00
Qian Cai a347d66cbe iommu/dma: fix variable 'cookie' set but not used
[ Upstream commit 55817b340a ]

The commit c18647900e ("iommu/dma: Relax locking in
iommu_dma_prepare_msi()") introduced a compliation warning,

drivers/iommu/dma-iommu.c: In function 'iommu_dma_prepare_msi':
drivers/iommu/dma-iommu.c:1206:27: warning: variable 'cookie' set but
not used [-Wunused-but-set-variable]
  struct iommu_dma_cookie *cookie;
                           ^~~~~~

Fixes: c18647900e ("iommu/dma: Relax locking in iommu_dma_prepare_msi()")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-01 09:34:45 +00:00
Shuah Khan 16aab32ff8 iommu/amd: Fix IOMMU perf counter clobbering during init
commit 8c17bbf6c8 upstream.

init_iommu_perf_ctr() clobbers the register when it checks write access
to IOMMU perf counters and fails to restore when they are writable.

Add save and restore to fix it.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Fixes: 30861ddc9c ("perf/x86/amd: Add IOMMU Performance Counter resource management")
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-29 16:45:29 +01:00
Jerry Snitselaar 5d1973adcf iommu/vt-d: Call __dmar_remove_one_dev_info with valid pointer
commit bf708cfb2f upstream.

It is possible for archdata.iommu to be set to
DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO so check for
those values before calling __dmar_remove_one_dev_info. Without a
check it can result in a null pointer dereference. This has been seen
while booting a kdump kernel on an HP dl380 gen9.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org # 5.3+
Cc: linux-kernel@vger.kernel.org
Fixes: ae23bfb68f ("iommu/vt-d: Detach domain before using a private one")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-29 16:45:27 +01:00
Yong Wu f07d3e39f0 iommu/mediatek: Add a new tlb_lock for tlb_flush
commit da3cc91b8d upstream.

The commit 4d689b6194 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API
TLB sync") help move the tlb_sync of unmap from v7s into the iommu
framework. It helps add a new function "mtk_iommu_iotlb_sync", But it
lacked the lock, then it will cause the variable "tlb_flush_active"
may be changed unexpectedly, we could see this warning log randomly:

mtk-iommu 10205000.iommu: Partial TLB flush timed out, falling back to
full flush

The HW requires tlb_flush/tlb_sync in pairs strictly, this patch adds
a new tlb_lock for tlb operations to fix this issue.

Fixes: 4d689b6194 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB sync")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:59 +01:00
Yong Wu e5c3362bc2 iommu/mediatek: Correct the flush_iotlb_all callback
commit 2009122f1d upstream.

Use the correct tlb_flush_all instead of the original one.

Fixes: 4d689b6194 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API TLB sync")
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:59 +01:00
Jon Derrick 297d6a06a8 iommu: Remove device link to group on failure
commit 7d4e6ccd1f upstream.

This adds the missing teardown step that removes the device link from
the group when the device addition fails.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Fixes: 797a8b4d76 ("iommu: Handle default domain attach failure")
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:20 +01:00
Jon Derrick f9fbac39cf iommu/vt-d: Unlink device if failed to add to group
commit f78947c409 upstream.

If the device fails to be added to the group, make sure to unlink the
reference before returning.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Fixes: 39ab9555c2 ("iommu: Add sysfs bindings for struct iommu_device")
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17 19:48:20 +01:00
Patrick Steinhardt d12d10211b iommu/vt-d: Fix adding non-PCI devices to Intel IOMMU
commit 4a350a0ee5 upstream.

Starting with commit fa212a97f3 ("iommu/vt-d: Probe DMA-capable ACPI
name space devices"), we now probe DMA-capable ACPI name
space devices. On Dell XPS 13 9343, which has an Intel LPSS platform
device INTL9C60 enumerated via ACPI, this change leads to the following
warning:

    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 1 at pci_device_group+0x11a/0x130
    CPU: 1 PID: 1 Comm: swapper/0 Tainted: G                T 5.5.0-rc3+ #22
    Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A20 06/06/2019
    RIP: 0010:pci_device_group+0x11a/0x130
    Code: f0 ff ff 48 85 c0 49 89 c4 75 c4 48 8d 74 24 10 48 89 ef e8 48 ef ff ff 48 85 c0 49 89 c4 75 af e8 db f7 ff ff 49 89 c4 eb a5 <0f> 0b 49 c7 c4 ea ff ff ff eb 9a e8 96 1e c7 ff 66 0f 1f 44 00 00
    RSP: 0000:ffffc0d6c0043cb0 EFLAGS: 00010202
    RAX: 0000000000000000 RBX: ffffa3d1d43dd810 RCX: 0000000000000000
    RDX: ffffa3d1d4fecf80 RSI: ffffa3d12943dcc0 RDI: ffffa3d1d43dd810
    RBP: ffffa3d1d43dd810 R08: 0000000000000000 R09: ffffa3d1d4c04a80
    R10: ffffa3d1d4c00880 R11: ffffa3d1d44ba000 R12: 0000000000000000
    R13: ffffa3d1d4383b80 R14: ffffa3d1d4c090d0 R15: ffffa3d1d4324530
    FS:  0000000000000000(0000) GS:ffffa3d1d6700000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000000460a001 CR4: 00000000003606e0
    Call Trace:
     ? iommu_group_get_for_dev+0x81/0x1f0
     ? intel_iommu_add_device+0x61/0x170
     ? iommu_probe_device+0x43/0xd0
     ? intel_iommu_init+0x1fa2/0x2235
     ? pci_iommu_init+0x52/0xe7
     ? e820__memblock_setup+0x15c/0x15c
     ? do_one_initcall+0xcc/0x27e
     ? kernel_init_freeable+0x169/0x259
     ? rest_init+0x95/0x95
     ? kernel_init+0x5/0xeb
     ? ret_from_fork+0x35/0x40
    ---[ end trace 28473e7abc25b92c ]---
    DMAR: ACPI name space devices didn't probe correctly

The bug results from the fact that while we now enumerate ACPI devices,
we aren't able to handle any non-PCI device when generating the device
group. Fix the issue by implementing an Intel-specific callback that
returns `pci_device_group` only if the device is a PCI device.
Otherwise, it will return a generic device group.

Fixes: fa212a97f3 ("iommu/vt-d: Probe DMA-capable ACPI name space devices")
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Cc: stable@vger.kernel.org # v5.3+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-14 20:08:32 +01:00
Robin Murphy 2d26e06cb1 iommu/dma: Relax locking in iommu_dma_prepare_msi()
[ Upstream commit c18647900e ]

Since commit ece6e6f021 ("iommu/dma-iommu: Split iommu_dma_map_msi_msg()
in two parts"), iommu_dma_prepare_msi() should no longer have to worry
about preempting itself, nor being called in atomic context at all. Thus
we can downgrade the IRQ-safe locking to a simple mutex to avoid angering
the new might_sleep() check in iommu_map().

Reported-by: Qian Cai <cai@lca.pw>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12 12:21:38 +01:00
Xiaotao Yin c65eddfef6 iommu/iova: Init the struct iova to fix the possible memleak
[ Upstream commit 472d26df5e ]

During ethernet(Marvell octeontx2) set ring buffer test:
ethtool -G eth1 rx <rx ring size> tx <tx ring size>
following kmemleak will happen sometimes:

unreferenced object 0xffff000b85421340 (size 64):
  comm "ethtool", pid 867, jiffies 4295323539 (age 550.500s)
  hex dump (first 64 bytes):
    80 13 42 85 0b 00 ff ff ff ff ff ff ff ff ff ff  ..B.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000001b204ddf>] kmem_cache_alloc+0x1b0/0x350
    [<00000000d9ef2e50>] alloc_iova+0x3c/0x168
    [<00000000ea30f99d>] alloc_iova_fast+0x7c/0x2d8
    [<00000000b8bb2f1f>] iommu_dma_alloc_iova.isra.0+0x12c/0x138
    [<000000002f1a43b5>] __iommu_dma_map+0x8c/0xf8
    [<00000000ecde7899>] iommu_dma_map_page+0x98/0xf8
    [<0000000082004e59>] otx2_alloc_rbuf+0xf4/0x158
    [<000000002b107f6b>] otx2_rq_aura_pool_init+0x110/0x270
    [<00000000c3d563c7>] otx2_open+0x15c/0x734
    [<00000000a2f5f3a8>] otx2_dev_open+0x3c/0x68
    [<00000000456a98b5>] otx2_set_ringparam+0x1ac/0x1d4
    [<00000000f2fbb819>] dev_ethtool+0xb84/0x2028
    [<0000000069b67c5a>] dev_ioctl+0x248/0x3a0
    [<00000000af38663a>] sock_ioctl+0x280/0x638
    [<000000002582384c>] do_vfs_ioctl+0x8b0/0xa80
    [<000000004e1a2c02>] ksys_ioctl+0x84/0xb8

The reason:
When alloc_iova_mem() without initial with Zero, sometimes fpn_lo will
equal to IOVA_ANCHOR by chance, so when return with -ENOMEM(iova32_full)
from __alloc_and_insert_iova_range(), the new_iova will not be freed in
free_iova_mem().

Fixes: bb68b2fbfb ("iommu/iova: Add rbtree anchor node")
Signed-off-by: Xiaotao Yin <xiaotao.yin@windriver.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12 12:21:35 +01:00
Lu Baolu 6f2c72738d iommu/vt-d: Remove incorrect PSI capability check
commit f81b846dcd upstream.

The PSI (Page Selective Invalidation) bit in the capability register
is only valid for second-level translation. Intel IOMMU supporting
scalable mode must support page/address selective IOTLB invalidation
for first-level translation. Remove the PSI capability check in SVA
cache invalidation code.

Fixes: 8744daf4b0 ("iommu/vt-d: Remove global page flush support")
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-09 10:20:02 +01:00
Jean-Philippe Brucker 6b1400f260 iommu/arm-smmu-v3: Don't display an error when IRQ lines are missing
[ Upstream commit f7aff1a93f ]

Since commit 7723f4c5ec ("driver core: platform: Add an error message
to platform_get_irq*()"), platform_get_irq_byname() displays an error
when the IRQ isn't found. Since the SMMUv3 driver uses that function to
query which interrupt method is available, the message is now displayed
during boot for any SMMUv3 that doesn't implement the combined
interrupt, or that implements MSIs.

[   20.700337] arm-smmu-v3 arm-smmu-v3.7.auto: IRQ combined not found
[   20.706508] arm-smmu-v3 arm-smmu-v3.7.auto: IRQ eventq not found
[   20.712503] arm-smmu-v3 arm-smmu-v3.7.auto: IRQ priq not found
[   20.718325] arm-smmu-v3 arm-smmu-v3.7.auto: IRQ gerror not found

Use platform_get_irq_byname_optional() to avoid displaying a spurious
error.

Fixes: 7723f4c5ec ("driver core: platform: Add an error message to platform_get_irq*()")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:17:25 +01:00
Thierry Reding d23e93e7fe iommu/tegra-smmu: Fix page tables in > 4 GiB memory
[ Upstream commit 96d3ab802e ]

Page tables that reside in physical memory beyond the 4 GiB boundary are
currently not working properly. The reason is that when the physical
address for page directory entries is read, it gets truncated at 32 bits
and can cause crashes when passing that address to the DMA API.

Fix this by first casting the PDE value to a dma_addr_t and then using
the page frame number mask for the SMMU instance to mask out the invalid
bits, which are typically used for mapping attributes, etc.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:39 +01:00
Ezequiel Garcia 4f43e37b58 iommu: rockchip: Free domain on .domain_free
[ Upstream commit 42bb97b80f ]

IOMMU domain resource life is well-defined, managed
by .domain_alloc and .domain_free.

Therefore, domain-specific resources shouldn't be tied to
the device life, but instead to its domain.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:16:38 +01:00
Jerry Snitselaar 90a7ae8add iommu/vt-d: Allocate reserved region for ISA with correct permission
commit cde9319e88 upstream.

Currently the reserved region for ISA is allocated with no
permissions. If a dma domain is being used, mapping this region will
fail. Set the permissions to DMA_PTE_READ|DMA_PTE_WRITE.

Cc: Joerg Roedel <jroedel@suse.de>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: stable@vger.kernel.org # v5.3+
Fixes: d850c2ee5f ("iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31 16:45:54 +01:00
Alex Williamson e04f7db2bc iommu/vt-d: Set ISA bridge reserved region as relaxable
commit d8018a0e91 upstream.

Commit d850c2ee5f ("iommu/vt-d: Expose ISA direct mapping region via
iommu_get_resv_regions") created a direct-mapped reserved memory region
in order to replace the static identity mapping of the ISA address
space, where the latter was then removed in commit df4f3c603a
("iommu/vt-d: Remove static identity map code").  According to the
history of this code and the Kconfig option surrounding it, this direct
mapping exists for the benefit of legacy ISA drivers that are not
compatible with the DMA API.

In conjuntion with commit 9b77e5c798 ("vfio/type1: check dma map
request is within a valid iova range") this change introduced a
regression where the vfio IOMMU backend enforces reserved memory regions
per IOMMU group, preventing userspace from creating IOMMU mappings
conflicting with prescribed reserved regions.  A necessary prerequisite
for the vfio change was the introduction of "relaxable" direct mappings
introduced by commit adfd373820 ("iommu: Introduce
IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions").  These relaxable
direct mappings provide the same identity mapping support in the default
domain, but also indicate that the reservation is software imposed and
may be relaxed under some conditions, such as device assignment.

Convert the ISA bridge direct-mapped reserved region to relaxable to
reflect that the restriction is self imposed and need not be enforced
by drivers such as vfio.

Fixes: 1c5c59fbad ("iommu/vt-d: Differentiate relaxable and non relaxable RMRRs")
Cc: stable@vger.kernel.org # v5.3+
Link: https://lore.kernel.org/linux-iommu/20191211082304.2d4fab45@x1.home
Reported-by: cprt <cprt@protonmail.com>
Tested-by: cprt <cprt@protonmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31 16:45:54 +01:00
Lu Baolu 8a96618477 iommu/vt-d: Fix dmar pte read access not set error
commit 75d1838539 upstream.

If the default DMA domain of a group doesn't fit a device, it
will still sit in the group but use a private identity domain.
When map/unmap/iova_to_phys come through iommu API, the driver
should still serve them, otherwise, other devices in the same
group will be impacted. Since identity domain has been mapped
with the whole available memory space and RMRRs, we don't need
to worry about the impact on it.

Link: https://www.spinics.net/lists/iommu/msg40416.html
Cc: Jerry Snitselaar <jsnitsel@redhat.com>
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Fixes: 942067f1b6 ("iommu/vt-d: Identify default domains replaced with private")
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31 16:45:53 +01:00
Jerry Snitselaar 71730ba791 iommu: set group default domain before creating direct mappings
commit d360211524 upstream.

iommu_group_create_direct_mappings uses group->default_domain, but
right after it is called, request_default_domain_for_dev calls
iommu_domain_free for the default domain, and sets the group default
domain to a different domain. Move the
iommu_group_create_direct_mappings call to after the group default
domain is set, so the direct mappings get associated with that domain.

Cc: Joerg Roedel <jroedel@suse.de>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: stable@vger.kernel.org
Fixes: 7423e01741 ("iommu: Add API to request DMA domain for device")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31 16:45:53 +01:00
Eric Auger fe3bcc2e23 iommu: fix KASAN use-after-free in iommu_insert_resv_region
commit 4c80ba392b upstream.

In case the new region gets merged into another one, the nr list node is
freed.  Checking its type while completing the merge algorithm leads to
a use-after-free.  Use new->type instead.

Fixes: 4dbd258ff6 ("iommu: Revisit iommu_insert_resv_region() implementation")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Qian Cai <cai@lca.pw>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Stable <stable@vger.kernel.org> #v5.3+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31 16:45:52 +01:00
John Donnelly 160c63f909 iommu/vt-d: Fix panic after kexec -p for kdump
This cures a panic on restart after a kexec operation on 5.3 and 5.4
kernels.

The underlying state of the iommu registers (iommu->flags &
VTD_FLAG_TRANS_PRE_ENABLED) on a restart results in a domain being marked as
"DEFER_DEVICE_DOMAIN_INFO" that produces an Oops in identity_mapping().

[   43.654737] BUG: kernel NULL pointer dereference, address:
0000000000000056
[   43.655720] #PF: supervisor read access in kernel mode
[   43.655720] #PF: error_code(0x0000) - not-present page
[   43.655720] PGD 0 P4D 0
[   43.655720] Oops: 0000 [#1] SMP PTI
[   43.655720] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.3.2-1940.el8uek.x86_64 #1
[   43.655720] Hardware name: Oracle Corporation ORACLE SERVER
X5-2/ASM,MOTHERBOARD,1U, BIOS 30140300 09/20/2018
[   43.655720] RIP: 0010:iommu_need_mapping+0x29/0xd0
[   43.655720] Code: 00 0f 1f 44 00 00 48 8b 97 70 02 00 00 48 83 fa ff
74 53 48 8d 4a ff b8 01 00 00 00 48 83 f9 fd 76 01 c3 48 8b 35 7f 58 e0
01 <48> 39 72 58 75 f2 55 48 89 e5 41 54 53 48 8b 87 28 02 00 00 4c 8b
[   43.655720] RSP: 0018:ffffc9000001b9b0 EFLAGS: 00010246
[   43.655720] RAX: 0000000000000001 RBX: 0000000000001000 RCX:
fffffffffffffffd
[   43.655720] RDX: fffffffffffffffe RSI: ffff8880719b8000 RDI:
ffff8880477460b0
[   43.655720] RBP: ffffc9000001b9e8 R08: 0000000000000000 R09:
ffff888047c01700
[   43.655720] R10: 00002194036fc692 R11: 0000000000000000 R12:
0000000000000000
[   43.655720] R13: ffff8880477460b0 R14: 0000000000000cc0 R15:
ffff888072d2b558
[   43.655720] FS:  0000000000000000(0000) GS:ffff888071c00000(0000)
knlGS:0000000000000000
[   43.655720] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   43.655720] CR2: 0000000000000056 CR3: 000000007440a002 CR4:
00000000001606b0
[   43.655720] Call Trace:
[   43.655720]  ? intel_alloc_coherent+0x2a/0x180
[   43.655720]  ? __schedule+0x2c2/0x650
[   43.655720]  dma_alloc_attrs+0x8c/0xd0
[   43.655720]  dma_pool_alloc+0xdf/0x200
[   43.655720]  ehci_qh_alloc+0x58/0x130
[   43.655720]  ehci_setup+0x287/0x7ba
[   43.655720]  ? _dev_info+0x6c/0x83
[   43.655720]  ehci_pci_setup+0x91/0x436
[   43.655720]  usb_add_hcd.cold.48+0x1d4/0x754
[   43.655720]  usb_hcd_pci_probe+0x2bc/0x3f0
[   43.655720]  ehci_pci_probe+0x39/0x40
[   43.655720]  local_pci_probe+0x47/0x80
[   43.655720]  pci_device_probe+0xff/0x1b0
[   43.655720]  really_probe+0xf5/0x3a0
[   43.655720]  driver_probe_device+0xbb/0x100
[   43.655720]  device_driver_attach+0x58/0x60
[   43.655720]  __driver_attach+0x8f/0x150
[   43.655720]  ? device_driver_attach+0x60/0x60
[   43.655720]  bus_for_each_dev+0x74/0xb0
[   43.655720]  driver_attach+0x1e/0x20
[   43.655720]  bus_add_driver+0x151/0x1f0
[   43.655720]  ? ehci_hcd_init+0xb2/0xb2
[   43.655720]  ? do_early_param+0x95/0x95
[   43.655720]  driver_register+0x70/0xc0
[   43.655720]  ? ehci_hcd_init+0xb2/0xb2
[   43.655720]  __pci_register_driver+0x57/0x60
[   43.655720]  ehci_pci_init+0x6a/0x6c
[   43.655720]  do_one_initcall+0x4a/0x1fa
[   43.655720]  ? do_early_param+0x95/0x95
[   43.655720]  kernel_init_freeable+0x1bd/0x262
[   43.655720]  ? rest_init+0xb0/0xb0
[   43.655720]  kernel_init+0xe/0x110
[   43.655720]  ret_from_fork+0x24/0x50

Fixes: 8af46c784e ("iommu/vt-d: Implement is_attach_deferred iommu ops entry")
Cc: stable@vger.kernel.org # v5.3+

Signed-off-by: John Donnelly <john.p.donnelly@oracle.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-30 10:30:22 +01:00
Takashi Iwai ad3e8da2d4 iommu/amd: Apply the same IVRS IOAPIC workaround to Acer Aspire A315-41
Acer Aspire A315-41 requires the very same workaround as the existing
quirk for Dell Latitude 5495.  Add the new entry for that.

BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1137799
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-30 10:24:03 +01:00
YueHaibing 565d454280 iommu/ipmmu-vmsa: Remove dev_err() on platform_get_irq() failure
platform_get_irq() will call dev_err() itself on failure,
so there is no need for the driver to also do this.
This is detected by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-30 10:16:37 +01:00
Linus Torvalds 964f9cfaae dma-mapping fix for 5.4
- fix a regression in the intel-iommu get_required_mask conversion
    (Arvind Sankar)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl2z3OELHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYOyJhAAwQalrpmmP4NLzx8J29ZLLFTpvMkQu7khHXxnfXd0
 +/aKg6g1teyJvg/Vxb8GQeGi/mTKClCjUvlS+88AQm1vR/9wLb4OvQPcHhNmG84s
 YxJGeDcIQzeXIpV1s6bcADcIAoYHZB1Ph0VobQeSNiJEAq4/ILCUfsVgcPbOPdDQ
 49/b8jrGXk/A/MMzJo2YefqQec2D5/7LCEK++IZAOnlnL/hd+YiB8Y1W8tjAMwXO
 ANOwpRGD+tUfjlP6DvuIbefPGVW5B0fdSa04KYqg03bZOSVTThNCSdWqXTcaOXFu
 MmAHhzrRiUH184d69pjWM371Qx6dF+fallkezrZXVqyInVww9Vca708sJPP3w9YD
 QjP2eYy1xrcPI9e84Xqad8o6TRr+wzmtQIHNRcm9ZrZhi/fdjUKPeBZkBbeOpGcd
 CLaqV8lVOFtVEHqtUq9egJ77FUdmCvDpaz7XaT3o33b8Wl70cF5G1/J4+CYkMHWM
 y67h7GpBaay7d6ZJyLbtqB29AM0PQnftJRZfef1dP3hGZKswZYuDuseMfkrMwPzt
 6MRWpSN+kn4B4HugO+W8OXVO2heZFb7sqs7BwfjgWWOAn5NN9Jvq2s1PYKj0CluR
 wB8NAhulNrkVslUk3Mx0baPDhO9ut3bMXRhVJcpXFJV23oz0HA1qqxD37vDglRHd
 aNM=
 =jbwS
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.4-2' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:
 "Fix a regression in the intel-iommu get_required_mask conversion
  (Arvind Sankar)"

* tag 'dma-mapping-5.4-2' of git://git.infradead.org/users/hch/dma-mapping:
  iommu/vt-d: Return the correct dma mask when we are bypassing the IOMMU
2019-10-26 06:29:04 -04:00
Arvind Sankar 9c24eaf81c iommu/vt-d: Return the correct dma mask when we are bypassing the IOMMU
We must return a mask covering the full physical RAM when bypassing the
IOMMU mapping. Also, in iommu_need_mapping, we need to check using
dma_direct_get_required_mask to ensure that the device's dma_mask can
cover physical RAM before deciding to bypass IOMMU mapping.

Based on an earlier patch from Christoph Hellwig.

Fixes: 249baa5479 ("dma-mapping: provide a better default ->get_required_mask")
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-10-18 17:19:20 +02:00
Joerg Roedel 46ac18c347 iommu/amd: Check PM_LEVEL_SIZE() condition in locked section
The increase_address_space() function has to check the PM_LEVEL_SIZE()
condition again under the domain->lock to avoid a false trigger of the
WARN_ON_ONCE() and to avoid that the address space is increase more
often than necessary.

Reported-by: Qian Cai <cai@lca.pw>
Fixes: 754265bcab ("iommu/amd: Fix race in increase_address_space()")
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-18 16:52:37 +02:00