Commit graph

3 commits

Author SHA1 Message Date
David Woodhouse 6fbcfb3e46 intel-iommu: Workaround IOTLB hang on Ironlake GPU
To work around a hardware issue, we have to submit IOTLB flushes while
the graphics engine is idle. The graphics driver will (we hope) go to
great lengths to ensure that it gets that right on the affected
chipset(s)... so let's not screw it over by deferring the unmap and
doing it later. That wouldn't be very helpful.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-10-14 20:51:44 +01:00
Roland Dreier 3e7abe2556 intel-iommu: Fix AB-BA lockdep report
When unbinding a device so that I could pass it through to a KVM VM, I
got the lockdep report below.  It looks like a legitimate lock
ordering problem:

 - domain_context_mapping_one() takes iommu->lock and calls
   iommu_support_dev_iotlb(), which takes device_domain_lock (inside
   iommu->lock).

 - domain_remove_one_dev_info() starts by taking device_domain_lock
   then takes iommu->lock inside it (near the end of the function).

So this is the classic AB-BA deadlock.  It looks like a safe fix is to
simply release device_domain_lock a bit earlier, since as far as I can
tell, it doesn't protect any of the stuff accessed at the end of
domain_remove_one_dev_info() anyway.

BTW, the use of device_domain_lock looks a bit unsafe to me... it's
at least not obvious to me why we aren't vulnerable to the race below:

  iommu_support_dev_iotlb()
                                          domain_remove_dev_info()

  lock device_domain_lock
    find info
  unlock device_domain_lock

                                          lock device_domain_lock
                                            find same info
                                          unlock device_domain_lock

                                          free_devinfo_mem(info)

  do stuff with info after it's free

However I don't understand the locking here well enough to know if
this is a real problem, let alone what the best fix is.

Anyway here's the full lockdep output that prompted all of this:

     =======================================================
     [ INFO: possible circular locking dependency detected ]
     2.6.39.1+ #1
     -------------------------------------------------------
     bash/13954 is trying to acquire lock:
      (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230

     but task is already holding lock:
      (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     which lock already depends on the new lock.

     the existing dependency chain (in reverse order) is:

     -> #1 (device_domain_lock){-.-...}:
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
            [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
            [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
            [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
            [<ffffffff81cab204>] pci_iommu_init+0x16/0x41
            [<ffffffff81002165>] do_one_initcall+0x45/0x190
            [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
            [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10

     -> #0 (&(&iommu->lock)->rlock){......}:
            [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
            [<ffffffff812f8b42>] device_notifier+0x72/0x90
            [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
            [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
            [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
            [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
            [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
            [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
            [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
            [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
            [<ffffffff8117569e>] vfs_write+0xce/0x190
            [<ffffffff811759e4>] sys_write+0x54/0xa0
            [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

     other info that might help us debug this:

     6 locks held by bash/13954:
      #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
      #1:  (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
      #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
      #3:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
      #4:  (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
      #5:  (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     stack backtrace:
     Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
     Call Trace:
      [<ffffffff810993a7>] print_circular_bug+0xf7/0x100
      [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
      [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
      [<ffffffff812f8b42>] device_notifier+0x72/0x90
      [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
      [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
      [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
      [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
      [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
      [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
      [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
      [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
      [<ffffffff8117569e>] vfs_write+0xce/0x190
      [<ffffffff811759e4>] sys_write+0x54/0xa0
      [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-10-10 22:02:24 +01:00
Ohad Ben-Cohen 166e9278a3 x86/ia64: intel-iommu: move to drivers/iommu/
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.

Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.

As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.

Compile-tested on x86_64.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:30 +02:00
Renamed from drivers/pci/intel-iommu.c (Browse further)