remarkable-linux/drivers
Roland Dreier 3e7abe2556 intel-iommu: Fix AB-BA lockdep report
When unbinding a device so that I could pass it through to a KVM VM, I
got the lockdep report below.  It looks like a legitimate lock
ordering problem:

 - domain_context_mapping_one() takes iommu->lock and calls
   iommu_support_dev_iotlb(), which takes device_domain_lock (inside
   iommu->lock).

 - domain_remove_one_dev_info() starts by taking device_domain_lock
   then takes iommu->lock inside it (near the end of the function).

So this is the classic AB-BA deadlock.  It looks like a safe fix is to
simply release device_domain_lock a bit earlier, since as far as I can
tell, it doesn't protect any of the stuff accessed at the end of
domain_remove_one_dev_info() anyway.

BTW, the use of device_domain_lock looks a bit unsafe to me... it's
at least not obvious to me why we aren't vulnerable to the race below:

  iommu_support_dev_iotlb()
                                          domain_remove_dev_info()

  lock device_domain_lock
    find info
  unlock device_domain_lock

                                          lock device_domain_lock
                                            find same info
                                          unlock device_domain_lock

                                          free_devinfo_mem(info)

  do stuff with info after it's free

However I don't understand the locking here well enough to know if
this is a real problem, let alone what the best fix is.

Anyway here's the full lockdep output that prompted all of this:

     =======================================================
     [ INFO: possible circular locking dependency detected ]
     2.6.39.1+ #1
     -------------------------------------------------------
     bash/13954 is trying to acquire lock:
      (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230

     but task is already holding lock:
      (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     which lock already depends on the new lock.

     the existing dependency chain (in reverse order) is:

     -> #1 (device_domain_lock){-.-...}:
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
            [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
            [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
            [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
            [<ffffffff81cab204>] pci_iommu_init+0x16/0x41
            [<ffffffff81002165>] do_one_initcall+0x45/0x190
            [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
            [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10

     -> #0 (&(&iommu->lock)->rlock){......}:
            [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
            [<ffffffff812f8b42>] device_notifier+0x72/0x90
            [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
            [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
            [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
            [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
            [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
            [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
            [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
            [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
            [<ffffffff8117569e>] vfs_write+0xce/0x190
            [<ffffffff811759e4>] sys_write+0x54/0xa0
            [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

     other info that might help us debug this:

     6 locks held by bash/13954:
      #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
      #1:  (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
      #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
      #3:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
      #4:  (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
      #5:  (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     stack backtrace:
     Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
     Call Trace:
      [<ffffffff810993a7>] print_circular_bug+0xf7/0x100
      [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
      [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
      [<ffffffff812f8b42>] device_notifier+0x72/0x90
      [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
      [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
      [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
      [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
      [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
      [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
      [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
      [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
      [<ffffffff8117569e>] vfs_write+0xce/0x190
      [<ffffffff811759e4>] sys_write+0x54/0xa0
      [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-10-10 22:02:24 +01:00
..
accessibility
acpi Merge branches 'apei', 'bz-13195' and 'doc' into acpi 2011-09-12 20:00:00 -04:00
amba
ata drivers/ata/sata_dwc_460ex.c: add missing kfree 2011-08-18 23:58:11 -04:00
atm atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
auxdisplay
base PM / Clocks: Do not acquire a mutex under a spinlock 2011-09-26 19:40:23 +02:00
bcma bcma: add uevent to the bus, to autoload drivers 2011-08-22 14:21:41 -04:00
block floppy: use del_timer_sync() in init cleanup 2011-09-21 10:22:11 +02:00
bluetooth Bluetooth: add support for 2011 mac mini 2011-09-17 17:16:03 -03:00
cdrom drivers/cdrom/cdrom.c: relax check on dvd manufacturer value 2011-08-02 12:43:50 +02:00
char TPM: Zero buffer after copying to userspace 2011-09-23 09:46:41 +10:00
clk
clocksource Merge branch 'common/core' into sh-latest 2011-08-08 16:33:54 +09:00
connector proc_fork_connector: a lockless ->real_parent usage is not safe 2011-07-28 18:26:32 -07:00
cpufreq drivers/cpufreq/pcc-cpufreq.c: avoid NULL pointer dereference 2011-09-14 18:09:38 -07:00
cpuidle cpuidle: stop depending on pm_idle 2011-08-03 19:06:37 -04:00
crypto n2_crypto: Attach on Niagara-T3. 2011-07-28 01:30:07 -07:00
dca
dio
dma dmaengine/ste_dma40: fix memory leak due to prepared descriptors 2011-09-05 17:08:26 +05:30
edac i7core_edac: fixed typo in error count calculation 2011-08-18 14:07:15 -07:00
eisa eisa/pci_eisa.c: fix BUG introduced by 005bdad7b8 2011-08-04 06:32:51 -10:00
firewire firewire: ohci: add no MSI quirk for O2Micro controller 2011-09-16 22:22:10 +02:00
firmware firmware: fix google/gsmi.c build warning 2011-08-08 13:53:49 -07:00
gpio drivers/gpio/gpio-generic.c: fix build errors 2011-09-14 18:09:38 -07:00
gpu drm/radeon/kms: use hardcoded dig encoder to transmitter mapping for DCE4.1 2011-10-06 11:45:30 +01:00
hid Merge branch 'for-linus' of git://github.com/dtor/input 2011-09-16 14:09:19 -07:00
hwmon hwmon: (coretemp) Avoid leaving around dangling pointer 2011-09-28 08:19:21 -07:00
hwspinlock
i2c i2c-tegra: fix possible race condition after tx 2011-09-07 00:13:40 +01:00
ide ide-disk: Fix request requeuing 2011-10-03 14:28:18 -04:00
idle
ieee802154
infiniband [SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference 2011-09-26 09:28:01 -05:00
input Merge branch 'for-linus' of git://github.com/dtor/input 2011-10-05 09:22:38 -07:00
iommu intel-iommu: Fix AB-BA lockdep report 2011-10-10 22:02:24 +01:00
isdn Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-07-28 05:58:19 -07:00
leds drivers/leds/ledtrig-timer.c: fix broken sysfs delay handling 2011-09-14 18:09:38 -07:00
lguest
macintosh
mca
md Merge branch 'for-linus' of http://people.redhat.com/agk/git/linux-dm 2011-10-06 08:31:47 -07:00
media [media] omap3isp: Fix build error in ispccdc.c 2011-09-21 22:18:26 -03:00
memstick
message Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 2011-07-30 08:36:02 -10:00
mfd mfd: Fix generic irq chip ack function name for jz4740-adc 2011-09-21 13:06:34 +02:00
misc lis3: fix regression of HP DriveGuard with 8bit chip 2011-10-03 20:51:51 -07:00
mmc Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2011-09-21 13:20:21 -07:00
mtd UBI: do not link debug messages when debugging is disabled 2011-08-19 19:02:27 +03:00
net macvlan/macvtap: Fix unicast between macvtap interfaces in bridge mode 2011-10-04 23:31:23 -04:00
nfc NFC: pn533: use after free in pn533_disconnect() 2011-07-26 16:27:24 -04:00
nubus
of Revert "dt: add of_alias_scan and of_alias_get_id" 2011-08-04 11:26:24 +01:00
oprofile atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
parisc
parport Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 2011-07-25 23:09:27 -07:00
pci PCI: Disable MPS configuration by default 2011-10-04 09:52:28 -07:00
pcmcia Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6 2011-07-31 06:23:08 -10:00
platform acer-wmi: support Lenovo ideapad S205 wifi switch 2011-08-05 15:21:52 -04:00
pnp Merge 'akpm' patch series 2011-07-25 21:00:19 -07:00
power s3c-adc-battery: Fix compilation error due to missing header (module.h) 2011-08-19 21:01:46 +04:00
pps
ps3
ptp
rapidio rapidio: fix use of non-compatible registers 2011-08-25 16:25:34 -07:00
regulator Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 2011-08-01 14:05:46 -10:00
rtc drivers/rtc/rtc-s3c.c: fix no occurrence of alarm interrupt 2011-09-14 18:09:38 -07:00
s390 [S390] cio: fix cio_tpi ignoring adapter interrupts 2011-09-26 16:40:50 +02:00
sbus atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
scsi [SCSI] libsas: fix panic when single phy is disabled on a wide port 2011-10-02 13:28:55 -05:00
sfi
sh Merge branch 'common/core' into sh-latest 2011-08-08 16:33:54 +09:00
sn
spi spi-topcliff-pch: Fix overrun issue 2011-10-04 10:10:50 -06:00
ssb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-07-25 13:56:39 -07:00
staging Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus 2011-10-10 14:39:03 +12:00
target iscsi-target: Fix sendpage breakage with proper padding+DataDigest iovec offsets 2011-09-16 23:47:07 +00:00
tc
telephony
thermal thermal: make THERMAL_HWMON implementation fully internal 2011-08-02 14:51:57 -04:00
tty Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus 2011-10-10 14:39:03 +12:00
uio Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 2011-07-25 23:06:24 -07:00
usb USB: xHCI: prevent infinite loop when processing MSE event 2011-09-19 17:15:47 -07:00
uwb
vhost atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
video backlight: Declare backlight_types[] const 2011-09-10 14:00:02 -07:00
virt
virtio
vlynq
w1 MAINTAINERS: Evgeniy has moved 2011-08-25 16:25:33 -07:00
watchdog watchdog: Initconst section fixes for watchdog 2011-09-20 14:32:00 +02:00
xen xen/irq: Alter the locking to use a mutex instead of a spinlock. 2011-09-15 04:32:02 -04:00
zorro zorro: Defer device_register() until all devices have been identified 2011-09-22 12:59:35 -07:00
Kconfig Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2011-07-25 22:59:39 -07:00
Makefile Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2011-07-25 22:59:39 -07:00