From 33c45ef8adc8a7cf781b2566d50e6ea8e97b3596 Mon Sep 17 00:00:00 2001 From: Gregory CLEMENT Date: Mon, 19 Sep 2016 12:02:50 +0200 Subject: [PATCH 001/292] ARM: mvebu: Select corediv clk for all mvebu v7 SoC MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Since the commit bd3677ff31a3 ("clk: mvebu: Remove corediv clock from Armada XP"), the corediv clk is no more selected for Armada XP, however this clock is used for Armada XP using the compatible armada-370-corediv-clock. While since commit 1594d568c6e3 ("clk: mvebu: Move corediv config to mvebu config") Armada 38x and Armada 375 got corediv support again, not only Armada XP was missed but also Armada 39x. Actually all the SoC selecting MVEBU_V7 config need this clock: git grep "\-corediv-clock" arch/arm/boot/dts arch/arm/boot/dts/armada-370-xp.dtsi: compatible = "marvell,armada-370-corediv-clock"; arch/arm/boot/dts/armada-375.dtsi: compatible = "marvell,armada-375-corediv-clock"; arch/arm/boot/dts/armada-38x.dtsi: compatible = "marvell,armada-380-corediv-clock"; arch/arm/boot/dts/armada-39x.dtsi: compatible = "marvell,armada-390-corediv-clock" This commit now fixes this behavior by letting MVEBU_V7 select MVEBU_CLK_COREDIV. Fixes: bd3677ff31a3 ("clk: mvebu: Remove corediv clock from Armada XP") Cc: stable@vger.kernel.org Reported-by: Uwe Kleine-König Acked-by: Uwe Kleine-König Signed-off-by: Gregory CLEMENT --- arch/arm/mach-mvebu/Kconfig | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/arm/mach-mvebu/Kconfig b/arch/arm/mach-mvebu/Kconfig index f9b6bd306cfe..541647f57192 100644 --- a/arch/arm/mach-mvebu/Kconfig +++ b/arch/arm/mach-mvebu/Kconfig @@ -23,6 +23,7 @@ config MACH_MVEBU_V7 select CACHE_L2X0 select ARM_CPU_SUSPEND select MACH_MVEBU_ANY + select MVEBU_CLK_COREDIV config MACH_ARMADA_370 bool "Marvell Armada 370 boards" @@ -32,7 +33,6 @@ config MACH_ARMADA_370 select CPU_PJ4B select MACH_MVEBU_V7 select PINCTRL_ARMADA_370 - select MVEBU_CLK_COREDIV help Say 'Y' here if you want your kernel to support boards based on the Marvell Armada 370 SoC with device tree. @@ -50,7 +50,6 @@ config MACH_ARMADA_375 select HAVE_SMP select MACH_MVEBU_V7 select PINCTRL_ARMADA_375 - select MVEBU_CLK_COREDIV help Say 'Y' here if you want your kernel to support boards based on the Marvell Armada 375 SoC with device tree. @@ -68,7 +67,6 @@ config MACH_ARMADA_38X select HAVE_SMP select MACH_MVEBU_V7 select PINCTRL_ARMADA_38X - select MVEBU_CLK_COREDIV help Say 'Y' here if you want your kernel to support boards based on the Marvell Armada 380/385 SoC with device tree. From 51227bf52008bd4c4c50da4b749bbc6e7bbbca52 Mon Sep 17 00:00:00 2001 From: Marcin Wojtas Date: Tue, 6 Sep 2016 19:41:11 +0200 Subject: [PATCH 002/292] arm64: dts: marvell: fix clocksource for CP110 master SPI0 I2C and SPI interfaces share common clock trees within the CP110 HW block. It occurred that SPI0 interface has wrong clock assignment in the device tree, which is fixed in this commit to a proper value. Fixes: 728dacc7f4dd ("arm64: dts: marvell: initial DT description of ...") Signed-off-by: Marcin Wojtas CC: v4.7+ Signed-off-by: Gregory CLEMENT --- arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi index da31bbbbb59e..399271853aad 100644 --- a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi @@ -131,7 +131,7 @@ #address-cells = <0x1>; #size-cells = <0x0>; cell-index = <1>; - clocks = <&cpm_syscon0 0 3>; + clocks = <&cpm_syscon0 1 21>; status = "disabled"; }; From 231147ee77f39f4134935686e9d7e415bdf48149 Mon Sep 17 00:00:00 2001 From: sayli karnik Date: Wed, 28 Sep 2016 21:46:51 +0530 Subject: [PATCH 003/292] iio: maxim_thermocouple: Align 16 bit big endian value of raw reads Driver was reporting invalid raw read values for MAX6675 on big endian architectures. MAX6675 buffered mode is not affected, nor is the MAX31855. The driver was losing a 2 byte read value when it used a 32 bit integer buffer to store a 16 bit big endian value. Use big endian types to properly align buffers on big endian architectures. Fixes following sparse endianness warnings: warning: cast to restricted __be16 warning: cast to restricted __be32 Fixes checkpatch issue: CHECK: No space is necessary after a cast Signed-off-by: sayli karnik Fixes: 1f25ca11d84a ("iio: temperature: add support for Maxim thermocouple chips") Signed-off-by: Jonathan Cameron --- drivers/iio/temperature/maxim_thermocouple.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/iio/temperature/maxim_thermocouple.c b/drivers/iio/temperature/maxim_thermocouple.c index 39dd2026ccc9..066161a4bccd 100644 --- a/drivers/iio/temperature/maxim_thermocouple.c +++ b/drivers/iio/temperature/maxim_thermocouple.c @@ -123,22 +123,24 @@ static int maxim_thermocouple_read(struct maxim_thermocouple_data *data, { unsigned int storage_bytes = data->chip->read_size; unsigned int shift = chan->scan_type.shift + (chan->address * 8); - unsigned int buf; + __be16 buf16; + __be32 buf32; int ret; - ret = spi_read(data->spi, (void *) &buf, storage_bytes); - if (ret) - return ret; - switch (storage_bytes) { case 2: - *val = be16_to_cpu(buf); + ret = spi_read(data->spi, (void *)&buf16, storage_bytes); + *val = be16_to_cpu(buf16); break; case 4: - *val = be32_to_cpu(buf); + ret = spi_read(data->spi, (void *)&buf32, storage_bytes); + *val = be32_to_cpu(buf32); break; } + if (ret) + return ret; + /* check to be sure this is a valid reading */ if (*val & data->chip->status_bit) return -EINVAL; From 2967999fbceffa8520987ab9b3b00a55d6997dba Mon Sep 17 00:00:00 2001 From: Mika Westerberg Date: Wed, 5 Oct 2016 17:46:50 +0300 Subject: [PATCH 004/292] iio: adc: ti-adc081c: Select IIO_TRIGGERED_BUFFER to prevent build errors Commit 08e05d1fce5c ("ti-adc081c: Initial triggered buffer support") added triggered buffer support but that also requires CONFIG_IIO_TRIGGERED_BUFFER, otherwise we get errors from linker such as: drivers/built-in.o: In function `adc081c_remove': drivers/iio/adc/ti-adc081c.c:225: undefined reference to `iio_triggered_buffer_cleanup' Fix these by explicitly selecting both CONFIG_IIO_TRIGGERED_BUFFER and CONFIG_IIO_BUFFER in Kconfig for the driver. Signed-off-by: Mika Westerberg Signed-off-by: Jonathan Cameron --- drivers/iio/adc/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig index 7edcf3238620..99c051490eff 100644 --- a/drivers/iio/adc/Kconfig +++ b/drivers/iio/adc/Kconfig @@ -437,6 +437,8 @@ config STX104 config TI_ADC081C tristate "Texas Instruments ADC081C/ADC101C/ADC121C family" depends on I2C + select IIO_BUFFER + select IIO_TRIGGERED_BUFFER help If you say yes here you get support for Texas Instruments ADC081C, ADC101C and ADC121C ADC chips. From 5c33677c87cbe44ae04df69c4a29c1750a9ec4e5 Mon Sep 17 00:00:00 2001 From: Andy Whitcroft Date: Tue, 11 Oct 2016 15:16:57 +0100 Subject: [PATCH 005/292] dm raid: fix compat_features validation In ecbfb9f118bce4 ("dm raid: add raid level takeover support") a new compatible feature flag was added. Validation for these compat_features was added but this only passes for new raid mappings with this feature flag. This causes previously created raid mappings to be failed at import. Check compat_features for the only valid combination. Fixes: ecbfb9f118bce4 ("dm raid: add raid level takeover support") Cc: stable@vger.kernel.org # v4.8 Signed-off-by: Andy Whitcroft Signed-off-by: Heinz Mauelshagen Signed-off-by: Mike Snitzer --- drivers/md/dm-raid.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 8abde6b8cedc..2a3970097991 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -2258,7 +2258,8 @@ static int super_validate(struct raid_set *rs, struct md_rdev *rdev) if (!mddev->events && super_init_validation(rs, rdev)) return -EINVAL; - if (le32_to_cpu(sb->compat_features) != FEATURE_FLAG_SUPPORTS_V190) { + if (le32_to_cpu(sb->compat_features) && + le32_to_cpu(sb->compat_features) != FEATURE_FLAG_SUPPORTS_V190) { rs->ti->error = "Unable to assemble array: Unknown flag(s) in compatible feature flags"; return -EINVAL; } From d5e84fd8d0634d056248b67463b42f6c85896a19 Mon Sep 17 00:00:00 2001 From: Filipe Manana Date: Mon, 19 Sep 2016 10:57:40 +0100 Subject: [PATCH 006/292] Btrfs: fix incremental send failure caused by balance Commit 951555856b88 ("Btrfs: send, don't bug on inconsistent snapshots") removed some BUG_ON() statements (replacing them with returning errors to user space and logging error messages) when a snapshot is in an inconsistent state due to failures to update a delayed inode item (ENOMEM or ENOSPC) after adding/updating/deleting references, xattrs or file extent items. However there is a case, when no errors happen, where a file extent item can be modified without having the corresponding inode item updated. This case happens during balance under very specific timings, when relocation is in the stage where it updates data pointers and a leaf that contains file extent items is COWed. When that happens file extent items get their disk_bytenr field updated to a new value that reflects the post relocation logical address of the extent, without updating their respective inode items (as there is nothing that needs to be updated on them). This is performed at relocation.c:replace_file_extents() through relocation.c:btrfs_reloc_cow_block(). So make an incremental send deal with this case and don't do any processing for a file extent item that got its disk_bytenr field updated by relocation, since the extent's data is the same as the one pointed by the file extent item in the parent snapshot. After the recent commit mentioned above this case resulted in EIO errors returned to user space (and an error message logged to dmesg/syslog) when doing an incremental send, while before it, it resulted in hitting a BUG_ON leading to the following trace: [ 952.206705] ------------[ cut here ]------------ [ 952.206714] kernel BUG at ../fs/btrfs/send.c:5653! [ 952.206719] Internal error: Oops - BUG: 0 [#1] SMP [ 952.209854] Modules linked in: st dm_mod nls_utf8 isofs fuse nf_log_ipv6 xt_pkttype xt_physdev br_netfilter nf_log_ipv4 nf_log_common xt_LOG xt_limit ebtable_filter ebtables af_packet bridge stp llc ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables xfs libcrc32c nls_iso8859_1 nls_cp437 vfat fat joydev aes_ce_blk ablk_helper cryptd snd_intel8x0 aes_ce_cipher snd_ac97_codec ac97_bus snd_pcm ghash_ce sha2_ce sha1_ce snd_timer snd virtio_net soundcore btrfs xor sr_mod cdrom hid_generic usbhid raid6_pq virtio_blk virtio_scsi bochs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_mmio xhci_pci xhci_hcd usbcore usb_common virtio_pci virtio_ring virtio drm sg efivarfs [ 952.228333] Supported: Yes [ 952.228908] CPU: 0 PID: 12779 Comm: snapperd Not tainted 4.4.14-50-default #1 [ 952.230329] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 952.231683] task: ffff800058e94100 ti: ffff8000d866c000 task.ti: ffff8000d866c000 [ 952.233279] PC is at changed_cb+0x9f4/0xa48 [btrfs] [ 952.234375] LR is at changed_cb+0x58/0xa48 [btrfs] [ 952.236552] pc : [] lr : [] pstate: 80000145 [ 952.238049] sp : ffff8000d866fa20 [ 952.238732] x29: ffff8000d866fa20 x28: 0000000000000019 [ 952.239840] x27: 00000000000028d5 x26: 00000000000024a2 [ 952.241008] x25: 0000000000000002 x24: ffff8000e66e92f0 [ 952.242131] x23: ffff8000b8c76800 x22: ffff800092879140 [ 952.243238] x21: 0000000000000002 x20: ffff8000d866fb78 [ 952.244348] x19: ffff8000b8f8c200 x18: 0000000000002710 [ 952.245607] x17: 0000ffff90d42480 x16: ffff800000237dc0 [ 952.246719] x15: 0000ffff90de7510 x14: ab000c000a2faf08 [ 952.247835] x13: 0000000000577c2b x12: ab000c000b696665 [ 952.248981] x11: 2e65726f632f6966 x10: 652d34366d72612f [ 952.250101] x9 : 32627572672f746f x8 : ab000c00092f1671 [ 952.251352] x7 : 8000000000577c2b x6 : ffff800053eadf45 [ 952.252468] x5 : 0000000000000000 x4 : ffff80005e169494 [ 952.253582] x3 : 0000000000000004 x2 : ffff8000d866fb78 [ 952.254695] x1 : 000000000003e2a3 x0 : 000000000003e2a4 [ 952.255803] [ 952.256150] Process snapperd (pid: 12779, stack limit = 0xffff8000d866c020) [ 952.257516] Stack: (0xffff8000d866fa20 to 0xffff8000d8670000) [ 952.258654] fa20: ffff8000d866fae0 ffff7ffffc308fc0 ffff800092879140 ffff8000e66e92f0 [ 952.260219] fa40: 0000000000000035 ffff800055de6000 ffff8000b8c76800 ffff8000d866fb78 [ 952.261745] fa60: 0000000000000002 00000000000024a2 00000000000028d5 0000000000000019 [ 952.263269] fa80: ffff8000d866fae0 ffff7ffffc3090f0 ffff8000d866fae0 ffff7ffffc309128 [ 952.264797] faa0: ffff800092879140 ffff8000e66e92f0 0000000000000035 ffff800055de6000 [ 952.268261] fac0: ffff8000b8c76800 ffff8000d866fb78 0000000000000002 0000000000001000 [ 952.269822] fae0: ffff8000d866fbc0 ffff7ffffc39ecfc ffff8000b8f8c200 ffff8000b8f8c368 [ 952.271368] fb00: ffff8000b8f8c378 ffff800055de6000 0000000000000001 ffff8000ecb17500 [ 952.272893] fb20: ffff8000b8c76800 ffff800092879140 ffff800062b6d000 ffff80007a9e2470 [ 952.274420] fb40: ffff8000b8f8c208 0000000005784000 ffff8000580a8000 ffff8000b8f8c200 [ 952.276088] fb60: ffff7ffffc39d488 00000002b8f8c368 0000000000000000 000000000003e2a4 [ 952.280275] fb80: 000000000000006c ffff7ffffc39ec00 000000000003e2a4 000000000000006c [ 952.283219] fba0: ffff8000b8f8c300 0000000000000100 0000000000000001 ffff8000ecb17500 [ 952.286166] fbc0: ffff8000d866fcd0 ffff7ffffc3643c0 ffff8000f8842700 0000ffff8ffe9278 [ 952.289136] fbe0: 0000000040489426 ffff800055de6000 0000ffff8ffe9278 0000000040489426 [ 952.292083] fc00: 000000000000011d 000000000000001d ffff80007a9e4598 ffff80007a9e43e8 [ 952.294959] fc20: ffff8000b8c7693f 0000000000003b24 0000000000000019 ffff8000b8f8c218 [ 952.301161] fc40: 00000001d866fc70 ffff8000b8c76800 0000000000000128 ffffffffffffff84 [ 952.305749] fc60: ffff800058e941ff 0000000000003a58 ffff8000d866fcb0 ffff8000000f7390 [ 952.308875] fc80: 000000000000012a 0000000000010290 ffff8000d866fc00 000000000000007b [ 952.311915] fca0: 0000000000010290 ffff800046c1b100 74732d7366727462 000001006d616572 [ 952.314937] fcc0: ffff8000fffc4100 cb88537fdc8ba60e ffff8000d866fe10 ffff8000002499e8 [ 952.318008] fce0: 0000000040489426 ffff8000f8842700 0000ffff8ffe9278 ffff80007a9e4598 [ 952.321321] fd00: 0000ffff8ffe9278 0000000040489426 000000000000011d 000000000000001d [ 952.324280] fd20: ffff80000072c000 ffff8000d866c000 ffff8000d866fda0 ffff8000000e997c [ 952.327156] fd40: ffff8000fffc4180 00000000000031ed ffff8000fffc4180 ffff800046c1b7d4 [ 952.329895] fd60: 0000000000000140 0000ffff907ea170 000000000000011d 00000000000000dc [ 952.334641] fd80: ffff80000072c000 ffff8000d866c000 0000000000000000 0000000000000002 [ 952.338002] fda0: ffff8000d866fdd0 ffff8000000ebacc ffff800046c1b080 ffff800046c1b7d4 [ 952.340724] fdc0: ffff8000d866fdf0 ffff8000000db67c 0000000000000040 ffff800000e69198 [ 952.343415] fde0: 0000ffff8ffea790 00000000000031ed ffff8000d866fe20 ffff800000254000 [ 952.346101] fe00: 000000000000001d 0000000000000004 ffff8000d866fe90 ffff800000249d3c [ 952.348980] fe20: ffff8000f8842700 0000000000000000 ffff8000f8842701 0000000000000008 [ 952.351696] fe40: ffff8000d866fe70 0000000000000008 ffff8000d866fe90 ffff800000249cf8 [ 952.354387] fe60: ffff8000f8842700 0000ffff8ffe9170 ffff8000f8842701 0000000000000008 [ 952.357083] fe80: 0000ffff8ffe9278 ffff80008ff85500 0000ffff8ffe90c0 ffff800000085c84 [ 952.359800] fea0: 0000000000000000 0000ffff8ffe9170 ffffffffffffffff 0000ffff90d473bc [ 952.365351] fec0: 0000000000000000 0000000000000015 0000000000000008 0000000040489426 [ 952.369550] fee0: 0000ffff8ffe9278 0000ffff907ea790 0000ffff907ea170 0000ffff907ea790 [ 952.372416] ff00: 0000ffff907ea170 0000000000000000 000000000000001d 0000000000000004 [ 952.375223] ff20: 0000ffff90a32220 00000000003d0f00 0000ffff907ea0a0 0000ffff8ffe8f30 [ 952.378099] ff40: 0000ffff9100f554 0000ffff91147000 0000ffff91117bc0 0000ffff90d473b0 [ 952.381115] ff60: 0000ffff9100f620 0000ffff880069b0 0000ffff8ffe9170 0000ffff8ffe91a0 [ 952.384003] ff80: 0000ffff8ffe9160 0000ffff8ffe9140 0000ffff88006990 0000ffff8ffe9278 [ 952.386860] ffa0: 0000ffff88008a60 0000ffff8ffe9480 0000ffff88014ca0 0000ffff8ffe90c0 [ 952.389654] ffc0: 0000ffff910be8e8 0000ffff8ffe90c0 0000ffff90d473bc 0000000000000000 [ 952.410986] ffe0: 0000000000000008 000000000000001d 6e2079747265706f 72616d223d656d61 [ 952.415497] Call trace: [ 952.417403] [] changed_cb+0x9f4/0xa48 [btrfs] [ 952.420023] [] btrfs_compare_trees+0x500/0x6b0 [btrfs] [ 952.422759] [] btrfs_ioctl_send+0xb4c/0xe10 [btrfs] [ 952.425601] [] btrfs_ioctl+0x374/0x29a4 [btrfs] [ 952.428031] [] do_vfs_ioctl+0x33c/0x600 [ 952.430360] [] SyS_ioctl+0x90/0xa4 [ 952.432552] [] el0_svc_naked+0x38/0x3c [ 952.434803] Code: 2a1503e0 17fffdac b9404282 17ffff28 (d4210000) [ 952.437457] ---[ end trace 9afd7090c466cf15 ]--- Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 978796865bfc..0b4628999b77 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -5805,6 +5805,64 @@ static int changed_extent(struct send_ctx *sctx, int ret = 0; if (sctx->cur_ino != sctx->cmp_key->objectid) { + + if (result == BTRFS_COMPARE_TREE_CHANGED) { + struct extent_buffer *leaf_l; + struct extent_buffer *leaf_r; + struct btrfs_file_extent_item *ei_l; + struct btrfs_file_extent_item *ei_r; + + leaf_l = sctx->left_path->nodes[0]; + leaf_r = sctx->right_path->nodes[0]; + ei_l = btrfs_item_ptr(leaf_l, + sctx->left_path->slots[0], + struct btrfs_file_extent_item); + ei_r = btrfs_item_ptr(leaf_r, + sctx->right_path->slots[0], + struct btrfs_file_extent_item); + + /* + * We may have found an extent item that has changed + * only its disk_bytenr field and the corresponding + * inode item was not updated. This case happens due to + * very specific timings during relocation when a leaf + * that contains file extent items is COWed while + * relocation is ongoing and its in the stage where it + * updates data pointers. So when this happens we can + * safely ignore it since we know it's the same extent, + * but just at different logical and physical locations + * (when an extent is fully replaced with a new one, we + * know the generation number must have changed too, + * since snapshot creation implies committing the current + * transaction, and the inode item must have been updated + * as well). + * This replacement of the disk_bytenr happens at + * relocation.c:replace_file_extents() through + * relocation.c:btrfs_reloc_cow_block(). + */ + if (btrfs_file_extent_generation(leaf_l, ei_l) == + btrfs_file_extent_generation(leaf_r, ei_r) && + btrfs_file_extent_ram_bytes(leaf_l, ei_l) == + btrfs_file_extent_ram_bytes(leaf_r, ei_r) && + btrfs_file_extent_compression(leaf_l, ei_l) == + btrfs_file_extent_compression(leaf_r, ei_r) && + btrfs_file_extent_encryption(leaf_l, ei_l) == + btrfs_file_extent_encryption(leaf_r, ei_r) && + btrfs_file_extent_other_encoding(leaf_l, ei_l) == + btrfs_file_extent_other_encoding(leaf_r, ei_r) && + btrfs_file_extent_type(leaf_l, ei_l) == + btrfs_file_extent_type(leaf_r, ei_r) && + btrfs_file_extent_disk_bytenr(leaf_l, ei_l) != + btrfs_file_extent_disk_bytenr(leaf_r, ei_r) && + btrfs_file_extent_disk_num_bytes(leaf_l, ei_l) == + btrfs_file_extent_disk_num_bytes(leaf_r, ei_r) && + btrfs_file_extent_offset(leaf_l, ei_l) == + btrfs_file_extent_offset(leaf_r, ei_r) && + btrfs_file_extent_num_bytes(leaf_l, ei_l) == + btrfs_file_extent_num_bytes(leaf_r, ei_r)) + return 0; + } + inconsistent_snapshot_error(sctx, result, "extent"); return -EIO; } From dcb2ff56417362c31f6b430c3c531a84581e8721 Mon Sep 17 00:00:00 2001 From: Heinz Mauelshagen Date: Mon, 10 Oct 2016 17:58:32 +0200 Subject: [PATCH 007/292] dm mirror: fix read error on recovery after default leg failure If a default leg has failed, any read will cause a new operational default leg to be selected and the read is resubmitted. But until now the read will return failure even though it was successful due to resubmission. The reason for this is bio->bi_error was not being cleared before resubmitting the bio. Fix by clearing bio->bi_error before resubmission. Fixes: 4246a0b63bd8 ("block: add a bi_error field to struct bio") Cc: stable@vger.kernel.org # 4.3+ Signed-off-by: Heinz Mauelshagen Signed-off-by: Mike Snitzer --- drivers/md/dm-raid1.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c index bdf1606f67bc..7a6254d54baf 100644 --- a/drivers/md/dm-raid1.c +++ b/drivers/md/dm-raid1.c @@ -1292,6 +1292,7 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error) dm_bio_restore(bd, bio); bio_record->details.bi_bdev = NULL; + bio->bi_error = 0; queue_bio(ms, bio, rw); return DM_ENDIO_INCOMPLETE; From 12a7cf5ba6c776a2621d8972c7d42e8d3d959d20 Mon Sep 17 00:00:00 2001 From: Heinz Mauelshagen Date: Mon, 10 Oct 2016 18:48:06 +0200 Subject: [PATCH 008/292] dm mirror: use all available legs on multiple failures When any leg(s) have failed, any read will cause a new operational default leg to be selected and the read is resubmitted to it. If that new default leg fails the read too, no other still accessible legs are used to resubmit the read again -- thus failing the io. Fix by allowing the read to get resubmitted until all operational legs have been exhausted. Also, remove any details.bi_dev use as a flag. Signed-off-by: Heinz Mauelshagen Signed-off-by: Mike Snitzer --- drivers/md/dm-raid1.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c index 7a6254d54baf..9a8b71067c6e 100644 --- a/drivers/md/dm-raid1.c +++ b/drivers/md/dm-raid1.c @@ -145,7 +145,6 @@ static void dispatch_bios(void *context, struct bio_list *bio_list) struct dm_raid1_bio_record { struct mirror *m; - /* if details->bi_bdev == NULL, details were not saved */ struct dm_bio_details details; region_t write_region; }; @@ -1200,8 +1199,6 @@ static int mirror_map(struct dm_target *ti, struct bio *bio) struct dm_raid1_bio_record *bio_record = dm_per_bio_data(bio, sizeof(struct dm_raid1_bio_record)); - bio_record->details.bi_bdev = NULL; - if (rw == WRITE) { /* Save region for mirror_end_io() handler */ bio_record->write_region = dm_rh_bio_to_region(ms->rh, bio); @@ -1260,22 +1257,12 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error) } if (error == -EOPNOTSUPP) - goto out; + return error; if ((error == -EWOULDBLOCK) && (bio->bi_opf & REQ_RAHEAD)) - goto out; + return error; if (unlikely(error)) { - if (!bio_record->details.bi_bdev) { - /* - * There wasn't enough memory to record necessary - * information for a retry or there was no other - * mirror in-sync. - */ - DMERR_LIMIT("Mirror read failed."); - return -EIO; - } - m = bio_record->m; DMERR("Mirror read failed from %s. Trying alternative device.", @@ -1291,7 +1278,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error) bd = &bio_record->details; dm_bio_restore(bd, bio); - bio_record->details.bi_bdev = NULL; bio->bi_error = 0; queue_bio(ms, bio, rw); @@ -1300,9 +1286,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error) DMERR("All replicated volumes dead, failing I/O"); } -out: - bio_record->details.bi_bdev = NULL; - return error; } From 0362fcc9d6834034bc0cdb1d9b02a1b9baf96a2a Mon Sep 17 00:00:00 2001 From: Shawn Lin Date: Thu, 22 Sep 2016 12:02:19 +0800 Subject: [PATCH 009/292] arm64: dts: rockchip: remove always-on and boot-on from vcc_sd Please don't add these for vcc_sd, and mmc-core/driver will control it. Otherwise, it will waste energy even without sdmmc in slot. Moreover, it will causes a bug: If we insert/remove sd card, we could see [9.337271] mmc0: new ultra high speed SDR25 SDHC card at address 0007 [9.345144] mmcblk0: mmc0:0007 SD32G 29.3 GiB This is okay for normal sd insert/remove test, but when I debug some issues for sdmmc, I did unbind/bind test. And there is a interesting phenomenon when we bind the driver again: [58.314069] mmc0: new high speed SDHC card at address 0007 [58.320282] mmcblk0: mmc0:0007 SD32G 29.3 GiB So the sd card could just support high speed without power cycle since the vcc_sd is always on, which makes the sd card fail to reinit its internal ocr mask. Signed-off-by: Shawn Lin Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts | 2 -- arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts | 2 -- 2 files changed, 4 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts b/arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts index 46cdddfcea6c..353314ca7426 100644 --- a/arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts +++ b/arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts @@ -258,8 +258,6 @@ }; vcc_sd: SWITCH_REG1 { - regulator-always-on; - regulator-boot-on; regulator-name = "vcc_sd"; }; diff --git a/arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts b/arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts index 5797933ef80e..13b7f1edad10 100644 --- a/arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts +++ b/arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts @@ -152,8 +152,6 @@ gpio = <&gpio3 11 GPIO_ACTIVE_LOW>; regulator-min-microvolt = <1800000>; regulator-max-microvolt = <3300000>; - regulator-always-on; - regulator-boot-on; vin-supply = <&vcc_io>; }; From 3ce0fefc51bd56381b1b9a92835cf8f9db3f2ef8 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Thu, 29 Sep 2016 10:00:14 -0700 Subject: [PATCH 010/292] ARCv2: intc: untangle SMP, MCIP and IDU The IDU intc is technically part of MCIP (Multi-core IP) hence historically was only available in a SMP hardware build (and thus only in a SMP kernel build). Now that hardware restriction has been lifted, so a UP kernel needs to support it. This requires breaking mcip.c into parts which are strictly SMP (inter-core interrupts) and IDU which in reality is just another intc and thus has no bearing on SMP. This change allows IDU in UP builds and with a suitable device tree, we can have the cascaded intc system ARCv2 core intc <---> ARCv2 IDU intc <---> periperals Signed-off-by: Vineet Gupta --- arch/arc/Kconfig | 17 +++++++++-------- arch/arc/include/asm/mcip.h | 16 ++++++++++++++++ arch/arc/kernel/mcip.c | 31 +++++++++++-------------------- 3 files changed, 36 insertions(+), 28 deletions(-) diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index ecd12379e2cd..6f67895cd9c4 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -186,14 +186,6 @@ if SMP config ARC_HAS_COH_CACHES def_bool n -config ARC_MCIP - bool "ARConnect Multicore IP (MCIP) Support " - depends on ISA_ARCV2 - help - This IP block enables SMP in ARC-HS38 cores. - It provides for cross-core interrupts, multi-core debug - hardware semaphores, shared memory,.... - config NR_CPUS int "Maximum number of CPUs (2-4096)" range 2 4096 @@ -211,6 +203,15 @@ config ARC_SMP_HALT_ON_RESET endif #SMP +config ARC_MCIP + bool "ARConnect Multicore IP (MCIP) Support " + depends on ISA_ARCV2 + default y if SMP + help + This IP block enables SMP in ARC-HS38 cores. + It provides for cross-core interrupts, multi-core debug + hardware semaphores, shared memory,.... + menuconfig ARC_CACHE bool "Enable Cache Support" default y diff --git a/arch/arc/include/asm/mcip.h b/arch/arc/include/asm/mcip.h index 847e3bbe387f..c8fbe4114bad 100644 --- a/arch/arc/include/asm/mcip.h +++ b/arch/arc/include/asm/mcip.h @@ -55,6 +55,22 @@ struct mcip_cmd { #define IDU_M_DISTRI_DEST 0x2 }; +struct mcip_bcr { +#ifdef CONFIG_CPU_BIG_ENDIAN + unsigned int pad3:8, + idu:1, llm:1, num_cores:6, + iocoh:1, gfrc:1, dbg:1, pad2:1, + msg:1, sem:1, ipi:1, pad:1, + ver:8; +#else + unsigned int ver:8, + pad:1, ipi:1, sem:1, msg:1, + pad2:1, dbg:1, gfrc:1, iocoh:1, + num_cores:6, llm:1, idu:1, + pad3:8; +#endif +}; + /* * MCIP programming model * diff --git a/arch/arc/kernel/mcip.c b/arch/arc/kernel/mcip.c index 72f9179b1a24..c424d5abc318 100644 --- a/arch/arc/kernel/mcip.c +++ b/arch/arc/kernel/mcip.c @@ -15,11 +15,12 @@ #include #include -static char smp_cpuinfo_buf[128]; -static int idu_detected; - static DEFINE_RAW_SPINLOCK(mcip_lock); +#ifdef CONFIG_SMP + +static char smp_cpuinfo_buf[128]; + static void mcip_setup_per_cpu(int cpu) { smp_ipi_irq_setup(cpu, IPI_IRQ); @@ -86,21 +87,7 @@ static void mcip_ipi_clear(int irq) static void mcip_probe_n_setup(void) { - struct mcip_bcr { -#ifdef CONFIG_CPU_BIG_ENDIAN - unsigned int pad3:8, - idu:1, llm:1, num_cores:6, - iocoh:1, gfrc:1, dbg:1, pad2:1, - msg:1, sem:1, ipi:1, pad:1, - ver:8; -#else - unsigned int ver:8, - pad:1, ipi:1, sem:1, msg:1, - pad2:1, dbg:1, gfrc:1, iocoh:1, - num_cores:6, llm:1, idu:1, - pad3:8; -#endif - } mp; + struct mcip_bcr mp; READ_BCR(ARC_REG_MCIP_BCR, mp); @@ -114,7 +101,6 @@ static void mcip_probe_n_setup(void) IS_AVAIL1(mp.gfrc, "GFRC")); cpuinfo_arc700[0].extn.gfrc = mp.gfrc; - idu_detected = mp.idu; if (mp.dbg) { __mcip_cmd_data(CMD_DEBUG_SET_SELECT, 0, 0xf); @@ -130,6 +116,8 @@ struct plat_smp_ops plat_smp_ops = { .ipi_clear = mcip_ipi_clear, }; +#endif + /*************************************************************************** * ARCv2 Interrupt Distribution Unit (IDU) * @@ -295,8 +283,11 @@ idu_of_init(struct device_node *intc, struct device_node *parent) /* Read IDU BCR to confirm nr_irqs */ int nr_irqs = of_irq_count(intc); int i, irq; + struct mcip_bcr mp; - if (!idu_detected) + READ_BCR(ARC_REG_MCIP_BCR, mp); + + if (!mp.idu) panic("IDU not detected, but DeviceTree using it"); pr_info("MCIP: IDU referenced from Devicetree %d irqs\n", nr_irqs); From 27f3d2a3b59f573a398c9acc810c16ebca07be78 Mon Sep 17 00:00:00 2001 From: Daniel Mentz Date: Tue, 4 Oct 2016 16:34:27 -0700 Subject: [PATCH 011/292] ARC: [build] Support gz, lzma compressed uImage Add support for lzma compressed uImage. Support for gzip was already available but could not be enabled because we were missing CONFIG_HAVE_KERNEL_GZIP in arch/arc/Kconfig. Signed-off-by: Daniel Mentz Cc: linux-snps-arc@lists.infradead.org Cc: Vineet Gupta Signed-off-by: Vineet Gupta --- arch/arc/Kconfig | 2 ++ arch/arc/boot/Makefile | 16 ++++++++++++++-- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index 6f67895cd9c4..ac0b309aced5 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -41,6 +41,8 @@ config ARC select PERF_USE_VMALLOC select HAVE_DEBUG_STACKOVERFLOW select HAVE_GENERIC_DMA_COHERENT + select HAVE_KERNEL_GZIP + select HAVE_KERNEL_LZMA config MIGHT_HAVE_PCI bool diff --git a/arch/arc/boot/Makefile b/arch/arc/boot/Makefile index e597cb34c16a..f94cf151e06a 100644 --- a/arch/arc/boot/Makefile +++ b/arch/arc/boot/Makefile @@ -14,9 +14,15 @@ UIMAGE_ENTRYADDR = $(LINUX_START_TEXT) suffix-y := bin suffix-$(CONFIG_KERNEL_GZIP) := gz +suffix-$(CONFIG_KERNEL_LZMA) := lzma -targets += uImage uImage.bin uImage.gz -extra-y += vmlinux.bin vmlinux.bin.gz +targets += uImage +targets += uImage.bin +targets += uImage.gz +targets += uImage.lzma +extra-y += vmlinux.bin +extra-y += vmlinux.bin.gz +extra-y += vmlinux.bin.lzma $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) @@ -24,12 +30,18 @@ $(obj)/vmlinux.bin: vmlinux FORCE $(obj)/vmlinux.bin.gz: $(obj)/vmlinux.bin FORCE $(call if_changed,gzip) +$(obj)/vmlinux.bin.lzma: $(obj)/vmlinux.bin FORCE + $(call if_changed,lzma) + $(obj)/uImage.bin: $(obj)/vmlinux.bin FORCE $(call if_changed,uimage,none) $(obj)/uImage.gz: $(obj)/vmlinux.bin.gz FORCE $(call if_changed,uimage,gzip) +$(obj)/uImage.lzma: $(obj)/vmlinux.bin.lzma FORCE + $(call if_changed,uimage,lzma) + $(obj)/uImage: $(obj)/uImage.$(suffix-y) @ln -sf $(notdir $<) $@ @echo ' Image $@ is ready' From 039bea844016ae85eaf195bf25266b5eb028a319 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Wed, 12 Oct 2016 08:02:21 +0530 Subject: [PATCH 012/292] Staging: greybus: gpio: Use gbphy_dev->dev instead of bundle->dev Some of the print messages are using the incorrect device pointer, fix them. Signed-off-by: Viresh Kumar Acked-by: Johan Hovold Reviewed-by: Rui Miguel Silva Signed-off-by: Greg Kroah-Hartman --- drivers/staging/greybus/gpio.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/staging/greybus/gpio.c b/drivers/staging/greybus/gpio.c index 5e06e4229e42..250caa00de5e 100644 --- a/drivers/staging/greybus/gpio.c +++ b/drivers/staging/greybus/gpio.c @@ -702,15 +702,13 @@ static int gb_gpio_probe(struct gbphy_device *gbphy_dev, ret = gb_gpio_irqchip_add(gpio, irqc, 0, handle_level_irq, IRQ_TYPE_NONE); if (ret) { - dev_err(&connection->bundle->dev, - "failed to add irq chip: %d\n", ret); + dev_err(&gbphy_dev->dev, "failed to add irq chip: %d\n", ret); goto exit_line_free; } ret = gpiochip_add(gpio); if (ret) { - dev_err(&connection->bundle->dev, - "failed to add gpio chip: %d\n", ret); + dev_err(&gbphy_dev->dev, "failed to add gpio chip: %d\n", ret); goto exit_gpio_irqchip_remove; } From 4fa589126f23243bd998a464cb6158d343eb6a89 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Wed, 12 Oct 2016 08:02:22 +0530 Subject: [PATCH 013/292] Staging: greybus: uart: Use gbphy_dev->dev instead of bundle->dev Some of the print messages are using the incorrect device pointer, fix them. Signed-off-by: Viresh Kumar Acked-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/staging/greybus/uart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/greybus/uart.c b/drivers/staging/greybus/uart.c index 5ee7954bd9f9..2633d2bfb1b4 100644 --- a/drivers/staging/greybus/uart.c +++ b/drivers/staging/greybus/uart.c @@ -888,7 +888,7 @@ static int gb_uart_probe(struct gbphy_device *gbphy_dev, minor = alloc_minor(gb_tty); if (minor < 0) { if (minor == -ENOSPC) { - dev_err(&connection->bundle->dev, + dev_err(&gbphy_dev->dev, "no more free minor numbers\n"); retval = -ENODEV; } else { From 0047b6e5f1b45b391244d78097631eb09a960202 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 12 Oct 2016 09:20:22 +0300 Subject: [PATCH 014/292] staging: android/ion: testing the wrong variable We're testing "pdev" but we intended to test "heap_pdev". This is a static checker fix and it's unlikely that anyone is affected by this bug. Fixes: 13439479c7de ('staging: ion: Add files for parsing the devicetree') Signed-off-by: Dan Carpenter Acked-by: Laura Abbott Signed-off-by: Greg Kroah-Hartman --- drivers/staging/android/ion/ion_of.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/android/ion/ion_of.c b/drivers/staging/android/ion/ion_of.c index 15bac92b7f04..46b2bb99bfd6 100644 --- a/drivers/staging/android/ion/ion_of.c +++ b/drivers/staging/android/ion/ion_of.c @@ -107,7 +107,7 @@ struct ion_platform_data *ion_parse_dt(struct platform_device *pdev, heap_pdev = of_platform_device_create(node, heaps[i].name, &pdev->dev); - if (!pdev) + if (!heap_pdev) return ERR_PTR(-ENOMEM); heap_pdev->dev.platform_data = &heaps[i]; From 1d4f1d53e1e2d5e38f4d3ca3bf60f8be5025540f Mon Sep 17 00:00:00 2001 From: Aditya Shankar Date: Fri, 7 Oct 2016 09:45:03 +0530 Subject: [PATCH 015/292] Staging: wilc1000: Fix kernel Oops on opening the device Commit 2518ac59eb27 ("staging: wilc1000: Replace kthread with workqueue for host interface") adds an unconditional destroy_workqueue() on the wilc's "hif_workqueue" soon after its creation thereby rendering it unusable. It then further attempts to queue work onto this non-existing hif_worqueue and results in: Unable to handle kernel NULL pointer dereference at virtual address 00000010 pgd = de478000 [00000010] *pgd=3eec0831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] ARM Modules linked in: wilc1000_sdio(C) wilc1000(C) CPU: 0 PID: 825 Comm: ifconfig Tainted: G C 4.8.0-rc8+ #37 Hardware name: Atmel SAMA5 task: df56f800 task.stack: deeb0000 PC is at __queue_work+0x90/0x284 LR is at __queue_work+0x58/0x284 pc : [] lr : [] psr: 600f0093 sp : deeb1aa0 ip : def22d78 fp : deea6000 r10: 00000000 r9 : c0a08150 r8 : c0a2f058 r7 : 00000001 r6 : dee9b600 r5 : def22d74 r4 : 00000000 r3 : 00000000 r2 : def22d74 r1 : 07ffffff r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none ... [] (__queue_work) from [] (queue_work_on+0x34/0x40) [] (queue_work_on) from [] (wilc_enqueue_cmd+0x54/0x64 [wilc1000]) [] (wilc_enqueue_cmd [wilc1000]) from [] (wilc_set_wfi_drv_handler+0x48/0x70 [wilc1000]) [] (wilc_set_wfi_drv_handler [wilc1000]) from [] (wilc_mac_open+0x214/0x250 [wilc1000]) [] (wilc_mac_open [wilc1000]) from [] (__dev_open+0xb8/0x11c) [] (__dev_open) from [] (__dev_change_flags+0x94/0x158) [] (__dev_change_flags) from [] (dev_change_flags+0x18/0x48) [] (dev_change_flags) from [] (devinet_ioctl+0x6b4/0x788) [] (devinet_ioctl) from [] (sock_ioctl+0x154/0x2cc) [] (sock_ioctl) from [] (do_vfs_ioctl+0x9c/0x878) [] (do_vfs_ioctl) from [] (SyS_ioctl+0x34/0x5c) [] (SyS_ioctl) from [] (ret_fast_syscall+0x0/0x3c) Code: e5932004 e1520006 01a04003 0affffff (e5943010) ---[ end trace b612328adaa6bf20 ]--- This fix removes the unnecessary call to destroy_workqueue() while opening the device to avoid the above kernel panic. The deinit routine already does a good job of terminating the workqueue when no longer needed. Reported-by: Nicolas Ferre Fixes: 2518ac59eb27 ("staging: wilc1000: Replace kthread with workqueue for host interface") Cc: stable@vger.kernel.org # 4.8+ Signed-off-by: Aditya Shankar Signed-off-by: Greg Kroah-Hartman --- drivers/staging/wilc1000/host_interface.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/staging/wilc1000/host_interface.c b/drivers/staging/wilc1000/host_interface.c index 78f5613e9467..6ab7443eabde 100644 --- a/drivers/staging/wilc1000/host_interface.c +++ b/drivers/staging/wilc1000/host_interface.c @@ -3388,7 +3388,6 @@ int wilc_init(struct net_device *dev, struct host_if_drv **hif_drv_handler) clients_count++; - destroy_workqueue(hif_workqueue); _fail_: return result; } From c89d98e224b4858f42a9fec0f16766b3d7669ba3 Mon Sep 17 00:00:00 2001 From: Oleg Drokin Date: Sun, 16 Oct 2016 13:16:50 -0400 Subject: [PATCH 016/292] staging/lustre/llite: Move unstable_stats from sysfs to debugfs It's multiple values per file, so it has no business being in sysfs, besides it was assuming seqfile anyway. Fixes: d806f30e639b ("staging: lustre: osc: revise unstable pages accounting") Signed-off-by: Oleg Drokin Signed-off-by: Greg Kroah-Hartman --- .../staging/lustre/lustre/llite/lproc_llite.c | 34 +++++++++---------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c b/drivers/staging/lustre/lustre/llite/lproc_llite.c index 6eae60595905..23fda9d98bff 100644 --- a/drivers/staging/lustre/lustre/llite/lproc_llite.c +++ b/drivers/staging/lustre/lustre/llite/lproc_llite.c @@ -871,12 +871,10 @@ static ssize_t xattr_cache_store(struct kobject *kobj, } LUSTRE_RW_ATTR(xattr_cache); -static ssize_t unstable_stats_show(struct kobject *kobj, - struct attribute *attr, - char *buf) +static int ll_unstable_stats_seq_show(struct seq_file *m, void *v) { - struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, - ll_kobj); + struct super_block *sb = m->private; + struct ll_sb_info *sbi = ll_s2sbi(sb); struct cl_client_cache *cache = sbi->ll_cache; long pages; int mb; @@ -884,19 +882,21 @@ static ssize_t unstable_stats_show(struct kobject *kobj, pages = atomic_long_read(&cache->ccc_unstable_nr); mb = (pages * PAGE_SIZE) >> 20; - return sprintf(buf, "unstable_check: %8d\n" - "unstable_pages: %12ld\n" - "unstable_mb: %8d\n", - cache->ccc_unstable_check, pages, mb); + seq_printf(m, + "unstable_check: %8d\n" + "unstable_pages: %12ld\n" + "unstable_mb: %8d\n", + cache->ccc_unstable_check, pages, mb); + + return 0; } -static ssize_t unstable_stats_store(struct kobject *kobj, - struct attribute *attr, - const char *buffer, - size_t count) +static ssize_t ll_unstable_stats_seq_write(struct file *file, + const char __user *buffer, + size_t count, loff_t *off) { - struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, - ll_kobj); + struct super_block *sb = ((struct seq_file *)file->private_data)->private; + struct ll_sb_info *sbi = ll_s2sbi(sb); char kernbuf[128]; int val, rc; @@ -922,7 +922,7 @@ static ssize_t unstable_stats_store(struct kobject *kobj, return count; } -LUSTRE_RW_ATTR(unstable_stats); +LPROC_SEQ_FOPS(ll_unstable_stats); static ssize_t root_squash_show(struct kobject *kobj, struct attribute *attr, char *buf) @@ -995,6 +995,7 @@ static struct lprocfs_vars lprocfs_llite_obd_vars[] = { /* { "filegroups", lprocfs_rd_filegroups, 0, 0 }, */ { "max_cached_mb", &ll_max_cached_mb_fops, NULL }, { "statahead_stats", &ll_statahead_stats_fops, NULL, 0 }, + { "unstable_stats", &ll_unstable_stats_fops, NULL }, { "sbi_flags", &ll_sbi_flags_fops, NULL, 0 }, { .name = "nosquash_nids", .fops = &ll_nosquash_nids_fops }, @@ -1026,7 +1027,6 @@ static struct attribute *llite_attrs[] = { &lustre_attr_max_easize.attr, &lustre_attr_default_easize.attr, &lustre_attr_xattr_cache.attr, - &lustre_attr_unstable_stats.attr, &lustre_attr_root_squash.attr, NULL, }; From bbe097f092b0d13e9736bd2794d0ab24547d0e5d Mon Sep 17 00:00:00 2001 From: Alexandre Belloni Date: Thu, 15 Sep 2016 17:07:22 +0200 Subject: [PATCH 017/292] usb: gadget: udc: atmel: fix endpoint name Since commit c32b5bcfa3c4 ("ARM: dts: at91: Fix USB endpoint nodes"), atmel_usba_udc fails with: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at include/linux/usb/gadget.h:405 ecm_do_notify+0x188/0x1a0 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.7.0+ #15 Hardware name: Atmel SAMA5 [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (__warn+0xe4/0xfc) [] (__warn) from [] (warn_slowpath_null+0x20/0x28) [] (warn_slowpath_null) from [] (ecm_do_notify+0x188/0x1a0) [] (ecm_do_notify) from [] (ecm_set_alt+0x74/0x1ac) [] (ecm_set_alt) from [] (composite_setup+0xfc0/0x19f8) [] (composite_setup) from [] (usba_udc_irq+0x8f4/0xd9c) [] (usba_udc_irq) from [] (handle_irq_event_percpu+0x9c/0x158) [] (handle_irq_event_percpu) from [] (handle_irq_event+0x28/0x3c) [] (handle_irq_event) from [] (handle_fasteoi_irq+0xa0/0x168) [] (handle_fasteoi_irq) from [] (generic_handle_irq+0x24/0x34) [] (generic_handle_irq) from [] (__handle_domain_irq+0x54/0xa8) [] (__handle_domain_irq) from [] (__irq_svc+0x54/0x70) [] (__irq_svc) from [] (arch_cpu_idle+0x38/0x3c) [] (arch_cpu_idle) from [] (cpu_startup_entry+0x9c/0xdc) [] (cpu_startup_entry) from [] (start_kernel+0x354/0x360) [] (start_kernel) from [<20008078>] (0x20008078) ---[ end trace e7cf9dcebf4815a6 ]--- Fixes: c32b5bcfa3c4 ("ARM: dts: at91: Fix USB endpoint nodes") Cc: Reported-by: Richard Genoud Acked-by: Nicolas Ferre Signed-off-by: Alexandre Belloni Signed-off-by: Felipe Balbi --- drivers/usb/gadget/udc/atmel_usba_udc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/gadget/udc/atmel_usba_udc.c b/drivers/usb/gadget/udc/atmel_usba_udc.c index bb1f6c8f0f01..45bc997d0711 100644 --- a/drivers/usb/gadget/udc/atmel_usba_udc.c +++ b/drivers/usb/gadget/udc/atmel_usba_udc.c @@ -1978,7 +1978,7 @@ static struct usba_ep * atmel_udc_of_init(struct platform_device *pdev, dev_err(&pdev->dev, "of_probe: name error(%d)\n", ret); goto err; } - ep->ep.name = name; + ep->ep.name = kasprintf(GFP_KERNEL, "ep%d", ep->index); ep->ep_regs = udc->regs + USBA_EPT_BASE(i); ep->dma_regs = udc->regs + USBA_DMA_BASE(i); From 6c83f77278f17a7679001027e9231291c20f0d8a Mon Sep 17 00:00:00 2001 From: Felipe Balbi Date: Tue, 4 Oct 2016 15:14:43 +0300 Subject: [PATCH 018/292] usb: gadget: function: u_ether: don't starve tx request queue MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If we don't guarantee that we will always get an interrupt at least when we're queueing our very last request, we could fall into situation where we queue every request with 'no_interrupt' set. This will cause the link to get stuck. The behavior above has been triggered with g_ether and dwc3. Cc: Reported-by: Ville Syrjälä Signed-off-by: Felipe Balbi --- drivers/usb/gadget/function/u_ether.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c index 9c8c9ed1dc9e..fe1811650dbc 100644 --- a/drivers/usb/gadget/function/u_ether.c +++ b/drivers/usb/gadget/function/u_ether.c @@ -590,8 +590,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb, /* throttle high/super speed IRQ rate back slightly */ if (gadget_is_dualspeed(dev->gadget)) - req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH || - dev->gadget->speed == USB_SPEED_SUPER) + req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH || + dev->gadget->speed == USB_SPEED_SUPER)) && + !list_empty(&dev->tx_reqs)) ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0) : 0; From a9c3ca5fae6bf73770f0576eaf57d5f1305ef4b3 Mon Sep 17 00:00:00 2001 From: Felipe Balbi Date: Wed, 5 Oct 2016 14:24:37 +0300 Subject: [PATCH 019/292] usb: dwc3: gadget: properly account queued requests Some requests could be accounted for multiple times. Let's fix that so each and every requests is accounted for only once. Cc: # v4.8 Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit") Signed-off-by: Felipe Balbi --- drivers/usb/dwc3/gadget.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 07cc8929f271..3c3ced128c77 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep, req->trb = trb; req->trb_dma = dwc3_trb_dma_offset(dep, trb); req->first_trb_index = dep->trb_enqueue; + dep->queued_requests++; } dwc3_ep_inc_enq(dep); @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep, trb->ctrl |= DWC3_TRB_CTRL_HWO; - dep->queued_requests++; - trace_dwc3_prepare_trb(dep, trb); } @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep, unsigned int s_pkt = 0; unsigned int trb_status; - dep->queued_requests--; dwc3_ep_inc_deq(dep); + + if (req->trb == trb) + dep->queued_requests--; + trace_dwc3_complete_trb(dep, trb); /* From d889c23ce4e3159e3d737f55f9d686a030a7ba87 Mon Sep 17 00:00:00 2001 From: Felipe Balbi Date: Thu, 29 Sep 2016 15:44:29 +0300 Subject: [PATCH 020/292] usb: dwc3: gadget: never pre-start Isochronous endpoints We cannot pre-start isochronous endpoints because we rely on the micro-frame number passed via XferNotReady command for proper Isochronous scheduling. Fixes: 08a36b543803 ("usb: dwc3: gadget: simplify __dwc3_gadget_ep_queue()") Signed-off-by: Felipe Balbi --- drivers/usb/dwc3/gadget.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 3c3ced128c77..f15147f79d14 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -1073,9 +1073,17 @@ static int __dwc3_gadget_ep_queue(struct dwc3_ep *dep, struct dwc3_request *req) list_add_tail(&req->list, &dep->pending_list); - if (usb_endpoint_xfer_isoc(dep->endpoint.desc) && - dep->flags & DWC3_EP_PENDING_REQUEST) { - if (list_empty(&dep->started_list)) { + /* + * NOTICE: Isochronous endpoints should NEVER be prestarted. We must + * wait for a XferNotReady event so we will know what's the current + * (micro-)frame number. + * + * Without this trick, we are very, very likely gonna get Bus Expiry + * errors which will force us issue EndTransfer command. + */ + if (usb_endpoint_xfer_isoc(dep->endpoint.desc)) { + if ((dep->flags & DWC3_EP_PENDING_REQUEST) && + list_empty(&dep->started_list)) { dwc3_stop_active_transfer(dwc, dep->number, true); dep->flags = DWC3_EP_ENABLED; } From 0f02c4e749bc79975dd23ddcc9bfe38bf76afb67 Mon Sep 17 00:00:00 2001 From: Christian Borntraeger Date: Fri, 9 Sep 2016 16:28:43 +0200 Subject: [PATCH 021/292] s390/dasd: avoid undefined behaviour the mdc value can be quite big (like 65535), so we are in undefined territory when doing the multiplication with the (also signed) FCX_MAX_DATA_FACTOR as outlined by UBSAN: UBSAN: Undefined behaviour in drivers/s390/block/dasd_eckd.c:1678:14 signed integer overflow: 65535 * 65536 cannot be represented in type 'int' CPU: 5 PID: 183 Comm: kworker/u512:1 Not tainted 4.7.0+ #150 Workqueue: events_unbound async_run_entry_fn 000000fb8b59f900 000000fb8b59f990 0000000000000002 0000000000000000 000000fb8b59fa30 000000fb8b59f9a8 000000fb8b59f9a8 000000000011732e 00000000000000a4 0000000000a309e2 0000000000a4c072 000000000000000b 000000fb8b59f9f0 000000fb8b59f990 0000000000000000 0000000000000000 0400000000d83238 000000000011732e 000000fb8b59f990 000000fb8b59f9f0 Call Trace: ([<0000000000117260>] show_trace+0x98/0xa8) ([<00000000001172e0>] show_stack+0x70/0xf0) ([<000000000053ac96>] dump_stack+0x86/0xb8) ([<000000000057f5f8>] ubsan_epilogue+0x28/0x70) ([<000000000057fe9e>] handle_overflow+0xde/0xf0) ([<00000000006c322a>] dasd_eckd_check_characteristics+0x50a/0x550) ([<00000000006b42ca>] dasd_generic_set_online+0xba/0x380) ([<0000000000693d82>] ccw_device_set_online+0x192/0x550) ([<00000000006ac1ae>] dasd_generic_auto_online+0x2e/0x70) ([<0000000000172130>] async_run_entry_fn+0x70/0x270) ([<0000000000165a72>] process_one_work+0x26a/0x638) ([<0000000000165e8a>] worker_thread+0x4a/0x658) ([<000000000016dd9c>] kthread+0x10c/0x110) ([<00000000008963ae>] kernel_thread_starter+0x6/0xc) ([<00000000008963a8>] kernel_thread_starter+0x0/0xc) As this is a runtime value there is actually no risk of any sane compiler to detect and (ab)use this undefinedness, but let's make the multiplication defined by making mdc unsigned. Signed-off-by: Christian Borntraeger Acked-by: Stefan Haberland Signed-off-by: Martin Schwidefsky --- drivers/s390/block/dasd_eckd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c index 831935af7389..a7a88476e215 100644 --- a/drivers/s390/block/dasd_eckd.c +++ b/drivers/s390/block/dasd_eckd.c @@ -1205,7 +1205,7 @@ static int verify_fcx_max_data(struct dasd_device *device, __u8 lpm) mdc, lpm); return mdc; } - fcx_max_data = mdc * FCX_MAX_DATA_FACTOR; + fcx_max_data = (u32)mdc * FCX_MAX_DATA_FACTOR; if (fcx_max_data < private->fcx_max_data) { dev_warn(&device->cdev->dev, "The maximum data size for zHPF requests %u " @@ -1675,7 +1675,7 @@ static u32 get_fcx_max_data(struct dasd_device *device) " data size for zHPF requests failed\n"); return 0; } else - return mdc * FCX_MAX_DATA_FACTOR; + return (u32)mdc * FCX_MAX_DATA_FACTOR; } /* From 12e721964e7feb555c3ee499a3f85c194d3d36ea Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Wed, 12 Oct 2016 13:43:38 +0200 Subject: [PATCH 022/292] s390: ignore pkey system calls Ignore the pkey systems calls since they don't make any sense on s390. In addition any user could trigger a warning if issueing the pkey_free system call, if it would be wired up on a system without pkey support. Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky --- arch/s390/include/asm/unistd.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/s390/include/asm/unistd.h b/arch/s390/include/asm/unistd.h index 02613bad8bbb..3066031a73fe 100644 --- a/arch/s390/include/asm/unistd.h +++ b/arch/s390/include/asm/unistd.h @@ -9,6 +9,9 @@ #include #define __IGNORE_time +#define __IGNORE_pkey_mprotect +#define __IGNORE_pkey_alloc +#define __IGNORE_pkey_free #define __ARCH_WANT_OLD_READDIR #define __ARCH_WANT_SYS_ALARM From 179a98cba11b057d9f1cc70cd2a8831f9e9a06e6 Mon Sep 17 00:00:00 2001 From: Sebastian Ott Date: Wed, 12 Oct 2016 11:14:31 +0200 Subject: [PATCH 023/292] s390/cio: don't register chpids in reserved state During IPL we register all chpids that are not in the unrecognized state. This includes chpids that are not usable and chpids for which the state could not be obtained. Change that to only register chpids in the configured (usable) or standby (usable after a configure operation) state. All other chpids could only be made available by external control for which we would receive machine checks. Signed-off-by: Sebastian Ott Reviewed-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky --- drivers/s390/cio/chp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/s390/cio/chp.c b/drivers/s390/cio/chp.c index 46be25c7461e..876c7e6e3a99 100644 --- a/drivers/s390/cio/chp.c +++ b/drivers/s390/cio/chp.c @@ -780,7 +780,7 @@ static int cfg_wait_idle(void) static int __init chp_init(void) { struct chp_id chpid; - int ret; + int state, ret; ret = crw_register_handler(CRW_RSC_CPATH, chp_process_crw); if (ret) @@ -791,7 +791,9 @@ static int __init chp_init(void) return 0; /* Register available channel-paths. */ chp_id_for_each(&chpid) { - if (chp_info_get_status(chpid) != CHP_STATUS_NOT_RECOGNIZED) + state = chp_info_get_status(chpid); + if (state == CHP_STATUS_CONFIGURED || + state == CHP_STATUS_STANDBY) chp_new(chpid); } From b5003b5f0a19b6b37ab32b1f0c6b5da2cb3f0903 Mon Sep 17 00:00:00 2001 From: Shyam Saini Date: Thu, 13 Oct 2016 21:50:07 +0530 Subject: [PATCH 024/292] s390/mm: use hugetlb_bad_size() Update setup_hugepagesz() to call hugetlb_bad_size() when unsupported hugepage size is found. Signed-off-by: Shyam Saini Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky --- arch/s390/mm/hugetlbpage.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index cd404aa3931c..4a0c5bce3552 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -217,6 +217,7 @@ static __init int setup_hugepagesz(char *opt) } else if (MACHINE_HAS_EDAT2 && size == PUD_SIZE) { hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT); } else { + hugetlb_bad_size(); pr_err("hugepagesz= specifies an unsupported page size %s\n", string); return 0; From a07ce8d34eb3d9c6cec3aa25f7713e6aafad2260 Mon Sep 17 00:00:00 2001 From: Heiko Stuebner Date: Fri, 14 Oct 2016 10:47:24 -0700 Subject: [PATCH 025/292] usb: dwc2: Add msleep for host-only Although a host-only controller should not have any associated delay, some rockchip SOC platforms will not show the correct host-values of registers until after a delay. So add a 50 ms sleep when in host-only mode. Signed-off-by: John Youn Signed-off-by: Heiko Stuebner Signed-off-by: Felipe Balbi --- drivers/usb/dwc2/core.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c index fa9b26b91507..4c0fa0b17353 100644 --- a/drivers/usb/dwc2/core.c +++ b/drivers/usb/dwc2/core.c @@ -463,9 +463,18 @@ static void dwc2_clear_force_mode(struct dwc2_hsotg *hsotg) */ void dwc2_force_dr_mode(struct dwc2_hsotg *hsotg) { + bool ret; + switch (hsotg->dr_mode) { case USB_DR_MODE_HOST: - dwc2_force_mode(hsotg, true); + ret = dwc2_force_mode(hsotg, true); + /* + * NOTE: This is required for some rockchip soc based + * platforms on their host-only dwc2. + */ + if (!ret) + msleep(50); + break; case USB_DR_MODE_PERIPHERAL: dwc2_force_mode(hsotg, false); From 454915dde06a51133750c6745f0ba57361ba209d Mon Sep 17 00:00:00 2001 From: Michal Nazarewicz Date: Tue, 4 Oct 2016 02:07:33 +0200 Subject: [PATCH 026/292] usb: gadget: f_fs: edit epfile->ep under lock epfile->ep is protected by ffs->eps_lock (not epfile->mutex) so clear it while holding the spin lock. Tested-by: John Stultz Tested-by: Chen Yu Signed-off-by: Michal Nazarewicz Signed-off-by: Felipe Balbi --- drivers/usb/gadget/function/f_fs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 54ad100af35b..b31aa9572723 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -1725,17 +1725,17 @@ static void ffs_func_eps_disable(struct ffs_function *func) unsigned long flags; do { - if (epfile) - mutex_lock(&epfile->mutex); spin_lock_irqsave(&func->ffs->eps_lock, flags); /* pending requests get nuked */ if (likely(ep->ep)) usb_ep_disable(ep->ep); ++ep; + if (epfile) + epfile->ep = NULL; spin_unlock_irqrestore(&func->ffs->eps_lock, flags); if (epfile) { - epfile->ep = NULL; + mutex_lock(&epfile->mutex); kfree(epfile->read_buffer); epfile->read_buffer = NULL; mutex_unlock(&epfile->mutex); From a9e6f83c2df199187a5248f824f31b6787ae23ae Mon Sep 17 00:00:00 2001 From: Michal Nazarewicz Date: Tue, 4 Oct 2016 02:07:34 +0200 Subject: [PATCH 027/292] usb: gadget: f_fs: stop sleeping in ffs_func_eps_disable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ffs_func_eps_disable is called from atomic context so it cannot sleep thus cannot grab a mutex. Change the handling of epfile->read_buffer to use non-sleeping synchronisation method. Reported-by: Chen Yu Signed-off-by: Michał Nazarewicz Fixes: 9353afbbfa7b ("buffer data from ‘oversized’ OUT requests") Tested-by: John Stultz Tested-by: Chen Yu Signed-off-by: Felipe Balbi --- drivers/usb/gadget/function/f_fs.c | 109 ++++++++++++++++++++++++----- 1 file changed, 93 insertions(+), 16 deletions(-) diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index b31aa9572723..e40d47d47d82 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -136,8 +136,60 @@ struct ffs_epfile { /* * Buffer for holding data from partial reads which may happen since * we’re rounding user read requests to a multiple of a max packet size. + * + * The pointer is initialised with NULL value and may be set by + * __ffs_epfile_read_data function to point to a temporary buffer. + * + * In normal operation, calls to __ffs_epfile_read_buffered will consume + * data from said buffer and eventually free it. Importantly, while the + * function is using the buffer, it sets the pointer to NULL. This is + * all right since __ffs_epfile_read_data and __ffs_epfile_read_buffered + * can never run concurrently (they are synchronised by epfile->mutex) + * so the latter will not assign a new value to the pointer. + * + * Meanwhile ffs_func_eps_disable frees the buffer (if the pointer is + * valid) and sets the pointer to READ_BUFFER_DROP value. This special + * value is crux of the synchronisation between ffs_func_eps_disable and + * __ffs_epfile_read_data. + * + * Once __ffs_epfile_read_data is about to finish it will try to set the + * pointer back to its old value (as described above), but seeing as the + * pointer is not-NULL (namely READ_BUFFER_DROP) it will instead free + * the buffer. + * + * == State transitions == + * + * • ptr == NULL: (initial state) + * ◦ __ffs_epfile_read_buffer_free: go to ptr == DROP + * ◦ __ffs_epfile_read_buffered: nop + * ◦ __ffs_epfile_read_data allocates temp buffer: go to ptr == buf + * ◦ reading finishes: n/a, not in ‘and reading’ state + * • ptr == DROP: + * ◦ __ffs_epfile_read_buffer_free: nop + * ◦ __ffs_epfile_read_buffered: go to ptr == NULL + * ◦ __ffs_epfile_read_data allocates temp buffer: free buf, nop + * ◦ reading finishes: n/a, not in ‘and reading’ state + * • ptr == buf: + * ◦ __ffs_epfile_read_buffer_free: free buf, go to ptr == DROP + * ◦ __ffs_epfile_read_buffered: go to ptr == NULL and reading + * ◦ __ffs_epfile_read_data: n/a, __ffs_epfile_read_buffered + * is always called first + * ◦ reading finishes: n/a, not in ‘and reading’ state + * • ptr == NULL and reading: + * ◦ __ffs_epfile_read_buffer_free: go to ptr == DROP and reading + * ◦ __ffs_epfile_read_buffered: n/a, mutex is held + * ◦ __ffs_epfile_read_data: n/a, mutex is held + * ◦ reading finishes and … + * … all data read: free buf, go to ptr == NULL + * … otherwise: go to ptr == buf and reading + * • ptr == DROP and reading: + * ◦ __ffs_epfile_read_buffer_free: nop + * ◦ __ffs_epfile_read_buffered: n/a, mutex is held + * ◦ __ffs_epfile_read_data: n/a, mutex is held + * ◦ reading finishes: free buf, go to ptr == DROP */ - struct ffs_buffer *read_buffer; /* P: epfile->mutex */ + struct ffs_buffer *read_buffer; +#define READ_BUFFER_DROP ((struct ffs_buffer *)ERR_PTR(-ESHUTDOWN)) char name[5]; @@ -736,25 +788,47 @@ static void ffs_epfile_async_io_complete(struct usb_ep *_ep, schedule_work(&io_data->work); } +static void __ffs_epfile_read_buffer_free(struct ffs_epfile *epfile) +{ + /* + * See comment in struct ffs_epfile for full read_buffer pointer + * synchronisation story. + */ + struct ffs_buffer *buf = xchg(&epfile->read_buffer, READ_BUFFER_DROP); + if (buf && buf != READ_BUFFER_DROP) + kfree(buf); +} + /* Assumes epfile->mutex is held. */ static ssize_t __ffs_epfile_read_buffered(struct ffs_epfile *epfile, struct iov_iter *iter) { - struct ffs_buffer *buf = epfile->read_buffer; + /* + * Null out epfile->read_buffer so ffs_func_eps_disable does not free + * the buffer while we are using it. See comment in struct ffs_epfile + * for full read_buffer pointer synchronisation story. + */ + struct ffs_buffer *buf = xchg(&epfile->read_buffer, NULL); ssize_t ret; - if (!buf) + if (!buf || buf == READ_BUFFER_DROP) return 0; ret = copy_to_iter(buf->data, buf->length, iter); if (buf->length == ret) { kfree(buf); - epfile->read_buffer = NULL; - } else if (unlikely(iov_iter_count(iter))) { + return ret; + } + + if (unlikely(iov_iter_count(iter))) { ret = -EFAULT; } else { buf->length -= ret; buf->data += ret; } + + if (cmpxchg(&epfile->read_buffer, NULL, buf)) + kfree(buf); + return ret; } @@ -783,7 +857,15 @@ static ssize_t __ffs_epfile_read_data(struct ffs_epfile *epfile, buf->length = data_len; buf->data = buf->storage; memcpy(buf->storage, data + ret, data_len); - epfile->read_buffer = buf; + + /* + * At this point read_buffer is NULL or READ_BUFFER_DROP (if + * ffs_func_eps_disable has been called in the meanwhile). See comment + * in struct ffs_epfile for full read_buffer pointer synchronisation + * story. + */ + if (unlikely(cmpxchg(&epfile->read_buffer, NULL, buf))) + kfree(buf); return ret; } @@ -1097,8 +1179,7 @@ ffs_epfile_release(struct inode *inode, struct file *file) ENTER(); - kfree(epfile->read_buffer); - epfile->read_buffer = NULL; + __ffs_epfile_read_buffer_free(epfile); ffs_data_closed(epfile->ffs); return 0; @@ -1724,24 +1805,20 @@ static void ffs_func_eps_disable(struct ffs_function *func) unsigned count = func->ffs->eps_count; unsigned long flags; + spin_lock_irqsave(&func->ffs->eps_lock, flags); do { - spin_lock_irqsave(&func->ffs->eps_lock, flags); /* pending requests get nuked */ if (likely(ep->ep)) usb_ep_disable(ep->ep); ++ep; - if (epfile) - epfile->ep = NULL; - spin_unlock_irqrestore(&func->ffs->eps_lock, flags); if (epfile) { - mutex_lock(&epfile->mutex); - kfree(epfile->read_buffer); - epfile->read_buffer = NULL; - mutex_unlock(&epfile->mutex); + epfile->ep = NULL; + __ffs_epfile_read_buffer_free(epfile); ++epfile; } } while (--count); + spin_unlock_irqrestore(&func->ffs->eps_lock, flags); } static int ffs_func_eps_enable(struct ffs_function *func) From 51fbc7c06c8900370c6da5fc4a4685add8fa4fb0 Mon Sep 17 00:00:00 2001 From: Christophe JAILLET Date: Fri, 7 Oct 2016 22:12:39 +0200 Subject: [PATCH 028/292] usb: dwc3: Fix size used in dma_free_coherent() In commit 2abd9d5fa60f9 ("usb: dwc3: ep0: Add chained TRB support"), the size of the memory allocated with 'dma_alloc_coherent()' has been modified but the corresponding calls to 'dma_free_coherent()' have not been updated accordingly. This has been spotted with coccinelle, using the following script: //////////////////// @r@ expression x0, x1, y0, y1, z0, z1, t0, t1, ret; @@ * ret = dma_alloc_coherent(x0, y0, z0, t0); ... * dma_free_coherent(x1, y1, ret, t1); @script:python@ y0 << r.y0; y1 << r.y1; @@ if y1.find(y0) == -1: print "WARNING: sizes look different: '%s' vs '%s'" % (y0, y1) //////////////////// Fixes: 2abd9d5fa60f9 ("usb: dwc3: ep0: Add chained TRB support") Signed-off-by: Christophe JAILLET Signed-off-by: Felipe Balbi --- drivers/usb/dwc3/gadget.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index f15147f79d14..1dfa56a5f1c5 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -2990,7 +2990,7 @@ err3: kfree(dwc->setup_buf); err2: - dma_free_coherent(dwc->dev, sizeof(*dwc->ep0_trb), + dma_free_coherent(dwc->dev, sizeof(*dwc->ep0_trb) * 2, dwc->ep0_trb, dwc->ep0_trb_addr); err1: @@ -3015,7 +3015,7 @@ void dwc3_gadget_exit(struct dwc3 *dwc) kfree(dwc->setup_buf); kfree(dwc->zlp_buf); - dma_free_coherent(dwc->dev, sizeof(*dwc->ep0_trb), + dma_free_coherent(dwc->dev, sizeof(*dwc->ep0_trb) * 2, dwc->ep0_trb, dwc->ep0_trb_addr); dma_free_coherent(dwc->dev, sizeof(*dwc->ctrl_req), From a19b882c07a63174f09d1f7c036cc2aab7f04ad3 Mon Sep 17 00:00:00 2001 From: Andy Lutomirski Date: Thu, 6 Oct 2016 10:25:38 -0700 Subject: [PATCH 029/292] wusb: Stop using the stack for sg crypto scratch space Pointing an sg list at the stack is verboten and, with CONFIG_VMAP_STACK=y, will malfunction. Use kmalloc for the wusb crypto stack space instead. Untested -- I'm not entirely convinced that this hardware exists in the wild. Signed-off-by: Andy Lutomirski Signed-off-by: Greg Kroah-Hartman --- drivers/usb/wusbcore/crypto.c | 59 ++++++++++++++++++++++------------- 1 file changed, 37 insertions(+), 22 deletions(-) diff --git a/drivers/usb/wusbcore/crypto.c b/drivers/usb/wusbcore/crypto.c index 79b2b628066d..de089f3a82f3 100644 --- a/drivers/usb/wusbcore/crypto.c +++ b/drivers/usb/wusbcore/crypto.c @@ -133,6 +133,13 @@ static void bytewise_xor(void *_bo, const void *_bi1, const void *_bi2, bo[itr] = bi1[itr] ^ bi2[itr]; } +/* Scratch space for MAC calculations. */ +struct wusb_mac_scratch { + struct aes_ccm_b0 b0; + struct aes_ccm_b1 b1; + struct aes_ccm_a ax; +}; + /* * CC-MAC function WUSB1.0[6.5] * @@ -197,16 +204,15 @@ static void bytewise_xor(void *_bo, const void *_bi1, const void *_bi2, * what sg[4] is for. Maybe there is a smarter way to do this. */ static int wusb_ccm_mac(struct crypto_skcipher *tfm_cbc, - struct crypto_cipher *tfm_aes, void *mic, + struct crypto_cipher *tfm_aes, + struct wusb_mac_scratch *scratch, + void *mic, const struct aes_ccm_nonce *n, const struct aes_ccm_label *a, const void *b, size_t blen) { int result = 0; SKCIPHER_REQUEST_ON_STACK(req, tfm_cbc); - struct aes_ccm_b0 b0; - struct aes_ccm_b1 b1; - struct aes_ccm_a ax; struct scatterlist sg[4], sg_dst; void *dst_buf; size_t dst_size; @@ -218,16 +224,17 @@ static int wusb_ccm_mac(struct crypto_skcipher *tfm_cbc, * These checks should be compile time optimized out * ensure @a fills b1's mac_header and following fields */ - WARN_ON(sizeof(*a) != sizeof(b1) - sizeof(b1.la)); - WARN_ON(sizeof(b0) != sizeof(struct aes_ccm_block)); - WARN_ON(sizeof(b1) != sizeof(struct aes_ccm_block)); - WARN_ON(sizeof(ax) != sizeof(struct aes_ccm_block)); + WARN_ON(sizeof(*a) != sizeof(scratch->b1) - sizeof(scratch->b1.la)); + WARN_ON(sizeof(scratch->b0) != sizeof(struct aes_ccm_block)); + WARN_ON(sizeof(scratch->b1) != sizeof(struct aes_ccm_block)); + WARN_ON(sizeof(scratch->ax) != sizeof(struct aes_ccm_block)); result = -ENOMEM; zero_padding = blen % sizeof(struct aes_ccm_block); if (zero_padding) zero_padding = sizeof(struct aes_ccm_block) - zero_padding; - dst_size = blen + sizeof(b0) + sizeof(b1) + zero_padding; + dst_size = blen + sizeof(scratch->b0) + sizeof(scratch->b1) + + zero_padding; dst_buf = kzalloc(dst_size, GFP_KERNEL); if (!dst_buf) goto error_dst_buf; @@ -235,9 +242,9 @@ static int wusb_ccm_mac(struct crypto_skcipher *tfm_cbc, memset(iv, 0, sizeof(iv)); /* Setup B0 */ - b0.flags = 0x59; /* Format B0 */ - b0.ccm_nonce = *n; - b0.lm = cpu_to_be16(0); /* WUSB1.0[6.5] sez l(m) is 0 */ + scratch->b0.flags = 0x59; /* Format B0 */ + scratch->b0.ccm_nonce = *n; + scratch->b0.lm = cpu_to_be16(0); /* WUSB1.0[6.5] sez l(m) is 0 */ /* Setup B1 * @@ -246,12 +253,12 @@ static int wusb_ccm_mac(struct crypto_skcipher *tfm_cbc, * 14'--after clarification, it means to use A's contents * for MAC Header, EO, sec reserved and padding. */ - b1.la = cpu_to_be16(blen + 14); - memcpy(&b1.mac_header, a, sizeof(*a)); + scratch->b1.la = cpu_to_be16(blen + 14); + memcpy(&scratch->b1.mac_header, a, sizeof(*a)); sg_init_table(sg, ARRAY_SIZE(sg)); - sg_set_buf(&sg[0], &b0, sizeof(b0)); - sg_set_buf(&sg[1], &b1, sizeof(b1)); + sg_set_buf(&sg[0], &scratch->b0, sizeof(scratch->b0)); + sg_set_buf(&sg[1], &scratch->b1, sizeof(scratch->b1)); sg_set_buf(&sg[2], b, blen); /* 0 if well behaved :) */ sg_set_buf(&sg[3], bzero, zero_padding); @@ -276,11 +283,12 @@ static int wusb_ccm_mac(struct crypto_skcipher *tfm_cbc, * POS Crypto API: size is assumed to be AES's block size. * Thanks for documenting it -- tip taken from airo.c */ - ax.flags = 0x01; /* as per WUSB 1.0 spec */ - ax.ccm_nonce = *n; - ax.counter = 0; - crypto_cipher_encrypt_one(tfm_aes, (void *)&ax, (void *)&ax); - bytewise_xor(mic, &ax, iv, 8); + scratch->ax.flags = 0x01; /* as per WUSB 1.0 spec */ + scratch->ax.ccm_nonce = *n; + scratch->ax.counter = 0; + crypto_cipher_encrypt_one(tfm_aes, (void *)&scratch->ax, + (void *)&scratch->ax); + bytewise_xor(mic, &scratch->ax, iv, 8); result = 8; error_cbc_crypt: kfree(dst_buf); @@ -303,6 +311,7 @@ ssize_t wusb_prf(void *out, size_t out_size, struct aes_ccm_nonce n = *_n; struct crypto_skcipher *tfm_cbc; struct crypto_cipher *tfm_aes; + struct wusb_mac_scratch *scratch; u64 sfn = 0; __le64 sfn_le; @@ -329,17 +338,23 @@ ssize_t wusb_prf(void *out, size_t out_size, printk(KERN_ERR "E: can't set AES key: %d\n", (int)result); goto error_setkey_aes; } + scratch = kmalloc(sizeof(*scratch), GFP_KERNEL); + if (!scratch) + goto error_alloc_scratch; for (bitr = 0; bitr < (len + 63) / 64; bitr++) { sfn_le = cpu_to_le64(sfn++); memcpy(&n.sfn, &sfn_le, sizeof(n.sfn)); /* n.sfn++... */ - result = wusb_ccm_mac(tfm_cbc, tfm_aes, out + bytes, + result = wusb_ccm_mac(tfm_cbc, tfm_aes, scratch, out + bytes, &n, a, b, blen); if (result < 0) goto error_ccm_mac; bytes += result; } result = bytes; + + kfree(scratch); +error_alloc_scratch: error_ccm_mac: error_setkey_aes: crypto_free_cipher(tfm_aes); From d0208639dbc6fe97a25054df44faa2d19aca9380 Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Mon, 17 Oct 2016 11:08:31 +0200 Subject: [PATCH 030/292] s390/dumpstack: restore reliable indicator for call traces Before merging all different stack tracers the call traces printed had an indicator if an entry can be considered reliable or not. Unreliable entries were put in braces, reliable not. Currently all lines contain these extra braces. This patch restores the old behaviour by adding an extra "reliable" parameter to the callback functions. Only show_trace makes currently use of it. Before: [ 0.804751] Call Trace: [ 0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0) [ 0.804756] ([<0000000000161d64>] create_worker+0x174/0x1c0) After: [ 0.804751] Call Trace: [ 0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0) [ 0.804756] [<0000000000161d64>] create_worker+0x174/0x1c0 Fixes: 758d39ebd3d5 ("s390/dumpstack: merge all four stack tracers") Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky --- arch/s390/include/asm/processor.h | 2 +- arch/s390/kernel/dumpstack.c | 17 +++++++++++------ arch/s390/kernel/perf_event.c | 2 +- arch/s390/kernel/stacktrace.c | 4 ++-- arch/s390/oprofile/init.c | 2 +- 5 files changed, 16 insertions(+), 11 deletions(-) diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h index 03323175de30..602af692efdc 100644 --- a/arch/s390/include/asm/processor.h +++ b/arch/s390/include/asm/processor.h @@ -192,7 +192,7 @@ struct task_struct; struct mm_struct; struct seq_file; -typedef int (*dump_trace_func_t)(void *data, unsigned long address); +typedef int (*dump_trace_func_t)(void *data, unsigned long address, int reliable); void dump_trace(dump_trace_func_t func, void *data, struct task_struct *task, unsigned long sp); diff --git a/arch/s390/kernel/dumpstack.c b/arch/s390/kernel/dumpstack.c index 6693383bc01b..518f615ad0a2 100644 --- a/arch/s390/kernel/dumpstack.c +++ b/arch/s390/kernel/dumpstack.c @@ -38,10 +38,10 @@ __dump_trace(dump_trace_func_t func, void *data, unsigned long sp, if (sp < low || sp > high - sizeof(*sf)) return sp; sf = (struct stack_frame *) sp; + if (func(data, sf->gprs[8], 0)) + return sp; /* Follow the backchain. */ while (1) { - if (func(data, sf->gprs[8])) - return sp; low = sp; sp = sf->back_chain; if (!sp) @@ -49,6 +49,8 @@ __dump_trace(dump_trace_func_t func, void *data, unsigned long sp, if (sp <= low || sp > high - sizeof(*sf)) return sp; sf = (struct stack_frame *) sp; + if (func(data, sf->gprs[8], 1)) + return sp; } /* Zero backchain detected, check for interrupt frame. */ sp = (unsigned long) (sf + 1); @@ -56,7 +58,7 @@ __dump_trace(dump_trace_func_t func, void *data, unsigned long sp, return sp; regs = (struct pt_regs *) sp; if (!user_mode(regs)) { - if (func(data, regs->psw.addr)) + if (func(data, regs->psw.addr, 1)) return sp; } low = sp; @@ -90,7 +92,7 @@ struct return_address_data { int depth; }; -static int __return_address(void *data, unsigned long address) +static int __return_address(void *data, unsigned long address, int reliable) { struct return_address_data *rd = data; @@ -109,9 +111,12 @@ unsigned long return_address(int depth) } EXPORT_SYMBOL_GPL(return_address); -static int show_address(void *data, unsigned long address) +static int show_address(void *data, unsigned long address, int reliable) { - printk("([<%016lx>] %pSR)\n", address, (void *)address); + if (reliable) + printk(" [<%016lx>] %pSR \n", address, (void *)address); + else + printk("([<%016lx>] %pSR)\n", address, (void *)address); return 0; } diff --git a/arch/s390/kernel/perf_event.c b/arch/s390/kernel/perf_event.c index 17431f63de00..955a7b6fa0a4 100644 --- a/arch/s390/kernel/perf_event.c +++ b/arch/s390/kernel/perf_event.c @@ -222,7 +222,7 @@ static int __init service_level_perf_register(void) } arch_initcall(service_level_perf_register); -static int __perf_callchain_kernel(void *data, unsigned long address) +static int __perf_callchain_kernel(void *data, unsigned long address, int reliable) { struct perf_callchain_entry_ctx *entry = data; diff --git a/arch/s390/kernel/stacktrace.c b/arch/s390/kernel/stacktrace.c index 44f84b23d4e5..355db9db8210 100644 --- a/arch/s390/kernel/stacktrace.c +++ b/arch/s390/kernel/stacktrace.c @@ -27,12 +27,12 @@ static int __save_address(void *data, unsigned long address, int nosched) return 1; } -static int save_address(void *data, unsigned long address) +static int save_address(void *data, unsigned long address, int reliable) { return __save_address(data, address, 0); } -static int save_address_nosched(void *data, unsigned long address) +static int save_address_nosched(void *data, unsigned long address, int reliable) { return __save_address(data, address, 1); } diff --git a/arch/s390/oprofile/init.c b/arch/s390/oprofile/init.c index 16f4c3960b87..9a4de4599c7b 100644 --- a/arch/s390/oprofile/init.c +++ b/arch/s390/oprofile/init.c @@ -13,7 +13,7 @@ #include #include -static int __s390_backtrace(void *data, unsigned long address) +static int __s390_backtrace(void *data, unsigned long address, int reliable) { unsigned int *depth = data; From a790634544f5f98364b0aafe9d7e669810d96360 Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Mon, 17 Oct 2016 11:59:58 +0200 Subject: [PATCH 031/292] s390/dumpstack: use pr_cont where appropriate Use pr_cont instead of simple printk calls when lines will be continued. This fixes the kernel output of various lines printed on e.g. a warning: Before: [ 0.840604] Krnl PSW : 0404c00180000000 000000000017d1d2 [ 0.840606] (try_to_wake_up+0x382/0x5e0) [ 0.840610] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 [ 0.840611] RI:0 EA:3 After: [ 0.831772] Krnl PSW : 0404c00180000000 000000000017d14a (try_to_wake_up+0x382/0x5e0) [ 0.831776] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky --- arch/s390/kernel/dumpstack.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/s390/kernel/dumpstack.c b/arch/s390/kernel/dumpstack.c index 518f615ad0a2..4bebe72b7780 100644 --- a/arch/s390/kernel/dumpstack.c +++ b/arch/s390/kernel/dumpstack.c @@ -168,13 +168,13 @@ void show_registers(struct pt_regs *regs) mode = user_mode(regs) ? "User" : "Krnl"; printk("%s PSW : %p %p", mode, (void *)regs->psw.mask, (void *)regs->psw.addr); if (!user_mode(regs)) - printk(" (%pSR)", (void *)regs->psw.addr); - printk("\n"); + pr_cont(" (%pSR)", (void *)regs->psw.addr); + pr_cont("\n"); printk(" R:%x T:%x IO:%x EX:%x Key:%x M:%x W:%x " "P:%x AS:%x CC:%x PM:%x", psw->r, psw->t, psw->i, psw->e, psw->key, psw->m, psw->w, psw->p, psw->as, psw->cc, psw->pm); - printk(" RI:%x EA:%x", psw->ri, psw->eaba); - printk("\n%s GPRS: %016lx %016lx %016lx %016lx\n", mode, + pr_cont(" RI:%x EA:%x\n", psw->ri, psw->eaba); + printk("%s GPRS: %016lx %016lx %016lx %016lx\n", mode, regs->gprs[0], regs->gprs[1], regs->gprs[2], regs->gprs[3]); printk(" %016lx %016lx %016lx %016lx\n", regs->gprs[4], regs->gprs[5], regs->gprs[6], regs->gprs[7]); From 4d062487f3431f124e3a2420c0da0b7a2388dc80 Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Mon, 17 Oct 2016 12:07:35 +0200 Subject: [PATCH 032/292] s390/disassambler: use pr_cont where appropriate Just like for dumpstack use pr_cont instead of simple printk calls to fix the output when disassembling a piece of code. Before: [ 0.840627] Krnl Code: 000000000017d1c6: a77400f7 brc 7,17d3b4 [ 0.840630] 000000000017d1ca: 92015000 mvi 0(%r5),1 [ 0.840634] #000000000017d1ce: a7f40001 brc 15,17d1d0 After: [ 0.831792] Krnl Code: 000000000017d13e: a77400f7 brc 7,17d32c 000000000017d142: 92015000 mvi 0(%r5),1 #000000000017d146: a7f40001 brc 15,17d148 Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky --- arch/s390/kernel/dis.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/s390/kernel/dis.c b/arch/s390/kernel/dis.c index 43446fa2a4e5..c74c59236f44 100644 --- a/arch/s390/kernel/dis.c +++ b/arch/s390/kernel/dis.c @@ -2014,12 +2014,12 @@ void show_code(struct pt_regs *regs) *ptr++ = '\t'; ptr += print_insn(ptr, code + start, addr); start += opsize; - printk("%s", buffer); + pr_cont("%s", buffer); ptr = buffer; ptr += sprintf(ptr, "\n "); hops++; } - printk("\n"); + pr_cont("\n"); } void print_fn_code(unsigned char *code, unsigned long len) From dcddba96cdbc5d0e4d4a17bf22cfd9b2f038a4ca Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Mon, 17 Oct 2016 13:07:46 +0200 Subject: [PATCH 033/292] s390/dumpstack: get rid of return_address again With commit ef6000b4c670 ("Disable the __builtin_return_address() warning globally after all)" the kernel does not warn at all again if __builtin_return_address(n) is called with n > 0. Besides the fact that this was a false warning on s390 anyway, due to the always present backchain, we can now revert commit 5606330627ab ("s390/dumpstack: implement and use return_address()") again, to simplify the code again. After all I shouldn't have had return_address() implememted at all to workaround this issue. So get rid of this again. Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky --- arch/s390/include/asm/ftrace.h | 4 +--- arch/s390/kernel/dumpstack.c | 24 ------------------------ 2 files changed, 1 insertion(+), 27 deletions(-) diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h index 64053d9ac3f2..836c56290499 100644 --- a/arch/s390/include/asm/ftrace.h +++ b/arch/s390/include/asm/ftrace.h @@ -12,9 +12,7 @@ #ifndef __ASSEMBLY__ -unsigned long return_address(int depth); - -#define ftrace_return_address(n) return_address(n) +#define ftrace_return_address(n) __builtin_return_address(n) void _mcount(void); void ftrace_caller(void); diff --git a/arch/s390/kernel/dumpstack.c b/arch/s390/kernel/dumpstack.c index 4bebe72b7780..34345c0a3c46 100644 --- a/arch/s390/kernel/dumpstack.c +++ b/arch/s390/kernel/dumpstack.c @@ -87,30 +87,6 @@ void dump_trace(dump_trace_func_t func, void *data, struct task_struct *task, } EXPORT_SYMBOL_GPL(dump_trace); -struct return_address_data { - unsigned long address; - int depth; -}; - -static int __return_address(void *data, unsigned long address, int reliable) -{ - struct return_address_data *rd = data; - - if (rd->depth--) - return 0; - rd->address = address; - return 1; -} - -unsigned long return_address(int depth) -{ - struct return_address_data rd = { .depth = depth + 2 }; - - dump_trace(__return_address, &rd, NULL, current_stack_pointer()); - return rd.address; -} -EXPORT_SYMBOL_GPL(return_address); - static int show_address(void *data, unsigned long address, int reliable) { if (reliable) From ca006f785fbfd7a5c901900bd3fe2b26e946a1ee Mon Sep 17 00:00:00 2001 From: Stefan Tauner Date: Thu, 6 Oct 2016 18:40:11 +0200 Subject: [PATCH 034/292] USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7 This adds support to ftdi_sio for the Infineon TriBoard TC2X7 engineering board for first-generation Aurix SoCs with Tricore CPUs. Mere addition of the device IDs does the job. Signed-off-by: Stefan Tauner Cc: stable Signed-off-by: Johan Hovold --- drivers/usb/serial/ftdi_sio.c | 3 ++- drivers/usb/serial/ftdi_sio_ids.h | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c index b2d767e743fc..0ff7f38d7800 100644 --- a/drivers/usb/serial/ftdi_sio.c +++ b/drivers/usb/serial/ftdi_sio.c @@ -986,7 +986,8 @@ static const struct usb_device_id id_table_combined[] = { /* ekey Devices */ { USB_DEVICE(FTDI_VID, FTDI_EKEY_CONV_USB_PID) }, /* Infineon Devices */ - { USB_DEVICE_INTERFACE_NUMBER(INFINEON_VID, INFINEON_TRIBOARD_PID, 1) }, + { USB_DEVICE_INTERFACE_NUMBER(INFINEON_VID, INFINEON_TRIBOARD_TC1798_PID, 1) }, + { USB_DEVICE_INTERFACE_NUMBER(INFINEON_VID, INFINEON_TRIBOARD_TC2X7_PID, 1) }, /* GE Healthcare devices */ { USB_DEVICE(GE_HEALTHCARE_VID, GE_HEALTHCARE_NEMO_TRACKER_PID) }, /* Active Research (Actisense) devices */ diff --git a/drivers/usb/serial/ftdi_sio_ids.h b/drivers/usb/serial/ftdi_sio_ids.h index f87a938cf005..21011c0a4c64 100644 --- a/drivers/usb/serial/ftdi_sio_ids.h +++ b/drivers/usb/serial/ftdi_sio_ids.h @@ -626,8 +626,9 @@ /* * Infineon Technologies */ -#define INFINEON_VID 0x058b -#define INFINEON_TRIBOARD_PID 0x0028 /* DAS JTAG TriBoard TC1798 V1.0 */ +#define INFINEON_VID 0x058b +#define INFINEON_TRIBOARD_TC1798_PID 0x0028 /* DAS JTAG TriBoard TC1798 V1.0 */ +#define INFINEON_TRIBOARD_TC2X7_PID 0x0043 /* DAS JTAG TriBoard TC2X7 V1.0 */ /* * Acton Research Corp. From 4fa507992f0a1063d7326abaf705f9408548349e Mon Sep 17 00:00:00 2001 From: Jitendra Bhivare Date: Thu, 13 Oct 2016 12:08:48 +0530 Subject: [PATCH 035/292] scsi: libiscsi: Fix locking in __iscsi_conn_send_pdu The code at free_task label in __iscsi_conn_send_pdu can get executed from blk_timeout_work which takes queue_lock using spin_lock_irq. back_lock taken with spin_unlock_bh will cause WARN_ON_ONCE. The code gets executed either with bottom half or IRQ disabled hence using spin_lock/spin_unlock for back_lock is safe. Signed-off-by: Jitendra Bhivare Reviewed-by: Hannes Reinecke Reviewed-by: Chris Leech Signed-off-by: Martin K. Petersen --- drivers/scsi/libiscsi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index c051694bfcb0..f9b6fba689ff 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -791,9 +791,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct iscsi_hdr *hdr, free_task: /* regular RX path uses back_lock */ - spin_lock_bh(&session->back_lock); + spin_lock(&session->back_lock); __iscsi_put_task(task); - spin_unlock_bh(&session->back_lock); + spin_unlock(&session->back_lock); return NULL; } From 7d2c0d643244311d0ce04fde373cd371ad1f1cad Mon Sep 17 00:00:00 2001 From: Jitendra Bhivare Date: Thu, 13 Oct 2016 12:08:49 +0530 Subject: [PATCH 036/292] scsi: be2iscsi: Replace _bh with _irqsave/irqrestore [ 3843.132217] WARNING: CPU: 20 PID: 1227 at kernel/softirq.c:150 __local_bh_enable_ip+0x6b/0x90 [ 3843.142815] Modules linked in: ... [ 3843.294328] CPU: 20 PID: 1227 Comm: kworker/20:1H Tainted: G E 4.8.0-rc1+ #3 [ 3843.304944] Hardware name: Dell Inc. PowerEdge R720/0X6H47, BIOS 1.4.8 10/25/2012 [ 3843.314798] Workqueue: kblockd blk_timeout_work [ 3843.321350] 0000000000000086 00000000a32f4533 ffff8802216d7bd8 ffffffff8135c3cf [ 3843.331146] 0000000000000000 0000000000000000 ffff8802216d7c18 ffffffff8108d661 [ 3843.340918] 00000096216d7c50 0000000000000200 ffff8802d07cc828 ffff8801b3632550 [ 3843.350687] Call Trace: [ 3843.354866] [] dump_stack+0x63/0x84 [ 3843.362061] [] __warn+0xd1/0xf0 [ 3843.368851] [] warn_slowpath_null+0x1d/0x20 [ 3843.376791] [] __local_bh_enable_ip+0x6b/0x90 [ 3843.384903] [] _raw_spin_unlock_bh+0x1e/0x20 [ 3843.392940] [] beiscsi_alloc_pdu+0x2f0/0x6e0 [be2iscsi] [ 3843.402076] [] __iscsi_conn_send_pdu+0xf8/0x370 [libiscsi] [ 3843.411549] [] iscsi_send_nopout+0xbe/0x110 [libiscsi] [ 3843.420639] [] iscsi_eh_cmd_timed_out+0x29b/0x2b0 [libiscsi] [ 3843.430339] [] scsi_times_out+0x5e/0x250 [ 3843.438119] [] blk_rq_timed_out+0x1f/0x60 [ 3843.446009] [] blk_timeout_work+0xad/0x150 [ 3843.454010] [] process_one_work+0x152/0x400 [ 3843.462114] [] worker_thread+0x125/0x4b0 [ 3843.469961] [] ? rescuer_thread+0x380/0x380 [ 3843.478116] [] kthread+0xd8/0xf0 [ 3843.485212] [] ret_from_fork+0x1f/0x40 [ 3843.492908] [] ? kthread_park+0x60/0x60 [ 3843.500715] ---[ end trace 57ec0a1d8f0dd3a0 ]--- [ 3852.328667] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1Kernel panic - not syncing: Hard LOCKUP blk_timeout_work takes queue_lock spin_lock with interrupts disabled before invoking iscsi_eh_cmd_timed_out. This causes a WARN_ON_ONCE in spin_unlock_bh for wrb_lock/io_sgl_lock/mgmt_sgl_lock. CPU was kept busy in lot of bottom half work with interrupts disabled thus causing hard lock up. Signed-off-by: Jitendra Bhivare Reviewed-by: Hannes Reinecke Reviewed-by: Chris Leech Signed-off-by: Martin K. Petersen --- drivers/scsi/be2iscsi/be_main.c | 37 ++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index 6a6906f847db..75bfa14ed5ba 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -900,8 +900,9 @@ void hwi_ring_cq_db(struct beiscsi_hba *phba, static struct sgl_handle *alloc_io_sgl_handle(struct beiscsi_hba *phba) { struct sgl_handle *psgl_handle; + unsigned long flags; - spin_lock_bh(&phba->io_sgl_lock); + spin_lock_irqsave(&phba->io_sgl_lock, flags); if (phba->io_sgl_hndl_avbl) { beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_IO, "BM_%d : In alloc_io_sgl_handle," @@ -919,14 +920,16 @@ static struct sgl_handle *alloc_io_sgl_handle(struct beiscsi_hba *phba) phba->io_sgl_alloc_index++; } else psgl_handle = NULL; - spin_unlock_bh(&phba->io_sgl_lock); + spin_unlock_irqrestore(&phba->io_sgl_lock, flags); return psgl_handle; } static void free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) { - spin_lock_bh(&phba->io_sgl_lock); + unsigned long flags; + + spin_lock_irqsave(&phba->io_sgl_lock, flags); beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_IO, "BM_%d : In free_,io_sgl_free_index=%d\n", phba->io_sgl_free_index); @@ -941,7 +944,7 @@ free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) "value there=%p\n", phba->io_sgl_free_index, phba->io_sgl_hndl_base [phba->io_sgl_free_index]); - spin_unlock_bh(&phba->io_sgl_lock); + spin_unlock_irqrestore(&phba->io_sgl_lock, flags); return; } phba->io_sgl_hndl_base[phba->io_sgl_free_index] = psgl_handle; @@ -950,7 +953,7 @@ free_io_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) phba->io_sgl_free_index = 0; else phba->io_sgl_free_index++; - spin_unlock_bh(&phba->io_sgl_lock); + spin_unlock_irqrestore(&phba->io_sgl_lock, flags); } static inline struct wrb_handle * @@ -958,15 +961,16 @@ beiscsi_get_wrb_handle(struct hwi_wrb_context *pwrb_context, unsigned int wrbs_per_cxn) { struct wrb_handle *pwrb_handle; + unsigned long flags; - spin_lock_bh(&pwrb_context->wrb_lock); + spin_lock_irqsave(&pwrb_context->wrb_lock, flags); pwrb_handle = pwrb_context->pwrb_handle_base[pwrb_context->alloc_index]; pwrb_context->wrb_handles_available--; if (pwrb_context->alloc_index == (wrbs_per_cxn - 1)) pwrb_context->alloc_index = 0; else pwrb_context->alloc_index++; - spin_unlock_bh(&pwrb_context->wrb_lock); + spin_unlock_irqrestore(&pwrb_context->wrb_lock, flags); if (pwrb_handle) memset(pwrb_handle->pwrb, 0, sizeof(*pwrb_handle->pwrb)); @@ -1001,14 +1005,16 @@ beiscsi_put_wrb_handle(struct hwi_wrb_context *pwrb_context, struct wrb_handle *pwrb_handle, unsigned int wrbs_per_cxn) { - spin_lock_bh(&pwrb_context->wrb_lock); + unsigned long flags; + + spin_lock_irqsave(&pwrb_context->wrb_lock, flags); pwrb_context->pwrb_handle_base[pwrb_context->free_index] = pwrb_handle; pwrb_context->wrb_handles_available++; if (pwrb_context->free_index == (wrbs_per_cxn - 1)) pwrb_context->free_index = 0; else pwrb_context->free_index++; - spin_unlock_bh(&pwrb_context->wrb_lock); + spin_unlock_irqrestore(&pwrb_context->wrb_lock, flags); } /** @@ -1037,8 +1043,9 @@ free_wrb_handle(struct beiscsi_hba *phba, struct hwi_wrb_context *pwrb_context, static struct sgl_handle *alloc_mgmt_sgl_handle(struct beiscsi_hba *phba) { struct sgl_handle *psgl_handle; + unsigned long flags; - spin_lock_bh(&phba->mgmt_sgl_lock); + spin_lock_irqsave(&phba->mgmt_sgl_lock, flags); if (phba->eh_sgl_hndl_avbl) { psgl_handle = phba->eh_sgl_hndl_base[phba->eh_sgl_alloc_index]; phba->eh_sgl_hndl_base[phba->eh_sgl_alloc_index] = NULL; @@ -1056,14 +1063,16 @@ static struct sgl_handle *alloc_mgmt_sgl_handle(struct beiscsi_hba *phba) phba->eh_sgl_alloc_index++; } else psgl_handle = NULL; - spin_unlock_bh(&phba->mgmt_sgl_lock); + spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags); return psgl_handle; } void free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) { - spin_lock_bh(&phba->mgmt_sgl_lock); + unsigned long flags; + + spin_lock_irqsave(&phba->mgmt_sgl_lock, flags); beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_CONFIG, "BM_%d : In free_mgmt_sgl_handle," "eh_sgl_free_index=%d\n", @@ -1078,7 +1087,7 @@ free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) "BM_%d : Double Free in eh SGL ," "eh_sgl_free_index=%d\n", phba->eh_sgl_free_index); - spin_unlock_bh(&phba->mgmt_sgl_lock); + spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags); return; } phba->eh_sgl_hndl_base[phba->eh_sgl_free_index] = psgl_handle; @@ -1088,7 +1097,7 @@ free_mgmt_sgl_handle(struct beiscsi_hba *phba, struct sgl_handle *psgl_handle) phba->eh_sgl_free_index = 0; else phba->eh_sgl_free_index++; - spin_unlock_bh(&phba->mgmt_sgl_lock); + spin_unlock_irqrestore(&phba->mgmt_sgl_lock, flags); } static void From 77f18a87186a87cab2a027335758a7244896084c Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 11 Oct 2016 11:23:23 +0200 Subject: [PATCH 037/292] scsi: NCR5380: no longer mark irq probing as __init The g_NCR5380 has been converted to more regular probing, which means its probe function can now be invoked after the __init section is discarded, as pointed out by this kbuild warning: WARNING: drivers/scsi/built-in.o(.text+0x3a105): Section mismatch in reference from the function generic_NCR5380_isa_match() to the function .init.text:probe_intr() WARNING: drivers/scsi/built-in.o(.text+0x3a145): Section mismatch in reference from the function generic_NCR5380_isa_match() to the variable .init.data:probe_irq To make sure this works correctly in all cases, let's remove the __init and __initdata annotations. Fixes: a8cfbcaec0c1 ("scsi: g_NCR5380: Stop using scsi_module.c") Signed-off-by: Arnd Bergmann Acked-by: Finn Thain Signed-off-by: Martin K. Petersen --- drivers/scsi/NCR5380.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c index db2739079cbb..790babc5ef66 100644 --- a/drivers/scsi/NCR5380.c +++ b/drivers/scsi/NCR5380.c @@ -353,7 +353,7 @@ static void NCR5380_print_phase(struct Scsi_Host *instance) #endif -static int probe_irq __initdata; +static int probe_irq; /** * probe_intr - helper for IRQ autoprobe @@ -365,7 +365,7 @@ static int probe_irq __initdata; * used by the IRQ probe code. */ -static irqreturn_t __init probe_intr(int irq, void *dev_id) +static irqreturn_t probe_intr(int irq, void *dev_id) { probe_irq = irq; return IRQ_HANDLED; @@ -380,7 +380,7 @@ static irqreturn_t __init probe_intr(int irq, void *dev_id) * and then looking to see what interrupt actually turned up. */ -static int __init __maybe_unused NCR5380_probe_irq(struct Scsi_Host *instance, +static int __maybe_unused NCR5380_probe_irq(struct Scsi_Host *instance, int possible) { struct NCR5380_hostdata *hostdata = shost_priv(instance); From b052b07c39d593c9954a84d5bbe1563999483f38 Mon Sep 17 00:00:00 2001 From: Heinz Mauelshagen Date: Mon, 17 Oct 2016 21:20:07 +0200 Subject: [PATCH 038/292] dm raid: fix activation of existing raid4/10 devices dm-raid 1.9.0 fails to activate existing RAID4/10 devices that have the old superblock format (which does not have takeover/reshaping support that was added via commit 33e53f06850f). Fix validation path for old superblocks by reverting to the old raid4 layout and basing checks on mddev->new_{level,layout,...} members in super_init_validation(). Cc: stable@vger.kernel.org # 4.8 Signed-off-by: Heinz Mauelshagen Signed-off-by: Mike Snitzer --- Documentation/device-mapper/dm-raid.txt | 1 + drivers/md/dm-raid.c | 12 +++++++----- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/Documentation/device-mapper/dm-raid.txt b/Documentation/device-mapper/dm-raid.txt index e5b6497116f4..c75b64a85859 100644 --- a/Documentation/device-mapper/dm-raid.txt +++ b/Documentation/device-mapper/dm-raid.txt @@ -309,3 +309,4 @@ Version History with a reshape in progress. 1.9.0 Add support for RAID level takeover/reshape/region size and set size reduction. +1.9.1 Fix activation of existing RAID 4/10 mapped devices diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 2a3970097991..6d53810963f7 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -266,7 +266,7 @@ static struct raid_type { {"raid10_offset", "raid10 offset (striped mirrors)", 0, 2, 10, ALGORITHM_RAID10_OFFSET}, {"raid10_near", "raid10 near (striped mirrors)", 0, 2, 10, ALGORITHM_RAID10_NEAR}, {"raid10", "raid10 (striped mirrors)", 0, 2, 10, ALGORITHM_RAID10_DEFAULT}, - {"raid4", "raid4 (dedicated last parity disk)", 1, 2, 4, ALGORITHM_PARITY_N}, /* raid4 layout = raid5_n */ + {"raid4", "raid4 (dedicated first parity disk)", 1, 2, 5, ALGORITHM_PARITY_0}, /* raid4 layout = raid5_0 */ {"raid5_n", "raid5 (dedicated last parity disk)", 1, 2, 5, ALGORITHM_PARITY_N}, {"raid5_ls", "raid5 (left symmetric)", 1, 2, 5, ALGORITHM_LEFT_SYMMETRIC}, {"raid5_rs", "raid5 (right symmetric)", 1, 2, 5, ALGORITHM_RIGHT_SYMMETRIC}, @@ -2087,11 +2087,11 @@ static int super_init_validation(struct raid_set *rs, struct md_rdev *rdev) /* * No takeover/reshaping, because we don't have the extended v1.9.0 metadata */ - if (le32_to_cpu(sb->level) != mddev->level) { + if (le32_to_cpu(sb->level) != mddev->new_level) { DMERR("Reshaping/takeover raid sets not yet supported. (raid level/stripes/size change)"); return -EINVAL; } - if (le32_to_cpu(sb->layout) != mddev->layout) { + if (le32_to_cpu(sb->layout) != mddev->new_layout) { DMERR("Reshaping raid sets not yet supported. (raid layout change)"); DMERR(" 0x%X vs 0x%X", le32_to_cpu(sb->layout), mddev->layout); DMERR(" Old layout: %s w/ %d copies", @@ -2102,7 +2102,7 @@ static int super_init_validation(struct raid_set *rs, struct md_rdev *rdev) raid10_md_layout_to_copies(mddev->layout)); return -EINVAL; } - if (le32_to_cpu(sb->stripe_sectors) != mddev->chunk_sectors) { + if (le32_to_cpu(sb->stripe_sectors) != mddev->new_chunk_sectors) { DMERR("Reshaping raid sets not yet supported. (stripe sectors change)"); return -EINVAL; } @@ -2115,6 +2115,8 @@ static int super_init_validation(struct raid_set *rs, struct md_rdev *rdev) return -EINVAL; } + DMINFO("Discovered old metadata format; upgrading to extended metadata format"); + /* Table line is checked vs. authoritative superblock */ rs_set_new(rs); } @@ -3647,7 +3649,7 @@ static void raid_resume(struct dm_target *ti) static struct target_type raid_target = { .name = "raid", - .version = {1, 9, 0}, + .version = {1, 9, 1}, .module = THIS_MODULE, .ctr = raid_ctr, .dtr = raid_dtr, From 1b283eea6228880b765bc40fe4e555416437ce58 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Fri, 7 Oct 2016 10:52:17 +0200 Subject: [PATCH 039/292] ARM: dts: fix the SD card on the Snowball This fixes a very annoying regression on the Snowball SD card that has been around for a while. It turns out that the device tree does not configure the direction pins properly, nor sets up the pins for the voltage converter properly at boot. Unless all things are correctly set up, the feedback clock will not work, and makes the driver spew messages in the console (but it works, very slowly): root@Ux500:/ mount /dev/mmcblk0p2 /mnt/ [ 9.953460] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer! [ 9.960296] mmcblk0: error -110 sending status command, retrying [ 9.966461] mmcblk0: error -110 sending status command, retrying [ 9.972534] mmcblk0: error -110 sending status command, aborting Fix this by rectifying the device tree to correspond to that of the Ux500 HREF boards plus the DAT31DIR setting that is unique for the Snowball, and things start working smoothly. Add in the SDR12 and SDR25 modes which this host can do without any problems. I don't know if this has ever been correct, sadly. It works after this patch. Cc: stable@vger.kernel.org Reported-by: Daniel Lezcano Cc: Ulf Hansson Signed-off-by: Linus Walleij Signed-off-by: Olof Johansson --- arch/arm/boot/dts/ste-snowball.dts | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/arm/boot/dts/ste-snowball.dts b/arch/arm/boot/dts/ste-snowball.dts index b3df1c60d465..386eee6de232 100644 --- a/arch/arm/boot/dts/ste-snowball.dts +++ b/arch/arm/boot/dts/ste-snowball.dts @@ -239,14 +239,25 @@ arm,primecell-periphid = <0x10480180>; max-frequency = <100000000>; bus-width = <4>; + cap-sd-highspeed; cap-mmc-highspeed; + sd-uhs-sdr12; + sd-uhs-sdr25; + /* All direction control is used */ + st,sig-dir-cmd; + st,sig-dir-dat0; + st,sig-dir-dat2; + st,sig-dir-dat31; + st,sig-pin-fbclk; + full-pwr-cycle; vmmc-supply = <&ab8500_ldo_aux3_reg>; vqmmc-supply = <&vmmci>; pinctrl-names = "default", "sleep"; pinctrl-0 = <&sdi0_default_mode>; pinctrl-1 = <&sdi0_sleep_mode>; - cd-gpios = <&gpio6 26 GPIO_ACTIVE_LOW>; // 218 + /* GPIO218 MMC_CD */ + cd-gpios = <&gpio6 26 GPIO_ACTIVE_LOW>; status = "okay"; }; @@ -549,7 +560,7 @@ /* VMMCI level-shifter enable */ snowball_cfg3 { pins = "GPIO217_AH12"; - ste,config = <&gpio_out_lo>; + ste,config = <&gpio_out_hi>; }; /* VMMCI level-shifter voltage select */ snowball_cfg4 { From 5fac7e8405a951f536dfcea09c6f6adb904a08a8 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Tue, 4 Oct 2016 13:56:19 +0200 Subject: [PATCH 040/292] bus: qcom-ebi2: depend on ARCH_QCOM or COMPILE_TEST This hides the option for people who do not want their Kconfig vision cluttered (i.e. x86) and enables compile testing apart from the supported main arch. Cc: Stephen Boyd Cc: Arnd Bergmann Signed-off-by: Linus Walleij Signed-off-by: Olof Johansson --- drivers/bus/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig index 7010dcac9328..78751057164a 100644 --- a/drivers/bus/Kconfig +++ b/drivers/bus/Kconfig @@ -111,6 +111,7 @@ config OMAP_OCP2SCP config QCOM_EBI2 bool "Qualcomm External Bus Interface 2 (EBI2)" depends on HAS_IOMEM + depends on ARCH_QCOM || COMPILE_TEST help Say y here to enable support for the Qualcomm External Bus Interface 2, which can be used to connect things like NAND Flash, From 70e1a28fe1e3fe1bddcb09e8eee25d8be1610b49 Mon Sep 17 00:00:00 2001 From: Jisheng Zhang Date: Thu, 29 Sep 2016 18:51:20 +0800 Subject: [PATCH 041/292] MAINTAINERS: add myself as Marvell berlin SoC maintainer I would like to take maintainership for Marvell berlin SoCs. Signed-off-by: Jisheng Zhang Acked-by: Sebastian Hesselbarth Signed-off-by: Olof Johansson --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 1cd38a7e0064..25f543c1814d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1442,6 +1442,7 @@ F: drivers/cpufreq/mvebu-cpufreq.c F: arch/arm/configs/mvebu_*_defconfig ARM/Marvell Berlin SoC support +M: Jisheng Zhang M: Sebastian Hesselbarth L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Maintained From 2723605169d8f58ebc2881d2b8a59e6ee6fe074a Mon Sep 17 00:00:00 2001 From: Scott Branden Date: Sat, 8 Oct 2016 13:41:04 -0700 Subject: [PATCH 042/292] ARM: multi_v7_defconfig: Enable Intel e1000e driver Enable support for the Intel e1000e driver Signed-off-by: Ray Jui Signed-off-by: Scott Branden Signed-off-by: Olof Johansson --- arch/arm/configs/multi_v7_defconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig index 437d0740dec6..11f37ed1dbff 100644 --- a/arch/arm/configs/multi_v7_defconfig +++ b/arch/arm/configs/multi_v7_defconfig @@ -850,6 +850,7 @@ CONFIG_PWM_SUN4I=y CONFIG_PWM_TEGRA=y CONFIG_PWM_VT8500=y CONFIG_PHY_HIX5HD2_SATA=y +CONFIG_E1000E=y CONFIG_PWM_STI=y CONFIG_PWM_BCM2835=y CONFIG_PWM_BRCMSTB=m From 8236d9ac4cf2e51e4355c411612cfbaec8d31942 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Mon, 17 Oct 2016 00:11:14 +0900 Subject: [PATCH 043/292] clk: uniphier: add system clock support for sLD3 SoC I do not know why, but I missed to add this compatible string in the initial commit of this driver. Signed-off-by: Masahiro Yamada Signed-off-by: Stephen Boyd --- drivers/clk/uniphier/clk-uniphier-core.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/clk/uniphier/clk-uniphier-core.c b/drivers/clk/uniphier/clk-uniphier-core.c index 5ffb898d0839..f4e0f6be5f33 100644 --- a/drivers/clk/uniphier/clk-uniphier-core.c +++ b/drivers/clk/uniphier/clk-uniphier-core.c @@ -110,6 +110,10 @@ static int uniphier_clk_remove(struct platform_device *pdev) static const struct of_device_id uniphier_clk_match[] = { /* System clock */ + { + .compatible = "socionext,uniphier-sld3-clock", + .data = uniphier_sld3_sys_clk_data, + }, { .compatible = "socionext,uniphier-ld4-clock", .data = uniphier_ld4_sys_clk_data, From c0ce317f0c8863d309e7cfb6f6c0fe86e38c6d86 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Mon, 17 Oct 2016 00:25:55 +0900 Subject: [PATCH 044/292] clk: uniphier: fix type of variable passed to regmap_read() The 3rd argument of regmap_read() takes a pointer to unsigned int. This driver is saved just because u32 happens to be typedef'ed as unsigned int, but we should not rely on that fact. Change the variable type just in case. Signed-off-by: Masahiro Yamada Signed-off-by: Stephen Boyd --- drivers/clk/uniphier/clk-uniphier-mux.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/clk/uniphier/clk-uniphier-mux.c b/drivers/clk/uniphier/clk-uniphier-mux.c index 15a2f2cbe0d9..2c243a894f3b 100644 --- a/drivers/clk/uniphier/clk-uniphier-mux.c +++ b/drivers/clk/uniphier/clk-uniphier-mux.c @@ -42,7 +42,7 @@ static u8 uniphier_clk_mux_get_parent(struct clk_hw *hw) struct uniphier_clk_mux *mux = to_uniphier_clk_mux(hw); int num_parents = clk_hw_get_num_parents(hw); int ret; - u32 val; + unsigned int val; u8 i; ret = regmap_read(mux->regmap, mux->reg, &val); From 34b89b2967f284937be6759936ef3dc4d3aff2d0 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Sun, 16 Oct 2016 10:45:07 -0300 Subject: [PATCH 045/292] clk: samsung: clk-exynos-audss: Fix module autoload If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Before this patch: $ modinfo drivers/clk/samsung/clk-exynos-audss.ko | grep alias alias: platform:exynos-audss-clk After this patch: $ modinfo drivers/clk/samsung/clk-exynos-audss.ko | grep alias alias: platform:exynos-audss-clk alias: of:N*T*Csamsung,exynos5420-audss-clockC* alias: of:N*T*Csamsung,exynos5420-audss-clock alias: of:N*T*Csamsung,exynos5410-audss-clockC* alias: of:N*T*Csamsung,exynos5410-audss-clock alias: of:N*T*Csamsung,exynos5250-audss-clockC* alias: of:N*T*Csamsung,exynos5250-audss-clock alias: of:N*T*Csamsung,exynos4210-audss-clockC* alias: of:N*T*Csamsung,exynos4210-audss-clock Fixes: 4d252fd5719b ("clk: samsung: Allow modular build of the Audio Subsystem CLKCON driver") Signed-off-by: Javier Martinez Canillas Reviewed-by: Krzysztof Kozlowski Tested-by: Krzysztof Kozlowski Signed-off-by: Stephen Boyd --- drivers/clk/samsung/clk-exynos-audss.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/clk/samsung/clk-exynos-audss.c b/drivers/clk/samsung/clk-exynos-audss.c index 51d152f735cc..17e68a724945 100644 --- a/drivers/clk/samsung/clk-exynos-audss.c +++ b/drivers/clk/samsung/clk-exynos-audss.c @@ -106,6 +106,7 @@ static const struct of_device_id exynos_audss_clk_of_match[] = { }, { }, }; +MODULE_DEVICE_TABLE(of, exynos_audss_clk_of_match); static void exynos_audss_clk_teardown(void) { From 234d511d8c158d62f73f1a818eb4dd494a13a6e3 Mon Sep 17 00:00:00 2001 From: Jean Delvare Date: Fri, 14 Oct 2016 14:44:13 +0200 Subject: [PATCH 046/292] clk: mediatek: Add hardware dependency Only propose the mediatek clock drivers on this platform, unless build-testing. Signed-off-by: Jean Delvare Cc: Shunli Wang Cc: James Liao Cc: Erin Lo Cc: Matthias Brugger Cc: Michael Turquette Reviewed-by: Matthias Brugger Signed-off-by: Stephen Boyd --- drivers/clk/mediatek/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/clk/mediatek/Kconfig b/drivers/clk/mediatek/Kconfig index 380c372d528e..f042bd2a6a99 100644 --- a/drivers/clk/mediatek/Kconfig +++ b/drivers/clk/mediatek/Kconfig @@ -8,6 +8,7 @@ config COMMON_CLK_MEDIATEK config COMMON_CLK_MT8135 bool "Clock driver for Mediatek MT8135" + depends on ARCH_MEDIATEK || COMPILE_TEST select COMMON_CLK_MEDIATEK default ARCH_MEDIATEK ---help--- @@ -15,6 +16,7 @@ config COMMON_CLK_MT8135 config COMMON_CLK_MT8173 bool "Clock driver for Mediatek MT8173" + depends on ARCH_MEDIATEK || COMPILE_TEST select COMMON_CLK_MEDIATEK default ARCH_MEDIATEK ---help--- From 339e1e54891c339b30023c9cc8a005cbf65a3c0c Mon Sep 17 00:00:00 2001 From: Shawn Guo Date: Sat, 8 Oct 2016 16:59:38 +0800 Subject: [PATCH 047/292] clk: core: add __init decoration for CLK_OF_DECLARE_DRIVER function The new introduced macro CLK_OF_DECLARE_DRIVER is usually used to declare clock driver init functions, which are mostly decorated with __init. Add __init decoration for CLK_OF_DECLARE_DRIVER function to avoid causing section mismatch warnings on client clock drivers. Signed-off-by: Shawn Guo Fixes: c7296c51ce5d ("clk: core: New macro CLK_OF_DECLARE_DRIVER") Signed-off-by: Stephen Boyd --- include/linux/clk-provider.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h index af596381fa0f..a428aec36ace 100644 --- a/include/linux/clk-provider.h +++ b/include/linux/clk-provider.h @@ -785,7 +785,7 @@ extern struct of_device_id __clk_of_table; * routines, one at of_clk_init(), and one at platform device probe */ #define CLK_OF_DECLARE_DRIVER(name, compat, fn) \ - static void name##_of_clk_init_driver(struct device_node *np) \ + static void __init name##_of_clk_init_driver(struct device_node *np) \ { \ of_node_clear_flag(np, OF_POPULATED); \ fn(np); \ From 981e1bea55e56abdb16505502e4a69ff868e87d3 Mon Sep 17 00:00:00 2001 From: Gregory CLEMENT Date: Thu, 29 Sep 2016 16:28:55 +0200 Subject: [PATCH 048/292] clk: mvebu: armada-37xx-periph: Fix the clock provider registration While trying using a peripheral clock on a driver, I saw that the clock pointer returned by the provider was NULL. The problem was a missing indirection. It was the pointer stored in the hws array which needed to be updated not the value it contains. Signed-off-by: Gregory CLEMENT Fixes: 8ca4746a78ab ("clk: mvebu: Add the peripheral clock driver for Armada 3700") Signed-off-by: Stephen Boyd --- drivers/clk/mvebu/armada-37xx-periph.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c index 45905fc0d75b..d5dfbad4ceab 100644 --- a/drivers/clk/mvebu/armada-37xx-periph.c +++ b/drivers/clk/mvebu/armada-37xx-periph.c @@ -305,7 +305,7 @@ static const struct of_device_id armada_3700_periph_clock_of_match[] = { }; static int armada_3700_add_composite_clk(const struct clk_periph_data *data, void __iomem *reg, spinlock_t *lock, - struct device *dev, struct clk_hw *hw) + struct device *dev, struct clk_hw **hw) { const struct clk_ops *mux_ops = NULL, *gate_ops = NULL, *rate_ops = NULL; @@ -353,13 +353,13 @@ static int armada_3700_add_composite_clk(const struct clk_periph_data *data, } } - hw = clk_hw_register_composite(dev, data->name, data->parent_names, + *hw = clk_hw_register_composite(dev, data->name, data->parent_names, data->num_parents, mux_hw, mux_ops, rate_hw, rate_ops, gate_hw, gate_ops, CLK_IGNORE_UNUSED); - if (IS_ERR(hw)) - return PTR_ERR(hw); + if (IS_ERR(*hw)) + return PTR_ERR(*hw); return 0; } @@ -400,7 +400,7 @@ static int armada_3700_periph_clock_probe(struct platform_device *pdev) spin_lock_init(&driver_data->lock); for (i = 0; i < num_periph; i++) { - struct clk_hw *hw = driver_data->hw_data->hws[i]; + struct clk_hw **hw = &driver_data->hw_data->hws[i]; if (armada_3700_add_composite_clk(&data[i], reg, &driver_data->lock, dev, hw)) From 1c7032258d568f9a7aeb4c541786699d9a219a2a Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Thu, 6 Oct 2016 11:59:59 -0300 Subject: [PATCH 049/292] clk: max77686: fix number of clocks setup for clk_hw based registration The commit 9b4cac33adc7 ("clk: max77686: Migrate to clk_hw based OF and registration APIs") converted the driver to use the new provider API to register clocks using clk_hw. But unfortunately, in the conversion it missed to set the num_clks value which lead to the following error when trying to register a clk provider: [ 1.963782] of_clk_max77686_get: invalid index 0 [ 1.967460] ERROR: could not get clock /rtc@10070000:rtc_src(1) [ 1.973638] s3c-rtc 10070000.rtc: failed to find rtc source clock Fix it by correctly set the max77686_clk_driver_data num_clks member. Fixes: 9b4cac33adc7 ("clk: max77686: Migrate to clk_hw based OF and registration APIs") Reported-by: Markus Reichl Suggested-by: Tobias Jakobi Signed-off-by: Javier Martinez Canillas Tested-by: Markus Reichl Reviewed-by: Chanwoo Choi Reviewed-by: Krzysztof Kozlowski Signed-off-by: Stephen Boyd --- drivers/clk/clk-max77686.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/clk/clk-max77686.c b/drivers/clk/clk-max77686.c index b637f5979023..eb953d3b0b69 100644 --- a/drivers/clk/clk-max77686.c +++ b/drivers/clk/clk-max77686.c @@ -216,6 +216,7 @@ static int max77686_clk_probe(struct platform_device *pdev) return -EINVAL; } + drv_data->num_clks = num_clks; drv_data->max_clk_data = devm_kcalloc(dev, num_clks, sizeof(*drv_data->max_clk_data), GFP_KERNEL); From c4e634ce412d97f0e61223b2a5b3f8f9600cd4dc Mon Sep 17 00:00:00 2001 From: Eric Anholt Date: Fri, 30 Sep 2016 10:07:27 -0700 Subject: [PATCH 050/292] clk: bcm2835: Clamp the PLL's requested rate to the hardware limits. Fixes setting low-resolution video modes on HDMI. Now the PLLH_PIX divider adjusts itself until the PLLH is within bounds. Signed-off-by: Eric Anholt Signed-off-by: Stephen Boyd --- drivers/clk/bcm/clk-bcm2835.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/clk/bcm/clk-bcm2835.c b/drivers/clk/bcm/clk-bcm2835.c index b68bf573dcfb..8c7763fd9efc 100644 --- a/drivers/clk/bcm/clk-bcm2835.c +++ b/drivers/clk/bcm/clk-bcm2835.c @@ -502,8 +502,12 @@ static long bcm2835_pll_rate_from_divisors(unsigned long parent_rate, static long bcm2835_pll_round_rate(struct clk_hw *hw, unsigned long rate, unsigned long *parent_rate) { + struct bcm2835_pll *pll = container_of(hw, struct bcm2835_pll, hw); + const struct bcm2835_pll_data *data = pll->data; u32 ndiv, fdiv; + rate = clamp(rate, data->min_rate, data->max_rate); + bcm2835_pll_choose_ndiv_and_fdiv(rate, *parent_rate, &ndiv, &fdiv); return bcm2835_pll_rate_from_divisors(*parent_rate, ndiv, fdiv, 1); @@ -608,13 +612,6 @@ static int bcm2835_pll_set_rate(struct clk_hw *hw, u32 ana[4]; int i; - if (rate < data->min_rate || rate > data->max_rate) { - dev_err(cprman->dev, "%s: rate out of spec: %lu vs (%lu, %lu)\n", - clk_hw_get_name(hw), rate, - data->min_rate, data->max_rate); - return -EINVAL; - } - if (rate > data->max_fb_rate) { use_fb_prediv = true; rate /= 2; From 4aa6c99d31c0cc471b7f243f5d314391a1abcaf3 Mon Sep 17 00:00:00 2001 From: Gregory CLEMENT Date: Fri, 30 Sep 2016 10:33:59 +0200 Subject: [PATCH 051/292] clk: mvebu: armada-37xx-periph: Fix the clock gate flag For the gate part of the peripheral clock setting the bit disables the clock and clearing it enables the clock. This is not the default behavior of clk_gate component, so we need to use the CLK_GATE_SET_TO_DISABLE flag. Signed-off-by: Gregory CLEMENT Fixes: 8ca4746a78ab ("clk: mvebu: Add the peripheral clock driver for Armada 3700") Signed-off-by: Stephen Boyd --- drivers/clk/mvebu/armada-37xx-periph.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c index d5dfbad4ceab..cecb0fdfaef6 100644 --- a/drivers/clk/mvebu/armada-37xx-periph.c +++ b/drivers/clk/mvebu/armada-37xx-periph.c @@ -329,6 +329,7 @@ static int armada_3700_add_composite_clk(const struct clk_periph_data *data, gate->lock = lock; gate_ops = gate_hw->init->ops; gate->reg = reg + (u64)gate->reg; + gate->flags = CLK_GATE_SET_TO_DISABLE; } if (data->rate_hw) { From d3397484bb5b8534289a630c1a78500ff4f2fbf4 Mon Sep 17 00:00:00 2001 From: Shawn Guo Date: Sat, 8 Oct 2016 21:38:12 +0800 Subject: [PATCH 052/292] clk: hi6220: use CLK_OF_DECLARE_DRIVER for sysctrl and mediactrl clock init The hi6220-sysctrl and hi6220-mediactrl are not only clock provider but also reset controller. It worked fine that single sysctrl/mediactrl device node in DT can be used to initialize clock driver and populate platform device for reset controller. But it stops working after commit 989eafd0b609 ("clk: core: Avoid double initialization of clocks") gets merged. The commit sets flag OF_POPULATED during clock initialization to skip the platform device populating for the same device node. On hi6220, it effectively makes hi6220-sysctrl reset driver not probe any more. The patch changes hi6220 sysctrl and mediactrl clock init macro from CLK_OF_DECLARE to CLK_OF_DECLARE_DRIVER, so that the reset driver using the same hardware block can continue working. Signed-off-by: Shawn Guo Tested-by: John Stultz Signed-off-by: Stephen Boyd --- drivers/clk/hisilicon/clk-hi6220.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/clk/hisilicon/clk-hi6220.c b/drivers/clk/hisilicon/clk-hi6220.c index fe364e63f8de..c0e8e1f196aa 100644 --- a/drivers/clk/hisilicon/clk-hi6220.c +++ b/drivers/clk/hisilicon/clk-hi6220.c @@ -195,7 +195,7 @@ static void __init hi6220_clk_sys_init(struct device_node *np) hi6220_clk_register_divider(hi6220_div_clks_sys, ARRAY_SIZE(hi6220_div_clks_sys), clk_data); } -CLK_OF_DECLARE(hi6220_clk_sys, "hisilicon,hi6220-sysctrl", hi6220_clk_sys_init); +CLK_OF_DECLARE_DRIVER(hi6220_clk_sys, "hisilicon,hi6220-sysctrl", hi6220_clk_sys_init); /* clocks in media controller */ @@ -252,7 +252,7 @@ static void __init hi6220_clk_media_init(struct device_node *np) hi6220_clk_register_divider(hi6220_div_clks_media, ARRAY_SIZE(hi6220_div_clks_media), clk_data); } -CLK_OF_DECLARE(hi6220_clk_media, "hisilicon,hi6220-mediactrl", hi6220_clk_media_init); +CLK_OF_DECLARE_DRIVER(hi6220_clk_media, "hisilicon,hi6220-mediactrl", hi6220_clk_media_init); /* clocks in pmctrl */ From 3ab7511eafdd5c4f40d2832f09554478dfbea170 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Mon, 17 Oct 2016 17:23:59 +0100 Subject: [PATCH 053/292] ALSA: hda - allow 40 bit DMA mask for NVidia devices Commit 49d9e77e72cf ("ALSA: hda - Fix system panic when DMA > 40 bits for Nvidia audio controllers") simply disabled any DMA exceeding 32 bits for NVidia devices, even though they are capable of performing DMA up to 40 bits. On some architectures (such as arm64), system memory is not guaranteed to be 32-bit addressable by PCI devices, and so this change prevents NVidia devices from working on platforms such as AMD Seattle. Since the original commit already mentioned that up to 40 bits of DMA is supported, and given that the code has been updated in the meantime to support a 40 bit DMA mask on other devices, revert commit 49d9e77e72cf and explicitly set the DMA mask to 40 bits for NVidia devices. Fixes: 49d9e77e72cf ('ALSA: hda - Fix system panic when DMA > 40 bits...') Signed-off-by: Ard Biesheuvel Cc: Signed-off-by: Takashi Iwai --- sound/pci/hda/hda_intel.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index c3469f756ec2..c64d986009a9 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -341,8 +341,7 @@ enum { /* quirks for Nvidia */ #define AZX_DCAPS_PRESET_NVIDIA \ - (AZX_DCAPS_NO_MSI | /*AZX_DCAPS_ALIGN_BUFSIZE |*/ \ - AZX_DCAPS_NO_64BIT | AZX_DCAPS_CORBRP_SELF_CLEAR |\ + (AZX_DCAPS_NO_MSI | AZX_DCAPS_CORBRP_SELF_CLEAR |\ AZX_DCAPS_SNOOP_TYPE(NVIDIA)) #define AZX_DCAPS_PRESET_CTHDA \ @@ -1716,6 +1715,10 @@ static int azx_first_init(struct azx *chip) } } + /* NVidia hardware normally only supports up to 40 bits of DMA */ + if (chip->pci->vendor == PCI_VENDOR_ID_NVIDIA) + dma_bits = 40; + /* disable 64bit DMA address on some devices */ if (chip->driver_caps & AZX_DCAPS_NO_64BIT) { dev_dbg(card->dev, "Disabling 64bit DMA\n"); From f771d5bb71d4df9573d12386400540516672208b Mon Sep 17 00:00:00 2001 From: Hui Wang Date: Tue, 18 Oct 2016 10:59:09 +0800 Subject: [PATCH 054/292] ALSA: hda - Adding a new group of pin cfg into ALC295 pin quirk table We have a new Dell laptop model which uses ALC295, the pin definition is different from the existing ones in the pin quirk table, to fix the headset mic detection and mic mute led's problem, we need to add the new pin defintion into the pin quirk table. Cc: stable@vger.kernel.org Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index b58e8c76346a..8483fc20f635 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5811,8 +5811,6 @@ static const struct hda_model_fixup alc269_fixup_models[] = { #define ALC295_STANDARD_PINS \ {0x12, 0xb7a60130}, \ {0x14, 0x90170110}, \ - {0x17, 0x21014020}, \ - {0x18, 0x21a19030}, \ {0x21, 0x04211020} #define ALC298_STANDARD_PINS \ @@ -6039,7 +6037,13 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { ALC292_STANDARD_PINS, {0x13, 0x90a60140}), SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, - ALC295_STANDARD_PINS), + ALC295_STANDARD_PINS, + {0x17, 0x21014020}, + {0x18, 0x21a19030}), + SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, + ALC295_STANDARD_PINS, + {0x17, 0x21014040}, + {0x18, 0x21a19050}), SND_HDA_PIN_QUIRK(0x10ec0298, 0x1028, "Dell", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE, ALC298_STANDARD_PINS, {0x17, 0x90170110}), From 2317eacd9cf9dc1beee74ddb453bdd7552a64a27 Mon Sep 17 00:00:00 2001 From: John Youn Date: Mon, 17 Oct 2016 17:36:23 -0700 Subject: [PATCH 055/292] Revert "usb: dwc2: gadget: change variable name to more meaningful" This reverts commit ba48eab8866c ("usb: dwc2: gadget: change variable name to more meaningful"). This is needed to cleanly revert commit aa381a7259c3 ("usb: dwc2: gadget: fix TX FIFO size and address initialization") which may cause regressions on some platforms. Signed-off-by: John Youn Cc: Robert Baldyga Cc: Stefan Wahren Signed-off-by: Felipe Balbi --- drivers/usb/dwc2/gadget.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c index 4cd6403a7566..aac4af3ea68f 100644 --- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -186,7 +186,7 @@ static void dwc2_hsotg_ctrl_epint(struct dwc2_hsotg *hsotg, */ static void dwc2_hsotg_init_fifo(struct dwc2_hsotg *hsotg) { - unsigned int fifo; + unsigned int ep; unsigned int addr; int timeout; u32 dptxfsizn; @@ -217,8 +217,8 @@ static void dwc2_hsotg_init_fifo(struct dwc2_hsotg *hsotg) * them to endpoints dynamically according to maxpacket size value of * given endpoint. */ - for (fifo = 1; fifo < MAX_EPS_CHANNELS; fifo++) { - dptxfsizn = dwc2_readl(hsotg->regs + DPTXFSIZN(fifo)); + for (ep = 1; ep < MAX_EPS_CHANNELS; ep++) { + dptxfsizn = dwc2_readl(hsotg->regs + DPTXFSIZN(ep)); val = (dptxfsizn & FIFOSIZE_DEPTH_MASK) | addr; addr += dptxfsizn >> FIFOSIZE_DEPTH_SHIFT; @@ -226,7 +226,7 @@ static void dwc2_hsotg_init_fifo(struct dwc2_hsotg *hsotg) if (addr > hsotg->fifo_mem) break; - dwc2_writel(val, hsotg->regs + DPTXFSIZN(fifo)); + dwc2_writel(val, hsotg->regs + DPTXFSIZN(ep)); } /* From 3fa9538539ac737096c81f3315a14670b1609092 Mon Sep 17 00:00:00 2001 From: John Youn Date: Mon, 17 Oct 2016 17:36:25 -0700 Subject: [PATCH 056/292] Revert "usb: dwc2: gadget: fix TX FIFO size and address initialization" This reverts commit aa381a7259c3 ("usb: dwc2: gadget: fix TX FIFO size and address initialization"). The original commit removed the FIFO size programming per endpoint. The DPTXFSIZn register is also used for DIEPTXFn and the SIZE field is r/w in dedicated fifo mode. So it isn't appropriate to simply remove this initialization as it might break existing behavior. Also, some cores might not have enough fifo space to handle the programming method used in the reverted patch, resulting in fifo initialization failure. Signed-off-by: John Youn Cc: Robert Baldyga Cc: Stefan Wahren Signed-off-by: Felipe Balbi --- drivers/usb/dwc2/core.h | 7 ++++++ drivers/usb/dwc2/gadget.c | 47 ++++++++++++++++++++++++++++++++------- 2 files changed, 46 insertions(+), 8 deletions(-) diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h index aad4107ef927..2a21a0414b1d 100644 --- a/drivers/usb/dwc2/core.h +++ b/drivers/usb/dwc2/core.h @@ -259,6 +259,13 @@ enum dwc2_lx_state { DWC2_L3, /* Off state */ }; +/* + * Gadget periodic tx fifo sizes as used by legacy driver + * EP0 is not included + */ +#define DWC2_G_P_LEGACY_TX_FIFO_SIZE {256, 256, 256, 256, 768, 768, 768, \ + 768, 0, 0, 0, 0, 0, 0, 0} + /* Gadget ep0 states */ enum dwc2_ep0_state { DWC2_EP0_SETUP, diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c index aac4af3ea68f..24fbebc9b409 100644 --- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -189,7 +189,6 @@ static void dwc2_hsotg_init_fifo(struct dwc2_hsotg *hsotg) unsigned int ep; unsigned int addr; int timeout; - u32 dptxfsizn; u32 val; /* Reset fifo map if not correctly cleared during previous session */ @@ -218,13 +217,13 @@ static void dwc2_hsotg_init_fifo(struct dwc2_hsotg *hsotg) * given endpoint. */ for (ep = 1; ep < MAX_EPS_CHANNELS; ep++) { - dptxfsizn = dwc2_readl(hsotg->regs + DPTXFSIZN(ep)); - - val = (dptxfsizn & FIFOSIZE_DEPTH_MASK) | addr; - addr += dptxfsizn >> FIFOSIZE_DEPTH_SHIFT; - - if (addr > hsotg->fifo_mem) - break; + if (!hsotg->g_tx_fifo_sz[ep]) + continue; + val = addr; + val |= hsotg->g_tx_fifo_sz[ep] << FIFOSIZE_DEPTH_SHIFT; + WARN_ONCE(addr + hsotg->g_tx_fifo_sz[ep] > hsotg->fifo_mem, + "insufficient fifo memory"); + addr += hsotg->g_tx_fifo_sz[ep]; dwc2_writel(val, hsotg->regs + DPTXFSIZN(ep)); } @@ -3807,10 +3806,36 @@ static void dwc2_hsotg_dump(struct dwc2_hsotg *hsotg) static void dwc2_hsotg_of_probe(struct dwc2_hsotg *hsotg) { struct device_node *np = hsotg->dev->of_node; + u32 len = 0; + u32 i = 0; /* Enable dma if requested in device tree */ hsotg->g_using_dma = of_property_read_bool(np, "g-use-dma"); + /* + * Register TX periodic fifo size per endpoint. + * EP0 is excluded since it has no fifo configuration. + */ + if (!of_find_property(np, "g-tx-fifo-size", &len)) + goto rx_fifo; + + len /= sizeof(u32); + + /* Read tx fifo sizes other than ep0 */ + if (of_property_read_u32_array(np, "g-tx-fifo-size", + &hsotg->g_tx_fifo_sz[1], len)) + goto rx_fifo; + + /* Add ep0 */ + len++; + + /* Make remaining TX fifos unavailable */ + if (len < MAX_EPS_CHANNELS) { + for (i = len; i < MAX_EPS_CHANNELS; i++) + hsotg->g_tx_fifo_sz[i] = 0; + } + +rx_fifo: /* Register RX fifo size */ of_property_read_u32(np, "g-rx-fifo-size", &hsotg->g_rx_fifo_sz); @@ -3832,10 +3857,13 @@ int dwc2_gadget_init(struct dwc2_hsotg *hsotg, int irq) struct device *dev = hsotg->dev; int epnum; int ret; + int i; + u32 p_tx_fifo[] = DWC2_G_P_LEGACY_TX_FIFO_SIZE; /* Initialize to legacy fifo configuration values */ hsotg->g_rx_fifo_sz = 2048; hsotg->g_np_g_tx_fifo_sz = 1024; + memcpy(&hsotg->g_tx_fifo_sz[1], p_tx_fifo, sizeof(p_tx_fifo)); /* Device tree specific probe */ dwc2_hsotg_of_probe(hsotg); @@ -3853,6 +3881,9 @@ int dwc2_gadget_init(struct dwc2_hsotg *hsotg, int irq) dev_dbg(dev, "NonPeriodic TXFIFO size: %d\n", hsotg->g_np_g_tx_fifo_sz); dev_dbg(dev, "RXFIFO size: %d\n", hsotg->g_rx_fifo_sz); + for (i = 0; i < MAX_EPS_CHANNELS; i++) + dev_dbg(dev, "Periodic TXFIFO%2d size: %d\n", i, + hsotg->g_tx_fifo_sz[i]); hsotg->gadget.max_speed = USB_SPEED_HIGH; hsotg->gadget.ops = &dwc2_hsotg_gadget_ops; From a1aa8cf6471b17c0fa7132ea5eeef0ae07ca07cd Mon Sep 17 00:00:00 2001 From: John Youn Date: Mon, 17 Oct 2016 17:36:28 -0700 Subject: [PATCH 057/292] Revert "Documentation: devicetree: dwc2: Deprecate g-tx-fifo-size" This binding was deprecated due to commit aa381a7259c3 ("usb: dwc2: gadget: fix TX FIFO size and address initialization"). However that commit is now reverted, so also revert this commit. The binding is valid and shouldn't be deprecated. This reverts commit 65e1ff7f4b5b ("Documentation: devicetree: dwc2: Deprecate g-tx-fifo-size"). Signed-off-by: John Youn Acked-by: Rob Herring Signed-off-by: Felipe Balbi --- Documentation/devicetree/bindings/usb/dwc2.txt | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/Documentation/devicetree/bindings/usb/dwc2.txt b/Documentation/devicetree/bindings/usb/dwc2.txt index 455f2c310a1b..2c30a5479069 100644 --- a/Documentation/devicetree/bindings/usb/dwc2.txt +++ b/Documentation/devicetree/bindings/usb/dwc2.txt @@ -28,10 +28,7 @@ Refer to phy/phy-bindings.txt for generic phy consumer properties - g-use-dma: enable dma usage in gadget driver. - g-rx-fifo-size: size of rx fifo size in gadget mode. - g-np-tx-fifo-size: size of non-periodic tx fifo size in gadget mode. - -Deprecated properties: -- g-tx-fifo-size: size of periodic tx fifo per endpoint (except ep0) - in gadget mode. +- g-tx-fifo-size: size of periodic tx fifo per endpoint (except ep0) in gadget mode. Example: From d69bb92e402ff948bdcd39f19c9067874fb86873 Mon Sep 17 00:00:00 2001 From: Vlad Tsyrklevich Date: Thu, 13 Oct 2016 14:36:41 +0200 Subject: [PATCH 058/292] ALSA: asihpi: fix kernel memory disclosure Some elements in hr are not cleared before being copied to user space, leaking kernel heap memory to user space. For example, this happens in the error handling code for the HPI_ADAPTER_DELETE case. Zero the memory before it's copied. Signed-off-by: Vlad Tsyrklevich Signed-off-by: Takashi Iwai --- sound/pci/asihpi/hpioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/pci/asihpi/hpioctl.c b/sound/pci/asihpi/hpioctl.c index d17937b92331..7e3aa50b21f9 100644 --- a/sound/pci/asihpi/hpioctl.c +++ b/sound/pci/asihpi/hpioctl.c @@ -111,7 +111,7 @@ long asihpi_hpi_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return -EINVAL; hm = kmalloc(sizeof(*hm), GFP_KERNEL); - hr = kmalloc(sizeof(*hr), GFP_KERNEL); + hr = kzalloc(sizeof(*hr), GFP_KERNEL); if (!hm || !hr) { err = -ENOMEM; goto out; From d09960b0032174eb493c4c13be5b9c9ef36dc9a7 Mon Sep 17 00:00:00 2001 From: Tahsin Erdogan Date: Mon, 10 Oct 2016 05:35:19 -0700 Subject: [PATCH 059/292] dm: free io_barrier after blk_cleanup_queue call dm_old_request_fn() has paths that access md->io_barrier. The party destroying io_barrier should ensure that no future execution of dm_old_request_fn() is possible. Move io_barrier destruction to below blk_cleanup_queue() to ensure this and avoid a NULL pointer crash during request-based DM device shutdown. Cc: stable@vger.kernel.org # 4.3+ Signed-off-by: Tahsin Erdogan Signed-off-by: Mike Snitzer --- drivers/md/dm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index be35258324c1..ec513ee864f2 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1423,8 +1423,6 @@ static void cleanup_mapped_device(struct mapped_device *md) if (md->bs) bioset_free(md->bs); - cleanup_srcu_struct(&md->io_barrier); - if (md->disk) { spin_lock(&_minor_lock); md->disk->private_data = NULL; @@ -1436,6 +1434,8 @@ static void cleanup_mapped_device(struct mapped_device *md) if (md->queue) blk_cleanup_queue(md->queue); + cleanup_srcu_struct(&md->io_barrier); + if (md->bdev) { bdput(md->bdev); md->bdev = NULL; From 937fa62e8a00d0b4bc2c0a40567d7c88ab2b2e8d Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Tue, 18 Oct 2016 14:02:04 -0400 Subject: [PATCH 060/292] dm rq: clear kworker_task if kthread_run() returned an error cleanup_mapped_device() calls kthread_stop() if kworker_task is non-NULL. Currently the assigned value could be a valid task struct or an error code (e.g -ENOMEM). Reset md->kworker_task to NULL if kthread_run() returned an erorr. Fixes: 7193a9defc ("dm rq: check kthread_run return for .request_fn request-based DM") Cc: stable@vger.kernel.org # 4.8 Reported-by: Tahsin Erdogan Signed-off-by: Mike Snitzer --- drivers/md/dm-rq.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index 5eacce1ef88b..63e43f261cce 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -856,8 +856,11 @@ int dm_old_init_request_queue(struct mapped_device *md) init_kthread_worker(&md->kworker); md->kworker_task = kthread_run(kthread_worker_fn, &md->kworker, "kdmwork-%s", dm_device_name(md)); - if (IS_ERR(md->kworker_task)) - return PTR_ERR(md->kworker_task); + if (IS_ERR(md->kworker_task)) { + int error = PTR_ERR(md->kworker_task); + md->kworker_task = NULL; + return error; + } elv_register_queue(md->queue); From 7c6273194445fe1316084d3096f9311c3dfa4da6 Mon Sep 17 00:00:00 2001 From: Shawn Lin Date: Tue, 18 Oct 2016 20:52:28 +0800 Subject: [PATCH 061/292] arm64: dts: rockchip: remove the abuse of keep-power-in-suspend It was invented for sdio only, and should not be used for sdmmc or emmc. Remove it. Signed-off-by: Shawn Lin Signed-off-by: Heiko Stuebner --- arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts | 1 - arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts | 2 -- 2 files changed, 3 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts b/arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts index 353314ca7426..e5eeca2c2456 100644 --- a/arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts +++ b/arch/arm64/boot/dts/rockchip/rk3368-geekbox.dts @@ -116,7 +116,6 @@ cap-mmc-highspeed; clock-frequency = <150000000>; disable-wp; - keep-power-in-suspend; non-removable; num-slots = <1>; vmmc-supply = <&vcc_io>; diff --git a/arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts b/arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts index 13b7f1edad10..ea0a8eceefd4 100644 --- a/arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts +++ b/arch/arm64/boot/dts/rockchip/rk3368-orion-r68-meta.dts @@ -199,7 +199,6 @@ bus-width = <8>; cap-mmc-highspeed; disable-wp; - keep-power-in-suspend; mmc-pwrseq = <&emmc_pwrseq>; mmc-hs200-1_2v; mmc-hs200-1_8v; @@ -348,7 +347,6 @@ clock-freq-min-max = <400000 50000000>; cap-sd-highspeed; card-detect-delay = <200>; - keep-power-in-suspend; num-slots = <1>; pinctrl-names = "default"; pinctrl-0 = <&sdmmc_clk &sdmmc_cmd &sdmmc_cd &sdmmc_bus4>; From 6d4952d9d9d4dc2bb9c0255d95a09405a1e958f7 Mon Sep 17 00:00:00 2001 From: Andrew Lutomirski Date: Mon, 17 Oct 2016 10:06:27 -0700 Subject: [PATCH 062/292] hwrng: core - Don't use a stack buffer in add_early_randomness() hw_random carefully avoids using a stack buffer except in add_early_randomness(). This causes a crash in virtio_rng if CONFIG_VMAP_STACK=y. Reported-by: Matt Mullins Tested-by: Matt Mullins Fixes: d3cc7996473a ("hwrng: fetch randomness only after device init") Signed-off-by: Andy Lutomirski Signed-off-by: Herbert Xu --- drivers/char/hw_random/core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index 482794526e8c..d2d2c89de5b4 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -84,14 +84,14 @@ static size_t rng_buffer_size(void) static void add_early_randomness(struct hwrng *rng) { - unsigned char bytes[16]; int bytes_read; + size_t size = min_t(size_t, 16, rng_buffer_size()); mutex_lock(&reading_mutex); - bytes_read = rng_get_data(rng, bytes, sizeof(bytes), 1); + bytes_read = rng_get_data(rng, rng_buffer, size, 1); mutex_unlock(&reading_mutex); if (bytes_read > 0) - add_device_randomness(bytes, bytes_read); + add_device_randomness(rng_buffer, bytes_read); } static inline void cleanup_rng(struct kref *kref) From 1ee1710cd6bbf49853688e97d6e1d98b48f28586 Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Wed, 19 Oct 2016 13:19:02 +0000 Subject: [PATCH 063/292] wusb: fix error return code in wusb_prf() Fix to return error code -ENOMEM from the kmalloc() error handling case instead of 0, as done elsewhere in this function. Fixes: a19b882c07a6 ("wusb: Stop using the stack for sg crypto scratch space") Signed-off-by: Wei Yongjun Signed-off-by: Greg Kroah-Hartman --- drivers/usb/wusbcore/crypto.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/usb/wusbcore/crypto.c b/drivers/usb/wusbcore/crypto.c index de089f3a82f3..79451f7ef1b7 100644 --- a/drivers/usb/wusbcore/crypto.c +++ b/drivers/usb/wusbcore/crypto.c @@ -339,8 +339,10 @@ ssize_t wusb_prf(void *out, size_t out_size, goto error_setkey_aes; } scratch = kmalloc(sizeof(*scratch), GFP_KERNEL); - if (!scratch) + if (!scratch) { + result = -ENOMEM; goto error_alloc_scratch; + } for (bitr = 0; bitr < (len + 63) / 64; bitr++) { sfn_le = cpu_to_le64(sfn++); From 889882bce2a5f69242c1f3acd840983f467499b9 Mon Sep 17 00:00:00 2001 From: Lukasz Odzioba Date: Tue, 4 Oct 2016 18:26:26 +0200 Subject: [PATCH 064/292] perf/x86/intel/cstate: Add C-state residency events for Knights Landing Although KNL does support C1,C6,PC2,PC3,PC6 states, the patch only supports C6,PC2,PC3,PC6, because there is no counter for C1. C6 residency counter MSR on KNL has a different address than other platforms which is handled as a new quirk flag. Signed-off-by: Lukasz Odzioba Acked-by: Peter Zijlstra Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Rafael J. Wysocki Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: bp@suse.de Cc: dave.hansen@linux.intel.com Cc: kan.liang@intel.com Link: http://lkml.kernel.org/r/1475598386-19597-1-git-send-email-lukasz.odzioba@intel.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/cstate.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c index 3ca87b5a8677..4f5ac726335f 100644 --- a/arch/x86/events/intel/cstate.c +++ b/arch/x86/events/intel/cstate.c @@ -48,7 +48,8 @@ * Scope: Core * MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter * perf code: 0x02 - * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,SKL + * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW + * SKL,KNL * Scope: Core * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter * perf code: 0x03 @@ -56,15 +57,16 @@ * Scope: Core * MSR_PKG_C2_RESIDENCY: Package C2 Residency Counter. * perf code: 0x00 - * Available model: SNB,IVB,HSW,BDW,SKL + * Available model: SNB,IVB,HSW,BDW,SKL,KNL * Scope: Package (physical package) * MSR_PKG_C3_RESIDENCY: Package C3 Residency Counter. * perf code: 0x01 - * Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL + * Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL * Scope: Package (physical package) * MSR_PKG_C6_RESIDENCY: Package C6 Residency Counter. * perf code: 0x02 - * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,SKL + * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW + * SKL,KNL * Scope: Package (physical package) * MSR_PKG_C7_RESIDENCY: Package C7 Residency Counter. * perf code: 0x03 @@ -118,6 +120,7 @@ struct cstate_model { /* Quirk flags */ #define SLM_PKG_C6_USE_C7_MSR (1UL << 0) +#define KNL_CORE_C6_MSR (1UL << 1) struct perf_cstate_msr { u64 msr; @@ -488,6 +491,18 @@ static const struct cstate_model slm_cstates __initconst = { .quirks = SLM_PKG_C6_USE_C7_MSR, }; + +static const struct cstate_model knl_cstates __initconst = { + .core_events = BIT(PERF_CSTATE_CORE_C6_RES), + + .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) | + BIT(PERF_CSTATE_PKG_C3_RES) | + BIT(PERF_CSTATE_PKG_C6_RES), + .quirks = KNL_CORE_C6_MSR, +}; + + + #define X86_CSTATES_MODEL(model, states) \ { X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long) &(states) } @@ -523,6 +538,8 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = { X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_MOBILE, snb_cstates), X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_DESKTOP, snb_cstates), + + X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNL, knl_cstates), { }, }; MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match); @@ -558,6 +575,11 @@ static int __init cstate_probe(const struct cstate_model *cm) if (cm->quirks & SLM_PKG_C6_USE_C7_MSR) pkg_msr[PERF_CSTATE_PKG_C6_RES].msr = MSR_PKG_C7_RESIDENCY; + /* KNL has different MSR for CORE C6 */ + if (cm->quirks & KNL_CORE_C6_MSR) + pkg_msr[PERF_CSTATE_CORE_C6_RES].msr = MSR_KNL_CORE_C6_RESIDENCY; + + has_cstate_core = cstate_probe_msr(cm->core_events, PERF_CSTATE_CORE_EVENT_MAX, core_msr, core_events_attrs); From 3da43104d3187184d7417569cb3360511229f761 Mon Sep 17 00:00:00 2001 From: Noam Camus Date: Wed, 19 Oct 2016 14:25:03 +0300 Subject: [PATCH 065/292] ARC: Adjust cpuinfo for non-continuous cpu ids num_possible_cpus() returns how many CPUs may be present on system. However we want the highest possible CPU number. This may be differ in a sparsed possible CPUs map. Such map achived by OF for plat-eznps. For example if we have: possible cpus mask 0,3 Then: num_possible_cpus() is equal 2 while nr_cpu_ids is equal 4. Only for value 4 c_start() will provide correct cpuinfo at procfs. Signed-off-by: Noam Camus Signed-off-by: Vineet Gupta --- arch/arc/kernel/setup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c index 3df7f9c72f42..75e540972135 100644 --- a/arch/arc/kernel/setup.c +++ b/arch/arc/kernel/setup.c @@ -507,7 +507,7 @@ static void *c_start(struct seq_file *m, loff_t *pos) * way to pass it w/o having to kmalloc/free a 2 byte string. * Encode cpu-id as 0xFFcccc, which is decoded by show routine. */ - return *pos < num_possible_cpus() ? cpu_to_ptr(*pos) : NULL; + return *pos < nr_cpu_ids ? cpu_to_ptr(*pos) : NULL; } static void *c_next(struct seq_file *m, void *v, loff_t *pos) From 17a51f12cfbd2814fd35966a069b242569c53e27 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Tue, 18 Oct 2016 09:00:52 +0200 Subject: [PATCH 066/292] ahci: only try to use multi-MSI mode if there is more than 1 port We should only try to allocate multiple MSI or MSI-X vectors if the device actually has multiple ports. Otherwise pci_alloc_irq_vectors will return a single vector due to n_ports = 1, in which case we shouldn't set the AHCI_HFLAG_MULTI_MSI flag. Signed-off-by: Christoph Hellwig Fixes: 0b9e2988 ("ahci: use pci_alloc_irq_vectors") Reported-by: Emmanuel Benisty Tested-by: Emmanuel Benisty Signed-off-by: Tejun Heo --- drivers/ata/ahci.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index ba5f11cebee2..ed311a040fed 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -1418,21 +1418,24 @@ static int ahci_init_msi(struct pci_dev *pdev, unsigned int n_ports, * Message mode could be enforced. In this case assume that advantage * of multipe MSIs is negated and use single MSI mode instead. */ - nvec = pci_alloc_irq_vectors(pdev, n_ports, INT_MAX, - PCI_IRQ_MSIX | PCI_IRQ_MSI); - if (nvec > 0) { - if (!(readl(hpriv->mmio + HOST_CTL) & HOST_MRSM)) { - hpriv->get_irq_vector = ahci_get_irq_vector; - hpriv->flags |= AHCI_HFLAG_MULTI_MSI; - return nvec; - } + if (n_ports > 1) { + nvec = pci_alloc_irq_vectors(pdev, n_ports, INT_MAX, + PCI_IRQ_MSIX | PCI_IRQ_MSI); + if (nvec > 0) { + if (!(readl(hpriv->mmio + HOST_CTL) & HOST_MRSM)) { + hpriv->get_irq_vector = ahci_get_irq_vector; + hpriv->flags |= AHCI_HFLAG_MULTI_MSI; + return nvec; + } - /* - * Fallback to single MSI mode if the controller enforced MRSM - * mode. - */ - printk(KERN_INFO "ahci: MRSM is on, fallback to single MSI\n"); - pci_free_irq_vectors(pdev); + /* + * Fallback to single MSI mode if the controller + * enforced MRSM mode. + */ + printk(KERN_INFO + "ahci: MRSM is on, fallback to single MSI\n"); + pci_free_irq_vectors(pdev); + } } /* From 75d29713b792da4782cadfaa87e802183440694e Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 12 Oct 2016 09:34:29 +0300 Subject: [PATCH 067/292] libnvdimm, namespace: potential NULL deref on allocation error If the kcalloc() fails then "devs" can be NULL and we dereference it checking "devs[i]". Fixes: 1b40e09a1232 ('libnvdimm: blk labels and namespace instantiation') Signed-off-by: Dan Carpenter Signed-off-by: Dan Williams --- drivers/nvdimm/namespace_devs.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c index 3509cff68ef9..abe5c6bc756c 100644 --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -2176,12 +2176,14 @@ static struct device **scan_labels(struct nd_region *nd_region) return devs; err: - for (i = 0; devs[i]; i++) - if (is_nd_blk(&nd_region->dev)) - namespace_blk_release(devs[i]); - else - namespace_pmem_release(devs[i]); - kfree(devs); + if (devs) { + for (i = 0; devs[i]; i++) + if (is_nd_blk(&nd_region->dev)) + namespace_blk_release(devs[i]); + else + namespace_pmem_release(devs[i]); + kfree(devs); + } return NULL; } From 3115bb02b5c23d960df5f1bf551ec394a9bb10ec Mon Sep 17 00:00:00 2001 From: Toshi Kani Date: Thu, 13 Oct 2016 09:54:21 -0600 Subject: [PATCH 068/292] pmem: report error on clear poison failure ACPI Clear Uncorrectable Error DSM function may fail or may be unsupported on a platform. pmem_clear_poison() returns without clearing badblocks in such cases. This failure is detected at the next read (-EIO). This behavior can lead to an issue when user keeps writing but does not read immediately. For instance, flight recorder file may be only read when it is necessary for troubleshooting. Change pmem_do_bvec() and pmem_clear_poison() to return -EIO so that filesystem can log an error message on a write error. Cc: Vishal Verma Signed-off-by: Toshi Kani Signed-off-by: Dan Williams --- drivers/nvdimm/pmem.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 42b3a8217073..24618431a14b 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -47,7 +47,7 @@ static struct nd_region *to_region(struct pmem_device *pmem) return to_nd_region(to_dev(pmem)->parent); } -static void pmem_clear_poison(struct pmem_device *pmem, phys_addr_t offset, +static int pmem_clear_poison(struct pmem_device *pmem, phys_addr_t offset, unsigned int len) { struct device *dev = to_dev(pmem); @@ -62,8 +62,12 @@ static void pmem_clear_poison(struct pmem_device *pmem, phys_addr_t offset, __func__, (unsigned long long) sector, cleared / 512, cleared / 512 > 1 ? "s" : ""); badblocks_clear(&pmem->bb, sector, cleared / 512); + } else { + return -EIO; } + invalidate_pmem(pmem->virt_addr + offset, len); + return 0; } static void write_pmem(void *pmem_addr, struct page *page, @@ -123,7 +127,7 @@ static int pmem_do_bvec(struct pmem_device *pmem, struct page *page, flush_dcache_page(page); write_pmem(pmem_addr, page, off, len); if (unlikely(bad_pmem)) { - pmem_clear_poison(pmem, pmem_off, len); + rc = pmem_clear_poison(pmem, pmem_off, len); write_pmem(pmem_addr, page, off, len); } } From 7d36b9c102318aa86aceb074359305da88ce9ef9 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Wed, 19 Oct 2016 20:49:39 +0900 Subject: [PATCH 069/292] clk: uniphier: fix memory overrun bug The first loop of this "for" statement writes memory beyond the allocated clk_hw_onecell_data. It should be: for (clk_num--; clk_num >= 0; clk_num--) ... Or more simply: while (--clk_num >= 0) ... Fixes: 734d82f4a678 ("clk: uniphier: add core support code for UniPhier clock driver") Signed-off-by: Masahiro Yamada Signed-off-by: Stephen Boyd --- drivers/clk/uniphier/clk-uniphier-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/clk/uniphier/clk-uniphier-core.c b/drivers/clk/uniphier/clk-uniphier-core.c index f4e0f6be5f33..84bc465d31aa 100644 --- a/drivers/clk/uniphier/clk-uniphier-core.c +++ b/drivers/clk/uniphier/clk-uniphier-core.c @@ -79,7 +79,7 @@ static int uniphier_clk_probe(struct platform_device *pdev) hw_data->num = clk_num; /* avoid returning NULL for unused idx */ - for (; clk_num >= 0; clk_num--) + while (--clk_num >= 0) hw_data->hws[clk_num] = ERR_PTR(-EINVAL); for (p = data; p->name; p++) { From 5c6201e60a57c6b240d446c8a2d83063283b2743 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Wed, 19 Oct 2016 17:22:07 +0900 Subject: [PATCH 070/292] clk: uniphier: rename MIO clock to SD clock for Pro5, PXs2, LD20 SoCs I made a mistake as for naming for this block. The MIO block is not implemented for these 3 SoCs in the first place. The current naming will be a trouble if an SoC with both MIO and SD-ctrl blocks appear in the future. This driver has just been merged in the previous merge window. Rename it before the release. Signed-off-by: Masahiro Yamada Signed-off-by: Stephen Boyd --- .../devicetree/bindings/clock/uniphier-clock.txt | 16 ++++++++-------- drivers/clk/uniphier/clk-uniphier-core.c | 14 +++++++------- drivers/clk/uniphier/clk-uniphier-mio.c | 2 +- drivers/clk/uniphier/clk-uniphier.h | 2 +- 4 files changed, 17 insertions(+), 17 deletions(-) diff --git a/Documentation/devicetree/bindings/clock/uniphier-clock.txt b/Documentation/devicetree/bindings/clock/uniphier-clock.txt index c7179d3b5c33..812163060fa3 100644 --- a/Documentation/devicetree/bindings/clock/uniphier-clock.txt +++ b/Documentation/devicetree/bindings/clock/uniphier-clock.txt @@ -24,7 +24,7 @@ Example: reg = <0x61840000 0x4000>; clock { - compatible = "socionext,uniphier-ld20-clock"; + compatible = "socionext,uniphier-ld11-clock"; #clock-cells = <1>; }; @@ -43,8 +43,8 @@ Provided clocks: 21: USB3 ch1 PHY1 -Media I/O (MIO) clock ---------------------- +Media I/O (MIO) clock, SD clock +------------------------------- Required properties: - compatible: should be one of the following: @@ -52,10 +52,10 @@ Required properties: "socionext,uniphier-ld4-mio-clock" - for LD4 SoC. "socionext,uniphier-pro4-mio-clock" - for Pro4 SoC. "socionext,uniphier-sld8-mio-clock" - for sLD8 SoC. - "socionext,uniphier-pro5-mio-clock" - for Pro5 SoC. - "socionext,uniphier-pxs2-mio-clock" - for PXs2/LD6b SoC. + "socionext,uniphier-pro5-sd-clock" - for Pro5 SoC. + "socionext,uniphier-pxs2-sd-clock" - for PXs2/LD6b SoC. "socionext,uniphier-ld11-mio-clock" - for LD11 SoC. - "socionext,uniphier-ld20-mio-clock" - for LD20 SoC. + "socionext,uniphier-ld20-sd-clock" - for LD20 SoC. - #clock-cells: should be 1. Example: @@ -66,7 +66,7 @@ Example: reg = <0x59810000 0x800>; clock { - compatible = "socionext,uniphier-ld20-mio-clock"; + compatible = "socionext,uniphier-ld11-mio-clock"; #clock-cells = <1>; }; @@ -112,7 +112,7 @@ Example: reg = <0x59820000 0x200>; clock { - compatible = "socionext,uniphier-ld20-peri-clock"; + compatible = "socionext,uniphier-ld11-peri-clock"; #clock-cells = <1>; }; diff --git a/drivers/clk/uniphier/clk-uniphier-core.c b/drivers/clk/uniphier/clk-uniphier-core.c index 84bc465d31aa..26c53f7963a4 100644 --- a/drivers/clk/uniphier/clk-uniphier-core.c +++ b/drivers/clk/uniphier/clk-uniphier-core.c @@ -142,7 +142,7 @@ static const struct of_device_id uniphier_clk_match[] = { .compatible = "socionext,uniphier-ld20-clock", .data = uniphier_ld20_sys_clk_data, }, - /* Media I/O clock */ + /* Media I/O clock, SD clock */ { .compatible = "socionext,uniphier-sld3-mio-clock", .data = uniphier_sld3_mio_clk_data, @@ -160,20 +160,20 @@ static const struct of_device_id uniphier_clk_match[] = { .data = uniphier_sld3_mio_clk_data, }, { - .compatible = "socionext,uniphier-pro5-mio-clock", - .data = uniphier_pro5_mio_clk_data, + .compatible = "socionext,uniphier-pro5-sd-clock", + .data = uniphier_pro5_sd_clk_data, }, { - .compatible = "socionext,uniphier-pxs2-mio-clock", - .data = uniphier_pro5_mio_clk_data, + .compatible = "socionext,uniphier-pxs2-sd-clock", + .data = uniphier_pro5_sd_clk_data, }, { .compatible = "socionext,uniphier-ld11-mio-clock", .data = uniphier_sld3_mio_clk_data, }, { - .compatible = "socionext,uniphier-ld20-mio-clock", - .data = uniphier_pro5_mio_clk_data, + .compatible = "socionext,uniphier-ld20-sd-clock", + .data = uniphier_pro5_sd_clk_data, }, /* Peripheral clock */ { diff --git a/drivers/clk/uniphier/clk-uniphier-mio.c b/drivers/clk/uniphier/clk-uniphier-mio.c index 6aa7ec768d0b..218d20f099ce 100644 --- a/drivers/clk/uniphier/clk-uniphier-mio.c +++ b/drivers/clk/uniphier/clk-uniphier-mio.c @@ -93,7 +93,7 @@ const struct uniphier_clk_data uniphier_sld3_mio_clk_data[] = { { /* sentinel */ } }; -const struct uniphier_clk_data uniphier_pro5_mio_clk_data[] = { +const struct uniphier_clk_data uniphier_pro5_sd_clk_data[] = { UNIPHIER_MIO_CLK_SD_FIXED, UNIPHIER_MIO_CLK_SD(0, 0), UNIPHIER_MIO_CLK_SD(1, 1), diff --git a/drivers/clk/uniphier/clk-uniphier.h b/drivers/clk/uniphier/clk-uniphier.h index 3ae184062388..0244dba1f4cf 100644 --- a/drivers/clk/uniphier/clk-uniphier.h +++ b/drivers/clk/uniphier/clk-uniphier.h @@ -115,7 +115,7 @@ extern const struct uniphier_clk_data uniphier_pxs2_sys_clk_data[]; extern const struct uniphier_clk_data uniphier_ld11_sys_clk_data[]; extern const struct uniphier_clk_data uniphier_ld20_sys_clk_data[]; extern const struct uniphier_clk_data uniphier_sld3_mio_clk_data[]; -extern const struct uniphier_clk_data uniphier_pro5_mio_clk_data[]; +extern const struct uniphier_clk_data uniphier_pro5_sd_clk_data[]; extern const struct uniphier_clk_data uniphier_ld4_peri_clk_data[]; extern const struct uniphier_clk_data uniphier_pro4_peri_clk_data[]; From 1dec78585328db00e33fb18dc1a6deed0e2095a5 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Wed, 19 Oct 2016 14:38:50 -0700 Subject: [PATCH 071/292] ARC: fix build warning in elf.h The cast valid since TASK_SIZE * 2 will never actually cause overflow. | CC fs/binfmt_elf.o | In file included from ../include/linux/elf.h:4:0, | from ../include/linux/module.h:15, | from ../fs/binfmt_elf.c:12: | ../fs/binfmt_elf.c: In function load_elf_binar: | ../arch/arc/include/asm/elf.h:57:29: warning: integer overflow in expression [-Woverflow] | #define ELF_ET_DYN_BASE (2 * TASK_SIZE / 3) | ^ | ../fs/binfmt_elf.c:921:16: note: in expansion of macro ELF_ET_DYN_BASE | load_bias = ELF_ET_DYN_BASE - vaddr; Signed-off-by: Vineet Gupta --- arch/arc/include/asm/elf.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arc/include/asm/elf.h b/arch/arc/include/asm/elf.h index 7096f97a1434..aa2d6da9d187 100644 --- a/arch/arc/include/asm/elf.h +++ b/arch/arc/include/asm/elf.h @@ -54,7 +54,7 @@ extern int elf_check_arch(const struct elf32_hdr *); * the loader. We need to make sure that it is out of the way of the program * that it will "exec", and that there is sufficient room for the brk. */ -#define ELF_ET_DYN_BASE (2 * TASK_SIZE / 3) +#define ELF_ET_DYN_BASE (2UL * TASK_SIZE / 3) /* * When the program starts, a1 contains a pointer to a function to be From 1d55a4bfd080ff4c6c96acfccfb7cdd2615ed6c2 Mon Sep 17 00:00:00 2001 From: Colin Ian King Date: Thu, 20 Oct 2016 15:40:55 +1100 Subject: [PATCH 072/292] xfs: remove redundant assignment of ifp Remove redundant ifp = ifp statement, it does nothing. Found with static analysis by CoverityScan. Signed-off-by: Colin Ian King Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_bmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index c27344cf38e1..0283b7eaf973 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -5204,7 +5204,7 @@ xfs_bunmapi_cow( ep = xfs_bmap_search_extents(ip, del->br_startoff, XFS_COW_FORK, &eof, &eidx, &got, &new); - ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK); ifp = ifp; + ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK); ASSERT((eidx >= 0) && (eidx < ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t))); ASSERT(del->br_blockcount > 0); From 1be7f9be0efa4e90547f50b8188f4e70710a1173 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Thu, 20 Oct 2016 15:41:48 +1100 Subject: [PATCH 073/292] xfs: Fix uninitialized variable in xfs_reflink_reserve_cow_range() with gcc 4.1.2: fs/xfs/xfs_reflink.c: In function xfs_reflink_reserve_cow_range: fs/xfs/xfs_reflink.c:327: warning: error may be used uninitialized in this function Indeed, if "count" is zero, the function will return an uninitialized error value. While "count" is unlikely to be zero, this function is called through the public iomap API. Hence fix this by preinitializing error to zero. Fixes: 2a06705cd5954030 ("xfs: create delalloc extents in CoW fork") Signed-off-by: Geert Uytterhoeven Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner --- fs/xfs/xfs_reflink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 5965e9455d91..d48a7cc2fe00 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -324,7 +324,7 @@ xfs_reflink_reserve_cow_range( struct xfs_mount *mp = ip->i_mount; xfs_fileoff_t offset_fsb, end_fsb; bool skipped = false; - int error; + int error = 0; trace_xfs_reflink_reserve_cow_range(ip, offset, count); From f1b8243c55ca6fd2a3898e2f586b8cfcfff684bb Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Thu, 20 Oct 2016 15:42:30 +1100 Subject: [PATCH 074/292] xfs: add some 'static' annotations sparse reported that several variables and a function were not forward-declared anywhere and therefore should be 'static'. Found with sparse by running 'make C=2 CF=-D__CHECK_ENDIAN__ fs/xfs/' Signed-off-by: Eric Biggers Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_btree.c | 2 +- fs/xfs/xfs_sysfs.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 5c8e6f2ce44f..0e80993c8a59 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -4826,7 +4826,7 @@ xfs_btree_calc_size( return rval; } -int +static int xfs_btree_count_blocks_helper( struct xfs_btree_cur *cur, int level, diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c index 5f8d55d29a11..276d3023d60f 100644 --- a/fs/xfs/xfs_sysfs.c +++ b/fs/xfs/xfs_sysfs.c @@ -512,13 +512,13 @@ static struct attribute *xfs_error_attrs[] = { }; -struct kobj_type xfs_error_cfg_ktype = { +static struct kobj_type xfs_error_cfg_ktype = { .release = xfs_sysfs_release, .sysfs_ops = &xfs_sysfs_ops, .default_attrs = xfs_error_attrs, }; -struct kobj_type xfs_error_ktype = { +static struct kobj_type xfs_error_ktype = { .release = xfs_sysfs_release, .sysfs_ops = &xfs_sysfs_ops, }; From 0ee7a3f6b5b2f22bb69bfc6c60d0ea0777003098 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:44:14 +1100 Subject: [PATCH 075/292] xfs: don't take the IOLOCK exclusive for direct I/O page invalidation XFS historically took the iolock exclusive when invalidating pages before direct I/O operations to protect against writeback starvations. But this writeback starvation issues has been fixed a long time ago in the core writeback code, and all other file systems manage to do without the exclusive lock. Convert XFS over to avoid the exclusive lock in this case, and also move to range invalidations like done by the other file systems. Signed-off-by: Christoph Hellwig Reviewed-by: Jan Kara Reviewed-by: Carlos Maiolino Signed-off-by: Dave Chinner --- fs/xfs/xfs_file.c | 98 +++++++++++++++-------------------------------- 1 file changed, 30 insertions(+), 68 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index a314fc7b56fa..0dc9971d3c84 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -249,6 +249,7 @@ xfs_file_dio_aio_read( struct xfs_inode *ip = XFS_I(inode); loff_t isize = i_size_read(inode); size_t count = iov_iter_count(to); + loff_t end = iocb->ki_pos + count - 1; struct iov_iter data; struct xfs_buftarg *target; ssize_t ret = 0; @@ -272,49 +273,21 @@ xfs_file_dio_aio_read( file_accessed(iocb->ki_filp); - /* - * Locking is a bit tricky here. If we take an exclusive lock for direct - * IO, we effectively serialise all new concurrent read IO to this file - * and block it behind IO that is currently in progress because IO in - * progress holds the IO lock shared. We only need to hold the lock - * exclusive to blow away the page cache, so only take lock exclusively - * if the page cache needs invalidation. This allows the normal direct - * IO case of no page cache pages to proceeed concurrently without - * serialisation. - */ xfs_rw_ilock(ip, XFS_IOLOCK_SHARED); if (mapping->nrpages) { - xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); - xfs_rw_ilock(ip, XFS_IOLOCK_EXCL); + ret = filemap_write_and_wait_range(mapping, iocb->ki_pos, end); + if (ret) + goto out_unlock; /* - * The generic dio code only flushes the range of the particular - * I/O. Because we take an exclusive lock here, this whole - * sequence is considerably more expensive for us. This has a - * noticeable performance impact for any file with cached pages, - * even when outside of the range of the particular I/O. - * - * Hence, amortize the cost of the lock against a full file - * flush and reduce the chances of repeated iolock cycles going - * forward. + * Invalidate whole pages. This can return an error if we fail + * to invalidate a page, but this should never happen on XFS. + * Warn if it does fail. */ - if (mapping->nrpages) { - ret = filemap_write_and_wait(mapping); - if (ret) { - xfs_rw_iunlock(ip, XFS_IOLOCK_EXCL); - return ret; - } - - /* - * Invalidate whole pages. This can return an error if - * we fail to invalidate a page, but this should never - * happen on XFS. Warn if it does fail. - */ - ret = invalidate_inode_pages2(mapping); - WARN_ON_ONCE(ret); - ret = 0; - } - xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL); + ret = invalidate_inode_pages2_range(mapping, + iocb->ki_pos >> PAGE_SHIFT, end >> PAGE_SHIFT); + WARN_ON_ONCE(ret); + ret = 0; } data = *to; @@ -324,8 +297,9 @@ xfs_file_dio_aio_read( iocb->ki_pos += ret; iov_iter_advance(to, ret); } - xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); +out_unlock: + xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED); return ret; } @@ -570,61 +544,49 @@ xfs_file_dio_aio_write( if ((iocb->ki_pos | count) & target->bt_logical_sectormask) return -EINVAL; - /* "unaligned" here means not aligned to a filesystem block */ + /* + * Don't take the exclusive iolock here unless the I/O is unaligned to + * the file system block size. We don't need to consider the EOF + * extension case here because xfs_file_aio_write_checks() will relock + * the inode as necessary for EOF zeroing cases and fill out the new + * inode size as appropriate. + */ if ((iocb->ki_pos & mp->m_blockmask) || - ((iocb->ki_pos + count) & mp->m_blockmask)) + ((iocb->ki_pos + count) & mp->m_blockmask)) { unaligned_io = 1; - - /* - * We don't need to take an exclusive lock unless there page cache needs - * to be invalidated or unaligned IO is being executed. We don't need to - * consider the EOF extension case here because - * xfs_file_aio_write_checks() will relock the inode as necessary for - * EOF zeroing cases and fill out the new inode size as appropriate. - */ - if (unaligned_io || mapping->nrpages) iolock = XFS_IOLOCK_EXCL; - else + } else { iolock = XFS_IOLOCK_SHARED; - xfs_rw_ilock(ip, iolock); - - /* - * Recheck if there are cached pages that need invalidate after we got - * the iolock to protect against other threads adding new pages while - * we were waiting for the iolock. - */ - if (mapping->nrpages && iolock == XFS_IOLOCK_SHARED) { - xfs_rw_iunlock(ip, iolock); - iolock = XFS_IOLOCK_EXCL; - xfs_rw_ilock(ip, iolock); } + xfs_rw_ilock(ip, iolock); + ret = xfs_file_aio_write_checks(iocb, from, &iolock); if (ret) goto out; count = iov_iter_count(from); end = iocb->ki_pos + count - 1; - /* - * See xfs_file_dio_aio_read() for why we do a full-file flush here. - */ if (mapping->nrpages) { - ret = filemap_write_and_wait(VFS_I(ip)->i_mapping); + ret = filemap_write_and_wait_range(mapping, iocb->ki_pos, end); if (ret) goto out; + /* * Invalidate whole pages. This can return an error if we fail * to invalidate a page, but this should never happen on XFS. * Warn if it does fail. */ - ret = invalidate_inode_pages2(VFS_I(ip)->i_mapping); + ret = invalidate_inode_pages2_range(mapping, + iocb->ki_pos >> PAGE_SHIFT, end >> PAGE_SHIFT); WARN_ON_ONCE(ret); ret = 0; } /* * If we are doing unaligned IO, wait for all other IO to drain, - * otherwise demote the lock if we had to flush cached pages + * otherwise demote the lock if we had to take the exclusive lock + * for other reasons in xfs_file_aio_write_checks. */ if (unaligned_io) inode_dio_wait(inode); From fe23759eaf2f6540de20c1623f066aad967ff9c9 Mon Sep 17 00:00:00 2001 From: Eric Sandeen Date: Thu, 20 Oct 2016 15:44:53 +1100 Subject: [PATCH 076/292] xfs: remove pointless error goto in xfs_bmap_remap_alloc The commit: f65306ea xfs: map an inode's offset to an exact physical block added a pointless error0: target; remove it. Addresses-Coverity-Id: 1373865 Signed-off-by: Eric Sandeen Reviewed-by: Bill O'Donnell Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_bmap.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 0283b7eaf973..80bdb11ca6bf 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -3974,9 +3974,6 @@ xfs_bmap_remap_alloc( * allocating, so skip that check by pretending to be freeing. */ error = xfs_alloc_fix_freelist(&args, XFS_ALLOC_FLAG_FREEING); - if (error) - goto error0; -error0: xfs_perag_put(args.pag); if (error) trace_xfs_bmap_remap_alloc_error(ap->ip, error, _RET_IP_); From d099245297e28fc9f8493edd9d5a1f0967a72511 Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Thu, 20 Oct 2016 15:45:40 +1100 Subject: [PATCH 077/292] xfs: unset MS_ACTIVE if mount fails As part of the inode block map intent log item recovery process, we had to set the IRECOVERY flag to prevent an unlinked inode from being truncated during the first iput call. This required us to set MS_ACTIVE so that iput puts the inode on the lru instead of immediately evicting the inode. Unfortunately, if the mount fails later on, the inodes that have been loaded (root dir and realtime) actually need to be evicted since we're aborting the mount. If we don't clear MS_ACTIVE in the failure step, those inodes are not evicted and therefore leak. The leak was found by running xfs/130 and rmmoding xfs immediately after the test. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner --- fs/xfs/xfs_mount.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index fc7873942bea..b341f10cf481 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1009,6 +1009,7 @@ xfs_mountfs( out_quota: xfs_qm_unmount_quotas(mp); out_rtunmount: + mp->m_super->s_flags &= ~MS_ACTIVE; xfs_rtunmount_inodes(mp); out_rele_rip: IRELE(rip); From 58d789678546d46d7bbd809dd7dab417c0f23655 Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Thu, 20 Oct 2016 15:46:18 +1100 Subject: [PATCH 078/292] libxfs: clean up _calc_dquots_per_chunk The function xfs_calc_dquots_per_chunk takes a parameter in units of basic blocks. The kernel seems to get the units wrong, but userspace got 'fixed' by commenting out the unnecessary conversion. Fix both. cc: Signed-off-by: Darrick J. Wong Reviewed-by: Eric Sandeen Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_dquot_buf.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_dquot_buf.c b/fs/xfs/libxfs/xfs_dquot_buf.c index 3cc3cf767474..ac9a003dd29a 100644 --- a/fs/xfs/libxfs/xfs_dquot_buf.c +++ b/fs/xfs/libxfs/xfs_dquot_buf.c @@ -191,8 +191,7 @@ xfs_dquot_buf_verify_crc( if (mp->m_quotainfo) ndquots = mp->m_quotainfo->qi_dqperchunk; else - ndquots = xfs_calc_dquots_per_chunk( - XFS_BB_TO_FSB(mp, bp->b_length)); + ndquots = xfs_calc_dquots_per_chunk(bp->b_length); for (i = 0; i < ndquots; i++, d++) { if (!xfs_verify_cksum((char *)d, sizeof(struct xfs_dqblk), From 8cdcc8102c0cfad20513ed1bfb96e0e9963928a8 Mon Sep 17 00:00:00 2001 From: Roger Willcocks Date: Thu, 20 Oct 2016 15:48:38 +1100 Subject: [PATCH 079/292] libxfs: v3 inodes are only valid on crc-enabled filesystems xfs_repair was not detecting that version 3 inodes are invalid for for non-CRC filesystems. The result is specific inode corruptions go undetected and hence aren't repaired if only the version number is out of range. The core of the problem is that the XFS_DINODE_GOOD_VERSION() macro doesn't know that valid inode versions are dependent on a superblock version number. Fix this in libxfs, and propagate the new function out into the rest of xfsprogs to fix the issue. [Darrick: port to kernel from xfsprogs] Reported-by: Leslie Rhorer Signed-off-by: Roger Willcocks Reviewed-by: Dave Chinner Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_format.h | 1 - fs/xfs/libxfs/xfs_inode_buf.c | 13 ++++++++++++- fs/xfs/libxfs/xfs_inode_buf.h | 2 ++ 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index f6547fc5e016..6b7579e7b60a 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -865,7 +865,6 @@ typedef struct xfs_timestamp { * padding field for v3 inodes. */ #define XFS_DINODE_MAGIC 0x494e /* 'IN' */ -#define XFS_DINODE_GOOD_VERSION(v) ((v) >= 1 && (v) <= 3) typedef struct xfs_dinode { __be16 di_magic; /* inode magic # = XFS_DINODE_MAGIC */ __be16 di_mode; /* mode and type of file */ diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 8de9a3a29589..134424fac434 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -57,6 +57,17 @@ xfs_inobp_check( } #endif +bool +xfs_dinode_good_version( + struct xfs_mount *mp, + __u8 version) +{ + if (xfs_sb_version_hascrc(&mp->m_sb)) + return version == 3; + + return version == 1 || version == 2; +} + /* * If we are doing readahead on an inode buffer, we might be in log recovery * reading an inode allocation buffer that hasn't yet been replayed, and hence @@ -91,7 +102,7 @@ xfs_inode_buf_verify( dip = xfs_buf_offset(bp, (i << mp->m_sb.sb_inodelog)); di_ok = dip->di_magic == cpu_to_be16(XFS_DINODE_MAGIC) && - XFS_DINODE_GOOD_VERSION(dip->di_version); + xfs_dinode_good_version(mp, dip->di_version); if (unlikely(XFS_TEST_ERROR(!di_ok, mp, XFS_ERRTAG_ITOBP_INOTOBP, XFS_RANDOM_ITOBP_INOTOBP))) { diff --git a/fs/xfs/libxfs/xfs_inode_buf.h b/fs/xfs/libxfs/xfs_inode_buf.h index 62d9d4681c8c..3cfe12a4f58a 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.h +++ b/fs/xfs/libxfs/xfs_inode_buf.h @@ -74,6 +74,8 @@ void xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from); void xfs_log_dinode_to_disk(struct xfs_log_dinode *from, struct xfs_dinode *to); +bool xfs_dinode_good_version(struct xfs_mount *mp, __u8 version); + #if defined(DEBUG) void xfs_inobp_check(struct xfs_mount *, struct xfs_buf *); #else From 4fbc2c65255f77b315a4ee3ccac397d677a35737 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:48:54 +1100 Subject: [PATCH 080/292] xfs: remove the same fs check from xfs_file_share_range The VFS already does the check, and the placement of this duplicate is in the way of the following locking rework. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/xfs_file.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 0dc9971d3c84..194f8f396e4d 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -965,9 +965,6 @@ xfs_file_share_range( IS_SWAPFILE(inode_out)) return -ETXTBSY; - /* Reflink only works within this filesystem. */ - if (inode_in->i_sb != inode_out->i_sb) - return -EXDEV; same_inode = (inode_in->i_ino == inode_out->i_ino); /* Don't reflink dirs, pipes, sockets... */ From a62e82b35b97e60e9e22a4e303900f342139822f Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:49:03 +1100 Subject: [PATCH 081/292] xfs: fix the same_inode check in xfs_file_share_range The VFS i_ino is an unsigned long, while XFS inode numbers are 64-bit wide, so checking i_ino for equality could lead to rate false positives on 32-bit architectures. Just compare the inode pointers themselves to be safe. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/xfs_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 194f8f396e4d..d5b835e82b2d 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -965,7 +965,7 @@ xfs_file_share_range( IS_SWAPFILE(inode_out)) return -ETXTBSY; - same_inode = (inode_in->i_ino == inode_out->i_ino); + same_inode = (inode_in == inode_out); /* Don't reflink dirs, pipes, sockets... */ if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) From 576177818e6f1e65f6109ed4a8fae8b60131c861 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:49:19 +1100 Subject: [PATCH 082/292] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range We need the iolock protection to stabilizie the IS_SWAPFILE and IS_IMMUTABLE values, as well as preventing new buffered writers re-dirtying the file data that we just wrote out. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/xfs_file.c | 64 +++++++++++++++++++++++++++++--------------- fs/xfs/xfs_reflink.c | 15 ----------- 2 files changed, 42 insertions(+), 37 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index d5b835e82b2d..663761edd778 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -958,38 +958,54 @@ xfs_file_share_range( inode_out = file_inode(file_out); bs = inode_out->i_sb->s_blocksize; - /* Don't touch certain kinds of inodes */ - if (IS_IMMUTABLE(inode_out)) - return -EPERM; - if (IS_SWAPFILE(inode_in) || - IS_SWAPFILE(inode_out)) - return -ETXTBSY; - + /* Lock both files against IO */ same_inode = (inode_in == inode_out); + if (same_inode) { + xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); + xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); + } else { + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), + XFS_IOLOCK_EXCL); + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), + XFS_MMAPLOCK_EXCL); + } + + /* Don't touch certain kinds of inodes */ + ret = -EPERM; + if (IS_IMMUTABLE(inode_out)) + goto out_unlock; + ret = -ETXTBSY; + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) + goto out_unlock; /* Don't reflink dirs, pipes, sockets... */ + ret = -EISDIR; if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) - return -EISDIR; + goto out_unlock; + ret = -EINVAL; if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) - return -EINVAL; + goto out_unlock; if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) - return -EINVAL; + goto out_unlock; /* Don't share DAX file data for now. */ if (IS_DAX(inode_in) || IS_DAX(inode_out)) - return -EINVAL; + goto out_unlock; /* Are we going all the way to the end? */ isize = i_size_read(inode_in); - if (isize == 0) - return 0; + if (isize == 0) { + ret = 0; + goto out_unlock; + } + if (len == 0) len = isize - pos_in; /* Ensure offsets don't wrap and the input is inside i_size */ if (pos_in + len < pos_in || pos_out + len < pos_out || pos_in + len > isize) - return -EINVAL; + goto out_unlock; /* Don't allow dedupe past EOF in the dest file */ if (is_dedupe) { @@ -997,7 +1013,7 @@ xfs_file_share_range( disize = i_size_read(inode_out); if (pos_out >= disize || pos_out + len > disize) - return -EINVAL; + goto out_unlock; } /* If we're linking to EOF, continue to the block boundary. */ @@ -1009,28 +1025,32 @@ xfs_file_share_range( /* Only reflink if we're aligned to block boundaries */ if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) - return -EINVAL; + goto out_unlock; /* Don't allow overlapped reflink within the same file */ if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) - return -EINVAL; + goto out_unlock; /* Wait for the completion of any pending IOs on srcfile */ ret = xfs_file_wait_for_io(inode_in, pos_in, len); if (ret) - goto out; + goto out_unlock; ret = xfs_file_wait_for_io(inode_out, pos_out, len); if (ret) - goto out; + goto out_unlock; if (is_dedupe) flags |= XFS_REFLINK_DEDUPE; ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), pos_out, len, flags); - if (ret < 0) - goto out; -out: +out_unlock: + xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); + xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); + if (!same_inode) { + xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); + xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); + } return ret; } diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index d48a7cc2fe00..3b1c1a6bb5da 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1341,15 +1341,6 @@ xfs_reflink_remap_range( trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); - /* Lock both files against IO */ - if (src->i_ino == dest->i_ino) { - xfs_ilock(src, XFS_IOLOCK_EXCL); - xfs_ilock(src, XFS_MMAPLOCK_EXCL); - } else { - xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); - xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); - } - /* * Check that the extents are the same. */ @@ -1401,12 +1392,6 @@ xfs_reflink_remap_range( goto out_error; out_error: - xfs_iunlock(src, XFS_MMAPLOCK_EXCL); - xfs_iunlock(src, XFS_IOLOCK_EXCL); - if (src->i_ino != dest->i_ino) { - xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); - xfs_iunlock(dest, XFS_IOLOCK_EXCL); - } if (error) trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); return error; From ec40759902556f21f37641ad9f19d02c4dd4b555 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:49:55 +1100 Subject: [PATCH 083/292] xfs: remove xfs_file_wait_for_io filemap_write_and_wait_range operates on full pages, so there is no need for the rounding operations. Additionally this allows us to micro-optimize by skipping the second inode_dio_wait for a intra-file clone. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/xfs_file.c | 39 ++++++++++----------------------------- 1 file changed, 10 insertions(+), 29 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 663761edd778..93729752bccb 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -909,32 +909,6 @@ out_unlock: return error; } -/* - * Flush all file writes out to disk. - */ -static int -xfs_file_wait_for_io( - struct inode *inode, - loff_t offset, - size_t len) -{ - loff_t rounding; - loff_t ioffset; - loff_t iendoffset; - loff_t bs; - int ret; - - bs = inode->i_sb->s_blocksize; - inode_dio_wait(inode); - - rounding = max_t(xfs_off_t, bs, PAGE_SIZE); - ioffset = round_down(offset, rounding); - iendoffset = round_up(offset + len, rounding) - 1; - ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, - iendoffset); - return ret; -} - /* Hook up to the VFS reflink function */ STATIC int xfs_file_share_range( @@ -1031,11 +1005,18 @@ xfs_file_share_range( if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) goto out_unlock; - /* Wait for the completion of any pending IOs on srcfile */ - ret = xfs_file_wait_for_io(inode_in, pos_in, len); + /* Wait for the completion of any pending IOs on both files */ + inode_dio_wait(inode_in); + if (!same_inode) + inode_dio_wait(inode_out); + + ret = filemap_write_and_wait_range(inode_in->i_mapping, + pos_in, pos_in + len - 1); if (ret) goto out_unlock; - ret = xfs_file_wait_for_io(inode_out, pos_out, len); + + ret = filemap_write_and_wait_range(inode_out->i_mapping, + pos_out, pos_out + len - 1); if (ret) goto out_unlock; From 5faaf4fa0a20d38edc4df57baf24ea35b7e91178 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:50:07 +1100 Subject: [PATCH 084/292] xfs: merge xfs_reflink_remap_range and xfs_file_share_range There is no clear division of responsibility between those functions, so just merge them into one to keep the code simple. Also move xfs_file_wait_for_io to xfs_reflink.c together with its only caller. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/xfs_file.c | 132 +------------------------------- fs/xfs/xfs_reflink.c | 178 +++++++++++++++++++++++++++++++++---------- fs/xfs/xfs_reflink.h | 7 +- 3 files changed, 143 insertions(+), 174 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 93729752bccb..6e4f7f900fea 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -909,132 +909,6 @@ out_unlock: return error; } -/* Hook up to the VFS reflink function */ -STATIC int -xfs_file_share_range( - struct file *file_in, - loff_t pos_in, - struct file *file_out, - loff_t pos_out, - u64 len, - bool is_dedupe) -{ - struct inode *inode_in; - struct inode *inode_out; - ssize_t ret; - loff_t bs; - loff_t isize; - int same_inode; - loff_t blen; - unsigned int flags = 0; - - inode_in = file_inode(file_in); - inode_out = file_inode(file_out); - bs = inode_out->i_sb->s_blocksize; - - /* Lock both files against IO */ - same_inode = (inode_in == inode_out); - if (same_inode) { - xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); - xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); - } else { - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), - XFS_IOLOCK_EXCL); - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), - XFS_MMAPLOCK_EXCL); - } - - /* Don't touch certain kinds of inodes */ - ret = -EPERM; - if (IS_IMMUTABLE(inode_out)) - goto out_unlock; - ret = -ETXTBSY; - if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) - goto out_unlock; - - /* Don't reflink dirs, pipes, sockets... */ - ret = -EISDIR; - if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) - goto out_unlock; - ret = -EINVAL; - if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) - goto out_unlock; - if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) - goto out_unlock; - - /* Don't share DAX file data for now. */ - if (IS_DAX(inode_in) || IS_DAX(inode_out)) - goto out_unlock; - - /* Are we going all the way to the end? */ - isize = i_size_read(inode_in); - if (isize == 0) { - ret = 0; - goto out_unlock; - } - - if (len == 0) - len = isize - pos_in; - - /* Ensure offsets don't wrap and the input is inside i_size */ - if (pos_in + len < pos_in || pos_out + len < pos_out || - pos_in + len > isize) - goto out_unlock; - - /* Don't allow dedupe past EOF in the dest file */ - if (is_dedupe) { - loff_t disize; - - disize = i_size_read(inode_out); - if (pos_out >= disize || pos_out + len > disize) - goto out_unlock; - } - - /* If we're linking to EOF, continue to the block boundary. */ - if (pos_in + len == isize) - blen = ALIGN(isize, bs) - pos_in; - else - blen = len; - - /* Only reflink if we're aligned to block boundaries */ - if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || - !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) - goto out_unlock; - - /* Don't allow overlapped reflink within the same file */ - if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) - goto out_unlock; - - /* Wait for the completion of any pending IOs on both files */ - inode_dio_wait(inode_in); - if (!same_inode) - inode_dio_wait(inode_out); - - ret = filemap_write_and_wait_range(inode_in->i_mapping, - pos_in, pos_in + len - 1); - if (ret) - goto out_unlock; - - ret = filemap_write_and_wait_range(inode_out->i_mapping, - pos_out, pos_out + len - 1); - if (ret) - goto out_unlock; - - if (is_dedupe) - flags |= XFS_REFLINK_DEDUPE; - ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), - pos_out, len, flags); - -out_unlock: - xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); - xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); - if (!same_inode) { - xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); - xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); - } - return ret; -} - STATIC ssize_t xfs_file_copy_range( struct file *file_in, @@ -1046,7 +920,7 @@ xfs_file_copy_range( { int error; - error = xfs_file_share_range(file_in, pos_in, file_out, pos_out, + error = xfs_reflink_remap_range(file_in, pos_in, file_out, pos_out, len, false); if (error) return error; @@ -1061,7 +935,7 @@ xfs_file_clone_range( loff_t pos_out, u64 len) { - return xfs_file_share_range(file_in, pos_in, file_out, pos_out, + return xfs_reflink_remap_range(file_in, pos_in, file_out, pos_out, len, false); } @@ -1084,7 +958,7 @@ xfs_file_dedupe_range( if (len > XFS_MAX_DEDUPE_LEN) len = XFS_MAX_DEDUPE_LEN; - error = xfs_file_share_range(src_file, loff, dst_file, dst_loff, + error = xfs_reflink_remap_range(src_file, loff, dst_file, dst_loff, len, true); if (error) return error; diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 3b1c1a6bb5da..6592daa833a4 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1312,19 +1312,26 @@ out_error: */ int xfs_reflink_remap_range( - struct xfs_inode *src, - xfs_off_t srcoff, - struct xfs_inode *dest, - xfs_off_t destoff, - xfs_off_t len, - unsigned int flags) + struct file *file_in, + loff_t pos_in, + struct file *file_out, + loff_t pos_out, + u64 len, + bool is_dedupe) { + struct inode *inode_in = file_inode(file_in); + struct xfs_inode *src = XFS_I(inode_in); + struct inode *inode_out = file_inode(file_out); + struct xfs_inode *dest = XFS_I(inode_out); struct xfs_mount *mp = src->i_mount; + loff_t bs = inode_out->i_sb->s_blocksize; + bool same_inode = (inode_in == inode_out); xfs_fileoff_t sfsbno, dfsbno; xfs_filblks_t fsblen; - int error; xfs_extlen_t cowextsize; - bool is_same; + loff_t isize; + ssize_t ret; + loff_t blen; if (!xfs_sb_version_hasreflink(&mp->m_sb)) return -EOPNOTSUPP; @@ -1332,48 +1339,135 @@ xfs_reflink_remap_range( if (XFS_FORCED_SHUTDOWN(mp)) return -EIO; + /* Lock both files against IO */ + if (same_inode) { + xfs_ilock(src, XFS_IOLOCK_EXCL); + xfs_ilock(src, XFS_MMAPLOCK_EXCL); + } else { + xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); + xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); + } + + /* Don't touch certain kinds of inodes */ + ret = -EPERM; + if (IS_IMMUTABLE(inode_out)) + goto out_unlock; + + ret = -ETXTBSY; + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) + goto out_unlock; + + + /* Don't reflink dirs, pipes, sockets... */ + ret = -EISDIR; + if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) + goto out_unlock; + ret = -EINVAL; + if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) + goto out_unlock; + if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) + goto out_unlock; + /* Don't reflink realtime inodes */ if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest)) - return -EINVAL; + goto out_unlock; - if (flags & ~XFS_REFLINK_ALL) - return -EINVAL; + /* Don't share DAX file data for now. */ + if (IS_DAX(inode_in) || IS_DAX(inode_out)) + goto out_unlock; - trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); + /* Are we going all the way to the end? */ + isize = i_size_read(inode_in); + if (isize == 0) { + ret = 0; + goto out_unlock; + } + + if (len == 0) + len = isize - pos_in; + + /* Ensure offsets don't wrap and the input is inside i_size */ + if (pos_in + len < pos_in || pos_out + len < pos_out || + pos_in + len > isize) + goto out_unlock; + + /* Don't allow dedupe past EOF in the dest file */ + if (is_dedupe) { + loff_t disize; + + disize = i_size_read(inode_out); + if (pos_out >= disize || pos_out + len > disize) + goto out_unlock; + } + + /* If we're linking to EOF, continue to the block boundary. */ + if (pos_in + len == isize) + blen = ALIGN(isize, bs) - pos_in; + else + blen = len; + + /* Only reflink if we're aligned to block boundaries */ + if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || + !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) + goto out_unlock; + + /* Don't allow overlapped reflink within the same file */ + if (same_inode) { + if (pos_out + blen > pos_in && pos_out < pos_in + blen) + goto out_unlock; + } + + /* Wait for the completion of any pending IOs on both files */ + inode_dio_wait(inode_in); + if (!same_inode) + inode_dio_wait(inode_out); + + ret = filemap_write_and_wait_range(inode_in->i_mapping, + pos_in, pos_in + len - 1); + if (ret) + goto out_unlock; + + ret = filemap_write_and_wait_range(inode_out->i_mapping, + pos_out, pos_out + len - 1); + if (ret) + goto out_unlock; + + trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out); /* * Check that the extents are the same. */ - if (flags & XFS_REFLINK_DEDUPE) { - is_same = false; - error = xfs_compare_extents(VFS_I(src), srcoff, VFS_I(dest), - destoff, len, &is_same); - if (error) - goto out_error; + if (is_dedupe) { + bool is_same = false; + + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out, + len, &is_same); + if (ret) + goto out_unlock; if (!is_same) { - error = -EBADE; - goto out_error; + ret = -EBADE; + goto out_unlock; } } - error = xfs_reflink_set_inode_flag(src, dest); - if (error) - goto out_error; + ret = xfs_reflink_set_inode_flag(src, dest); + if (ret) + goto out_unlock; /* * Invalidate the page cache so that we can clear any CoW mappings * in the destination file. */ - truncate_inode_pages_range(&VFS_I(dest)->i_data, destoff, - PAGE_ALIGN(destoff + len) - 1); + truncate_inode_pages_range(&inode_out->i_data, pos_out, + PAGE_ALIGN(pos_out + len) - 1); - dfsbno = XFS_B_TO_FSBT(mp, destoff); - sfsbno = XFS_B_TO_FSBT(mp, srcoff); + dfsbno = XFS_B_TO_FSBT(mp, pos_out); + sfsbno = XFS_B_TO_FSBT(mp, pos_in); fsblen = XFS_B_TO_FSB(mp, len); - error = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, - destoff + len); - if (error) - goto out_error; + ret = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, + pos_out + len); + if (ret) + goto out_unlock; /* * Carry the cowextsize hint from src to dest if we're sharing the @@ -1381,20 +1475,24 @@ xfs_reflink_remap_range( * has a cowextsize hint, and the destination file does not. */ cowextsize = 0; - if (srcoff == 0 && len == i_size_read(VFS_I(src)) && + if (pos_in == 0 && len == i_size_read(inode_in) && (src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) && - destoff == 0 && len >= i_size_read(VFS_I(dest)) && + pos_out == 0 && len >= i_size_read(inode_out) && !(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE)) cowextsize = src->i_d.di_cowextsize; - error = xfs_reflink_update_dest(dest, destoff + len, cowextsize); - if (error) - goto out_error; + ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize); -out_error: - if (error) - trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); - return error; +out_unlock: + xfs_iunlock(src, XFS_MMAPLOCK_EXCL); + xfs_iunlock(src, XFS_IOLOCK_EXCL); + if (src->i_ino != dest->i_ino) { + xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); + xfs_iunlock(dest, XFS_IOLOCK_EXCL); + } + if (ret) + trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_); + return ret; } /* diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 5dc3c8ac12aa..7ddd9f69560d 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -43,11 +43,8 @@ extern int xfs_reflink_cancel_cow_range(struct xfs_inode *ip, xfs_off_t offset, extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset, xfs_off_t count); extern int xfs_reflink_recover_cow(struct xfs_mount *mp); -#define XFS_REFLINK_DEDUPE 1 /* only reflink if contents match */ -#define XFS_REFLINK_ALL (XFS_REFLINK_DEDUPE) -extern int xfs_reflink_remap_range(struct xfs_inode *src, xfs_off_t srcoff, - struct xfs_inode *dest, xfs_off_t destoff, xfs_off_t len, - unsigned int flags); +extern int xfs_reflink_remap_range(struct file *file_in, loff_t pos_in, + struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe); extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip, struct xfs_trans **tpp); extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, From d33fd776f992332be110f6ceca6dad60cb5d513f Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:51:28 +1100 Subject: [PATCH 085/292] iomap: add IOMAP_REPORT This allows the file system to tell a FIEMAP from a read operation, and thus avoids the need to report flags that aren't actually used in the read path. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/iomap.c | 2 +- include/linux/iomap.h | 17 +++++++++++------ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index 013d1d36fbbf..a92204012e2d 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -561,7 +561,7 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi, } while (len > 0) { - ret = iomap_apply(inode, start, len, 0, ops, &ctx, + ret = iomap_apply(inode, start, len, IOMAP_REPORT, ops, &ctx, iomap_fiemap_actor); /* inode with no (attribute) mapping will give ENOENT */ if (ret == -ENOENT) diff --git a/include/linux/iomap.h b/include/linux/iomap.h index e63e288dee83..7892f55a1866 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -19,11 +19,15 @@ struct vm_fault; #define IOMAP_UNWRITTEN 0x04 /* blocks allocated @blkno in unwritten state */ /* - * Flags for iomap mappings: + * Flags for all iomap mappings: */ -#define IOMAP_F_MERGED 0x01 /* contains multiple blocks/extents */ -#define IOMAP_F_SHARED 0x02 /* block shared with another file */ -#define IOMAP_F_NEW 0x04 /* blocks have been newly allocated */ +#define IOMAP_F_NEW 0x01 /* blocks have been newly allocated */ + +/* + * Flags that only need to be reported for IOMAP_REPORT requests: + */ +#define IOMAP_F_MERGED 0x10 /* contains multiple blocks/extents */ +#define IOMAP_F_SHARED 0x20 /* block shared with another file */ /* * Magic value for blkno: @@ -42,8 +46,9 @@ struct iomap { /* * Flags for iomap_begin / iomap_end. No flag implies a read. */ -#define IOMAP_WRITE (1 << 0) -#define IOMAP_ZERO (1 << 1) +#define IOMAP_WRITE (1 << 0) /* writing, must allocate blocks */ +#define IOMAP_ZERO (1 << 1) /* zeroing operation, may skip holes */ +#define IOMAP_REPORT (1 << 2) /* report extent status, e.g. FIEMAP */ struct iomap_ops { /* From 0a0af28cad9a43d90f13c2047bd8ee3d4cffb7f3 Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Thu, 20 Oct 2016 15:51:50 +1100 Subject: [PATCH 086/292] xfs: add xfs_trim_extent This helpers allows to trim an extent to a subset of it's original range while making sure the block numbers in it remain valid, In the future xfs_trim_extent and xfs_bmapi_trim_map should probably be merged in some form. Signed-off-by: Darrick J. Wong [hch: split from a previous patch from Darrick, moved around and added support for "raw" delayed extents"] Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_bmap.c | 33 +++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_bmap.h | 2 ++ 2 files changed, 35 insertions(+) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 80bdb11ca6bf..381e7659598c 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -3996,6 +3996,39 @@ xfs_bmap_alloc( return xfs_bmap_btalloc(ap); } +/* Trim extent to fit a logical block range. */ +void +xfs_trim_extent( + struct xfs_bmbt_irec *irec, + xfs_fileoff_t bno, + xfs_filblks_t len) +{ + xfs_fileoff_t distance; + xfs_fileoff_t end = bno + len; + + if (irec->br_startoff + irec->br_blockcount <= bno || + irec->br_startoff >= end) { + irec->br_blockcount = 0; + return; + } + + if (irec->br_startoff < bno) { + distance = bno - irec->br_startoff; + if (isnullstartblock(irec->br_startblock)) + irec->br_startblock = DELAYSTARTBLOCK; + if (irec->br_startblock != DELAYSTARTBLOCK && + irec->br_startblock != HOLESTARTBLOCK) + irec->br_startblock += distance; + irec->br_startoff += distance; + irec->br_blockcount -= distance; + } + + if (end < irec->br_startoff + irec->br_blockcount) { + distance = irec->br_startoff + irec->br_blockcount - end; + irec->br_blockcount -= distance; + } +} + /* * Trim the returned map to the required bounds */ diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index f97db7132564..eb86af0c988b 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -190,6 +190,8 @@ void xfs_bmap_trace_exlist(struct xfs_inode *ip, xfs_extnum_t cnt, #define XFS_BMAP_TRACE_EXLIST(ip,c,w) #endif +void xfs_trim_extent(struct xfs_bmbt_irec *irec, xfs_fileoff_t bno, + xfs_filblks_t len); int xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd); void xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork); void xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_defer_ops *dfops, From 62c5ac89de7d5eecdbaa94ca3d554b0bd41578b1 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:52:00 +1100 Subject: [PATCH 087/292] xfs: handle "raw" delayed extents xfs_reflink_trim_around_shared Delalloc extents in the extent list contain the number of reserved indirect blocks in their startblock value and don't use the magic DELAYSTARTBLOCK constant. Ensure that xfs_reflink_trim_around_shared handles them properly by checking for isnullstartblock(). Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/xfs/xfs_reflink.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 6592daa833a4..6c4c215634ec 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -182,7 +182,8 @@ xfs_reflink_trim_around_shared( if (!xfs_is_reflink_inode(ip) || ISUNWRITTEN(irec) || irec->br_startblock == HOLESTARTBLOCK || - irec->br_startblock == DELAYSTARTBLOCK) { + irec->br_startblock == DELAYSTARTBLOCK || + isnullstartblock(irec->br_startblock)) { *shared = false; return 0; } From 5f9268ca53aca992106d74edde3e7cf6c1be60a0 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:53:32 +1100 Subject: [PATCH 088/292] xfs: don't bother looking at the refcount tree for reads There is no need to trim an extent into a shared or non-shared one, or report any flags for plain old reads. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/xfs/xfs_iomap.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index d907eb9f8ef3..1dabf2eb136a 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -996,11 +996,14 @@ xfs_file_iomap_begin( return error; } - /* Trim the mapping to the nearest shared extent boundary. */ - error = xfs_reflink_trim_around_shared(ip, &imap, &shared, &trimmed); - if (error) { - xfs_iunlock(ip, lockmode); - return error; + if (flags & (IOMAP_WRITE | IOMAP_ZERO | IOMAP_REPORT)) { + /* Trim the mapping to the nearest shared extent boundary. */ + error = xfs_reflink_trim_around_shared(ip, &imap, &shared, + &trimmed); + if (error) { + xfs_iunlock(ip, lockmode); + return error; + } } if ((flags & IOMAP_WRITE) && imap_needs_alloc(inode, &imap, nimaps)) { From 3ba020befef030aaabbd5eb82a09f6ddf02a9542 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:53:50 +1100 Subject: [PATCH 089/292] xfs: optimize writes to reflink files Instead of reserving space as the first thing in write_begin move it past reading the extent in the data fork. That way we only have to read from the data fork once and can reuse that information for trimming the extent to the shared/unshared boundary. Additionally this allows to easily limit the actual write size to said boundary, and avoid a roundtrip on the ilock. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/xfs/xfs_iomap.c | 56 +++++++++++------ fs/xfs/xfs_reflink.c | 142 ++++++++++++++++++------------------------- fs/xfs/xfs_reflink.h | 4 +- fs/xfs/xfs_trace.h | 3 +- 4 files changed, 100 insertions(+), 105 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 1dabf2eb136a..436e109bb01e 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -566,6 +566,17 @@ xfs_file_iomap_begin_delay( xfs_bmap_search_extents(ip, offset_fsb, XFS_DATA_FORK, &eof, &idx, &got, &prev); if (!eof && got.br_startoff <= offset_fsb) { + if (xfs_is_reflink_inode(ip)) { + bool shared; + + end_fsb = min(XFS_B_TO_FSB(mp, offset + count), + maxbytes_fsb); + xfs_trim_extent(&got, offset_fsb, end_fsb - offset_fsb); + error = xfs_reflink_reserve_cow(ip, &got, &shared); + if (error) + goto out_unlock; + } + trace_xfs_iomap_found(ip, offset, count, 0, &got); goto done; } @@ -961,19 +972,13 @@ xfs_file_iomap_begin( struct xfs_mount *mp = ip->i_mount; struct xfs_bmbt_irec imap; xfs_fileoff_t offset_fsb, end_fsb; - bool shared, trimmed; int nimaps = 1, error = 0; + bool shared = false, trimmed = false; unsigned lockmode; if (XFS_FORCED_SHUTDOWN(mp)) return -EIO; - if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { - error = xfs_reflink_reserve_cow_range(ip, offset, length); - if (error < 0) - return error; - } - if ((flags & IOMAP_WRITE) && !IS_DAX(inode) && !xfs_get_extsz_hint(ip)) { /* Reserve delalloc blocks for regular writeback. */ @@ -981,7 +986,16 @@ xfs_file_iomap_begin( iomap); } - lockmode = xfs_ilock_data_map_shared(ip); + /* + * COW writes will allocate delalloc space, so we need to make sure + * to take the lock exclusively here. + */ + if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { + lockmode = XFS_ILOCK_EXCL; + xfs_ilock(ip, XFS_ILOCK_EXCL); + } else { + lockmode = xfs_ilock_data_map_shared(ip); + } ASSERT(offset <= mp->m_super->s_maxbytes); if ((xfs_fsize_t)offset + length > mp->m_super->s_maxbytes) @@ -991,19 +1005,24 @@ xfs_file_iomap_begin( error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap, &nimaps, 0); - if (error) { - xfs_iunlock(ip, lockmode); - return error; - } + if (error) + goto out_unlock; - if (flags & (IOMAP_WRITE | IOMAP_ZERO | IOMAP_REPORT)) { + if (flags & IOMAP_REPORT) { /* Trim the mapping to the nearest shared extent boundary. */ error = xfs_reflink_trim_around_shared(ip, &imap, &shared, &trimmed); - if (error) { - xfs_iunlock(ip, lockmode); - return error; - } + if (error) + goto out_unlock; + } + + if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { + error = xfs_reflink_reserve_cow(ip, &imap, &shared); + if (error) + goto out_unlock; + + end_fsb = imap.br_startoff + imap.br_blockcount; + length = XFS_FSB_TO_B(mp, end_fsb) - offset; } if ((flags & IOMAP_WRITE) && imap_needs_alloc(inode, &imap, nimaps)) { @@ -1042,6 +1061,9 @@ xfs_file_iomap_begin( if (shared) iomap->flags |= IOMAP_F_SHARED; return 0; +out_unlock: + xfs_iunlock(ip, lockmode); + return error; } static int diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 6c4c215634ec..9c477de3c1ac 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -228,50 +228,54 @@ xfs_reflink_trim_around_shared( } } -/* Create a CoW reservation for a range of blocks within a file. */ -static int -__xfs_reflink_reserve_cow( +/* + * Trim the passed in imap to the next shared/unshared extent boundary, and + * if imap->br_startoff points to a shared extent reserve space for it in the + * COW fork. In this case *shared is set to true, else to false. + * + * Note that imap will always contain the block numbers for the existing blocks + * in the data fork, as the upper layers need them for read-modify-write + * operations. + */ +int +xfs_reflink_reserve_cow( struct xfs_inode *ip, - xfs_fileoff_t *offset_fsb, - xfs_fileoff_t end_fsb, - bool *skipped) + struct xfs_bmbt_irec *imap, + bool *shared) { - struct xfs_bmbt_irec got, prev, imap; - xfs_fileoff_t orig_end_fsb; - int nimaps, eof = 0, error = 0; - bool shared = false, trimmed = false; + struct xfs_bmbt_irec got, prev; + xfs_fileoff_t end_fsb, orig_end_fsb; + int eof = 0, error = 0; + bool trimmed; xfs_extnum_t idx; xfs_extlen_t align; - /* Already reserved? Skip the refcount btree access. */ - xfs_bmap_search_extents(ip, *offset_fsb, XFS_COW_FORK, &eof, &idx, + /* + * Search the COW fork extent list first. This serves two purposes: + * first this implement the speculative preallocation using cowextisze, + * so that we also unshared block adjacent to shared blocks instead + * of just the shared blocks themselves. Second the lookup in the + * extent list is generally faster than going out to the shared extent + * tree. + */ + xfs_bmap_search_extents(ip, imap->br_startoff, XFS_COW_FORK, &eof, &idx, &got, &prev); - if (!eof && got.br_startoff <= *offset_fsb) { - end_fsb = orig_end_fsb = got.br_startoff + got.br_blockcount; - trace_xfs_reflink_cow_found(ip, &got); - goto done; - } + if (!eof && got.br_startoff <= imap->br_startoff) { + trace_xfs_reflink_cow_found(ip, imap); + xfs_trim_extent(imap, got.br_startoff, got.br_blockcount); - /* Read extent from the source file. */ - nimaps = 1; - error = xfs_bmapi_read(ip, *offset_fsb, end_fsb - *offset_fsb, - &imap, &nimaps, 0); - if (error) - goto out_unlock; - ASSERT(nimaps == 1); + *shared = true; + return 0; + } /* Trim the mapping to the nearest shared extent boundary. */ - error = xfs_reflink_trim_around_shared(ip, &imap, &shared, &trimmed); + error = xfs_reflink_trim_around_shared(ip, imap, shared, &trimmed); if (error) - goto out_unlock; - - end_fsb = orig_end_fsb = imap.br_startoff + imap.br_blockcount; + return error; /* Not shared? Just report the (potentially capped) extent. */ - if (!shared) { - *skipped = true; - goto done; - } + if (!*shared) + return 0; /* * Fork all the shared blocks from our write offset until the end of @@ -279,72 +283,38 @@ __xfs_reflink_reserve_cow( */ error = xfs_qm_dqattach_locked(ip, 0); if (error) - goto out_unlock; + return error; + + end_fsb = orig_end_fsb = imap->br_startoff + imap->br_blockcount; align = xfs_eof_alignment(ip, xfs_get_cowextsz_hint(ip)); if (align) end_fsb = roundup_64(end_fsb, align); retry: - error = xfs_bmapi_reserve_delalloc(ip, XFS_COW_FORK, *offset_fsb, - end_fsb - *offset_fsb, &got, - &prev, &idx, eof); + error = xfs_bmapi_reserve_delalloc(ip, XFS_COW_FORK, imap->br_startoff, + end_fsb - imap->br_startoff, &got, &prev, &idx, eof); switch (error) { case 0: break; case -ENOSPC: case -EDQUOT: /* retry without any preallocation */ - trace_xfs_reflink_cow_enospc(ip, &imap); + trace_xfs_reflink_cow_enospc(ip, imap); if (end_fsb != orig_end_fsb) { end_fsb = orig_end_fsb; goto retry; } /*FALLTHRU*/ default: - goto out_unlock; + return error; } if (end_fsb != orig_end_fsb) xfs_inode_set_cowblocks_tag(ip); trace_xfs_reflink_cow_alloc(ip, &got); -done: - *offset_fsb = end_fsb; -out_unlock: - return error; -} - -/* Create a CoW reservation for part of a file. */ -int -xfs_reflink_reserve_cow_range( - struct xfs_inode *ip, - xfs_off_t offset, - xfs_off_t count) -{ - struct xfs_mount *mp = ip->i_mount; - xfs_fileoff_t offset_fsb, end_fsb; - bool skipped = false; - int error = 0; - - trace_xfs_reflink_reserve_cow_range(ip, offset, count); - - offset_fsb = XFS_B_TO_FSBT(mp, offset); - end_fsb = XFS_B_TO_FSB(mp, offset + count); - - xfs_ilock(ip, XFS_ILOCK_EXCL); - while (offset_fsb < end_fsb) { - error = __xfs_reflink_reserve_cow(ip, &offset_fsb, end_fsb, - &skipped); - if (error) { - trace_xfs_reflink_reserve_cow_range_error(ip, error, - _RET_IP_); - break; - } - } - xfs_iunlock(ip, XFS_ILOCK_EXCL); - - return error; + return 0; } /* Allocate all CoW reservations covering a range of blocks in a file. */ @@ -359,9 +329,8 @@ __xfs_reflink_allocate_cow( struct xfs_defer_ops dfops; struct xfs_trans *tp; xfs_fsblock_t first_block; - xfs_fileoff_t next_fsb; int nimaps = 1, error; - bool skipped = false; + bool shared; xfs_defer_init(&dfops, &first_block); @@ -372,33 +341,38 @@ __xfs_reflink_allocate_cow( xfs_ilock(ip, XFS_ILOCK_EXCL); - next_fsb = *offset_fsb; - error = __xfs_reflink_reserve_cow(ip, &next_fsb, end_fsb, &skipped); + /* Read extent from the source file. */ + nimaps = 1; + error = xfs_bmapi_read(ip, *offset_fsb, end_fsb - *offset_fsb, + &imap, &nimaps, 0); + if (error) + goto out_unlock; + ASSERT(nimaps == 1); + + error = xfs_reflink_reserve_cow(ip, &imap, &shared); if (error) goto out_trans_cancel; - if (skipped) { - *offset_fsb = next_fsb; + if (!shared) { + *offset_fsb = imap.br_startoff + imap.br_blockcount; goto out_trans_cancel; } xfs_trans_ijoin(tp, ip, 0); - error = xfs_bmapi_write(tp, ip, *offset_fsb, next_fsb - *offset_fsb, + error = xfs_bmapi_write(tp, ip, imap.br_startoff, imap.br_blockcount, XFS_BMAPI_COWFORK, &first_block, XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK), &imap, &nimaps, &dfops); if (error) goto out_trans_cancel; - /* We might not have been able to map the whole delalloc extent */ - *offset_fsb = min(*offset_fsb + imap.br_blockcount, next_fsb); - error = xfs_defer_finish(&tp, &dfops, NULL); if (error) goto out_trans_cancel; error = xfs_trans_commit(tp); + *offset_fsb = imap.br_startoff + imap.br_blockcount; out_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); return error; diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 7ddd9f69560d..fad11607c9ad 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -26,8 +26,8 @@ extern int xfs_reflink_find_shared(struct xfs_mount *mp, xfs_agnumber_t agno, extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip, struct xfs_bmbt_irec *irec, bool *shared, bool *trimmed); -extern int xfs_reflink_reserve_cow_range(struct xfs_inode *ip, - xfs_off_t offset, xfs_off_t count); +extern int xfs_reflink_reserve_cow(struct xfs_inode *ip, + struct xfs_bmbt_irec *imap, bool *shared); extern int xfs_reflink_allocate_cow_range(struct xfs_inode *ip, xfs_off_t offset, xfs_off_t count); extern bool xfs_reflink_find_cow_mapping(struct xfs_inode *ip, xfs_off_t offset, diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index ad188d3a83f3..72f9f6b7a76a 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3346,7 +3346,7 @@ DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_alloc); DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_found); DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_enospc); -DEFINE_RW_EVENT(xfs_reflink_reserve_cow_range); +DEFINE_RW_EVENT(xfs_reflink_reserve_cow); DEFINE_RW_EVENT(xfs_reflink_allocate_cow_range); DEFINE_INODE_IREC_EVENT(xfs_reflink_bounce_dio_write); @@ -3358,7 +3358,6 @@ DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_cow); DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap); DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_piece); -DEFINE_INODE_ERROR_EVENT(xfs_reflink_reserve_cow_range_error); DEFINE_INODE_ERROR_EVENT(xfs_reflink_allocate_cow_range_error); DEFINE_INODE_ERROR_EVENT(xfs_reflink_cancel_cow_range_error); DEFINE_INODE_ERROR_EVENT(xfs_reflink_end_cow_error); From fa5c836ca8eb5bad6316ddfc066acbc4e2485356 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:54:14 +1100 Subject: [PATCH 090/292] xfs: refactor xfs_bunmapi_cow Split out two helpers for deleting delayed or real extents from the COW fork. This allows to call them directly from xfs_reflink_cow_end_io once that function is refactored to iterate the extent tree. It will also allow to reuse the delalloc deletion from xfs_bunmapi in the future. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_bmap.c | 374 +++++++++++++++++++++++---------------- fs/xfs/libxfs/xfs_bmap.h | 5 + fs/xfs/xfs_reflink.c | 5 - 3 files changed, 225 insertions(+), 159 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 381e7659598c..d7ad51132f4f 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4859,6 +4859,219 @@ xfs_bmap_split_indlen( return stolen; } +int +xfs_bmap_del_extent_delay( + struct xfs_inode *ip, + int whichfork, + xfs_extnum_t *idx, + struct xfs_bmbt_irec *got, + struct xfs_bmbt_irec *del) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork); + struct xfs_bmbt_irec new; + int64_t da_old, da_new, da_diff = 0; + xfs_fileoff_t del_endoff, got_endoff; + xfs_filblks_t got_indlen, new_indlen, stolen; + int error = 0, state = 0; + bool isrt; + + XFS_STATS_INC(mp, xs_del_exlist); + + isrt = (whichfork == XFS_DATA_FORK) && XFS_IS_REALTIME_INODE(ip); + del_endoff = del->br_startoff + del->br_blockcount; + got_endoff = got->br_startoff + got->br_blockcount; + da_old = startblockval(got->br_startblock); + da_new = 0; + + ASSERT(*idx >= 0); + ASSERT(*idx < ifp->if_bytes / sizeof(struct xfs_bmbt_rec)); + ASSERT(del->br_blockcount > 0); + ASSERT(got->br_startoff <= del->br_startoff); + ASSERT(got_endoff >= del_endoff); + + if (isrt) { + int64_t rtexts = XFS_FSB_TO_B(mp, del->br_blockcount); + + do_div(rtexts, mp->m_sb.sb_rextsize); + xfs_mod_frextents(mp, rtexts); + } + + /* + * Update the inode delalloc counter now and wait to update the + * sb counters as we might have to borrow some blocks for the + * indirect block accounting. + */ + xfs_trans_reserve_quota_nblks(NULL, ip, -((long)del->br_blockcount), 0, + isrt ? XFS_QMOPT_RES_RTBLKS : XFS_QMOPT_RES_REGBLKS); + ip->i_delayed_blks -= del->br_blockcount; + + if (whichfork == XFS_COW_FORK) + state |= BMAP_COWFORK; + + if (got->br_startoff == del->br_startoff) + state |= BMAP_LEFT_CONTIG; + if (got_endoff == del_endoff) + state |= BMAP_RIGHT_CONTIG; + + switch (state & (BMAP_LEFT_CONTIG | BMAP_RIGHT_CONTIG)) { + case BMAP_LEFT_CONTIG | BMAP_RIGHT_CONTIG: + /* + * Matches the whole extent. Delete the entry. + */ + xfs_iext_remove(ip, *idx, 1, state); + --*idx; + break; + case BMAP_LEFT_CONTIG: + /* + * Deleting the first part of the extent. + */ + trace_xfs_bmap_pre_update(ip, *idx, state, _THIS_IP_); + got->br_startoff = del_endoff; + got->br_blockcount -= del->br_blockcount; + da_new = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, + got->br_blockcount), da_old); + got->br_startblock = nullstartblock((int)da_new); + xfs_bmbt_set_all(xfs_iext_get_ext(ifp, *idx), got); + trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_); + break; + case BMAP_RIGHT_CONTIG: + /* + * Deleting the last part of the extent. + */ + trace_xfs_bmap_pre_update(ip, *idx, state, _THIS_IP_); + got->br_blockcount = got->br_blockcount - del->br_blockcount; + da_new = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, + got->br_blockcount), da_old); + got->br_startblock = nullstartblock((int)da_new); + xfs_bmbt_set_all(xfs_iext_get_ext(ifp, *idx), got); + trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_); + break; + case 0: + /* + * Deleting the middle of the extent. + * + * Distribute the original indlen reservation across the two new + * extents. Steal blocks from the deleted extent if necessary. + * Stealing blocks simply fudges the fdblocks accounting below. + * Warn if either of the new indlen reservations is zero as this + * can lead to delalloc problems. + */ + trace_xfs_bmap_pre_update(ip, *idx, state, _THIS_IP_); + + got->br_blockcount = del->br_startoff - got->br_startoff; + got_indlen = xfs_bmap_worst_indlen(ip, got->br_blockcount); + + new.br_blockcount = got_endoff - del_endoff; + new_indlen = xfs_bmap_worst_indlen(ip, new.br_blockcount); + + WARN_ON_ONCE(!got_indlen || !new_indlen); + stolen = xfs_bmap_split_indlen(da_old, &got_indlen, &new_indlen, + del->br_blockcount); + + got->br_startblock = nullstartblock((int)got_indlen); + xfs_bmbt_set_all(xfs_iext_get_ext(ifp, *idx), got); + trace_xfs_bmap_post_update(ip, *idx, 0, _THIS_IP_); + + new.br_startoff = del_endoff; + new.br_state = got->br_state; + new.br_startblock = nullstartblock((int)new_indlen); + + ++*idx; + xfs_iext_insert(ip, *idx, 1, &new, state); + + da_new = got_indlen + new_indlen - stolen; + del->br_blockcount -= stolen; + break; + } + + ASSERT(da_old >= da_new); + da_diff = da_old - da_new; + if (!isrt) + da_diff += del->br_blockcount; + if (da_diff) + xfs_mod_fdblocks(mp, da_diff, false); + return error; +} + +void +xfs_bmap_del_extent_cow( + struct xfs_inode *ip, + xfs_extnum_t *idx, + struct xfs_bmbt_irec *got, + struct xfs_bmbt_irec *del) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK); + struct xfs_bmbt_irec new; + xfs_fileoff_t del_endoff, got_endoff; + int state = BMAP_COWFORK; + + XFS_STATS_INC(mp, xs_del_exlist); + + del_endoff = del->br_startoff + del->br_blockcount; + got_endoff = got->br_startoff + got->br_blockcount; + + ASSERT(*idx >= 0); + ASSERT(*idx < ifp->if_bytes / sizeof(struct xfs_bmbt_rec)); + ASSERT(del->br_blockcount > 0); + ASSERT(got->br_startoff <= del->br_startoff); + ASSERT(got_endoff >= del_endoff); + ASSERT(!isnullstartblock(got->br_startblock)); + + if (got->br_startoff == del->br_startoff) + state |= BMAP_LEFT_CONTIG; + if (got_endoff == del_endoff) + state |= BMAP_RIGHT_CONTIG; + + switch (state & (BMAP_LEFT_CONTIG | BMAP_RIGHT_CONTIG)) { + case BMAP_LEFT_CONTIG | BMAP_RIGHT_CONTIG: + /* + * Matches the whole extent. Delete the entry. + */ + xfs_iext_remove(ip, *idx, 1, state); + --*idx; + break; + case BMAP_LEFT_CONTIG: + /* + * Deleting the first part of the extent. + */ + trace_xfs_bmap_pre_update(ip, *idx, state, _THIS_IP_); + got->br_startoff = del_endoff; + got->br_blockcount -= del->br_blockcount; + got->br_startblock = del->br_startblock + del->br_blockcount; + xfs_bmbt_set_all(xfs_iext_get_ext(ifp, *idx), got); + trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_); + break; + case BMAP_RIGHT_CONTIG: + /* + * Deleting the last part of the extent. + */ + trace_xfs_bmap_pre_update(ip, *idx, state, _THIS_IP_); + got->br_blockcount -= del->br_blockcount; + xfs_bmbt_set_all(xfs_iext_get_ext(ifp, *idx), got); + trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_); + break; + case 0: + /* + * Deleting the middle of the extent. + */ + trace_xfs_bmap_pre_update(ip, *idx, state, _THIS_IP_); + got->br_blockcount = del->br_startoff - got->br_startoff; + xfs_bmbt_set_all(xfs_iext_get_ext(ifp, *idx), got); + trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_); + + new.br_startoff = del_endoff; + new.br_blockcount = got_endoff - del_endoff; + new.br_state = got->br_state; + new.br_startblock = del->br_startblock + del->br_blockcount; + + ++*idx; + xfs_iext_insert(ip, *idx, 1, &new, state); + break; + } +} + /* * Called by xfs_bmapi to update file extent records and the btree * after removing space (or undoing a delayed allocation). @@ -5207,167 +5420,20 @@ xfs_bunmapi_cow( struct xfs_inode *ip, struct xfs_bmbt_irec *del) { - xfs_filblks_t da_new; - xfs_filblks_t da_old; - xfs_fsblock_t del_endblock = 0; - xfs_fileoff_t del_endoff; - int delay; struct xfs_bmbt_rec_host *ep; - int error; struct xfs_bmbt_irec got; - xfs_fileoff_t got_endoff; - struct xfs_ifork *ifp; - struct xfs_mount *mp; - xfs_filblks_t nblks; struct xfs_bmbt_irec new; - /* REFERENCED */ - uint qfield; - xfs_filblks_t temp; - xfs_filblks_t temp2; - int state = BMAP_COWFORK; int eof; xfs_extnum_t eidx; - mp = ip->i_mount; - XFS_STATS_INC(mp, xs_del_exlist); - ep = xfs_bmap_search_extents(ip, del->br_startoff, XFS_COW_FORK, &eof, - &eidx, &got, &new); - - ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK); - ASSERT((eidx >= 0) && (eidx < ifp->if_bytes / - (uint)sizeof(xfs_bmbt_rec_t))); - ASSERT(del->br_blockcount > 0); - ASSERT(got.br_startoff <= del->br_startoff); - del_endoff = del->br_startoff + del->br_blockcount; - got_endoff = got.br_startoff + got.br_blockcount; - ASSERT(got_endoff >= del_endoff); - delay = isnullstartblock(got.br_startblock); - ASSERT(isnullstartblock(del->br_startblock) == delay); - qfield = 0; - error = 0; - /* - * If deleting a real allocation, must free up the disk space. - */ - if (!delay) { - nblks = del->br_blockcount; - qfield = XFS_TRANS_DQ_BCOUNT; - /* - * Set up del_endblock and cur for later. - */ - del_endblock = del->br_startblock + del->br_blockcount; - da_old = da_new = 0; - } else { - da_old = startblockval(got.br_startblock); - da_new = 0; - nblks = 0; - } - qfield = qfield; - nblks = nblks; - - /* - * Set flag value to use in switch statement. - * Left-contig is 2, right-contig is 1. - */ - switch (((got.br_startoff == del->br_startoff) << 1) | - (got_endoff == del_endoff)) { - case 3: - /* - * Matches the whole extent. Delete the entry. - */ - xfs_iext_remove(ip, eidx, 1, BMAP_COWFORK); - --eidx; - break; - - case 2: - /* - * Deleting the first part of the extent. - */ - trace_xfs_bmap_pre_update(ip, eidx, state, _THIS_IP_); - xfs_bmbt_set_startoff(ep, del_endoff); - temp = got.br_blockcount - del->br_blockcount; - xfs_bmbt_set_blockcount(ep, temp); - if (delay) { - temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), - da_old); - xfs_bmbt_set_startblock(ep, nullstartblock((int)temp)); - trace_xfs_bmap_post_update(ip, eidx, state, _THIS_IP_); - da_new = temp; - break; - } - xfs_bmbt_set_startblock(ep, del_endblock); - trace_xfs_bmap_post_update(ip, eidx, state, _THIS_IP_); - break; - - case 1: - /* - * Deleting the last part of the extent. - */ - temp = got.br_blockcount - del->br_blockcount; - trace_xfs_bmap_pre_update(ip, eidx, state, _THIS_IP_); - xfs_bmbt_set_blockcount(ep, temp); - if (delay) { - temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), - da_old); - xfs_bmbt_set_startblock(ep, nullstartblock((int)temp)); - trace_xfs_bmap_post_update(ip, eidx, state, _THIS_IP_); - da_new = temp; - break; - } - trace_xfs_bmap_post_update(ip, eidx, state, _THIS_IP_); - break; - - case 0: - /* - * Deleting the middle of the extent. - */ - temp = del->br_startoff - got.br_startoff; - trace_xfs_bmap_pre_update(ip, eidx, state, _THIS_IP_); - xfs_bmbt_set_blockcount(ep, temp); - new.br_startoff = del_endoff; - temp2 = got_endoff - del_endoff; - new.br_blockcount = temp2; - new.br_state = got.br_state; - if (!delay) { - new.br_startblock = del_endblock; - } else { - temp = xfs_bmap_worst_indlen(ip, temp); - xfs_bmbt_set_startblock(ep, nullstartblock((int)temp)); - temp2 = xfs_bmap_worst_indlen(ip, temp2); - new.br_startblock = nullstartblock((int)temp2); - da_new = temp + temp2; - while (da_new > da_old) { - if (temp) { - temp--; - da_new--; - xfs_bmbt_set_startblock(ep, - nullstartblock((int)temp)); - } - if (da_new == da_old) - break; - if (temp2) { - temp2--; - da_new--; - new.br_startblock = - nullstartblock((int)temp2); - } - } - } - trace_xfs_bmap_post_update(ip, eidx, state, _THIS_IP_); - xfs_iext_insert(ip, eidx + 1, 1, &new, state); - ++eidx; - break; - } - - /* - * Account for change in delayed indirect blocks. - * Nothing to do for disk quota accounting here. - */ - ASSERT(da_old >= da_new); - if (da_old > da_new) - xfs_mod_fdblocks(mp, (int64_t)(da_old - da_new), false); - - return error; + &eidx, &got, &new); + ASSERT(ep); + if (isnullstartblock(got.br_startblock)) + xfs_bmap_del_extent_delay(ip, XFS_COW_FORK, &eidx, &got, del); + else + xfs_bmap_del_extent_cow(ip, &eidx, &got, del); + return 0; } /* diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index eb86af0c988b..5f18248cbac2 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -224,6 +224,11 @@ int xfs_bunmapi(struct xfs_trans *tp, struct xfs_inode *ip, xfs_extnum_t nexts, xfs_fsblock_t *firstblock, struct xfs_defer_ops *dfops, int *done); int xfs_bunmapi_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *del); +int xfs_bmap_del_extent_delay(struct xfs_inode *ip, int whichfork, + xfs_extnum_t *idx, struct xfs_bmbt_irec *got, + struct xfs_bmbt_irec *del); +void xfs_bmap_del_extent_cow(struct xfs_inode *ip, xfs_extnum_t *idx, + struct xfs_bmbt_irec *got, struct xfs_bmbt_irec *del); int xfs_check_nostate_extents(struct xfs_ifork *ifp, xfs_extnum_t idx, xfs_extnum_t num); uint xfs_default_attroffset(struct xfs_inode *ip); diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 9c477de3c1ac..09bd5dcd90cd 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -534,11 +534,6 @@ xfs_reflink_cancel_cow_blocks( trace_xfs_reflink_cancel_cow(ip, &irec); if (irec.br_startblock == DELAYSTARTBLOCK) { - /* Free a delayed allocation. */ - xfs_mod_fdblocks(ip->i_mount, irec.br_blockcount, - false); - ip->i_delayed_blks -= irec.br_blockcount; - /* Remove the mapping from the CoW fork. */ error = xfs_bunmapi_cow(ip, &irec); if (error) From 3e0ee78f7a5a8ed43b129e9e4c5c0ff10c71df74 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:54:31 +1100 Subject: [PATCH 091/292] xfs: optimize xfs_reflink_cancel_cow_blocks Rewrite xfs_reflink_cancel_cow_blocks so that we only do a search for the first extent in the extent list and then iterate over the remaining extents using the extent index, passing the extent we operate on directly to xfs_bmap_del_extent_delay or xfs_bmap_del_extent_cow instead of going through xfs_bunmapi and doing yet another extent list lookup. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/xfs/xfs_reflink.c | 51 ++++++++++++++++++++------------------------ 1 file changed, 23 insertions(+), 28 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 09bd5dcd90cd..1e9c589d22cc 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -511,53 +511,49 @@ xfs_reflink_cancel_cow_blocks( xfs_fileoff_t offset_fsb, xfs_fileoff_t end_fsb) { - struct xfs_bmbt_irec irec; - xfs_filblks_t count_fsb; + struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK); + struct xfs_bmbt_irec got, prev, del; + xfs_extnum_t idx; xfs_fsblock_t firstfsb; struct xfs_defer_ops dfops; - int error = 0; - int nimaps; + int error = 0, eof = 0; if (!xfs_is_reflink_inode(ip)) return 0; - /* Go find the old extent in the CoW fork. */ - while (offset_fsb < end_fsb) { - nimaps = 1; - count_fsb = (xfs_filblks_t)(end_fsb - offset_fsb); - error = xfs_bmapi_read(ip, offset_fsb, count_fsb, &irec, - &nimaps, XFS_BMAPI_COWFORK); - if (error) - break; - ASSERT(nimaps == 1); + xfs_bmap_search_extents(ip, offset_fsb, XFS_COW_FORK, &eof, &idx, + &got, &prev); + if (eof) + return 0; - trace_xfs_reflink_cancel_cow(ip, &irec); + while (got.br_startoff < end_fsb) { + del = got; + xfs_trim_extent(&del, offset_fsb, end_fsb - offset_fsb); + trace_xfs_reflink_cancel_cow(ip, &del); - if (irec.br_startblock == DELAYSTARTBLOCK) { - /* Remove the mapping from the CoW fork. */ - error = xfs_bunmapi_cow(ip, &irec); + if (isnullstartblock(del.br_startblock)) { + error = xfs_bmap_del_extent_delay(ip, XFS_COW_FORK, + &idx, &got, &del); if (error) break; - } else if (irec.br_startblock == HOLESTARTBLOCK) { - /* empty */ } else { xfs_trans_ijoin(*tpp, ip, 0); xfs_defer_init(&dfops, &firstfsb); /* Free the CoW orphan record. */ error = xfs_refcount_free_cow_extent(ip->i_mount, - &dfops, irec.br_startblock, - irec.br_blockcount); + &dfops, del.br_startblock, + del.br_blockcount); if (error) break; xfs_bmap_add_free(ip->i_mount, &dfops, - irec.br_startblock, irec.br_blockcount, + del.br_startblock, del.br_blockcount, NULL); /* Update quota accounting */ xfs_trans_mod_dquot_byino(*tpp, ip, XFS_TRANS_DQ_BCOUNT, - -(long)irec.br_blockcount); + -(long)del.br_blockcount); /* Roll the transaction */ error = xfs_defer_finish(tpp, &dfops, ip); @@ -567,13 +563,12 @@ xfs_reflink_cancel_cow_blocks( } /* Remove the mapping from the CoW fork. */ - error = xfs_bunmapi_cow(ip, &irec); - if (error) - break; + xfs_bmap_del_extent_cow(ip, &idx, &got, &del); } - /* Roll on... */ - offset_fsb = irec.br_startoff + irec.br_blockcount; + if (++idx >= ifp->if_bytes / sizeof(struct xfs_bmbt_rec)) + return 0; + xfs_bmbt_get_all(xfs_iext_get_ext(ifp, idx), &got); } return error; From c1112b6e626637ec09319883b63e705a931c398b Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:54:45 +1100 Subject: [PATCH 092/292] xfs: optimize xfs_reflink_end_cow Instead of doing a full extent list search for each extent that is to be deleted using xfs_bmapi_read and then doing another one inside of xfs_bunmapi_cow use the same scheme that xfs_bumapi uses: look up the last extent to be deleted and then use the extent index to walk downward until we are outside the range to be deleted. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/xfs/xfs_reflink.c | 129 ++++++++++++++++++++----------------------- fs/xfs/xfs_trace.h | 1 - 2 files changed, 61 insertions(+), 69 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 1e9c589d22cc..cd308f119e20 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -633,25 +633,26 @@ xfs_reflink_end_cow( xfs_off_t offset, xfs_off_t count) { - struct xfs_bmbt_irec irec; - struct xfs_bmbt_irec uirec; + struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK); + struct xfs_bmbt_irec got, prev, del; struct xfs_trans *tp; xfs_fileoff_t offset_fsb; xfs_fileoff_t end_fsb; - xfs_filblks_t count_fsb; xfs_fsblock_t firstfsb; struct xfs_defer_ops dfops; - int error; + int error, eof = 0; unsigned int resblks; - xfs_filblks_t ilen; xfs_filblks_t rlen; - int nimaps; + xfs_extnum_t idx; trace_xfs_reflink_end_cow(ip, offset, count); + /* No COW extents? That's easy! */ + if (ifp->if_bytes == 0) + return 0; + offset_fsb = XFS_B_TO_FSBT(ip->i_mount, offset); end_fsb = XFS_B_TO_FSB(ip->i_mount, offset + count); - count_fsb = (xfs_filblks_t)(end_fsb - offset_fsb); /* Start a rolling transaction to switch the mappings */ resblks = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK); @@ -663,72 +664,65 @@ xfs_reflink_end_cow( xfs_ilock(ip, XFS_ILOCK_EXCL); xfs_trans_ijoin(tp, ip, 0); - /* Go find the old extent in the CoW fork. */ - while (offset_fsb < end_fsb) { - /* Read extent from the source file */ - nimaps = 1; - count_fsb = (xfs_filblks_t)(end_fsb - offset_fsb); - error = xfs_bmapi_read(ip, offset_fsb, count_fsb, &irec, - &nimaps, XFS_BMAPI_COWFORK); - if (error) - goto out_cancel; - ASSERT(nimaps == 1); + xfs_bmap_search_extents(ip, end_fsb - 1, XFS_COW_FORK, &eof, &idx, + &got, &prev); - ASSERT(irec.br_startblock != DELAYSTARTBLOCK); - trace_xfs_reflink_cow_remap(ip, &irec); + /* If there is a hole at end_fsb - 1 go to the previous extent */ + if (eof || got.br_startoff > end_fsb) { + ASSERT(idx > 0); + xfs_bmbt_get_all(xfs_iext_get_ext(ifp, --idx), &got); + } - /* - * We can have a hole in the CoW fork if part of a directio - * write is CoW but part of it isn't. - */ - rlen = ilen = irec.br_blockcount; - if (irec.br_startblock == HOLESTARTBLOCK) + /* Walk backwards until we're out of the I/O range... */ + while (got.br_startoff + got.br_blockcount > offset_fsb) { + del = got; + xfs_trim_extent(&del, offset_fsb, end_fsb - offset_fsb); + + /* Extent delete may have bumped idx forward */ + if (!del.br_blockcount) { + idx--; goto next_extent; - - /* Unmap the old blocks in the data fork. */ - while (rlen) { - xfs_defer_init(&dfops, &firstfsb); - error = __xfs_bunmapi(tp, ip, irec.br_startoff, - &rlen, 0, 1, &firstfsb, &dfops); - if (error) - goto out_defer; - - /* - * Trim the extent to whatever got unmapped. - * Remember, bunmapi works backwards. - */ - uirec.br_startblock = irec.br_startblock + rlen; - uirec.br_startoff = irec.br_startoff + rlen; - uirec.br_blockcount = irec.br_blockcount - rlen; - irec.br_blockcount = rlen; - trace_xfs_reflink_cow_remap_piece(ip, &uirec); - - /* Free the CoW orphan record. */ - error = xfs_refcount_free_cow_extent(tp->t_mountp, - &dfops, uirec.br_startblock, - uirec.br_blockcount); - if (error) - goto out_defer; - - /* Map the new blocks into the data fork. */ - error = xfs_bmap_map_extent(tp->t_mountp, &dfops, - ip, &uirec); - if (error) - goto out_defer; - - /* Remove the mapping from the CoW fork. */ - error = xfs_bunmapi_cow(ip, &uirec); - if (error) - goto out_defer; - - error = xfs_defer_finish(&tp, &dfops, ip); - if (error) - goto out_defer; } + ASSERT(!isnullstartblock(got.br_startblock)); + + /* Unmap the old blocks in the data fork. */ + xfs_defer_init(&dfops, &firstfsb); + rlen = del.br_blockcount; + error = __xfs_bunmapi(tp, ip, del.br_startoff, &rlen, 0, 1, + &firstfsb, &dfops); + if (error) + goto out_defer; + + /* Trim the extent to whatever got unmapped. */ + if (rlen) { + xfs_trim_extent(&del, del.br_startoff + rlen, + del.br_blockcount - rlen); + } + trace_xfs_reflink_cow_remap(ip, &del); + + /* Free the CoW orphan record. */ + error = xfs_refcount_free_cow_extent(tp->t_mountp, &dfops, + del.br_startblock, del.br_blockcount); + if (error) + goto out_defer; + + /* Map the new blocks into the data fork. */ + error = xfs_bmap_map_extent(tp->t_mountp, &dfops, ip, &del); + if (error) + goto out_defer; + + /* Remove the mapping from the CoW fork. */ + xfs_bmap_del_extent_cow(ip, &idx, &got, &del); + + error = xfs_defer_finish(&tp, &dfops, ip); + if (error) + goto out_defer; + next_extent: - /* Roll on... */ - offset_fsb = irec.br_startoff + ilen; + if (idx < 0) + break; + xfs_bmbt_get_all(xfs_iext_get_ext(ifp, idx), &got); } error = xfs_trans_commit(tp); @@ -739,7 +733,6 @@ next_extent: out_defer: xfs_defer_cancel(&dfops); -out_cancel: xfs_trans_cancel(tp); xfs_iunlock(ip, XFS_ILOCK_EXCL); out: diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 72f9f6b7a76a..0907752be62d 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3356,7 +3356,6 @@ DEFINE_INODE_IREC_EVENT(xfs_reflink_trim_irec); DEFINE_SIMPLE_IO_EVENT(xfs_reflink_cancel_cow_range); DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_cow); DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap); -DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_piece); DEFINE_INODE_ERROR_EVENT(xfs_reflink_allocate_cow_range_error); DEFINE_INODE_ERROR_EVENT(xfs_reflink_cancel_cow_range_error); From 64e6428ddd00f864e3ca105f914a2b6920c2bc41 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 15:54:59 +1100 Subject: [PATCH 093/292] xfs: remove xfs_bunmapi_cow Since no one uses it anymore. Signed-off-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Reviewed-by: Brian Foster Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_bmap.c | 22 ---------------------- fs/xfs/libxfs/xfs_bmap.h | 1 - 2 files changed, 23 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index d7ad51132f4f..c6eb21940783 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -5414,28 +5414,6 @@ done: return error; } -/* Remove an extent from the CoW fork. Similar to xfs_bmap_del_extent. */ -int -xfs_bunmapi_cow( - struct xfs_inode *ip, - struct xfs_bmbt_irec *del) -{ - struct xfs_bmbt_rec_host *ep; - struct xfs_bmbt_irec got; - struct xfs_bmbt_irec new; - int eof; - xfs_extnum_t eidx; - - ep = xfs_bmap_search_extents(ip, del->br_startoff, XFS_COW_FORK, &eof, - &eidx, &got, &new); - ASSERT(ep); - if (isnullstartblock(got.br_startblock)) - xfs_bmap_del_extent_delay(ip, XFS_COW_FORK, &eidx, &got, del); - else - xfs_bmap_del_extent_cow(ip, &eidx, &got, del); - return 0; -} - /* * Unmap (remove) blocks from a file. * If nexts is nonzero then the number of extents to remove is limited to diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index 5f18248cbac2..7cae6ec27fa6 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -223,7 +223,6 @@ int xfs_bunmapi(struct xfs_trans *tp, struct xfs_inode *ip, xfs_fileoff_t bno, xfs_filblks_t len, int flags, xfs_extnum_t nexts, xfs_fsblock_t *firstblock, struct xfs_defer_ops *dfops, int *done); -int xfs_bunmapi_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *del); int xfs_bmap_del_extent_delay(struct xfs_inode *ip, int whichfork, xfs_extnum_t *idx, struct xfs_bmbt_irec *got, struct xfs_bmbt_irec *del); From aed3f249f93fe15f16242bbd4e6c3931674b8d91 Mon Sep 17 00:00:00 2001 From: Srinivas Pandruvada Date: Mon, 3 Oct 2016 12:36:24 -0700 Subject: [PATCH 094/292] thermal: intel_pch_thermal: Add an ACPI passive trip On the platforms which has an ACPI companion device associated with PCH thermal device, read passive trip temperature via ACPI _PSV control method. Signed-off-by: Srinivas Pandruvada Signed-off-by: Zhang Rui --- drivers/thermal/intel_pch_thermal.c | 51 +++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/drivers/thermal/intel_pch_thermal.c b/drivers/thermal/intel_pch_thermal.c index 9b4815e81b0d..5cf644a2c012 100644 --- a/drivers/thermal/intel_pch_thermal.c +++ b/drivers/thermal/intel_pch_thermal.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -66,9 +67,53 @@ struct pch_thermal_device { unsigned long crt_temp; int hot_trip_id; unsigned long hot_temp; + int psv_trip_id; + unsigned long psv_temp; bool bios_enabled; }; +#ifdef CONFIG_ACPI + +/* + * On some platforms, there is a companion ACPI device, which adds + * passive trip temperature using _PSV method. There is no specific + * passive temperature setting in MMIO interface of this PCI device. + */ +static void pch_wpt_add_acpi_psv_trip(struct pch_thermal_device *ptd, + int *nr_trips) +{ + struct acpi_device *adev; + + ptd->psv_trip_id = -1; + + adev = ACPI_COMPANION(&ptd->pdev->dev); + if (adev) { + unsigned long long r; + acpi_status status; + + status = acpi_evaluate_integer(adev->handle, "_PSV", NULL, + &r); + if (ACPI_SUCCESS(status)) { + unsigned long trip_temp; + + trip_temp = DECI_KELVIN_TO_MILLICELSIUS(r); + if (trip_temp) { + ptd->psv_temp = trip_temp; + ptd->psv_trip_id = *nr_trips; + ++(*nr_trips); + } + } + } +} +#else +static void pch_wpt_add_acpi_psv_trip(struct pch_thermal_device *ptd, + int *nr_trips) +{ + ptd->psv_trip_id = -1; + +} +#endif + static int pch_wpt_init(struct pch_thermal_device *ptd, int *nr_trips) { u8 tsel; @@ -119,6 +164,8 @@ read_trips: ++(*nr_trips); } + pch_wpt_add_acpi_psv_trip(ptd, nr_trips); + return 0; } @@ -194,6 +241,8 @@ static int pch_get_trip_type(struct thermal_zone_device *tzd, int trip, *type = THERMAL_TRIP_CRITICAL; else if (ptd->hot_trip_id == trip) *type = THERMAL_TRIP_HOT; + else if (ptd->psv_trip_id == trip) + *type = THERMAL_TRIP_PASSIVE; else return -EINVAL; @@ -208,6 +257,8 @@ static int pch_get_trip_temp(struct thermal_zone_device *tzd, int trip, int *tem *temp = ptd->crt_temp; else if (ptd->hot_trip_id == trip) *temp = ptd->hot_temp; + else if (ptd->psv_trip_id == trip) + *temp = ptd->psv_temp; else return -EINVAL; From 33086a9a3071812a829d4b63bed826736a5d8b17 Mon Sep 17 00:00:00 2001 From: Srinivas Pandruvada Date: Mon, 3 Oct 2016 12:36:02 -0700 Subject: [PATCH 095/292] thermal: intel_pch_thermal: Enable Haswell PCH Added missing support for Haswell PCH thermal sensor. Signed-off-by: Srinivas Pandruvada Signed-off-by: Zhang Rui --- drivers/thermal/intel_pch_thermal.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/thermal/intel_pch_thermal.c b/drivers/thermal/intel_pch_thermal.c index 5cf644a2c012..19bf2028e508 100644 --- a/drivers/thermal/intel_pch_thermal.c +++ b/drivers/thermal/intel_pch_thermal.c @@ -25,6 +25,8 @@ #include /* Intel PCH thermal Device IDs */ +#define PCH_THERMAL_DID_HSW_1 0x9C24 /* Haswell PCH */ +#define PCH_THERMAL_DID_HSW_2 0x8C24 /* Haswell PCH */ #define PCH_THERMAL_DID_WPT 0x9CA4 /* Wildcat Point */ #define PCH_THERMAL_DID_SKL 0x9D31 /* Skylake PCH */ @@ -293,6 +295,11 @@ static int intel_pch_thermal_probe(struct pci_dev *pdev, ptd->ops = &pch_dev_ops_wpt; dev_name = "pch_skylake"; break; + case PCH_THERMAL_DID_HSW_1: + case PCH_THERMAL_DID_HSW_2: + ptd->ops = &pch_dev_ops_wpt; + dev_name = "pch_haswell"; + break; default: dev_err(&pdev->dev, "unknown pch thermal device\n"); return -ENODEV; @@ -375,6 +382,8 @@ static int intel_pch_thermal_resume(struct device *device) static struct pci_device_id intel_pch_thermal_id[] = { { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCH_THERMAL_DID_WPT) }, { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCH_THERMAL_DID_SKL) }, + { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCH_THERMAL_DID_HSW_1) }, + { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCH_THERMAL_DID_HSW_2) }, { 0, }, }; MODULE_DEVICE_TABLE(pci, intel_pch_thermal_id); From 3105f234e0aba43e44e277c20f9b32ee8add43d4 Mon Sep 17 00:00:00 2001 From: Eric Ernst Date: Thu, 6 Oct 2016 08:56:49 -0700 Subject: [PATCH 096/292] thermal/powerclamp: correct cpu support check Initial logic for checking CPU match resulted in OR of CPU features rather than the intended AND. Updated to use boot_cpu_has macro rather than x86_match_cpu. In addition, MWAIT is the only required CPU feature for idle injection to work. Drop other feature requirements since they are only needed for optimal efficiency. CC: stable@vger.kernel.org #v4.7 Signed-off-by: Eric Ernst Acked-by: Jacob Pan Signed-off-by: Zhang Rui --- drivers/thermal/intel_powerclamp.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c index 0e4dc0afcfd2..7a223074df3d 100644 --- a/drivers/thermal/intel_powerclamp.c +++ b/drivers/thermal/intel_powerclamp.c @@ -669,20 +669,10 @@ static struct thermal_cooling_device_ops powerclamp_cooling_ops = { .set_cur_state = powerclamp_set_cur_state, }; -static const struct x86_cpu_id intel_powerclamp_ids[] __initconst = { - { X86_VENDOR_INTEL, X86_FAMILY_ANY, X86_MODEL_ANY, X86_FEATURE_MWAIT }, - { X86_VENDOR_INTEL, X86_FAMILY_ANY, X86_MODEL_ANY, X86_FEATURE_ARAT }, - { X86_VENDOR_INTEL, X86_FAMILY_ANY, X86_MODEL_ANY, X86_FEATURE_NONSTOP_TSC }, - { X86_VENDOR_INTEL, X86_FAMILY_ANY, X86_MODEL_ANY, X86_FEATURE_CONSTANT_TSC}, - {} -}; -MODULE_DEVICE_TABLE(x86cpu, intel_powerclamp_ids); - static int __init powerclamp_probe(void) { - if (!x86_match_cpu(intel_powerclamp_ids)) { - pr_err("Intel powerclamp does not run on family %d model %d\n", - boot_cpu_data.x86, boot_cpu_data.x86_model); + if (!boot_cpu_has(X86_FEATURE_MWAIT)) { + pr_err("CPU does not support MWAIT"); return -ENODEV; } From de24e0a108bc48062e1c7acaa97014bce32a919f Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Wed, 19 Oct 2016 15:45:07 +0200 Subject: [PATCH 097/292] USB: serial: cp210x: fix tiocmget error handling The current tiocmget implementation would fail to report errors up the stack and instead leaked a few bits from the stack as a mask of modem-status flags. Fixes: 39a66b8d22a3 ("[PATCH] USB: CP2101 Add support for flow control") Cc: stable Signed-off-by: Johan Hovold --- drivers/usb/serial/cp210x.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c index 54a4de0efdba..f61477bed3a8 100644 --- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -1077,7 +1077,9 @@ static int cp210x_tiocmget(struct tty_struct *tty) u8 control; int result; - cp210x_read_u8_reg(port, CP210X_GET_MDMSTS, &control); + result = cp210x_read_u8_reg(port, CP210X_GET_MDMSTS, &control); + if (result) + return result; result = ((control & CONTROL_DTR) ? TIOCM_DTR : 0) |((control & CONTROL_RTS) ? TIOCM_RTS : 0) From 6aecd8715802d23dc6a0859b50c62d2b0a99de3a Mon Sep 17 00:00:00 2001 From: Hui Wang Date: Thu, 20 Oct 2016 14:03:33 +0800 Subject: [PATCH 098/292] ALSA: hda - Fix headset mic detection problem for two Dell laptops They uses the codec ALC255, and have the different pin cfg definition from the ones in the existing pin quirk table. Now adding them into the table to fix the problem. Cc: stable@vger.kernel.org Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 8483fc20f635..b582d57fe184 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5856,10 +5856,18 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { {0x14, 0x90170110}, {0x1b, 0x02011020}, {0x21, 0x0221101f}), + SND_HDA_PIN_QUIRK(0x10ec0255, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, + {0x14, 0x90170110}, + {0x1b, 0x01011020}, + {0x21, 0x0221101f}), SND_HDA_PIN_QUIRK(0x10ec0255, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, {0x14, 0x90170130}, {0x1b, 0x01014020}, {0x21, 0x0221103f}), + SND_HDA_PIN_QUIRK(0x10ec0255, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, + {0x14, 0x90170130}, + {0x1b, 0x01011020}, + {0x21, 0x0221103f}), SND_HDA_PIN_QUIRK(0x10ec0255, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, {0x14, 0x90170130}, {0x1b, 0x02011020}, From 15e2a357876910623953a7c9b071096e6d976ca9 Mon Sep 17 00:00:00 2001 From: Pavel Machek Date: Mon, 3 Oct 2016 10:43:46 +0200 Subject: [PATCH 099/292] gpio/board.txt: point to gpiod_set_value gpiod_set_value() is preffered interface these days, so add a pointer. Also fix a missing ). Signed-off-by: Pavel Machek [Fixed some grammar and reworded] Signed-off-by: Linus Walleij --- Documentation/gpio/board.txt | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/Documentation/gpio/board.txt b/Documentation/gpio/board.txt index 40884c4fe40c..a0f61898d493 100644 --- a/Documentation/gpio/board.txt +++ b/Documentation/gpio/board.txt @@ -6,7 +6,7 @@ Note that it only applies to the new descriptor-based interface. For a description of the deprecated integer-based GPIO interface please refer to gpio-legacy.txt (actually, there is no real mapping possible with the old interface; you just fetch an integer from somewhere and request the -corresponding GPIO. +corresponding GPIO). All platforms can enable the GPIO library, but if the platform strictly requires GPIO functionality to be present, it needs to select GPIOLIB from its @@ -162,6 +162,9 @@ The driver controlling "foo.0" will then be able to obtain its GPIOs as follows: Since the "led" GPIOs are mapped as active-high, this example will switch their signals to 1, i.e. enabling the LEDs. And for the "power" GPIO, which is mapped -as active-low, its actual signal will be 0 after this code. Contrary to the legacy -integer GPIO interface, the active-low property is handled during mapping and is -thus transparent to GPIO consumers. +as active-low, its actual signal will be 0 after this code. Contrary to the +legacy integer GPIO interface, the active-low property is handled during +mapping and is thus transparent to GPIO consumers. + +A set of functions such as gpiod_set_value() is available to work with +the new descriptor-oriented interface. From 44df08198bc98d75085bb0ff4b54bf43e0bc40c0 Mon Sep 17 00:00:00 2001 From: Arvind Yadav Date: Wed, 5 Oct 2016 15:08:36 +0530 Subject: [PATCH 100/292] gpio: mxs: Unmap region obtained by of_iomap Free memory mapping, if mxs_gpio_probe is not successful. Signed-off-by: Arvind Yadav Signed-off-by: Linus Walleij --- drivers/gpio/gpio-mxs.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpio-mxs.c b/drivers/gpio/gpio-mxs.c index b9daa0bf32a4..ee1724806f46 100644 --- a/drivers/gpio/gpio-mxs.c +++ b/drivers/gpio/gpio-mxs.c @@ -308,8 +308,10 @@ static int mxs_gpio_probe(struct platform_device *pdev) writel(~0U, port->base + PINCTRL_IRQSTAT(port) + MXS_CLR); irq_base = irq_alloc_descs(-1, 0, 32, numa_node_id()); - if (irq_base < 0) - return irq_base; + if (irq_base < 0) { + err = irq_base; + goto out_iounmap; + } port->domain = irq_domain_add_legacy(np, 32, irq_base, 0, &irq_domain_simple_ops, NULL); @@ -349,6 +351,8 @@ out_irqdomain_remove: irq_domain_remove(port->domain); out_irqdesc_free: irq_free_descs(irq_base, 32); +out_iounmap: + iounmap(port->base); return err; } From d1ca19cb3bc88318a54643f7a250ec6abab51108 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 12 Oct 2016 09:25:20 +0300 Subject: [PATCH 101/292] gpio: stmpe: || vs && typo && was obviously intended here. Fixes: 6936e1f88d23 ('gpio: stmpe: Write int status register only when needed') Signed-off-by: Dan Carpenter Acked-by: Patrice Chotard Signed-off-by: Linus Walleij --- drivers/gpio/gpio-stmpe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpio-stmpe.c b/drivers/gpio/gpio-stmpe.c index e7d422a6b90b..5b0042776ec7 100644 --- a/drivers/gpio/gpio-stmpe.c +++ b/drivers/gpio/gpio-stmpe.c @@ -409,7 +409,7 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev) * 801/1801/1600, bits are cleared when read. * Edge detect register is not present on 801/1600/1801 */ - if (stmpe->partnum != STMPE801 || stmpe->partnum != STMPE1600 || + if (stmpe->partnum != STMPE801 && stmpe->partnum != STMPE1600 && stmpe->partnum != STMPE1801) { stmpe_reg_write(stmpe, statmsbreg + i, status[i]); stmpe_reg_write(stmpe, From 0cb940927df72a514588d8286916eda4aaa4d011 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 10 Oct 2016 14:42:46 +0200 Subject: [PATCH 102/292] gpio: mockup: add sysfs dependency Building the gpio-mockup driver without SYSFS results in a harmless Kconfig warning: warning: (GPIO_MOCKUP) selects GPIO_SYSFS which has unmet direct dependencies (GPIOLIB && SYSFS) We can easily avoid that warning by adding a dependency on SYSFS. Fixes: 0f98dd1b27d2 ("gpio/mockup: add virtual gpio device") Signed-off-by: Arnd Bergmann Signed-off-by: Linus Walleij --- drivers/gpio/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 26ee00f6bd58..d011cb89d25e 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -284,7 +284,7 @@ config GPIO_MM_LANTIQ config GPIO_MOCKUP tristate "GPIO Testing Driver" - depends on GPIOLIB + depends on GPIOLIB && SYSFS select GPIO_SYSFS help This enables GPIO Testing driver, which provides a way to test GPIO From 67bf5156edc4f58241fd7c119ae145c552adddd6 Mon Sep 17 00:00:00 2001 From: David Arcari Date: Wed, 12 Oct 2016 18:40:30 +0200 Subject: [PATCH 103/292] gpio / ACPI: fix returned error from acpi_dev_gpio_irq_get() acpi_dev_gpio_irq_get() currently ignores the error returned by acpi_get_gpiod_by_index() and overwrites it with -ENOENT. Problem is this error can be -EPROBE_DEFER, which just blows up some drivers when the module ordering is not correct. Cc: stable@vger.kernel.org Signed-off-by: David Arcari Signed-off-by: Benjamin Tissoires Acked-by: Mika Westerberg Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib-acpi.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c index 58ece201b8e6..72a4b326fd0d 100644 --- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -653,14 +653,17 @@ int acpi_dev_gpio_irq_get(struct acpi_device *adev, int index) { int idx, i; unsigned int irq_flags; + int ret = -ENOENT; for (i = 0, idx = 0; idx <= index; i++) { struct acpi_gpio_info info; struct gpio_desc *desc; desc = acpi_get_gpiod_by_index(adev, NULL, i, &info); - if (IS_ERR(desc)) + if (IS_ERR(desc)) { + ret = PTR_ERR(desc); break; + } if (info.gpioint && idx++ == index) { int irq = gpiod_to_irq(desc); @@ -679,7 +682,7 @@ int acpi_dev_gpio_irq_get(struct acpi_device *adev, int index) } } - return -ENOENT; + return ret; } EXPORT_SYMBOL_GPL(acpi_dev_gpio_irq_get); From 4c39135aa412d2f1381e43802523da110ca7855c Mon Sep 17 00:00:00 2001 From: Mathias Nyman Date: Thu, 20 Oct 2016 18:09:18 +0300 Subject: [PATCH 104/292] xhci: add restart quirk for Intel Wildcatpoint PCH xHC in Wildcatpoint-LP PCH is similar to LynxPoint-LP and need the same quirks to prevent machines from spurious restart while shutting them down. Reported-by: Hasan Mahmood CC: Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/xhci-pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index d7b0f97abbad..314179b4824e 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -45,6 +45,7 @@ #define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31 #define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31 +#define PCI_DEVICE_ID_INTEL_WILDCATPOINT_LP_XHCI 0x9cb1 #define PCI_DEVICE_ID_INTEL_CHERRYVIEW_XHCI 0x22b5 #define PCI_DEVICE_ID_INTEL_SUNRISEPOINT_H_XHCI 0xa12f #define PCI_DEVICE_ID_INTEL_SUNRISEPOINT_LP_XHCI 0x9d2f @@ -153,7 +154,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) xhci->quirks |= XHCI_SPURIOUS_REBOOT; } if (pdev->vendor == PCI_VENDOR_ID_INTEL && - pdev->device == PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI) { + (pdev->device == PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI || + pdev->device == PCI_DEVICE_ID_INTEL_WILDCATPOINT_LP_XHCI)) { xhci->quirks |= XHCI_SPURIOUS_REBOOT; xhci->quirks |= XHCI_SPURIOUS_WAKEUP; } From 346e99736c3ce328fd42d678343b70243aca5f36 Mon Sep 17 00:00:00 2001 From: Mathias Nyman Date: Thu, 20 Oct 2016 18:09:19 +0300 Subject: [PATCH 105/292] xhci: workaround for hosts missing CAS bit If a device is unplugged and replugged during Sx system suspend some Intel xHC hosts will overwrite the CAS (Cold attach status) flag and no device connection is noticed in resume. A device in this state can be identified in resume if its link state is in polling or compliance mode, and the current connect status is 0. A device in this state needs to be warm reset. Intel 100/c230 series PCH specification update Doc #332692-006 Errata #8 Observed on Cherryview and Apollolake as they go into compliance mode if LFPS times out during polling, and re-plugged devices are not discovered at resume. Signed-off-by: Mathias Nyman CC: Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/xhci-hub.c | 37 +++++++++++++++++++++++++++++++++++++ drivers/usb/host/xhci-pci.c | 6 ++++++ drivers/usb/host/xhci.h | 3 +++ 3 files changed, 46 insertions(+) diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c index 730b9fd26685..c047362b3768 100644 --- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -1355,6 +1355,35 @@ int xhci_bus_suspend(struct usb_hcd *hcd) return 0; } +/* + * Workaround for missing Cold Attach Status (CAS) if device re-plugged in S3. + * warm reset a USB3 device stuck in polling or compliance mode after resume. + * See Intel 100/c230 series PCH specification update Doc #332692-006 Errata #8 + */ +static bool xhci_port_missing_cas_quirk(int port_index, + __le32 __iomem **port_array) +{ + u32 portsc; + + portsc = readl(port_array[port_index]); + + /* if any of these are set we are not stuck */ + if (portsc & (PORT_CONNECT | PORT_CAS)) + return false; + + if (((portsc & PORT_PLS_MASK) != XDEV_POLLING) && + ((portsc & PORT_PLS_MASK) != XDEV_COMP_MODE)) + return false; + + /* clear wakeup/change bits, and do a warm port reset */ + portsc &= ~(PORT_RWC_BITS | PORT_CEC | PORT_WAKE_BITS); + portsc |= PORT_WR; + writel(portsc, port_array[port_index]); + /* flush write */ + readl(port_array[port_index]); + return true; +} + int xhci_bus_resume(struct usb_hcd *hcd) { struct xhci_hcd *xhci = hcd_to_xhci(hcd); @@ -1392,6 +1421,14 @@ int xhci_bus_resume(struct usb_hcd *hcd) u32 temp; temp = readl(port_array[port_index]); + + /* warm reset CAS limited ports stuck in polling/compliance */ + if ((xhci->quirks & XHCI_MISSING_CAS) && + (hcd->speed >= HCD_USB3) && + xhci_port_missing_cas_quirk(port_index, port_array)) { + xhci_dbg(xhci, "reset stuck port %d\n", port_index); + continue; + } if (DEV_SUPERSPEED_ANY(temp)) temp &= ~(PORT_RWC_BITS | PORT_CEC | PORT_WAKE_BITS); else diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index 314179b4824e..e96ae80d107e 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -51,6 +51,7 @@ #define PCI_DEVICE_ID_INTEL_SUNRISEPOINT_LP_XHCI 0x9d2f #define PCI_DEVICE_ID_INTEL_BROXTON_M_XHCI 0x0aa8 #define PCI_DEVICE_ID_INTEL_BROXTON_B_XHCI 0x1aa8 +#define PCI_DEVICE_ID_INTEL_APL_XHCI 0x5aa8 static const char hcd_name[] = "xhci_hcd"; @@ -171,6 +172,11 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) pdev->device == PCI_DEVICE_ID_INTEL_CHERRYVIEW_XHCI) { xhci->quirks |= XHCI_SSIC_PORT_UNUSED; } + if (pdev->vendor == PCI_VENDOR_ID_INTEL && + (pdev->device == PCI_DEVICE_ID_INTEL_CHERRYVIEW_XHCI || + pdev->device == PCI_DEVICE_ID_INTEL_APL_XHCI)) + xhci->quirks |= XHCI_MISSING_CAS; + if (pdev->vendor == PCI_VENDOR_ID_ETRON && pdev->device == PCI_DEVICE_ID_EJ168) { xhci->quirks |= XHCI_RESET_ON_RESUME; diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index b2c1dc5dc0f3..f945380035d0 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -314,6 +314,8 @@ struct xhci_op_regs { #define XDEV_U2 (0x2 << 5) #define XDEV_U3 (0x3 << 5) #define XDEV_INACTIVE (0x6 << 5) +#define XDEV_POLLING (0x7 << 5) +#define XDEV_COMP_MODE (0xa << 5) #define XDEV_RESUME (0xf << 5) /* true: port has power (see HCC_PPC) */ #define PORT_POWER (1 << 9) @@ -1653,6 +1655,7 @@ struct xhci_hcd { #define XHCI_MTK_HOST (1 << 21) #define XHCI_SSIC_PORT_UNUSED (1 << 22) #define XHCI_NO_64BIT_SUPPORT (1 << 23) +#define XHCI_MISSING_CAS (1 << 24) unsigned int num_active_eps; unsigned int limit_active_eps; /* There are two roothubs to keep track of bus suspend info for */ From 7d3b016a6f5a0fa610dfd02b05654c08fa4ae514 Mon Sep 17 00:00:00 2001 From: Mathias Nyman Date: Thu, 20 Oct 2016 18:09:20 +0300 Subject: [PATCH 106/292] xhci: use default USB_RESUME_TIMEOUT when resuming ports. USB2 host inititated resume, and system suspend bus resume need to use the same USB_RESUME_TIMEOUT as elsewhere. This resolves a device disconnect issue at system resume seen on Intel Braswell and Apollolake, but is in no way limited to those platforms. Signed-off-by: Mathias Nyman CC: Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/xhci-hub.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c index c047362b3768..0ef16900efed 100644 --- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -1166,7 +1166,7 @@ int xhci_hub_control(struct usb_hcd *hcd, u16 typeReq, u16 wValue, xhci_set_link_state(xhci, port_array, wIndex, XDEV_RESUME); spin_unlock_irqrestore(&xhci->lock, flags); - msleep(20); + msleep(USB_RESUME_TIMEOUT); spin_lock_irqsave(&xhci->lock, flags); xhci_set_link_state(xhci, port_array, wIndex, XDEV_U0); @@ -1447,7 +1447,7 @@ int xhci_bus_resume(struct usb_hcd *hcd) if (need_usb2_u3_exit) { spin_unlock_irqrestore(&xhci->lock, flags); - msleep(20); + msleep(USB_RESUME_TIMEOUT); spin_lock_irqsave(&xhci->lock, flags); } From a478b097474cd9f2268ab1beaca74ff09e652b9b Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 20 Oct 2016 17:15:41 +0200 Subject: [PATCH 107/292] ahci: fix nvec check commit 17a51f12 ("ahci: only try to use multi-MSI mode if there is more than 1 port") lead to a case where nvec isn't initialized before it's used. Fix this by moving the check into the n_ports conditional. Reported-and-reviewed-by Colin Ian King Signed-off-by: Christoph Hellwig Signed-off-by: Tejun Heo --- drivers/ata/ahci.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index ed311a040fed..60e42e2ed68f 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -1436,14 +1436,14 @@ static int ahci_init_msi(struct pci_dev *pdev, unsigned int n_ports, "ahci: MRSM is on, fallback to single MSI\n"); pci_free_irq_vectors(pdev); } - } - /* - * -ENOSPC indicated we don't have enough vectors. Don't bother trying - * a single vectors for any other error: - */ - if (nvec < 0 && nvec != -ENOSPC) - return nvec; + /* + * -ENOSPC indicated we don't have enough vectors. Don't bother + * trying a single vectors for any other error: + */ + if (nvec < 0 && nvec != -ENOSPC) + return nvec; + } /* * If the host is not capable of supporting per-port vectors, fall From 91bbc174d45c347aa7aedb2215cc7d2013c06c1f Mon Sep 17 00:00:00 2001 From: Christophe JAILLET Date: Sun, 25 Sep 2016 13:53:58 +0200 Subject: [PATCH 108/292] clk: at91: Fix a return value in case of error If 'clk_hw_register()' fails, it is likely that we expect to return an error instead of a valid pointer (which would mean success). Fix commit f5644f10dcfb ("clk: at91: Migrate to clk_hw based registration and OF APIs") Signed-off-by: Christophe JAILLET Signed-off-by: Stephen Boyd --- drivers/clk/at91/clk-programmable.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/clk/at91/clk-programmable.c b/drivers/clk/at91/clk-programmable.c index 190122e64a3a..85a449cf61e3 100644 --- a/drivers/clk/at91/clk-programmable.c +++ b/drivers/clk/at91/clk-programmable.c @@ -203,7 +203,7 @@ at91_clk_register_programmable(struct regmap *regmap, ret = clk_hw_register(NULL, &prog->hw); if (ret) { kfree(prog); - hw = &prog->hw; + hw = ERR_PTR(ret); } return hw; From 1f1cc4566bd9dd8d3cf19965a4b6392143618536 Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Tue, 18 Oct 2016 16:53:59 +0200 Subject: [PATCH 109/292] gpio: GPIO_GET_CHIPINFO_IOCTL: Fix line offset validation The current line offset validation is off by one. Depending on the data stored behind the descs array this can either cause undefined behavior or disclose arbitrary, potentially sensitive, memory to the issuing userspace application. Make sure that offset is within the bounds of the desc array. Cc: stable@vger.kernel.org Fixes: 521a2ad6f862 ("gpio: add userspace ABI for GPIO line information") Signed-off-by: Lars-Peter Clausen Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index f0fc3a0d37c8..dac975397253 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -839,7 +839,7 @@ static long gpio_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) if (copy_from_user(&lineinfo, ip, sizeof(lineinfo))) return -EFAULT; - if (lineinfo.line_offset > gdev->ngpio) + if (lineinfo.line_offset >= gdev->ngpio) return -EINVAL; desc = &gdev->descs[lineinfo.line_offset]; From 0f4bbb233743bdfd51d47688b0bc2561f310488b Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Tue, 18 Oct 2016 16:54:00 +0200 Subject: [PATCH 110/292] gpio: GPIO_GET_CHIPINFO_IOCTL: Fix information leak The GPIO_GET_CHIPINFO_IOCTL handler allocates a gpiochip_info struct on the stack and then passes it to copy_to_user(). But depending on the length of the GPIO chip name and label the struct is only partially initialized. This exposes the previous, potentially sensitive, stack content to the issuing userspace application. To avoid this make sure that the struct is fully initialized. Cc: stable@vger.kernel.org Fixes: 521a2ad6f862 ("gpio: add userspace ABI for GPIO line information") Signed-off-by: Lars-Peter Clausen Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index dac975397253..7a01969169e8 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -823,6 +823,8 @@ static long gpio_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) if (cmd == GPIO_GET_CHIPINFO_IOCTL) { struct gpiochip_info chipinfo; + memset(&chipinfo, 0, sizeof(chipinfo)); + strncpy(chipinfo.name, dev_name(&gdev->dev), sizeof(chipinfo.name)); chipinfo.name[sizeof(chipinfo.name)-1] = '\0'; From e405f9fcb63602d35f7a419ededa3f952a395a72 Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Tue, 18 Oct 2016 16:54:01 +0200 Subject: [PATCH 111/292] gpio: GPIO_GET_LINEHANDLE_IOCTL: Validate line offset The line offset that is used as an index into the descs array is provided by userspace and might go beyond the bounds of the array. If that happens undefined behavior will occur. Make sure that the offset is within the bounds of the desc array and reject any requests that specify a value outside of it. Cc: stable@vger.kernel.org Fixes: d7c51b47ac11 ("gpio: userspace ABI for reading/writing GPIO lines") Signed-off-by: Lars-Peter Clausen Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 7a01969169e8..d287cb4e97c4 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -444,6 +444,11 @@ static int linehandle_create(struct gpio_device *gdev, void __user *ip) u32 lflags = handlereq.flags; struct gpio_desc *desc; + if (offset >= gdev->ngpio) { + ret = -EINVAL; + goto out_free_descs; + } + desc = &gdev->descs[offset]; ret = gpiod_request(desc, lh->label); if (ret) From d82aa4a8f2e8df9673ddb099262355da4c9b99b1 Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Tue, 18 Oct 2016 16:54:04 +0200 Subject: [PATCH 112/292] gpio: GPIOHANDLE_GET_LINE_VALUES_IOCTL: Fix information leak The GPIOHANDLE_GET_LINE_VALUES_IOCTL handler allocates a gpiohandle_data struct on the stack and then passes it to copy_to_user(). But only the first element of the values array in the struct is set, which leaves the struct partially initialized. This exposes the previous, potentially sensitive, stack content to the issuing userspace application. To avoid this make sure that the struct is fully initialized. Cc: stable@vger.kernel.org Fixes: 61f922db7221 ("gpio: userspace ABI for reading GPIO line events") Signed-off-by: Lars-Peter Clausen Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index d287cb4e97c4..4ed26bcf054c 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -628,6 +628,8 @@ static long lineevent_ioctl(struct file *filep, unsigned int cmd, if (cmd == GPIOHANDLE_GET_LINE_VALUES_IOCTL) { int val; + memset(&ghd, 0, sizeof(ghd)); + val = gpiod_get_value_cansleep(le->desc); if (val < 0) return val; From b8b0e3d303654b3bb7b31b0266c513fd6f4132ce Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Tue, 18 Oct 2016 16:54:03 +0200 Subject: [PATCH 113/292] gpio: GPIO_GET_LINEEVENT_IOCTL: Validate line offset The line offset that is used as an index into the descs array is provided by userspace and might go beyond the bounds of the array. If that happens undefined behavior will occur. Make sure that the offset is within the bounds of the desc array and reject any requests that specify a value outside of it. Cc: stable@vger.kernel.org Fixes: 61f922db7221 ("gpio: userspace ABI for reading GPIO line events") Signed-off-by: Lars-Peter Clausen Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 4ed26bcf054c..db44a7da5eb4 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -733,6 +733,11 @@ static int lineevent_create(struct gpio_device *gdev, void __user *ip) lflags = eventreq.handleflags; eflags = eventreq.eventflags; + if (offset >= gdev->ngpio) { + ret = -EINVAL; + goto out_free_label; + } + /* This is just wrong: we don't look for events on output lines */ if (lflags & GPIOHANDLE_REQUEST_OUTPUT) { ret = -EINVAL; From 3eded5d83bf4e36ad78775c7ceb44a45480b0abd Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Tue, 18 Oct 2016 16:54:02 +0200 Subject: [PATCH 114/292] gpio: GPIOHANDLE_GET_LINE_VALUES_IOCTL: Fix information leak The GPIOHANDLE_GET_LINE_VALUES_IOCTL handler allocates a gpiohandle_data struct on the stack and then passes it to copy_to_user(). But depending on the number of requested line handles the struct is only partially initialized. This exposes the previous, potentially sensitive, stack content to the issuing userspace application. To avoid this make sure that the struct is fully initialized. Cc: stable@vger.kernel.org Fixes: d7c51b47ac11 ("gpio: userspace ABI for reading/writing GPIO lines") Signed-off-by: Lars-Peter Clausen Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index db44a7da5eb4..7f92f8964efd 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -344,6 +344,8 @@ static long linehandle_ioctl(struct file *filep, unsigned int cmd, if (cmd == GPIOHANDLE_GET_LINE_VALUES_IOCTL) { int val; + memset(&ghd, 0, sizeof(ghd)); + /* TODO: check if descriptors are really input */ for (i = 0; i < lh->numdescs; i++) { val = gpiod_get_value_cansleep(lh->descs[i]); From e3e847c7f15a27c80f526b2a7a8d4dd7ce0960a0 Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Tue, 18 Oct 2016 16:54:05 +0200 Subject: [PATCH 115/292] gpio: GPIO_GET_LINEHANDLE_IOCTL: Reject invalid line flags The GPIO_GET_LINEHANDLE_IOCTL currently ignores unknown or undefined linehandle flags. From a backwards and forwards compatibility viewpoint it is highly desirable to reject unknown flags though. On one hand an application that is using newer flags and is running on an older kernel has no way to detect if the new flags were handled correctly if they are silently discarded. On the other hand an application that (accidentally) passes undefined flags will run fine on an older kernel, but may break on a newer kernel when these flags get defined. Ensure that requests that have undefined flags set are rejected with an error, rather than silently discarding the undefined flags. Cc: stable@vger.kernel.org Fixes: d7c51b47ac11 ("gpio: userspace ABI for reading/writing GPIO lines") Signed-off-by: Lars-Peter Clausen Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 7f92f8964efd..b5b1a8425907 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -333,6 +333,13 @@ struct linehandle_state { u32 numdescs; }; +#define GPIOHANDLE_REQUEST_VALID_FLAGS \ + (GPIOHANDLE_REQUEST_INPUT | \ + GPIOHANDLE_REQUEST_OUTPUT | \ + GPIOHANDLE_REQUEST_ACTIVE_LOW | \ + GPIOHANDLE_REQUEST_OPEN_DRAIN | \ + GPIOHANDLE_REQUEST_OPEN_SOURCE) + static long linehandle_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { @@ -451,6 +458,12 @@ static int linehandle_create(struct gpio_device *gdev, void __user *ip) goto out_free_descs; } + /* Return an error if a unknown flag is set */ + if (lflags & ~GPIOHANDLE_REQUEST_VALID_FLAGS) { + ret = -EINVAL; + goto out_free_descs; + } + desc = &gdev->descs[offset]; ret = gpiod_request(desc, lh->label); if (ret) From ac7dbb991ee5afc0beacce3a252dcaaa249a7786 Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Tue, 18 Oct 2016 16:54:06 +0200 Subject: [PATCH 116/292] gpio: GPIO_GET_LINEEVENT_IOCTL: Reject invalid line and event flags The GPIO_GET_LINEEVENT_IOCTL currently ignores unknown or undefined linehandle and lineevent flags. From a backwards and forwards compatibility viewpoint it is highly desirable to reject unknown flags though. On one hand an application that is using newer flags and is running on an older kernel has no way to detect if the new flags were handled correctly if they are silently discarded. On the other hand an application that (accidentally) passes undefined flags will run fine on an older kernel, but may break on a newer kernel when these flags get defined. Ensure that requests that have undefined flags set are rejected with an error, rather than silently discarding the undefined flags. Cc: stable@vger.kernel.org Fixes: 61f922db7221 ("gpio: userspace ABI for reading GPIO line events") Signed-off-by: Lars-Peter Clausen Signed-off-by: Linus Walleij --- drivers/gpio/gpiolib.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index b5b1a8425907..20e09b7c2de3 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -556,6 +556,10 @@ struct lineevent_state { struct mutex read_lock; }; +#define GPIOEVENT_REQUEST_VALID_FLAGS \ + (GPIOEVENT_REQUEST_RISING_EDGE | \ + GPIOEVENT_REQUEST_FALLING_EDGE) + static unsigned int lineevent_poll(struct file *filep, struct poll_table_struct *wait) { @@ -753,6 +757,13 @@ static int lineevent_create(struct gpio_device *gdev, void __user *ip) goto out_free_label; } + /* Return an error if a unknown flag is set */ + if ((lflags & ~GPIOHANDLE_REQUEST_VALID_FLAGS) || + (eflags & ~GPIOEVENT_REQUEST_VALID_FLAGS)) { + ret = -EINVAL; + goto out_free_label; + } + /* This is just wrong: we don't look for events on output lines */ if (lflags & GPIOHANDLE_REQUEST_OUTPUT) { ret = -EINVAL; From 91f1551a746f62937957fa1bff4165e1b7a45337 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Tue, 18 Oct 2016 17:44:02 -0300 Subject: [PATCH 117/292] gpio: ts4800: Fix module autoload If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Before this patch: $ modinfo drivers/gpio/gpio-ts4800.ko | grep alias $ After this patch: $ modinfo drivers/gpio/gpio-ts4800.ko | grep alias alias: of:N*T*Ctechnologic,ts4800-gpioC* alias: of:N*T*Ctechnologic,ts4800-gpio Signed-off-by: Javier Martinez Canillas Signed-off-by: Linus Walleij --- drivers/gpio/gpio-ts4800.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpio/gpio-ts4800.c b/drivers/gpio/gpio-ts4800.c index 99256115bea5..c2a80b4cbf32 100644 --- a/drivers/gpio/gpio-ts4800.c +++ b/drivers/gpio/gpio-ts4800.c @@ -66,6 +66,7 @@ static const struct of_device_id ts4800_gpio_of_match[] = { { .compatible = "technologic,ts4800-gpio", }, {}, }; +MODULE_DEVICE_TABLE(of, ts4800_gpio_of_match); static struct platform_driver ts4800_gpio_driver = { .driver = { From 6a34e0e6b49f50d2626c15ed75b76031f12bd637 Mon Sep 17 00:00:00 2001 From: Scott Wood Date: Thu, 22 Sep 2016 03:35:16 -0500 Subject: [PATCH 118/292] arm64: dts: Add timer erratum property for LS2080A and LS1043A Both the LS1043A and LS2080A platforms are affected by the Freescale A008585 erratum. Advertise it in their respective device trees. Signed-off-by: Scott Wood Acked-by: Marc Zyngier Signed-off-by: Shawn Guo --- arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 1 + arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi index 220ac7057d12..97d331ec2500 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi @@ -123,6 +123,7 @@ <1 14 0xf08>, /* Physical Non-Secure PPI */ <1 11 0xf08>, /* Virtual PPI */ <1 10 0xf08>; /* Hypervisor PPI */ + fsl,erratum-a008585; }; pmu { diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi index 337da90bd7da..7f0dc13b4087 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi @@ -195,6 +195,7 @@ <1 14 4>, /* Physical Non-Secure PPI, active-low */ <1 11 4>, /* Virtual PPI, active-low */ <1 10 4>; /* Hypervisor PPI, active-low */ + fsl,erratum-a008585; }; pmu { From 126d26f66d9890a69158812a6caa248c05359daa Mon Sep 17 00:00:00 2001 From: Johan Hovold Date: Fri, 21 Oct 2016 12:56:27 +0200 Subject: [PATCH 119/292] USB: serial: fix potential NULL-dereference at probe Make sure we have at least one port before attempting to register a console. Currently, at least one driver binds to a "dummy" interface and requests zero ports for it. Should such an interface also lack endpoints, we get a NULL-deref during probe. Fixes: e5b1e2062e05 ("USB: serial: make minor allocation dynamic") Cc: stable # 3.11 Signed-off-by: Johan Hovold --- drivers/usb/serial/usb-serial.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c index d213cf44a7e4..4a037b4a79cf 100644 --- a/drivers/usb/serial/usb-serial.c +++ b/drivers/usb/serial/usb-serial.c @@ -1078,7 +1078,8 @@ static int usb_serial_probe(struct usb_interface *interface, serial->disconnected = 0; - usb_serial_console_init(serial->port[0]->minor); + if (num_ports > 0) + usb_serial_console_init(serial->port[0]->minor); exit: module_put(type->driver.owner); return 0; From a6c6ead14183ea4ec8ce7551e1f3451024b9c4db Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Wed, 19 Oct 2016 02:57:22 +0200 Subject: [PATCH 120/292] cpufreq: intel_pstate: Set P-state upfront in performance mode After commit a4675fbc4a7a (cpufreq: intel_pstate: Replace timers with utilization update callbacks) the cpufreq governor callbacks may not be invoked on NOHZ_FULL CPUs and, in particular, switching to the "performance" policy via sysfs may not have any effect on them. That is a problem, because it usually is desirable to squeeze the last bit of performance out of those CPUs, so work around it by setting the maximum P-state (within the limits) in intel_pstate_set_policy() upfront when the policy is CPUFREQ_POLICY_PERFORMANCE. Fixes: a4675fbc4a7a (cpufreq: intel_pstate: Replace timers with utilization update callbacks) Signed-off-by: Rafael J. Wysocki Acked-by: Srinivas Pandruvada --- drivers/cpufreq/intel_pstate.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index f535f8123258..ac7c58d20b58 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -1142,10 +1142,8 @@ static void intel_pstate_get_min_max(struct cpudata *cpu, int *min, int *max) *min = clamp_t(int, min_perf, cpu->pstate.min_pstate, max_perf); } -static void intel_pstate_set_min_pstate(struct cpudata *cpu) +static void intel_pstate_set_pstate(struct cpudata *cpu, int pstate) { - int pstate = cpu->pstate.min_pstate; - trace_cpu_frequency(pstate * cpu->pstate.scaling, cpu->cpu); cpu->pstate.current_pstate = pstate; /* @@ -1157,6 +1155,20 @@ static void intel_pstate_set_min_pstate(struct cpudata *cpu) pstate_funcs.get_val(cpu, pstate)); } +static void intel_pstate_set_min_pstate(struct cpudata *cpu) +{ + intel_pstate_set_pstate(cpu, cpu->pstate.min_pstate); +} + +static void intel_pstate_max_within_limits(struct cpudata *cpu) +{ + int min_pstate, max_pstate; + + update_turbo_state(); + intel_pstate_get_min_max(cpu, &min_pstate, &max_pstate); + intel_pstate_set_pstate(cpu, max_pstate); +} + static void intel_pstate_get_cpu_pstates(struct cpudata *cpu) { cpu->pstate.min_pstate = pstate_funcs.get_min(); @@ -1491,7 +1503,7 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) pr_debug("set_policy cpuinfo.max %u policy->max %u\n", policy->cpuinfo.max_freq, policy->max); - cpu = all_cpu_data[0]; + cpu = all_cpu_data[policy->cpu]; if (cpu->pstate.max_pstate_physical > cpu->pstate.max_pstate && policy->max < policy->cpuinfo.max_freq && policy->max > cpu->pstate.max_pstate * cpu->pstate.scaling) { @@ -1535,6 +1547,15 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) limits->max_perf = round_up(limits->max_perf, FRAC_BITS); out: + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { + /* + * NOHZ_FULL CPUs need this as the governor callback may not + * be invoked on them. + */ + intel_pstate_clear_update_util_hook(policy->cpu); + intel_pstate_max_within_limits(cpu); + } + intel_pstate_set_update_util_hook(policy->cpu); intel_pstate_hwp_set_policy(policy); From 62623d5f918fb1c8ed86b03b9a86cc81f1cb1878 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Thu, 20 Oct 2016 13:32:55 +1100 Subject: [PATCH 121/292] KVM: PPC: Book3S HV: Fix build error when SMP=n Commit 5d375199ea96 ("KVM: PPC: Book3S HV: Set server for passed-through interrupts") broke the SMP=n build: arch/powerpc/kvm/book3s_hv_rm_xics.c:758:2: error: implicit declaration of function 'get_hard_smp_processor_id' That is because we lost the implicit include of asm/smp.h, so include it explicitly to get the definition for get_hard_smp_processor_id(). Fixes: 5d375199ea96 ("KVM: PPC: Book3S HV: Set server for passed-through interrupts") Signed-off-by: Michael Ellerman --- arch/powerpc/kvm/book3s_hv_rm_xics.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c b/arch/powerpc/kvm/book3s_hv_rm_xics.c index 82ff5de8b1e7..a0ea63ac2b52 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_xics.c +++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c @@ -23,6 +23,7 @@ #include #include #include +#include #include "book3s_xics.h" From 80f23935cadb1c654e81951f5a8b7ceae0acc1b4 Mon Sep 17 00:00:00 2001 From: Segher Boessenkool Date: Thu, 6 Oct 2016 13:42:19 +0000 Subject: [PATCH 122/292] powerpc: Convert cmp to cmpd in idle enter sequence PowerPC's "cmp" instruction has four operands. Normally people write "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently people forget, and write "cmp" with just three operands. With older binutils this is silently accepted as if this was "cmpw", while often "cmpd" is wanted. With newer binutils GAS will complain about this for 64-bit code. For 32-bit code it still silently assumes "cmpw" is what is meant. In this instance the code comes directly from ISA v2.07, including the cmp, but cmpd is correct. Backport to stable so that new toolchains can build old kernels. Fixes: 948cf67c4726 ("powerpc: Add NAP mode support on Power7 in HV mode") Cc: stable@vger.kernel.org # v3.0 Reviewed-by: Vaidyanathan Srinivasan Signed-off-by: Segher Boessenkool Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/cpuidle.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h index 01b8a13f0224..3919332965af 100644 --- a/arch/powerpc/include/asm/cpuidle.h +++ b/arch/powerpc/include/asm/cpuidle.h @@ -26,7 +26,7 @@ extern u64 pnv_first_deep_stop_state; std r0,0(r1); \ ptesync; \ ld r0,0(r1); \ -1: cmp cr0,r0,r0; \ +1: cmpd cr0,r0,r0; \ bne 1b; \ IDLE_INST; \ b . From f660f6066716b700148f60dba3461e65efff3123 Mon Sep 17 00:00:00 2001 From: Sagi Grimberg Date: Mon, 10 Oct 2016 15:10:51 +0300 Subject: [PATCH 123/292] softirq: Display IRQ_POLL for irq-poll statistics This library was moved to the generic area and was renamed to irq-poll. Hence, update proc/softirqs output accordingly. Signed-off-by: Sagi Grimberg Reviewed-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe --- kernel/softirq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/softirq.c b/kernel/softirq.c index 1bf81ef91375..744fa611cae0 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -58,7 +58,7 @@ static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp DEFINE_PER_CPU(struct task_struct *, ksoftirqd); const char * const softirq_to_name[NR_SOFTIRQS] = { - "HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "BLOCK_IOPOLL", + "HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "IRQ_POLL", "TASKLET", "SCHED", "HRTIMER", "RCU" }; From b4a1278c78bc939b3e29c3ad21ceaa636b0ca8c8 Mon Sep 17 00:00:00 2001 From: Shaohua Li Date: Thu, 20 Oct 2016 14:40:06 -0700 Subject: [PATCH 124/292] badblocks: badblocks_set/clear update unacked_exist When bandblocks_set acknowledges a range or badblocks_clear a range, it's possible all badblocks are acknowledged. We should update unacked_exist if this occurs. Signed-off-by: Shaohua Li Reviewed-by: Tomasz Majchrzak Tested-by: Tomasz Majchrzak Signed-off-by: Jens Axboe --- block/badblocks.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/block/badblocks.c b/block/badblocks.c index 6610e282a03e..6ebcef282314 100644 --- a/block/badblocks.c +++ b/block/badblocks.c @@ -133,6 +133,26 @@ retry: } EXPORT_SYMBOL_GPL(badblocks_check); +static void badblocks_update_acked(struct badblocks *bb) +{ + u64 *p = bb->page; + int i; + bool unacked = false; + + if (!bb->unacked_exist) + return; + + for (i = 0; i < bb->count ; i++) { + if (!BB_ACK(p[i])) { + unacked = true; + break; + } + } + + if (!unacked) + bb->unacked_exist = 0; +} + /** * badblocks_set() - Add a range of bad blocks to the table. * @bb: the badblocks structure that holds all badblock information @@ -294,6 +314,8 @@ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, bb->changed = 1; if (!acknowledged) bb->unacked_exist = 1; + else + badblocks_update_acked(bb); write_sequnlock_irqrestore(&bb->lock, flags); return rv; @@ -401,6 +423,7 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors) } } + badblocks_update_acked(bb); bb->changed = 1; out: write_sequnlock_irq(&bb->lock); From bcf4f311e034dc661b5e641595ef1f50af27b5bf Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Tue, 20 Sep 2016 13:43:19 +0900 Subject: [PATCH 125/292] ARM: uniphier: select ARCH_HAS_RESET_CONTROLLER The UniPhier reset driver (drivers/reset/reset-uniphier.c) has been merged. Select ARCH_HAS_RESET_CONTROLLER from the SoC Kconfig. Signed-off-by: Masahiro Yamada Acked-by: Philipp Zabel --- arch/arm/mach-uniphier/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm/mach-uniphier/Kconfig b/arch/arm/mach-uniphier/Kconfig index 82dddee3a469..3930fbba30b4 100644 --- a/arch/arm/mach-uniphier/Kconfig +++ b/arch/arm/mach-uniphier/Kconfig @@ -1,6 +1,7 @@ config ARCH_UNIPHIER bool "Socionext UniPhier SoCs" depends on ARCH_MULTI_V7 + select ARCH_HAS_RESET_CONTROLLER select ARM_AMBA select ARM_GLOBAL_TIMER select ARM_GIC From 75924903c51d0697b989035f6baebccb2a7367cd Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Sat, 8 Oct 2016 11:25:34 +0900 Subject: [PATCH 126/292] arm64: uniphier: select ARCH_HAS_RESET_CONTROLLER The UniPhier reset driver (drivers/reset/reset-uniphier.c) has been merged. Select ARCH_HAS_RESET_CONTROLLER from the SoC Kconfig. Signed-off-by: Masahiro Yamada --- arch/arm64/Kconfig.platforms | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms index cfbdf02ef566..101794f5ce10 100644 --- a/arch/arm64/Kconfig.platforms +++ b/arch/arm64/Kconfig.platforms @@ -190,6 +190,7 @@ config ARCH_THUNDER config ARCH_UNIPHIER bool "Socionext UniPhier SoC Family" + select ARCH_HAS_RESET_CONTROLLER select PINCTRL help This enables support for Socionext UniPhier SoC family. From 19eb4a47224da4bc118f1e304e16b6c08ed172c9 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Wed, 19 Oct 2016 17:23:49 +0900 Subject: [PATCH 127/292] reset: uniphier: rename MIO reset to SD reset for Pro5, PXs2, LD20 SoCs I made a mistake as for naming for this block. The MIO block is not implemented for these 3 SoCs in the first place. The current naming will be a trouble if an SoC with both MIO and SD-ctrl blocks appear in the future. This driver has just been merged in the previous merge window. Rename it before the release. Signed-off-by: Masahiro Yamada Acked-by: Philipp Zabel --- .../bindings/reset/uniphier-reset.txt | 62 +++++++++---------- drivers/reset/reset-uniphier.c | 16 ++--- 2 files changed, 39 insertions(+), 39 deletions(-) diff --git a/Documentation/devicetree/bindings/reset/uniphier-reset.txt b/Documentation/devicetree/bindings/reset/uniphier-reset.txt index e6bbfccd56c3..5020524cddeb 100644 --- a/Documentation/devicetree/bindings/reset/uniphier-reset.txt +++ b/Documentation/devicetree/bindings/reset/uniphier-reset.txt @@ -6,25 +6,25 @@ System reset Required properties: - compatible: should be one of the following: - "socionext,uniphier-sld3-reset" - for PH1-sLD3 SoC. - "socionext,uniphier-ld4-reset" - for PH1-LD4 SoC. - "socionext,uniphier-pro4-reset" - for PH1-Pro4 SoC. - "socionext,uniphier-sld8-reset" - for PH1-sLD8 SoC. - "socionext,uniphier-pro5-reset" - for PH1-Pro5 SoC. - "socionext,uniphier-pxs2-reset" - for ProXstream2/PH1-LD6b SoC. - "socionext,uniphier-ld11-reset" - for PH1-LD11 SoC. - "socionext,uniphier-ld20-reset" - for PH1-LD20 SoC. + "socionext,uniphier-sld3-reset" - for sLD3 SoC. + "socionext,uniphier-ld4-reset" - for LD4 SoC. + "socionext,uniphier-pro4-reset" - for Pro4 SoC. + "socionext,uniphier-sld8-reset" - for sLD8 SoC. + "socionext,uniphier-pro5-reset" - for Pro5 SoC. + "socionext,uniphier-pxs2-reset" - for PXs2/LD6b SoC. + "socionext,uniphier-ld11-reset" - for LD11 SoC. + "socionext,uniphier-ld20-reset" - for LD20 SoC. - #reset-cells: should be 1. Example: sysctrl@61840000 { - compatible = "socionext,uniphier-ld20-sysctrl", + compatible = "socionext,uniphier-ld11-sysctrl", "simple-mfd", "syscon"; reg = <0x61840000 0x4000>; reset { - compatible = "socionext,uniphier-ld20-reset"; + compatible = "socionext,uniphier-ld11-reset"; #reset-cells = <1>; }; @@ -32,30 +32,30 @@ Example: }; -Media I/O (MIO) reset ---------------------- +Media I/O (MIO) reset, SD reset +------------------------------- Required properties: - compatible: should be one of the following: - "socionext,uniphier-sld3-mio-reset" - for PH1-sLD3 SoC. - "socionext,uniphier-ld4-mio-reset" - for PH1-LD4 SoC. - "socionext,uniphier-pro4-mio-reset" - for PH1-Pro4 SoC. - "socionext,uniphier-sld8-mio-reset" - for PH1-sLD8 SoC. - "socionext,uniphier-pro5-mio-reset" - for PH1-Pro5 SoC. - "socionext,uniphier-pxs2-mio-reset" - for ProXstream2/PH1-LD6b SoC. - "socionext,uniphier-ld11-mio-reset" - for PH1-LD11 SoC. - "socionext,uniphier-ld20-mio-reset" - for PH1-LD20 SoC. + "socionext,uniphier-sld3-mio-reset" - for sLD3 SoC. + "socionext,uniphier-ld4-mio-reset" - for LD4 SoC. + "socionext,uniphier-pro4-mio-reset" - for Pro4 SoC. + "socionext,uniphier-sld8-mio-reset" - for sLD8 SoC. + "socionext,uniphier-pro5-sd-reset" - for Pro5 SoC. + "socionext,uniphier-pxs2-sd-reset" - for PXs2/LD6b SoC. + "socionext,uniphier-ld11-mio-reset" - for LD11 SoC. + "socionext,uniphier-ld20-sd-reset" - for LD20 SoC. - #reset-cells: should be 1. Example: mioctrl@59810000 { - compatible = "socionext,uniphier-ld20-mioctrl", + compatible = "socionext,uniphier-ld11-mioctrl", "simple-mfd", "syscon"; reg = <0x59810000 0x800>; reset { - compatible = "socionext,uniphier-ld20-mio-reset"; + compatible = "socionext,uniphier-ld11-mio-reset"; #reset-cells = <1>; }; @@ -68,24 +68,24 @@ Peripheral reset Required properties: - compatible: should be one of the following: - "socionext,uniphier-ld4-peri-reset" - for PH1-LD4 SoC. - "socionext,uniphier-pro4-peri-reset" - for PH1-Pro4 SoC. - "socionext,uniphier-sld8-peri-reset" - for PH1-sLD8 SoC. - "socionext,uniphier-pro5-peri-reset" - for PH1-Pro5 SoC. - "socionext,uniphier-pxs2-peri-reset" - for ProXstream2/PH1-LD6b SoC. - "socionext,uniphier-ld11-peri-reset" - for PH1-LD11 SoC. - "socionext,uniphier-ld20-peri-reset" - for PH1-LD20 SoC. + "socionext,uniphier-ld4-peri-reset" - for LD4 SoC. + "socionext,uniphier-pro4-peri-reset" - for Pro4 SoC. + "socionext,uniphier-sld8-peri-reset" - for sLD8 SoC. + "socionext,uniphier-pro5-peri-reset" - for Pro5 SoC. + "socionext,uniphier-pxs2-peri-reset" - for PXs2/LD6b SoC. + "socionext,uniphier-ld11-peri-reset" - for LD11 SoC. + "socionext,uniphier-ld20-peri-reset" - for LD20 SoC. - #reset-cells: should be 1. Example: perictrl@59820000 { - compatible = "socionext,uniphier-ld20-perictrl", + compatible = "socionext,uniphier-ld11-perictrl", "simple-mfd", "syscon"; reg = <0x59820000 0x200>; reset { - compatible = "socionext,uniphier-ld20-peri-reset"; + compatible = "socionext,uniphier-ld11-peri-reset"; #reset-cells = <1>; }; diff --git a/drivers/reset/reset-uniphier.c b/drivers/reset/reset-uniphier.c index 8b2558e7363e..968c3ae4535c 100644 --- a/drivers/reset/reset-uniphier.c +++ b/drivers/reset/reset-uniphier.c @@ -154,7 +154,7 @@ const struct uniphier_reset_data uniphier_sld3_mio_reset_data[] = { UNIPHIER_RESET_END, }; -const struct uniphier_reset_data uniphier_pro5_mio_reset_data[] = { +const struct uniphier_reset_data uniphier_pro5_sd_reset_data[] = { UNIPHIER_MIO_RESET_SD(0, 0), UNIPHIER_MIO_RESET_SD(1, 1), UNIPHIER_MIO_RESET_EMMC_HW_RESET(6, 1), @@ -360,7 +360,7 @@ static const struct of_device_id uniphier_reset_match[] = { .compatible = "socionext,uniphier-ld20-reset", .data = uniphier_ld20_sys_reset_data, }, - /* Media I/O reset */ + /* Media I/O reset, SD reset */ { .compatible = "socionext,uniphier-sld3-mio-reset", .data = uniphier_sld3_mio_reset_data, @@ -378,20 +378,20 @@ static const struct of_device_id uniphier_reset_match[] = { .data = uniphier_sld3_mio_reset_data, }, { - .compatible = "socionext,uniphier-pro5-mio-reset", - .data = uniphier_pro5_mio_reset_data, + .compatible = "socionext,uniphier-pro5-sd-reset", + .data = uniphier_pro5_sd_reset_data, }, { - .compatible = "socionext,uniphier-pxs2-mio-reset", - .data = uniphier_pro5_mio_reset_data, + .compatible = "socionext,uniphier-pxs2-sd-reset", + .data = uniphier_pro5_sd_reset_data, }, { .compatible = "socionext,uniphier-ld11-mio-reset", .data = uniphier_sld3_mio_reset_data, }, { - .compatible = "socionext,uniphier-ld20-mio-reset", - .data = uniphier_pro5_mio_reset_data, + .compatible = "socionext,uniphier-ld20-sd-reset", + .data = uniphier_pro5_sd_reset_data, }, /* Peripheral reset */ { From 1bdb60ef18655596d56b9b7268ad0bf5214e00e4 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Fri, 21 Oct 2016 17:27:57 +0900 Subject: [PATCH 128/292] ARM: dts: uniphier: change MIO node to SD control node I made a mistake bacuse the Media I/O block is not implemented in these SoCs. Signed-off-by: Masahiro Yamada --- arch/arm/boot/dts/uniphier-pro5.dtsi | 4 ++-- arch/arm/boot/dts/uniphier-pxs2.dtsi | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/arm/boot/dts/uniphier-pro5.dtsi b/arch/arm/boot/dts/uniphier-pro5.dtsi index 2c49c3614bda..5357ea9c14b1 100644 --- a/arch/arm/boot/dts/uniphier-pro5.dtsi +++ b/arch/arm/boot/dts/uniphier-pro5.dtsi @@ -184,11 +184,11 @@ }; &mio_clk { - compatible = "socionext,uniphier-pro5-mio-clock"; + compatible = "socionext,uniphier-pro5-sd-clock"; }; &mio_rst { - compatible = "socionext,uniphier-pro5-mio-reset"; + compatible = "socionext,uniphier-pro5-sd-reset"; }; &peri_clk { diff --git a/arch/arm/boot/dts/uniphier-pxs2.dtsi b/arch/arm/boot/dts/uniphier-pxs2.dtsi index 8789cd518933..950f07ba0337 100644 --- a/arch/arm/boot/dts/uniphier-pxs2.dtsi +++ b/arch/arm/boot/dts/uniphier-pxs2.dtsi @@ -197,11 +197,11 @@ }; &mio_clk { - compatible = "socionext,uniphier-pxs2-mio-clock"; + compatible = "socionext,uniphier-pxs2-sd-clock"; }; &mio_rst { - compatible = "socionext,uniphier-pxs2-mio-reset"; + compatible = "socionext,uniphier-pxs2-sd-reset"; }; &peri_clk { From 8e68c65d111a57a4cbe41dc886bb2a1e671e0b6e Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Fri, 21 Oct 2016 16:45:21 +0900 Subject: [PATCH 129/292] arm64: dts: uniphier: change MIO node to SD control node I made a mistake bacuse the Media I/O block is not implemented in this SoC. Signed-off-by: Masahiro Yamada --- arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi index 08fd7cf7769c..56a1b2e92cf3 100644 --- a/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi +++ b/arch/arm64/boot/dts/socionext/uniphier-ld20.dtsi @@ -257,18 +257,18 @@ reg = <0x59801000 0x400>; }; - mioctrl@59810000 { - compatible = "socionext,uniphier-mioctrl", + sdctrl@59810000 { + compatible = "socionext,uniphier-ld20-sdctrl", "simple-mfd", "syscon"; reg = <0x59810000 0x800>; - mio_clk: clock { - compatible = "socionext,uniphier-ld20-mio-clock"; + sd_clk: clock { + compatible = "socionext,uniphier-ld20-sd-clock"; #clock-cells = <1>; }; - mio_rst: reset { - compatible = "socionext,uniphier-ld20-mio-reset"; + sd_rst: reset { + compatible = "socionext,uniphier-ld20-sd-reset"; #reset-cells = <1>; }; }; From d1fe85ec7702917f2f1515b4c421d5d4792201a0 Mon Sep 17 00:00:00 2001 From: Sandhya Bankar Date: Sun, 25 Sep 2016 00:46:21 +0530 Subject: [PATCH 130/292] iio:chemical:atlas-ph-sensor: Fix use of 32 bit int to hold 16 bit big endian value This will result in a random value being reported on big endian architectures. (thanks to Lars-Peter Clausen for pointing out the effects of this bug) Only effects a value printed to the log, but as this reports the settings of the probe in question it may be of direct interest to users. Also, fixes the following sparse endianness warnings: drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16 drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16 drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16 drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16 drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16 drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16 drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16 drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16 Signed-off-by: Sandhya Bankar Fixes: e8dd92bfbff25 ("iio: chemical: atlas-ph-sensor: add EC feature") Cc: Signed-off-by: Jonathan Cameron --- drivers/iio/chemical/atlas-ph-sensor.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/iio/chemical/atlas-ph-sensor.c b/drivers/iio/chemical/atlas-ph-sensor.c index bd321b305a0a..ef761a508630 100644 --- a/drivers/iio/chemical/atlas-ph-sensor.c +++ b/drivers/iio/chemical/atlas-ph-sensor.c @@ -213,13 +213,14 @@ static int atlas_check_ec_calibration(struct atlas_data *data) struct device *dev = &data->client->dev; int ret; unsigned int val; + __be16 rval; - ret = regmap_bulk_read(data->regmap, ATLAS_REG_EC_PROBE, &val, 2); + ret = regmap_bulk_read(data->regmap, ATLAS_REG_EC_PROBE, &rval, 2); if (ret) return ret; - dev_info(dev, "probe set to K = %d.%.2d", be16_to_cpu(val) / 100, - be16_to_cpu(val) % 100); + val = be16_to_cpu(rval); + dev_info(dev, "probe set to K = %d.%.2d", val / 100, val % 100); ret = regmap_read(data->regmap, ATLAS_REG_EC_CALIB_STATUS, &val); if (ret) From 64bc2d02d754f4143d65cc21c644176db12ab5c8 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 18 Oct 2016 00:13:35 +0200 Subject: [PATCH 131/292] iio: accel: sca3000_core: avoid potentially uninitialized variable The newly added __sca3000_get_base_freq function handles all valid modes of the SCA3000_REG_ADDR_MODE register, but gcc notices that any other value (i.e. 0x00) causes the base_freq variable to not get initialized: drivers/staging/iio/accel/sca3000_core.c: In function 'sca3000_write_raw': drivers/staging/iio/accel/sca3000_core.c:527:23: error: 'base_freq' may be used uninitialized in this function [-Werror=maybe-uninitialized] This adds explicit error handling for unexpected register values, to ensure this cannot happen. Fixes: e0f3fc9b47e6 ("iio: accel: sca3000_core: implemented IIO_CHAN_INFO_SAMP_FREQ") Signed-off-by: Arnd Bergmann Cc: Ico Doornekamp Cc: Jonathan Cameron Signed-off-by: Jonathan Cameron --- drivers/staging/iio/accel/sca3000_core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/staging/iio/accel/sca3000_core.c b/drivers/staging/iio/accel/sca3000_core.c index d626125d7af9..564b36d4f648 100644 --- a/drivers/staging/iio/accel/sca3000_core.c +++ b/drivers/staging/iio/accel/sca3000_core.c @@ -468,6 +468,8 @@ static inline int __sca3000_get_base_freq(struct sca3000_state *st, case SCA3000_MEAS_MODE_OP_2: *base_freq = info->option_mode_2_freq; break; + default: + ret = -EINVAL; } error_ret: return ret; From 963d790468a2f581abf039b45edac79af5e16e55 Mon Sep 17 00:00:00 2001 From: Ray Jui Date: Wed, 20 Jul 2016 14:53:51 -0700 Subject: [PATCH 132/292] arm64: dts: Updated NAND DT properties for NS2 SVK This patch adds NAND DT properties for NS2 SVK to configure the bus width width and OOB sector size Signed-off-by: Prafulla Kota Signed-off-by: Ray Jui Signed-off-by: Florian Fainelli --- arch/arm64/boot/dts/broadcom/ns2-svk.dts | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/broadcom/ns2-svk.dts b/arch/arm64/boot/dts/broadcom/ns2-svk.dts index 2d7872a36b91..b09f3bc5c6c1 100644 --- a/arch/arm64/boot/dts/broadcom/ns2-svk.dts +++ b/arch/arm64/boot/dts/broadcom/ns2-svk.dts @@ -164,6 +164,8 @@ nand-ecc-mode = "hw"; nand-ecc-strength = <8>; nand-ecc-step-size = <512>; + nand-bus-width = <16>; + brcm,nand-oob-sector-size = <16>; #address-cells = <1>; #size-cells = <1>; }; From 6d8d271eee0eaa3c3ffd6db29a825e02316359d4 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Tue, 18 Oct 2016 17:44:01 -0300 Subject: [PATCH 133/292] gpio: ath79: Fix module autoload If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Before this patch: $ modinfo drivers/gpio/gpio-ath79.ko | grep alias $ After this patch: $ modinfo drivers/gpio/gpio-ath79.ko | grep alias alias: of:N*T*Cqca,ar9340-gpioC* alias: of:N*T*Cqca,ar9340-gpio alias: of:N*T*Cqca,ar7100-gpioC* alias: of:N*T*Cqca,ar7100-gpio Signed-off-by: Javier Martinez Canillas Acked-by: Aban Bedel Signed-off-by: Linus Walleij --- drivers/gpio/gpio-ath79.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpio/gpio-ath79.c b/drivers/gpio/gpio-ath79.c index 9457e2022bf6..dc37dbe4b46d 100644 --- a/drivers/gpio/gpio-ath79.c +++ b/drivers/gpio/gpio-ath79.c @@ -219,6 +219,7 @@ static const struct of_device_id ath79_gpio_of_match[] = { { .compatible = "qca,ar9340-gpio" }, {}, }; +MODULE_DEVICE_TABLE(of, ath79_gpio_of_match); static int ath79_gpio_probe(struct platform_device *pdev) { From d71cf15b865bdd45925f7b094d169aaabd705145 Mon Sep 17 00:00:00 2001 From: Liu Gang Date: Fri, 21 Oct 2016 15:31:28 +0800 Subject: [PATCH 134/292] gpio: mpc8xxx: Correct irq handler function From the beginning of the gpio-mpc8xxx.c, the "handle_level_irq" has being used to handle GPIO interrupts in the PowerPC/Layerscape platforms. But actually, almost all PowerPC/Layerscape platforms assert an interrupt request upon either a high-to-low change or any change on the state of the signal. So the "handle_level_irq" is not reasonable for PowerPC/Layerscape GPIO interrupt, it should be "handle_edge_irq". Otherwise the system may lost some interrupts from the PIN's state changes. Signed-off-by: Liu Gang Signed-off-by: Linus Walleij --- drivers/gpio/gpio-mpc8xxx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpio-mpc8xxx.c b/drivers/gpio/gpio-mpc8xxx.c index 425501c39527..793518a30afe 100644 --- a/drivers/gpio/gpio-mpc8xxx.c +++ b/drivers/gpio/gpio-mpc8xxx.c @@ -239,7 +239,7 @@ static int mpc8xxx_gpio_irq_map(struct irq_domain *h, unsigned int irq, irq_hw_number_t hwirq) { irq_set_chip_data(irq, h->host_data); - irq_set_chip_and_handler(irq, &mpc8xxx_irq_chip, handle_level_irq); + irq_set_chip_and_handler(irq, &mpc8xxx_irq_chip, handle_edge_irq); return 0; } From a05b82d5149dfeef05254a11c3636a89a854520a Mon Sep 17 00:00:00 2001 From: Vaibhav Jain Date: Fri, 21 Oct 2016 14:53:53 +0530 Subject: [PATCH 135/292] cxl: Fix leaking pid refs in some error paths In some error paths in functions cxl_start_context and afu_ioctl_start_work pid references to the current & group-leader tasks can leak after they are taken. This patch fixes these error paths to release these pid references before exiting the error path. Fixes: 7b8ad495d592 ("cxl: Fix DSI misses when the context owning task exits") Cc: stable@vger.kernel.org # v4.5+ Reviewed-by: Andrew Donnellan Reported-by: Frederic Barrat Signed-off-by: Vaibhav Jain Acked-by: Frederic Barrat Signed-off-by: Michael Ellerman --- drivers/misc/cxl/api.c | 2 ++ drivers/misc/cxl/file.c | 22 +++++++++++++--------- 2 files changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index af23d7dfe752..2e5233b60971 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -247,7 +247,9 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed, cxl_ctx_get(); if ((rc = cxl_ops->attach_process(ctx, kernel, wed, 0))) { + put_pid(ctx->glpid); put_pid(ctx->pid); + ctx->glpid = ctx->pid = NULL; cxl_adapter_context_put(ctx->afu->adapter); cxl_ctx_put(); goto out; diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c index d0b421f49b39..77080cc5fa0a 100644 --- a/drivers/misc/cxl/file.c +++ b/drivers/misc/cxl/file.c @@ -193,6 +193,16 @@ static long afu_ioctl_start_work(struct cxl_context *ctx, ctx->mmio_err_ff = !!(work.flags & CXL_START_WORK_ERR_FF); + /* + * Increment the mapped context count for adapter. This also checks + * if adapter_context_lock is taken. + */ + rc = cxl_adapter_context_get(ctx->afu->adapter); + if (rc) { + afu_release_irqs(ctx, ctx); + goto out; + } + /* * We grab the PID here and not in the file open to allow for the case * where a process (master, some daemon, etc) has opened the chardev on @@ -205,15 +215,6 @@ static long afu_ioctl_start_work(struct cxl_context *ctx, ctx->pid = get_task_pid(current, PIDTYPE_PID); ctx->glpid = get_task_pid(current->group_leader, PIDTYPE_PID); - /* - * Increment the mapped context count for adapter. This also checks - * if adapter_context_lock is taken. - */ - rc = cxl_adapter_context_get(ctx->afu->adapter); - if (rc) { - afu_release_irqs(ctx, ctx); - goto out; - } trace_cxl_attach(ctx, work.work_element_descriptor, work.num_interrupts, amr); @@ -221,6 +222,9 @@ static long afu_ioctl_start_work(struct cxl_context *ctx, amr))) { afu_release_irqs(ctx, ctx); cxl_adapter_context_put(ctx->afu->adapter); + put_pid(ctx->glpid); + put_pid(ctx->pid); + ctx->glpid = ctx->pid = NULL; goto out; } From c663e29f8885269c1608b5aa6057729fa9267b35 Mon Sep 17 00:00:00 2001 From: Jan Kara Date: Mon, 24 Oct 2016 14:20:25 +1100 Subject: [PATCH 136/292] fs: Do to trim high file position bits in iomap_page_mkwrite_actor iomap_page_mkwrite_actor() calls __block_write_begin_int() with position masked as pos & ~PAGE_MASK which is equivalent to pos & (PAGE_SIZE-1). Thus it masks off high bits of file position. However __block_write_begin_int() expects full file position on input. This does not cause any visible issues because all __block_write_begin_int() really cares about are low file position bits but still it is a bug waiting to happen. Signed-off-by: Jan Kara Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner --- fs/iomap.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index a92204012e2d..a8ee8c33ca78 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -433,8 +433,7 @@ iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length, struct page *page = data; int ret; - ret = __block_write_begin_int(page, pos & ~PAGE_MASK, length, - NULL, iomap); + ret = __block_write_begin_int(page, pos, length, NULL, iomap); if (ret) return ret; From 7b7381f043568224af798b1decb607dca97b4114 Mon Sep 17 00:00:00 2001 From: Brian Foster Date: Mon, 24 Oct 2016 14:21:00 +1100 Subject: [PATCH 137/292] xfs: fix up inode cowblocks tracking tracepoints These calls are still using the eofblocks tracepoints. The cowblocks equivalents are already defined, we just aren't actually calling them. Signed-off-by: Brian Foster Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/xfs_icache.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 14796b744e0a..f295049db681 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -1656,9 +1656,9 @@ void xfs_inode_set_cowblocks_tag( xfs_inode_t *ip) { - trace_xfs_inode_set_eofblocks_tag(ip); + trace_xfs_inode_set_cowblocks_tag(ip); return __xfs_inode_set_eofblocks_tag(ip, xfs_queue_cowblocks, - trace_xfs_perag_set_eofblocks, + trace_xfs_perag_set_cowblocks, XFS_ICI_COWBLOCKS_TAG); } @@ -1666,7 +1666,7 @@ void xfs_inode_clear_cowblocks_tag( xfs_inode_t *ip) { - trace_xfs_inode_clear_eofblocks_tag(ip); + trace_xfs_inode_clear_cowblocks_tag(ip); return __xfs_inode_clear_eofblocks_tag(ip, - trace_xfs_perag_clear_eofblocks, XFS_ICI_COWBLOCKS_TAG); + trace_xfs_perag_clear_cowblocks, XFS_ICI_COWBLOCKS_TAG); } From c17a8ef43d6b80ed3519b828c37d18645445949f Mon Sep 17 00:00:00 2001 From: Brian Foster Date: Mon, 24 Oct 2016 14:21:08 +1100 Subject: [PATCH 138/292] xfs: clear cowblocks tag when cow fork is emptied The background cowblocks scan job takes care of scanning for inodes with potentially lingering blocks in the cow fork and clearing them out. If the background scanner reclaims the cow fork blocks, however, it doesn't immediately clear the cowblocks tag from the inode. Instead, the inode remains tagged until the background scanner comes around again, discovers the inode cow fork has no blocks, clears the tag and fires the trace_xfs_inode_free_cowblocks_invalid() tracepoint to indicate that the inode may have been incorrectly tagged. This is not a major functional problem as the tag is ultimately cleared. Nonetheless, clear the tag when an inode cow fork is explicitly emptied to avoid the extra round trip through the background scanner and spurious "invalid" tracepoint. Signed-off-by: Brian Foster Reviewed-by: Christoph Hellwig Reviewed-by: Darrick J. Wong Signed-off-by: Dave Chinner --- fs/xfs/xfs_reflink.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index cd308f119e20..a279b4e7f5fe 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -567,10 +567,14 @@ xfs_reflink_cancel_cow_blocks( } if (++idx >= ifp->if_bytes / sizeof(struct xfs_bmbt_rec)) - return 0; + break; xfs_bmbt_get_all(xfs_iext_get_ext(ifp, idx), &got); } + /* clear tag if cow fork is emptied */ + if (!ifp->if_bytes) + xfs_inode_clear_cowblocks_tag(ip); + return error; } From eef0b282bb586259d35548851cf6a4ce847bb804 Mon Sep 17 00:00:00 2001 From: Fabio Estevam Date: Sat, 22 Oct 2016 10:20:55 -0200 Subject: [PATCH 139/292] ARM: imx: gpc: Initialize all power domains Since commit 0159ec670763dd ("PM / Domains: Verify the PM domain is present when adding a provider") the following regression is observed on imx6: imx-gpc: probe of 20dc000.gpc failed with error -22 The gpc probe fails because of_genpd_add_provider_onecell() now checks if all the domains are initialized via pm_genpd_present() function and it fails because not all the power domains are initialized. In order to fix this error, initialize all the power domains from imx_gpc_domains[], not only the imx6q_pu_domain.base one. Reported-by: Olof's autobooter Signed-off-by: Fabio Estevam Signed-off-by: Shawn Guo --- arch/arm/mach-imx/gpc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm/mach-imx/gpc.c b/arch/arm/mach-imx/gpc.c index 0df062d8b2c9..d0463e9d8a08 100644 --- a/arch/arm/mach-imx/gpc.c +++ b/arch/arm/mach-imx/gpc.c @@ -430,7 +430,8 @@ static int imx_gpc_genpd_init(struct device *dev, struct regulator *pu_reg) if (!IS_ENABLED(CONFIG_PM_GENERIC_DOMAINS)) return 0; - pm_genpd_init(&imx6q_pu_domain.base, NULL, false); + for (i = 0; i < ARRAY_SIZE(imx_gpc_domains); i++) + pm_genpd_init(imx_gpc_domains[i], NULL, false); return of_genpd_add_provider_onecell(dev->of_node, &imx_gpc_onecell_data); From f9d1f7a7ad919c93dfb708aae6e19d33c5437443 Mon Sep 17 00:00:00 2001 From: Fabio Estevam Date: Sat, 22 Oct 2016 10:20:56 -0200 Subject: [PATCH 140/292] ARM: imx: gpc: Fix the imx_gpc_genpd_init() error path If of_genpd_add_provider_onecell() fails the following kernel crash is observed on a kernel built with multi_v7_defconfig: [ 1.739301] [00000040] *pgd=00000000 [ 1.739310] Internal error: Oops: 5 [#1] SMP ARM [ 1.739319] Modules linked in: [ 1.739328] CPU: 1 PID: 95 Comm: kworker/1:4 Not tainted 4.8.0-11897-g6b5e09a #1 [ 1.739331] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 1.739352] Workqueue: pm genpd_power_off_work_fn [ 1.739356] task: ee63d400 task.stack: ee70a000 [ 1.739365] PC is at mutex_lock+0xc/0x4c [ 1.739374] LR is at regulator_disable+0x2c/0x60 [ 1.739379] pc : [] lr : [] psr: 60000013 [ 1.739379] sp : ee70beb0 ip : 10624dd3 fp : ee6e6280 [ 1.739382] r10: eefb0900 r9 : 00000000 r8 : c1309918 [ 1.739385] r7 : 00000000 r6 : 00000040 r5 : 00000000 r4 : 00000040 [ 1.739390] r3 : 0000004c r2 : 7fffd540 r1 : 000001e4 r0 : 00000040 Instead of returning of_genpd_add_provider_onecell() directly, we should check its return value and in the case of error we should unwind the previously taken actions, which in these case are: - Call imx6q_pm_pu_power_off() - Set imx6q_pu_domain.reg back to NULL Setting imx6q_pu_domain.reg to NULL in the error case is important as it will prevent further operations in the pu_reg regulator. This kernel crash is not observed with imx_v6_v7_defconfig because it selects GPU and VPU drivers, which are consumers of the GPC block and thus change the refcount of the pu_reg regulator. Signed-off-by: Fabio Estevam Signed-off-by: Shawn Guo --- arch/arm/mach-imx/gpc.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-imx/gpc.c b/arch/arm/mach-imx/gpc.c index d0463e9d8a08..b54db47f6f32 100644 --- a/arch/arm/mach-imx/gpc.c +++ b/arch/arm/mach-imx/gpc.c @@ -408,7 +408,7 @@ static struct genpd_onecell_data imx_gpc_onecell_data = { static int imx_gpc_genpd_init(struct device *dev, struct regulator *pu_reg) { struct clk *clk; - int i; + int i, ret; imx6q_pu_domain.reg = pu_reg; @@ -432,12 +432,20 @@ static int imx_gpc_genpd_init(struct device *dev, struct regulator *pu_reg) for (i = 0; i < ARRAY_SIZE(imx_gpc_domains); i++) pm_genpd_init(imx_gpc_domains[i], NULL, false); - return of_genpd_add_provider_onecell(dev->of_node, - &imx_gpc_onecell_data); + ret = of_genpd_add_provider_onecell(dev->of_node, + &imx_gpc_onecell_data); + if (ret) + goto power_off; + + return 0; + +power_off: + imx6q_pm_pu_power_off(&imx6q_pu_domain.base); clk_err: while (i--) clk_put(imx6q_pu_domain.clk[i]); + imx6q_pu_domain.reg = NULL; return -EINVAL; } From 47ece7fef4e4206cdcee7c28ac3bca3ede0a1908 Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Wed, 19 Oct 2016 13:42:55 +0200 Subject: [PATCH 141/292] s390/dumpstack: use pr_cont within show_stack and die Use pr_cont instead of printk calls also within show_stack and die in order to avoid extra line breaks. Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky --- arch/s390/kernel/dumpstack.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/s390/kernel/dumpstack.c b/arch/s390/kernel/dumpstack.c index 34345c0a3c46..55d4fe174fd9 100644 --- a/arch/s390/kernel/dumpstack.c +++ b/arch/s390/kernel/dumpstack.c @@ -119,14 +119,14 @@ void show_stack(struct task_struct *task, unsigned long *sp) else stack = (unsigned long *)task->thread.ksp; } + printk(KERN_DEFAULT "Stack:\n"); for (i = 0; i < 20; i++) { if (((addr_t) stack & (THREAD_SIZE-1)) == 0) break; - if ((i * sizeof(long) % 32) == 0) - printk("%s ", i == 0 ? "" : "\n"); - printk("%016lx ", *stack++); + if (i % 4 == 0) + printk(KERN_DEFAULT " "); + pr_cont("%016lx%c", *stack++, i % 4 == 3 ? '\n' : ' '); } - printk("\n"); show_trace(task, (unsigned long)sp); } @@ -186,14 +186,14 @@ void die(struct pt_regs *regs, const char *str) printk("%s: %04x ilc:%d [#%d] ", str, regs->int_code & 0xffff, regs->int_code >> 17, ++die_counter); #ifdef CONFIG_PREEMPT - printk("PREEMPT "); + pr_cont("PREEMPT "); #endif #ifdef CONFIG_SMP - printk("SMP "); + pr_cont("SMP "); #endif if (debug_pagealloc_enabled()) - printk("DEBUG_PAGEALLOC"); - printk("\n"); + pr_cont("DEBUG_PAGEALLOC"); + pr_cont("\n"); notify_die(DIE_OOPS, str, regs, 0, regs->int_code & 0xffff, SIGSEGV); print_modules(); show_regs(regs); From 4a65429457a5d271dd3b00598b3ec75fe8b5103c Mon Sep 17 00:00:00 2001 From: Gerald Schaefer Date: Tue, 18 Oct 2016 17:32:18 +0200 Subject: [PATCH 142/292] s390/mm: fix zone calculation in arch_add_memory() Standby (hotplug) memory should be added to ZONE_MOVABLE on s390. After commit 199071f1 "s390/mm: make arch_add_memory() NUMA aware", arch_add_memory() used memblock_end_of_DRAM() to find out the end of ZONE_NORMAL and the beginning of ZONE_MOVABLE. However, commit 7f36e3e5 "memory-hotplug: add hot-added memory ranges to memblock before allocate node_data for a node." moved the call of memblock_add_node() before the call of arch_add_memory() in add_memory_resource(), and thus changed the return value of memblock_end_of_DRAM() when called in arch_add_memory(). As a result, arch_add_memory() will think that all memory blocks should be added to ZONE_NORMAL. Fix this by changing the logic in arch_add_memory() so that it will manually iterate over all zones of a given node to find out which zone a memory block should be added to. Reviewed-by: Heiko Carstens Signed-off-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky --- arch/s390/mm/init.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index f56a39bd8ba6..b3e9d18f2ec6 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -151,36 +151,40 @@ void __init free_initrd_mem(unsigned long start, unsigned long end) #ifdef CONFIG_MEMORY_HOTPLUG int arch_add_memory(int nid, u64 start, u64 size, bool for_device) { - unsigned long normal_end_pfn = PFN_DOWN(memblock_end_of_DRAM()); - unsigned long dma_end_pfn = PFN_DOWN(MAX_DMA_ADDRESS); + unsigned long zone_start_pfn, zone_end_pfn, nr_pages; unsigned long start_pfn = PFN_DOWN(start); unsigned long size_pages = PFN_DOWN(size); - unsigned long nr_pages; - int rc, zone_enum; + pg_data_t *pgdat = NODE_DATA(nid); + struct zone *zone; + int rc, i; rc = vmem_add_mapping(start, size); if (rc) return rc; - while (size_pages > 0) { - if (start_pfn < dma_end_pfn) { - nr_pages = (start_pfn + size_pages > dma_end_pfn) ? - dma_end_pfn - start_pfn : size_pages; - zone_enum = ZONE_DMA; - } else if (start_pfn < normal_end_pfn) { - nr_pages = (start_pfn + size_pages > normal_end_pfn) ? - normal_end_pfn - start_pfn : size_pages; - zone_enum = ZONE_NORMAL; + for (i = 0; i < MAX_NR_ZONES; i++) { + zone = pgdat->node_zones + i; + if (zone_idx(zone) != ZONE_MOVABLE) { + /* Add range within existing zone limits, if possible */ + zone_start_pfn = zone->zone_start_pfn; + zone_end_pfn = zone->zone_start_pfn + + zone->spanned_pages; } else { - nr_pages = size_pages; - zone_enum = ZONE_MOVABLE; + /* Add remaining range to ZONE_MOVABLE */ + zone_start_pfn = start_pfn; + zone_end_pfn = start_pfn + size_pages; } - rc = __add_pages(nid, NODE_DATA(nid)->node_zones + zone_enum, - start_pfn, size_pages); + if (start_pfn < zone_start_pfn || start_pfn >= zone_end_pfn) + continue; + nr_pages = (start_pfn + size_pages > zone_end_pfn) ? + zone_end_pfn - start_pfn : size_pages; + rc = __add_pages(nid, zone, start_pfn, nr_pages); if (rc) break; start_pfn += nr_pages; size_pages -= nr_pages; + if (!size_pages) + break; } if (rc) vmem_remove_mapping(start, size); From 56c46222af0d09149fadec2a3ce9d4889de01cc6 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Fri, 21 Oct 2016 20:03:05 +1100 Subject: [PATCH 143/292] powerpc/64: Re-fix race condition between going idle and entering guest Commit 8117ac6a6c2f ("powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one thread entering a KVM guest could switch the MMU context to the guest while another thread was still in host kernel context with the MMU on. That commit moved the point where a thread entering a power-saving mode set its kvm_hstate.hwthread_state field in its PACA to KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the MMU had been switched off. That commit also added a comment explaining that we have to switch to real mode before setting hwthread_state to avoid this race. Nevertheless, commit 4eae2c9ae54a ("powerpc/powernv: Make pnv_powersave_common more generic", 2016-07-08) subsequently moved the setting of hwthread_state back to a point where the MMU is on, thus reintroducing the race, despite the comment saying that this should not be done being included in full in the context lines of the patch that did it. This fixes the race again and adds a bigger and shoutier comment explaining the potential race condition. Fixes: 4eae2c9ae54a ("powerpc/powernv: Make pnv_powersave_common more generic") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Paul Mackerras Reviewed-by: Shreyas B. Prabhu Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/idle_book3s.S | 32 +++++++++++++++++++++++++------ 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S index bd739fed26e3..0d8712a835d2 100644 --- a/arch/powerpc/kernel/idle_book3s.S +++ b/arch/powerpc/kernel/idle_book3s.S @@ -163,12 +163,6 @@ _GLOBAL(pnv_powersave_common) std r9,_MSR(r1) std r1,PACAR1(r13) -#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE - /* Tell KVM we're entering idle */ - li r4,KVM_HWTHREAD_IN_IDLE - stb r4,HSTATE_HWTHREAD_STATE(r13) -#endif - /* * Go to real mode to do the nap, as required by the architecture. * Also, we need to be in real mode before setting hwthread_state, @@ -185,6 +179,26 @@ _GLOBAL(pnv_powersave_common) .globl pnv_enter_arch207_idle_mode pnv_enter_arch207_idle_mode: +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + /* Tell KVM we're entering idle */ + li r4,KVM_HWTHREAD_IN_IDLE + /******************************************************/ + /* N O T E W E L L ! ! ! N O T E W E L L */ + /* The following store to HSTATE_HWTHREAD_STATE(r13) */ + /* MUST occur in real mode, i.e. with the MMU off, */ + /* and the MMU must stay off until we clear this flag */ + /* and test HSTATE_HWTHREAD_REQ(r13) in the system */ + /* reset interrupt vector in exceptions-64s.S. */ + /* The reason is that another thread can switch the */ + /* MMU to a guest context whenever this flag is set */ + /* to KVM_HWTHREAD_IN_IDLE, and if the MMU was on, */ + /* that would potentially cause this thread to start */ + /* executing instructions from guest memory in */ + /* hypervisor mode, leading to a host crash or data */ + /* corruption, or worse. */ + /******************************************************/ + stb r4,HSTATE_HWTHREAD_STATE(r13) +#endif stb r3,PACA_THREAD_IDLE_STATE(r13) cmpwi cr3,r3,PNV_THREAD_SLEEP bge cr3,2f @@ -250,6 +264,12 @@ enter_winkle: * r3 - requested stop state */ power_enter_stop: +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE + /* Tell KVM we're entering idle */ + li r4,KVM_HWTHREAD_IN_IDLE + /* DO THIS IN REAL MODE! See comment above. */ + stb r4,HSTATE_HWTHREAD_STATE(r13) +#endif /* * Check if the requested state is a deep idle state. */ From 09b7e37b18eecc1e347f4b1a3bc863f32801f634 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Fri, 21 Oct 2016 20:04:17 +1100 Subject: [PATCH 144/292] powerpc/64: Fix race condition in setting lock bit in idle/wakeup code This fixes a race condition where one thread that is entering or leaving a power-saving state can inadvertently ignore the lock bit that was set by another thread, and potentially also clear it. The core_idle_lock_held function is called when the lock bit is seen to be set. It polls the lock bit until it is clear, then does a lwarx to load the word containing the lock bit and thread idle bits so it can be updated. However, it is possible that the value loaded with the lwarx has the lock bit set, even though an immediately preceding lwz loaded a value with the lock bit clear. If this happens then we go ahead and update the word despite the lock bit being set, and when called from pnv_enter_arch207_idle_mode, we will subsequently clear the lock bit. No identifiable misbehaviour has been attributed to this race. This fixes it by checking the lock bit in the value loaded by the lwarx. If it is set then we just go back and keep on polling. Fixes: b32aadc1a8ed ("powerpc/powernv: Fix race in updating core_idle_state") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Paul Mackerras Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/idle_book3s.S | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S index 0d8712a835d2..72dac0b58061 100644 --- a/arch/powerpc/kernel/idle_book3s.S +++ b/arch/powerpc/kernel/idle_book3s.S @@ -90,6 +90,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) * Threads will spin in HMT_LOW until the lock bit is cleared. * r14 - pointer to core_idle_state * r15 - used to load contents of core_idle_state + * r9 - used as a temporary variable */ core_idle_lock_held: @@ -99,6 +100,8 @@ core_idle_lock_held: bne 3b HMT_MEDIUM lwarx r15,0,r14 + andi. r9,r15,PNV_CORE_IDLE_LOCK_BIT + bne core_idle_lock_held blr /* From 44d524218c65e1f2e6d945b09165562852298015 Mon Sep 17 00:00:00 2001 From: Stefan Agner Date: Mon, 17 Oct 2016 18:51:27 -0700 Subject: [PATCH 145/292] ARM: dts: vf610: fix IRQ flag of global timer The global timer IRQ (PPI[0], PPI 11 in device tree terms) is a rising edge interrupt. The ARM Cortex-A5 MPCore TRM in Chapter 10.1.2. Interrupt types and sources says: "Interrupt is rising-edge sensitive." The bits seem to be read-only, hence this missconfiguration had no negative effect. However, with commit 992345a58e0c ("irqchip/gic: WARN if setting the interrupt type for a PPI fails") warnings such as this get printed: GIC: PPI11 is secure or misconfigured With this change the new configuration matches the default configuration and no warning is printed anymore. Signed-off-by: Stefan Agner Signed-off-by: Shawn Guo --- arch/arm/boot/dts/vf500.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/vf500.dtsi b/arch/arm/boot/dts/vf500.dtsi index a3824e61bd72..d7fdb2a7d97b 100644 --- a/arch/arm/boot/dts/vf500.dtsi +++ b/arch/arm/boot/dts/vf500.dtsi @@ -70,7 +70,7 @@ global_timer: timer@40002200 { compatible = "arm,cortex-a9-global-timer"; reg = <0x40002200 0x20>; - interrupts = ; + interrupts = ; interrupt-parent = <&intc>; clocks = <&clks VF610_CLK_PLATFORM_BUS>; }; From eeaed4bb5a35591470b545590bb2f26dbe7653a2 Mon Sep 17 00:00:00 2001 From: Sinan Kaya Date: Mon, 24 Oct 2016 00:31:30 -0400 Subject: [PATCH 146/292] ACPI/PCI/IRQ: assign ISA IRQ directly during early boot stages We do not want to store the SCI penalty in the acpi_isa_irq_penalty[] table because acpi_isa_irq_penalty[] only holds ISA IRQ penalties and there's no guarantee that the SCI is an ISA IRQ. We add in the SCI penalty as a special case in acpi_irq_get_penalty(). But if we called acpi_penalize_isa_irq() or acpi_irq_penalty_update() for an SCI that happened to be an ISA IRQ, they stored the SCI penalty (part of the acpi_irq_get_penalty() return value) in acpi_isa_irq_penalty[]. Subsequent calls to acpi_irq_get_penalty() returned a penalty that included *two* SCI penalties. Fixes: 103544d86976 (ACPI,PCI,IRQ: reduce resource requirements) Signed-off-by: Sinan Kaya Acked-by: Bjorn Helgaas Tested-by: Jonathan Liu Signed-off-by: Rafael J. Wysocki --- drivers/acpi/pci_link.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c index c983bf733ad3..6229b022a5d4 100644 --- a/drivers/acpi/pci_link.c +++ b/drivers/acpi/pci_link.c @@ -849,7 +849,7 @@ static int __init acpi_irq_penalty_update(char *str, int used) continue; if (used) - new_penalty = acpi_irq_get_penalty(irq) + + new_penalty = acpi_isa_irq_penalty[irq] + PIRQ_PENALTY_ISA_USED; else new_penalty = 0; @@ -871,7 +871,7 @@ static int __init acpi_irq_penalty_update(char *str, int used) void acpi_penalize_isa_irq(int irq, int active) { if ((irq >= 0) && (irq < ARRAY_SIZE(acpi_isa_irq_penalty))) - acpi_isa_irq_penalty[irq] = acpi_irq_get_penalty(irq) + + acpi_isa_irq_penalty[irq] += (active ? PIRQ_PENALTY_ISA_USED : PIRQ_PENALTY_PCI_USING); } From f1caa61df2a3dc4c58316295c5dc5edba4c68d85 Mon Sep 17 00:00:00 2001 From: Sinan Kaya Date: Mon, 24 Oct 2016 00:31:31 -0400 Subject: [PATCH 147/292] ACPI/PCI: pci_link: penalize SCI correctly Ondrej reported that IRQs stopped working in v4.7 on several platforms. A typical scenario, from Ondrej's VT82C694X/694X, is: ACPI: Using PIC for interrupt routing ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 10 *11 12 14 15) ACPI: No IRQ available for PCI Interrupt Link [LNKA] 8139too 0000:00:0f.0: PCI INT A: no GSI We're using PIC routing, so acpi_irq_balance == 0, and LNKA is already active at IRQ 11. In that case, acpi_pci_link_allocate() only tries to use the active IRQ (IRQ 11) which also happens to be the SCI. We should penalize the SCI by PIRQ_PENALTY_PCI_USING, but irq_get_trigger_type(11) returns something other than IRQ_TYPE_LEVEL_LOW, so we penalize it by PIRQ_PENALTY_ISA_ALWAYS instead, which makes acpi_pci_link_allocate() assume the IRQ isn't available and give up. Add acpi_penalize_sci_irq() so platforms can tell us the SCI IRQ, trigger, and polarity directly and we don't have to depend on irq_get_trigger_type(). Fixes: 103544d86976 (ACPI,PCI,IRQ: reduce resource requirements) Link: http://lkml.kernel.org/r/201609251512.05657.linux@rainbow-software.org Reported-by: Ondrej Zary Acked-by: Bjorn Helgaas Signed-off-by: Sinan Kaya Tested-by: Jonathan Liu Signed-off-by: Rafael J. Wysocki --- arch/x86/kernel/acpi/boot.c | 1 + drivers/acpi/pci_link.c | 30 +++++++++++++++--------------- include/linux/acpi.h | 1 + 3 files changed, 17 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 8a5abaa7d453..931ced8ca345 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -454,6 +454,7 @@ static void __init acpi_sci_ioapic_setup(u8 bus_irq, u16 polarity, u16 trigger, polarity = acpi_sci_flags & ACPI_MADT_POLARITY_MASK; mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); + acpi_penalize_sci_irq(bus_irq, trigger, polarity); /* * stash over-ride to indicate we've been here diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c index 6229b022a5d4..74bf96efae95 100644 --- a/drivers/acpi/pci_link.c +++ b/drivers/acpi/pci_link.c @@ -87,6 +87,7 @@ struct acpi_pci_link { static LIST_HEAD(acpi_link_list); static DEFINE_MUTEX(acpi_link_lock); +static int sci_irq = -1, sci_penalty; /* -------------------------------------------------------------------------- PCI Link Device Management @@ -496,25 +497,13 @@ static int acpi_irq_get_penalty(int irq) { int penalty = 0; - /* - * Penalize IRQ used by ACPI SCI. If ACPI SCI pin attributes conflict - * with PCI IRQ attributes, mark ACPI SCI as ISA_ALWAYS so it won't be - * use for PCI IRQs. - */ - if (irq == acpi_gbl_FADT.sci_interrupt) { - u32 type = irq_get_trigger_type(irq) & IRQ_TYPE_SENSE_MASK; - - if (type != IRQ_TYPE_LEVEL_LOW) - penalty += PIRQ_PENALTY_ISA_ALWAYS; - else - penalty += PIRQ_PENALTY_PCI_USING; - } + if (irq == sci_irq) + penalty += sci_penalty; if (irq < ACPI_MAX_ISA_IRQS) return penalty + acpi_isa_irq_penalty[irq]; - penalty += acpi_irq_pci_sharing_penalty(irq); - return penalty; + return penalty + acpi_irq_pci_sharing_penalty(irq); } int __init acpi_irq_penalty_init(void) @@ -881,6 +870,17 @@ bool acpi_isa_irq_available(int irq) acpi_irq_get_penalty(irq) < PIRQ_PENALTY_ISA_ALWAYS); } +void acpi_penalize_sci_irq(int irq, int trigger, int polarity) +{ + sci_irq = irq; + + if (trigger == ACPI_MADT_TRIGGER_LEVEL && + polarity == ACPI_MADT_POLARITY_ACTIVE_LOW) + sci_penalty = PIRQ_PENALTY_PCI_USING; + else + sci_penalty = PIRQ_PENALTY_ISA_ALWAYS; +} + /* * Over-ride default table to reserve additional IRQs for use by ISA * e.g. acpi_irq_isa=5 diff --git a/include/linux/acpi.h b/include/linux/acpi.h index ddbeda6dbdc8..689a8b9b9c8f 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -326,6 +326,7 @@ struct pci_dev; int acpi_pci_irq_enable (struct pci_dev *dev); void acpi_penalize_isa_irq(int irq, int active); bool acpi_isa_irq_available(int irq); +void acpi_penalize_sci_irq(int irq, int trigger, int polarity); void acpi_pci_irq_disable (struct pci_dev *dev); extern int ec_read(u8 addr, u8 *val); From 98756f5319c64c883caa910dce702d9edefe7810 Mon Sep 17 00:00:00 2001 From: Sinan Kaya Date: Mon, 24 Oct 2016 00:31:32 -0400 Subject: [PATCH 148/292] ACPI/PCI: pci_link: Include PIRQ_PENALTY_PCI_USING for ISA IRQs Commit 103544d86976 ("ACPI,PCI,IRQ: reduce resource requirements") replaced the addition of PIRQ_PENALTY_PCI_USING in acpi_pci_link_allocate() with an addition in acpi_irq_pci_sharing_penalty(), but f7eca374f000 ("ACPI,PCI,IRQ: separate ISA penalty calculation") removed the use of acpi_irq_pci_sharing_penalty() for ISA IRQs. Therefore, PIRQ_PENALTY_PCI_USING is missing from ISA IRQs used by interrupt links. Include that penalty by adding it in the acpi_pci_link_allocate() path. Fixes: f7eca374f000 (ACPI,PCI,IRQ: separate ISA penalty calculation) Signed-off-by: Sinan Kaya Acked-by: Bjorn Helgaas Tested-by: Jonathan Liu Signed-off-by: Rafael J. Wysocki --- drivers/acpi/pci_link.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c index 74bf96efae95..bc3d914dfc3e 100644 --- a/drivers/acpi/pci_link.c +++ b/drivers/acpi/pci_link.c @@ -608,6 +608,10 @@ static int acpi_pci_link_allocate(struct acpi_pci_link *link) acpi_device_bid(link->device)); return -ENODEV; } else { + if (link->irq.active < ACPI_MAX_ISA_IRQS) + acpi_isa_irq_penalty[link->irq.active] += + PIRQ_PENALTY_PCI_USING; + printk(KERN_WARNING PREFIX "%s [%s] enabled at IRQ %d\n", acpi_device_name(link->device), acpi_device_bid(link->device), link->irq.active); From ed19ece135a6908244e22e4ab395165999a4d3ab Mon Sep 17 00:00:00 2001 From: Wenyou Yang Date: Thu, 29 Sep 2016 18:07:07 +0800 Subject: [PATCH 149/292] usb: ohci-at91: Set RemoteWakeupConnected bit explicitly. The reset value of RWC is 0, set RemoteWakeupConnected bit explicitly before calling ohci_run, it also fixes the issue that the mass storage stick connected wasn't suspended when the system suspend. Signed-off-by: Wenyou Yang Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/ohci-at91.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/usb/host/ohci-at91.c b/drivers/usb/host/ohci-at91.c index 5b5880c0ae19..b38a228134df 100644 --- a/drivers/usb/host/ohci-at91.c +++ b/drivers/usb/host/ohci-at91.c @@ -221,6 +221,12 @@ static int usb_hcd_at91_probe(const struct hc_driver *driver, ohci->num_ports = board->ports; at91_start_hc(pdev); + /* + * The RemoteWakeupConnected bit has to be set explicitly + * before calling ohci_run. The reset value of this bit is 0. + */ + ohci->hc_control = OHCI_CTRL_RWC; + retval = usb_add_hcd(hcd, irq, IRQF_SHARED); if (retval == 0) { device_wakeup_enable(hcd->self.controller); @@ -677,9 +683,6 @@ ohci_hcd_at91_drv_suspend(struct device *dev) * REVISIT: some boards will be able to turn VBUS off... */ if (!ohci_at91->wakeup) { - ohci->hc_control = ohci_readl(ohci, &ohci->regs->control); - ohci->hc_control &= OHCI_CTRL_RWC; - ohci_writel(ohci, ohci->hc_control, &ohci->regs->control); ohci->rh_state = OHCI_RH_HALTED; /* flush the writes */ From 1e4b4348753ba555ea93470ea8af821425d9c826 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Mon, 17 Oct 2016 20:11:59 +0900 Subject: [PATCH 150/292] usb: ehci-platform: increase EHCI_MAX_RSTS to 4 Socionext LD11 SoC (arch/arm64/boot/dts/socionext/uniphier-ld11.dtsi) needs to handle 4 reset lines for EHCI. Signed-off-by: Masahiro Yamada Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/ehci-platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/host/ehci-platform.c b/drivers/usb/host/ehci-platform.c index 876dca4fc216..a268d9e8d6cf 100644 --- a/drivers/usb/host/ehci-platform.c +++ b/drivers/usb/host/ehci-platform.c @@ -39,7 +39,7 @@ #define DRIVER_DESC "EHCI generic platform driver" #define EHCI_MAX_CLKS 4 -#define EHCI_MAX_RSTS 3 +#define EHCI_MAX_RSTS 4 #define hcd_to_ehci_priv(h) ((struct ehci_platform_priv *)hcd_to_ehci(h)->priv) struct ehci_platform_priv { From d8e5f0eca1e88215e45aca27115ea747e6164da1 Mon Sep 17 00:00:00 2001 From: Tony Lindgren Date: Wed, 19 Oct 2016 12:03:39 -0500 Subject: [PATCH 151/292] usb: musb: Fix hardirq-safe hardirq-unsafe lock order error If we configure musb with 2430 glue as a peripheral, and then rmmod omap2430 module, we'll get the following error: [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] ... rmmod/413 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: (&phy->mutex){+.+.+.}, at: [] phy_power_off+0x1c/0xb8 [ 204.678710] and this task is already holding: (&(&musb->lock)->rlock){-.-...}, at: [] musb_gadget_stop+0x24/0xec [musb_hdrc] which would create a new lock dependency: (&(&musb->lock)->rlock){-.-...} -> (&phy->mutex){+.+.+.} ... This is because some glue layers expect musb_platform_enable/disable to be called with spinlock held, and 2430 glue layer has USB PHY on the I2C bus using a mutex. We could fix the glue layers to take the spinlock, but we still have a problem of musb_plaform_enable/disable being called in an unbalanced manner. So that would still lead into USB PHY enable/disable related problems for omap2430 glue layer. While it makes sense to only enable USB PHY when needed from PM point of view, in this case we just can't do it yet without breaking things. So let's just revert phy_enable/disable related changes instead and reconsider this after we have fixed musb_platform_enable/disable to be balanced. Fixes: a83e17d0f73b ("usb: musb: Improve PM runtime and phy handling for 2430 glue layer") Reviewed-by: Laurent Pinchart Signed-off-by: Tony Lindgren Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman --- drivers/usb/musb/omap2430.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c index 1ab6973d4f61..cc1225485509 100644 --- a/drivers/usb/musb/omap2430.c +++ b/drivers/usb/musb/omap2430.c @@ -287,6 +287,7 @@ static int omap2430_musb_init(struct musb *musb) } musb->isr = omap2430_musb_interrupt; phy_init(musb->phy); + phy_power_on(musb->phy); l = musb_readl(musb->mregs, OTG_INTERFSEL); @@ -323,8 +324,6 @@ static void omap2430_musb_enable(struct musb *musb) struct musb_hdrc_platform_data *pdata = dev_get_platdata(dev); struct omap_musb_board_data *data = pdata->board_data; - if (!WARN_ON(!musb->phy)) - phy_power_on(musb->phy); switch (glue->status) { @@ -361,9 +360,6 @@ static void omap2430_musb_disable(struct musb *musb) struct device *dev = musb->controller; struct omap2430_glue *glue = dev_get_drvdata(dev->parent); - if (!WARN_ON(!musb->phy)) - phy_power_off(musb->phy); - if (glue->status != MUSB_UNKNOWN) omap_control_usb_set_mode(glue->control_otghs, USB_MODE_DISCONNECT); @@ -375,6 +371,7 @@ static int omap2430_musb_exit(struct musb *musb) struct omap2430_glue *glue = dev_get_drvdata(dev->parent); omap2430_low_level_exit(musb); + phy_power_off(musb->phy); phy_exit(musb->phy); musb->phy = NULL; cancel_work_sync(&glue->omap_musb_mailbox_work); From cacaaf80c3a6f6036d290f353c4c1db237b42647 Mon Sep 17 00:00:00 2001 From: Tony Lindgren Date: Wed, 19 Oct 2016 12:03:40 -0500 Subject: [PATCH 152/292] usb: musb: Call pm_runtime from musb_gadget_queue If we're booting pandaboard using NFSroot over built-in g_ether, we can get the following after booting once and doing a warm reset: g_ether gadget: ecm_open g_ether gadget: notify connect true ... WARNING: CPU: 0 PID: 1 at drivers/bus/omap_l3_noc.c:147 l3_interrupt_handler+0x220/0x34c 44000000.ocp:L3 Custom Error: MASTER MPU TARGET L4CFG (Read): Data Access in User mode du ring Functional access ... Fix the issue by calling pm_runtime functions from musb_gadget_queue. Note that in the long run we should be able to queue the pending transfers if pm_runtime is not active, and flush the queue from pm_runtime_resume. Reported-by: Laurent Pinchart Tested-by: Laurent Pinchart Signed-off-by: Tony Lindgren Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman --- drivers/usb/musb/musb_gadget.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c index bff4869a57cd..4042ea017985 100644 --- a/drivers/usb/musb/musb_gadget.c +++ b/drivers/usb/musb/musb_gadget.c @@ -1255,6 +1255,7 @@ static int musb_gadget_queue(struct usb_ep *ep, struct usb_request *req, map_dma_buffer(request, musb, musb_ep); + pm_runtime_get_sync(musb->controller); spin_lock_irqsave(&musb->lock, lockflags); /* don't queue if the ep is down */ @@ -1275,6 +1276,9 @@ static int musb_gadget_queue(struct usb_ep *ep, struct usb_request *req, unlock: spin_unlock_irqrestore(&musb->lock, lockflags); + pm_runtime_mark_last_busy(musb->controller); + pm_runtime_put_autosuspend(musb->controller); + return status; } From ed6d6f8f42d7302f6f9b6245f34927ec20d26c12 Mon Sep 17 00:00:00 2001 From: Bryan Paluch Date: Mon, 17 Oct 2016 08:54:46 -0400 Subject: [PATCH 153/292] usb: increase ohci watchdog delay to 275 msec Increase ohci watchout delay to 275 ms. Previous delay was 250 ms with 20 ms of slack, after removing slack time some ohci controllers don't respond in time. Logs from systems with controllers that have the issue would show "HcDoneHead not written back; disabled" Signed-off-by: Bryan Paluch Acked-by: Alan Stern Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/usb/host/ohci-hcd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c index 1700908b84ef..86612ac3fda2 100644 --- a/drivers/usb/host/ohci-hcd.c +++ b/drivers/usb/host/ohci-hcd.c @@ -72,7 +72,7 @@ static const char hcd_name [] = "ohci_hcd"; #define STATECHANGE_DELAY msecs_to_jiffies(300) -#define IO_WATCHDOG_DELAY msecs_to_jiffies(250) +#define IO_WATCHDOG_DELAY msecs_to_jiffies(275) #include "ohci.h" #include "pci-quirks.h" From 806487a8fc8f385af75ed261e9ab658fc845e633 Mon Sep 17 00:00:00 2001 From: Punit Agrawal Date: Tue, 18 Oct 2016 17:07:19 +0100 Subject: [PATCH 154/292] ACPI / APEI: Fix incorrect return value of ghes_proc() Although ghes_proc() tests for errors while reading the error status, it always return success (0). Fix this by propagating the return value. Fixes: d334a49113a4a33 (ACPI, APEI, Generic Hardware Error Source memory error support) Signed-of-by: Punit Agrawal Tested-by: Tyler Baicar Reviewed-by: Borislav Petkov [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index f0a029e68d3e..0d099a24f776 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -662,7 +662,7 @@ static int ghes_proc(struct ghes *ghes) ghes_do_proc(ghes, ghes->estatus); out: ghes_clear_estatus(ghes); - return 0; + return rc; } static void ghes_add_timer(struct ghes *ghes) From b76032396d7958f006bccf5fb2535beb5526837c Mon Sep 17 00:00:00 2001 From: Yoshihiro Shimoda Date: Thu, 20 Oct 2016 13:19:19 +0900 Subject: [PATCH 155/292] usb: renesas_usbhs: add wait after initialization for R-Car Gen3 Since the controller on R-Car Gen3 doesn't have any status registers to detect initialization (LPSTS.SUSPM = 1) and the initialization needs up to 45 usec, this patch adds wait after the initialization. Otherwise, writing other registers (e.g. INTENB0) will fail. Fixes: de18757e272d ("usb: renesas_usbhs: add R-Car Gen3 power control") Cc: # v4.6+ Cc: Signed-off-by: Yoshihiro Shimoda Signed-off-by: Greg Kroah-Hartman --- drivers/usb/renesas_usbhs/rcar3.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/usb/renesas_usbhs/rcar3.c b/drivers/usb/renesas_usbhs/rcar3.c index 1d70add926f0..d544b331c9f2 100644 --- a/drivers/usb/renesas_usbhs/rcar3.c +++ b/drivers/usb/renesas_usbhs/rcar3.c @@ -9,6 +9,7 @@ * */ +#include #include #include "common.h" #include "rcar3.h" @@ -35,10 +36,13 @@ static int usbhs_rcar3_power_ctrl(struct platform_device *pdev, usbhs_write32(priv, UGCTRL2, UGCTRL2_RESERVED_3 | UGCTRL2_USB0SEL_OTG); - if (enable) + if (enable) { usbhs_bset(priv, LPSTS, LPSTS_SUSPM, LPSTS_SUSPM); - else + /* The controller on R-Car Gen3 needs to wait up to 45 usec */ + udelay(45); + } else { usbhs_bset(priv, LPSTS, LPSTS_SUSPM, 0); + } return 0; } From 1adb469b9b76276d7e5ea36a20a24c47d6618a0b Mon Sep 17 00:00:00 2001 From: Jon Hunter Date: Fri, 21 Oct 2016 16:24:09 +0100 Subject: [PATCH 156/292] PM / suspend: Fix missing KERN_CONT for suspend message Commit 4bcc595ccd80 (printk: reinstate KERN_CONT for printing continuation lines) exposed a missing KERN_CONT from one of the messages shown on entering suspend. With v4.9-rc1, the 'done.' shown after syncing the filesystems no longer appears as a continuation but a new message with its own timestamp. [ 9.259566] PM: Syncing filesystems ... [ 9.264119] done. Fix this by adding the KERN_CONT log level for the 'done.' part of the message seen after syncing filesystems. While we are at it, convert these suspend printks to pr_info and pr_cont, respectively. Signed-off-by: Jon Hunter Signed-off-by: Rafael J. Wysocki --- kernel/power/suspend.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 1e7f5da648d9..6ccb08f57fcb 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -498,9 +498,9 @@ static int enter_state(suspend_state_t state) #ifndef CONFIG_SUSPEND_SKIP_SYNC trace_suspend_resume(TPS("sync_filesystems"), 0, true); - printk(KERN_INFO "PM: Syncing filesystems ... "); + pr_info("PM: Syncing filesystems ... "); sys_sync(); - printk("done.\n"); + pr_cont("done.\n"); trace_suspend_resume(TPS("sync_filesystems"), 0, false); #endif From 4edd601c5a9c5094daa714e65063e623826f3bcc Mon Sep 17 00:00:00 2001 From: Fabio Estevam Date: Mon, 24 Oct 2016 10:32:12 -0200 Subject: [PATCH 157/292] ARM: imx: mach-imx6q: Fix the PHY ID mask for AR8031 AR8031 and AR8035 have the same PHY ID mask of 0xffffffef. So fix it and make it match with the PHY ID mask definition at drivers/net/phy/at803x.c. Signed-off-by: Fabio Estevam Signed-off-by: Shawn Guo --- arch/arm/mach-imx/mach-imx6q.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/mach-imx/mach-imx6q.c b/arch/arm/mach-imx/mach-imx6q.c index 97fd25105e2c..45801b27ee5c 100644 --- a/arch/arm/mach-imx/mach-imx6q.c +++ b/arch/arm/mach-imx/mach-imx6q.c @@ -173,7 +173,7 @@ static void __init imx6q_enet_phy_init(void) ksz9021rn_phy_fixup); phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK, ksz9031rn_phy_fixup); - phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffff, + phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef, ar8031_phy_fixup); phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef, ar8035_phy_fixup); From cf55902b9c306ed259eb57ff111a0c152620f4a6 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Thu, 13 Oct 2016 15:55:08 +0300 Subject: [PATCH 158/292] staging: android: ion: Fix error handling in ion_query_heaps() If the copy_to_user() fails we should unlock and return directly without updating "cnt". Also the return value should be -EFAULT instead of the number of bytes remaining. Fixes: 02b23803c6af ("staging: android: ion: Add ioctl to query available heaps") Signed-off-by: Dan Carpenter Signed-off-by: Greg Kroah-Hartman --- drivers/staging/android/ion/ion.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index 396ded52ab70..209a8f7ef02b 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -1187,8 +1187,10 @@ int ion_query_heaps(struct ion_client *client, struct ion_heap_query *query) hdata.type = heap->type; hdata.heap_id = heap->id; - ret = copy_to_user(&buffer[cnt], - &hdata, sizeof(hdata)); + if (copy_to_user(&buffer[cnt], &hdata, sizeof(hdata))) { + ret = -EFAULT; + goto out; + } cnt++; if (cnt >= max_cnt) From 25633d1f5dd7ea35c77aae12c039a80e46abec01 Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Mon, 17 Oct 2016 16:37:20 +0000 Subject: [PATCH 159/292] greybus: arche-platform: Add missing of_node_put() in arche_platform_change_state() This node pointer is returned by of_find_compatible_node() with refcount incremented in this function. of_node_put() on it before exitting this function. This is detected by Coccinelle semantic patch. Signed-off-by: Wei Yongjun Acked-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/staging/greybus/arche-platform.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/staging/greybus/arche-platform.c b/drivers/staging/greybus/arche-platform.c index e36ee984485b..34307ac3f255 100644 --- a/drivers/staging/greybus/arche-platform.c +++ b/drivers/staging/greybus/arche-platform.c @@ -128,6 +128,7 @@ int arche_platform_change_state(enum arche_platform_state state, pdev = of_find_device_by_node(np); if (!pdev) { pr_err("arche-platform device not found\n"); + of_node_put(np); return -ENODEV; } From 1305f2b2f52af5986f44dfbb1a6fe58ae875aa61 Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Wed, 19 Oct 2016 13:17:53 +0000 Subject: [PATCH 160/292] greybus: es2: fix error return code in ap_probe() Fix to return a negative error code from the es2_arpc_in_enable() error handling case instead of 0, as done elsewhere in this function. Fixes: 9d9d3777a9db ("greybus: es2: Add a new bulk in endpoint for APBridgeA RPC") Signed-off-by: Wei Yongjun Acked-by: Viresh Kumar Reviewed-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/staging/greybus/es2.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/staging/greybus/es2.c b/drivers/staging/greybus/es2.c index 071bb1cfd3ae..baab460eeaa3 100644 --- a/drivers/staging/greybus/es2.c +++ b/drivers/staging/greybus/es2.c @@ -1548,7 +1548,8 @@ static int ap_probe(struct usb_interface *interface, INIT_LIST_HEAD(&es2->arpcs); spin_lock_init(&es2->arpc_lock); - if (es2_arpc_in_enable(es2)) + retval = es2_arpc_in_enable(es2); + if (retval) goto error; retval = gb_hd_add(hd); From e866dd8aab76b6a0ee8428491e65fa5c83a6ae5a Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 14 Oct 2016 22:18:21 +0300 Subject: [PATCH 161/292] greybus: fix a leak on error in gb_module_create() We should release ->interfaces[0] as well. Fixes: b15d97d77017 ("greybus: core: add module abstraction") Signed-off-by: Dan Carpenter Acked-by: Viresh Kumar Acked-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman --- drivers/staging/greybus/module.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/greybus/module.c b/drivers/staging/greybus/module.c index 69f67ddbd4a3..660b4674a76f 100644 --- a/drivers/staging/greybus/module.c +++ b/drivers/staging/greybus/module.c @@ -127,7 +127,7 @@ struct gb_module *gb_module_create(struct gb_host_device *hd, u8 module_id, return module; err_put_interfaces: - for (--i; i > 0; --i) + for (--i; i >= 0; --i) gb_interface_put(module->interfaces[i]); put_device(&module->dev); From 44b3c7af02ca2701b6b90ee30c9d1d9c3ae07653 Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Tue, 11 Oct 2016 13:34:16 +0200 Subject: [PATCH 162/292] xenbus: advertise control feature flags The Xen docs specify several flags which a guest can set to advertise which values of the xenstore control/shutdown key it will recognize. This patch adds code to write all the relevant feature-flag keys. Based-on-patch-by: Paul Durrant Signed-off-by: Juergen Gross Reviewed-by: David Vrabel Reviewed-by: Paul Durrant Signed-off-by: David Vrabel --- drivers/xen/manage.c | 45 ++++++++++++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 16 deletions(-) diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index e12bd3635f83..26e5e8507f03 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -168,7 +168,9 @@ out: #endif /* CONFIG_HIBERNATE_CALLBACKS */ struct shutdown_handler { - const char *command; +#define SHUTDOWN_CMD_SIZE 11 + const char command[SHUTDOWN_CMD_SIZE]; + bool flag; void (*cb)(void); }; @@ -206,22 +208,22 @@ static void do_reboot(void) ctrl_alt_del(); } +static struct shutdown_handler shutdown_handlers[] = { + { "poweroff", true, do_poweroff }, + { "halt", false, do_poweroff }, + { "reboot", true, do_reboot }, +#ifdef CONFIG_HIBERNATE_CALLBACKS + { "suspend", true, do_suspend }, +#endif +}; + static void shutdown_handler(struct xenbus_watch *watch, const char **vec, unsigned int len) { char *str; struct xenbus_transaction xbt; int err; - static struct shutdown_handler handlers[] = { - { "poweroff", do_poweroff }, - { "halt", do_poweroff }, - { "reboot", do_reboot }, -#ifdef CONFIG_HIBERNATE_CALLBACKS - { "suspend", do_suspend }, -#endif - {NULL, NULL}, - }; - static struct shutdown_handler *handler; + int idx; if (shutting_down != SHUTDOWN_INVALID) return; @@ -238,13 +240,13 @@ static void shutdown_handler(struct xenbus_watch *watch, return; } - for (handler = &handlers[0]; handler->command; handler++) { - if (strcmp(str, handler->command) == 0) + for (idx = 0; idx < ARRAY_SIZE(shutdown_handlers); idx++) { + if (strcmp(str, shutdown_handlers[idx].command) == 0) break; } /* Only acknowledge commands which we are prepared to handle. */ - if (handler->cb) + if (idx < ARRAY_SIZE(shutdown_handlers)) xenbus_write(xbt, "control", "shutdown", ""); err = xenbus_transaction_end(xbt, 0); @@ -253,8 +255,8 @@ static void shutdown_handler(struct xenbus_watch *watch, goto again; } - if (handler->cb) { - handler->cb(); + if (idx < ARRAY_SIZE(shutdown_handlers)) { + shutdown_handlers[idx].cb(); } else { pr_info("Ignoring shutdown request: %s\n", str); shutting_down = SHUTDOWN_INVALID; @@ -310,6 +312,9 @@ static struct notifier_block xen_reboot_nb = { static int setup_shutdown_watcher(void) { int err; + int idx; +#define FEATURE_PATH_SIZE (SHUTDOWN_CMD_SIZE + sizeof("feature-")) + char node[FEATURE_PATH_SIZE]; err = register_xenbus_watch(&shutdown_watch); if (err) { @@ -326,6 +331,14 @@ static int setup_shutdown_watcher(void) } #endif + for (idx = 0; idx < ARRAY_SIZE(shutdown_handlers); idx++) { + if (!shutdown_handlers[idx].flag) + continue; + snprintf(node, FEATURE_PATH_SIZE, "feature-%s", + shutdown_handlers[idx].command); + xenbus_printf(XBT_NIL, "control", node, "%u", 1); + } + return 0; } From cb5f7e7c1ded5ff91b18116669c0f43c82bea3db Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Wed, 12 Oct 2016 17:20:38 +0200 Subject: [PATCH 163/292] x86: xen: move cpu_up functions out of ifdef MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three newly introduced functions are not defined when CONFIG_XEN_PVHVM is disabled, but are still being used: arch/x86/xen/enlighten.c:141:12: warning: ‘xen_cpu_up_prepare’ used but never defined arch/x86/xen/enlighten.c:142:12: warning: ‘xen_cpu_up_online’ used but never defined arch/x86/xen/enlighten.c:143:12: warning: ‘xen_cpu_dead’ used but never defined Fixes: 4d737042d6c4 ("xen/x86: Convert to hotplug state machine") Signed-off-by: Arnd Bergmann Signed-off-by: David Vrabel --- arch/x86/xen/enlighten.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 96c2dea798a1..a637f902f59e 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1838,6 +1838,7 @@ static void __init init_hvm_pv_info(void) xen_domain_type = XEN_HVM_DOMAIN; } +#endif static int xen_cpu_up_prepare(unsigned int cpu) { @@ -1888,6 +1889,7 @@ static int xen_cpu_up_online(unsigned int cpu) return 0; } +#ifdef CONFIG_XEN_PVHVM #ifdef CONFIG_KEXEC_CORE static void xen_hvm_shutdown(void) { From e1e5b3ff41983f506c3cbcf123fe7d682f61a8f1 Mon Sep 17 00:00:00 2001 From: Jan Beulich Date: Mon, 24 Oct 2016 09:03:49 -0600 Subject: [PATCH 164/292] xenbus: prefer list_for_each() This is more efficient than list_for_each_safe() when list modification is accompanied by breaking out of the loop. Signed-off-by: Jan Beulich Reviewed-by: Juergen Gross Signed-off-by: David Vrabel --- drivers/xen/xenbus/xenbus_dev_frontend.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_dev_frontend.c b/drivers/xen/xenbus/xenbus_dev_frontend.c index 7487971f9f78..fe60b12de920 100644 --- a/drivers/xen/xenbus/xenbus_dev_frontend.c +++ b/drivers/xen/xenbus/xenbus_dev_frontend.c @@ -364,7 +364,7 @@ out: static int xenbus_write_watch(unsigned msg_type, struct xenbus_file_priv *u) { - struct watch_adapter *watch, *tmp_watch; + struct watch_adapter *watch; char *path, *token; int err, rc; LIST_HEAD(staging_q); @@ -399,7 +399,7 @@ static int xenbus_write_watch(unsigned msg_type, struct xenbus_file_priv *u) } list_add(&watch->list, &u->watches); } else { - list_for_each_entry_safe(watch, tmp_watch, &u->watches, list) { + list_for_each_entry(watch, &u->watches, list) { if (!strcmp(watch->token, token) && !strcmp(watch->watch.node, path)) { unregister_xenbus_watch(&watch->watch); From c251f15c7dbf2cb72e7b2b282020b41f4e4d3665 Mon Sep 17 00:00:00 2001 From: Jan Beulich Date: Mon, 24 Oct 2016 09:05:18 -0600 Subject: [PATCH 165/292] xenbus: check return value of xenbus_scanf() Don't ignore errors here: Set backend state to unknown when unsuccessful. Signed-off-by: Jan Beulich Signed-off-by: David Vrabel --- drivers/xen/xenbus/xenbus_probe_frontend.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c b/drivers/xen/xenbus/xenbus_probe_frontend.c index 611a23119675..6d40a972ffb2 100644 --- a/drivers/xen/xenbus/xenbus_probe_frontend.c +++ b/drivers/xen/xenbus/xenbus_probe_frontend.c @@ -335,7 +335,9 @@ static int backend_state; static void xenbus_reset_backend_state_changed(struct xenbus_watch *w, const char **v, unsigned int l) { - xenbus_scanf(XBT_NIL, v[XS_WATCH_PATH], "", "%i", &backend_state); + if (xenbus_scanf(XBT_NIL, v[XS_WATCH_PATH], "", "%i", + &backend_state) != 1) + backend_state = XenbusStateUnknown; printk(KERN_DEBUG "XENBUS: backend %s %s\n", v[XS_WATCH_PATH], xenbus_strstate(backend_state)); wake_up(&backend_state_wq); From dafa724bf582181d9a7d54f5cb4ca0bf8ef29269 Mon Sep 17 00:00:00 2001 From: "tang.junhui" Date: Fri, 21 Oct 2016 09:35:32 +0800 Subject: [PATCH 166/292] dm table: fix missing dm_put_target_type() in dm_table_add_target() dm_get_target_type() was previously called so any error returned from dm_table_add_target() must first call dm_put_target_type(). Otherwise the DM target module's reference count will leak and the associated kernel module will be unable to be removed. Also, leverage the fact that r is already -EINVAL and remove an extra newline. Fixes: 36a0456 ("dm table: add immutable feature") Fixes: cc6cbe1 ("dm table: add always writeable feature") Fixes: 3791e2f ("dm table: add singleton feature") Cc: stable@vger.kernel.org # 3.2+ Signed-off-by: tang.junhui Signed-off-by: Mike Snitzer --- drivers/md/dm-table.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 3e407a9cde1f..c4b53b332607 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -695,37 +695,32 @@ int dm_table_add_target(struct dm_table *t, const char *type, tgt->type = dm_get_target_type(type); if (!tgt->type) { - DMERR("%s: %s: unknown target type", dm_device_name(t->md), - type); + DMERR("%s: %s: unknown target type", dm_device_name(t->md), type); return -EINVAL; } if (dm_target_needs_singleton(tgt->type)) { if (t->num_targets) { - DMERR("%s: target type %s must appear alone in table", - dm_device_name(t->md), type); - return -EINVAL; + tgt->error = "singleton target type must appear alone in table"; + goto bad; } t->singleton = true; } if (dm_target_always_writeable(tgt->type) && !(t->mode & FMODE_WRITE)) { - DMERR("%s: target type %s may not be included in read-only tables", - dm_device_name(t->md), type); - return -EINVAL; + tgt->error = "target type may not be included in a read-only table"; + goto bad; } if (t->immutable_target_type) { if (t->immutable_target_type != tgt->type) { - DMERR("%s: immutable target type %s cannot be mixed with other target types", - dm_device_name(t->md), t->immutable_target_type->name); - return -EINVAL; + tgt->error = "immutable target type cannot be mixed with other target types"; + goto bad; } } else if (dm_target_is_immutable(tgt->type)) { if (t->num_targets) { - DMERR("%s: immutable target type %s cannot be mixed with other target types", - dm_device_name(t->md), tgt->type->name); - return -EINVAL; + tgt->error = "immutable target type cannot be mixed with other target types"; + goto bad; } t->immutable_target_type = tgt->type; } @@ -740,7 +735,6 @@ int dm_table_add_target(struct dm_table *t, const char *type, */ if (!adjoin(t, tgt)) { tgt->error = "Gap in table"; - r = -EINVAL; goto bad; } From 91e040a79df73d371f70792f30380d4e44805250 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Thu, 20 Oct 2016 07:39:45 -0700 Subject: [PATCH 167/292] ARC: syscall for userspace cmpxchg assist Older ARC700 cores (ARC750 specifically) lack instructions to implement atomic r-w-w. This is problematic for userspace libraries such as NPTL which need atomic primitives. So enable them by providing kernel assist. This is costly but really the only sane soluton (othern than tight spinning using the otherwise availiable atomic exchange EX instruciton). Good thing is there are only a few of these cores running Linux out in the wild. This only works on UP systems. Reviewed-by: Colin Ian King Signed-off-by: Vineet Gupta --- arch/arc/include/asm/syscalls.h | 1 + arch/arc/include/uapi/asm/unistd.h | 9 ++++---- arch/arc/kernel/process.c | 33 ++++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+), 4 deletions(-) diff --git a/arch/arc/include/asm/syscalls.h b/arch/arc/include/asm/syscalls.h index e56f9fcc5581..772b67ca56e7 100644 --- a/arch/arc/include/asm/syscalls.h +++ b/arch/arc/include/asm/syscalls.h @@ -17,6 +17,7 @@ int sys_clone_wrapper(int, int, int, int, int); int sys_cacheflush(uint32_t, uint32_t uint32_t); int sys_arc_settls(void *); int sys_arc_gettls(void); +int sys_arc_usr_cmpxchg(int *, int, int); #include diff --git a/arch/arc/include/uapi/asm/unistd.h b/arch/arc/include/uapi/asm/unistd.h index 41fa2ec9e02c..9a34136d84b2 100644 --- a/arch/arc/include/uapi/asm/unistd.h +++ b/arch/arc/include/uapi/asm/unistd.h @@ -27,18 +27,19 @@ #define NR_syscalls __NR_syscalls +/* Generic syscall (fs/filesystems.c - lost in asm-generic/unistd.h */ +#define __NR_sysfs (__NR_arch_specific_syscall + 3) + /* ARC specific syscall */ #define __NR_cacheflush (__NR_arch_specific_syscall + 0) #define __NR_arc_settls (__NR_arch_specific_syscall + 1) #define __NR_arc_gettls (__NR_arch_specific_syscall + 2) +#define __NR_arc_usr_cmpxchg (__NR_arch_specific_syscall + 4) __SYSCALL(__NR_cacheflush, sys_cacheflush) __SYSCALL(__NR_arc_settls, sys_arc_settls) __SYSCALL(__NR_arc_gettls, sys_arc_gettls) - - -/* Generic syscall (fs/filesystems.c - lost in asm-generic/unistd.h */ -#define __NR_sysfs (__NR_arch_specific_syscall + 3) +__SYSCALL(__NR_arc_usr_cmpxchg, sys_arc_usr_cmpxchg) __SYSCALL(__NR_sysfs, sys_sysfs) #undef __SYSCALL diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c index be1972bd2729..59aa43cb146e 100644 --- a/arch/arc/kernel/process.c +++ b/arch/arc/kernel/process.c @@ -41,6 +41,39 @@ SYSCALL_DEFINE0(arc_gettls) return task_thread_info(current)->thr_ptr; } +SYSCALL_DEFINE3(arc_usr_cmpxchg, int *, uaddr, int, expected, int, new) +{ + int uval; + int ret; + + /* + * This is only for old cores lacking LLOCK/SCOND, which by defintion + * can't possibly be SMP. Thus doesn't need to be SMP safe. + * And this also helps reduce the overhead for serializing in + * the UP case + */ + WARN_ON_ONCE(IS_ENABLED(CONFIG_SMP)); + + if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int))) + return -EFAULT; + + preempt_disable(); + + ret = __get_user(uval, uaddr); + if (ret) + goto done; + + if (uval != expected) + ret = -EAGAIN; + else + ret = __put_user(new, uaddr); + +done: + preempt_enable(); + + return ret; +} + void arch_cpu_idle(void) { /* sleep, but enable all interrupts before committing */ From cf986d470208fbdd68b6934a86ccd81c04408484 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Thu, 13 Oct 2016 15:58:59 -0700 Subject: [PATCH 168/292] ARCv2: IOC: use @ioc_enable not @ioc_exist where intended if user disables IOC from debugger at startup (by clearing @ioc_enable), @ioc_exists is cleared too. This means boot prints don't capture the fact that IOC was present but disabled which could be misleading. So invert how we use @ioc_enable and @ioc_exists and make it more canonical. @ioc_exists represent whether hardware is present or not and stays same whether enabled or not. @ioc_enable is still user driven, but will be auto-disabled if IOC hardware is not present, i.e. if @ioc_exist=0. This is opposite to what we were doing before, but much clearer. This means @ioc_enable is now the "exported" toggle in rest of code such as dma mapping API. Signed-off-by: Vineet Gupta --- arch/arc/include/asm/cache.h | 2 +- arch/arc/mm/cache.c | 10 ++++++---- arch/arc/mm/dma.c | 4 ++-- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/arc/include/asm/cache.h b/arch/arc/include/asm/cache.h index fb781e34f322..b3410ff6a62d 100644 --- a/arch/arc/include/asm/cache.h +++ b/arch/arc/include/asm/cache.h @@ -53,7 +53,7 @@ extern void arc_cache_init(void); extern char *arc_cache_mumbojumbo(int cpu_id, char *buf, int len); extern void read_decode_cache_bcr(void); -extern int ioc_exists; +extern int ioc_enable; extern unsigned long perip_base, perip_end; #endif /* !__ASSEMBLY__ */ diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c index 97dddbefb86a..518ff76771f3 100644 --- a/arch/arc/mm/cache.c +++ b/arch/arc/mm/cache.c @@ -22,8 +22,8 @@ #include static int l2_line_sz; -int ioc_exists; -volatile int slc_enable = 1, ioc_enable = 1; +static int ioc_exists; +int slc_enable = 1, ioc_enable = 1; unsigned long perip_base = ARC_UNCACHED_ADDR_SPACE; /* legacy value for boot */ unsigned long perip_end = 0xFFFFFFFF; /* legacy value */ @@ -113,8 +113,10 @@ static void read_decode_cache_bcr_arcv2(int cpu) } READ_BCR(ARC_REG_CLUSTER_BCR, cbcr); - if (cbcr.c && ioc_enable) + if (cbcr.c) ioc_exists = 1; + else + ioc_enable = 0; /* HS 2.0 didn't have AUX_VOL */ if (cpuinfo_arc700[cpu].core.family > 0x51) { @@ -1002,7 +1004,7 @@ void arc_cache_init(void) read_aux_reg(ARC_REG_SLC_CTRL) | SLC_CTRL_DISABLE); } - if (is_isa_arcv2() && ioc_exists) { + if (is_isa_arcv2() && ioc_enable) { /* IO coherency base - 0x8z */ write_aux_reg(ARC_REG_IO_COH_AP0_BASE, 0x80000); /* IO coherency aperture size - 512Mb: 0x8z-0xAz */ diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c index 20afc65e22dc..60aab5a7522b 100644 --- a/arch/arc/mm/dma.c +++ b/arch/arc/mm/dma.c @@ -45,7 +45,7 @@ static void *arc_dma_alloc(struct device *dev, size_t size, * -For coherent data, Read/Write to buffers terminate early in cache * (vs. always going to memory - thus are faster) */ - if ((is_isa_arcv2() && ioc_exists) || + if ((is_isa_arcv2() && ioc_enable) || (attrs & DMA_ATTR_NON_CONSISTENT)) need_coh = 0; @@ -97,7 +97,7 @@ static void arc_dma_free(struct device *dev, size_t size, void *vaddr, int is_non_coh = 1; is_non_coh = (attrs & DMA_ATTR_NON_CONSISTENT) || - (is_isa_arcv2() && ioc_exists); + (is_isa_arcv2() && ioc_enable); if (PageHighMem(page) || !is_non_coh) iounmap((void __force __iomem *)vaddr); From 0a3ffab93fe52530602fe47cd74802cffdb19c05 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= Date: Mon, 24 Oct 2016 15:20:29 +0200 Subject: [PATCH 169/292] ANDROID: binder: Add strong ref checks MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Prevent using a binder_ref with only weak references where a strong reference is required. Signed-off-by: Arve Hjønnevåg Signed-off-by: Martijn Coenen Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/android/binder.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 562af94bec35..3681759c22d7 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -1002,7 +1002,7 @@ static int binder_dec_node(struct binder_node *node, int strong, int internal) static struct binder_ref *binder_get_ref(struct binder_proc *proc, - uint32_t desc) + u32 desc, bool need_strong_ref) { struct rb_node *n = proc->refs_by_desc.rb_node; struct binder_ref *ref; @@ -1010,12 +1010,16 @@ static struct binder_ref *binder_get_ref(struct binder_proc *proc, while (n) { ref = rb_entry(n, struct binder_ref, rb_node_desc); - if (desc < ref->desc) + if (desc < ref->desc) { n = n->rb_left; - else if (desc > ref->desc) + } else if (desc > ref->desc) { n = n->rb_right; - else + } else if (need_strong_ref && !ref->strong) { + binder_user_error("tried to use weak ref as strong ref\n"); + return NULL; + } else { return ref; + } } return NULL; } @@ -1285,7 +1289,10 @@ static void binder_transaction_buffer_release(struct binder_proc *proc, } break; case BINDER_TYPE_HANDLE: case BINDER_TYPE_WEAK_HANDLE: { - struct binder_ref *ref = binder_get_ref(proc, fp->handle); + struct binder_ref *ref; + + ref = binder_get_ref(proc, fp->handle, + fp->type == BINDER_TYPE_HANDLE); if (ref == NULL) { pr_err("transaction release %d bad handle %d\n", @@ -1380,7 +1387,7 @@ static void binder_transaction(struct binder_proc *proc, if (tr->target.handle) { struct binder_ref *ref; - ref = binder_get_ref(proc, tr->target.handle); + ref = binder_get_ref(proc, tr->target.handle, true); if (ref == NULL) { binder_user_error("%d:%d got transaction to invalid handle\n", proc->pid, thread->pid); @@ -1589,7 +1596,10 @@ static void binder_transaction(struct binder_proc *proc, } break; case BINDER_TYPE_HANDLE: case BINDER_TYPE_WEAK_HANDLE: { - struct binder_ref *ref = binder_get_ref(proc, fp->handle); + struct binder_ref *ref; + + ref = binder_get_ref(proc, fp->handle, + fp->type == BINDER_TYPE_HANDLE); if (ref == NULL) { binder_user_error("%d:%d got transaction with invalid handle, %d\n", @@ -1800,7 +1810,9 @@ static int binder_thread_write(struct binder_proc *proc, ref->desc); } } else - ref = binder_get_ref(proc, target); + ref = binder_get_ref(proc, target, + cmd == BC_ACQUIRE || + cmd == BC_RELEASE); if (ref == NULL) { binder_user_error("%d:%d refcount change on invalid ref %d\n", proc->pid, thread->pid, target); @@ -1996,7 +2008,7 @@ static int binder_thread_write(struct binder_proc *proc, if (get_user(cookie, (binder_uintptr_t __user *)ptr)) return -EFAULT; ptr += sizeof(binder_uintptr_t); - ref = binder_get_ref(proc, target); + ref = binder_get_ref(proc, target, false); if (ref == NULL) { binder_user_error("%d:%d %s invalid ref %d\n", proc->pid, thread->pid, From 4afb604e2d14d429ac9e1fd84b952602853b2df5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= Date: Mon, 24 Oct 2016 15:20:30 +0200 Subject: [PATCH 170/292] ANDROID: binder: Clear binder and cookie when setting handle in flat binder struct MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Prevents leaking pointers between processes Signed-off-by: Arve Hjønnevåg Signed-off-by: Martijn Coenen Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/android/binder.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 3681759c22d7..3c71b982bf2a 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -1584,7 +1584,9 @@ static void binder_transaction(struct binder_proc *proc, fp->type = BINDER_TYPE_HANDLE; else fp->type = BINDER_TYPE_WEAK_HANDLE; + fp->binder = 0; fp->handle = ref->desc; + fp->cookie = 0; binder_inc_ref(ref, fp->type == BINDER_TYPE_HANDLE, &thread->todo); @@ -1634,7 +1636,9 @@ static void binder_transaction(struct binder_proc *proc, return_error = BR_FAILED_REPLY; goto err_binder_get_ref_for_node_failed; } + fp->binder = 0; fp->handle = new_ref->desc; + fp->cookie = 0; binder_inc_ref(new_ref, fp->type == BINDER_TYPE_HANDLE, NULL); trace_binder_transaction_ref_to_ref(t, ref, new_ref); @@ -1688,6 +1692,7 @@ static void binder_transaction(struct binder_proc *proc, binder_debug(BINDER_DEBUG_TRANSACTION, " fd %d -> %d\n", fp->handle, target_fd); /* TODO: fput? */ + fp->binder = 0; fp->handle = target_fd; } break; From 43605e293eb13c07acb546c14f407a271837af17 Mon Sep 17 00:00:00 2001 From: Alexander Usyskin Date: Wed, 19 Oct 2016 01:34:48 +0300 Subject: [PATCH 171/292] mei: txe: don't clean an unprocessed interrupt cause. SEC registers are not accessible when the TXE device is in low power state, hence the SEC interrupt cannot be processed if device is not awake. In some rare cases entrance to low power state (aliveness off) and input ready bits can be signaled at the same time, resulting in communication stall as input ready won't be signaled again after waking up. To resolve this IPC_HHIER_SEC bit in HHISR_REG should not be cleaned if the interrupt is not processed. Cc: stable@vger.kernel.org Signed-off-by: Alexander Usyskin Signed-off-by: Tomas Winkler Signed-off-by: Greg Kroah-Hartman --- drivers/misc/mei/hw-txe.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/misc/mei/hw-txe.c b/drivers/misc/mei/hw-txe.c index e6e5e55a12ed..60415a2bfcbd 100644 --- a/drivers/misc/mei/hw-txe.c +++ b/drivers/misc/mei/hw-txe.c @@ -981,11 +981,13 @@ static bool mei_txe_check_and_ack_intrs(struct mei_device *dev, bool do_ack) hisr = mei_txe_br_reg_read(hw, HISR_REG); aliveness = mei_txe_aliveness_get(dev); - if (hhisr & IPC_HHIER_SEC && aliveness) + if (hhisr & IPC_HHIER_SEC && aliveness) { ipc_isr = mei_txe_sec_reg_read_silent(hw, SEC_IPC_HOST_INT_STATUS_REG); - else + } else { ipc_isr = 0; + hhisr &= ~IPC_HHIER_SEC; + } generated = generated || (hisr & HISR_INT_STS_MSK) || From d62a9025aee446994a706c711e45c6a655d9d348 Mon Sep 17 00:00:00 2001 From: Amir Goldstein Date: Fri, 21 Oct 2016 07:33:57 +0300 Subject: [PATCH 172/292] orangefs: user file_inode() where it is due Replace wrong use of file->f_path.dentry->d_inode with file_inode(file). In case orangefs ever finds itself as an overelayfs layer, it would want to get its own inode and not overlayfs's inode. DISCLAIMER: I did not test this patch because I do not know how to setup an orangefs mount Signed-off-by: Amir Goldstein Signed-off-by: Mike Marshall --- fs/orangefs/file.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/orangefs/file.c b/fs/orangefs/file.c index 66ea0cc37b18..02cc6139ec90 100644 --- a/fs/orangefs/file.c +++ b/fs/orangefs/file.c @@ -621,9 +621,9 @@ static int orangefs_file_release(struct inode *inode, struct file *file) * readahead cache (if any); this forces an expensive refresh of * data for the next caller of mmap (or 'get_block' accesses) */ - if (file->f_path.dentry->d_inode && - file->f_path.dentry->d_inode->i_mapping && - mapping_nrpages(&file->f_path.dentry->d_inode->i_data)) { + if (file_inode(file) && + file_inode(file)->i_mapping && + mapping_nrpages(&file_inode(file)->i_data)) { if (orangefs_features & ORANGEFS_FEATURE_READAHEAD) { gossip_debug(GOSSIP_INODE_DEBUG, "calling flush_racache on %pU\n", @@ -632,7 +632,7 @@ static int orangefs_file_release(struct inode *inode, struct file *file) gossip_debug(GOSSIP_INODE_DEBUG, "flush_racache finished\n"); } - truncate_inode_pages(file->f_path.dentry->d_inode->i_mapping, + truncate_inode_pages(file_inode(file)->i_mapping, 0); } return 0; @@ -648,7 +648,7 @@ static int orangefs_fsync(struct file *file, { int ret = -EINVAL; struct orangefs_inode_s *orangefs_inode = - ORANGEFS_I(file->f_path.dentry->d_inode); + ORANGEFS_I(file_inode(file)); struct orangefs_kernel_op_s *new_op = NULL; /* required call */ @@ -661,7 +661,7 @@ static int orangefs_fsync(struct file *file, ret = service_operation(new_op, "orangefs_fsync", - get_interruptible_flag(file->f_path.dentry->d_inode)); + get_interruptible_flag(file_inode(file))); gossip_debug(GOSSIP_FILE_DEBUG, "orangefs_fsync got return value of %d\n", @@ -669,7 +669,7 @@ static int orangefs_fsync(struct file *file, op_release(new_op); - orangefs_flush_inode(file->f_path.dentry->d_inode); + orangefs_flush_inode(file_inode(file)); return ret; } From 804b1737d71253f01621d2a37a0dce6279a2d440 Mon Sep 17 00:00:00 2001 From: Miklos Szeredi Date: Mon, 17 Oct 2016 10:14:23 +0200 Subject: [PATCH 173/292] orangefs: don't use d_time Instead use d_fsdata which is the same size. Hoping to get rid of d_time, which is used by very few filesystems by this time. Signed-off-by: Miklos Szeredi Reviewed-by: Martin Brandenburg Signed-off-by: Mike Marshall --- fs/orangefs/dcache.c | 5 +++-- fs/orangefs/namei.c | 8 ++++---- fs/orangefs/orangefs-kernel.h | 7 +++++++ 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/orangefs/dcache.c b/fs/orangefs/dcache.c index 1e8fe844e69f..5355efba4bc8 100644 --- a/fs/orangefs/dcache.c +++ b/fs/orangefs/dcache.c @@ -73,7 +73,7 @@ static int orangefs_revalidate_lookup(struct dentry *dentry) } } - dentry->d_time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000; + orangefs_set_timeout(dentry); ret = 1; out_release_op: op_release(new_op); @@ -94,8 +94,9 @@ out_drop: static int orangefs_d_revalidate(struct dentry *dentry, unsigned int flags) { int ret; + unsigned long time = (unsigned long) dentry->d_fsdata; - if (time_before(jiffies, dentry->d_time)) + if (time_before(jiffies, time)) return 1; if (flags & LOOKUP_RCU) diff --git a/fs/orangefs/namei.c b/fs/orangefs/namei.c index d15d3d2dba62..a290ff6ec756 100644 --- a/fs/orangefs/namei.c +++ b/fs/orangefs/namei.c @@ -72,7 +72,7 @@ static int orangefs_create(struct inode *dir, d_instantiate(dentry, inode); unlock_new_inode(inode); - dentry->d_time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000; + orangefs_set_timeout(dentry); ORANGEFS_I(inode)->getattr_time = jiffies - 1; gossip_debug(GOSSIP_NAME_DEBUG, @@ -183,7 +183,7 @@ static struct dentry *orangefs_lookup(struct inode *dir, struct dentry *dentry, goto out; } - dentry->d_time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000; + orangefs_set_timeout(dentry); inode = orangefs_iget(dir->i_sb, &new_op->downcall.resp.lookup.refn); if (IS_ERR(inode)) { @@ -322,7 +322,7 @@ static int orangefs_symlink(struct inode *dir, d_instantiate(dentry, inode); unlock_new_inode(inode); - dentry->d_time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000; + orangefs_set_timeout(dentry); ORANGEFS_I(inode)->getattr_time = jiffies - 1; gossip_debug(GOSSIP_NAME_DEBUG, @@ -386,7 +386,7 @@ static int orangefs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode d_instantiate(dentry, inode); unlock_new_inode(inode); - dentry->d_time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000; + orangefs_set_timeout(dentry); ORANGEFS_I(inode)->getattr_time = jiffies - 1; gossip_debug(GOSSIP_NAME_DEBUG, diff --git a/fs/orangefs/orangefs-kernel.h b/fs/orangefs/orangefs-kernel.h index 0a82048f3aaf..3bf803d732c5 100644 --- a/fs/orangefs/orangefs-kernel.h +++ b/fs/orangefs/orangefs-kernel.h @@ -580,4 +580,11 @@ static inline void orangefs_i_size_write(struct inode *inode, loff_t i_size) #endif } +static inline void orangefs_set_timeout(struct dentry *dentry) +{ + unsigned long time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000; + + dentry->d_fsdata = (void *) time; +} + #endif /* __ORANGEFSKERNEL_H */ From 423221d1745b53656db896bd34646d09d620c759 Mon Sep 17 00:00:00 2001 From: "John W. Linville" Date: Mon, 24 Oct 2016 15:13:25 -0400 Subject: [PATCH 174/292] nbd: fix incorrect unlock of nbd->sock_lock in sock_shutdown Commit 0eadf37afc250 ("nbd: allow block mq to deal with timeouts") changed normal usage of nbd->sock_lock to use spin_lock/spin_unlock rather than the *_irq variants, but it missed this unlock in an error path. Found by Coverity, CID 1373871. Signed-off-by: John W. Linville Cc: Josef Bacik Cc: Jens Axboe Cc: Markus Pargmann Fixes: 0eadf37afc250 ("nbd: allow block mq to deal with timeouts") Signed-off-by: Jens Axboe --- drivers/block/nbd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index ba405b55329f..19a16b2dbb91 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -164,7 +164,7 @@ static void sock_shutdown(struct nbd_device *nbd) spin_lock(&nbd->sock_lock); if (!nbd->sock) { - spin_unlock_irq(&nbd->sock_lock); + spin_unlock(&nbd->sock_lock); return; } From 2f1d407adab026b34a105ed27b1d4d7e910c4448 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 24 Oct 2016 23:20:25 +0200 Subject: [PATCH 175/292] cpufreq: intel_pstate: Always set max P-state in performance mode The only times at which intel_pstate checks the policy set for a given CPU is the initialization of that CPU and updates of its policy settings from cpufreq when intel_pstate_set_policy() is invoked. That is insufficient, however, because intel_pstate uses the same P-state selection function for all CPUs regardless of the policy setting for each of them and the P-state limits are shared between them. Thus if the policy is set to "performance" for a particular CPU, it may not behave as expected if the cpufreq settings are changed subsequently for another CPU. That can be easily demonstrated by writing "performance" to scaling_governor for all CPUs and then switching it to "powersave" for one of them in which case all of the CPUs will behave as though their scaling_governor were all "powersave" (even though the policy still appears to be "performance" for the remaining CPUs). Fix this problem by modifying intel_pstate_adjust_busy_pstate() to always set the P-state to the maximum allowed by the current limits for all CPUs whose policy is set to "performance". Note that it still is recommended to always change the policy setting in the same way for all CPUs even with this fix applied to avoid confusion. Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/intel_pstate.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index ac7c58d20b58..4737520ec823 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -179,6 +179,7 @@ struct _pid { /** * struct cpudata - Per CPU instance data storage * @cpu: CPU number for this instance data + * @policy: CPUFreq policy value * @update_util: CPUFreq utility callback information * @update_util_set: CPUFreq utility callback is set * @iowait_boost: iowait-related boost fraction @@ -201,6 +202,7 @@ struct _pid { struct cpudata { int cpu; + unsigned int policy; struct update_util_data update_util; bool update_util_set; @@ -1337,7 +1339,8 @@ static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu) from = cpu->pstate.current_pstate; - target_pstate = pstate_funcs.get_target_pstate(cpu); + target_pstate = cpu->policy == CPUFREQ_POLICY_PERFORMANCE ? + cpu->pstate.turbo_pstate : pstate_funcs.get_target_pstate(cpu); intel_pstate_update_pstate(cpu, target_pstate); @@ -1504,6 +1507,8 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) policy->cpuinfo.max_freq, policy->max); cpu = all_cpu_data[policy->cpu]; + cpu->policy = policy->policy; + if (cpu->pstate.max_pstate_physical > cpu->pstate.max_pstate && policy->max < policy->cpuinfo.max_freq && policy->max > cpu->pstate.max_pstate * cpu->pstate.scaling) { @@ -1511,7 +1516,7 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) policy->max = policy->cpuinfo.max_freq; } - if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { + if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { limits = &performance_limits; if (policy->max >= policy->cpuinfo.max_freq) { pr_debug("set performance\n"); @@ -1547,7 +1552,7 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) limits->max_perf = round_up(limits->max_perf, FRAC_BITS); out: - if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) { + if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { /* * NOHZ_FULL CPUs need this as the governor callback may not * be invoked on them. From 272ddc8b37354c3fe111ab26d25e792629148eee Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Mon, 24 Oct 2016 19:00:44 -0700 Subject: [PATCH 176/292] proc: don't use FOLL_FORCE for reading cmdline and environment Now that Lorenzo cleaned things up and made the FOLL_FORCE users explicit, it becomes obvious how some of them don't really need FOLL_FORCE at all. So remove FOLL_FORCE from the proc code that reads the command line and arguments from user space. The mem_rw() function actually does want FOLL_FORCE, because gdd (and possibly many other debuggers) use it as a much more convenient version of PTRACE_PEEKDATA, but we should consider making the FOLL_FORCE part conditional on actually being a ptracer. This does not actually do that, just moves adds a comment to that effect and moves the gup_flags settings next to each other. Signed-off-by: Linus Torvalds --- fs/proc/base.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 8e654468ab67..adfc5b4986f5 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -252,7 +252,7 @@ static ssize_t proc_pid_cmdline_read(struct file *file, char __user *buf, * Inherently racy -- command line shares address space * with code and data. */ - rv = access_remote_vm(mm, arg_end - 1, &c, 1, FOLL_FORCE); + rv = access_remote_vm(mm, arg_end - 1, &c, 1, 0); if (rv <= 0) goto out_free_page; @@ -270,8 +270,7 @@ static ssize_t proc_pid_cmdline_read(struct file *file, char __user *buf, int nr_read; _count = min3(count, len, PAGE_SIZE); - nr_read = access_remote_vm(mm, p, page, _count, - FOLL_FORCE); + nr_read = access_remote_vm(mm, p, page, _count, 0); if (nr_read < 0) rv = nr_read; if (nr_read <= 0) @@ -306,8 +305,7 @@ static ssize_t proc_pid_cmdline_read(struct file *file, char __user *buf, bool final; _count = min3(count, len, PAGE_SIZE); - nr_read = access_remote_vm(mm, p, page, _count, - FOLL_FORCE); + nr_read = access_remote_vm(mm, p, page, _count, 0); if (nr_read < 0) rv = nr_read; if (nr_read <= 0) @@ -356,8 +354,7 @@ skip_argv: bool final; _count = min3(count, len, PAGE_SIZE); - nr_read = access_remote_vm(mm, p, page, _count, - FOLL_FORCE); + nr_read = access_remote_vm(mm, p, page, _count, 0); if (nr_read < 0) rv = nr_read; if (nr_read <= 0) @@ -835,7 +832,7 @@ static ssize_t mem_rw(struct file *file, char __user *buf, unsigned long addr = *ppos; ssize_t copied; char *page; - unsigned int flags = FOLL_FORCE; + unsigned int flags; if (!mm) return 0; @@ -848,6 +845,8 @@ static ssize_t mem_rw(struct file *file, char __user *buf, if (!atomic_inc_not_zero(&mm->mm_users)) goto free; + /* Maybe we should limit FOLL_FORCE to actual ptrace users? */ + flags = FOLL_FORCE; if (write) flags |= FOLL_WRITE; @@ -971,8 +970,7 @@ static ssize_t environ_read(struct file *file, char __user *buf, max_len = min_t(size_t, PAGE_SIZE, count); this_len = min(max_len, this_len); - retval = access_remote_vm(mm, (env_start + src), - page, this_len, FOLL_FORCE); + retval = access_remote_vm(mm, (env_start + src), page, this_len, 0); if (retval <= 0) { ret = retval; From 0d7317598214134d73da59990b846481a9527a00 Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes Date: Mon, 24 Oct 2016 10:57:25 +0100 Subject: [PATCH 177/292] mm: unexport __get_user_pages() This patch unexports the low-level __get_user_pages() function. Recent refactoring of the get_user_pages* functions allow flags to be passed through get_user_pages() which eliminates the need for access to this function from its one user, kvm. We can see that the two calls to get_user_pages() which replace __get_user_pages() in kvm_main.c are equivalent by examining their call stacks: get_user_page_nowait(): get_user_pages(start, 1, flags, page, NULL) __get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL, false, flags | FOLL_TOUCH) __get_user_pages(current, current->mm, start, 1, flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL) check_user_page_hwpoison(): get_user_pages(addr, 1, flags, NULL, NULL) __get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL, false, flags | FOLL_TOUCH) __get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL, NULL, NULL) Signed-off-by: Lorenzo Stoakes Acked-by: Paolo Bonzini Acked-by: Michal Hocko Signed-off-by: Linus Torvalds --- include/linux/mm.h | 4 ---- mm/gup.c | 3 +-- mm/nommu.c | 2 +- virt/kvm/kvm_main.c | 10 ++++------ 4 files changed, 6 insertions(+), 13 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3a191853faaa..a92c8d73aeaf 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1271,10 +1271,6 @@ extern int access_process_vm(struct task_struct *tsk, unsigned long addr, void * extern int access_remote_vm(struct mm_struct *mm, unsigned long addr, void *buf, int len, unsigned int gup_flags); -long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, - unsigned long start, unsigned long nr_pages, - unsigned int foll_flags, struct page **pages, - struct vm_area_struct **vmas, int *nonblocking); long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, diff --git a/mm/gup.c b/mm/gup.c index 7aa113c2d373..ec4f82704b6f 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -526,7 +526,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) * instead of __get_user_pages. __get_user_pages should be used only if * you need some special @gup_flags. */ -long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, +static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, struct vm_area_struct **vmas, int *nonblocking) @@ -631,7 +631,6 @@ next_page: } while (nr_pages); return i; } -EXPORT_SYMBOL(__get_user_pages); bool vma_permits_fault(struct vm_area_struct *vma, unsigned int fault_flags) { diff --git a/mm/nommu.c b/mm/nommu.c index db5fd1795298..8b8faaf2a9e9 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -109,7 +109,7 @@ unsigned int kobjsize(const void *objp) return PAGE_SIZE << compound_order(page); } -long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, +static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, unsigned int foll_flags, struct page **pages, struct vm_area_struct **vmas, int *nonblocking) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 28510e72618a..2907b7b78654 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1346,21 +1346,19 @@ unsigned long kvm_vcpu_gfn_to_hva_prot(struct kvm_vcpu *vcpu, gfn_t gfn, bool *w static int get_user_page_nowait(unsigned long start, int write, struct page **page) { - int flags = FOLL_TOUCH | FOLL_NOWAIT | FOLL_HWPOISON | FOLL_GET; + int flags = FOLL_NOWAIT | FOLL_HWPOISON; if (write) flags |= FOLL_WRITE; - return __get_user_pages(current, current->mm, start, 1, flags, page, - NULL, NULL); + return get_user_pages(start, 1, flags, page, NULL); } static inline int check_user_page_hwpoison(unsigned long addr) { - int rc, flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_WRITE; + int rc, flags = FOLL_HWPOISON | FOLL_WRITE; - rc = __get_user_pages(current, current->mm, addr, 1, - flags, NULL, NULL, NULL); + rc = get_user_pages(addr, 1, flags, NULL, NULL); return rc == -EHWPOISON; } From 407a3aee6ee2d2cb46d9ba3fc380bc29f35d020c Mon Sep 17 00:00:00 2001 From: Long Li Date: Wed, 5 Oct 2016 16:57:46 -0700 Subject: [PATCH 178/292] hv: do not lose pending heartbeat vmbus packets The host keeps sending heartbeat packets independent of the guest responding to them. Even though we respond to the heartbeat messages at interrupt level, we can have situations where there maybe multiple heartbeat messages pending that have not been responded to. For instance this occurs when the VM is paused and the host continues to send the heartbeat messages. Address this issue by draining and responding to all the heartbeat messages that maybe pending. Signed-off-by: Long Li Signed-off-by: K. Y. Srinivasan CC: Stable Signed-off-by: Greg Kroah-Hartman --- drivers/hv/hv_util.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c index 4aa3cb63fd41..bcd06306f3e8 100644 --- a/drivers/hv/hv_util.c +++ b/drivers/hv/hv_util.c @@ -314,10 +314,14 @@ static void heartbeat_onchannelcallback(void *context) u8 *hbeat_txf_buf = util_heartbeat.recv_buffer; struct icmsg_negotiate *negop = NULL; - vmbus_recvpacket(channel, hbeat_txf_buf, - PAGE_SIZE, &recvlen, &requestid); + while (1) { + + vmbus_recvpacket(channel, hbeat_txf_buf, + PAGE_SIZE, &recvlen, &requestid); + + if (!recvlen) + break; - if (recvlen > 0) { icmsghdrp = (struct icmsg_hdr *)&hbeat_txf_buf[ sizeof(struct vmbuspipe_hdr)]; From 1a3f099101b85cc93d864eb030d97e7725c72ea7 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 20 Oct 2016 12:14:51 +0200 Subject: [PATCH 179/292] ALSA: hda - Fix surround output pins for ASRock B150M mobo ASRock B150M Pro4/D3 mobo with ALC892 codec doesn't seem to provide proper pins for the surround outputs, hence we need to specify the pincfgs manually with a couple of other corrections. Reported-and-tested-by: Benjamin Valentin Cc: Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index b582d57fe184..2f909dd8b7b8 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6625,6 +6625,7 @@ enum { ALC891_FIXUP_HEADSET_MODE, ALC891_FIXUP_DELL_MIC_NO_PRESENCE, ALC662_FIXUP_ACER_VERITON, + ALC892_FIXUP_ASROCK_MOBO, }; static const struct hda_fixup alc662_fixups[] = { @@ -6901,6 +6902,16 @@ static const struct hda_fixup alc662_fixups[] = { { } } }, + [ALC892_FIXUP_ASROCK_MOBO] = { + .type = HDA_FIXUP_PINS, + .v.pins = (const struct hda_pintbl[]) { + { 0x15, 0x40f000f0 }, /* disabled */ + { 0x16, 0x40f000f0 }, /* disabled */ + { 0x18, 0x01014011 }, /* LO */ + { 0x1a, 0x01014012 }, /* LO */ + { } + } + }, }; static const struct snd_pci_quirk alc662_fixup_tbl[] = { @@ -6938,6 +6949,7 @@ static const struct snd_pci_quirk alc662_fixup_tbl[] = { SND_PCI_QUIRK(0x144d, 0xc051, "Samsung R720", ALC662_FIXUP_IDEAPAD), SND_PCI_QUIRK(0x17aa, 0x38af, "Lenovo Ideapad Y550P", ALC662_FIXUP_IDEAPAD), SND_PCI_QUIRK(0x17aa, 0x3a0d, "Lenovo Ideapad Y550", ALC662_FIXUP_IDEAPAD), + SND_PCI_QUIRK(0x1849, 0x5892, "ASRock B150M", ALC892_FIXUP_ASROCK_MOBO), SND_PCI_QUIRK(0x19da, 0xa130, "Zotac Z68", ALC662_FIXUP_ZOTAC_Z68), SND_PCI_QUIRK(0x1b0a, 0x01b8, "ACER Veriton", ALC662_FIXUP_ACER_VERITON), SND_PCI_QUIRK(0x1b35, 0x2206, "CZC P10T", ALC662_FIXUP_CZC_P10T), From 991d5add50a5bb6ab8f12f2129f5c7487f6baaf6 Mon Sep 17 00:00:00 2001 From: Stefan Wahren Date: Sat, 10 Sep 2016 12:53:21 +0000 Subject: [PATCH 180/292] usb: chipidea: host: fix NULL ptr dereference during shutdown After commit b09b5224fe86 ("usb: chipidea: implement platform shutdown callback") and commit 43a404577a93 ("usb: chipidea: host: set host to be null after hcd is freed") a NULL pointer dereference is caused on i.MX23 during shutdown. So ensure that role is set to CI_ROLE_END and we finish interrupt handling before the hcd is deallocated. This avoids the NULL pointer dereference. Suggested-by: Alan Stern Signed-off-by: Stefan Wahren Fixes: b09b5224fe86 ("usb: chipidea: implement platform shutdown callback") Signed-off-by: Peter Chen --- drivers/usb/chipidea/host.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/chipidea/host.c b/drivers/usb/chipidea/host.c index 96ae69502c86..111b0e0b8698 100644 --- a/drivers/usb/chipidea/host.c +++ b/drivers/usb/chipidea/host.c @@ -188,6 +188,8 @@ static void host_stop(struct ci_hdrc *ci) if (hcd) { usb_remove_hcd(hcd); + ci->role = CI_ROLE_END; + synchronize_irq(ci->irq); usb_put_hcd(hcd); if (ci->platdata->reg_vbus && !ci_otg_is_fsm_mode(ci) && (ci->platdata->flags & CI_HDRC_TURN_VBUS_EARLY_ON)) From ae824f00241f495e8d55ebdc0341f3ce61a77da6 Mon Sep 17 00:00:00 2001 From: Ruqiang Ju Date: Mon, 24 Oct 2016 16:39:49 +0800 Subject: [PATCH 181/292] i2c: hix5hd2: allow build with ARCH_HISI This driver should be buildable with ARCH_HISI, because some of other HiSilicon SoCs also use it. Signed-off-by: Ruqiang Ju Signed-off-by: Wolfram Sang --- drivers/i2c/busses/Kconfig | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index 6d94e2ec5b4f..41abb20f082d 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -79,12 +79,12 @@ config I2C_AMD8111 config I2C_HIX5HD2 tristate "Hix5hd2 high-speed I2C driver" - depends on ARCH_HIX5HD2 || COMPILE_TEST + depends on ARCH_HISI || ARCH_HIX5HD2 || COMPILE_TEST help - Say Y here to include support for high-speed I2C controller in the - Hisilicon based hix5hd2 SoCs. + Say Y here to include support for the high-speed I2C controller + used in HiSilicon hix5hd2 SoCs. - This driver can also be built as a module. If so, the module + This driver can also be built as a module. If so, the module will be called i2c-hix5hd2. config I2C_I801 From 399c168ab5ab5e12ed55b6c91d61c24eb84c9164 Mon Sep 17 00:00:00 2001 From: David Wu Date: Sat, 22 Oct 2016 16:43:42 +0800 Subject: [PATCH 182/292] i2c: rk3x: Give the tuning value 0 during rk3x_i2c_v0_calc_timings We found a bug that i2c transfer sometimes failed on 3066a board with stabel-4.8, the con register would be updated by uninitialized tuning value, it made the i2c transfer failed. So give the tuning value to be zero during rk3x_i2c_v0_calc_timings. Signed-off-by: David Wu Tested-by: Andy Yan Reviewed-by: Douglas Anderson Signed-off-by: Wolfram Sang Cc: stable@kernel.org --- drivers/i2c/busses/i2c-rk3x.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c index 50702c7bb244..df220666d627 100644 --- a/drivers/i2c/busses/i2c-rk3x.c +++ b/drivers/i2c/busses/i2c-rk3x.c @@ -694,6 +694,8 @@ static int rk3x_i2c_v0_calc_timings(unsigned long clk_rate, t_calc->div_low--; t_calc->div_high--; + /* Give the tuning value 0, that would not update con register */ + t_calc->tuning = 0; /* Maximum divider supported by hw is 0xffff */ if (t_calc->div_low > 0xffff) { t_calc->div_low = 0xffff; From 7fbe6ac02485504b964b283aca62b36b4313ca79 Mon Sep 17 00:00:00 2001 From: Josh Poimboeuf Date: Mon, 24 Oct 2016 08:31:27 -0500 Subject: [PATCH 183/292] x86/unwind: Fix empty stack dereference in guess unwinder Vince Waver reported the following bug: WARNING: CPU: 0 PID: 21338 at arch/x86/mm/fault.c:435 vmalloc_fault+0x58/0x1f0 CPU: 0 PID: 21338 Comm: perf_fuzzer Not tainted 4.8.0+ #37 Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013 Call Trace: ? dump_stack+0x46/0x59 ? __warn+0xd5/0xee ? vmalloc_fault+0x58/0x1f0 ? __do_page_fault+0x6d/0x48e ? perf_log_throttle+0xa4/0xf4 ? trace_page_fault+0x22/0x30 ? __unwind_start+0x28/0x42 ? perf_callchain_kernel+0x75/0xac ? get_perf_callchain+0x13a/0x1f0 ? perf_callchain+0x6a/0x6c ? perf_prepare_sample+0x71/0x2eb ? perf_event_output_forward+0x1a/0x54 ? __default_send_IPI_shortcut+0x10/0x2d ? __perf_event_overflow+0xfb/0x167 ? x86_pmu_handle_irq+0x113/0x150 ? native_read_msr+0x6/0x34 ? perf_event_nmi_handler+0x22/0x39 ? perf_ibs_nmi_handler+0x4a/0x51 ? perf_event_nmi_handler+0x22/0x39 ? nmi_handle+0x4d/0xf0 ? perf_ibs_handle_irq+0x3d1/0x3d1 ? default_do_nmi+0x3c/0xd5 ? do_nmi+0x92/0x102 ? end_repeat_nmi+0x1a/0x1e ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ? entry_SYSCALL_64_after_swapgs+0x12/0x4a ^A4---[ end trace 632723104d47d31a ]--- BUG: stack guard page was hit at ffffc90008500000 (stack is ffffc900084fc000..ffffc900084fffff) kernel stack overflow (page fault): 0000 [#1] SMP ... The NMI hit in the entry code right after setting up the stack pointer from 'cpu_current_top_of_stack', so the kernel stack was empty. The 'guess' version of __unwind_start() attempted to dereference the "top of stack" pointer, which is not actually *on* the stack. Add a check in the guess unwinder to deal with an empty stack. (The frame pointer unwinder already has such a check.) Reported-by: Vince Weaver Signed-off-by: Josh Poimboeuf Cc: Andy Lutomirski Cc: Arnaldo Carvalho de Melo Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: 7c7900f89770 ("x86/unwind: Add new unwind interface and implementations") Link: http://lkml.kernel.org/r/20161024133127.e5evgeebdbohnmpb@treble Signed-off-by: Ingo Molnar --- arch/x86/kernel/unwind_guess.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/unwind_guess.c b/arch/x86/kernel/unwind_guess.c index 9298993dc8b7..2d721e533cf4 100644 --- a/arch/x86/kernel/unwind_guess.c +++ b/arch/x86/kernel/unwind_guess.c @@ -47,7 +47,14 @@ void __unwind_start(struct unwind_state *state, struct task_struct *task, get_stack_info(first_frame, state->task, &state->stack_info, &state->stack_mask); - if (!__kernel_text_address(*first_frame)) + /* + * The caller can provide the address of the first frame directly + * (first_frame) or indirectly (regs->sp) to indicate which stack frame + * to start unwinding at. Skip ahead until we reach it. + */ + if (!unwind_done(state) && + (!on_stack(&state->stack_info, first_frame, sizeof(long)) || + !__kernel_text_address(*first_frame))) unwind_next_frame(state); } EXPORT_SYMBOL_GPL(__unwind_start); From a2209b742e6cf978b85d4f31a25a269c3d3b062b Mon Sep 17 00:00:00 2001 From: Jan Beulich Date: Mon, 24 Oct 2016 09:00:12 -0600 Subject: [PATCH 184/292] x86/build: Fix build with older GCC versions Older GCC (observed with 4.1.x) doesn't support -Wno-override-init and also doesn't ignore unknown -Wno-* options. Signed-off-by: Jan Beulich Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Valdis Kletnieks Cc: Valdis.Kletnieks@vt.edu Fixes: 5e44258d16 "x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables" Link: http://lkml.kernel.org/r/580E3E1C02000078001191C4@prv-mh.provo.novell.com Signed-off-by: Ingo Molnar --- arch/x86/entry/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile index 77f28ce9c646..9976fcecd17e 100644 --- a/arch/x86/entry/Makefile +++ b/arch/x86/entry/Makefile @@ -5,8 +5,8 @@ OBJECT_FILES_NON_STANDARD_entry_$(BITS).o := y OBJECT_FILES_NON_STANDARD_entry_64_compat.o := y -CFLAGS_syscall_64.o += -Wno-override-init -CFLAGS_syscall_32.o += -Wno-override-init +CFLAGS_syscall_64.o += $(call cc-option,-Wno-override-init,) +CFLAGS_syscall_32.o += $(call cc-option,-Wno-override-init,) obj-y := entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o obj-y += common.o From d320b9a5bd85f6178cc3ed8b0a1a9960f2b5bc7b Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 24 Oct 2016 17:33:18 +0200 Subject: [PATCH 185/292] x86/quirks: Hide maybe-uninitialized warning MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit gcc -Wmaybe-uninitialized detects that quirk_intel_brickland_xeon_ras_cap uses uninitialized data when CONFIG_PCI is not set: arch/x86/kernel/quirks.c: In function ‘quirk_intel_brickland_xeon_ras_cap’: arch/x86/kernel/quirks.c:641:13: error: ‘capid0’ is used uninitialized in this function [-Werror=uninitialized] However, the function is also not called in this configuration, so we can avoid the warning by moving the existing #ifdef to cover it as well. Signed-off-by: Arnd Bergmann Cc: Bjorn Helgaas Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Tony Luck Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20161024153325.2752428-1-arnd@arndb.de Signed-off-by: Ingo Molnar --- arch/x86/kernel/quirks.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c index 51402a7e4ca6..0bee04d41bed 100644 --- a/arch/x86/kernel/quirks.c +++ b/arch/x86/kernel/quirks.c @@ -625,8 +625,6 @@ static void amd_disable_seq_and_redirect_scrub(struct pci_dev *dev) DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_16H_NB_F3, amd_disable_seq_and_redirect_scrub); -#endif - #if defined(CONFIG_X86_64) && defined(CONFIG_X86_MCE) #include #include @@ -657,3 +655,4 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2fc0, quirk_intel_brickland_xeon_ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x6fc0, quirk_intel_brickland_xeon_ras_cap); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2083, quirk_intel_purley_xeon_ras_cap); #endif +#endif From 6a676fb69dcbf3310b9e462c1db66c8e7f6ead38 Mon Sep 17 00:00:00 2001 From: Ralf Ramsauer Date: Mon, 17 Oct 2016 15:59:57 +0200 Subject: [PATCH 186/292] i2c: mark device nodes only in case of successful instantiation Instantiated I2C device nodes are marked with OF_POPULATE. This was introduced in 4f001fd30145a6. On unloading, loaded device nodes will of course be unmarked. The problem are nodes that fail during initialisation: If a node fails, it won't be unloaded and hence not be unmarked. If a I2C driver module is unloaded and reloaded, it will skip nodes that failed before. Skip device nodes that are already populated and mark them only in case of success. Fixes: 4f001fd30145a6 ("i2c: Mark instantiated device nodes with OF_POPULATE") Signed-off-by: Ralf Ramsauer Reviewed-by: Geert Uytterhoeven Acked-by: Pantelis Antoniou [wsa: use 14-digit commit sha] Signed-off-by: Wolfram Sang Cc: stable@kernel.org --- drivers/i2c/i2c-core.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c index 5ab67219f71e..1704fc84d647 100644 --- a/drivers/i2c/i2c-core.c +++ b/drivers/i2c/i2c-core.c @@ -1681,6 +1681,7 @@ static struct i2c_client *of_i2c_register_device(struct i2c_adapter *adap, static void of_i2c_register_devices(struct i2c_adapter *adap) { struct device_node *bus, *node; + struct i2c_client *client; /* Only register child devices if the adapter has a node pointer set */ if (!adap->dev.of_node) @@ -1695,7 +1696,14 @@ static void of_i2c_register_devices(struct i2c_adapter *adap) for_each_available_child_of_node(bus, node) { if (of_node_test_and_set_flag(node, OF_POPULATED)) continue; - of_i2c_register_device(adap, node); + + client = of_i2c_register_device(adap, node); + if (IS_ERR(client)) { + dev_warn(&adap->dev, + "Failed to create I2C device for %s\n", + node->full_name); + of_node_clear_flag(node, OF_POPULATED); + } } of_node_put(bus); @@ -2299,6 +2307,7 @@ static int of_i2c_notify(struct notifier_block *nb, unsigned long action, if (IS_ERR(client)) { dev_err(&adap->dev, "failed to create client for '%s'\n", rd->dn->full_name); + of_node_clear_flag(rd->dn, OF_POPULATED); return notifier_from_errno(PTR_ERR(client)); } break; From 17791650c356b3cb220946d697d89afc1e32c315 Mon Sep 17 00:00:00 2001 From: Greg Ungerer Date: Mon, 17 Oct 2016 11:54:05 +1000 Subject: [PATCH 187/292] i2c: allow configuration of imx driver for ColdFire architecture The i2c controller used by Freescales iMX processors is the same hardware module used on Freescales ColdFire family of processors. We can use the existing i2c-imx driver on ColdFire family members. Modify the configuration to allow it to be selected when compiling for ColdFire targets. Signed-off-by: Greg Ungerer Signed-off-by: Wolfram Sang --- drivers/i2c/busses/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index 41abb20f082d..d252276feadf 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -589,10 +589,10 @@ config I2C_IMG config I2C_IMX tristate "IMX I2C interface" - depends on ARCH_MXC || ARCH_LAYERSCAPE + depends on ARCH_MXC || ARCH_LAYERSCAPE || COLDFIRE help Say Y here if you want to use the IIC bus controller on - the Freescale i.MX/MXC or Layerscape processors. + the Freescale i.MX/MXC, Layerscape or ColdFire processors. This driver can also be built as a module. If so, the module will be called i2c-imx. From 3855ada848dbdbeae39fcf9eaaf6a7678315f153 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Tue, 18 Oct 2016 18:01:45 -0300 Subject: [PATCH 188/292] i2c: jz4780: Fix module autoload If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Signed-off-by: Javier Martinez Canillas Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-jz4780.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/i2c/busses/i2c-jz4780.c b/drivers/i2c/busses/i2c-jz4780.c index b8ea62105f42..30132c3957cd 100644 --- a/drivers/i2c/busses/i2c-jz4780.c +++ b/drivers/i2c/busses/i2c-jz4780.c @@ -729,6 +729,7 @@ static const struct of_device_id jz4780_i2c_of_matches[] = { { .compatible = "ingenic,jz4780-i2c", }, { /* sentinel */ } }; +MODULE_DEVICE_TABLE(of, jz4780_i2c_of_matches); static int jz4780_i2c_probe(struct platform_device *pdev) { From 06e7b10a8711d10b4c4c21ab1aac3f1529a084df Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Tue, 18 Oct 2016 18:01:46 -0300 Subject: [PATCH 189/292] i2c: xlp9xx: Fix module autoload If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Signed-off-by: Javier Martinez Canillas Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-xlp9xx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/i2c/busses/i2c-xlp9xx.c b/drivers/i2c/busses/i2c-xlp9xx.c index 2a972ed7aa0d..e29ff37a43bd 100644 --- a/drivers/i2c/busses/i2c-xlp9xx.c +++ b/drivers/i2c/busses/i2c-xlp9xx.c @@ -426,6 +426,7 @@ static const struct of_device_id xlp9xx_i2c_of_match[] = { { .compatible = "netlogic,xlp980-i2c", }, { /* sentinel */ }, }; +MODULE_DEVICE_TABLE(of, xlp9xx_i2c_of_match); #ifdef CONFIG_ACPI static const struct acpi_device_id xlp9xx_i2c_acpi_ids[] = { From 2cb496db3d56baa86d53022044c4d25ffe7fa038 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Tue, 18 Oct 2016 18:01:47 -0300 Subject: [PATCH 190/292] i2c: xlr: Fix module autoload for OF registration If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Signed-off-by: Javier Martinez Canillas Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-xlr.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/i2c/busses/i2c-xlr.c b/drivers/i2c/busses/i2c-xlr.c index 0968f59b6df5..ad17d88d8573 100644 --- a/drivers/i2c/busses/i2c-xlr.c +++ b/drivers/i2c/busses/i2c-xlr.c @@ -358,6 +358,7 @@ static const struct of_device_id xlr_i2c_dt_ids[] = { }, { } }; +MODULE_DEVICE_TABLE(of, xlr_i2c_dt_ids); static int xlr_i2c_probe(struct platform_device *pdev) { From 60a951af8e1656e2a17a96d64941aafe0668d750 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Tue, 18 Oct 2016 18:01:48 -0300 Subject: [PATCH 191/292] i2c: digicolor: Fix module autoload If the driver is built as a module, autoload won't work because the module alias information is not filled. So user-space can't match the registered device with the corresponding module. Export the module alias information using the MODULE_DEVICE_TABLE() macro. Signed-off-by: Javier Martinez Canillas Acked-by: Baruch Siach Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-digicolor.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/i2c/busses/i2c-digicolor.c b/drivers/i2c/busses/i2c-digicolor.c index 9604024e0eb0..49f2084f7bb5 100644 --- a/drivers/i2c/busses/i2c-digicolor.c +++ b/drivers/i2c/busses/i2c-digicolor.c @@ -368,6 +368,7 @@ static const struct of_device_id dc_i2c_match[] = { { .compatible = "cnxt,cx92755-i2c" }, { }, }; +MODULE_DEVICE_TABLE(of, dc_i2c_match); static struct platform_driver dc_i2c_driver = { .probe = dc_i2c_probe, From 603616017c35f4d0fbdbcace72adf9bf949c4a65 Mon Sep 17 00:00:00 2001 From: Hoan Tran Date: Mon, 10 Oct 2016 10:13:10 -0700 Subject: [PATCH 192/292] i2c: xgene: Avoid dma_buffer overrun SMBus block command uses the first byte of buffer for the data length. The dma_buffer should be increased by 1 to avoid the overrun issue. Reported-by: Phil Endecott Signed-off-by: Hoan Tran Signed-off-by: Wolfram Sang Cc: stable@kernel.org --- drivers/i2c/busses/i2c-xgene-slimpro.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-xgene-slimpro.c b/drivers/i2c/busses/i2c-xgene-slimpro.c index 263685c7a512..05cf192ef1ac 100644 --- a/drivers/i2c/busses/i2c-xgene-slimpro.c +++ b/drivers/i2c/busses/i2c-xgene-slimpro.c @@ -105,7 +105,7 @@ struct slimpro_i2c_dev { struct mbox_chan *mbox_chan; struct mbox_client mbox_client; struct completion rd_complete; - u8 dma_buffer[I2C_SMBUS_BLOCK_MAX]; + u8 dma_buffer[I2C_SMBUS_BLOCK_MAX + 1]; /* dma_buffer[0] is used for length */ u32 *resp_msg; }; From ba9ad2af7019956b990ad654c56da5bac1e8b71b Mon Sep 17 00:00:00 2001 From: Jean Delvare Date: Tue, 11 Oct 2016 13:13:27 +0200 Subject: [PATCH 193/292] i2c: i801: Fix I2C Block Read on 8-Series/C220 and later Starting with the 8-Series/C220 PCH (Lynx Point), the SMBus controller includes a SPD EEPROM protection mechanism. Once the SPD Write Disable bit is set, only reads are allowed to slave addresses 0x50-0x57. However the legacy implementation of I2C Block Read since the ICH5 looks like a write, and is therefore blocked by the SPD protection mechanism. This causes the eeprom and at24 drivers to fail. So assume that I2C Block Read is implemented as an actual read on these chipsets. I tested it on my Q87 chipset and it seems to work just fine. Signed-off-by: Jean Delvare Tested-by: Jarkko Nikula [wsa: rebased to v4.9-rc2] Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-i801.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index 08847e8b8998..eb3627f35d12 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -146,6 +146,7 @@ #define SMBHSTCFG_HST_EN 1 #define SMBHSTCFG_SMB_SMI_EN 2 #define SMBHSTCFG_I2C_EN 4 +#define SMBHSTCFG_SPD_WD 0x10 /* TCO configuration bits for TCOCTL */ #define TCOCTL_EN 0x0100 @@ -865,9 +866,16 @@ static s32 i801_access(struct i2c_adapter *adap, u16 addr, block = 1; break; case I2C_SMBUS_I2C_BLOCK_DATA: - /* NB: page 240 of ICH5 datasheet shows that the R/#W - * bit should be cleared here, even when reading */ - outb_p((addr & 0x7f) << 1, SMBHSTADD(priv)); + /* + * NB: page 240 of ICH5 datasheet shows that the R/#W + * bit should be cleared here, even when reading. + * However if SPD Write Disable is set (Lynx Point and later), + * the read will fail if we don't set the R/#W bit. + */ + outb_p(((addr & 0x7f) << 1) | + ((priv->original_hstcfg & SMBHSTCFG_SPD_WD) ? + (read_write & 0x01) : 0), + SMBHSTADD(priv)); if (read_write == I2C_SMBUS_READ) { /* NB: page 240 of ICH5 datasheet also shows * that DATA1 is the cmd field when reading */ @@ -1573,6 +1581,8 @@ static int i801_probe(struct pci_dev *dev, const struct pci_device_id *id) /* Disable SMBus interrupt feature if SMBus using SMI# */ priv->features &= ~FEATURE_IRQ; } + if (temp & SMBHSTCFG_SPD_WD) + dev_info(&dev->dev, "SPD Write Disable is set\n"); /* Clear special mode bits */ if (priv->features & (FEATURE_SMBUS_PEC | FEATURE_BLOCK_BUFFER)) From 171e23e150acfb285f1772cedf04d35694af740b Mon Sep 17 00:00:00 2001 From: Jarkko Nikula Date: Thu, 29 Sep 2016 16:04:59 +0300 Subject: [PATCH 194/292] i2c: designware: Avoid aborted transfers with fast reacting I2C slaves I2C DesignWare may abort transfer with arbitration lost if I2C slave pulls SDA down quickly after falling edge of SCL. Reason for this is unknown but after trial and error it was found this can be avoided by enabling non-zero SDA RX hold time for the receiver. By the specification SDA RX hold time extends incoming SDA low to high transition by n * ic_clk cycles but only when SCL is high. However it seems to help avoid above faulty arbitration lost error. Bits 23:16 in IC_SDA_HOLD register define the SDA RX hold time for the receiver. Be conservative and enable 1 ic_clk cycle long hold time in case boot firmware hasn't set it up. Reported-by: Jukka Laitinen Signed-off-by: Jarkko Nikula Tested-by: Jukka Laitinen Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-designware-core.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/i2c/busses/i2c-designware-core.c b/drivers/i2c/busses/i2c-designware-core.c index 1fe93c43215c..11e866d05368 100644 --- a/drivers/i2c/busses/i2c-designware-core.c +++ b/drivers/i2c/busses/i2c-designware-core.c @@ -95,6 +95,9 @@ #define DW_IC_STATUS_TFE BIT(2) #define DW_IC_STATUS_MST_ACTIVITY BIT(5) +#define DW_IC_SDA_HOLD_RX_SHIFT 16 +#define DW_IC_SDA_HOLD_RX_MASK GENMASK(23, DW_IC_SDA_HOLD_RX_SHIFT) + #define DW_IC_ERR_TX_ABRT 0x1 #define DW_IC_TAR_10BITADDR_MASTER BIT(12) @@ -420,12 +423,20 @@ int i2c_dw_init(struct dw_i2c_dev *dev) /* Configure SDA Hold Time if required */ reg = dw_readl(dev, DW_IC_COMP_VERSION); if (reg >= DW_IC_SDA_HOLD_MIN_VERS) { - if (dev->sda_hold_time) { - dw_writel(dev, dev->sda_hold_time, DW_IC_SDA_HOLD); - } else { + if (!dev->sda_hold_time) { /* Keep previous hold time setting if no one set it */ dev->sda_hold_time = dw_readl(dev, DW_IC_SDA_HOLD); } + /* + * Workaround for avoiding TX arbitration lost in case I2C + * slave pulls SDA down "too quickly" after falling egde of + * SCL by enabling non-zero SDA RX hold. Specification says it + * extends incoming SDA low to high transition while SCL is + * high but it apprears to help also above issue. + */ + if (!(dev->sda_hold_time & DW_IC_SDA_HOLD_RX_MASK)) + dev->sda_hold_time |= 1 << DW_IC_SDA_HOLD_RX_SHIFT; + dw_writel(dev, dev->sda_hold_time, DW_IC_SDA_HOLD); } else { dev_warn(dev->dev, "Hardware too old to adjust SDA hold time.\n"); From 533169d164c6b4c8571d0d48779f6ff6be593d72 Mon Sep 17 00:00:00 2001 From: Stefan Agner Date: Mon, 26 Sep 2016 17:18:58 -0700 Subject: [PATCH 195/292] i2c: imx: defer probe if bus recovery GPIOs are not ready MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Some SoC might load the GPIO driver after the I2C driver and using the I2C bus recovery mechanism via GPIOs. In this case it is crucial to defer probing if the GPIO request functions do so, otherwise the I2C driver gets loaded without recovery mechanisms enabled. Signed-off-by: Stefan Agner Acked-by: Uwe Kleine-König Acked-by: Li Yang Signed-off-by: Wolfram Sang Cc: stable@kernel.org --- drivers/i2c/busses/i2c-imx.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index 592a8f26a708..47fc1f1acff7 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -1009,10 +1009,13 @@ static int i2c_imx_init_recovery_info(struct imx_i2c_struct *i2c_imx, rinfo->sda_gpio = of_get_named_gpio(pdev->dev.of_node, "sda-gpios", 0); rinfo->scl_gpio = of_get_named_gpio(pdev->dev.of_node, "scl-gpios", 0); - if (!gpio_is_valid(rinfo->sda_gpio) || - !gpio_is_valid(rinfo->scl_gpio) || - IS_ERR(i2c_imx->pinctrl_pins_default) || - IS_ERR(i2c_imx->pinctrl_pins_gpio)) { + if (rinfo->sda_gpio == -EPROBE_DEFER || + rinfo->scl_gpio == -EPROBE_DEFER) { + return -EPROBE_DEFER; + } else if (!gpio_is_valid(rinfo->sda_gpio) || + !gpio_is_valid(rinfo->scl_gpio) || + IS_ERR(i2c_imx->pinctrl_pins_default) || + IS_ERR(i2c_imx->pinctrl_pins_gpio)) { dev_dbg(&pdev->dev, "recovery information incomplete\n"); return 0; } From 9b50898ad96c793a8f7cde9d8f281596d752a7dd Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 25 Oct 2016 15:56:35 +0200 Subject: [PATCH 196/292] ALSA: seq: Fix time account regression The recent rewrite of the sequencer time accounting using timespec64 in the commit [3915bf294652: ALSA: seq_timer: use monotonic times internally] introduced a bad regression. Namely, the time reported back doesn't increase but goes back and forth. The culprit was obvious: the delta is stored to the result (cur_time = delta), instead of adding the delta (cur_time += delta)! Let's fix it. Fixes: 3915bf294652 ('ALSA: seq_timer: use monotonic times internally') Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=177571 Reported-by: Yves Guillemot Cc: # v4.8+ Signed-off-by: Takashi Iwai --- sound/core/seq/seq_timer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/core/seq/seq_timer.c b/sound/core/seq/seq_timer.c index dcc102813aef..37d9cfbc29f9 100644 --- a/sound/core/seq/seq_timer.c +++ b/sound/core/seq/seq_timer.c @@ -448,8 +448,8 @@ snd_seq_real_time_t snd_seq_timer_get_cur_time(struct snd_seq_timer *tmr) ktime_get_ts64(&tm); tm = timespec64_sub(tm, tmr->last_update); - cur_time.tv_nsec = tm.tv_nsec; - cur_time.tv_sec = tm.tv_sec; + cur_time.tv_nsec += tm.tv_nsec; + cur_time.tv_sec += tm.tv_sec; snd_seq_sanity_real_time(&cur_time); } spin_unlock_irqrestore(&tmr->lock, flags); From b831275a3553c32091222ac619cfddd73a5553fb Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Mon, 24 Oct 2016 11:41:56 +0200 Subject: [PATCH 197/292] timers: Plug locking race vs. timer migration Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the timer flags. As a consequence the compiler is allowed to reload the flags between the initial check for TIMER_MIGRATION and the following timer base computation and the spin lock of the base. While this has not been observed (yet), we need to make sure that it never happens. Fixes: 0eeda71bc30d ("timer: Replace timer base by a cpu index") Reported-by: Linus Torvalds Signed-off-by: Thomas Gleixner Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: stable@vger.kernel.org Cc: Andrew Morton Cc: Peter Zijlstra --- kernel/time/timer.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 2d47980a1bc4..0d4b91c5a374 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -943,7 +943,14 @@ static struct timer_base *lock_timer_base(struct timer_list *timer, { for (;;) { struct timer_base *base; - u32 tf = timer->flags; + u32 tf; + + /* + * We need to use READ_ONCE() here, otherwise the compiler + * might re-read @tf between the check for TIMER_MIGRATING + * and spin_lock(). + */ + tf = READ_ONCE(timer->flags); if (!(tf & TIMER_MIGRATING)) { base = get_timer_base(tf); From 4da9152a4308dcbf611cde399c695c359fc9145f Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Mon, 24 Oct 2016 11:55:10 +0200 Subject: [PATCH 198/292] timers: Lock base for same bucket optimization Linus stumbled over the unlocked modification of the timer expiry value in mod_timer() which is an optimization for timers which stay in the same bucket - due to the bucket granularity - despite their expiry time getting updated. The optimization itself still makes sense even if we take the lock, because in case that the bucket stays the same, we avoid the pointless queue/enqueue dance. Make the check and the modification of timer->expires protected by the base lock and shuffle the remaining code around so we can keep the lock held when we actually have to requeue the timer to a different bucket. Fixes: f00c0afdfa62 ("timers: Implement optimization for same expiry time in mod_timer()") Reported-by: Linus Torvalds Signed-off-by: Thomas Gleixner Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos Cc: stable@vger.kernel.org Cc: Andrew Morton Cc: Peter Zijlstra --- kernel/time/timer.c | 32 +++++++++++++++++++------------- 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 0d4b91c5a374..ccf913038f9f 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -971,6 +971,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only) unsigned long clk = 0, flags; int ret = 0; + BUG_ON(!timer->function); + /* * This is a common optimization triggered by the networking code - if * the timer is re-modified to have the same timeout or ends up in the @@ -979,13 +981,16 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only) if (timer_pending(timer)) { if (timer->expires == expires) return 1; - /* - * Take the current timer_jiffies of base, but without holding - * the lock! - */ - base = get_timer_base(timer->flags); - clk = base->clk; + /* + * We lock timer base and calculate the bucket index right + * here. If the timer ends up in the same bucket, then we + * just update the expiry time and avoid the whole + * dequeue/enqueue dance. + */ + base = lock_timer_base(timer, &flags); + + clk = base->clk; idx = calc_wheel_index(expires, clk); /* @@ -995,14 +1000,14 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only) */ if (idx == timer_get_idx(timer)) { timer->expires = expires; - return 1; + ret = 1; + goto out_unlock; } + } else { + base = lock_timer_base(timer, &flags); } timer_stats_timer_set_start_info(timer); - BUG_ON(!timer->function); - - base = lock_timer_base(timer, &flags); ret = detach_if_pending(timer, base, false); if (!ret && pending_only) @@ -1035,9 +1040,10 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only) timer->expires = expires; /* * If 'idx' was calculated above and the base time did not advance - * between calculating 'idx' and taking the lock, only enqueue_timer() - * and trigger_dyntick_cpu() is required. Otherwise we need to - * (re)calculate the wheel index via internal_add_timer(). + * between calculating 'idx' and possibly switching the base, only + * enqueue_timer() and trigger_dyntick_cpu() is required. Otherwise + * we need to (re)calculate the wheel index via + * internal_add_timer(). */ if (idx != UINT_MAX && clk == base->clk) { enqueue_timer(base, timer, idx); From 041ad7bc758db259bb960ef795197dd14aab19a6 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 22 Oct 2016 11:07:35 +0000 Subject: [PATCH 199/292] timers: Prevent base clock rewind when forwarding clock Ashton and Michael reported, that kernel versions 4.8 and later suffer from USB timeouts which are caused by the timer wheel rework. This is caused by a bug in the base clock forwarding mechanism, which leads to timers expiring early. The scenario which leads to this is: run_timers() while (jiffies >= base->clk) { collect_expired_timers(); base->clk++; expire_timers(); } So base->clk = jiffies + 1. Now the cpu goes idle: idle() get_next_timer_interrupt() nextevt = __next_time_interrupt(); if (time_after(nextevt, base->clk)) base->clk = jiffies; jiffies has not advanced since run_timers(), so this assignment effectively decrements base->clk by one. base->clk is the index into the timer wheel arrays. So let's assume the following state after the base->clk increment in run_timers(): jiffies = 0 base->clk = 1 A timer gets enqueued with an expiry delta of 63 ticks (which is the case with the USB timeout and HZ=250) so the resulting bucket index is: base->clk + delta = 1 + 63 = 64 The timer goes into the first wheel level. The array size is 64 so it ends up in bucket 0, which is correct as it takes 63 ticks to advance base->clk to index into bucket 0 again. If the cpu goes idle before jiffies advance, then the bug in the forwarding mechanism sets base->clk back to 0, so the next invocation of run_timers() at the next tick will index into bucket 0 and therefore expire the timer 62 ticks too early. Instead of blindly setting base->clk to jiffies we must make the forwarding conditional on jiffies > base->clk, but we cannot use jiffies for this as we might run into the following issue: if (time_after(jiffies, base->clk) { if (time_after(nextevt, base->clk)) base->clk = jiffies; jiffies can increment between the check and the assigment far enough to advance beyond nextevt. So we need to use a stable value for checking. get_next_timer_interrupt() has the basej argument which is the jiffies value snapshot taken in the calling code. So we can just that. Thanks to Ashton for bisecting and providing trace data! Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible") Reported-by: Ashton Holmes Reported-by: Michael Thayer Signed-off-by: Thomas Gleixner Cc: Michal Necasek Cc: Peter Zijlstra Cc: knut.osmundsen@oracle.com Cc: stable@vger.kernel.org Cc: stern@rowland.harvard.edu Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161022110552.175308322@linutronix.de Signed-off-by: Thomas Gleixner --- kernel/time/timer.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index ccf913038f9f..7c446fb5163a 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1523,12 +1523,16 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) is_max_delta = (nextevt == base->clk + NEXT_TIMER_MAX_DELTA); base->next_expiry = nextevt; /* - * We have a fresh next event. Check whether we can forward the base: + * We have a fresh next event. Check whether we can forward the + * base. We can only do that when @basej is past base->clk + * otherwise we might rewind base->clk. */ - if (time_after(nextevt, jiffies)) - base->clk = jiffies; - else if (time_after(nextevt, base->clk)) - base->clk = nextevt; + if (time_after(basej, base->clk)) { + if (time_after(nextevt, basej)) + base->clk = basej; + else if (time_after(nextevt, base->clk)) + base->clk = nextevt; + } if (time_before_eq(nextevt, basej)) { expires = basem; From 6bad6bccf2d717f652d37e63cf261eaa23466009 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 22 Oct 2016 11:07:37 +0000 Subject: [PATCH 200/292] timers: Prevent base clock corruption when forwarding When a timer is enqueued we try to forward the timer base clock. This mechanism has two issues: 1) Forwarding a remote base unlocked The forwarding function is called from get_target_base() with the current timer base lock held. But if the new target base is a different base than the current base (can happen with NOHZ, sigh!) then the forwarding is done on an unlocked base. This can lead to corruption of base->clk. Solution is simple: Invoke the forwarding after the target base is locked. 2) Possible corruption due to jiffies advancing This is similar to the issue in get_net_timer_interrupt() which was fixed in the previous patch. jiffies can advance between check and assignement and therefore advancing base->clk beyond the next expiry value. So we need to read jiffies into a local variable once and do the checks and assignment with the local copy. Fixes: a683f390b93f("timers: Forward the wheel clock whenever possible") Reported-by: Ashton Holmes Reported-by: Michael Thayer Signed-off-by: Thomas Gleixner Cc: Michal Necasek Cc: Peter Zijlstra Cc: knut.osmundsen@oracle.com Cc: stable@vger.kernel.org Cc: stern@rowland.harvard.edu Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20161022110552.253640125@linutronix.de Signed-off-by: Thomas Gleixner --- kernel/time/timer.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 7c446fb5163a..c611c47de884 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -878,7 +878,7 @@ static inline struct timer_base *get_timer_base(u32 tflags) #ifdef CONFIG_NO_HZ_COMMON static inline struct timer_base * -__get_target_base(struct timer_base *base, unsigned tflags) +get_target_base(struct timer_base *base, unsigned tflags) { #ifdef CONFIG_SMP if ((tflags & TIMER_PINNED) || !base->migration_enabled) @@ -891,25 +891,27 @@ __get_target_base(struct timer_base *base, unsigned tflags) static inline void forward_timer_base(struct timer_base *base) { + unsigned long jnow = READ_ONCE(jiffies); + /* * We only forward the base when it's idle and we have a delta between * base clock and jiffies. */ - if (!base->is_idle || (long) (jiffies - base->clk) < 2) + if (!base->is_idle || (long) (jnow - base->clk) < 2) return; /* * If the next expiry value is > jiffies, then we fast forward to * jiffies otherwise we forward to the next expiry value. */ - if (time_after(base->next_expiry, jiffies)) - base->clk = jiffies; + if (time_after(base->next_expiry, jnow)) + base->clk = jnow; else base->clk = base->next_expiry; } #else static inline struct timer_base * -__get_target_base(struct timer_base *base, unsigned tflags) +get_target_base(struct timer_base *base, unsigned tflags) { return get_timer_this_cpu_base(tflags); } @@ -917,14 +919,6 @@ __get_target_base(struct timer_base *base, unsigned tflags) static inline void forward_timer_base(struct timer_base *base) { } #endif -static inline struct timer_base * -get_target_base(struct timer_base *base, unsigned tflags) -{ - struct timer_base *target = __get_target_base(base, tflags); - - forward_timer_base(target); - return target; -} /* * We are using hashed locking: Holding per_cpu(timer_bases[x]).lock means @@ -1037,6 +1031,9 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only) } } + /* Try to forward a stale timer base clock */ + forward_timer_base(base); + timer->expires = expires; /* * If 'idx' was calculated above and the base time did not advance From 0ce57f8af1782fd12d3a81872a4ab97244989802 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Tue, 25 Oct 2016 14:04:34 +0200 Subject: [PATCH 201/292] ahci: fix the single MSI-X case in ahci_init_one We need to make sure hpriv->irq is set properly if we don't use per-port vectors, so switch from blindly assigning pdev->irq to using pci_irq_vector, which handles all interrupt types correctly. Signed-off-by: Christoph Hellwig Reported-by: Robert Richter Tested-by: Robert Richter Tested-by: David Daney Fixes: 0b9e2988ab22 ("ahci: use pci_alloc_irq_vectors") Signed-off-by: Tejun Heo --- drivers/ata/ahci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 60e42e2ed68f..9669fc7c19df 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -1620,7 +1620,7 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) /* legacy intx interrupts */ pci_intx(pdev, 1); } - hpriv->irq = pdev->irq; + hpriv->irq = pci_irq_vector(pdev, 0); if (!(hpriv->cap & HOST_CAP_SSS) || ahci_ignore_sss) host->flags |= ATA_HOST_PARALLEL_SCAN; From a467a672cf097ec11332a9b22db6e31d3ef50359 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Wed, 26 Oct 2016 10:07:44 +1030 Subject: [PATCH 202/292] MAINTAINERS: Begin module maintainer transition Being a Linux kernel maintainer has been my proudest professional accomplishment, spanning the last 19 years. But now we have a surfeit of excellent hackers, and I can hand this over without regret. I'll still be around as co-maintainer for another cycle, but Jessica is now the one to convince if you want your patches applied. She rocks, and is far more timely than me too! Signed-off-by: Rusty Russell Acked-by: Jessica Yu --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index c44795306342..f30b8ea700fd 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8100,6 +8100,7 @@ S: Maintained F: drivers/media/dvb-frontends/mn88473* MODULE SUPPORT +M: Jessica Yu M: Rusty Russell S: Maintained F: include/linux/module.h From 8ef4227615e158faa4ee85a1d6466782f7e22f2f Mon Sep 17 00:00:00 2001 From: Dave Airlie Date: Mon, 24 Oct 2016 15:27:59 +1000 Subject: [PATCH 203/292] x86/io: add interface to reserve io memtype for a resource range. (v1.1) A recent change to the mm code in: 87744ab3832b mm: fix cache mode tracking in vm_insert_mixed() started enforcing checking the memory type against the registered list for amixed pfn insertion mappings. It happens that the drm drivers for a number of gpus relied on this being broken. Currently the driver only inserted VRAM mappings into the tracking table when they came from the kernel, and userspace mappings never landed in the table. This led to a regression where all the mapping end up as UC instead of WC now. I've considered a number of solutions but since this needs to be fixed in fixes and not next, and some of the solutions were going to introduce overhead that hadn't been there before I didn't consider them viable at this stage. These mainly concerned hooking into the TTM io reserve APIs, but these API have a bunch of fast paths I didn't want to unwind to add this to. The solution I've decided on is to add a new API like the arch_phys_wc APIs (these would have worked but wc_del didn't take a range), and use them from the drivers to add a WC compatible mapping to the table for all VRAM on those GPUs. This means we can then create userspace mapping that won't get degraded to UC. v1.1: use CONFIG_X86_PAT + add some comments in io.h Cc: Toshi Kani Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Andy Lutomirski Cc: Denys Vlasenko Cc: Brian Gerst Cc: x86@kernel.org Cc: mcgrof@suse.com Cc: Dan Williams Acked-by: Ingo Molnar Reviewed-by: Thomas Gleixner Signed-off-by: Dave Airlie --- arch/x86/include/asm/io.h | 6 ++++++ arch/x86/mm/pat.c | 14 ++++++++++++++ include/linux/io.h | 22 ++++++++++++++++++++++ 3 files changed, 42 insertions(+) diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index de25aad07853..d34bd370074b 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -351,4 +351,10 @@ extern void arch_phys_wc_del(int handle); #define arch_phys_wc_add arch_phys_wc_add #endif +#ifdef CONFIG_X86_PAT +extern int arch_io_reserve_memtype_wc(resource_size_t start, resource_size_t size); +extern void arch_io_free_memtype_wc(resource_size_t start, resource_size_t size); +#define arch_io_reserve_memtype_wc arch_io_reserve_memtype_wc +#endif + #endif /* _ASM_X86_IO_H */ diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 170cc4ff057b..83e701f160a9 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -730,6 +730,20 @@ void io_free_memtype(resource_size_t start, resource_size_t end) free_memtype(start, end); } +int arch_io_reserve_memtype_wc(resource_size_t start, resource_size_t size) +{ + enum page_cache_mode type = _PAGE_CACHE_MODE_WC; + + return io_reserve_memtype(start, start + size, &type); +} +EXPORT_SYMBOL(arch_io_reserve_memtype_wc); + +void arch_io_free_memtype_wc(resource_size_t start, resource_size_t size) +{ + io_free_memtype(start, start + size); +} +EXPORT_SYMBOL(arch_io_free_memtype_wc); + pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, unsigned long size, pgprot_t vma_prot) { diff --git a/include/linux/io.h b/include/linux/io.h index e2c8419278c1..82ef36eac8a1 100644 --- a/include/linux/io.h +++ b/include/linux/io.h @@ -141,4 +141,26 @@ enum { void *memremap(resource_size_t offset, size_t size, unsigned long flags); void memunmap(void *addr); +/* + * On x86 PAT systems we have memory tracking that keeps track of + * the allowed mappings on memory ranges. This tracking works for + * all the in-kernel mapping APIs (ioremap*), but where the user + * wishes to map a range from a physical device into user memory + * the tracking won't be updated. This API is to be used by + * drivers which remap physical device pages into userspace, + * and wants to make sure they are mapped WC and not UC. + */ +#ifndef arch_io_reserve_memtype_wc +static inline int arch_io_reserve_memtype_wc(resource_size_t base, + resource_size_t size) +{ + return 0; +} + +static inline void arch_io_free_memtype_wc(resource_size_t base, + resource_size_t size) +{ +} +#endif + #endif /* _LINUX_IO_H */ From 7cf321d118a825c1541b43ca45294126fd474efa Mon Sep 17 00:00:00 2001 From: Dave Airlie Date: Mon, 24 Oct 2016 15:37:48 +1000 Subject: [PATCH 204/292] drm/drivers: add support for using the arch wc mapping API. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This fixes a regression in all these drivers since the cache mode tracking was fixed for mixed mappings. It uses the new arch API to add the VRAM range to the PAT mapping tracking tables. Fixes: 87744ab3832 (mm: fix cache mode tracking in vm_insert_mixed()) Reviewed-by: Christian König . Signed-off-by: Dave Airlie --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 +++++ drivers/gpu/drm/ast/ast_ttm.c | 6 ++++++ drivers/gpu/drm/cirrus/cirrus_ttm.c | 7 +++++++ drivers/gpu/drm/mgag200/mgag200_ttm.c | 7 +++++++ drivers/gpu/drm/nouveau/nouveau_ttm.c | 8 ++++++++ drivers/gpu/drm/radeon/radeon_object.c | 5 +++++ 6 files changed, 38 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index aa074fac0c7f..f3efb1c5dae9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -754,6 +754,10 @@ static const char *amdgpu_vram_names[] = { int amdgpu_bo_init(struct amdgpu_device *adev) { + /* reserve PAT memory space to WC for VRAM */ + arch_io_reserve_memtype_wc(adev->mc.aper_base, + adev->mc.aper_size); + /* Add an MTRR for the VRAM */ adev->mc.vram_mtrr = arch_phys_wc_add(adev->mc.aper_base, adev->mc.aper_size); @@ -769,6 +773,7 @@ void amdgpu_bo_fini(struct amdgpu_device *adev) { amdgpu_ttm_fini(adev); arch_phys_wc_del(adev->mc.vram_mtrr); + arch_io_free_memtype_wc(adev->mc.aper_base, adev->mc.aper_size); } int amdgpu_bo_fbdev_mmap(struct amdgpu_bo *bo, diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c index 608df4c90520..0743e65cb240 100644 --- a/drivers/gpu/drm/ast/ast_ttm.c +++ b/drivers/gpu/drm/ast/ast_ttm.c @@ -267,6 +267,8 @@ int ast_mm_init(struct ast_private *ast) return ret; } + arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0), + pci_resource_len(dev->pdev, 0)); ast->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0), pci_resource_len(dev->pdev, 0)); @@ -275,11 +277,15 @@ int ast_mm_init(struct ast_private *ast) void ast_mm_fini(struct ast_private *ast) { + struct drm_device *dev = ast->dev; + ttm_bo_device_release(&ast->ttm.bdev); ast_ttm_global_release(ast); arch_phys_wc_del(ast->fb_mtrr); + arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0), + pci_resource_len(dev->pdev, 0)); } void ast_ttm_placement(struct ast_bo *bo, int domain) diff --git a/drivers/gpu/drm/cirrus/cirrus_ttm.c b/drivers/gpu/drm/cirrus/cirrus_ttm.c index bb2438dd8733..5e7e63ce7bce 100644 --- a/drivers/gpu/drm/cirrus/cirrus_ttm.c +++ b/drivers/gpu/drm/cirrus/cirrus_ttm.c @@ -267,6 +267,9 @@ int cirrus_mm_init(struct cirrus_device *cirrus) return ret; } + arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0), + pci_resource_len(dev->pdev, 0)); + cirrus->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0), pci_resource_len(dev->pdev, 0)); @@ -276,6 +279,8 @@ int cirrus_mm_init(struct cirrus_device *cirrus) void cirrus_mm_fini(struct cirrus_device *cirrus) { + struct drm_device *dev = cirrus->dev; + if (!cirrus->mm_inited) return; @@ -285,6 +290,8 @@ void cirrus_mm_fini(struct cirrus_device *cirrus) arch_phys_wc_del(cirrus->fb_mtrr); cirrus->fb_mtrr = 0; + arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0), + pci_resource_len(dev->pdev, 0)); } void cirrus_ttm_placement(struct cirrus_bo *bo, int domain) diff --git a/drivers/gpu/drm/mgag200/mgag200_ttm.c b/drivers/gpu/drm/mgag200/mgag200_ttm.c index 919b35f2ad24..dcf7d11ac380 100644 --- a/drivers/gpu/drm/mgag200/mgag200_ttm.c +++ b/drivers/gpu/drm/mgag200/mgag200_ttm.c @@ -266,6 +266,9 @@ int mgag200_mm_init(struct mga_device *mdev) return ret; } + arch_io_reserve_memtype_wc(pci_resource_start(dev->pdev, 0), + pci_resource_len(dev->pdev, 0)); + mdev->fb_mtrr = arch_phys_wc_add(pci_resource_start(dev->pdev, 0), pci_resource_len(dev->pdev, 0)); @@ -274,10 +277,14 @@ int mgag200_mm_init(struct mga_device *mdev) void mgag200_mm_fini(struct mga_device *mdev) { + struct drm_device *dev = mdev->dev; + ttm_bo_device_release(&mdev->ttm.bdev); mgag200_ttm_global_release(mdev); + arch_io_free_memtype_wc(pci_resource_start(dev->pdev, 0), + pci_resource_len(dev->pdev, 0)); arch_phys_wc_del(mdev->fb_mtrr); mdev->fb_mtrr = 0; } diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c index 1825dbc33192..a6dbe8258040 100644 --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c @@ -398,6 +398,9 @@ nouveau_ttm_init(struct nouveau_drm *drm) /* VRAM init */ drm->gem.vram_available = drm->device.info.ram_user; + arch_io_reserve_memtype_wc(device->func->resource_addr(device, 1), + device->func->resource_size(device, 1)); + ret = ttm_bo_init_mm(&drm->ttm.bdev, TTM_PL_VRAM, drm->gem.vram_available >> PAGE_SHIFT); if (ret) { @@ -430,6 +433,8 @@ nouveau_ttm_init(struct nouveau_drm *drm) void nouveau_ttm_fini(struct nouveau_drm *drm) { + struct nvkm_device *device = nvxx_device(&drm->device); + ttm_bo_clean_mm(&drm->ttm.bdev, TTM_PL_VRAM); ttm_bo_clean_mm(&drm->ttm.bdev, TTM_PL_TT); @@ -439,4 +444,7 @@ nouveau_ttm_fini(struct nouveau_drm *drm) arch_phys_wc_del(drm->ttm.mtrr); drm->ttm.mtrr = 0; + arch_io_free_memtype_wc(device->func->resource_addr(device, 1), + device->func->resource_size(device, 1)); + } diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index be30861afae9..41b72ce6613f 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -446,6 +446,10 @@ void radeon_bo_force_delete(struct radeon_device *rdev) int radeon_bo_init(struct radeon_device *rdev) { + /* reserve PAT memory space to WC for VRAM */ + arch_io_reserve_memtype_wc(rdev->mc.aper_base, + rdev->mc.aper_size); + /* Add an MTRR for the VRAM */ if (!rdev->fastfb_working) { rdev->mc.vram_mtrr = arch_phys_wc_add(rdev->mc.aper_base, @@ -463,6 +467,7 @@ void radeon_bo_fini(struct radeon_device *rdev) { radeon_ttm_fini(rdev); arch_phys_wc_del(rdev->mc.vram_mtrr); + arch_io_free_memtype_wc(rdev->mc.aper_base, rdev->mc.aper_size); } /* Returns how many bytes TTM can move per IB. From 2925d366f450dc1db0ac88f2509a0eec0fef4226 Mon Sep 17 00:00:00 2001 From: Stephen Boyd Date: Mon, 17 Oct 2016 17:16:02 -0700 Subject: [PATCH 205/292] extcon: qcom-spmi-misc: Sync the extcon state on interrupt The driver was changed after submission to use the new style APIs like extcon_set_state(). Unfortunately, that only sets the state, and doesn't notify any consumers that the cable state has changed. Use extcon_set_state_sync() here instead so that we notify cable consumers of the state change. This fixes USB host-device role switching on the db8074 platform. Fixes: 38085c987f52 ("extcon: Add support for qcom SPMI PMIC USB id detection hardware") Signed-off-by: Stephen Boyd Signed-off-by: Chanwoo Choi --- drivers/extcon/extcon-qcom-spmi-misc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/extcon/extcon-qcom-spmi-misc.c b/drivers/extcon/extcon-qcom-spmi-misc.c index ca957a5f4291..b8cde096a808 100644 --- a/drivers/extcon/extcon-qcom-spmi-misc.c +++ b/drivers/extcon/extcon-qcom-spmi-misc.c @@ -51,7 +51,7 @@ static void qcom_usb_extcon_detect_cable(struct work_struct *work) if (ret) return; - extcon_set_state(info->edev, EXTCON_USB_HOST, !id); + extcon_set_state_sync(info->edev, EXTCON_USB_HOST, !id); } static irqreturn_t qcom_usb_irq_handler(int irq, void *dev_id) From 62c61514191bfe5731b43619b9b1bf4b423beeb0 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Sun, 23 Oct 2016 09:32:34 -0700 Subject: [PATCH 206/292] doc: Add missing parameter for msi_setup commit 92ca8d20dee2 ("genirq/msi: Switch to new irq spreading") introduced new parameter to msi_init_setup and but did not update docbook comments. Fixes 'make htmldocs' warning. Signed-off-by: Stephen Hemminger Cc: bhelgaas@google.com Cc: linux-pci@vger.kernel.org Signed-off-by: Thomas Gleixner --- drivers/pci/msi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index bfdd0744b686..ad70507cfb56 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -610,6 +610,7 @@ static int msi_verify_entries(struct pci_dev *dev) * msi_capability_init - configure device's MSI capability structure * @dev: pointer to the pci_dev data structure of MSI device function * @nvec: number of interrupts to allocate + * @affinity: flag to indicate cpu irq affinity mask should be set * * Setup the MSI capability structure of the device with the requested * number of interrupts. A return value of zero indicates the successful @@ -752,6 +753,7 @@ static void msix_program_entries(struct pci_dev *dev, * @dev: pointer to the pci_dev data structure of MSI-X device function * @entries: pointer to an array of struct msix_entry entries * @nvec: number of @entries + * @affinity: flag to indicate cpu irq affinity mask should be set * * Setup the MSI-X capability structure of device function with a * single MSI-X irq. A return of zero indicates the successful setup of From 5de0a8c0c240338cb5b73363b0673c6aa804bb1c Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Mon, 24 Oct 2016 15:01:48 -0400 Subject: [PATCH 207/292] x86: Fix export for mcount and __fentry__ Commit 784d5699eddc5 ("x86: move exports to actual definitions") removed the EXPORT_SYMBOL(__fentry__) and EXPORT_SYMBOL(mcount) from x8664_ksyms_64.c, and added EXPORT_SYMBOL(function_hook) in mcount_64.S instead. The problem is that function_hook isn't a function at all, but a macro that is defined as either mcount or __fentry__ depending on the support from gcc. Originally, I thought this was a macro issue, like what __stringify() is used for. But the problem is a bit deeper. The Makefile.build has some magic that does post processing of files to create the CRC bindings. It does some searches for EXPORT_SYMBOL() and because it finds a macro name and not the actual functions, this causes function_hook not to be converted into mcount or __fentry__ and they are missed. Instead of adding more magic to Makefile.build, just add EXPORT_SYMBOL() for mcount and __fentry__ where the ifdef is used. Since this is assembly and not C, it doesn't require being set after the function is defined. Signed-off-by: Steven Rostedt Tested-by: Borislav Petkov Cc: Gabriel C Cc: Nicholas Piggin Cc: Al Viro Link: http://lkml.kernel.org/r/20161024150148.4f9d90e4@gandalf.local.home Signed-off-by: Thomas Gleixner --- arch/x86/kernel/mcount_64.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S index efe73aacf966..7b0d3da52fb4 100644 --- a/arch/x86/kernel/mcount_64.S +++ b/arch/x86/kernel/mcount_64.S @@ -18,8 +18,10 @@ #ifdef CC_USING_FENTRY # define function_hook __fentry__ +EXPORT_SYMBOL(__fentry__) #else # define function_hook mcount +EXPORT_SYMBOL(mcount) #endif /* All cases save the original rbp (8 bytes) */ @@ -295,7 +297,6 @@ trace: jmp fgraph_trace END(function_hook) #endif /* CONFIG_DYNAMIC_FTRACE */ -EXPORT_SYMBOL(function_hook) #endif /* CONFIG_FUNCTION_TRACER */ #ifdef CONFIG_FUNCTION_GRAPH_TRACER From 94d7dea448fae6cbb83395323c1d2fd7f19dc388 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Wed, 26 Oct 2016 16:57:15 +0800 Subject: [PATCH 208/292] block: flush: fix IO hang in case of flood fua req This patch fixes one issue reported by Kent, which can be triggered in bcachefs over sata disk. Actually it is a generic issue in block flush vs. blk-tag. Cc: Christoph Hellwig Reported-by: Kent Overstreet Signed-off-by: Ming Lei Signed-off-by: Jens Axboe --- block/blk-flush.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/block/blk-flush.c b/block/blk-flush.c index 6a14b68b9135..3c882cbc7541 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -342,6 +342,34 @@ static void flush_data_end_io(struct request *rq, int error) struct request_queue *q = rq->q; struct blk_flush_queue *fq = blk_get_flush_queue(q, NULL); + /* + * Updating q->in_flight[] here for making this tag usable + * early. Because in blk_queue_start_tag(), + * q->in_flight[BLK_RW_ASYNC] is used to limit async I/O and + * reserve tags for sync I/O. + * + * More importantly this way can avoid the following I/O + * deadlock: + * + * - suppose there are 40 fua requests comming to flush queue + * and queue depth is 31 + * - 30 rqs are scheduled then blk_queue_start_tag() can't alloc + * tag for async I/O any more + * - all the 30 rqs are completed before FLUSH_PENDING_TIMEOUT + * and flush_data_end_io() is called + * - the other rqs still can't go ahead if not updating + * q->in_flight[BLK_RW_ASYNC] here, meantime these rqs + * are held in flush data queue and make no progress of + * handling post flush rq + * - only after the post flush rq is handled, all these rqs + * can be completed + */ + + elv_completed_request(q, rq); + + /* for avoiding double accounting */ + rq->cmd_flags &= ~REQ_STARTED; + /* * After populating an empty queue, kick it to avoid stall. Read * the comment in flush_end_io(). From 26984c3bc29aa15b705475177842feddcd3d9df0 Mon Sep 17 00:00:00 2001 From: Yisheng Xie Date: Fri, 21 Oct 2016 16:13:55 +0800 Subject: [PATCH 209/292] arm64/numa: fix pcpu_cpu_distance() to get correct CPU proximity The pcpu_build_alloc_info() function group CPUs according to their proximity, by call callback function @cpu_distance_fn from different ARCHs. For arm64 the callback of @cpu_distance_fn is pcpu_cpu_distance(from, to) -> node_distance(from, to) The @from and @to for function node_distance() should be nid. However, pcpu_cpu_distance() in arch/arm64/mm/numa.c just past the cpu id for @from and @to, and didn't convert to numa node id. For this incorrect cpu proximity get from ARCH, it may cause each CPU in one group and make group_cnt out of bound: setup_per_cpu_areas() pcpu_embed_first_chunk() pcpu_build_alloc_info() in pcpu_build_alloc_info, since cpu_distance_fn will return REMOTE_DISTANCE if we pass cpu ids (0,1,2...), so cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE will wrongly be ture. This may results in triggering the BUG_ON(unit != nr_units) later: [ 0.000000] kernel BUG at mm/percpu.c:1916! [ 0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-00003-g14155ca-dirty #26 [ 0.000000] Hardware name: Hisilicon Hi1616 Evaluation Board (DT) [ 0.000000] task: ffff000008d6e900 task.stack: ffff000008d60000 [ 0.000000] PC is at pcpu_embed_first_chunk+0x420/0x704 [ 0.000000] LR is at pcpu_embed_first_chunk+0x3bc/0x704 [ 0.000000] pc : [] lr : [] pstate: 800000c5 [ 0.000000] sp : ffff000008d63eb0 [ 0.000000] x29: ffff000008d63eb0 [ 0.000000] x28: 0000000000000000 [ 0.000000] x27: 0000000000000040 [ 0.000000] x26: ffff8413fbfcef00 [ 0.000000] x25: 0000000000000042 [ 0.000000] x24: 0000000000000042 [ 0.000000] x23: 0000000000001000 [ 0.000000] x22: 0000000000000046 [ 0.000000] x21: 0000000000000001 [ 0.000000] x20: ffff000008cb3bc8 [ 0.000000] x19: ffff8413fbfcf570 [ 0.000000] x18: 0000000000000000 [ 0.000000] x17: ffff000008e49ae0 [ 0.000000] x16: 0000000000000003 [ 0.000000] x15: 000000000000001e [ 0.000000] x14: 0000000000000004 [ 0.000000] x13: 0000000000000000 [ 0.000000] x12: 000000000000006f [ 0.000000] x11: 00000413fbffff00 [ 0.000000] x10: 0000000000000004 [ 0.000000] x9 : 0000000000000000 [ 0.000000] x8 : 0000000000000001 [ 0.000000] x7 : ffff8413fbfcf63c [ 0.000000] x6 : ffff000008d65d28 [ 0.000000] x5 : ffff000008d65e50 [ 0.000000] x4 : 0000000000000000 [ 0.000000] x3 : ffff000008cb3cc8 [ 0.000000] x2 : 0000000000000040 [ 0.000000] x1 : 0000000000000040 [ 0.000000] x0 : 0000000000000000 [...] [ 0.000000] Call trace: [ 0.000000] Exception stack(0xffff000008d63ce0 to 0xffff000008d63e10) [ 0.000000] 3ce0: ffff8413fbfcf570 0001000000000000 ffff000008d63eb0 ffff000008c754f4 [ 0.000000] 3d00: ffff000008d63d50 ffff0000081af210 00000413fbfff010 0000000000001000 [ 0.000000] 3d20: ffff000008d63d50 ffff0000081af220 00000413fbfff010 0000000000001000 [ 0.000000] 3d40: 00000413fbfcef00 0000000000000004 ffff000008d63db0 ffff0000081af390 [ 0.000000] 3d60: 00000413fbfcef00 0000000000001000 0000000000000000 0000000000001000 [ 0.000000] 3d80: 0000000000000000 0000000000000040 0000000000000040 ffff000008cb3cc8 [ 0.000000] 3da0: 0000000000000000 ffff000008d65e50 ffff000008d65d28 ffff8413fbfcf63c [ 0.000000] 3dc0: 0000000000000001 0000000000000000 0000000000000004 00000413fbffff00 [ 0.000000] 3de0: 000000000000006f 0000000000000000 0000000000000004 000000000000001e [ 0.000000] 3e00: 0000000000000003 ffff000008e49ae0 [ 0.000000] [] pcpu_embed_first_chunk+0x420/0x704 [ 0.000000] [] setup_per_cpu_areas+0x38/0xc8 [ 0.000000] [] start_kernel+0x10c/0x390 [ 0.000000] [] __primary_switched+0x5c/0x64 [ 0.000000] Code: b8018660 17ffffd7 6b16037f 54000080 (d4210000) [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! Fix by getting cpu's node id with early_cpu_to_node() then pass it to node_distance() as the original intention. Fixes: 7af3a0a99252 ("arm64/numa: support HAVE_SETUP_PER_CPU_AREA") Signed-off-by: Yisheng Xie Signed-off-by: Hanjun Guo Cc: Catalin Marinas Cc: Lorenzo Pieralisi Cc: Will Deacon Cc: Zhen Lei Signed-off-by: Will Deacon --- arch/arm64/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c index 778a985c8a70..9a71d0678a08 100644 --- a/arch/arm64/mm/numa.c +++ b/arch/arm64/mm/numa.c @@ -147,7 +147,7 @@ static int __init early_cpu_to_node(int cpu) static int __init pcpu_cpu_distance(unsigned int from, unsigned int to) { - return node_distance(from, to); + return node_distance(early_cpu_to_node(from), early_cpu_to_node(to)); } static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, From 3f7a09f44e5ef8a2629842f2d22892114e603fc1 Mon Sep 17 00:00:00 2001 From: Hanjun Guo Date: Fri, 21 Oct 2016 16:13:56 +0800 Subject: [PATCH 210/292] arm64/numa: fix incorrect log for memory-less node When booting on NUMA system with memory-less node (no memory dimm on this memory controller), the print for setup_node_data() is incorrect: NUMA: Initmem setup node 2 [mem 0x00000000-0xffffffffffffffff] It can be fixed by printing [mem 0x00000000-0x00000000] when end_pfn is 0, but print will be more useful. Fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.") Signed-off-by: Hanjun Guo Cc: Catalin Marinas Cc: Ganapatrao Kulkarni Cc: Lorenzo Pieralisi Cc: Mark Rutland Cc: Will Deacon Cc: Yisheng Xie Signed-off-by: Will Deacon --- arch/arm64/mm/numa.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c index 9a71d0678a08..4b32168cf91a 100644 --- a/arch/arm64/mm/numa.c +++ b/arch/arm64/mm/numa.c @@ -223,8 +223,11 @@ static void __init setup_node_data(int nid, u64 start_pfn, u64 end_pfn) void *nd; int tnid; - pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n", - nid, start_pfn << PAGE_SHIFT, (end_pfn << PAGE_SHIFT) - 1); + if (start_pfn < end_pfn) + pr_info("Initmem setup node %d [mem %#010Lx-%#010Lx]\n", nid, + start_pfn << PAGE_SHIFT, (end_pfn << PAGE_SHIFT) - 1); + else + pr_info("Initmem setup node %d []\n", nid); nd_pa = memblock_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid); nd = __va(nd_pa); From 3fa72fe9c614717d22ae75b84d45f41da65c10fe Mon Sep 17 00:00:00 2001 From: Neeraj Upadhyay Date: Fri, 21 Oct 2016 14:28:46 +0530 Subject: [PATCH 211/292] arm64: mm: fix __page_to_voff definition Fix parameter name for __page_to_voff, to match its definition. At present, we don't see any issue, as page_to_virt's caller declares 'page'. Fixes: 9f2875912dac ("arm64: mm: restrict virt_to_page() to the linear mapping") Acked-by: Mark Rutland Acked-by: Ard Biesheuvel Signed-off-by: Neeraj Upadhyay Signed-off-by: Will Deacon --- arch/arm64/include/asm/memory.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index ba62df8c6e35..b71086d25195 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -217,7 +217,7 @@ static inline void *phys_to_virt(phys_addr_t x) #define _virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT) #else #define __virt_to_pgoff(kaddr) (((u64)(kaddr) & ~PAGE_OFFSET) / PAGE_SIZE * sizeof(struct page)) -#define __page_to_voff(kaddr) (((u64)(page) & ~VMEMMAP_START) * PAGE_SIZE / sizeof(struct page)) +#define __page_to_voff(page) (((u64)(page) & ~VMEMMAP_START) * PAGE_SIZE / sizeof(struct page)) #define page_to_virt(page) ((void *)((__page_to_voff(page)) | PAGE_OFFSET)) #define virt_to_page(vaddr) ((struct page *)((__virt_to_pgoff(vaddr)) | VMEMMAP_START)) From 03dab869b7b239c4e013ec82aea22e181e441cfc Mon Sep 17 00:00:00 2001 From: David Howells Date: Wed, 26 Oct 2016 15:01:54 +0100 Subject: [PATCH 212/292] KEYS: Fix short sprintf buffer in /proc/keys show function This fixes CVE-2016-7042. Fix a short sprintf buffer in proc_keys_show(). If the gcc stack protector is turned on, this can cause a panic due to stack corruption. The problem is that xbuf[] is not big enough to hold a 64-bit timeout rendered as weeks: (gdb) p 0xffffffffffffffffULL/(60*60*24*7) $2 = 30500568904943 That's 14 chars plus NUL, not 11 chars plus NUL. Expand the buffer to 16 chars. I think the unpatched code apparently works if the stack-protector is not enabled because on a 32-bit machine the buffer won't be overflowed and on a 64-bit machine there's a 64-bit aligned pointer at one side and an int that isn't checked again on the other side. The panic incurred looks something like: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679 Call Trace: [] dump_stack+0x63/0x84 [] panic+0xde/0x22a [] ? proc_keys_show+0x3ce/0x3d0 [] __stack_chk_fail+0x19/0x30 [] proc_keys_show+0x3ce/0x3d0 [] ? key_validate+0x50/0x50 [] ? key_default_cmp+0x20/0x20 [] seq_read+0x2cc/0x390 [] proc_reg_read+0x42/0x70 [] __vfs_read+0x37/0x150 [] ? security_file_permission+0xa0/0xc0 [] vfs_read+0x96/0x130 [] SyS_read+0x55/0xc0 [] entry_SYSCALL_64_fastpath+0x1a/0xa4 Reported-by: Ondrej Kozina Signed-off-by: David Howells Tested-by: Ondrej Kozina cc: stable@vger.kernel.org Signed-off-by: James Morris --- security/keys/proc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/security/keys/proc.c b/security/keys/proc.c index f0611a6368cd..b9f531c9e4fa 100644 --- a/security/keys/proc.c +++ b/security/keys/proc.c @@ -181,7 +181,7 @@ static int proc_keys_show(struct seq_file *m, void *v) struct timespec now; unsigned long timo; key_ref_t key_ref, skey_ref; - char xbuf[12]; + char xbuf[16]; int rc; struct keyring_search_context ctx = { From 7df3e59c3d1df4f87fe874c7956ef7a3d2f4d5fb Mon Sep 17 00:00:00 2001 From: David Howells Date: Wed, 26 Oct 2016 15:02:01 +0100 Subject: [PATCH 213/292] KEYS: Sort out big_key initialisation big_key has two separate initialisation functions, one that registers the key type and one that registers the crypto. If the key type fails to register, there's no problem if the crypto registers successfully because there's no way to reach the crypto except through the key type. However, if the key type registers successfully but the crypto does not, big_key_rng and big_key_blkcipher may end up set to NULL - but the code neither checks for this nor unregisters the big key key type. Furthermore, since the key type is registered before the crypto, it is theoretically possible for the kernel to try adding a big_key before the crypto is set up, leading to the same effect. Fix this by merging big_key_crypto_init() and big_key_init() and calling the resulting function late. If they're going to be encrypted, we shouldn't be creating big_keys before we have the facilities to do the encryption available. The key type registration is also moved after the crypto initialisation. The fix also includes message printing on failure. If the big_key type isn't correctly set up, simply doing: dd if=/dev/zero bs=4096 count=1 | keyctl padd big_key a @s ought to cause an oops. Fixes: 13100a72f40f5748a04017e0ab3df4cf27c809ef ('Security: Keys: Big keys stored encrypted') Signed-off-by: David Howells cc: Peter Hlavaty cc: Kirill Marinushkin cc: Artem Savkov cc: stable@vger.kernel.org Signed-off-by: James Morris --- security/keys/big_key.c | 59 ++++++++++++++++++++++------------------- 1 file changed, 32 insertions(+), 27 deletions(-) diff --git a/security/keys/big_key.c b/security/keys/big_key.c index c0b3030b5634..835c1ab30d01 100644 --- a/security/keys/big_key.c +++ b/security/keys/big_key.c @@ -9,6 +9,7 @@ * 2 of the Licence, or (at your option) any later version. */ +#define pr_fmt(fmt) "big_key: "fmt #include #include #include @@ -341,44 +342,48 @@ error: */ static int __init big_key_init(void) { - return register_key_type(&key_type_big_key); -} + struct crypto_skcipher *cipher; + struct crypto_rng *rng; + int ret; -/* - * Initialize big_key crypto and RNG algorithms - */ -static int __init big_key_crypto_init(void) -{ - int ret = -EINVAL; - - /* init RNG */ - big_key_rng = crypto_alloc_rng(big_key_rng_name, 0, 0); - if (IS_ERR(big_key_rng)) { - big_key_rng = NULL; - return -EFAULT; + rng = crypto_alloc_rng(big_key_rng_name, 0, 0); + if (IS_ERR(rng)) { + pr_err("Can't alloc rng: %ld\n", PTR_ERR(rng)); + return PTR_ERR(rng); } + big_key_rng = rng; + /* seed RNG */ - ret = crypto_rng_reset(big_key_rng, NULL, crypto_rng_seedsize(big_key_rng)); - if (ret) - goto error; + ret = crypto_rng_reset(rng, NULL, crypto_rng_seedsize(rng)); + if (ret) { + pr_err("Can't reset rng: %d\n", ret); + goto error_rng; + } /* init block cipher */ - big_key_skcipher = crypto_alloc_skcipher(big_key_alg_name, - 0, CRYPTO_ALG_ASYNC); - if (IS_ERR(big_key_skcipher)) { - big_key_skcipher = NULL; - ret = -EFAULT; - goto error; + cipher = crypto_alloc_skcipher(big_key_alg_name, 0, CRYPTO_ALG_ASYNC); + if (IS_ERR(cipher)) { + ret = PTR_ERR(cipher); + pr_err("Can't alloc crypto: %d\n", ret); + goto error_rng; + } + + big_key_skcipher = cipher; + + ret = register_key_type(&key_type_big_key); + if (ret < 0) { + pr_err("Can't register type: %d\n", ret); + goto error_cipher; } return 0; -error: +error_cipher: + crypto_free_skcipher(big_key_skcipher); +error_rng: crypto_free_rng(big_key_rng); - big_key_rng = NULL; return ret; } -device_initcall(big_key_init); -late_initcall(big_key_crypto_init); +late_initcall(big_key_init); From 31e6ec4519c0fe0ee4a2f6ba3ab278e9506b9500 Mon Sep 17 00:00:00 2001 From: Artem Savkov Date: Wed, 26 Oct 2016 15:02:09 +0100 Subject: [PATCH 214/292] security/keys: make BIG_KEYS dependent on stdrng. Since BIG_KEYS can't be compiled as module it requires one of the "stdrng" providers to be compiled into kernel. Otherwise big_key_crypto_init() fails on crypto_alloc_rng step and next dereference of big_key_skcipher (e.g. in big_key_preparse()) results in a NULL pointer dereference. Fixes: 13100a72f40f5748a04017e0ab3df4cf27c809ef ('Security: Keys: Big keys stored encrypted') Signed-off-by: Artem Savkov Signed-off-by: David Howells cc: Stephan Mueller cc: Kirill Marinushkin cc: stable@vger.kernel.org Signed-off-by: James Morris --- security/keys/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/security/keys/Kconfig b/security/keys/Kconfig index f826e8739023..d942c7c2bc0a 100644 --- a/security/keys/Kconfig +++ b/security/keys/Kconfig @@ -41,7 +41,7 @@ config BIG_KEYS bool "Large payload keys" depends on KEYS depends on TMPFS - select CRYPTO + depends on (CRYPTO_ANSI_CPRNG = y || CRYPTO_DRBG = y) select CRYPTO_AES select CRYPTO_ECB select CRYPTO_RNG From 56fb2d6eb63acd48b50437b415b6f7d2fcffe75d Mon Sep 17 00:00:00 2001 From: Josh Poimboeuf Date: Wed, 26 Oct 2016 10:34:08 -0500 Subject: [PATCH 215/292] objtool: Fix rare switch jump table pattern detection The following commit: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern detection") ... improved objtool's ability to detect GCC switch statement jump tables for GCC 6. However the check to allow short jumps with the scanned range of instructions wasn't quite right. The pattern detection should allow jumps to the indirect jump instruction itself. This fixes the following warning: drivers/infiniband/sw/rxe/rxe_comp.o: warning: objtool: rxe_completer()+0x315: sibling call from callable instruction with changed frame pointer Reported-by: Arnd Bergmann Signed-off-by: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: 3732710ff6f2 ("objtool: Improve rare switch jump table pattern detection") Link: http://lkml.kernel.org/r/20161026153408.2rifnw7bvoc5sex7@treble Signed-off-by: Ingo Molnar --- tools/objtool/builtin-check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c index 4490601a9235..e8a1f699058a 100644 --- a/tools/objtool/builtin-check.c +++ b/tools/objtool/builtin-check.c @@ -754,7 +754,7 @@ static struct rela *find_switch_table(struct objtool_file *file, if (insn->type == INSN_JUMP_UNCONDITIONAL && insn->jump_dest && (insn->jump_dest->offset <= insn->offset || - insn->jump_dest->offset >= orig_insn->offset)) + insn->jump_dest->offset > orig_insn->offset)) break; text_rela = find_rela_by_dest_range(insn->sec, insn->offset, From f5d6d2da0d9098a4aa0ebcc187aa0fc167045d6b Mon Sep 17 00:00:00 2001 From: Tobias Klauser Date: Wed, 26 Oct 2016 13:37:04 +0200 Subject: [PATCH 216/292] sched/fair: Remove unused but set variable 'rq' MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Since commit: 8663e24d56dc ("sched/fair: Reorder cgroup creation code") ... the variable 'rq' in alloc_fair_sched_group() is set but no longer used. Remove it to fix the following GCC warning when building with 'W=1': kernel/sched/fair.c:8842:13: warning: variable ‘rq’ set but not used [-Wunused-but-set-variable] Signed-off-by: Tobias Klauser Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20161026113704.8981-1-tklauser@distanz.ch Signed-off-by: Ingo Molnar --- kernel/sched/fair.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d941c97dfbc3..c242944f5cbd 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8839,7 +8839,6 @@ int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent) { struct sched_entity *se; struct cfs_rq *cfs_rq; - struct rq *rq; int i; tg->cfs_rq = kzalloc(sizeof(cfs_rq) * nr_cpu_ids, GFP_KERNEL); @@ -8854,8 +8853,6 @@ int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent) init_cfs_bandwidth(tg_cfs_bandwidth(tg)); for_each_possible_cpu(i) { - rq = cpu_rq(i); - cfs_rq = kzalloc_node(sizeof(struct cfs_rq), GFP_KERNEL, cpu_to_node(i)); if (!cfs_rq) From bdc3478f90cd4d2928197f36629d5cf93b64dbe9 Mon Sep 17 00:00:00 2001 From: Marcel Hasler Date: Thu, 27 Oct 2016 00:42:27 +0200 Subject: [PATCH 217/292] ALSA: usb-audio: Add quirk for Syntek STK1160 The stk1160 chip needs QUIRK_AUDIO_ALIGN_TRANSFER. This patch resolves the issue reported on the mailing list (http://marc.info/?l=linux-sound&m=139223599126215&w=2) and also fixes bug 180071 (https://bugzilla.kernel.org/show_bug.cgi?id=180071). Signed-off-by: Marcel Hasler Cc: Signed-off-by: Takashi Iwai --- sound/usb/quirks-table.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/sound/usb/quirks-table.h b/sound/usb/quirks-table.h index c60a776e815d..8a59d4782a0f 100644 --- a/sound/usb/quirks-table.h +++ b/sound/usb/quirks-table.h @@ -2907,6 +2907,23 @@ AU0828_DEVICE(0x2040, 0x7260, "Hauppauge", "HVR-950Q"), AU0828_DEVICE(0x2040, 0x7213, "Hauppauge", "HVR-950Q"), AU0828_DEVICE(0x2040, 0x7270, "Hauppauge", "HVR-950Q"), +/* Syntek STK1160 */ +{ + .match_flags = USB_DEVICE_ID_MATCH_DEVICE | + USB_DEVICE_ID_MATCH_INT_CLASS | + USB_DEVICE_ID_MATCH_INT_SUBCLASS, + .idVendor = 0x05e1, + .idProduct = 0x0408, + .bInterfaceClass = USB_CLASS_AUDIO, + .bInterfaceSubClass = USB_SUBCLASS_AUDIOCONTROL, + .driver_info = (unsigned long) &(const struct snd_usb_audio_quirk) { + .vendor_name = "Syntek", + .product_name = "STK1160", + .ifnum = QUIRK_ANY_INTERFACE, + .type = QUIRK_AUDIO_ALIGN_TRANSFER + } +}, + /* Digidesign Mbox */ { /* Thanks to Clemens Ladisch */ From 39715bf972ed4fee18fe5409609a971fb16b1771 Mon Sep 17 00:00:00 2001 From: Valentin Rothberg Date: Wed, 5 Oct 2016 07:57:26 +0200 Subject: [PATCH 218/292] powerpc/process: Fix CONFIG_ALIVEC typo in restore_tm_state() It should be ALTIVEC, not ALIVEC. Cyril explains: If a thread performs a transaction with altivec and then gets preempted for whatever reason, this bug may cause the kernel to not re-enable altivec when that thread runs again. This will result in an altivec unavailable fault, when that fault happens inside a user transaction the kernel has no choice but to enable altivec and doom the transaction. The result is that transactions using altivec may get aborted more often than they should. The difficulty in catching this with a selftest is my deliberate use of the word may above. Optimisations to avoid FPU/altivec/VSX faults mean that the kernel will always leave them on for 255 switches. This code prevents the kernel turning it off if it got to the 256th switch (and userspace was transactional). Fixes: dc16b553c949 ("powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use") Reviewed-by: Cyril Bur Signed-off-by: Valentin Rothberg Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/process.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 9e7c10fe205f..ce6dc61b15b2 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1012,7 +1012,7 @@ void restore_tm_state(struct pt_regs *regs) /* Ensure that restore_math() will restore */ if (msr_diff & MSR_FP) current->thread.load_fp = 1; -#ifdef CONFIG_ALIVEC +#ifdef CONFIG_ALTIVEC if (cpu_has_feature(CPU_FTR_ALTIVEC) && msr_diff & MSR_VEC) current->thread.load_vec = 1; #endif From bd77c4498616e27d5725b5959d880ce2272fefa9 Mon Sep 17 00:00:00 2001 From: "Aneesh Kumar K.V" Date: Mon, 24 Oct 2016 08:50:43 +0530 Subject: [PATCH 219/292] powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu Before this patch, we used tlbiel, if we ever ran only on this core. That was mostly derived from the nohash usage of the same. But is incorrect, the ISA 3.0 clarifies tlbiel such that: "All TLB entries that have all of the following properties are made invalid on the thread executing the tlbiel instruction" ie. tlbiel only invalidates TLB entries on the current thread. So if the mm has been used on any other thread (aka. cpu) then we must broadcast the invalidate. This bug could lead to invalid TLB entries if a program runs on multiple threads of a core. Hence use tlbiel, if we only ever ran on only the current cpu. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Aneesh Kumar K.V Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/tlb.h | 12 ++++++++++++ arch/powerpc/mm/tlb-radix.c | 8 ++++---- 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index f6f68f73e858..99e1397b71da 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -52,11 +52,23 @@ static inline int mm_is_core_local(struct mm_struct *mm) return cpumask_subset(mm_cpumask(mm), topology_sibling_cpumask(smp_processor_id())); } + +static inline int mm_is_thread_local(struct mm_struct *mm) +{ + return cpumask_equal(mm_cpumask(mm), + cpumask_of(smp_processor_id())); +} + #else static inline int mm_is_core_local(struct mm_struct *mm) { return 1; } + +static inline int mm_is_thread_local(struct mm_struct *mm) +{ + return 1; +} #endif #endif /* __KERNEL__ */ diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 0e49ec541ab5..bda8c43be78a 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -175,7 +175,7 @@ void radix__flush_tlb_mm(struct mm_struct *mm) if (unlikely(pid == MMU_NO_CONTEXT)) goto no_context; - if (!mm_is_core_local(mm)) { + if (!mm_is_thread_local(mm)) { int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); if (lock_tlbie) @@ -201,7 +201,7 @@ void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr) if (unlikely(pid == MMU_NO_CONTEXT)) goto no_context; - if (!mm_is_core_local(mm)) { + if (!mm_is_thread_local(mm)) { int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); if (lock_tlbie) @@ -226,7 +226,7 @@ void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, pid = mm ? mm->context.id : 0; if (unlikely(pid == MMU_NO_CONTEXT)) goto bail; - if (!mm_is_core_local(mm)) { + if (!mm_is_thread_local(mm)) { int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); if (lock_tlbie) @@ -321,7 +321,7 @@ void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, { unsigned long pid; unsigned long addr; - int local = mm_is_core_local(mm); + int local = mm_is_thread_local(mm); unsigned long ap = mmu_get_ap(psize); int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE); unsigned long page_size = 1UL << mmu_psize_defs[psize].shift; From fb479e44a9e240a23c2d208c2ace23542a47f41c Mon Sep 17 00:00:00 2001 From: Nicholas Piggin Date: Thu, 13 Oct 2016 13:17:14 +1100 Subject: [PATCH 220/292] powerpc/64s: relocation, register save fixes for system reset interrupt This patch does a couple of things. First of all, powernv immediately explodes when running a relocated kernel, because the system reset exception for handling sleeps does not do correct relocated branches. Secondly, the sleep handling code trashes the condition and cfar registers, which we would like to preserve for debugging purposes (for non-sleep case exception). This patch changes the exception to use the standard format that saves registers before any tests or branches are made. It adds the test for idle-wakeup as an "extra" to break out of the normal exception path. Then it branches to a relocated idle handler that calls the various idle handling functions. After this patch, POWER8 CPU simulator now boots powernv kernel that is running at non-zero. Fixes: 948cf67c4726 ("powerpc: Add NAP mode support on Power7 in HV mode") Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Nicholas Piggin Acked-by: Gautham R. Shenoy Acked-by: Balbir Singh Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/exception-64s.h | 16 ++++++++ arch/powerpc/kernel/exceptions-64s.S | 50 ++++++++++++++---------- 2 files changed, 45 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 2e4e7d878c8e..84d49b197c32 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -93,6 +93,10 @@ ld reg,PACAKBASE(r13); /* get high part of &label */ \ ori reg,reg,(FIXED_SYMBOL_ABS_ADDR(label))@l; +#define __LOAD_HANDLER(reg, label) \ + ld reg,PACAKBASE(r13); \ + ori reg,reg,(ABS_ADDR(label))@l; + /* Exception register prefixes */ #define EXC_HV H #define EXC_STD @@ -208,6 +212,18 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) #define kvmppc_interrupt kvmppc_interrupt_pr #endif +#ifdef CONFIG_RELOCATABLE +#define BRANCH_TO_COMMON(reg, label) \ + __LOAD_HANDLER(reg, label); \ + mtctr reg; \ + bctr + +#else +#define BRANCH_TO_COMMON(reg, label) \ + b label + +#endif + #define __KVM_HANDLER_PROLOG(area, n) \ BEGIN_FTR_SECTION_NESTED(947) \ ld r10,area+EX_CFAR(r13); \ diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index f129408c6022..08ba447a4b3d 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -95,19 +95,35 @@ __start_interrupts: /* No virt vectors corresponding with 0x0..0x100 */ EXC_VIRT_NONE(0x4000, 0x4100) + +#ifdef CONFIG_PPC_P7_NAP + /* + * If running native on arch 2.06 or later, check if we are waking up + * from nap/sleep/winkle, and branch to idle handler. + */ +#define IDLETEST(n) \ + BEGIN_FTR_SECTION ; \ + mfspr r10,SPRN_SRR1 ; \ + rlwinm. r10,r10,47-31,30,31 ; \ + beq- 1f ; \ + cmpwi cr3,r10,2 ; \ + BRANCH_TO_COMMON(r10, system_reset_idle_common) ; \ +1: \ + END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) +#else +#define IDLETEST NOTEST +#endif + EXC_REAL_BEGIN(system_reset, 0x100, 0x200) SET_SCRATCH0(r13) -#ifdef CONFIG_PPC_P7_NAP -BEGIN_FTR_SECTION - /* Running native on arch 2.06 or later, check if we are - * waking up from nap/sleep/winkle. - */ - mfspr r13,SPRN_SRR1 - rlwinm. r13,r13,47-31,30,31 - beq 9f + EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common, EXC_STD, + IDLETEST, 0x100) - cmpwi cr3,r13,2 - GET_PACA(r13) +EXC_REAL_END(system_reset, 0x100, 0x200) +EXC_VIRT_NONE(0x4100, 0x4200) + +#ifdef CONFIG_PPC_P7_NAP +EXC_COMMON_BEGIN(system_reset_idle_common) bl pnv_restore_hyp_resource li r0,PNV_THREAD_RUNNING @@ -130,14 +146,8 @@ BEGIN_FTR_SECTION blt cr3,2f b pnv_wakeup_loss 2: b pnv_wakeup_noloss +#endif -9: -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) -#endif /* CONFIG_PPC_P7_NAP */ - EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common, EXC_STD, - NOTEST, 0x100) -EXC_REAL_END(system_reset, 0x100, 0x200) -EXC_VIRT_NONE(0x4100, 0x4200) EXC_COMMON(system_reset_common, 0x100, system_reset_exception) #ifdef CONFIG_PPC_PSERIES @@ -817,10 +827,8 @@ EXC_VIRT(trap_0b, 0x4b00, 0x4c00, 0xb00) TRAMP_KVM(PACA_EXGEN, 0xb00) EXC_COMMON(trap_0b_common, 0xb00, unknown_exception) - -#define LOAD_SYSCALL_HANDLER(reg) \ - ld reg,PACAKBASE(r13); \ - ori reg,reg,(ABS_ADDR(system_call_common))@l; +#define LOAD_SYSCALL_HANDLER(reg) \ + __LOAD_HANDLER(reg, system_call_common) /* Syscall routine is used twice, in reloc-off and reloc-on paths */ #define SYSCALL_PSERIES_1 \ From 42acfc6615f47e465731c263bee0c799edb098f2 Mon Sep 17 00:00:00 2001 From: Jiri Slaby Date: Mon, 3 Oct 2016 11:00:17 +0200 Subject: [PATCH 221/292] tty: vt, fix bogus division in csi_J MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In csi_J(3), the third parameter of scr_memsetw (vc_screenbuf_size) is divided by 2 inappropriatelly. But scr_memsetw expects size, not count, because it divides the size by 2 on its own before doing actual memset-by-words. So remove the bogus division. Signed-off-by: Jiri Slaby Cc: Petr Písař Fixes: f8df13e0a9 (tty: Clean console safely) Signed-off-by: Greg Kroah-Hartman --- drivers/tty/vt/vt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 06fb39c1d6dd..ae203c2cd15a 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -1176,7 +1176,7 @@ static void csi_J(struct vc_data *vc, int vpar) break; case 3: /* erase scroll-back buffer (and whole display) */ scr_memsetw(vc->vc_screenbuf, vc->vc_video_erase_char, - vc->vc_screenbuf_size >> 1); + vc->vc_screenbuf_size); set_origin(vc); if (con_is_visible(vc)) update_screen(vc); From b20fb13c7c093f9170fbf162fc7f333d7a5cf77a Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Thu, 6 Oct 2016 15:42:54 +0200 Subject: [PATCH 222/292] serial: stm32: Fix comparisons with undefined register MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit drivers/tty/serial/stm32-usart.c: In function ‘stm32_receive_chars’: drivers/tty/serial/stm32-usart.c:130: warning: comparison is always true due to limited range of data type drivers/tty/serial/stm32-usart.c: In function ‘stm32_tx_dma_complete’: drivers/tty/serial/stm32-usart.c:177: warning: comparison is always false due to limited range of data type stm32_usart_offsets.icr is u8, while UNDEF_REG = ~0 is int, and thus 0xffffffff. As all registers in stm32_usart_offsets are u8, change the definition of UNDEF_REG to 0xff to fix this. Fixes: ada8618ff3bfe183 ("serial: stm32: adding support for stm32f7") Signed-off-by: Geert Uytterhoeven Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/stm32-usart.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/stm32-usart.h b/drivers/tty/serial/stm32-usart.h index 41d974923102..cd97ceb76e4f 100644 --- a/drivers/tty/serial/stm32-usart.h +++ b/drivers/tty/serial/stm32-usart.h @@ -31,7 +31,7 @@ struct stm32_usart_info { struct stm32_usart_config cfg; }; -#define UNDEF_REG ~0 +#define UNDEF_REG 0xff /* Register offsets */ struct stm32_usart_info stm32f4_info = { From bc2a024f865d712bb5748aabe77cd515d7d5eea9 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Thu, 6 Oct 2016 15:55:24 +0200 Subject: [PATCH 223/292] serial: SERIAL_STM32 should depend on HAS_DMA If NO_DMA=y: drivers/built-in.o: In function `stm32_serial_remove': stm32-usart.c:(.text+0xcea1a): undefined reference to `bad_dma_ops' stm32-usart.c:(.text+0xcea7a): undefined reference to `bad_dma_ops' Add a dependency on HAS_DMA to fix this. Signed-off-by: Geert Uytterhoeven Acked-by: Alexandre Torgue Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig index c7831407a882..25c1d7bc0100 100644 --- a/drivers/tty/serial/Kconfig +++ b/drivers/tty/serial/Kconfig @@ -1625,6 +1625,7 @@ config SERIAL_SPRD_CONSOLE config SERIAL_STM32 tristate "STMicroelectronics STM32 serial port support" select SERIAL_CORE + depends on HAS_DMA depends on ARM || COMPILE_TEST help This driver is for the on-chip Serial Controller on From 0267a4ff9836a1a4e59044db5bb8cdaddb986d3f Mon Sep 17 00:00:00 2001 From: Nava kishore Manne Date: Wed, 12 Oct 2016 13:17:27 +0530 Subject: [PATCH 224/292] serial: xuartps: Add new compatible string for ZynqMP This patch Adds the new compatible string for ZynqMP SoC. Signed-off-by: Nava kishore Manne Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/xilinx_uartps.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c index f37edaa5ac75..dd4c02fa4820 100644 --- a/drivers/tty/serial/xilinx_uartps.c +++ b/drivers/tty/serial/xilinx_uartps.c @@ -1200,6 +1200,7 @@ static int __init cdns_early_console_setup(struct earlycon_device *device, OF_EARLYCON_DECLARE(cdns, "xlnx,xuartps", cdns_early_console_setup); OF_EARLYCON_DECLARE(cdns, "cdns,uart-r1p8", cdns_early_console_setup); OF_EARLYCON_DECLARE(cdns, "cdns,uart-r1p12", cdns_early_console_setup); +OF_EARLYCON_DECLARE(cdns, "xlnx,zynqmp-uart", cdns_early_console_setup); /** * cdns_uart_console_write - perform write operation @@ -1438,6 +1439,7 @@ static const struct of_device_id cdns_uart_of_match[] = { { .compatible = "xlnx,xuartps", }, { .compatible = "cdns,uart-r1p8", }, { .compatible = "cdns,uart-r1p12", .data = &zynqmp_uart_def }, + { .compatible = "xlnx,zynqmp-uart", .data = &zynqmp_uart_def }, {} }; MODULE_DEVICE_TABLE(of, cdns_uart_of_match); From 78c22449f2d32aafd8804047f7e3bee4926b52eb Mon Sep 17 00:00:00 2001 From: Nava kishore Manne Date: Wed, 12 Oct 2016 13:17:28 +0530 Subject: [PATCH 225/292] devicetree: bindings: uart: Add new compatible string for ZynqMP This patch Adds the new compatible string for ZynqMP SoC. Signed-off-by: Nava kishore Manne Acked-by: Rob Herring Signed-off-by: Greg Kroah-Hartman --- Documentation/devicetree/bindings/serial/cdns,uart.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/serial/cdns,uart.txt b/Documentation/devicetree/bindings/serial/cdns,uart.txt index a3eb154c32ca..227bb770b027 100644 --- a/Documentation/devicetree/bindings/serial/cdns,uart.txt +++ b/Documentation/devicetree/bindings/serial/cdns,uart.txt @@ -1,7 +1,9 @@ Binding for Cadence UART Controller Required properties: -- compatible : should be "cdns,uart-r1p8", or "xlnx,xuartps" +- compatible : + Use "xlnx,xuartps","cdns,uart-r1p8" for Zynq-7xxx SoC. + Use "xlnx,zynqmp-uart","cdns,uart-r1p12" for Zynq Ultrascale+ MPSoC. - reg: Should contain UART controller registers location and length. - interrupts: Should contain UART controller interrupts. - clocks: Must contain phandles to the UART clocks From beadba5e19e2c44ec3527d3d1fc3ac3eda957e09 Mon Sep 17 00:00:00 2001 From: Wei Yongjun Date: Sun, 23 Oct 2016 11:38:18 +0000 Subject: [PATCH 226/292] serial: pch_uart: add terminate entry for dmi_system_id tables Make sure dmi_system_id tables are NULL terminated. Signed-off-by: Wei Yongjun Acked-by: Jiri Slaby Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/pch_uart.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/tty/serial/pch_uart.c b/drivers/tty/serial/pch_uart.c index d391650b82e7..42caccb5e87e 100644 --- a/drivers/tty/serial/pch_uart.c +++ b/drivers/tty/serial/pch_uart.c @@ -419,6 +419,7 @@ static struct dmi_system_id pch_uart_dmi_table[] = { }, (void *)MINNOW_UARTCLK, }, + { } }; /* Return UART clock, checking for board specific clocks. */ From 0ead21ad25f53117a1e39f0bddcb363e38886996 Mon Sep 17 00:00:00 2001 From: Denys Vlasenko Date: Mon, 24 Oct 2016 17:00:27 +0900 Subject: [PATCH 227/292] serial: 8250_uniphier: fix more unterminated string Commit 1681d2116c96 ("serial: 8250_uniphier: add "\n" at the end of error log") missed this. Signed-off-by: Denys Vlasenko [masahiro: add commit log] Signed-off-by: Masahiro Yamada Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/8250/8250_uniphier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/8250/8250_uniphier.c b/drivers/tty/serial/8250/8250_uniphier.c index b8d9c8c9d02a..a8babb0cf659 100644 --- a/drivers/tty/serial/8250/8250_uniphier.c +++ b/drivers/tty/serial/8250/8250_uniphier.c @@ -199,7 +199,7 @@ static int uniphier_uart_probe(struct platform_device *pdev) regs = platform_get_resource(pdev, IORESOURCE_MEM, 0); if (!regs) { - dev_err(dev, "failed to get memory resource"); + dev_err(dev, "failed to get memory resource\n"); return -EINVAL; } From 09065c5f0f1ff4bfb309975e182b746989a869c5 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Mon, 24 Oct 2016 17:00:28 +0900 Subject: [PATCH 228/292] serial: 8250_uniphier: fix clearing divisor latch access bit At this point, 'value' is always a byte, then this code is clearing bit 15, which is already clear. I meant to clear bit 7. Signed-off-by: Masahiro Yamada Reported-by: Denys Vlasenko Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/8250/8250_uniphier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/8250/8250_uniphier.c b/drivers/tty/serial/8250/8250_uniphier.c index a8babb0cf659..417d9e7038e1 100644 --- a/drivers/tty/serial/8250/8250_uniphier.c +++ b/drivers/tty/serial/8250/8250_uniphier.c @@ -99,7 +99,7 @@ static void uniphier_serial_out(struct uart_port *p, int offset, int value) case UART_LCR: valshift = UNIPHIER_UART_LCR_SHIFT; /* Divisor latch access bit does not exist. */ - value &= ~(UART_LCR_DLAB << valshift); + value &= ~UART_LCR_DLAB; /* fall through */ case UART_MCR: offset = UNIPHIER_UART_LCR_MCR; From be2c92b8f1648527620058fdac2bae12b07f1fe9 Mon Sep 17 00:00:00 2001 From: Rob Herring Date: Mon, 24 Oct 2016 15:56:49 -0500 Subject: [PATCH 229/292] serial: core: fix console problems on uart_close MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit 761ed4a94582 ('tty: serial_core: convert uart_close to use tty_port_close') started setting the ttyport console flag for serial drivers. This is causing crashes, hangs, or garbage output on several platforms because the serial shutdown is skipped and IRQs are left enabled. Partially revert commit 761ed4a94582 and drop reporting UART tty_ports as a console leaving the console handling to the serial_core as it was before. Fixes: 761ed4a94582ab29 ("tty: serial_core: convert uart_close to use tty_port_close") Reported-by: Niklas Söderlund Reported-by: Mike Galbraith Reported-by: Mugunthan V N Cc: Peter Hurley Cc: Geert Uytterhoeven Cc: Alan Cox Cc: Greg Kroah-Hartman Cc: Jiri Slaby Cc: linux-serial@vger.kernel.org Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/serial_core.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index 6e4f63627479..664c99aeeca5 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -2746,8 +2746,6 @@ int uart_add_one_port(struct uart_driver *drv, struct uart_port *uport) uport->cons = drv->cons; uport->minor = drv->tty_driver->minor_start + uport->line; - port->console = uart_console(uport); - /* * If this port is a console, then the spinlock is already * initialised. From f00a7c57569db04633818bc5e0c0e35d62733b02 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Mon, 10 Oct 2016 11:13:48 +0300 Subject: [PATCH 230/292] serial: 8250_lpss: enable MSI for sure The commit 4fe0d154880b ("PCI: Use positive flags in pci_alloc_irq_vectors()") replaces flags from negative to positive values which makes mandatory to have the last argument in pci_alloc_irq_vectors() non-zero (if we want to be no-op). This basically drops MSI enabling in 8250_lpss driver. Restore desired behaviour in 8250_lpss by passing PCI_IRQ_ALL_TYPES instead of 0 to pci_alloc_irq_vectors(). Fixes: 60a9244a5d14 ("serial: 8250_lpss: enable MSI for Intel Quark") Cc: Christoph Hellwig Signed-off-by: Andy Shevchenko Reviewed-by: Bryan O'Donoghue Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/8250/8250_lpss.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/serial/8250/8250_lpss.c b/drivers/tty/serial/8250/8250_lpss.c index 886fcf37f291..b9923464599f 100644 --- a/drivers/tty/serial/8250/8250_lpss.c +++ b/drivers/tty/serial/8250/8250_lpss.c @@ -213,7 +213,7 @@ static int qrk_serial_setup(struct lpss8250 *lpss, struct uart_port *port) struct pci_dev *pdev = to_pci_dev(port->dev); int ret; - ret = pci_alloc_irq_vectors(pdev, 1, 1, 0); + ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES); if (ret < 0) return ret; From d704b2d32c39c256dea659e142a31b875a13c63b Mon Sep 17 00:00:00 2001 From: Aaron Brice Date: Thu, 6 Oct 2016 15:13:04 -0700 Subject: [PATCH 231/292] tty: serial: fsl_lpuart: Fix Tx DMA edge case In the case where head == 0 on the circular buffer, there should be one DMA buffer, not two. The second zero-length buffer would break the lpuart driver, transfer would never complete. Signed-off-by: Aaron Brice Acked-by: Stefan Agner Tested-by: Stefan Agner Tested-by: Bhuvanchandra DV Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/fsl_lpuart.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c index de9d5107c00a..76103f2c4a80 100644 --- a/drivers/tty/serial/fsl_lpuart.c +++ b/drivers/tty/serial/fsl_lpuart.c @@ -328,7 +328,7 @@ static void lpuart_dma_tx(struct lpuart_port *sport) sport->dma_tx_bytes = uart_circ_chars_pending(xmit); - if (xmit->tail < xmit->head) { + if (xmit->tail < xmit->head || xmit->head == 0) { sport->dma_tx_nents = 1; sg_init_one(sgl, xmit->buf + xmit->tail, sport->dma_tx_bytes); } else { @@ -359,7 +359,6 @@ static void lpuart_dma_tx(struct lpuart_port *sport) sport->dma_tx_in_progress = true; sport->dma_tx_cookie = dmaengine_submit(sport->dma_tx_desc); dma_async_issue_pending(sport->dma_tx_chan); - } static void lpuart_dma_tx_complete(void *arg) From 32b2921e6a7461fe63b71217067a6cf4bddb132f Mon Sep 17 00:00:00 2001 From: Dmitry Vyukov Date: Fri, 14 Oct 2016 15:18:28 +0200 Subject: [PATCH 232/292] tty: limit terminal size to 4M chars Size of kmalloc() in vc_do_resize() is controlled by user. Too large kmalloc() size triggers WARNING message on console. Put a reasonable upper bound on terminal size to prevent WARNINGs. Signed-off-by: Dmitry Vyukov CC: David Rientjes Cc: One Thousand Gnomes Cc: Greg Kroah-Hartman Cc: Jiri Slaby Cc: Peter Hurley Cc: linux-kernel@vger.kernel.org Cc: syzkaller@googlegroups.com Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/tty/vt/vt.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index ae203c2cd15a..26cda08bc611 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -870,6 +870,8 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc, if (new_cols == vc->vc_cols && new_rows == vc->vc_rows) return 0; + if (new_screen_size > (4 << 20)) + return -EINVAL; newscreen = kmalloc(new_screen_size, GFP_USER); if (!newscreen) return -ENOMEM; From ecb988a3b7985913d1f0112f66667cdd15e40711 Mon Sep 17 00:00:00 2001 From: Steve Shih Date: Mon, 17 Oct 2016 09:51:05 -0700 Subject: [PATCH 233/292] tty: serial: 8250: 8250_core: NXP SC16C2552 workaround NXP SC16C2552 requires that we always write a reset to the RX FIFO and TX FIFO whenever we enable the FIFOs Cc: xe-kernel@external.cisco.com Signed-off-by: Steve Shih Signed-off-by: David Singleton Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/8250/8250_port.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c index 1bfb6fdbaa20..1731b98d2471 100644 --- a/drivers/tty/serial/8250/8250_port.c +++ b/drivers/tty/serial/8250/8250_port.c @@ -83,7 +83,8 @@ static const struct serial8250_config uart_config[] = { .name = "16550A", .fifo_size = 16, .tx_loadsz = 16, - .fcr = UART_FCR_ENABLE_FIFO | UART_FCR_R_TRIG_10, + .fcr = UART_FCR_ENABLE_FIFO | UART_FCR_R_TRIG_10 | + UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT, .rxtrig_bytes = {1, 4, 8, 14}, .flags = UART_CAP_FIFO, }, From c03e1b8703a4a0c8bac7f6b65d8fcb539ec62d82 Mon Sep 17 00:00:00 2001 From: Sergei Shtylyov Date: Fri, 21 Oct 2016 23:00:43 +0300 Subject: [PATCH 234/292] sh-sci: document R8A7743/5 support Renesas RZ/G SoC also have the SCIF, SCIFA, SCIFB, and HSCIF ports and they seem compatible with the R-Car gen2 SoC in this respect... Document RZ/G1[ME] (also known as R8A774[35]) SoC bindings. Signed-off-by: Sergei Shtylyov Acked-by: Simon Horman Acked-by: Geert Uytterhoeven Signed-off-by: Greg Kroah-Hartman --- .../devicetree/bindings/serial/renesas,sci-serial.txt | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/Documentation/devicetree/bindings/serial/renesas,sci-serial.txt b/Documentation/devicetree/bindings/serial/renesas,sci-serial.txt index 1e4000d83aee..8d27d1a603e7 100644 --- a/Documentation/devicetree/bindings/serial/renesas,sci-serial.txt +++ b/Documentation/devicetree/bindings/serial/renesas,sci-serial.txt @@ -9,6 +9,14 @@ Required properties: - "renesas,scifb-r8a73a4" for R8A73A4 (R-Mobile APE6) SCIFB compatible UART. - "renesas,scifa-r8a7740" for R8A7740 (R-Mobile A1) SCIFA compatible UART. - "renesas,scifb-r8a7740" for R8A7740 (R-Mobile A1) SCIFB compatible UART. + - "renesas,scif-r8a7743" for R8A7743 (RZ/G1M) SCIF compatible UART. + - "renesas,scifa-r8a7743" for R8A7743 (RZ/G1M) SCIFA compatible UART. + - "renesas,scifb-r8a7743" for R8A7743 (RZ/G1M) SCIFB compatible UART. + - "renesas,hscif-r8a7743" for R8A7743 (RZ/G1M) HSCIF compatible UART. + - "renesas,scif-r8a7745" for R8A7745 (RZ/G1E) SCIF compatible UART. + - "renesas,scifa-r8a7745" for R8A7745 (RZ/G1E) SCIFA compatible UART. + - "renesas,scifb-r8a7745" for R8A7745 (RZ/G1E) SCIFB compatible UART. + - "renesas,hscif-r8a7745" for R8A7745 (RZ/G1E) HSCIF compatible UART. - "renesas,scif-r8a7778" for R8A7778 (R-Car M1) SCIF compatible UART. - "renesas,scif-r8a7779" for R8A7779 (R-Car H1) SCIF compatible UART. - "renesas,scif-r8a7790" for R8A7790 (R-Car H2) SCIF compatible UART. From 03842c17397e14cb8bb1adc2015f5dce6c733ffe Mon Sep 17 00:00:00 2001 From: Francois Berder Date: Tue, 25 Oct 2016 13:24:13 +0100 Subject: [PATCH 235/292] sc16is7xx: always write state when configuring GPIO as an output The regmap_update first reads the IOState register and then triggers a write if needed. However, GPIOS might be configured as an input so the read to IOState on this GPIO is the current state which might be random. Signed-off-by: Francois Berder Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/sc16is7xx.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index 2675792a8f59..fb0672554123 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -1130,9 +1130,13 @@ static int sc16is7xx_gpio_direction_output(struct gpio_chip *chip, { struct sc16is7xx_port *s = gpiochip_get_data(chip); struct uart_port *port = &s->p[0].port; + u8 state = sc16is7xx_port_read(port, SC16IS7XX_IOSTATE_REG); - sc16is7xx_port_update(port, SC16IS7XX_IOSTATE_REG, BIT(offset), - val ? BIT(offset) : 0); + if (val) + state |= BIT(offset); + else + state &= ~BIT(offset); + sc16is7xx_port_write(port, SC16IS7XX_IOSTATE_REG, state); sc16is7xx_port_update(port, SC16IS7XX_IODIR_REG, BIT(offset), BIT(offset)); From 009e39ae44f4191188aeb6dfbf661b771dbbe515 Mon Sep 17 00:00:00 2001 From: Scot Doyle Date: Thu, 13 Oct 2016 12:12:43 -0500 Subject: [PATCH 236/292] vt: clear selection before resizing When resizing a vt its selection may exceed the new size, resulting in an invalid memory access [1]. Clear the selection before resizing. [1] http://lkml.kernel.org/r/CACT4Y+acDTwy4umEvf5ROBGiRJNrxHN4Cn5szCXE5Jw-d1B=Xw@mail.gmail.com Reported-and-tested-by: Dmitry Vyukov Signed-off-by: Scot Doyle Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/tty/vt/vt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 26cda08bc611..8c3bf3d613c0 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -876,6 +876,9 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc, if (!newscreen) return -ENOMEM; + if (vc == sel_cons) + clear_selection(); + old_rows = vc->vc_rows; old_row_size = vc->vc_size_row; From 2a9becdd4dbed499815938308bdab9aae70dd561 Mon Sep 17 00:00:00 2001 From: Tony Luck Date: Fri, 14 Oct 2016 10:56:42 -0700 Subject: [PATCH 237/292] kernfs: Add noop_fsync to supported kernfs_file_fops If you edit a kernfs backed file with vi(1), you see an ugly error message when you write the file because vi tries to fsync(2) the file after writing, which fails. We have noop_fsync() for this, use it. Signed-off-by: Tony Luck Signed-off-by: Greg Kroah-Hartman --- fs/kernfs/file.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 2bcb86e6e6ca..78219d5644e9 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -911,6 +911,7 @@ const struct file_operations kernfs_file_fops = { .open = kernfs_fop_open, .release = kernfs_fop_release, .poll = kernfs_fop_poll, + .fsync = noop_fsync, }; /** From 248ff02165437864146d6fbd2d99b2359c3723e6 Mon Sep 17 00:00:00 2001 From: Laura Abbott Date: Fri, 7 Oct 2016 09:09:30 -0700 Subject: [PATCH 238/292] driver core: Make Kconfig text for DEBUG_TEST_DRIVER_REMOVE stronger The current state of driver removal is not great. CONFIG_DEBUG_TEST_DRIVER_REMOVE finds lots of errors. The help text currently undersells exactly how many errors this option will find. Add a bit more description to indicate this option shouldn't be turned on unless you actually want to debug driver removal. The text can be changed later when more drivers are fixed up. Signed-off-by: Laura Abbott Signed-off-by: Greg Kroah-Hartman --- drivers/base/Kconfig | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index fdf44cac08e6..d02e7c0f5bfd 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -213,14 +213,16 @@ config DEBUG_DEVRES If you are unsure about this, Say N here. config DEBUG_TEST_DRIVER_REMOVE - bool "Test driver remove calls during probe" + bool "Test driver remove calls during probe (UNSTABLE)" depends on DEBUG_KERNEL help Say Y here if you want the Driver core to test driver remove functions by calling probe, remove, probe. This tests the remove path without having to unbind the driver or unload the driver module. - If you are unsure about this, say N here. + This option is expected to find errors and may render your system + unusable. You should say N here unless you are explicitly looking to + test this functionality. config SYS_HYPERVISOR bool From 7fe311302f7d52601cd799ad508a6f92cb3d748d Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Thu, 27 Oct 2016 09:49:19 -0600 Subject: [PATCH 239/292] blk-mq: update hardware and software queues for sleeping alloc If we end up sleeping due to running out of requests, we should update the hardware and software queues in the map ctx structure. Otherwise we could end up having rq->mq_ctx point to the pre-sleep context, and risk corrupting ctx->rq_list since we'll be grabbing the wrong lock when inserting the request. Reported-by: Dave Jones Reported-by: Chris Mason Tested-by: Chris Mason Fixes: 63581af3f31e ("blk-mq: remove non-blocking pass in blk_mq_map_request") Signed-off-by: Jens Axboe --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index ddc2eed64771..f3d27a6dee09 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1217,9 +1217,9 @@ static struct request *blk_mq_map_request(struct request_queue *q, blk_mq_set_alloc_data(&alloc_data, q, 0, ctx, hctx); rq = __blk_mq_alloc_request(&alloc_data, op, op_flags); - hctx->queued++; - data->hctx = hctx; - data->ctx = ctx; + data->hctx = alloc_data.hctx; + data->ctx = alloc_data.ctx; + data->hctx->queued++; return rq; } From 9dcb8b685fc30813b35ab4b4bf39244430753190 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Wed, 26 Oct 2016 10:15:30 -0700 Subject: [PATCH 240/292] mm: remove per-zone hashtable of bitlock waitqueues The per-zone waitqueues exist because of a scalability issue with the page waitqueues on some NUMA machines, but it turns out that they hurt normal loads, and now with the vmalloced stacks they also end up breaking gfs2 that uses a bit_wait on a stack object: wait_on_bit(&gh->gh_iflags, HIF_WAIT, TASK_UNINTERRUPTIBLE) where 'gh' can be a reference to the local variable 'mount_gh' on the stack of fill_super(). The reason the per-zone hash table breaks for this case is that there is no "zone" for virtual allocations, and trying to look up the physical page to get at it will fail (with a BUG_ON()). It turns out that I actually complained to the mm people about the per-zone hash table for another reason just a month ago: the zone lookup also hurts the regular use of "unlock_page()" a lot, because the zone lookup ends up forcing several unnecessary cache misses and generates horrible code. As part of that earlier discussion, we had a much better solution for the NUMA scalability issue - by just making the page lock have a separate contention bit, the waitqueue doesn't even have to be looked at for the normal case. Peter Zijlstra already has a patch for that, but let's see if anybody even notices. In the meantime, let's fix the actual gfs2 breakage by simplifying the bitlock waitqueues and removing the per-zone issue. Reported-by: Andreas Gruenbacher Tested-by: Bob Peterson Acked-by: Mel Gorman Cc: Peter Zijlstra Cc: Andy Lutomirski Cc: Steven Whitehouse Signed-off-by: Linus Torvalds --- include/linux/mmzone.h | 30 +---------- kernel/sched/core.c | 16 ++++++ kernel/sched/wait.c | 10 ---- mm/filemap.c | 4 +- mm/memory_hotplug.c | 28 ---------- mm/page_alloc.c | 115 +---------------------------------------- 6 files changed, 21 insertions(+), 182 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 7f2ae99e5daf..0f088f3a2fed 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -440,33 +440,7 @@ struct zone { seqlock_t span_seqlock; #endif - /* - * wait_table -- the array holding the hash table - * wait_table_hash_nr_entries -- the size of the hash table array - * wait_table_bits -- wait_table_size == (1 << wait_table_bits) - * - * The purpose of all these is to keep track of the people - * waiting for a page to become available and make them - * runnable again when possible. The trouble is that this - * consumes a lot of space, especially when so few things - * wait on pages at a given time. So instead of using - * per-page waitqueues, we use a waitqueue hash table. - * - * The bucket discipline is to sleep on the same queue when - * colliding and wake all in that wait queue when removing. - * When something wakes, it must check to be sure its page is - * truly available, a la thundering herd. The cost of a - * collision is great, but given the expected load of the - * table, they should be so rare as to be outweighed by the - * benefits from the saved space. - * - * __wait_on_page_locked() and unlock_page() in mm/filemap.c, are the - * primary users of these fields, and in mm/page_alloc.c - * free_area_init_core() performs the initialization of them. - */ - wait_queue_head_t *wait_table; - unsigned long wait_table_hash_nr_entries; - unsigned long wait_table_bits; + int initialized; /* Write-intensive fields used from the page allocator */ ZONE_PADDING(_pad1_) @@ -546,7 +520,7 @@ static inline bool zone_spans_pfn(const struct zone *zone, unsigned long pfn) static inline bool zone_is_initialized(struct zone *zone) { - return !!zone->wait_table; + return zone->initialized; } static inline bool zone_is_empty(struct zone *zone) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 94732d1ab00a..42d4027f9e26 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7515,11 +7515,27 @@ static struct kmem_cache *task_group_cache __read_mostly; DECLARE_PER_CPU(cpumask_var_t, load_balance_mask); DECLARE_PER_CPU(cpumask_var_t, select_idle_mask); +#define WAIT_TABLE_BITS 8 +#define WAIT_TABLE_SIZE (1 << WAIT_TABLE_BITS) +static wait_queue_head_t bit_wait_table[WAIT_TABLE_SIZE] __cacheline_aligned; + +wait_queue_head_t *bit_waitqueue(void *word, int bit) +{ + const int shift = BITS_PER_LONG == 32 ? 5 : 6; + unsigned long val = (unsigned long)word << shift | bit; + + return bit_wait_table + hash_long(val, WAIT_TABLE_BITS); +} +EXPORT_SYMBOL(bit_waitqueue); + void __init sched_init(void) { int i, j; unsigned long alloc_size = 0, ptr; + for (i = 0; i < WAIT_TABLE_SIZE; i++) + init_waitqueue_head(bit_wait_table + i); + #ifdef CONFIG_FAIR_GROUP_SCHED alloc_size += 2 * nr_cpu_ids * sizeof(void **); #endif diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index 4f7053579fe3..9453efe9b25a 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -480,16 +480,6 @@ void wake_up_bit(void *word, int bit) } EXPORT_SYMBOL(wake_up_bit); -wait_queue_head_t *bit_waitqueue(void *word, int bit) -{ - const int shift = BITS_PER_LONG == 32 ? 5 : 6; - const struct zone *zone = page_zone(virt_to_page(word)); - unsigned long val = (unsigned long)word << shift | bit; - - return &zone->wait_table[hash_long(val, zone->wait_table_bits)]; -} -EXPORT_SYMBOL(bit_waitqueue); - /* * Manipulate the atomic_t address to produce a better bit waitqueue table hash * index (we're keying off bit -1, but that would produce a horrible hash diff --git a/mm/filemap.c b/mm/filemap.c index 849f459ad078..c7fe2f16503f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -790,9 +790,7 @@ EXPORT_SYMBOL(__page_cache_alloc); */ wait_queue_head_t *page_waitqueue(struct page *page) { - const struct zone *zone = page_zone(page); - - return &zone->wait_table[hash_ptr(page, zone->wait_table_bits)]; + return bit_waitqueue(page, 0); } EXPORT_SYMBOL(page_waitqueue); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 962927309b6e..b18dab401be6 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -268,7 +268,6 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat) unsigned long i, pfn, end_pfn, nr_pages; int node = pgdat->node_id; struct page *page; - struct zone *zone; nr_pages = PAGE_ALIGN(sizeof(struct pglist_data)) >> PAGE_SHIFT; page = virt_to_page(pgdat); @@ -276,19 +275,6 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat) for (i = 0; i < nr_pages; i++, page++) get_page_bootmem(node, page, NODE_INFO); - zone = &pgdat->node_zones[0]; - for (; zone < pgdat->node_zones + MAX_NR_ZONES - 1; zone++) { - if (zone_is_initialized(zone)) { - nr_pages = zone->wait_table_hash_nr_entries - * sizeof(wait_queue_head_t); - nr_pages = PAGE_ALIGN(nr_pages) >> PAGE_SHIFT; - page = virt_to_page(zone->wait_table); - - for (i = 0; i < nr_pages; i++, page++) - get_page_bootmem(node, page, NODE_INFO); - } - } - pfn = pgdat->node_start_pfn; end_pfn = pgdat_end_pfn(pgdat); @@ -2158,20 +2144,6 @@ void try_offline_node(int nid) */ node_set_offline(nid); unregister_one_node(nid); - - /* free waittable in each zone */ - for (i = 0; i < MAX_NR_ZONES; i++) { - struct zone *zone = pgdat->node_zones + i; - - /* - * wait_table may be allocated from boot memory, - * here only free if it's allocated by vmalloc. - */ - if (is_vmalloc_addr(zone->wait_table)) { - vfree(zone->wait_table); - zone->wait_table = NULL; - } - } } EXPORT_SYMBOL(try_offline_node); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2b3bf6767d54..de7c6e43b1c9 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4976,72 +4976,6 @@ void __ref build_all_zonelists(pg_data_t *pgdat, struct zone *zone) #endif } -/* - * Helper functions to size the waitqueue hash table. - * Essentially these want to choose hash table sizes sufficiently - * large so that collisions trying to wait on pages are rare. - * But in fact, the number of active page waitqueues on typical - * systems is ridiculously low, less than 200. So this is even - * conservative, even though it seems large. - * - * The constant PAGES_PER_WAITQUEUE specifies the ratio of pages to - * waitqueues, i.e. the size of the waitq table given the number of pages. - */ -#define PAGES_PER_WAITQUEUE 256 - -#ifndef CONFIG_MEMORY_HOTPLUG -static inline unsigned long wait_table_hash_nr_entries(unsigned long pages) -{ - unsigned long size = 1; - - pages /= PAGES_PER_WAITQUEUE; - - while (size < pages) - size <<= 1; - - /* - * Once we have dozens or even hundreds of threads sleeping - * on IO we've got bigger problems than wait queue collision. - * Limit the size of the wait table to a reasonable size. - */ - size = min(size, 4096UL); - - return max(size, 4UL); -} -#else -/* - * A zone's size might be changed by hot-add, so it is not possible to determine - * a suitable size for its wait_table. So we use the maximum size now. - * - * The max wait table size = 4096 x sizeof(wait_queue_head_t). ie: - * - * i386 (preemption config) : 4096 x 16 = 64Kbyte. - * ia64, x86-64 (no preemption): 4096 x 20 = 80Kbyte. - * ia64, x86-64 (preemption) : 4096 x 24 = 96Kbyte. - * - * The maximum entries are prepared when a zone's memory is (512K + 256) pages - * or more by the traditional way. (See above). It equals: - * - * i386, x86-64, powerpc(4K page size) : = ( 2G + 1M)byte. - * ia64(16K page size) : = ( 8G + 4M)byte. - * powerpc (64K page size) : = (32G +16M)byte. - */ -static inline unsigned long wait_table_hash_nr_entries(unsigned long pages) -{ - return 4096UL; -} -#endif - -/* - * This is an integer logarithm so that shifts can be used later - * to extract the more random high bits from the multiplicative - * hash function before the remainder is taken. - */ -static inline unsigned long wait_table_bits(unsigned long size) -{ - return ffz(~size); -} - /* * Initially all pages are reserved - free ones are freed * up by free_all_bootmem() once the early boot process is @@ -5304,49 +5238,6 @@ void __init setup_per_cpu_pageset(void) alloc_percpu(struct per_cpu_nodestat); } -static noinline __ref -int zone_wait_table_init(struct zone *zone, unsigned long zone_size_pages) -{ - int i; - size_t alloc_size; - - /* - * The per-page waitqueue mechanism uses hashed waitqueues - * per zone. - */ - zone->wait_table_hash_nr_entries = - wait_table_hash_nr_entries(zone_size_pages); - zone->wait_table_bits = - wait_table_bits(zone->wait_table_hash_nr_entries); - alloc_size = zone->wait_table_hash_nr_entries - * sizeof(wait_queue_head_t); - - if (!slab_is_available()) { - zone->wait_table = (wait_queue_head_t *) - memblock_virt_alloc_node_nopanic( - alloc_size, zone->zone_pgdat->node_id); - } else { - /* - * This case means that a zone whose size was 0 gets new memory - * via memory hot-add. - * But it may be the case that a new node was hot-added. In - * this case vmalloc() will not be able to use this new node's - * memory - this wait_table must be initialized to use this new - * node itself as well. - * To use this new node's memory, further consideration will be - * necessary. - */ - zone->wait_table = vmalloc(alloc_size); - } - if (!zone->wait_table) - return -ENOMEM; - - for (i = 0; i < zone->wait_table_hash_nr_entries; ++i) - init_waitqueue_head(zone->wait_table + i); - - return 0; -} - static __meminit void zone_pcp_init(struct zone *zone) { /* @@ -5367,10 +5258,7 @@ int __meminit init_currently_empty_zone(struct zone *zone, unsigned long size) { struct pglist_data *pgdat = zone->zone_pgdat; - int ret; - ret = zone_wait_table_init(zone, size); - if (ret) - return ret; + pgdat->nr_zones = zone_idx(zone) + 1; zone->zone_start_pfn = zone_start_pfn; @@ -5382,6 +5270,7 @@ int __meminit init_currently_empty_zone(struct zone *zone, zone_start_pfn, (zone_start_pfn + size)); zone_init_free_lists(zone); + zone->initialized = 1; return 0; } From 570dd45042a7c8a7aba1ee029c5dd0f5ccf41b9b Mon Sep 17 00:00:00 2001 From: Chris Mason Date: Thu, 27 Oct 2016 10:42:20 -0700 Subject: [PATCH 241/292] btrfs: fix races on root_log_ctx lists btrfs_remove_all_log_ctxs takes a shortcut where it avoids walking the list because it knows all of the waiters are patiently waiting for the commit to finish. But, there's a small race where btrfs_sync_log can remove itself from the list if it finds a log commit is already done. Also, it uses list_del_init() to remove itself from the list, but there's no way to know if btrfs_remove_all_log_ctxs has already run, so we don't know for sure if it is safe to call list_del_init(). This gets rid of all the shortcuts for btrfs_remove_all_log_ctxs(), and just calls it with the proper locking. This is part two of the corruption fixed by cbd60aa7cd1. I should have done this in the first place, but convinced myself the optimizations were safe. A 12 hour run of dbench 2048 will eventually trigger a list debug WARN_ON for the list_del_init() in btrfs_sync_log(). Fixes: d1433debe7f4346cf9fc0dafc71c3137d2a97bc4 Reported-by: Dave Jones cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Chris Mason --- fs/btrfs/tree-log.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 688df71c1bf7..4b1c0a6eee04 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -2713,14 +2713,12 @@ static inline void btrfs_remove_all_log_ctxs(struct btrfs_root *root, int index, int error) { struct btrfs_log_ctx *ctx; + struct btrfs_log_ctx *safe; - if (!error) { - INIT_LIST_HEAD(&root->log_ctxs[index]); - return; - } - - list_for_each_entry(ctx, &root->log_ctxs[index], list) + list_for_each_entry_safe(ctx, safe, &root->log_ctxs[index], list) { + list_del_init(&ctx->list); ctx->log_ret = error; + } INIT_LIST_HEAD(&root->log_ctxs[index]); } @@ -2961,13 +2959,9 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans, mutex_unlock(&root->log_mutex); out_wake_log_root: - /* - * We needn't get log_mutex here because we are sure all - * the other tasks are blocked. - */ + mutex_lock(&log_root_tree->log_mutex); btrfs_remove_all_log_ctxs(log_root_tree, index2, ret); - mutex_lock(&log_root_tree->log_mutex); log_root_tree->log_transid_committed++; atomic_set(&log_root_tree->log_commit[index2], 0); mutex_unlock(&log_root_tree->log_mutex); @@ -2978,10 +2972,8 @@ out_wake_log_root: if (waitqueue_active(&log_root_tree->log_commit_wait[index2])) wake_up(&log_root_tree->log_commit_wait[index2]); out: - /* See above. */ - btrfs_remove_all_log_ctxs(root, index1, ret); - mutex_lock(&root->log_mutex); + btrfs_remove_all_log_ctxs(root, index1, ret); root->log_transid_committed++; atomic_set(&root->log_commit[index1], 0); mutex_unlock(&root->log_mutex); From 9db4f36e82c2394c958d8e42a498fb664684bc22 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Thu, 27 Oct 2016 15:49:12 -0700 Subject: [PATCH 242/292] mm: remove unused variable in memory hotplug When I removed the per-zone bitlock hashed waitqueues in commit 9dcb8b685fc3 ("mm: remove per-zone hashtable of bitlock waitqueues"), I removed all the magic hotplug memory initialization of said waitqueues too. But when I actually _tested_ the resulting build, I stupidly assumed that "allmodconfig" would enable memory hotplug. And it doesn't, because it enables KASAN instead, which then disables hotplug memory support. As a result, my build test of the per-zone waitqueues was totally broken, and I didn't notice that the compiler warns about the now unused iterator variable 'i'. I guess I should be happy that that seems to be the worst breakage from my clearly horribly failed test coverage. Reported-by: Stephen Rothwell Signed-off-by: Linus Torvalds --- mm/memory_hotplug.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index b18dab401be6..cad4b9125695 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -2117,7 +2117,6 @@ void try_offline_node(int nid) unsigned long start_pfn = pgdat->node_start_pfn; unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages; unsigned long pfn; - int i; for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { unsigned long section_nr = pfn_to_section_nr(pfn); From 867dfe342118b1ea0256a85f7c0d9ceb0ead032a Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 25 Oct 2016 17:52:04 +0200 Subject: [PATCH 243/292] nvdimm: make CONFIG_NVDIMM_DAX 'bool' A bugfix just tried to address a randconfig build problem and introduced a variant of the same problem: with CONFIG_LIBNVDIMM=y and CONFIG_NVDIMM_DAX=m, the nvdimm module now fails to link: drivers/nvdimm/built-in.o: In function `to_nd_device_type': bus.c:(.text+0x1b5d): undefined reference to `is_nd_dax' drivers/nvdimm/built-in.o: In function `nd_region_notify_driver_action.constprop.2': region_devs.c:(.text+0x6b6c): undefined reference to `is_nd_dax' region_devs.c:(.text+0x6b8c): undefined reference to `to_nd_dax' drivers/nvdimm/built-in.o: In function `nd_region_probe': region.c:(.text+0x70f3): undefined reference to `nd_dax_create' drivers/nvdimm/built-in.o: In function `mode_show': namespace_devs.c:(.text+0xa196): undefined reference to `is_nd_dax' drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe': (.text+0xa55f): undefined reference to `is_nd_dax' drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe': (.text+0xa56e): undefined reference to `to_nd_dax' This reverts the earlier fix, making NVDIMM_DAX a 'bool' option again as it should be (it gets linked into the libnvdimm module). To fix the original problem, I'm adding a dependency on LIBNVDIMM to DEV_DAX_PMEM, which ensures we can't have that one built-in if the rest is a module. Fixes: 4e65e9381c7a ("/dev/dax: fix Kconfig dependency build breakage") Signed-off-by: Arnd Bergmann Reviewed-by: Ross Zwisler Signed-off-by: Dan Williams --- drivers/dax/Kconfig | 2 +- drivers/nvdimm/Kconfig | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index daadd20aa936..3e2ab3b14eea 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -14,7 +14,7 @@ if DEV_DAX config DEV_DAX_PMEM tristate "PMEM DAX: direct access to persistent memory" - depends on NVDIMM_DAX + depends on LIBNVDIMM && NVDIMM_DAX default DEV_DAX help Support raw access to persistent memory. Note that this diff --git a/drivers/nvdimm/Kconfig b/drivers/nvdimm/Kconfig index 8b2b740d6679..124c2432ac9c 100644 --- a/drivers/nvdimm/Kconfig +++ b/drivers/nvdimm/Kconfig @@ -89,7 +89,7 @@ config NVDIMM_PFN Select Y if unsure config NVDIMM_DAX - tristate "NVDIMM DAX: Raw access to persistent memory" + bool "NVDIMM DAX: Raw access to persistent memory" default LIBNVDIMM depends on NVDIMM_PFN help From 67463e54beb63114965c3d2c7cb81d1d524e2697 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Thu, 27 Oct 2016 16:23:01 -0700 Subject: [PATCH 244/292] Allow KASAN and HOTPLUG_MEMORY to co-exist when doing build testing No, KASAN may not be able to co-exist with HOTPLUG_MEMORY at runtime, but for build testing there is no reason not to allow them together. This hopefully means better build coverage and fewer embarrasing silly problems like the one fixed by commit 9db4f36e82c2 ("mm: remove unused variable in memory hotplug") in the future. Cc: Stephen Rothwell Cc: Andrey Ryabinin Cc: Alexander Potapenko Signed-off-by: Linus Torvalds --- mm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/Kconfig b/mm/Kconfig index be0ee11fa0d9..86e3e0e74d20 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -187,7 +187,7 @@ config MEMORY_HOTPLUG bool "Allow for memory hot-add" depends on SPARSEMEM || X86_64_ACPI_NUMA depends on ARCH_ENABLE_MEMORY_HOTPLUG - depends on !KASAN + depends on COMPILE_TEST || !KASAN config MEMORY_HOTPLUG_SPARSE def_bool y From 52e73eb2872c9af6f382b2b22954ca8214397a4e Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Thu, 27 Oct 2016 17:04:05 -0700 Subject: [PATCH 245/292] device-dax: fix percpu_ref_exit ordering We need to wait until the percpu_ref is released before exit. Otherwise, we sometimes lose the race and trigger this new warning that was added in v4.9 (commit a67823c1ed10 "percpu-refcount: init ->confirm_switch member properly"): WARNING: CPU: 0 PID: 3629 at lib/percpu-refcount.c:107 percpu_ref_exit+0x51/0x60 [..] Call Trace: [] dump_stack+0x85/0xc2 [] __warn+0xcb/0xf0 [] warn_slowpath_null+0x1d/0x20 [] percpu_ref_exit+0x51/0x60 [] dax_pmem_percpu_exit+0x1a/0x50 [dax_pmem] [] devm_action_release+0xf/0x20 Cc: Fixes: ab68f2622136 ("/dev/dax, pmem: direct access to persistent memory") Signed-off-by: Dan Williams --- drivers/dax/pmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c index 9630d8837ba9..4a15fa5df98b 100644 --- a/drivers/dax/pmem.c +++ b/drivers/dax/pmem.c @@ -44,7 +44,6 @@ static void dax_pmem_percpu_exit(void *data) dev_dbg(dax_pmem->dev, "%s\n", __func__); percpu_ref_exit(ref); - wait_for_completion(&dax_pmem->cmp); } static void dax_pmem_percpu_kill(void *data) @@ -54,6 +53,7 @@ static void dax_pmem_percpu_kill(void *data) dev_dbg(dax_pmem->dev, "%s\n", __func__); percpu_ref_kill(ref); + wait_for_completion(&dax_pmem->cmp); } static int dax_pmem_probe(struct device *dev) From 86d9f48534e800e4d62cdc1b5aaf539f4c1d47d6 Mon Sep 17 00:00:00 2001 From: Joonsoo Kim Date: Thu, 27 Oct 2016 17:46:18 -0700 Subject: [PATCH 246/292] mm/slab: fix kmemcg cache creation delayed issue There is a bug report that SLAB makes extreme load average due to over 2000 kworker thread. https://bugzilla.kernel.org/show_bug.cgi?id=172981 This issue is caused by kmemcg feature that try to create new set of kmem_caches for each memcg. Recently, kmem_cache creation is slowed by synchronize_sched() and futher kmem_cache creation is also delayed since kmem_cache creation is synchronized by a global slab_mutex lock. So, the number of kworker that try to create kmem_cache increases quietly. synchronize_sched() is for lockless access to node's shared array but it's not needed when a new kmem_cache is created. So, this patch rules out that case. Fixes: 801faf0db894 ("mm/slab: lockless decision to grow cache") Link: http://lkml.kernel.org/r/1475734855-4837-1-git-send-email-iamjoonsoo.kim@lge.com Reported-by: Doug Smythies Tested-by: Doug Smythies Signed-off-by: Joonsoo Kim Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/slab.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/slab.c b/mm/slab.c index 090fb26b3a39..c451e3f6bbf6 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -966,7 +966,7 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep, * guaranteed to be valid until irq is re-enabled, because it will be * freed after synchronize_sched(). */ - if (force_change) + if (old_shared && force_change) synchronize_sched(); fail: From b274c0bb394c6a69ac12feac7c2db81f5aff5a55 Mon Sep 17 00:00:00 2001 From: Andrey Konovalov Date: Thu, 27 Oct 2016 17:46:21 -0700 Subject: [PATCH 247/292] kcov: properly check if we are in an interrupt in_interrupt() returns a nonzero value when we are either in an interrupt or have bh disabled via local_bh_disable(). Since we are interested in only ignoring coverage from actual interrupts, do a proper check instead of just calling in_interrupt(). As a result of this change, kcov will start to collect coverage from within local_bh_disable()/local_bh_enable() sections. Link: http://lkml.kernel.org/r/1476115803-20712-1-git-send-email-andreyknvl@google.com Signed-off-by: Andrey Konovalov Acked-by: Dmitry Vyukov Cc: Nicolai Stange Cc: Andrey Ryabinin Cc: Kees Cook Cc: James Morse Cc: Vegard Nossum Cc: Quentin Casasnovas Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- kernel/kcov.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/kernel/kcov.c b/kernel/kcov.c index 8d44b3fea9d0..30e6d05aa5a9 100644 --- a/kernel/kcov.c +++ b/kernel/kcov.c @@ -53,8 +53,15 @@ void notrace __sanitizer_cov_trace_pc(void) /* * We are interested in code coverage as a function of a syscall inputs, * so we ignore code executed in interrupts. + * The checks for whether we are in an interrupt are open-coded, because + * 1. We can't use in_interrupt() here, since it also returns true + * when we are inside local_bh_disable() section. + * 2. We don't want to use (in_irq() | in_serving_softirq() | in_nmi()), + * since that leads to slower generated code (three separate tests, + * one for each of the flags). */ - if (!t || in_interrupt()) + if (!t || (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_OFFSET + | NMI_MASK))) return; mode = READ_ONCE(t->kcov_mode); if (mode == KCOV_MODE_TRACE) { From 21753583056d48a5fad964d6f272e28168426845 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Thu, 27 Oct 2016 17:46:24 -0700 Subject: [PATCH 248/292] h8300: fix syscall restarting Back in commit f56141e3e2d9 ("all arches, signal: move restart_block to struct task_struct"), all architectures and core code were changed to use task_struct::restart_block. However, when h8300 support was subsequently restored in v4.2, it was not updated to account for this, and maintains thread_info::restart_block, which is not kept in sync. This patch drops the redundant restart_block from thread_info, and moves h8300 to the common one in task_struct, ensuring that syscall restarting always works as expected. Fixes: f56141e3e2d9 ("all arches, signal: move restart_block to struct task_struct") Link: http://lkml.kernel.org/r/1476714934-11635-1-git-send-email-mark.rutland@arm.com Signed-off-by: Mark Rutland Cc: Andy Lutomirski Cc: Yoshinori Sato Cc: uclinux-h8-devel@lists.sourceforge.jp Cc: [4.2+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/h8300/include/asm/thread_info.h | 4 ---- arch/h8300/kernel/signal.c | 2 +- 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/h8300/include/asm/thread_info.h b/arch/h8300/include/asm/thread_info.h index b408fe660cf8..3cef06875f5c 100644 --- a/arch/h8300/include/asm/thread_info.h +++ b/arch/h8300/include/asm/thread_info.h @@ -31,7 +31,6 @@ struct thread_info { int cpu; /* cpu we're on */ int preempt_count; /* 0 => preemptable, <0 => BUG */ mm_segment_t addr_limit; - struct restart_block restart_block; }; /* @@ -44,9 +43,6 @@ struct thread_info { .cpu = 0, \ .preempt_count = INIT_PREEMPT_COUNT, \ .addr_limit = KERNEL_DS, \ - .restart_block = { \ - .fn = do_no_restart_syscall, \ - }, \ } #define init_thread_info (init_thread_union.thread_info) diff --git a/arch/h8300/kernel/signal.c b/arch/h8300/kernel/signal.c index ad1f81f574e5..7138303cbbf2 100644 --- a/arch/h8300/kernel/signal.c +++ b/arch/h8300/kernel/signal.c @@ -79,7 +79,7 @@ restore_sigcontext(struct sigcontext *usc, int *pd0) unsigned int er0; /* Always make any pending restarted system calls return -EINTR */ - current_thread_info()->restart_block.fn = do_no_restart_syscall; + current->restart_block.fn = do_no_restart_syscall; /* restore passed registers */ #define COPY(r) do { err |= get_user(regs->r, &usc->sc_##r); } while (0) From 1bc11d70b5db7c6bb1414b283d7f09b1fe1ac0d0 Mon Sep 17 00:00:00 2001 From: Alexander Polakov Date: Thu, 27 Oct 2016 17:46:27 -0700 Subject: [PATCH 249/292] mm/list_lru.c: avoid error-path NULL pointer deref As described in https://bugzilla.kernel.org/show_bug.cgi?id=177821: After some analysis it seems to be that the problem is in alloc_super(). In case list_lru_init_memcg() fails it goes into destroy_super(), which calls list_lru_destroy(). And in list_lru_init() we see that in case memcg_init_list_lru() fails, lru->node is freed, but not set NULL, which then leads list_lru_destroy() to believe it is initialized and call memcg_destroy_list_lru(). memcg_destroy_list_lru() in turn can access lru->node[i].memcg_lrus, which is NULL. [akpm@linux-foundation.org: add comment] Signed-off-by: Alexander Polakov Acked-by: Vladimir Davydov Cc: Al Viro Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/list_lru.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/list_lru.c b/mm/list_lru.c index 1d05cb9d363d..234676e31edd 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -554,6 +554,8 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware, err = memcg_init_list_lru(lru, memcg_aware); if (err) { kfree(lru->node); + /* Do this so a list_lru_destroy() doesn't crash: */ + lru->node = NULL; goto out; } From 1f84a18fc010d7a62667199c9be35872bbf31526 Mon Sep 17 00:00:00 2001 From: Joe Perches Date: Thu, 27 Oct 2016 17:46:29 -0700 Subject: [PATCH 250/292] mm: page_alloc: use KERN_CONT where appropriate Recent changes to printk require KERN_CONT uses to continue logging messages. So add KERN_CONT where necessary. [akpm@linux-foundation.org: coding-style fixes] Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") Link: http://lkml.kernel.org/r/c7df37c8665134654a17aaeb8b9f6ace1d6db58b.1476239034.git.joe@perches.com Reported-by: Mark Rutland Signed-off-by: Joe Perches Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/page_alloc.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index de7c6e43b1c9..8fd42aa7c4bd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4224,7 +4224,7 @@ static void show_migration_types(unsigned char type) } *p = '\0'; - printk("(%s) ", tmp); + printk(KERN_CONT "(%s) ", tmp); } /* @@ -4335,7 +4335,8 @@ void show_free_areas(unsigned int filter) free_pcp += per_cpu_ptr(zone->pageset, cpu)->pcp.count; show_node(zone); - printk("%s" + printk(KERN_CONT + "%s" " free:%lukB" " min:%lukB" " low:%lukB" @@ -4382,8 +4383,8 @@ void show_free_areas(unsigned int filter) K(zone_page_state(zone, NR_FREE_CMA_PAGES))); printk("lowmem_reserve[]:"); for (i = 0; i < MAX_NR_ZONES; i++) - printk(" %ld", zone->lowmem_reserve[i]); - printk("\n"); + printk(KERN_CONT " %ld", zone->lowmem_reserve[i]); + printk(KERN_CONT "\n"); } for_each_populated_zone(zone) { @@ -4394,7 +4395,7 @@ void show_free_areas(unsigned int filter) if (skip_free_areas_node(filter, zone_to_nid(zone))) continue; show_node(zone); - printk("%s: ", zone->name); + printk(KERN_CONT "%s: ", zone->name); spin_lock_irqsave(&zone->lock, flags); for (order = 0; order < MAX_ORDER; order++) { @@ -4412,11 +4413,12 @@ void show_free_areas(unsigned int filter) } spin_unlock_irqrestore(&zone->lock, flags); for (order = 0; order < MAX_ORDER; order++) { - printk("%lu*%lukB ", nr[order], K(1UL) << order); + printk(KERN_CONT "%lu*%lukB ", + nr[order], K(1UL) << order); if (nr[order]) show_migration_types(types[order]); } - printk("= %lukB\n", K(total)); + printk(KERN_CONT "= %lukB\n", K(total)); } hugetlb_show_meminfo(); From 07a63c41fa1f6533f5668e5b33a295bfd63aa534 Mon Sep 17 00:00:00 2001 From: Aruna Ramakrishna Date: Thu, 27 Oct 2016 17:46:32 -0700 Subject: [PATCH 251/292] mm/slab: improve performance of gathering slabinfo stats On large systems, when some slab caches grow to millions of objects (and many gigabytes), running 'cat /proc/slabinfo' can take up to 1-2 seconds. During this time, interrupts are disabled while walking the slab lists (slabs_full, slabs_partial, and slabs_free) for each node, and this sometimes causes timeouts in other drivers (for instance, Infiniband). This patch optimizes 'cat /proc/slabinfo' by maintaining a counter for total number of allocated slabs per node, per cache. This counter is updated when a slab is created or destroyed. This enables us to skip traversing the slabs_full list while gathering slabinfo statistics, and since slabs_full tends to be the biggest list when the cache is large, it results in a dramatic performance improvement. Getting slabinfo statistics now only requires walking the slabs_free and slabs_partial lists, and those lists are usually much smaller than slabs_full. We tested this after growing the dentry cache to 70GB, and the performance improved from 2s to 5ms. Link: http://lkml.kernel.org/r/1472517876-26814-1-git-send-email-aruna.ramakrishna@oracle.com Signed-off-by: Aruna Ramakrishna Acked-by: David Rientjes Cc: Mike Kravetz Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/slab.c | 43 +++++++++++++++++++++++++++---------------- mm/slab.h | 1 + 2 files changed, 28 insertions(+), 16 deletions(-) diff --git a/mm/slab.c b/mm/slab.c index c451e3f6bbf6..0b0550ca85b4 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -233,6 +233,7 @@ static void kmem_cache_node_init(struct kmem_cache_node *parent) spin_lock_init(&parent->list_lock); parent->free_objects = 0; parent->free_touched = 0; + parent->num_slabs = 0; } #define MAKE_LIST(cachep, listp, slab, nodeid) \ @@ -1382,24 +1383,27 @@ slab_out_of_memory(struct kmem_cache *cachep, gfp_t gfpflags, int nodeid) for_each_kmem_cache_node(cachep, node, n) { unsigned long active_objs = 0, num_objs = 0, free_objects = 0; unsigned long active_slabs = 0, num_slabs = 0; + unsigned long num_slabs_partial = 0, num_slabs_free = 0; + unsigned long num_slabs_full; spin_lock_irqsave(&n->list_lock, flags); - list_for_each_entry(page, &n->slabs_full, lru) { - active_objs += cachep->num; - active_slabs++; - } + num_slabs = n->num_slabs; list_for_each_entry(page, &n->slabs_partial, lru) { active_objs += page->active; - active_slabs++; + num_slabs_partial++; } list_for_each_entry(page, &n->slabs_free, lru) - num_slabs++; + num_slabs_free++; free_objects += n->free_objects; spin_unlock_irqrestore(&n->list_lock, flags); - num_slabs += active_slabs; num_objs = num_slabs * cachep->num; + active_slabs = num_slabs - num_slabs_free; + num_slabs_full = num_slabs - + (num_slabs_partial + num_slabs_free); + active_objs += (num_slabs_full * cachep->num); + pr_warn(" node %d: slabs: %ld/%ld, objs: %ld/%ld, free: %ld\n", node, active_slabs, num_slabs, active_objs, num_objs, free_objects); @@ -2314,6 +2318,7 @@ static int drain_freelist(struct kmem_cache *cache, page = list_entry(p, struct page, lru); list_del(&page->lru); + n->num_slabs--; /* * Safe to drop the lock. The slab is no longer linked * to the cache. @@ -2752,6 +2757,8 @@ static void cache_grow_end(struct kmem_cache *cachep, struct page *page) list_add_tail(&page->lru, &(n->slabs_free)); else fixup_slab_list(cachep, n, page, &list); + + n->num_slabs++; STATS_INC_GROWN(cachep); n->free_objects += cachep->num - page->active; spin_unlock(&n->list_lock); @@ -3443,6 +3450,7 @@ static void free_block(struct kmem_cache *cachep, void **objpp, page = list_last_entry(&n->slabs_free, struct page, lru); list_move(&page->lru, list); + n->num_slabs--; } } @@ -4099,6 +4107,8 @@ void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo) unsigned long num_objs; unsigned long active_slabs = 0; unsigned long num_slabs, free_objects = 0, shared_avail = 0; + unsigned long num_slabs_partial = 0, num_slabs_free = 0; + unsigned long num_slabs_full = 0; const char *name; char *error = NULL; int node; @@ -4111,33 +4121,34 @@ void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo) check_irq_on(); spin_lock_irq(&n->list_lock); - list_for_each_entry(page, &n->slabs_full, lru) { - if (page->active != cachep->num && !error) - error = "slabs_full accounting error"; - active_objs += cachep->num; - active_slabs++; - } + num_slabs += n->num_slabs; + list_for_each_entry(page, &n->slabs_partial, lru) { if (page->active == cachep->num && !error) error = "slabs_partial accounting error"; if (!page->active && !error) error = "slabs_partial accounting error"; active_objs += page->active; - active_slabs++; + num_slabs_partial++; } + list_for_each_entry(page, &n->slabs_free, lru) { if (page->active && !error) error = "slabs_free accounting error"; - num_slabs++; + num_slabs_free++; } + free_objects += n->free_objects; if (n->shared) shared_avail += n->shared->avail; spin_unlock_irq(&n->list_lock); } - num_slabs += active_slabs; num_objs = num_slabs * cachep->num; + active_slabs = num_slabs - num_slabs_free; + num_slabs_full = num_slabs - (num_slabs_partial + num_slabs_free); + active_objs += (num_slabs_full * cachep->num); + if (num_objs - active_objs != free_objects && !error) error = "free_objects accounting error"; diff --git a/mm/slab.h b/mm/slab.h index 9653f2e2591a..bc05fdc3edce 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -432,6 +432,7 @@ struct kmem_cache_node { struct list_head slabs_partial; /* partial list first, better asm code */ struct list_head slabs_full; struct list_head slabs_free; + unsigned long num_slabs; unsigned long free_objects; unsigned int free_limit; unsigned int colour_next; /* Per-node cache coloring */ From 8c8d4d45204902e144abc0f15b7c658828028fa1 Mon Sep 17 00:00:00 2001 From: Aristeu Rozanski Date: Thu, 27 Oct 2016 17:46:35 -0700 Subject: [PATCH 252/292] ipc: account for kmem usage on mqueue and msg When kmem accounting switched from account by default to only account if flagged by __GFP_ACCOUNT, IPC mqueue and messages was left out. The production use case at hand is that mqueues should be customizable via sysctls in Docker containers in a Kubernetes cluster. This can only be safely allowed to the users of the cluster (without the risk that they can cause resource shortage on a node, influencing other users' containers) if all resources they control are bounded, i.e. accounted for. Link: http://lkml.kernel.org/r/1476806075-1210-1-git-send-email-arozansk@redhat.com Signed-off-by: Aristeu Rozanski Reported-by: Stefan Schimanski Acked-by: Davidlohr Bueso Cc: Alexey Dobriyan Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Stefan Schimanski Cc: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- ipc/msgutil.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ipc/msgutil.c b/ipc/msgutil.c index a521999de4f1..bf74eaa5c39f 100644 --- a/ipc/msgutil.c +++ b/ipc/msgutil.c @@ -53,7 +53,7 @@ static struct msg_msg *alloc_msg(size_t len) size_t alen; alen = min(len, DATALEN_MSG); - msg = kmalloc(sizeof(*msg) + alen, GFP_KERNEL); + msg = kmalloc(sizeof(*msg) + alen, GFP_KERNEL_ACCOUNT); if (msg == NULL) return NULL; @@ -65,7 +65,7 @@ static struct msg_msg *alloc_msg(size_t len) while (len > 0) { struct msg_msgseg *seg; alen = min(len, DATALEN_SEG); - seg = kmalloc(sizeof(*seg) + alen, GFP_KERNEL); + seg = kmalloc(sizeof(*seg) + alen, GFP_KERNEL_ACCOUNT); if (seg == NULL) goto out_err; *pseg = seg; From c0a0aba8e478229b2f0956918542152fbad3f794 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Thu, 27 Oct 2016 17:46:38 -0700 Subject: [PATCH 253/292] kconfig.h: remove config_enabled() macro The use of config_enabled() is ambiguous. For config options, IS_ENABLED(), IS_REACHABLE(), etc. will make intention clearer. Sometimes config_enabled() has been used for non-config options because it is useful to check whether the given symbol is defined or not. I have been tackling on deprecating config_enabled(), and now is the time to finish this work. Some new users have appeared for v4.9-rc1, but it is trivial to replace them: - arch/x86/mm/kaslr.c replace config_enabled() with IS_ENABLED() because CONFIG_X86_ESPFIX64 and CONFIG_EFI are boolean. - include/asm-generic/export.h replace config_enabled() with __is_defined(). Then, config_enabled() can be removed now. Going forward, please use IS_ENABLED(), IS_REACHABLE(), etc. for config options, and __is_defined() for non-config symbols. Link: http://lkml.kernel.org/r/1476616078-32252-1-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada Acked-by: Ingo Molnar Acked-by: Nicolas Pitre Cc: Peter Oberparleiter Cc: Arnd Bergmann Cc: Kees Cook Cc: Michal Marek Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Thomas Garnier Cc: Paul Bolle Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/x86/mm/kaslr.c | 6 +++--- include/asm-generic/export.h | 2 +- include/linux/kconfig.h | 5 ++--- 3 files changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c index ddd2661c4502..887e57182716 100644 --- a/arch/x86/mm/kaslr.c +++ b/arch/x86/mm/kaslr.c @@ -104,10 +104,10 @@ void __init kernel_randomize_memory(void) * consistent with the vaddr_start/vaddr_end variables. */ BUILD_BUG_ON(vaddr_start >= vaddr_end); - BUILD_BUG_ON(config_enabled(CONFIG_X86_ESPFIX64) && + BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_ESPFIX64) && vaddr_end >= EFI_VA_START); - BUILD_BUG_ON((config_enabled(CONFIG_X86_ESPFIX64) || - config_enabled(CONFIG_EFI)) && + BUILD_BUG_ON((IS_ENABLED(CONFIG_X86_ESPFIX64) || + IS_ENABLED(CONFIG_EFI)) && vaddr_end >= __START_KERNEL_map); BUILD_BUG_ON(vaddr_end > __START_KERNEL_map); diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h index 43199a049da5..63554e9f6e0c 100644 --- a/include/asm-generic/export.h +++ b/include/asm-generic/export.h @@ -70,7 +70,7 @@ KSYM(__kcrctab_\name): #include #define __EXPORT_SYMBOL(sym, val, sec) \ - __cond_export_sym(sym, val, sec, config_enabled(__KSYM_##sym)) + __cond_export_sym(sym, val, sec, __is_defined(__KSYM_##sym)) #define __cond_export_sym(sym, val, sec, conf) \ ___cond_export_sym(sym, val, sec, conf) #define ___cond_export_sym(sym, val, sec, enabled) \ diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h index 15ec117ec537..8f2e059e4d45 100644 --- a/include/linux/kconfig.h +++ b/include/linux/kconfig.h @@ -31,7 +31,6 @@ * When CONFIG_BOOGER is not defined, we generate a (... 1, 0) pair, and when * the last step cherry picks the 2nd arg, we get a zero. */ -#define config_enabled(cfg) ___is_defined(cfg) #define __is_defined(x) ___is_defined(x) #define ___is_defined(val) ____is_defined(__ARG_PLACEHOLDER_##val) #define ____is_defined(arg1_or_junk) __take_second_arg(arg1_or_junk 1, 0) @@ -41,13 +40,13 @@ * otherwise. For boolean options, this is equivalent to * IS_ENABLED(CONFIG_FOO). */ -#define IS_BUILTIN(option) config_enabled(option) +#define IS_BUILTIN(option) __is_defined(option) /* * IS_MODULE(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'm', 0 * otherwise. */ -#define IS_MODULE(option) config_enabled(option##_MODULE) +#define IS_MODULE(option) __is_defined(option##_MODULE) /* * IS_REACHABLE(CONFIG_FOO) evaluates to 1 if the currently compiled From 0e07f663c90154079b287931257defbdb1d5405a Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Thu, 27 Oct 2016 17:46:41 -0700 Subject: [PATCH 254/292] latent_entropy: raise CONFIG_FRAME_WARN by default When building with the latent_entropy plugin, set the default CONFIG_FRAME_WARN to 2048, since some __init functions have many basic blocks that, when instrumented by the latent_entropy plugin, grow beyond 1024 byte stack size on 32-bit builds. Link: http://lkml.kernel.org/r/20161018211216.GA39687@beast Signed-off-by: Kees Cook Reported-by: kbuild test robot Cc: Emese Revfy Cc: Ingo Molnar Cc: Michal Marek Cc: "Paul E. McKenney" Cc: Dan Williams Cc: Andrey Ryabinin Cc: Josh Poimboeuf Cc: Tejun Heo Cc: Nikolay Aleksandrov Cc: Dmitry Vyukov Cc: Shuah Khan Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- lib/Kconfig.debug | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 33bc56cf60d7..b01e547d4d04 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -198,6 +198,7 @@ config FRAME_WARN int "Warn for stack frames larger than (needs gcc 4.4)" range 0 8192 default 0 if KASAN + default 2048 if GCC_PLUGIN_LATENT_ENTROPY default 1024 if !64BIT default 2048 if 64BIT help From 02754e0a484a50a92d44c38879f2cb2792ebc572 Mon Sep 17 00:00:00 2001 From: Dmitry Vyukov Date: Thu, 27 Oct 2016 17:46:44 -0700 Subject: [PATCH 255/292] lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB KASAN uses stackdepot to memorize stacks for all kmalloc/kfree calls. Current stackdepot capacity is 16MB (1024 top level entries x 4 pages on second level). Size of each stack is (num_frames + 3) * sizeof(long). Which gives us ~84K stacks. This capacity was chosen empirically and it is enough to run kernel normally. However, when lots of configs are enabled and a fuzzer tries to maximize code coverage, it easily hits the limit within tens of minutes. I've tested for long a time with number of top level entries bumped 4x (4096). And I think I've seen overflow only once. But I don't have all configs enabled and code coverage has not reached maximum yet. So bump it 8x to 8192. Since we have two-level table, memory cost of this is very moderate -- currently the top-level table is 8KB, with this patch it is 64KB, which is negligible under KASAN. Here is some approx math. 128MB allows us to memorize ~670K stacks (assuming stack is ~200b). I've grepped kernel for kmalloc|kfree|kmem_cache_alloc|kmem_cache_free| kzalloc|kstrdup|kstrndup|kmemdup and it gives ~60K matches. Most of alloc/free call sites are reachable with only one stack. But some utility functions can have large fanout. Assuming average fanout is 5x, total number of alloc/free stacks is ~300K. Link: http://lkml.kernel.org/r/1476458416-122131-1-git-send-email-dvyukov@google.com Signed-off-by: Dmitry Vyukov Cc: Andrey Ryabinin Cc: Alexander Potapenko Cc: Joonsoo Kim Cc: Baozeng Ding Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- lib/stackdepot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/stackdepot.c b/lib/stackdepot.c index 60f77f1d470a..4d830e299989 100644 --- a/lib/stackdepot.c +++ b/lib/stackdepot.c @@ -50,7 +50,7 @@ STACK_ALLOC_ALIGN) #define STACK_ALLOC_INDEX_BITS (DEPOT_STACK_BITS - \ STACK_ALLOC_NULL_PROTECTION_BITS - STACK_ALLOC_OFFSET_BITS) -#define STACK_ALLOC_SLABS_CAP 1024 +#define STACK_ALLOC_SLABS_CAP 8192 #define STACK_ALLOC_MAX_SLABS \ (((1LL << (STACK_ALLOC_INDEX_BITS)) < STACK_ALLOC_SLABS_CAP) ? \ (1LL << (STACK_ALLOC_INDEX_BITS)) : STACK_ALLOC_SLABS_CAP) From 37df49f433bc3a11f5716fe65aaec5189c6402cb Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Thu, 27 Oct 2016 17:46:47 -0700 Subject: [PATCH 256/292] mm: kmemleak: ensure that the task stack is not freed during scanning Commit 68f24b08ee89 ("sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK") may cause the task->stack to be freed during kmemleak_scan() execution, leading to either a NULL pointer fault (if task->stack is NULL) or kmemleak accessing already freed memory. This patch uses the new try_get_task_stack() API to ensure that the task stack is not freed during kmemleak stack scanning. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=173901. Fixes: 68f24b08ee89 ("sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK") Link: http://lkml.kernel.org/r/1476266223-14325-1-git-send-email-catalin.marinas@arm.com Signed-off-by: Catalin Marinas Reported-by: CAI Qian Tested-by: CAI Qian Acked-by: Michal Hocko Cc: Andy Lutomirski Cc: CAI Qian Cc: Hillf Danton Cc: Oleg Nesterov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/kmemleak.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/mm/kmemleak.c b/mm/kmemleak.c index a5e453cf05c4..e5355a5b423f 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -1453,8 +1453,11 @@ static void kmemleak_scan(void) read_lock(&tasklist_lock); do_each_thread(g, p) { - scan_block(task_stack_page(p), task_stack_page(p) + - THREAD_SIZE, NULL); + void *stack = try_get_task_stack(p); + if (stack) { + scan_block(stack, stack + THREAD_SIZE, NULL); + put_task_stack(p); + } } while_each_thread(g, p); read_unlock(&tasklist_lock); } From 06b2849d103f4a91212876a211d0d7df227a9513 Mon Sep 17 00:00:00 2001 From: Leon Yu Date: Thu, 27 Oct 2016 17:46:50 -0700 Subject: [PATCH 257/292] proc: fix NULL dereference when reading /proc//auxv Reading auxv of any kernel thread results in NULL pointer dereferencing in auxv_read() where mm can be NULL. Fix that by checking for NULL mm and bailing out early. This is also the original behavior changed by recent commit c5317167854e ("proc: switch auxv to use of __mem_open()"). # cat /proc/2/auxv Unable to handle kernel NULL pointer dereference at virtual address 000000a8 Internal error: Oops: 17 [#1] PREEMPT SMP ARM CPU: 3 PID: 113 Comm: cat Not tainted 4.9.0-rc1-ARCH+ #1 Hardware name: BCM2709 task: ea3b0b00 task.stack: e99b2000 PC is at auxv_read+0x24/0x4c LR is at do_readv_writev+0x2fc/0x37c Process cat (pid: 113, stack limit = 0xe99b2210) Call chain: auxv_read do_readv_writev vfs_readv default_file_splice_read splice_direct_to_actor do_splice_direct do_sendfile SyS_sendfile64 ret_fast_syscall Fixes: c5317167854e ("proc: switch auxv to use of __mem_open()") Link: http://lkml.kernel.org/r/1476966200-14457-1-git-send-email-chianglungyu@gmail.com Signed-off-by: Leon Yu Acked-by: Oleg Nesterov Acked-by: Michal Hocko Cc: Al Viro Cc: Kees Cook Cc: John Stultz Cc: Mateusz Guzik Cc: Janis Danisevskis Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/proc/base.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/proc/base.c b/fs/proc/base.c index adfc5b4986f5..ca651ac00660 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -1012,6 +1012,9 @@ static ssize_t auxv_read(struct file *file, char __user *buf, { struct mm_struct *mm = file->private_data; unsigned int nwords = 0; + + if (!mm) + return 0; do { nwords += 2; } while (mm->saved_auxv[nwords - 2] != 0); /* AT_NULL */ From 8f72cb4ef90c63bcb5111c2e3ec7ea2727eab2f8 Mon Sep 17 00:00:00 2001 From: Martin Kepplinger Date: Thu, 27 Oct 2016 17:46:53 -0700 Subject: [PATCH 258/292] CREDITS: update credit information for Martin Kepplinger Content and employer changed. Link: http://lkml.kernel.org/r/1477304102-28830-1-git-send-email-martin.kepplinger@ginzinger.com Signed-off-by: Martin Kepplinger Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- CREDITS | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/CREDITS b/CREDITS index 513aaa3546bf..837367624e45 100644 --- a/CREDITS +++ b/CREDITS @@ -1864,10 +1864,11 @@ S: The Netherlands N: Martin Kepplinger E: martink@posteo.de -E: martin.kepplinger@theobroma-systems.com +E: martin.kepplinger@ginzinger.com W: http://www.martinkepplinger.com D: mma8452 accelerators iio driver -D: Kernel cleanups +D: pegasus_notetaker input driver +D: Kernel fixes and cleanups S: Garnisonstraße 26 S: 4020 Linz S: Austria From 89a2848381b5fcd9c4d9c0cd97680e3b28730e31 Mon Sep 17 00:00:00 2001 From: Johannes Weiner Date: Thu, 27 Oct 2016 17:46:56 -0700 Subject: [PATCH 259/292] mm: memcontrol: do not recurse in direct reclaim On 4.0, we saw a stack corruption from a page fault entering direct memory cgroup reclaim, calling into btrfs_releasepage(), which then tried to allocate an extent and recursed back into a kmem charge ad nauseam: [...] btrfs_releasepage+0x2c/0x30 try_to_release_page+0x32/0x50 shrink_page_list+0x6da/0x7a0 shrink_inactive_list+0x1e5/0x510 shrink_lruvec+0x605/0x7f0 shrink_zone+0xee/0x320 do_try_to_free_pages+0x174/0x440 try_to_free_mem_cgroup_pages+0xa7/0x130 try_charge+0x17b/0x830 memcg_charge_kmem+0x40/0x80 new_slab+0x2d9/0x5a0 __slab_alloc+0x2fd/0x44f kmem_cache_alloc+0x193/0x1e0 alloc_extent_state+0x21/0xc0 __clear_extent_bit+0x2b5/0x400 try_release_extent_mapping+0x1a3/0x220 __btrfs_releasepage+0x31/0x70 btrfs_releasepage+0x2c/0x30 try_to_release_page+0x32/0x50 shrink_page_list+0x6da/0x7a0 shrink_inactive_list+0x1e5/0x510 shrink_lruvec+0x605/0x7f0 shrink_zone+0xee/0x320 do_try_to_free_pages+0x174/0x440 try_to_free_mem_cgroup_pages+0xa7/0x130 try_charge+0x17b/0x830 mem_cgroup_try_charge+0x65/0x1c0 handle_mm_fault+0x117f/0x1510 __do_page_fault+0x177/0x420 do_page_fault+0xc/0x10 page_fault+0x22/0x30 On later kernels, kmem charging is opt-in rather than opt-out, and that particular kmem allocation in btrfs_releasepage() is no longer being charged and won't recurse and overrun the stack anymore. But it's not impossible for an accounted allocation to happen from the memcg direct reclaim context, and we needed to reproduce this crash many times before we even got a useful stack trace out of it. Like other direct reclaimers, mark tasks in memcg reclaim PF_MEMALLOC to avoid recursing into any other form of direct reclaim. Then let recursive charges from PF_MEMALLOC contexts bypass the cgroup limit. Link: http://lkml.kernel.org/r/20161025141050.GA13019@cmpxchg.org Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Cc: Vladimir Davydov Cc: Tejun Heo Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/memcontrol.c | 9 +++++++++ mm/vmscan.c | 2 ++ 2 files changed, 11 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ae052b5e3315..0f870ba43942 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1917,6 +1917,15 @@ retry: current->flags & PF_EXITING)) goto force; + /* + * Prevent unbounded recursion when reclaim operations need to + * allocate memory. This might exceed the limits temporarily, + * but we prefer facilitating memory reclaim and getting back + * under the limit over triggering OOM kills in these cases. + */ + if (unlikely(current->flags & PF_MEMALLOC)) + goto force; + if (unlikely(task_in_memcg_oom(current))) goto nomem; diff --git a/mm/vmscan.c b/mm/vmscan.c index 744f926af442..76fda2268148 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3043,7 +3043,9 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, sc.gfp_mask, sc.reclaim_idx); + current->flags |= PF_MEMALLOC; nr_reclaimed = do_try_to_free_pages(zonelist, &sc); + current->flags &= ~PF_MEMALLOC; trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed); From 62e931fac45b17c2a42549389879411572f75804 Mon Sep 17 00:00:00 2001 From: Daniel Mentz Date: Thu, 27 Oct 2016 17:46:59 -0700 Subject: [PATCH 260/292] lib/genalloc.c: start search from start of chunk gen_pool_alloc_algo() iterates over the chunks of a pool trying to find a contiguous block of memory that satisfies the allocation request. The shortcut if (size > atomic_read(&chunk->avail)) continue; makes the loop skip over chunks that do not have enough bytes left to fulfill the request. There are two situations, though, where an allocation might still fail: (1) The available memory is not contiguous, i.e. the request cannot be fulfilled due to external fragmentation. (2) A race condition. Another thread runs the same code concurrently and is quicker to grab the available memory. In those situations, the loop calls pool->algo() to search the entire chunk, and pool->algo() returns some value that is >= end_bit to indicate that the search failed. This return value is then assigned to start_bit. The variables start_bit and end_bit describe the range that should be searched, and this range should be reset for every chunk that is searched. Today, the code fails to reset start_bit to 0. As a result, prefixes of subsequent chunks are ignored. Memory allocations might fail even though there is plenty of room left in these prefixes of those other chunks. Fixes: 7f184275aa30 ("lib, Make gen_pool memory allocator lockless") Link: http://lkml.kernel.org/r/1477420604-28918-1-git-send-email-danielmentz@google.com Signed-off-by: Daniel Mentz Reviewed-by: Mathieu Desnoyers Acked-by: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- lib/genalloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/genalloc.c b/lib/genalloc.c index 0a1139644d32..144fe6b1a03e 100644 --- a/lib/genalloc.c +++ b/lib/genalloc.c @@ -292,7 +292,7 @@ unsigned long gen_pool_alloc_algo(struct gen_pool *pool, size_t size, struct gen_pool_chunk *chunk; unsigned long addr = 0; int order = pool->min_alloc_order; - int nbits, start_bit = 0, end_bit, remain; + int nbits, start_bit, end_bit, remain; #ifndef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG BUG_ON(in_nmi()); @@ -307,6 +307,7 @@ unsigned long gen_pool_alloc_algo(struct gen_pool *pool, size_t size, if (size > atomic_read(&chunk->avail)) continue; + start_bit = 0; end_bit = chunk_size(chunk) >> order; retry: start_bit = algo(chunk->bits, end_bit, start_bit, From 14f947c87a2164b19c7f8c02234f4f348e03f409 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= Date: Thu, 27 Oct 2016 17:47:02 -0700 Subject: [PATCH 261/292] fs: exofs: print a hex number after a 0x prefix MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It makes the message hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/20161026125658.25728-2-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König Cc: Boaz Harrosh Cc: Benny Halevy Cc: Joe Perches Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exofs/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/exofs/dir.c b/fs/exofs/dir.c index 79101651fe9e..42f9a0a0c4ca 100644 --- a/fs/exofs/dir.c +++ b/fs/exofs/dir.c @@ -137,7 +137,7 @@ Espan: bad_entry: EXOFS_ERR( "ERROR [exofs_check_page]: bad entry in directory(0x%lx): %s - " - "offset=%lu, inode=0x%llu, rec_len=%d, name_len=%d\n", + "offset=%lu, inode=0x%llx, rec_len=%d, name_len=%d\n", dir->i_ino, error, (page->index<inode_no)), rec_len, p->name_len); From ee52c44dee63ff2686a7b0d98fff7c80852ac022 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= Date: Thu, 27 Oct 2016 17:47:04 -0700 Subject: [PATCH 262/292] block: DAC960: print a hex number after a 0x prefix MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It makes the message hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/20161026125658.25728-3-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- drivers/block/DAC960.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/DAC960.c b/drivers/block/DAC960.c index 811e11c82f32..0809cda93cc0 100644 --- a/drivers/block/DAC960.c +++ b/drivers/block/DAC960.c @@ -2954,7 +2954,7 @@ DAC960_DetectController(struct pci_dev *PCI_Device, case DAC960_PD_Controller: if (!request_region(Controller->IO_Address, 0x80, Controller->FullModelName)) { - DAC960_Error("IO port 0x%d busy for Controller at\n", + DAC960_Error("IO port 0x%lx busy for Controller at\n", Controller, Controller->IO_Address); goto Failure; } @@ -2990,7 +2990,7 @@ DAC960_DetectController(struct pci_dev *PCI_Device, case DAC960_P_Controller: if (!request_region(Controller->IO_Address, 0x80, Controller->FullModelName)){ - DAC960_Error("IO port 0x%d busy for Controller at\n", + DAC960_Error("IO port 0x%lx busy for Controller at\n", Controller, Controller->IO_Address); goto Failure; } From 9105585d13bd2946f0eb2a664e32ec9a9b23bf1b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= Date: Thu, 27 Oct 2016 17:47:07 -0700 Subject: [PATCH 263/292] ipack: print a hex number after a 0x prefix MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It makes the result hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/20161026125658.25728-4-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König Cc: Samuel Iglesias Gonsalvez Cc: Jens Taprogge Cc: Greg Kroah-Hartman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- drivers/ipack/ipack.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/ipack/ipack.c b/drivers/ipack/ipack.c index c0e7b624ce54..12102448fddd 100644 --- a/drivers/ipack/ipack.c +++ b/drivers/ipack/ipack.c @@ -178,7 +178,7 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, idev->id_vendor, idev->id_device); } -ipack_device_attr(id_format, "0x%hhu\n"); +ipack_device_attr(id_format, "0x%hhx\n"); static DEVICE_ATTR_RO(id); static DEVICE_ATTR_RO(id_device); From 17a88939568be28a8ea5195b55ef3a84a469777e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= Date: Thu, 27 Oct 2016 17:47:10 -0700 Subject: [PATCH 264/292] cris/arch-v32: cryptocop: print a hex number after a 0x prefix MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It makes the result hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/20161026125658.25728-6-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König Cc: Mikael Starvik Cc: Jesper Nilsson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/cris/arch-v32/drivers/cryptocop.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/cris/arch-v32/drivers/cryptocop.c b/arch/cris/arch-v32/drivers/cryptocop.c index 099e170a93ee..0068fd411a84 100644 --- a/arch/cris/arch-v32/drivers/cryptocop.c +++ b/arch/cris/arch-v32/drivers/cryptocop.c @@ -3149,7 +3149,7 @@ static void print_dma_descriptors(struct cryptocop_int_operation *iop) printk("print_dma_descriptors start\n"); printk("iop:\n"); - printk("\tsid: 0x%lld\n", iop->sid); + printk("\tsid: 0x%llx\n", iop->sid); printk("\tcdesc_out: 0x%p\n", iop->cdesc_out); printk("\tcdesc_in: 0x%p\n", iop->cdesc_in); From 8e819101ce6fcc58801c9a813ea99c4da0255eef Mon Sep 17 00:00:00 2001 From: Dimitri Sivanich Date: Thu, 27 Oct 2016 17:47:12 -0700 Subject: [PATCH 265/292] drivers/misc/sgi-gru/grumain.c: remove bogus 0x prefix from printk MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Would like to have this be a decimal number. Link: http://lkml.kernel.org/r/20161026134746.GA30169@sgi.com Signed-off-by: Dimitri Sivanich Reported-by: Uwe Kleine-König Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- drivers/misc/sgi-gru/grumain.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/sgi-gru/grumain.c b/drivers/misc/sgi-gru/grumain.c index 1525870f460a..33741ad4a74a 100644 --- a/drivers/misc/sgi-gru/grumain.c +++ b/drivers/misc/sgi-gru/grumain.c @@ -283,7 +283,7 @@ static void gru_unload_mm_tracker(struct gru_state *gru, spin_lock(&gru->gs_asid_lock); BUG_ON((asids->mt_ctxbitmap & ctxbitmap) != ctxbitmap); asids->mt_ctxbitmap ^= ctxbitmap; - gru_dbg(grudev, "gid %d, gts %p, gms %p, ctxnum 0x%d, asidmap 0x%lx\n", + gru_dbg(grudev, "gid %d, gts %p, gms %p, ctxnum %d, asidmap 0x%lx\n", gru->gs_gid, gts, gms, gts->ts_ctxnum, gms->ms_asidmap[0]); spin_unlock(&gru->gs_asid_lock); spin_unlock(&gms->ms_asid_lock); From 1c27f646b18fb56308dff82784ca61951bad0b48 Mon Sep 17 00:00:00 2001 From: Borislav Petkov Date: Thu, 27 Oct 2016 14:36:23 +0200 Subject: [PATCH 266/292] x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y We needed the physical address of the container in order to compute the offset within the relocated ramdisk. And we did this by doing __pa() on the virtual address. However, __pa() does checks whether the physical address is within PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address which *doesn't* have the randomization offset into a function which uses PAGE_OFFSET which *does* have that offset. This makes this check fire: VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x)); ^^^^^^ due to the randomization offset. The fix is as simple as using __pa_nodebug() because we do that randomization offset accounting later in that function ourselves. Reported-by: Bob Peterson Tested-by: Bob Peterson Signed-off-by: Borislav Petkov Cc: Andreas Gruenbacher Cc: Andy Lutomirski Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Josh Poimboeuf Cc: Linus Torvalds Cc: Mel Gorman Cc: Peter Zijlstra Cc: Steven Whitehouse Cc: Thomas Gleixner Cc: linux-mm Cc: stable@vger.kernel.org # 4.9 Link: http://lkml.kernel.org/r/20161027123623.j2jri5bandimboff@pd.tnic Signed-off-by: Ingo Molnar --- arch/x86/kernel/cpu/microcode/amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 620ab06bcf45..017bda12caae 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -429,7 +429,7 @@ int __init save_microcode_in_initrd_amd(void) * We need the physical address of the container for both bitness since * boot_params.hdr.ramdisk_image is a physical address. */ - cont = __pa(container); + cont = __pa_nodebug(container); cont_va = container; #endif From 0933840acf7b65d6d30a5b6089d882afea57aca3 Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Thu, 20 Oct 2016 13:10:11 +0200 Subject: [PATCH 267/292] perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic CAI Qian reported a crash in the PMU uncore device removal code, enabled by the CONFIG_DEBUG_TEST_DRIVER_REMOVE=y option: https://marc.info/?l=linux-kernel&m=147688837328451 The reason for the crash is that perf_pmu_unregister() tries to remove a PMU device which is not added at this point. We add PMU devices only after pmu_bus is registered, which happens in the perf_event_sysfs_init() call and sets the 'pmu_bus_running' flag. The fix is to get the 'pmu_bus_running' flag state at the point the PMU is taken out of the PMU list and remove the device later only if it's set. Reported-by: CAI Qian Tested-by: CAI Qian Signed-off-by: Jiri Olsa Signed-off-by: Peter Zijlstra (Intel) Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Greg Kroah-Hartman Cc: Jiri Olsa Cc: Kan Liang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Rob Herring Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20161020111011.GA13361@krava Signed-off-by: Ingo Molnar --- kernel/events/core.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index c6e47e97b33f..a5d2e62faf7e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8855,7 +8855,10 @@ EXPORT_SYMBOL_GPL(perf_pmu_register); void perf_pmu_unregister(struct pmu *pmu) { + int remove_device; + mutex_lock(&pmus_lock); + remove_device = pmu_bus_running; list_del_rcu(&pmu->entry); mutex_unlock(&pmus_lock); @@ -8869,10 +8872,12 @@ void perf_pmu_unregister(struct pmu *pmu) free_percpu(pmu->pmu_disable_count); if (pmu->type >= PERF_TYPE_MAX) idr_remove(&pmu_idr, pmu->type); - if (pmu->nr_addr_filters) - device_remove_file(pmu->dev, &dev_attr_nr_addr_filters); - device_del(pmu->dev); - put_device(pmu->dev); + if (remove_device) { + if (pmu->nr_addr_filters) + device_remove_file(pmu->dev, &dev_attr_nr_addr_filters); + device_del(pmu->dev); + put_device(pmu->dev); + } free_pmu_context(pmu); } EXPORT_SYMBOL_GPL(perf_pmu_unregister); From 5aab90ce1ec449912a2ebc4d45e0c85dac29e9dd Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Wed, 26 Oct 2016 11:48:24 +0200 Subject: [PATCH 268/292] perf/powerpc: Don't call perf_event_disable() from atomic context The trinity syscall fuzzer triggered following WARN() on powerpc: WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278 ... NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0 LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 Call Trace: [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable) [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 Followed by a lockdep warning: =============================== [ INFO: suspicious RCU usage. ] 4.8.0-rc5+ #7 Tainted: G W ------------------------------- ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ls/2998: #0: (rcu_read_lock){......}, at: [] .__atomic_notifier_call_chain+0x0/0x1c0 #1: (rcu_read_lock){......}, at: [] .hw_breakpoint_handler+0x0/0x2b0 stack backtrace: CPU: 9 PID: 2998 Comm: ls Tainted: G W 4.8.0-rc5+ #7 Call Trace: [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable) [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180 [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0 [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0 [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380 [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60 [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0 [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 While it looks like the first WARN() is probably valid, the other one is triggered by disabling event via perf_event_disable() from atomic context. The event is disabled here in case we were not able to emulate the instruction that hit the breakpoint. By disabling the event we unschedule the event and make sure it's not scheduled back. But we can't call perf_event_disable() from atomic context, instead we need to use the event's pending_disable irq_work method to disable it. Reported-by: Jan Stancek Signed-off-by: Jiri Olsa Signed-off-by: Peter Zijlstra (Intel) Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Huang Ying Cc: Jiri Olsa Cc: Linus Torvalds Cc: Michael Neuling Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20161026094824.GA21397@krava Signed-off-by: Ingo Molnar --- arch/powerpc/kernel/hw_breakpoint.c | 2 +- include/linux/perf_event.h | 1 + kernel/events/core.c | 10 ++++++++-- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c index 9781c69eae57..03d089b3ed72 100644 --- a/arch/powerpc/kernel/hw_breakpoint.c +++ b/arch/powerpc/kernel/hw_breakpoint.c @@ -275,7 +275,7 @@ int hw_breakpoint_handler(struct die_args *args) if (!stepped) { WARN(1, "Unable to handle hardware breakpoint. Breakpoint at " "0x%lx will be disabled.", info->address); - perf_event_disable(bp); + perf_event_disable_inatomic(bp); goto out; } /* diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 060d0ede88df..4741ecdb9817 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1257,6 +1257,7 @@ extern u64 perf_swevent_set_period(struct perf_event *event); extern void perf_event_enable(struct perf_event *event); extern void perf_event_disable(struct perf_event *event); extern void perf_event_disable_local(struct perf_event *event); +extern void perf_event_disable_inatomic(struct perf_event *event); extern void perf_event_task_tick(void); #else /* !CONFIG_PERF_EVENTS: */ static inline void * diff --git a/kernel/events/core.c b/kernel/events/core.c index a5d2e62faf7e..0e292132efac 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1960,6 +1960,12 @@ void perf_event_disable(struct perf_event *event) } EXPORT_SYMBOL_GPL(perf_event_disable); +void perf_event_disable_inatomic(struct perf_event *event) +{ + event->pending_disable = 1; + irq_work_queue(&event->pending); +} + static void perf_set_shadow_time(struct perf_event *event, struct perf_event_context *ctx, u64 tstamp) @@ -7075,8 +7081,8 @@ static int __perf_event_overflow(struct perf_event *event, if (events && atomic_dec_and_test(&event->event_limit)) { ret = 1; event->pending_kill = POLL_HUP; - event->pending_disable = 1; - irq_work_queue(&event->pending); + + perf_event_disable_inatomic(event); } READ_ONCE(event->overflow_handler)(event, data, regs); From f92b7604149a55cb601fc0b52911b1e11f0f2514 Mon Sep 17 00:00:00 2001 From: Imre Palik Date: Fri, 21 Oct 2016 01:18:59 -0700 Subject: [PATCH 269/292] perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors perf doesn't seem to honour the number of fixed counters specified by CPUID leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters. So if some of the fixed counters are masked out by the hypervisor, it still tries to check/set them. This patch makes perf behave nicer when the kernel is running under a hypervisor that doesn't expose all the counters. This patch contains some ideas from Matt Wilson. Signed-off-by: Imre Palik Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Andi Kleen Cc: Alexander Kozyrev Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Artyom Kuanbekov Cc: David Carrillo-Cisneros Cc: David Woodhouse Cc: H. Peter Anvin Cc: Jiri Olsa Cc: Kan Liang Cc: Linus Torvalds Cc: Matt Wilson Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1477037939-15605-1-git-send-email-imrep.amz@gmail.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/core.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index eab0915f5995..a74a2dbc0180 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3607,10 +3607,14 @@ __init int intel_pmu_init(void) /* * Quirk: v2 perfmon does not report fixed-purpose events, so - * assume at least 3 events: + * assume at least 3 events, when not running in a hypervisor: */ - if (version > 1) - x86_pmu.num_counters_fixed = max((int)edx.split.num_counters_fixed, 3); + if (version > 1) { + int assume = 3 * !boot_cpu_has(X86_FEATURE_HYPERVISOR); + + x86_pmu.num_counters_fixed = + max((int)edx.split.num_counters_fixed, assume); + } if (boot_cpu_has(X86_FEATURE_PDCM)) { u64 capabilities; From 9bcffe7575b721d7b6d9b3090fe18809d9806e78 Mon Sep 17 00:00:00 2001 From: Richard Genoud Date: Thu, 27 Oct 2016 18:04:06 +0200 Subject: [PATCH 270/292] tty/serial: at91: fix hardware handshake on Atmel platforms MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit After commit 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when hardware handshake is enabled"), the hardware handshake wasn't functional anymore on Atmel platforms (beside SAMA5D2). To understand why, one has to understand the flag ATMEL_US_USMODE_HWHS first: Before commit 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when hardware handshake is enabled"), this flag was never set. Thus, the CTS/RTS where only handled by serial_core (and everything worked just fine). This commit introduced the use of the ATMEL_US_USMODE_HWHS flag, enabling it for all boards when the user space enables flow control. When the ATMEL_US_USMODE_HWHS is set, the Atmel USART controller handles a part of the flow control job: - disable the transmitter when the CTS pin gets high. - drive the RTS pin high when the DMA buffer transfer is completed or PDC RX buffer full or RX FIFO is beyond threshold. (depending on the controller version). NB: This feature is *not* mandatory for the flow control to work. (Nevertheless, it's very useful if low latencies are needed.) Now, the specifics of the ATMEL_US_USMODE_HWHS flag: - For platforms with DMAC and no FIFOs (sam9x25, sam9x35, sama5D3, sama5D4, sam9g15, sam9g25, sam9g35)* this feature simply doesn't work. ( source: https://lkml.org/lkml/2016/9/7/598 ) Tested it on sam9g35, the RTS pins always stays up, even when RXEN=1 or a new DMA transfer descriptor is set. => ATMEL_US_USMODE_HWHS must not be used for those platforms - For platforms with a PDC (sam926{0,1,3}, sam9g10, sam9g20, sam9g45, sam9g46)*, there's another kind of problem. Once the flag ATMEL_US_USMODE_HWHS is set, the RTS pin can't be driven anymore via RTSEN/RTSDIS in USART Control Register. The RTS pin can only be driven by enabling/disabling the receiver or setting RCR=RNCR=0 in the PDC (Receive (Next) Counter Register). => Doing this is beyond the scope of this patch and could add other bugs, so the original (and working) behaviour should be set for those platforms (meaning ATMEL_US_USMODE_HWHS flag should be unset). - For platforms with a FIFO (sama5d2)*, the RTS pin is driven according to the RX FIFO thresholds, and can be also driven by RTSEN/RTSDIS in USART Control Register. No problem here. (This was the use case of commit 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when hardware handshake is enabled")) NB: If the CTS pin declared as a GPIO in the DTS, (for instance cts-gpios = <&pioA PIN_PB31 GPIO_ACTIVE_LOW>), the transmitter will be disabled. => ATMEL_US_USMODE_HWHS flag can be set for this platform ONLY IF the CTS pin is not a GPIO. So, the only case when ATMEL_US_USMODE_HWHS can be enabled is when (atmel_use_fifo(port) && !mctrl_gpio_to_gpiod(atmel_port->gpios, UART_GPIO_CTS)) Tested on all Atmel USART controller flavours: AT91SAM9G35-CM (DMAC flavour), AT91SAM9G20-EK (PDC flavour), SAMA5D2xplained (FIFO flavour). * the list may not be exhaustive Cc: #4.4+ (beware, missing atmel_port variable) Fixes: 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when hardware handshake is enabled") Signed-off-by: Richard Genoud Acked-by: Alexandre Belloni Acked-by: Cyrille Pitchen Acked-by: Uwe Kleine-König Acked-by: Nicolas Ferre Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/atmel_serial.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c index fd8aa1f4ba78..168b10cad47b 100644 --- a/drivers/tty/serial/atmel_serial.c +++ b/drivers/tty/serial/atmel_serial.c @@ -2132,11 +2132,29 @@ static void atmel_set_termios(struct uart_port *port, struct ktermios *termios, mode |= ATMEL_US_USMODE_RS485; } else if (termios->c_cflag & CRTSCTS) { /* RS232 with hardware handshake (RTS/CTS) */ - if (atmel_use_dma_rx(port) && !atmel_use_fifo(port)) { - dev_info(port->dev, "not enabling hardware flow control because DMA is used"); - termios->c_cflag &= ~CRTSCTS; - } else { + if (atmel_use_fifo(port) && + !mctrl_gpio_to_gpiod(atmel_port->gpios, UART_GPIO_CTS)) { + /* + * with ATMEL_US_USMODE_HWHS set, the controller will + * be able to drive the RTS pin high/low when the RX + * FIFO is above RXFTHRES/below RXFTHRES2. + * It will also disable the transmitter when the CTS + * pin is high. + * This mode is not activated if CTS pin is a GPIO + * because in this case, the transmitter is always + * disabled (there must be an internal pull-up + * responsible for this behaviour). + * If the RTS pin is a GPIO, the controller won't be + * able to drive it according to the FIFO thresholds, + * but it will be handled by the driver. + */ mode |= ATMEL_US_USMODE_HWHS; + } else { + /* + * For platforms without FIFO, the flow control is + * handled by the driver. + */ + mode |= ATMEL_US_USMODE_NORMAL; } } else { /* RS232 without hadware handshake */ From 4dda864d73079a1eb01fab4ec29b97db150163bf Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Fri, 28 Oct 2016 07:07:47 -0500 Subject: [PATCH 271/292] tty: serial_core: Fix serial console crash on port shutdown MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The port->console flag is always false, as uart_console() is called before the serial console has been registered. Hence for a serial port used as the console, uart_tty_port_shutdown() will still be called when userspace closes the port, powering it down. This may lead to a system lock up when the serial console driver writes to the serial port's registers. To fix this, move the setting of port->console after the call to uart_configure_port(), which registers the serial console. Fixes: 761ed4a94582ab29 ("tty: serial_core: convert uart_close to use tty_port_close") Reported-by: Niklas Söderlund Signed-off-by: Geert Uytterhoeven Acked-by: Rob Herring Tested-by: Mugunthan V N Tested-by: Niklas Söderlund [robh: rebased on tty-linus] Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/serial_core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index 664c99aeeca5..ce8899c13af3 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -2759,6 +2759,8 @@ int uart_add_one_port(struct uart_driver *drv, struct uart_port *uport) uart_configure_port(drv, state, uport); + port->console = uart_console(uport); + num_groups = 2; if (uport->attr_group) num_groups++; From d0f4bce2bce7e998abc906f3590e9032af7a41ba Mon Sep 17 00:00:00 2001 From: Rob Herring Date: Fri, 28 Oct 2016 07:07:48 -0500 Subject: [PATCH 272/292] tty: serial_core: fix NULL struct tty pointer access in uart_write_wakeup Since commit 761ed4a94582ab29 ("tty: serial_core: convert uart_close to use tty_port_close"), the serial console is broken on various systems and typing "reboot" splats the following on the serial console: INIT: Sending p[ 427.863916] BUG: unable to handle kernel NULL pointer dereference at 00000000000001e0 [ 427.885156] IP: [] tty_wakeup+0xc/0x70 [ 427.898337] PGD 0 [ 427.902051] [ 427.907498] Oops: 0000 [#1] PREEMPT SMP [ 427.917635] Modules linked in: nfsv3 nfs_acl nfs fscache lockd sunrpc grace edd af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave fuse loop md_mod dm_mod joydev hid_generic usbhid ipmi_ssif ohci_pci ohci_hcd ehci_pci ehci_hcd e1000e ptp firewire_ohci edac_core pps_core tpm_infineon sp5100_tco firewire_core acpi_cpufreq serio_raw pcspkr fjes usbcore shpchp edac_mce_amd tpm_tis ipmi_si tpm_tis_core i2c_piix4 k10temp sg ipmi_msghandler tpm sr_mod button cdrom kvm_amd kvm irqbypass crc_itu_t ast ttm drm_kms_helper drm fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw ata_generic pata_atiixp [ 428.054179] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc1-1.g73e3f23-default #1 [ 428.072868] Hardware name: System manufacturer System Product Name/KGP(M)E-D16, BIOS 0902 12/03/2010 [ 428.094755] task: ffffffffa2c0d500 task.stack: ffffffffa2c00000 [ 428.109717] RIP: 0010:[] [] tty_wakeup+0xc/0x70 [ 428.128407] RSP: 0018:ffff9a1a5fc03df8 EFLAGS: 00010086 [ 428.142184] RAX: ffff9a1857258000 RBX: ffffffffa3050ea0 RCX: 0000000000000000 [ 428.159649] RDX: 000000000000001b RSI: 0000000000000000 RDI: 0000000000000000 [ 428.177109] RBP: ffff9a1a5fc03e08 R08: 0000000000000000 R09: 0000000000000000 [ 428.194547] R10: 0000000000021c77 R11: 0000000000000000 R12: ffff9a1857258000 [ 428.212002] R13: 0000000000000000 R14: 0000000000000020 R15: 0000000000000020 [ 428.229481] FS: 0000000000000000(0000) GS:ffff9a1a5fc00000(0000) knlGS:0000000000000000 [ 428.248938] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 428.263726] CR2: 00000000000001e0 CR3: 0000000390c06000 CR4: 00000000000006f0 [ 428.281331] Stack: [ 428.288696] ffffffffa3050ea0 ffff9a1857258000 ffff9a1a5fc03e18 ffffffffa24e0ab1 [ 428.307064] ffff9a1a5fc03e40 ffffffffa24e8865 ffffffffa3050ea0 00000000000000c2 [ 428.325456] 0000000000000046 ffff9a1a5fc03e78 ffffffffa24e8a5f ffffffffa3050ea0 [ 428.343905] Call Trace: [ 428.352319] [ 428.356216] [] uart_write_wakeup+0x21/0x30 The problem is for console ports, the serial port is not shutdown and interrupts may fire after the struct tty is gone. Simply calling the tty_port helper tty_port_tty_wakeup instead of tty_wakeup directly will ensure there is a valid struct tty. Fixes: 761ed4a94582ab29 ("tty: serial_core: convert uart_close to use tty_port_close") Reported-by: Borislav Petkov Reported-by: Mike Galbraith Cc: Jiri Slaby Cc: Greg Kroah-Hartman Cc: linux-serial@vger.kernel.org Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman --- drivers/tty/serial/serial_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index ce8899c13af3..f2303f390345 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -111,7 +111,7 @@ void uart_write_wakeup(struct uart_port *port) * closed. No cookie for you. */ BUG_ON(!state); - tty_wakeup(state->port.tty); + tty_port_tty_wakeup(&state->port); } static void uart_stop(struct tty_struct *tty) @@ -632,7 +632,7 @@ static void uart_flush_buffer(struct tty_struct *tty) if (port->ops->flush_buffer) port->ops->flush_buffer(port); uart_port_unlock(port, flags); - tty_wakeup(tty); + tty_port_tty_wakeup(&state->port); } /* From 6ad37567b6b886121e250036e489d82cde5e5e94 Mon Sep 17 00:00:00 2001 From: Martyn Welch Date: Fri, 21 Oct 2016 17:36:19 +0100 Subject: [PATCH 273/292] vme: vme_get_size potentially returning incorrect value on failure The function vme_get_size returns the size of the window to the caller, however it doesn't check the return value of the call to vme_master_get. Return 0 on failure rather than anything else. Suggested-by: Dan Carpenter Signed-off-by: Martyn Welch Signed-off-by: Greg Kroah-Hartman --- drivers/vme/vme.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/vme/vme.c b/drivers/vme/vme.c index 15b64076bc26..bdbadaa47ef3 100644 --- a/drivers/vme/vme.c +++ b/drivers/vme/vme.c @@ -156,12 +156,16 @@ size_t vme_get_size(struct vme_resource *resource) case VME_MASTER: retval = vme_master_get(resource, &enabled, &base, &size, &aspace, &cycle, &dwidth); + if (retval) + return 0; return size; break; case VME_SLAVE: retval = vme_slave_get(resource, &enabled, &base, &size, &buf_base, &aspace, &cycle); + if (retval) + return 0; return size; break; From a7a7aeefbca2982586ba2c9fd7739b96416a6d1d Mon Sep 17 00:00:00 2001 From: Gerald Schaefer Date: Wed, 19 Oct 2016 12:29:41 +0200 Subject: [PATCH 274/292] GenWQE: Fix bad page access during abort of resource allocation When interrupting an application which was allocating DMAable memory, it was possible, that the DMA memory was deallocated twice, leading to the error symptoms below. Thanks to Gerald, who analyzed the problem and provided this patch. I agree with his analysis of the problem: ddcb_cmd_fixups() -> genwqe_alloc_sync_sgl() (fails in f/lpage, but sgl->sgl != NULL and f/lpage maybe also != NULL) -> ddcb_cmd_cleanup() -> genwqe_free_sync_sgl() (double free, because sgl->sgl != NULL and f/lpage maybe also != NULL) In this scenario we would have exactly the kind of double free that would explain the WARNING / Bad page state, and as expected it is caused by broken error handling (cleanup). Using the Ubuntu git source, tag Ubuntu-4.4.0-33.52, he was able to reproduce the "Bad page state" issue, and with the patch on top he could not reproduce it any more. ------------[ cut here ]------------ WARNING: at /build/linux-o03cxz/linux-4.4.0/arch/s390/include/asm/pci_dma.h:141 Modules linked in: qeth_l2 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha256_s390 sha1_s390 sha_common genwqe_card qeth crc_itu_t qdio ccwgroup vmur dm_multipath dasd_eckd_mod dasd_mod CPU: 2 PID: 3293 Comm: genwqe_gunzip Not tainted 4.4.0-33-generic #52-Ubuntu task: 0000000032c7e270 ti: 00000000324e4000 task.ti: 00000000324e4000 Krnl PSW : 0404c00180000000 0000000000156346 (dma_update_cpu_trans+0x9e/0xa8) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3 Krnl GPRS: 00000000324e7bcd 0000000000c3c34a 0000000027628298 000000003215b400 0000000000000400 0000000000001fff 0000000000000400 0000000116853000 07000000324e7b1e 0000000000000001 0000000000000001 0000000000000001 0000000000001000 0000000116854000 0000000000156402 00000000324e7a38 Krnl Code: 000000000015633a: 95001000 cli 0(%r1),0 000000000015633e: a774ffc3 brc 7,1562c4 #0000000000156342: a7f40001 brc 15,156344 >0000000000156346: 92011000 mvi 0(%r1),1 000000000015634a: a7f4ffbd brc 15,1562c4 000000000015634e: 0707 bcr 0,%r7 0000000000156350: c00400000000 brcl 0,156350 0000000000156356: eb7ff0500024 stmg %r7,%r15,80(%r15) Call Trace: ([<00000000001563e0>] dma_update_trans+0x90/0x228) [<00000000001565dc>] s390_dma_unmap_pages+0x64/0x160 [<00000000001567c2>] s390_dma_free+0x62/0x98 [<000003ff801310ce>] __genwqe_free_consistent+0x56/0x70 [genwqe_card] [<000003ff801316d0>] genwqe_free_sync_sgl+0xf8/0x160 [genwqe_card] [<000003ff8012bd6e>] ddcb_cmd_cleanup+0x86/0xa8 [genwqe_card] [<000003ff8012c1c0>] do_execute_ddcb+0x110/0x348 [genwqe_card] [<000003ff8012c914>] genwqe_ioctl+0x51c/0xc20 [genwqe_card] [<000000000032513a>] do_vfs_ioctl+0x3b2/0x518 [<0000000000325344>] SyS_ioctl+0xa4/0xb8 [<00000000007b86c6>] system_call+0xd6/0x264 [<000003ff9e8e520a>] 0x3ff9e8e520a Last Breaking-Event-Address: [<0000000000156342>] dma_update_cpu_trans+0x9a/0xa8 ---[ end trace 35996336235145c8 ]--- BUG: Bad page state in process jbd2/dasdb1-8 pfn:3215b page:000003d100c856c0 count:-1 mapcount:0 mapping: (null) index:0x0 flags: 0x3fffc0000000000() page dumped because: nonzero _count Signed-off-by: Gerald Schaefer Signed-off-by: Frank Haverkamp Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/misc/genwqe/card_utils.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/misc/genwqe/card_utils.c b/drivers/misc/genwqe/card_utils.c index 8a679ecc8fd1..fc2794b513fa 100644 --- a/drivers/misc/genwqe/card_utils.c +++ b/drivers/misc/genwqe/card_utils.c @@ -352,17 +352,27 @@ int genwqe_alloc_sync_sgl(struct genwqe_dev *cd, struct genwqe_sgl *sgl, if (copy_from_user(sgl->lpage, user_addr + user_size - sgl->lpage_size, sgl->lpage_size)) { rc = -EFAULT; - goto err_out1; + goto err_out2; } } return 0; + err_out2: + __genwqe_free_consistent(cd, PAGE_SIZE, sgl->lpage, + sgl->lpage_dma_addr); + sgl->lpage = NULL; + sgl->lpage_dma_addr = 0; err_out1: __genwqe_free_consistent(cd, PAGE_SIZE, sgl->fpage, sgl->fpage_dma_addr); + sgl->fpage = NULL; + sgl->fpage_dma_addr = 0; err_out: __genwqe_free_consistent(cd, sgl->sgl_size, sgl->sgl, sgl->sgl_dma_addr); + sgl->sgl = NULL; + sgl->sgl_dma_addr = 0; + sgl->sgl_size = 0; return -ENOMEM; } From eb94cd68abd9b7c92bf70ddc452d65f1a84c46e2 Mon Sep 17 00:00:00 2001 From: Jorgen Hansen Date: Thu, 6 Oct 2016 04:43:08 -0700 Subject: [PATCH 275/292] VMCI: Doorbell create and destroy fixes This change consists of two changes: 1) If vmci_doorbell_create is called when neither guest nor host personality as been initialized, vmci_get_context_id will return VMCI_INVALID_ID. In that case, we should fail the create call. 2) In doorbell destroy, we assume that vmci_guest_code_active() has the same return value on create and destroy. That may not be the case, so we may end up with the wrong refcount. Instead, destroy should check explicitly whether the doorbell is in the index table as an indicator of whether the guest code was active at create time. Reviewed-by: Adit Ranadive Signed-off-by: Jorgen Hansen Signed-off-by: Greg Kroah-Hartman --- drivers/misc/vmw_vmci/vmci_doorbell.c | 8 +++++++- drivers/misc/vmw_vmci/vmci_driver.c | 2 +- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/misc/vmw_vmci/vmci_doorbell.c b/drivers/misc/vmw_vmci/vmci_doorbell.c index a8cee33ae8d2..b3fa738ae005 100644 --- a/drivers/misc/vmw_vmci/vmci_doorbell.c +++ b/drivers/misc/vmw_vmci/vmci_doorbell.c @@ -431,6 +431,12 @@ int vmci_doorbell_create(struct vmci_handle *handle, if (vmci_handle_is_invalid(*handle)) { u32 context_id = vmci_get_context_id(); + if (context_id == VMCI_INVALID_ID) { + pr_warn("Failed to get context ID\n"); + result = VMCI_ERROR_NO_RESOURCES; + goto free_mem; + } + /* Let resource code allocate a free ID for us */ new_handle = vmci_make_handle(context_id, VMCI_INVALID_ID); } else { @@ -525,7 +531,7 @@ int vmci_doorbell_destroy(struct vmci_handle handle) entry = container_of(resource, struct dbell_entry, resource); - if (vmci_guest_code_active()) { + if (!hlist_unhashed(&entry->node)) { int result; dbell_index_table_remove(entry); diff --git a/drivers/misc/vmw_vmci/vmci_driver.c b/drivers/misc/vmw_vmci/vmci_driver.c index 896be150e28f..d7eaf1eb11e7 100644 --- a/drivers/misc/vmw_vmci/vmci_driver.c +++ b/drivers/misc/vmw_vmci/vmci_driver.c @@ -113,5 +113,5 @@ module_exit(vmci_drv_exit); MODULE_AUTHOR("VMware, Inc."); MODULE_DESCRIPTION("VMware Virtual Machine Communication Interface."); -MODULE_VERSION("1.1.4.0-k"); +MODULE_VERSION("1.1.5.0-k"); MODULE_LICENSE("GPL v2"); From a7d5afe82d4931289250d6e807e35db3885b3b12 Mon Sep 17 00:00:00 2001 From: Gabriel Krisman Bertazi Date: Tue, 4 Oct 2016 15:26:48 -0300 Subject: [PATCH 276/292] MAINTAINERS: Add entry for genwqe driver Frank and I maintain this Signed-off-by: Gabriel Krisman Bertazi Cc: haver@linux.vnet.ibm.com Acked-by: Frank Haverkamp = Signed-off-by: Greg Kroah-Hartman --- MAINTAINERS | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index c44795306342..61f201b7cd95 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5287,6 +5287,12 @@ M: Joe Perches S: Maintained F: scripts/get_maintainer.pl +GENWQE (IBM Generic Workqueue Card) +M: Frank Haverkamp +M: Gabriel Krisman Bertazi +S: Supported +F: drivers/misc/genwqe/ + GFS2 FILE SYSTEM M: Steven Whitehouse M: Bob Peterson From 40b6e61ac72e99672e47cdb99c8d7d226004169b Mon Sep 17 00:00:00 2001 From: Boris Brezillon Date: Fri, 28 Oct 2016 11:08:44 +0200 Subject: [PATCH 277/292] ubi: fastmap: Fix add_vol() return value test in ubi_attach_fastmap() Commit e96a8a3bb671 ("UBI: Fastmap: Do not add vol if it already exists") introduced a bug by changing the possible error codes returned by add_vol(): - this function no longer returns NULL in case of allocation failure but return ERR_PTR(-ENOMEM) - when a duplicate entry in the volume RB tree is found it returns ERR_PTR(-EEXIST) instead of ERR_PTR(-EINVAL) Fix the tests done on add_vol() return val to match this new behavior. Fixes: e96a8a3bb671 ("UBI: Fastmap: Do not add vol if it already exists") Reported-by: Dan Carpenter Signed-off-by: Boris Brezillon Acked-by: Sheng Yong Signed-off-by: Richard Weinberger --- drivers/mtd/ubi/fastmap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c index 2ff62157d3bb..c1f5c29e458e 100644 --- a/drivers/mtd/ubi/fastmap.c +++ b/drivers/mtd/ubi/fastmap.c @@ -707,11 +707,11 @@ static int ubi_attach_fastmap(struct ubi_device *ubi, fmvhdr->vol_type, be32_to_cpu(fmvhdr->last_eb_bytes)); - if (!av) - goto fail_bad; - if (PTR_ERR(av) == -EINVAL) { - ubi_err(ubi, "volume (ID %i) already exists", - fmvhdr->vol_id); + if (IS_ERR(av)) { + if (PTR_ERR(av) == -EEXIST) + ubi_err(ubi, "volume (ID %i) already exists", + fmvhdr->vol_id); + goto fail_bad; } From a00052a296e54205cf238c75bd98d17d5d02a6db Mon Sep 17 00:00:00 2001 From: Richard Weinberger Date: Fri, 28 Oct 2016 11:49:03 +0200 Subject: [PATCH 278/292] ubifs: Fix regression in ubifs_readdir() Commit c83ed4c9dbb35 ("ubifs: Abort readdir upon error") broke overlayfs support because the fix exposed an internal error code to VFS. Reported-by: Peter Rosin Tested-by: Peter Rosin Reported-by: Ralph Sennhauser Tested-by: Ralph Sennhauser Fixes: c83ed4c9dbb35 ("ubifs: Abort readdir upon error") Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger --- fs/ubifs/dir.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index bd4a5e8ce441..ca16c5d7bab1 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -543,6 +543,14 @@ out: if (err != -ENOENT) ubifs_err(c, "cannot find next direntry, error %d", err); + else + /* + * -ENOENT is a non-fatal error in this context, the TNC uses + * it to indicate that the cursor moved past the current directory + * and readdir() has to stop. + */ + err = 0; + /* 2 is a special value indicating that there are no more direntries */ ctx->pos = 2; From 711c1f2671174c918045e2cb20aece976ac516cd Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Thu, 13 Oct 2016 15:53:02 -0700 Subject: [PATCH 279/292] ARCv2: boot log: print IOC exists as well as enabled status Previously we would not print the case when IOC existed but was not enabled. And while at it, reduce one line off boot printing by consolidating the Peripheral address space and IO-Coherency which in a way applies to them Signed-off-by: Vineet Gupta --- arch/arc/include/asm/setup.h | 1 + arch/arc/kernel/setup.c | 4 +--- arch/arc/mm/cache.c | 9 +++------ 3 files changed, 5 insertions(+), 9 deletions(-) diff --git a/arch/arc/include/asm/setup.h b/arch/arc/include/asm/setup.h index 48b37c693db3..bdc43df922c9 100644 --- a/arch/arc/include/asm/setup.h +++ b/arch/arc/include/asm/setup.h @@ -43,5 +43,6 @@ void __init setup_arch_memory(void); #define IS_USED_RUN(v) ((v) ? "" : "(not used) ") #define IS_USED_CFG(cfg) IS_USED_RUN(IS_ENABLED(cfg)) #define IS_AVAIL2(v, s, cfg) IS_AVAIL1(v, s), IS_AVAIL1(v, IS_USED_CFG(cfg)) +#define IS_AVAIL3(v, v2, s) IS_AVAIL1(v, s), IS_AVAIL1(v, IS_DISABLED_RUN(v2)) #endif /* __ASMARC_SETUP_H */ diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c index 75e540972135..a77efa173df0 100644 --- a/arch/arc/kernel/setup.c +++ b/arch/arc/kernel/setup.c @@ -272,9 +272,7 @@ static char *arc_extn_mumbojumbo(int cpu_id, char *buf, int len) FIX_PTR(cpu); - n += scnprintf(buf + n, len - n, - "Vector Table\t: %#x\nPeripherals\t: %#lx:%#lx\n", - cpu->vec_base, perip_base, perip_end); + n += scnprintf(buf + n, len - n, "Vector Table\t: %#x\n", cpu->vec_base); if (cpu->extn.fpu_sp || cpu->extn.fpu_dp) n += scnprintf(buf + n, len - n, "FPU\t\t: %s%s\n", diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c index 518ff76771f3..2b96cfc3be75 100644 --- a/arch/arc/mm/cache.c +++ b/arch/arc/mm/cache.c @@ -53,18 +53,15 @@ char *arc_cache_mumbojumbo(int c, char *buf, int len) PR_CACHE(&cpuinfo_arc700[c].icache, CONFIG_ARC_HAS_ICACHE, "I-Cache"); PR_CACHE(&cpuinfo_arc700[c].dcache, CONFIG_ARC_HAS_DCACHE, "D-Cache"); - if (!is_isa_arcv2()) - return buf; - p = &cpuinfo_arc700[c].slc; if (p->ver) n += scnprintf(buf + n, len - n, "SLC\t\t: %uK, %uB Line%s\n", p->sz_k, p->line_len, IS_USED_RUN(slc_enable)); - if (ioc_exists) - n += scnprintf(buf + n, len - n, "IOC\t\t:%s\n", - IS_DISABLED_RUN(ioc_enable)); + n += scnprintf(buf + n, len - n, "Peripherals\t: %#lx%s%s\n", + perip_base, + IS_AVAIL3(ioc_exists, ioc_enable, ", IO-Coherency ")); return buf; } From 73e284d2572581d848267c74552215f95f0f0996 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Thu, 20 Oct 2016 17:49:15 -0700 Subject: [PATCH 280/292] ARC: boot log: refactor printing abt features not captured in BCRs On older arc700 cores, some of the features configured were not present in Build config registers. To print about them at boot, we just use the Kconfig option i.e. whether linux is built to use them or not. So yes this seems bogus, but what else can be done. Moreover if linux is booting with these enabled, then the Kconfig info is a good indicator anyways. Over time these "hacks" accumulated in read_arc_build_cfg_regs() as well as arc_cpu_mumbojumbo(). so refactor and move all of those in a single place: read_arc_build_cfg_regs(). This causes some code redcution too: | bloat-o-meter2 arch/arc/kernel/setup.o.0 arch/arc/kernel/setup.o.1 | add/remove: 0/0 grow/shrink: 2/1 up/down: 64/-132 (-68) | function old new delta | setup_processor 610 670 +60 | cpuinfo_arc700 76 80 +4 | arc_cpu_mumbojumbo 752 620 -132 Signed-off-by: Vineet Gupta --- arch/arc/include/asm/arcregs.h | 1 + arch/arc/kernel/setup.c | 89 ++++++++++++++++------------------ 2 files changed, 44 insertions(+), 46 deletions(-) diff --git a/arch/arc/include/asm/arcregs.h b/arch/arc/include/asm/arcregs.h index db25c65155cb..819b44c1a719 100644 --- a/arch/arc/include/asm/arcregs.h +++ b/arch/arc/include/asm/arcregs.h @@ -349,6 +349,7 @@ struct cpuinfo_arc { struct cpuinfo_arc_bpu bpu; struct bcr_identity core; struct bcr_isa isa; + const char *details; unsigned int vec_base; struct cpuinfo_arc_ccm iccm, dccm; struct { diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c index a77efa173df0..0170d94f3860 100644 --- a/arch/arc/kernel/setup.c +++ b/arch/arc/kernel/setup.c @@ -40,6 +40,20 @@ struct task_struct *_current_task[NR_CPUS]; /* For stack switching */ struct cpuinfo_arc cpuinfo_arc700[NR_CPUS]; +static const struct cpuinfo_data arc_cpu_tbl[] = { +#ifdef CONFIG_ISA_ARCOMPACT + { {0x20, "ARC 600" }, 0x2F}, + { {0x30, "ARC 700" }, 0x33}, + { {0x34, "ARC 700 R4.10"}, 0x34}, + { {0x35, "ARC 700 R4.11"}, 0x35}, +#else + { {0x50, "ARC HS38 R2.0"}, 0x51}, + { {0x52, "ARC HS38 R2.1"}, 0x52}, + { {0x53, "ARC HS38 R3.0"}, 0x53}, +#endif + { {0x00, NULL } } +}; + static void read_decode_ccm_bcr(struct cpuinfo_arc *cpu) { if (is_isa_arcompact()) { @@ -92,11 +106,24 @@ static void read_arc_build_cfg_regs(void) struct bcr_timer timer; struct bcr_generic bcr; struct cpuinfo_arc *cpu = &cpuinfo_arc700[smp_processor_id()]; + const struct cpuinfo_data *tbl; + FIX_PTR(cpu); READ_BCR(AUX_IDENTITY, cpu->core); READ_BCR(ARC_REG_ISA_CFG_BCR, cpu->isa); + for (tbl = &arc_cpu_tbl[0]; tbl->info.id != 0; tbl++) { + if ((cpu->core.family >= tbl->info.id) && + (cpu->core.family <= tbl->up_range)) { + cpu->details = tbl->info.str; + break; + } + } + + if (tbl->info.id == 0) + cpu->details = "UNKNOWN"; + READ_BCR(ARC_REG_TIMERS_BCR, timer); cpu->extn.timer0 = timer.t0; cpu->extn.timer1 = timer.t1; @@ -160,64 +187,34 @@ static void read_arc_build_cfg_regs(void) cpu->extn.rtt = bcr.ver ? 1 : 0; cpu->extn.debug = cpu->extn.ap | cpu->extn.smart | cpu->extn.rtt; + + /* some hacks for lack of feature BCR info in old ARC700 cores */ + if (is_isa_arcompact()) { + if (!cpu->isa.ver) /* ISA BCR absent, use Kconfig info */ + cpu->isa.atomic = IS_ENABLED(CONFIG_ARC_HAS_LLSC); + else + cpu->isa.atomic = cpu->isa.atomic1; + + cpu->isa.be = IS_ENABLED(CONFIG_CPU_BIG_ENDIAN); + } } -static const struct cpuinfo_data arc_cpu_tbl[] = { -#ifdef CONFIG_ISA_ARCOMPACT - { {0x20, "ARC 600" }, 0x2F}, - { {0x30, "ARC 700" }, 0x33}, - { {0x34, "ARC 700 R4.10"}, 0x34}, - { {0x35, "ARC 700 R4.11"}, 0x35}, -#else - { {0x50, "ARC HS38 R2.0"}, 0x51}, - { {0x52, "ARC HS38 R2.1"}, 0x52}, - { {0x53, "ARC HS38 R3.0"}, 0x53}, -#endif - { {0x00, NULL } } -}; - - static char *arc_cpu_mumbojumbo(int cpu_id, char *buf, int len) { struct cpuinfo_arc *cpu = &cpuinfo_arc700[cpu_id]; struct bcr_identity *core = &cpu->core; - const struct cpuinfo_data *tbl; - char *isa_nm; - int i, be, atomic; - int n = 0; + int i, n = 0; FIX_PTR(cpu); - if (is_isa_arcompact()) { - isa_nm = "ARCompact"; - be = IS_ENABLED(CONFIG_CPU_BIG_ENDIAN); - - atomic = cpu->isa.atomic1; - if (!cpu->isa.ver) /* ISA BCR absent, use Kconfig info */ - atomic = IS_ENABLED(CONFIG_ARC_HAS_LLSC); - } else { - isa_nm = "ARCv2"; - be = cpu->isa.be; - atomic = cpu->isa.atomic; - } - n += scnprintf(buf + n, len - n, "\nIDENTITY\t: ARCVER [%#02x] ARCNUM [%#02x] CHIPID [%#4x]\n", core->family, core->cpu_id, core->chip_id); - for (tbl = &arc_cpu_tbl[0]; tbl->info.id != 0; tbl++) { - if ((core->family >= tbl->info.id) && - (core->family <= tbl->up_range)) { - n += scnprintf(buf + n, len - n, - "processor [%d]\t: %s (%s ISA) %s\n", - cpu_id, tbl->info.str, isa_nm, - IS_AVAIL1(be, "[Big-Endian]")); - break; - } - } - - if (tbl->info.id == 0) - n += scnprintf(buf + n, len - n, "UNKNOWN ARC Processor\n"); + n += scnprintf(buf + n, len - n, "processor [%d]\t: %s (%s ISA) %s\n", + cpu_id, cpu->details, + is_isa_arcompact() ? "ARCompact" : "ARCv2", + IS_AVAIL1(cpu->isa.be, "[Big-Endian]")); n += scnprintf(buf + n, len - n, "Timers\t\t: %s%s%s%s\nISA Extn\t: ", IS_AVAIL1(cpu->extn.timer0, "Timer0 "), @@ -226,7 +223,7 @@ static char *arc_cpu_mumbojumbo(int cpu_id, char *buf, int len) CONFIG_ARC_HAS_RTC)); n += i = scnprintf(buf + n, len - n, "%s%s%s%s%s", - IS_AVAIL2(atomic, "atomic ", CONFIG_ARC_HAS_LLSC), + IS_AVAIL2(cpu->isa.atomic, "atomic ", CONFIG_ARC_HAS_LLSC), IS_AVAIL2(cpu->isa.ldd, "ll64 ", CONFIG_ARC_HAS_LL64), IS_AVAIL1(cpu->isa.unalign, "unalign (not used)")); From a024fd9bc4d0b102b8aa66b8ecba678d2d32fdcf Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Thu, 20 Oct 2016 18:08:10 -0700 Subject: [PATCH 281/292] ARC: boot log: don't assume SWAPE instruction support This came to light when helping a customer with oldish ARC750 core who were getting instruction errors because of lack of SWAPE but boot log was incorrectly printing it as being present Signed-off-by: Vineet Gupta --- arch/arc/include/asm/arcregs.h | 2 +- arch/arc/kernel/setup.c | 5 ++++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/arc/include/asm/arcregs.h b/arch/arc/include/asm/arcregs.h index 819b44c1a719..b8d29b136b96 100644 --- a/arch/arc/include/asm/arcregs.h +++ b/arch/arc/include/asm/arcregs.h @@ -353,7 +353,7 @@ struct cpuinfo_arc { unsigned int vec_base; struct cpuinfo_arc_ccm iccm, dccm; struct { - unsigned int swap:1, norm:1, minmax:1, barrel:1, crc:1, pad1:3, + unsigned int swap:1, norm:1, minmax:1, barrel:1, crc:1, swape:1, pad1:2, fpu_sp:1, fpu_dp:1, pad2:6, debug:1, ap:1, smart:1, rtt:1, pad3:4, timer0:1, timer1:1, rtc:1, gfrc:1, pad4:4; diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c index 0170d94f3860..156981aecd74 100644 --- a/arch/arc/kernel/setup.c +++ b/arch/arc/kernel/setup.c @@ -138,6 +138,9 @@ static void read_arc_build_cfg_regs(void) cpu->extn.swap = read_aux_reg(ARC_REG_SWAP_BCR) ? 1 : 0; /* 1,3 */ cpu->extn.crc = read_aux_reg(ARC_REG_CRC_BCR) ? 1 : 0; cpu->extn.minmax = read_aux_reg(ARC_REG_MIXMAX_BCR) > 1 ? 1 : 0; /* 2 */ + cpu->extn.swape = (cpu->core.family >= 0x34) ? 1 : + IS_ENABLED(CONFIG_ARC_HAS_SWAPE); + READ_BCR(ARC_REG_XY_MEM_BCR, cpu->extn_xymem); /* Read CCM BCRs for boot reporting even if not enabled in Kconfig */ @@ -250,7 +253,7 @@ static char *arc_cpu_mumbojumbo(int cpu_id, char *buf, int len) IS_AVAIL1(cpu->extn.swap, "swap "), IS_AVAIL1(cpu->extn.minmax, "minmax "), IS_AVAIL1(cpu->extn.crc, "crc "), - IS_AVAIL2(1, "swape", CONFIG_ARC_HAS_SWAPE)); + IS_AVAIL2(cpu->extn.swape, "swape", CONFIG_ARC_HAS_SWAPE)); if (cpu->bpu.ver) n += scnprintf(buf + n, len - n, From d7c46114e356fe41b7291ebff70d7ca09c0f0ac9 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Tue, 25 Oct 2016 13:45:11 -0700 Subject: [PATCH 282/292] ARC: boot log: remove awkward space comma from MMU line Signed-off-by: Vineet Gupta --- arch/arc/mm/tlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c index ec868a9081a1..bdb295e09160 100644 --- a/arch/arc/mm/tlb.c +++ b/arch/arc/mm/tlb.c @@ -793,16 +793,16 @@ char *arc_mmu_mumbojumbo(int cpu_id, char *buf, int len) char super_pg[64] = ""; if (p_mmu->s_pg_sz_m) - scnprintf(super_pg, 64, "%dM Super Page%s, ", + scnprintf(super_pg, 64, "%dM Super Page %s", p_mmu->s_pg_sz_m, IS_USED_CFG(CONFIG_TRANSPARENT_HUGEPAGE)); n += scnprintf(buf + n, len - n, - "MMU [v%x]\t: %dk PAGE, %sJTLB %d (%dx%d), uDTLB %d, uITLB %d %s%s\n", + "MMU [v%x]\t: %dk PAGE, %sJTLB %d (%dx%d), uDTLB %d, uITLB %d%s%s\n", p_mmu->ver, p_mmu->pg_sz_k, super_pg, p_mmu->sets * p_mmu->ways, p_mmu->sets, p_mmu->ways, p_mmu->u_dtlb, p_mmu->u_itlb, - IS_AVAIL2(p_mmu->pae, "PAE40 ", CONFIG_ARC_HAS_PAE40)); + IS_AVAIL2(p_mmu->pae, ", PAE40 ", CONFIG_ARC_HAS_PAE40)); return buf; } From d975cbc8acb6f4a52ac46a57b13bd6a7f871b5e9 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Thu, 27 Oct 2016 14:33:19 -0700 Subject: [PATCH 283/292] ARC: boot log: refactor cpu name/release printing The motivation is to identify ARC750 vs. ARC770 (we currently print generic "ARC700"). A given ARC700 release could be 750 or 770, with same ARCNUM (or family identifier which is unfortunate). The existing arc_cpu_tbl[] kept a single concatenated string for core name and release which thus doesn't work for 750 vs. 770 identification. So split this into 2 tables, one with core names and other with release. And while we are at it, get rid of the range checking for family numbers. We just document the known to exist cores running Linux and ditch others. With this in place, we add detection of ARC750 which is - cores 0x33 and before - cores 0x34 and later with MMUv2 Signed-off-by: Vineet Gupta --- arch/arc/include/asm/arcregs.h | 2 +- arch/arc/include/asm/setup.h | 5 ---- arch/arc/kernel/setup.c | 51 ++++++++++++++++++++++------------ 3 files changed, 34 insertions(+), 24 deletions(-) diff --git a/arch/arc/include/asm/arcregs.h b/arch/arc/include/asm/arcregs.h index b8d29b136b96..7f3f9f63708c 100644 --- a/arch/arc/include/asm/arcregs.h +++ b/arch/arc/include/asm/arcregs.h @@ -349,7 +349,7 @@ struct cpuinfo_arc { struct cpuinfo_arc_bpu bpu; struct bcr_identity core; struct bcr_isa isa; - const char *details; + const char *details, *name; unsigned int vec_base; struct cpuinfo_arc_ccm iccm, dccm; struct { diff --git a/arch/arc/include/asm/setup.h b/arch/arc/include/asm/setup.h index bdc43df922c9..cb954cdab070 100644 --- a/arch/arc/include/asm/setup.h +++ b/arch/arc/include/asm/setup.h @@ -27,11 +27,6 @@ struct id_to_str { const char *str; }; -struct cpuinfo_data { - struct id_to_str info; - int up_range; -}; - extern int root_mountflags, end_mem; void setup_processor(void); diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c index 156981aecd74..0385df77a697 100644 --- a/arch/arc/kernel/setup.c +++ b/arch/arc/kernel/setup.c @@ -40,18 +40,27 @@ struct task_struct *_current_task[NR_CPUS]; /* For stack switching */ struct cpuinfo_arc cpuinfo_arc700[NR_CPUS]; -static const struct cpuinfo_data arc_cpu_tbl[] = { +static const struct id_to_str arc_cpu_rel[] = { #ifdef CONFIG_ISA_ARCOMPACT - { {0x20, "ARC 600" }, 0x2F}, - { {0x30, "ARC 700" }, 0x33}, - { {0x34, "ARC 700 R4.10"}, 0x34}, - { {0x35, "ARC 700 R4.11"}, 0x35}, + { 0x34, "R4.10"}, + { 0x35, "R4.11"}, #else - { {0x50, "ARC HS38 R2.0"}, 0x51}, - { {0x52, "ARC HS38 R2.1"}, 0x52}, - { {0x53, "ARC HS38 R3.0"}, 0x53}, + { 0x51, "R2.0" }, + { 0x52, "R2.1" }, + { 0x53, "R3.0" }, #endif - { {0x00, NULL } } + { 0x00, NULL } +}; + +static const struct id_to_str arc_cpu_nm[] = { +#ifdef CONFIG_ISA_ARCOMPACT + { 0x20, "ARC 600" }, + { 0x30, "ARC 770" }, /* 750 identified seperately */ +#else + { 0x40, "ARC EM" }, + { 0x50, "ARC HS38" }, +#endif + { 0x00, "Unknown" } }; static void read_decode_ccm_bcr(struct cpuinfo_arc *cpu) @@ -106,23 +115,25 @@ static void read_arc_build_cfg_regs(void) struct bcr_timer timer; struct bcr_generic bcr; struct cpuinfo_arc *cpu = &cpuinfo_arc700[smp_processor_id()]; - const struct cpuinfo_data *tbl; + const struct id_to_str *tbl; FIX_PTR(cpu); READ_BCR(AUX_IDENTITY, cpu->core); READ_BCR(ARC_REG_ISA_CFG_BCR, cpu->isa); - for (tbl = &arc_cpu_tbl[0]; tbl->info.id != 0; tbl++) { - if ((cpu->core.family >= tbl->info.id) && - (cpu->core.family <= tbl->up_range)) { - cpu->details = tbl->info.str; + for (tbl = &arc_cpu_rel[0]; tbl->id != 0; tbl++) { + if (cpu->core.family == tbl->id) { + cpu->details = tbl->str; break; } } - if (tbl->info.id == 0) - cpu->details = "UNKNOWN"; + for (tbl = &arc_cpu_nm[0]; tbl->id != 0; tbl++) { + if ((cpu->core.family & 0xF0) == tbl->id) + break; + } + cpu->name = tbl->str; READ_BCR(ARC_REG_TIMERS_BCR, timer); cpu->extn.timer0 = timer.t0; @@ -199,6 +210,10 @@ static void read_arc_build_cfg_regs(void) cpu->isa.atomic = cpu->isa.atomic1; cpu->isa.be = IS_ENABLED(CONFIG_CPU_BIG_ENDIAN); + + /* there's no direct way to distinguish 750 vs. 770 */ + if (unlikely(cpu->core.family < 0x34 || cpu->mmu.ver < 3)) + cpu->name = "ARC750"; } } @@ -214,8 +229,8 @@ static char *arc_cpu_mumbojumbo(int cpu_id, char *buf, int len) "\nIDENTITY\t: ARCVER [%#02x] ARCNUM [%#02x] CHIPID [%#4x]\n", core->family, core->cpu_id, core->chip_id); - n += scnprintf(buf + n, len - n, "processor [%d]\t: %s (%s ISA) %s\n", - cpu_id, cpu->details, + n += scnprintf(buf + n, len - n, "processor [%d]\t: %s %s (%s ISA) %s\n", + cpu_id, cpu->name, cpu->details, is_isa_arcompact() ? "ARCompact" : "ARCv2", IS_AVAIL1(cpu->isa.be, "[Big-Endian]")); From c3005475889c7c730638f95d13be3360f0b33e98 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Fri, 21 Oct 2016 16:04:37 -0700 Subject: [PATCH 284/292] ARC: build: retire old toggles These are really ancient toggles and tools no longer require them to be passed. This paves way for deprecating them in long run. Signed-off-by: Vineet Gupta --- arch/arc/Makefile | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/arc/Makefile b/arch/arc/Makefile index aa82d13d4213..864adad52280 100644 --- a/arch/arc/Makefile +++ b/arch/arc/Makefile @@ -50,9 +50,6 @@ atleast_gcc44 := $(call cc-ifversion, -ge, 0404, y) cflags-$(atleast_gcc44) += -fsection-anchors -cflags-$(CONFIG_ARC_HAS_LLSC) += -mlock -cflags-$(CONFIG_ARC_HAS_SWAPE) += -mswape - ifdef CONFIG_ISA_ARCV2 ifndef CONFIG_ARC_HAS_LL64 From f644e3688855902ad11549029098a62cbbc8f558 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Tue, 25 Oct 2016 08:58:17 -0700 Subject: [PATCH 285/292] ARC: mm: retire ARC_DBG_TLB_MISS_COUNT... ... given that we have perf counters abel to do the same thing non intrusively Signed-off-by: Vineet Gupta --- arch/arc/Kconfig | 8 --- arch/arc/kernel/troubleshoot.c | 110 --------------------------------- arch/arc/mm/tlbex.S | 21 ------- 3 files changed, 139 deletions(-) diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index ac0b309aced5..bd204bfa29ed 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -540,14 +540,6 @@ config ARC_DBG_TLB_PARANOIA bool "Paranoia Checks in Low Level TLB Handlers" default n -config ARC_DBG_TLB_MISS_COUNT - bool "Profile TLB Misses" - default n - select DEBUG_FS - help - Counts number of I and D TLB Misses and exports them via Debugfs - The counters can be cleared via Debugfs as well - endif config ARC_UBOOT_SUPPORT diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c index 934150e7ac48..82f9bc819f4a 100644 --- a/arch/arc/kernel/troubleshoot.c +++ b/arch/arc/kernel/troubleshoot.c @@ -237,113 +237,3 @@ void show_kernel_fault_diag(const char *str, struct pt_regs *regs, if (!user_mode(regs)) show_stacktrace(current, regs); } - -#ifdef CONFIG_DEBUG_FS - -#include -#include -#include -#include -#include -#include -#include - -static struct dentry *test_dentry; -static struct dentry *test_dir; -static struct dentry *test_u32_dentry; - -static u32 clr_on_read = 1; - -#ifdef CONFIG_ARC_DBG_TLB_MISS_COUNT -u32 numitlb, numdtlb, num_pte_not_present; - -static int fill_display_data(char *kbuf) -{ - size_t num = 0; - num += sprintf(kbuf + num, "I-TLB Miss %x\n", numitlb); - num += sprintf(kbuf + num, "D-TLB Miss %x\n", numdtlb); - num += sprintf(kbuf + num, "PTE not present %x\n", num_pte_not_present); - - if (clr_on_read) - numitlb = numdtlb = num_pte_not_present = 0; - - return num; -} - -static int tlb_stats_open(struct inode *inode, struct file *file) -{ - file->private_data = (void *)__get_free_page(GFP_KERNEL); - return 0; -} - -/* called on user read(): display the counters */ -static ssize_t tlb_stats_output(struct file *file, /* file descriptor */ - char __user *user_buf, /* user buffer */ - size_t len, /* length of buffer */ - loff_t *offset) /* offset in the file */ -{ - size_t num; - char *kbuf = (char *)file->private_data; - - /* All of the data can he shoved in one iteration */ - if (*offset != 0) - return 0; - - num = fill_display_data(kbuf); - - /* simple_read_from_buffer() is helper for copy to user space - It copies up to @2 (num) bytes from kernel buffer @4 (kbuf) at offset - @3 (offset) into the user space address starting at @1 (user_buf). - @5 (len) is max size of user buffer - */ - return simple_read_from_buffer(user_buf, num, offset, kbuf, len); -} - -/* called on user write : clears the counters */ -static ssize_t tlb_stats_clear(struct file *file, const char __user *user_buf, - size_t length, loff_t *offset) -{ - numitlb = numdtlb = num_pte_not_present = 0; - return length; -} - -static int tlb_stats_close(struct inode *inode, struct file *file) -{ - free_page((unsigned long)(file->private_data)); - return 0; -} - -static const struct file_operations tlb_stats_file_ops = { - .read = tlb_stats_output, - .write = tlb_stats_clear, - .open = tlb_stats_open, - .release = tlb_stats_close -}; -#endif - -static int __init arc_debugfs_init(void) -{ - test_dir = debugfs_create_dir("arc", NULL); - -#ifdef CONFIG_ARC_DBG_TLB_MISS_COUNT - test_dentry = debugfs_create_file("tlb_stats", 0444, test_dir, NULL, - &tlb_stats_file_ops); -#endif - - test_u32_dentry = - debugfs_create_u32("clr_on_read", 0444, test_dir, &clr_on_read); - - return 0; -} - -module_init(arc_debugfs_init); - -static void __exit arc_debugfs_exit(void) -{ - debugfs_remove(test_u32_dentry); - debugfs_remove(test_dentry); - debugfs_remove(test_dir); -} -module_exit(arc_debugfs_exit); - -#endif diff --git a/arch/arc/mm/tlbex.S b/arch/arc/mm/tlbex.S index f1967eeb32e7..b30e4e36bb00 100644 --- a/arch/arc/mm/tlbex.S +++ b/arch/arc/mm/tlbex.S @@ -237,15 +237,6 @@ ex_saved_reg1: 2: -#ifdef CONFIG_ARC_DBG_TLB_MISS_COUNT - and.f 0, r0, _PAGE_PRESENT - bz 1f - ld r3, [num_pte_not_present] - add r3, r3, 1 - st r3, [num_pte_not_present] -1: -#endif - .endm ;----------------------------------------------------------------- @@ -309,12 +300,6 @@ ENTRY(EV_TLBMissI) TLBMISS_FREEUP_REGS -#ifdef CONFIG_ARC_DBG_TLB_MISS_COUNT - ld r0, [@numitlb] - add r0, r0, 1 - st r0, [@numitlb] -#endif - ;---------------------------------------------------------------- ; Get the PTE corresponding to V-addr accessed, r2 is setup with EFA LOAD_FAULT_PTE @@ -349,12 +334,6 @@ ENTRY(EV_TLBMissD) TLBMISS_FREEUP_REGS -#ifdef CONFIG_ARC_DBG_TLB_MISS_COUNT - ld r0, [@numdtlb] - add r0, r0, 1 - st r0, [@numdtlb] -#endif - ;---------------------------------------------------------------- ; Get the PTE corresponding to V-addr accessed ; If PTE exists, it will setup, r0 = PTE, r1 = Ptr to PTE, r2 = EFA From d65283f7b695b5d04ca1ab58b6bb41f443b96286 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Tue, 25 Oct 2016 10:43:20 -0700 Subject: [PATCH 286/292] ARC: module: elide loop to save reference to .eh_frame The loop was really needed in .debug_frame regime where wanted make it as SH_ALLOC so that apply_relocate_add() would process it. That's not needed for .eh_frame, so we check this in apply_relocate_add() which gets called for each section. Note that we need to save reference to "section name strings" section in module_frob_arch_sections() since apply_relocate_add() doesn't get that Signed-off-by: Vineet Gupta --- arch/arc/include/asm/module.h | 1 + arch/arc/kernel/module.c | 18 ++++++++---------- 2 files changed, 9 insertions(+), 10 deletions(-) diff --git a/arch/arc/include/asm/module.h b/arch/arc/include/asm/module.h index 518222bb3f8e..6e91d8b339c3 100644 --- a/arch/arc/include/asm/module.h +++ b/arch/arc/include/asm/module.h @@ -18,6 +18,7 @@ struct mod_arch_specific { void *unw_info; int unw_sec_idx; + const char *secstr; }; #endif diff --git a/arch/arc/kernel/module.c b/arch/arc/kernel/module.c index 9a2849756022..24bd2ffb90b7 100644 --- a/arch/arc/kernel/module.c +++ b/arch/arc/kernel/module.c @@ -30,17 +30,9 @@ int module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, char *secstr, struct module *mod) { #ifdef CONFIG_ARC_DW2_UNWIND - int i; - mod->arch.unw_sec_idx = 0; mod->arch.unw_info = NULL; - - for (i = 1; i < hdr->e_shnum; i++) { - if (strcmp(secstr+sechdrs[i].sh_name, ".eh_frame") == 0) { - mod->arch.unw_sec_idx = i; - break; - } - } + mod->arch.secstr = secstr; #endif return 0; } @@ -66,8 +58,10 @@ int apply_relocate_add(Elf32_Shdr *sechdrs, Elf32_Addr location; Elf32_Addr sec_to_patch; int relo_type; + unsigned int tgtsec; - sec_to_patch = sechdrs[sechdrs[relsec].sh_info].sh_addr; + tgtsec = sechdrs[relsec].sh_info; + sec_to_patch = sechdrs[tgtsec].sh_addr; sym_sec = (Elf32_Sym *) sechdrs[symindex].sh_addr; n = sechdrs[relsec].sh_size / sizeof(*rel_entry); @@ -111,6 +105,10 @@ int apply_relocate_add(Elf32_Shdr *sechdrs, goto relo_err; } + + if (strcmp(module->arch.secstr+sechdrs[tgtsec].sh_name, ".eh_frame") == 0) + module->arch.unw_sec_idx = tgtsec; + return 0; relo_err: From b75dcd9c7d352c7d9ea9010e95c708595094896a Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Tue, 25 Oct 2016 11:23:19 -0700 Subject: [PATCH 287/292] ARC: module: print pretty section names Now that we have referece to section name string table in apply_relocate_add(), use it to - print the name of section being relocated - print symbol with NULL name (since it refers to a section) before | Section to fixup 7000a060 | ========================================================= | rela->r_off | rela->addend | sym->st_value | ADDR | VALUE | ========================================================= | 1c 0 7000e000 7000a07c 7000e000 [] | 40 0 7000a000 7000a0a0 7000a000 [] after | Section to fixup .eh_frame @7000a060 | ========================================================= | r_off r_add st_value ADDRESS VALUE | ========================================================= | 1c 0 7000e000 7000a07c 7000e000 [.init.text] | 40 0 7000a000 7000a0a0 7000a000 [.exit.text] Signed-off-by: Vineet Gupta --- arch/arc/kernel/module.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/arch/arc/kernel/module.c b/arch/arc/kernel/module.c index 24bd2ffb90b7..42e964db2967 100644 --- a/arch/arc/kernel/module.c +++ b/arch/arc/kernel/module.c @@ -51,31 +51,33 @@ int apply_relocate_add(Elf32_Shdr *sechdrs, unsigned int relsec, /* sec index for relo sec */ struct module *module) { - int i, n; + int i, n, relo_type; Elf32_Rela *rel_entry = (void *)sechdrs[relsec].sh_addr; Elf32_Sym *sym_entry, *sym_sec; - Elf32_Addr relocation; - Elf32_Addr location; - Elf32_Addr sec_to_patch; - int relo_type; + Elf32_Addr relocation, location, tgt_addr; unsigned int tgtsec; + /* + * @relsec has relocations e.g. .rela.init.text + * @tgtsec is section to patch e.g. .init.text + */ tgtsec = sechdrs[relsec].sh_info; - sec_to_patch = sechdrs[tgtsec].sh_addr; + tgt_addr = sechdrs[tgtsec].sh_addr; sym_sec = (Elf32_Sym *) sechdrs[symindex].sh_addr; n = sechdrs[relsec].sh_size / sizeof(*rel_entry); - pr_debug("\n========== Module Sym reloc ===========================\n"); - pr_debug("Section to fixup %x\n", sec_to_patch); + pr_debug("\nSection to fixup %s @%x\n", + module->arch.secstr + sechdrs[tgtsec].sh_name, tgt_addr); pr_debug("=========================================================\n"); - pr_debug("rela->r_off | rela->addend | sym->st_value | ADDR | VALUE\n"); + pr_debug("r_off\tr_add\tst_value ADDRESS VALUE\n"); pr_debug("=========================================================\n"); /* Loop thru entries in relocation section */ for (i = 0; i < n; i++) { + const char *s; /* This is where to make the change */ - location = sec_to_patch + rel_entry[i].r_offset; + location = tgt_addr + rel_entry[i].r_offset; /* This is the symbol it is referring to. Note that all undefined symbols have been resolved. */ @@ -83,10 +85,15 @@ int apply_relocate_add(Elf32_Shdr *sechdrs, relocation = sym_entry->st_value + rel_entry[i].r_addend; - pr_debug("\t%x\t\t%x\t\t%x %x %x [%s]\n", - rel_entry[i].r_offset, rel_entry[i].r_addend, - sym_entry->st_value, location, relocation, - strtab + sym_entry->st_name); + if (sym_entry->st_name == 0 && ELF_ST_TYPE (sym_entry->st_info) == STT_SECTION) { + s = module->arch.secstr + sechdrs[sym_entry->st_shndx].sh_name; + } else { + s = strtab + sym_entry->st_name; + } + + pr_debug(" %x\t%x\t%x %x %x [%s]\n", + rel_entry[i].r_offset, rel_entry[i].r_addend, + sym_entry->st_value, location, relocation, s); /* This assumes modules are built with -mlong-calls * so any branches/jumps are absolute 32 bit jmps From 25ccd2429fd0af5b9c2c136c8c593491aeabf162 Mon Sep 17 00:00:00 2001 From: Lv Zheng Date: Wed, 26 Oct 2016 15:40:12 +0800 Subject: [PATCH 288/292] ACPICA: Dispatcher: Fix order issue of method termination The last step of the method termination should be the end of the method serialization. Otherwise, the steps happening after it will face the race issues that cannot be protected by the method serialization mechanism. This patch fixes this issue by moving the per-method-object deletion code prior than the end of the method serialization. Otherwise, the possible race issues may result in AE_ALREADY_EXISTS error in a parallel environment. Fixes: 74f51b80a0c4 (ACPICA: Namespace: Fix dynamic table loading issues) Reported-and-tested-by: Imre Deak Signed-off-by: Lv Zheng Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpica/dsmethod.c | 40 +++++++++++++++++----------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/acpi/acpica/dsmethod.c b/drivers/acpi/acpica/dsmethod.c index 32e9ddc0cf2b..c4028a8dacd5 100644 --- a/drivers/acpi/acpica/dsmethod.c +++ b/drivers/acpi/acpica/dsmethod.c @@ -730,26 +730,6 @@ acpi_ds_terminate_control_method(union acpi_operand_object *method_desc, acpi_ds_method_data_delete_all(walk_state); - /* - * If method is serialized, release the mutex and restore the - * current sync level for this thread - */ - if (method_desc->method.mutex) { - - /* Acquisition Depth handles recursive calls */ - - method_desc->method.mutex->mutex.acquisition_depth--; - if (!method_desc->method.mutex->mutex.acquisition_depth) { - walk_state->thread->current_sync_level = - method_desc->method.mutex->mutex. - original_sync_level; - - acpi_os_release_mutex(method_desc->method. - mutex->mutex.os_mutex); - method_desc->method.mutex->mutex.thread_id = 0; - } - } - /* * Delete any namespace objects created anywhere within the * namespace by the execution of this method. Unless: @@ -786,6 +766,26 @@ acpi_ds_terminate_control_method(union acpi_operand_object *method_desc, ~ACPI_METHOD_MODIFIED_NAMESPACE; } } + + /* + * If method is serialized, release the mutex and restore the + * current sync level for this thread + */ + if (method_desc->method.mutex) { + + /* Acquisition Depth handles recursive calls */ + + method_desc->method.mutex->mutex.acquisition_depth--; + if (!method_desc->method.mutex->mutex.acquisition_depth) { + walk_state->thread->current_sync_level = + method_desc->method.mutex->mutex. + original_sync_level; + + acpi_os_release_mutex(method_desc->method. + mutex->mutex.os_mutex); + method_desc->method.mutex->mutex.thread_id = 0; + } + } } /* Decrement the thread count on the method */ From 8121aa26e32012ca89afafa5e503b879950ac0fe Mon Sep 17 00:00:00 2001 From: Lv Zheng Date: Wed, 26 Oct 2016 15:40:20 +0800 Subject: [PATCH 289/292] ACPICA: Dispatcher: Fix an unbalanced lock exit path in acpi_ds_auto_serialize_method() There is a lock unbalanced exit path in acpi_ds_initialize_method(), this patch corrects it. Fixes: 441ad11d078f (ACPICA: Dispatcher: Fix a mutex issue for method auto serialization) Tested-by: Imre Deak Signed-off-by: Lv Zheng Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpica/dsmethod.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/acpica/dsmethod.c b/drivers/acpi/acpica/dsmethod.c index c4028a8dacd5..5997e592e5a6 100644 --- a/drivers/acpi/acpica/dsmethod.c +++ b/drivers/acpi/acpica/dsmethod.c @@ -128,7 +128,7 @@ acpi_ds_auto_serialize_method(struct acpi_namespace_node *node, if (ACPI_FAILURE(status)) { acpi_ds_delete_walk_state(walk_state); acpi_ps_free_op(op); - return_ACPI_STATUS(status); + goto unlock; } walk_state->descending_callback = acpi_ds_detect_named_opcodes; From 8633db6b027952449e155a316f4ae3a530bbe18f Mon Sep 17 00:00:00 2001 From: Lv Zheng Date: Wed, 26 Oct 2016 15:42:01 +0800 Subject: [PATCH 290/292] ACPICA: Dispatcher: Fix interpreter locking around acpi_ev_initialize_region() In the code path of acpi_ev_initialize_region(), there is namespace modification code unlocked. This patch tunes the code to make sure such modification are always locked. Fixes: 74f51b80a0c4 (ACPICA: Namespace: Fix dynamic table loading issues) Tested-by: Imre Deak Signed-off-by: Lv Zheng Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpica/dsinit.c | 11 +++-------- drivers/acpi/acpica/dsmethod.c | 12 +++--------- drivers/acpi/acpica/dswload2.c | 2 -- drivers/acpi/acpica/evrgnini.c | 3 +++ drivers/acpi/acpica/nsload.c | 2 ++ 5 files changed, 11 insertions(+), 19 deletions(-) diff --git a/drivers/acpi/acpica/dsinit.c b/drivers/acpi/acpica/dsinit.c index f1e6dcc7a827..54d48b90de2c 100644 --- a/drivers/acpi/acpica/dsinit.c +++ b/drivers/acpi/acpica/dsinit.c @@ -46,6 +46,7 @@ #include "acdispat.h" #include "acnamesp.h" #include "actables.h" +#include "acinterp.h" #define _COMPONENT ACPI_DISPATCHER ACPI_MODULE_NAME("dsinit") @@ -214,23 +215,17 @@ acpi_ds_initialize_objects(u32 table_index, /* Walk entire namespace from the supplied root */ - status = acpi_ut_acquire_mutex(ACPI_MTX_NAMESPACE); - if (ACPI_FAILURE(status)) { - return_ACPI_STATUS(status); - } - /* * We don't use acpi_walk_namespace since we do not want to acquire * the namespace reader lock. */ status = acpi_ns_walk_namespace(ACPI_TYPE_ANY, start_node, ACPI_UINT32_MAX, - ACPI_NS_WALK_UNLOCK, acpi_ds_init_one_object, - NULL, &info, NULL); + 0, acpi_ds_init_one_object, NULL, &info, + NULL); if (ACPI_FAILURE(status)) { ACPI_EXCEPTION((AE_INFO, status, "During WalkNamespace")); } - (void)acpi_ut_release_mutex(ACPI_MTX_NAMESPACE); status = acpi_get_table_by_index(table_index, &table); if (ACPI_FAILURE(status)) { diff --git a/drivers/acpi/acpica/dsmethod.c b/drivers/acpi/acpica/dsmethod.c index 5997e592e5a6..2b3210f42a46 100644 --- a/drivers/acpi/acpica/dsmethod.c +++ b/drivers/acpi/acpica/dsmethod.c @@ -99,14 +99,11 @@ acpi_ds_auto_serialize_method(struct acpi_namespace_node *node, "Method auto-serialization parse [%4.4s] %p\n", acpi_ut_get_node_name(node), node)); - acpi_ex_enter_interpreter(); - /* Create/Init a root op for the method parse tree */ op = acpi_ps_alloc_op(AML_METHOD_OP, obj_desc->method.aml_start); if (!op) { - status = AE_NO_MEMORY; - goto unlock; + return_ACPI_STATUS(AE_NO_MEMORY); } acpi_ps_set_name(op, node->name.integer); @@ -118,8 +115,7 @@ acpi_ds_auto_serialize_method(struct acpi_namespace_node *node, acpi_ds_create_walk_state(node->owner_id, NULL, NULL, NULL); if (!walk_state) { acpi_ps_free_op(op); - status = AE_NO_MEMORY; - goto unlock; + return_ACPI_STATUS(AE_NO_MEMORY); } status = acpi_ds_init_aml_walk(walk_state, op, node, @@ -128,7 +124,7 @@ acpi_ds_auto_serialize_method(struct acpi_namespace_node *node, if (ACPI_FAILURE(status)) { acpi_ds_delete_walk_state(walk_state); acpi_ps_free_op(op); - goto unlock; + return_ACPI_STATUS(status); } walk_state->descending_callback = acpi_ds_detect_named_opcodes; @@ -138,8 +134,6 @@ acpi_ds_auto_serialize_method(struct acpi_namespace_node *node, status = acpi_ps_parse_aml(walk_state); acpi_ps_delete_parse_tree(op); -unlock: - acpi_ex_exit_interpreter(); return_ACPI_STATUS(status); } diff --git a/drivers/acpi/acpica/dswload2.c b/drivers/acpi/acpica/dswload2.c index 028b22a3154e..e36218206bb0 100644 --- a/drivers/acpi/acpica/dswload2.c +++ b/drivers/acpi/acpica/dswload2.c @@ -607,11 +607,9 @@ acpi_status acpi_ds_load2_end_op(struct acpi_walk_state *walk_state) } } - acpi_ex_exit_interpreter(); status = acpi_ev_initialize_region (acpi_ns_get_attached_object(node), FALSE); - acpi_ex_enter_interpreter(); if (ACPI_FAILURE(status)) { /* diff --git a/drivers/acpi/acpica/evrgnini.c b/drivers/acpi/acpica/evrgnini.c index 3843f1fc5dbb..75ddd160a716 100644 --- a/drivers/acpi/acpica/evrgnini.c +++ b/drivers/acpi/acpica/evrgnini.c @@ -45,6 +45,7 @@ #include "accommon.h" #include "acevents.h" #include "acnamesp.h" +#include "acinterp.h" #define _COMPONENT ACPI_EVENTS ACPI_MODULE_NAME("evrgnini") @@ -597,9 +598,11 @@ acpi_ev_initialize_region(union acpi_operand_object *region_obj, } } + acpi_ex_exit_interpreter(); status = acpi_ev_execute_reg_method(region_obj, ACPI_REG_CONNECT); + acpi_ex_enter_interpreter(); if (acpi_ns_locked) { status = diff --git a/drivers/acpi/acpica/nsload.c b/drivers/acpi/acpica/nsload.c index 334d3c5ba617..d1f20143bb11 100644 --- a/drivers/acpi/acpica/nsload.c +++ b/drivers/acpi/acpica/nsload.c @@ -137,7 +137,9 @@ unlock: ACPI_DEBUG_PRINT((ACPI_DB_INFO, "**** Begin Table Object Initialization\n")); + acpi_ex_enter_interpreter(); status = acpi_ds_initialize_objects(table_index, node); + acpi_ex_exit_interpreter(); ACPI_DEBUG_PRINT((ACPI_DB_INFO, "**** Completed Table Object Initialization\n")); From 1e90a13d0c3dc94512af1ccb2b6563e8297838fa Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Sat, 29 Oct 2016 13:42:42 +0200 Subject: [PATCH 291/292] x86/smpboot: Init apic mapping before usage The recent changes, which forced the registration of the boot cpu on UP systems, which do not have ACPI tables, have been fixed for systems w/o local APIC, but left a wreckage for systems which have neither ACPI nor mptables, but the CPU has an APIC, e.g. virtualbox. The boot process crashes in prefill_possible_map() as it wants to register the boot cpu, which needs to access the local apic, but the local APIC is not yet mapped. There is no reason why init_apic_mapping() can't be invoked before prefill_possible_map(). So instead of playing another silly early mapping game, as the ACPI/mptables code does, we just move init_apic_mapping() before the call to prefill_possible_map(). In hindsight, I should have noticed that combination earlier. Sorry for the churn (also in stable)! Fixes: ff8560512b8d ("x86/boot/smp: Don't try to poke disabled/non-existent APIC") Reported-and-debugged-by: Michal Necasek Reported-and-tested-by: Wolfgang Bauer Cc: prarit@redhat.com Cc: ville.syrjala@linux.intel.com Cc: michael.thayer@oracle.com Cc: knut.osmundsen@oracle.com Cc: frank.mehnert@oracle.com Cc: Borislav Petkov Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610282114380.5053@nanos Signed-off-by: Thomas Gleixner --- arch/x86/kernel/setup.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index bbfbca5fea0c..9c337b0e8ba7 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1221,11 +1221,16 @@ void __init setup_arch(char **cmdline_p) */ get_smp_config(); + /* + * Systems w/o ACPI and mptables might not have it mapped the local + * APIC yet, but prefill_possible_map() might need to access it. + */ + init_apic_mappings(); + prefill_possible_map(); init_cpu_to_node(); - init_apic_mappings(); io_apic_init_mappings(); kvm_guest_init(); From a909d3e636995ba7c349e2ca5dbb528154d4ac30 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sat, 29 Oct 2016 13:52:02 -0700 Subject: [PATCH 292/292] Linux 4.9-rc3 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 93beca4312c4..a2650f9c6a25 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 4 PATCHLEVEL = 9 SUBLEVEL = 0 -EXTRAVERSION = -rc2 +EXTRAVERSION = -rc3 NAME = Psychotic Stoned Sheep # *DOCUMENTATION*