From 4dbfb8be1989118cd013c058c2f0ce199ecaf93f Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Wed, 18 May 2016 16:55:56 +0200 Subject: [PATCH 001/312] crypto: public_key: select CRYPTO_AKCIPHER [ Upstream commit bad6a185b4d6f81d0ed2b6e4c16307969f160b95 ] In some rare randconfig builds, we can end up with ASYMMETRIC_PUBLIC_KEY_SUBTYPE enabled but CRYPTO_AKCIPHER disabled, which fails to link because of the reference to crypto_alloc_akcipher: crypto/built-in.o: In function `public_key_verify_signature': :(.text+0x110e4): undefined reference to `crypto_alloc_akcipher' This adds a Kconfig 'select' statement to ensure the dependency is always there. Cc: Signed-off-by: Arnd Bergmann Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- crypto/asymmetric_keys/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/crypto/asymmetric_keys/Kconfig b/crypto/asymmetric_keys/Kconfig index 4870f28403f5..05bfe568cd30 100644 --- a/crypto/asymmetric_keys/Kconfig +++ b/crypto/asymmetric_keys/Kconfig @@ -14,6 +14,7 @@ config ASYMMETRIC_PUBLIC_KEY_SUBTYPE select MPILIB select PUBLIC_KEY_ALGO_RSA select CRYPTO_HASH_INFO + select CRYPTO_AKCIPHER help This option provides support for asymmetric public key type handling. If signature generation and/or verification are to be used, From 8cdfb73d8c344384469ddd019320ea9cd9e718a9 Mon Sep 17 00:00:00 2001 From: James Bottomley Date: Fri, 13 May 2016 12:04:06 -0700 Subject: [PATCH 002/312] scsi_lib: correctly retry failed zero length REQ_TYPE_FS commands [ Upstream commit a621bac3044ed6f7ec5fa0326491b2d4838bfa93 ] When SCSI was written, all commands coming from the filesystem (REQ_TYPE_FS commands) had data. This meant that our signal for needing to complete the command was the number of bytes completed being equal to the number of bytes in the request. Unfortunately, with the advent of flush barriers, we can now get zero length REQ_TYPE_FS commands, which confuse this logic because they satisfy the condition every time. This means they never get retried even for retryable conditions, like UNIT ATTENTION because we complete them early assuming they're done. Fix this by special casing the early completion condition to recognise zero length commands with errors and let them drop through to the retry code. Cc: stable@vger.kernel.org Reported-by: Sebastian Parschauer Signed-off-by: James E.J. Bottomley Tested-by: Jack Wang Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/scsi_lib.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 448ebdaa3d69..17fbf1d3eadc 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -909,9 +909,12 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) } /* - * If we finished all bytes in the request we are done now. + * special case: failed zero length commands always need to + * drop down into the retry code. Otherwise, if we finished + * all bytes in the request we are done now. */ - if (!scsi_end_request(req, error, good_bytes, 0)) + if (!(blk_rq_bytes(req) == 0 && error) && + !scsi_end_request(req, error, good_bytes, 0)) return; /* From b182f703e49b54b6a7d135b279f4ac53abf337a2 Mon Sep 17 00:00:00 2001 From: Tom Lendacky Date: Fri, 20 May 2016 17:33:03 -0500 Subject: [PATCH 003/312] crypto: ccp - Fix AES XTS error for request sizes above 4096 [ Upstream commit ab6a11a7c8ef47f996974dd3c648c2c0b1a36ab1 ] The ccp-crypto module for AES XTS support has a bug that can allow requests greater than 4096 bytes in size to be passed to the CCP hardware. The CCP hardware does not support request sizes larger than 4096, resulting in incorrect output. The request should actually be handled by the fallback mechanism instantiated by the ccp-crypto module. Add a check to insure the request size is less than or equal to the maximum supported size and use the fallback mechanism if it is not. Cc: # 3.14.x- Signed-off-by: Tom Lendacky Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- drivers/crypto/ccp/ccp-crypto-aes-xts.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c b/drivers/crypto/ccp/ccp-crypto-aes-xts.c index 52c7395cb8d8..0d0d4529ee36 100644 --- a/drivers/crypto/ccp/ccp-crypto-aes-xts.c +++ b/drivers/crypto/ccp/ccp-crypto-aes-xts.c @@ -122,6 +122,7 @@ static int ccp_aes_xts_crypt(struct ablkcipher_request *req, struct ccp_ctx *ctx = crypto_tfm_ctx(req->base.tfm); struct ccp_aes_req_ctx *rctx = ablkcipher_request_ctx(req); unsigned int unit; + u32 unit_size; int ret; if (!ctx->u.aes.key_len) @@ -133,11 +134,17 @@ static int ccp_aes_xts_crypt(struct ablkcipher_request *req, if (!req->info) return -EINVAL; - for (unit = 0; unit < ARRAY_SIZE(unit_size_map); unit++) - if (!(req->nbytes & (unit_size_map[unit].size - 1))) - break; + unit_size = CCP_XTS_AES_UNIT_SIZE__LAST; + if (req->nbytes <= unit_size_map[0].size) { + for (unit = 0; unit < ARRAY_SIZE(unit_size_map); unit++) { + if (!(req->nbytes & (unit_size_map[unit].size - 1))) { + unit_size = unit_size_map[unit].value; + break; + } + } + } - if ((unit_size_map[unit].value == CCP_XTS_AES_UNIT_SIZE__LAST) || + if ((unit_size == CCP_XTS_AES_UNIT_SIZE__LAST) || (ctx->u.aes.key_len != AES_KEYSIZE_128)) { /* Use the fallback to process the request for any * unsupported unit sizes or key sizes @@ -158,7 +165,7 @@ static int ccp_aes_xts_crypt(struct ablkcipher_request *req, rctx->cmd.engine = CCP_ENGINE_XTS_AES_128; rctx->cmd.u.xts.action = (encrypt) ? CCP_AES_ACTION_ENCRYPT : CCP_AES_ACTION_DECRYPT; - rctx->cmd.u.xts.unit_size = unit_size_map[unit].value; + rctx->cmd.u.xts.unit_size = unit_size; rctx->cmd.u.xts.key = &ctx->u.aes.key_sg; rctx->cmd.u.xts.key_len = ctx->u.aes.key_len; rctx->cmd.u.xts.iv = &rctx->iv_sg; From f98a3feb04c264bdb6b1127831bca65484b6185e Mon Sep 17 00:00:00 2001 From: Russell Currey Date: Thu, 7 Apr 2016 16:28:26 +1000 Subject: [PATCH 004/312] powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge [ Upstream commit 871e178e0f2c4fa788f694721a10b4758d494ce1 ] In the "ibm,configure-pe" and "ibm,configure-bridge" RTAS calls, the spec states that values of 9900-9905 can be returned, indicating that software should delay for 10^x (where x is the last digit, i.e. 990x) milliseconds and attempt the call again. Currently, the kernel doesn't know about this, and respecting it fixes some PCI failures when the hypervisor is busy. The delay is capped at 0.2 seconds. Cc: # 3.10+ Signed-off-by: Russell Currey Acked-by: Gavin Shan Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- arch/powerpc/platforms/pseries/eeh_pseries.c | 51 ++++++++++++++------ 1 file changed, 36 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c index 2039397cc75d..d2a44cd476f2 100644 --- a/arch/powerpc/platforms/pseries/eeh_pseries.c +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c @@ -623,29 +623,50 @@ static int pseries_eeh_configure_bridge(struct eeh_pe *pe) { int config_addr; int ret; + /* Waiting 0.2s maximum before skipping configuration */ + int max_wait = 200; /* Figure out the PE address */ config_addr = pe->config_addr; if (pe->addr) config_addr = pe->addr; - /* Use new configure-pe function, if supported */ - if (ibm_configure_pe != RTAS_UNKNOWN_SERVICE) { - ret = rtas_call(ibm_configure_pe, 3, 1, NULL, - config_addr, BUID_HI(pe->phb->buid), - BUID_LO(pe->phb->buid)); - } else if (ibm_configure_bridge != RTAS_UNKNOWN_SERVICE) { - ret = rtas_call(ibm_configure_bridge, 3, 1, NULL, - config_addr, BUID_HI(pe->phb->buid), - BUID_LO(pe->phb->buid)); - } else { - return -EFAULT; + while (max_wait > 0) { + /* Use new configure-pe function, if supported */ + if (ibm_configure_pe != RTAS_UNKNOWN_SERVICE) { + ret = rtas_call(ibm_configure_pe, 3, 1, NULL, + config_addr, BUID_HI(pe->phb->buid), + BUID_LO(pe->phb->buid)); + } else if (ibm_configure_bridge != RTAS_UNKNOWN_SERVICE) { + ret = rtas_call(ibm_configure_bridge, 3, 1, NULL, + config_addr, BUID_HI(pe->phb->buid), + BUID_LO(pe->phb->buid)); + } else { + return -EFAULT; + } + + if (!ret) + return ret; + + /* + * If RTAS returns a delay value that's above 100ms, cut it + * down to 100ms in case firmware made a mistake. For more + * on how these delay values work see rtas_busy_delay_time + */ + if (ret > RTAS_EXTENDED_DELAY_MIN+2 && + ret <= RTAS_EXTENDED_DELAY_MAX) + ret = RTAS_EXTENDED_DELAY_MIN+2; + + max_wait -= rtas_busy_delay_time(ret); + + if (max_wait < 0) + break; + + rtas_busy_delay(ret); } - if (ret) - pr_warn("%s: Unable to configure bridge PHB#%d-PE#%x (%d)\n", - __func__, pe->phb->global_number, pe->addr, ret); - + pr_warn("%s: Unable to configure bridge PHB#%d-PE#%x (%d)\n", + __func__, pe->phb->global_number, pe->addr, ret); return ret; } From 3de5aac328d5006b251fff8b3bf0c4940898ad42 Mon Sep 17 00:00:00 2001 From: Kailang Yang Date: Mon, 30 May 2016 15:58:28 +0800 Subject: [PATCH 005/312] ALSA: hda/realtek - ALC256 speaker noise issue [ Upstream commit e69e7e03ed225abf3e1c43545aa3bcb68dc81d5f ] That is some different register for ALC255 and ALC256. ALC256 can't fit with some ALC255 register. This issue is cause from LDO output voltage control. This patch is updated the right LDO register value. Signed-off-by: Kailang Yang Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/pci/hda/patch_realtek.c | 52 +++++++++++++++++++++++++++++++---- 1 file changed, 47 insertions(+), 5 deletions(-) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index bee74795c9b9..560948fda46f 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -3608,13 +3608,20 @@ static void alc269_fixup_hp_line1_mic1_led(struct hda_codec *codec, static void alc_headset_mode_unplugged(struct hda_codec *codec) { static struct coef_fw coef0255[] = { - WRITE_COEF(0x1b, 0x0c0b), /* LDO and MISC control */ WRITE_COEF(0x45, 0xd089), /* UAJ function set to menual mode */ UPDATE_COEFEX(0x57, 0x05, 1<<14, 0), /* Direct Drive HP Amp control(Set to verb control)*/ WRITE_COEF(0x06, 0x6104), /* Set MIC2 Vref gate with HP */ WRITE_COEFEX(0x57, 0x03, 0x8aa6), /* Direct Drive HP Amp control */ {} }; + static struct coef_fw coef0255_1[] = { + WRITE_COEF(0x1b, 0x0c0b), /* LDO and MISC control */ + {} + }; + static struct coef_fw coef0256[] = { + WRITE_COEF(0x1b, 0x0c4b), /* LDO and MISC control */ + {} + }; static struct coef_fw coef0233[] = { WRITE_COEF(0x1b, 0x0c0b), WRITE_COEF(0x45, 0xc429), @@ -3657,7 +3664,11 @@ static void alc_headset_mode_unplugged(struct hda_codec *codec) switch (codec->core.vendor_id) { case 0x10ec0255: + alc_process_coef_fw(codec, coef0255_1); + alc_process_coef_fw(codec, coef0255); + break; case 0x10ec0256: + alc_process_coef_fw(codec, coef0256); alc_process_coef_fw(codec, coef0255); break; case 0x10ec0233: @@ -3854,6 +3865,12 @@ static void alc_headset_mode_ctia(struct hda_codec *codec) WRITE_COEFEX(0x57, 0x03, 0x8ea6), {} }; + static struct coef_fw coef0256[] = { + WRITE_COEF(0x45, 0xd489), /* Set to CTIA type */ + WRITE_COEF(0x1b, 0x0c6b), + WRITE_COEFEX(0x57, 0x03, 0x8ea6), + {} + }; static struct coef_fw coef0233[] = { WRITE_COEF(0x45, 0xd429), WRITE_COEF(0x1b, 0x0c2b), @@ -3887,9 +3904,11 @@ static void alc_headset_mode_ctia(struct hda_codec *codec) switch (codec->core.vendor_id) { case 0x10ec0255: - case 0x10ec0256: alc_process_coef_fw(codec, coef0255); break; + case 0x10ec0256: + alc_process_coef_fw(codec, coef0256); + break; case 0x10ec0233: case 0x10ec0283: alc_process_coef_fw(codec, coef0233); @@ -3922,6 +3941,12 @@ static void alc_headset_mode_omtp(struct hda_codec *codec) WRITE_COEFEX(0x57, 0x03, 0x8ea6), {} }; + static struct coef_fw coef0256[] = { + WRITE_COEF(0x45, 0xe489), /* Set to OMTP Type */ + WRITE_COEF(0x1b, 0x0c6b), + WRITE_COEFEX(0x57, 0x03, 0x8ea6), + {} + }; static struct coef_fw coef0233[] = { WRITE_COEF(0x45, 0xe429), WRITE_COEF(0x1b, 0x0c2b), @@ -3955,9 +3980,11 @@ static void alc_headset_mode_omtp(struct hda_codec *codec) switch (codec->core.vendor_id) { case 0x10ec0255: - case 0x10ec0256: alc_process_coef_fw(codec, coef0255); break; + case 0x10ec0256: + alc_process_coef_fw(codec, coef0256); + break; case 0x10ec0233: case 0x10ec0283: alc_process_coef_fw(codec, coef0233); @@ -4181,7 +4208,7 @@ static void alc_fixup_headset_mode_no_hp_mic(struct hda_codec *codec, static void alc255_set_default_jack_type(struct hda_codec *codec) { /* Set to iphone type */ - static struct coef_fw fw[] = { + static struct coef_fw alc255fw[] = { WRITE_COEF(0x1b, 0x880b), WRITE_COEF(0x45, 0xd089), WRITE_COEF(0x1b, 0x080b), @@ -4189,7 +4216,22 @@ static void alc255_set_default_jack_type(struct hda_codec *codec) WRITE_COEF(0x1b, 0x0c0b), {} }; - alc_process_coef_fw(codec, fw); + static struct coef_fw alc256fw[] = { + WRITE_COEF(0x1b, 0x884b), + WRITE_COEF(0x45, 0xd089), + WRITE_COEF(0x1b, 0x084b), + WRITE_COEF(0x46, 0x0004), + WRITE_COEF(0x1b, 0x0c4b), + {} + }; + switch (codec->core.vendor_id) { + case 0x10ec0255: + alc_process_coef_fw(codec, alc255fw); + break; + case 0x10ec0256: + alc_process_coef_fw(codec, alc256fw); + break; + } msleep(30); } From 466a158ee7f1913f104b04b7e14baccd95af15bf Mon Sep 17 00:00:00 2001 From: "hongkun.cao" Date: Sat, 21 May 2016 15:23:39 +0800 Subject: [PATCH 006/312] pinctrl: mediatek: fix dual-edge code defect [ Upstream commit 5edf673d07fdcb6498be24914f3f38f8d8843199 ] When a dual-edge irq is triggered, an incorrect irq will be reported on condition that the external signal is not stable and this incorrect irq has been registered. Correct the register offset. Cc: stable@vger.kernel.org Signed-off-by: Hongkun Cao Reviewed-by: Matthias Brugger Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/pinctrl/mediatek/pinctrl-mtk-common.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/pinctrl/mediatek/pinctrl-mtk-common.c b/drivers/pinctrl/mediatek/pinctrl-mtk-common.c index de08175aef0a..d32a72e96c72 100644 --- a/drivers/pinctrl/mediatek/pinctrl-mtk-common.c +++ b/drivers/pinctrl/mediatek/pinctrl-mtk-common.c @@ -1030,9 +1030,10 @@ static void mtk_eint_irq_handler(unsigned irq, struct irq_desc *desc) const struct mtk_desc_pin *pin; chained_irq_enter(chip, desc); - for (eint_num = 0; eint_num < pctl->devdata->ap_num; eint_num += 32) { + for (eint_num = 0; + eint_num < pctl->devdata->ap_num; + eint_num += 32, reg += 4) { status = readl(reg); - reg += 4; while (status) { offset = __ffs(status); index = eint_num + offset; From d58d489f2e03ceb48f8dd979d94cac9b3451d0cd Mon Sep 17 00:00:00 2001 From: Thomas Huth Date: Thu, 12 May 2016 13:26:44 +0200 Subject: [PATCH 007/312] powerpc: Fix definition of SIAR and SDAR registers [ Upstream commit d23fac2b27d94aeb7b65536a50d32bfdc21fe01e ] The SIAR and SDAR registers are available twice, one time as SPRs 780 / 781 (unprivileged, but read-only), and one time as the SPRs 796 / 797 (privileged, but read and write). The Linux kernel code currently uses the unprivileged SPRs - while this is OK for reading, writing to that register of course does not work. Since the KVM code tries to write to this register, too (see the mtspr in book3s_hv_rmhandlers.S), the contents of this register sometimes get lost for the guests, e.g. during migration of a VM. To fix this issue, simply switch to the privileged SPR numbers instead. Cc: stable@vger.kernel.org Signed-off-by: Thomas Huth Acked-by: Paul Mackerras Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- arch/powerpc/include/asm/reg.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index f4f99f01b746..6fe14b422fd7 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -745,13 +745,13 @@ #define SPRN_PMC6 792 #define SPRN_PMC7 793 #define SPRN_PMC8 794 -#define SPRN_SIAR 780 -#define SPRN_SDAR 781 #define SPRN_SIER 784 #define SIER_SIPR 0x2000000 /* Sampled MSR_PR */ #define SIER_SIHV 0x1000000 /* Sampled MSR_HV */ #define SIER_SIAR_VALID 0x0400000 /* SIAR contents valid */ #define SIER_SDAR_VALID 0x0200000 /* SDAR contents valid */ +#define SPRN_SIAR 796 +#define SPRN_SDAR 797 #define SPRN_TACR 888 #define SPRN_TCSCR 889 #define SPRN_CSIGR 890 From dd254ba2b75a05dd7e2d843a30aa6371bb7dd67a Mon Sep 17 00:00:00 2001 From: Thomas Huth Date: Thu, 12 May 2016 13:29:11 +0200 Subject: [PATCH 008/312] powerpc: Use privileged SPR number for MMCR2 [ Upstream commit 8dd75ccb571f3c92c48014b3dabd3d51a115ab41 ] We are already using the privileged versions of MMCR0, MMCR1 and MMCRA in the kernel, so for MMCR2, we should better use the privileged versions, too, to be consistent. Fixes: 240686c13687 ("powerpc: Initialise PMU related regs on Power8") Cc: stable@vger.kernel.org # v3.10+ Suggested-by: Paul Mackerras Signed-off-by: Thomas Huth Acked-by: Paul Mackerras Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- arch/powerpc/include/asm/reg.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 6fe14b422fd7..a4bf6e0eb813 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -708,7 +708,7 @@ #define MMCR0_FCWAIT 0x00000002UL /* freeze counter in WAIT state */ #define MMCR0_FCHV 0x00000001UL /* freeze conditions in hypervisor mode */ #define SPRN_MMCR1 798 -#define SPRN_MMCR2 769 +#define SPRN_MMCR2 785 #define SPRN_MMCRA 0x312 #define MMCRA_SDSYNC 0x80000000UL /* SDAR synced with SIAR */ #define MMCRA_SDAR_DCACHE_MISS 0x40000000UL From 6e19f72011a87e576c00b2c38108cfa461542fca Mon Sep 17 00:00:00 2001 From: Martin Willi Date: Fri, 13 May 2016 12:41:48 +0200 Subject: [PATCH 009/312] mac80211_hwsim: Add missing check for HWSIM_ATTR_SIGNAL [ Upstream commit 62397da50bb20a6b812c949ef465d7e69fe54bb6 ] A wmediumd that does not send this attribute causes a NULL pointer dereference, as the attribute is accessed even if it does not exist. The attribute was required but never checked ever since userspace frame forwarding has been introduced. The issue gets more problematic once we allow wmediumd registration from user namespaces. Cc: stable@vger.kernel.org Fixes: 7882513bacb1 ("mac80211_hwsim driver support userspace frame tx/rx") Signed-off-by: Martin Willi Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin --- drivers/net/wireless/mac80211_hwsim.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c index d5c0a1af08b9..eafaeb01aa3e 100644 --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -2724,6 +2724,7 @@ static int hwsim_tx_info_frame_received_nl(struct sk_buff *skb_2, if (!info->attrs[HWSIM_ATTR_ADDR_TRANSMITTER] || !info->attrs[HWSIM_ATTR_FLAGS] || !info->attrs[HWSIM_ATTR_COOKIE] || + !info->attrs[HWSIM_ATTR_SIGNAL] || !info->attrs[HWSIM_ATTR_TX_INFO]) goto out; From 2b8944f05530cb759468c42895571e46d103b883 Mon Sep 17 00:00:00 2001 From: Bob Copeland Date: Sun, 15 May 2016 13:19:16 -0400 Subject: [PATCH 010/312] mac80211: mesh: flush mesh paths unconditionally [ Upstream commit fe7a7c57629e8dcbc0e297363a9b2366d67a6dc5 ] Currently, the mesh paths associated with a nexthop station are cleaned up in the following code path: __sta_info_destroy_part1 synchronize_net() __sta_info_destroy_part2 -> cleanup_single_sta -> mesh_sta_cleanup -> mesh_plink_deactivate -> mesh_path_flush_by_nexthop However, there are a couple of problems here: 1) the paths aren't flushed at all if the MPM is running in userspace (e.g. when using wpa_supplicant or authsae) 2) there is no synchronize_rcu between removing the path and readers accessing the nexthop, which means the following race is possible: CPU0 CPU1 ~~~~ ~~~~ sta_info_destroy_part1() synchronize_net() rcu_read_lock() mesh_nexthop_resolve() mpath = mesh_path_lookup() [...] -> mesh_path_flush_by_nexthop() sta = rcu_dereference( mpath->next_hop) kfree(sta) access sta <-- CRASH Fix both of these by unconditionally flushing paths before destroying the sta, and by adding a synchronize_net() after path flush to ensure no active readers can still dereference the sta. Fixes this crash: [ 348.529295] BUG: unable to handle kernel paging request at 00020040 [ 348.530014] IP: [] ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] [ 348.530014] *pde = 00000000 [ 348.530014] Oops: 0000 [#1] PREEMPT [ 348.530014] Modules linked in: drbg ansi_cprng ctr ccm ppp_generic slhc ipt_MASQUERADE nf_nat_masquerade_ipv4 8021q ] [ 348.530014] CPU: 0 PID: 20597 Comm: wget Tainted: G O 4.6.0-rc5-wt=V1 #1 [ 348.530014] Hardware name: To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080016 11/07/2014 [ 348.530014] task: f64fa280 ti: f4f9c000 task.ti: f4f9c000 [ 348.530014] EIP: 0060:[] EFLAGS: 00010246 CPU: 0 [ 348.530014] EIP is at ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] [ 348.530014] EAX: f4ce63e0 EBX: 00000088 ECX: f3788416 EDX: 00020008 [ 348.530014] ESI: 00000000 EDI: 00000088 EBP: f6409a4c ESP: f6409a40 [ 348.530014] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 [ 348.530014] CR0: 80050033 CR2: 00020040 CR3: 33190000 CR4: 00000690 [ 348.530014] Stack: [ 348.530014] 00000000 f4ce63e0 f5f9bd80 f6409a64 f9291d80 0000ce67 f5d51e00 f4ce63e0 [ 348.530014] f3788416 f6409a80 f9291dc1 f4ce8320 f4ce63e0 f5d51e00 f4ce63e0 f4ce8320 [ 348.530014] f6409a98 f9277f6f 00000000 00000000 0000007c 00000000 f6409b2c f9278dd1 [ 348.530014] Call Trace: [ 348.530014] [] mesh_nexthop_lookup+0xbb/0xc8 [mac80211] [ 348.530014] [] mesh_nexthop_resolve+0x34/0xd8 [mac80211] [ 348.530014] [] ieee80211_xmit+0x92/0xc1 [mac80211] [ 348.530014] [] __ieee80211_subif_start_xmit+0x807/0x83c [mac80211] [ 348.530014] [] ? sch_direct_xmit+0xd7/0x1b3 [ 348.530014] [] ? __local_bh_enable_ip+0x5d/0x7b [ 348.530014] [] ? nf_nat_ipv4_out+0x4c/0xd0 [nf_nat_ipv4] [ 348.530014] [] ? iptable_nat_ipv4_fn+0xf/0xf [iptable_nat] [ 348.530014] [] ? netif_skb_features+0x14d/0x30a [ 348.530014] [] ieee80211_subif_start_xmit+0xa/0xe [mac80211] [ 348.530014] [] dev_hard_start_xmit+0x1f8/0x267 [ 348.530014] [] ? validate_xmit_skb.isra.120.part.121+0x10/0x253 [ 348.530014] [] sch_direct_xmit+0x8b/0x1b3 [ 348.530014] [] __dev_queue_xmit+0x2c8/0x513 [ 348.530014] [] dev_queue_xmit+0xa/0xc [ 348.530014] [] batadv_send_skb_packet+0xd6/0xec [batman_adv] [ 348.530014] [] batadv_send_unicast_skb+0x15/0x4a [batman_adv] [ 348.530014] [] batadv_dat_send_data+0x27e/0x310 [batman_adv] [ 348.530014] [] ? batadv_tt_global_hash_find.isra.11+0x8/0xa [batman_adv] [ 348.530014] [] batadv_dat_snoop_outgoing_arp_request+0x208/0x23d [batman_adv] [ 348.530014] [] batadv_interface_tx+0x206/0x385 [batman_adv] [ 348.530014] [] dev_hard_start_xmit+0x1f8/0x267 [ 348.530014] [] ? validate_xmit_skb.isra.120.part.121+0x10/0x253 [ 348.530014] [] sch_direct_xmit+0x8b/0x1b3 [ 348.530014] [] __dev_queue_xmit+0x2c8/0x513 [ 348.530014] [] ? igb_xmit_frame+0x57/0x72 [igb] [ 348.530014] [] dev_queue_xmit+0xa/0xc [ 348.530014] [] br_dev_queue_push_xmit+0xeb/0xfb [bridge] [ 348.530014] [] br_forward_finish+0x29/0x74 [bridge] [ 348.530014] [] ? deliver_clone+0x3b/0x3b [bridge] [ 348.530014] [] __br_forward+0x89/0xe7 [bridge] [ 348.530014] [] ? br_dev_queue_push_xmit+0xfb/0xfb [bridge] [ 348.530014] [] deliver_clone+0x34/0x3b [bridge] [ 348.530014] [] ? br_flood+0x95/0x95 [bridge] [ 348.530014] [] br_flood+0x77/0x95 [bridge] [ 348.530014] [] br_flood_forward+0x13/0x1a [bridge] [ 348.530014] [] ? br_flood+0x95/0x95 [bridge] [ 348.530014] [] br_handle_frame_finish+0x392/0x3db [bridge] [ 348.530014] [] ? nf_iterate+0x2b/0x6b [ 348.530014] [] br_handle_frame+0x1e6/0x240 [bridge] [ 348.530014] [] ? br_handle_local_finish+0x6a/0x6a [bridge] [ 348.530014] [] __netif_receive_skb_core+0x43a/0x66b [ 348.530014] [] ? br_handle_frame_finish+0x3db/0x3db [bridge] [ 348.530014] [] ? resched_curr+0x19/0x37 [ 348.530014] [] ? check_preempt_wakeup+0xbf/0xfe [ 348.530014] [] ? ktime_get_with_offset+0x5c/0xfc [ 348.530014] [] __netif_receive_skb+0x47/0x55 [ 348.530014] [] netif_receive_skb_internal+0x40/0x5a [ 348.530014] [] napi_gro_receive+0x3a/0x94 [ 348.530014] [] igb_poll+0x6fd/0x9ad [igb] [ 348.530014] [] ? swake_up_locked+0x14/0x26 [ 348.530014] [] net_rx_action+0xde/0x250 [ 348.530014] [] __do_softirq+0x8a/0x163 [ 348.530014] [] ? __hrtimer_tasklet_trampoline+0x19/0x19 [ 348.530014] [] do_softirq_own_stack+0x26/0x2c [ 348.530014] [ 348.530014] [] irq_exit+0x31/0x6f [ 348.530014] [] do_IRQ+0x8d/0xa0 [ 348.530014] [] common_interrupt+0x2c/0x40 [ 348.530014] Code: e7 8c 00 66 81 ff 88 00 75 12 85 d2 75 0e b2 c3 b8 83 e9 29 f9 e8 a7 5f f9 c6 eb 74 66 81 e3 8c 005 [ 348.530014] EIP: [] ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] SS:ESP 0068:f6409a40 [ 348.530014] CR2: 0000000000020040 [ 348.530014] ---[ end trace 48556ac26779732e ]--- [ 348.530014] Kernel panic - not syncing: Fatal exception in interrupt [ 348.530014] Kernel Offset: disabled Cc: stable@vger.kernel.org Reported-by: Fred Veldini Tested-by: Fred Veldini Signed-off-by: Bob Copeland Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin --- net/mac80211/mesh.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/net/mac80211/mesh.c b/net/mac80211/mesh.c index afcc67a157fd..c5af4e3d4497 100644 --- a/net/mac80211/mesh.c +++ b/net/mac80211/mesh.c @@ -161,6 +161,10 @@ void mesh_sta_cleanup(struct sta_info *sta) del_timer_sync(&sta->plink_timer); } + /* make sure no readers can access nexthop sta from here on */ + mesh_path_flush_by_nexthop(sta); + synchronize_net(); + if (changed) ieee80211_mbss_info_change_notify(sdata, changed); } From fbcda466f1b3cb237eb5416393671e9e1bece57b Mon Sep 17 00:00:00 2001 From: "Ewan D. Milne" Date: Tue, 31 May 2016 09:42:29 -0400 Subject: [PATCH 011/312] scsi: Add QEMU CD-ROM to VPD Inquiry Blacklist [ Upstream commit fbd83006e3e536fcb103228d2422ea63129ccb03 ] Linux fails to boot as a guest with a QEMU CD-ROM: [ 4.439488] ata2.00: ATAPI: QEMU CD-ROM, 0.8.2, max UDMA/100 [ 4.443649] ata2.00: configured for MWDMA2 [ 4.450267] scsi 1:0:0:0: CD-ROM QEMU QEMU CD-ROM 0.8. PQ: 0 ANSI: 5 [ 4.464317] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 4.464319] ata2.00: BMDMA stat 0x5 [ 4.464339] ata2.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 0 dma 16640 in [ 4.464339] Inquiry 12 01 00 00 ff 00res 48/20:02:00:24:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) [ 4.464341] ata2.00: status: { DRDY DRQ } [ 4.465864] ata2: soft resetting link [ 4.625971] ata2.00: configured for MWDMA2 [ 4.628290] ata2: EH complete [ 4.646670] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 4.646671] ata2.00: BMDMA stat 0x5 [ 4.646683] ata2.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 0 dma 16640 in [ 4.646683] Inquiry 12 01 00 00 ff 00res 48/20:02:00:24:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) [ 4.646685] ata2.00: status: { DRDY DRQ } [ 4.648193] ata2: soft resetting link ... Fix this by suppressing VPD inquiry for this device. Signed-off-by: Ewan D. Milne Reported-by: Jan Stancek Tested-by: Jan Stancek Cc: Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/scsi_devinfo.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c index ac418e73536d..42d3f82e75c7 100644 --- a/drivers/scsi/scsi_devinfo.c +++ b/drivers/scsi/scsi_devinfo.c @@ -227,6 +227,7 @@ static struct { {"PIONEER", "CD-ROM DRM-624X", NULL, BLIST_FORCELUN | BLIST_SINGLELUN}, {"Promise", "VTrak E610f", NULL, BLIST_SPARSELUN | BLIST_NO_RSOC}, {"Promise", "", NULL, BLIST_SPARSELUN}, + {"QEMU", "QEMU CD-ROM", NULL, BLIST_SKIP_VPD_PAGES}, {"QNAP", "iSCSI Storage", NULL, BLIST_MAX_1024}, {"SYNOLOGY", "iSCSI Storage", NULL, BLIST_MAX_1024}, {"QUANTUM", "XP34301", "1071", BLIST_NOTQ}, From 3327b0c01cf4941a7692e715c158bce85011e0a6 Mon Sep 17 00:00:00 2001 From: Thomas Huth Date: Tue, 31 May 2016 07:51:17 +0200 Subject: [PATCH 012/312] powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call [ Upstream commit 7cc851039d643a2ee7df4d18177150f2c3a484f5 ] If we do not provide the PVR for POWER8NVL, a guest on this system currently ends up in PowerISA 2.06 compatibility mode on KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet. So some new instructions from POWER8 (like "mtvsrd") get disabled for the guest, resulting in crashes when using code compiled explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC). Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Thomas Huth Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- arch/powerpc/kernel/prom_init.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index fd1fe4c37599..42654360f84e 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -647,6 +647,7 @@ unsigned char ibm_architecture_vec[] = { W(0xffff0000), W(0x003e0000), /* POWER6 */ W(0xffff0000), W(0x003f0000), /* POWER7 */ W(0xffff0000), W(0x004b0000), /* POWER8E */ + W(0xffff0000), W(0x004c0000), /* POWER8NVL */ W(0xffff0000), W(0x004d0000), /* POWER8 */ W(0xffffffff), W(0x0f000004), /* all 2.07-compliant */ W(0xffffffff), W(0x0f000003), /* all 2.06-compliant */ From f5a826aa29aba9e2800ae8269df2f274d6bb322f Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Thu, 30 Jul 2015 12:40:33 +0530 Subject: [PATCH 013/312] thermal/cpu_cooling: rename cpufreq_val as clipped_freq [ Upstream commit 59f0d21883f39d27f14408d4ca211dce80658963 ] That's what it is for, lets name it properly. Signed-off-by: Viresh Kumar Signed-off-by: Eduardo Valentin Signed-off-by: Sasha Levin --- drivers/thermal/cpu_cooling.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index f65f0d109fc8..b7693555942f 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -52,7 +52,7 @@ * registered cooling device. * @cpufreq_state: integer value representing the current state of cpufreq * cooling devices. - * @cpufreq_val: integer value representing the absolute value of the clipped + * @clipped_freq: integer value representing the absolute value of the clipped * frequency. * @max_level: maximum cooling level. One less than total number of valid * cpufreq frequencies. @@ -66,7 +66,7 @@ struct cpufreq_cooling_device { int id; struct thermal_cooling_device *cool_dev; unsigned int cpufreq_state; - unsigned int cpufreq_val; + unsigned int clipped_freq; unsigned int max_level; unsigned int *freq_table; /* In descending order */ struct cpumask allowed_cpus; @@ -195,7 +195,7 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb, &cpufreq_dev->allowed_cpus)) continue; - max_freq = cpufreq_dev->cpufreq_val; + max_freq = cpufreq_dev->clipped_freq; if (policy->max != max_freq) cpufreq_verify_within_limits(policy, 0, max_freq); @@ -273,7 +273,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, clip_freq = cpufreq_device->freq_table[state]; cpufreq_device->cpufreq_state = state; - cpufreq_device->cpufreq_val = clip_freq; + cpufreq_device->clipped_freq = clip_freq; cpufreq_update_policy(cpu); @@ -383,7 +383,7 @@ __cpufreq_cooling_register(struct device_node *np, pr_debug("%s: freq:%u KHz\n", __func__, freq); } - cpufreq_dev->cpufreq_val = cpufreq_dev->freq_table[0]; + cpufreq_dev->clipped_freq = cpufreq_dev->freq_table[0]; cpufreq_dev->cool_dev = cool_dev; mutex_lock(&cooling_cpufreq_lock); From 1c4732bd6d31d8ded9b6ce802e60cd92fc1701bb Mon Sep 17 00:00:00 2001 From: Lukasz Luba Date: Tue, 31 May 2016 11:32:02 +0100 Subject: [PATCH 014/312] thermal: cpu_cooling: fix improper order during initialization [ Upstream commit f840ab18bdf2e415dac21d09fbbbd2873111bd48 ] The freq_table array is not populated before calling thermal_of_cooling_register. The code which populates the freq table was introduced in commit f6859014. This should be done before registering new thermal cooling device. The log shows effects of this wrong decision. [ 2.172614] cpu cpu1: Failed to get voltage for frequency 1984518656000: -34 [ 2.220863] cpu cpu0: Failed to get voltage for frequency 1984524416000: -34 Cc: # 3.19+ Fixes: f6859014c7e7 ("thermal: cpu_cooling: Store frequencies in descending order") Signed-off-by: Lukasz Luba Acked-by: Javi Merino Acked-by: Viresh Kumar Signed-off-by: Zhang Rui Signed-off-by: Sasha Levin --- drivers/thermal/cpu_cooling.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index b7693555942f..5db5e91fd12f 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -363,14 +363,6 @@ __cpufreq_cooling_register(struct device_node *np, goto free_table; } - snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d", - cpufreq_dev->id); - - cool_dev = thermal_of_cooling_device_register(np, dev_name, cpufreq_dev, - &cpufreq_cooling_ops); - if (IS_ERR(cool_dev)) - goto remove_idr; - /* Fill freq-table in descending order of frequencies */ for (i = 0, freq = -1; i <= cpufreq_dev->max_level; i++) { freq = find_next_max(table, freq); @@ -383,6 +375,14 @@ __cpufreq_cooling_register(struct device_node *np, pr_debug("%s: freq:%u KHz\n", __func__, freq); } + snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d", + cpufreq_dev->id); + + cool_dev = thermal_of_cooling_device_register(np, dev_name, cpufreq_dev, + &cpufreq_cooling_ops); + if (IS_ERR(cool_dev)) + goto remove_idr; + cpufreq_dev->clipped_freq = cpufreq_dev->freq_table[0]; cpufreq_dev->cool_dev = cool_dev; From 3d69d142d6431c9daecc6e47daf22abd386311af Mon Sep 17 00:00:00 2001 From: Ilia Mirkin Date: Wed, 7 Oct 2015 18:39:32 -0400 Subject: [PATCH 015/312] drm/nouveau/gr: document mp error 0x10 [ Upstream commit 3988f645f053a6889d00324dac3e57bd62cb8900 ] NVIDIA provided the documentation for mp error 0x10, INVALID_ADDR_SPACE, which apparently happens when trying to use an atomic operation on local or shared memory (instead of global memory). Signed-off-by: Ilia Mirkin Signed-off-by: Ben Skeggs Signed-off-by: Sasha Levin --- drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c index 5606c25e5d02..eb043217e8af 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c @@ -804,6 +804,7 @@ static const struct nvkm_enum gf100_mp_warp_error[] = { { 0x0d, "GPR_OUT_OF_BOUNDS" }, { 0x0e, "MEM_OUT_OF_BOUNDS" }, { 0x0f, "UNALIGNED_MEM_ACCESS" }, + { 0x10, "INVALID_ADDR_SPACE" }, { 0x11, "INVALID_PARAM" }, {} }; From 063c66caefc18d026a1f9d5c49bf6bf761feb318 Mon Sep 17 00:00:00 2001 From: Ben Skeggs Date: Wed, 1 Jun 2016 16:20:10 +1000 Subject: [PATCH 016/312] drm/nouveau/gr/gf100-: update sm error decoding from gk20a nvgpu headers [ Upstream commit 383d0a419f8e63e3d65e706c3c515fa9505ce364 ] Signed-off-by: Ben Skeggs Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- .../gpu/drm/nouveau/nvkm/engine/gr/gf100.c | 37 ++++++++++++++----- 1 file changed, 28 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c index eb043217e8af..6d9fea664c6a 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c @@ -796,22 +796,41 @@ gf100_gr_trap_gpc_rop(struct gf100_gr_priv *priv, int gpc) } static const struct nvkm_enum gf100_mp_warp_error[] = { - { 0x00, "NO_ERROR" }, - { 0x01, "STACK_MISMATCH" }, + { 0x01, "STACK_ERROR" }, + { 0x02, "API_STACK_ERROR" }, + { 0x03, "RET_EMPTY_STACK_ERROR" }, + { 0x04, "PC_WRAP" }, { 0x05, "MISALIGNED_PC" }, - { 0x08, "MISALIGNED_GPR" }, - { 0x09, "INVALID_OPCODE" }, - { 0x0d, "GPR_OUT_OF_BOUNDS" }, - { 0x0e, "MEM_OUT_OF_BOUNDS" }, - { 0x0f, "UNALIGNED_MEM_ACCESS" }, + { 0x06, "PC_OVERFLOW" }, + { 0x07, "MISALIGNED_IMMC_ADDR" }, + { 0x08, "MISALIGNED_REG" }, + { 0x09, "ILLEGAL_INSTR_ENCODING" }, + { 0x0a, "ILLEGAL_SPH_INSTR_COMBO" }, + { 0x0b, "ILLEGAL_INSTR_PARAM" }, + { 0x0c, "INVALID_CONST_ADDR" }, + { 0x0d, "OOR_REG" }, + { 0x0e, "OOR_ADDR" }, + { 0x0f, "MISALIGNED_ADDR" }, { 0x10, "INVALID_ADDR_SPACE" }, - { 0x11, "INVALID_PARAM" }, + { 0x11, "ILLEGAL_INSTR_PARAM2" }, + { 0x12, "INVALID_CONST_ADDR_LDC" }, + { 0x13, "GEOMETRY_SM_ERROR" }, + { 0x14, "DIVERGENT" }, + { 0x15, "WARP_EXIT" }, {} }; static const struct nvkm_bitfield gf100_mp_global_error[] = { + { 0x00000001, "SM_TO_SM_FAULT" }, + { 0x00000002, "L1_ERROR" }, { 0x00000004, "MULTIPLE_WARP_ERRORS" }, - { 0x00000008, "OUT_OF_STACK_SPACE" }, + { 0x00000008, "PHYSICAL_STACK_OVERFLOW" }, + { 0x00000010, "BPT_INT" }, + { 0x00000020, "BPT_PAUSE" }, + { 0x00000040, "SINGLE_STEP_COMPLETE" }, + { 0x20000000, "ECC_SEC_ERROR" }, + { 0x40000000, "ECC_DED_ERROR" }, + { 0x80000000, "TIMEOUT" }, {} }; From 34e6c73e0bebcec0ff4788170b1abd51cbf683ad Mon Sep 17 00:00:00 2001 From: Ben Skeggs Date: Thu, 2 Jun 2016 12:23:31 +1000 Subject: [PATCH 017/312] drm/nouveau/fbcon: fix out-of-bounds memory accesses [ Upstream commit f045f459d925138fe7d6193a8c86406bda7e49da ] Reported by KASAN. Signed-off-by: Ben Skeggs Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/nouveau/nouveau_fbcon.c | 1 + drivers/gpu/drm/nouveau/nv04_fbcon.c | 7 ++----- drivers/gpu/drm/nouveau/nv50_fbcon.c | 6 ++---- drivers/gpu/drm/nouveau/nvc0_fbcon.c | 6 ++---- 4 files changed, 7 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c index 567791b27d6d..cf43f77be254 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c +++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c @@ -572,6 +572,7 @@ nouveau_fbcon_init(struct drm_device *dev) if (ret) goto fini; + fbcon->helper.fbdev->pixmap.buf_align = 4; return 0; fini: diff --git a/drivers/gpu/drm/nouveau/nv04_fbcon.c b/drivers/gpu/drm/nouveau/nv04_fbcon.c index 495c57644ced..7a92d15d474e 100644 --- a/drivers/gpu/drm/nouveau/nv04_fbcon.c +++ b/drivers/gpu/drm/nouveau/nv04_fbcon.c @@ -82,7 +82,6 @@ nv04_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) uint32_t fg; uint32_t bg; uint32_t dsize; - uint32_t width; uint32_t *data = (uint32_t *)image->data; int ret; @@ -93,9 +92,6 @@ nv04_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) if (ret) return ret; - width = ALIGN(image->width, 8); - dsize = ALIGN(width * image->height, 32) >> 5; - if (info->fix.visual == FB_VISUAL_TRUECOLOR || info->fix.visual == FB_VISUAL_DIRECTCOLOR) { fg = ((uint32_t *) info->pseudo_palette)[image->fg_color]; @@ -111,10 +107,11 @@ nv04_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) ((image->dx + image->width) & 0xffff)); OUT_RING(chan, bg); OUT_RING(chan, fg); - OUT_RING(chan, (image->height << 16) | width); + OUT_RING(chan, (image->height << 16) | image->width); OUT_RING(chan, (image->height << 16) | image->width); OUT_RING(chan, (image->dy << 16) | (image->dx & 0xffff)); + dsize = ALIGN(image->width * image->height, 32) >> 5; while (dsize) { int iter_len = dsize > 128 ? 128 : dsize; diff --git a/drivers/gpu/drm/nouveau/nv50_fbcon.c b/drivers/gpu/drm/nouveau/nv50_fbcon.c index 394c89abcc97..cb2a71ada99e 100644 --- a/drivers/gpu/drm/nouveau/nv50_fbcon.c +++ b/drivers/gpu/drm/nouveau/nv50_fbcon.c @@ -95,7 +95,7 @@ nv50_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) struct nouveau_fbdev *nfbdev = info->par; struct nouveau_drm *drm = nouveau_drm(nfbdev->dev); struct nouveau_channel *chan = drm->channel; - uint32_t width, dwords, *data = (uint32_t *)image->data; + uint32_t dwords, *data = (uint32_t *)image->data; uint32_t mask = ~(~0 >> (32 - info->var.bits_per_pixel)); uint32_t *palette = info->pseudo_palette; int ret; @@ -107,9 +107,6 @@ nv50_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) if (ret) return ret; - width = ALIGN(image->width, 32); - dwords = (width * image->height) >> 5; - BEGIN_NV04(chan, NvSub2D, 0x0814, 2); if (info->fix.visual == FB_VISUAL_TRUECOLOR || info->fix.visual == FB_VISUAL_DIRECTCOLOR) { @@ -128,6 +125,7 @@ nv50_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) OUT_RING(chan, 0); OUT_RING(chan, image->dy); + dwords = ALIGN(image->width * image->height, 32) >> 5; while (dwords) { int push = dwords > 2047 ? 2047 : dwords; diff --git a/drivers/gpu/drm/nouveau/nvc0_fbcon.c b/drivers/gpu/drm/nouveau/nvc0_fbcon.c index 61246677e8dc..69f760e8c54f 100644 --- a/drivers/gpu/drm/nouveau/nvc0_fbcon.c +++ b/drivers/gpu/drm/nouveau/nvc0_fbcon.c @@ -95,7 +95,7 @@ nvc0_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) struct nouveau_fbdev *nfbdev = info->par; struct nouveau_drm *drm = nouveau_drm(nfbdev->dev); struct nouveau_channel *chan = drm->channel; - uint32_t width, dwords, *data = (uint32_t *)image->data; + uint32_t dwords, *data = (uint32_t *)image->data; uint32_t mask = ~(~0 >> (32 - info->var.bits_per_pixel)); uint32_t *palette = info->pseudo_palette; int ret; @@ -107,9 +107,6 @@ nvc0_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) if (ret) return ret; - width = ALIGN(image->width, 32); - dwords = (width * image->height) >> 5; - BEGIN_NVC0(chan, NvSub2D, 0x0814, 2); if (info->fix.visual == FB_VISUAL_TRUECOLOR || info->fix.visual == FB_VISUAL_DIRECTCOLOR) { @@ -128,6 +125,7 @@ nvc0_fbcon_imageblit(struct fb_info *info, const struct fb_image *image) OUT_RING (chan, 0); OUT_RING (chan, image->dy); + dwords = ALIGN(image->width * image->height, 32) >> 5; while (dwords) { int push = dwords > 2047 ? 2047 : dwords; From 5642859e6a68f8c8c1e0c5163fad29ad4497f6f5 Mon Sep 17 00:00:00 2001 From: Russell King Date: Mon, 30 May 2016 23:14:56 +0100 Subject: [PATCH 018/312] ARM: fix PTRACE_SETVFPREGS on SMP systems [ Upstream commit e2dfb4b880146bfd4b6aa8e138c0205407cebbaf ] PTRACE_SETVFPREGS fails to properly mark the VFP register set to be reloaded, because it undoes one of the effects of vfp_flush_hwstate(). Specifically vfp_flush_hwstate() sets thread->vfpstate.hard.cpu to an invalid CPU number, but vfp_set() overwrites this with the original CPU number, thereby rendering the hardware state as apparently "valid", even though the software state is more recent. Fix this by reverting the previous change. Cc: Fixes: 8130b9d7b9d8 ("ARM: 7308/1: vfp: flush thread hwstate before copying ptrace registers") Acked-by: Will Deacon Tested-by: Simon Marchi Signed-off-by: Russell King Signed-off-by: Sasha Levin --- arch/arm/kernel/ptrace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c index ef9119f7462e..4d9375814b53 100644 --- a/arch/arm/kernel/ptrace.c +++ b/arch/arm/kernel/ptrace.c @@ -733,8 +733,8 @@ static int vfp_set(struct task_struct *target, if (ret) return ret; - vfp_flush_hwstate(thread); thread->vfpstate.hard = new_vfp; + vfp_flush_hwstate(thread); return 0; } From f9301f929c0a02505a1e151b636c5a9af4f04bfc Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Wed, 1 Jun 2016 14:09:21 +0200 Subject: [PATCH 019/312] KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit c622a3c21ede892e370b56e1ceb9eb28f8bbda6b ] Found by syzkaller: BUG: unable to handle kernel NULL pointer dereference at 0000000000000120 IP: [] kvm_irq_map_gsi+0x12/0x90 [kvm] PGD 6f80b067 PUD b6535067 PMD 0 Oops: 0000 [#1] SMP CPU: 3 PID: 4988 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 [...] Call Trace: [] irqfd_update+0x32/0xc0 [kvm] [] kvm_irqfd+0x3dc/0x5b0 [kvm] [] kvm_vm_ioctl+0x164/0x6f0 [kvm] [] do_vfs_ioctl+0x298/0x480 [] SyS_ioctl+0x79/0x90 [] tracesys_phase2+0x84/0x89 Code: b5 71 a7 e0 5b 41 5c 41 5d 5d f3 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 10 2e 00 00 31 c0 48 89 e5 <39> 91 20 01 00 00 76 6a 48 63 d2 48 8b 94 d1 28 01 00 00 48 85 RIP [] kvm_irq_map_gsi+0x12/0x90 [kvm] RSP CR2: 0000000000000120 Testcase: #include #include #include #include #include #include #include long r[26]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", 0); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); struct kvm_irqfd ifd; ifd.fd = syscall(SYS_eventfd2, 5, 0); ifd.gsi = 3; ifd.flags = 2; ifd.resamplefd = ifd.fd; r[25] = ioctl(r[3], KVM_IRQFD, &ifd); return 0; } Reported-by: Dmitry Vyukov Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Radim Krčmář Signed-off-by: Sasha Levin --- virt/kvm/irqchip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c index 1d56a901e791..1470f2aab091 100644 --- a/virt/kvm/irqchip.c +++ b/virt/kvm/irqchip.c @@ -51,7 +51,7 @@ int kvm_irq_map_gsi(struct kvm *kvm, irq_rt = srcu_dereference_check(kvm->irq_routing, &kvm->irq_srcu, lockdep_is_held(&kvm->irq_lock)); - if (gsi < irq_rt->nr_rt_entries) { + if (irq_rt && gsi < irq_rt->nr_rt_entries) { hlist_for_each_entry(e, &irq_rt->map[gsi], link) { entries[n] = *e; ++n; From 839c2669344ed8c10f3b6f968c467071b6288067 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Wed, 1 Jun 2016 14:09:23 +0200 Subject: [PATCH 020/312] KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit d14bdb553f9196169f003058ae1cdabe514470e6 ] MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS time, and the next KVM_RUN oopses: general protection fault: 0000 [#1] SMP CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 [...] Call Trace: [] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm] [] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [] do_vfs_ioctl+0x298/0x480 [] SyS_ioctl+0x79/0x90 [] entry_SYSCALL_64_fastpath+0x12/0x71 Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 RIP [] native_set_debugreg+0x2b/0x40 RSP Testcase (beautified/reduced from syzkaller output): #include #include #include #include #include #include #include long r[8]; int main() { struct kvm_debugregs dr = { 0 }; r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); memcpy(&dr, "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72" "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8" "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9" "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb", 48); r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr); r[6] = ioctl(r[4], KVM_RUN, 0); } Reported-by: Dmitry Vyukov Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Radim Krčmář Signed-off-by: Sasha Levin --- arch/x86/kvm/x86.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 75eb9603ed29..bd84d2226ca1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3174,6 +3174,11 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu, if (dbgregs->flags) return -EINVAL; + if (dbgregs->dr6 & ~0xffffffffull) + return -EINVAL; + if (dbgregs->dr7 & ~0xffffffffull) + return -EINVAL; + memcpy(vcpu->arch.db, dbgregs->db, sizeof(vcpu->arch.db)); kvm_update_dr0123(vcpu); vcpu->arch.dr6 = dbgregs->dr6; From c0410f18e2b4f62090b04743f0d5d8e672335225 Mon Sep 17 00:00:00 2001 From: Marc Zyngier Date: Thu, 2 Jun 2016 09:00:28 +0100 Subject: [PATCH 021/312] irqchip/gic-v3: Fix ICC_SGI1R_EL1.INTID decoding mask [ Upstream commit dd5f1b049dc139876801db3cdd0f20d21fd428cc ] The INTID mask is wrong, and is made a signed value, which has nteresting effects in the KVM emulation. Let's sanitize it. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier Signed-off-by: Sasha Levin --- include/linux/irqchip/arm-gic-v3.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h index ffbc034c8810..cbf1ce800fd1 100644 --- a/include/linux/irqchip/arm-gic-v3.h +++ b/include/linux/irqchip/arm-gic-v3.h @@ -307,7 +307,7 @@ #define ICC_SGI1R_AFFINITY_1_SHIFT 16 #define ICC_SGI1R_AFFINITY_1_MASK (0xff << ICC_SGI1R_AFFINITY_1_SHIFT) #define ICC_SGI1R_SGI_ID_SHIFT 24 -#define ICC_SGI1R_SGI_ID_MASK (0xff << ICC_SGI1R_SGI_ID_SHIFT) +#define ICC_SGI1R_SGI_ID_MASK (0xfULL << ICC_SGI1R_SGI_ID_SHIFT) #define ICC_SGI1R_AFFINITY_2_SHIFT 32 #define ICC_SGI1R_AFFINITY_2_MASK (0xffULL << ICC_SGI1R_AFFINITY_1_SHIFT) #define ICC_SGI1R_IRQ_ROUTING_MODE_BIT 40 From 5ec6f21ee5c6d2207708a8469771efbde823488f Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Thu, 26 May 2016 21:08:17 +0100 Subject: [PATCH 022/312] locking/ww_mutex: Report recursive ww_mutex locking early [ Upstream commit 0422e83d84ae24b933e4b0d4c1e0f0b4ae8a0a3b ] Recursive locking for ww_mutexes was originally conceived as an exception. However, it is heavily used by the DRM atomic modesetting code. Currently, the recursive deadlock is checked after we have queued up for a busy-spin and as we never release the lock, we spin until kicked, whereupon the deadlock is discovered and reported. A simple solution for the now common problem is to move the recursive deadlock discovery to the first action when taking the ww_mutex. Suggested-by: Maarten Lankhorst Signed-off-by: Chris Wilson Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Maarten Lankhorst Cc: Andrew Morton Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1464293297-19777-1-git-send-email-chris@chris-wilson.co.uk Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- kernel/locking/mutex.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 4cccea6b8934..6f1c3e41b2a6 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -486,9 +486,6 @@ __ww_mutex_lock_check_stamp(struct mutex *lock, struct ww_acquire_ctx *ctx) if (!hold_ctx) return 0; - if (unlikely(ctx == hold_ctx)) - return -EALREADY; - if (ctx->stamp - hold_ctx->stamp <= LONG_MAX && (ctx->stamp != hold_ctx->stamp || ctx > hold_ctx)) { #ifdef CONFIG_DEBUG_MUTEXES @@ -514,6 +511,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, unsigned long flags; int ret; + if (use_ww_ctx) { + struct ww_mutex *ww = container_of(lock, struct ww_mutex, base); + if (unlikely(ww_ctx == READ_ONCE(ww->ctx))) + return -EALREADY; + } + preempt_disable(); mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip); From f518d7a9791398c181e77b0f645802678055a554 Mon Sep 17 00:00:00 2001 From: AceLan Kao Date: Fri, 3 Jun 2016 14:45:25 +0800 Subject: [PATCH 023/312] ALSA: hda - Fix headset mic detection problem for Dell machine [ Upstream commit f90d83b301701026b2e4c437a3613f377f63290e ] Add the pin configuration value of this machine into the pin_quirk table to make DELL1_MIC_NO_PRESENCE apply to this machine. Signed-off-by: AceLan Kao Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/pci/hda/patch_realtek.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 560948fda46f..ddee824e67ea 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5740,6 +5740,10 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { {0x17, 0x40000000}, {0x1d, 0x40700001}, {0x21, 0x02211040}), + SND_HDA_PIN_QUIRK(0x10ec0255, 0x1028, "Dell Inspiron 5565", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, + {0x12, 0x90a60180}, + {0x14, 0x90170120}, + {0x21, 0x02211030}), SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, ALC255_STANDARD_PINS, {0x12, 0x90a60160}, From 3821aa37f7bd4a98d7c9f01eb915cdae06dde1fb Mon Sep 17 00:00:00 2001 From: Tony Luck Date: Tue, 31 May 2016 11:50:28 -0700 Subject: [PATCH 024/312] EDAC, sb_edac: Fix rank lookup on Broadwell [ Upstream commit c7103f650a11328f28b9fa1c95027db331b7774b ] Broadwell made a small change to the rank target register moving the target rank ID field up from bits 16:19 to bits 20:23. Also found that the offset field grew by one bit in the IVY_BRIDGE to HASWELL transition, so fix the RIR_OFFSET() macro too. Signed-off-by: Tony Luck Cc: stable@vger.kernel.org # v3.19+ Cc: Aristeu Rozanski Cc: Mauro Carvalho Chehab Cc: linux-edac Link: http://lkml.kernel.org/r/2943fb819b1f7e396681165db9c12bb3df0e0b16.1464735623.git.tony.luck@intel.com Signed-off-by: Borislav Petkov Signed-off-by: Sasha Levin --- drivers/edac/sb_edac.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c index a7e7be0a8ae8..cb46c468b01e 100644 --- a/drivers/edac/sb_edac.c +++ b/drivers/edac/sb_edac.c @@ -218,8 +218,11 @@ static const u32 rir_offset[MAX_RIR_RANGES][MAX_RIR_WAY] = { { 0x1a0, 0x1a4, 0x1a8, 0x1ac, 0x1b0, 0x1b4, 0x1b8, 0x1bc }, }; -#define RIR_RNK_TGT(reg) GET_BITFIELD(reg, 16, 19) -#define RIR_OFFSET(reg) GET_BITFIELD(reg, 2, 14) +#define RIR_RNK_TGT(type, reg) (((type) == BROADWELL) ? \ + GET_BITFIELD(reg, 20, 23) : GET_BITFIELD(reg, 16, 19)) + +#define RIR_OFFSET(type, reg) (((type) == HASWELL || (type) == BROADWELL) ? \ + GET_BITFIELD(reg, 2, 15) : GET_BITFIELD(reg, 2, 14)) /* Device 16, functions 2-7 */ @@ -1101,14 +1104,14 @@ static void get_memory_layout(const struct mem_ctl_info *mci) pci_read_config_dword(pvt->pci_tad[i], rir_offset[j][k], ®); - tmp_mb = RIR_OFFSET(reg) << 6; + tmp_mb = RIR_OFFSET(pvt->info.type, reg) << 6; gb = div_u64_rem(tmp_mb, 1024, &mb); edac_dbg(0, "CH#%d RIR#%d INTL#%d, offset %u.%03u GB (0x%016Lx), tgt: %d, reg=0x%08x\n", i, j, k, gb, (mb*1000)/1024, ((u64)tmp_mb) << 20L, - (u32)RIR_RNK_TGT(reg), + (u32)RIR_RNK_TGT(pvt->info.type, reg), reg); } } @@ -1432,7 +1435,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci, pci_read_config_dword(pvt->pci_tad[base_ch], rir_offset[n_rir][idx], ®); - *rank = RIR_RNK_TGT(reg); + *rank = RIR_RNK_TGT(pvt->info.type, reg); edac_dbg(0, "RIR#%d: channel address 0x%08Lx < 0x%08Lx, RIR interleave %d, index %d\n", n_rir, From 7c8f7a240e802368d9788cc62cd2324dac53e735 Mon Sep 17 00:00:00 2001 From: Sergei Shtylyov Date: Sat, 28 May 2016 23:02:50 +0300 Subject: [PATCH 025/312] of: irq: fix of_irq_get[_byname]() kernel-doc [ Upstream commit 3993546646baf1dab5f5c4f7d9bb58f2046fd1c1 ] The kernel-doc for the of_irq_get[_byname]() is clearly inadequate in describing the return values -- of_irq_get_byname() is documented better than of_irq_get() but it still doesn't mention that 0 is returned iff irq_create_of_mapping() fails (it doesn't return an error code in this case). Document all possible return value variants, making the writing of the word "IRQ" consistent, while at it... Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Fixes: ad69674e73a1 ("of/irq: do irq resolution in platform_get_irq_byname()") Signed-off-by: Sergei Shtylyov CC: stable@vger.kernel.org Signed-off-by: Rob Herring Signed-off-by: Sasha Levin --- drivers/of/irq.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/of/irq.c b/drivers/of/irq.c index 1a7980692f25..f5d497989fcd 100644 --- a/drivers/of/irq.c +++ b/drivers/of/irq.c @@ -385,13 +385,13 @@ int of_irq_to_resource(struct device_node *dev, int index, struct resource *r) EXPORT_SYMBOL_GPL(of_irq_to_resource); /** - * of_irq_get - Decode a node's IRQ and return it as a Linux irq number + * of_irq_get - Decode a node's IRQ and return it as a Linux IRQ number * @dev: pointer to device tree node - * @index: zero-based index of the irq - * - * Returns Linux irq number on success, or -EPROBE_DEFER if the irq domain - * is not yet created. + * @index: zero-based index of the IRQ * + * Returns Linux IRQ number on success, or 0 on the IRQ mapping failure, or + * -EPROBE_DEFER if the IRQ domain is not yet created, or error code in case + * of any other failure. */ int of_irq_get(struct device_node *dev, int index) { @@ -412,12 +412,13 @@ int of_irq_get(struct device_node *dev, int index) EXPORT_SYMBOL_GPL(of_irq_get); /** - * of_irq_get_byname - Decode a node's IRQ and return it as a Linux irq number + * of_irq_get_byname - Decode a node's IRQ and return it as a Linux IRQ number * @dev: pointer to device tree node - * @name: irq name + * @name: IRQ name * - * Returns Linux irq number on success, or -EPROBE_DEFER if the irq domain - * is not yet created, or error code in case of any other failure. + * Returns Linux IRQ number on success, or 0 on the IRQ mapping failure, or + * -EPROBE_DEFER if the IRQ domain is not yet created, or error code in case + * of any other failure. */ int of_irq_get_byname(struct device_node *dev, const char *name) { From eb1eba6ac8e902b378d1196acb1026e91c40e736 Mon Sep 17 00:00:00 2001 From: Helge Deller Date: Sat, 4 Jun 2016 17:21:33 +0200 Subject: [PATCH 026/312] parisc: Fix pagefault crash in unaligned __get_user() call [ Upstream commit 8b78f260887df532da529f225c49195d18fef36b ] One of the debian buildd servers had this crash in the syslog without any other information: Unaligned handler failed, ret = -2 clock_adjtime (pid 22578): Unaligned data reference (code 28) CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1 task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111100000001111 Tainted: G E r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0 r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4 r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218 r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000 r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0 r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218 sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88 IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff ORIG_R28: 00000002369fe628 IAOQ[0]: compat_get_timex+0x2dc/0x3c0 IAOQ[1]: compat_get_timex+0x2e0/0x3c0 RP(r2): compat_get_timex+0x40/0x3c0 Backtrace: [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0 [<0000000040205024>] syscall_exit+0x0/0x14 This means the userspace program clock_adjtime called the clock_adjtime() syscall and then crashed inside the compat_get_timex() function. Syscalls should never crash programs, but instead return EFAULT. The IIR register contains the executed instruction, which disassebles into "ldw 0(sr3,r5),r9". This load-word instruction is part of __get_user() which tried to read the word at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The unaligned handler is able to emulate all ldw instructions, but it fails if it fails to read the source e.g. because of page fault. The following program reproduces the problem: #define _GNU_SOURCE #include #include #include int main(void) { /* allocate 8k */ char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); /* free second half (upper 4k) and make it invalid. */ munmap(ptr+4096, 4096); /* syscall where first int is unaligned and clobbers into invalid memory region */ /* syscall should return EFAULT */ return syscall(__NR_clock_adjtime, 0, ptr+4095); } To fix this issue we simply need to check if the faulting instruction address is in the exception fixup table when the unaligned handler failed. If it is, call the fixup routine instead of crashing. While looking at the unaligned handler I found another issue as well: The target register should not be modified if the handler was unsuccessful. Signed-off-by: Helge Deller Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- arch/parisc/kernel/unaligned.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/parisc/kernel/unaligned.c b/arch/parisc/kernel/unaligned.c index d7c0acb35ec2..8d49614d600d 100644 --- a/arch/parisc/kernel/unaligned.c +++ b/arch/parisc/kernel/unaligned.c @@ -666,7 +666,7 @@ void handle_unaligned(struct pt_regs *regs) break; } - if (modify && R1(regs->iir)) + if (ret == 0 && modify && R1(regs->iir)) regs->gr[R1(regs->iir)] = newbase; @@ -677,6 +677,14 @@ void handle_unaligned(struct pt_regs *regs) if (ret) { + /* + * The unaligned handler failed. + * If we were called by __get_user() or __put_user() jump + * to it's exception fixup handler instead of crashing. + */ + if (!user_mode(regs) && fixup_exception(regs)) + return; + printk(KERN_CRIT "Unaligned handler failed, ret = %d\n", ret); die_if_kernel("Unaligned data reference", regs, 28); From fced2a819164c305544bcc8d9a45beea15374528 Mon Sep 17 00:00:00 2001 From: "Eric W. Biederman" Date: Mon, 6 Jun 2016 15:36:07 -0500 Subject: [PATCH 027/312] mnt: If fs_fully_visible fails call put_filesystem. [ Upstream commit 97c1df3e54e811aed484a036a798b4b25d002ecf ] Add this trivial missing error handling. Cc: stable@vger.kernel.org Fixes: 1b852bceb0d1 ("mnt: Refactor the logic for mounting sysfs and proc in a user namespace") Signed-off-by: "Eric W. Biederman" Signed-off-by: Sasha Levin --- fs/namespace.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index fce3cc1a3fa7..2b801290b286 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2390,8 +2390,10 @@ static int do_new_mount(struct path *path, const char *fstype, int flags, mnt_flags |= MNT_NODEV | MNT_LOCK_NODEV; } if (type->fs_flags & FS_USERNS_VISIBLE) { - if (!fs_fully_visible(type, &mnt_flags)) + if (!fs_fully_visible(type, &mnt_flags)) { + put_filesystem(type); return -EPERM; + } } } From f530dd0176f24aa31dd76672777de06f735d7bdc Mon Sep 17 00:00:00 2001 From: "Eric W. Biederman" Date: Fri, 27 May 2016 14:50:05 -0500 Subject: [PATCH 028/312] mnt: fs_fully_visible test the proper mount for MNT_LOCKED [ Upstream commit d71ed6c930ac7d8f88f3cef6624a7e826392d61f ] MNT_LOCKED implies on a child mount implies the child is locked to the parent. So while looping through the children the children should be tested (not their parent). Typically an unshare of a mount namespace locks all mounts together making both the parent and the slave as locked but there are a few corner cases where other things work. Cc: stable@vger.kernel.org Fixes: ceeb0e5d39fc ("vfs: Ignore unlocked mounts in fs_fully_visible") Reported-by: Seth Forshee Signed-off-by: "Eric W. Biederman" Signed-off-by: Sasha Levin --- fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index 2b801290b286..6257268147ee 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3238,7 +3238,7 @@ static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags) list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) { struct inode *inode = child->mnt_mountpoint->d_inode; /* Only worry about locked mounts */ - if (!(mnt->mnt.mnt_flags & MNT_LOCKED)) + if (!(child->mnt.mnt_flags & MNT_LOCKED)) continue; /* Is the directory permanetly empty? */ if (!is_empty_dir_inode(inode)) From 02813a36b33b46d5be433f230de2f064cb6c504b Mon Sep 17 00:00:00 2001 From: Torsten Hilbrich Date: Tue, 7 Jun 2016 13:14:21 +0200 Subject: [PATCH 029/312] ALSA: hda/realtek: Add T560 docking unit fixup [ Upstream commit dab38e43b298501a4e8807b56117c029e2e98383 ] Tested with Lenovo Ultradock. Fixes the non-working headphone jack on the docking unit. Signed-off-by: Torsten Hilbrich Tested-by: Torsten Hilbrich Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index ddee824e67ea..a62872f7b41a 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5488,6 +5488,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x2218, "Thinkpad X1 Carbon 2nd", ALC292_FIXUP_TPT440_DOCK), SND_PCI_QUIRK(0x17aa, 0x2223, "ThinkPad T550", ALC292_FIXUP_TPT440_DOCK), SND_PCI_QUIRK(0x17aa, 0x2226, "ThinkPad X250", ALC292_FIXUP_TPT440_DOCK), + SND_PCI_QUIRK(0x17aa, 0x2231, "Thinkpad T560", ALC292_FIXUP_TPT460), SND_PCI_QUIRK(0x17aa, 0x2233, "Thinkpad", ALC292_FIXUP_TPT460), SND_PCI_QUIRK(0x17aa, 0x30bb, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY), SND_PCI_QUIRK(0x17aa, 0x30e2, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY), From f0461457f13c2c7441b86a830c52724ec928b9ab Mon Sep 17 00:00:00 2001 From: "H. Peter Anvin" Date: Tue, 5 Apr 2016 17:01:33 -0700 Subject: [PATCH 030/312] x86, build: copy ldlinux.c32 to image.iso [ Upstream commit 9c77679cadb118c0aa99e6f88533d91765a131ba ] For newer versions of Syslinux, we need ldlinux.c32 in addition to isolinux.bin to reside on the boot disk, so if the latter is found, copy it, too, to the isoimage tree. Signed-off-by: H. Peter Anvin Cc: Linux Stable Tree Signed-off-by: Sasha Levin --- arch/x86/boot/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile index 57bbf2fb21f6..78c366462e70 100644 --- a/arch/x86/boot/Makefile +++ b/arch/x86/boot/Makefile @@ -162,6 +162,9 @@ isoimage: $(obj)/bzImage for i in lib lib64 share end ; do \ if [ -f /usr/$$i/syslinux/isolinux.bin ] ; then \ cp /usr/$$i/syslinux/isolinux.bin $(obj)/isoimage ; \ + if [ -f /usr/$$i/syslinux/ldlinux.c32 ]; then \ + cp /usr/$$i/syslinux/ldlinux.c32 $(obj)/isoimage ; \ + fi ; \ break ; \ fi ; \ if [ $$i = end ] ; then exit 1 ; fi ; \ From 7c630ac4bf02eea4f0eb2aa266877b5d9c9d7efd Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Wed, 8 Jun 2016 10:01:23 +1000 Subject: [PATCH 031/312] powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added [ Upstream commit 2c2a63e301fd19ccae673e79de59b30a232ff7f9 ] The recent commit 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") added a new PVR mask & value to the start of the ibm_architecture_vec[] array. However it missed the fact that further down in the array, we hard code the offset of one of the fields, and then at boot use that value to patch the value in the array. This means every update to the array must also update the #define, ugh. This means that on pseries machines we will misreport to firmware the number of cores we support, by a factor of threads_per_core. Fix it for now by updating the #define. Fixes: 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- arch/powerpc/kernel/prom_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index 42654360f84e..ae97ba211d8e 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -710,7 +710,7 @@ unsigned char ibm_architecture_vec[] = { * must match by the macro below. Update the definition if * the structure layout changes. */ -#define IBM_ARCH_VEC_NRCORES_OFFSET 125 +#define IBM_ARCH_VEC_NRCORES_OFFSET 133 W(NR_CPUS), /* number of cores supported */ 0, 0, From d6126a7032f4af5477dedf6ca9c6f1ee3498135a Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Fri, 17 Jun 2016 17:30:50 -0400 Subject: [PATCH 032/312] of: fix autoloading due to broken modalias with no 'compatible' [ Upstream commit b3c0a4dab7e35a9b6d69c0415641d2280fdefb2b ] Because of an improper dereference, a stray 'C' character was output to the modalias when no 'compatible' was specified. This is the case for some old PowerMac drivers which only set the 'name' property. Fix it to let them match again. Reported-by: Mathieu Malaterre Signed-off-by: Wolfram Sang Tested-by: Mathieu Malaterre Cc: Philipp Zabel Cc: Andreas Schwab Fixes: 6543becf26fff6 ("mod/file2alias: make modalias generation safe for cross compiling") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- scripts/mod/file2alias.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index 78691d51a479..0ef6956dcfa6 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -653,7 +653,7 @@ static int do_of_entry (const char *filename, void *symval, char *alias) len = sprintf(alias, "of:N%sT%s", (*name)[0] ? *name : "*", (*type)[0] ? *type : "*"); - if (compatible[0]) + if ((*compatible)[0]) sprintf(&alias[len], "%sC%s", (*type)[0] ? "*" : "", *compatible); From 1dbd163a91b5e778fe757cfa6286d3a7c0b7dc32 Mon Sep 17 00:00:00 2001 From: Srinivas Pandruvada Date: Tue, 7 Jun 2016 17:38:53 -0700 Subject: [PATCH 033/312] cpufreq: intel_pstate: Fix ->set_policy() interface for no_turbo [ Upstream commit 983e600e88835f0321d1a0ea06f52d48b7b5a544 ] When turbo is disabled, the ->set_policy() interface is broken. For example, when turbo is disabled and cpuinfo.max = 2900000 (full max turbo frequency), setting the limits results in frequency less than the requested one: Set 1000000 KHz results in 0700000 KHz Set 1500000 KHz results in 1100000 KHz Set 2000000 KHz results in 1500000 KHz This is because the limits->max_perf fraction is calculated using the max turbo frequency as the reference, but when the max P-State is capped in intel_pstate_get_min_max(), the reference is not the max turbo P-State. This results in reducing max P-State. One option is to always use max turbo as reference for calculating limits. But this will not be correct. By definition the intel_pstate sysfs limits, shows percentage of available performance. So when BIOS has disabled turbo, the available performance is max non turbo. So the max_perf_pct should still show 100%. Signed-off-by: Srinivas Pandruvada [ rjw : Subject & changelog, rewrite in fewer lines of code ] Cc: All applicable Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin --- drivers/cpufreq/intel_pstate.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 1ee2ab58e37d..e6eed20b1401 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -1037,8 +1037,11 @@ static int intel_pstate_cpu_init(struct cpufreq_policy *policy) /* cpuinfo and default policy values */ policy->cpuinfo.min_freq = cpu->pstate.min_pstate * cpu->pstate.scaling; - policy->cpuinfo.max_freq = - cpu->pstate.turbo_pstate * cpu->pstate.scaling; + update_turbo_state(); + policy->cpuinfo.max_freq = limits.turbo_disabled ? + cpu->pstate.max_pstate : cpu->pstate.turbo_pstate; + policy->cpuinfo.max_freq *= cpu->pstate.scaling; + policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL; cpumask_set_cpu(policy->cpu, policy->cpus); From 7296467cfbe04536b8ccec5345a1fd0a2cab1cdc Mon Sep 17 00:00:00 2001 From: Al Viro Date: Tue, 7 Jun 2016 21:26:55 -0400 Subject: [PATCH 034/312] fix d_walk()/non-delayed __d_free() race [ Upstream commit 3d56c25e3bb0726a5c5e16fc2d9e38f8ed763085 ] Ascend-to-parent logics in d_walk() depends on all encountered child dentries not getting freed without an RCU delay. Unfortunately, in quite a few cases it is not true, with hard-to-hit oopsable race as the result. Fortunately, the fix is simiple; right now the rule is "if it ever been hashed, freeing must be delayed" and changing it to "if it ever had a parent, freeing must be delayed" closes that hole and covers all cases the old rule used to cover. Moreover, pipes and sockets remain _not_ covered, so we do not introduce RCU delay in the cases which are the reason for having that delay conditional in the first place. Cc: stable@vger.kernel.org # v3.2+ (and watch out for __d_materialise_dentry()) Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- fs/dcache.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 10bce74c427f..2c75b393d31a 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1618,7 +1618,7 @@ struct dentry *d_alloc(struct dentry * parent, const struct qstr *name) struct dentry *dentry = __d_alloc(parent->d_sb, name); if (!dentry) return NULL; - + dentry->d_flags |= DCACHE_RCUACCESS; spin_lock(&parent->d_lock); /* * don't need child lock because it is not subject @@ -2410,7 +2410,6 @@ static void __d_rehash(struct dentry * entry, struct hlist_bl_head *b) { BUG_ON(!d_unhashed(entry)); hlist_bl_lock(b); - entry->d_flags |= DCACHE_RCUACCESS; hlist_bl_add_head_rcu(&entry->d_hash, b); hlist_bl_unlock(b); } @@ -2629,6 +2628,7 @@ static void __d_move(struct dentry *dentry, struct dentry *target, /* ... and switch them in the tree */ if (IS_ROOT(dentry)) { /* splicing a tree */ + dentry->d_flags |= DCACHE_RCUACCESS; dentry->d_parent = target->d_parent; target->d_parent = target; list_del_init(&target->d_child); From 796795362c839447eadc671420f8c4bdd1deb182 Mon Sep 17 00:00:00 2001 From: Ricardo Ribalda Delgado Date: Fri, 3 Jun 2016 19:10:01 +0200 Subject: [PATCH 035/312] gpiolib: Fix NULL pointer deference [ Upstream commit 11f33a6d15bfa397867ac0d7f3481b6dd683286f ] Under some circumstances, a gpiochip might be half cleaned from the gpio_device list. This patch makes sure that the chip pointer is still valid, before calling the match function. [ 104.088296] BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 [ 104.089772] IP: [] of_gpiochip_find_and_xlate+0x15/0x80 [ 104.128273] Call Trace: [ 104.129802] [] ? of_parse_own_gpio+0x1f0/0x1f0 [ 104.131353] [] gpiochip_find+0x60/0x90 [ 104.132868] [] of_get_named_gpiod_flags+0x9a/0x120 ... [ 104.141586] [] gpio_led_probe+0x11b/0x360 Cc: stable@vger.kernel.org Signed-off-by: Ricardo Ribalda Delgado Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/gpio/gpiolib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index 6bc612b8a49f..95752d38b7fe 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -375,7 +375,7 @@ struct gpio_chip *gpiochip_find(void *data, spin_lock_irqsave(&gpio_lock, flags); list_for_each_entry(chip, &gpio_chips, list) - if (match(chip, data)) + if (chip && match(chip, data)) break; /* No match? */ From 8921c300bb0f67ddced9e50330b894556c72c87d Mon Sep 17 00:00:00 2001 From: Ben Dooks Date: Tue, 7 Jun 2016 17:22:17 +0100 Subject: [PATCH 036/312] gpio: bcm-kona: fix bcm_kona_gpio_reset() warnings [ Upstream commit b66b2a0adf0e48973b582e055758b9907a7eee7c ] The bcm_kona_gpio_reset() calls bcm_kona_gpio_write_lock_regs() with what looks like the wrong parameter. The write_lock_regs function takes a pointer to the registers, not the bcm_kona_gpio structure. Fix the warning, and probably bug by changing the function to pass reg_base instead of kona_gpio, fixing the following warning: drivers/gpio/gpio-bcm-kona.c:550:47: warning: incorrect type in argument 1 (different address spaces) expected void [noderef] *reg_base got struct bcm_kona_gpio *kona_gpio warning: incorrect type in argument 1 (different address spaces) expected void [noderef] *reg_base got struct bcm_kona_gpio *kona_gpio Cc: stable@vger.kernel.org Signed-off-by: Ben Dooks Acked-by: Ray Jui Reviewed-by: Markus Mayer Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/gpio/gpio-bcm-kona.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpio/gpio-bcm-kona.c b/drivers/gpio/gpio-bcm-kona.c index b164ce837b43..81ddd1d6d84b 100644 --- a/drivers/gpio/gpio-bcm-kona.c +++ b/drivers/gpio/gpio-bcm-kona.c @@ -549,11 +549,11 @@ static void bcm_kona_gpio_reset(struct bcm_kona_gpio *kona_gpio) /* disable interrupts and clear status */ for (i = 0; i < kona_gpio->num_bank; i++) { /* Unlock the entire bank first */ - bcm_kona_gpio_write_lock_regs(kona_gpio, i, UNLOCK_CODE); + bcm_kona_gpio_write_lock_regs(reg_base, i, UNLOCK_CODE); writel(0xffffffff, reg_base + GPIO_INT_MASK(i)); writel(0xffffffff, reg_base + GPIO_INT_STATUS(i)); /* Now re-lock the bank */ - bcm_kona_gpio_write_lock_regs(kona_gpio, i, LOCK_CODE); + bcm_kona_gpio_write_lock_regs(reg_base, i, LOCK_CODE); } } From e90b6fdf60b1e5b292c0b8a4eacc862da97e46fc Mon Sep 17 00:00:00 2001 From: Prasun Maiti Date: Mon, 6 Jun 2016 20:04:19 +0530 Subject: [PATCH 037/312] wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel [ Upstream commit 3d5fdff46c4b2b9534fa2f9fc78e90a48e0ff724 ] iwpriv app uses iw_point structure to send data to Kernel. The iw_point structure holds a pointer. For compatibility Kernel converts the pointer as required for WEXT IOCTLs (SIOCIWFIRST to SIOCIWLAST). Some drivers may use iw_handler_def.private_args to populate iwpriv commands instead of iw_handler_def.private. For those case, the IOCTLs from SIOCIWFIRSTPRIV to SIOCIWLASTPRIV will follow the path ndo_do_ioctl(). Accordingly when the filled up iw_point structure comes from 32 bit iwpriv to 64 bit Kernel, Kernel will not convert the pointer and sends it to driver. So, the driver may get the invalid data. The pointer conversion for the IOCTLs (SIOCIWFIRSTPRIV to SIOCIWLASTPRIV), which follow the path ndo_do_ioctl(), is mandatory. This patch adds pointer conversion from 32 bit to 64 bit and vice versa, if the ioctl comes from 32 bit iwpriv to 64 bit Kernel. Cc: stable@vger.kernel.org Signed-off-by: Prasun Maiti Signed-off-by: Ujjal Roy Tested-by: Dibyajyoti Ghosh Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin --- net/wireless/wext-core.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/net/wireless/wext-core.c b/net/wireless/wext-core.c index b50ee5d622e1..c753211cb83f 100644 --- a/net/wireless/wext-core.c +++ b/net/wireless/wext-core.c @@ -955,8 +955,29 @@ static int wireless_process_ioctl(struct net *net, struct ifreq *ifr, return private(dev, iwr, cmd, info, handler); } /* Old driver API : call driver ioctl handler */ - if (dev->netdev_ops->ndo_do_ioctl) - return dev->netdev_ops->ndo_do_ioctl(dev, ifr, cmd); + if (dev->netdev_ops->ndo_do_ioctl) { +#ifdef CONFIG_COMPAT + if (info->flags & IW_REQUEST_FLAG_COMPAT) { + int ret = 0; + struct iwreq iwr_lcl; + struct compat_iw_point *iwp_compat = (void *) &iwr->u.data; + + memcpy(&iwr_lcl, iwr, sizeof(struct iwreq)); + iwr_lcl.u.data.pointer = compat_ptr(iwp_compat->pointer); + iwr_lcl.u.data.length = iwp_compat->length; + iwr_lcl.u.data.flags = iwp_compat->flags; + + ret = dev->netdev_ops->ndo_do_ioctl(dev, (void *) &iwr_lcl, cmd); + + iwp_compat->pointer = ptr_to_compat(iwr_lcl.u.data.pointer); + iwp_compat->length = iwr_lcl.u.data.length; + iwp_compat->flags = iwr_lcl.u.data.flags; + + return ret; + } else +#endif + return dev->netdev_ops->ndo_do_ioctl(dev, ifr, cmd); + } return -EOPNOTSUPP; } From c96e6bf5705254a4c93ca25d6d3c68a04fc7ab5b Mon Sep 17 00:00:00 2001 From: Jann Horn Date: Wed, 1 Jun 2016 11:55:05 +0200 Subject: [PATCH 038/312] proc: prevent stacking filesystems on top [ Upstream commit e54ad7f1ee263ffa5a2de9c609d58dfa27b21cd9 ] This prevents stacking filesystems (ecryptfs and overlayfs) from using procfs as lower filesystem. There is too much magic going on inside procfs, and there is no good reason to stack stuff on top of procfs. (For example, procfs does access checks in VFS open handlers, and ecryptfs by design calls open handlers from a kernel thread that doesn't drop privileges or so.) Signed-off-by: Jann Horn Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- fs/proc/root.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/proc/root.c b/fs/proc/root.c index 68feb0f70e63..c3e1bc595e6d 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -121,6 +121,13 @@ static struct dentry *proc_mount(struct file_system_type *fs_type, if (IS_ERR(sb)) return ERR_CAST(sb); + /* + * procfs isn't actually a stacking filesystem; however, there is + * too much magic going on inside it to permit stacking things on + * top of it + */ + sb->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH; + if (!proc_parse_options(options, ns)) { deactivate_locked_super(sb); return ERR_PTR(-EINVAL); From ca1950cd168857c811f217a390ed8c91b57d2c09 Mon Sep 17 00:00:00 2001 From: Jann Horn Date: Wed, 1 Jun 2016 11:55:06 +0200 Subject: [PATCH 039/312] ecryptfs: forbid opening files without mmap handler [ Upstream commit 2f36db71009304b3f0b95afacd8eba1f9f046b87 ] This prevents users from triggering a stack overflow through a recursive invocation of pagefault handling that involves mapping procfs files into virtual memory. Signed-off-by: Jann Horn Acked-by: Tyler Hicks Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- fs/ecryptfs/kthread.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/ecryptfs/kthread.c b/fs/ecryptfs/kthread.c index 866bb18efefe..e818f5ac7a26 100644 --- a/fs/ecryptfs/kthread.c +++ b/fs/ecryptfs/kthread.c @@ -25,6 +25,7 @@ #include #include #include +#include #include "ecryptfs_kernel.h" struct ecryptfs_open_req { @@ -147,7 +148,7 @@ int ecryptfs_privileged_open(struct file **lower_file, flags |= IS_RDONLY(d_inode(lower_dentry)) ? O_RDONLY : O_RDWR; (*lower_file) = dentry_open(&req.path, flags, cred); if (!IS_ERR(*lower_file)) - goto out; + goto have_file; if ((flags & O_ACCMODE) == O_RDONLY) { rc = PTR_ERR((*lower_file)); goto out; @@ -165,8 +166,16 @@ int ecryptfs_privileged_open(struct file **lower_file, mutex_unlock(&ecryptfs_kthread_ctl.mux); wake_up(&ecryptfs_kthread_ctl.wait); wait_for_completion(&req.done); - if (IS_ERR(*lower_file)) + if (IS_ERR(*lower_file)) { rc = PTR_ERR(*lower_file); + goto out; + } +have_file: + if ((*lower_file)->f_op->mmap == NULL) { + fput(*lower_file); + *lower_file = NULL; + rc = -EMEDIUMTYPE; + } out: return rc; } From 7cf9f2300c12ba5fba23e7c61a3477927b0cc6d1 Mon Sep 17 00:00:00 2001 From: Andy Lutomirski Date: Tue, 24 May 2016 15:13:02 -0700 Subject: [PATCH 040/312] uvc: Forward compat ioctls to their handlers directly [ Upstream commit a44323e2a8f342848bb77e8e04fcd85fcb91b3b4 ] The current code goes through a lot of indirection just to call a known handler. Simplify it: just call the handlers directly. Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski Signed-off-by: Sasha Levin --- drivers/media/usb/uvc/uvc_v4l2.c | 39 +++++++++++++++----------------- 1 file changed, 18 insertions(+), 21 deletions(-) diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c index c4b1ac6750d8..3f86e548d795 100644 --- a/drivers/media/usb/uvc/uvc_v4l2.c +++ b/drivers/media/usb/uvc/uvc_v4l2.c @@ -1379,47 +1379,44 @@ static int uvc_v4l2_put_xu_query(const struct uvc_xu_control_query *kp, static long uvc_v4l2_compat_ioctl32(struct file *file, unsigned int cmd, unsigned long arg) { + struct uvc_fh *handle = file->private_data; union { struct uvc_xu_control_mapping xmap; struct uvc_xu_control_query xqry; } karg; void __user *up = compat_ptr(arg); - mm_segment_t old_fs; long ret; switch (cmd) { case UVCIOC_CTRL_MAP32: - cmd = UVCIOC_CTRL_MAP; ret = uvc_v4l2_get_xu_mapping(&karg.xmap, up); + if (ret) + return ret; + ret = uvc_ioctl_ctrl_map(handle->chain, &karg.xmap); + if (ret) + return ret; + ret = uvc_v4l2_put_xu_mapping(&karg.xmap, up); + if (ret) + return ret; + break; case UVCIOC_CTRL_QUERY32: - cmd = UVCIOC_CTRL_QUERY; ret = uvc_v4l2_get_xu_query(&karg.xqry, up); + if (ret) + return ret; + ret = uvc_xu_ctrl_query(handle->chain, &karg.xqry); + if (ret) + return ret; + ret = uvc_v4l2_put_xu_query(&karg.xqry, up); + if (ret) + return ret; break; default: return -ENOIOCTLCMD; } - old_fs = get_fs(); - set_fs(KERNEL_DS); - ret = video_ioctl2(file, cmd, (unsigned long)&karg); - set_fs(old_fs); - - if (ret < 0) - return ret; - - switch (cmd) { - case UVCIOC_CTRL_MAP: - ret = uvc_v4l2_put_xu_mapping(&karg.xmap, up); - break; - - case UVCIOC_CTRL_QUERY: - ret = uvc_v4l2_put_xu_query(&karg.xqry, up); - break; - } - return ret; } #endif From 95123c0b81d9478b8155fe15093b88f57ef7d0bd Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Wed, 22 Jun 2016 23:59:54 -0400 Subject: [PATCH 041/312] Linux 4.1.27 Signed-off-by: Sasha Levin --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 080a87e290b9..54b3d8ae8624 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 4 PATCHLEVEL = 1 -SUBLEVEL = 26 +SUBLEVEL = 27 EXTRAVERSION = NAME = Series 4800 From 272d474fc6e9e4362c69df4b77adc0008b53b4ad Mon Sep 17 00:00:00 2001 From: Rainer Weikusat Date: Sun, 3 Jan 2016 18:56:38 +0000 Subject: [PATCH 042/312] af_unix: Fix splice-bind deadlock [ Upstream commit c845acb324aa85a39650a14e7696982ceea75dc1 ] On 2015/11/06, Dmitry Vyukov reported a deadlock involving the splice system call and AF_UNIX sockets, http://lists.openwall.net/netdev/2015/11/06/24 The situation was analyzed as (a while ago) A: socketpair() B: splice() from a pipe to /mnt/regular_file does sb_start_write() on /mnt C: try to freeze /mnt wait for B to finish with /mnt A: bind() try to bind our socket to /mnt/new_socket_name lock our socket, see it not bound yet decide that it needs to create something in /mnt try to do sb_start_write() on /mnt, block (it's waiting for C). D: splice() from the same pipe to our socket lock the pipe, see that socket is connected try to lock the socket, block waiting for A B: get around to actually feeding a chunk from pipe to file, try to lock the pipe. Deadlock. on 2015/11/10 by Al Viro, http://lists.openwall.net/netdev/2015/11/10/4 The patch fixes this by removing the kern_path_create related code from unix_mknod and executing it as part of unix_bind prior acquiring the readlock of the socket in question. This means that A (as used above) will sb_start_write on /mnt before it acquires the readlock, hence, it won't indirectly block B which first did a sb_start_write and then waited for a thread trying to acquire the readlock. Consequently, A being blocked by C waiting for B won't cause a deadlock anymore (effectively, both A and B acquire two locks in opposite order in the situation described above). Dmitry Vyukov() tested the original patch. Signed-off-by: Rainer Weikusat Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/unix/af_unix.c | 66 ++++++++++++++++++++++++++++------------------ 1 file changed, 40 insertions(+), 26 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 535a642a1688..03da879008d7 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -935,32 +935,20 @@ fail: return NULL; } -static int unix_mknod(const char *sun_path, umode_t mode, struct path *res) +static int unix_mknod(struct dentry *dentry, struct path *path, umode_t mode, + struct path *res) { - struct dentry *dentry; - struct path path; - int err = 0; - /* - * Get the parent directory, calculate the hash for last - * component. - */ - dentry = kern_path_create(AT_FDCWD, sun_path, &path, 0); - err = PTR_ERR(dentry); - if (IS_ERR(dentry)) - return err; + int err; - /* - * All right, let's create it. - */ - err = security_path_mknod(&path, dentry, mode, 0); + err = security_path_mknod(path, dentry, mode, 0); if (!err) { - err = vfs_mknod(d_inode(path.dentry), dentry, mode, 0); + err = vfs_mknod(d_inode(path->dentry), dentry, mode, 0); if (!err) { - res->mnt = mntget(path.mnt); + res->mnt = mntget(path->mnt); res->dentry = dget(dentry); } } - done_path_create(&path, dentry); + return err; } @@ -971,10 +959,12 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) struct unix_sock *u = unix_sk(sk); struct sockaddr_un *sunaddr = (struct sockaddr_un *)uaddr; char *sun_path = sunaddr->sun_path; - int err; + int err, name_err; unsigned int hash; struct unix_address *addr; struct hlist_head *list; + struct path path; + struct dentry *dentry; err = -EINVAL; if (sunaddr->sun_family != AF_UNIX) @@ -990,14 +980,34 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) goto out; addr_len = err; + name_err = 0; + dentry = NULL; + if (sun_path[0]) { + /* Get the parent directory, calculate the hash for last + * component. + */ + dentry = kern_path_create(AT_FDCWD, sun_path, &path, 0); + + if (IS_ERR(dentry)) { + /* delay report until after 'already bound' check */ + name_err = PTR_ERR(dentry); + dentry = NULL; + } + } + err = mutex_lock_interruptible(&u->readlock); if (err) - goto out; + goto out_path; err = -EINVAL; if (u->addr) goto out_up; + if (name_err) { + err = name_err == -EEXIST ? -EADDRINUSE : name_err; + goto out_up; + } + err = -ENOMEM; addr = kmalloc(sizeof(*addr)+addr_len, GFP_KERNEL); if (!addr) @@ -1008,11 +1018,11 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) addr->hash = hash ^ sk->sk_type; atomic_set(&addr->refcnt, 1); - if (sun_path[0]) { - struct path path; + if (dentry) { + struct path u_path; umode_t mode = S_IFSOCK | (SOCK_INODE(sock)->i_mode & ~current_umask()); - err = unix_mknod(sun_path, mode, &path); + err = unix_mknod(dentry, &path, mode, &u_path); if (err) { if (err == -EEXIST) err = -EADDRINUSE; @@ -1020,9 +1030,9 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) goto out_up; } addr->hash = UNIX_HASH_SIZE; - hash = d_backing_inode(path.dentry)->i_ino & (UNIX_HASH_SIZE-1); + hash = d_backing_inode(dentry)->i_ino & (UNIX_HASH_SIZE - 1); spin_lock(&unix_table_lock); - u->path = path; + u->path = u_path; list = &unix_socket_table[hash]; } else { spin_lock(&unix_table_lock); @@ -1045,6 +1055,10 @@ out_unlock: spin_unlock(&unix_table_lock); out_up: mutex_unlock(&u->readlock); +out_path: + if (dentry) + done_path_create(&path, dentry); + out: return err; } From d273823dc63bb51e3adc11e0f7c324d86e2d2009 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 25 May 2016 11:48:25 -0400 Subject: [PATCH 043/312] percpu: fix synchronization between chunk->map_extend_work and chunk destruction [ Upstream commit 4f996e234dad488e5d9ba0858bc1bae12eff82c3 ] Atomic allocations can trigger async map extensions which is serviced by chunk->map_extend_work. pcpu_balance_work which is responsible for destroying idle chunks wasn't synchronizing properly against chunk->map_extend_work and may end up freeing the chunk while the work item is still in flight. This patch fixes the bug by rolling async map extension operations into pcpu_balance_work. Signed-off-by: Tejun Heo Reported-and-tested-by: Alexei Starovoitov Reported-by: Vlastimil Babka Reported-by: Sasha Levin Cc: stable@vger.kernel.org # v3.18+ Fixes: 9c824b6a172c ("percpu: make sure chunk->map array has available space") Signed-off-by: Sasha Levin --- mm/percpu.c | 57 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 36 insertions(+), 21 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index 2dd74487a0af..4c2615e02b26 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -110,7 +110,7 @@ struct pcpu_chunk { int map_used; /* # of map entries used before the sentry */ int map_alloc; /* # of map entries allocated */ int *map; /* allocation map */ - struct work_struct map_extend_work;/* async ->map[] extension */ + struct list_head map_extend_list;/* on pcpu_map_extend_chunks */ void *data; /* chunk data */ int first_free; /* no free below this */ @@ -164,6 +164,9 @@ static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop */ static struct list_head *pcpu_slot __read_mostly; /* chunk list slots */ +/* chunks which need their map areas extended, protected by pcpu_lock */ +static LIST_HEAD(pcpu_map_extend_chunks); + /* * The number of empty populated pages, protected by pcpu_lock. The * reserved chunk doesn't contribute to the count. @@ -397,13 +400,19 @@ static int pcpu_need_to_extend(struct pcpu_chunk *chunk, bool is_atomic) { int margin, new_alloc; + lockdep_assert_held(&pcpu_lock); + if (is_atomic) { margin = 3; if (chunk->map_alloc < - chunk->map_used + PCPU_ATOMIC_MAP_MARGIN_LOW && - pcpu_async_enabled) - schedule_work(&chunk->map_extend_work); + chunk->map_used + PCPU_ATOMIC_MAP_MARGIN_LOW) { + if (list_empty(&chunk->map_extend_list)) { + list_add_tail(&chunk->map_extend_list, + &pcpu_map_extend_chunks); + pcpu_schedule_balance_work(); + } + } } else { margin = PCPU_ATOMIC_MAP_MARGIN_HIGH; } @@ -469,20 +478,6 @@ out_unlock: return 0; } -static void pcpu_map_extend_workfn(struct work_struct *work) -{ - struct pcpu_chunk *chunk = container_of(work, struct pcpu_chunk, - map_extend_work); - int new_alloc; - - spin_lock_irq(&pcpu_lock); - new_alloc = pcpu_need_to_extend(chunk, false); - spin_unlock_irq(&pcpu_lock); - - if (new_alloc) - pcpu_extend_area_map(chunk, new_alloc); -} - /** * pcpu_fit_in_area - try to fit the requested allocation in a candidate area * @chunk: chunk the candidate area belongs to @@ -742,7 +737,7 @@ static struct pcpu_chunk *pcpu_alloc_chunk(void) chunk->map_used = 1; INIT_LIST_HEAD(&chunk->list); - INIT_WORK(&chunk->map_extend_work, pcpu_map_extend_workfn); + INIT_LIST_HEAD(&chunk->map_extend_list); chunk->free_size = pcpu_unit_size; chunk->contig_hint = pcpu_unit_size; @@ -1131,6 +1126,7 @@ static void pcpu_balance_workfn(struct work_struct *work) if (chunk == list_first_entry(free_head, struct pcpu_chunk, list)) continue; + list_del_init(&chunk->map_extend_list); list_move(&chunk->list, &to_free); } @@ -1148,6 +1144,25 @@ static void pcpu_balance_workfn(struct work_struct *work) pcpu_destroy_chunk(chunk); } + /* service chunks which requested async area map extension */ + do { + int new_alloc = 0; + + spin_lock_irq(&pcpu_lock); + + chunk = list_first_entry_or_null(&pcpu_map_extend_chunks, + struct pcpu_chunk, map_extend_list); + if (chunk) { + list_del_init(&chunk->map_extend_list); + new_alloc = pcpu_need_to_extend(chunk, false); + } + + spin_unlock_irq(&pcpu_lock); + + if (new_alloc) + pcpu_extend_area_map(chunk, new_alloc); + } while (chunk); + /* * Ensure there are certain number of free populated pages for * atomic allocs. Fill up from the most packed so that atomic @@ -1646,7 +1661,7 @@ int __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai, */ schunk = memblock_virt_alloc(pcpu_chunk_struct_size, 0); INIT_LIST_HEAD(&schunk->list); - INIT_WORK(&schunk->map_extend_work, pcpu_map_extend_workfn); + INIT_LIST_HEAD(&schunk->map_extend_list); schunk->base_addr = base_addr; schunk->map = smap; schunk->map_alloc = ARRAY_SIZE(smap); @@ -1676,7 +1691,7 @@ int __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai, if (dyn_size) { dchunk = memblock_virt_alloc(pcpu_chunk_struct_size, 0); INIT_LIST_HEAD(&dchunk->list); - INIT_WORK(&dchunk->map_extend_work, pcpu_map_extend_workfn); + INIT_LIST_HEAD(&dchunk->map_extend_list); dchunk->base_addr = base_addr; dchunk->map = dmap; dchunk->map_alloc = ARRAY_SIZE(dmap); From 62f7175fa64005be2e090618da5a20f87752704d Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 25 May 2016 11:48:25 -0400 Subject: [PATCH 044/312] percpu: fix synchronization between synchronous map extension and chunk destruction [ Upstream commit 6710e594f71ccaad8101bc64321152af7cd9ea28 ] For non-atomic allocations, pcpu_alloc() can try to extend the area map synchronously after dropping pcpu_lock; however, the extension wasn't synchronized against chunk destruction and the chunk might get freed while extension is in progress. This patch fixes the bug by putting most of non-atomic allocations under pcpu_alloc_mutex to synchronize against pcpu_balance_work which is responsible for async chunk management including destruction. Signed-off-by: Tejun Heo Reported-and-tested-by: Alexei Starovoitov Reported-by: Vlastimil Babka Reported-by: Sasha Levin Cc: stable@vger.kernel.org # v3.18+ Fixes: 1a4d76076cda ("percpu: implement asynchronous chunk population") Signed-off-by: Sasha Levin --- mm/percpu.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index 4c2615e02b26..b97617587620 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -160,7 +160,7 @@ static struct pcpu_chunk *pcpu_reserved_chunk; static int pcpu_reserved_chunk_limit; static DEFINE_SPINLOCK(pcpu_lock); /* all internal data structures */ -static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop */ +static DEFINE_MUTEX(pcpu_alloc_mutex); /* chunk create/destroy, [de]pop, map ext */ static struct list_head *pcpu_slot __read_mostly; /* chunk list slots */ @@ -446,6 +446,8 @@ static int pcpu_extend_area_map(struct pcpu_chunk *chunk, int new_alloc) size_t old_size = 0, new_size = new_alloc * sizeof(new[0]); unsigned long flags; + lockdep_assert_held(&pcpu_alloc_mutex); + new = pcpu_mem_zalloc(new_size); if (!new) return -ENOMEM; @@ -892,6 +894,9 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, return NULL; } + if (!is_atomic) + mutex_lock(&pcpu_alloc_mutex); + spin_lock_irqsave(&pcpu_lock, flags); /* serve reserved allocations from the reserved chunk if available */ @@ -964,12 +969,9 @@ restart: if (is_atomic) goto fail; - mutex_lock(&pcpu_alloc_mutex); - if (list_empty(&pcpu_slot[pcpu_nr_slots - 1])) { chunk = pcpu_create_chunk(); if (!chunk) { - mutex_unlock(&pcpu_alloc_mutex); err = "failed to allocate new chunk"; goto fail; } @@ -980,7 +982,6 @@ restart: spin_lock_irqsave(&pcpu_lock, flags); } - mutex_unlock(&pcpu_alloc_mutex); goto restart; area_found: @@ -990,8 +991,6 @@ area_found: if (!is_atomic) { int page_start, page_end, rs, re; - mutex_lock(&pcpu_alloc_mutex); - page_start = PFN_DOWN(off); page_end = PFN_UP(off + size); @@ -1002,7 +1001,6 @@ area_found: spin_lock_irqsave(&pcpu_lock, flags); if (ret) { - mutex_unlock(&pcpu_alloc_mutex); pcpu_free_area(chunk, off, &occ_pages); err = "failed to populate"; goto fail_unlock; @@ -1042,6 +1040,8 @@ fail: /* see the flag handling in pcpu_blance_workfn() */ pcpu_atomic_alloc_failed = true; pcpu_schedule_balance_work(); + } else { + mutex_unlock(&pcpu_alloc_mutex); } return NULL; } From 6ef304587cf41e19312fa9fd81448ea6880ad475 Mon Sep 17 00:00:00 2001 From: Wenwei Tao Date: Fri, 13 May 2016 22:59:20 +0800 Subject: [PATCH 045/312] cgroup: remove redundant cleanup in css_create [ Upstream commit b00c52dae6d9ee8d0f2407118ef6544ae5524781 ] When create css failed, before call css_free_rcu_fn, we remove the css id and exit the percpu_ref, but we will do these again in css_free_work_fn, so they are redundant. Especially the css id, that would cause problem if we remove it twice, since it may be assigned to another css after the first remove. tj: This was broken by two commits updating the free path without synchronizing the creation failure path. This can be easily triggered by trying to create more than 64k memory cgroups. Signed-off-by: Wenwei Tao Signed-off-by: Tejun Heo Cc: Vladimir Davydov Fixes: 9a1049da9bd2 ("percpu-refcount: require percpu_ref to be exited explicitly") Fixes: 01e586598b22 ("cgroup: release css->id after css_free") Cc: stable@vger.kernel.org # v3.17+ Signed-off-by: Sasha Levin --- kernel/cgroup.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 359da3abb004..530fecfa3992 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4563,7 +4563,7 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss, err = cgroup_idr_alloc(&ss->css_idr, NULL, 2, 0, GFP_NOWAIT); if (err < 0) - goto err_free_percpu_ref; + goto err_free_css; css->id = err; if (visible) { @@ -4595,9 +4595,6 @@ err_list_del: list_del_rcu(&css->sibling); cgroup_clear_dir(css->cgroup, 1 << css->ss->id); err_free_id: - cgroup_idr_remove(&ss->css_idr, css->id); -err_free_percpu_ref: - percpu_ref_exit(&css->refcnt); err_free_css: call_rcu(&css->rcu_head, css_free_rcu_fn); return err; From c6ec15d8965d03a604bdcfde28a5c506b9de7043 Mon Sep 17 00:00:00 2001 From: Ludovic Desroches Date: Thu, 12 May 2016 16:54:08 +0200 Subject: [PATCH 046/312] dmaengine: at_xdmac: align descriptors on 64 bits [ Upstream commit 4a9723e8df68cfce4048517ee32e37f78854b6fb ] Having descriptors aligned on 64 bits allows update CNDA and CUBC in an atomic way. Signed-off-by: Ludovic Desroches Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Cc: stable@vger.kernel.org #v4.1 and later Reviewed-by: Nicolas Ferre Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin --- drivers/dma/at_xdmac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dma/at_xdmac.c b/drivers/dma/at_xdmac.c index ffa809f30b19..690d811d56b7 100644 --- a/drivers/dma/at_xdmac.c +++ b/drivers/dma/at_xdmac.c @@ -238,7 +238,7 @@ struct at_xdmac_lld { u32 mbr_cfg; /* Configuration Register */ }; - +/* 64-bit alignment needed to update CNDA and CUBC registers in an atomic way. */ struct at_xdmac_desc { struct at_xdmac_lld lld; enum dma_transfer_direction direction; @@ -249,7 +249,7 @@ struct at_xdmac_desc { unsigned int xfer_size; struct list_head descs_list; struct list_head xfer_node; -}; +} __aligned(sizeof(u64)); static inline void __iomem *at_xdmac_chan_reg_base(struct at_xdmac *atxdmac, unsigned int chan_nb) { From 6d8fde793d0efbf1e48bca89ee261c18ddafea84 Mon Sep 17 00:00:00 2001 From: Ludovic Desroches Date: Thu, 12 May 2016 16:54:09 +0200 Subject: [PATCH 047/312] dmaengine: at_xdmac: fix residue corruption [ Upstream commit 53398f488821c2b5b15291e3debec6ad33f75d3d ] An unexpected value of CUBC can lead to a corrupted residue. A more complex sequence is needed to detect an inaccurate value for NCA or CUBC. Signed-off-by: Ludovic Desroches Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Cc: stable@vger.kernel.org #v4.1 and later Reviewed-by: Nicolas Ferre Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin --- drivers/dma/at_xdmac.c | 54 +++++++++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/drivers/dma/at_xdmac.c b/drivers/dma/at_xdmac.c index 690d811d56b7..2c413e52c5e5 100644 --- a/drivers/dma/at_xdmac.c +++ b/drivers/dma/at_xdmac.c @@ -930,6 +930,7 @@ at_xdmac_tx_status(struct dma_chan *chan, dma_cookie_t cookie, u32 cur_nda, check_nda, cur_ubc, mask, value; u8 dwidth = 0; unsigned long flags; + bool initd; ret = dma_cookie_status(chan, cookie, txstate); if (ret == DMA_COMPLETE) @@ -965,34 +966,43 @@ at_xdmac_tx_status(struct dma_chan *chan, dma_cookie_t cookie, } /* - * When processing the residue, we need to read two registers but we - * can't do it in an atomic way. AT_XDMAC_CNDA is used to find where - * we stand in the descriptor list and AT_XDMAC_CUBC is used - * to know how many data are remaining for the current descriptor. - * Since the dma channel is not paused to not loose data, between the - * AT_XDMAC_CNDA and AT_XDMAC_CUBC read, we may have change of - * descriptor. - * For that reason, after reading AT_XDMAC_CUBC, we check if we are - * still using the same descriptor by reading a second time - * AT_XDMAC_CNDA. If AT_XDMAC_CNDA has changed, it means we have to - * read again AT_XDMAC_CUBC. + * The easiest way to compute the residue should be to pause the DMA + * but doing this can lead to miss some data as some devices don't + * have FIFO. + * We need to read several registers because: + * - DMA is running therefore a descriptor change is possible while + * reading these registers + * - When the block transfer is done, the value of the CUBC register + * is set to its initial value until the fetch of the next descriptor. + * This value will corrupt the residue calculation so we have to skip + * it. + * + * INITD -------- ------------ + * |____________________| + * _______________________ _______________ + * NDA @desc2 \/ @desc3 + * _______________________/\_______________ + * __________ ___________ _______________ + * CUBC 0 \/ MAX desc1 \/ MAX desc2 + * __________/\___________/\_______________ + * + * Since descriptors are aligned on 64 bits, we can assume that + * the update of NDA and CUBC is atomic. * Memory barriers are used to ensure the read order of the registers. - * A max number of retries is set because unlikely it can never ends if - * we are transferring a lot of data with small buffers. + * A max number of retries is set because unlikely it could never ends. */ - cur_nda = at_xdmac_chan_read(atchan, AT_XDMAC_CNDA) & 0xfffffffc; - rmb(); - cur_ubc = at_xdmac_chan_read(atchan, AT_XDMAC_CUBC); for (retry = 0; retry < AT_XDMAC_RESIDUE_MAX_RETRIES; retry++) { - rmb(); check_nda = at_xdmac_chan_read(atchan, AT_XDMAC_CNDA) & 0xfffffffc; - - if (likely(cur_nda == check_nda)) - break; - - cur_nda = check_nda; + rmb(); + initd = !!(at_xdmac_chan_read(atchan, AT_XDMAC_CC) & AT_XDMAC_CC_INITD); rmb(); cur_ubc = at_xdmac_chan_read(atchan, AT_XDMAC_CUBC); + rmb(); + cur_nda = at_xdmac_chan_read(atchan, AT_XDMAC_CNDA) & 0xfffffffc; + rmb(); + + if ((check_nda == cur_nda) && initd) + break; } if (unlikely(retry >= AT_XDMAC_RESIDUE_MAX_RETRIES)) { From 039f59796079951d6e0ffb40d7882b5597e5e16c Mon Sep 17 00:00:00 2001 From: Ludovic Desroches Date: Thu, 12 May 2016 16:54:10 +0200 Subject: [PATCH 048/312] dmaengine: at_xdmac: double FIFO flush needed to compute residue [ Upstream commit 9295c41d77ca93aac79cfca6fa09fa1ca5cab66f ] Due to the way CUBC register is updated, a double flush is needed to compute an accurate residue. First flush aim is to get data from the DMA FIFO and second one ensures that we won't report data which are not in memory. Signed-off-by: Ludovic Desroches Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Cc: stable@vger.kernel.org #v4.1 and later Reviewed-by: Nicolas Ferre Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin --- drivers/dma/at_xdmac.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/drivers/dma/at_xdmac.c b/drivers/dma/at_xdmac.c index 2c413e52c5e5..c5e6c82516ce 100644 --- a/drivers/dma/at_xdmac.c +++ b/drivers/dma/at_xdmac.c @@ -955,7 +955,16 @@ at_xdmac_tx_status(struct dma_chan *chan, dma_cookie_t cookie, residue = desc->xfer_size; /* * Flush FIFO: only relevant when the transfer is source peripheral - * synchronized. + * synchronized. Flush is needed before reading CUBC because data in + * the FIFO are not reported by CUBC. Reporting a residue of the + * transfer length while we have data in FIFO can cause issue. + * Usecase: atmel USART has a timeout which means I have received + * characters but there is no more character received for a while. On + * timeout, it requests the residue. If the data are in the DMA FIFO, + * we will return a residue of the transfer length. It means no data + * received. If an application is waiting for these data, it will hang + * since we won't have another USART timeout without receiving new + * data. */ mask = AT_XDMAC_CC_TYPE | AT_XDMAC_CC_DSYNC; value = AT_XDMAC_CC_TYPE_PER_TRAN | AT_XDMAC_CC_DSYNC_PER2MEM; @@ -1010,6 +1019,19 @@ at_xdmac_tx_status(struct dma_chan *chan, dma_cookie_t cookie, goto spin_unlock; } + /* + * Flush FIFO: only relevant when the transfer is source peripheral + * synchronized. Another flush is needed here because CUBC is updated + * when the controller sends the data write command. It can lead to + * report data that are not written in the memory or the device. The + * FIFO flush ensures that data are really written. + */ + if ((desc->lld.mbr_cfg & mask) == value) { + at_xdmac_write(atxdmac, AT_XDMAC_GSWF, atchan->mask); + while (!(at_xdmac_chan_read(atchan, AT_XDMAC_CIS) & AT_XDMAC_CIS_FIS)) + cpu_relax(); + } + /* * Remove size of all microblocks already transferred and the current * one. Then add the remaining size to transfer of the current From 01a129f8a323fc3c393a4d55851f5c671f4c9a4e Mon Sep 17 00:00:00 2001 From: Heiko Stuebner Date: Tue, 17 May 2016 20:57:50 +0200 Subject: [PATCH 049/312] clk: rockchip: initialize flags of clk_init_data in mmc-phase clock [ Upstream commit 595144c1141c951a3c6bb9004ae6a2bc29aad66f ] The flags element of clk_init_data was never initialized for mmc- phase-clocks resulting in the element containing a random value and thus possibly enabling unwanted clock flags. Fixes: 89bf26cbc1a0 ("clk: rockchip: Add support for the mmc clock phases using the framework") Cc: stable@vger.kernel.org Signed-off-by: Heiko Stuebner Signed-off-by: Sasha Levin --- drivers/clk/rockchip/clk-mmc-phase.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/clk/rockchip/clk-mmc-phase.c b/drivers/clk/rockchip/clk-mmc-phase.c index c842e3b60f21..2c111c2cdccc 100644 --- a/drivers/clk/rockchip/clk-mmc-phase.c +++ b/drivers/clk/rockchip/clk-mmc-phase.c @@ -131,6 +131,7 @@ struct clk *rockchip_clk_register_mmc(const char *name, if (!mmc_clock) return NULL; + init.flags = 0; init.num_parents = num_parents; init.parent_names = parent_names; init.ops = &rockchip_mmc_clk_ops; From a56c72fc0cda3382a339172a35c76cea10df72ad Mon Sep 17 00:00:00 2001 From: "Steinar H. Gunderson" Date: Tue, 24 May 2016 20:13:15 +0200 Subject: [PATCH 050/312] usb: dwc3: exynos: Fix deferred probing storm. [ Upstream commit 4879efb34f7d49235fac334d76d9c6a77a021413 ] dwc3-exynos has two problems during init if the regulators are slow to come up (for instance if the I2C bus driver is not on the initramfs) and return probe deferral. First, every time this happens, the driver leaks the USB phys created; they need to be deallocated on error. Second, since the phy devices are created before the regulators fail, this means that there's a new device to re-trigger deferred probing, which causes it to essentially go into a busy loop of re-probing the device until the regulators come up. Move the phy creation to after the regulators have succeeded, and also fix cleanup on failure. On my ODROID XU4 system (with Debian's initramfs which doesn't contain the I2C driver), this reduces the number of probe attempts (for each of the two controllers) from more than 2000 to eight. Signed-off-by: Steinar H. Gunderson Reviewed-by: Krzysztof Kozlowski Reviewed-by: Vivek Gautam Fixes: d720f057fda4 ("usb: dwc3: exynos: add nop transceiver support") Cc: Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin --- drivers/usb/dwc3/dwc3-exynos.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/usb/dwc3/dwc3-exynos.c b/drivers/usb/dwc3/dwc3-exynos.c index 7bd0a95b2815..0a465a90f0d6 100644 --- a/drivers/usb/dwc3/dwc3-exynos.c +++ b/drivers/usb/dwc3/dwc3-exynos.c @@ -128,12 +128,6 @@ static int dwc3_exynos_probe(struct platform_device *pdev) platform_set_drvdata(pdev, exynos); - ret = dwc3_exynos_register_phys(exynos); - if (ret) { - dev_err(dev, "couldn't register PHYs\n"); - return ret; - } - exynos->dev = dev; exynos->clk = devm_clk_get(dev, "usbdrd30"); @@ -183,20 +177,29 @@ static int dwc3_exynos_probe(struct platform_device *pdev) goto err3; } + ret = dwc3_exynos_register_phys(exynos); + if (ret) { + dev_err(dev, "couldn't register PHYs\n"); + goto err4; + } + if (node) { ret = of_platform_populate(node, NULL, NULL, dev); if (ret) { dev_err(dev, "failed to add dwc3 core\n"); - goto err4; + goto err5; } } else { dev_err(dev, "no device node, failed to add dwc3 core\n"); ret = -ENODEV; - goto err4; + goto err5; } return 0; +err5: + platform_device_unregister(exynos->usb2_phy); + platform_device_unregister(exynos->usb3_phy); err4: regulator_disable(exynos->vdd10); err3: From 1aaee5de33d2b615b35afd77cdc6002da8919f0d Mon Sep 17 00:00:00 2001 From: Bin Liu Date: Thu, 26 May 2016 11:43:45 -0500 Subject: [PATCH 051/312] usb: gadget: fix spinlock dead lock in gadgetfs [ Upstream commit d246dcb2331c5783743720e6510892eb1d2801d9 ] [ 40.467381] ============================================= [ 40.473013] [ INFO: possible recursive locking detected ] [ 40.478651] 4.6.0-08691-g7f3db9a #37 Not tainted [ 40.483466] --------------------------------------------- [ 40.489098] usb/733 is trying to acquire lock: [ 40.493734] (&(&dev->lock)->rlock){-.....}, at: [] ep0_complete+0x18/0xdc [gadgetfs] [ 40.502882] [ 40.502882] but task is already holding lock: [ 40.508967] (&(&dev->lock)->rlock){-.....}, at: [] ep0_read+0x20/0x5e0 [gadgetfs] [ 40.517811] [ 40.517811] other info that might help us debug this: [ 40.524623] Possible unsafe locking scenario: [ 40.524623] [ 40.530798] CPU0 [ 40.533346] ---- [ 40.535894] lock(&(&dev->lock)->rlock); [ 40.540088] lock(&(&dev->lock)->rlock); [ 40.544284] [ 40.544284] *** DEADLOCK *** [ 40.544284] [ 40.550461] May be due to missing lock nesting notation [ 40.550461] [ 40.557544] 2 locks held by usb/733: [ 40.561271] #0: (&f->f_pos_lock){+.+.+.}, at: [] __fdget_pos+0x40/0x48 [ 40.569219] #1: (&(&dev->lock)->rlock){-.....}, at: [] ep0_read+0x20/0x5e0 [gadgetfs] [ 40.578523] [ 40.578523] stack backtrace: [ 40.583075] CPU: 0 PID: 733 Comm: usb Not tainted 4.6.0-08691-g7f3db9a #37 [ 40.590246] Hardware name: Generic AM33XX (Flattened Device Tree) [ 40.596625] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 40.604718] [] (show_stack) from [] (dump_stack+0xb0/0xe4) [ 40.612267] [] (dump_stack) from [] (__lock_acquire+0xf68/0x1994) [ 40.620440] [] (__lock_acquire) from [] (lock_acquire+0xd8/0x238) [ 40.628621] [] (lock_acquire) from [] (_raw_spin_lock_irqsave+0x38/0x4c) [ 40.637440] [] (_raw_spin_lock_irqsave) from [] (ep0_complete+0x18/0xdc [gadgetfs]) [ 40.647339] [] (ep0_complete [gadgetfs]) from [] (musb_g_giveback+0x118/0x1b0 [musb_hdrc]) [ 40.657842] [] (musb_g_giveback [musb_hdrc]) from [] (musb_g_ep0_queue+0x16c/0x188 [musb_hdrc]) [ 40.668772] [] (musb_g_ep0_queue [musb_hdrc]) from [] (ep0_read+0x544/0x5e0 [gadgetfs]) [ 40.678963] [] (ep0_read [gadgetfs]) from [] (__vfs_read+0x20/0x110) [ 40.687414] [] (__vfs_read) from [] (vfs_read+0x88/0x114) [ 40.694864] [] (vfs_read) from [] (SyS_read+0x44/0x9c) [ 40.702051] [] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c) This is caused by the spinlock bug in ep0_read(). Fix the two other deadlock sources in gadgetfs_setup() too. Cc: # v3.16+ Signed-off-by: Bin Liu Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin --- drivers/usb/gadget/legacy/inode.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c index 2030565c6789..bccc5788bb98 100644 --- a/drivers/usb/gadget/legacy/inode.c +++ b/drivers/usb/gadget/legacy/inode.c @@ -934,8 +934,11 @@ ep0_read (struct file *fd, char __user *buf, size_t len, loff_t *ptr) struct usb_ep *ep = dev->gadget->ep0; struct usb_request *req = dev->req; - if ((retval = setup_req (ep, req, 0)) == 0) - retval = usb_ep_queue (ep, req, GFP_ATOMIC); + if ((retval = setup_req (ep, req, 0)) == 0) { + spin_unlock_irq (&dev->lock); + retval = usb_ep_queue (ep, req, GFP_KERNEL); + spin_lock_irq (&dev->lock); + } dev->state = STATE_DEV_CONNECTED; /* assume that was SET_CONFIGURATION */ @@ -1453,8 +1456,11 @@ delegate: w_length); if (value < 0) break; + + spin_unlock (&dev->lock); value = usb_ep_queue (gadget->ep0, dev->req, - GFP_ATOMIC); + GFP_KERNEL); + spin_lock (&dev->lock); if (value < 0) { clean_req (gadget->ep0, dev->req); break; @@ -1477,11 +1483,14 @@ delegate: if (value >= 0 && dev->state != STATE_DEV_SETUP) { req->length = value; req->zero = value < w_length; - value = usb_ep_queue (gadget->ep0, req, GFP_ATOMIC); + + spin_unlock (&dev->lock); + value = usb_ep_queue (gadget->ep0, req, GFP_KERNEL); if (value < 0) { DBG (dev, "ep_queue --> %d\n", value); req->status = 0; } + return value; } /* device stalls when value < 0 */ From b71d1794461e3be4e193b76a04132ac55afd5ae1 Mon Sep 17 00:00:00 2001 From: Oliver Neukum Date: Tue, 31 May 2016 14:48:15 +0200 Subject: [PATCH 052/312] HID: elo: kill not flush the work [ Upstream commit ed596a4a88bd161f868ccba078557ee7ede8a6ef ] Flushing a work that reschedules itself is not a sensible operation. It needs to be killed. Failure to do so leads to a kernel panic in the timer code. CC: stable@vger.kernel.org Signed-off-by: Oliver Neukum Reviewed-by: Benjamin Tissoires Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin --- drivers/hid/hid-elo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hid/hid-elo.c b/drivers/hid/hid-elo.c index 4e49462870ab..d0c8a1c1e1fe 100644 --- a/drivers/hid/hid-elo.c +++ b/drivers/hid/hid-elo.c @@ -259,7 +259,7 @@ static void elo_remove(struct hid_device *hdev) struct elo_priv *priv = hid_get_drvdata(hdev); hid_hw_stop(hdev); - flush_workqueue(wq); + cancel_delayed_work_sync(&priv->work); kfree(priv); } From e78c8a575fd956a4e8a04125c50c23264e012e8e Mon Sep 17 00:00:00 2001 From: Mathias Nyman Date: Wed, 1 Jun 2016 18:09:08 +0300 Subject: [PATCH 053/312] xhci: Fix handling timeouted commands on hosts in weird states. [ Upstream commit 3425aa03f484d45dc21e0e791c2f6c74ea656421 ] If commands timeout we mark them for abortion, then stop the command ring, and turn the commands to no-ops and finally restart the command ring. If the host is working properly the no-op commands will finish and pending completions are called. If we notice the host is failing, driver clears the command ring and completes, deletes and frees all pending commands. There are two separate cases reported where host is believed to work properly but is not. In the first case we successfully stop the ring but no abort or stop command ring event is ever sent and host locks up. The second case is if a host is removed, command times out and driver believes the ring is stopped, and assumes it will be restarted, but actually ends up timing out on the same command forever. If one of the pending commands has the xhci->mutex held it will block xhci_stop() in the remove codepath which otherwise would cleanup pending commands. Add a check that clears all pending commands in case host is removed, or we are stuck timing out on the same command. Also restart the command timeout timer when stopping the command ring to ensure we recive an ring stop/abort event. Cc: stable Tested-by: Joe Lawrence Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/host/xhci-ring.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 6fe0377ec5cf..6ef255142e01 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -289,6 +289,14 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci) temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring); xhci->cmd_ring_state = CMD_RING_STATE_ABORTED; + + /* + * Writing the CMD_RING_ABORT bit should cause a cmd completion event, + * however on some host hw the CMD_RING_RUNNING bit is correctly cleared + * but the completion event in never sent. Use the cmd timeout timer to + * handle those cases. Use twice the time to cover the bit polling retry + */ + mod_timer(&xhci->cmd_timer, jiffies + (2 * XHCI_CMD_DEFAULT_TIMEOUT)); xhci_write_64(xhci, temp_64 | CMD_RING_ABORT, &xhci->op_regs->cmd_ring); @@ -313,6 +321,7 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci) xhci_err(xhci, "Stopped the command ring failed, " "maybe the host is dead\n"); + del_timer(&xhci->cmd_timer); xhci->xhc_state |= XHCI_STATE_DYING; xhci_quiesce(xhci); xhci_halt(xhci); @@ -1252,22 +1261,21 @@ void xhci_handle_command_timeout(unsigned long data) int ret; unsigned long flags; u64 hw_ring_state; - struct xhci_command *cur_cmd = NULL; + bool second_timeout = false; xhci = (struct xhci_hcd *) data; /* mark this command to be cancelled */ spin_lock_irqsave(&xhci->lock, flags); if (xhci->current_cmd) { - cur_cmd = xhci->current_cmd; - cur_cmd->status = COMP_CMD_ABORT; + if (xhci->current_cmd->status == COMP_CMD_ABORT) + second_timeout = true; + xhci->current_cmd->status = COMP_CMD_ABORT; } - /* Make sure command ring is running before aborting it */ hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring); if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) && (hw_ring_state & CMD_RING_RUNNING)) { - spin_unlock_irqrestore(&xhci->lock, flags); xhci_dbg(xhci, "Command timeout\n"); ret = xhci_abort_cmd_ring(xhci); @@ -1279,6 +1287,15 @@ void xhci_handle_command_timeout(unsigned long data) } return; } + + /* command ring failed to restart, or host removed. Bail out */ + if (second_timeout || xhci->xhc_state & XHCI_STATE_REMOVING) { + spin_unlock_irqrestore(&xhci->lock, flags); + xhci_dbg(xhci, "command timed out twice, ring start fail?\n"); + xhci_cleanup_command_queue(xhci); + return; + } + /* command timeout on stopped ring, ring can't be aborted */ xhci_dbg(xhci, "Command timeout on stopped ring\n"); xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd); From 0c3f25d8c6aa0ff475a86cd5d3b7e2c7b6eb496f Mon Sep 17 00:00:00 2001 From: Thomas Petazzoni Date: Wed, 1 Jun 2016 18:09:09 +0300 Subject: [PATCH 054/312] usb: xhci-plat: properly handle probe deferral for devm_clk_get() [ Upstream commit de95c40d5beaa47f6dc8fe9ac4159b4672b51523 ] On some platforms, the clocks might be registered by a platform driver. When this is the case, the clock platform driver may very well be probed after xhci-plat, in which case the first probe() invocation of xhci-plat will receive -EPROBE_DEFER as the return value of devm_clk_get(). The current code handles that as a normal error, and simply assumes that this means that the system doesn't have a clock for the XHCI controller, and continues probing without calling clk_prepare_enable(). Unfortunately, this doesn't work on systems where the XHCI controller does have a clock, but that clock is provided by another platform driver. In order to fix this situation, we handle the -EPROBE_DEFER error condition specially, and abort the XHCI controller probe(). It will be retried later automatically, the clock will be available, devm_clk_get() will succeed, and the probe() will continue with the clock prepared and enabled as expected. In practice, such issue is seen on the ARM64 Marvell 7K/8K platform, where the clocks are registered by a platform driver. Cc: Signed-off-by: Thomas Petazzoni Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/host/xhci-plat.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c index 783e819139a7..7606710baf43 100644 --- a/drivers/usb/host/xhci-plat.c +++ b/drivers/usb/host/xhci-plat.c @@ -116,6 +116,9 @@ static int xhci_plat_probe(struct platform_device *pdev) ret = clk_prepare_enable(clk); if (ret) goto put_hcd; + } else if (PTR_ERR(clk) == -EPROBE_DEFER) { + ret = -EPROBE_DEFER; + goto put_hcd; } if (of_device_is_compatible(pdev->dev.of_node, From 7d2831a8cd331f8d8d214b2a78a217a4ee1b5138 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Thu, 19 May 2016 17:12:19 +0200 Subject: [PATCH 055/312] usb: quirks: Fix sorting [ Upstream commit 81099f97bd31e25ff2719a435b1860fc3876122f ] Properly sort all the entries by vendor id. Signed-off-by: Hans de Goede Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/core/quirks.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c index 017c1de53aa5..34aa57bffdac 100644 --- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -44,6 +44,9 @@ static const struct usb_device_id usb_quirk_list[] = { /* Creative SB Audigy 2 NX */ { USB_DEVICE(0x041e, 0x3020), .driver_info = USB_QUIRK_RESET_RESUME }, + /* USB3503 */ + { USB_DEVICE(0x0424, 0x3503), .driver_info = USB_QUIRK_RESET_RESUME }, + /* Microsoft Wireless Laser Mouse 6000 Receiver */ { USB_DEVICE(0x045e, 0x00e1), .driver_info = USB_QUIRK_RESET_RESUME }, @@ -170,6 +173,10 @@ static const struct usb_device_id usb_quirk_list[] = { /* MAYA44USB sound device */ { USB_DEVICE(0x0a92, 0x0091), .driver_info = USB_QUIRK_RESET_RESUME }, + /* ASUS Base Station(T100) */ + { USB_DEVICE(0x0b05, 0x17e0), .driver_info = + USB_QUIRK_IGNORE_REMOTE_WAKEUP }, + /* Action Semiconductor flash disk */ { USB_DEVICE(0x10d6, 0x2200), .driver_info = USB_QUIRK_STRING_FETCH_255 }, @@ -185,16 +192,6 @@ static const struct usb_device_id usb_quirk_list[] = { { USB_DEVICE(0x1908, 0x1315), .driver_info = USB_QUIRK_HONOR_BNUMINTERFACES }, - /* INTEL VALUE SSD */ - { USB_DEVICE(0x8086, 0xf1a5), .driver_info = USB_QUIRK_RESET_RESUME }, - - /* USB3503 */ - { USB_DEVICE(0x0424, 0x3503), .driver_info = USB_QUIRK_RESET_RESUME }, - - /* ASUS Base Station(T100) */ - { USB_DEVICE(0x0b05, 0x17e0), .driver_info = - USB_QUIRK_IGNORE_REMOTE_WAKEUP }, - /* Protocol and OTG Electrical Test Device */ { USB_DEVICE(0x1a0a, 0x0200), .driver_info = USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL }, @@ -205,6 +202,9 @@ static const struct usb_device_id usb_quirk_list[] = { /* Blackmagic Design UltraStudio SDI */ { USB_DEVICE(0x1edb, 0xbd4f), .driver_info = USB_QUIRK_NO_LPM }, + /* INTEL VALUE SSD */ + { USB_DEVICE(0x8086, 0xf1a5), .driver_info = USB_QUIRK_RESET_RESUME }, + { } /* terminating entry must be last */ }; From 378cd9e7d03774e60d58ad8fcbd67fd226234d40 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Thu, 19 May 2016 17:12:20 +0200 Subject: [PATCH 056/312] usb: quirks: Add no-lpm quirk for Acer C120 LED Projector [ Upstream commit 32cb0b37098f4beeff5ad9e325f11b42a6ede56c ] The Acer C120 LED Projector is a USB-3 connected pico projector which takes both its power and video data from USB-3. In combination with some hubs this device does not play well with lpm, so disable lpm for it. Signed-off-by: Hans de Goede Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/core/quirks.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c index 34aa57bffdac..f28b5375e2c8 100644 --- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -196,6 +196,9 @@ static const struct usb_device_id usb_quirk_list[] = { { USB_DEVICE(0x1a0a, 0x0200), .driver_info = USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL }, + /* Acer C120 LED Projector */ + { USB_DEVICE(0x1de1, 0xc102), .driver_info = USB_QUIRK_NO_LPM }, + /* Blackmagic Design Intensity Shuttle */ { USB_DEVICE(0x1edb, 0xbd3b), .driver_info = USB_QUIRK_NO_LPM }, From 95cb83b7af2bf9d7a466b51427966eba42333977 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Wed, 1 Jun 2016 21:01:29 +0200 Subject: [PATCH 057/312] USB: xhci: Add broken streams quirk for Frescologic device id 1009 [ Upstream commit d95815ba6a0f287213118c136e64d8c56daeaeab ] I got one of these cards for testing uas with, it seems that with streams it dma-s all over the place, corrupting memory. On my first tests it managed to dma over the BIOS of the motherboard somehow and completely bricked it. Tests on another motherboard show that it does work with streams disabled. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/host/xhci-pci.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index c6027acb6263..54caaf87c567 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -37,6 +37,7 @@ /* Device for a quirk */ #define PCI_VENDOR_ID_FRESCO_LOGIC 0x1b73 #define PCI_DEVICE_ID_FRESCO_LOGIC_PDK 0x1000 +#define PCI_DEVICE_ID_FRESCO_LOGIC_FL1009 0x1009 #define PCI_DEVICE_ID_FRESCO_LOGIC_FL1400 0x1400 #define PCI_VENDOR_ID_ETRON 0x1b6f @@ -108,6 +109,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) xhci->quirks |= XHCI_TRUST_TX_LENGTH; } + if (pdev->vendor == PCI_VENDOR_ID_FRESCO_LOGIC && + pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_FL1009) + xhci->quirks |= XHCI_BROKEN_STREAMS; + if (pdev->vendor == PCI_VENDOR_ID_NEC) xhci->quirks |= XHCI_NEC_HOST; From ffb84c1774409125134fe0eebd5fd4da82993f22 Mon Sep 17 00:00:00 2001 From: Andrew Goodbody Date: Tue, 31 May 2016 10:05:26 -0500 Subject: [PATCH 058/312] usb: musb: Ensure rx reinit occurs for shared_fifo endpoints [ Upstream commit f3eec0cf784e0d6c47822ca6b66df3d5812af7e6 ] shared_fifo endpoints would only get a previous tx state cleared out, the rx state was only cleared for non shared_fifo endpoints Change this so that the rx state is cleared for all endpoints. This addresses an issue that resulted in rx packets being dropped silently. Signed-off-by: Andrew Goodbody Cc: stable@vger.kernel.org Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/musb/musb_host.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c index c3d5fc9dfb5b..383a6574c2aa 100644 --- a/drivers/usb/musb/musb_host.c +++ b/drivers/usb/musb/musb_host.c @@ -583,14 +583,13 @@ musb_rx_reinit(struct musb *musb, struct musb_qh *qh, struct musb_hw_ep *ep) musb_writew(ep->regs, MUSB_TXCSR, 0); /* scrub all previous state, clearing toggle */ - } else { - csr = musb_readw(ep->regs, MUSB_RXCSR); - if (csr & MUSB_RXCSR_RXPKTRDY) - WARNING("rx%d, packet/%d ready?\n", ep->epnum, - musb_readw(ep->regs, MUSB_RXCOUNT)); - - musb_h_flush_rxfifo(ep, MUSB_RXCSR_CLRDATATOG); } + csr = musb_readw(ep->regs, MUSB_RXCSR); + if (csr & MUSB_RXCSR_RXPKTRDY) + WARNING("rx%d, packet/%d ready?\n", ep->epnum, + musb_readw(ep->regs, MUSB_RXCOUNT)); + + musb_h_flush_rxfifo(ep, MUSB_RXCSR_CLRDATATOG); /* target addr and (for multipoint) hub addr/port */ if (musb->is_multipoint) { From 5cf1329f2004c34966fd4de2bdba14b470c1895f Mon Sep 17 00:00:00 2001 From: Andrew Goodbody Date: Tue, 31 May 2016 10:05:27 -0500 Subject: [PATCH 059/312] usb: musb: Stop bulk endpoint while queue is rotated [ Upstream commit 7b2c17f829545df27a910e8d82e133c21c9a8c9c ] Ensure that the endpoint is stopped by clearing REQPKT before clearing DATAERR_NAKTIMEOUT before rotating the queue on the dedicated bulk endpoint. This addresses an issue where a race could result in the endpoint receiving data before it was reprogrammed resulting in a warning about such data from musb_rx_reinit before it was thrown away. The data thrown away was a valid packet that had been correctly ACKed which meant the host and device got out of sync. Signed-off-by: Andrew Goodbody Cc: stable@vger.kernel.org Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/musb/musb_host.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c index 383a6574c2aa..06853d7c89fd 100644 --- a/drivers/usb/musb/musb_host.c +++ b/drivers/usb/musb/musb_host.c @@ -949,9 +949,15 @@ static void musb_bulk_nak_timeout(struct musb *musb, struct musb_hw_ep *ep, if (is_in) { dma = is_dma_capable() ? ep->rx_channel : NULL; - /* clear nak timeout bit */ + /* + * Need to stop the transaction by clearing REQPKT first + * then the NAK Timeout bit ref MUSBMHDRC USB 2.0 HIGH-SPEED + * DUAL-ROLE CONTROLLER Programmer's Guide, section 9.2.2 + */ rx_csr = musb_readw(epio, MUSB_RXCSR); rx_csr |= MUSB_RXCSR_H_WZC_BITS; + rx_csr &= ~MUSB_RXCSR_H_REQPKT; + musb_writew(epio, MUSB_RXCSR, rx_csr); rx_csr &= ~MUSB_RXCSR_DATAERROR; musb_writew(epio, MUSB_RXCSR, rx_csr); From e342d574a6ce550e14ae1def0eaf2c7cbdc3d991 Mon Sep 17 00:00:00 2001 From: Thierry Reding Date: Thu, 26 May 2016 17:23:29 +0200 Subject: [PATCH 060/312] usb: host: ehci-tegra: Grab the correct UTMI pads reset [ Upstream commit f8a15a9650694feaa0dabf197b0c94d37cd3fb42 ] There are three EHCI controllers on Tegra SoCs, each with its own reset line. However, the first controller contains a set of UTMI configuration registers that are shared with its siblings. These registers will only be reset as part of the first controller's reset. For proper operation it must be ensured that the UTMI configuration registers are reset before any of the EHCI controllers are enabled, irrespective of the probe order. Commit a47cc24cd1e5 ("USB: EHCI: tegra: Fix probe order issue leading to broken USB") introduced code that ensures the first controller is always reset before setting up any of the controllers, and is never again reset afterwards. This code, however, grabs the wrong reset. Each EHCI controller has two reset controls attached: 1) the USB controller reset and 2) the UTMI pads reset (really the first controller's reset). In order to reset the UTMI pads registers the code must grab the second reset, but instead it grabbing the first. Fixes: a47cc24cd1e5 ("USB: EHCI: tegra: Fix probe order issue leading to broken USB") Acked-by: Jon Hunter Cc: stable@vger.kernel.org Signed-off-by: Thierry Reding Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/host/ehci-tegra.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/host/ehci-tegra.c b/drivers/usb/host/ehci-tegra.c index ff9af29b4e9f..d888a00195ac 100644 --- a/drivers/usb/host/ehci-tegra.c +++ b/drivers/usb/host/ehci-tegra.c @@ -89,7 +89,7 @@ static int tegra_reset_usb_controller(struct platform_device *pdev) if (!usb1_reset_attempted) { struct reset_control *usb1_reset; - usb1_reset = of_reset_control_get(phy_np, "usb"); + usb1_reset = of_reset_control_get(phy_np, "utmi-pads"); if (IS_ERR(usb1_reset)) { dev_warn(&pdev->dev, "can't get utmi-pads reset from the PHY\n"); From bcef1c817adc16f798ff75e7910ab8b3361b85c8 Mon Sep 17 00:00:00 2001 From: Wei Fang Date: Tue, 7 Jun 2016 14:53:56 +0800 Subject: [PATCH 061/312] scsi: fix race between simultaneous decrements of ->host_failed [ Upstream commit 72d8c36ec364c82bf1bf0c64dfa1041cfaf139f7 ] sas_ata_strategy_handler() adds the works of the ata error handler to system_unbound_wq. This workqueue asynchronously runs work items, so the ata error handler will be performed concurrently on different CPUs. In this case, ->host_failed will be decreased simultaneously in scsi_eh_finish_cmd() on different CPUs, and become abnormal. It will lead to permanently inequality between ->host_failed and ->host_busy, and scsi error handler thread won't start running. IO errors after that won't be handled. Since all scmds must have been handled in the strategy handler, just remove the decrement in scsi_eh_finish_cmd() and zero ->host_busy after the strategy handler to fix this race. Fixes: 50824d6c5657 ("[SCSI] libsas: async ata-eh") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang Reviewed-by: James Bottomley Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- Documentation/scsi/scsi_eh.txt | 8 ++++++-- drivers/ata/libata-eh.c | 2 +- drivers/scsi/scsi_error.c | 4 +++- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt index 8638f61c8c9d..37eca00796ee 100644 --- a/Documentation/scsi/scsi_eh.txt +++ b/Documentation/scsi/scsi_eh.txt @@ -263,19 +263,23 @@ scmd->allowed. 3. scmd recovered ACTION: scsi_eh_finish_cmd() is invoked to EH-finish scmd - - shost->host_failed-- - clear scmd->eh_eflags - scsi_setup_cmd_retry() - move from local eh_work_q to local eh_done_q LOCKING: none + CONCURRENCY: at most one thread per separate eh_work_q to + keep queue manipulation lockless 4. EH completes ACTION: scsi_eh_flush_done_q() retries scmds or notifies upper - layer of failure. + layer of failure. May be called concurrently but must have + a no more than one thread per separate eh_work_q to + manipulate the queue locklessly - scmd is removed from eh_done_q and scmd->eh_entry is cleared - if retry is necessary, scmd is requeued using scsi_queue_insert() - otherwise, scsi_finish_command() is invoked for scmd + - zero shost->host_failed LOCKING: queue or finish function performs appropriate locking diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index cb0508af1459..5ab6fa9cfc2f 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -606,7 +606,7 @@ void ata_scsi_error(struct Scsi_Host *host) ata_scsi_port_error_handler(host, ap); /* finish or retry handled scmd's and clean up */ - WARN_ON(host->host_failed || !list_empty(&eh_work_q)); + WARN_ON(!list_empty(&eh_work_q)); DPRINTK("EXIT\n"); } diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c6b93d273799..841fdf745fcf 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1122,7 +1122,6 @@ static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn) */ void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q) { - scmd->device->host->host_failed--; scmd->eh_eflags = 0; list_move_tail(&scmd->eh_entry, done_q); } @@ -2216,6 +2215,9 @@ int scsi_error_handler(void *data) else scsi_unjam_host(shost); + /* All scmds have been handled */ + shost->host_failed = 0; + /* * Note - if the above fails completely, the action is to take * individual devices offline and flush the queue of any From 017864ed4d67f66d11cf951ad8d8f497c81e915c Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Tue, 7 Jun 2016 17:57:54 +0100 Subject: [PATCH 062/312] ARM: 8578/1: mm: ensure pmd_present only checks the valid bit [ Upstream commit 624531886987f0f1b5d01fb598034d039198e090 ] In a subsequent patch, pmd_mknotpresent will clear the valid bit of the pmd entry, resulting in a not-present entry from the hardware's perspective. Unfortunately, pmd_present simply checks for a non-zero pmd value and will therefore continue to return true even after a pmd_mknotpresent operation. Since pmd_mknotpresent is only used for managing huge entries, this is only an issue for the 3-level case. This patch fixes the 3-level pmd_present implementation to take into account the valid bit. For bisectability, the change is made before the fix to pmd_mknotpresent. [catalin.marinas@arm.com: comment update regarding pmd_mknotpresent patch] Fixes: 8d9625070073 ("ARM: mm: Transparent huge page support for LPAE systems.") Cc: # 3.11+ Cc: Russell King Cc: Steve Capper Signed-off-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: Russell King Signed-off-by: Sasha Levin --- arch/arm/include/asm/pgtable-2level.h | 1 + arch/arm/include/asm/pgtable-3level.h | 1 + arch/arm/include/asm/pgtable.h | 1 - 3 files changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h index bfd662e49a25..89b5a0a00dc9 100644 --- a/arch/arm/include/asm/pgtable-2level.h +++ b/arch/arm/include/asm/pgtable-2level.h @@ -164,6 +164,7 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr) #define pmd_large(pmd) (pmd_val(pmd) & 2) #define pmd_bad(pmd) (pmd_val(pmd) & 2) +#define pmd_present(pmd) (pmd_val(pmd)) #define copy_pmd(pmdpd,pmdps) \ do { \ diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index a745a2a53853..e3dad00abcb3 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -212,6 +212,7 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr) : !!(pmd_val(pmd) & (val))) #define pmd_isclear(pmd, val) (!(pmd_val(pmd) & (val))) +#define pmd_present(pmd) (pmd_isset((pmd), L_PMD_SECT_VALID)) #define pmd_young(pmd) (pmd_isset((pmd), PMD_SECT_AF)) #define pte_special(pte) (pte_isset((pte), L_PTE_SPECIAL)) static inline pte_t pte_mkspecial(pte_t pte) diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index f40354198bad..7fa12e0f1bc9 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -182,7 +182,6 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; #define pgd_offset_k(addr) pgd_offset(&init_mm, addr) #define pmd_none(pmd) (!pmd_val(pmd)) -#define pmd_present(pmd) (pmd_val(pmd)) static inline pte_t *pmd_page_vaddr(pmd_t pmd) { From 515f30350a5e240febb0b710dbb68abe1f293464 Mon Sep 17 00:00:00 2001 From: Steve Capper Date: Tue, 7 Jun 2016 17:58:06 +0100 Subject: [PATCH 063/312] ARM: 8579/1: mm: Fix definition of pmd_mknotpresent [ Upstream commit 56530f5d2ddc9b9fade7ef8db9cb886e9dc689b5 ] Currently pmd_mknotpresent will use a zero entry to respresent an invalidated pmd. Unfortunately this definition clashes with pmd_none, thus it is possible for a race condition to occur if zap_pmd_range sees pmd_none whilst __split_huge_pmd_locked is running too with pmdp_invalidate just called. This patch fixes the race condition by modifying pmd_mknotpresent to create non-zero faulting entries (as is done in other architectures), removing the ambiguity with pmd_none. [catalin.marinas@arm.com: using L_PMD_SECT_VALID instead of PMD_TYPE_SECT] Fixes: 8d9625070073 ("ARM: mm: Transparent huge page support for LPAE systems.") Cc: # 3.11+ Reported-by: Kirill A. Shutemov Acked-by: Will Deacon Cc: Russell King Signed-off-by: Steve Capper Signed-off-by: Catalin Marinas Signed-off-by: Russell King Signed-off-by: Sasha Levin --- arch/arm/include/asm/pgtable-3level.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index e3dad00abcb3..fd929b5ded9e 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -258,10 +258,10 @@ PMD_BIT_FUNC(mkyoung, |= PMD_SECT_AF); #define pfn_pmd(pfn,prot) (__pmd(((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))) #define mk_pmd(page,prot) pfn_pmd(page_to_pfn(page),prot) -/* represent a notpresent pmd by zero, this is used by pmdp_invalidate */ +/* represent a notpresent pmd by faulting entry, this is used by pmdp_invalidate */ static inline pmd_t pmd_mknotpresent(pmd_t pmd) { - return __pmd(0); + return __pmd(pmd_val(pmd) & ~L_PMD_SECT_VALID); } static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) From 9341331d504d3acd5b3ad0763bb9d23cd3b28944 Mon Sep 17 00:00:00 2001 From: Anton Blanchard Date: Fri, 10 Jun 2016 16:47:03 +1000 Subject: [PATCH 064/312] crypto: vmx - Increase priority of aes-cbc cipher [ Upstream commit 12d3f49e1ffbbf8cbbb60acae5a21103c5c841ac ] All of the VMX AES ciphers (AES, AES-CBC and AES-CTR) are set at priority 1000. Unfortunately this means we never use AES-CBC and AES-CTR, because the base AES-CBC cipher that is implemented on top of AES inherits its priority. To fix this, AES-CBC and AES-CTR have to be a higher priority. Set them to 2000. Testing on a POWER8 with: cryptsetup benchmark --cipher aes --key-size 256 Shows decryption speed increase from 402.4 MB/s to 3069.2 MB/s, over 7x faster. Thanks to Mike Strosaker for helping me debug this issue. Fixes: 8c755ace357c ("crypto: vmx - Adding CBC routines for VMX module") Cc: stable@vger.kernel.org Signed-off-by: Anton Blanchard Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- drivers/crypto/vmx/aes_cbc.c | 2 +- drivers/crypto/vmx/aes_ctr.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/crypto/vmx/aes_cbc.c b/drivers/crypto/vmx/aes_cbc.c index c8e7f653e5d3..f508cea02039 100644 --- a/drivers/crypto/vmx/aes_cbc.c +++ b/drivers/crypto/vmx/aes_cbc.c @@ -167,7 +167,7 @@ struct crypto_alg p8_aes_cbc_alg = { .cra_name = "cbc(aes)", .cra_driver_name = "p8_aes_cbc", .cra_module = THIS_MODULE, - .cra_priority = 1000, + .cra_priority = 2000, .cra_type = &crypto_blkcipher_type, .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK, .cra_alignmask = 0, diff --git a/drivers/crypto/vmx/aes_ctr.c b/drivers/crypto/vmx/aes_ctr.c index 266e708d63df..d8fa3b4ec17f 100644 --- a/drivers/crypto/vmx/aes_ctr.c +++ b/drivers/crypto/vmx/aes_ctr.c @@ -151,7 +151,7 @@ struct crypto_alg p8_aes_ctr_alg = { .cra_name = "ctr(aes)", .cra_driver_name = "p8_aes_ctr", .cra_module = THIS_MODULE, - .cra_priority = 1000, + .cra_priority = 2000, .cra_type = &crypto_blkcipher_type, .cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_NEED_FALLBACK, .cra_alignmask = 0, From 4a088cba60485acbdceee2c3f903e7b0c7846737 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Wed, 8 Jun 2016 14:56:39 +0200 Subject: [PATCH 065/312] crypto: ux500 - memmove the right size [ Upstream commit 19ced623db2fe91604d69f7d86b03144c5107739 ] The hash buffer is really HASH_BLOCK_SIZE bytes, someone must have thought that memmove takes n*u32 words by mistake. Tests work as good/bad as before after this patch. Cc: Joakim Bech Cc: stable@vger.kernel.org Reported-by: David Binderman Signed-off-by: Linus Walleij Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- drivers/crypto/ux500/hash/hash_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/crypto/ux500/hash/hash_core.c b/drivers/crypto/ux500/hash/hash_core.c index 5f5f360628fc..c5e6222a6ca2 100644 --- a/drivers/crypto/ux500/hash/hash_core.c +++ b/drivers/crypto/ux500/hash/hash_core.c @@ -797,7 +797,7 @@ static int hash_process_data(struct hash_device_data *device_data, &device_data->state); memmove(req_ctx->state.buffer, device_data->state.buffer, - HASH_BLOCK_SIZE / sizeof(u32)); + HASH_BLOCK_SIZE); if (ret) { dev_err(device_data->dev, "%s: hash_resume_state() failed!\n", @@ -848,7 +848,7 @@ static int hash_process_data(struct hash_device_data *device_data, memmove(device_data->state.buffer, req_ctx->state.buffer, - HASH_BLOCK_SIZE / sizeof(u32)); + HASH_BLOCK_SIZE); if (ret) { dev_err(device_data->dev, "%s: hash_save_state() failed!\n", __func__); From 7ac3a704d7dd6c0b8a72048ee66523f190afda8d Mon Sep 17 00:00:00 2001 From: Junichi Nomura Date: Fri, 10 Jun 2016 04:31:52 +0000 Subject: [PATCH 066/312] ipmi: Remove smi_msg from waiting_rcv_msgs list before handle_one_recv_msg() [ Upstream commit ae4ea9a2460c7fee2ae8feeb4dfe96f5f6c3e562 ] Commit 7ea0ed2b5be8 ("ipmi: Make the message handler easier to use for SMI interfaces") changed handle_new_recv_msgs() to call handle_one_recv_msg() for a smi_msg while the smi_msg is still connected to waiting_rcv_msgs list. That could lead to following list corruption problems: 1) low-level function treats smi_msg as not connected to list handle_one_recv_msg() could end up calling smi_send(), which assumes the msg is not connected to list. For example, the following sequence could corrupt list by doing list_add_tail() for the entry still connected to other list. handle_new_recv_msgs() msg = list_entry(waiting_rcv_msgs) handle_one_recv_msg(msg) handle_ipmb_get_msg_cmd(msg) smi_send(msg) spin_lock(xmit_msgs_lock) list_add_tail(msg) spin_unlock(xmit_msgs_lock) 2) race between multiple handle_new_recv_msgs() instances handle_new_recv_msgs() once releases waiting_rcv_msgs_lock before calling handle_one_recv_msg() then retakes the lock and list_del() it. If others call handle_new_recv_msgs() during the window shown below list_del() will be done twice for the same smi_msg. handle_new_recv_msgs() spin_lock(waiting_rcv_msgs_lock) msg = list_entry(waiting_rcv_msgs) spin_unlock(waiting_rcv_msgs_lock) | | handle_one_recv_msg(msg) | spin_lock(waiting_rcv_msgs_lock) list_del(msg) spin_unlock(waiting_rcv_msgs_lock) Fixes: 7ea0ed2b5be8 ("ipmi: Make the message handler easier to use for SMI interfaces") Signed-off-by: Jun'ichi Nomura [Added a comment to describe why this works.] Signed-off-by: Corey Minyard Cc: stable@vger.kernel.org # 3.19 Tested-by: Ye Feng Signed-off-by: Sasha Levin --- drivers/char/ipmi/ipmi_msghandler.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c index bf75f6361773..4bc508c14900 100644 --- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -3813,6 +3813,7 @@ static void handle_new_recv_msgs(ipmi_smi_t intf) while (!list_empty(&intf->waiting_rcv_msgs)) { smi_msg = list_entry(intf->waiting_rcv_msgs.next, struct ipmi_smi_msg, link); + list_del(&smi_msg->link); if (!run_to_completion) spin_unlock_irqrestore(&intf->waiting_rcv_msgs_lock, flags); @@ -3822,11 +3823,14 @@ static void handle_new_recv_msgs(ipmi_smi_t intf) if (rv > 0) { /* * To preserve message order, quit if we - * can't handle a message. + * can't handle a message. Add the message + * back at the head, this is safe because this + * tasklet is the only thing that pulls the + * messages. */ + list_add(&smi_msg->link, &intf->waiting_rcv_msgs); break; } else { - list_del(&smi_msg->link); if (rv == 0) /* Message handled */ ipmi_free_smi_msg(smi_msg); From d7afed774eec54eabc25de2623cda1aa8176a2bb Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Mon, 13 Jun 2016 15:37:34 -0400 Subject: [PATCH 067/312] drm/radeon: fix asic initialization for virtualized environments [ Upstream commit 05082b8bbd1a0ffc74235449c4b8930a8c240f85 ] When executing in a PCI passthrough based virtuzliation environment, the hypervisor will usually attempt to send a PCIe bus reset signal to the ASIC when the VM reboots. In this scenario, the card is not correctly initialized, but we still consider it to be posted. Therefore, in a passthrough based environemnt we should always post the card to guarantee it is in a good state for driver initialization. Ported from amdgpu commit: amdgpu: fix asic initialization for virtualized environments Cc: Andres Rodriguez Cc: Alex Williamson Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_device.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 604c44d88e7a..83b3eb2e444a 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -630,6 +630,23 @@ void radeon_gtt_location(struct radeon_device *rdev, struct radeon_mc *mc) /* * GPU helpers function. */ + +/** + * radeon_device_is_virtual - check if we are running is a virtual environment + * + * Check if the asic has been passed through to a VM (all asics). + * Used at driver startup. + * Returns true if virtual or false if not. + */ +static bool radeon_device_is_virtual(void) +{ +#ifdef CONFIG_X86 + return boot_cpu_has(X86_FEATURE_HYPERVISOR); +#else + return false; +#endif +} + /** * radeon_card_posted - check if the hw has already been initialized * @@ -643,6 +660,10 @@ bool radeon_card_posted(struct radeon_device *rdev) { uint32_t reg; + /* for pass through, always force asic_init */ + if (radeon_device_is_virtual()) + return false; + /* required for EFI mode on macbook2,1 which uses an r5xx asic */ if (efi_enabled(EFI_BOOT) && (rdev->pdev->subsystem_vendor == PCI_VENDOR_ID_APPLE) && From 505291afa41b4918f3edbd775210853778b7939f Mon Sep 17 00:00:00 2001 From: Oscar Date: Tue, 14 Jun 2016 14:14:35 +0800 Subject: [PATCH 068/312] usb: common: otg-fsm: add license to usb-otg-fsm [ Upstream commit ea1d39a31d3b1b6060b6e83e5a29c069a124c68a ] Fix warning about tainted kernel because usb-otg-fsm has no license. WARNING: with this patch usb-otg-fsm module can be loaded but then the kernel will hang. Tested with a udoo quad board. Cc: #v4.1+ Signed-off-by: Oscar Signed-off-by: Peter Chen Signed-off-by: Sasha Levin --- drivers/usb/common/usb-otg-fsm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/usb/common/usb-otg-fsm.c b/drivers/usb/common/usb-otg-fsm.c index 61d538aa2346..4f4f06a5889f 100644 --- a/drivers/usb/common/usb-otg-fsm.c +++ b/drivers/usb/common/usb-otg-fsm.c @@ -21,6 +21,7 @@ * 675 Mass Ave, Cambridge, MA 02139, USA. */ +#include #include #include #include @@ -365,3 +366,4 @@ int otg_statemachine(struct otg_fsm *fsm) return state_changed; } EXPORT_SYMBOL_GPL(otg_statemachine); +MODULE_LICENSE("GPL"); From d1c7fc1c906d7f7ca1788a53ab67043f3094fd67 Mon Sep 17 00:00:00 2001 From: James Hogan Date: Thu, 9 Jun 2016 10:50:43 +0100 Subject: [PATCH 069/312] MIPS: KVM: Fix modular KVM under QEMU MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 797179bc4fe06c89e47a9f36f886f68640b423f8 ] Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never get a TLB refill exception in it when KVM is built as a module. This was observed to happen with the host MIPS kernel running under QEMU, due to a not entirely transparent optimisation in the QEMU TLB handling where TLB entries replaced with TLBWR are copied to a separate part of the TLB array. Code in those pages continue to be executable, but those mappings persist only until the next ASID switch, even if they are marked global. An ASID switch happens in __kvm_mips_vcpu_run() at exception level after switching to the guest exception base. Subsequent TLB mapped kernel instructions just prior to switching to the guest trigger a TLB refill exception, which enters the guest exception handlers without updating EPC. This appears as a guest triggered TLB refill on a host kernel mapped (host KSeg2) address, which is not handled correctly as user (guest) mode accesses to kernel (host) segments always generate address error exceptions. Signed-off-by: James Hogan Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Ralf Baechle Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: # 3.10.x- Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin --- arch/mips/include/asm/kvm_host.h | 1 + arch/mips/kvm/interrupt.h | 1 + arch/mips/kvm/locore.S | 1 + arch/mips/kvm/mips.c | 11 ++++++++++- 4 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h index 3585af093576..ab518d14b7b0 100644 --- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -370,6 +370,7 @@ struct kvm_mips_tlb { #define KVM_MIPS_GUEST_TLB_SIZE 64 struct kvm_vcpu_arch { void *host_ebase, *guest_ebase; + int (*vcpu_run)(struct kvm_run *run, struct kvm_vcpu *vcpu); unsigned long host_stack; unsigned long host_gp; diff --git a/arch/mips/kvm/interrupt.h b/arch/mips/kvm/interrupt.h index 4ab4bdfad703..2143884709e4 100644 --- a/arch/mips/kvm/interrupt.h +++ b/arch/mips/kvm/interrupt.h @@ -28,6 +28,7 @@ #define MIPS_EXC_MAX 12 /* XXXSL More to follow */ +extern char __kvm_mips_vcpu_run_end[]; extern char mips32_exception[], mips32_exceptionEnd[]; extern char mips32_GuestException[], mips32_GuestExceptionEnd[]; diff --git a/arch/mips/kvm/locore.S b/arch/mips/kvm/locore.S index d1ee95a7f7dd..ae01e7fe4e1c 100644 --- a/arch/mips/kvm/locore.S +++ b/arch/mips/kvm/locore.S @@ -235,6 +235,7 @@ FEXPORT(__kvm_mips_load_k0k1) /* Jump to guest */ eret +EXPORT(__kvm_mips_vcpu_run_end) VECTOR(MIPSX(exception), unknown) /* Find out what mode we came from and jump to the proper handler. */ diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index ace4ed7d41c6..485fdc462243 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -312,6 +312,15 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) memcpy(gebase + offset, mips32_GuestException, mips32_GuestExceptionEnd - mips32_GuestException); +#ifdef MODULE + offset += mips32_GuestExceptionEnd - mips32_GuestException; + memcpy(gebase + offset, (char *)__kvm_mips_vcpu_run, + __kvm_mips_vcpu_run_end - (char *)__kvm_mips_vcpu_run); + vcpu->arch.vcpu_run = gebase + offset; +#else + vcpu->arch.vcpu_run = __kvm_mips_vcpu_run; +#endif + /* Invalidate the icache for these ranges */ local_flush_icache_range((unsigned long)gebase, (unsigned long)gebase + ALIGN(size, PAGE_SIZE)); @@ -401,7 +410,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) /* Disable hardware page table walking while in guest */ htw_stop(); - r = __kvm_mips_vcpu_run(run, vcpu); + r = vcpu->arch.vcpu_run(run, vcpu); /* Re-enable HTW before enabling interrupts */ htw_start(); From fcc5d265d134e891abd67169c77358cd5ea2fc77 Mon Sep 17 00:00:00 2001 From: Michal Suchanek Date: Mon, 13 Jun 2016 17:46:49 +0000 Subject: [PATCH 070/312] spi: sun4i: fix FIFO limit [ Upstream commit 6d9fe44bd73d567d04d3a68a2d2fa521ab9532f2 ] When testing SPI without DMA I noticed that filling the FIFO on the spi controller causes timeout. Always leave room for one byte in the FIFO. Signed-off-by: Michal Suchanek Acked-by: Maxime Ripard Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/spi/spi-sun4i.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/spi/spi-sun4i.c b/drivers/spi/spi-sun4i.c index fbb0a4d74e91..c854248209a3 100644 --- a/drivers/spi/spi-sun4i.c +++ b/drivers/spi/spi-sun4i.c @@ -176,7 +176,10 @@ static int sun4i_spi_transfer_one(struct spi_master *master, /* We don't support transfer larger than the FIFO */ if (tfr->len > SUN4I_FIFO_DEPTH) - return -EINVAL; + return -EMSGSIZE; + + if (tfr->tx_buf && tfr->len >= SUN4I_FIFO_DEPTH) + return -EMSGSIZE; reinit_completion(&sspi->done); sspi->tx_buf = tfr->tx_buf; @@ -269,8 +272,12 @@ static int sun4i_spi_transfer_one(struct spi_master *master, sun4i_spi_write(sspi, SUN4I_BURST_CNT_REG, SUN4I_BURST_CNT(tfr->len)); sun4i_spi_write(sspi, SUN4I_XMIT_CNT_REG, SUN4I_XMIT_CNT(tx_len)); - /* Fill the TX FIFO */ - sun4i_spi_fill_fifo(sspi, SUN4I_FIFO_DEPTH); + /* + * Fill the TX FIFO + * Filling the FIFO fully causes timeout for some reason + * at least on spi2 on A10s + */ + sun4i_spi_fill_fifo(sspi, SUN4I_FIFO_DEPTH - 1); /* Enable the interrupts */ sun4i_spi_write(sspi, SUN4I_INT_CTL_REG, SUN4I_INT_CTL_TC); From 01c93ba6d66687ff46116a46527b7d3b314e0787 Mon Sep 17 00:00:00 2001 From: Michal Suchanek Date: Mon, 13 Jun 2016 17:46:49 +0000 Subject: [PATCH 071/312] spi: sunxi: fix transfer timeout [ Upstream commit 719bd6542044efd9b338a53dba1bef45f40ca169 ] The trasfer timeout is fixed at 1000 ms. Reading a 4Mbyte flash over 1MHz SPI bus takes way longer than that. Calculate the timeout from the actual time the transfer is supposed to take and multiply by 2 for good measure. Signed-off-by: Michal Suchanek Acked-by: Maxime Ripard Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/spi/spi-sun4i.c | 10 +++++++++- drivers/spi/spi-sun6i.c | 10 +++++++++- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-sun4i.c b/drivers/spi/spi-sun4i.c index c854248209a3..39d7c7c70112 100644 --- a/drivers/spi/spi-sun4i.c +++ b/drivers/spi/spi-sun4i.c @@ -170,6 +170,7 @@ static int sun4i_spi_transfer_one(struct spi_master *master, { struct sun4i_spi *sspi = spi_master_get_devdata(master); unsigned int mclk_rate, div, timeout; + unsigned int start, end, tx_time; unsigned int tx_len = 0; int ret = 0; u32 reg; @@ -286,9 +287,16 @@ static int sun4i_spi_transfer_one(struct spi_master *master, reg = sun4i_spi_read(sspi, SUN4I_CTL_REG); sun4i_spi_write(sspi, SUN4I_CTL_REG, reg | SUN4I_CTL_XCH); + tx_time = max(tfr->len * 8 * 2 / (tfr->speed_hz / 1000), 100U); + start = jiffies; timeout = wait_for_completion_timeout(&sspi->done, - msecs_to_jiffies(1000)); + msecs_to_jiffies(tx_time)); + end = jiffies; if (!timeout) { + dev_warn(&master->dev, + "%s: timeout transferring %u bytes@%iHz for %i(%i)ms", + dev_name(&spi->dev), tfr->len, tfr->speed_hz, + jiffies_to_msecs(end - start), tx_time); ret = -ETIMEDOUT; goto out; } diff --git a/drivers/spi/spi-sun6i.c b/drivers/spi/spi-sun6i.c index ac48f59705a8..e77add01b0e9 100644 --- a/drivers/spi/spi-sun6i.c +++ b/drivers/spi/spi-sun6i.c @@ -160,6 +160,7 @@ static int sun6i_spi_transfer_one(struct spi_master *master, { struct sun6i_spi *sspi = spi_master_get_devdata(master); unsigned int mclk_rate, div, timeout; + unsigned int start, end, tx_time; unsigned int tx_len = 0; int ret = 0; u32 reg; @@ -269,9 +270,16 @@ static int sun6i_spi_transfer_one(struct spi_master *master, reg = sun6i_spi_read(sspi, SUN6I_TFR_CTL_REG); sun6i_spi_write(sspi, SUN6I_TFR_CTL_REG, reg | SUN6I_TFR_CTL_XCH); + tx_time = max(tfr->len * 8 * 2 / (tfr->speed_hz / 1000), 100U); + start = jiffies; timeout = wait_for_completion_timeout(&sspi->done, - msecs_to_jiffies(1000)); + msecs_to_jiffies(tx_time)); + end = jiffies; if (!timeout) { + dev_warn(&master->dev, + "%s: timeout transferring %u bytes@%iHz for %i(%i)ms", + dev_name(&spi->dev), tfr->len, tfr->speed_hz, + jiffies_to_msecs(end - start), tx_time); ret = -ETIMEDOUT; goto out; } From c5ffc99b6e519420a791e942606d48fe030383d2 Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Sat, 11 Jun 2016 23:06:53 +0900 Subject: [PATCH 072/312] kprobes/x86: Clear TF bit in fault on single-stepping [ Upstream commit dcfc47248d3f7d28df6f531e6426b933de94370d ] Fix kprobe_fault_handler() to clear the TF (trap flag) bit of the flags register in the case of a fault fixup on single-stepping. If we put a kprobe on the instruction which caused a page fault (e.g. actual mov instructions in copy_user_*), that fault happens on the single-stepping buffer. In this case, kprobes resets running instance so that the CPU can retry execution on the original ip address. However, current code forgets to reset the TF bit. Since this fault happens with TF bit set for enabling single-stepping, when it retries, it causes a debug exception and kprobes can not handle it because it already reset itself. On the most of x86-64 platform, it can be easily reproduced by using kprobe tracer. E.g. # cd /sys/kernel/debug/tracing # echo p copy_user_enhanced_fast_string+5 > kprobe_events # echo 1 > events/kprobes/enable And you'll see a kernel panic on do_debug(), since the debug trap is not handled by kprobes. To fix this problem, we just need to clear the TF bit when resetting running kprobe. Signed-off-by: Masami Hiramatsu Reviewed-by: Ananth N Mavinakayanahalli Acked-by: Steven Rostedt Cc: Alexander Shishkin Cc: Andy Lutomirski Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: systemtap@sourceware.org Cc: stable@vger.kernel.org # All the way back to ancient kernels Link: http://lkml.kernel.org/r/20160611140648.25885.37482.stgit@devbox [ Updated the comments. ] Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- arch/x86/kernel/kprobes/core.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index 1deffe6cc873..023c442c33bb 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -959,7 +959,19 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr) * normal page fault. */ regs->ip = (unsigned long)cur->addr; + /* + * Trap flag (TF) has been set here because this fault + * happened where the single stepping will be done. + * So clear it by resetting the current kprobe: + */ + regs->flags &= ~X86_EFLAGS_TF; + + /* + * If the TF flag was set before the kprobe hit, + * don't touch it: + */ regs->flags |= kcb->kprobe_old_flags; + if (kcb->kprobe_status == KPROBE_REENTER) restore_previous_kprobe(kcb); else From 9f67dcf663004c56aaf153835f624f9f87d9a643 Mon Sep 17 00:00:00 2001 From: Andrey Ryabinin Date: Thu, 9 Jun 2016 15:20:05 +0300 Subject: [PATCH 073/312] kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w [ Upstream commit 57675cb976eff977aefb428e68e4e0236d48a9ff ] Lengthy output of sysrq-w may take a lot of time on slow serial console. Currently we reset NMI-watchdog on the current CPU to avoid spurious lockup messages. Sometimes this doesn't work since softlockup watchdog might trigger on another CPU which is waiting for an IPI to proceed. We reset softlockup watchdogs on all CPUs, but we do this only after listing all tasks, and this may be too late on a busy system. So, reset watchdogs CPUs earlier, in for_each_process_thread() loop. Signed-off-by: Andrey Ryabinin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Link: http://lkml.kernel.org/r/1465474805-14641-1-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- kernel/sched/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 3b0f4c09ab92..98e607121d09 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4568,14 +4568,16 @@ void show_state_filter(unsigned long state_filter) /* * reset the NMI-timeout, listing all files on a slow * console might take a lot of time: + * Also, reset softlockup watchdogs on all CPUs, because + * another CPU might be blocked waiting for us to process + * an IPI. */ touch_nmi_watchdog(); + touch_all_softlockup_watchdogs(); if (!state_filter || (p->state & state_filter)) sched_show_task(p); } - touch_all_softlockup_watchdogs(); - #ifdef CONFIG_SCHED_DEBUG sysrq_sched_debug_show(); #endif From 260c505e55b51645affb70a2c456b350f7e7460a Mon Sep 17 00:00:00 2001 From: Rhyland Klein Date: Thu, 9 Jun 2016 17:28:39 -0400 Subject: [PATCH 074/312] power_supply: power_supply_read_temp only if use_cnt > 0 [ Upstream commit 5bc28b93a36e3cb3acc2870fb75cb6ffb182fece ] Change power_supply_read_temp() to use power_supply_get_property() so that it will check the use_cnt and ensure it is > 0. The use_cnt will be incremented at the end of __power_supply_register, so this will block to case where get_property can be called before the supply is fully registered. This fixes the issue show in the stack below: [ 1.452598] power_supply_read_temp+0x78/0x80 [ 1.458680] thermal_zone_get_temp+0x5c/0x11c [ 1.464765] thermal_zone_device_update+0x34/0xb4 [ 1.471195] thermal_zone_device_register+0x87c/0x8cc [ 1.477974] __power_supply_register+0x364/0x424 [ 1.484317] power_supply_register_no_ws+0x10/0x18 [ 1.490833] bq27xxx_battery_setup+0x10c/0x164 [ 1.497003] bq27xxx_battery_i2c_probe+0xd0/0x1b0 [ 1.503435] i2c_device_probe+0x174/0x240 [ 1.509172] driver_probe_device+0x1fc/0x29c [ 1.515167] __driver_attach+0xa4/0xa8 [ 1.520643] bus_for_each_dev+0x58/0x98 [ 1.526204] driver_attach+0x20/0x28 [ 1.531505] bus_add_driver+0x1c8/0x22c [ 1.537067] driver_register+0x68/0x108 [ 1.542630] i2c_register_driver+0x38/0x7c [ 1.548457] bq27xxx_battery_i2c_driver_init+0x18/0x20 [ 1.555321] do_one_initcall+0x38/0x12c [ 1.560886] kernel_init_freeable+0x148/0x1ec [ 1.566972] kernel_init+0x10/0xfc [ 1.572101] ret_from_fork+0x10/0x40 Also make the same change to ps_get_max_charge_cntl_limit() and ps_get_cur_chrage_cntl_limit() to be safe. Lastly, change the return value of power_supply_get_property() to -EAGAIN from -ENODEV if use_cnt <= 0. Fixes: 297d716f6260 ("power_supply: Change ownership from driver to core") Cc: stable@vger.kernel.org Signed-off-by: Rhyland Klein Reviewed-by: Krzysztof Kozlowski Signed-off-by: Sebastian Reichel Signed-off-by: Sasha Levin --- drivers/power/power_supply_core.c | 27 ++++++++++++++++----------- 1 file changed, 16 insertions(+), 11 deletions(-) diff --git a/drivers/power/power_supply_core.c b/drivers/power/power_supply_core.c index 4bc0c7f459a5..b2bf48c7dc36 100644 --- a/drivers/power/power_supply_core.c +++ b/drivers/power/power_supply_core.c @@ -526,11 +526,12 @@ static int power_supply_read_temp(struct thermal_zone_device *tzd, WARN_ON(tzd == NULL); psy = tzd->devdata; - ret = psy->desc->get_property(psy, POWER_SUPPLY_PROP_TEMP, &val); + ret = power_supply_get_property(psy, POWER_SUPPLY_PROP_TEMP, &val); + if (ret) + return ret; /* Convert tenths of degree Celsius to milli degree Celsius. */ - if (!ret) - *temp = val.intval * 100; + *temp = val.intval * 100; return ret; } @@ -573,10 +574,12 @@ static int ps_get_max_charge_cntl_limit(struct thermal_cooling_device *tcd, int ret; psy = tcd->devdata; - ret = psy->desc->get_property(psy, - POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT_MAX, &val); - if (!ret) - *state = val.intval; + ret = power_supply_get_property(psy, + POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT_MAX, &val); + if (ret) + return ret; + + *state = val.intval; return ret; } @@ -589,10 +592,12 @@ static int ps_get_cur_chrage_cntl_limit(struct thermal_cooling_device *tcd, int ret; psy = tcd->devdata; - ret = psy->desc->get_property(psy, - POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT, &val); - if (!ret) - *state = val.intval; + ret = power_supply_get_property(psy, + POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT, &val); + if (ret) + return ret; + + *state = val.intval; return ret; } From 5a6c735c13e0ad34cdb4fc24a782924b64efec92 Mon Sep 17 00:00:00 2001 From: Lyude Date: Tue, 14 Jun 2016 11:04:09 -0400 Subject: [PATCH 075/312] drm/i915/ilk: Don't disable SSC source if it's in use MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 476490a945e1f0f6bd58e303058d2d8ca93a974c ] Thanks to Ville Syrjälä for pointing me towards the cause of this issue. Unfortunately one of the sideaffects of having the refclk for a DPLL set to SSC is that as long as it's set to SSC, the GPU will prevent us from powering down any of the pipes or transcoders using it. A couple of BIOSes enable SSC in both PCH_DREF_CONTROL and in the DPLL configurations. This causes issues on the first modeset, since we don't expect SSC to be left on and as a result, can't successfully power down the pipes or the transcoders using it. Here's an example from this Dell OptiPlex 990: [drm:intel_modeset_init] SSC enabled by BIOS, overriding VBT which says disabled [drm:intel_modeset_init] 2 display pipes available. [drm:intel_update_cdclk] Current CD clock rate: 400000 kHz [drm:intel_update_max_cdclk] Max CD clock rate: 400000 kHz [drm:intel_update_max_cdclk] Max dotclock rate: 360000 kHz vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem [drm:intel_crt_reset] crt adpa set to 0xf40000 [drm:intel_dp_init_connector] Adding DP connector on port C [drm:intel_dp_aux_init] registering DPDDC-C bus for card0-DP-1 [drm:ironlake_init_pch_refclk] has_panel 0 has_lvds 0 has_ck505 0 [drm:ironlake_init_pch_refclk] Disabling SSC entirely … later we try committing the first modeset … [drm:intel_dump_pipe_config] [CRTC:26][modeset] config ffff88041b02e800 for pipe A [drm:intel_dump_pipe_config] cpu_transcoder: A … [drm:intel_dump_pipe_config] dpll_hw_state: dpll: 0xc4016001, dpll_md: 0x0, fp0: 0x20e08, fp1: 0x30d07 [drm:intel_dump_pipe_config] planes on this crtc [drm:intel_dump_pipe_config] STANDARD PLANE:23 plane: 0.0 idx: 0 enabled [drm:intel_dump_pipe_config] FB:42, fb = 800x600 format = 0x34325258 [drm:intel_dump_pipe_config] scaler:0 src (0, 0) 800x600 dst (0, 0) 800x600 [drm:intel_dump_pipe_config] CURSOR PLANE:25 plane: 0.1 idx: 1 disabled, scaler_id = 0 [drm:intel_dump_pipe_config] STANDARD PLANE:27 plane: 0.1 idx: 2 disabled, scaler_id = 0 [drm:intel_get_shared_dpll] CRTC:26 allocated PCH DPLL A [drm:intel_get_shared_dpll] using PCH DPLL A for pipe A [drm:ilk_audio_codec_disable] Disable audio codec on port C, pipe A [drm:intel_disable_pipe] disabling pipe A ------------[ cut here ]------------ WARNING: CPU: 1 PID: 130 at drivers/gpu/drm/i915/intel_display.c:1146 intel_disable_pipe+0x297/0x2d0 [i915] pipe_off wait timed out … ---[ end trace 94fc8aa03ae139e8 ]--- [drm:intel_dp_link_down] [drm:ironlake_crtc_disable [i915]] *ERROR* failed to disable transcoder A Later modesets succeed since they reset the DPLL's configuration anyway, but this is enough to get stuck with a big fat warning in dmesg. A better solution would be to add refcounts for the SSC source, but for now leaving the source clock on should suffice. Changes since v4: - Fix calculation of final for systems with LVDS panels (fixes BUG() on CI test suite) Changes since v3: - Move temp variable into loop - Move checks for using_ssc_source to after we've figured out has_ck505 - Add using_ssc_source to debug output Changes since v2: - Fix debug output for when we disable the CPU source Changes since v1: - Leave the SSC source clock on instead of just shutting it off on all of the DPLL configurations. Cc: stable@vger.kernel.org Reviewed-by: Ville Syrjälä Signed-off-by: Lyude Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1465916649-10228-1-git-send-email-cpaul@redhat.com Signed-off-by: Daniel Vetter Signed-off-by: Sasha Levin --- drivers/gpu/drm/i915/intel_display.c | 48 ++++++++++++++++++++-------- 1 file changed, 34 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 56323732c748..5250596a612e 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -7129,12 +7129,14 @@ static void ironlake_init_pch_refclk(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; struct intel_encoder *encoder; + int i; u32 val, final; bool has_lvds = false; bool has_cpu_edp = false; bool has_panel = false; bool has_ck505 = false; bool can_ssc = false; + bool using_ssc_source = false; /* We need to take the global config into account */ for_each_intel_encoder(dev, encoder) { @@ -7161,8 +7163,22 @@ static void ironlake_init_pch_refclk(struct drm_device *dev) can_ssc = true; } - DRM_DEBUG_KMS("has_panel %d has_lvds %d has_ck505 %d\n", - has_panel, has_lvds, has_ck505); + /* Check if any DPLLs are using the SSC source */ + for (i = 0; i < dev_priv->num_shared_dpll; i++) { + u32 temp = I915_READ(PCH_DPLL(i)); + + if (!(temp & DPLL_VCO_ENABLE)) + continue; + + if ((temp & PLL_REF_INPUT_MASK) == + PLLB_REF_INPUT_SPREADSPECTRUMIN) { + using_ssc_source = true; + break; + } + } + + DRM_DEBUG_KMS("has_panel %d has_lvds %d has_ck505 %d using_ssc_source %d\n", + has_panel, has_lvds, has_ck505, using_ssc_source); /* Ironlake: try to setup display ref clock before DPLL * enabling. This is only under driver's control after @@ -7199,9 +7215,9 @@ static void ironlake_init_pch_refclk(struct drm_device *dev) final |= DREF_CPU_SOURCE_OUTPUT_NONSPREAD; } else final |= DREF_CPU_SOURCE_OUTPUT_DISABLE; - } else { - final |= DREF_SSC_SOURCE_DISABLE; - final |= DREF_CPU_SOURCE_OUTPUT_DISABLE; + } else if (using_ssc_source) { + final |= DREF_SSC_SOURCE_ENABLE; + final |= DREF_SSC1_ENABLE; } if (final == val) @@ -7247,7 +7263,7 @@ static void ironlake_init_pch_refclk(struct drm_device *dev) POSTING_READ(PCH_DREF_CONTROL); udelay(200); } else { - DRM_DEBUG_KMS("Disabling SSC entirely\n"); + DRM_DEBUG_KMS("Disabling CPU source output\n"); val &= ~DREF_CPU_SOURCE_OUTPUT_MASK; @@ -7258,16 +7274,20 @@ static void ironlake_init_pch_refclk(struct drm_device *dev) POSTING_READ(PCH_DREF_CONTROL); udelay(200); - /* Turn off the SSC source */ - val &= ~DREF_SSC_SOURCE_MASK; - val |= DREF_SSC_SOURCE_DISABLE; + if (!using_ssc_source) { + DRM_DEBUG_KMS("Disabling SSC source\n"); - /* Turn off SSC1 */ - val &= ~DREF_SSC1_ENABLE; + /* Turn off the SSC source */ + val &= ~DREF_SSC_SOURCE_MASK; + val |= DREF_SSC_SOURCE_DISABLE; - I915_WRITE(PCH_DREF_CONTROL, val); - POSTING_READ(PCH_DREF_CONTROL); - udelay(200); + /* Turn off SSC1 */ + val &= ~DREF_SSC1_ENABLE; + + I915_WRITE(PCH_DREF_CONTROL, val); + POSTING_READ(PCH_DREF_CONTROL); + udelay(200); + } } BUG_ON(val != final); From 040ad2a3262124ac7d4676a5409c194baca514a9 Mon Sep 17 00:00:00 2001 From: Andrey Grodzovsky Date: Wed, 25 May 2016 16:45:43 -0400 Subject: [PATCH 076/312] drm/dp/mst: Always clear proposed vcpi table for port. [ Upstream commit fd2d2bac6e79b0be91ab86a6075a0c46ffda658a ] Not clearing mst manager's proposed vcpis table for destroyed connectors when the manager is stopped leaves it pointing to unrefernced memory, this causes pagefault when the manager is restarted when plugging back a branch. Fixes: 91a25e463130 ("drm/dp/mst: deallocate payload on port destruction") Signed-off-by: Andrey Grodzovsky Reviewed-by: Lyude Cc: stable@vger.kernel.org Cc: Mykola Lysenko Cc: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/drm_dp_mst_topology.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 10b8839cbd0c..52dea773bb1b 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -2862,11 +2862,9 @@ static void drm_dp_destroy_connector_work(struct work_struct *work) drm_dp_port_teardown_pdt(port, port->pdt); if (!port->input && port->vcpi.vcpi > 0) { - if (mgr->mst_state) { - drm_dp_mst_reset_vcpi_slots(mgr, port); - drm_dp_update_payload_part1(mgr); - drm_dp_mst_put_payload_id(mgr, port->vcpi.vcpi); - } + drm_dp_mst_reset_vcpi_slots(mgr, port); + drm_dp_update_payload_part1(mgr); + drm_dp_mst_put_payload_id(mgr, port->vcpi.vcpi); } kref_put(&port->kref, drm_dp_free_mst_port); From c2edadb9814a6ca12e1442c574d2191ca7352c06 Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Mon, 16 May 2016 17:03:42 -0400 Subject: [PATCH 077/312] nfsd4/rpc: move backchannel create logic into rpc code [ Upstream commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 ] Also simplify the logic a bit. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields Acked-by: Trond Myklebust Signed-off-by: Sasha Levin --- fs/nfsd/nfs4callback.c | 18 +----------------- include/linux/sunrpc/clnt.h | 2 -- net/sunrpc/clnt.c | 12 ++++++++++-- 3 files changed, 11 insertions(+), 21 deletions(-) diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c index 5694cfb7a47b..29c4bff1e6e1 100644 --- a/fs/nfsd/nfs4callback.c +++ b/fs/nfsd/nfs4callback.c @@ -710,22 +710,6 @@ static struct rpc_cred *get_backchannel_cred(struct nfs4_client *clp, struct rpc } } -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) -{ - struct rpc_xprt *xprt; - - if (args->protocol != XPRT_TRANSPORT_BC_TCP) - return rpc_create(args); - - xprt = args->bc_xprt->xpt_bc_xprt; - if (xprt) { - xprt_get(xprt); - return rpc_create_xprt(args, xprt); - } - - return rpc_create(args); -} - static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *conn, struct nfsd4_session *ses) { int maxtime = max_cb_time(clp->net); @@ -768,7 +752,7 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c args.authflavor = ses->se_cb_sec.flavor; } /* Create RPC client */ - client = create_backchannel_client(&args); + client = rpc_create(&args); if (IS_ERR(client)) { dprintk("NFSD: couldn't create callback client: %ld\n", PTR_ERR(client)); diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 598ba80ec30c..ee29cb43470f 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -134,8 +134,6 @@ struct rpc_create_args { #define RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT (1UL << 9) struct rpc_clnt *rpc_create(struct rpc_create_args *args); -struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, - struct rpc_xprt *xprt); struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *, const struct rpc_program *, u32); void rpc_task_reset_client(struct rpc_task *task, struct rpc_clnt *clnt); diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index e6ce1517367f..16e831dcfde0 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -442,7 +442,7 @@ out_no_rpciod: return ERR_PTR(err); } -struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, +static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, struct rpc_xprt *xprt) { struct rpc_clnt *clnt = NULL; @@ -474,7 +474,6 @@ struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, return clnt; } -EXPORT_SYMBOL_GPL(rpc_create_xprt); /** * rpc_create - create an RPC client and transport with one call @@ -500,6 +499,15 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) }; char servername[48]; + if (args->bc_xprt) { + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); + xprt = args->bc_xprt->xpt_bc_xprt; + if (xprt) { + xprt_get(xprt); + return rpc_create_xprt(args, xprt); + } + } + if (args->flags & RPC_CLNT_CREATE_INFINITE_SLOTS) xprtargs.flags |= XPRT_CREATE_INFINITE_SLOTS; if (args->flags & RPC_CLNT_CREATE_NO_IDLE_TIMEOUT) From aec2ddb9595f130759ca20c8f089e0361c95ca97 Mon Sep 17 00:00:00 2001 From: Jiri Slaby Date: Fri, 10 Jun 2016 10:54:32 +0200 Subject: [PATCH 078/312] base: make module_create_drivers_dir race-free [ Upstream commit 7e1b1fc4dabd6ec8e28baa0708866e13fa93c9b3 ] Modules which register drivers via standard path (driver_register) in parallel can cause a warning: WARNING: CPU: 2 PID: 3492 at ../fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80 sysfs: cannot create duplicate filename '/module/saa7146/drivers' Modules linked in: hexium_gemini(+) mxb(+) ... ... Call Trace: ... [] sysfs_warn_dup+0x62/0x80 [] sysfs_create_dir_ns+0x77/0x90 [] kobject_add_internal+0xb4/0x340 [] kobject_add+0x68/0xb0 [] kobject_create_and_add+0x31/0x70 [] module_add_driver+0xc3/0xd0 [] bus_add_driver+0x154/0x280 [] driver_register+0x60/0xe0 [] __pci_register_driver+0x60/0x70 [] saa7146_register_extension+0x64/0x90 [saa7146] [] hexium_init_module+0x11/0x1000 [hexium_gemini] ... As can be (mostly) seen, driver_register causes this call sequence: -> bus_add_driver -> module_add_driver -> module_create_drivers_dir The last one creates "drivers" directory in /sys/module/<...>. When this is done in parallel, the directory is attempted to be created twice at the same time. This can be easily reproduced by loading mxb and hexium_gemini in parallel: while :; do modprobe mxb & modprobe hexium_gemini wait rmmod mxb hexium_gemini saa7146_vv saa7146 done saa7146 calls pci_register_driver for both mxb and hexium_gemini, which means /sys/module/saa7146/drivers is to be created for both of them. Fix this by a new mutex in module_create_drivers_dir which makes the test-and-create "drivers" dir atomic. I inverted the condition and removed 'return' to avoid multiple unlocks or a goto. Signed-off-by: Jiri Slaby Fixes: fe480a2675ed (Modules: only add drivers/ direcory if needed) Cc: v2.6.21+ Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/base/module.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/base/module.c b/drivers/base/module.c index db930d3ee312..2a215780eda2 100644 --- a/drivers/base/module.c +++ b/drivers/base/module.c @@ -24,10 +24,12 @@ static char *make_driver_name(struct device_driver *drv) static void module_create_drivers_dir(struct module_kobject *mk) { - if (!mk || mk->drivers_dir) - return; + static DEFINE_MUTEX(drivers_dir_mutex); - mk->drivers_dir = kobject_create_and_add("drivers", &mk->kobj); + mutex_lock(&drivers_dir_mutex); + if (mk && !mk->drivers_dir) + mk->drivers_dir = kobject_create_and_add("drivers", &mk->kobj); + mutex_unlock(&drivers_dir_mutex); } void module_add_driver(struct module *mod, struct device_driver *drv) From 316515981206627bf97be37eca397234d604395d Mon Sep 17 00:00:00 2001 From: Xiubo Li Date: Wed, 15 Jun 2016 18:00:33 +0800 Subject: [PATCH 079/312] kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES [ Upstream commit caf1ff26e1aa178133df68ac3d40815fed2187d9 ] These days, we experienced one guest crash with 8 cores and 3 disks, with qemu error logs as bellow: qemu-system-x86_64: /build/qemu-2.0.0/kvm-all.c:984: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. And then we found one patch(bdf026317d) in qemu tree, which said could fix this bug. Execute the following script will reproduce the BUG quickly: irq_affinity.sh ======================================================================== vda_irq_num=25 vdb_irq_num=27 while [ 1 ] do for irq in {1,2,4,8,10,20,40,80} do echo $irq > /proc/irq/$vda_irq_num/smp_affinity echo $irq > /proc/irq/$vdb_irq_num/smp_affinity dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct done done ======================================================================== The following qemu log is added in the qemu code and is displayed when this bug reproduced: kvm_irqchip_commit_routes: max gsi: 1008, nr_allocated_irq_routes: 1024, irq_routes->nr: 1024, gsi_count: 1024. That's to say when irq_routes->nr == 1024, there are 1024 routing entries, but in the kernel code when routes->nr >= 1024, will just return -EINVAL; The nr is the number of the routing entries which is in of [1 ~ KVM_MAX_IRQ_ROUTES], not the index in [0 ~ KVM_MAX_IRQ_ROUTES - 1]. This patch fix the BUG above. Cc: stable@vger.kernel.org Signed-off-by: Xiubo Li Signed-off-by: Wei Tang Signed-off-by: Zhang Zhuoyu Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index c2f87ff0061d..d93deb5ce4f2 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2622,7 +2622,7 @@ static long kvm_vm_ioctl(struct file *filp, if (copy_from_user(&routing, argp, sizeof(routing))) goto out; r = -EINVAL; - if (routing.nr >= KVM_MAX_IRQ_ROUTES) + if (routing.nr > KVM_MAX_IRQ_ROUTES) goto out; if (routing.flags) goto out; From 4c9b1064a186967df8c81f7305df533f56678abf Mon Sep 17 00:00:00 2001 From: "Ocquidant, Sebastien" Date: Wed, 15 Jun 2016 13:47:35 +0200 Subject: [PATCH 080/312] memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing [ Upstream commit 8f50b8e57442d28e41bb736c173d8a2490549a82 ] In the omap gpmc driver it can be noticed that GPMC_CONFIG4_OEEXTRADELAY is overwritten by the WEEXTRADELAY value from the device tree and GPMC_CONFIG4_WEEXTRADELAY is not updated by the value from the device tree. As a consequence, the memory accesses cannot be configured properly when the extra delay are needed for OE and WE. Fix the update of GPMC_CONFIG4_WEEXTRADELAY with the value from the device tree file and prevents GPMC_CONFIG4_OEXTRADELAY being overwritten by the WEXTRADELAY value from the device tree. Cc: stable@vger.kernel.org Signed-off-by: Ocquidant, Sebastien Signed-off-by: Roger Quadros Signed-off-by: Sasha Levin --- drivers/memory/omap-gpmc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/memory/omap-gpmc.c b/drivers/memory/omap-gpmc.c index c94ea0d68746..2c51acce4b34 100644 --- a/drivers/memory/omap-gpmc.c +++ b/drivers/memory/omap-gpmc.c @@ -394,7 +394,7 @@ static void gpmc_cs_bool_timings(int cs, const struct gpmc_bool_timings *p) gpmc_cs_modify_reg(cs, GPMC_CS_CONFIG4, GPMC_CONFIG4_OEEXTRADELAY, p->oe_extra_delay); gpmc_cs_modify_reg(cs, GPMC_CS_CONFIG4, - GPMC_CONFIG4_OEEXTRADELAY, p->we_extra_delay); + GPMC_CONFIG4_WEEXTRADELAY, p->we_extra_delay); gpmc_cs_modify_reg(cs, GPMC_CS_CONFIG6, GPMC_CONFIG6_CYCLE2CYCLESAMECSEN, p->cycle2cyclesamecsen); From 0356b95fcc4e7eb670633be53db29147201ddce4 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Thu, 26 May 2016 15:42:13 -0400 Subject: [PATCH 081/312] cgroup: set css->id to -1 during init [ Upstream commit 8fa3b8d689a54d6d04ff7803c724fb7aca6ce98e ] If percpu_ref initialization fails during css_create(), the free path can end up trying to free css->id of zero. As ID 0 is unused, it doesn't cause a critical breakage but it does trigger a warning message. Fix it by setting css->id to -1 from init_and_link_css(). Signed-off-by: Tejun Heo Cc: Wenwei Tao Fixes: 01e586598b22 ("cgroup: release css->id after css_free") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin --- kernel/cgroup.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 530fecfa3992..d01c60425761 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4478,6 +4478,7 @@ static void init_and_link_css(struct cgroup_subsys_state *css, memset(css, 0, sizeof(*css)); css->cgroup = cgrp; css->ss = ss; + css->id = -1; INIT_LIST_HEAD(&css->sibling); INIT_LIST_HEAD(&css->children); css->serial_nr = css_serial_nr_next++; From c262505cdb45765ddea20a1f85f0023990276772 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Thu, 16 Jun 2016 15:48:57 +0100 Subject: [PATCH 082/312] KEYS: potential uninitialized variable [ Upstream commit 38327424b40bcebe2de92d07312c89360ac9229a ] If __key_link_begin() failed then "edit" would be uninitialized. I've added a check to fix that. This allows a random user to crash the kernel, though it's quite difficult to achieve. There are three ways it can be done as the user would have to cause an error to occur in __key_link(): (1) Cause the kernel to run out of memory. In practice, this is difficult to achieve without ENOMEM cropping up elsewhere and aborting the attempt. (2) Revoke the destination keyring between the keyring ID being looked up and it being tested for revocation. In practice, this is difficult to time correctly because the KEYCTL_REJECT function can only be used from the request-key upcall process. Further, users can only make use of what's in /sbin/request-key.conf, though this does including a rejection debugging test - which means that the destination keyring has to be the caller's session keyring in practice. (3) Have just enough key quota available to create a key, a new session keyring for the upcall and a link in the session keyring, but not then sufficient quota to create a link in the nominated destination keyring so that it fails with EDQUOT. The bug can be triggered using option (3) above using something like the following: echo 80 >/proc/sys/kernel/keys/root_maxbytes keyctl request2 user debug:fred negate @t The above sets the quota to something much lower (80) to make the bug easier to trigger, but this is dependent on the system. Note also that the name of the keyring created contains a random number that may be between 1 and 10 characters in size, so may throw the test off by changing the amount of quota used. Assuming the failure occurs, something like the following will be seen: kfree_debugcheck: out of range ptr 6b6b6b6b6b6b6b68h ------------[ cut here ]------------ kernel BUG at ../mm/slab.c:2821! ... RIP: 0010:[] kfree_debugcheck+0x20/0x25 RSP: 0018:ffff8804014a7de8 EFLAGS: 00010092 RAX: 0000000000000034 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000 RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300 RBP: ffff8804014a7df0 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8804014a7e68 R11: 0000000000000054 R12: 0000000000000202 R13: ffffffff81318a66 R14: 0000000000000000 R15: 0000000000000001 ... Call Trace: kfree+0xde/0x1bc assoc_array_cancel_edit+0x1f/0x36 __key_link_end+0x55/0x63 key_reject_and_link+0x124/0x155 keyctl_reject_key+0xb6/0xe0 keyctl_negate_key+0x10/0x12 SyS_keyctl+0x9f/0xe7 do_syscall_64+0x63/0x13a entry_SYSCALL64_slow_path+0x25/0x25 Fixes: f70e2e06196a ('KEYS: Do preallocation for __key_link()') Signed-off-by: Dan Carpenter Signed-off-by: David Howells cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- security/keys/key.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/security/keys/key.c b/security/keys/key.c index aee2ec5a18fc..970b58ee3e20 100644 --- a/security/keys/key.c +++ b/security/keys/key.c @@ -578,7 +578,7 @@ int key_reject_and_link(struct key *key, mutex_unlock(&key_construction_mutex); - if (keyring) + if (keyring && link_ret == 0) __key_link_end(keyring, &key->index_key, edit); /* wake up anyone waiting for a key to be constructed */ From ec3e73223a8e5a5ec66ca6a2b12d37ddaf2530dd Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 21 Apr 2016 17:49:11 +0200 Subject: [PATCH 083/312] ALSA: hda - Fix possible race on regmap bypass flip [ Upstream commit 3194ed497939c6448005542e3ca4fa2386968fa0 ] HD-audio driver uses regmap cache bypass feature for reading a raw value without the cache. But this is racy since both the cached and the uncached reads may occur concurrently. The former is done via the normal control API access while the latter comes from the proc file read. Even though the regmap itself has the protection against the concurrent accesses, the flag set/reset is done without the protection, so it may lead to inconsistent state of bypass flag that doesn't match with the current read and occasionally result in a kernel WARNING like: WARNING: CPU: 3 PID: 2731 at drivers/base/regmap/regcache.c:499 regcache_cache_only+0x78/0x93 One way to work around such a problem is to wrap with a mutex. But in this case, the solution is simpler: for the uncached read, we just skip the regmap and directly calls its accessor. The verb execution there is protected by itself, so basically it's safe to call individually. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116171 Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- include/sound/hda_regmap.h | 2 ++ sound/hda/hdac_device.c | 10 ++++------ sound/hda/hdac_regmap.c | 40 ++++++++++++++++++++++++++------------ 3 files changed, 34 insertions(+), 18 deletions(-) diff --git a/include/sound/hda_regmap.h b/include/sound/hda_regmap.h index df705908480a..2f48d648e051 100644 --- a/include/sound/hda_regmap.h +++ b/include/sound/hda_regmap.h @@ -17,6 +17,8 @@ int snd_hdac_regmap_add_vendor_verb(struct hdac_device *codec, unsigned int verb); int snd_hdac_regmap_read_raw(struct hdac_device *codec, unsigned int reg, unsigned int *val); +int snd_hdac_regmap_read_raw_uncached(struct hdac_device *codec, + unsigned int reg, unsigned int *val); int snd_hdac_regmap_write_raw(struct hdac_device *codec, unsigned int reg, unsigned int val); int snd_hdac_regmap_update_raw(struct hdac_device *codec, unsigned int reg, diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c index 961ca32ee989..37f2dacd6691 100644 --- a/sound/hda/hdac_device.c +++ b/sound/hda/hdac_device.c @@ -261,13 +261,11 @@ EXPORT_SYMBOL_GPL(_snd_hdac_read_parm); int snd_hdac_read_parm_uncached(struct hdac_device *codec, hda_nid_t nid, int parm) { - int val; + unsigned int cmd, val; - if (codec->regmap) - regcache_cache_bypass(codec->regmap, true); - val = snd_hdac_read_parm(codec, nid, parm); - if (codec->regmap) - regcache_cache_bypass(codec->regmap, false); + cmd = snd_hdac_regmap_encode_verb(nid, AC_VERB_PARAMETERS) | parm; + if (snd_hdac_regmap_read_raw_uncached(codec, cmd, &val) < 0) + return -1; return val; } EXPORT_SYMBOL_GPL(snd_hdac_read_parm_uncached); diff --git a/sound/hda/hdac_regmap.c b/sound/hda/hdac_regmap.c index b0ed870ffb88..c785e5a84b9c 100644 --- a/sound/hda/hdac_regmap.c +++ b/sound/hda/hdac_regmap.c @@ -420,14 +420,30 @@ int snd_hdac_regmap_write_raw(struct hdac_device *codec, unsigned int reg, EXPORT_SYMBOL_GPL(snd_hdac_regmap_write_raw); static int reg_raw_read(struct hdac_device *codec, unsigned int reg, - unsigned int *val) + unsigned int *val, bool uncached) { - if (!codec->regmap) + if (uncached || !codec->regmap) return hda_reg_read(codec, reg, val); else return regmap_read(codec->regmap, reg, val); } +static int __snd_hdac_regmap_read_raw(struct hdac_device *codec, + unsigned int reg, unsigned int *val, + bool uncached) +{ + int err; + + err = reg_raw_read(codec, reg, val, uncached); + if (err == -EAGAIN) { + err = snd_hdac_power_up_pm(codec); + if (!err) + err = reg_raw_read(codec, reg, val, uncached); + snd_hdac_power_down_pm(codec); + } + return err; +} + /** * snd_hdac_regmap_read_raw - read a pseudo register with power mgmt * @codec: the codec object @@ -439,19 +455,19 @@ static int reg_raw_read(struct hdac_device *codec, unsigned int reg, int snd_hdac_regmap_read_raw(struct hdac_device *codec, unsigned int reg, unsigned int *val) { - int err; - - err = reg_raw_read(codec, reg, val); - if (err == -EAGAIN) { - err = snd_hdac_power_up_pm(codec); - if (!err) - err = reg_raw_read(codec, reg, val); - snd_hdac_power_down_pm(codec); - } - return err; + return __snd_hdac_regmap_read_raw(codec, reg, val, false); } EXPORT_SYMBOL_GPL(snd_hdac_regmap_read_raw); +/* Works like snd_hdac_regmap_read_raw(), but this doesn't read from the + * cache but always via hda verbs. + */ +int snd_hdac_regmap_read_raw_uncached(struct hdac_device *codec, + unsigned int reg, unsigned int *val) +{ + return __snd_hdac_regmap_read_raw(codec, reg, val, true); +} + /** * snd_hdac_regmap_update_raw - update a pseudo register with power mgmt * @codec: the codec object From 0b1ca750b5fe6e457768c1b835f7f380a288ca58 Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Fri, 17 Jun 2016 13:35:56 +0200 Subject: [PATCH 084/312] ALSA: hdac_regmap - fix the register access for runtime PM [ Upstream commit 8198868f0a283eb23e264951632ce61ec2f82228 ] Call path: 1) snd_hdac_power_up_pm() 2) snd_hdac_power_up() 3) pm_runtime_get_sync() 4) __pm_runtime_resume() 5) rpm_resume() The rpm_resume() returns 1 when the device is already active. Because the return value is unmodified, the hdac regmap read/write functions should allow this value for the retry I/O operation, too. Signed-off-by: Jaroslav Kysela Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/hda/hdac_regmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/hda/hdac_regmap.c b/sound/hda/hdac_regmap.c index c785e5a84b9c..f3eb78e47ced 100644 --- a/sound/hda/hdac_regmap.c +++ b/sound/hda/hdac_regmap.c @@ -411,7 +411,7 @@ int snd_hdac_regmap_write_raw(struct hdac_device *codec, unsigned int reg, err = reg_raw_write(codec, reg, val); if (err == -EAGAIN) { err = snd_hdac_power_up_pm(codec); - if (!err) + if (err >= 0) err = reg_raw_write(codec, reg, val); snd_hdac_power_down_pm(codec); } @@ -437,7 +437,7 @@ static int __snd_hdac_regmap_read_raw(struct hdac_device *codec, err = reg_raw_read(codec, reg, val, uncached); if (err == -EAGAIN) { err = snd_hdac_power_up_pm(codec); - if (!err) + if (err >= 0) err = reg_raw_read(codec, reg, val, uncached); snd_hdac_power_down_pm(codec); } From efe098c364208b0d1fcc1b252290df0fd225d352 Mon Sep 17 00:00:00 2001 From: Jeff Mahoney Date: Wed, 8 Jun 2016 00:36:38 -0400 Subject: [PATCH 085/312] btrfs: account for non-CoW'd blocks in btrfs_abort_transaction [ Upstream commit 64c12921e11b3a0c10d088606e328c58e29274d8 ] The test for !trans->blocks_used in btrfs_abort_transaction is insufficient to determine whether it's safe to drop the transaction handle on the floor. btrfs_cow_block, informed by should_cow_block, can return blocks that have already been CoW'd in the current transaction. trans->blocks_used is only incremented for new block allocations. If an operation overlaps the blocks in the current transaction entirely and must abort the transaction, we'll happily let it clean up the trans handle even though it may have modified the blocks and will commit an incomplete operation. In the long-term, I'd like to do closer tracking of when the fs is actually modified so we can still recover as gracefully as possible, but that approach will need some discussion. In the short term, since this is the only code using trans->blocks_used, let's just switch it to a bool indicating whether any blocks were used and set it when should_cow_block returns false. Cc: stable@vger.kernel.org # 3.4+ Signed-off-by: Jeff Mahoney Reviewed-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Sasha Levin --- fs/btrfs/ctree.c | 5 ++++- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/super.c | 2 +- fs/btrfs/transaction.c | 1 - fs/btrfs/transaction.h | 2 +- 5 files changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 0f11ebc92f02..844c883a7169 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1548,6 +1548,7 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, trans->transid, root->fs_info->generation); if (!should_cow_block(trans, root, buf)) { + trans->dirty = true; *cow_ret = buf; return 0; } @@ -2767,8 +2768,10 @@ again: * then we don't want to set the path blocking, * so we test it here */ - if (!should_cow_block(trans, root, b)) + if (!should_cow_block(trans, root, b)) { + trans->dirty = true; goto cow_done; + } /* * must have write locks on this node and the diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index d1ae1322648a..2771bc32dbd9 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -7504,7 +7504,7 @@ btrfs_init_new_buffer(struct btrfs_trans_handle *trans, struct btrfs_root *root, set_extent_dirty(&trans->transaction->dirty_pages, buf->start, buf->start + buf->len - 1, GFP_NOFS); } - trans->blocks_used++; + trans->dirty = true; /* this returns a buffer locked for blocking */ return buf; } diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 70734d89193a..a40b454aea44 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -262,7 +262,7 @@ void __btrfs_abort_transaction(struct btrfs_trans_handle *trans, trans->aborted = errno; /* Nothing used. The other threads that have joined this * transaction may be able to continue. */ - if (!trans->blocks_used && list_empty(&trans->new_bgs)) { + if (!trans->dirty && list_empty(&trans->new_bgs)) { const char *errstr; errstr = btrfs_decode_error(errno); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 00d18c2bdb0f..6d43b2ab183b 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -507,7 +507,6 @@ again: h->transid = cur_trans->transid; h->transaction = cur_trans; - h->blocks_used = 0; h->bytes_reserved = 0; h->root = root; h->delayed_ref_updates = 0; diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h index 0b24755596ba..4ce102be6d6b 100644 --- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -105,7 +105,6 @@ struct btrfs_trans_handle { u64 qgroup_reserved; unsigned long use_count; unsigned long blocks_reserved; - unsigned long blocks_used; unsigned long delayed_ref_updates; struct btrfs_transaction *transaction; struct btrfs_block_rsv *block_rsv; @@ -115,6 +114,7 @@ struct btrfs_trans_handle { bool allocating_chunk; bool reloc_reserved; bool sync; + bool dirty; unsigned int type; /* * this root is only needed to validate that the root passed to From 0764832cd4b2472693529696578563892929701a Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Wed, 8 Jun 2016 17:28:29 -0600 Subject: [PATCH 086/312] IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs [ Upstream commit 8c5122e45a10a9262f872b53f151a592e870f905 ] When this code was reworked for IBoE support the order of assignments for the sl_tclass_flowlabel got flipped around resulting in TClass & FlowLabel being permanently set to 0 in the packet headers. This breaks IB routers that rely on these headers, but only affects kernel users - libmlx4 does this properly for user space. Cc: stable@vger.kernel.org Fixes: fa417f7b520e ("IB/mlx4: Add support for IBoE") Signed-off-by: Jason Gunthorpe Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin --- drivers/infiniband/hw/mlx4/ah.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/mlx4/ah.c b/drivers/infiniband/hw/mlx4/ah.c index 33fdd50123f7..9fa27b0cda32 100644 --- a/drivers/infiniband/hw/mlx4/ah.c +++ b/drivers/infiniband/hw/mlx4/ah.c @@ -47,6 +47,7 @@ static struct ib_ah *create_ib_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr, ah->av.ib.port_pd = cpu_to_be32(to_mpd(pd)->pdn | (ah_attr->port_num << 24)); ah->av.ib.g_slid = ah_attr->src_path_bits; + ah->av.ib.sl_tclass_flowlabel = cpu_to_be32(ah_attr->sl << 28); if (ah_attr->ah_flags & IB_AH_GRH) { ah->av.ib.g_slid |= 0x80; ah->av.ib.gid_index = ah_attr->grh.sgid_index; @@ -64,7 +65,6 @@ static struct ib_ah *create_ib_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr, !(1 << ah->av.ib.stat_rate & dev->caps.stat_rate_support)) --ah->av.ib.stat_rate; } - ah->av.ib.sl_tclass_flowlabel = cpu_to_be32(ah_attr->sl << 28); return &ah->ibah; } From 293745be2931d248a1244918b2318b7cbcdda542 Mon Sep 17 00:00:00 2001 From: Thor Thayer Date: Thu, 16 Jun 2016 11:10:19 -0500 Subject: [PATCH 087/312] can: c_can: Update D_CAN TX and RX functions to 32 bit - fix Altera Cyclone access [ Upstream commit 427460c83cdf55069eee49799a0caef7dde8df69 ] When testing CAN write floods on Altera's CycloneV, the first 2 bytes are sometimes 0x00, 0x00 or corrupted instead of the values sent. Also observed bytes 4 & 5 were corrupted in some cases. The D_CAN Data registers are 32 bits and changing from 16 bit writes to 32 bit writes fixes the problem. Testing performed on Altera CycloneV (D_CAN). Requesting tests on other C_CAN & D_CAN platforms. Reported-by: Richard Andrysek Signed-off-by: Thor Thayer Cc: Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin --- drivers/net/can/c_can/c_can.c | 38 ++++++++++++++++++++++++++++------- 1 file changed, 31 insertions(+), 7 deletions(-) diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index 5d214d135332..c076414103d2 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -332,9 +332,23 @@ static void c_can_setup_tx_object(struct net_device *dev, int iface, priv->write_reg(priv, C_CAN_IFACE(MSGCTRL_REG, iface), ctrl); - for (i = 0; i < frame->can_dlc; i += 2) { - priv->write_reg(priv, C_CAN_IFACE(DATA1_REG, iface) + i / 2, - frame->data[i] | (frame->data[i + 1] << 8)); + if (priv->type == BOSCH_D_CAN) { + u32 data = 0, dreg = C_CAN_IFACE(DATA1_REG, iface); + + for (i = 0; i < frame->can_dlc; i += 4, dreg += 2) { + data = (u32)frame->data[i]; + data |= (u32)frame->data[i + 1] << 8; + data |= (u32)frame->data[i + 2] << 16; + data |= (u32)frame->data[i + 3] << 24; + priv->write_reg32(priv, dreg, data); + } + } else { + for (i = 0; i < frame->can_dlc; i += 2) { + priv->write_reg(priv, + C_CAN_IFACE(DATA1_REG, iface) + i / 2, + frame->data[i] | + (frame->data[i + 1] << 8)); + } } } @@ -402,10 +416,20 @@ static int c_can_read_msg_object(struct net_device *dev, int iface, u32 ctrl) } else { int i, dreg = C_CAN_IFACE(DATA1_REG, iface); - for (i = 0; i < frame->can_dlc; i += 2, dreg ++) { - data = priv->read_reg(priv, dreg); - frame->data[i] = data; - frame->data[i + 1] = data >> 8; + if (priv->type == BOSCH_D_CAN) { + for (i = 0; i < frame->can_dlc; i += 4, dreg += 2) { + data = priv->read_reg32(priv, dreg); + frame->data[i] = data; + frame->data[i + 1] = data >> 8; + frame->data[i + 2] = data >> 16; + frame->data[i + 3] = data >> 24; + } + } else { + for (i = 0; i < frame->can_dlc; i += 2, dreg++) { + data = priv->read_reg(priv, dreg); + frame->data[i] = data; + frame->data[i + 1] = data >> 8; + } } } From 1105110de5bac70124e63663c618cab9d4e5c3b7 Mon Sep 17 00:00:00 2001 From: Wolfgang Grandegger Date: Mon, 13 Jun 2016 15:44:19 +0200 Subject: [PATCH 088/312] can: at91_can: RX queue could get stuck at high bus load [ Upstream commit 43200a4480cbbe660309621817f54cbb93907108 ] At high bus load it could happen that "at91_poll()" enters with all RX message boxes filled up. If then at the end the "quota" is exceeded as well, "rx_next" will not be reset to the first RX mailbox and hence the interrupts remain disabled. Signed-off-by: Wolfgang Grandegger Tested-by: Amr Bekhit Cc: Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin --- drivers/net/can/at91_can.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/can/at91_can.c b/drivers/net/can/at91_can.c index f4e40aa4d2a2..35233aa5a88a 100644 --- a/drivers/net/can/at91_can.c +++ b/drivers/net/can/at91_can.c @@ -733,9 +733,10 @@ static int at91_poll_rx(struct net_device *dev, int quota) /* upper group completed, look again in lower */ if (priv->rx_next > get_mb_rx_low_last(priv) && - quota > 0 && mb > get_mb_rx_last(priv)) { + mb > get_mb_rx_last(priv)) { priv->rx_next = get_mb_rx_first(priv); - goto again; + if (quota > 0) + goto again; } return received; From 335f166ee3341efcf402c46ea0e7d6362e7c9122 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (Red Hat)" Date: Fri, 17 Jun 2016 16:10:42 -0400 Subject: [PATCH 089/312] tracing: Handle NULL formats in hold_module_trace_bprintk_format() [ Upstream commit 70c8217acd4383e069fe1898bbad36ea4fcdbdcc ] If a task uses a non constant string for the format parameter in trace_printk(), then the trace_printk_fmt variable is set to NULL. This variable is then saved in the __trace_printk_fmt section. The function hold_module_trace_bprintk_format() checks to see if duplicate formats are used by modules, and reuses them if so (saves them to the list if it is new). But this function calls lookup_format() that does a strcmp() to the value (which is now NULL) and can cause a kernel oops. This wasn't an issue till 3debb0a9ddb ("tracing: Fix trace_printk() to print when not using bprintk()") which added "__used" to the trace_printk_fmt variable, and before that, the kernel simply optimized it out (no NULL value was saved). The fix is simply to handle the NULL pointer in lookup_format() and have the caller ignore the value if it was NULL. Link: http://lkml.kernel.org/r/1464769870-18344-1-git-send-email-zhengjun.xing@intel.com Reported-by: xingzhen Acked-by: Namhyung Kim Fixes: 3debb0a9ddb ("tracing: Fix trace_printk() to print when not using bprintk()") Cc: stable@vger.kernel.org # v3.5+ Signed-off-by: Steven Rostedt Signed-off-by: Sasha Levin --- kernel/trace/trace_printk.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c index 6d6c0411cbe8..fc0c74f18288 100644 --- a/kernel/trace/trace_printk.c +++ b/kernel/trace/trace_printk.c @@ -36,6 +36,10 @@ struct trace_bprintk_fmt { static inline struct trace_bprintk_fmt *lookup_format(const char *fmt) { struct trace_bprintk_fmt *pos; + + if (!fmt) + return ERR_PTR(-EINVAL); + list_for_each_entry(pos, &trace_bprintk_fmt_list, list) { if (!strcmp(pos->fmt, fmt)) return pos; @@ -57,7 +61,8 @@ void hold_module_trace_bprintk_format(const char **start, const char **end) for (iter = start; iter < end; iter++) { struct trace_bprintk_fmt *tb_fmt = lookup_format(*iter); if (tb_fmt) { - *iter = tb_fmt->fmt; + if (!IS_ERR(tb_fmt)) + *iter = tb_fmt->fmt; continue; } From 78016458e6bc8d2e3a352be3e7d879ab43a233e6 Mon Sep 17 00:00:00 2001 From: Boris Brezillon Date: Fri, 27 May 2016 16:09:25 +0200 Subject: [PATCH 090/312] drm: atmel-hlcdc: actually disable scaling when no scaling is required [ Upstream commit 1b7e38b92b0bbd363369f5160f13f4d26140972d ] The driver is only enabling scaling, but never disabling it, thus, if you enable the scaling feature once it stays enabled forever. Signed-off-by: Boris Brezillon Reported-by: Alex Vazquez Reviewed-by: Nicolas Ferre Fixes: 1a396789f65a ("drm: add Atmel HLCDC Display Controller support") Cc: Signed-off-by: Sasha Levin --- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c index be9fa8220499..767d0eaabe97 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c @@ -335,6 +335,8 @@ atmel_hlcdc_plane_update_pos_and_size(struct atmel_hlcdc_plane *plane, atmel_hlcdc_layer_update_cfg(&plane->layer, 13, 0xffffffff, factor_reg); + } else { + atmel_hlcdc_layer_update_cfg(&plane->layer, 13, 0xffffffff, 0); } } From 32bab0754d80dc086111d93bdc8ce39cf37c9eb9 Mon Sep 17 00:00:00 2001 From: Shaokun Zhang Date: Tue, 21 Jun 2016 15:32:57 +0800 Subject: [PATCH 091/312] arm64: mm: remove page_mapping check in __sync_icache_dcache [ Upstream commit 20c27a4270c775d7ed661491af8ac03264d60fc6 ] __sync_icache_dcache unconditionally skips the cache maintenance for anonymous pages, under the assumption that flushing is only required in the presence of D-side aliases [see 7249b79f6b4cc ("arm64: Do not flush the D-cache for anonymous pages")]. Unfortunately, this breaks migration of anonymous pages holding self-modifying code, where userspace cannot be reasonably expected to reissue maintenance instructions in response to a migration. This patch fixes the problem by removing the broken page_mapping(page) check from the cache syncing code, otherwise we may end up fetching and executing stale instructions from the PoU. Cc: Catalin Marinas Cc: Will Deacon Cc: Mark Rutland Cc: Reviewed-by: Catalin Marinas Signed-off-by: Shaokun Zhang Signed-off-by: Will Deacon Signed-off-by: Sasha Levin --- arch/arm64/mm/flush.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index b6f14e8d2121..bfb8eb168f2d 100644 --- a/arch/arm64/mm/flush.c +++ b/arch/arm64/mm/flush.c @@ -74,10 +74,6 @@ void __sync_icache_dcache(pte_t pte, unsigned long addr) { struct page *page = pte_page(pte); - /* no flushing needed for anonymous pages */ - if (!page_mapping(page)) - return; - if (!test_and_set_bit(PG_dcache_clean, &page->flags)) { __flush_dcache_area(page_address(page), PAGE_SIZE << compound_order(page)); From 65c68fd8c1c79178d17217fb045b80badecdd472 Mon Sep 17 00:00:00 2001 From: Alexander Shiyan Date: Wed, 1 Jun 2016 22:21:53 +0300 Subject: [PATCH 092/312] pinctrl: imx: Do not treat a PIN without MUX register as an error [ Upstream commit ba562d5e54fd3136bfea0457add3675850247774 ] Some PINs do not have a MUX register, it is not an error. It is necessary to allow the continuation of the PINs configuration, otherwise the whole PIN-group will be configured incorrectly. Cc: stable@vger.kernel.org Signed-off-by: Alexander Shiyan Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/pinctrl/freescale/pinctrl-imx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/pinctrl/freescale/pinctrl-imx.c b/drivers/pinctrl/freescale/pinctrl-imx.c index e261f1cf85c6..09c05bffe026 100644 --- a/drivers/pinctrl/freescale/pinctrl-imx.c +++ b/drivers/pinctrl/freescale/pinctrl-imx.c @@ -205,9 +205,9 @@ static int imx_pmx_set(struct pinctrl_dev *pctldev, unsigned selector, pin_reg = &info->pin_regs[pin_id]; if (pin_reg->mux_reg == -1) { - dev_err(ipctl->dev, "Pin(%s) does not support mux function\n", + dev_dbg(ipctl->dev, "Pin(%s) does not support mux function\n", info->pins[pin_id].name); - return -EINVAL; + continue; } if (info->flags & SHARE_MUX_CONF_REG) { From 0e16f4fb69810c58bdf142dd85ee92e46576b877 Mon Sep 17 00:00:00 2001 From: Tony Lindgren Date: Tue, 31 May 2016 14:17:06 -0700 Subject: [PATCH 093/312] pinctrl: single: Fix missing flush of posted write for a wakeirq [ Upstream commit 0ac3c0a4025f41748a083bdd4970cb3ede802b15 ] With many repeated suspend resume cycles, the pin specific wakeirq may not always work on omaps. This is because the write to enable the pin interrupt may not have reached the device over the interconnect before suspend happens. Let's fix the issue with a flush of posted write with a readback. Cc: stable@vger.kernel.org Reported-by: Nishanth Menon Signed-off-by: Tony Lindgren Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/pinctrl/pinctrl-single.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pinctrl/pinctrl-single.c b/drivers/pinctrl/pinctrl-single.c index 13b45f297727..11bf750d3761 100644 --- a/drivers/pinctrl/pinctrl-single.c +++ b/drivers/pinctrl/pinctrl-single.c @@ -1576,6 +1576,9 @@ static inline void pcs_irq_set(struct pcs_soc_data *pcs_soc, else mask &= ~soc_mask; pcs->write(mask, pcswi->reg); + + /* flush posted write */ + mask = pcs->read(pcswi->reg); raw_spin_unlock(&pcs->lock); } From 59523335cd4ae9e0b3359ce6577afea1fff9524d Mon Sep 17 00:00:00 2001 From: Richard Weinberger Date: Tue, 21 Jun 2016 00:31:50 +0200 Subject: [PATCH 094/312] ubi: Make recover_peb power cut aware MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 972228d87445dc46c0a01f5f3de673ac017626f7 ] recover_peb() was never power cut aware, if a power cut happened right after writing the VID header upon next attach UBI would blindly use the new partial written PEB and all data from the old PEB is lost. In order to make recover_peb() power cut aware, write the new VID with a proper crc and copy_flag set such that the UBI attach process will detect whether the new PEB is completely written or not. We cannot directly use ubi_eba_atomic_leb_change() since we'd have to unlock the LEB which is facing a write error. Cc: stable@vger.kernel.org Reported-by: Jörg Pfähler Reviewed-by: Jörg Pfähler Signed-off-by: Richard Weinberger Signed-off-by: Sasha Levin --- drivers/mtd/ubi/eba.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c index 1c73ba6efdbd..f109aeed9883 100644 --- a/drivers/mtd/ubi/eba.c +++ b/drivers/mtd/ubi/eba.c @@ -575,6 +575,7 @@ static int recover_peb(struct ubi_device *ubi, int pnum, int vol_id, int lnum, int err, idx = vol_id2idx(ubi, vol_id), new_pnum, data_size, tries = 0; struct ubi_volume *vol = ubi->volumes[idx]; struct ubi_vid_hdr *vid_hdr; + uint32_t crc; vid_hdr = ubi_zalloc_vid_hdr(ubi, GFP_NOFS); if (!vid_hdr) @@ -599,14 +600,8 @@ retry: goto out_put; } - vid_hdr->sqnum = cpu_to_be64(ubi_next_sqnum(ubi)); - err = ubi_io_write_vid_hdr(ubi, new_pnum, vid_hdr); - if (err) { - up_read(&ubi->fm_eba_sem); - goto write_error; - } + ubi_assert(vid_hdr->vol_type == UBI_VID_DYNAMIC); - data_size = offset + len; mutex_lock(&ubi->buf_mutex); memset(ubi->peb_buf + offset, 0xFF, len); @@ -621,6 +616,19 @@ retry: memcpy(ubi->peb_buf + offset, buf, len); + data_size = offset + len; + crc = crc32(UBI_CRC32_INIT, ubi->peb_buf, data_size); + vid_hdr->sqnum = cpu_to_be64(ubi_next_sqnum(ubi)); + vid_hdr->copy_flag = 1; + vid_hdr->data_size = cpu_to_be32(data_size); + vid_hdr->data_crc = cpu_to_be32(crc); + err = ubi_io_write_vid_hdr(ubi, new_pnum, vid_hdr); + if (err) { + mutex_unlock(&ubi->buf_mutex); + up_read(&ubi->fm_eba_sem); + goto write_error; + } + err = ubi_io_write_data(ubi, ubi->peb_buf, new_pnum, 0, data_size); if (err) { mutex_unlock(&ubi->buf_mutex); From a13b0f0a244b15e576f6edf4ffb9ce41ea6f3837 Mon Sep 17 00:00:00 2001 From: Richard Weinberger Date: Thu, 16 Jun 2016 23:26:14 +0200 Subject: [PATCH 095/312] mm: Export migrate_page_move_mapping and migrate_page_copy [ Upstream commit 1118dce773d84f39ebd51a9fe7261f9169cb056e ] Export these symbols such that UBIFS can implement ->migratepage. Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger Acked-by: Christoph Hellwig Signed-off-by: Sasha Levin --- mm/migrate.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/migrate.c b/mm/migrate.c index fe71f91c7b27..2599977221aa 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -389,6 +389,7 @@ int migrate_page_move_mapping(struct address_space *mapping, return MIGRATEPAGE_SUCCESS; } +EXPORT_SYMBOL(migrate_page_move_mapping); /* * The expected number of remaining references is the same as that @@ -549,6 +550,7 @@ void migrate_page_copy(struct page *newpage, struct page *page) if (PageWriteback(newpage)) end_page_writeback(newpage); } +EXPORT_SYMBOL(migrate_page_copy); /************************************************************ * Migration functions From f3d1ae6f55fd1c9bedb26fd742693f629facd01a Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Thu, 16 Jun 2016 23:26:15 +0200 Subject: [PATCH 096/312] UBIFS: Implement ->migratepage() [ Upstream commit 4ac1c17b2044a1b4b2fbed74451947e905fc2992 ] During page migrations UBIFS might get confused and the following assert triggers: [ 213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436) [ 213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008 [ 213.490000] Hardware name: Allwinner sun4i/sun5i Families [ 213.490000] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 213.490000] [] (show_stack) from [] (dump_stack+0x8c/0xa0) [ 213.490000] [] (dump_stack) from [] (ubifs_set_page_dirty+0x44/0x50) [ 213.490000] [] (ubifs_set_page_dirty) from [] (try_to_unmap_one+0x10c/0x3a8) [ 213.490000] [] (try_to_unmap_one) from [] (rmap_walk+0xb4/0x290) [ 213.490000] [] (rmap_walk) from [] (try_to_unmap+0x64/0x80) [ 213.490000] [] (try_to_unmap) from [] (migrate_pages+0x328/0x7a0) [ 213.490000] [] (migrate_pages) from [] (alloc_contig_range+0x168/0x2f4) [ 213.490000] [] (alloc_contig_range) from [] (cma_alloc+0x170/0x2c0) [ 213.490000] [] (cma_alloc) from [] (__alloc_from_contiguous+0x38/0xd8) [ 213.490000] [] (__alloc_from_contiguous) from [] (__dma_alloc+0x23c/0x274) [ 213.490000] [] (__dma_alloc) from [] (arm_dma_alloc+0x54/0x5c) [ 213.490000] [] (arm_dma_alloc) from [] (drm_gem_cma_create+0xb8/0xf0) [ 213.490000] [] (drm_gem_cma_create) from [] (drm_gem_cma_create_with_handle+0x1c/0xe8) [ 213.490000] [] (drm_gem_cma_create_with_handle) from [] (drm_gem_cma_dumb_create+0x3c/0x48) [ 213.490000] [] (drm_gem_cma_dumb_create) from [] (drm_ioctl+0x12c/0x444) [ 213.490000] [] (drm_ioctl) from [] (do_vfs_ioctl+0x3f4/0x614) [ 213.490000] [] (do_vfs_ioctl) from [] (SyS_ioctl+0x34/0x5c) [ 213.490000] [] (SyS_ioctl) from [] (ret_fast_syscall+0x0/0x34) UBIFS is using PagePrivate() which can have different meanings across filesystems. Therefore the generic page migration code cannot handle this case correctly. We have to implement our own migration function which basically does a plain copy but also duplicates the page private flag. UBIFS is not a block device filesystem and cannot use buffer_migrate_page(). Cc: stable@vger.kernel.org Signed-off-by: Kirill A. Shutemov [rw: Massaged changelog, build fixes, etc...] Signed-off-by: Richard Weinberger Acked-by: Christoph Hellwig Signed-off-by: Sasha Levin --- fs/ubifs/file.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index 35efc103c39c..75e9b2db14ab 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -53,6 +53,7 @@ #include #include #include +#include static int read_block(struct inode *inode, void *addr, unsigned int block, struct ubifs_data_node *dn) @@ -1420,6 +1421,26 @@ static int ubifs_set_page_dirty(struct page *page) return ret; } +#ifdef CONFIG_MIGRATION +static int ubifs_migrate_page(struct address_space *mapping, + struct page *newpage, struct page *page, enum migrate_mode mode) +{ + int rc; + + rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0); + if (rc != MIGRATEPAGE_SUCCESS) + return rc; + + if (PagePrivate(page)) { + ClearPagePrivate(page); + SetPagePrivate(newpage); + } + + migrate_page_copy(newpage, page); + return MIGRATEPAGE_SUCCESS; +} +#endif + static int ubifs_releasepage(struct page *page, gfp_t unused_gfp_flags) { /* @@ -1556,6 +1577,9 @@ const struct address_space_operations ubifs_file_address_operations = { .write_end = ubifs_write_end, .invalidatepage = ubifs_invalidatepage, .set_page_dirty = ubifs_set_page_dirty, +#ifdef CONFIG_MIGRATION + .migratepage = ubifs_migrate_page, +#endif .releasepage = ubifs_releasepage, }; From 488ba7c5a08649e4f05075a8237135b2ae4c37b4 Mon Sep 17 00:00:00 2001 From: Oliver Hartkopp Date: Tue, 21 Jun 2016 12:14:07 +0200 Subject: [PATCH 097/312] can: fix handling of unmodifiable configuration options fix [ Upstream commit bce271f255dae8335dc4d2ee2c4531e09cc67f5a ] With upstream commit bb208f144cf3f59 (can: fix handling of unmodifiable configuration options) a new can_validate() function was introduced. When invoking 'ip link set can0 type can' without any configuration data can_validate() tries to validate the content without taking into account that there's totally no content. This patch adds a check for missing content. Reported-by: ajneu Signed-off-by: Oliver Hartkopp Cc: Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin --- drivers/net/can/dev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index 910c12e2638e..348dd5001fa4 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -798,6 +798,9 @@ static int can_validate(struct nlattr *tb[], struct nlattr *data[]) * - control mode with CAN_CTRLMODE_FD set */ + if (!data) + return 0; + if (data[IFLA_CAN_CTRLMODE]) { struct can_ctrlmode *cm = nla_data(data[IFLA_CAN_CTRLMODE]); From 760102cd663d7a2f952177942f0cf44bbb06320d Mon Sep 17 00:00:00 2001 From: Oliver Hartkopp Date: Tue, 21 Jun 2016 15:45:47 +0200 Subject: [PATCH 098/312] can: fix oops caused by wrong rtnl dellink usage [ Upstream commit 25e1ed6e64f52a692ba3191c4fde650aab3ecc07 ] For 'real' hardware CAN devices the netlink interface is used to set CAN specific communication parameters. Real CAN hardware can not be created nor removed with the ip tool ... This patch adds a private dellink function for the CAN device driver interface that does just nothing. It's a follow up to commit 993e6f2fd ("can: fix oops caused by wrong rtnl newlink usage") but for dellink. Reported-by: ajneu Signed-off-by: Oliver Hartkopp Cc: Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin --- drivers/net/can/dev.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c index 348dd5001fa4..ad535a854e5c 100644 --- a/drivers/net/can/dev.c +++ b/drivers/net/can/dev.c @@ -1011,6 +1011,11 @@ static int can_newlink(struct net *src_net, struct net_device *dev, return -EOPNOTSUPP; } +static void can_dellink(struct net_device *dev, struct list_head *head) +{ + return; +} + static struct rtnl_link_ops can_link_ops __read_mostly = { .kind = "can", .maxtype = IFLA_CAN_MAX, @@ -1019,6 +1024,7 @@ static struct rtnl_link_ops can_link_ops __read_mostly = { .validate = can_validate, .newlink = can_newlink, .changelink = can_changelink, + .dellink = can_dellink, .get_size = can_get_size, .fill_info = can_fill_info, .get_xstats_size = can_get_xstats_size, From 90a12c571f30254108126ee82ca7f4179e934b33 Mon Sep 17 00:00:00 2001 From: Andrey Grodzovsky Date: Tue, 21 Jun 2016 14:26:36 -0400 Subject: [PATCH 099/312] xen/pciback: Fix conf_space read/write overlap check. [ Upstream commit 02ef871ecac290919ea0c783d05da7eedeffc10e ] Current overlap check is evaluating to false a case where a filter field is fully contained (proper subset) of a r/w request. This change applies classical overlap check instead to include all the scenarios. More specifically, for (Hilscher GmbH CIFX 50E-DP(M/S)) device driver the logic is such that the entire confspace is read and written in 4 byte chunks. In this case as an example, CACHE_LINE_SIZE, LATENCY_TIMER and PCI_BIST are arriving together in one call to xen_pcibk_config_write() with offset == 0xc and size == 4. With the exsisting overlap check the LATENCY_TIMER field (offset == 0xd, length == 1) is fully contained in the write request and hence is excluded from write, which is incorrect. Signed-off-by: Andrey Grodzovsky Reviewed-by: Boris Ostrovsky Reviewed-by: Jan Beulich Cc: Signed-off-by: David Vrabel Signed-off-by: Sasha Levin --- drivers/xen/xen-pciback/conf_space.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/xen/xen-pciback/conf_space.c b/drivers/xen/xen-pciback/conf_space.c index 9c234209d8b5..47a4177b16d2 100644 --- a/drivers/xen/xen-pciback/conf_space.c +++ b/drivers/xen/xen-pciback/conf_space.c @@ -183,8 +183,7 @@ int xen_pcibk_config_read(struct pci_dev *dev, int offset, int size, field_start = OFFSET(cfg_entry); field_end = OFFSET(cfg_entry) + field->size; - if ((req_start >= field_start && req_start < field_end) - || (req_end > field_start && req_end <= field_end)) { + if (req_end > field_start && field_end > req_start) { err = conf_space_read(dev, cfg_entry, field_start, &tmp_val); if (err) @@ -230,8 +229,7 @@ int xen_pcibk_config_write(struct pci_dev *dev, int offset, int size, u32 value) field_start = OFFSET(cfg_entry); field_end = OFFSET(cfg_entry) + field->size; - if ((req_start >= field_start && req_start < field_end) - || (req_end > field_start && req_end <= field_end)) { + if (req_end > field_start && field_end > req_start) { tmp_val = 0; err = xen_pcibk_config_read(dev, field_start, From 37d925d2750f0efcbdc04e17d6bc46d3c82dc214 Mon Sep 17 00:00:00 2001 From: Ping Cheng Date: Thu, 23 Jun 2016 10:54:17 -0700 Subject: [PATCH 100/312] Input: wacom_w8001 - w8001_MAX_LENGTH should be 13 [ Upstream commit 12afb34400eb2b301f06b2aa3535497d14faee59 ] Somehow the patch that added two-finger touch support forgot to update W8001_MAX_LENGTH from 11 to 13. Signed-off-by: Ping Cheng Reviewed-by: Peter Hutterer Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin --- drivers/input/touchscreen/wacom_w8001.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/input/touchscreen/wacom_w8001.c b/drivers/input/touchscreen/wacom_w8001.c index 2792ca397dd0..3ed0ce1e4dcb 100644 --- a/drivers/input/touchscreen/wacom_w8001.c +++ b/drivers/input/touchscreen/wacom_w8001.c @@ -27,7 +27,7 @@ MODULE_AUTHOR("Jaya Kumar "); MODULE_DESCRIPTION(DRIVER_DESC); MODULE_LICENSE("GPL"); -#define W8001_MAX_LENGTH 11 +#define W8001_MAX_LENGTH 13 #define W8001_LEAD_MASK 0x80 #define W8001_LEAD_BYTE 0x80 #define W8001_TAB_MASK 0x40 From 97e2a92930008f6087b6d59f761277f0839b30be Mon Sep 17 00:00:00 2001 From: Dmitry Torokhov Date: Tue, 21 Jun 2016 16:09:00 -0700 Subject: [PATCH 101/312] Input: elantech - add more IC body types to the list [ Upstream commit 226ba707744a51acb4244724e09caacb1d96aed9 ] The touchpad in HP Pavilion 14-ab057ca reports it's version as 12 and according to Elan both 11 and 12 are valid IC types and should be identified as hw_version 4. Reported-by: Patrick Lessard Tested-by: Patrick Lessard Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin --- drivers/input/mouse/elantech.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/input/mouse/elantech.c b/drivers/input/mouse/elantech.c index 0f5b400706d7..c3c5d492cba0 100644 --- a/drivers/input/mouse/elantech.c +++ b/drivers/input/mouse/elantech.c @@ -1550,13 +1550,7 @@ static int elantech_set_properties(struct elantech_data *etd) case 5: etd->hw_version = 3; break; - case 6: - case 7: - case 8: - case 9: - case 10: - case 13: - case 14: + case 6 ... 14: etd->hw_version = 4; break; default: From 6a2f15857e4debf46d34fd897e9d3eaf70590e33 Mon Sep 17 00:00:00 2001 From: Dmitrii Tcvetkov Date: Mon, 20 Jun 2016 13:52:14 +0300 Subject: [PATCH 102/312] drm/nouveau: fix for disabled fbdev emulation [ Upstream commit 52dfcc5ccfbb6697ac3cac7f7ff1e712760e1216 ] Hello, after this commit: commit f045f459d925138fe7d6193a8c86406bda7e49da Author: Ben Skeggs Date: Thu Jun 2 12:23:31 2016 +1000 drm/nouveau/fbcon: fix out-of-bounds memory accesses kernel started to oops when loading nouveau module when using GTX 780 Ti video adapter. This patch fixes the problem. Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=120591 Signed-off-by: Dmitrii Tcvetkov Suggested-by: Ilia Mirkin Fixes: f045f459d925 ("nouveau_fbcon_init()") Signed-off-by: Ben Skeggs Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/nouveau/nouveau_fbcon.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c index cf43f77be254..bb29f1e482d7 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c +++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c @@ -572,7 +572,8 @@ nouveau_fbcon_init(struct drm_device *dev) if (ret) goto fini; - fbcon->helper.fbdev->pixmap.buf_align = 4; + if (fbcon->helper.fbdev) + fbcon->helper.fbdev->pixmap.buf_align = 4; return 0; fini: From 60243e6760bb680c9c6c6d5676a0669c44890c74 Mon Sep 17 00:00:00 2001 From: Sinclair Yeh Date: Thu, 23 Jun 2016 17:37:34 -0700 Subject: [PATCH 103/312] Input: vmmouse - remove port reservation [ Upstream commit 60842ef8128e7bf58c024814cd0dc14319232b6c ] The VMWare EFI BIOS will expose port 0x5658 as an ACPI resource. This causes the port to be reserved by the APCI module as the system comes up, making it unavailable to be reserved again by other drivers, thus preserving this VMWare port for special use in a VMWare guest. This port is designed to be shared among multiple VMWare services, such as the VMMOUSE. Because of this, VMMOUSE should not try to reserve this port on its own. The VMWare non-EFI BIOS does not do this to preserve compatibility with existing/legacy VMs. It is known that there is small chance a VM may be configured such that these ports get reserved by other non-VMWare devices, and if this ever happens, the result is undefined. Signed-off-by: Sinclair Yeh Reviewed-by: Thomas Hellstrom Cc: # 4.1- Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin --- drivers/input/mouse/vmmouse.c | 22 ++-------------------- 1 file changed, 2 insertions(+), 20 deletions(-) diff --git a/drivers/input/mouse/vmmouse.c b/drivers/input/mouse/vmmouse.c index a3f0f5a47490..0f586780ceb4 100644 --- a/drivers/input/mouse/vmmouse.c +++ b/drivers/input/mouse/vmmouse.c @@ -355,18 +355,11 @@ int vmmouse_detect(struct psmouse *psmouse, bool set_properties) return -ENXIO; } - if (!request_region(VMMOUSE_PROTO_PORT, 4, "vmmouse")) { - psmouse_dbg(psmouse, "VMMouse port in use.\n"); - return -EBUSY; - } - /* Check if the device is present */ response = ~VMMOUSE_PROTO_MAGIC; VMMOUSE_CMD(GETVERSION, 0, version, response, dummy1, dummy2); - if (response != VMMOUSE_PROTO_MAGIC || version == 0xffffffffU) { - release_region(VMMOUSE_PROTO_PORT, 4); + if (response != VMMOUSE_PROTO_MAGIC || version == 0xffffffffU) return -ENXIO; - } if (set_properties) { psmouse->vendor = VMMOUSE_VENDOR; @@ -374,8 +367,6 @@ int vmmouse_detect(struct psmouse *psmouse, bool set_properties) psmouse->model = version; } - release_region(VMMOUSE_PROTO_PORT, 4); - return 0; } @@ -394,7 +385,6 @@ static void vmmouse_disconnect(struct psmouse *psmouse) psmouse_reset(psmouse); input_unregister_device(priv->abs_dev); kfree(priv); - release_region(VMMOUSE_PROTO_PORT, 4); } /** @@ -438,15 +428,10 @@ int vmmouse_init(struct psmouse *psmouse) struct input_dev *rel_dev = psmouse->dev, *abs_dev; int error; - if (!request_region(VMMOUSE_PROTO_PORT, 4, "vmmouse")) { - psmouse_dbg(psmouse, "VMMouse port in use.\n"); - return -EBUSY; - } - psmouse_reset(psmouse); error = vmmouse_enable(psmouse); if (error) - goto release_region; + return error; priv = kzalloc(sizeof(*priv), GFP_KERNEL); abs_dev = input_allocate_device(); @@ -502,8 +487,5 @@ init_fail: kfree(priv); psmouse->private = NULL; -release_region: - release_region(VMMOUSE_PROTO_PORT, 4); - return error; } From 32dc059d132c7fb4f45a7aeab70e08d2a47ed90d Mon Sep 17 00:00:00 2001 From: Jerome Marchand Date: Thu, 26 May 2016 11:52:25 +0200 Subject: [PATCH 104/312] cifs: dynamic allocation of ntlmssp blob [ Upstream commit b8da344b74c822e966c6d19d6b2321efe82c5d97 ] In sess_auth_rawntlmssp_authenticate(), the ntlmssp blob is allocated statically and its size is an "empirical" 5*sizeof(struct _AUTHENTICATE_MESSAGE) (320B on x86_64). I don't know where this value comes from or if it was ever appropriate, but it is currently insufficient: the user and domain name in UTF16 could take 1kB by themselves. Because of that, build_ntlmssp_auth_blob() might corrupt memory (out-of-bounds write). The size of ntlmssp_blob in SMB2_sess_setup() is too small too (sizeof(struct _NEGOTIATE_MESSAGE) + 500). This patch allocates the blob dynamically in build_ntlmssp_auth_blob(). Signed-off-by: Jerome Marchand Signed-off-by: Steve French CC: Stable Signed-off-by: Sasha Levin --- fs/cifs/ntlmssp.h | 2 +- fs/cifs/sess.c | 76 ++++++++++++++++++++++++++--------------------- fs/cifs/smb2pdu.c | 10 ++----- 3 files changed, 45 insertions(+), 43 deletions(-) diff --git a/fs/cifs/ntlmssp.h b/fs/cifs/ntlmssp.h index 848249fa120f..3079b38f0afb 100644 --- a/fs/cifs/ntlmssp.h +++ b/fs/cifs/ntlmssp.h @@ -133,6 +133,6 @@ typedef struct _AUTHENTICATE_MESSAGE { int decode_ntlmssp_challenge(char *bcc_ptr, int blob_len, struct cifs_ses *ses); void build_ntlmssp_negotiate_blob(unsigned char *pbuffer, struct cifs_ses *ses); -int build_ntlmssp_auth_blob(unsigned char *pbuffer, u16 *buflen, +int build_ntlmssp_auth_blob(unsigned char **pbuffer, u16 *buflen, struct cifs_ses *ses, const struct nls_table *nls_cp); diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c index 8ffda5084dbf..5f9229ddf335 100644 --- a/fs/cifs/sess.c +++ b/fs/cifs/sess.c @@ -364,19 +364,43 @@ void build_ntlmssp_negotiate_blob(unsigned char *pbuffer, sec_blob->DomainName.MaximumLength = 0; } -/* We do not malloc the blob, it is passed in pbuffer, because its - maximum possible size is fixed and small, making this approach cleaner. - This function returns the length of the data in the blob */ -int build_ntlmssp_auth_blob(unsigned char *pbuffer, +static int size_of_ntlmssp_blob(struct cifs_ses *ses) +{ + int sz = sizeof(AUTHENTICATE_MESSAGE) + ses->auth_key.len + - CIFS_SESS_KEY_SIZE + CIFS_CPHTXT_SIZE + 2; + + if (ses->domainName) + sz += 2 * strnlen(ses->domainName, CIFS_MAX_DOMAINNAME_LEN); + else + sz += 2; + + if (ses->user_name) + sz += 2 * strnlen(ses->user_name, CIFS_MAX_USERNAME_LEN); + else + sz += 2; + + return sz; +} + +int build_ntlmssp_auth_blob(unsigned char **pbuffer, u16 *buflen, struct cifs_ses *ses, const struct nls_table *nls_cp) { int rc; - AUTHENTICATE_MESSAGE *sec_blob = (AUTHENTICATE_MESSAGE *)pbuffer; + AUTHENTICATE_MESSAGE *sec_blob; __u32 flags; unsigned char *tmp; + rc = setup_ntlmv2_rsp(ses, nls_cp); + if (rc) { + cifs_dbg(VFS, "Error %d during NTLMSSP authentication\n", rc); + *buflen = 0; + goto setup_ntlmv2_ret; + } + *pbuffer = kmalloc(size_of_ntlmssp_blob(ses), GFP_KERNEL); + sec_blob = (AUTHENTICATE_MESSAGE *)*pbuffer; + memcpy(sec_blob->Signature, NTLMSSP_SIGNATURE, 8); sec_blob->MessageType = NtLmAuthenticate; @@ -391,7 +415,7 @@ int build_ntlmssp_auth_blob(unsigned char *pbuffer, flags |= NTLMSSP_NEGOTIATE_KEY_XCH; } - tmp = pbuffer + sizeof(AUTHENTICATE_MESSAGE); + tmp = *pbuffer + sizeof(AUTHENTICATE_MESSAGE); sec_blob->NegotiateFlags = cpu_to_le32(flags); sec_blob->LmChallengeResponse.BufferOffset = @@ -399,13 +423,9 @@ int build_ntlmssp_auth_blob(unsigned char *pbuffer, sec_blob->LmChallengeResponse.Length = 0; sec_blob->LmChallengeResponse.MaximumLength = 0; - sec_blob->NtChallengeResponse.BufferOffset = cpu_to_le32(tmp - pbuffer); + sec_blob->NtChallengeResponse.BufferOffset = + cpu_to_le32(tmp - *pbuffer); if (ses->user_name != NULL) { - rc = setup_ntlmv2_rsp(ses, nls_cp); - if (rc) { - cifs_dbg(VFS, "Error %d during NTLMSSP authentication\n", rc); - goto setup_ntlmv2_ret; - } memcpy(tmp, ses->auth_key.response + CIFS_SESS_KEY_SIZE, ses->auth_key.len - CIFS_SESS_KEY_SIZE); tmp += ses->auth_key.len - CIFS_SESS_KEY_SIZE; @@ -423,7 +443,7 @@ int build_ntlmssp_auth_blob(unsigned char *pbuffer, } if (ses->domainName == NULL) { - sec_blob->DomainName.BufferOffset = cpu_to_le32(tmp - pbuffer); + sec_blob->DomainName.BufferOffset = cpu_to_le32(tmp - *pbuffer); sec_blob->DomainName.Length = 0; sec_blob->DomainName.MaximumLength = 0; tmp += 2; @@ -432,14 +452,14 @@ int build_ntlmssp_auth_blob(unsigned char *pbuffer, len = cifs_strtoUTF16((__le16 *)tmp, ses->domainName, CIFS_MAX_USERNAME_LEN, nls_cp); len *= 2; /* unicode is 2 bytes each */ - sec_blob->DomainName.BufferOffset = cpu_to_le32(tmp - pbuffer); + sec_blob->DomainName.BufferOffset = cpu_to_le32(tmp - *pbuffer); sec_blob->DomainName.Length = cpu_to_le16(len); sec_blob->DomainName.MaximumLength = cpu_to_le16(len); tmp += len; } if (ses->user_name == NULL) { - sec_blob->UserName.BufferOffset = cpu_to_le32(tmp - pbuffer); + sec_blob->UserName.BufferOffset = cpu_to_le32(tmp - *pbuffer); sec_blob->UserName.Length = 0; sec_blob->UserName.MaximumLength = 0; tmp += 2; @@ -448,13 +468,13 @@ int build_ntlmssp_auth_blob(unsigned char *pbuffer, len = cifs_strtoUTF16((__le16 *)tmp, ses->user_name, CIFS_MAX_USERNAME_LEN, nls_cp); len *= 2; /* unicode is 2 bytes each */ - sec_blob->UserName.BufferOffset = cpu_to_le32(tmp - pbuffer); + sec_blob->UserName.BufferOffset = cpu_to_le32(tmp - *pbuffer); sec_blob->UserName.Length = cpu_to_le16(len); sec_blob->UserName.MaximumLength = cpu_to_le16(len); tmp += len; } - sec_blob->WorkstationName.BufferOffset = cpu_to_le32(tmp - pbuffer); + sec_blob->WorkstationName.BufferOffset = cpu_to_le32(tmp - *pbuffer); sec_blob->WorkstationName.Length = 0; sec_blob->WorkstationName.MaximumLength = 0; tmp += 2; @@ -463,19 +483,19 @@ int build_ntlmssp_auth_blob(unsigned char *pbuffer, (ses->ntlmssp->server_flags & NTLMSSP_NEGOTIATE_EXTENDED_SEC)) && !calc_seckey(ses)) { memcpy(tmp, ses->ntlmssp->ciphertext, CIFS_CPHTXT_SIZE); - sec_blob->SessionKey.BufferOffset = cpu_to_le32(tmp - pbuffer); + sec_blob->SessionKey.BufferOffset = cpu_to_le32(tmp - *pbuffer); sec_blob->SessionKey.Length = cpu_to_le16(CIFS_CPHTXT_SIZE); sec_blob->SessionKey.MaximumLength = cpu_to_le16(CIFS_CPHTXT_SIZE); tmp += CIFS_CPHTXT_SIZE; } else { - sec_blob->SessionKey.BufferOffset = cpu_to_le32(tmp - pbuffer); + sec_blob->SessionKey.BufferOffset = cpu_to_le32(tmp - *pbuffer); sec_blob->SessionKey.Length = 0; sec_blob->SessionKey.MaximumLength = 0; } + *buflen = tmp - *pbuffer; setup_ntlmv2_ret: - *buflen = tmp - pbuffer; return rc; } @@ -1266,7 +1286,7 @@ sess_auth_rawntlmssp_authenticate(struct sess_data *sess_data) struct cifs_ses *ses = sess_data->ses; __u16 bytes_remaining; char *bcc_ptr; - char *ntlmsspblob = NULL; + unsigned char *ntlmsspblob = NULL; u16 blob_len; cifs_dbg(FYI, "rawntlmssp session setup authenticate phase\n"); @@ -1279,19 +1299,7 @@ sess_auth_rawntlmssp_authenticate(struct sess_data *sess_data) /* Build security blob before we assemble the request */ pSMB = (SESSION_SETUP_ANDX *)sess_data->iov[0].iov_base; smb_buf = (struct smb_hdr *)pSMB; - /* - * 5 is an empirical value, large enough to hold - * authenticate message plus max 10 of av paris, - * domain, user, workstation names, flags, etc. - */ - ntlmsspblob = kzalloc(5*sizeof(struct _AUTHENTICATE_MESSAGE), - GFP_KERNEL); - if (!ntlmsspblob) { - rc = -ENOMEM; - goto out; - } - - rc = build_ntlmssp_auth_blob(ntlmsspblob, + rc = build_ntlmssp_auth_blob(&ntlmsspblob, &blob_len, ses, sess_data->nls_cp); if (rc) goto out_free_ntlmsspblob; diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 14e845e8996f..1a6191d688b7 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -532,7 +532,7 @@ SMB2_sess_setup(const unsigned int xid, struct cifs_ses *ses, u16 blob_length = 0; struct key *spnego_key = NULL; char *security_blob = NULL; - char *ntlmssp_blob = NULL; + unsigned char *ntlmssp_blob = NULL; bool use_spnego = false; /* else use raw ntlmssp */ cifs_dbg(FYI, "Session Setup\n"); @@ -657,13 +657,7 @@ ssetup_ntlmssp_authenticate: iov[1].iov_len = blob_length; } else if (phase == NtLmAuthenticate) { req->hdr.SessionId = ses->Suid; - ntlmssp_blob = kzalloc(sizeof(struct _NEGOTIATE_MESSAGE) + 500, - GFP_KERNEL); - if (ntlmssp_blob == NULL) { - rc = -ENOMEM; - goto ssetup_exit; - } - rc = build_ntlmssp_auth_blob(ntlmssp_blob, &blob_length, ses, + rc = build_ntlmssp_auth_blob(&ntlmssp_blob, &blob_length, ses, nls_cp); if (rc) { cifs_dbg(FYI, "build_ntlmssp_auth_blob failed %d\n", From f67b6920a0cf03d363c5f3bfb14f5d258168dc8c Mon Sep 17 00:00:00 2001 From: Scott Bauer Date: Thu, 23 Jun 2016 08:59:47 -0600 Subject: [PATCH 105/312] HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands [ Upstream commit 93a2001bdfd5376c3dc2158653034c20392d15c5 ] This patch validates the num_values parameter from userland during the HIDIOCGUSAGES and HIDIOCSUSAGES commands. Previously, if the report id was set to HID_REPORT_ID_UNKNOWN, we would fail to validate the num_values parameter leading to a heap overflow. Cc: stable@vger.kernel.org Signed-off-by: Scott Bauer Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin --- drivers/hid/usbhid/hiddev.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c index 2f1ddca6f2e0..700145b15088 100644 --- a/drivers/hid/usbhid/hiddev.c +++ b/drivers/hid/usbhid/hiddev.c @@ -516,13 +516,13 @@ static noinline int hiddev_ioctl_usage(struct hiddev *hiddev, unsigned int cmd, goto inval; } else if (uref->usage_index >= field->report_count) goto inval; - - else if ((cmd == HIDIOCGUSAGES || cmd == HIDIOCSUSAGES) && - (uref_multi->num_values > HID_MAX_MULTI_USAGES || - uref->usage_index + uref_multi->num_values > field->report_count)) - goto inval; } + if ((cmd == HIDIOCGUSAGES || cmd == HIDIOCSUSAGES) && + (uref_multi->num_values > HID_MAX_MULTI_USAGES || + uref->usage_index + uref_multi->num_values > field->report_count)) + goto inval; + switch (cmd) { case HIDIOCGUSAGE: uref->value = field->value[uref->usage_index]; From 65f6fab102a9da8bbaf9e9fed3a8dec76f120636 Mon Sep 17 00:00:00 2001 From: Hui Wang Date: Wed, 22 Jul 2015 10:33:34 +0800 Subject: [PATCH 106/312] ALSA: hda - remove one pin from ALC292_STANDARD_PINS [ Upstream commit 21e9d017b88ea0baa367ef0b6516d794fa23e85e ] One more Dell laptop with alc293 codec needs ALC293_FIXUP_DELL1_MIC_NO_PRESENCE, but the pin 0x1e does not match the corresponding one in the ALC292_STANDARD_PINS. To use this macro for this machine, we need to remove pin 0x1e from it. BugLink: https://bugs.launchpad.net/bugs/1476888 Cc: Signed-off-by: Hui Wang Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/pci/hda/patch_realtek.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index a62872f7b41a..e5ed8dfb6329 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5635,8 +5635,7 @@ static const struct hda_model_fixup alc269_fixup_models[] = { {0x15, 0x0221401f}, \ {0x1a, 0x411111f0}, \ {0x1b, 0x411111f0}, \ - {0x1d, 0x40700001}, \ - {0x1e, 0x411111f0} + {0x1d, 0x40700001} #define ALC298_STANDARD_PINS \ {0x18, 0x411111f0}, \ @@ -5920,35 +5919,48 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = { {0x13, 0x411111f0}, {0x16, 0x01014020}, {0x18, 0x411111f0}, - {0x19, 0x01a19030}), + {0x19, 0x01a19030}, + {0x1e, 0x411111f0}), SND_HDA_PIN_QUIRK(0x10ec0292, 0x1028, "Dell", ALC269_FIXUP_DELL2_MIC_NO_PRESENCE, ALC292_STANDARD_PINS, {0x12, 0x90a60140}, {0x13, 0x411111f0}, {0x16, 0x01014020}, {0x18, 0x02a19031}, - {0x19, 0x01a1903e}), + {0x19, 0x01a1903e}, + {0x1e, 0x411111f0}), SND_HDA_PIN_QUIRK(0x10ec0292, 0x1028, "Dell", ALC269_FIXUP_DELL3_MIC_NO_PRESENCE, ALC292_STANDARD_PINS, {0x12, 0x90a60140}, {0x13, 0x411111f0}, {0x16, 0x411111f0}, {0x18, 0x411111f0}, - {0x19, 0x411111f0}), + {0x19, 0x411111f0}, + {0x1e, 0x411111f0}), SND_HDA_PIN_QUIRK(0x10ec0293, 0x1028, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE, ALC292_STANDARD_PINS, {0x12, 0x40000000}, {0x13, 0x90a60140}, {0x16, 0x21014020}, {0x18, 0x411111f0}, - {0x19, 0x21a19030}), + {0x19, 0x21a19030}, + {0x1e, 0x411111f0}), SND_HDA_PIN_QUIRK(0x10ec0293, 0x1028, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE, ALC292_STANDARD_PINS, {0x12, 0x40000000}, {0x13, 0x90a60140}, {0x16, 0x411111f0}, {0x18, 0x411111f0}, - {0x19, 0x411111f0}), + {0x19, 0x411111f0}, + {0x1e, 0x411111f0}), + SND_HDA_PIN_QUIRK(0x10ec0293, 0x1028, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE, + ALC292_STANDARD_PINS, + {0x12, 0x40000000}, + {0x13, 0x90a60140}, + {0x16, 0x21014020}, + {0x18, 0x411111f0}, + {0x19, 0x21a19030}, + {0x1e, 0x411111ff}), SND_HDA_PIN_QUIRK(0x10ec0298, 0x1028, "Dell", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE, ALC298_STANDARD_PINS, {0x12, 0x90a60130}, From 9bbe4e5d2bfb5a94cc40e75969df16f3cdf85927 Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Fri, 24 Jun 2016 15:13:16 +0200 Subject: [PATCH 107/312] ALSA: hda / realtek - add two more Thinkpad IDs (5050,5053) for tpt460 fixup [ Upstream commit 0f087ee3f3b86a4507db4ff1d2d5a3880e4cfd16 ] See: https://bugzilla.redhat.com/show_bug.cgi?id=1349539 See: https://bugzilla.kernel.org/show_bug.cgi?id=120961 Signed-off-by: Jaroslav Kysela Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/pci/hda/patch_realtek.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index e5ed8dfb6329..abf8d342f1f4 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5504,6 +5504,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x503c, "Thinkpad L450", ALC292_FIXUP_TPT440_DOCK), SND_PCI_QUIRK(0x17aa, 0x504a, "ThinkPad X260", ALC292_FIXUP_TPT440_DOCK), SND_PCI_QUIRK(0x17aa, 0x504b, "Thinkpad", ALC293_FIXUP_LENOVO_SPK_NOISE), + SND_PCI_QUIRK(0x17aa, 0x5050, "Thinkpad T560p", ALC292_FIXUP_TPT460), + SND_PCI_QUIRK(0x17aa, 0x5053, "Thinkpad T460", ALC292_FIXUP_TPT460), SND_PCI_QUIRK(0x17aa, 0x5109, "Thinkpad", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x17aa, 0x3bf8, "Quanta FL1", ALC269_FIXUP_PCM_44K), SND_PCI_QUIRK(0x17aa, 0x9e54, "LENOVO NB", ALC269_FIXUP_LENOVO_EAPD), From 1ff20a560eba527ba652502a2da1cd431e1e2fea Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 24 Jun 2016 15:15:26 +0200 Subject: [PATCH 108/312] ALSA: dummy: Fix a use-after-free at closing [ Upstream commit d5dbbe6569481bf12dcbe3e12cff72c5f78d272c ] syzkaller fuzzer spotted a potential use-after-free case in snd-dummy driver when hrtimer is used as backend: > ================================================================== > BUG: KASAN: use-after-free in rb_erase+0x1b17/0x2010 at addr ffff88005e5b6f68 > Read of size 8 by task syz-executor/8984 > ============================================================================= > BUG kmalloc-192 (Not tainted): kasan: bad access detected > ----------------------------------------------------------------------------- > > Disabling lock debugging due to kernel taint > INFO: Allocated in 0xbbbbbbbbbbbbbbbb age=18446705582212484632 > .... > [< none >] dummy_hrtimer_create+0x49/0x1a0 sound/drivers/dummy.c:464 > .... > INFO: Freed in 0xfffd8e09 age=18446705496313138713 cpu=2164287125 pid=-1 > [< none >] dummy_hrtimer_free+0x68/0x80 sound/drivers/dummy.c:481 > .... > Call Trace: > [] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:333 > [< inline >] rb_set_parent include/linux/rbtree_augmented.h:111 > [< inline >] __rb_erase_augmented include/linux/rbtree_augmented.h:218 > [] rb_erase+0x1b17/0x2010 lib/rbtree.c:427 > [] timerqueue_del+0x78/0x170 lib/timerqueue.c:86 > [] __remove_hrtimer+0x90/0x220 kernel/time/hrtimer.c:903 > [< inline >] remove_hrtimer kernel/time/hrtimer.c:945 > [] hrtimer_try_to_cancel+0x22a/0x570 kernel/time/hrtimer.c:1046 > [] hrtimer_cancel+0x22/0x40 kernel/time/hrtimer.c:1066 > [] dummy_hrtimer_stop+0x91/0xb0 sound/drivers/dummy.c:417 > [] dummy_pcm_trigger+0x17f/0x1e0 sound/drivers/dummy.c:507 > [] snd_pcm_do_stop+0x160/0x1b0 sound/core/pcm_native.c:1106 > [] snd_pcm_action_single+0x76/0x120 sound/core/pcm_native.c:956 > [] snd_pcm_action+0x231/0x290 sound/core/pcm_native.c:974 > [< inline >] snd_pcm_stop sound/core/pcm_native.c:1139 > [] snd_pcm_drop+0x12d/0x1d0 sound/core/pcm_native.c:1784 > [] snd_pcm_common_ioctl1+0xfae/0x2150 sound/core/pcm_native.c:2805 > [] snd_pcm_capture_ioctl1+0x2a1/0x5e0 sound/core/pcm_native.c:2976 > [] snd_pcm_kernel_ioctl+0x11c/0x160 sound/core/pcm_native.c:3020 > [] snd_pcm_oss_sync+0x3a4/0xa30 sound/core/oss/pcm_oss.c:1693 > [] snd_pcm_oss_release+0x1ad/0x280 sound/core/oss/pcm_oss.c:2483 > ..... A workaround is to call hrtimer_cancel() in dummy_hrtimer_sync() which is called certainly before other blocking ops. Reported-by: Dmitry Vyukov Tested-by: Dmitry Vyukov Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/drivers/dummy.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/drivers/dummy.c b/sound/drivers/dummy.c index c5d5217a4180..4df5dc1a3765 100644 --- a/sound/drivers/dummy.c +++ b/sound/drivers/dummy.c @@ -420,6 +420,7 @@ static int dummy_hrtimer_stop(struct snd_pcm_substream *substream) static inline void dummy_hrtimer_sync(struct dummy_hrtimer_pcm *dpcm) { + hrtimer_cancel(&dpcm->timer); tasklet_kill(&dpcm->tasklet); } From 94d06a437eb0ad1a492ed75f929d82715b6c05f8 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Mon, 3 Aug 2015 17:38:33 -0400 Subject: [PATCH 109/312] pNFS: Tighten up locking around DS commit buckets [ Upstream commit 27571297a7e9a2a845c232813a7ba7e1227f5ec6 ] I'm not aware of any bugreports around this issue, but the locking around the pnfs_commit_bucket is inconsistent at best. This patch tightens it up by ensuring that the 'bucket->committing' list is always changed atomically w.r.t. the 'bucket->clseg' layout segment tracking. Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- fs/nfs/pnfs_nfs.c | 50 ++++++++++++++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 18 deletions(-) diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c index 1705c78ee2d8..b28214c30e6d 100644 --- a/fs/nfs/pnfs_nfs.c +++ b/fs/nfs/pnfs_nfs.c @@ -124,11 +124,12 @@ pnfs_generic_scan_ds_commit_list(struct pnfs_commit_bucket *bucket, if (ret) { cinfo->ds->nwritten -= ret; cinfo->ds->ncommitting += ret; - bucket->clseg = bucket->wlseg; - if (list_empty(src)) + if (bucket->clseg == NULL) + bucket->clseg = pnfs_get_lseg(bucket->wlseg); + if (list_empty(src)) { + pnfs_put_lseg_locked(bucket->wlseg); bucket->wlseg = NULL; - else - pnfs_get_lseg(bucket->clseg); + } } return ret; } @@ -182,19 +183,23 @@ static void pnfs_generic_retry_commit(struct nfs_commit_info *cinfo, int idx) struct pnfs_ds_commit_info *fl_cinfo = cinfo->ds; struct pnfs_commit_bucket *bucket; struct pnfs_layout_segment *freeme; + LIST_HEAD(pages); int i; + spin_lock(cinfo->lock); for (i = idx; i < fl_cinfo->nbuckets; i++) { bucket = &fl_cinfo->buckets[i]; if (list_empty(&bucket->committing)) continue; - nfs_retry_commit(&bucket->committing, bucket->clseg, cinfo, i); - spin_lock(cinfo->lock); freeme = bucket->clseg; bucket->clseg = NULL; + list_splice_init(&bucket->committing, &pages); spin_unlock(cinfo->lock); + nfs_retry_commit(&pages, freeme, cinfo, i); pnfs_put_lseg(freeme); + spin_lock(cinfo->lock); } + spin_unlock(cinfo->lock); } static unsigned int @@ -216,10 +221,6 @@ pnfs_generic_alloc_ds_commits(struct nfs_commit_info *cinfo, if (!data) break; data->ds_commit_index = i; - spin_lock(cinfo->lock); - data->lseg = bucket->clseg; - bucket->clseg = NULL; - spin_unlock(cinfo->lock); list_add(&data->pages, list); nreq++; } @@ -229,6 +230,22 @@ pnfs_generic_alloc_ds_commits(struct nfs_commit_info *cinfo, return nreq; } +static inline +void pnfs_fetch_commit_bucket_list(struct list_head *pages, + struct nfs_commit_data *data, + struct nfs_commit_info *cinfo) +{ + struct pnfs_commit_bucket *bucket; + + bucket = &cinfo->ds->buckets[data->ds_commit_index]; + spin_lock(cinfo->lock); + list_splice_init(pages, &bucket->committing); + data->lseg = bucket->clseg; + bucket->clseg = NULL; + spin_unlock(cinfo->lock); + +} + /* This follows nfs_commit_list pretty closely */ int pnfs_generic_commit_pagelist(struct inode *inode, struct list_head *mds_pages, @@ -243,7 +260,7 @@ pnfs_generic_commit_pagelist(struct inode *inode, struct list_head *mds_pages, if (!list_empty(mds_pages)) { data = nfs_commitdata_alloc(); if (data != NULL) { - data->lseg = NULL; + data->ds_commit_index = -1; list_add(&data->pages, &list); nreq++; } else { @@ -265,19 +282,16 @@ pnfs_generic_commit_pagelist(struct inode *inode, struct list_head *mds_pages, list_for_each_entry_safe(data, tmp, &list, pages) { list_del_init(&data->pages); - if (!data->lseg) { + if (data->ds_commit_index < 0) { nfs_init_commit(data, mds_pages, NULL, cinfo); nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(data->inode), data->mds_ops, how, 0); } else { - struct pnfs_commit_bucket *buckets; + LIST_HEAD(pages); - buckets = cinfo->ds->buckets; - nfs_init_commit(data, - &buckets[data->ds_commit_index].committing, - data->lseg, - cinfo); + pnfs_fetch_commit_bucket_list(&pages, data, cinfo); + nfs_init_commit(data, &pages, data->lseg, cinfo); initiate_commit(data, how); } } From 691c507ec01fa0cab2a9cfb5bd4398ddd5480a8a Mon Sep 17 00:00:00 2001 From: Weston Andros Adamson Date: Wed, 25 May 2016 10:07:23 -0400 Subject: [PATCH 110/312] nfs: avoid race that crashes nfs_init_commit [ Upstream commit ade8febde0271513360bac44883dbebad44276c3 ] Since the patch "NFS: Allow multiple commit requests in flight per file" we can run multiple simultaneous commits on the same inode. This introduced a race over collecting pages to commit that made it possible to call nfs_init_commit() with an empty list - which causes crashes like the one below. The fix is to catch this race and avoid calling nfs_init_commit and initiate_commit when there is no work to do. Here is the crash: [600522.076832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 [600522.078475] IP: [] nfs_init_commit+0x22/0x130 [nfs] [600522.078745] PGD 4272b1067 PUD 4272cb067 PMD 0 [600522.078972] Oops: 0000 [#1] SMP [600522.079204] Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vmw_vsock_vmci_transport vsock bonding ipmi_devintf ipmi_msghandler coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev vmw_balloon parport_pc parport acpi_cpufreq vmw_vmci i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmxnet3 [600522.081380] vmw_pvscsi ata_generic pata_acpi [600522.081809] CPU: 3 PID: 15667 Comm: /usr/bin/python Not tainted 4.1.9-100.pd.88.el7.x86_64 #1 [600522.082281] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 [600522.082814] task: ffff8800bbbfa780 ti: ffff88042ae84000 task.ti: ffff88042ae84000 [600522.083378] RIP: 0010:[] [] nfs_init_commit+0x22/0x130 [nfs] [600522.083973] RSP: 0018:ffff88042ae87438 EFLAGS: 00010246 [600522.084571] RAX: 0000000000000000 RBX: ffff880003485e40 RCX: ffff88042ae87588 [600522.085188] RDX: 0000000000000000 RSI: ffff88042ae874b0 RDI: ffff880003485e40 [600522.085756] RBP: ffff88042ae87448 R08: ffff880003486010 R09: ffff88042ae874b0 [600522.086332] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88042ae872d0 [600522.086905] R13: ffff88042ae874b0 R14: ffff880003485e40 R15: ffff88042704c840 [600522.087484] FS: 00007f4728ff2740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000 [600522.088070] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [600522.088663] CR2: 0000000000000040 CR3: 000000042b6aa000 CR4: 00000000001406e0 [600522.089327] Stack: [600522.089926] 0000000000000001 ffff88042ae87588 ffff88042ae874f8 ffffffffa04f09fa [600522.090549] 0000000000017840 0000000000017840 ffff88042ae87588 ffff8803258d9930 [600522.091169] ffff88042ae87578 ffffffffa0563d80 0000000000000000 ffff88042704c840 [600522.091789] Call Trace: [600522.092420] [] pnfs_generic_commit_pagelist+0x1da/0x320 [nfsv4] [600522.093052] [] ? ff_layout_commit_prepare_v3+0x30/0x30 [nfs_layout_flexfiles] [600522.093696] [] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles] [600522.094359] [] nfs_generic_commit_list+0xe8/0x120 [nfs] [600522.095032] [] nfs_commit_inode+0xba/0x110 [nfs] [600522.095719] [] nfs_release_page+0x44/0xd0 [nfs] [600522.096410] [] try_to_release_page+0x32/0x50 [600522.097109] [] shrink_page_list+0x961/0xb30 [600522.097812] [] shrink_inactive_list+0x1cd/0x550 [600522.098530] [] shrink_lruvec+0x635/0x840 [600522.099250] [] shrink_zone+0xf0/0x2f0 [600522.099974] [] do_try_to_free_pages+0x192/0x470 [600522.100709] [] try_to_free_pages+0xda/0x170 [600522.101464] [] __alloc_pages_nodemask+0x588/0x970 [600522.102235] [] alloc_pages_vma+0xb5/0x230 [600522.103000] [] ? cpumask_any_but+0x39/0x50 [600522.103774] [] wp_page_copy.isra.55+0x95/0x490 [600522.104558] [] ? __wake_up+0x48/0x60 [600522.105357] [] do_wp_page+0xab/0x4f0 [600522.106137] [] ? release_task+0x36b/0x470 [600522.106902] [] ? eventfd_ctx_read+0x67/0x1c0 [600522.107659] [] handle_mm_fault+0xc78/0x1900 [600522.108431] [] __do_page_fault+0x181/0x420 [600522.109173] [] ? __audit_syscall_exit+0x1e6/0x280 [600522.109893] [] do_page_fault+0x30/0x80 [600522.110594] [] ? syscall_trace_leave+0xc6/0x120 [600522.111288] [] page_fault+0x28/0x30 [600522.111947] Code: 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 4c 8d 87 d0 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce <48> 8b 40 40 4c 8b 50 30 74 24 48 8b 87 d0 01 00 00 48 8b 7e 08 [600522.113343] RIP [] nfs_init_commit+0x22/0x130 [nfs] [600522.114003] RSP [600522.114636] CR2: 0000000000000040 Fixes: af7cf057 (NFS: Allow multiple commit requests in flight per file) CC: stable@vger.kernel.org Signed-off-by: Weston Andros Adamson Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin --- fs/nfs/pnfs_nfs.c | 28 ++++++++++++++++++++++++++++ fs/nfs/write.c | 4 ++++ 2 files changed, 32 insertions(+) diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c index b28214c30e6d..8bd645a1913b 100644 --- a/fs/nfs/pnfs_nfs.c +++ b/fs/nfs/pnfs_nfs.c @@ -246,6 +246,23 @@ void pnfs_fetch_commit_bucket_list(struct list_head *pages, } +/* Helper function for pnfs_generic_commit_pagelist to catch an empty + * page list. This can happen when two commits race. */ +static bool +pnfs_generic_commit_cancel_empty_pagelist(struct list_head *pages, + struct nfs_commit_data *data, + struct nfs_commit_info *cinfo) +{ + if (list_empty(pages)) { + if (atomic_dec_and_test(&cinfo->mds->rpcs_out)) + wake_up_atomic_t(&cinfo->mds->rpcs_out); + nfs_commitdata_release(data); + return true; + } + + return false; +} + /* This follows nfs_commit_list pretty closely */ int pnfs_generic_commit_pagelist(struct inode *inode, struct list_head *mds_pages, @@ -283,6 +300,11 @@ pnfs_generic_commit_pagelist(struct inode *inode, struct list_head *mds_pages, list_for_each_entry_safe(data, tmp, &list, pages) { list_del_init(&data->pages); if (data->ds_commit_index < 0) { + /* another commit raced with us */ + if (pnfs_generic_commit_cancel_empty_pagelist(mds_pages, + data, cinfo)) + continue; + nfs_init_commit(data, mds_pages, NULL, cinfo); nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(data->inode), @@ -291,6 +313,12 @@ pnfs_generic_commit_pagelist(struct inode *inode, struct list_head *mds_pages, LIST_HEAD(pages); pnfs_fetch_commit_bucket_list(&pages, data, cinfo); + + /* another commit raced with us */ + if (pnfs_generic_commit_cancel_empty_pagelist(&pages, + data, cinfo)) + continue; + nfs_init_commit(data, &pages, data->lseg, cinfo); initiate_commit(data, how); } diff --git a/fs/nfs/write.c b/fs/nfs/write.c index d9851a6a2813..f98cd9adbc0d 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1653,6 +1653,10 @@ nfs_commit_list(struct inode *inode, struct list_head *head, int how, { struct nfs_commit_data *data; + /* another commit raced with us */ + if (list_empty(head)) + return 0; + data = nfs_commitdata_alloc(); if (!data) From eba391c749fe8a47aea9de2e78fadc02434b5417 Mon Sep 17 00:00:00 2001 From: Weston Andros Adamson Date: Fri, 17 Jun 2016 16:48:24 -0400 Subject: [PATCH 111/312] pnfs_nfs: fix _cancel_empty_pagelist [ Upstream commit 5e3a98883e7ebdd1440f829a9e9dd5c3d2c5903b ] pnfs_generic_commit_cancel_empty_pagelist calls nfs_commitdata_release, but that is wrong: nfs_commitdata_release puts the open context, something that isn't valid until nfs_init_commit is called, which is never the case when pnfs_generic_commit_cancel_empty_pagelist is called. This was introduced in "nfs: avoid race that crashes nfs_init_commit". Signed-off-by: Weston Andros Adamson Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin --- fs/nfs/pnfs_nfs.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c index 8bd645a1913b..19c1bcf70e3e 100644 --- a/fs/nfs/pnfs_nfs.c +++ b/fs/nfs/pnfs_nfs.c @@ -247,7 +247,11 @@ void pnfs_fetch_commit_bucket_list(struct list_head *pages, } /* Helper function for pnfs_generic_commit_pagelist to catch an empty - * page list. This can happen when two commits race. */ + * page list. This can happen when two commits race. + * + * This must be called instead of nfs_init_commit - call one or the other, but + * not both! + */ static bool pnfs_generic_commit_cancel_empty_pagelist(struct list_head *pages, struct nfs_commit_data *data, @@ -256,7 +260,11 @@ pnfs_generic_commit_cancel_empty_pagelist(struct list_head *pages, if (list_empty(pages)) { if (atomic_dec_and_test(&cinfo->mds->rpcs_out)) wake_up_atomic_t(&cinfo->mds->rpcs_out); - nfs_commitdata_release(data); + /* don't call nfs_commitdata_release - it tries to put + * the open_context which is not acquired until nfs_init_commit + * which has not been called on @data */ + WARN_ON_ONCE(data->context); + nfs_commit_free(data); return true; } From e20c888e2b3576e5f498c167729d274ef60b86f8 Mon Sep 17 00:00:00 2001 From: Steve French Date: Wed, 22 Jun 2016 20:12:05 -0500 Subject: [PATCH 112/312] Fix reconnect to not defer smb3 session reconnect long after socket reconnect [ Upstream commit 4fcd1813e6404dd4420c7d12fb483f9320f0bf93 ] Azure server blocks clients that open a socket and don't do anything on it. In our reconnect scenarios, we can reconnect the tcp session and detect the socket is available but we defer the negprot and SMB3 session setup and tree connect reconnection until the next i/o is requested, but this looks suspicous to some servers who expect SMB3 negprog and session setup soon after a socket is created. In the echo thread, reconnect SMB3 sessions and tree connections that are disconnected. A later patch will replay persistent (and resilient) handle opens. CC: Stable Signed-off-by: Steve French Acked-by: Pavel Shilovsky Signed-off-by: Sasha Levin --- fs/cifs/connect.c | 4 +++- fs/cifs/smb2pdu.c | 27 +++++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index de626b939811..17998d19b166 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -414,7 +414,9 @@ cifs_echo_request(struct work_struct *work) * server->ops->need_neg() == true. Also, no need to ping if * we got a response recently. */ - if (!server->ops->need_neg || server->ops->need_neg(server) || + + if (server->tcpStatus == CifsNeedReconnect || + server->tcpStatus == CifsExiting || server->tcpStatus == CifsNew || (server->ops->can_echo && !server->ops->can_echo(server)) || time_before(jiffies, server->lstrp + SMB_ECHO_INTERVAL - HZ)) goto requeue_echo; diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 1a6191d688b7..8f527c867f78 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -1626,6 +1626,33 @@ SMB2_echo(struct TCP_Server_Info *server) cifs_dbg(FYI, "In echo request\n"); + if (server->tcpStatus == CifsNeedNegotiate) { + struct list_head *tmp, *tmp2; + struct cifs_ses *ses; + struct cifs_tcon *tcon; + + cifs_dbg(FYI, "Need negotiate, reconnecting tcons\n"); + spin_lock(&cifs_tcp_ses_lock); + list_for_each(tmp, &server->smb_ses_list) { + ses = list_entry(tmp, struct cifs_ses, smb_ses_list); + list_for_each(tmp2, &ses->tcon_list) { + tcon = list_entry(tmp2, struct cifs_tcon, + tcon_list); + /* add check for persistent handle reconnect */ + if (tcon && tcon->need_reconnect) { + spin_unlock(&cifs_tcp_ses_lock); + rc = smb2_reconnect(SMB2_ECHO, tcon); + spin_lock(&cifs_tcp_ses_lock); + } + } + } + spin_unlock(&cifs_tcp_ses_lock); + } + + /* if no session, renegotiate failed above */ + if (server->tcpStatus == CifsNeedNegotiate) + return -EIO; + rc = small_smb2_init(SMB2_ECHO, NULL, (void **)&req); if (rc) return rc; From 655d8d195ad10e5bf3252eb7822136303d7c65fe Mon Sep 17 00:00:00 2001 From: Steve French Date: Wed, 22 Jun 2016 21:07:32 -0500 Subject: [PATCH 113/312] File names with trailing period or space need special case conversion [ Upstream commit 45e8a2583d97ca758a55c608f78c4cef562644d1 ] POSIX allows files with trailing spaces or a trailing period but SMB3 does not, so convert these using the normal Services For Mac mapping as we do for other reserved characters such as : < > | ? * This is similar to what Macs do for the same problem over SMB3. CC: Stable Signed-off-by: Steve French Acked-by: Pavel Shilovsky Signed-off-by: Sasha Levin --- fs/cifs/cifs_unicode.c | 33 +++++++++++++++++++++++++++++---- fs/cifs/cifs_unicode.h | 2 ++ 2 files changed, 31 insertions(+), 4 deletions(-) diff --git a/fs/cifs/cifs_unicode.c b/fs/cifs/cifs_unicode.c index 5a53ac6b1e02..02b071bf3732 100644 --- a/fs/cifs/cifs_unicode.c +++ b/fs/cifs/cifs_unicode.c @@ -101,6 +101,12 @@ convert_sfm_char(const __u16 src_char, char *target) case SFM_SLASH: *target = '\\'; break; + case SFM_SPACE: + *target = ' '; + break; + case SFM_PERIOD: + *target = '.'; + break; default: return false; } @@ -404,7 +410,7 @@ static __le16 convert_to_sfu_char(char src_char) return dest_char; } -static __le16 convert_to_sfm_char(char src_char) +static __le16 convert_to_sfm_char(char src_char, bool end_of_string) { __le16 dest_char; @@ -427,6 +433,18 @@ static __le16 convert_to_sfm_char(char src_char) case '|': dest_char = cpu_to_le16(SFM_PIPE); break; + case '.': + if (end_of_string) + dest_char = cpu_to_le16(SFM_PERIOD); + else + dest_char = 0; + break; + case ' ': + if (end_of_string) + dest_char = cpu_to_le16(SFM_SPACE); + else + dest_char = 0; + break; default: dest_char = 0; } @@ -469,9 +487,16 @@ cifsConvertToUTF16(__le16 *target, const char *source, int srclen, /* see if we must remap this char */ if (map_chars == SFU_MAP_UNI_RSVD) dst_char = convert_to_sfu_char(src_char); - else if (map_chars == SFM_MAP_UNI_RSVD) - dst_char = convert_to_sfm_char(src_char); - else + else if (map_chars == SFM_MAP_UNI_RSVD) { + bool end_of_string; + + if (i == srclen - 1) + end_of_string = true; + else + end_of_string = false; + + dst_char = convert_to_sfm_char(src_char, end_of_string); + } else dst_char = 0; /* * FIXME: We can not handle remapping backslash (UNI_SLASH) diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h index bdc52cb9a676..479bc0a941f3 100644 --- a/fs/cifs/cifs_unicode.h +++ b/fs/cifs/cifs_unicode.h @@ -64,6 +64,8 @@ #define SFM_LESSTHAN ((__u16) 0xF023) #define SFM_PIPE ((__u16) 0xF027) #define SFM_SLASH ((__u16) 0xF026) +#define SFM_PERIOD ((__u16) 0xF028) +#define SFM_SPACE ((__u16) 0xF029) /* * Mapping mechanism to use when one of the seven reserved characters is From 7f3724b8951735ef1d5ae4f2846b8af98a665d73 Mon Sep 17 00:00:00 2001 From: Alan Stern Date: Thu, 23 Jun 2016 14:54:37 -0400 Subject: [PATCH 114/312] USB: EHCI: declare hostpc register as zero-length array [ Upstream commit 7e8b3dfef16375dbfeb1f36a83eb9f27117c51fd ] The HOSTPC extension registers found in some EHCI implementations form a variable-length array, with one element for each port. Therefore the hostpc field in struct ehci_regs should be declared as a zero-length array, not a single-element array. This fixes a problem reported by UBSAN. Signed-off-by: Alan Stern Reported-by: Wilfried Klaebe Tested-by: Wilfried Klaebe CC: Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- include/linux/usb/ehci_def.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/usb/ehci_def.h b/include/linux/usb/ehci_def.h index 966889a20ea3..e479033bd782 100644 --- a/include/linux/usb/ehci_def.h +++ b/include/linux/usb/ehci_def.h @@ -180,11 +180,11 @@ struct ehci_regs { * PORTSCx */ /* HOSTPC: offset 0x84 */ - u32 hostpc[1]; /* HOSTPC extension */ + u32 hostpc[0]; /* HOSTPC extension */ #define HOSTPC_PHCD (1<<22) /* Phy clock disable */ #define HOSTPC_PSPD (3<<25) /* Port speed detection */ - u32 reserved5[16]; + u32 reserved5[17]; /* USBMODE_EX: offset 0xc8 */ u32 usbmode_ex; /* USB Device mode extension */ From c5bcec6cbcbf520f088dc7939934bbf10c20c5a5 Mon Sep 17 00:00:00 2001 From: Anthony Romano Date: Fri, 24 Jun 2016 14:48:43 -0700 Subject: [PATCH 115/312] tmpfs: don't undo fallocate past its last page [ Upstream commit b9b4bb26af017dbe930cd4df7f9b2fc3a0497bfe ] When fallocate is interrupted it will undo a range that extends one byte past its range of allocated pages. This can corrupt an in-use page by zeroing out its first byte. Instead, undo using the inclusive byte range. Fixes: 1635f6a74152f1d ("tmpfs: undo fallocation on failure") Link: http://lkml.kernel.org/r/1462713387-16724-1-git-send-email-anthony.romano@coreos.com Signed-off-by: Anthony Romano Cc: Vlastimil Babka Cc: Hugh Dickins Cc: Brandon Philips Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/shmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index 47d536e59fc0..86c150bf8235 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2139,7 +2139,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, /* Remove the !PageUptodate pages we added */ shmem_undo_range(inode, (loff_t)start << PAGE_CACHE_SHIFT, - (loff_t)index << PAGE_CACHE_SHIFT, true); + ((loff_t)index << PAGE_CACHE_SHIFT) - 1, true); goto undone; } From c5ad33184354260be6d05de57e46a5498692f6d6 Mon Sep 17 00:00:00 2001 From: Lukasz Odzioba Date: Fri, 24 Jun 2016 14:50:01 -0700 Subject: [PATCH 116/312] mm/swap.c: flush lru pvecs on compound page arrival [ Upstream commit 8f182270dfec432e93fae14f9208a6b9af01009f ] Currently we can have compound pages held on per cpu pagevecs, which leads to a lot of memory unavailable for reclaim when needed. In the systems with hundreads of processors it can be GBs of memory. On of the way of reproducing the problem is to not call munmap explicitly on all mapped regions (i.e. after receiving SIGTERM). After that some pages (with THP enabled also huge pages) may end up on lru_add_pvec, example below. void main() { #pragma omp parallel { size_t size = 55 * 1000 * 1000; // smaller than MEM/CPUS void *p = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS , -1, 0); if (p != MAP_FAILED) memset(p, 0, size); //munmap(p, size); // uncomment to make the problem go away } } When we run it with THP enabled it will leave significant amount of memory on lru_add_pvec. This memory will be not reclaimed if we hit OOM, so when we run above program in a loop: for i in `seq 100`; do ./a.out; done many processes (95% in my case) will be killed by OOM. The primary point of the LRU add cache is to save the zone lru_lock contention with a hope that more pages will belong to the same zone and so their addition can be batched. The huge page is already a form of batched addition (it will add 512 worth of memory in one go) so skipping the batching seems like a safer option when compared to a potential excess in the caching which can be quite large and much harder to fix because lru_add_drain_all is way to expensive and it is not really clear what would be a good moment to call it. Similarly we can reproduce the problem on lru_deactivate_pvec by adding: madvise(p, size, MADV_FREE); after memset. This patch flushes lru pvecs on compound page arrival making the problem less severe - after applying it kill rate of above example drops to 0%, due to reducing maximum amount of memory held on pvec from 28MB (with THP) to 56kB per CPU. Suggested-by: Michal Hocko Link: http://lkml.kernel.org/r/1466180198-18854-1-git-send-email-lukasz.odzioba@intel.com Signed-off-by: Lukasz Odzioba Acked-by: Michal Hocko Cc: Kirill Shutemov Cc: Andrea Arcangeli Cc: Vladimir Davydov Cc: Ming Li Cc: Minchan Kim Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/swap.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/mm/swap.c b/mm/swap.c index a7251a8ed532..b523f0a4cbfb 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -483,7 +483,7 @@ void rotate_reclaimable_page(struct page *page) page_cache_get(page); local_irq_save(flags); pvec = this_cpu_ptr(&lru_rotate_pvecs); - if (!pagevec_add(pvec, page)) + if (!pagevec_add(pvec, page) || PageCompound(page)) pagevec_move_tail(pvec); local_irq_restore(flags); } @@ -539,7 +539,7 @@ void activate_page(struct page *page) struct pagevec *pvec = &get_cpu_var(activate_page_pvecs); page_cache_get(page); - if (!pagevec_add(pvec, page)) + if (!pagevec_add(pvec, page) || PageCompound(page)) pagevec_lru_move_fn(pvec, __activate_page, NULL); put_cpu_var(activate_page_pvecs); } @@ -631,9 +631,8 @@ static void __lru_cache_add(struct page *page) struct pagevec *pvec = &get_cpu_var(lru_add_pvec); page_cache_get(page); - if (!pagevec_space(pvec)) + if (!pagevec_space(pvec) || PageCompound(page)) __pagevec_lru_add(pvec); - pagevec_add(pvec, page); put_cpu_var(lru_add_pvec); } @@ -846,7 +845,7 @@ void deactivate_file_page(struct page *page) if (likely(get_page_unless_zero(page))) { struct pagevec *pvec = &get_cpu_var(lru_deactivate_file_pvecs); - if (!pagevec_add(pvec, page)) + if (!pagevec_add(pvec, page) || PageCompound(page)) pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL); put_cpu_var(lru_deactivate_file_pvecs); } From 683854270f84daa09baffe2b21d64ec88c614fa9 Mon Sep 17 00:00:00 2001 From: Vlastimil Babka Date: Tue, 8 Sep 2015 15:02:49 -0700 Subject: [PATCH 117/312] mm, compaction: skip compound pages by order in free scanner [ Upstream commit 9fcd6d2e052eef525e94a9ae58dbe7ed4df4f5a7 ] The compaction free scanner is looking for PageBuddy() pages and skipping all others. For large compound pages such as THP or hugetlbfs, we can save a lot of iterations if we skip them at once using their compound_order(). This is generally unsafe and we can read a bogus value of order due to a race, but if we are careful, the only danger is skipping too much. When tested with stress-highalloc from mmtests on 4GB system with 1GB hugetlbfs pages, the vmstat compact_free_scanned count decreased by at least 15%. Signed-off-by: Vlastimil Babka Cc: Minchan Kim Cc: Mel Gorman Acked-by: Joonsoo Kim Acked-by: Michal Nazarewicz Cc: Naoya Horiguchi Cc: Christoph Lameter Cc: Rik van Riel Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/compaction.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/mm/compaction.c b/mm/compaction.c index 3dcf93cd622b..ac3ebfd69f4d 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -431,6 +431,24 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, if (!valid_page) valid_page = page; + + /* + * For compound pages such as THP and hugetlbfs, we can save + * potentially a lot of iterations if we skip them at once. + * The check is racy, but we can consider only valid values + * and the only danger is skipping too much. + */ + if (PageCompound(page)) { + unsigned int comp_order = compound_order(page); + + if (likely(comp_order < MAX_ORDER)) { + blockpfn += (1UL << comp_order) - 1; + cursor += (1UL << comp_order) - 1; + } + + goto isolate_fail; + } + if (!PageBuddy(page)) goto isolate_fail; @@ -490,6 +508,13 @@ isolate_fail: } + /* + * There is a tiny chance that we have read bogus compound_order(), + * so be careful to not go outside of the pageblock. + */ + if (unlikely(blockpfn > end_pfn)) + blockpfn = end_pfn; + trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn, nr_scanned, total_isolated); From 284f69fb49e2e385203f52441b324b9a68461d6b Mon Sep 17 00:00:00 2001 From: David Rientjes Date: Fri, 24 Jun 2016 14:50:10 -0700 Subject: [PATCH 118/312] mm, compaction: abort free scanner if split fails [ Upstream commit a4f04f2c6955aff5e2c08dcb40aca247ff4d7370 ] If the memory compaction free scanner cannot successfully split a free page (only possible due to per-zone low watermark), terminate the free scanner rather than continuing to scan memory needlessly. If the watermark is insufficient for a free page of order <= cc->order, then terminate the scanner since all future splits will also likely fail. This prevents the compaction freeing scanner from scanning all memory on very large zones (very noticeable for zones > 128GB, for instance) when all splits will likely fail while holding zone->lock. compaction_alloc() iterating a 128GB zone has been benchmarked to take over 400ms on some systems whereas any free page isolated and ready to be split ends up failing in split_free_page() because of the low watermark check and thus the iteration continues. The next time compaction occurs, the freeing scanner will likely start at the end of the zone again since no success was made previously and we get the same lengthy iteration until the zone is brought above the low watermark. All thp page faults can take >400ms in such a state without this fix. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1606211820350.97086@chino.kir.corp.google.com Signed-off-by: David Rientjes Acked-by: Vlastimil Babka Cc: Minchan Kim Cc: Joonsoo Kim Cc: Mel Gorman Cc: Hugh Dickins Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/compaction.c | 39 +++++++++++++++++++++------------------ 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index ac3ebfd69f4d..32c719a4bc3d 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -480,25 +480,23 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, /* Found a free page, break it into order-0 pages */ isolated = split_free_page(page); + if (!isolated) + break; + total_isolated += isolated; + cc->nr_freepages += isolated; for (i = 0; i < isolated; i++) { list_add(&page->lru, freelist); page++; } - - /* If a page was split, advance to the end of it */ - if (isolated) { - cc->nr_freepages += isolated; - if (!strict && - cc->nr_migratepages <= cc->nr_freepages) { - blockpfn += isolated; - break; - } - - blockpfn += isolated - 1; - cursor += isolated - 1; - continue; + if (!strict && cc->nr_migratepages <= cc->nr_freepages) { + blockpfn += isolated; + break; } + /* Advance to the end of split page */ + blockpfn += isolated - 1; + cursor += isolated - 1; + continue; isolate_fail: if (strict) @@ -508,6 +506,9 @@ isolate_fail: } + if (locked) + spin_unlock_irqrestore(&cc->zone->lock, flags); + /* * There is a tiny chance that we have read bogus compound_order(), * so be careful to not go outside of the pageblock. @@ -529,9 +530,6 @@ isolate_fail: if (strict && blockpfn < end_pfn) total_isolated = 0; - if (locked) - spin_unlock_irqrestore(&cc->zone->lock, flags); - /* Update the pageblock-skip if the whole pageblock was scanned */ if (blockpfn == end_pfn) update_pageblock_skip(cc, valid_page, total_isolated, false); @@ -955,6 +953,7 @@ static void isolate_freepages(struct compact_control *cc) block_end_pfn = block_start_pfn, block_start_pfn -= pageblock_nr_pages, isolate_start_pfn = block_start_pfn) { + unsigned long isolated; /* * This can iterate a massively long zone without finding any @@ -979,8 +978,12 @@ static void isolate_freepages(struct compact_control *cc) continue; /* Found a block suitable for isolating free pages from. */ - isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, freelist, false); + isolated = isolate_freepages_block(cc, &isolate_start_pfn, + block_end_pfn, freelist, false); + /* If isolation failed early, do not continue needlessly */ + if (!isolated && isolate_start_pfn < block_end_pfn && + cc->nr_migratepages > cc->nr_freepages) + break; /* * Remember where the free scanner should restart next time, From cc6fd729b8a04fbb4b88e45209c1241dd89a3fbe Mon Sep 17 00:00:00 2001 From: Torsten Hilbrich Date: Fri, 24 Jun 2016 14:50:18 -0700 Subject: [PATCH 119/312] fs/nilfs2: fix potential underflow in call to crc32_le [ Upstream commit 63d2f95d63396059200c391ca87161897b99e74a ] The value `bytes' comes from the filesystem which is about to be mounted. We cannot trust that the value is always in the range we expect it to be. Check its value before using it to calculate the length for the crc32_le call. It value must be larger (or equal) sumoff + 4. This fixes a kernel bug when accidentially mounting an image file which had the nilfs2 magic value 0x3434 at the right offset 0x406 by chance. The bytes 0x01 0x00 were stored at 0x408 and were interpreted as a s_bytes value of 1. This caused an underflow when substracting sumoff + 4 (20) in the call to crc32_le. BUG: unable to handle kernel paging request at ffff88021e600000 IP: crc32_le+0x36/0x100 ... Call Trace: nilfs_valid_sb.part.5+0x52/0x60 [nilfs2] nilfs_load_super_block+0x142/0x300 [nilfs2] init_nilfs+0x60/0x390 [nilfs2] nilfs_mount+0x302/0x520 [nilfs2] mount_fs+0x38/0x160 vfs_kern_mount+0x67/0x110 do_mount+0x269/0xe00 SyS_mount+0x9f/0x100 entry_SYSCALL_64_fastpath+0x16/0x71 Link: http://lkml.kernel.org/r/1466778587-5184-2-git-send-email-konishi.ryusuke@lab.ntt.co.jp Signed-off-by: Torsten Hilbrich Tested-by: Torsten Hilbrich Signed-off-by: Ryusuke Konishi Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- fs/nilfs2/the_nilfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c index 69bd801afb53..37e49cb2ac4c 100644 --- a/fs/nilfs2/the_nilfs.c +++ b/fs/nilfs2/the_nilfs.c @@ -443,7 +443,7 @@ static int nilfs_valid_sb(struct nilfs_super_block *sbp) if (!sbp || le16_to_cpu(sbp->s_magic) != NILFS_SUPER_MAGIC) return 0; bytes = le16_to_cpu(sbp->s_bytes); - if (bytes > BLOCK_SIZE) + if (bytes < sumoff + 4 || bytes > BLOCK_SIZE) return 0; crc = crc32_le(le32_to_cpu(sbp->s_crc_seed), (unsigned char *)sbp, sumoff); From 848be4770beb10fcc6f971c58e80aa2c2b6dad66 Mon Sep 17 00:00:00 2001 From: Cyril Bur Date: Fri, 17 Jun 2016 14:58:34 +1000 Subject: [PATCH 120/312] powerpc/tm: Always reclaim in start_thread() for exec() class syscalls [ Upstream commit 8e96a87c5431c256feb65bcfc5aec92d9f7839b6 ] Userspace can quite legitimately perform an exec() syscall with a suspended transaction. exec() does not return to the old process, rather it load a new one and starts that, the expectation therefore is that the new process starts not in a transaction. Currently exec() is not treated any differently to any other syscall which creates problems. Firstly it could allow a new process to start with a suspended transaction for a binary that no longer exists. This means that the checkpointed state won't be valid and if the suspended transaction were ever to be resumed and subsequently aborted (a possibility which is exceedingly likely as exec()ing will likely doom the transaction) the new process will jump to invalid state. Secondly the incorrect attempt to keep the transactional state while still zeroing state for the new process creates at least two TM Bad Things. The first triggers on the rfid to return to userspace as start_thread() has given the new process a 'clean' MSR but the suspend will still be set in the hardware MSR. The second TM Bad Thing triggers in __switch_to() as the processor is still transactionally suspended but __switch_to() wants to zero the TM sprs for the new process. This is an example of the outcome of calling exec() with a suspended transaction. Note the first 700 is likely the first TM bad thing decsribed earlier only the kernel can't report it as we've loaded userspace registers. c000000000009980 is the rfid in fast_exception_return() Bad kernel stack pointer 3fffcfa1a370 at c000000000009980 Oops: Bad kernel stack pointer, sig: 6 [#1] CPU: 0 PID: 2006 Comm: tm-execed Not tainted NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000 REGS: c00000003ffefd40 TRAP: 0700 Not tainted MSR: 8000000300201031 CR: 00000000 XER: 00000000 CFAR: c0000000000098b4 SOFTE: 0 PACATMSCRATCH: b00000010000d033 GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000 GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000 NIP [c000000000009980] fast_exception_return+0xb0/0xb8 LR [0000000000000000] (null) Call Trace: Instruction dump: f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070 e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b Kernel BUG at c000000000043e80 [verbose debug info unavailable] Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033) Oops: Unrecoverable exception, sig: 6 [#2] CPU: 0 PID: 2006 Comm: tm-execed Tainted: G D task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000 NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000 REGS: c00000003ffef7e0 TRAP: 0700 Tainted: G D MSR: 8000000300201033 CR: 28002828 XER: 00000000 CFAR: c000000000015a20 SOFTE: 0 PACATMSCRATCH: b00000010000d033 GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000 GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000 GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004 GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000 GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000 GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80 NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c LR [c000000000015a24] __switch_to+0x1f4/0x420 Call Trace: Instruction dump: 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020 This fixes CVE-2016-5828. Fixes: bc2a9408fa65 ("powerpc: Hook in new transactional memory code") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Cyril Bur Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- arch/powerpc/kernel/process.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index c8c8275765e7..dd023904bac5 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1240,6 +1240,16 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp) current->thread.regs = regs - 1; } +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM + /* + * Clear any transactional state, we're exec()ing. The cause is + * not important as there will never be a recheckpoint so it's not + * user visible. + */ + if (MSR_TM_SUSPENDED(mfmsr())) + tm_reclaim_current(0); +#endif + memset(regs->gpr, 0, sizeof(regs->gpr)); regs->ctr = 0; regs->link = 0; From d4b08964d00a0b99e999a2bb1ce417e54b5c607f Mon Sep 17 00:00:00 2001 From: James Morse Date: Wed, 8 Jun 2016 17:24:45 +0100 Subject: [PATCH 121/312] KVM: arm/arm64: Stop leaking vcpu pid references [ Upstream commit 591d215afcc2f94e8e2c69a63c924c044677eb31 ] kvm provides kvm_vcpu_uninit(), which amongst other things, releases the last reference to the struct pid of the task that was last running the vcpu. On arm64 built with CONFIG_DEBUG_KMEMLEAK, starting a guest with kvmtool, then killing it with SIGKILL results (after some considerable time) in: > cat /sys/kernel/debug/kmemleak > unreferenced object 0xffff80007d5ea080 (size 128): > comm "lkvm", pid 2025, jiffies 4294942645 (age 1107.776s) > hex dump (first 32 bytes): > 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [] create_object+0xfc/0x278 > [] kmemleak_alloc+0x34/0x70 > [] kmem_cache_alloc+0x16c/0x1d8 > [] alloc_pid+0x34/0x4d0 > [] copy_process.isra.6+0x79c/0x1338 > [] _do_fork+0x74/0x320 > [] SyS_clone+0x18/0x20 > [] el0_svc_naked+0x24/0x28 > [] 0xffffffffffffffff On x86 kvm_vcpu_uninit() is called on the path from kvm_arch_destroy_vm(), on arm no equivalent call is made. Add the call to kvm_arch_vcpu_free(). Signed-off-by: James Morse Fixes: 749cf76c5a36 ("KVM: ARM: Initial skeleton to compile KVM support") Cc: # 3.10+ Acked-by: Marc Zyngier Signed-off-by: Christoffer Dall Signed-off-by: Sasha Levin --- arch/arm/kvm/arm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index d6223cbcb661..5414081c0bbf 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -257,6 +257,7 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu) kvm_mmu_free_memory_caches(vcpu); kvm_timer_vcpu_terminate(vcpu); kvm_vgic_vcpu_destroy(vcpu); + kvm_vcpu_uninit(vcpu); kmem_cache_free(kvm_vcpu_cache, vcpu); } From 3f3d3526c73a85950a3d963347b0de675fb3bea5 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Mon, 20 Jun 2016 13:14:36 -0400 Subject: [PATCH 122/312] make nfs_atomic_open() call d_drop() on all ->open_context() errors. [ Upstream commit d20cb71dbf3487f24549ede1a8e2d67579b4632e ] In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code" unconditional d_drop() after the ->open_context() had been removed. It had been correct for success cases (there ->open_context() itself had been doing dcache manipulations), but not for error ones. Only one of those (ENOENT) got a compensatory d_drop() added in that commit, but in fact it should've been done for all errors. As it is, the case of O_CREAT non-exclusive open on a hashed negative dentry racing with e.g. symlink creation from another client ended up with ->open_context() getting an error and proceeding to call nfs_lookup(). On a hashed dentry, which would've instantly triggered BUG_ON() in d_materialise_unique() (or, these days, its equivalent in d_splice_alias()). Cc: stable@vger.kernel.org # v3.10+ Tested-by: Oleg Drokin Signed-off-by: Al Viro Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin --- fs/nfs/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index b2c8b31b2be7..aadb4af4a0fe 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1542,9 +1542,9 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry, err = PTR_ERR(inode); trace_nfs_atomic_open_exit(dir, ctx, open_flags, err); put_nfs_open_context(ctx); + d_drop(dentry); switch (err) { case -ENOENT: - d_drop(dentry); d_add(dentry, NULL); nfs_set_verifier(dentry, nfs_save_change_attribute(dir)); break; From 7421921581dc77ffb4017ce747f95931f74a624b Mon Sep 17 00:00:00 2001 From: Alan Stern Date: Mon, 27 Jun 2016 10:23:10 -0400 Subject: [PATCH 123/312] USB: don't free bandwidth_mutex too early [ Upstream commit ab2a4bf83902c170d29ba130a8abb5f9d90559e1 ] The USB core contains a bug that can show up when a USB-3 host controller is removed. If the primary (USB-2) hcd structure is released before the shared (USB-3) hcd, the core will try to do a double-free of the common bandwidth_mutex. The problem was described in graphical form by Chung-Geol Kim, who first reported it: ================================================= At *remove USB(3.0) Storage sequence <1> --> <5> ((Problem Case)) ================================================= VOLD ------------------------------------|------------ (uevent) ________|_________ |<1> | |dwc3_otg_sm_work | |usb_put_hcd | |peer_hcd(kref=2)| |__________________| ________|_________ |<2> | |New USB BUS #2 | | | |peer_hcd(kref=1) | | | --(Link)-bandXX_mutex| | |__________________| | ___________________ | |<3> | | |dwc3_otg_sm_work | | |usb_put_hcd | | |primary_hcd(kref=1)| | |___________________| | _________|_________ | |<4> | | |New USB BUS #1 | | |hcd_release | | |primary_hcd(kref=0)| | | | | |bandXX_mutex(free) |<- |___________________| (( VOLD )) ______|___________ |<5> | | SCSI | |usb_put_hcd | |peer_hcd(kref=0) | |*hcd_release | |bandXX_mutex(free*)|<- double free |__________________| ================================================= This happens because hcd_release() frees the bandwidth_mutex whenever it sees a primary hcd being released (which is not a very good idea in any case), but in the course of releasing the primary hcd, it changes the pointers in the shared hcd in such a way that the shared hcd will appear to be primary when it gets released. This patch fixes the problem by changing hcd_release() so that it deallocates the bandwidth_mutex only when the _last_ hcd structure referencing it is released. The patch also removes an unnecessary test, so that when an hcd is released, both the shared_hcd and primary_hcd pointers in the hcd's peer will be cleared. Signed-off-by: Alan Stern Reported-by: Chung-Geol Kim Tested-by: Chung-Geol Kim CC: Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/core/hcd.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c index e47cfcd5640c..3a49ba2910df 100644 --- a/drivers/usb/core/hcd.c +++ b/drivers/usb/core/hcd.c @@ -2522,26 +2522,23 @@ EXPORT_SYMBOL_GPL(usb_create_hcd); * Don't deallocate the bandwidth_mutex until the last shared usb_hcd is * deallocated. * - * Make sure to only deallocate the bandwidth_mutex when the primary HCD is - * freed. When hcd_release() is called for either hcd in a peer set - * invalidate the peer's ->shared_hcd and ->primary_hcd pointers to - * block new peering attempts + * Make sure to deallocate the bandwidth_mutex only when the last HCD is + * freed. When hcd_release() is called for either hcd in a peer set, + * invalidate the peer's ->shared_hcd and ->primary_hcd pointers. */ static void hcd_release(struct kref *kref) { struct usb_hcd *hcd = container_of (kref, struct usb_hcd, kref); mutex_lock(&usb_port_peer_mutex); - if (usb_hcd_is_primary_hcd(hcd)) { - kfree(hcd->address0_mutex); - kfree(hcd->bandwidth_mutex); - } if (hcd->shared_hcd) { struct usb_hcd *peer = hcd->shared_hcd; peer->shared_hcd = NULL; - if (peer->primary_hcd == hcd) - peer->primary_hcd = NULL; + peer->primary_hcd = NULL; + } else { + kfree(hcd->address0_mutex); + kfree(hcd->bandwidth_mutex); } mutex_unlock(&usb_port_peer_mutex); kfree(hcd); From 7678c949a43aefb98ac7e64d6b8b2c32ad442fc3 Mon Sep 17 00:00:00 2001 From: Vineet Gupta Date: Tue, 28 Jun 2016 09:42:25 +0530 Subject: [PATCH 124/312] ARC: unwind: ensure that .debug_frame is generated (vs. .eh_frame) [ Upstream commit f52e126cc7476196f44f3c313b7d9f0699a881fc ] With recent binutils update to support dwarf CFI pseudo-ops in gas, we now get .eh_frame vs. .debug_frame. Although the call frame info is exactly the same in both, the CIE differs, which the current kernel unwinder can't cope with. This broke both the kernel unwinder as well as loadable modules (latter because of a new unhandled relo R_ARC_32_PCREL from .rela.eh_frame in the module loader) The ideal solution would be to switch unwinder to .eh_frame. For now however we can make do by just ensureing .debug_frame is generated by removing -fasynchronous-unwind-tables .eh_frame generated with -gdwarf-2 -fasynchronous-unwind-tables .debug_frame generated with -gdwarf-2 Fixes STAR 9001058196 Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta Signed-off-by: Sasha Levin --- arch/arc/Makefile | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/arc/Makefile b/arch/arc/Makefile index 2f21e1e0ecf7..305dbdf6c944 100644 --- a/arch/arc/Makefile +++ b/arch/arc/Makefile @@ -34,7 +34,6 @@ cflags-$(atleast_gcc44) += -fsection-anchors cflags-$(CONFIG_ARC_HAS_LLSC) += -mlock cflags-$(CONFIG_ARC_HAS_SWAPE) += -mswape cflags-$(CONFIG_ARC_HAS_RTSC) += -mrtsc -cflags-$(CONFIG_ARC_DW2_UNWIND) += -fasynchronous-unwind-tables # By default gcc 4.8 generates dwarf4 which kernel unwinder can't grok ifeq ($(atleast_gcc48),y) From 49cacd2b68d3dfc88bbf5cf16c885138ba947561 Mon Sep 17 00:00:00 2001 From: Alexey Brodkin Date: Thu, 23 Jun 2016 11:00:39 +0300 Subject: [PATCH 125/312] arc: unwind: warn only once if DW2_UNWIND is disabled [ Upstream commit 9bd54517ee86cb164c734f72ea95aeba4804f10b ] If CONFIG_ARC_DW2_UNWIND is disabled every time arc_unwind_core() gets called following message gets printed in debug console: ----------------->8--------------- CONFIG_ARC_DW2_UNWIND needs to be enabled ----------------->8--------------- That message makes sense if user indeed wants to see a backtrace or get nice function call-graphs in perf but what if user disabled unwinder for the purpose? Why pollute his debug console? So instead we'll warn user about possibly missing feature once and let him decide if that was what he or she really wanted. Signed-off-by: Alexey Brodkin Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta Signed-off-by: Sasha Levin --- arch/arc/kernel/stacktrace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arc/kernel/stacktrace.c b/arch/arc/kernel/stacktrace.c index 92320d6f737c..5086cc767c0b 100644 --- a/arch/arc/kernel/stacktrace.c +++ b/arch/arc/kernel/stacktrace.c @@ -144,7 +144,7 @@ arc_unwind_core(struct task_struct *tsk, struct pt_regs *regs, * prelogue is setup (callee regs saved and then fp set and not other * way around */ - pr_warn("CONFIG_ARC_DW2_UNWIND needs to be enabled\n"); + pr_warn_once("CONFIG_ARC_DW2_UNWIND needs to be enabled\n"); return 0; #endif From 2bbd6a57310cd7dd8fd4c484e629bb5389b78d9b Mon Sep 17 00:00:00 2001 From: Michael Holzheu Date: Mon, 13 Jun 2016 17:03:48 +0200 Subject: [PATCH 126/312] Revert "s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPL" [ Upstream commit 5419447e2142d6ed68c9f5c1a28630b3a290a845 ] This reverts commit 852ffd0f4e23248b47531058e531066a988434b5. There are use cases where an intermediate boot kernel (1) uses kexec to boot the final production kernel (2). For this scenario we should provide the original boot information to the production kernel (2). Therefore clearing the boot information during kexec() should not be done. Cc: stable@vger.kernel.org # v3.17+ Reported-by: Steffen Maier Signed-off-by: Michael Holzheu Reviewed-by: Heiko Carstens Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin --- arch/s390/kernel/ipl.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c index 52fbef91d1d9..7963c6aa1196 100644 --- a/arch/s390/kernel/ipl.c +++ b/arch/s390/kernel/ipl.c @@ -2044,13 +2044,6 @@ void s390_reset_system(void (*fn_pre)(void), S390_lowcore.program_new_psw.addr = PSW_ADDR_AMODE | (unsigned long) s390_base_pgm_handler; - /* - * Clear subchannel ID and number to signal new kernel that no CCW or - * SCSI IPL has been done (for kexec and kdump) - */ - S390_lowcore.subchannel_id = 0; - S390_lowcore.subchannel_nr = 0; - /* Store status at absolute zero */ store_status(); From 4a883b820416683360258a08becad92d99f7fab1 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Sat, 25 Jun 2016 19:19:28 -0400 Subject: [PATCH 127/312] NFS: Fix another OPEN_DOWNGRADE bug [ Upstream commit e547f2628327fec6afd2e03b46f113f614cca05b ] Olga Kornievskaia reports that the following test fails to trigger an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE. fd0 = open(foo, RDRW) -- should be open on the wire for "both" fd1 = open(foo, RDONLY) -- should be open on the wire for "read" close(fd0) -- should trigger an open_downgrade read(fd1) close(fd1) The issue is that we're missing a check for whether or not the current state transitioned from an O_RDWR state as opposed to having transitioned from a combination of O_RDONLY and O_WRONLY. Reported-by: Olga Kornievskaia Fixes: cd9288ffaea4 ("NFSv4: Fix another bug in the close/open_downgrade code") Cc: stable@vger.kernel.org # 2.6.33+ Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin --- fs/nfs/nfs4proc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 84706204cc33..eef16ec0638a 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2715,12 +2715,11 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data) call_close |= is_wronly; else if (is_wronly) calldata->arg.fmode |= FMODE_WRITE; + if (calldata->arg.fmode != (FMODE_READ|FMODE_WRITE)) + call_close |= is_rdwr; } else if (is_rdwr) calldata->arg.fmode |= FMODE_READ|FMODE_WRITE; - if (calldata->arg.fmode == 0) - call_close |= is_rdwr; - if (!nfs4_valid_open_stateid(state)) call_close = 0; spin_unlock(&state->owner->so_lock); From 2119a62b133e4b4ade51cbc446f31b728662eed8 Mon Sep 17 00:00:00 2001 From: Andrey Ulanov Date: Fri, 15 Apr 2016 14:24:41 -0700 Subject: [PATCH 128/312] namespace: update event counter when umounting a deleted dentry [ Upstream commit e06b933e6ded42384164d28a2060b7f89243b895 ] - m_start() in fs/namespace.c expects that ns->event is incremented each time a mount added or removed from ns->list. - umount_tree() removes items from the list but does not increment event counter, expecting that it's done before the function is called. - There are some codepaths that call umount_tree() without updating "event" counter. e.g. from __detach_mounts(). - When this happens m_start may reuse a cached mount structure that no longer belongs to ns->list (i.e. use after free which usually leads to infinite loop). This change fixes the above problem by incrementing global event counter before invoking umount_tree(). Change-Id: I622c8e84dcb9fb63542372c5dbf0178ee86bb589 Cc: stable@vger.kernel.org Signed-off-by: Andrey Ulanov Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- fs/namespace.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/namespace.c b/fs/namespace.c index 6257268147ee..556721fb0cf6 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1551,6 +1551,7 @@ void __detach_mounts(struct dentry *dentry) goto out_unlock; lock_mount_hash(); + event++; while (!hlist_empty(&mp->m_list)) { mnt = hlist_entry(mp->m_list.first, struct mount, mnt_mp_list); if (mnt->mnt.mnt_flags & MNT_UMOUNT) { From a1f678e55da552523dd45a531580a7e533b3bd8f Mon Sep 17 00:00:00 2001 From: Miklos Szeredi Date: Fri, 1 Jul 2016 14:56:07 +0200 Subject: [PATCH 129/312] locks: use file_inode() [ Upstream commit 6343a2120862f7023006c8091ad95c1f16a32077 ] (Another one for the f_path debacle.) ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask. The reason is that generic_add_lease() used filp->f_path.dentry->inode while all the others use file_inode(). This makes a difference for files opened on overlayfs since the former will point to the overlay inode the latter to the underlying inode. So generic_add_lease() added the lease to the overlay inode and generic_delete_lease() removed it from the underlying inode. When the file was released the lease remained on the overlay inode's lock list, resulting in use after free. Reported-by: Eryu Guan Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: Signed-off-by: Miklos Szeredi Reviewed-by: Jeff Layton Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin --- fs/locks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/locks.c b/fs/locks.c index 8501eecb2af0..3c234b9fbdd9 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1596,7 +1596,7 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr { struct file_lock *fl, *my_fl = NULL, *lease; struct dentry *dentry = filp->f_path.dentry; - struct inode *inode = dentry->d_inode; + struct inode *inode = file_inode(filp); struct file_lock_context *ctx; bool is_deleg = (*flp)->fl_flags & FL_DELEG; int error; From 85aa23b07fabe1d50a26968d07b0e040feb4007c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Krzysztof=20Ha=C5=82asa?= Date: Tue, 1 Mar 2016 07:07:18 +0100 Subject: [PATCH 130/312] PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 54c6e2dd00c313d0add58e5befe62fe6f286d03b ] pci_create_root_bus() passes a "parent" pointer to pci_bus_assign_domain_nr(). When CONFIG_PCI_DOMAINS_GENERIC is defined, pci_bus_assign_domain_nr() dereferences that pointer. Many callers of pci_create_root_bus() supply a NULL "parent" pointer, which leads to a NULL pointer dereference error. 7c674700098c ("PCI: Move domain assignment from arm64 to generic code") moved the "parent" dereference from arm64 to generic code. Only arm64 used that code (because only arm64 defined CONFIG_PCI_DOMAINS_GENERIC), and it always supplied a valid "parent" pointer. Other arches supplied NULL "parent" pointers but didn't defined CONFIG_PCI_DOMAINS_GENERIC, so they used a no-op version of pci_bus_assign_domain_nr(). 8c7d14746abc ("ARM/PCI: Move to generic PCI domains") defined CONFIG_PCI_DOMAINS_GENERIC on ARM, and many ARM platforms use pci_common_init(), which supplies a NULL "parent" pointer. These platforms (cns3xxx, dove, footbridge, iop13xx, etc.) crash with a NULL pointer dereference like this while probing PCI: Unable to handle kernel NULL pointer dereference at virtual address 000000a4 PC is at pci_bus_assign_domain_nr+0x10/0x84 LR is at pci_create_root_bus+0x48/0x2e4 Kernel panic - not syncing: Attempted to kill init! [bhelgaas: changelog, add "Reported:" and "Fixes:" tags] Reported: http://forum.doozan.com/read.php?2,17868,22070,quote=1 Fixes: 8c7d14746abc ("ARM/PCI: Move to generic PCI domains") Fixes: 7c674700098c ("PCI: Move domain assignment from arm64 to generic code") Signed-off-by: Krzysztof Hałasa Signed-off-by: Bjorn Helgaas Acked-by: Lorenzo Pieralisi CC: stable@vger.kernel.org # v4.0+ Signed-off-by: Sasha Levin --- drivers/pci/pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index c44393f26fd3..4e720ed402ef 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4520,8 +4520,10 @@ int pci_get_new_domain_nr(void) void pci_bus_assign_domain_nr(struct pci_bus *bus, struct device *parent) { static int use_dt_domains = -1; - int domain = of_get_pci_domain_nr(parent->of_node); + int domain = -1; + if (parent) + domain = of_get_pci_domain_nr(parent->of_node); /* * Check DT domain and use_dt_domains values. * From 8f82841376c2dc033b6b3f976cfb463678c8b80f Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Wed, 18 Nov 2015 15:25:23 +0100 Subject: [PATCH 131/312] ASoC: samsung: pass DMA channels as pointers [ Upstream commit b9a1a743818ea3265abf98f9431623afa8c50c86 ] ARM64 allmodconfig produces a bunch of warnings when building the samsung ASoC code: sound/soc/samsung/dmaengine.c: In function 'samsung_asoc_init_dma_data': sound/soc/samsung/dmaengine.c:53:32: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] playback_data->filter_data = (void *)playback->channel; sound/soc/samsung/dmaengine.c:60:31: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] capture_data->filter_data = (void *)capture->channel; We could easily shut up the warning by adding an intermediate cast, but there is a bigger underlying problem: The use of IORESOURCE_DMA to pass data from platform code to device drivers is dubious to start with, as what we really want is a pointer that can be passed into a filter function. Note that on s3c64xx, the pl08x DMA data is already a pointer, but gets cast to resource_size_t so we can pass it as a resource, and it then gets converted back to a pointer. In contrast, the data we pass for s3c24xx is an index into a device specific table, and we artificially convert that into a pointer for the filter function. Signed-off-by: Arnd Bergmann Reviewed-by: Krzysztof Kozlowski Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- arch/arm/mach-s3c64xx/dev-audio.c | 41 +++++++++++-------- arch/arm/mach-s3c64xx/include/mach/dma.h | 52 ++++++++++++------------ arch/arm/plat-samsung/devs.c | 11 +++-- include/linux/platform_data/asoc-s3c.h | 4 ++ sound/soc/samsung/ac97.c | 26 ++---------- sound/soc/samsung/dma.h | 2 +- sound/soc/samsung/dmaengine.c | 4 +- sound/soc/samsung/i2s.c | 26 +++--------- sound/soc/samsung/pcm.c | 20 +++------ sound/soc/samsung/s3c2412-i2s.c | 4 +- sound/soc/samsung/s3c24xx-i2s.c | 4 +- sound/soc/samsung/spdif.c | 10 +---- 12 files changed, 84 insertions(+), 120 deletions(-) diff --git a/arch/arm/mach-s3c64xx/dev-audio.c b/arch/arm/mach-s3c64xx/dev-audio.c index ff780a8d8366..9a42736ef4ac 100644 --- a/arch/arm/mach-s3c64xx/dev-audio.c +++ b/arch/arm/mach-s3c64xx/dev-audio.c @@ -54,12 +54,12 @@ static int s3c64xx_i2s_cfg_gpio(struct platform_device *pdev) static struct resource s3c64xx_iis0_resource[] = { [0] = DEFINE_RES_MEM(S3C64XX_PA_IIS0, SZ_256), - [1] = DEFINE_RES_DMA(DMACH_I2S0_OUT), - [2] = DEFINE_RES_DMA(DMACH_I2S0_IN), }; -static struct s3c_audio_pdata i2sv3_pdata = { +static struct s3c_audio_pdata i2s0_pdata = { .cfg_gpio = s3c64xx_i2s_cfg_gpio, + .dma_playback = DMACH_I2S0_OUT, + .dma_capture = DMACH_I2S0_IN, }; struct platform_device s3c64xx_device_iis0 = { @@ -68,15 +68,19 @@ struct platform_device s3c64xx_device_iis0 = { .num_resources = ARRAY_SIZE(s3c64xx_iis0_resource), .resource = s3c64xx_iis0_resource, .dev = { - .platform_data = &i2sv3_pdata, + .platform_data = &i2s0_pdata, }, }; EXPORT_SYMBOL(s3c64xx_device_iis0); static struct resource s3c64xx_iis1_resource[] = { [0] = DEFINE_RES_MEM(S3C64XX_PA_IIS1, SZ_256), - [1] = DEFINE_RES_DMA(DMACH_I2S1_OUT), - [2] = DEFINE_RES_DMA(DMACH_I2S1_IN), +}; + +static struct s3c_audio_pdata i2s1_pdata = { + .cfg_gpio = s3c64xx_i2s_cfg_gpio, + .dma_playback = DMACH_I2S1_OUT, + .dma_capture = DMACH_I2S1_IN, }; struct platform_device s3c64xx_device_iis1 = { @@ -85,19 +89,19 @@ struct platform_device s3c64xx_device_iis1 = { .num_resources = ARRAY_SIZE(s3c64xx_iis1_resource), .resource = s3c64xx_iis1_resource, .dev = { - .platform_data = &i2sv3_pdata, + .platform_data = &i2s1_pdata, }, }; EXPORT_SYMBOL(s3c64xx_device_iis1); static struct resource s3c64xx_iisv4_resource[] = { [0] = DEFINE_RES_MEM(S3C64XX_PA_IISV4, SZ_256), - [1] = DEFINE_RES_DMA(DMACH_HSI_I2SV40_TX), - [2] = DEFINE_RES_DMA(DMACH_HSI_I2SV40_RX), }; static struct s3c_audio_pdata i2sv4_pdata = { .cfg_gpio = s3c64xx_i2s_cfg_gpio, + .dma_playback = DMACH_HSI_I2SV40_TX, + .dma_capture = DMACH_HSI_I2SV40_RX, .type = { .i2s = { .quirks = QUIRK_PRI_6CHAN, @@ -142,12 +146,12 @@ static int s3c64xx_pcm_cfg_gpio(struct platform_device *pdev) static struct resource s3c64xx_pcm0_resource[] = { [0] = DEFINE_RES_MEM(S3C64XX_PA_PCM0, SZ_256), - [1] = DEFINE_RES_DMA(DMACH_PCM0_TX), - [2] = DEFINE_RES_DMA(DMACH_PCM0_RX), }; static struct s3c_audio_pdata s3c_pcm0_pdata = { .cfg_gpio = s3c64xx_pcm_cfg_gpio, + .dma_capture = DMACH_PCM0_RX, + .dma_playback = DMACH_PCM0_TX, }; struct platform_device s3c64xx_device_pcm0 = { @@ -163,12 +167,12 @@ EXPORT_SYMBOL(s3c64xx_device_pcm0); static struct resource s3c64xx_pcm1_resource[] = { [0] = DEFINE_RES_MEM(S3C64XX_PA_PCM1, SZ_256), - [1] = DEFINE_RES_DMA(DMACH_PCM1_TX), - [2] = DEFINE_RES_DMA(DMACH_PCM1_RX), }; static struct s3c_audio_pdata s3c_pcm1_pdata = { .cfg_gpio = s3c64xx_pcm_cfg_gpio, + .dma_playback = DMACH_PCM1_TX, + .dma_capture = DMACH_PCM1_RX, }; struct platform_device s3c64xx_device_pcm1 = { @@ -196,13 +200,14 @@ static int s3c64xx_ac97_cfg_gpe(struct platform_device *pdev) static struct resource s3c64xx_ac97_resource[] = { [0] = DEFINE_RES_MEM(S3C64XX_PA_AC97, SZ_256), - [1] = DEFINE_RES_DMA(DMACH_AC97_PCMOUT), - [2] = DEFINE_RES_DMA(DMACH_AC97_PCMIN), - [3] = DEFINE_RES_DMA(DMACH_AC97_MICIN), - [4] = DEFINE_RES_IRQ(IRQ_AC97), + [1] = DEFINE_RES_IRQ(IRQ_AC97), }; -static struct s3c_audio_pdata s3c_ac97_pdata; +static struct s3c_audio_pdata s3c_ac97_pdata = { + .dma_playback = DMACH_AC97_PCMOUT, + .dma_capture = DMACH_AC97_PCMIN, + .dma_capture_mic = DMACH_AC97_MICIN, +}; static u64 s3c64xx_ac97_dmamask = DMA_BIT_MASK(32); diff --git a/arch/arm/mach-s3c64xx/include/mach/dma.h b/arch/arm/mach-s3c64xx/include/mach/dma.h index 096e14073bd9..9c739eafe95c 100644 --- a/arch/arm/mach-s3c64xx/include/mach/dma.h +++ b/arch/arm/mach-s3c64xx/include/mach/dma.h @@ -14,38 +14,38 @@ #define S3C64XX_DMA_CHAN(name) ((unsigned long)(name)) /* DMA0/SDMA0 */ -#define DMACH_UART0 S3C64XX_DMA_CHAN("uart0_tx") -#define DMACH_UART0_SRC2 S3C64XX_DMA_CHAN("uart0_rx") -#define DMACH_UART1 S3C64XX_DMA_CHAN("uart1_tx") -#define DMACH_UART1_SRC2 S3C64XX_DMA_CHAN("uart1_rx") -#define DMACH_UART2 S3C64XX_DMA_CHAN("uart2_tx") -#define DMACH_UART2_SRC2 S3C64XX_DMA_CHAN("uart2_rx") -#define DMACH_UART3 S3C64XX_DMA_CHAN("uart3_tx") -#define DMACH_UART3_SRC2 S3C64XX_DMA_CHAN("uart3_rx") -#define DMACH_PCM0_TX S3C64XX_DMA_CHAN("pcm0_tx") -#define DMACH_PCM0_RX S3C64XX_DMA_CHAN("pcm0_rx") -#define DMACH_I2S0_OUT S3C64XX_DMA_CHAN("i2s0_tx") -#define DMACH_I2S0_IN S3C64XX_DMA_CHAN("i2s0_rx") +#define DMACH_UART0 "uart0_tx" +#define DMACH_UART0_SRC2 "uart0_rx" +#define DMACH_UART1 "uart1_tx" +#define DMACH_UART1_SRC2 "uart1_rx" +#define DMACH_UART2 "uart2_tx" +#define DMACH_UART2_SRC2 "uart2_rx" +#define DMACH_UART3 "uart3_tx" +#define DMACH_UART3_SRC2 "uart3_rx" +#define DMACH_PCM0_TX "pcm0_tx" +#define DMACH_PCM0_RX "pcm0_rx" +#define DMACH_I2S0_OUT "i2s0_tx" +#define DMACH_I2S0_IN "i2s0_rx" #define DMACH_SPI0_TX S3C64XX_DMA_CHAN("spi0_tx") #define DMACH_SPI0_RX S3C64XX_DMA_CHAN("spi0_rx") -#define DMACH_HSI_I2SV40_TX S3C64XX_DMA_CHAN("i2s2_tx") -#define DMACH_HSI_I2SV40_RX S3C64XX_DMA_CHAN("i2s2_rx") +#define DMACH_HSI_I2SV40_TX "i2s2_tx" +#define DMACH_HSI_I2SV40_RX "i2s2_rx" /* DMA1/SDMA1 */ -#define DMACH_PCM1_TX S3C64XX_DMA_CHAN("pcm1_tx") -#define DMACH_PCM1_RX S3C64XX_DMA_CHAN("pcm1_rx") -#define DMACH_I2S1_OUT S3C64XX_DMA_CHAN("i2s1_tx") -#define DMACH_I2S1_IN S3C64XX_DMA_CHAN("i2s1_rx") +#define DMACH_PCM1_TX "pcm1_tx" +#define DMACH_PCM1_RX "pcm1_rx" +#define DMACH_I2S1_OUT "i2s1_tx" +#define DMACH_I2S1_IN "i2s1_rx" #define DMACH_SPI1_TX S3C64XX_DMA_CHAN("spi1_tx") #define DMACH_SPI1_RX S3C64XX_DMA_CHAN("spi1_rx") -#define DMACH_AC97_PCMOUT S3C64XX_DMA_CHAN("ac97_out") -#define DMACH_AC97_PCMIN S3C64XX_DMA_CHAN("ac97_in") -#define DMACH_AC97_MICIN S3C64XX_DMA_CHAN("ac97_mic") -#define DMACH_PWM S3C64XX_DMA_CHAN("pwm") -#define DMACH_IRDA S3C64XX_DMA_CHAN("irda") -#define DMACH_EXTERNAL S3C64XX_DMA_CHAN("external") -#define DMACH_SECURITY_RX S3C64XX_DMA_CHAN("sec_rx") -#define DMACH_SECURITY_TX S3C64XX_DMA_CHAN("sec_tx") +#define DMACH_AC97_PCMOUT "ac97_out" +#define DMACH_AC97_PCMIN "ac97_in" +#define DMACH_AC97_MICIN "ac97_mic" +#define DMACH_PWM "pwm" +#define DMACH_IRDA "irda" +#define DMACH_EXTERNAL "external" +#define DMACH_SECURITY_RX "sec_rx" +#define DMACH_SECURITY_TX "sec_tx" enum dma_ch { DMACH_MAX = 32 diff --git a/arch/arm/plat-samsung/devs.c b/arch/arm/plat-samsung/devs.c index 83c7d154bde0..8b67db8c1213 100644 --- a/arch/arm/plat-samsung/devs.c +++ b/arch/arm/plat-samsung/devs.c @@ -65,6 +65,7 @@ #include #include #include +#include #include static u64 samsung_device_dma_mask = DMA_BIT_MASK(32); @@ -74,9 +75,12 @@ static u64 samsung_device_dma_mask = DMA_BIT_MASK(32); static struct resource s3c_ac97_resource[] = { [0] = DEFINE_RES_MEM(S3C2440_PA_AC97, S3C2440_SZ_AC97), [1] = DEFINE_RES_IRQ(IRQ_S3C244X_AC97), - [2] = DEFINE_RES_DMA_NAMED(DMACH_PCM_OUT, "PCM out"), - [3] = DEFINE_RES_DMA_NAMED(DMACH_PCM_IN, "PCM in"), - [4] = DEFINE_RES_DMA_NAMED(DMACH_MIC_IN, "Mic in"), +}; + +static struct s3c_audio_pdata s3c_ac97_pdata = { + .dma_playback = (void *)DMACH_PCM_OUT, + .dma_capture = (void *)DMACH_PCM_IN, + .dma_capture_mic = (void *)DMACH_MIC_IN, }; struct platform_device s3c_device_ac97 = { @@ -87,6 +91,7 @@ struct platform_device s3c_device_ac97 = { .dev = { .dma_mask = &samsung_device_dma_mask, .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = &s3c_ac97_pdata, } }; #endif /* CONFIG_CPU_S3C2440 */ diff --git a/include/linux/platform_data/asoc-s3c.h b/include/linux/platform_data/asoc-s3c.h index 5e0bc779e6c5..33f88b4479e4 100644 --- a/include/linux/platform_data/asoc-s3c.h +++ b/include/linux/platform_data/asoc-s3c.h @@ -39,6 +39,10 @@ struct samsung_i2s { */ struct s3c_audio_pdata { int (*cfg_gpio)(struct platform_device *); + void *dma_playback; + void *dma_capture; + void *dma_play_sec; + void *dma_capture_mic; union { struct samsung_i2s i2s; } type; diff --git a/sound/soc/samsung/ac97.c b/sound/soc/samsung/ac97.c index e4145509d63c..9c5219392460 100644 --- a/sound/soc/samsung/ac97.c +++ b/sound/soc/samsung/ac97.c @@ -324,7 +324,7 @@ static const struct snd_soc_component_driver s3c_ac97_component = { static int s3c_ac97_probe(struct platform_device *pdev) { - struct resource *mem_res, *dmatx_res, *dmarx_res, *dmamic_res, *irq_res; + struct resource *mem_res, *irq_res; struct s3c_audio_pdata *ac97_pdata; int ret; @@ -335,24 +335,6 @@ static int s3c_ac97_probe(struct platform_device *pdev) } /* Check for availability of necessary resource */ - dmatx_res = platform_get_resource(pdev, IORESOURCE_DMA, 0); - if (!dmatx_res) { - dev_err(&pdev->dev, "Unable to get AC97-TX dma resource\n"); - return -ENXIO; - } - - dmarx_res = platform_get_resource(pdev, IORESOURCE_DMA, 1); - if (!dmarx_res) { - dev_err(&pdev->dev, "Unable to get AC97-RX dma resource\n"); - return -ENXIO; - } - - dmamic_res = platform_get_resource(pdev, IORESOURCE_DMA, 2); - if (!dmamic_res) { - dev_err(&pdev->dev, "Unable to get AC97-MIC dma resource\n"); - return -ENXIO; - } - irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0); if (!irq_res) { dev_err(&pdev->dev, "AC97 IRQ not provided!\n"); @@ -364,11 +346,11 @@ static int s3c_ac97_probe(struct platform_device *pdev) if (IS_ERR(s3c_ac97.regs)) return PTR_ERR(s3c_ac97.regs); - s3c_ac97_pcm_out.channel = dmatx_res->start; + s3c_ac97_pcm_out.slave = ac97_pdata->dma_playback; s3c_ac97_pcm_out.dma_addr = mem_res->start + S3C_AC97_PCM_DATA; - s3c_ac97_pcm_in.channel = dmarx_res->start; + s3c_ac97_pcm_in.slave = ac97_pdata->dma_capture; s3c_ac97_pcm_in.dma_addr = mem_res->start + S3C_AC97_PCM_DATA; - s3c_ac97_mic_in.channel = dmamic_res->start; + s3c_ac97_mic_in.slave = ac97_pdata->dma_capture_mic; s3c_ac97_mic_in.dma_addr = mem_res->start + S3C_AC97_MIC_DATA; init_completion(&s3c_ac97.done); diff --git a/sound/soc/samsung/dma.h b/sound/soc/samsung/dma.h index 0e85dcfec023..085ef30f5ca2 100644 --- a/sound/soc/samsung/dma.h +++ b/sound/soc/samsung/dma.h @@ -15,7 +15,7 @@ #include struct s3c_dma_params { - int channel; /* Channel ID */ + void *slave; /* Channel ID */ dma_addr_t dma_addr; int dma_size; /* Size of the DMA transfer */ char *ch_name; diff --git a/sound/soc/samsung/dmaengine.c b/sound/soc/samsung/dmaengine.c index 506f5bf6d082..727008d57d14 100644 --- a/sound/soc/samsung/dmaengine.c +++ b/sound/soc/samsung/dmaengine.c @@ -50,14 +50,14 @@ void samsung_asoc_init_dma_data(struct snd_soc_dai *dai, if (playback) { playback_data = &playback->dma_data; - playback_data->filter_data = (void *)playback->channel; + playback_data->filter_data = playback->slave; playback_data->chan_name = playback->ch_name; playback_data->addr = playback->dma_addr; playback_data->addr_width = playback->dma_size; } if (capture) { capture_data = &capture->dma_data; - capture_data->filter_data = (void *)capture->channel; + capture_data->filter_data = capture->slave; capture_data->chan_name = capture->ch_name; capture_data->addr = capture->dma_addr; capture_data->addr_width = capture->dma_size; diff --git a/sound/soc/samsung/i2s.c b/sound/soc/samsung/i2s.c index 5e8ccb0a7028..9456c78c9051 100644 --- a/sound/soc/samsung/i2s.c +++ b/sound/soc/samsung/i2s.c @@ -1260,27 +1260,14 @@ static int samsung_i2s_probe(struct platform_device *pdev) pri_dai->lock = &pri_dai->spinlock; if (!np) { - res = platform_get_resource(pdev, IORESOURCE_DMA, 0); - if (!res) { - dev_err(&pdev->dev, - "Unable to get I2S-TX dma resource\n"); - return -ENXIO; - } - pri_dai->dma_playback.channel = res->start; - - res = platform_get_resource(pdev, IORESOURCE_DMA, 1); - if (!res) { - dev_err(&pdev->dev, - "Unable to get I2S-RX dma resource\n"); - return -ENXIO; - } - pri_dai->dma_capture.channel = res->start; - if (i2s_pdata == NULL) { dev_err(&pdev->dev, "Can't work without s3c_audio_pdata\n"); return -EINVAL; } + pri_dai->dma_playback.slave = i2s_pdata->dma_playback; + pri_dai->dma_capture.slave = i2s_pdata->dma_capture; + if (&i2s_pdata->type) i2s_cfg = &i2s_pdata->type.i2s; @@ -1341,11 +1328,8 @@ static int samsung_i2s_probe(struct platform_device *pdev) sec_dai->dma_playback.dma_addr = regs_base + I2STXDS; sec_dai->dma_playback.ch_name = "tx-sec"; - if (!np) { - res = platform_get_resource(pdev, IORESOURCE_DMA, 2); - if (res) - sec_dai->dma_playback.channel = res->start; - } + if (!np) + sec_dai->dma_playback.slave = i2s_pdata->dma_play_sec; sec_dai->dma_playback.dma_size = 4; sec_dai->addr = pri_dai->addr; diff --git a/sound/soc/samsung/pcm.c b/sound/soc/samsung/pcm.c index b320a9d3fbf8..c77f324e0bb8 100644 --- a/sound/soc/samsung/pcm.c +++ b/sound/soc/samsung/pcm.c @@ -486,7 +486,7 @@ static const struct snd_soc_component_driver s3c_pcm_component = { static int s3c_pcm_dev_probe(struct platform_device *pdev) { struct s3c_pcm_info *pcm; - struct resource *mem_res, *dmatx_res, *dmarx_res; + struct resource *mem_res; struct s3c_audio_pdata *pcm_pdata; int ret; @@ -499,18 +499,6 @@ static int s3c_pcm_dev_probe(struct platform_device *pdev) pcm_pdata = pdev->dev.platform_data; /* Check for availability of necessary resource */ - dmatx_res = platform_get_resource(pdev, IORESOURCE_DMA, 0); - if (!dmatx_res) { - dev_err(&pdev->dev, "Unable to get PCM-TX dma resource\n"); - return -ENXIO; - } - - dmarx_res = platform_get_resource(pdev, IORESOURCE_DMA, 1); - if (!dmarx_res) { - dev_err(&pdev->dev, "Unable to get PCM-RX dma resource\n"); - return -ENXIO; - } - mem_res = platform_get_resource(pdev, IORESOURCE_MEM, 0); if (!mem_res) { dev_err(&pdev->dev, "Unable to get register resource\n"); @@ -568,8 +556,10 @@ static int s3c_pcm_dev_probe(struct platform_device *pdev) s3c_pcm_stereo_out[pdev->id].dma_addr = mem_res->start + S3C_PCM_TXFIFO; - s3c_pcm_stereo_in[pdev->id].channel = dmarx_res->start; - s3c_pcm_stereo_out[pdev->id].channel = dmatx_res->start; + if (pcm_pdata) { + s3c_pcm_stereo_in[pdev->id].slave = pcm_pdata->dma_capture; + s3c_pcm_stereo_out[pdev->id].slave = pcm_pdata->dma_playback; + } pcm->dma_capture = &s3c_pcm_stereo_in[pdev->id]; pcm->dma_playback = &s3c_pcm_stereo_out[pdev->id]; diff --git a/sound/soc/samsung/s3c2412-i2s.c b/sound/soc/samsung/s3c2412-i2s.c index 2b766d212ce0..77d27c85a32a 100644 --- a/sound/soc/samsung/s3c2412-i2s.c +++ b/sound/soc/samsung/s3c2412-i2s.c @@ -34,13 +34,13 @@ #include "s3c2412-i2s.h" static struct s3c_dma_params s3c2412_i2s_pcm_stereo_out = { - .channel = DMACH_I2S_OUT, + .slave = (void *)(uintptr_t)DMACH_I2S_OUT, .ch_name = "tx", .dma_size = 4, }; static struct s3c_dma_params s3c2412_i2s_pcm_stereo_in = { - .channel = DMACH_I2S_IN, + .slave = (void *)(uintptr_t)DMACH_I2S_IN, .ch_name = "rx", .dma_size = 4, }; diff --git a/sound/soc/samsung/s3c24xx-i2s.c b/sound/soc/samsung/s3c24xx-i2s.c index 5bf723689692..9da3a77ea2c7 100644 --- a/sound/soc/samsung/s3c24xx-i2s.c +++ b/sound/soc/samsung/s3c24xx-i2s.c @@ -32,13 +32,13 @@ #include "s3c24xx-i2s.h" static struct s3c_dma_params s3c24xx_i2s_pcm_stereo_out = { - .channel = DMACH_I2S_OUT, + .slave = (void *)(uintptr_t)DMACH_I2S_OUT, .ch_name = "tx", .dma_size = 2, }; static struct s3c_dma_params s3c24xx_i2s_pcm_stereo_in = { - .channel = DMACH_I2S_IN, + .slave = (void *)(uintptr_t)DMACH_I2S_IN, .ch_name = "rx", .dma_size = 2, }; diff --git a/sound/soc/samsung/spdif.c b/sound/soc/samsung/spdif.c index 36dbc0e96004..9dd7ee6d03ff 100644 --- a/sound/soc/samsung/spdif.c +++ b/sound/soc/samsung/spdif.c @@ -359,7 +359,7 @@ static const struct snd_soc_component_driver samsung_spdif_component = { static int spdif_probe(struct platform_device *pdev) { struct s3c_audio_pdata *spdif_pdata; - struct resource *mem_res, *dma_res; + struct resource *mem_res; struct samsung_spdif_info *spdif; int ret; @@ -367,12 +367,6 @@ static int spdif_probe(struct platform_device *pdev) dev_dbg(&pdev->dev, "Entered %s\n", __func__); - dma_res = platform_get_resource(pdev, IORESOURCE_DMA, 0); - if (!dma_res) { - dev_err(&pdev->dev, "Unable to get dma resource.\n"); - return -ENXIO; - } - mem_res = platform_get_resource(pdev, IORESOURCE_MEM, 0); if (!mem_res) { dev_err(&pdev->dev, "Unable to get register resource.\n"); @@ -432,7 +426,7 @@ static int spdif_probe(struct platform_device *pdev) spdif_stereo_out.dma_size = 2; spdif_stereo_out.dma_addr = mem_res->start + DATA_OUTBUF; - spdif_stereo_out.channel = dma_res->start; + spdif_stereo_out.slave = spdif_pdata ? spdif_pdata->dma_playback : NULL; spdif->dma_playback = &spdif_stereo_out; From b5ba0d06632445b3810b50093cd22a2ab06900de Mon Sep 17 00:00:00 2001 From: DingXiang Date: Tue, 2 Feb 2016 12:29:18 +0800 Subject: [PATCH 132/312] dm snapshot: disallow the COW and origin devices from being identical [ Upstream commit 4df2bf466a9c9c92f40d27c4aa9120f4e8227bfc ] Otherwise loading a "snapshot" table using the same device for the origin and COW devices, e.g.: echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap will trigger: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 [ 1958.979934] IP: [] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1958.989655] PGD 0 [ 1958.991903] Oops: 0000 [#1] SMP ... [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G IO 4.5.0-rc5.snitm+ #150 ... [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000 [ 1959.091865] RIP: 0010:[] [] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.104295] RSP: 0018:ffff88032a957b30 EFLAGS: 00010246 [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001 [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00 [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001 [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80 [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80 [ 1959.150021] FS: 00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000 [ 1959.159047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0 [ 1959.173415] Stack: [ 1959.175656] ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc [ 1959.183946] ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000 [ 1959.192233] ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf [ 1959.200521] Call Trace: [ 1959.203248] [] dm_exception_store_create+0x1d5/0x240 [dm_snapshot] [ 1959.211986] [] snapshot_ctr+0x140/0x630 [dm_snapshot] [ 1959.219469] [] ? dm_split_args+0x64/0x150 [dm_mod] [ 1959.226656] [] dm_table_add_target+0x177/0x440 [dm_mod] [ 1959.234328] [] table_load+0x143/0x370 [dm_mod] [ 1959.241129] [] ? retrieve_status+0x1b0/0x1b0 [dm_mod] [ 1959.248607] [] ctl_ioctl+0x255/0x4d0 [dm_mod] [ 1959.255307] [] ? memzero_explicit+0x12/0x20 [ 1959.261816] [] dm_ctl_ioctl+0x13/0x20 [dm_mod] [ 1959.268615] [] do_vfs_ioctl+0xa6/0x5c0 [ 1959.274637] [] ? __audit_syscall_entry+0xaf/0x100 [ 1959.281726] [] ? do_audit_syscall_entry+0x66/0x70 [ 1959.288814] [] SyS_ioctl+0x79/0x90 [ 1959.294450] [] entry_SYSCALL_64_fastpath+0x12/0x71 ... [ 1959.323277] RIP [] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.333090] RSP [ 1959.336978] CR2: 0000000000000098 [ 1959.344121] ---[ end trace b049991ccad1169e ]--- Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899 Cc: stable@vger.kernel.org Signed-off-by: Ding Xiang Signed-off-by: Mike Snitzer Signed-off-by: Sasha Levin --- drivers/md/dm-snap.c | 9 +++++++++ drivers/md/dm-table.c | 36 +++++++++++++++++++++++------------ include/linux/device-mapper.h | 2 ++ 3 files changed, 35 insertions(+), 12 deletions(-) diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index 11ec9d2a27df..38f375516ae6 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -1099,6 +1099,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv) int i; int r = -EINVAL; char *origin_path, *cow_path; + dev_t origin_dev, cow_dev; unsigned args_used, num_flush_bios = 1; fmode_t origin_mode = FMODE_READ; @@ -1129,11 +1130,19 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv) ti->error = "Cannot get origin device"; goto bad_origin; } + origin_dev = s->origin->bdev->bd_dev; cow_path = argv[0]; argv++; argc--; + cow_dev = dm_get_dev_t(cow_path); + if (cow_dev && cow_dev == origin_dev) { + ti->error = "COW device cannot be the same as origin device"; + r = -EINVAL; + goto bad_cow; + } + r = dm_get_device(ti, cow_path, dm_table_get_mode(ti->table), &s->cow); if (r) { ti->error = "Cannot get COW device"; diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 16ba55ad7089..e411ccba0af6 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -364,6 +364,26 @@ static int upgrade_mode(struct dm_dev_internal *dd, fmode_t new_mode, return 0; } +/* + * Convert the path to a device + */ +dev_t dm_get_dev_t(const char *path) +{ + dev_t uninitialized_var(dev); + struct block_device *bdev; + + bdev = lookup_bdev(path); + if (IS_ERR(bdev)) + dev = name_to_dev_t(path); + else { + dev = bdev->bd_dev; + bdput(bdev); + } + + return dev; +} +EXPORT_SYMBOL_GPL(dm_get_dev_t); + /* * Add a device to the list, or just increment the usage count if * it's already present. @@ -372,23 +392,15 @@ int dm_get_device(struct dm_target *ti, const char *path, fmode_t mode, struct dm_dev **result) { int r; - dev_t uninitialized_var(dev); + dev_t dev; struct dm_dev_internal *dd; struct dm_table *t = ti->table; - struct block_device *bdev; BUG_ON(!t); - /* convert the path to a device */ - bdev = lookup_bdev(path); - if (IS_ERR(bdev)) { - dev = name_to_dev_t(path); - if (!dev) - return -ENODEV; - } else { - dev = bdev->bd_dev; - bdput(bdev); - } + dev = dm_get_dev_t(path); + if (!dev) + return -ENODEV; dd = find_device(&t->devices, dev); if (!dd) { diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index 51cc1deb7af3..30c8971bc615 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -127,6 +127,8 @@ struct dm_dev { char name[16]; }; +dev_t dm_get_dev_t(const char *path); + /* * Constructors should call these functions to ensure destination devices * are opened/closed correctly. From 00ef5df8687ff1a51660a5d27c202417a69b5018 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 15 Mar 2016 12:14:49 +0100 Subject: [PATCH 133/312] ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk() [ Upstream commit 902eb7fd1e4af3ac69b9b30f8373f118c92b9729 ] Just a minor code cleanup: unify the error paths. Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/usb/quirks.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 194fa7f60a38..dcd4b232544a 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -164,23 +164,18 @@ static int create_fixed_stream_quirk(struct snd_usb_audio *chip, stream = (fp->endpoint & USB_DIR_IN) ? SNDRV_PCM_STREAM_CAPTURE : SNDRV_PCM_STREAM_PLAYBACK; err = snd_usb_add_audio_stream(chip, stream, fp); - if (err < 0) { - kfree(fp); - kfree(rate_table); - return err; - } + if (err < 0) + goto error; if (fp->iface != get_iface_desc(&iface->altsetting[0])->bInterfaceNumber || fp->altset_idx >= iface->num_altsetting) { - kfree(fp); - kfree(rate_table); - return -EINVAL; + err = -EINVAL; + goto error; } alts = &iface->altsetting[fp->altset_idx]; altsd = get_iface_desc(alts); if (altsd->bNumEndpoints < 1) { - kfree(fp); - kfree(rate_table); - return -EINVAL; + err = -EINVAL; + goto error; } fp->protocol = altsd->bInterfaceProtocol; @@ -193,6 +188,11 @@ static int create_fixed_stream_quirk(struct snd_usb_audio *chip, snd_usb_init_pitch(chip, fp->iface, alts, fp); snd_usb_init_sample_rate(chip, fp->iface, alts, fp, fp->rate_max); return 0; + + error: + kfree(fp); + kfree(rate_table); + return err; } static int create_auto_pcm_quirk(struct snd_usb_audio *chip, From 03e046efdbd3bccaac952553065adb70fb6976aa Mon Sep 17 00:00:00 2001 From: Vladis Dronov Date: Thu, 31 Mar 2016 12:05:43 -0400 Subject: [PATCH 134/312] ALSA: usb-audio: Fix double-free in error paths after snd_usb_add_audio_stream() call [ Upstream commit 836b34a935abc91e13e63053d0a83b24dfb5ea78 ] create_fixed_stream_quirk(), snd_usb_parse_audio_interface() and create_uaxx_quirk() functions allocate the audioformat object by themselves and free it upon error before returning. However, once the object is linked to a stream, it's freed again in snd_usb_audio_pcm_free(), thus it'll be double-freed, eventually resulting in a memory corruption. This patch fixes these failures in the error paths by unlinking the audioformat object before freeing it. Based on a patch by Takashi Iwai [Note for stable backports: this patch requires the commit 902eb7fd1e4a ('ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk()')] Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1283358 Reported-by: Ralf Spenneberg Cc: # see the note above Signed-off-by: Vladis Dronov Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/usb/quirks.c | 4 ++++ sound/usb/stream.c | 6 +++++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index dcd4b232544a..e27df0d3898b 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -147,6 +147,7 @@ static int create_fixed_stream_quirk(struct snd_usb_audio *chip, usb_audio_err(chip, "cannot memdup\n"); return -ENOMEM; } + INIT_LIST_HEAD(&fp->list); if (fp->nr_rates > MAX_NR_RATES) { kfree(fp); return -EINVAL; @@ -190,6 +191,7 @@ static int create_fixed_stream_quirk(struct snd_usb_audio *chip, return 0; error: + list_del(&fp->list); /* unlink for avoiding double-free */ kfree(fp); kfree(rate_table); return err; @@ -465,6 +467,7 @@ static int create_uaxx_quirk(struct snd_usb_audio *chip, fp->ep_attr = get_endpoint(alts, 0)->bmAttributes; fp->datainterval = 0; fp->maxpacksize = le16_to_cpu(get_endpoint(alts, 0)->wMaxPacketSize); + INIT_LIST_HEAD(&fp->list); switch (fp->maxpacksize) { case 0x120: @@ -488,6 +491,7 @@ static int create_uaxx_quirk(struct snd_usb_audio *chip, ? SNDRV_PCM_STREAM_CAPTURE : SNDRV_PCM_STREAM_PLAYBACK; err = snd_usb_add_audio_stream(chip, stream, fp); if (err < 0) { + list_del(&fp->list); /* unlink for avoiding double-free */ kfree(fp); return err; } diff --git a/sound/usb/stream.c b/sound/usb/stream.c index 310a3822d2b7..25e8075f9ea3 100644 --- a/sound/usb/stream.c +++ b/sound/usb/stream.c @@ -315,7 +315,9 @@ static struct snd_pcm_chmap_elem *convert_chmap(int channels, unsigned int bits, /* * add this endpoint to the chip instance. * if a stream with the same endpoint already exists, append to it. - * if not, create a new pcm stream. + * if not, create a new pcm stream. note, fp is added to the substream + * fmt_list and will be freed on the chip instance release. do not free + * fp or do remove it from the substream fmt_list to avoid double-free. */ int snd_usb_add_audio_stream(struct snd_usb_audio *chip, int stream, @@ -668,6 +670,7 @@ int snd_usb_parse_audio_interface(struct snd_usb_audio *chip, int iface_no) * (fp->maxpacksize & 0x7ff); fp->attributes = parse_uac_endpoint_attributes(chip, alts, protocol, iface_no); fp->clock = clock; + INIT_LIST_HEAD(&fp->list); /* some quirks for attributes here */ @@ -716,6 +719,7 @@ int snd_usb_parse_audio_interface(struct snd_usb_audio *chip, int iface_no) dev_dbg(&dev->dev, "%u:%d: add audio endpoint %#x\n", iface_no, altno, fp->endpoint); err = snd_usb_add_audio_stream(chip, stream, fp); if (err < 0) { + list_del(&fp->list); /* unlink for avoiding double-free */ kfree(fp->rate_table); kfree(fp->chmap); kfree(fp); From eae5f796a5de5ebc33e745126ce232f534fd0d0e Mon Sep 17 00:00:00 2001 From: Jarkko Sakkinen Date: Mon, 8 Feb 2016 22:31:08 +0200 Subject: [PATCH 135/312] tpm: fix the cleanup of struct tpm_chip [ Upstream commit 8e0ee3c9faed7ca68807ea45141775856c438ac0 ] If the initialization fails before tpm_chip_register(), put_device() will be not called, which causes release callback not to be called. This patch fixes the issue by adding put_device() to devres list of the parent device. Fixes: 313d21eeab ("tpm: device class for tpm") Signed-off-by: Jarkko Sakkinen cc: stable@vger.kernel.org Reviewed-by: Jason Gunthorpe Signed-off-by: Sasha Levin --- drivers/char/tpm/tpm-chip.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c index 1082d4bb016a..591629cc32d5 100644 --- a/drivers/char/tpm/tpm-chip.c +++ b/drivers/char/tpm/tpm-chip.c @@ -133,6 +133,8 @@ struct tpm_chip *tpmm_chip_alloc(struct device *dev, chip->cdev.owner = chip->pdev->driver->owner; chip->cdev.kobj.parent = &chip->dev.kobj; + devm_add_action(dev, (void (*)(void *)) put_device, &chip->dev); + return chip; } EXPORT_SYMBOL_GPL(tpmm_chip_alloc); @@ -168,7 +170,7 @@ static int tpm_dev_add_device(struct tpm_chip *chip) static void tpm_dev_del_device(struct tpm_chip *chip) { cdev_del(&chip->cdev); - device_unregister(&chip->dev); + device_del(&chip->dev); } static int tpm1_chip_register(struct tpm_chip *chip) From e4e503b2a3c6ad22375010933041f05b2ef00367 Mon Sep 17 00:00:00 2001 From: Grazvydas Ignotas Date: Sat, 13 Feb 2016 22:41:51 +0200 Subject: [PATCH 136/312] HID: logitech: fix Dual Action gamepad support [ Upstream commit 5d74325a2201376a95520a4a38a1ce2c65761c49 ] The patch that added Logitech Dual Action gamepad support forgot to update the special driver list for the device. This caused the logitech driver not to probe unless kernel module load order was favorable. Update the special driver list to fix it. Thanks to Simon Wood for the idea. Cc: Vitaly Katraew Fixes: 56d0c8b7c8fb ("HID: add support for Logitech Dual Action gamepads") Signed-off-by: Grazvydas Ignotas Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin --- drivers/hid/hid-core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index bc23db196930..bf039dbaa7eb 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -1864,6 +1864,7 @@ static const struct hid_device_id hid_have_special_driver[] = { { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_ELITE_KBD) }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_CORDLESS_DESKTOP_LX500) }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_EXTREME_3D) }, + { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_DUAL_ACTION) }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_WHEEL) }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_RUMBLEPAD_CORD) }, { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_RUMBLEPAD) }, From c6aa8bae477bf2fdaacbc8683f5801511aa3b859 Mon Sep 17 00:00:00 2001 From: Sebastian Frias Date: Fri, 18 Dec 2015 17:40:05 +0100 Subject: [PATCH 137/312] 8250: use callbacks to access UART_DLL/UART_DLM [ Upstream commit 0b41ce991052022c030fd868e03877700220b090 ] Some UART HW has a single register combining UART_DLL/UART_DLM (this was probably forgotten in the change that introduced the callbacks, commit b32b19b8ffc05cbd3bf91c65e205f6a912ca15d9) Fixes: b32b19b8ffc0 ("[SERIAL] 8250: set divisor register correctly ...") Signed-off-by: Sebastian Frias Reviewed-by: Peter Hurley Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/tty/serial/8250/8250_core.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c index b4fd8debf941..a64d53f7b1d1 100644 --- a/drivers/tty/serial/8250/8250_core.c +++ b/drivers/tty/serial/8250/8250_core.c @@ -791,22 +791,16 @@ static int size_fifo(struct uart_8250_port *up) */ static unsigned int autoconfig_read_divisor_id(struct uart_8250_port *p) { - unsigned char old_dll, old_dlm, old_lcr; - unsigned int id; + unsigned char old_lcr; + unsigned int id, old_dl; old_lcr = serial_in(p, UART_LCR); serial_out(p, UART_LCR, UART_LCR_CONF_MODE_A); + old_dl = serial_dl_read(p); + serial_dl_write(p, 0); + id = serial_dl_read(p); + serial_dl_write(p, old_dl); - old_dll = serial_in(p, UART_DLL); - old_dlm = serial_in(p, UART_DLM); - - serial_out(p, UART_DLL, 0); - serial_out(p, UART_DLM, 0); - - id = serial_in(p, UART_DLL) | serial_in(p, UART_DLM) << 8; - - serial_out(p, UART_DLL, old_dll); - serial_out(p, UART_DLM, old_dlm); serial_out(p, UART_LCR, old_lcr); return id; From 30dbed7b3dcc4b968ec050589f7acdcdabea8c7a Mon Sep 17 00:00:00 2001 From: Asai Thambi SP Date: Wed, 24 Feb 2016 21:17:47 -0800 Subject: [PATCH 138/312] mtip32xx: Fix for rmmod crash when drive is in FTL rebuild [ Upstream commit 59cf70e236c96594d9f1e065755d8fce9df5356b ] When FTL rebuild is in progress, alloc_disk() initializes the disk but device node will be created by add_disk() only after successful completion of FTL rebuild. So, skip deletion of device node in removal path when FTL rebuild is in progress. Signed-off-by: Selvan Mani Signed-off-by: Asai Thambi S P Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- drivers/block/mtip32xx/mtip32xx.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 2af8b29656af..09212c19cab3 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -3004,10 +3004,8 @@ restart_eh: } if (test_bit(MTIP_PF_REBUILD_BIT, &port->flags)) { - if (mtip_ftl_rebuild_poll(dd) < 0) - set_bit(MTIP_DDF_REBUILD_FAILED_BIT, - &dd->dd_flag); - clear_bit(MTIP_PF_REBUILD_BIT, &port->flags); + if (mtip_ftl_rebuild_poll(dd) == 0) + clear_bit(MTIP_PF_REBUILD_BIT, &port->flags); } } @@ -3887,7 +3885,6 @@ static int mtip_block_initialize(struct driver_data *dd) mtip_hw_debugfs_init(dd); -skip_create_disk: memset(&dd->tags, 0, sizeof(dd->tags)); dd->tags.ops = &mtip_mq_ops; dd->tags.nr_hw_queues = 1; @@ -3917,6 +3914,7 @@ skip_create_disk: dd->disk->queue = dd->queue; dd->queue->queuedata = dd; +skip_create_disk: /* Initialize the protocol layer. */ wait_for_rebuild = mtip_hw_get_identify(dd); if (wait_for_rebuild < 0) { @@ -4078,7 +4076,8 @@ static int mtip_block_remove(struct driver_data *dd) dd->bdev = NULL; } if (dd->disk) { - del_gendisk(dd->disk); + if (test_bit(MTIP_DDF_INIT_DONE_BIT, &dd->dd_flag)) + del_gendisk(dd->disk); if (dd->disk->queue) { blk_cleanup_queue(dd->queue); blk_mq_free_tag_set(&dd->tags); @@ -4119,7 +4118,8 @@ static int mtip_block_shutdown(struct driver_data *dd) dev_info(&dd->pdev->dev, "Shutting down %s ...\n", dd->disk->disk_name); - del_gendisk(dd->disk); + if (test_bit(MTIP_DDF_INIT_DONE_BIT, &dd->dd_flag)) + del_gendisk(dd->disk); if (dd->disk->queue) { blk_cleanup_queue(dd->queue); blk_mq_free_tag_set(&dd->tags); From cab680963342190c38a0df2e16b6184f9f67189a Mon Sep 17 00:00:00 2001 From: Asai Thambi SP Date: Wed, 24 Feb 2016 21:16:00 -0800 Subject: [PATCH 139/312] mtip32xx: Fix broken service thread handling [ Upstream commit 1b899eb4833d3394f37272d38b4b1a26eac30feb ] commit cfc05bd31384c4898bf2437a4de5557f3cf9803a upstream. Service thread does not detect the need for taskfile error hanlding. Fixed the flag condition to process taskfile error. Signed-off-by: Selvan Mani Signed-off-by: Asai Thambi S P Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/block/mtip32xx/mtip32xx.c | 12 ++++++++++-- drivers/block/mtip32xx/mtip32xx.h | 5 +++++ 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 09212c19cab3..eecaa02ec222 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -2946,9 +2946,15 @@ static int mtip_service_thread(void *data) * is in progress nor error handling is active */ wait_event_interruptible(port->svc_wait, (port->flags) && - !(port->flags & MTIP_PF_PAUSE_IO)); + (port->flags & MTIP_PF_SVC_THD_WORK)); - set_bit(MTIP_PF_SVC_THD_ACTIVE_BIT, &port->flags); + if (kthread_should_stop() || + test_bit(MTIP_PF_SVC_THD_STOP_BIT, &port->flags)) + goto st_out; + + if (unlikely(test_bit(MTIP_DDF_REMOVE_PENDING_BIT, + &dd->dd_flag))) + goto st_out; if (kthread_should_stop() || test_bit(MTIP_PF_SVC_THD_STOP_BIT, &port->flags)) @@ -2962,6 +2968,8 @@ static int mtip_service_thread(void *data) &dd->dd_flag))) goto st_out; + set_bit(MTIP_PF_SVC_THD_ACTIVE_BIT, &port->flags); + restart_eh: /* Demux bits: start with error handling */ if (test_bit(MTIP_PF_EH_ACTIVE_BIT, &port->flags)) { diff --git a/drivers/block/mtip32xx/mtip32xx.h b/drivers/block/mtip32xx/mtip32xx.h index 76695265dffb..578ad36c9913 100644 --- a/drivers/block/mtip32xx/mtip32xx.h +++ b/drivers/block/mtip32xx/mtip32xx.h @@ -145,6 +145,11 @@ enum { MTIP_PF_SR_CLEANUP_BIT = 7, MTIP_PF_SVC_THD_STOP_BIT = 8, + MTIP_PF_SVC_THD_WORK = ((1 << MTIP_PF_EH_ACTIVE_BIT) | + (1 << MTIP_PF_ISSUE_CMDS_BIT) | + (1 << MTIP_PF_REBUILD_BIT) | + (1 << MTIP_PF_SVC_THD_STOP_BIT)), + /* below are bit numbers in 'dd_flag' defined in driver_data */ MTIP_DDF_SEC_LOCK_BIT = 0, MTIP_DDF_REMOVE_PENDING_BIT = 1, From 4c06bf7e573a4e8cd8bf42ed99524e30b3aa8cbd Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 10 Mar 2016 20:56:20 +0100 Subject: [PATCH 140/312] ALSA: pcm: Avoid "BUG:" string for warnings again [ Upstream commit 0ab1ace856205d10cbc1924b2d931c01ffd216a6 ] The commit [d507941beb1e: ALSA: pcm: Correct PCM BUG error message] made the warning prefix back to "BUG:" due to its previous wrong prefix. But a kernel message containing "BUG:" seems taken as an Oops message wrongly by some brain-dead daemons, and it annoys users in the end. Instead of teaching daemons, change the string again to a more reasonable one. Fixes: 507941beb1e ('ALSA: pcm: Correct PCM BUG error message') Cc: # v3.19+ Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/core/pcm_lib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 7d45645f10ba..253a2da05cf0 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -322,7 +322,7 @@ static int snd_pcm_update_hw_ptr0(struct snd_pcm_substream *substream, char name[16]; snd_pcm_debug_name(substream, name, sizeof(name)); pcm_err(substream->pcm, - "BUG: %s, pos = %ld, buffer size = %ld, period size = %ld\n", + "invalid position: %s, pos = %ld, buffer size = %ld, period size = %ld\n", name, pos, runtime->buffer_size, runtime->period_size); } From bf15ba34c2a9afb0090294597ec4122f92616b38 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Sat, 26 Mar 2016 12:28:05 -0700 Subject: [PATCH 141/312] hwmon: (max1111) Return -ENODEV from max1111_read_channel if not instantiated [ Upstream commit 3c2e2266a5bd2d1cef258e6e54dca1d99946379f ] arm:pxa_defconfig can result in the following crash if the max1111 driver is not instantiated. Unhandled fault: page domain fault (0x01b) at 0x00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: : 1b [#1] PREEMPT ARM Modules linked in: CPU: 0 PID: 300 Comm: kworker/0:1 Not tainted 4.5.0-01301-g1701f680407c #10 Hardware name: SHARP Akita Workqueue: events sharpsl_charge_toggle task: c390a000 ti: c391e000 task.ti: c391e000 PC is at max1111_read_channel+0x20/0x30 LR is at sharpsl_pm_pxa_read_max1111+0x2c/0x3c pc : [] lr : [] psr: 20000013 ... [] (max1111_read_channel) from [] (sharpsl_pm_pxa_read_max1111+0x2c/0x3c) [] (sharpsl_pm_pxa_read_max1111) from [] (spitzpm_read_devdata+0x5c/0xc4) [] (spitzpm_read_devdata) from [] (sharpsl_check_battery_temp+0x78/0x110) [] (sharpsl_check_battery_temp) from [] (sharpsl_charge_toggle+0x48/0x110) [] (sharpsl_charge_toggle) from [] (process_one_work+0x14c/0x48c) [] (process_one_work) from [] (worker_thread+0x3c/0x5d4) [] (worker_thread) from [] (kthread+0xd0/0xec) [] (kthread) from [] (ret_from_fork+0x14/0x24) This can occur because the SPI controller driver (SPI_PXA2XX) is built as module and thus not necessarily loaded. While building SPI_PXA2XX into the kernel would make the problem disappear, it appears prudent to ensure that the driver is instantiated before accessing its data structures. Cc: Arnd Bergmann Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin --- drivers/hwmon/max1111.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/hwmon/max1111.c b/drivers/hwmon/max1111.c index f67d71ee8386..159f50d0ae39 100644 --- a/drivers/hwmon/max1111.c +++ b/drivers/hwmon/max1111.c @@ -85,6 +85,9 @@ static struct max1111_data *the_max1111; int max1111_read_channel(int channel) { + if (!the_max1111 || !the_max1111->spi) + return -ENODEV; + return max1111_read(&the_max1111->spi->dev, channel); } EXPORT_SYMBOL(max1111_read_channel); @@ -258,6 +261,9 @@ static int max1111_remove(struct spi_device *spi) { struct max1111_data *data = spi_get_drvdata(spi); +#ifdef CONFIG_SHARPSL_PM + the_max1111 = NULL; +#endif hwmon_device_unregister(data->hwmon_dev); sysfs_remove_group(&spi->dev.kobj, &max1110_attr_group); sysfs_remove_group(&spi->dev.kobj, &max1111_attr_group); From fb7a806fe08816a6fd2eb258db5e28218624bd07 Mon Sep 17 00:00:00 2001 From: Nicolai Stange Date: Sun, 20 Mar 2016 23:23:46 +0100 Subject: [PATCH 142/312] PKCS#7: pkcs7_validate_trust(): initialize the _trusted output argument [ Upstream commit e54358915d0a00399c11c2c23ae1be674cba188a ] Despite what the DocBook comment to pkcs7_validate_trust() says, the *_trusted argument is never set to false. pkcs7_validate_trust() only positively sets *_trusted upon encountering a trusted PKCS#7 SignedInfo block. This is quite unfortunate since its callers, system_verify_data() for example, depend on pkcs7_validate_trust() clearing *_trusted on non-trust. Indeed, UBSAN splats when attempting to load the uninitialized local variable 'trusted' from system_verify_data() in pkcs7_validate_trust(): UBSAN: Undefined behaviour in crypto/asymmetric_keys/pkcs7_trust.c:194:14 load of value 82 is not a valid value for type '_Bool' [...] Call Trace: [] dump_stack+0xbc/0x117 [] ? _atomic_dec_and_lock+0x169/0x169 [] ubsan_epilogue+0xd/0x4e [] __ubsan_handle_load_invalid_value+0x111/0x158 [] ? val_to_string.constprop.12+0xcf/0xcf [] ? x509_request_asymmetric_key+0x114/0x370 [] ? kfree+0x220/0x370 [] ? public_key_verify_signature_2+0x32/0x50 [] pkcs7_validate_trust+0x524/0x5f0 [] system_verify_data+0xca/0x170 [] ? top_trace_array+0x9b/0x9b [] ? __vfs_read+0x279/0x3d0 [] mod_verify_sig+0x1ff/0x290 [...] The implication is that pkcs7_validate_trust() effectively grants trust when it really shouldn't have. Fix this by explicitly setting *_trusted to false at the very beginning of pkcs7_validate_trust(). Cc: Signed-off-by: Nicolai Stange Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin --- crypto/asymmetric_keys/pkcs7_trust.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/crypto/asymmetric_keys/pkcs7_trust.c b/crypto/asymmetric_keys/pkcs7_trust.c index 0f6463b6692b..44c8447eb97e 100644 --- a/crypto/asymmetric_keys/pkcs7_trust.c +++ b/crypto/asymmetric_keys/pkcs7_trust.c @@ -174,6 +174,8 @@ int pkcs7_validate_trust(struct pkcs7_message *pkcs7, int cached_ret = -ENOKEY; int ret; + *_trusted = false; + for (p = pkcs7->certs; p; p = p->next) p->seen = false; From b9f2aab60bfe7f85dc9ffaeb6f258bfcd4d5bc1a Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 1 Apr 2016 12:28:16 +0200 Subject: [PATCH 143/312] ALSA: timer: Use mod_timer() for rearming the system timer [ Upstream commit 4a07083ed613644c96c34a7dd2853dc5d7c70902 ] ALSA system timer backend stops the timer via del_timer() without sync and leaves del_timer_sync() at the close instead. This is because of the restriction by the design of ALSA timer: namely, the stop callback may be called from the timer handler, and calling the sync shall lead to a hangup. However, this also triggers a kernel BUG() when the timer is rearmed immediately after stopping without sync: kernel BUG at kernel/time/timer.c:966! Call Trace: [] snd_timer_s_start+0x13e/0x1a0 [] snd_timer_interrupt+0x504/0xec0 [] ? debug_check_no_locks_freed+0x290/0x290 [] snd_timer_s_function+0xb4/0x120 [] call_timer_fn+0x162/0x520 [] ? call_timer_fn+0xcd/0x520 [] ? snd_timer_interrupt+0xec0/0xec0 .... It's the place where add_timer() checks the pending timer. It's clear that this may happen after the immediate restart without sync in our cases. So, the workaround here is just to use mod_timer() instead of add_timer(). This looks like a band-aid fix, but it's a right move, as snd_timer_interrupt() takes care of the continuous rearm of timer. Reported-by: Jiri Slaby Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin --- sound/core/timer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/core/timer.c b/sound/core/timer.c index bf48e71f73cd..1782555fcaca 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -1051,8 +1051,8 @@ static int snd_timer_s_start(struct snd_timer * timer) njiff += timer->sticks - priv->correction; priv->correction = 0; } - priv->last_expires = priv->tlist.expires = njiff; - add_timer(&priv->tlist); + priv->last_expires = njiff; + mod_timer(&priv->tlist, njiff); return 0; } From 39bf5645d861cb5dc7913b183fecfc6fad033b4e Mon Sep 17 00:00:00 2001 From: Xishi Qiu Date: Fri, 1 Apr 2016 14:31:20 -0700 Subject: [PATCH 144/312] mm: fix invalid node in alloc_migrate_target() [ Upstream commit 6f25a14a7053b69917e2ebea0d31dd444cd31fd5 ] It is incorrect to use next_node to find a target node, it will return MAX_NUMNODES or invalid node. This will lead to crash in buddy system allocation. Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Signed-off-by: Xishi Qiu Acked-by: Vlastimil Babka Acked-by: Naoya Horiguchi Cc: Joonsoo Kim Cc: David Rientjes Cc: "Laura Abbott" Cc: Hui Zhu Cc: Wang Xiaoqiang Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/page_isolation.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 303c908790ef..4b640c20f5c5 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -300,11 +300,11 @@ struct page *alloc_migrate_target(struct page *page, unsigned long private, * now as a simple work-around, we use the next node for destination. */ if (PageHuge(page)) { - nodemask_t src = nodemask_of_node(page_to_nid(page)); - nodemask_t dst; - nodes_complement(dst, src); + int node = next_online_node(page_to_nid(page)); + if (node == MAX_NUMNODES) + node = first_online_node; return alloc_huge_page_node(page_hstate(compound_head(page)), - next_node(page_to_nid(page), dst)); + node); } if (PageHighMem(page)) From 7e69c7b26e03e6f573d8613326c94f7ddc5935b8 Mon Sep 17 00:00:00 2001 From: Sebastian Siewior Date: Tue, 8 Mar 2016 10:03:56 +0100 Subject: [PATCH 145/312] powerpc/mm: Fixup preempt underflow with huge pages [ Upstream commit 08a5bb2921e490939f78f38fd0d02858bb709942 ] hugepd_free() used __get_cpu_var() once. Nothing ensured that the code accessing the variable did not migrate from one CPU to another and soon this was noticed by Tiejun Chen in 94b09d755462 ("powerpc/hugetlb: Replace __get_cpu_var with get_cpu_var"). So we had it fixed. Christoph Lameter was doing his __get_cpu_var() replaces and forgot PowerPC. Then he noticed this and sent his fixed up batch again which got applied as 69111bac42f5 ("powerpc: Replace __get_cpu_var uses"). The careful reader will noticed one little detail: get_cpu_var() got replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does a preempt_enable() and nothing that does preempt_disable() so we underflow the preempt counter. Cc: Benjamin Herrenschmidt Cc: Christoph Lameter Cc: stable@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Aneesh Kumar K.V Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- arch/powerpc/mm/hugetlbpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 3385e3d0506e..4d29154a4987 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -472,13 +472,13 @@ static void hugepd_free(struct mmu_gather *tlb, void *hugepte) { struct hugepd_freelist **batchp; - batchp = this_cpu_ptr(&hugepd_freelist_cur); + batchp = &get_cpu_var(hugepd_freelist_cur); if (atomic_read(&tlb->mm->mm_users) < 2 || cpumask_equal(mm_cpumask(tlb->mm), cpumask_of(smp_processor_id()))) { kmem_cache_free(hugepte_cache, hugepte); - put_cpu_var(hugepd_freelist_cur); + put_cpu_var(hugepd_freelist_cur); return; } From da38a239f06b9bd2e8338c0f8c468d2ae04eb046 Mon Sep 17 00:00:00 2001 From: Todd Previte Date: Sat, 18 Apr 2015 00:04:18 -0700 Subject: [PATCH 146/312] drm: Fix for DP CTS test 4.2.2.5 - I2C DEFER handling [ Upstream commit 396aa4451e865d1e36d6d4e0686a9303c038b606 ] For test 4.2.2.5 to pass per the Link CTS Core 1.2 rev1.1 spec, the source device must attempt at least 7 times to read the EDID when it receives an I2C defer. The normal DRM code makes only 7 retries, regardless of whether or not the response is a native defer or an I2C defer. Test 4.2.2.5 fails since there are native defers interspersed with the I2C defers which results in less than 7 EDID read attempts. The solution is to add the numer of defers to the retry counter when an I2C DEFER is returned such that another read attempt will be made. This situation should normally only occur in compliance testing, however, as a worse case real-world scenario, it would result in 13 attempts ( 6 native defers, 7 I2C defers) for a single transaction to complete. The net result is a slightly slower response to an EDID read that shouldn't significantly impact overall performance. V2: - Added a check on the number of I2C Defers to limit the number of times that the retries variable will be decremented. This is to address review feedback regarding possible infinite loops from misbehaving sink devices. V3: - Fixed the limit value to 7 instead of 8 to get the correct retry count. - Combined the increment of the defer count into the if-statement V4: - Removed i915 tag from subject as the patch is not i915-specific V5: - Updated the for-loop to add the number of i2c defers to the retry counter such that the correct number of retry attempts will be made Signed-off-by: Todd Previte Cc: dri-devel@lists.freedesktop.org Reviewed-by: Paulo Zanoni Signed-off-by: Daniel Vetter Signed-off-by: Sasha Levin --- drivers/gpu/drm/drm_dp_helper.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 71dcbc64ae98..7f0356ea0bbf 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -432,7 +432,7 @@ static u32 drm_dp_i2c_functionality(struct i2c_adapter *adapter) */ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) { - unsigned int retry; + unsigned int retry, defer_i2c; int ret; /* @@ -440,7 +440,7 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) * is required to retry at least seven times upon receiving AUX_DEFER * before giving up the AUX transaction. */ - for (retry = 0; retry < 7; retry++) { + for (retry = 0, defer_i2c = 0; retry < (7 + defer_i2c); retry++) { mutex_lock(&aux->hw_mutex); ret = aux->transfer(aux, msg); mutex_unlock(&aux->hw_mutex); @@ -499,7 +499,13 @@ static int drm_dp_i2c_do_msg(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) case DP_AUX_I2C_REPLY_DEFER: DRM_DEBUG_KMS("I2C defer\n"); + /* DP Compliance Test 4.2.2.5 Requirement: + * Must have at least 7 retries for I2C defers on the + * transaction to pass this test + */ aux->i2c_defer_count++; + if (defer_i2c < 7) + defer_i2c++; usleep_range(400, 500); continue; From 28a201e2e8323ffeb3ac2fc51b454a410dde3527 Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Wed, 30 Mar 2016 11:40:43 +0200 Subject: [PATCH 147/312] drm/udl: Use unlocked gem unreferencing [ Upstream commit 72b9ff0612ad8fc969b910cd00ac16b57a1a9ba4 ] For drm_gem_object_unreference callers are required to hold dev->struct_mutex, which these paths don't. Enforcing this requirement has become a bit more strict with commit ef4c6270bf2867e2f8032e9614d1a8cfc6c71663 Author: Daniel Vetter Date: Thu Oct 15 09:36:25 2015 +0200 drm/gem: Check locking in drm_gem_object_unreference Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter Signed-off-by: Dave Airlie Signed-off-by: Sasha Levin --- drivers/gpu/drm/udl/udl_fb.c | 2 +- drivers/gpu/drm/udl/udl_gem.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/udl/udl_fb.c b/drivers/gpu/drm/udl/udl_fb.c index 5fc16cecd3ba..cd8d183dcfe5 100644 --- a/drivers/gpu/drm/udl/udl_fb.c +++ b/drivers/gpu/drm/udl/udl_fb.c @@ -546,7 +546,7 @@ static int udlfb_create(struct drm_fb_helper *helper, return ret; out_gfree: - drm_gem_object_unreference(&ufbdev->ufb.obj->base); + drm_gem_object_unreference_unlocked(&ufbdev->ufb.obj->base); out: return ret; } diff --git a/drivers/gpu/drm/udl/udl_gem.c b/drivers/gpu/drm/udl/udl_gem.c index 2a0a784ab6ee..d7528e0d8442 100644 --- a/drivers/gpu/drm/udl/udl_gem.c +++ b/drivers/gpu/drm/udl/udl_gem.c @@ -52,7 +52,7 @@ udl_gem_create(struct drm_file *file, return ret; } - drm_gem_object_unreference(&obj->base); + drm_gem_object_unreference_unlocked(&obj->base); *handle_p = handle; return 0; } From cd18a79a6445fecf42114a9124c5ff23d72201cd Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Fri, 25 Mar 2016 10:31:04 -0400 Subject: [PATCH 148/312] drm/radeon: add a dpm quirk for sapphire Dual-X R7 370 2G D5 [ Upstream commit f971f2263deaa4a441e377b385c11aee0f3b3f9a ] bug: https://bugs.freedesktop.org/show_bug.cgi?id=94692 Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/si_dpm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index c4e0e69b688d..27e8baed4cc3 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -2925,6 +2925,7 @@ static struct si_dpm_quirk si_dpm_quirk_list[] = { /* PITCAIRN - https://bugs.freedesktop.org/show_bug.cgi?id=76490 */ { PCI_VENDOR_ID_ATI, 0x6810, 0x1462, 0x3036, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0xe271, 0, 120000 }, + { PCI_VENDOR_ID_ATI, 0x6811, 0x174b, 0x2015, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6810, 0x174b, 0xe271, 85000, 90000 }, { PCI_VENDOR_ID_ATI, 0x6811, 0x1762, 0x2015, 0, 120000 }, { PCI_VENDOR_ID_ATI, 0x6811, 0x1043, 0x2015, 0, 120000 }, From d393b6c7cba4d7e9fd07e0d774e875e6ff0d1dd2 Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Mon, 28 Mar 2016 10:21:20 -0400 Subject: [PATCH 149/312] drm/radeon: add a dpm quirk for all R7 370 parts [ Upstream commit 0e5585dc870af947fab2af96a88c2d8b4270247c ] Higher mclk values are not stable due to a bug somewhere. Limit them for now. Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/si_dpm.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index 27e8baed4cc3..f666277a8993 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -2960,6 +2960,10 @@ static void si_apply_state_adjust_rules(struct radeon_device *rdev, } ++p; } + /* limit mclk on all R7 370 parts for stability */ + if (rdev->pdev->device == 0x6811 && + rdev->pdev->revision == 0x81) + max_mclk = 120000; if ((rdev->pm.dpm.new_active_crtc_count > 1) || ni_dpm_vblank_too_short(rdev)) From 3de6011e6ba781498f9a5ba52638c0dbcd346832 Mon Sep 17 00:00:00 2001 From: Konstantin Khlebnikov Date: Sun, 21 Feb 2016 10:12:39 +0300 Subject: [PATCH 150/312] tcp: convert cached rtt from usec to jiffies when feeding initial rto [ Upstream commit 9bdfb3b79e61c60e1a3e2dc05ad164528afa6b8a ] Currently it's converted into msecs, thus HZ=1000 intact. Signed-off-by: Konstantin Khlebnikov Fixes: 740b0f1841f6 ("tcp: switch rtt estimations to usec resolution") Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv4/tcp_metrics.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c index a51d63a43e33..9c840c5c6047 100644 --- a/net/ipv4/tcp_metrics.c +++ b/net/ipv4/tcp_metrics.c @@ -566,7 +566,7 @@ reset: */ if (crtt > tp->srtt_us) { /* Set RTO like tcp_rtt_estimator(), but from cached RTT. */ - crtt /= 8 * USEC_PER_MSEC; + crtt /= 8 * USEC_PER_SEC / HZ; inet_csk(sk)->icsk_rto = crtt + max(2 * crtt, tcp_rto_min(sk)); } else if (tp->srtt_us == 0) { /* RFC6298: 5.7 We've failed to get a valid RTT sample from From ea82b38fce682e4277cb20f385b3cebbb91cd1e5 Mon Sep 17 00:00:00 2001 From: Bernie Harris Date: Mon, 22 Feb 2016 12:58:05 +1300 Subject: [PATCH 151/312] tunnel: Clear IPCB(skb)->opt before dst_link_failure called [ Upstream commit 5146d1f151122e868e594c7b45115d64825aee5f ] IPCB may contain data from previous layers (in the observed case the qdisc layer). In the observed scenario, the data was misinterpreted as ip header options, which later caused the ihl to be set to an invalid value (<5). This resulted in an infinite loop in the mips implementation of ip_fast_csum. This patch clears IPCB(skb)->opt before dst_link_failure can be called for various types of tunnels. This change only applies to encapsulated ipv4 packets. The code introduced in 11c21a30 which clears all of IPCB has been removed to be consistent with these changes, and instead the opt field is cleared unconditionally in ip_tunnel_xmit. The change in ip_tunnel_xmit applies to SIT, GRE, and IPIP tunnels. The relevant vti, l2tp, and pptp functions already contain similar code for clearing the IPCB. Signed-off-by: Bernie Harris Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv4/ip_tunnel.c | 3 ++- net/ipv4/udp_tunnel.c | 2 ++ net/ipv6/ip6_gre.c | 2 ++ net/ipv6/ip6_tunnel.c | 2 ++ 4 files changed, 8 insertions(+), 1 deletion(-) diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 626d9e56a6bd..35080a708b59 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -652,6 +652,8 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, inner_iph = (const struct iphdr *)skb_inner_network_header(skb); connected = (tunnel->parms.iph.daddr != 0); + memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); + dst = tnl_params->daddr; if (dst == 0) { /* NBMA tunnel */ @@ -749,7 +751,6 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, tunnel->err_time + IPTUNNEL_ERR_TIMEO)) { tunnel->err_count--; - memset(IPCB(skb), 0, sizeof(*IPCB(skb))); dst_link_failure(skb); } else tunnel->err_count = 0; diff --git a/net/ipv4/udp_tunnel.c b/net/ipv4/udp_tunnel.c index 6bb98cc193c9..7b534ac04056 100644 --- a/net/ipv4/udp_tunnel.c +++ b/net/ipv4/udp_tunnel.c @@ -90,6 +90,8 @@ int udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb, uh->source = src_port; uh->len = htons(skb->len); + memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); + udp_set_csum(nocheck, skb, src, dst, skb->len); return iptunnel_xmit(sk, rt, skb, src, dst, IPPROTO_UDP, diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index 76be7d311cc4..b1311da5d7b8 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -783,6 +783,8 @@ static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, struct net_device *dev) __u32 mtu; int err; + memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); + if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT)) encap_limit = t->parms.encap_limit; diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 5cafd92c2312..6fd0f96fdac1 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1124,6 +1124,8 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev) u8 tproto; int err; + memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); + tproto = ACCESS_ONCE(t->parms.proto); if (tproto != IPPROTO_IPIP && tproto != 0) return -1; From 2131abc39f1033260ffc66a37e26bef9ddab8b16 Mon Sep 17 00:00:00 2001 From: Diego Viola Date: Tue, 23 Feb 2016 12:04:04 -0300 Subject: [PATCH 152/312] net: jme: fix suspend/resume on JMC260 [ Upstream commit ee50c130c82175eaa0820c96b6d3763928af2241 ] The JMC260 network card fails to suspend/resume because the call to jme_start_irq() was too early, moving the call to jme_start_irq() after the call to jme_reset_link() makes it work. Prior this change suspend/resume would fail unless /sys/power/pm_async=0 was explicitly specified. Relevant bug report: https://bugzilla.kernel.org/show_bug.cgi?id=112351 Signed-off-by: Diego Viola Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/jme.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c index 6e9a792097d3..32d240b332d2 100644 --- a/drivers/net/ethernet/jme.c +++ b/drivers/net/ethernet/jme.c @@ -3316,13 +3316,14 @@ jme_resume(struct device *dev) jme_reset_phy_processor(jme); jme_phy_calibration(jme); jme_phy_setEA(jme); - jme_start_irq(jme); netif_device_attach(netdev); atomic_inc(&jme->link_changing); jme_reset_link(jme); + jme_start_irq(jme); + return 0; } From 830b50eabc046fee60aedf8dc6384a9894033f31 Mon Sep 17 00:00:00 2001 From: Stefan Wahren Date: Tue, 23 Feb 2016 19:23:23 +0000 Subject: [PATCH 153/312] net: qca_spi: Don't clear IFF_BROADCAST [ Upstream commit 2b70bad23c89b121a3e4a00f8968d14ebb78887d ] Currently qcaspi_netdev_setup accidentally clears IFF_BROADCAST. So fix this by keeping the flags from ether_setup. Reported-by: Michael Heimpold Signed-off-by: Stefan Wahren Fixes: 291ab06ecf67 (net: qualcomm: new Ethernet over SPI driver for QCA7000) Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/qualcomm/qca_spi.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/qualcomm/qca_spi.c b/drivers/net/ethernet/qualcomm/qca_spi.c index 97e4df9bf407..412715c11eb0 100644 --- a/drivers/net/ethernet/qualcomm/qca_spi.c +++ b/drivers/net/ethernet/qualcomm/qca_spi.c @@ -811,7 +811,6 @@ qcaspi_netdev_setup(struct net_device *dev) dev->netdev_ops = &qcaspi_netdev_ops; qcaspi_set_ethtool_ops(dev); dev->watchdog_timeo = QCASPI_TX_TIMEOUT; - dev->flags = IFF_MULTICAST; dev->tx_queue_len = 100; qca = netdev_priv(dev); From 8590142dc5b464c3bbf60ef0b14322850983c1c1 Mon Sep 17 00:00:00 2001 From: Stefan Wahren Date: Tue, 23 Feb 2016 19:23:24 +0000 Subject: [PATCH 154/312] net: qca_spi: clear IFF_TX_SKB_SHARING [ Upstream commit a4690afeb0d2d7ba4d60dfa98a89f3bb1ce60ecd ] ether_setup sets IFF_TX_SKB_SHARING but this is not supported by qca_spi as it modifies the skb on xmit. Signed-off-by: Stefan Wahren Fixes: 291ab06ecf67 (net: qualcomm: new Ethernet over SPI driver for QCA7000) Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/qualcomm/qca_spi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/qualcomm/qca_spi.c b/drivers/net/ethernet/qualcomm/qca_spi.c index 412715c11eb0..cba41860167c 100644 --- a/drivers/net/ethernet/qualcomm/qca_spi.c +++ b/drivers/net/ethernet/qualcomm/qca_spi.c @@ -811,6 +811,7 @@ qcaspi_netdev_setup(struct net_device *dev) dev->netdev_ops = &qcaspi_netdev_ops; qcaspi_set_ethtool_ops(dev); dev->watchdog_timeo = QCASPI_TX_TIMEOUT; + dev->priv_flags &= ~IFF_TX_SKB_SHARING; dev->tx_queue_len = 100; qca = netdev_priv(dev); From 67eab3249eacc0861cf57f6d929e66bf78159379 Mon Sep 17 00:00:00 2001 From: Xin Long Date: Sun, 28 Feb 2016 10:03:51 +0800 Subject: [PATCH 155/312] sctp: lack the check for ports in sctp_v6_cmp_addr [ Upstream commit 40b4f0fd74e46c017814618d67ec9127ff20f157 ] As the member .cmp_addr of sctp_af_inet6, sctp_v6_cmp_addr should also check the port of addresses, just like sctp_v4_cmp_addr, cause it's invoked by sctp_cmp_addr_exact(). Now sctp_v6_cmp_addr just check the port when two addresses have different family, and lack the port check for two ipv6 addresses. that will make sctp_hash_cmp() cannot work well. so fix it by adding ports comparison in sctp_v6_cmp_addr(). Signed-off-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/sctp/ipv6.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index 3267a5cbb3e8..18361cbfc882 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -519,6 +519,8 @@ static int sctp_v6_cmp_addr(const union sctp_addr *addr1, } return 0; } + if (addr1->v6.sin6_port != addr2->v6.sin6_port) + return 0; if (!ipv6_addr_equal(&addr1->v6.sin6_addr, &addr2->v6.sin6_addr)) return 0; /* If this is a linklocal address, compare the scope_id. */ From fda740f68c700b1da33a512ec8005d86f275265a Mon Sep 17 00:00:00 2001 From: Benjamin Poirier Date: Mon, 29 Feb 2016 15:03:33 -0800 Subject: [PATCH 156/312] mld, igmp: Fix reserved tailroom calculation [ Upstream commit 1837b2e2bcd23137766555a63867e649c0b637f0 ] The current reserved_tailroom calculation fails to take hlen and tlen into account. skb: [__hlen__|__data____________|__tlen___|__extra__] ^ ^ head skb_end_offset In this representation, hlen + data + tlen is the size passed to alloc_skb. "extra" is the extra space made available in __alloc_skb because of rounding up by kmalloc. We can reorder the representation like so: [__hlen__|__data____________|__extra__|__tlen___] ^ ^ head skb_end_offset The maximum space available for ip headers and payload without fragmentation is min(mtu, data + extra). Therefore, reserved_tailroom = data + extra + tlen - min(mtu, data + extra) = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen) = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen) Compare the second line to the current expression: reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset) and we can see that hlen and tlen are not taken into account. The min() in the third line can be expanded into: if mtu < skb_tailroom - tlen: reserved_tailroom = skb_tailroom - mtu else: reserved_tailroom = tlen Depending on hlen, tlen, mtu and the number of multicast address records, the current code may output skbs that have less tailroom than dev->needed_tailroom or it may output more skbs than needed because not all space available is used. Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs") Signed-off-by: Benjamin Poirier Acked-by: Hannes Frederic Sowa Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- include/linux/skbuff.h | 24 ++++++++++++++++++++++++ net/ipv4/igmp.c | 3 +-- net/ipv6/mcast.c | 3 +-- 3 files changed, 26 insertions(+), 4 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 6633b0cd3fb9..ca2e26a486ee 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1781,6 +1781,30 @@ static inline void skb_reserve(struct sk_buff *skb, int len) skb->tail += len; } +/** + * skb_tailroom_reserve - adjust reserved_tailroom + * @skb: buffer to alter + * @mtu: maximum amount of headlen permitted + * @needed_tailroom: minimum amount of reserved_tailroom + * + * Set reserved_tailroom so that headlen can be as large as possible but + * not larger than mtu and tailroom cannot be smaller than + * needed_tailroom. + * The required headroom should already have been reserved before using + * this function. + */ +static inline void skb_tailroom_reserve(struct sk_buff *skb, unsigned int mtu, + unsigned int needed_tailroom) +{ + SKB_LINEAR_ASSERT(skb); + if (mtu < skb_tailroom(skb) - needed_tailroom) + /* use at most mtu */ + skb->reserved_tailroom = skb_tailroom(skb) - mtu; + else + /* use up to all available space */ + skb->reserved_tailroom = needed_tailroom; +} + #define ENCAP_TYPE_ETHER 0 #define ENCAP_TYPE_IPPROTO 1 diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index a3a697f5ffba..218abf9fb1ed 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -353,9 +353,8 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu) skb_dst_set(skb, &rt->dst); skb->dev = dev; - skb->reserved_tailroom = skb_end_offset(skb) - - min(mtu, skb_end_offset(skb)); skb_reserve(skb, hlen); + skb_tailroom_reserve(skb, mtu, tlen); skb_reset_network_header(skb); pip = ip_hdr(skb); diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 41e3b5ee8d0b..9a63110b6548 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -1574,9 +1574,8 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu) return NULL; skb->priority = TC_PRIO_CONTROL; - skb->reserved_tailroom = skb_end_offset(skb) - - min(mtu, skb_end_offset(skb)); skb_reserve(skb, hlen); + skb_tailroom_reserve(skb, mtu, tlen); if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) { /* : From 4ded71d9c776483496b7249fafd375e7e09bb205 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= Date: Tue, 1 Mar 2016 14:31:02 +0100 Subject: [PATCH 157/312] qmi_wwan: add Sierra Wireless EM74xx device ID MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit bf13c94ccb33c3182efc92ce4989506a0f541243 ] The MC74xx and EM74xx modules use different IDs by default, according to the Lenovo EM7455 driver for Windows. Signed-off-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/usb/qmi_wwan.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index cffb25280a3b..8677c6a56cdf 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -767,8 +767,10 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x1199, 0x9061, 8)}, /* Sierra Wireless Modem */ {QMI_FIXED_INTF(0x1199, 0x9070, 8)}, /* Sierra Wireless MC74xx/EM74xx */ {QMI_FIXED_INTF(0x1199, 0x9070, 10)}, /* Sierra Wireless MC74xx/EM74xx */ - {QMI_FIXED_INTF(0x1199, 0x9071, 8)}, /* Sierra Wireless MC74xx/EM74xx */ - {QMI_FIXED_INTF(0x1199, 0x9071, 10)}, /* Sierra Wireless MC74xx/EM74xx */ + {QMI_FIXED_INTF(0x1199, 0x9071, 8)}, /* Sierra Wireless MC74xx */ + {QMI_FIXED_INTF(0x1199, 0x9071, 10)}, /* Sierra Wireless MC74xx */ + {QMI_FIXED_INTF(0x1199, 0x9079, 8)}, /* Sierra Wireless EM74xx */ + {QMI_FIXED_INTF(0x1199, 0x9079, 10)}, /* Sierra Wireless EM74xx */ {QMI_FIXED_INTF(0x1bbb, 0x011e, 4)}, /* Telekom Speedstick LTE II (Alcatel One Touch L100V LTE) */ {QMI_FIXED_INTF(0x1bbb, 0x0203, 2)}, /* Alcatel L800MA */ {QMI_FIXED_INTF(0x2357, 0x0201, 4)}, /* TP-LINK HSUPA Modem MA180 */ From af4f7a3ca6a0928551220a7a8bea3cd3e70b873e Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 1 Mar 2016 16:15:16 +0100 Subject: [PATCH 158/312] ipv6: re-enable fragment header matching in ipv6_find_hdr [ Upstream commit 5d150a985520bbe3cb2aa1ceef24a7e32f20c15f ] When ipv6_find_hdr is used to find a fragment header (caller specifies target NEXTHDR_FRAGMENT) we erronously return -ENOENT for all fragments with nonzero offset. Before commit 9195bb8e381d, when target was specified, we did not enter the exthdr walk loop as nexthdr == target so this used to work. Now we do (so we can skip empty route headers). When we then stumble upon a frag with nonzero frag_off we must return -ENOENT ("header not found") only if the caller did not specifically request NEXTHDR_FRAGMENT. This allows nfables exthdr expression to match ipv6 fragments, e.g. via nft add rule ip6 filter input frag frag-off gt 0 Fixes: 9195bb8e381d ("ipv6: improve ipv6_find_hdr() to skip empty routing headers") Signed-off-by: Florian Westphal Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv6/exthdrs_core.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/net/ipv6/exthdrs_core.c b/net/ipv6/exthdrs_core.c index 5c5d23e59da5..9508a20fbf61 100644 --- a/net/ipv6/exthdrs_core.c +++ b/net/ipv6/exthdrs_core.c @@ -257,7 +257,11 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset, *fragoff = _frag_off; return hp->nexthdr; } - return -ENOENT; + if (!found) + return -ENOENT; + if (fragoff) + *fragoff = _frag_off; + break; } hdrlen = 8; } else if (nexthdr == NEXTHDR_AUTH) { From cf852e989018e95f25e8a620f77e380812a9460f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= Date: Thu, 3 Mar 2016 22:20:53 +0100 Subject: [PATCH 159/312] cdc_ncm: toggle altsetting to force reset before setup MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 48906f62c96cc2cd35753e59310cb70eb08cc6a5 ] Some devices will silently fail setup unless they are reset first. This is necessary even if the data interface is already in altsetting 0, which it will be when the device is probed for the first time. Briefly toggling the altsetting forces a function reset regardless of the initial state. This fixes a setup problem observed on a number of Huawei devices, appearing to operate in NTB-32 mode even if we explicitly set them to NTB-16 mode. Signed-off-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/usb/cdc_ncm.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index 0b481c30979b..5db25e46a962 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -843,7 +843,11 @@ advance: iface_no = ctx->data->cur_altsetting->desc.bInterfaceNumber; - /* reset data interface */ + /* Reset data interface. Some devices will not reset properly + * unless they are configured first. Toggle the altsetting to + * force a reset + */ + usb_set_interface(dev->udev, iface_no, data_altsetting); temp = usb_set_interface(dev->udev, iface_no, 0); if (temp) { dev_dbg(&intf->dev, "set interface failed\n"); From 900aa9788446a8bc7fc40f21199bbe1dbdc1bf19 Mon Sep 17 00:00:00 2001 From: Oliver Neukum Date: Mon, 7 Mar 2016 11:31:10 +0100 Subject: [PATCH 160/312] usbnet: cleanup after bind() in probe() [ Upstream commit 1666984c8625b3db19a9abc298931d35ab7bc64b ] In case bind() works, but a later error forces bailing in probe() in error cases work and a timer may be scheduled. They must be killed. This fixes an error case related to the double free reported in http://www.spinics.net/lists/netdev/msg367669.html and needs to go on top of Linus' fix to cdc-ncm. Signed-off-by: Oliver Neukum Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/usb/usbnet.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index e0498571ae26..edbb2f389337 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -1754,6 +1754,13 @@ out3: if (info->unbind) info->unbind (dev, udev); out1: + /* subdrivers must undo all they did in bind() if they + * fail it, but we may fail later and a deferred kevent + * may trigger an error resubmitting itself and, worse, + * schedule a timer. So we kill it all just in case. + */ + cancel_work_sync(&dev->kevent); + del_timer_sync(&dev->delay); free_netdev(net); out: return status; From 6def7a990b10af3532caed172e7ada9dbb596a53 Mon Sep 17 00:00:00 2001 From: Bill Sommerfeld Date: Fri, 4 Mar 2016 14:47:21 -0800 Subject: [PATCH 161/312] udp6: fix UDP/IPv6 encap resubmit path [ Upstream commit 59dca1d8a6725a121dae6c452de0b2611d5865dc ] IPv4 interprets a negative return value from a protocol handler as a request to redispatch to a new protocol. In contrast, IPv6 interprets a negative value as an error, and interprets a positive value as a request for redispatch. UDP for IPv6 was unaware of this difference. Change __udp6_lib_rcv() to return a positive value for redispatch. Note that the socket's encap_rcv hook still needs to return a negative value to request dispatch, and in the case of IPv6 packets, adjust IP6CB(skb)->nhoff to identify the byte containing the next protocol. Signed-off-by: Bill Sommerfeld Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv6/udp.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 7333f3575fc5..d28b2a1ab0e5 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -913,11 +913,9 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, ret = udpv6_queue_rcv_skb(sk, skb); sock_put(sk); - /* a return value > 0 means to resubmit the input, but - * it wants the return to be -protocol, or 0 - */ + /* a return value > 0 means to resubmit the input */ if (ret > 0) - return -ret; + return ret; return 0; } From 58c92f1359c6811478159cfe3cc3118bdb995608 Mon Sep 17 00:00:00 2001 From: Martin Blumenstingl Date: Sun, 22 Nov 2015 17:46:09 +0100 Subject: [PATCH 162/312] packet: Allow packets with only a header (but no payload) [ Upstream commit 880621c2605b82eb5af91a2c94223df6f5a3fb64 ] Commit 9c7077622dd91 ("packet: make packet_snd fail on len smaller than l2 header") added validation for the packet size in packet_snd. This change enforces that every packet needs a header (with at least hard_header_len bytes) plus a payload with at least one byte. Before this change the payload was optional. This fixes PPPoE connections which do not have a "Service" or "Host-Uniq" configured (which is violating the spec, but is still widely used in real-world setups). Those are currently failing with the following message: "pppd: packet size is too short (24 <= 24)" Signed-off-by: Martin Blumenstingl Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- include/linux/netdevice.h | 3 ++- net/packet/af_packet.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 05b9a694e213..f6e5cad155e7 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1372,7 +1372,8 @@ enum netdev_priv_flags { * @dma: DMA channel * @mtu: Interface MTU value * @type: Interface hardware type - * @hard_header_len: Hardware header length + * @hard_header_len: Hardware header length, which means that this is the + * minimum size of a packet. * * @needed_headroom: Extra headroom the hardware may need, but not in all * cases can this be guaranteed diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index ebc39e66d704..738c8bac0320 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2112,8 +2112,8 @@ static void tpacket_destruct_skb(struct sk_buff *skb) static bool ll_header_truncated(const struct net_device *dev, int len) { /* net device doesn't like empty head */ - if (unlikely(len <= dev->hard_header_len)) { - net_warn_ratelimited("%s: packet size is too short (%d <= %d)\n", + if (unlikely(len < dev->hard_header_len)) { + net_warn_ratelimited("%s: packet size is too short (%d < %d)\n", current->comm, len, dev->hard_header_len); return true; } From 1df16498dfd0d5a129bdf2982d9a08df73e8923d Mon Sep 17 00:00:00 2001 From: Willem de Bruijn Date: Wed, 9 Mar 2016 21:58:32 -0500 Subject: [PATCH 163/312] net: validate variable length ll headers [ Upstream commit 2793a23aacbd754dbbb5cb75093deb7e4103bace ] Netdevice parameter hard_header_len is variously interpreted both as an upper and lower bound on link layer header length. The field is used as upper bound when reserving room at allocation, as lower bound when validating user input in PF_PACKET. Clarify the definition to be maximum header length. For validation of untrusted headers, add an optional validate member to header_ops. Allow bypassing of validation by passing CAP_SYS_RAWIO, for instance for deliberate testing of corrupt input. In this case, pad trailing bytes, as some device drivers expect completely initialized headers. See also http://comments.gmane.org/gmane.linux.network/401064 Signed-off-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- include/linux/netdevice.h | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index f6e5cad155e7..6c86c7edafa7 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -265,6 +265,7 @@ struct header_ops { void (*cache_update)(struct hh_cache *hh, const struct net_device *dev, const unsigned char *haddr); + bool (*validate)(const char *ll_header, unsigned int len); }; /* These flag bits are private to the generic network queueing @@ -1372,8 +1373,7 @@ enum netdev_priv_flags { * @dma: DMA channel * @mtu: Interface MTU value * @type: Interface hardware type - * @hard_header_len: Hardware header length, which means that this is the - * minimum size of a packet. + * @hard_header_len: Maximum hardware header length. * * @needed_headroom: Extra headroom the hardware may need, but not in all * cases can this be guaranteed @@ -2417,6 +2417,24 @@ static inline int dev_parse_header(const struct sk_buff *skb, return dev->header_ops->parse(skb, haddr); } +/* ll_header must have at least hard_header_len allocated */ +static inline bool dev_validate_header(const struct net_device *dev, + char *ll_header, int len) +{ + if (likely(len >= dev->hard_header_len)) + return true; + + if (capable(CAP_SYS_RAWIO)) { + memset(ll_header + len, 0, dev->hard_header_len - len); + return true; + } + + if (dev->header_ops && dev->header_ops->validate) + return dev->header_ops->validate(ll_header, len); + + return false; +} + typedef int gifconf_func_t(struct net_device * dev, char __user * bufptr, int len); int register_gifconf(unsigned int family, gifconf_func_t *gifconf); static inline int unregister_gifconf(unsigned int family) From 58ba5f64ecb7c1909c060793be9e46cc28b417db Mon Sep 17 00:00:00 2001 From: Willem de Bruijn Date: Wed, 9 Mar 2016 21:58:33 -0500 Subject: [PATCH 164/312] ax25: add link layer header validation function [ Upstream commit ea47781c26510e5d97f80f9aceafe9065bd5e3aa ] As variable length protocol, AX25 fails link layer header validation tests based on a minimum length. header_ops.validate allows protocols to validate headers that are shorter than hard_header_len. Implement this callback for AX25. See also http://comments.gmane.org/gmane.linux.network/401064 Signed-off-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ax25/ax25_ip.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/net/ax25/ax25_ip.c b/net/ax25/ax25_ip.c index 7c646bb2c6f7..98d206b22653 100644 --- a/net/ax25/ax25_ip.c +++ b/net/ax25/ax25_ip.c @@ -229,8 +229,23 @@ netdev_tx_t ax25_ip_xmit(struct sk_buff *skb) } #endif +static bool ax25_validate_header(const char *header, unsigned int len) +{ + ax25_digi digi; + + if (!len) + return false; + + if (header[0]) + return true; + + return ax25_addr_parse(header + 1, len - 1, NULL, NULL, &digi, NULL, + NULL); +} + const struct header_ops ax25_header_ops = { .create = ax25_hard_header, + .validate = ax25_validate_header, }; EXPORT_SYMBOL(ax25_header_ops); From 795f5dc952b62c0a4afe5d8872f6f16372243293 Mon Sep 17 00:00:00 2001 From: Willem de Bruijn Date: Wed, 9 Mar 2016 21:58:34 -0500 Subject: [PATCH 165/312] packet: validate variable length ll headers [ Upstream commit 9ed988cd591500c040b2a6257bc68543e08ceeef ] Replace link layer header validation check ll_header_truncate with more generic dev_validate_header. Validation based on hard_header_len incorrectly drops valid packets in variable length protocols, such as AX25. dev_validate_header calls header_ops.validate for such protocols to ensure correctness below hard_header_len. See also http://comments.gmane.org/gmane.linux.network/401064 Fixes 9c7077622dd9 ("packet: make packet_snd fail on len smaller than l2 header") Signed-off-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/packet/af_packet.c | 37 ++++++++++++++++--------------------- 1 file changed, 16 insertions(+), 21 deletions(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 738c8bac0320..68a048472e5f 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1699,6 +1699,10 @@ retry: goto retry; } + if (!dev_validate_header(dev, skb->data, len)) { + err = -EINVAL; + goto out_unlock; + } if (len > (dev->mtu + dev->hard_header_len + extra_len) && !packet_extra_vlan_len_allowed(dev, skb)) { err = -EMSGSIZE; @@ -2109,18 +2113,6 @@ static void tpacket_destruct_skb(struct sk_buff *skb) sock_wfree(skb); } -static bool ll_header_truncated(const struct net_device *dev, int len) -{ - /* net device doesn't like empty head */ - if (unlikely(len < dev->hard_header_len)) { - net_warn_ratelimited("%s: packet size is too short (%d < %d)\n", - current->comm, len, dev->hard_header_len); - return true; - } - - return false; -} - static void tpacket_set_protocol(const struct net_device *dev, struct sk_buff *skb) { @@ -2203,19 +2195,19 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb, if (unlikely(err < 0)) return -EINVAL; } else if (dev->hard_header_len) { - if (ll_header_truncated(dev, tp_len)) - return -EINVAL; + int hdrlen = min_t(int, dev->hard_header_len, tp_len); skb_push(skb, dev->hard_header_len); - err = skb_store_bits(skb, 0, data, - dev->hard_header_len); + err = skb_store_bits(skb, 0, data, hdrlen); if (unlikely(err)) return err; + if (!dev_validate_header(dev, skb->data, hdrlen)) + return -EINVAL; if (!skb->protocol) tpacket_set_protocol(dev, skb); - data += dev->hard_header_len; - to_write -= dev->hard_header_len; + data += hdrlen; + to_write -= hdrlen; } offset = offset_in_page(data); @@ -2538,9 +2530,6 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) offset = dev_hard_header(skb, dev, ntohs(proto), addr, NULL, len); if (unlikely(offset < 0)) goto out_free; - } else { - if (ll_header_truncated(dev, len)) - goto out_free; } /* Returns -EFAULT on error */ @@ -2548,6 +2537,12 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) if (err) goto out_free; + if (sock->type == SOCK_RAW && + !dev_validate_header(dev, skb->data, len)) { + err = -EINVAL; + goto out_free; + } + sock_tx_timestamp(sk, &skb_shinfo(skb)->tx_flags); if (!gso_type && (len > dev->mtu + reserve + extra_len) && From c5e6283a4379eaae2d7fe596fc29ea02a14ac818 Mon Sep 17 00:00:00 2001 From: Sergei Shtylyov Date: Tue, 8 Mar 2016 01:36:28 +0300 Subject: [PATCH 166/312] sh_eth: fix NULL pointer dereference in sh_eth_ring_format() [ Upstream commit c1b7fca65070bfadca94dd53a4e6b71cd4f69715 ] In a low memory situation, if netdev_alloc_skb() fails on a first RX ring loop iteration in sh_eth_ring_format(), 'rxdesc' is still NULL. Avoid kernel oops by adding the 'rxdesc' check after the loop. Reported-by: Wolfram Sang Signed-off-by: Sergei Shtylyov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/renesas/sh_eth.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index 13463c4acc86..51eff9fefb4e 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -1173,7 +1173,8 @@ static void sh_eth_ring_format(struct net_device *ndev) mdp->dirty_rx = (u32) (i - mdp->num_rx_ring); /* Mark the last entry as wrapping the ring. */ - rxdesc->status |= cpu_to_edmac(mdp, RD_RDEL); + if (rxdesc) + rxdesc->status |= cpu_to_edmac(mdp, RD_RDEL); memset(mdp->tx_ring, 0, tx_ringsize); From 52643323baa9f6aea370fb00c9a01798ff727302 Mon Sep 17 00:00:00 2001 From: Sergei Shtylyov Date: Sat, 24 Oct 2015 00:46:03 +0300 Subject: [PATCH 167/312] sh_eth: fix RX buffer size alignment [ Upstream commit ab8579169b79c062935dade949287113c7c1ba73 ] Both Renesas R-Car and RZ/A1 manuals state that RX buffer length must be a multiple of 32 bytes, while the driver only uses 16 byte granularity... Signed-off-by: Sergei Shtylyov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/renesas/sh_eth.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c index 51eff9fefb4e..c93a458f96f7 100644 --- a/drivers/net/ethernet/renesas/sh_eth.c +++ b/drivers/net/ethernet/renesas/sh_eth.c @@ -1148,8 +1148,8 @@ static void sh_eth_ring_format(struct net_device *ndev) /* RX descriptor */ rxdesc = &mdp->rx_ring[i]; - /* The size of the buffer is a multiple of 16 bytes. */ - rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16); + /* The size of the buffer is a multiple of 32 bytes. */ + rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 32); dma_addr = dma_map_single(&ndev->dev, skb->data, rxdesc->buffer_length, DMA_FROM_DEVICE); @@ -1507,7 +1507,7 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota) if (mdp->cd->rpadir) skb_reserve(skb, NET_IP_ALIGN); dma_unmap_single(&ndev->dev, rxdesc->addr, - ALIGN(mdp->rx_buf_sz, 16), + ALIGN(mdp->rx_buf_sz, 32), DMA_FROM_DEVICE); skb_put(skb, pkt_len); skb->protocol = eth_type_trans(skb, ndev); @@ -1525,8 +1525,8 @@ static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota) for (; mdp->cur_rx - mdp->dirty_rx > 0; mdp->dirty_rx++) { entry = mdp->dirty_rx % mdp->num_rx_ring; rxdesc = &mdp->rx_ring[entry]; - /* The size of the buffer is 16 byte boundary. */ - rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 16); + /* The size of the buffer is 32 byte boundary. */ + rxdesc->buffer_length = ALIGN(mdp->rx_buf_sz, 32); if (mdp->rx_skbuff[entry] == NULL) { skb = netdev_alloc_skb(ndev, skbuff_size); From fa565f5389b3886d3b28a8c12ad7ee8511098ca3 Mon Sep 17 00:00:00 2001 From: Rajesh Borundia Date: Tue, 8 Mar 2016 02:39:57 -0500 Subject: [PATCH 168/312] qlcnic: Remove unnecessary usage of atomic_t [ Upstream commit 5bf93251cee1fb66141d1d2eaff86e04a9397bdf ] o atomic_t usage is incorrect as we are not implementing any atomicity. Signed-off-by: Rajesh Borundia Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/qlogic/qlcnic/qlcnic.h | 2 +- drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 9 ++++----- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h index f221126a5c4e..77472de033f2 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h @@ -1099,7 +1099,7 @@ struct qlcnic_mailbox { unsigned long status; spinlock_t queue_lock; /* Mailbox queue lock */ spinlock_t aen_lock; /* Mailbox response/AEN lock */ - atomic_t rsp_status; + u32 rsp_status; u32 num_cmds; }; diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index 840bf36b5e9d..98ac6134d6a2 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -489,7 +489,7 @@ irqreturn_t qlcnic_83xx_clear_legacy_intr(struct qlcnic_adapter *adapter) static inline void qlcnic_83xx_notify_mbx_response(struct qlcnic_mailbox *mbx) { - atomic_set(&mbx->rsp_status, QLC_83XX_MBX_RESPONSE_ARRIVED); + mbx->rsp_status = QLC_83XX_MBX_RESPONSE_ARRIVED; complete(&mbx->completion); } @@ -508,7 +508,7 @@ static void qlcnic_83xx_poll_process_aen(struct qlcnic_adapter *adapter) if (event & QLCNIC_MBX_ASYNC_EVENT) { __qlcnic_83xx_process_aen(adapter); } else { - if (atomic_read(&mbx->rsp_status) != rsp_status) + if (mbx->rsp_status != rsp_status) qlcnic_83xx_notify_mbx_response(mbx); } out: @@ -1023,7 +1023,7 @@ static void qlcnic_83xx_process_aen(struct qlcnic_adapter *adapter) if (event & QLCNIC_MBX_ASYNC_EVENT) { __qlcnic_83xx_process_aen(adapter); } else { - if (atomic_read(&mbx->rsp_status) != rsp_status) + if (mbx->rsp_status != rsp_status) qlcnic_83xx_notify_mbx_response(mbx); } } @@ -4025,7 +4025,6 @@ static void qlcnic_83xx_mailbox_worker(struct work_struct *work) struct qlcnic_adapter *adapter = mbx->adapter; struct qlcnic_mbx_ops *mbx_ops = mbx->ops; struct device *dev = &adapter->pdev->dev; - atomic_t *rsp_status = &mbx->rsp_status; struct list_head *head = &mbx->cmd_q; struct qlcnic_hardware_context *ahw; struct qlcnic_cmd_args *cmd = NULL; @@ -4038,7 +4037,7 @@ static void qlcnic_83xx_mailbox_worker(struct work_struct *work) return; } - atomic_set(rsp_status, QLC_83XX_MBX_RESPONSE_WAIT); + mbx->rsp_status = QLC_83XX_MBX_RESPONSE_WAIT; spin_lock(&mbx->queue_lock); From 5284ee1e9da2f8f78b5b800109ef4c436c58876d Mon Sep 17 00:00:00 2001 From: Rajesh Borundia Date: Tue, 8 Mar 2016 02:39:58 -0500 Subject: [PATCH 169/312] qlcnic: Fix mailbox completion handling during spurious interrupt [ Upstream commit 819bfe764dceec2f6b4551768453f374b4c60443 ] o While the driver is in the middle of a MB completion processing and it receives a spurious MB interrupt, it is mistaken as a good MB completion interrupt leading to premature completion of the next MB request. Fix the driver to guard against this by checking the current state of MB processing and ignore the spurious interrupt. Also added a stats counter to record this condition. Signed-off-by: Rajesh Borundia Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/qlogic/qlcnic/qlcnic.h | 1 + .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 15 +++++++++++---- .../net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c | 3 ++- 3 files changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h index 77472de033f2..d0992825c47c 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h @@ -567,6 +567,7 @@ struct qlcnic_adapter_stats { u64 tx_dma_map_error; u64 spurious_intr; u64 mac_filter_limit_overrun; + u64 mbx_spurious_intr; }; /* diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index 98ac6134d6a2..dd618d7ed257 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -2338,9 +2338,9 @@ static void qlcnic_83xx_handle_link_aen(struct qlcnic_adapter *adapter, static irqreturn_t qlcnic_83xx_handle_aen(int irq, void *data) { + u32 mask, resp, event, rsp_status = QLC_83XX_MBX_RESPONSE_ARRIVED; struct qlcnic_adapter *adapter = data; struct qlcnic_mailbox *mbx; - u32 mask, resp, event; unsigned long flags; mbx = adapter->ahw->mailbox; @@ -2350,10 +2350,14 @@ static irqreturn_t qlcnic_83xx_handle_aen(int irq, void *data) goto out; event = readl(QLCNIC_MBX_FW(adapter->ahw, 0)); - if (event & QLCNIC_MBX_ASYNC_EVENT) + if (event & QLCNIC_MBX_ASYNC_EVENT) { __qlcnic_83xx_process_aen(adapter); - else - qlcnic_83xx_notify_mbx_response(mbx); + } else { + if (mbx->rsp_status != rsp_status) + qlcnic_83xx_notify_mbx_response(mbx); + else + adapter->stats.mbx_spurious_intr++; + } out: mask = QLCRDX(adapter->ahw, QLCNIC_DEF_INT_MASK); @@ -4028,6 +4032,7 @@ static void qlcnic_83xx_mailbox_worker(struct work_struct *work) struct list_head *head = &mbx->cmd_q; struct qlcnic_hardware_context *ahw; struct qlcnic_cmd_args *cmd = NULL; + unsigned long flags; ahw = adapter->ahw; @@ -4037,7 +4042,9 @@ static void qlcnic_83xx_mailbox_worker(struct work_struct *work) return; } + spin_lock_irqsave(&mbx->aen_lock, flags); mbx->rsp_status = QLC_83XX_MBX_RESPONSE_WAIT; + spin_unlock_irqrestore(&mbx->aen_lock, flags); spin_lock(&mbx->queue_lock); diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c index 494e8105adee..0a2318cad34d 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c @@ -59,7 +59,8 @@ static const struct qlcnic_stats qlcnic_gstrings_stats[] = { QLC_OFF(stats.mac_filter_limit_overrun)}, {"spurious intr", QLC_SIZEOF(stats.spurious_intr), QLC_OFF(stats.spurious_intr)}, - + {"mbx spurious intr", QLC_SIZEOF(stats.mbx_spurious_intr), + QLC_OFF(stats.mbx_spurious_intr)}, }; static const char qlcnic_device_gstrings_stats[][ETH_GSTRING_LEN] = { From 686263920d666722c42b5347123113bd85ec67cc Mon Sep 17 00:00:00 2001 From: Willem de Bruijn Date: Tue, 8 Mar 2016 15:18:54 -0500 Subject: [PATCH 170/312] macvtap: always pass ethernet header in linear [ Upstream commit 8e2ad4113ce4671686740f808ff2795395c39eef ] The stack expects link layer headers in the skb linear section. Macvtap can create skbs with llheader in frags in edge cases: when (IFF_VNET_HDR is off or vnet_hdr.hdr_len < ETH_HLEN) and prepad + len > PAGE_SIZE and vnet_hdr.flags has no or bad csum. Add checks to ensure linear is always at least ETH_HLEN. At this point, len is already ensured to be >= ETH_HLEN. For backwards compatiblity, rounds up short vnet_hdr.hdr_len. This differs from tap and packet, which return an error. Fixes b9fb9ee07e67 ("macvtap: add GSO/csum offload support") Signed-off-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/macvtap.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index 4dba5fbc735e..2b212f3e140c 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -710,6 +710,8 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m, macvtap16_to_cpu(q, vnet_hdr.hdr_len) : GOODCOPY_LEN; if (copylen > good_linear) copylen = good_linear; + else if (copylen < ETH_HLEN) + copylen = ETH_HLEN; linear = copylen; i = *from; iov_iter_advance(&i, copylen); @@ -719,10 +721,11 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m, if (!zerocopy) { copylen = len; - if (macvtap16_to_cpu(q, vnet_hdr.hdr_len) > good_linear) + linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len); + if (linear > good_linear) linear = good_linear; - else - linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len); + else if (linear < ETH_HLEN) + linear = ETH_HLEN; } skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen, From 86de8271be91cce66aace5a3ae8afd3f28094957 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sun, 13 Mar 2016 23:28:00 -0400 Subject: [PATCH 171/312] ipv4: Don't do expensive useless work during inetdev destroy. [ Upstream commit fbd40ea0180a2d328c5adc61414dc8bab9335ce2 ] When an inetdev is destroyed, every address assigned to the interface is removed. And in this scenerio we do two pointless things which can be very expensive if the number of assigned interfaces is large: 1) Address promotion. We are deleting all addresses, so there is no point in doing this. 2) A full nf conntrack table purge for every address. We only need to do this once, as is already caught by the existing masq_dev_notifier so masq_inet_event() can skip this. Reported-by: Solar Designer Signed-off-by: David S. Miller Tested-by: Cyrill Gorcunov Signed-off-by: Sasha Levin --- net/ipv4/devinet.c | 4 ++++ net/ipv4/fib_frontend.c | 4 ++++ net/ipv4/netfilter/nf_nat_masquerade_ipv4.c | 12 ++++++++++-- 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 280d46f947ea..a57056d87a43 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -334,6 +334,9 @@ static void __inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, ASSERT_RTNL(); + if (in_dev->dead) + goto no_promotions; + /* 1. Deleting primary ifaddr forces deletion all secondaries * unless alias promotion is set **/ @@ -380,6 +383,7 @@ static void __inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, fib_del_ifaddr(ifa, ifa1); } +no_promotions: /* 2. Unlink it */ *ifap = ifa1->ifa_next; diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 872494e6e6eb..87766363d2a4 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -876,6 +876,9 @@ void fib_del_ifaddr(struct in_ifaddr *ifa, struct in_ifaddr *iprim) subnet = 1; } + if (in_dev->dead) + goto no_promotions; + /* Deletion is more complicated than add. * We should take care of not to delete too much :-) * @@ -951,6 +954,7 @@ void fib_del_ifaddr(struct in_ifaddr *ifa, struct in_ifaddr *iprim) } } +no_promotions: if (!(ok & BRD_OK)) fib_magic(RTM_DELROUTE, RTN_BROADCAST, ifa->ifa_broadcast, 32, prim); if (subnet && ifa->ifa_prefixlen < 31) { diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c index c6eb42100e9a..ea91058b5f6f 100644 --- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c +++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c @@ -108,10 +108,18 @@ static int masq_inet_event(struct notifier_block *this, unsigned long event, void *ptr) { - struct net_device *dev = ((struct in_ifaddr *)ptr)->ifa_dev->dev; + struct in_device *idev = ((struct in_ifaddr *)ptr)->ifa_dev; struct netdev_notifier_info info; - netdev_notifier_info_init(&info, dev); + /* The masq_dev_notifier will catch the case of the device going + * down. So if the inetdev is dead and being destroyed we have + * no work to do. Otherwise this is an individual address removal + * and we have to perform the flush. + */ + if (idev->dead) + return NOTIFY_DONE; + + netdev_notifier_info_init(&info, idev->dev); return masq_device_event(this, event, &info); } From 8ca7bf099ae0e6ff096b3910895b5285a112aeb5 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Mon, 14 Mar 2016 09:56:35 -0300 Subject: [PATCH 172/312] net: Fix use after free in the recvmmsg exit path [ Upstream commit 34b88a68f26a75e4fded796f1a49c40f82234b7d ] The syzkaller fuzzer hit the following use-after-free: Call Trace: [] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295 [] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261 [< inline >] SYSC_recvmmsg net/socket.c:2281 [] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270 [] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185 And, as Dmitry rightly assessed, that is because we can drop the reference and then touch it when the underlying recvmsg calls return some packets and then hit an error, which will make recvmmsg to set sock->sk->sk_err, oops, fix it. Reported-and-Tested-by: Dmitry Vyukov Cc: Alexander Potapenko Cc: Eric Dumazet Cc: Kostya Serebryany Cc: Sasha Levin Fixes: a2e2725541fa ("net: Introduce recvmmsg socket syscall") http://lkml.kernel.org/r/20160122211644.GC2470@redhat.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/socket.c | 44 ++++++++++++++++++++++---------------------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/net/socket.c b/net/socket.c index dcbfa868e398..e66e4f357506 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2247,31 +2247,31 @@ int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen, break; } + if (err == 0) + goto out_put; + + if (datagrams == 0) { + datagrams = err; + goto out_put; + } + + /* + * We may return less entries than requested (vlen) if the + * sock is non block and there aren't enough datagrams... + */ + if (err != -EAGAIN) { + /* + * ... or if recvmsg returns an error after we + * received some datagrams, where we record the + * error to return on the next call or if the + * app asks about it using getsockopt(SO_ERROR). + */ + sock->sk->sk_err = -err; + } out_put: fput_light(sock->file, fput_needed); - if (err == 0) - return datagrams; - - if (datagrams != 0) { - /* - * We may return less entries than requested (vlen) if the - * sock is non block and there aren't enough datagrams... - */ - if (err != -EAGAIN) { - /* - * ... or if recvmsg returns an error after we - * received some datagrams, where we record the - * error to return on the next call or if the - * app asks about it using getsockopt(SO_ERROR). - */ - sock->sk->sk_err = -err; - } - - return datagrams; - } - - return err; + return datagrams; } SYSCALL_DEFINE5(recvmmsg, int, fd, struct mmsghdr __user *, mmsg, From c7c3a3219def53ece4b535f220deefa021ee09f7 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 14 Mar 2016 15:18:34 +0100 Subject: [PATCH 173/312] mlx4: add missing braces in verify_qp_parameters [ Upstream commit baefd7015cdb304ce6c94f9679d0486c71954766 ] The implementation of QP paravirtualization back in linux-3.7 included some code that looks very dubious, and gcc-6 has grown smart enough to warn about it: drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'verify_qp_parameters': drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3154:5: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation] if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) { ^~ drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3144:4: note: ...this 'if' clause, but it is not if (slave != mlx4_master_func_num(dev)) >From looking at the context, I'm reasonably sure that the indentation is correct but that it should have contained curly braces from the start, as the update_gid() function in the same patch correctly does. Signed-off-by: Arnd Bergmann Fixes: 54679e148287 ("mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop") Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c index bafe2180cf0c..e662ab39499e 100644 --- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c @@ -2960,7 +2960,7 @@ static int verify_qp_parameters(struct mlx4_dev *dev, case QP_TRANS_RTS2RTS: case QP_TRANS_SQD2SQD: case QP_TRANS_SQD2RTS: - if (slave != mlx4_master_func_num(dev)) + if (slave != mlx4_master_func_num(dev)) { if (optpar & MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH) { port = (qp_ctx->pri_path.sched_queue >> 6 & 1) + 1; if (dev->caps.port_mask[port] != MLX4_PORT_TYPE_IB) @@ -2979,6 +2979,7 @@ static int verify_qp_parameters(struct mlx4_dev *dev, if (qp_ctx->alt_path.mgid_index >= num_gids) return -EINVAL; } + } break; default: break; From 7b95465ef5afa3924fc3dbb5d27a9667a18f4f81 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 14 Mar 2016 15:18:35 +0100 Subject: [PATCH 174/312] farsync: fix off-by-one bug in fst_add_one [ Upstream commit e725a66c0202b5f36c2f9d59d26a65c53bbf21f7 ] gcc-6 finds an out of bounds access in the fst_add_one function when calculating the end of the mmio area: drivers/net/wan/farsync.c: In function 'fst_add_one': drivers/net/wan/farsync.c:418:53: error: index 2 denotes an offset greater than size of 'u8[2][8192] {aka unsigned char[2][8192]}' [-Werror=array-bounds] #define BUF_OFFSET(X) (BFM_BASE + offsetof(struct buf_window, X)) ^ include/linux/compiler-gcc.h:158:21: note: in definition of macro '__compiler_offsetof' __builtin_offsetof(a, b) ^ drivers/net/wan/farsync.c:418:37: note: in expansion of macro 'offsetof' #define BUF_OFFSET(X) (BFM_BASE + offsetof(struct buf_window, X)) ^~~~~~~~ drivers/net/wan/farsync.c:2519:36: note: in expansion of macro 'BUF_OFFSET' + BUF_OFFSET ( txBuffer[i][NUM_TX_BUFFER][0]); ^~~~~~~~~~ The warning is correct, but not critical because this appears to be a write-only variable that is set by each WAN driver but never accessed afterwards. I'm taking the minimal fix here, using the correct pointer by pointing 'mem_end' to the last byte inside of the register area as all other WAN drivers do, rather than the first byte outside of it. An alternative would be to just remove the mem_end member entirely. Signed-off-by: Arnd Bergmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/wan/farsync.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wan/farsync.c b/drivers/net/wan/farsync.c index 44541dbc5c28..69b994f3b8c5 100644 --- a/drivers/net/wan/farsync.c +++ b/drivers/net/wan/farsync.c @@ -2516,7 +2516,7 @@ fst_add_one(struct pci_dev *pdev, const struct pci_device_id *ent) dev->mem_start = card->phys_mem + BUF_OFFSET ( txBuffer[i][0][0]); dev->mem_end = card->phys_mem - + BUF_OFFSET ( txBuffer[i][NUM_TX_BUFFER][0]); + + BUF_OFFSET ( txBuffer[i][NUM_TX_BUFFER - 1][LEN_RX_BUFFER - 1]); dev->base_addr = card->pci_conf; dev->irq = card->irq; From 58bcdf4573b78c0eecf682abe636ad7363e19635 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 14 Mar 2016 15:18:36 +0100 Subject: [PATCH 175/312] ath9k: fix buffer overrun for ar9287 [ Upstream commit 83d6f1f15f8cce844b0a131cbc63e444620e48b5 ] Code that was added back in 2.6.38 has an obvious overflow when accessing a static array, and at the time it was added only a code comment was put in front of it as a reminder to have it reviewed properly. This has not happened, but gcc-6 now points to the specific overflow: drivers/net/wireless/ath/ath9k/eeprom.c: In function 'ath9k_hw_get_gain_boundaries_pdadcs': drivers/net/wireless/ath/ath9k/eeprom.c:483:44: error: array subscript is above array bounds [-Werror=array-bounds] maxPwrT4[i] = data_9287[idxL].pwrPdg[i][4]; ~~~~~~~~~~~~~~~~~~~~~~~~~^~~ It turns out that the correct array length exists in the local 'intercepts' variable of this function, so we can just use that instead of hardcoding '4', so this patch changes all three instances to use that variable. The other two instances were already correct, but it's more consistent this way. Signed-off-by: Arnd Bergmann Fixes: 940cd2c12ebf ("ath9k_hw: merge the ar9287 version of ath9k_hw_get_gain_boundaries_pdadcs") Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/wireless/ath/ath9k/eeprom.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/ath/ath9k/eeprom.c b/drivers/net/wireless/ath/ath9k/eeprom.c index cc81482c934d..113a43fca9cf 100644 --- a/drivers/net/wireless/ath/ath9k/eeprom.c +++ b/drivers/net/wireless/ath/ath9k/eeprom.c @@ -403,10 +403,9 @@ void ath9k_hw_get_gain_boundaries_pdadcs(struct ath_hw *ah, if (match) { if (AR_SREV_9287(ah)) { - /* FIXME: array overrun? */ for (i = 0; i < numXpdGains; i++) { minPwrT4[i] = data_9287[idxL].pwrPdg[i][0]; - maxPwrT4[i] = data_9287[idxL].pwrPdg[i][4]; + maxPwrT4[i] = data_9287[idxL].pwrPdg[i][intercepts - 1]; ath9k_hw_fill_vpd_table(minPwrT4[i], maxPwrT4[i], data_9287[idxL].pwrPdg[i], data_9287[idxL].vpdPdg[i], @@ -416,7 +415,7 @@ void ath9k_hw_get_gain_boundaries_pdadcs(struct ath_hw *ah, } else if (eeprom_4k) { for (i = 0; i < numXpdGains; i++) { minPwrT4[i] = data_4k[idxL].pwrPdg[i][0]; - maxPwrT4[i] = data_4k[idxL].pwrPdg[i][4]; + maxPwrT4[i] = data_4k[idxL].pwrPdg[i][intercepts - 1]; ath9k_hw_fill_vpd_table(minPwrT4[i], maxPwrT4[i], data_4k[idxL].pwrPdg[i], data_4k[idxL].vpdPdg[i], @@ -426,7 +425,7 @@ void ath9k_hw_get_gain_boundaries_pdadcs(struct ath_hw *ah, } else { for (i = 0; i < numXpdGains; i++) { minPwrT4[i] = data_def[idxL].pwrPdg[i][0]; - maxPwrT4[i] = data_def[idxL].pwrPdg[i][4]; + maxPwrT4[i] = data_def[idxL].pwrPdg[i][intercepts - 1]; ath9k_hw_fill_vpd_table(minPwrT4[i], maxPwrT4[i], data_def[idxL].pwrPdg[i], data_def[idxL].vpdPdg[i], From 827e50724c0009dc327627df1edb478cda1ca32a Mon Sep 17 00:00:00 2001 From: Guillaume Nault Date: Mon, 14 Mar 2016 21:17:16 +0100 Subject: [PATCH 176/312] ppp: ensure file->private_data can't be overridden [ Upstream commit e8e56ffd9d2973398b60ece1f1bebb8d67b4d032 ] Locking ppp_mutex must be done before dereferencing file->private_data, otherwise it could be modified before ppp_unattached_ioctl() takes the lock. This could lead ppp_unattached_ioctl() to override ->private_data, thus leaking reference to the ppp_file previously pointed to. v2: lock all ppp_ioctl() instead of just checking private_data in ppp_unattached_ioctl(), to avoid ambiguous behaviour. Fixes: f3ff8a4d80e8 ("ppp: push BKL down into the driver") Signed-off-by: Guillaume Nault Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ppp/ppp_generic.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index cfe49a07c7c1..221334a9f1a1 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -563,7 +563,7 @@ static int get_filter(void __user *arg, struct sock_filter **p) static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { - struct ppp_file *pf = file->private_data; + struct ppp_file *pf; struct ppp *ppp; int err = -EFAULT, val, val2, i; struct ppp_idle idle; @@ -573,9 +573,14 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) void __user *argp = (void __user *)arg; int __user *p = argp; - if (!pf) - return ppp_unattached_ioctl(current->nsproxy->net_ns, - pf, file, cmd, arg); + mutex_lock(&ppp_mutex); + + pf = file->private_data; + if (!pf) { + err = ppp_unattached_ioctl(current->nsproxy->net_ns, + pf, file, cmd, arg); + goto out; + } if (cmd == PPPIOCDETACH) { /* @@ -590,7 +595,6 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) * this fd and reopening /dev/ppp. */ err = -EINVAL; - mutex_lock(&ppp_mutex); if (pf->kind == INTERFACE) { ppp = PF_TO_PPP(pf); if (file == ppp->owner) @@ -602,15 +606,13 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) } else pr_warn("PPPIOCDETACH file->f_count=%ld\n", atomic_long_read(&file->f_count)); - mutex_unlock(&ppp_mutex); - return err; + goto out; } if (pf->kind == CHANNEL) { struct channel *pch; struct ppp_channel *chan; - mutex_lock(&ppp_mutex); pch = PF_TO_CHANNEL(pf); switch (cmd) { @@ -632,17 +634,16 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) err = chan->ops->ioctl(chan, cmd, arg); up_read(&pch->chan_sem); } - mutex_unlock(&ppp_mutex); - return err; + goto out; } if (pf->kind != INTERFACE) { /* can't happen */ pr_err("PPP: not interface or channel??\n"); - return -EINVAL; + err = -EINVAL; + goto out; } - mutex_lock(&ppp_mutex); ppp = PF_TO_PPP(pf); switch (cmd) { case PPPIOCSMRU: @@ -817,7 +818,10 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) default: err = -ENOTTY; } + +out: mutex_unlock(&ppp_mutex); + return err; } @@ -830,7 +834,6 @@ static int ppp_unattached_ioctl(struct net *net, struct ppp_file *pf, struct ppp_net *pn; int __user *p = (int __user *)arg; - mutex_lock(&ppp_mutex); switch (cmd) { case PPPIOCNEWUNIT: /* Create a new ppp unit */ @@ -881,7 +884,7 @@ static int ppp_unattached_ioctl(struct net *net, struct ppp_file *pf, default: err = -ENOTTY; } - mutex_unlock(&ppp_mutex); + return err; } From bb9863ec605470a48c67b91a37c25d04136acab6 Mon Sep 17 00:00:00 2001 From: Manish Chopra Date: Tue, 15 Mar 2016 07:13:45 -0400 Subject: [PATCH 177/312] qlge: Fix receive packets drop. [ Upstream commit 2c9a266afefe137bff06bbe0fc48b4d3b3cb348c ] When running small packets [length < 256 bytes] traffic, packets were being dropped due to invalid data in those packets which were delivered by the driver upto the stack. Using pci_dma_sync_single_for_cpu ensures copying latest and updated data into skb from the receive buffer. Signed-off-by: Sony Chacko Signed-off-by: Manish Chopra Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/qlogic/qlge/qlge_main.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_main.c b/drivers/net/ethernet/qlogic/qlge/qlge_main.c index 25800a1dedcb..b915de060a42 100644 --- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c +++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c @@ -1648,7 +1648,18 @@ static void ql_process_mac_rx_skb(struct ql_adapter *qdev, return; } skb_reserve(new_skb, NET_IP_ALIGN); + + pci_dma_sync_single_for_cpu(qdev->pdev, + dma_unmap_addr(sbq_desc, mapaddr), + dma_unmap_len(sbq_desc, maplen), + PCI_DMA_FROMDEVICE); + memcpy(skb_put(new_skb, length), skb->data, length); + + pci_dma_sync_single_for_device(qdev->pdev, + dma_unmap_addr(sbq_desc, mapaddr), + dma_unmap_len(sbq_desc, maplen), + PCI_DMA_FROMDEVICE); skb = new_skb; /* Frame error, so drop the packet. */ From 1fdc6943a177598ea960188f8972808bfa20dd84 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 17 Mar 2016 11:57:06 -0700 Subject: [PATCH 178/312] net: bcmgenet: fix dma api length mismatch [ Upstream commit eee577232203842b4dcadb7ab477a298479633ed ] When un-mapping skb->data in __bcmgenet_tx_reclaim(), we must use the length that was used in original dma_map_single(), instead of skb->len that might be bigger (includes the frags) We simply can store skb_len into tx_cb_ptr->dma_len and use it at unmap time. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Eric Dumazet Acked-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index 6043734ea613..a9fcac044e9e 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1048,7 +1048,7 @@ static unsigned int __bcmgenet_tx_reclaim(struct net_device *dev, dev->stats.tx_bytes += tx_cb_ptr->skb->len; dma_unmap_single(&dev->dev, dma_unmap_addr(tx_cb_ptr, dma_addr), - tx_cb_ptr->skb->len, + dma_unmap_len(tx_cb_ptr, dma_len), DMA_TO_DEVICE); bcmgenet_free_cb(tx_cb_ptr); } else if (dma_unmap_addr(tx_cb_ptr, dma_addr)) { @@ -1159,7 +1159,7 @@ static int bcmgenet_xmit_single(struct net_device *dev, } dma_unmap_addr_set(tx_cb_ptr, dma_addr, mapping); - dma_unmap_len_set(tx_cb_ptr, dma_len, skb->len); + dma_unmap_len_set(tx_cb_ptr, dma_len, skb_len); length_status = (skb_len << DMA_BUFLENGTH_SHIFT) | dma_desc_flags | (priv->hw_params->qtag_mask << DMA_TX_QTAG_SHIFT) | DMA_TX_APPEND_CRC; From 6acfc65a07790d15de590a2441ab55f51d36733a Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 17 Mar 2016 17:23:36 -0700 Subject: [PATCH 179/312] bonding: fix bond_get_stats() [ Upstream commit fe30937b65354c7fec244caebbdaae68e28ca797 ] bond_get_stats() can be called from rtnetlink (with RTNL held) or from /proc/net/dev seq handler (with RCU held) The logic added in commit 5f0c5f73e5ef ("bonding: make global bonding stats more reliable") kind of assumed only one cpu could run there. If multiple threads are reading /proc/net/dev, stats can be really messed up after a while. A second problem is that some fields are 32bit, so we need to properly handle the wrap around problem. Given that RTNL is not always held, we need to use bond_for_each_slave_rcu(). Fixes: 5f0c5f73e5ef ("bonding: make global bonding stats more reliable") Signed-off-by: Eric Dumazet Cc: Andy Gospodarek Cc: Jay Vosburgh Cc: Veaceslav Falico Reviewed-by: Nikolay Aleksandrov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/bonding/bond_main.c | 63 ++++++++++++++++++--------------- include/net/bonding.h | 1 + 2 files changed, 35 insertions(+), 29 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index bd744e31c434..9ba92e23e67f 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3246,6 +3246,30 @@ static int bond_close(struct net_device *bond_dev) return 0; } +/* fold stats, assuming all rtnl_link_stats64 fields are u64, but + * that some drivers can provide 32bit values only. + */ +static void bond_fold_stats(struct rtnl_link_stats64 *_res, + const struct rtnl_link_stats64 *_new, + const struct rtnl_link_stats64 *_old) +{ + const u64 *new = (const u64 *)_new; + const u64 *old = (const u64 *)_old; + u64 *res = (u64 *)_res; + int i; + + for (i = 0; i < sizeof(*_res) / sizeof(u64); i++) { + u64 nv = new[i]; + u64 ov = old[i]; + + /* detects if this particular field is 32bit only */ + if (((nv | ov) >> 32) == 0) + res[i] += (u32)nv - (u32)ov; + else + res[i] += nv - ov; + } +} + static struct rtnl_link_stats64 *bond_get_stats(struct net_device *bond_dev, struct rtnl_link_stats64 *stats) { @@ -3254,43 +3278,23 @@ static struct rtnl_link_stats64 *bond_get_stats(struct net_device *bond_dev, struct list_head *iter; struct slave *slave; + spin_lock(&bond->stats_lock); memcpy(stats, &bond->bond_stats, sizeof(*stats)); - bond_for_each_slave(bond, slave, iter) { - const struct rtnl_link_stats64 *sstats = + rcu_read_lock(); + bond_for_each_slave_rcu(bond, slave, iter) { + const struct rtnl_link_stats64 *new = dev_get_stats(slave->dev, &temp); - struct rtnl_link_stats64 *pstats = &slave->slave_stats; - stats->rx_packets += sstats->rx_packets - pstats->rx_packets; - stats->rx_bytes += sstats->rx_bytes - pstats->rx_bytes; - stats->rx_errors += sstats->rx_errors - pstats->rx_errors; - stats->rx_dropped += sstats->rx_dropped - pstats->rx_dropped; - - stats->tx_packets += sstats->tx_packets - pstats->tx_packets;; - stats->tx_bytes += sstats->tx_bytes - pstats->tx_bytes; - stats->tx_errors += sstats->tx_errors - pstats->tx_errors; - stats->tx_dropped += sstats->tx_dropped - pstats->tx_dropped; - - stats->multicast += sstats->multicast - pstats->multicast; - stats->collisions += sstats->collisions - pstats->collisions; - - stats->rx_length_errors += sstats->rx_length_errors - pstats->rx_length_errors; - stats->rx_over_errors += sstats->rx_over_errors - pstats->rx_over_errors; - stats->rx_crc_errors += sstats->rx_crc_errors - pstats->rx_crc_errors; - stats->rx_frame_errors += sstats->rx_frame_errors - pstats->rx_frame_errors; - stats->rx_fifo_errors += sstats->rx_fifo_errors - pstats->rx_fifo_errors; - stats->rx_missed_errors += sstats->rx_missed_errors - pstats->rx_missed_errors; - - stats->tx_aborted_errors += sstats->tx_aborted_errors - pstats->tx_aborted_errors; - stats->tx_carrier_errors += sstats->tx_carrier_errors - pstats->tx_carrier_errors; - stats->tx_fifo_errors += sstats->tx_fifo_errors - pstats->tx_fifo_errors; - stats->tx_heartbeat_errors += sstats->tx_heartbeat_errors - pstats->tx_heartbeat_errors; - stats->tx_window_errors += sstats->tx_window_errors - pstats->tx_window_errors; + bond_fold_stats(stats, new, &slave->slave_stats); /* save off the slave stats for the next run */ - memcpy(pstats, sstats, sizeof(*sstats)); + memcpy(&slave->slave_stats, new, sizeof(*new)); } + rcu_read_unlock(); + memcpy(&bond->bond_stats, stats, sizeof(*stats)); + spin_unlock(&bond->stats_lock); return stats; } @@ -4102,6 +4106,7 @@ void bond_setup(struct net_device *bond_dev) struct bonding *bond = netdev_priv(bond_dev); spin_lock_init(&bond->mode_lock); + spin_lock_init(&bond->stats_lock); bond->params = bonding_defaults; /* Initialize pointers */ diff --git a/include/net/bonding.h b/include/net/bonding.h index 78ed135e9dea..5cba8f3c3fe4 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -211,6 +211,7 @@ struct bonding { * ALB mode (6) - to sync the use and modifications of its hash table */ spinlock_t mode_lock; + spinlock_t stats_lock; u8 send_peer_notif; u8 igmp_retrans; #ifdef CONFIG_PROC_FS From 263a20bc8ce3c3cb2ad3078457dc7abd29bc9165 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Tue, 22 Mar 2016 09:19:38 +0100 Subject: [PATCH 180/312] ipv4: fix broadcast packets reception [ Upstream commit ad0ea1989cc4d5905941d0a9e62c63ad6d859cef ] Currently, ingress ipv4 broadcast datagrams are dropped since, in udp_v4_early_demux(), ip_check_mc_rcu() is invoked even on bcast packets. This patch addresses the issue, invoking ip_check_mc_rcu() only for mcast packets. Fixes: 6e5403093261 ("ipv4/udp: Verify multicast group is ours in upd_v4_early_demux()") Signed-off-by: Paolo Abeni Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv4/udp.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index a390174b96de..031752efe1ab 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1979,10 +1979,14 @@ void udp_v4_early_demux(struct sk_buff *skb) if (!in_dev) return; - ours = ip_check_mc_rcu(in_dev, iph->daddr, iph->saddr, - iph->protocol); - if (!ours) - return; + /* we are supposed to accept bcast packets */ + if (skb->pkt_type == PACKET_MULTICAST) { + ours = ip_check_mc_rcu(in_dev, iph->daddr, iph->saddr, + iph->protocol); + if (!ours) + return; + } + sk = __udp4_lib_mcast_demux_lookup(net, uh->dest, iph->daddr, uh->source, iph->saddr, dif); } else if (skb->pkt_type == PACKET_HOST) { From fc74ace8df9bffbab3b886686db02f0809bdc5e9 Mon Sep 17 00:00:00 2001 From: Guillaume Nault Date: Wed, 23 Mar 2016 16:38:55 +0100 Subject: [PATCH 181/312] ppp: take reference on channels netns [ Upstream commit 1f461dcdd296eecedaffffc6bae2bfa90bd7eb89 ] Let channels hold a reference on their network namespace. Some channel types, like ppp_async and ppp_synctty, can have their userspace controller running in a different namespace. Therefore they can't rely on them to preclude their netns from being removed from under them. ================================================================== BUG: KASAN: use-after-free in ppp_unregister_channel+0x372/0x3a0 at addr ffff880064e217e0 Read of size 8 by task syz-executor/11581 ============================================================================= BUG net_namespace (Not tainted): kasan: bad access detected ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in copy_net_ns+0x6b/0x1a0 age=92569 cpu=3 pid=6906 [< none >] ___slab_alloc+0x4c7/0x500 kernel/mm/slub.c:2440 [< none >] __slab_alloc+0x4c/0x90 kernel/mm/slub.c:2469 [< inline >] slab_alloc_node kernel/mm/slub.c:2532 [< inline >] slab_alloc kernel/mm/slub.c:2574 [< none >] kmem_cache_alloc+0x23a/0x2b0 kernel/mm/slub.c:2579 [< inline >] kmem_cache_zalloc kernel/include/linux/slab.h:597 [< inline >] net_alloc kernel/net/core/net_namespace.c:325 [< none >] copy_net_ns+0x6b/0x1a0 kernel/net/core/net_namespace.c:360 [< none >] create_new_namespaces+0x2f6/0x610 kernel/kernel/nsproxy.c:95 [< none >] copy_namespaces+0x297/0x320 kernel/kernel/nsproxy.c:150 [< none >] copy_process.part.35+0x1bf4/0x5760 kernel/kernel/fork.c:1451 [< inline >] copy_process kernel/kernel/fork.c:1274 [< none >] _do_fork+0x1bc/0xcb0 kernel/kernel/fork.c:1723 [< inline >] SYSC_clone kernel/kernel/fork.c:1832 [< none >] SyS_clone+0x37/0x50 kernel/kernel/fork.c:1826 [< none >] entry_SYSCALL_64_fastpath+0x16/0x7a kernel/arch/x86/entry/entry_64.S:185 INFO: Freed in net_drop_ns+0x67/0x80 age=575 cpu=2 pid=2631 [< none >] __slab_free+0x1fc/0x320 kernel/mm/slub.c:2650 [< inline >] slab_free kernel/mm/slub.c:2805 [< none >] kmem_cache_free+0x2a0/0x330 kernel/mm/slub.c:2814 [< inline >] net_free kernel/net/core/net_namespace.c:341 [< none >] net_drop_ns+0x67/0x80 kernel/net/core/net_namespace.c:348 [< none >] cleanup_net+0x4e5/0x600 kernel/net/core/net_namespace.c:448 [< none >] process_one_work+0x794/0x1440 kernel/kernel/workqueue.c:2036 [< none >] worker_thread+0xdb/0xfc0 kernel/kernel/workqueue.c:2170 [< none >] kthread+0x23f/0x2d0 kernel/drivers/block/aoe/aoecmd.c:1303 [< none >] ret_from_fork+0x3f/0x70 kernel/arch/x86/entry/entry_64.S:468 INFO: Slab 0xffffea0001938800 objects=3 used=0 fp=0xffff880064e20000 flags=0x5fffc0000004080 INFO: Object 0xffff880064e20000 @offset=0 fp=0xffff880064e24200 CPU: 1 PID: 11581 Comm: syz-executor Tainted: G B 4.4.0+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 00000000ffffffff ffff8800662c7790 ffffffff8292049d ffff88003e36a300 ffff880064e20000 ffff880064e20000 ffff8800662c77c0 ffffffff816f2054 ffff88003e36a300 ffffea0001938800 ffff880064e20000 0000000000000000 Call Trace: [< inline >] __dump_stack kernel/lib/dump_stack.c:15 [] dump_stack+0x6f/0xa2 kernel/lib/dump_stack.c:50 [] print_trailer+0xf4/0x150 kernel/mm/slub.c:654 [] object_err+0x2f/0x40 kernel/mm/slub.c:661 [< inline >] print_address_description kernel/mm/kasan/report.c:138 [] kasan_report_error+0x215/0x530 kernel/mm/kasan/report.c:236 [< inline >] kasan_report kernel/mm/kasan/report.c:259 [] __asan_report_load8_noabort+0x3e/0x40 kernel/mm/kasan/report.c:280 [< inline >] ? ppp_pernet kernel/include/linux/compiler.h:218 [] ? ppp_unregister_channel+0x372/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392 [< inline >] ppp_pernet kernel/include/linux/compiler.h:218 [] ppp_unregister_channel+0x372/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392 [< inline >] ? ppp_pernet kernel/drivers/net/ppp/ppp_generic.c:293 [] ? ppp_unregister_channel+0xe6/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392 [] ppp_asynctty_close+0xa3/0x130 kernel/drivers/net/ppp/ppp_async.c:241 [] ? async_lcp_peek+0x5b0/0x5b0 kernel/drivers/net/ppp/ppp_async.c:1000 [] tty_ldisc_close.isra.1+0x99/0xe0 kernel/drivers/tty/tty_ldisc.c:478 [] tty_ldisc_kill+0x40/0x170 kernel/drivers/tty/tty_ldisc.c:744 [] tty_ldisc_release+0x1b3/0x260 kernel/drivers/tty/tty_ldisc.c:772 [] tty_release+0xac1/0x13e0 kernel/drivers/tty/tty_io.c:1901 [] ? release_tty+0x320/0x320 kernel/drivers/tty/tty_io.c:1688 [] __fput+0x236/0x780 kernel/fs/file_table.c:208 [] ____fput+0x15/0x20 kernel/fs/file_table.c:244 [] task_work_run+0x16b/0x200 kernel/kernel/task_work.c:115 [< inline >] exit_task_work kernel/include/linux/task_work.h:21 [] do_exit+0x8b5/0x2c60 kernel/kernel/exit.c:750 [] ? debug_check_no_locks_freed+0x290/0x290 kernel/kernel/locking/lockdep.c:4123 [] ? mm_update_next_owner+0x6f0/0x6f0 kernel/kernel/exit.c:357 [] ? __dequeue_signal+0x136/0x470 kernel/kernel/signal.c:550 [] ? recalc_sigpending_tsk+0x13b/0x180 kernel/kernel/signal.c:145 [] do_group_exit+0x108/0x330 kernel/kernel/exit.c:880 [] get_signal+0x5e4/0x14f0 kernel/kernel/signal.c:2307 [< inline >] ? kretprobe_table_lock kernel/kernel/kprobes.c:1113 [] ? kprobe_flush_task+0xb5/0x450 kernel/kernel/kprobes.c:1158 [] do_signal+0x83/0x1c90 kernel/arch/x86/kernel/signal.c:712 [] ? recycle_rp_inst+0x310/0x310 kernel/include/linux/list.h:655 [] ? setup_sigcontext+0x780/0x780 kernel/arch/x86/kernel/signal.c:165 [] ? finish_task_switch+0x424/0x5f0 kernel/kernel/sched/core.c:2692 [< inline >] ? finish_lock_switch kernel/kernel/sched/sched.h:1099 [] ? finish_task_switch+0x120/0x5f0 kernel/kernel/sched/core.c:2678 [< inline >] ? context_switch kernel/kernel/sched/core.c:2807 [] ? __schedule+0x919/0x1bd0 kernel/kernel/sched/core.c:3283 [] exit_to_usermode_loop+0xf1/0x1a0 kernel/arch/x86/entry/common.c:247 [< inline >] prepare_exit_to_usermode kernel/arch/x86/entry/common.c:282 [] syscall_return_slowpath+0x19f/0x210 kernel/arch/x86/entry/common.c:344 [] int_ret_from_sys_call+0x25/0x9f kernel/arch/x86/entry/entry_64.S:281 Memory state around the buggy address: ffff880064e21680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff880064e21700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff880064e21780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff880064e21800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff880064e21880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fixes: 273ec51dd7ce ("net: ppp_generic - introduce net-namespace functionality v2") Reported-by: Baozeng Ding Signed-off-by: Guillaume Nault Reviewed-by: Cyrill Gorcunov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ppp/ppp_generic.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 221334a9f1a1..51ba895f0522 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -2247,7 +2247,7 @@ int ppp_register_net_channel(struct net *net, struct ppp_channel *chan) pch->ppp = NULL; pch->chan = chan; - pch->chan_net = net; + pch->chan_net = get_net(net); chan->ppp = pch; init_ppp_file(&pch->file, CHANNEL); pch->file.hdrlen = chan->hdrlen; @@ -2344,6 +2344,8 @@ ppp_unregister_channel(struct ppp_channel *chan) spin_lock_bh(&pn->all_channels_lock); list_del(&pch->list); spin_unlock_bh(&pn->all_channels_lock); + put_net(pch->chan_net); + pch->chan_net = NULL; pch->file.dead = 1; wake_up_interruptible(&pch->file.rwait); From a23597733aed572b2de06562f5d3913b76bddfdc Mon Sep 17 00:00:00 2001 From: "subashab@codeaurora.org" Date: Wed, 23 Mar 2016 22:39:50 -0600 Subject: [PATCH 182/312] xfrm: Fix crash observed during device unregistration and decryption [ Upstream commit 071d36bf21bcc837be00cea55bcef8d129e7f609 ] A crash is observed when a decrypted packet is processed in receive path. get_rps_cpus() tries to dereference the skb->dev fields but it appears that the device is freed from the poison pattern. [] get_rps_cpu+0x94/0x2f0 [] netif_rx_internal+0x140/0x1cc [] netif_rx+0x74/0x94 [] xfrm_input+0x754/0x7d0 [] xfrm_input_resume+0x10/0x1c [] esp_input_done+0x20/0x30 [] process_one_work+0x244/0x3fc [] worker_thread+0x2f8/0x418 [] kthread+0xe0/0xec -013|get_rps_cpu( | dev = 0xFFFFFFC08B688000, | skb = 0xFFFFFFC0C76AAC00 -> ( | dev = 0xFFFFFFC08B688000 -> ( | name = "...................................................... | name_hlist = (next = 0xAAAAAAAAAAAAAAAA, pprev = 0xAAAAAAAAAAA Following are the sequence of events observed - - Encrypted packet in receive path from netdevice is queued - Encrypted packet queued for decryption (asynchronous) - Netdevice brought down and freed - Packet is decrypted and returned through callback in esp_input_done - Packet is queued again for process in network stack using netif_rx Since the device appears to have been freed, the dereference of skb->dev in get_rps_cpus() leads to an unhandled page fault exception. Fix this by holding on to device reference when queueing packets asynchronously and releasing the reference on call back return. v2: Make the change generic to xfrm as mentioned by Steffen and update the title to xfrm Suggested-by: Herbert Xu Signed-off-by: Jerome Stanislaus Signed-off-by: Subash Abhinov Kasiviswanathan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/xfrm/xfrm_input.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c index b58286ecd156..cbaf52c837f4 100644 --- a/net/xfrm/xfrm_input.c +++ b/net/xfrm/xfrm_input.c @@ -292,12 +292,15 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type) XFRM_SKB_CB(skb)->seq.input.hi = seq_hi; skb_dst_force(skb); + dev_hold(skb->dev); nexthdr = x->type->input(x, skb); if (nexthdr == -EINPROGRESS) return 0; resume: + dev_put(skb->dev); + spin_lock(&x->lock); if (nexthdr <= 0) { if (nexthdr == -EBADMSG) { From 95e4695bd33e9f8ddabaf4fa64857e9bed1bda80 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bj=C3=B8rn=20Mork?= Date: Mon, 28 Mar 2016 22:38:16 +0200 Subject: [PATCH 183/312] qmi_wwan: add "D-Link DWM-221 B1" device id MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit e84810c7b85a2d7897797b3ad3e879168a8e032a ] Thomas reports: "Windows: 00 diagnostics 01 modem 02 at-port 03 nmea 04 nic Linux: T: Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2001 ProdID=7e19 Rev=02.32 S: Manufacturer=Mobile Connect S: Product=Mobile Connect S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 5 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage" Reported-by: Thomas Schäfer Signed-off-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/usb/qmi_wwan.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 8677c6a56cdf..8153e97408e7 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -749,6 +749,7 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x19d2, 0x1426, 2)}, /* ZTE MF91 */ {QMI_FIXED_INTF(0x19d2, 0x1428, 2)}, /* Telewell TW-LTE 4G v2 */ {QMI_FIXED_INTF(0x19d2, 0x2002, 4)}, /* ZTE (Vodafone) K3765-Z */ + {QMI_FIXED_INTF(0x2001, 0x7e19, 4)}, /* D-Link DWM-221 B1 */ {QMI_FIXED_INTF(0x0f3d, 0x68a2, 8)}, /* Sierra Wireless MC7700 */ {QMI_FIXED_INTF(0x114f, 0x68a2, 8)}, /* Sierra Wireless MC7750 */ {QMI_FIXED_INTF(0x1199, 0x68a2, 8)}, /* Sierra Wireless MC7710 in QMI mode */ From b95228fd0efb56afd9aaec13a76f530f85fcb502 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Tue, 29 Mar 2016 08:43:41 -0700 Subject: [PATCH 184/312] ipv6: udp: fix UDP_MIB_IGNOREDMULTI updates [ Upstream commit 2d4212261fdf13e29728ddb5ea9d60c342cc92b5 ] IPv6 counters updates use a different macro than IPv4. Fixes: 36cbb2452cbaf ("udp: Increment UDP_MIB_IGNOREDMULTI for arriving unmatched multicasts") Signed-off-by: Eric Dumazet Cc: Rick Jones Cc: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv6/udp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index d28b2a1ab0e5..1173557ea551 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -834,8 +834,8 @@ start_lookup: flush_stack(stack, count, skb, count - 1); } else { if (!inner_flushed) - UDP_INC_STATS_BH(net, UDP_MIB_IGNOREDMULTI, - proto == IPPROTO_UDPLITE); + UDP6_INC_STATS_BH(net, UDP_MIB_IGNOREDMULTI, + proto == IPPROTO_UDPLITE); consume_skb(skb); } return 0; From 5e6a091fd631b29332d3f61e92857ebd5d85a4be Mon Sep 17 00:00:00 2001 From: Nicolas Dichtel Date: Thu, 31 Mar 2016 18:10:31 +0200 Subject: [PATCH 185/312] rtnl: fix msg size calculation in if_nlmsg_size() [ Upstream commit c57c7a95da842807b475b823ed2e5435c42cb3b0 ] Size of the attribute IFLA_PHYS_PORT_NAME was missing. Fixes: db24a9044ee1 ("net: add support for phys_port_name") CC: David Ahern Signed-off-by: Nicolas Dichtel Acked-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/core/rtnetlink.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index fe95cb704aaa..31a46a7bd3d6 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -884,7 +884,9 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev, + rtnl_link_get_size(dev) /* IFLA_LINKINFO */ + rtnl_link_get_af_size(dev) /* IFLA_AF_SPEC */ + nla_total_size(MAX_PHYS_ITEM_ID_LEN) /* IFLA_PHYS_PORT_ID */ - + nla_total_size(MAX_PHYS_ITEM_ID_LEN); /* IFLA_PHYS_SWITCH_ID */ + + nla_total_size(MAX_PHYS_ITEM_ID_LEN) /* IFLA_PHYS_SWITCH_ID */ + + nla_total_size(IFNAMSIZ); /* IFLA_PHYS_PORT_NAME */ + } static int rtnl_vf_ports_fill(struct sk_buff *skb, struct net_device *dev) From a9c5107769719e76a895fd59408442e5629fc40f Mon Sep 17 00:00:00 2001 From: Haishuang Yan Date: Sun, 3 Apr 2016 22:09:23 +0800 Subject: [PATCH 186/312] ipv4: l2tp: fix a potential issue in l2tp_ip_recv [ Upstream commit 5745b8232e942abd5e16e85fa9b27cc21324acf0 ] pskb_may_pull() can change skb->data, so we have to load ptr/optr at the right place. Signed-off-by: Haishuang Yan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/l2tp/l2tp_ip.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c index 79649937ec71..44ee0683b14b 100644 --- a/net/l2tp/l2tp_ip.c +++ b/net/l2tp/l2tp_ip.c @@ -123,12 +123,11 @@ static int l2tp_ip_recv(struct sk_buff *skb) struct l2tp_tunnel *tunnel = NULL; int length; - /* Point to L2TP header */ - optr = ptr = skb->data; - if (!pskb_may_pull(skb, 4)) goto discard; + /* Point to L2TP header */ + optr = ptr = skb->data; session_id = ntohl(*((__be32 *) ptr)); ptr += 4; @@ -156,6 +155,9 @@ static int l2tp_ip_recv(struct sk_buff *skb) if (!pskb_may_pull(skb, length)) goto discard; + /* Point to L2TP header */ + optr = ptr = skb->data; + ptr += 4; pr_debug("%s: ip recv\n", tunnel->name); print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length); } From 1d8ad31a7ef1e1dcc4968518d5cefcd492bc2c58 Mon Sep 17 00:00:00 2001 From: Haishuang Yan Date: Sun, 3 Apr 2016 22:09:24 +0800 Subject: [PATCH 187/312] ipv6: l2tp: fix a potential issue in l2tp_ip6_recv [ Upstream commit be447f305494e019dfc37ea4cdf3b0e4200b4eba ] pskb_may_pull() can change skb->data, so we have to load ptr/optr at the right place. Signed-off-by: Haishuang Yan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/l2tp/l2tp_ip6.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c index 0ce9da948ad7..36f8fa223a78 100644 --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -135,12 +135,11 @@ static int l2tp_ip6_recv(struct sk_buff *skb) struct l2tp_tunnel *tunnel = NULL; int length; - /* Point to L2TP header */ - optr = ptr = skb->data; - if (!pskb_may_pull(skb, 4)) goto discard; + /* Point to L2TP header */ + optr = ptr = skb->data; session_id = ntohl(*((__be32 *) ptr)); ptr += 4; @@ -168,6 +167,9 @@ static int l2tp_ip6_recv(struct sk_buff *skb) if (!pskb_may_pull(skb, length)) goto discard; + /* Point to L2TP header */ + optr = ptr = skb->data; + ptr += 4; pr_debug("%s: ip recv\n", tunnel->name); print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length); } From 80618333cec2c6b69244c7a478580e5ffccb6d24 Mon Sep 17 00:00:00 2001 From: Thadeu Lima de Souza Cascardo Date: Fri, 1 Apr 2016 17:17:50 -0300 Subject: [PATCH 188/312] ip6_tunnel: set rtnl_link_ops before calling register_netdevice [ Upstream commit b6ee376cb0b7fb4e7e07d6cd248bd40436fb9ba6 ] When creating an ip6tnl tunnel with ip tunnel, rtnl_link_ops is not set before ip6_tnl_create2 is called. When register_netdevice is called, there is no linkinfo attribute in the NEWLINK message because of that. Setting rtnl_link_ops before calling register_netdevice fixes that. Fixes: 0b112457229d ("ip6tnl: add support of link creation via rtnl") Signed-off-by: Thadeu Lima de Souza Cascardo Acked-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv6/ip6_tunnel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 6fd0f96fdac1..c7c2c33aa4af 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -284,12 +284,12 @@ static int ip6_tnl_create2(struct net_device *dev) t = netdev_priv(dev); + dev->rtnl_link_ops = &ip6_link_ops; err = register_netdevice(dev); if (err < 0) goto out; strcpy(t->parms.name, dev->name); - dev->rtnl_link_ops = &ip6_link_ops; dev_hold(dev); ip6_tnl_link(ip6n, t); From 921aa6dd8753b3c81e3daaeb1ba2acd70ed36a2d Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Thu, 24 Mar 2016 13:15:45 +0100 Subject: [PATCH 189/312] pinctrl: nomadik: fix pull debug print inversion [ Upstream commit 6ee334559324a55725e22463de633b99ad99fcad ] Pull up was reported as pull down and vice versa. Fix this. Fixes: 8f1774a2a971 "pinctrl: nomadik: improve GPIO debug prints" Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/pinctrl/nomadik/pinctrl-nomadik.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pinctrl/nomadik/pinctrl-nomadik.c b/drivers/pinctrl/nomadik/pinctrl-nomadik.c index a6a22054c0ba..f4b1dac45aca 100644 --- a/drivers/pinctrl/nomadik/pinctrl-nomadik.c +++ b/drivers/pinctrl/nomadik/pinctrl-nomadik.c @@ -1025,7 +1025,7 @@ static void nmk_gpio_dbg_show_one(struct seq_file *s, int pullidx = 0; if (pull) - pullidx = data_out ? 1 : 2; + pullidx = data_out ? 2 : 1; seq_printf(s, " gpio-%-3d (%-20.20s) in %s %s", gpio, From 80a3bf1c098ce3e0b711695743037b859db536c0 Mon Sep 17 00:00:00 2001 From: Philipp Zabel Date: Fri, 26 Feb 2016 08:21:35 -0300 Subject: [PATCH 190/312] [media] coda: fix error path in case of missing pdata on non-DT platform [ Upstream commit bc717d5e92c8c079280eb4acbe335c6f25041aa2 ] If we bail out this early, v4l2_device_register() has not been called yet, so no need to call v4l2_device_unregister(). Fixes: b7bd660a51f0 ("[media] coda: Call v4l2_device_unregister() from a single location") Reported-by: Michael Olbrich Signed-off-by: Philipp Zabel Reviewed-by: Fabio Estevam Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin --- drivers/media/platform/coda/coda-common.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/media/platform/coda/coda-common.c b/drivers/media/platform/coda/coda-common.c index 8e6fe0200117..e14b1a19f4e6 100644 --- a/drivers/media/platform/coda/coda-common.c +++ b/drivers/media/platform/coda/coda-common.c @@ -2100,14 +2100,12 @@ static int coda_probe(struct platform_device *pdev) pdev_id = of_id ? of_id->data : platform_get_device_id(pdev); - if (of_id) { + if (of_id) dev->devtype = of_id->data; - } else if (pdev_id) { + else if (pdev_id) dev->devtype = &coda_devdata[pdev_id->driver_data]; - } else { - ret = -EINVAL; - goto err_v4l2_register; - } + else + return -EINVAL; spin_lock_init(&dev->irqlock); INIT_LIST_HEAD(&dev->instances); From f46752f2993159836cb0591f5b6083f9ff2d7039 Mon Sep 17 00:00:00 2001 From: Laurent Pinchart Date: Wed, 9 Sep 2015 11:38:56 -0300 Subject: [PATCH 191/312] [media] v4l: vsp1: Set the SRU CTRL0 register when starting the stream [ Upstream commit f6acfcdc5b8cdc9ddd53a459361820b9efe958c4 ] Commit 58f896d859ce ("[media] v4l: vsp1: sru: Make the intensity controllable during streaming") refactored the stream start code and removed the SRU CTRL0 register write by mistake. Add it back. Fixes: 58f896d859ce ("[media] v4l: vsp1: sru: Make the intensity controllable during streaming") Signed-off-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin --- drivers/media/platform/vsp1/vsp1_sru.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/media/platform/vsp1/vsp1_sru.c b/drivers/media/platform/vsp1/vsp1_sru.c index 6310acab60e7..d41ae950d1a1 100644 --- a/drivers/media/platform/vsp1/vsp1_sru.c +++ b/drivers/media/platform/vsp1/vsp1_sru.c @@ -154,6 +154,7 @@ static int sru_s_stream(struct v4l2_subdev *subdev, int enable) mutex_lock(sru->ctrls.lock); ctrl0 |= vsp1_sru_read(sru, VI6_SRU_CTRL0) & (VI6_SRU_CTRL0_PARAM0_MASK | VI6_SRU_CTRL0_PARAM1_MASK); + vsp1_sru_write(sru, VI6_SRU_CTRL0, ctrl0); mutex_unlock(sru->ctrls.lock); vsp1_sru_write(sru, VI6_SRU_CTRL1, VI6_SRU_CTRL1_PARAM5); From 6c59578f592a6873a71bc1397ccf60615d2abec0 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 26 Jan 2016 23:05:31 +0100 Subject: [PATCH 192/312] mac80211: avoid excessive stack usage in sta_info [ Upstream commit 0ef049dc1167fe834d0ad5d63f89eddc5c70f6e4 ] When CONFIG_OPTIMIZE_INLINING is set, the sta_info_insert_finish function consumes more stack than normally, exceeding the 1024 byte limit on ARM: net/mac80211/sta_info.c: In function 'sta_info_insert_finish': net/mac80211/sta_info.c:561:1: error: the frame size of 1080 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] It turns out that there are two functions that put a 'struct station_info' on the stack: __sta_info_destroy_part2 and sta_info_insert_finish, and this structure alone requires up to 792 bytes. Hoping that both are called rarely enough, this replaces the on-stack structure with a dynamic allocation, which unfortunately requires some suboptimal error handling for out-of-memory. The __sta_info_destroy_part2 function is actually affected by the stack usage twice because it calls cfg80211_del_sta_sinfo(), which has another instance of struct station_info on its stack. Signed-off-by: Arnd Bergmann Fixes: 98b6218388e3 ("mac80211/cfg80211: add station events") Fixes: 6f7a8d26e266 ("mac80211: send statistics with delete station event") Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin --- net/mac80211/sta_info.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index a7027190f298..bcdbda289d75 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -472,11 +472,17 @@ static int sta_info_insert_finish(struct sta_info *sta) __acquires(RCU) { struct ieee80211_local *local = sta->local; struct ieee80211_sub_if_data *sdata = sta->sdata; - struct station_info sinfo; + struct station_info *sinfo; int err = 0; lockdep_assert_held(&local->sta_mtx); + sinfo = kzalloc(sizeof(struct station_info), GFP_KERNEL); + if (!sinfo) { + err = -ENOMEM; + goto out_err; + } + /* check if STA exists already */ if (sta_info_get_bss(sdata, sta->sta.addr)) { err = -EEXIST; @@ -510,10 +516,9 @@ static int sta_info_insert_finish(struct sta_info *sta) __acquires(RCU) ieee80211_sta_debugfs_add(sta); rate_control_add_sta_debugfs(sta); - memset(&sinfo, 0, sizeof(sinfo)); - sinfo.filled = 0; - sinfo.generation = local->sta_generation; - cfg80211_new_sta(sdata->dev, sta->sta.addr, &sinfo, GFP_KERNEL); + sinfo->generation = local->sta_generation; + cfg80211_new_sta(sdata->dev, sta->sta.addr, sinfo, GFP_KERNEL); + kfree(sinfo); sta_dbg(sdata, "Inserted STA %pM\n", sta->sta.addr); @@ -876,7 +881,7 @@ static void __sta_info_destroy_part2(struct sta_info *sta) { struct ieee80211_local *local = sta->local; struct ieee80211_sub_if_data *sdata = sta->sdata; - struct station_info sinfo = {}; + struct station_info *sinfo; int ret; /* @@ -914,8 +919,11 @@ static void __sta_info_destroy_part2(struct sta_info *sta) sta_dbg(sdata, "Removed STA %pM\n", sta->sta.addr); - sta_set_sinfo(sta, &sinfo); - cfg80211_del_sta_sinfo(sdata->dev, sta->sta.addr, &sinfo, GFP_KERNEL); + sinfo = kzalloc(sizeof(*sinfo), GFP_KERNEL); + if (sinfo) + sta_set_sinfo(sta, sinfo); + cfg80211_del_sta_sinfo(sdata->dev, sta->sta.addr, sinfo, GFP_KERNEL); + kfree(sinfo); rate_control_remove_sta_debugfs(sta); ieee80211_sta_debugfs_remove(sta); From bd6b2f89b2357c2daabd596d8c1c78940030042c Mon Sep 17 00:00:00 2001 From: Sara Sharon Date: Mon, 25 Jan 2016 15:46:35 +0200 Subject: [PATCH 193/312] mac80211: fix ibss scan parameters [ Upstream commit d321cd014e51baab475efbdec468255b9e0ec822 ] When joining IBSS a full scan should be initiated in order to search for existing cell, unless the fixed_channel parameter was set. A default channel to create the IBSS on if no cell was found is provided as well. However - a scan is initiated only on the default channel provided regardless of whether ifibss->fixed_channel is set or not, with the obvious result of the cell not joining existing IBSS cell that is on another channel. Fixes: 76bed0f43b27 ("mac80211: IBSS fix scan request") Signed-off-by: Sara Sharon Signed-off-by: Emmanuel Grumbach Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin --- net/mac80211/ibss.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/net/mac80211/ibss.c b/net/mac80211/ibss.c index 41adfc898a18..6f1f3bdddea2 100644 --- a/net/mac80211/ibss.c +++ b/net/mac80211/ibss.c @@ -7,6 +7,7 @@ * Copyright 2007, Michael Wu * Copyright 2009, Johannes Berg * Copyright 2013-2014 Intel Mobile Communications GmbH + * Copyright(c) 2016 Intel Deutschland GmbH * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -1479,14 +1480,21 @@ static void ieee80211_sta_find_ibss(struct ieee80211_sub_if_data *sdata) sdata_info(sdata, "Trigger new scan to find an IBSS to join\n"); - num = ieee80211_ibss_setup_scan_channels(local->hw.wiphy, - &ifibss->chandef, - channels, - ARRAY_SIZE(channels)); scan_width = cfg80211_chandef_to_scan_width(&ifibss->chandef); - ieee80211_request_ibss_scan(sdata, ifibss->ssid, - ifibss->ssid_len, channels, num, - scan_width); + + if (ifibss->fixed_channel) { + num = ieee80211_ibss_setup_scan_channels(local->hw.wiphy, + &ifibss->chandef, + channels, + ARRAY_SIZE(channels)); + ieee80211_request_ibss_scan(sdata, ifibss->ssid, + ifibss->ssid_len, channels, + num, scan_width); + } else { + ieee80211_request_ibss_scan(sdata, ifibss->ssid, + ifibss->ssid_len, NULL, + 0, scan_width); + } } else { int interval = IEEE80211_SCAN_INTERVAL; From 193517ef5a16dedb1ba9bca6a9e490780cdf3f3a Mon Sep 17 00:00:00 2001 From: Michal Kazior Date: Mon, 25 Jan 2016 14:43:24 +0100 Subject: [PATCH 194/312] mac80211: fix unnecessary frame drops in mesh fwding [ Upstream commit cf44012810ccdd8fd947518e965cb04b7b8498be ] The ieee80211_queue_stopped() expects hw queue number but it was given raw WMM AC number instead. This could cause frame drops and problems with traffic in some cases - most notably if driver doesn't map AC numbers to queue numbers 1:1 and uses ieee80211_stop_queues() and ieee80211_wake_queue() only without ever calling ieee80211_wake_queues(). On ath10k it was possible to hit this problem in the following case: 1. wlan0 uses queue 0 (ath10k maps queues per vif) 2. offchannel uses queue 15 3. queues 1-14 are unused 4. ieee80211_stop_queues() 5. ieee80211_wake_queue(q=0) 6. ieee80211_wake_queue(q=15) (other queues are not woken up because both driver and mac80211 know other queues are unused) 7. ieee80211_rx_h_mesh_fwding() 8. ieee80211_select_queue_80211() returns 2 9. ieee80211_queue_stopped(q=2) returns true 10. frame is dropped (oops!) Fixes: d3c1597b8d1b ("mac80211: fix forwarded mesh frame queue mapping") Signed-off-by: Michal Kazior Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin --- net/mac80211/rx.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index d4b08d87537c..3073164a6fcf 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -2227,7 +2227,7 @@ ieee80211_rx_h_mesh_fwding(struct ieee80211_rx_data *rx) struct ieee80211_sub_if_data *sdata = rx->sdata; struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb); struct ieee80211_if_mesh *ifmsh = &sdata->u.mesh; - u16 q, hdrlen; + u16 ac, q, hdrlen; hdr = (struct ieee80211_hdr *) skb->data; hdrlen = ieee80211_hdrlen(hdr->frame_control); @@ -2297,7 +2297,8 @@ ieee80211_rx_h_mesh_fwding(struct ieee80211_rx_data *rx) ether_addr_equal(sdata->vif.addr, hdr->addr3)) return RX_CONTINUE; - q = ieee80211_select_queue_80211(sdata, skb, hdr); + ac = ieee80211_select_queue_80211(sdata, skb, hdr); + q = sdata->vif.hw_queue[ac]; if (ieee80211_queue_stopped(&local->hw, q)) { IEEE80211_IFSTA_MESH_CTR_INC(ifmsh, dropped_frames_congestion); return RX_DROP_MONITOR; From 17e8cd1e540985cd11cc6f007867fa1679b8f866 Mon Sep 17 00:00:00 2001 From: Michal Kazior Date: Thu, 21 Jan 2016 14:23:07 +0100 Subject: [PATCH 195/312] mac80211: fix txq queue related crashes [ Upstream commit 2a58d42c1e018ad514d4e23fd33fb2ded95d3ee6 ] The driver can access the queue simultanously while mac80211 tears down the interface. Without spinlock protection this could lead to corrupting sk_buff_head and subsequently to an invalid pointer dereference. Fixes: ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation") Signed-off-by: Michal Kazior Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin --- net/mac80211/iface.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c index 84cef600c573..6e89ab8eac44 100644 --- a/net/mac80211/iface.c +++ b/net/mac80211/iface.c @@ -980,7 +980,10 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata, if (sdata->vif.txq) { struct txq_info *txqi = to_txq_info(sdata->vif.txq); + spin_lock_bh(&txqi->queue.lock); ieee80211_purge_tx_queue(&local->hw, &txqi->queue); + spin_unlock_bh(&txqi->queue.lock); + atomic_set(&sdata->txqs_len[txqi->txq.ac], 0); } From 0ee1b6dbc57dc78541df83157e1251b983c51485 Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso Date: Wed, 20 Apr 2016 20:09:24 -0700 Subject: [PATCH 196/312] futex: Acknowledge a new waiter in counter before plist [ Upstream commit fe1bce9e2107ba3a8faffe572483b6974201a0e6 ] Otherwise an incoming waker on the dest hash bucket can miss the waiter adding itself to the plist during the lockless check optimization (small window but still the correct way of doing this); similarly to the decrement counterpart. Suggested-by: Peter Zijlstra Signed-off-by: Davidlohr Bueso Cc: Davidlohr Bueso Cc: bigeasy@linutronix.de Cc: dvhart@infradead.org Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1461208164-29150-1-git-send-email-dave@stgolabs.net Signed-off-by: Thomas Gleixner Signed-off-by: Sasha Levin --- kernel/futex.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/futex.c b/kernel/futex.c index 46b168e19c98..2214b70f1910 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1381,8 +1381,8 @@ void requeue_futex(struct futex_q *q, struct futex_hash_bucket *hb1, if (likely(&hb1->chain != &hb2->chain)) { plist_del(&q->list, &hb1->chain); hb_waiters_dec(hb1); - plist_add(&q->list, &hb2->chain); hb_waiters_inc(hb2); + plist_add(&q->list, &hb2->chain); q->lock_ptr = &hb2->lock; } get_futex_key_refs(key2); From f49eb503f0f9ab66874a81992b14249a1c59b6ad Mon Sep 17 00:00:00 2001 From: Anton Blanchard Date: Fri, 15 Apr 2016 12:08:19 +1000 Subject: [PATCH 197/312] powerpc: Update TM user feature bits in scan_features() [ Upstream commit 4705e02498d6d5a7ab98dfee9595cd5e91db2017 ] We need to update the user TM feature bits (PPC_FEATURE2_HTM and PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature bit. At the moment, if firmware reports TM is not available we turn off the kernel TM feature bit but leave the userspace ones on. Userspace thinks it can execute TM instructions and it dies trying. This (together with a QEMU patch) fixes PR KVM, which doesn't currently support TM. Signed-off-by: Anton Blanchard Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin --- arch/powerpc/kernel/prom.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index abe9cdc390a5..28dbbb0d12c4 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -162,11 +162,12 @@ static struct ibm_pa_feature { {0, MMU_FTR_CI_LARGE_PAGE, 0, 0, 1, 2, 0}, {CPU_FTR_REAL_LE, 0, PPC_FEATURE_TRUE_LE, 0, 5, 0, 0}, /* - * If the kernel doesn't support TM (ie. CONFIG_PPC_TRANSACTIONAL_MEM=n), - * we don't want to turn on CPU_FTR_TM here, so we use CPU_FTR_TM_COMP - * which is 0 if the kernel doesn't support TM. + * If the kernel doesn't support TM (ie CONFIG_PPC_TRANSACTIONAL_MEM=n), + * we don't want to turn on TM here, so we use the *_COMP versions + * which are 0 if the kernel doesn't support TM. */ - {CPU_FTR_TM_COMP, 0, 0, 0, 22, 0, 0}, + {CPU_FTR_TM_COMP, 0, 0, + PPC_FEATURE2_HTM_COMP|PPC_FEATURE2_HTM_NOSC_COMP, 22, 0, 0}, }; static void __init scan_features(unsigned long node, const unsigned char *ftrs, From e9f20ca83ec68dddf1366fe99456a12072dc8bac Mon Sep 17 00:00:00 2001 From: Stephen Boyd Date: Sun, 17 Apr 2016 05:21:42 -0700 Subject: [PATCH 198/312] Input: pmic8xxx-pwrkey - fix algorithm for converting trigger delay [ Upstream commit eda5ecc0a6b865561997e177c393f0b0136fe3b7 ] The trigger delay algorithm that converts from microseconds to the register value looks incorrect. According to most of the PMIC documentation, the equation is delay (Seconds) = (1 / 1024) * 2 ^ (x + 4) except for one case where the documentation looks to have a formatting issue and the equation looks like delay (Seconds) = (1 / 1024) * 2 x + 4 Most likely this driver was written with the improper documentation to begin with. According to the downstream sources the valid delays are from 2 seconds to 1/64 second, and the latter equation just doesn't make sense for that. Let's fix the algorithm and the range check to match the documentation and the downstream sources. Reported-by: Bjorn Andersson Fixes: 92d57a73e410 ("input: Add support for Qualcomm PMIC8XXX power key") Signed-off-by: Stephen Boyd Tested-by: John Stultz Acked-by: Bjorn Andersson Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin --- drivers/input/misc/pmic8xxx-pwrkey.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/input/misc/pmic8xxx-pwrkey.c b/drivers/input/misc/pmic8xxx-pwrkey.c index c4ca20e63221..b6d14bba6645 100644 --- a/drivers/input/misc/pmic8xxx-pwrkey.c +++ b/drivers/input/misc/pmic8xxx-pwrkey.c @@ -92,7 +92,8 @@ static int pmic8xxx_pwrkey_probe(struct platform_device *pdev) if (of_property_read_u32(pdev->dev.of_node, "debounce", &kpd_delay)) kpd_delay = 15625; - if (kpd_delay > 62500 || kpd_delay == 0) { + /* Valid range of pwr key trigger delay is 1/64 sec to 2 seconds. */ + if (kpd_delay > USEC_PER_SEC * 2 || kpd_delay < USEC_PER_SEC / 64) { dev_err(&pdev->dev, "invalid power key trigger delay\n"); return -EINVAL; } @@ -122,8 +123,8 @@ static int pmic8xxx_pwrkey_probe(struct platform_device *pdev) pwr->name = "pmic8xxx_pwrkey"; pwr->phys = "pmic8xxx_pwrkey/input0"; - delay = (kpd_delay << 10) / USEC_PER_SEC; - delay = 1 + ilog2(delay); + delay = (kpd_delay << 6) / USEC_PER_SEC; + delay = ilog2(delay); err = regmap_read(regmap, PON_CNTL_1, &pon_cntl); if (err < 0) { From 372574134cc15a03608d87a3577885b049e40db3 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 16 Feb 2016 16:03:23 +0100 Subject: [PATCH 199/312] xen kconfig: don't "select INPUT_XEN_KBDDEV_FRONTEND" [ Upstream commit 13aa38e291bdd4e4018f40dd2f75e464814dcbf3 ] The Xen framebuffer driver selects the xen keyboard driver, so the latter will be built-in if XEN_FBDEV_FRONTEND=y. However, when CONFIG_INPUT is a loadable module, this configuration cannot work. On mainline kernels, the symbol will be enabled but not used, while in combination with a patch I have to detect such useless configurations, we get the expected link failure: drivers/input/built-in.o: In function `xenkbd_remove': xen-kbdfront.c:(.text+0x2f0): undefined reference to `input_unregister_device' xen-kbdfront.c:(.text+0x30e): undefined reference to `input_unregister_device' This removes the extra "select", as it just causes more trouble than it helps. In theory, some defconfig file might break if it has XEN_FBDEV_FRONTEND in it but not INPUT_XEN_KBDDEV_FRONTEND. The Kconfig fragment we ship in the kernel (kernel/configs/xen.config) however already enables both, and anyone using an old .config file would keep having both enabled. Signed-off-by: Arnd Bergmann Suggested-by: David Vrabel Fixes: 36c1132e34bd ("xen kconfig: fix select INPUT_XEN_KBDDEV_FRONTEND") Acked-by: Stefano Stabellini Signed-off-by: Tomi Valkeinen Signed-off-by: Sasha Levin --- drivers/video/fbdev/Kconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig index d1e1e1704da1..44eb7c737ea2 100644 --- a/drivers/video/fbdev/Kconfig +++ b/drivers/video/fbdev/Kconfig @@ -2246,7 +2246,6 @@ config XEN_FBDEV_FRONTEND select FB_SYS_IMAGEBLIT select FB_SYS_FOPS select FB_DEFERRED_IO - select INPUT_XEN_KBDDEV_FRONTEND if INPUT_MISC select XEN_XENBUS_FRONTEND default y help From 4e57736de7ebcaf667ef07a8b7d34060ac3d153b Mon Sep 17 00:00:00 2001 From: Keerthy Date: Thu, 14 Apr 2016 10:29:16 +0530 Subject: [PATCH 200/312] pinctrl: single: Fix pcs_parse_bits_in_pinctrl_entry to use __ffs than ffs [ Upstream commit 56b367c0cd67d4c3006738e7dc9dda9273fd2bfe ] pcs_parse_bits_in_pinctrl_entry uses ffs which gives bit indices ranging from 1 to MAX. This leads to a corner case where we try to request the pin number = MAX and fails. bit_pos value is being calculted using ffs. pin_num_from_lsb uses bit_pos value. pins array is populated with: pin + pin_num_from_lsb. The above is 1 more than usual bit indices as bit_pos uses ffs to compute first set bit. Hence the last of the pins array is populated with the MAX value and not MAX - 1 which causes error when we call pin_request. mask_pos is rightly calculated as ((pcs->fmask) << (bit_pos - 1)) Consequently val_pos and submask are correct. Hence use __ffs which gives (ffs(x) - 1) as the first bit set. fixes: 4e7e8017a8 ("pinctrl: pinctrl-single: enhance to configure multiple pins of different modules") Signed-off-by: Keerthy Acked-by: Tony Lindgren Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin --- drivers/pinctrl/pinctrl-single.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/pinctrl/pinctrl-single.c b/drivers/pinctrl/pinctrl-single.c index 11bf750d3761..f2e4232ea98d 100644 --- a/drivers/pinctrl/pinctrl-single.c +++ b/drivers/pinctrl/pinctrl-single.c @@ -1273,9 +1273,9 @@ static int pcs_parse_bits_in_pinctrl_entry(struct pcs_device *pcs, /* Parse pins in each row from LSB */ while (mask) { - bit_pos = ffs(mask); + bit_pos = __ffs(mask); pin_num_from_lsb = bit_pos / pcs->bits_per_pin; - mask_pos = ((pcs->fmask) << (bit_pos - 1)); + mask_pos = ((pcs->fmask) << bit_pos); val_pos = val & mask_pos; submask = mask & mask_pos; @@ -1854,7 +1854,7 @@ static int pcs_probe(struct platform_device *pdev) ret = of_property_read_u32(np, "pinctrl-single,function-mask", &pcs->fmask); if (!ret) { - pcs->fshift = ffs(pcs->fmask) - 1; + pcs->fshift = __ffs(pcs->fmask); pcs->fmax = pcs->fmask >> pcs->fshift; } else { /* If mask property doesn't exist, function mux is invalid. */ From be109716b05f8c1f2bd4c1389327c7848b894e20 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= Date: Mon, 11 Jan 2016 20:48:32 +0200 Subject: [PATCH 201/312] drm/i915: Cleanup phys status page too MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 7d3fdfff23852fe458a0d0979a3555fe60f1e563 ] Restore the lost phys status page cleanup. Fixes the following splat with DMA_API_DEBUG=y: WARNING: CPU: 0 PID: 21615 at ../lib/dma-debug.c:974 dma_debug_device_change+0x190/0x1f0() pci 0000:00:02.0: DMA-API: device driver has pending DMA allocations while released from device [count=1] One of leaked entries details: [device address=0x0000000023163000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent] Modules linked in: i915(-) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm sha256_generic hmac drbg ctr ccm sch_fq_codel binfmt_misc joydev mousedev arc4 ath5k iTCO_wdt mac80211 smsc_ircc2 ath snd_intel8x0m snd_intel8x0 snd_ac97_codec ac97_bus psmouse snd_pcm input_leds i2c_i801 pcspkr snd_timer cfg80211 snd soundcore i2c_core ehci_pci firewire_ohci ehci_hcd firewire_core lpc_ich 8139too rfkill crc_itu_t mfd_core mii usbcore rng_core intel_agp intel_gtt usb_common agpgart irda crc_ccitt fujitsu_laptop led_class parport_pc video parport evdev backlight CPU: 0 PID: 21615 Comm: rmmod Tainted: G U 4.4.0-rc4-mgm-ovl+ #4 Hardware name: FUJITSU SIEMENS LIFEBOOK S6120/FJNB16C, BIOS Version 1.26 05/10/2004 e31a3de0 e31a3de0 e31a3d9c c128d4bd e31a3dd0 c1045a0c c15e00c4 e31a3dfc 0000546f c15dfad2 000003ce c12b3740 000003ce c12b3740 00000000 00000001 f61fb8a0 e31a3de8 c1045a83 00000009 e31a3de0 c15e00c4 e31a3dfc e31a3e4c Call Trace: [] dump_stack+0x16/0x19 [] warn_slowpath_common+0x8c/0xd0 [] ? dma_debug_device_change+0x190/0x1f0 [] ? dma_debug_device_change+0x190/0x1f0 [] warn_slowpath_fmt+0x33/0x40 [] dma_debug_device_change+0x190/0x1f0 [] notifier_call_chain+0x59/0x70 [] __blocking_notifier_call_chain+0x3f/0x80 [] blocking_notifier_call_chain+0x1f/0x30 [] __device_release_driver+0xc3/0xf0 [] driver_detach+0x97/0xa0 [] bus_remove_driver+0x40/0x90 [] driver_unregister+0x28/0x60 [] ? trace_hardirqs_on_caller+0x12c/0x1d0 [] pci_unregister_driver+0x18/0x80 [] drm_pci_exit+0x87/0xb0 [drm] [] i915_exit+0x1b/0x1ee [i915] [] SyS_delete_module+0x14c/0x210 [] ? trace_hardirqs_on_caller+0x12c/0x1d0 [] ? ____fput+0xd/0x10 [] do_fast_syscall_32+0xa4/0x450 [] sysenter_past_esp+0x3b/0x5d ---[ end trace c2ecbc77760f10a0 ]--- Mapped at: [] debug_dma_alloc_coherent+0x33/0x90 [] drm_pci_alloc+0x18c/0x1e0 [drm] [] intel_init_ring_buffer+0x2af/0x490 [i915] [] intel_init_render_ring_buffer+0x130/0x750 [i915] [] i915_gem_init_rings+0x1e/0x110 [i915] v2: s/BUG_ON/WARN_ON/ since dim doens't like the former anymore Cc: Chris Wilson Fixes: 5c6c600 ("drm/i915: Remove DRI1 ring accessors and API") Signed-off-by: Ville Syrjälä Reviewed-by: Chris Wilson (v1) Link: http://patchwork.freedesktop.org/patch/msgid/1452538112-5331-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter Signed-off-by: Sasha Levin --- drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index b7e20dee64c4..09844b5fe250 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1814,6 +1814,17 @@ i915_dispatch_execbuffer(struct intel_engine_cs *ring, return 0; } +static void cleanup_phys_status_page(struct intel_engine_cs *ring) +{ + struct drm_i915_private *dev_priv = to_i915(ring->dev); + + if (!dev_priv->status_page_dmah) + return; + + drm_pci_free(ring->dev, dev_priv->status_page_dmah); + ring->status_page.page_addr = NULL; +} + static void cleanup_status_page(struct intel_engine_cs *ring) { struct drm_i915_gem_object *obj; @@ -1830,9 +1841,9 @@ static void cleanup_status_page(struct intel_engine_cs *ring) static int init_status_page(struct intel_engine_cs *ring) { - struct drm_i915_gem_object *obj; + struct drm_i915_gem_object *obj = ring->status_page.obj; - if ((obj = ring->status_page.obj) == NULL) { + if (obj == NULL) { unsigned flags; int ret; @@ -1985,7 +1996,7 @@ static int intel_init_ring_buffer(struct drm_device *dev, if (ret) goto error; } else { - BUG_ON(ring->id != RCS); + WARN_ON(ring->id != RCS); ret = init_phys_status_page(ring); if (ret) goto error; @@ -2049,7 +2060,12 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring) if (ring->cleanup) ring->cleanup(ring); - cleanup_status_page(ring); + if (I915_NEED_GFX_HWS(ring->dev)) { + cleanup_status_page(ring); + } else { + WARN_ON(ring->id != RCS); + cleanup_phys_status_page(ring); + } i915_cmd_parser_fini_ring(ring); From 44783088d3fea68985f644312b3e76cd6dde8d18 Mon Sep 17 00:00:00 2001 From: Javier Martinez Canillas Date: Sat, 16 Apr 2016 21:14:52 -0400 Subject: [PATCH 202/312] i2c: exynos5: Fix possible ABBA deadlock by keeping I2C clock prepared [ Upstream commit 10ff4c5239a137abfc896ec73ef3d15a0f86a16a ] The exynos5 I2C controller driver always prepares and enables a clock before using it and then disables unprepares it when the clock is not used anymore. But this can cause a possible ABBA deadlock in some scenarios since a driver that uses regmap to access its I2C registers, will first grab the regmap lock and then the I2C xfer function will grab the prepare lock when preparing the I2C clock. But since the clock driver also uses regmap for I2C accesses, preparing a clock will first grab the prepare lock and then the regmap lock when using the regmap API. An example of this happens on the Exynos5422 Odroid XU4 board where a s2mps11 PMIC is used and both the s2mps11 regulators and clk drivers share the same I2C regmap. The possible deadlock is reported by the kernel lockdep: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(sec_core:428:(regmap)->lock); lock(prepare_lock); lock(sec_core:428:(regmap)->lock); lock(prepare_lock); *** DEADLOCK *** Fix it by leaving the code prepared on probe and use {en,dis}able in the I2C transfer function. This patch is similar to commit 34e81ad5f0b6 ("i2c: s3c2410: fix ABBA deadlock by keeping clock prepared") that fixes the same bug in other driver for an I2C controller found in Samsung SoCs. Reported-by: Anand Moon Signed-off-by: Javier Martinez Canillas Reviewed-by: Anand Moon Reviewed-by: Krzysztof Kozlowski Signed-off-by: Wolfram Sang Cc: stable@kernel.org Signed-off-by: Sasha Levin --- drivers/i2c/busses/i2c-exynos5.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/i2c/busses/i2c-exynos5.c b/drivers/i2c/busses/i2c-exynos5.c index b29c7500461a..f54ece8fce78 100644 --- a/drivers/i2c/busses/i2c-exynos5.c +++ b/drivers/i2c/busses/i2c-exynos5.c @@ -671,7 +671,9 @@ static int exynos5_i2c_xfer(struct i2c_adapter *adap, return -EIO; } - clk_prepare_enable(i2c->clk); + ret = clk_enable(i2c->clk); + if (ret) + return ret; for (i = 0; i < num; i++, msgs++) { stop = (i == num - 1); @@ -695,7 +697,7 @@ static int exynos5_i2c_xfer(struct i2c_adapter *adap, } out: - clk_disable_unprepare(i2c->clk); + clk_disable(i2c->clk); return ret; } @@ -747,7 +749,9 @@ static int exynos5_i2c_probe(struct platform_device *pdev) return -ENOENT; } - clk_prepare_enable(i2c->clk); + ret = clk_prepare_enable(i2c->clk); + if (ret) + return ret; mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); i2c->regs = devm_ioremap_resource(&pdev->dev, mem); @@ -799,6 +803,10 @@ static int exynos5_i2c_probe(struct platform_device *pdev) platform_set_drvdata(pdev, i2c); + clk_disable(i2c->clk); + + return 0; + err_clk: clk_disable_unprepare(i2c->clk); return ret; @@ -810,6 +818,8 @@ static int exynos5_i2c_remove(struct platform_device *pdev) i2c_del_adapter(&i2c->adap); + clk_unprepare(i2c->clk); + return 0; } @@ -821,6 +831,8 @@ static int exynos5_i2c_suspend_noirq(struct device *dev) i2c->suspended = 1; + clk_unprepare(i2c->clk); + return 0; } @@ -830,7 +842,9 @@ static int exynos5_i2c_resume_noirq(struct device *dev) struct exynos5_i2c *i2c = platform_get_drvdata(pdev); int ret = 0; - clk_prepare_enable(i2c->clk); + ret = clk_prepare_enable(i2c->clk); + if (ret) + return ret; ret = exynos5_hsi2c_clock_setup(i2c); if (ret) { @@ -839,7 +853,7 @@ static int exynos5_i2c_resume_noirq(struct device *dev) } exynos5_i2c_init(i2c); - clk_disable_unprepare(i2c->clk); + clk_disable(i2c->clk); i2c->suspended = 0; return 0; From 0cbc7967e970b92f2b60a5d935dc49eef08cfa2e Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 25 Jan 2016 18:07:33 +0100 Subject: [PATCH 203/312] ASoC: s3c24xx: use const snd_soc_component_driver pointer [ Upstream commit ba4bc32eaa39ba7687f0958ae90eec94da613b46 ] An older patch to convert the API in the s3c i2s driver ended up passing a const pointer into a function that takes a non-const pointer, so we now get a warning: sound/soc/samsung/s3c2412-i2s.c: In function 's3c2412_iis_dev_probe': sound/soc/samsung/s3c2412-i2s.c:172:9: error: passing argument 3 of 's3c_i2sv2_register_component' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers] However, the s3c_i2sv2_register_component() function again passes the pointer into another function taking a const, so we just need to change its prototype. Fixes: eca3b01d0885 ("ASoC: switch over to use snd_soc_register_component() on s3c i2s") Signed-off-by: Arnd Bergmann Reviewed-by: Krzysztof Kozlowski Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/samsung/s3c-i2s-v2.c | 2 +- sound/soc/samsung/s3c-i2s-v2.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/soc/samsung/s3c-i2s-v2.c b/sound/soc/samsung/s3c-i2s-v2.c index df65c5b494b1..b6ab3fc5789e 100644 --- a/sound/soc/samsung/s3c-i2s-v2.c +++ b/sound/soc/samsung/s3c-i2s-v2.c @@ -709,7 +709,7 @@ static int s3c2412_i2s_resume(struct snd_soc_dai *dai) #endif int s3c_i2sv2_register_component(struct device *dev, int id, - struct snd_soc_component_driver *cmp_drv, + const struct snd_soc_component_driver *cmp_drv, struct snd_soc_dai_driver *dai_drv) { struct snd_soc_dai_ops *ops = (struct snd_soc_dai_ops *)dai_drv->ops; diff --git a/sound/soc/samsung/s3c-i2s-v2.h b/sound/soc/samsung/s3c-i2s-v2.h index 90abab364b49..d0684145ed1f 100644 --- a/sound/soc/samsung/s3c-i2s-v2.h +++ b/sound/soc/samsung/s3c-i2s-v2.h @@ -101,7 +101,7 @@ extern int s3c_i2sv2_probe(struct snd_soc_dai *dai, * soc core. */ extern int s3c_i2sv2_register_component(struct device *dev, int id, - struct snd_soc_component_driver *cmp_drv, + const struct snd_soc_component_driver *cmp_drv, struct snd_soc_dai_driver *dai_drv); #endif /* __SND_SOC_S3C24XX_S3C_I2SV2_I2S_H */ From fad61c46cc9c999169b95d14a67c60a0a7141d00 Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Wed, 27 Jan 2016 14:26:18 +0100 Subject: [PATCH 204/312] ASoC: ssm4567: Reset device before regcache_sync() [ Upstream commit 712a8038cc24dba668afe82f0413714ca87184e0 ] When the ssm4567 is powered up the driver calles regcache_sync() to restore the register map content. regcache_sync() assumes that the device is in its power-on reset state. Make sure that this is the case by explicitly resetting the ssm4567 register map before calling regcache_sync() otherwise we might end up with a incorrect register map which leads to undefined behaviour. One such undefined behaviour was observed when returning from system suspend while a playback stream is active, in that case the ssm4567 was kept muted after resume. Fixes: 1ee44ce03011 ("ASoC: ssm4567: Add driver for Analog Devices SSM4567 amplifier") Reported-by: Harsha Priya Tested-by: Fang, Yang A Signed-off-by: Lars-Peter Clausen Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- sound/soc/codecs/ssm4567.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/sound/soc/codecs/ssm4567.c b/sound/soc/codecs/ssm4567.c index f7549cc7ea85..d1f53abb60de 100644 --- a/sound/soc/codecs/ssm4567.c +++ b/sound/soc/codecs/ssm4567.c @@ -338,6 +338,11 @@ static int ssm4567_set_power(struct ssm4567 *ssm4567, bool enable) regcache_cache_only(ssm4567->regmap, !enable); if (enable) { + ret = regmap_write(ssm4567->regmap, SSM4567_REG_SOFT_RESET, + 0x00); + if (ret) + return ret; + ret = regmap_update_bits(ssm4567->regmap, SSM4567_REG_POWER_CTRL, SSM4567_POWER_SPWDN, 0x00); From 339861186370771cdf9df0ba6d1b2cb2440a1d61 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Mon, 1 Feb 2016 22:06:55 +0000 Subject: [PATCH 205/312] efi: Expose non-blocking set_variable() wrapper to efivars [ Upstream commit 9c6672ac9c91f7eb1ec436be1442b8c26d098e55 ] Commit 6d80dba1c9fe ("efi: Provide a non-blocking SetVariable() operation") implemented a non-blocking alternative for the UEFI SetVariable() invocation performed by efivars, since it may occur in atomic context. However, this version of the function was never exposed via the efivars struct, so the non-blocking versions was not actually callable. Fix that. Signed-off-by: Ard Biesheuvel Signed-off-by: Matt Fleming Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-efi@vger.kernel.org Fixes: 6d80dba1c9fe ("efi: Provide a non-blocking SetVariable() operation") Link: http://lkml.kernel.org/r/1454364428-494-2-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- drivers/firmware/efi/efi.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 63226e9036a1..1f2c86d81176 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -170,6 +170,7 @@ static int generic_ops_register(void) { generic_ops.get_variable = efi.get_variable; generic_ops.set_variable = efi.set_variable; + generic_ops.set_variable_nonblocking = efi.set_variable_nonblocking; generic_ops.get_next_variable = efi.get_next_variable; generic_ops.query_variable_store = efi_query_variable_store; From cef12dbf18ab7617cacf7c271da43624afac5520 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Thu, 21 Jan 2016 15:32:15 -0500 Subject: [PATCH 206/312] cgroup: make sure a parent css isn't freed before its children [ Upstream commit 8bb5ef79bc0f4016ecf79e8dce6096a3c63603e4 ] There are three subsystem callbacks in css shutdown path - css_offline(), css_released() and css_free(). Except for css_released(), cgroup core didn't guarantee the order of invocation. css_offline() or css_free() could be called on a parent css before its children. This behavior is unexpected and led to bugs in cpu and memory controller. The previous patch updated ordering for css_offline() which fixes the cpu controller issue. While there currently isn't a known bug caused by misordering of css_free() invocations, let's fix it too for consistency. css_free() ordering can be trivially fixed by moving putting of the parent css below css_free() invocation. Signed-off-by: Tejun Heo Cc: Peter Zijlstra Signed-off-by: Sasha Levin --- kernel/cgroup.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index d01c60425761..3abce1e0f910 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4378,14 +4378,15 @@ static void css_free_work_fn(struct work_struct *work) if (ss) { /* css free path */ + struct cgroup_subsys_state *parent = css->parent; int id = css->id; - if (css->parent) - css_put(css->parent); - ss->css_free(css); cgroup_idr_remove(&ss->css_idr, id); cgroup_put(cgrp); + + if (parent) + css_put(parent); } else { /* cgroup free path */ atomic_dec(&cgrp->root->nr_cgrps); From 25c871c07f37b8cbaebc97403233185479af095d Mon Sep 17 00:00:00 2001 From: Ignat Korchagin Date: Thu, 17 Mar 2016 18:00:29 +0000 Subject: [PATCH 207/312] USB: usbip: fix potential out-of-bounds write [ Upstream commit b348d7dddb6c4fbfc810b7a0626e8ec9e29f7cbb ] Fix potential out-of-bounds write to urb->transfer_buffer usbip handles network communication directly in the kernel. When receiving a packet from its peer, usbip code parses headers according to protocol. As part of this parsing urb->actual_length is filled. Since the input for urb->actual_length comes from the network, it should be treated as untrusted. Any entity controlling the network may put any value in the input and the preallocated urb->transfer_buffer may not be large enough to hold the data. Thus, the malicious entity is able to write arbitrary data to kernel memory. Signed-off-by: Ignat Korchagin Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/usb/usbip/usbip_common.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index facaaf003f19..e40da7759a0e 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -741,6 +741,17 @@ int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb) if (!(size > 0)) return 0; + if (size > urb->transfer_buffer_length) { + /* should not happen, probably malicious packet */ + if (ud->side == USBIP_STUB) { + usbip_event_add(ud, SDEV_EVENT_ERROR_TCP); + return 0; + } else { + usbip_event_add(ud, VDEV_EVENT_ERROR_TCP); + return -EPIPE; + } + } + ret = usbip_recv(ud->tcp_socket, urb->transfer_buffer, size); if (ret != size) { dev_err(&urb->dev->dev, "recv xbuf, %d\n", ret); From 377e05c42b3078841377779f6f7f0f37b0ef31a2 Mon Sep 17 00:00:00 2001 From: Huibin Hong Date: Wed, 24 Feb 2016 18:00:04 +0800 Subject: [PATCH 208/312] spi/rockchip: Make sure spi clk is on in rockchip_spi_set_cs [ Upstream commit b920cc3191d7612f26f36ee494e05b5ffd9044c0 ] Rockchip_spi_set_cs could be called by spi_setup, but spi_setup may be called by device driver after runtime suspend. Then the spi clock is closed, rockchip_spi_set_cs may access the spi registers, which causes cpu block in some socs. Fixes: 64e36824b32 ("spi/rockchip: add driver for Rockchip RK3xxx") Signed-off-by: Huibin Hong Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/spi/spi-rockchip.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-rockchip.c b/drivers/spi/spi-rockchip.c index 68e7efeb9a27..1d308cba29b1 100644 --- a/drivers/spi/spi-rockchip.c +++ b/drivers/spi/spi-rockchip.c @@ -265,7 +265,10 @@ static inline u32 rx_max(struct rockchip_spi *rs) static void rockchip_spi_set_cs(struct spi_device *spi, bool enable) { u32 ser; - struct rockchip_spi *rs = spi_master_get_devdata(spi->master); + struct spi_master *master = spi->master; + struct rockchip_spi *rs = spi_master_get_devdata(master); + + pm_runtime_get_sync(rs->dev); ser = readl_relaxed(rs->regs + ROCKCHIP_SPI_SER) & SER_MASK; @@ -290,6 +293,8 @@ static void rockchip_spi_set_cs(struct spi_device *spi, bool enable) ser &= ~(1 << spi->chip_select); writel_relaxed(ser, rs->regs + ROCKCHIP_SPI_SER); + + pm_runtime_put_sync(rs->dev); } static int rockchip_spi_prepare_message(struct spi_master *master, From 1cc58547374cfa39193c7bc66119b50fcfbc9342 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 16 Feb 2016 15:53:11 +0100 Subject: [PATCH 209/312] regulator: s5m8767: fix get_register() error handling [ Upstream commit e07ff9434167981c993a26d2edbbcb8e13801dbb ] The s5m8767_pmic_probe() function calls s5m8767_get_register() to read data without checking the return code, which produces a compile-time warning when that data is accessed: drivers/regulator/s5m8767.c: In function 's5m8767_pmic_probe': drivers/regulator/s5m8767.c:924:7: error: 'enable_reg' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/regulator/s5m8767.c:944:30: error: 'enable_val' may be used uninitialized in this function [-Werror=maybe-uninitialized] This changes the s5m8767_get_register() function to return a -EINVAL not just for an invalid register number but also for an invalid regulator number, as both would result in returning uninitialized data. The s5m8767_pmic_probe() function is then changed accordingly to fail on a read error, as all the other callers of s5m8767_get_register() already do. In practice this probably cannot happen, as we don't call s5m8767_get_register() with invalid arguments, but the gcc warning seems valid in principle, in terms writing safe error checking. Signed-off-by: Arnd Bergmann Fixes: 9c4c60554acf ("regulator: s5m8767: Convert to use regulator_[enable|disable|is_enabled]_regmap") Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/regulator/s5m8767.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/regulator/s5m8767.c b/drivers/regulator/s5m8767.c index 58f5d3b8e981..27343e1c43ef 100644 --- a/drivers/regulator/s5m8767.c +++ b/drivers/regulator/s5m8767.c @@ -202,9 +202,10 @@ static int s5m8767_get_register(struct s5m8767_info *s5m8767, int reg_id, } } - if (i < s5m8767->num_regulators) - *enable_ctrl = - s5m8767_opmode_reg[reg_id][mode] << S5M8767_ENCTRL_SHIFT; + if (i >= s5m8767->num_regulators) + return -EINVAL; + + *enable_ctrl = s5m8767_opmode_reg[reg_id][mode] << S5M8767_ENCTRL_SHIFT; return 0; } @@ -937,8 +938,12 @@ static int s5m8767_pmic_probe(struct platform_device *pdev) else regulators[id].vsel_mask = 0xff; - s5m8767_get_register(s5m8767, id, &enable_reg, + ret = s5m8767_get_register(s5m8767, id, &enable_reg, &enable_val); + if (ret) { + dev_err(s5m8767->dev, "error reading registers\n"); + return ret; + } regulators[id].enable_reg = enable_reg; regulators[id].enable_mask = S5M8767_ENCTRL_MASK; regulators[id].enable_val = enable_val; From 2d7efc56371c93957871838253fefe2fb4446b77 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 15 Mar 2016 14:53:29 -0700 Subject: [PATCH 210/312] paride: make 'verbose' parameter an 'int' again [ Upstream commit dec63a4dec2d6d01346fd5d96062e67c0636852b ] gcc-6.0 found an ancient bug in the paride driver, which had a "module_param(verbose, bool, 0);" since before 2.6.12, but actually uses it to accept '0', '1' or '2' as arguments: drivers/block/paride/pd.c: In function 'pd_init_dev_parms': drivers/block/paride/pd.c:298:29: warning: comparison of constant '1' with boolean expression is always false [-Wbool-compare] #define DBMSG(msg) ((verbose>1)?(msg):NULL) In 2012, Rusty did a cleanup patch that also changed the type of the variable to 'bool', which introduced what is now a gcc warning. This changes the type back to 'int' and adapts the module_param() line instead, so it should work as documented in case anyone ever cares about running the ancient driver with debugging. Fixes: 90ab5ee94171 ("module_param: make bool parameters really bool (drivers & misc)") Signed-off-by: Arnd Bergmann Rusty Russell Cc: Tim Waugh Cc: Sudip Mukherjee Cc: Jens Axboe Cc: Greg Kroah-Hartman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- drivers/block/paride/pd.c | 4 ++-- drivers/block/paride/pt.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/block/paride/pd.c b/drivers/block/paride/pd.c index d48715b287e6..b0414702e61a 100644 --- a/drivers/block/paride/pd.c +++ b/drivers/block/paride/pd.c @@ -126,7 +126,7 @@ */ #include -static bool verbose = 0; +static int verbose = 0; static int major = PD_MAJOR; static char *name = PD_NAME; static int cluster = 64; @@ -161,7 +161,7 @@ enum {D_PRT, D_PRO, D_UNI, D_MOD, D_GEO, D_SBY, D_DLY, D_SLV}; static DEFINE_MUTEX(pd_mutex); static DEFINE_SPINLOCK(pd_lock); -module_param(verbose, bool, 0); +module_param(verbose, int, 0); module_param(major, int, 0); module_param(name, charp, 0); module_param(cluster, int, 0); diff --git a/drivers/block/paride/pt.c b/drivers/block/paride/pt.c index 2596042eb987..ada45058e04d 100644 --- a/drivers/block/paride/pt.c +++ b/drivers/block/paride/pt.c @@ -117,7 +117,7 @@ */ -static bool verbose = 0; +static int verbose = 0; static int major = PT_MAJOR; static char *name = PT_NAME; static int disable = 0; @@ -152,7 +152,7 @@ static int (*drives[4])[6] = {&drive0, &drive1, &drive2, &drive3}; #include -module_param(verbose, bool, 0); +module_param(verbose, int, 0); module_param(major, int, 0); module_param(name, charp, 0); module_param_array(drive0, int, NULL, 0); From a5e1649e0c4382a0994e91d958ffa70aa11ac226 Mon Sep 17 00:00:00 2001 From: Sushaanth Srirangapathi Date: Mon, 29 Feb 2016 18:42:19 +0530 Subject: [PATCH 211/312] fbdev: da8xx-fb: fix videomodes of lcd panels [ Upstream commit 713fced8d10fa1c759c8fb6bf9aaa681bae68cad ] Commit 028cd86b794f4a ("video: da8xx-fb: fix the polarities of the hsync/vsync pulse") fixes polarities of HSYNC/VSYNC pulse but forgot to update known_lcd_panels[] which had sync values according to old logic. This breaks LCD at least on DA850 EVM. This patch fixes this issue and I have tested this for panel "Sharp_LK043T1DG01" using DA850 EVM board. Fixes: 028cd86b794f4a ("video: da8xx-fb: fix the polarities of the hsync/vsync pulse") Signed-off-by: Sushaanth Srirangapathi Signed-off-by: Tomi Valkeinen Signed-off-by: Sasha Levin --- drivers/video/fbdev/da8xx-fb.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/video/fbdev/da8xx-fb.c b/drivers/video/fbdev/da8xx-fb.c index 0081725c6b5b..d00510029c93 100644 --- a/drivers/video/fbdev/da8xx-fb.c +++ b/drivers/video/fbdev/da8xx-fb.c @@ -209,8 +209,7 @@ static struct fb_videomode known_lcd_panels[] = { .lower_margin = 2, .hsync_len = 0, .vsync_len = 0, - .sync = FB_SYNC_CLK_INVERT | - FB_SYNC_HOR_HIGH_ACT | FB_SYNC_VERT_HIGH_ACT, + .sync = FB_SYNC_CLK_INVERT, }, /* Sharp LK043T1DG01 */ [1] = { @@ -224,7 +223,7 @@ static struct fb_videomode known_lcd_panels[] = { .lower_margin = 2, .hsync_len = 41, .vsync_len = 10, - .sync = FB_SYNC_HOR_HIGH_ACT | FB_SYNC_VERT_HIGH_ACT, + .sync = 0, .flag = 0, }, [2] = { @@ -239,7 +238,7 @@ static struct fb_videomode known_lcd_panels[] = { .lower_margin = 10, .hsync_len = 10, .vsync_len = 10, - .sync = FB_SYNC_HOR_HIGH_ACT | FB_SYNC_VERT_HIGH_ACT, + .sync = 0, .flag = 0, }, [3] = { From 2de533754013067d325cc1211edcc03e4d14096c Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Mon, 14 Dec 2015 14:29:23 +0000 Subject: [PATCH 212/312] misc/bmp085: Enable building as a module [ Upstream commit 50e6315dba721cbc24ccd6d7b299f1782f210a98 ] Commit 985087dbcb02 'misc: add support for bmp18x chips to the bmp085 driver' changed the BMP085 config symbol to a boolean. I see no reason why the shared code cannot be built as a module, so change it back to tristate. Fixes: 985087dbcb02 ("misc: add support for bmp18x chips to the bmp085 driver") Cc: Eric Andersson Signed-off-by: Ben Hutchings Acked-by: Arnd Bergmann Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/misc/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 006242c8bca0..b3c10b7dae1f 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -429,7 +429,7 @@ config ARM_CHARLCD still useful. config BMP085 - bool + tristate depends on SYSFS config BMP085_I2C From aa9d9e960c1f06ff27438ab61194e877edd6ff83 Mon Sep 17 00:00:00 2001 From: Alexander Kochetkov Date: Sun, 6 Mar 2016 12:43:57 +0300 Subject: [PATCH 213/312] rtc: hym8563: fix invalid year calculation [ Upstream commit d5861262210067fc01b2fb4f7af2fd85a3453f15 ] Year field must be in BCD format, according to hym8563 datasheet. Due to the bug year 2016 became 2010. Fixes: dcaf03849352 ("rtc: add hym8563 rtc-driver") Signed-off-by: Alexander Kochetkov Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin --- drivers/rtc/rtc-hym8563.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/rtc/rtc-hym8563.c b/drivers/rtc/rtc-hym8563.c index 0f710e98538f..3db1557e5394 100644 --- a/drivers/rtc/rtc-hym8563.c +++ b/drivers/rtc/rtc-hym8563.c @@ -144,7 +144,7 @@ static int hym8563_rtc_set_time(struct device *dev, struct rtc_time *tm) * it does not seem to carry it over a subsequent write/read. * So we'll limit ourself to 100 years, starting at 2000 for now. */ - buf[6] = tm->tm_year - 100; + buf[6] = bin2bcd(tm->tm_year - 100); /* * CTL1 only contains TEST-mode bits apart from stop, From 1d22cdabe02c2c89fa78fc29795336edcc4445c0 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Tue, 1 Mar 2016 09:50:01 +0100 Subject: [PATCH 214/312] rtc: vr41xx: Wire up alarm_irq_enable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit a25f4a95ec3cded34c1250364eba704c5e4fdac4 ] drivers/rtc/rtc-vr41xx.c:229: warning: ‘vr41xx_rtc_alarm_irq_enable’ defined but not used Apparently the conversion to alarm_irq_enable forgot to wire up the callback. Fixes: 16380c153a69c378 ("RTC: Convert rtc drivers to use the alarm_irq_enable method") Signed-off-by: Geert Uytterhoeven Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin --- drivers/rtc/rtc-vr41xx.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/rtc/rtc-vr41xx.c b/drivers/rtc/rtc-vr41xx.c index f64c282275b3..e1b86bb01062 100644 --- a/drivers/rtc/rtc-vr41xx.c +++ b/drivers/rtc/rtc-vr41xx.c @@ -272,12 +272,13 @@ static irqreturn_t rtclong1_interrupt(int irq, void *dev_id) } static const struct rtc_class_ops vr41xx_rtc_ops = { - .release = vr41xx_rtc_release, - .ioctl = vr41xx_rtc_ioctl, - .read_time = vr41xx_rtc_read_time, - .set_time = vr41xx_rtc_set_time, - .read_alarm = vr41xx_rtc_read_alarm, - .set_alarm = vr41xx_rtc_set_alarm, + .release = vr41xx_rtc_release, + .ioctl = vr41xx_rtc_ioctl, + .read_time = vr41xx_rtc_read_time, + .set_time = vr41xx_rtc_set_time, + .read_alarm = vr41xx_rtc_read_alarm, + .set_alarm = vr41xx_rtc_set_alarm, + .alarm_irq_enable = vr41xx_rtc_alarm_irq_enable, }; static int rtc_probe(struct platform_device *pdev) From 49d22c01a3ee52d3783465079c64c3974ce53ab7 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 2 Mar 2016 13:07:45 +0300 Subject: [PATCH 215/312] rtc: ds1685: passing bogus values to irq_restore [ Upstream commit 8c09b9fdecab1f4a289f07b46e2ad174b6641928 ] We call spin_lock_irqrestore with "flags" set to zero instead of to the value from spin_lock_irqsave(). Fixes: aaaf5fbf56f1 ('rtc: add driver for DS1685 family of real time clocks') Signed-off-by: Dan Carpenter Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin --- drivers/rtc/rtc-ds1685.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/rtc/rtc-ds1685.c b/drivers/rtc/rtc-ds1685.c index 818a3635a8c8..86865881ce4b 100644 --- a/drivers/rtc/rtc-ds1685.c +++ b/drivers/rtc/rtc-ds1685.c @@ -187,9 +187,9 @@ ds1685_rtc_end_data_access(struct ds1685_priv *rtc) * Only use this where you are certain another lock will not be held. */ static inline void -ds1685_rtc_begin_ctrl_access(struct ds1685_priv *rtc, unsigned long flags) +ds1685_rtc_begin_ctrl_access(struct ds1685_priv *rtc, unsigned long *flags) { - spin_lock_irqsave(&rtc->lock, flags); + spin_lock_irqsave(&rtc->lock, *flags); ds1685_rtc_switch_to_bank1(rtc); } @@ -1304,7 +1304,7 @@ ds1685_rtc_sysfs_ctrl_regs_store(struct device *dev, { struct ds1685_priv *rtc = dev_get_drvdata(dev); u8 reg = 0, bit = 0, tmp; - unsigned long flags = 0; + unsigned long flags; long int val = 0; const struct ds1685_rtc_ctrl_regs *reg_info = ds1685_rtc_sysfs_ctrl_regs_lookup(attr->attr.name); @@ -1325,7 +1325,7 @@ ds1685_rtc_sysfs_ctrl_regs_store(struct device *dev, bit = reg_info->bit; /* Safe to spinlock during a write. */ - ds1685_rtc_begin_ctrl_access(rtc, flags); + ds1685_rtc_begin_ctrl_access(rtc, &flags); tmp = rtc->read(rtc, reg); rtc->write(rtc, reg, (val ? (tmp | bit) : (tmp & ~(bit)))); ds1685_rtc_end_ctrl_access(rtc, flags); From d8b755902390e7ceb32a08a82f4c9e1ce72fb11e Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Thu, 4 Feb 2016 09:26:35 +0900 Subject: [PATCH 216/312] rtc: max77686: Properly handle regmap_irq_get_virq() error code [ Upstream commit fb166ba1d7f0a662f7332f4ff660a0d6f4d76915 ] The regmap_irq_get_virq() can return 0 or -EINVAL in error conditions but driver checked only for value of 0. This could lead to a cast of -EINVAL to an unsigned int used as a interrupt number for devm_request_threaded_irq(). Although this is not yet fatal (devm_request_threaded_irq() will just fail with -EINVAL) but might be a misleading when diagnosing errors. Signed-off-by: Krzysztof Kozlowski Fixes: 6f1c1e71d933 ("mfd: max77686: Convert to use regmap_irq") Reviewed-by: Javier Martinez Canillas Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin --- drivers/rtc/rtc-max77686.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/rtc/rtc-max77686.c b/drivers/rtc/rtc-max77686.c index 7632a87784c3..d42cef0ca939 100644 --- a/drivers/rtc/rtc-max77686.c +++ b/drivers/rtc/rtc-max77686.c @@ -465,7 +465,7 @@ static int max77686_rtc_probe(struct platform_device *pdev) info->virq = regmap_irq_get_virq(max77686->rtc_irq_data, MAX77686_RTCIRQ_RTCA1); - if (!info->virq) { + if (info->virq <= 0) { ret = -ENXIO; goto err_rtc; } From 4385d1a4421af970858491b1c35676271c89c2d9 Mon Sep 17 00:00:00 2001 From: Michael Hennerich Date: Mon, 22 Feb 2016 10:20:24 +0100 Subject: [PATCH 217/312] drivers/misc/ad525x_dpot: AD5274 fix RDAC read back errors [ Upstream commit f3df53e4d70b5736368a8fe8aa1bb70c1cb1f577 ] Fix RDAC read back errors caused by a typo. Value must shift by 2. Fixes: a4bd394956f2 ("drivers/misc/ad525x_dpot.c: new features") Signed-off-by: Michael Hennerich Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/misc/ad525x_dpot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/ad525x_dpot.c b/drivers/misc/ad525x_dpot.c index 15e88078ba1e..f1a0b99f5a9a 100644 --- a/drivers/misc/ad525x_dpot.c +++ b/drivers/misc/ad525x_dpot.c @@ -216,7 +216,7 @@ static s32 dpot_read_i2c(struct dpot_data *dpot, u8 reg) */ value = swab16(value); - if (dpot->uid == DPOT_UID(AD5271_ID)) + if (dpot->uid == DPOT_UID(AD5274_ID)) value = value >> 2; return value; default: From 6e939f787e830777bdafd5d70f13ed6498dc7ac0 Mon Sep 17 00:00:00 2001 From: Karol Herbst Date: Thu, 3 Mar 2016 02:03:11 +0100 Subject: [PATCH 218/312] x86/mm/kmmio: Fix mmiotrace for hugepages [ Upstream commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 ] Because Linux might use bigger pages than the 4K pages to handle those mmio ioremaps, the kmmio code shouldn't rely on the pade id as it currently does. Using the memory address instead of the page id lets us look up how big the page is and what its base address is, so that we won't get a page fault within the same page twice anymore. Tested-by: Pierre Moreau Signed-off-by: Karol Herbst Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Toshi Kani Cc: linux-mm@kvack.org Cc: linux-x86_64@vger.kernel.org Cc: nouveau@lists.freedesktop.org Cc: pq@iki.fi Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/1456966991-6861-1-git-send-email-nouveau@karolherbst.de Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- arch/x86/mm/kmmio.c | 88 ++++++++++++++++++++++++++++++--------------- 1 file changed, 59 insertions(+), 29 deletions(-) diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c index 637ab34ed632..ddb2244b06a1 100644 --- a/arch/x86/mm/kmmio.c +++ b/arch/x86/mm/kmmio.c @@ -33,7 +33,7 @@ struct kmmio_fault_page { struct list_head list; struct kmmio_fault_page *release_next; - unsigned long page; /* location of the fault page */ + unsigned long addr; /* the requested address */ pteval_t old_presence; /* page presence prior to arming */ bool armed; @@ -70,9 +70,16 @@ unsigned int kmmio_count; static struct list_head kmmio_page_table[KMMIO_PAGE_TABLE_SIZE]; static LIST_HEAD(kmmio_probes); -static struct list_head *kmmio_page_list(unsigned long page) +static struct list_head *kmmio_page_list(unsigned long addr) { - return &kmmio_page_table[hash_long(page, KMMIO_PAGE_HASH_BITS)]; + unsigned int l; + pte_t *pte = lookup_address(addr, &l); + + if (!pte) + return NULL; + addr &= page_level_mask(l); + + return &kmmio_page_table[hash_long(addr, KMMIO_PAGE_HASH_BITS)]; } /* Accessed per-cpu */ @@ -98,15 +105,19 @@ static struct kmmio_probe *get_kmmio_probe(unsigned long addr) } /* You must be holding RCU read lock. */ -static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long page) +static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long addr) { struct list_head *head; struct kmmio_fault_page *f; + unsigned int l; + pte_t *pte = lookup_address(addr, &l); - page &= PAGE_MASK; - head = kmmio_page_list(page); + if (!pte) + return NULL; + addr &= page_level_mask(l); + head = kmmio_page_list(addr); list_for_each_entry_rcu(f, head, list) { - if (f->page == page) + if (f->addr == addr) return f; } return NULL; @@ -137,10 +148,10 @@ static void clear_pte_presence(pte_t *pte, bool clear, pteval_t *old) static int clear_page_presence(struct kmmio_fault_page *f, bool clear) { unsigned int level; - pte_t *pte = lookup_address(f->page, &level); + pte_t *pte = lookup_address(f->addr, &level); if (!pte) { - pr_err("no pte for page 0x%08lx\n", f->page); + pr_err("no pte for addr 0x%08lx\n", f->addr); return -1; } @@ -156,7 +167,7 @@ static int clear_page_presence(struct kmmio_fault_page *f, bool clear) return -1; } - __flush_tlb_one(f->page); + __flush_tlb_one(f->addr); return 0; } @@ -176,12 +187,12 @@ static int arm_kmmio_fault_page(struct kmmio_fault_page *f) int ret; WARN_ONCE(f->armed, KERN_ERR pr_fmt("kmmio page already armed.\n")); if (f->armed) { - pr_warning("double-arm: page 0x%08lx, ref %d, old %d\n", - f->page, f->count, !!f->old_presence); + pr_warning("double-arm: addr 0x%08lx, ref %d, old %d\n", + f->addr, f->count, !!f->old_presence); } ret = clear_page_presence(f, true); - WARN_ONCE(ret < 0, KERN_ERR pr_fmt("arming 0x%08lx failed.\n"), - f->page); + WARN_ONCE(ret < 0, KERN_ERR pr_fmt("arming at 0x%08lx failed.\n"), + f->addr); f->armed = true; return ret; } @@ -191,7 +202,7 @@ static void disarm_kmmio_fault_page(struct kmmio_fault_page *f) { int ret = clear_page_presence(f, false); WARN_ONCE(ret < 0, - KERN_ERR "kmmio disarming 0x%08lx failed.\n", f->page); + KERN_ERR "kmmio disarming at 0x%08lx failed.\n", f->addr); f->armed = false; } @@ -215,6 +226,12 @@ int kmmio_handler(struct pt_regs *regs, unsigned long addr) struct kmmio_context *ctx; struct kmmio_fault_page *faultpage; int ret = 0; /* default to fault not handled */ + unsigned long page_base = addr; + unsigned int l; + pte_t *pte = lookup_address(addr, &l); + if (!pte) + return -EINVAL; + page_base &= page_level_mask(l); /* * Preemption is now disabled to prevent process switch during @@ -227,7 +244,7 @@ int kmmio_handler(struct pt_regs *regs, unsigned long addr) preempt_disable(); rcu_read_lock(); - faultpage = get_kmmio_fault_page(addr); + faultpage = get_kmmio_fault_page(page_base); if (!faultpage) { /* * Either this page fault is not caused by kmmio, or @@ -239,7 +256,7 @@ int kmmio_handler(struct pt_regs *regs, unsigned long addr) ctx = &get_cpu_var(kmmio_ctx); if (ctx->active) { - if (addr == ctx->addr) { + if (page_base == ctx->addr) { /* * A second fault on the same page means some other * condition needs handling by do_page_fault(), the @@ -267,9 +284,9 @@ int kmmio_handler(struct pt_regs *regs, unsigned long addr) ctx->active++; ctx->fpage = faultpage; - ctx->probe = get_kmmio_probe(addr); + ctx->probe = get_kmmio_probe(page_base); ctx->saved_flags = (regs->flags & (X86_EFLAGS_TF | X86_EFLAGS_IF)); - ctx->addr = addr; + ctx->addr = page_base; if (ctx->probe && ctx->probe->pre_handler) ctx->probe->pre_handler(ctx->probe, regs, addr); @@ -354,12 +371,11 @@ out: } /* You must be holding kmmio_lock. */ -static int add_kmmio_fault_page(unsigned long page) +static int add_kmmio_fault_page(unsigned long addr) { struct kmmio_fault_page *f; - page &= PAGE_MASK; - f = get_kmmio_fault_page(page); + f = get_kmmio_fault_page(addr); if (f) { if (!f->count) arm_kmmio_fault_page(f); @@ -372,26 +388,25 @@ static int add_kmmio_fault_page(unsigned long page) return -1; f->count = 1; - f->page = page; + f->addr = addr; if (arm_kmmio_fault_page(f)) { kfree(f); return -1; } - list_add_rcu(&f->list, kmmio_page_list(f->page)); + list_add_rcu(&f->list, kmmio_page_list(f->addr)); return 0; } /* You must be holding kmmio_lock. */ -static void release_kmmio_fault_page(unsigned long page, +static void release_kmmio_fault_page(unsigned long addr, struct kmmio_fault_page **release_list) { struct kmmio_fault_page *f; - page &= PAGE_MASK; - f = get_kmmio_fault_page(page); + f = get_kmmio_fault_page(addr); if (!f) return; @@ -420,18 +435,27 @@ int register_kmmio_probe(struct kmmio_probe *p) int ret = 0; unsigned long size = 0; const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); + unsigned int l; + pte_t *pte; spin_lock_irqsave(&kmmio_lock, flags); if (get_kmmio_probe(p->addr)) { ret = -EEXIST; goto out; } + + pte = lookup_address(p->addr, &l); + if (!pte) { + ret = -EINVAL; + goto out; + } + kmmio_count++; list_add_rcu(&p->list, &kmmio_probes); while (size < size_lim) { if (add_kmmio_fault_page(p->addr + size)) pr_err("Unable to set page fault.\n"); - size += PAGE_SIZE; + size += page_level_size(l); } out: spin_unlock_irqrestore(&kmmio_lock, flags); @@ -506,11 +530,17 @@ void unregister_kmmio_probe(struct kmmio_probe *p) const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK); struct kmmio_fault_page *release_list = NULL; struct kmmio_delayed_release *drelease; + unsigned int l; + pte_t *pte; + + pte = lookup_address(p->addr, &l); + if (!pte) + return; spin_lock_irqsave(&kmmio_lock, flags); while (size < size_lim) { release_kmmio_fault_page(p->addr + size, &release_list); - size += PAGE_SIZE; + size += page_level_size(l); } list_del_rcu(&p->list); kmmio_count--; From 7925f4fb3d8b5b0c1bb9c5ded85078197c6f87af Mon Sep 17 00:00:00 2001 From: Eryu Guan Date: Sat, 12 Mar 2016 21:40:32 -0500 Subject: [PATCH 219/312] ext4: fix NULL pointer dereference in ext4_mark_inode_dirty() [ Upstream commit 5e1021f2b6dff1a86a468a1424d59faae2bc63c1 ] ext4_reserve_inode_write() in ext4_mark_inode_dirty() could fail on error (e.g. EIO) and iloc.bh can be NULL in this case. But the error is ignored in the following "if" condition and ext4_expand_extra_isize() might be called with NULL iloc.bh set, which triggers NULL pointer dereference. This is uncovered by commit 8b4953e13f4c ("ext4: reserve code points for the project quota feature"), which enlarges the ext4_inode size, and run the following script on new kernel but with old mke2fs: #/bin/bash mnt=/mnt/ext4 devname=ext4-error dev=/dev/mapper/$devname fsimg=/home/fs.img trap cleanup 0 1 2 3 9 15 cleanup() { umount $mnt >/dev/null 2>&1 dmsetup remove $devname losetup -d $backend_dev rm -f $fsimg exit 0 } rm -f $fsimg fallocate -l 1g $fsimg backend_dev=`losetup -f --show $fsimg` devsize=`blockdev --getsz $backend_dev` good_tab="0 $devsize linear $backend_dev 0" error_tab="0 $devsize error $backend_dev 0" dmsetup create $devname --table "$good_tab" mkfs -t ext4 $dev mount -t ext4 -o errors=continue,strictatime $dev $mnt dmsetup load $devname --table "$error_tab" && dmsetup resume $devname echo 3 > /proc/sys/vm/drop_caches ls -l $mnt exit 0 [ Patch changed to simplify the function a tiny bit. -- Ted ] Signed-off-by: Eryu Guan Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin --- fs/ext4/inode.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index f43996884242..ba12e2953aec 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5094,6 +5094,8 @@ int ext4_mark_inode_dirty(handle_t *handle, struct inode *inode) might_sleep(); trace_ext4_mark_inode_dirty(inode, _RET_IP_); err = ext4_reserve_inode_write(handle, inode, &iloc); + if (err) + return err; if (ext4_handle_valid(handle) && EXT4_I(inode)->i_extra_isize < sbi->s_want_extra_isize && !ext4_test_inode_state(inode, EXT4_STATE_NO_EXPAND)) { @@ -5124,9 +5126,7 @@ int ext4_mark_inode_dirty(handle_t *handle, struct inode *inode) } } } - if (!err) - err = ext4_mark_iloc_dirty(handle, inode, &iloc); - return err; + return ext4_mark_iloc_dirty(handle, inode, &iloc); } /* From cedfda9c15cbb1404ba6acdc2d3ea3fc73aa76f8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marcin=20=C5=9Alusarz?= Date: Tue, 19 Jan 2016 20:03:03 +0100 Subject: [PATCH 220/312] perf tools: handle spaces in file names obtained from /proc/pid/maps MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 89fee59b504f86925894fcc9ba79d5c933842f93 ] Steam frequently puts game binaries in folders with spaces. Note: "(deleted)" markers are now treated as part of the file name. Signed-off-by: Marcin Ślusarz Acked-by: Namhyung Kim Fixes: 6064803313ba ("perf tools: Use sscanf for parsing /proc/pid/maps") Link: http://lkml.kernel.org/r/20160119190303.GA17579@marcin-Inspiron-7720 Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin --- tools/perf/util/event.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index ff866c4d2e2f..d18a59ab4ed5 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -251,7 +251,7 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, strcpy(execname, ""); /* 00400000-0040c000 r-xp 00000000 fd:01 41038 /bin/cat */ - n = sscanf(bf, "%"PRIx64"-%"PRIx64" %s %"PRIx64" %x:%x %u %s\n", + n = sscanf(bf, "%"PRIx64"-%"PRIx64" %s %"PRIx64" %x:%x %u %[^\n]\n", &event->mmap2.start, &event->mmap2.len, prot, &event->mmap2.pgoff, &event->mmap2.maj, &event->mmap2.min, From 95297bf99436694051ee91fd5eee3c3aa02fda5d Mon Sep 17 00:00:00 2001 From: Borislav Petkov Date: Mon, 7 Mar 2016 16:44:44 -0300 Subject: [PATCH 221/312] perf stat: Document --detailed option [ Upstream commit f594bae08183fb6b57db55387794ece3e1edf6f6 ] I'm surprised this remained undocumented since at least 2011. And it is actually a very useful switch, as Steve and I came to realize recently. Add the text from 2cba3ffb9a9d ("perf stat: Add -d -d and -d -d -d options to show more CPU events") which added the incrementing aspect to -d. Tested-by: Arnaldo Carvalho de Melo Signed-off-by: Borislav Petkov Signed-off-by: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: David Ahern Cc: Davidlohr Bueso Cc: Jiri Olsa Cc: Mel Gorman Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Peter Zijlstra Cc: Steven Rostedt Cc: Thomas Gleixner Fixes: 2cba3ffb9a9d ("perf stat: Add -d -d and -d -d -d options to show more CPU events") Link: http://lkml.kernel.org/r/1457347294-32546-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- tools/perf/Documentation/perf-stat.txt | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 04e150d83e7d..756ed9fdc9ad 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -62,6 +62,14 @@ OPTIONS --scale:: scale/normalize counter values +-d:: +--detailed:: + print more detailed statistics, can be specified up to 3 times + + -d: detailed events, L1 and LLC data cache + -d -d: more detailed events, dTLB and iTLB events + -d -d -d: very detailed events, adding prefetch events + -r:: --repeat=:: repeat command and print average + stddev (max: 100). 0 means forever. From 61116b29f126e6e7a6975f9a391d18ecd3bad83d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pali=20Roh=C3=A1r?= Date: Fri, 19 Feb 2016 10:35:39 -0800 Subject: [PATCH 222/312] ARM: OMAP3: Add cpuidle parameters table for omap3430 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 98f42221501353067251fbf11e732707dbb68ce3 ] Based on CPU type choose generic omap3 or omap3430 specific cpuidle parameters. Parameters for omap3430 were measured on Nokia N900 device and added by commit 5a1b1d3a9efa ("OMAP3: RX-51: Pass cpu idle parameters") which were later removed by commit 231900afba52 ("ARM: OMAP3: cpuidle - remove rx51 cpuidle parameters table") due to huge code complexity. This patch brings cpuidle parameters for omap3430 devices again, but uses simple condition based on CPU type. Fixes: 231900afba52 ("ARM: OMAP3: cpuidle - remove rx51 cpuidle parameters table") Signed-off-by: Pali Rohár Acked-by: Daniel Lezcano Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin --- arch/arm/mach-omap2/cpuidle34xx.c | 69 ++++++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c index aa7b379e2661..2a3db0bd9e15 100644 --- a/arch/arm/mach-omap2/cpuidle34xx.c +++ b/arch/arm/mach-omap2/cpuidle34xx.c @@ -34,6 +34,7 @@ #include "pm.h" #include "control.h" #include "common.h" +#include "soc.h" /* Mach specific information to be recorded in the C-state driver_data */ struct omap3_idle_statedata { @@ -315,6 +316,69 @@ static struct cpuidle_driver omap3_idle_driver = { .safe_state_index = 0, }; +/* + * Numbers based on measurements made in October 2009 for PM optimized kernel + * with CPU freq enabled on device Nokia N900. Assumes OPP2 (main idle OPP, + * and worst case latencies). + */ +static struct cpuidle_driver omap3430_idle_driver = { + .name = "omap3430_idle", + .owner = THIS_MODULE, + .states = { + { + .enter = omap3_enter_idle_bm, + .exit_latency = 110 + 162, + .target_residency = 5, + .name = "C1", + .desc = "MPU ON + CORE ON", + }, + { + .enter = omap3_enter_idle_bm, + .exit_latency = 106 + 180, + .target_residency = 309, + .name = "C2", + .desc = "MPU ON + CORE ON", + }, + { + .enter = omap3_enter_idle_bm, + .exit_latency = 107 + 410, + .target_residency = 46057, + .name = "C3", + .desc = "MPU RET + CORE ON", + }, + { + .enter = omap3_enter_idle_bm, + .exit_latency = 121 + 3374, + .target_residency = 46057, + .name = "C4", + .desc = "MPU OFF + CORE ON", + }, + { + .enter = omap3_enter_idle_bm, + .exit_latency = 855 + 1146, + .target_residency = 46057, + .name = "C5", + .desc = "MPU RET + CORE RET", + }, + { + .enter = omap3_enter_idle_bm, + .exit_latency = 7580 + 4134, + .target_residency = 484329, + .name = "C6", + .desc = "MPU OFF + CORE RET", + }, + { + .enter = omap3_enter_idle_bm, + .exit_latency = 7505 + 15274, + .target_residency = 484329, + .name = "C7", + .desc = "MPU OFF + CORE OFF", + }, + }, + .state_count = ARRAY_SIZE(omap3_idle_data), + .safe_state_index = 0, +}; + /* Public functions */ /** @@ -333,5 +397,8 @@ int __init omap3_idle_init(void) if (!mpu_pd || !core_pd || !per_pd || !cam_pd) return -ENODEV; - return cpuidle_register(&omap3_idle_driver, NULL); + if (cpu_is_omap3430()) + return cpuidle_register(&omap3430_idle_driver, NULL); + else + return cpuidle_register(&omap3_idle_driver, NULL); } From 781fff9fdb879457cd067994f162fa2dc21f4bbc Mon Sep 17 00:00:00 2001 From: Guo-Fu Tseng Date: Sat, 5 Mar 2016 08:11:55 +0800 Subject: [PATCH 223/312] jme: Do not enable NIC WoL functions on S0 [ Upstream commit 0772a99b818079e628a1da122ac7ee023faed83e ] Otherwise it might be back on resume right after going to suspend in some hardware. Reported-by: Diego Viola Signed-off-by: Guo-Fu Tseng Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/jme.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c index 32d240b332d2..73351d20646c 100644 --- a/drivers/net/ethernet/jme.c +++ b/drivers/net/ethernet/jme.c @@ -270,11 +270,17 @@ jme_reset_mac_processor(struct jme_adapter *jme) } static inline void -jme_clear_pm(struct jme_adapter *jme) +jme_clear_pm_enable_wol(struct jme_adapter *jme) { jwrite32(jme, JME_PMCS, PMCS_STMASK | jme->reg_pmcs); } +static inline void +jme_clear_pm_disable_wol(struct jme_adapter *jme) +{ + jwrite32(jme, JME_PMCS, PMCS_STMASK); +} + static int jme_reload_eeprom(struct jme_adapter *jme) { @@ -1857,7 +1863,7 @@ jme_open(struct net_device *netdev) struct jme_adapter *jme = netdev_priv(netdev); int rc; - jme_clear_pm(jme); + jme_clear_pm_disable_wol(jme); JME_NAPI_ENABLE(jme); tasklet_init(&jme->linkch_task, jme_link_change_tasklet, @@ -1933,7 +1939,7 @@ jme_powersave_phy(struct jme_adapter *jme) jme_set_100m_half(jme); if (jme->reg_pmcs & (PMCS_LFEN | PMCS_LREN)) jme_wait_link(jme); - jme_clear_pm(jme); + jme_clear_pm_enable_wol(jme); } else { jme_phy_off(jme); } @@ -2650,7 +2656,6 @@ jme_set_wol(struct net_device *netdev, if (wol->wolopts & WAKE_MAGIC) jme->reg_pmcs |= PMCS_MFEN; - jwrite32(jme, JME_PMCS, jme->reg_pmcs); device_set_wakeup_enable(&jme->pdev->dev, !!(jme->reg_pmcs)); return 0; @@ -3176,7 +3181,7 @@ jme_init_one(struct pci_dev *pdev, jme->mii_if.mdio_read = jme_mdio_read; jme->mii_if.mdio_write = jme_mdio_write; - jme_clear_pm(jme); + jme_clear_pm_disable_wol(jme); device_set_wakeup_enable(&pdev->dev, true); jme_set_phyfifo_5level(jme); @@ -3308,7 +3313,7 @@ jme_resume(struct device *dev) if (!netif_running(netdev)) return 0; - jme_clear_pm(jme); + jme_clear_pm_disable_wol(jme); jme_phy_on(jme); if (test_bit(JME_FLAG_SSET, &jme->flags)) jme_set_settings(netdev, &jme->old_ecmd); From 00c0a9f44f68bfe9af73f84cdb93b1695da66b44 Mon Sep 17 00:00:00 2001 From: Guo-Fu Tseng Date: Sat, 5 Mar 2016 08:11:56 +0800 Subject: [PATCH 224/312] jme: Fix device PM wakeup API usage [ Upstream commit 81422e672f8181d7ad1ee6c60c723aac649f538f ] According to Documentation/power/devices.txt The driver should not use device_set_wakeup_enable() which is the policy for user to decide. Using device_init_wakeup() to initialize dev->power.should_wakeup and dev->power.can_wakeup on driver initialization. And use device_may_wakeup() on suspend to decide if WoL function should be enabled on NIC. Reported-by: Diego Viola Signed-off-by: Guo-Fu Tseng Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/jme.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c index 73351d20646c..46e8d5b12c1a 100644 --- a/drivers/net/ethernet/jme.c +++ b/drivers/net/ethernet/jme.c @@ -1935,7 +1935,7 @@ jme_wait_link(struct jme_adapter *jme) static void jme_powersave_phy(struct jme_adapter *jme) { - if (jme->reg_pmcs) { + if (jme->reg_pmcs && device_may_wakeup(&jme->pdev->dev)) { jme_set_100m_half(jme); if (jme->reg_pmcs & (PMCS_LFEN | PMCS_LREN)) jme_wait_link(jme); @@ -2656,8 +2656,6 @@ jme_set_wol(struct net_device *netdev, if (wol->wolopts & WAKE_MAGIC) jme->reg_pmcs |= PMCS_MFEN; - device_set_wakeup_enable(&jme->pdev->dev, !!(jme->reg_pmcs)); - return 0; } @@ -3182,7 +3180,7 @@ jme_init_one(struct pci_dev *pdev, jme->mii_if.mdio_write = jme_mdio_write; jme_clear_pm_disable_wol(jme); - device_set_wakeup_enable(&pdev->dev, true); + device_init_wakeup(&pdev->dev, true); jme_set_phyfifo_5level(jme); jme->pcirev = pdev->revision; From e28574abf31fbce4ae38606989b3f18e72838b97 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Fri, 4 Mar 2016 17:20:13 +1100 Subject: [PATCH 225/312] sunrpc/cache: drop reference when sunrpc_cache_pipe_upcall() detects a race [ Upstream commit a6ab1e8126d205238defbb55d23661a3a5c6a0d8 ] sunrpc_cache_pipe_upcall() can detect a race if CACHE_PENDING is no longer set. In this case it aborts the queuing of the upcall. However it has already taken a new counted reference on "h" and doesn't "put" it, even though it frees the data structure holding the reference. So let's delay the "cache_get" until we know we need it. Fixes: f9e1aedc6c79 ("sunrpc/cache: remove races with queuing an upcall.") Signed-off-by: NeilBrown Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin --- net/sunrpc/cache.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index 8d79e70bd978..9ec709b9707c 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -1175,14 +1175,14 @@ int sunrpc_cache_pipe_upcall(struct cache_detail *detail, struct cache_head *h) } crq->q.reader = 0; - crq->item = cache_get(h); crq->buf = buf; crq->len = 0; crq->readers = 0; spin_lock(&queue_lock); - if (test_bit(CACHE_PENDING, &h->flags)) + if (test_bit(CACHE_PENDING, &h->flags)) { + crq->item = cache_get(h); list_add_tail(&crq->q.list, &detail->queue); - else + } else /* Lost a race, no longer PENDING, so don't enqueue */ ret = -EAGAIN; spin_unlock(&queue_lock); From 7267c9680f5a0ca9e068bb8986b3c336fd02bb1e Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 14 Mar 2016 15:29:45 +0100 Subject: [PATCH 226/312] megaraid_sas: add missing curly braces in ioctl handler [ Upstream commit 3deb9438d34a09f6796639b652a01d110aca9f75 ] gcc-6 found a dubious indentation in the megasas_mgmt_fw_ioctl function: drivers/scsi/megaraid/megaraid_sas_base.c: In function 'megasas_mgmt_fw_ioctl': drivers/scsi/megaraid/megaraid_sas_base.c:6658:4: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] kbuff_arr[i] = NULL; ^~~~~~~~~ drivers/scsi/megaraid/megaraid_sas_base.c:6653:3: note: ...this 'if' clause, but it is not if (kbuff_arr[i]) ^~ The code is actually correct, as there is no downside in clearing a NULL pointer again. This clarifies the code and avoids the warning by adding extra curly braces. Signed-off-by: Arnd Bergmann Fixes: 90dc9d98f01b ("megaraid_sas : MFI MPT linked list corruption fix") Reviewed-by: Hannes Reinecke Acked-by: Sumit Saxena Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/megaraid/megaraid_sas_base.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 890637fdd61e..7a1c4b4e764b 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -6203,12 +6203,13 @@ megasas_mgmt_fw_ioctl(struct megasas_instance *instance, } for (i = 0; i < ioc->sge_count; i++) { - if (kbuff_arr[i]) + if (kbuff_arr[i]) { dma_free_coherent(&instance->pdev->dev, le32_to_cpu(kern_sge32[i].length), kbuff_arr[i], le32_to_cpu(kern_sge32[i].phys_addr)); kbuff_arr[i] = NULL; + } } if (instance->ctrl_context && cmd->mpt_pthr_cmd_blocked) From 98f453547cfb0e1a1295e13905214a9f592d1e1c Mon Sep 17 00:00:00 2001 From: Marco Angaroni Date: Sat, 5 Mar 2016 12:10:02 +0100 Subject: [PATCH 227/312] ipvs: correct initial offset of Call-ID header search in SIP persistence engine [ Upstream commit 7617a24f83b5d67f4dab1844956be1cebc44aec8 ] The IPVS SIP persistence engine is not able to parse the SIP header "Call-ID" when such header is inserted in the first positions of the SIP message. When IPVS is configured with "--pe sip" option, like for example: ipvsadm -A -u 1.2.3.4:5060 -s rr --pe sip -p 120 -o some particular messages (see below for details) do not create entries in the connection template table, which can be listed with: ipvsadm -Lcn --persistent-conn Problematic SIP messages are SIP responses having "Call-ID" header positioned just after message first line: SIP/2.0 200 OK [Call-ID header here] [rest of the headers] When "Call-ID" header is positioned down (after a few other headers) it is correctly recognized. This is due to the data offset used in get_callid function call inside ip_vs_pe_sip.c file: since dptr already points to the start of the SIP message, the value of dataoff should be initially 0. Otherwise the header is searched starting from some bytes after the first character of the SIP message. Fixes: 758ff0338722 ("IPVS: sip persistence engine") Signed-off-by: Marco Angaroni Acked-by: Julian Anastasov Signed-off-by: Simon Horman Signed-off-by: Sasha Levin --- net/netfilter/ipvs/ip_vs_pe_sip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/ipvs/ip_vs_pe_sip.c b/net/netfilter/ipvs/ip_vs_pe_sip.c index bed5f7042529..bb318e4623a3 100644 --- a/net/netfilter/ipvs/ip_vs_pe_sip.c +++ b/net/netfilter/ipvs/ip_vs_pe_sip.c @@ -88,7 +88,7 @@ ip_vs_sip_fill_param(struct ip_vs_conn_param *p, struct sk_buff *skb) dptr = skb->data + dataoff; datalen = skb->len - dataoff; - if (get_callid(dptr, dataoff, datalen, &matchoff, &matchlen)) + if (get_callid(dptr, 0, datalen, &matchoff, &matchlen)) return -EINVAL; /* N.B: pe_data is only set on success, From 7361f006e36e20a81edede6724be39e718d4347f Mon Sep 17 00:00:00 2001 From: Julian Anastasov Date: Sat, 5 Mar 2016 15:03:22 +0200 Subject: [PATCH 228/312] ipvs: drop first packet to redirect conntrack [ Upstream commit f719e3754ee2f7275437e61a6afd520181fdd43b ] Jiri Bohac is reporting for a problem where the attempt to reschedule existing connection to another real server needs proper redirect for the conntrack used by the IPVS connection. For example, when IPVS connection is created to NAT-ed real server we alter the reply direction of conntrack. If we later decide to select different real server we can not alter again the conntrack. And if we expire the old connection, the new connection is left without conntrack. So, the only way to redirect both the IPVS connection and the Netfilter's conntrack is to drop the SYN packet that hits existing connection, to wait for the next jiffie to expire the old connection and its conntrack and to rely on client's retransmission to create new connection as usually. Jiri Bohac provided a fix that drops all SYNs on rescheduling, I extended his patch to do such drops only for connections that use conntrack. Here is the original report from Jiri Bohac: Since commit dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead"), new connections to dead servers are redistributed immediately to new servers. The old connection is expired using ip_vs_conn_expire_now() which sets the connection timer to expire immediately. However, before the timer callback, ip_vs_conn_expire(), is run to clean the connection's conntrack entry, the new redistributed connection may already be established and its conntrack removed instead. Fix this by dropping the first packet of the new connection instead, like we do when the destination server is not available. The timer will have deleted the old conntrack entry long before the first packet of the new connection is retransmitted. Fixes: dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead") Signed-off-by: Jiri Bohac Signed-off-by: Julian Anastasov Signed-off-by: Simon Horman Signed-off-by: Sasha Levin --- include/net/ip_vs.h | 17 +++++++++++++++ net/netfilter/ipvs/ip_vs_core.c | 37 +++++++++++++++++++++++++-------- 2 files changed, 45 insertions(+), 9 deletions(-) diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 4e3731ee4eac..cd8594528d4b 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -1584,6 +1584,23 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp) } #endif /* CONFIG_IP_VS_NFCT */ +/* Really using conntrack? */ +static inline bool ip_vs_conn_uses_conntrack(struct ip_vs_conn *cp, + struct sk_buff *skb) +{ +#ifdef CONFIG_IP_VS_NFCT + enum ip_conntrack_info ctinfo; + struct nf_conn *ct; + + if (!(cp->flags & IP_VS_CONN_F_NFCT)) + return false; + ct = nf_ct_get(skb, &ctinfo); + if (ct && !nf_ct_is_untracked(ct)) + return true; +#endif + return false; +} + static inline int ip_vs_dest_conn_overhead(struct ip_vs_dest *dest) { diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index 38fbc194b9cb..a26bd6532829 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -1689,15 +1689,34 @@ ip_vs_in(unsigned int hooknum, struct sk_buff *skb, int af) cp = pp->conn_in_get(af, skb, &iph, 0); conn_reuse_mode = sysctl_conn_reuse_mode(ipvs); - if (conn_reuse_mode && !iph.fragoffs && - is_new_conn(skb, &iph) && cp && - ((unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest && - unlikely(!atomic_read(&cp->dest->weight))) || - unlikely(is_new_conn_expected(cp, conn_reuse_mode)))) { - if (!atomic_read(&cp->n_control)) - ip_vs_conn_expire_now(cp); - __ip_vs_conn_put(cp); - cp = NULL; + if (conn_reuse_mode && !iph.fragoffs && is_new_conn(skb, &iph) && cp) { + bool uses_ct = false, resched = false; + + if (unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest && + unlikely(!atomic_read(&cp->dest->weight))) { + resched = true; + uses_ct = ip_vs_conn_uses_conntrack(cp, skb); + } else if (is_new_conn_expected(cp, conn_reuse_mode)) { + uses_ct = ip_vs_conn_uses_conntrack(cp, skb); + if (!atomic_read(&cp->n_control)) { + resched = true; + } else { + /* Do not reschedule controlling connection + * that uses conntrack while it is still + * referenced by controlled connection(s). + */ + resched = !uses_ct; + } + } + + if (resched) { + if (!atomic_read(&cp->n_control)) + ip_vs_conn_expire_now(cp); + __ip_vs_conn_put(cp); + if (uses_ct) + return NF_DROP; + cp = NULL; + } } if (unlikely(!cp) && !iph.fragoffs) { From 9eb52957b6a4a3e86ae25735c5cf1f1945ad33c4 Mon Sep 17 00:00:00 2001 From: Dan Streetman Date: Thu, 14 Jan 2016 13:42:32 -0500 Subject: [PATCH 229/312] nbd: ratelimit error msgs after socket close [ Upstream commit da6ccaaa79caca4f38b540b651238f87215217a2 ] Make the "Attempted send on closed socket" error messages generated in nbd_request_handler() ratelimited. When the nbd socket is shutdown, the nbd_request_handler() function emits an error message for every request remaining in its queue. If the queue is large, this will spam a large amount of messages to the log. There's no need for a separate error message for each request, so this patch ratelimits it. In the specific case this was found, the system was virtual and the error messages were logged to the serial port, which overwhelmed it. Fixes: 4d48a542b427 ("nbd: fix I/O hang on disconnected nbds") Signed-off-by: Dan Streetman Signed-off-by: Markus Pargmann Signed-off-by: Sasha Levin --- drivers/block/nbd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 39e5f7fae3ef..9911b2067286 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -557,8 +557,8 @@ static void do_nbd_request(struct request_queue *q) req, req->cmd_type); if (unlikely(!nbd->sock)) { - dev_err(disk_to_dev(nbd->disk), - "Attempted send on closed socket\n"); + dev_err_ratelimited(disk_to_dev(nbd->disk), + "Attempted send on closed socket\n"); req->errors++; nbd_end_request(nbd, req); spin_lock_irq(q->queue_lock); From ea0b24134918a838f1ff94ac4707d3dcc637630c Mon Sep 17 00:00:00 2001 From: Shawn Lin Date: Tue, 2 Feb 2016 11:37:50 +0800 Subject: [PATCH 230/312] clk: rockchip: free memory in error cases when registering clock branches [ Upstream commit 2467b6745e0ae9c6cdccff24c4cceeb14b1cce3f ] Add free memeory if rockchip_clk_register_branch fails. Fixes: a245fecbb806 ("clk: rockchip: add basic infrastructure...") Signed-off-by: Shawn Lin Signed-off-by: Heiko Stuebner Signed-off-by: Sasha Levin --- drivers/clk/rockchip/clk.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/clk/rockchip/clk.c b/drivers/clk/rockchip/clk.c index edb5d489ae61..3b9de3264534 100644 --- a/drivers/clk/rockchip/clk.c +++ b/drivers/clk/rockchip/clk.c @@ -70,7 +70,7 @@ static struct clk *rockchip_clk_register_branch(const char *name, if (gate_offset >= 0) { gate = kzalloc(sizeof(*gate), GFP_KERNEL); if (!gate) - return ERR_PTR(-ENOMEM); + goto err_gate; gate->flags = gate_flags; gate->reg = base + gate_offset; @@ -82,7 +82,7 @@ static struct clk *rockchip_clk_register_branch(const char *name, if (div_width > 0) { div = kzalloc(sizeof(*div), GFP_KERNEL); if (!div) - return ERR_PTR(-ENOMEM); + goto err_div; div->flags = div_flags; div->reg = base + muxdiv_offset; @@ -100,6 +100,11 @@ static struct clk *rockchip_clk_register_branch(const char *name, flags); return clk; +err_div: + kfree(gate); +err_gate: + kfree(mux); + return ERR_PTR(-ENOMEM); } static struct clk *rockchip_clk_register_frac_branch(const char *name, From 5356deeafda4e139a44f6f82a99439d93a7b84cf Mon Sep 17 00:00:00 2001 From: Srinivas Kandagatla Date: Mon, 22 Feb 2016 11:43:39 +0000 Subject: [PATCH 231/312] clk: qcom: msm8960: fix ce3_core clk enable register [ Upstream commit 732d6913691848db9fabaa6a25b4d6fad10ddccf ] This patch corrects the enable register offset which is actually 0x36cc instead of 0x36c4 Signed-off-by: Srinivas Kandagatla Fixes: 5f775498bdc4 ("clk: qcom: Fully support apq8064 global clock control") Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin --- drivers/clk/qcom/gcc-msm8960.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/clk/qcom/gcc-msm8960.c b/drivers/clk/qcom/gcc-msm8960.c index eb6a4f9fa107..9ce658ae72b7 100644 --- a/drivers/clk/qcom/gcc-msm8960.c +++ b/drivers/clk/qcom/gcc-msm8960.c @@ -2769,7 +2769,7 @@ static struct clk_branch ce3_core_clk = { .halt_reg = 0x2fdc, .halt_bit = 5, .clkr = { - .enable_reg = 0x36c4, + .enable_reg = 0x36cc, .enable_mask = BIT(4), .hw.init = &(struct clk_init_data){ .name = "ce3_core_clk", From b1a774dd632be6ec5494b0eaf55981b561ddec22 Mon Sep 17 00:00:00 2001 From: Linus Walleij Date: Wed, 24 Feb 2016 09:39:11 +0100 Subject: [PATCH 232/312] clk: versatile: sp810: support reentrance [ Upstream commit ec7957a6aa0aaf981fb8356dc47a2cdd01cde03c ] Despite care take to allocate clocks state containers the SP810 driver actually just supports creating one instance: all clocks registered for every instance will end up with the exact same name and __clk_init() will fail. Rename the timclken<0> .. timclken to sp810__ so every clock on every instance gets a unique name. This is necessary for the RealView PBA8 which has two SP810 blocks: the second block will not register its clocks unless every clock on every instance is unique and results in boot logs like this: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at ../drivers/clk/versatile/clk-sp810.c:137 clk_sp810_of_setup+0x110/0x154() Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.5.0-rc2-00030-g352718fc39f6-dirty #225 Hardware name: ARM RealView Machine (Device Tree Support) [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (dump_stack+0x84/0x9c) [] (dump_stack) from [] (warn_slowpath_common+0x74/0xb0) [] (warn_slowpath_common) from [] (warn_slowpath_null+0x1c/0x24) [] (warn_slowpath_null) from [] (clk_sp810_of_setup+0x110/0x154) [] (clk_sp810_of_setup) from [] (of_clk_init+0x12c/0x1c8) [] (of_clk_init) from [] (time_init+0x20/0x2c) [] (time_init) from [] (start_kernel+0x244/0x3c4) [] (start_kernel) from [<7000807c>] (0x7000807c) ---[ end trace cb88537fdc8fa200 ]--- Cc: Michael Turquette Cc: Pawel Moll Fixes: 6e973d2c4385 "clk: vexpress: Add separate SP810 driver" Signed-off-by: Linus Walleij Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin --- drivers/clk/versatile/clk-sp810.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/clk/versatile/clk-sp810.c b/drivers/clk/versatile/clk-sp810.c index 5122ef25f595..e63c3ef9b5ec 100644 --- a/drivers/clk/versatile/clk-sp810.c +++ b/drivers/clk/versatile/clk-sp810.c @@ -141,6 +141,7 @@ void __init clk_sp810_of_setup(struct device_node *node) const char *parent_names[2]; char name[12]; struct clk_init_data init; + static int instance; int i; if (!sp810) { @@ -172,7 +173,7 @@ void __init clk_sp810_of_setup(struct device_node *node) init.num_parents = ARRAY_SIZE(parent_names); for (i = 0; i < ARRAY_SIZE(sp810->timerclken); i++) { - snprintf(name, ARRAY_SIZE(name), "timerclken%d", i); + snprintf(name, sizeof(name), "sp810_%d_%d", instance, i); sp810->timerclken[i].sp810 = sp810; sp810->timerclken[i].channel = i; @@ -184,5 +185,6 @@ void __init clk_sp810_of_setup(struct device_node *node) } of_clk_add_provider(node, clk_sp810_timerclken_of_get, sp810); + instance++; } CLK_OF_DECLARE(sp810, "arm,sp810", clk_sp810_of_setup); From 8859f16ee6d1c3ceb81e7b295cd9b3ff0689228e Mon Sep 17 00:00:00 2001 From: Stephen Boyd Date: Tue, 1 Mar 2016 17:26:48 -0800 Subject: [PATCH 233/312] clk: qcom: msm8960: Fix ce3_src register offset [ Upstream commit 0f75e1a370fd843c9e508fc1ccf0662833034827 ] The offset seems to have been copied from the sata clk. Fix it so that enabling the crypto engine source clk works. Tested-by: Srinivas Kandagatla Tested-by: Bjorn Andersson Fixes: 5f775498bdc4 ("clk: qcom: Fully support apq8064 global clock control") Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin --- drivers/clk/qcom/gcc-msm8960.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/clk/qcom/gcc-msm8960.c b/drivers/clk/qcom/gcc-msm8960.c index 9ce658ae72b7..476d2090c96b 100644 --- a/drivers/clk/qcom/gcc-msm8960.c +++ b/drivers/clk/qcom/gcc-msm8960.c @@ -2753,7 +2753,7 @@ static struct clk_rcg ce3_src = { }, .freq_tbl = clk_tbl_ce3, .clkr = { - .enable_reg = 0x2c08, + .enable_reg = 0x36c0, .enable_mask = BIT(7), .hw.init = &(struct clk_init_data){ .name = "ce3_src", From b7c61d507c7f67c6441da26942c5c65c49511d35 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Mon, 14 Mar 2016 15:29:44 +0100 Subject: [PATCH 234/312] lpfc: fix misleading indentation [ Upstream commit aeb6641f8ebdd61939f462a8255b316f9bfab707 ] gcc-6 complains about the indentation of the lpfc_destroy_vport_work_array() call in lpfc_online(), which clearly doesn't look right: drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_online': drivers/scsi/lpfc/lpfc_init.c:2880:3: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] lpfc_destroy_vport_work_array(phba, vports); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/lpfc/lpfc_init.c:2863:2: note: ...this 'if' clause, but it is not if (vports != NULL) ^~ Looking at the patch that introduced this code, it's clear that the behavior is correct and the indentation is wrong. This fixes the indentation and adds curly braces around the previous if() block for clarity, as that is most likely what caused the code to be misindented in the first place. Signed-off-by: Arnd Bergmann Fixes: 549e55cd2a1b ("[SCSI] lpfc 8.2.2 : Fix locking around HBA's port_list") Reviewed-by: Sebastian Herbszt Reviewed-by: Hannes Reinecke Reviewed-by: Ewan D. Milne Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/lpfc/lpfc_init.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index e8c8c1ecc1f5..bf8fd38abbbd 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -2848,7 +2848,7 @@ lpfc_online(struct lpfc_hba *phba) } vports = lpfc_create_vport_work_array(phba); - if (vports != NULL) + if (vports != NULL) { for (i = 0; i <= phba->max_vports && vports[i] != NULL; i++) { struct Scsi_Host *shost; shost = lpfc_shost_from_vport(vports[i]); @@ -2865,7 +2865,8 @@ lpfc_online(struct lpfc_hba *phba) } spin_unlock_irq(shost->host_lock); } - lpfc_destroy_vport_work_array(phba, vports); + } + lpfc_destroy_vport_work_array(phba, vports); lpfc_unblock_mgmt_io(phba); return 0; From d6c812a33d16d1e2dd9ab209c69d9755da5cc425 Mon Sep 17 00:00:00 2001 From: Knut Wohlrab Date: Mon, 25 Apr 2016 14:08:25 -0700 Subject: [PATCH 235/312] Input: zforce_ts - fix dual touch recognition [ Upstream commit 6984ab1ab35f422292b7781c65284038bcc0f6a6 ] A wrong decoding of the touch coordinate message causes a wrong touch ID. Touch ID for dual touch must be 0 or 1. According to the actual Neonode nine byte touch coordinate coding, the state is transported in the lower nibble and the touch ID in the higher nibble of payload byte five. Signed-off-by: Knut Wohlrab Signed-off-by: Oleksij Rempel Signed-off-by: Dirk Behme Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin --- drivers/input/touchscreen/zforce_ts.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/input/touchscreen/zforce_ts.c b/drivers/input/touchscreen/zforce_ts.c index 19880c7385e3..d618f67a9f48 100644 --- a/drivers/input/touchscreen/zforce_ts.c +++ b/drivers/input/touchscreen/zforce_ts.c @@ -359,8 +359,8 @@ static int zforce_touch_event(struct zforce_ts *ts, u8 *payload) point.coord_x = point.coord_y = 0; } - point.state = payload[9 * i + 5] & 0x03; - point.id = (payload[9 * i + 5] & 0xfc) >> 2; + point.state = payload[9 * i + 5] & 0x0f; + point.id = (payload[9 * i + 5] & 0xf0) >> 4; /* determine touch major, minor and orientation */ point.area_major = max(payload[9 * i + 6], From 93c4863f4435023fcfdae542039860349189b334 Mon Sep 17 00:00:00 2001 From: Mathias Krause Date: Thu, 5 May 2016 16:22:26 -0700 Subject: [PATCH 236/312] proc: prevent accessing /proc//environ until it's ready [ Upstream commit 8148a73c9901a8794a50f950083c00ccf97d43b3 ] If /proc//environ gets read before the envp[] array is fully set up in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying to read more bytes than are actually written, as env_start will already be set but env_end will still be zero, making the range calculation underflow, allowing to read beyond the end of what has been written. Fix this as it is done for /proc//cmdline by testing env_end for zero. It is, apparently, intentionally set last in create_*_tables(). This bug was found by the PaX size_overflow plugin that detected the arithmetic underflow of 'this_len = env_end - (env_start + src)' when env_end is still zero. The expected consequence is that userland trying to access /proc//environ of a not yet fully set up process may get inconsistent data as we're in the middle of copying in the environment variables. Fixes: https://forums.grsecurity.net/viewtopic.php?f=3&t=4363 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=116461 Signed-off-by: Mathias Krause Cc: Emese Revfy Cc: Pax Team Cc: Al Viro Cc: Mateusz Guzik Cc: Alexey Dobriyan Cc: Cyrill Gorcunov Cc: Jarod Wilson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- fs/proc/base.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 68d51ed1666f..239dca3fb676 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -759,7 +759,8 @@ static ssize_t environ_read(struct file *file, char __user *buf, int ret = 0; struct mm_struct *mm = file->private_data; - if (!mm) + /* Ensure the process spawned far enough to have an environment. */ + if (!mm || !mm->env_end) return 0; page = (char *)__get_free_page(GFP_TEMPORARY); From 7b1f624b9e5dbba4d7613a9b9050526eba6da626 Mon Sep 17 00:00:00 2001 From: Jason Baron Date: Thu, 5 May 2016 16:22:12 -0700 Subject: [PATCH 237/312] mm: update min_free_kbytes from khugepaged after core initialization [ Upstream commit bc22af74f271ef76b2e6f72f3941f91f0da3f5f8 ] Khugepaged attempts to raise min_free_kbytes if its set too low. However, on boot khugepaged sets min_free_kbytes first from subsys_initcall(), and then the mm 'core' over-rides min_free_kbytes after from init_per_zone_wmark_min(), via a module_init() call. Khugepaged used to use a late_initcall() to set min_free_kbytes (such that it occurred after the core initialization), however this was removed when the initialization of min_free_kbytes was integrated into the starting of the khugepaged thread. The fix here is simply to invoke the core initialization using a core_initcall() instead of module_init(), such that the previous initialization ordering is restored. I didn't restore the late_initcall() since start_stop_khugepaged() already sets min_free_kbytes via set_recommended_min_free_kbytes(). This was noticed when we had a number of page allocation failures when moving a workload to a kernel with this new initialization ordering. On an 8GB system this restores min_free_kbytes back to 67584 from 11365 when CONFIG_TRANSPARENT_HUGEPAGE=y is set and either CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y or CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y. Fixes: 79553da293d3 ("thp: cleanup khugepaged startup") Signed-off-by: Jason Baron Acked-by: Kirill A. Shutemov Acked-by: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 551923097bbc..f6f6831cec52 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5926,7 +5926,7 @@ int __meminit init_per_zone_wmark_min(void) setup_per_zone_inactive_ratio(); return 0; } -module_init(init_per_zone_wmark_min) +core_initcall(init_per_zone_wmark_min) /* * min_free_kbytes_sysctl_handler - just a wrapper around proc_dointvec() so From f7ac631102975609851be1969f7d89a77ecaa52e Mon Sep 17 00:00:00 2001 From: Sven Eckelmann Date: Fri, 26 Feb 2016 17:56:13 +0100 Subject: [PATCH 238/312] batman-adv: Check skb size before using encapsulated ETH+VLAN header [ Upstream commit c78296665c3d81f040117432ab9e1cb125521b0c ] The encapsulated ethernet and VLAN header may be outside the received ethernet frame. Thus the skb buffer size has to be checked before it can be parsed to find out if it encapsulates another batman-adv packet. Fixes: 420193573f11 ("batman-adv: softif bridge loop avoidance") Signed-off-by: Sven Eckelmann Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli Signed-off-by: Sasha Levin --- net/batman-adv/soft-interface.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c index a0b1b861b968..b38a8aa3cce8 100644 --- a/net/batman-adv/soft-interface.c +++ b/net/batman-adv/soft-interface.c @@ -377,11 +377,17 @@ void batadv_interface_rx(struct net_device *soft_iface, */ nf_reset(skb); + if (unlikely(!pskb_may_pull(skb, ETH_HLEN))) + goto dropped; + vid = batadv_get_vid(skb, 0); ethhdr = eth_hdr(skb); switch (ntohs(ethhdr->h_proto)) { case ETH_P_8021Q: + if (!pskb_may_pull(skb, VLAN_ETH_HLEN)) + goto dropped; + vhdr = (struct vlan_ethhdr *)skb->data; if (vhdr->h_vlan_encapsulated_proto != ethertype) @@ -393,8 +399,6 @@ void batadv_interface_rx(struct net_device *soft_iface, } /* skb->dev & skb->pkt_type are set here */ - if (unlikely(!pskb_may_pull(skb, ETH_HLEN))) - goto dropped; skb->protocol = eth_type_trans(skb, soft_iface); /* should not be necessary anymore as we use skb_pull_rcsum() From eaa401a078fffcd5e1d078cb10b0efcd8e2d6c20 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Linus=20L=C3=BCssing?= Date: Fri, 11 Mar 2016 14:04:49 +0100 Subject: [PATCH 239/312] batman-adv: Fix broadcast/ogm queue limit on a removed interface MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit c4fdb6cff2aa0ae740c5f19b6f745cbbe786d42f ] When removing a single interface while a broadcast or ogm packet is still pending then we will free the forward packet without releasing the queue slots again. This patch is supposed to fix this issue. Fixes: 6d5808d4ae1b ("batman-adv: Add missing hardif_free_ref in forw_packet_free") Signed-off-by: Linus Lüssing [sven@narfation.org: fix conflicts with current version] Signed-off-by: Sven Eckelmann Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli Signed-off-by: Sasha Levin --- net/batman-adv/send.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c index 3d64ed20c393..6004c2de7b2a 100644 --- a/net/batman-adv/send.c +++ b/net/batman-adv/send.c @@ -611,6 +611,9 @@ batadv_purge_outstanding_packets(struct batadv_priv *bat_priv, if (pending) { hlist_del(&forw_packet->list); + if (!forw_packet->own) + atomic_inc(&bat_priv->bcast_queue_left); + batadv_forw_packet_free(forw_packet); } } @@ -638,6 +641,9 @@ batadv_purge_outstanding_packets(struct batadv_priv *bat_priv, if (pending) { hlist_del(&forw_packet->list); + if (!forw_packet->own) + atomic_inc(&bat_priv->batman_queue_left); + batadv_forw_packet_free(forw_packet); } } From b55c212a0f25d8999b9601a120506861d85f6fd3 Mon Sep 17 00:00:00 2001 From: Sven Eckelmann Date: Sun, 20 Mar 2016 12:27:53 +0100 Subject: [PATCH 240/312] batman-adv: Reduce refcnt of removed router when updating route [ Upstream commit d1a65f1741bfd9c69f9e4e2ad447a89b6810427d ] _batadv_update_route rcu_derefences orig_ifinfo->router outside of a spinlock protected region to print some information messages to the debug log. But this pointer is not checked again when the new pointer is assigned in the spinlock protected region. Thus is can happen that the value of orig_ifinfo->router changed in the meantime and thus the reference counter of the wrong router gets reduced after the spinlock protected region. Just rcu_dereferencing the value of orig_ifinfo->router inside the spinlock protected region (which also set the new pointer) is enough to get the correct old router object. Fixes: e1a5382f978b ("batman-adv: Make orig_node->router an rcu protected pointer") Signed-off-by: Sven Eckelmann Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli Signed-off-by: Sasha Levin --- net/batman-adv/routing.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c index da83982bf974..f77dafc114a6 100644 --- a/net/batman-adv/routing.c +++ b/net/batman-adv/routing.c @@ -88,6 +88,15 @@ static void _batadv_update_route(struct batadv_priv *bat_priv, neigh_node = NULL; spin_lock_bh(&orig_node->neigh_list_lock); + /* curr_router used earlier may not be the current orig_ifinfo->router + * anymore because it was dereferenced outside of the neigh_list_lock + * protected region. After the new best neighbor has replace the current + * best neighbor the reference counter needs to decrease. Consequently, + * the code needs to ensure the curr_router variable contains a pointer + * to the replaced best neighbor. + */ + curr_router = rcu_dereference_protected(orig_ifinfo->router, true); + rcu_assign_pointer(orig_ifinfo->router, neigh_node); spin_unlock_bh(&orig_node->neigh_list_lock); batadv_orig_ifinfo_free_ref(orig_ifinfo); From fd4453a9c8a4ccc784a51c71411b113f8a997ea1 Mon Sep 17 00:00:00 2001 From: Richard Leitner Date: Tue, 5 Apr 2016 15:03:48 +0200 Subject: [PATCH 241/312] iio: ak8975: fix maybe-uninitialized warning MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 05be8d4101d960bad271d32b4f6096af1ccb1534 ] If i2c_device_id *id is NULL and acpi_match_device returns NULL too, then chipset may be unitialized when accessing &ak_def_array[chipset] in ak8975_probe. Therefore initialize chipset to AK_MAX_TYPE, which will return an error when not changed. This patch fixes the following maybe-uninitialized warning: drivers/iio/magnetometer/ak8975.c: In function ‘ak8975_probe’: drivers/iio/magnetometer/ak8975.c:788:14: warning: ‘chipset’ may be used uninitialized in this function [-Wmaybe-uninitialized] data->def = &ak_def_array[chipset]; Signed-off-by: Richard Leitner Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin --- drivers/iio/magnetometer/ak8975.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iio/magnetometer/ak8975.c b/drivers/iio/magnetometer/ak8975.c index fd780bbcd07e..f2a7f72f7aa6 100644 --- a/drivers/iio/magnetometer/ak8975.c +++ b/drivers/iio/magnetometer/ak8975.c @@ -732,7 +732,7 @@ static int ak8975_probe(struct i2c_client *client, int eoc_gpio; int err; const char *name = NULL; - enum asahi_compass_chipset chipset; + enum asahi_compass_chipset chipset = AK_MAX_TYPE; /* Grab and set up the supplied GPIO. */ if (client->dev.platform_data) From 51319f40e513500085f25ae42ace31b8bc4bcfc2 Mon Sep 17 00:00:00 2001 From: Srinivas Pandruvada Date: Wed, 23 Mar 2016 21:07:39 -0700 Subject: [PATCH 242/312] ACPI / processor: Request native thermal interrupt handling via _OSC [ Upstream commit a21211672c9a1d730a39aa65d4a5b3414700adfb ] There are several reports of freeze on enabling HWP (Hardware PStates) feature on Skylake-based systems by the Intel P-states driver. The root cause is identified as the HWP interrupts causing BIOS code to freeze. HWP interrupts use the thermal LVT which can be handled by Linux natively, but on the affected Skylake-based systems SMM will respond to it by default. This is a problem for several reasons: - On the affected systems the SMM thermal LVT handler is broken (it will crash when invoked) and a BIOS update is necessary to fix it. - With thermal interrupt handled in SMM we lose all of the reporting features of the arch/x86/kernel/cpu/mcheck/therm_throt driver. - Some thermal drivers like x86-package-temp depend on the thermal threshold interrupts signaled via the thermal LVT. - The HWP interrupts are useful for debugging and tuning performance (if the kernel can handle them). The native handling of thermal interrupts needs to be enabled because of that. This requires some way to tell SMM that the OS can handle thermal interrupts. That can be done by using _OSC/_PDC in processor scope very early during ACPI initialization. The meaning of _OSC/_PDC bit 12 in processor scope is whether or not the OS supports native handling of interrupts for Collaborative Processor Performance Control (CPPC) notifications. Since on HWP-capable systems CPPC is a firmware interface to HWP, setting this bit effectively tells the firmware that the OS will handle thermal interrupts natively going forward. For details on _OSC/_PDC refer to: http://www.intel.com/content/www/us/en/standards/processor-vendor-specific-acpi-specification.html To implement the _OSC/_PDC handshake as described, introduce a new function, acpi_early_processor_osc(), that walks the ACPI namespace looking for ACPI processor objects and invokes _OSC for them with bit 12 in the capabilities buffer set and terminates the namespace walk on the first success. Also modify intel_thermal_interrupt() to clear HWP status bits in the HWP_STATUS MSR to acknowledge HWP interrupts (which prevents them from firing continuously). Signed-off-by: Srinivas Pandruvada [ rjw: Subject & changelog, function rename ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin --- arch/x86/kernel/cpu/mcheck/therm_throt.c | 3 ++ drivers/acpi/acpi_processor.c | 52 ++++++++++++++++++++++++ drivers/acpi/bus.c | 3 ++ drivers/acpi/internal.h | 6 +++ 4 files changed, 64 insertions(+) diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c index 1af51b1586d7..56270f0f05e6 100644 --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c @@ -385,6 +385,9 @@ static void intel_thermal_interrupt(void) { __u64 msr_val; + if (static_cpu_has(X86_FEATURE_HWP)) + wrmsrl_safe(MSR_HWP_STATUS, 0); + rdmsrl(MSR_IA32_THERM_STATUS, msr_val); /* Check for violation of core thermal thresholds*/ diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 58f335ca2e75..568f2b942aac 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -475,6 +475,58 @@ static void acpi_processor_remove(struct acpi_device *device) } #endif /* CONFIG_ACPI_HOTPLUG_CPU */ +#ifdef CONFIG_X86 +static bool acpi_hwp_native_thermal_lvt_set; +static acpi_status __init acpi_hwp_native_thermal_lvt_osc(acpi_handle handle, + u32 lvl, + void *context, + void **rv) +{ + u8 sb_uuid_str[] = "4077A616-290C-47BE-9EBD-D87058713953"; + u32 capbuf[2]; + struct acpi_osc_context osc_context = { + .uuid_str = sb_uuid_str, + .rev = 1, + .cap.length = 8, + .cap.pointer = capbuf, + }; + + if (acpi_hwp_native_thermal_lvt_set) + return AE_CTRL_TERMINATE; + + capbuf[0] = 0x0000; + capbuf[1] = 0x1000; /* set bit 12 */ + + if (ACPI_SUCCESS(acpi_run_osc(handle, &osc_context))) { + if (osc_context.ret.pointer && osc_context.ret.length > 1) { + u32 *capbuf_ret = osc_context.ret.pointer; + + if (capbuf_ret[1] & 0x1000) { + acpi_handle_info(handle, + "_OSC native thermal LVT Acked\n"); + acpi_hwp_native_thermal_lvt_set = true; + } + } + kfree(osc_context.ret.pointer); + } + + return AE_OK; +} + +void __init acpi_early_processor_osc(void) +{ + if (boot_cpu_has(X86_FEATURE_HWP)) { + acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT, + ACPI_UINT32_MAX, + acpi_hwp_native_thermal_lvt_osc, + NULL, NULL, NULL); + acpi_get_devices(ACPI_PROCESSOR_DEVICE_HID, + acpi_hwp_native_thermal_lvt_osc, + NULL, NULL); + } +} +#endif + /* * The following ACPI IDs are known to be suitable for representing as * processor devices. diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 513e7230e3d0..fd6053908d24 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -612,6 +612,9 @@ static int __init acpi_bus_init(void) goto error1; } + /* Set capability bits for _OSC under processor scope */ + acpi_early_processor_osc(); + /* * _OSC method may exist in module level code, * so it must be run after ACPI_FULL_INITIALIZATION diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h index ba4a61e964be..7db7f9dd7c47 100644 --- a/drivers/acpi/internal.h +++ b/drivers/acpi/internal.h @@ -121,6 +121,12 @@ void acpi_early_processor_set_pdc(void); static inline void acpi_early_processor_set_pdc(void) {} #endif +#ifdef CONFIG_X86 +void acpi_early_processor_osc(void); +#else +static inline void acpi_early_processor_osc(void) {} +#endif + /* -------------------------------------------------------------------------- Embedded Controller -------------------------------------------------------------------------- */ From 1ea4df4a2642ba2856a70e4530ae922f0c8e672e Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sun, 10 Apr 2016 23:01:30 -0400 Subject: [PATCH 243/312] decnet: Do not build routes to devices without decnet private data. [ Upstream commit a36a0d4008488fa545c74445d69eaf56377d5d4e ] In particular, make sure we check for decnet private presence for loopback devices. Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/decnet/dn_route.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c index 03227ffd19ce..76d3bf70c31a 100644 --- a/net/decnet/dn_route.c +++ b/net/decnet/dn_route.c @@ -1036,10 +1036,13 @@ source_ok: if (!fld.daddr) { fld.daddr = fld.saddr; - err = -EADDRNOTAVAIL; if (dev_out) dev_put(dev_out); + err = -EINVAL; dev_out = init_net.loopback_dev; + if (!dev_out->dn_ptr) + goto out; + err = -EADDRNOTAVAIL; dev_hold(dev_out); if (!fld.daddr) { fld.daddr = @@ -1112,6 +1115,8 @@ source_ok: if (dev_out == NULL) goto out; dn_db = rcu_dereference_raw(dev_out->dn_ptr); + if (!dn_db) + goto e_inval; /* Possible improvement - check all devices for local addr */ if (dn_dev_islocal(dev_out, fld.daddr)) { dev_put(dev_out); @@ -1153,6 +1158,8 @@ select_source: dev_put(dev_out); dev_out = init_net.loopback_dev; dev_hold(dev_out); + if (!dev_out->dn_ptr) + goto e_inval; fld.flowidn_oif = dev_out->ifindex; if (res.fi) dn_fib_info_put(res.fi); From 80565d16c385958217a0c204f19475ce8210e64a Mon Sep 17 00:00:00 2001 From: Chris Friesen Date: Fri, 8 Apr 2016 15:21:30 -0600 Subject: [PATCH 244/312] route: do not cache fib route info on local routes with oif [ Upstream commit d6d5e999e5df67f8ec20b6be45e2229455ee3699 ] For local routes that require a particular output interface we do not want to cache the result. Caching the result causes incorrect behaviour when there are multiple source addresses on the interface. The end result being that if the intended recipient is waiting on that interface for the packet he won't receive it because it will be delivered on the loopback interface and the IP_PKTINFO ipi_ifindex will be set to the loopback interface as well. This can be tested by running a program such as "dhcp_release" which attempts to inject a packet on a particular interface so that it is received by another program on the same board. The receiving process should see an IP_PKTINFO ipi_ifndex value of the source interface (e.g., eth1) instead of the loopback interface (e.g., lo). The packet will still appear on the loopback interface in tcpdump but the important aspect is that the CMSG info is correct. Sample dhcp_release command line: dhcp_release eth1 192.168.204.222 02:11:33:22:44:66 Signed-off-by: Allain Legacy Signed off-by: Chris Friesen Reviewed-by: Julian Anastasov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv4/route.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 1d3cdb4d4ebc..eb1d9839a257 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1976,6 +1976,18 @@ static struct rtable *__mkroute_output(const struct fib_result *res, */ if (fi && res->prefixlen < 4) fi = NULL; + } else if ((type == RTN_LOCAL) && (orig_oif != 0) && + (orig_oif != dev_out->ifindex)) { + /* For local routes that require a particular output interface + * we do not want to cache the result. Caching the result + * causes incorrect behaviour when there are multiple source + * addresses on the interface, the end result being that if the + * intended recipient is waiting on that interface for the + * packet he won't receive it because it will be delivered on + * the loopback interface and the IP_PKTINFO ipi_ifindex will + * be set to the loopback interface as well. + */ + fi = NULL; } fnhe = NULL; From 4b5223be98e1972e177f319159a29eb3bab2720e Mon Sep 17 00:00:00 2001 From: Mathias Krause Date: Sun, 10 Apr 2016 12:52:28 +0200 Subject: [PATCH 245/312] packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface [ Upstream commit 309cf37fe2a781279b7675d4bb7173198e532867 ] Because we miss to wipe the remainder of i->addr[] in packet_mc_add(), pdiag_put_mclist() leaks uninitialized heap bytes via the PACKET_DIAG_MCLIST netlink attribute. Fix this by explicitly memset(0)ing the remaining bytes in i->addr[]. Fixes: eea68e2f1a00 ("packet: Report socket mclist info via diag module") Signed-off-by: Mathias Krause Cc: Eric W. Biederman Cc: Pavel Emelyanov Acked-by: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/packet/af_packet.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 68a048472e5f..a3654d929814 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -3207,6 +3207,7 @@ static int packet_mc_add(struct sock *sk, struct packet_mreq_max *mreq) i->ifindex = mreq->mr_ifindex; i->alen = mreq->mr_alen; memcpy(i->addr, mreq->mr_address, i->alen); + memset(i->addr + i->alen, 0, sizeof(i->addr) - i->alen); i->count = 1; i->next = po->mclist; po->mclist = i; From 5730fd5d72071c9ae96929292351029ea56de7c0 Mon Sep 17 00:00:00 2001 From: Lars Persson Date: Tue, 12 Apr 2016 08:45:52 +0200 Subject: [PATCH 246/312] net: sched: do not requeue a NULL skb [ Upstream commit 3dcd493fbebfd631913df6e2773cc295d3bf7d22 ] A failure in validate_xmit_skb_list() triggered an unconditional call to dev_requeue_skb with skb=NULL. This slowly grows the queue discipline's qlen count until all traffic through the queue stops. We take the optimistic approach and continue running the queue after a failure since it is unknown if later packets also will fail in the validate path. Fixes: 55a93b3ea780 ("qdisc: validate skb without holding lock") Signed-off-by: Lars Persson Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/sched/sch_generic.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 3c6f6b774ba6..9821e6d641bb 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -159,12 +159,15 @@ int sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q, if (validate) skb = validate_xmit_skb_list(skb, dev); - if (skb) { + if (likely(skb)) { HARD_TX_LOCK(dev, txq, smp_processor_id()); if (!netif_xmit_frozen_or_stopped(txq)) skb = dev_hard_start_xmit(skb, dev, txq, &ret); HARD_TX_UNLOCK(dev, txq); + } else { + spin_lock(root_lock); + return qdisc_qlen(q); } spin_lock(root_lock); From a4e1611c8bc5e331e07bea540d3571911df6ac52 Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Tue, 12 Apr 2016 10:26:19 -0700 Subject: [PATCH 247/312] bpf/verifier: reject invalid LD_ABS | BPF_DW instruction [ Upstream commit d82bccc69041a51f7b7b9b4a36db0772f4cdba21 ] verifier must check for reserved size bits in instruction opcode and reject BPF_LD | BPF_ABS | BPF_DW and BPF_LD | BPF_IND | BPF_DW instructions, otherwise interpreter will WARN_RATELIMIT on them during execution. Fixes: ddd872bc3098 ("bpf: verifier: add checks for BPF_ABS | BPF_IND instructions") Signed-off-by: Alexei Starovoitov Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- kernel/bpf/verifier.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 6582410a71c7..dd16d1993c04 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1227,6 +1227,7 @@ static int check_ld_abs(struct verifier_env *env, struct bpf_insn *insn) } if (insn->dst_reg != BPF_REG_0 || insn->off != 0 || + BPF_SIZE(insn->code) == BPF_DW || (mode == BPF_ABS && insn->src_reg != BPF_REG_0)) { verbose("BPF_LD_ABS uses reserved fields\n"); return -EINVAL; From a2e388f2537a23348810b20ae82468f13d3fb123 Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Wed, 20 Apr 2016 23:23:08 +0100 Subject: [PATCH 248/312] atl2: Disable unimplemented scatter/gather feature [ Upstream commit f43bfaeddc79effbf3d0fcb53ca477cca66f3db8 ] atl2 includes NETIF_F_SG in hw_features even though it has no support for non-linear skbs. This bug was originally harmless since the driver does not claim to implement checksum offload and that used to be a requirement for SG. Now that SG and checksum offload are independent features, if you explicitly enable SG *and* use one of the rare protocols that can use SG without checkusm offload, this potentially leaks sensitive information (before you notice that it just isn't working). Therefore this obscure bug has been designated CVE-2016-2117. Reported-by: Justin Yackoski Signed-off-by: Ben Hutchings Fixes: ec5f06156423 ("net: Kill link between CSUM and SG features.") Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/atheros/atlx/atl2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/atheros/atlx/atl2.c b/drivers/net/ethernet/atheros/atlx/atl2.c index 46a535318c7a..972ee645fac6 100644 --- a/drivers/net/ethernet/atheros/atlx/atl2.c +++ b/drivers/net/ethernet/atheros/atlx/atl2.c @@ -1412,7 +1412,7 @@ static int atl2_probe(struct pci_dev *pdev, const struct pci_device_id *ent) err = -EIO; - netdev->hw_features = NETIF_F_SG | NETIF_F_HW_VLAN_CTAG_RX; + netdev->hw_features = NETIF_F_HW_VLAN_CTAG_RX; netdev->features |= (NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX); /* Init PHY as early as possible due to power saving issue */ From 303c69a17dc9ec4c26682dfcf7be27805544354e Mon Sep 17 00:00:00 2001 From: Simon Horman Date: Thu, 21 Apr 2016 11:49:15 +1000 Subject: [PATCH 249/312] openvswitch: use flow protocol when recalculating ipv6 checksums [ Upstream commit b4f70527f052b0c00be4d7cac562baa75b212df5 ] When using masked actions the ipv6_proto field of an action to set IPv6 fields may be zero rather than the prevailing protocol which will result in skipping checksum recalculation. This patch resolves the problem by relying on the protocol in the flow key rather than that in the set field action. Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.") Cc: Jarno Rajahalme Signed-off-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/openvswitch/actions.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index b491c1c296fe..9920f7502f6d 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -441,7 +441,7 @@ static int set_ipv6(struct sk_buff *skb, struct sw_flow_key *flow_key, mask_ipv6_addr(saddr, key->ipv6_src, mask->ipv6_src, masked); if (unlikely(memcmp(saddr, masked, sizeof(masked)))) { - set_ipv6_addr(skb, key->ipv6_proto, saddr, masked, + set_ipv6_addr(skb, flow_key->ip.proto, saddr, masked, true); memcpy(&flow_key->ipv6.addr.src, masked, sizeof(flow_key->ipv6.addr.src)); @@ -463,7 +463,7 @@ static int set_ipv6(struct sk_buff *skb, struct sw_flow_key *flow_key, NULL, &flags) != NEXTHDR_ROUTING); - set_ipv6_addr(skb, key->ipv6_proto, daddr, masked, + set_ipv6_addr(skb, flow_key->ip.proto, daddr, masked, recalc_csum); memcpy(&flow_key->ipv6.addr.dst, masked, sizeof(flow_key->ipv6.addr.dst)); From 2c5ac2bfe56da842b7c84e99e9811f118b6f7a17 Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Thu, 21 Apr 2016 22:23:31 +0200 Subject: [PATCH 250/312] ipv4/fib: don't warn when primary address is missing if in_dev is dead [ Upstream commit 391a20333b8393ef2e13014e6e59d192c5594471 ] After commit fbd40ea0180a ("ipv4: Don't do expensive useless work during inetdev destroy.") when deleting an interface, fib_del_ifaddr() can be executed without any primary address present on the dead interface. The above is safe, but triggers some "bug: prim == NULL" warnings. This commit avoids warning if the in_dev is dead Signed-off-by: Paolo Abeni Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv4/fib_frontend.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 87766363d2a4..513b6aabc5b7 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -861,7 +861,11 @@ void fib_del_ifaddr(struct in_ifaddr *ifa, struct in_ifaddr *iprim) if (ifa->ifa_flags & IFA_F_SECONDARY) { prim = inet_ifa_byprefix(in_dev, any, ifa->ifa_mask); if (!prim) { - pr_warn("%s: bug: prim == NULL\n", __func__); + /* if the device has been deleted, we don't perform + * address promotion + */ + if (!in_dev->dead) + pr_warn("%s: bug: prim == NULL\n", __func__); return; } if (iprim && iprim != prim) { From eeee948a652e9ee02643bf031cd613339ccc6864 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Sat, 23 Apr 2016 11:35:46 -0700 Subject: [PATCH 251/312] net/mlx4_en: fix spurious timestamping callbacks [ Upstream commit fc96256c906362e845d848d0f6a6354450059e81 ] When multiple skb are TX-completed in a row, we might incorrectly keep a timestamp of a prior skb and cause extra work. Fixes: ec693d47010e8 ("net/mlx4_en: Add HW timestamping (TS) support") Signed-off-by: Eric Dumazet Cc: Willem de Bruijn Reviewed-by: Eran Ben Elisha Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx4/en_tx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c index c10d98f6ad96..a1b4301f719a 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c @@ -400,7 +400,6 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev, u32 packets = 0; u32 bytes = 0; int factor = priv->cqe_factor; - u64 timestamp = 0; int done = 0; int budget = priv->tx_work_limit; u32 last_nr_txbb; @@ -440,9 +439,12 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev, new_index = be16_to_cpu(cqe->wqe_index) & size_mask; do { + u64 timestamp = 0; + txbbs_skipped += last_nr_txbb; ring_index = (ring_index + last_nr_txbb) & size_mask; - if (ring->tx_info[ring_index].ts_requested) + + if (unlikely(ring->tx_info[ring_index].ts_requested)) timestamp = mlx4_en_get_cqe_ts(cqe); /* free next descriptor */ From 7b6460337ed23f3b29958e853f7d1f191bfd844d Mon Sep 17 00:00:00 2001 From: Jann Horn Date: Tue, 26 Apr 2016 22:26:26 +0200 Subject: [PATCH 252/312] bpf: fix double-fdput in replace_map_fd_with_map_ptr() [ Upstream commit 8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7 ] When bpf(BPF_PROG_LOAD, ...) was invoked with a BPF program whose bytecode references a non-map file descriptor as a map file descriptor, the error handling code called fdput() twice instead of once (in __bpf_map_get() and in replace_map_fd_with_map_ptr()). If the file descriptor table of the current task is shared, this causes f_count to be decremented too much, allowing the struct file to be freed while it is still in use (use-after-free). This can be exploited to gain root privileges by an unprivileged user. This bug was introduced in commit 0246e64d9a5f ("bpf: handle pseudo BPF_LD_IMM64 insn"), but is only exploitable since commit 1be7f75d1668 ("bpf: enable non-root eBPF programs") because previously, CAP_SYS_ADMIN was required to reach the vulnerable code. (posted publicly according to request by maintainer) Signed-off-by: Jann Horn Signed-off-by: Linus Torvalds Acked-by: Alexei Starovoitov Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- kernel/bpf/verifier.c | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index dd16d1993c04..efd143dcedf1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1865,7 +1865,6 @@ static int replace_map_fd_with_map_ptr(struct verifier_env *env) if (IS_ERR(map)) { verbose("fd %d is not pointing to valid bpf_map\n", insn->imm); - fdput(f); return PTR_ERR(map); } From 11316d7eef2230000bb4fa3d4b9056690fda3ef2 Mon Sep 17 00:00:00 2001 From: WANG Cong Date: Thu, 25 Feb 2016 14:55:00 -0800 Subject: [PATCH 253/312] net_sched: introduce qdisc_replace() helper [ Upstream commit 86a7996cc8a078793670d82ed97d5a99bb4e8496 ] Remove nearly duplicated code and prepare for the following patch. Cc: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- include/net/sch_generic.h | 17 +++++++++++++++++ net/sched/sch_cbq.c | 7 +------ net/sched/sch_drr.c | 6 +----- net/sched/sch_dsmark.c | 8 +------- net/sched/sch_hfsc.c | 6 +----- net/sched/sch_htb.c | 9 +-------- net/sched/sch_multiq.c | 8 +------- net/sched/sch_netem.c | 10 +--------- net/sched/sch_prio.c | 8 +------- net/sched/sch_qfq.c | 6 +----- net/sched/sch_red.c | 7 +------ net/sched/sch_sfb.c | 7 +------ net/sched/sch_tbf.c | 8 +------- 13 files changed, 29 insertions(+), 78 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 080b657ef8fb..1fb8a5b42836 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -691,6 +691,23 @@ static inline void qdisc_reset_queue(struct Qdisc *sch) sch->qstats.backlog = 0; } +static inline struct Qdisc *qdisc_replace(struct Qdisc *sch, struct Qdisc *new, + struct Qdisc **pold) +{ + struct Qdisc *old; + + sch_tree_lock(sch); + old = *pold; + *pold = new; + if (old != NULL) { + qdisc_tree_decrease_qlen(old, old->q.qlen); + qdisc_reset(old); + } + sch_tree_unlock(sch); + + return old; +} + static inline unsigned int __qdisc_queue_drop(struct Qdisc *sch, struct sk_buff_head *list) { diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c index beeb75f80fdb..17ad79ebd720 100644 --- a/net/sched/sch_cbq.c +++ b/net/sched/sch_cbq.c @@ -1624,13 +1624,8 @@ static int cbq_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, new->reshape_fail = cbq_reshape_fail; #endif } - sch_tree_lock(sch); - *old = cl->q; - cl->q = new; - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - sch_tree_unlock(sch); + *old = qdisc_replace(sch, new, &cl->q); return 0; } diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c index 338706092c27..d4b3f82758b4 100644 --- a/net/sched/sch_drr.c +++ b/net/sched/sch_drr.c @@ -226,11 +226,7 @@ static int drr_graft_class(struct Qdisc *sch, unsigned long arg, new = &noop_qdisc; } - sch_tree_lock(sch); - drr_purge_queue(cl); - *old = cl->qdisc; - cl->qdisc = new; - sch_tree_unlock(sch); + *old = qdisc_replace(sch, new, &cl->qdisc); return 0; } diff --git a/net/sched/sch_dsmark.c b/net/sched/sch_dsmark.c index 66700a6116aa..d2084e744433 100644 --- a/net/sched/sch_dsmark.c +++ b/net/sched/sch_dsmark.c @@ -67,13 +67,7 @@ static int dsmark_graft(struct Qdisc *sch, unsigned long arg, new = &noop_qdisc; } - sch_tree_lock(sch); - *old = p->q; - p->q = new; - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - sch_tree_unlock(sch); - + *old = qdisc_replace(sch, new, &p->q); return 0; } diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index e6c7416d0332..134f7d2116c9 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -1215,11 +1215,7 @@ hfsc_graft_class(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, new = &noop_qdisc; } - sch_tree_lock(sch); - hfsc_purge_queue(sch, cl); - *old = cl->qdisc; - cl->qdisc = new; - sch_tree_unlock(sch); + *old = qdisc_replace(sch, new, &cl->qdisc); return 0; } diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index f1acb0f60dc3..520ffe9eaddb 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1165,14 +1165,7 @@ static int htb_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, cl->common.classid)) == NULL) return -ENOBUFS; - sch_tree_lock(sch); - *old = cl->un.leaf.q; - cl->un.leaf.q = new; - if (*old != NULL) { - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - } - sch_tree_unlock(sch); + *old = qdisc_replace(sch, new, &cl->un.leaf.q); return 0; } diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c index 42dd218871e0..f36ff83fac34 100644 --- a/net/sched/sch_multiq.c +++ b/net/sched/sch_multiq.c @@ -303,13 +303,7 @@ static int multiq_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, if (new == NULL) new = &noop_qdisc; - sch_tree_lock(sch); - *old = q->queues[band]; - q->queues[band] = new; - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - sch_tree_unlock(sch); - + *old = qdisc_replace(sch, new, &q->queues[band]); return 0; } diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 956ead2cab9a..8d56016621ca 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -1037,15 +1037,7 @@ static int netem_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, { struct netem_sched_data *q = qdisc_priv(sch); - sch_tree_lock(sch); - *old = q->qdisc; - q->qdisc = new; - if (*old) { - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - } - sch_tree_unlock(sch); - + *old = qdisc_replace(sch, new, &q->qdisc); return 0; } diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c index 8e5cd34aaa74..a677f543a25c 100644 --- a/net/sched/sch_prio.c +++ b/net/sched/sch_prio.c @@ -268,13 +268,7 @@ static int prio_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, if (new == NULL) new = &noop_qdisc; - sch_tree_lock(sch); - *old = q->queues[band]; - q->queues[band] = new; - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - sch_tree_unlock(sch); - + *old = qdisc_replace(sch, new, &q->queues[band]); return 0; } diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index 3ec7e88a43ca..6fea3209a7c6 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -619,11 +619,7 @@ static int qfq_graft_class(struct Qdisc *sch, unsigned long arg, new = &noop_qdisc; } - sch_tree_lock(sch); - qfq_purge_queue(cl); - *old = cl->qdisc; - cl->qdisc = new; - sch_tree_unlock(sch); + *old = qdisc_replace(sch, new, &cl->qdisc); return 0; } diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c index 6c0534cc7758..d5abcee454d8 100644 --- a/net/sched/sch_red.c +++ b/net/sched/sch_red.c @@ -313,12 +313,7 @@ static int red_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, if (new == NULL) new = &noop_qdisc; - sch_tree_lock(sch); - *old = q->qdisc; - q->qdisc = new; - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - sch_tree_unlock(sch); + *old = qdisc_replace(sch, new, &q->qdisc); return 0; } diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c index 5819dd82630d..fa7f0a36cc31 100644 --- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -614,12 +614,7 @@ static int sfb_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, if (new == NULL) new = &noop_qdisc; - sch_tree_lock(sch); - *old = q->qdisc; - q->qdisc = new; - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - sch_tree_unlock(sch); + *old = qdisc_replace(sch, new, &q->qdisc); return 0; } diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c index a4afde14e865..56a1aef3495f 100644 --- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -502,13 +502,7 @@ static int tbf_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, if (new == NULL) new = &noop_qdisc; - sch_tree_lock(sch); - *old = q->qdisc; - q->qdisc = new; - qdisc_tree_decrease_qlen(*old, (*old)->q.qlen); - qdisc_reset(*old); - sch_tree_unlock(sch); - + *old = qdisc_replace(sch, new, &q->qdisc); return 0; } From 236094acb4b5c224d68d3b279941d24d555078d4 Mon Sep 17 00:00:00 2001 From: WANG Cong Date: Thu, 25 Feb 2016 14:55:01 -0800 Subject: [PATCH 254/312] net_sched: update hierarchical backlog too [ Upstream commit 2ccccf5fb43ff62b2b96cc58d95fc0b3596516e4 ] When the bottom qdisc decides to, for example, drop some packet, it calls qdisc_tree_decrease_qlen() to update the queue length for all its ancestors, we need to update the backlog too to keep the stats on root qdisc accurate. Cc: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- include/net/codel.h | 4 ++++ include/net/sch_generic.h | 5 +++-- net/sched/sch_api.c | 8 +++++--- net/sched/sch_cbq.c | 5 +++-- net/sched/sch_choke.c | 6 ++++-- net/sched/sch_codel.c | 10 ++++++---- net/sched/sch_drr.c | 3 ++- net/sched/sch_fq.c | 4 +++- net/sched/sch_fq_codel.c | 17 ++++++++++++----- net/sched/sch_hfsc.c | 3 ++- net/sched/sch_hhf.c | 10 +++++++--- net/sched/sch_htb.c | 10 ++++++---- net/sched/sch_multiq.c | 8 +++++--- net/sched/sch_netem.c | 3 ++- net/sched/sch_pie.c | 5 +++-- net/sched/sch_prio.c | 7 ++++--- net/sched/sch_qfq.c | 3 ++- net/sched/sch_red.c | 3 ++- net/sched/sch_sfb.c | 3 ++- net/sched/sch_sfq.c | 16 +++++++++------- net/sched/sch_tbf.c | 7 +++++-- 21 files changed, 91 insertions(+), 49 deletions(-) diff --git a/include/net/codel.h b/include/net/codel.h index 1e18005f7f65..0ee76108e741 100644 --- a/include/net/codel.h +++ b/include/net/codel.h @@ -160,11 +160,13 @@ struct codel_vars { * struct codel_stats - contains codel shared variables and stats * @maxpacket: largest packet we've seen so far * @drop_count: temp count of dropped packets in dequeue() + * @drop_len: bytes of dropped packets in dequeue() * ecn_mark: number of packets we ECN marked instead of dropping */ struct codel_stats { u32 maxpacket; u32 drop_count; + u32 drop_len; u32 ecn_mark; }; @@ -301,6 +303,7 @@ static struct sk_buff *codel_dequeue(struct Qdisc *sch, vars->rec_inv_sqrt); goto end; } + stats->drop_len += qdisc_pkt_len(skb); qdisc_drop(skb, sch); stats->drop_count++; skb = dequeue_func(vars, sch); @@ -323,6 +326,7 @@ static struct sk_buff *codel_dequeue(struct Qdisc *sch, if (params->ecn && INET_ECN_set_ce(skb)) { stats->ecn_mark++; } else { + stats->drop_len += qdisc_pkt_len(skb); qdisc_drop(skb, sch); stats->drop_count++; diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 1fb8a5b42836..530bdca19803 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -395,7 +395,8 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue, struct Qdisc *qdisc); void qdisc_reset(struct Qdisc *qdisc); void qdisc_destroy(struct Qdisc *qdisc); -void qdisc_tree_decrease_qlen(struct Qdisc *qdisc, unsigned int n); +void qdisc_tree_reduce_backlog(struct Qdisc *qdisc, unsigned int n, + unsigned int len); struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, const struct Qdisc_ops *ops); struct Qdisc *qdisc_create_dflt(struct netdev_queue *dev_queue, @@ -700,7 +701,7 @@ static inline struct Qdisc *qdisc_replace(struct Qdisc *sch, struct Qdisc *new, old = *pold; *pold = new; if (old != NULL) { - qdisc_tree_decrease_qlen(old, old->q.qlen); + qdisc_tree_reduce_backlog(old, old->q.qlen, old->qstats.backlog); qdisc_reset(old); } sch_tree_unlock(sch); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 68c599a5e1d1..c244a49ae4ac 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -744,14 +744,15 @@ static u32 qdisc_alloc_handle(struct net_device *dev) return 0; } -void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n) +void qdisc_tree_reduce_backlog(struct Qdisc *sch, unsigned int n, + unsigned int len) { const struct Qdisc_class_ops *cops; unsigned long cl; u32 parentid; int drops; - if (n == 0) + if (n == 0 && len == 0) return; drops = max_t(int, n, 0); rcu_read_lock(); @@ -774,11 +775,12 @@ void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n) cops->put(sch, cl); } sch->q.qlen -= n; + sch->qstats.backlog -= len; __qdisc_qstats_drop(sch, drops); } rcu_read_unlock(); } -EXPORT_SYMBOL(qdisc_tree_decrease_qlen); +EXPORT_SYMBOL(qdisc_tree_reduce_backlog); static void notify_and_destroy(struct net *net, struct sk_buff *skb, struct nlmsghdr *n, u32 clid, diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c index 17ad79ebd720..f6e7a60012b1 100644 --- a/net/sched/sch_cbq.c +++ b/net/sched/sch_cbq.c @@ -1909,7 +1909,7 @@ static int cbq_delete(struct Qdisc *sch, unsigned long arg) { struct cbq_sched_data *q = qdisc_priv(sch); struct cbq_class *cl = (struct cbq_class *)arg; - unsigned int qlen; + unsigned int qlen, backlog; if (cl->filters || cl->children || cl == &q->link) return -EBUSY; @@ -1917,8 +1917,9 @@ static int cbq_delete(struct Qdisc *sch, unsigned long arg) sch_tree_lock(sch); qlen = cl->q->q.qlen; + backlog = cl->q->qstats.backlog; qdisc_reset(cl->q); - qdisc_tree_decrease_qlen(cl->q, qlen); + qdisc_tree_reduce_backlog(cl->q, qlen, backlog); if (cl->next_alive) cbq_deactivate_class(cl); diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c index c009eb9045ce..3f6437db9b0f 100644 --- a/net/sched/sch_choke.c +++ b/net/sched/sch_choke.c @@ -128,8 +128,8 @@ static void choke_drop_by_idx(struct Qdisc *sch, unsigned int idx) choke_zap_tail_holes(q); qdisc_qstats_backlog_dec(sch, skb); + qdisc_tree_reduce_backlog(sch, 1, qdisc_pkt_len(skb)); qdisc_drop(skb, sch); - qdisc_tree_decrease_qlen(sch, 1); --sch->q.qlen; } @@ -449,6 +449,7 @@ static int choke_change(struct Qdisc *sch, struct nlattr *opt) old = q->tab; if (old) { unsigned int oqlen = sch->q.qlen, tail = 0; + unsigned dropped = 0; while (q->head != q->tail) { struct sk_buff *skb = q->tab[q->head]; @@ -460,11 +461,12 @@ static int choke_change(struct Qdisc *sch, struct nlattr *opt) ntab[tail++] = skb; continue; } + dropped += qdisc_pkt_len(skb); qdisc_qstats_backlog_dec(sch, skb); --sch->q.qlen; qdisc_drop(skb, sch); } - qdisc_tree_decrease_qlen(sch, oqlen - sch->q.qlen); + qdisc_tree_reduce_backlog(sch, oqlen - sch->q.qlen, dropped); q->head = 0; q->tail = tail; } diff --git a/net/sched/sch_codel.c b/net/sched/sch_codel.c index 7a0bdb16ac92..9a9068d00833 100644 --- a/net/sched/sch_codel.c +++ b/net/sched/sch_codel.c @@ -79,12 +79,13 @@ static struct sk_buff *codel_qdisc_dequeue(struct Qdisc *sch) skb = codel_dequeue(sch, &q->params, &q->vars, &q->stats, dequeue); - /* We cant call qdisc_tree_decrease_qlen() if our qlen is 0, + /* We cant call qdisc_tree_reduce_backlog() if our qlen is 0, * or HTB crashes. Defer it for next round. */ if (q->stats.drop_count && sch->q.qlen) { - qdisc_tree_decrease_qlen(sch, q->stats.drop_count); + qdisc_tree_reduce_backlog(sch, q->stats.drop_count, q->stats.drop_len); q->stats.drop_count = 0; + q->stats.drop_len = 0; } if (skb) qdisc_bstats_update(sch, skb); @@ -115,7 +116,7 @@ static int codel_change(struct Qdisc *sch, struct nlattr *opt) { struct codel_sched_data *q = qdisc_priv(sch); struct nlattr *tb[TCA_CODEL_MAX + 1]; - unsigned int qlen; + unsigned int qlen, dropped = 0; int err; if (!opt) @@ -149,10 +150,11 @@ static int codel_change(struct Qdisc *sch, struct nlattr *opt) while (sch->q.qlen > sch->limit) { struct sk_buff *skb = __skb_dequeue(&sch->q); + dropped += qdisc_pkt_len(skb); qdisc_qstats_backlog_dec(sch, skb); qdisc_drop(skb, sch); } - qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen); + qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, dropped); sch_tree_unlock(sch); return 0; diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c index d4b3f82758b4..e599803caa1e 100644 --- a/net/sched/sch_drr.c +++ b/net/sched/sch_drr.c @@ -53,9 +53,10 @@ static struct drr_class *drr_find_class(struct Qdisc *sch, u32 classid) static void drr_purge_queue(struct drr_class *cl) { unsigned int len = cl->qdisc->q.qlen; + unsigned int backlog = cl->qdisc->qstats.backlog; qdisc_reset(cl->qdisc); - qdisc_tree_decrease_qlen(cl->qdisc, len); + qdisc_tree_reduce_backlog(cl->qdisc, len, backlog); } static const struct nla_policy drr_policy[TCA_DRR_MAX + 1] = { diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index f377702d4b91..4816778566d3 100644 --- a/net/sched/sch_fq.c +++ b/net/sched/sch_fq.c @@ -659,6 +659,7 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt) struct fq_sched_data *q = qdisc_priv(sch); struct nlattr *tb[TCA_FQ_MAX + 1]; int err, drop_count = 0; + unsigned drop_len = 0; u32 fq_log; if (!opt) @@ -733,10 +734,11 @@ static int fq_change(struct Qdisc *sch, struct nlattr *opt) if (!skb) break; + drop_len += qdisc_pkt_len(skb); kfree_skb(skb); drop_count++; } - qdisc_tree_decrease_qlen(sch, drop_count); + qdisc_tree_reduce_backlog(sch, drop_count, drop_len); sch_tree_unlock(sch); return err; diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index 9291598b5aad..96971c7ab228 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -173,7 +173,7 @@ static unsigned int fq_codel_drop(struct Qdisc *sch) static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch) { struct fq_codel_sched_data *q = qdisc_priv(sch); - unsigned int idx; + unsigned int idx, prev_backlog; struct fq_codel_flow *flow; int uninitialized_var(ret); @@ -201,6 +201,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch) if (++sch->q.qlen <= sch->limit) return NET_XMIT_SUCCESS; + prev_backlog = sch->qstats.backlog; q->drop_overlimit++; /* Return Congestion Notification only if we dropped a packet * from this flow. @@ -209,7 +210,7 @@ static int fq_codel_enqueue(struct sk_buff *skb, struct Qdisc *sch) return NET_XMIT_CN; /* As we dropped a packet, better let upper stack know this */ - qdisc_tree_decrease_qlen(sch, 1); + qdisc_tree_reduce_backlog(sch, 1, prev_backlog - sch->qstats.backlog); return NET_XMIT_SUCCESS; } @@ -239,6 +240,7 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch) struct fq_codel_flow *flow; struct list_head *head; u32 prev_drop_count, prev_ecn_mark; + unsigned int prev_backlog; begin: head = &q->new_flows; @@ -257,6 +259,7 @@ begin: prev_drop_count = q->cstats.drop_count; prev_ecn_mark = q->cstats.ecn_mark; + prev_backlog = sch->qstats.backlog; skb = codel_dequeue(sch, &q->cparams, &flow->cvars, &q->cstats, dequeue); @@ -274,12 +277,14 @@ begin: } qdisc_bstats_update(sch, skb); flow->deficit -= qdisc_pkt_len(skb); - /* We cant call qdisc_tree_decrease_qlen() if our qlen is 0, + /* We cant call qdisc_tree_reduce_backlog() if our qlen is 0, * or HTB crashes. Defer it for next round. */ if (q->cstats.drop_count && sch->q.qlen) { - qdisc_tree_decrease_qlen(sch, q->cstats.drop_count); + qdisc_tree_reduce_backlog(sch, q->cstats.drop_count, + q->cstats.drop_len); q->cstats.drop_count = 0; + q->cstats.drop_len = 0; } return skb; } @@ -347,11 +352,13 @@ static int fq_codel_change(struct Qdisc *sch, struct nlattr *opt) while (sch->q.qlen > sch->limit) { struct sk_buff *skb = fq_codel_dequeue(sch); + q->cstats.drop_len += qdisc_pkt_len(skb); kfree_skb(skb); q->cstats.drop_count++; } - qdisc_tree_decrease_qlen(sch, q->cstats.drop_count); + qdisc_tree_reduce_backlog(sch, q->cstats.drop_count, q->cstats.drop_len); q->cstats.drop_count = 0; + q->cstats.drop_len = 0; sch_tree_unlock(sch); return 0; diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index 134f7d2116c9..d3e21dac8b40 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -895,9 +895,10 @@ static void hfsc_purge_queue(struct Qdisc *sch, struct hfsc_class *cl) { unsigned int len = cl->qdisc->q.qlen; + unsigned int backlog = cl->qdisc->qstats.backlog; qdisc_reset(cl->qdisc); - qdisc_tree_decrease_qlen(cl->qdisc, len); + qdisc_tree_reduce_backlog(cl->qdisc, len, backlog); } static void diff --git a/net/sched/sch_hhf.c b/net/sched/sch_hhf.c index 15d3aabfe250..792c6f330f77 100644 --- a/net/sched/sch_hhf.c +++ b/net/sched/sch_hhf.c @@ -390,6 +390,7 @@ static int hhf_enqueue(struct sk_buff *skb, struct Qdisc *sch) struct hhf_sched_data *q = qdisc_priv(sch); enum wdrr_bucket_idx idx; struct wdrr_bucket *bucket; + unsigned int prev_backlog; idx = hhf_classify(skb, sch); @@ -417,6 +418,7 @@ static int hhf_enqueue(struct sk_buff *skb, struct Qdisc *sch) if (++sch->q.qlen <= sch->limit) return NET_XMIT_SUCCESS; + prev_backlog = sch->qstats.backlog; q->drop_overlimit++; /* Return Congestion Notification only if we dropped a packet from this * bucket. @@ -425,7 +427,7 @@ static int hhf_enqueue(struct sk_buff *skb, struct Qdisc *sch) return NET_XMIT_CN; /* As we dropped a packet, better let upper stack know this. */ - qdisc_tree_decrease_qlen(sch, 1); + qdisc_tree_reduce_backlog(sch, 1, prev_backlog - sch->qstats.backlog); return NET_XMIT_SUCCESS; } @@ -535,7 +537,7 @@ static int hhf_change(struct Qdisc *sch, struct nlattr *opt) { struct hhf_sched_data *q = qdisc_priv(sch); struct nlattr *tb[TCA_HHF_MAX + 1]; - unsigned int qlen; + unsigned int qlen, prev_backlog; int err; u64 non_hh_quantum; u32 new_quantum = q->quantum; @@ -585,12 +587,14 @@ static int hhf_change(struct Qdisc *sch, struct nlattr *opt) } qlen = sch->q.qlen; + prev_backlog = sch->qstats.backlog; while (sch->q.qlen > sch->limit) { struct sk_buff *skb = hhf_dequeue(sch); kfree_skb(skb); } - qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen); + qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, + prev_backlog - sch->qstats.backlog); sch_tree_unlock(sch); return 0; diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 520ffe9eaddb..6b118b1587ca 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1267,7 +1267,6 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) { struct htb_sched *q = qdisc_priv(sch); struct htb_class *cl = (struct htb_class *)arg; - unsigned int qlen; struct Qdisc *new_q = NULL; int last_child = 0; @@ -1287,9 +1286,11 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) sch_tree_lock(sch); if (!cl->level) { - qlen = cl->un.leaf.q->q.qlen; + unsigned int qlen = cl->un.leaf.q->q.qlen; + unsigned int backlog = cl->un.leaf.q->qstats.backlog; + qdisc_reset(cl->un.leaf.q); - qdisc_tree_decrease_qlen(cl->un.leaf.q, qlen); + qdisc_tree_reduce_backlog(cl->un.leaf.q, qlen, backlog); } /* delete from hash and active; remainder in destroy_class */ @@ -1423,10 +1424,11 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, sch_tree_lock(sch); if (parent && !parent->level) { unsigned int qlen = parent->un.leaf.q->q.qlen; + unsigned int backlog = parent->un.leaf.q->qstats.backlog; /* turn parent into inner node */ qdisc_reset(parent->un.leaf.q); - qdisc_tree_decrease_qlen(parent->un.leaf.q, qlen); + qdisc_tree_reduce_backlog(parent->un.leaf.q, qlen, backlog); qdisc_destroy(parent->un.leaf.q); if (parent->prio_activity) htb_deactivate(q, parent); diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c index f36ff83fac34..23437d62a8db 100644 --- a/net/sched/sch_multiq.c +++ b/net/sched/sch_multiq.c @@ -218,7 +218,8 @@ static int multiq_tune(struct Qdisc *sch, struct nlattr *opt) if (q->queues[i] != &noop_qdisc) { struct Qdisc *child = q->queues[i]; q->queues[i] = &noop_qdisc; - qdisc_tree_decrease_qlen(child, child->q.qlen); + qdisc_tree_reduce_backlog(child, child->q.qlen, + child->qstats.backlog); qdisc_destroy(child); } } @@ -238,8 +239,9 @@ static int multiq_tune(struct Qdisc *sch, struct nlattr *opt) q->queues[i] = child; if (old != &noop_qdisc) { - qdisc_tree_decrease_qlen(old, - old->q.qlen); + qdisc_tree_reduce_backlog(old, + old->q.qlen, + old->qstats.backlog); qdisc_destroy(old); } sch_tree_unlock(sch); diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 8d56016621ca..cc00329c379c 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -598,7 +598,8 @@ deliver: if (unlikely(err != NET_XMIT_SUCCESS)) { if (net_xmit_drop_count(err)) { qdisc_qstats_drop(sch); - qdisc_tree_decrease_qlen(sch, 1); + qdisc_tree_reduce_backlog(sch, 1, + qdisc_pkt_len(skb)); } } goto tfifo_dequeue; diff --git a/net/sched/sch_pie.c b/net/sched/sch_pie.c index b783a446d884..71ae3b9629f9 100644 --- a/net/sched/sch_pie.c +++ b/net/sched/sch_pie.c @@ -183,7 +183,7 @@ static int pie_change(struct Qdisc *sch, struct nlattr *opt) { struct pie_sched_data *q = qdisc_priv(sch); struct nlattr *tb[TCA_PIE_MAX + 1]; - unsigned int qlen; + unsigned int qlen, dropped = 0; int err; if (!opt) @@ -232,10 +232,11 @@ static int pie_change(struct Qdisc *sch, struct nlattr *opt) while (sch->q.qlen > sch->limit) { struct sk_buff *skb = __skb_dequeue(&sch->q); + dropped += qdisc_pkt_len(skb); qdisc_qstats_backlog_dec(sch, skb); qdisc_drop(skb, sch); } - qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen); + qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, dropped); sch_tree_unlock(sch); return 0; diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c index a677f543a25c..e671b1a4e815 100644 --- a/net/sched/sch_prio.c +++ b/net/sched/sch_prio.c @@ -191,7 +191,7 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt) struct Qdisc *child = q->queues[i]; q->queues[i] = &noop_qdisc; if (child != &noop_qdisc) { - qdisc_tree_decrease_qlen(child, child->q.qlen); + qdisc_tree_reduce_backlog(child, child->q.qlen, child->qstats.backlog); qdisc_destroy(child); } } @@ -210,8 +210,9 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt) q->queues[i] = child; if (old != &noop_qdisc) { - qdisc_tree_decrease_qlen(old, - old->q.qlen); + qdisc_tree_reduce_backlog(old, + old->q.qlen, + old->qstats.backlog); qdisc_destroy(old); } sch_tree_unlock(sch); diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index 6fea3209a7c6..e2b8fd47008b 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -221,9 +221,10 @@ static struct qfq_class *qfq_find_class(struct Qdisc *sch, u32 classid) static void qfq_purge_queue(struct qfq_class *cl) { unsigned int len = cl->qdisc->q.qlen; + unsigned int backlog = cl->qdisc->qstats.backlog; qdisc_reset(cl->qdisc); - qdisc_tree_decrease_qlen(cl->qdisc, len); + qdisc_tree_reduce_backlog(cl->qdisc, len, backlog); } static const struct nla_policy qfq_policy[TCA_QFQ_MAX + 1] = { diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c index d5abcee454d8..8c0508c0e287 100644 --- a/net/sched/sch_red.c +++ b/net/sched/sch_red.c @@ -210,7 +210,8 @@ static int red_change(struct Qdisc *sch, struct nlattr *opt) q->flags = ctl->flags; q->limit = ctl->limit; if (child) { - qdisc_tree_decrease_qlen(q->qdisc, q->qdisc->q.qlen); + qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen, + q->qdisc->qstats.backlog); qdisc_destroy(q->qdisc); q->qdisc = child; } diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c index fa7f0a36cc31..e1d634e3c255 100644 --- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -518,7 +518,8 @@ static int sfb_change(struct Qdisc *sch, struct nlattr *opt) sch_tree_lock(sch); - qdisc_tree_decrease_qlen(q->qdisc, q->qdisc->q.qlen); + qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen, + q->qdisc->qstats.backlog); qdisc_destroy(q->qdisc); q->qdisc = child; diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index b877140beda5..4417fb25166f 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -369,7 +369,7 @@ static int sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch) { struct sfq_sched_data *q = qdisc_priv(sch); - unsigned int hash; + unsigned int hash, dropped; sfq_index x, qlen; struct sfq_slot *slot; int uninitialized_var(ret); @@ -484,7 +484,7 @@ enqueue: return NET_XMIT_SUCCESS; qlen = slot->qlen; - sfq_drop(sch); + dropped = sfq_drop(sch); /* Return Congestion Notification only if we dropped a packet * from this flow. */ @@ -492,7 +492,7 @@ enqueue: return NET_XMIT_CN; /* As we dropped a packet, better let upper stack know this */ - qdisc_tree_decrease_qlen(sch, 1); + qdisc_tree_reduce_backlog(sch, 1, dropped); return NET_XMIT_SUCCESS; } @@ -560,6 +560,7 @@ static void sfq_rehash(struct Qdisc *sch) struct sfq_slot *slot; struct sk_buff_head list; int dropped = 0; + unsigned int drop_len = 0; __skb_queue_head_init(&list); @@ -588,6 +589,7 @@ static void sfq_rehash(struct Qdisc *sch) if (x >= SFQ_MAX_FLOWS) { drop: qdisc_qstats_backlog_dec(sch, skb); + drop_len += qdisc_pkt_len(skb); kfree_skb(skb); dropped++; continue; @@ -617,7 +619,7 @@ drop: } } sch->q.qlen -= dropped; - qdisc_tree_decrease_qlen(sch, dropped); + qdisc_tree_reduce_backlog(sch, dropped, drop_len); } static void sfq_perturbation(unsigned long arg) @@ -641,7 +643,7 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt) struct sfq_sched_data *q = qdisc_priv(sch); struct tc_sfq_qopt *ctl = nla_data(opt); struct tc_sfq_qopt_v1 *ctl_v1 = NULL; - unsigned int qlen; + unsigned int qlen, dropped = 0; struct red_parms *p = NULL; if (opt->nla_len < nla_attr_size(sizeof(*ctl))) @@ -690,8 +692,8 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt) qlen = sch->q.qlen; while (sch->q.qlen > q->limit) - sfq_drop(sch); - qdisc_tree_decrease_qlen(sch, qlen - sch->q.qlen); + dropped += sfq_drop(sch); + qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, dropped); del_timer(&q->perturb_timer); if (q->perturb_period) { diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c index 56a1aef3495f..c2fbde742f37 100644 --- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -160,6 +160,7 @@ static int tbf_segment(struct sk_buff *skb, struct Qdisc *sch) struct tbf_sched_data *q = qdisc_priv(sch); struct sk_buff *segs, *nskb; netdev_features_t features = netif_skb_features(skb); + unsigned int len = 0, prev_len = qdisc_pkt_len(skb); int ret, nb; segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK); @@ -172,6 +173,7 @@ static int tbf_segment(struct sk_buff *skb, struct Qdisc *sch) nskb = segs->next; segs->next = NULL; qdisc_skb_cb(segs)->pkt_len = segs->len; + len += segs->len; ret = qdisc_enqueue(segs, q->qdisc); if (ret != NET_XMIT_SUCCESS) { if (net_xmit_drop_count(ret)) @@ -183,7 +185,7 @@ static int tbf_segment(struct sk_buff *skb, struct Qdisc *sch) } sch->q.qlen += nb; if (nb > 1) - qdisc_tree_decrease_qlen(sch, 1 - nb); + qdisc_tree_reduce_backlog(sch, 1 - nb, prev_len - len); consume_skb(skb); return nb > 0 ? NET_XMIT_SUCCESS : NET_XMIT_DROP; } @@ -399,7 +401,8 @@ static int tbf_change(struct Qdisc *sch, struct nlattr *opt) sch_tree_lock(sch); if (child) { - qdisc_tree_decrease_qlen(q->qdisc, q->qdisc->q.qlen); + qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen, + q->qdisc->qstats.backlog); qdisc_destroy(q->qdisc); q->qdisc = child; } From 64e1762ae4128508d4a9ef483d77ac52c89f5630 Mon Sep 17 00:00:00 2001 From: WANG Cong Date: Thu, 25 Feb 2016 14:55:02 -0800 Subject: [PATCH 255/312] sch_htb: update backlog as well [ Upstream commit 431e3a8e36a05a37126f34b41aa3a5a6456af04e ] We saw qlen!=0 but backlog==0 on our production machine: qdisc htb 1: dev eth0 root refcnt 2 r2q 10 default 1 direct_packets_stat 0 ver 3.17 Sent 172680457356 bytes 222469449 pkt (dropped 0, overlimits 123575834 requeues 0) backlog 0b 72p requeues 0 The problem is we only count qlen for HTB qdisc but not backlog. We need to update backlog too when we update qlen, so that we can at least know the average packet length. Cc: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/sched/sch_htb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 6b118b1587ca..ccff00640713 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -600,6 +600,7 @@ static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch) htb_activate(q, cl); } + qdisc_qstats_backlog_inc(sch, skb); sch->q.qlen++; return NET_XMIT_SUCCESS; } @@ -889,6 +890,7 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch) ok: qdisc_bstats_update(sch, skb); qdisc_unthrottled(sch); + qdisc_qstats_backlog_dec(sch, skb); sch->q.qlen--; return skb; } @@ -955,6 +957,7 @@ static unsigned int htb_drop(struct Qdisc *sch) unsigned int len; if (cl->un.leaf.q->ops->drop && (len = cl->un.leaf.q->ops->drop(cl->un.leaf.q))) { + sch->qstats.backlog -= len; sch->q.qlen--; if (!cl->un.leaf.q->q.qlen) htb_deactivate(q, cl); @@ -984,12 +987,12 @@ static void htb_reset(struct Qdisc *sch) } cl->prio_activity = 0; cl->cmode = HTB_CAN_SEND; - } } qdisc_watchdog_cancel(&q->watchdog); __skb_queue_purge(&q->direct_queue); sch->q.qlen = 0; + sch->qstats.backlog = 0; memset(q->hlevel, 0, sizeof(q->hlevel)); memset(q->row_mask, 0, sizeof(q->row_mask)); for (i = 0; i < TC_HTB_NUMPRIO; i++) From dfc58a04d16c3132f3691c021a67938617905679 Mon Sep 17 00:00:00 2001 From: WANG Cong Date: Thu, 25 Feb 2016 14:55:03 -0800 Subject: [PATCH 256/312] sch_dsmark: update backlog as well [ Upstream commit bdf17661f63a79c3cb4209b970b1cc39e34f7543 ] Similarly, we need to update backlog too when we update qlen. Cc: Jamal Hadi Salim Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/sched/sch_dsmark.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/sched/sch_dsmark.c b/net/sched/sch_dsmark.c index d2084e744433..7288dda2a7fb 100644 --- a/net/sched/sch_dsmark.c +++ b/net/sched/sch_dsmark.c @@ -256,6 +256,7 @@ static int dsmark_enqueue(struct sk_buff *skb, struct Qdisc *sch) return err; } + qdisc_qstats_backlog_inc(sch, skb); sch->q.qlen++; return NET_XMIT_SUCCESS; @@ -278,6 +279,7 @@ static struct sk_buff *dsmark_dequeue(struct Qdisc *sch) return NULL; qdisc_bstats_update(sch, skb); + qdisc_qstats_backlog_dec(sch, skb); sch->q.qlen--; index = skb->tc_index & (p->indices - 1); @@ -393,6 +395,7 @@ static void dsmark_reset(struct Qdisc *sch) pr_debug("%s(sch %p,[qdisc %p])\n", __func__, sch, p); qdisc_reset(p->q); + sch->qstats.backlog = 0; sch->q.qlen = 0; } From 5305392b443d3ba1414deff068ca1542f197d5ec Mon Sep 17 00:00:00 2001 From: Neil Horman Date: Mon, 2 May 2016 12:20:15 -0400 Subject: [PATCH 257/312] netem: Segment GSO packets on enqueue [ Upstream commit 6071bd1aa13ed9e41824bafad845b7b7f4df5cfd ] This was recently reported to me, and reproduced on the latest net kernel, when attempting to run netperf from a host that had a netem qdisc attached to the egress interface: [ 788.073771] ---------------------[ cut here ]--------------------------- [ 788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda() [ 788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962 data_len=0 gso_size=1448 gso_type=1 ip_summed=3 [ 788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log dm_mod [ 788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G W ------------ 3.10.0-327.el7.x86_64 #1 [ 788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012 [ 788.542260] ffff880437c036b8 f7afc56532a53db9 ffff880437c03670 ffffffff816351f1 [ 788.576332] ffff880437c036a8 ffffffff8107b200 ffff880633e74200 ffff880231674000 [ 788.611943] 0000000000000001 0000000000000003 0000000000000000 ffff880437c03710 [ 788.647241] Call Trace: [ 788.658817] [] dump_stack+0x19/0x1b [ 788.686193] [] warn_slowpath_common+0x70/0xb0 [ 788.713803] [] warn_slowpath_fmt+0x5c/0x80 [ 788.741314] [] ? ___ratelimit+0x93/0x100 [ 788.767018] [] skb_warn_bad_offload+0xcd/0xda [ 788.796117] [] skb_checksum_help+0x17c/0x190 [ 788.823392] [] netem_enqueue+0x741/0x7c0 [sch_netem] [ 788.854487] [] dev_queue_xmit+0x2a8/0x570 [ 788.880870] [] ip_finish_output+0x53d/0x7d0 ... The problem occurs because netem is not prepared to handle GSO packets (as it uses skb_checksum_help in its enqueue path, which cannot manipulate these frames). The solution I think is to simply segment the skb in a simmilar fashion to the way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes. When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt the first segment, and enqueue the remaining ones. tested successfully by myself on the latest net kernel, to which this applies Signed-off-by: Neil Horman CC: Jamal Hadi Salim CC: "David S. Miller" CC: netem@lists.linux-foundation.org CC: eric.dumazet@gmail.com CC: stephen@networkplumber.org Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/sched/sch_netem.c | 61 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 59 insertions(+), 2 deletions(-) diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index cc00329c379c..80124c1edbba 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -395,6 +395,25 @@ static void tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch) sch->q.qlen++; } +/* netem can't properly corrupt a megapacket (like we get from GSO), so instead + * when we statistically choose to corrupt one, we instead segment it, returning + * the first packet to be corrupted, and re-enqueue the remaining frames + */ +static struct sk_buff *netem_segment(struct sk_buff *skb, struct Qdisc *sch) +{ + struct sk_buff *segs; + netdev_features_t features = netif_skb_features(skb); + + segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK); + + if (IS_ERR_OR_NULL(segs)) { + qdisc_reshape_fail(skb, sch); + return NULL; + } + consume_skb(skb); + return segs; +} + /* * Insert one skb into qdisc. * Note: parent depends on return value to account for queue length. @@ -407,7 +426,11 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) /* We don't fill cb now as skb_unshare() may invalidate it */ struct netem_skb_cb *cb; struct sk_buff *skb2; + struct sk_buff *segs = NULL; + unsigned int len = 0, last_len, prev_len = qdisc_pkt_len(skb); + int nb = 0; int count = 1; + int rc = NET_XMIT_SUCCESS; /* Random duplication */ if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor)) @@ -453,10 +476,23 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) * do it now in software before we mangle it. */ if (q->corrupt && q->corrupt >= get_crandom(&q->corrupt_cor)) { + if (skb_is_gso(skb)) { + segs = netem_segment(skb, sch); + if (!segs) + return NET_XMIT_DROP; + } else { + segs = skb; + } + + skb = segs; + segs = segs->next; + if (!(skb = skb_unshare(skb, GFP_ATOMIC)) || (skb->ip_summed == CHECKSUM_PARTIAL && - skb_checksum_help(skb))) - return qdisc_drop(skb, sch); + skb_checksum_help(skb))) { + rc = qdisc_drop(skb, sch); + goto finish_segs; + } skb->data[prandom_u32() % skb_headlen(skb)] ^= 1<<(prandom_u32() % 8); @@ -516,6 +552,27 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) sch->qstats.requeues++; } +finish_segs: + if (segs) { + while (segs) { + skb2 = segs->next; + segs->next = NULL; + qdisc_skb_cb(segs)->pkt_len = segs->len; + last_len = segs->len; + rc = qdisc_enqueue(segs, sch); + if (rc != NET_XMIT_SUCCESS) { + if (net_xmit_drop_count(rc)) + qdisc_qstats_drop(sch); + } else { + nb++; + len += last_len; + } + segs = skb2; + } + sch->q.qlen += nb; + if (nb > 1) + qdisc_tree_reduce_backlog(sch, 1 - nb, prev_len - len); + } return NET_XMIT_SUCCESS; } From 9a7e38d01e351fc5de11b2a033ced5ff04090b81 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= Date: Tue, 3 May 2016 16:38:53 +0200 Subject: [PATCH 258/312] net: fec: only clear a queue's work bit if the queue was emptied MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 1c021bb717a70aaeaa4b25c91f43c2aeddd922de ] In the receive path a queue's work bit was cleared unconditionally even if fec_enet_rx_queue only read out a part of the available packets from the hardware. This resulted in not reading any packets in the next napi turn and so packets were delayed or lost. The obvious fix is to only clear a queue's bit when the queue was emptied. Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue") Signed-off-by: Uwe Kleine-König Reviewed-by: Lucas Stach Tested-by: Fugang Duan Acked-by: Fugang Duan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/freescale/fec_main.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 570390b5cd42..67aec18dd76c 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1546,9 +1546,15 @@ fec_enet_rx(struct net_device *ndev, int budget) struct fec_enet_private *fep = netdev_priv(ndev); for_each_set_bit(queue_id, &fep->work_rx, FEC_ENET_MAX_RX_QS) { - clear_bit(queue_id, &fep->work_rx); - pkt_received += fec_enet_rx_queue(ndev, + int ret; + + ret = fec_enet_rx_queue(ndev, budget - pkt_received, queue_id); + + if (ret < budget - pkt_received) + clear_bit(queue_id, &fep->work_rx); + + pkt_received += ret; } return pkt_received; } From 5923f46563d1ce74c1f1178cba5a67735bb83e6d Mon Sep 17 00:00:00 2001 From: Kangjie Lu Date: Tue, 3 May 2016 16:35:05 -0400 Subject: [PATCH 259/312] net: fix infoleak in llc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit b8670c09f37bdf2847cc44f36511a53afc6161fd ] The stack object “info” has a total size of 12 bytes. Its last byte is padding which is not initialized and leaked via “put_cmsg”. Signed-off-by: Kangjie Lu Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/llc/af_llc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c index 17a8dff06090..c58f242c00f1 100644 --- a/net/llc/af_llc.c +++ b/net/llc/af_llc.c @@ -626,6 +626,7 @@ static void llc_cmsg_rcv(struct msghdr *msg, struct sk_buff *skb) if (llc->cmsg_flags & LLC_CMSG_PKTINFO) { struct llc_pktinfo info; + memset(&info, 0, sizeof(info)); info.lpi_ifindex = llc_sk(skb->sk)->dev->ifindex; llc_pdu_decode_dsap(skb, &info.lpi_sap); llc_pdu_decode_da(skb, info.lpi_mac); From 9a9390bcf56680c487a8e4c89c813a48bfedc4b6 Mon Sep 17 00:00:00 2001 From: Kangjie Lu Date: Tue, 3 May 2016 16:46:24 -0400 Subject: [PATCH 260/312] net: fix infoleak in rtnetlink MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 5f8e44741f9f216e33736ea4ec65ca9ac03036e6 ] The stack object “map” has a total size of 32 bytes. Its last 4 bytes are padding generated by compiler. These padding bytes are not initialized and sent out via “nla_put”. Signed-off-by: Kangjie Lu Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/core/rtnetlink.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 31a46a7bd3d6..e9c4b51525de 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1072,14 +1072,16 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, goto nla_put_failure; if (1) { - struct rtnl_link_ifmap map = { - .mem_start = dev->mem_start, - .mem_end = dev->mem_end, - .base_addr = dev->base_addr, - .irq = dev->irq, - .dma = dev->dma, - .port = dev->if_port, - }; + struct rtnl_link_ifmap map; + + memset(&map, 0, sizeof(map)); + map.mem_start = dev->mem_start; + map.mem_end = dev->mem_end; + map.base_addr = dev->base_addr; + map.irq = dev->irq; + map.dma = dev->dma; + map.port = dev->if_port; + if (nla_put(skb, IFLA_MAP, sizeof(map), &map)) goto nla_put_failure; } From 8bba1625512245771bdb2cb1502697228fe7e1b2 Mon Sep 17 00:00:00 2001 From: Daniel Jurgens Date: Wed, 4 May 2016 15:00:33 +0300 Subject: [PATCH 261/312] net/mlx4_en: Fix endianness bug in IPV6 csum calculation [ Upstream commit 82d69203df634b4dfa765c94f60ce9482bcc44d6 ] Use htons instead of unconditionally byte swapping nexthdr. On a little endian systems shifting the byte is correct behavior, but it results in incorrect csums on big endian architectures. Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE') Signed-off-by: Daniel Jurgens Reviewed-by: Carol Soto Tested-by: Carol Soto Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 80aac20104de..f6095d2b77de 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -710,7 +710,7 @@ static int get_fixed_ipv6_csum(__wsum hw_checksum, struct sk_buff *skb, if (ipv6h->nexthdr == IPPROTO_FRAGMENT || ipv6h->nexthdr == IPPROTO_HOPOPTS) return -1; - hw_checksum = csum_add(hw_checksum, (__force __wsum)(ipv6h->nexthdr << 8)); + hw_checksum = csum_add(hw_checksum, (__force __wsum)htons(ipv6h->nexthdr)); csum_pseudo_hdr = csum_partial(&ipv6h->saddr, sizeof(ipv6h->saddr) + sizeof(ipv6h->daddr), 0); From 8817c1b1dad9fe118779eb3df237674856514fc8 Mon Sep 17 00:00:00 2001 From: Ian Campbell Date: Wed, 4 May 2016 14:21:53 +0100 Subject: [PATCH 262/312] VSOCK: do not disconnect socket when peer has shutdown SEND only [ Upstream commit dedc58e067d8c379a15a8a183c5db318201295bb ] The peer may be expecting a reply having sent a request and then done a shutdown(SHUT_WR), so tearing down the whole socket at this point seems wrong and breaks for me with a client which does a SHUT_WR. Looking at other socket family's stream_recvmsg callbacks doing a shutdown here does not seem to be the norm and removing it does not seem to have had any adverse effects that I can see. I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact on the vmci transport. Signed-off-by: Ian Campbell Cc: "David S. Miller" Cc: Stefan Hajnoczi Cc: Claudio Imbrenda Cc: Andy King Cc: Dmitry Torokhov Cc: Jorgen Hansen Cc: Adit Ranadive Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/vmw_vsock/af_vsock.c | 21 +-------------------- 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 2ec86e652a19..e1c69b216db3 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1794,27 +1794,8 @@ vsock_stream_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, else if (sk->sk_shutdown & RCV_SHUTDOWN) err = 0; - if (copied > 0) { - /* We only do these additional bookkeeping/notification steps - * if we actually copied something out of the queue pair - * instead of just peeking ahead. - */ - - if (!(flags & MSG_PEEK)) { - /* If the other side has shutdown for sending and there - * is nothing more to read, then modify the socket - * state. - */ - if (vsk->peer_shutdown & SEND_SHUTDOWN) { - if (vsock_stream_has_data(vsk) <= 0) { - sk->sk_state = SS_UNCONNECTED; - sock_set_flag(sk, SOCK_DONE); - sk->sk_state_change(sk); - } - } - } + if (copied > 0) err = copied; - } out_wait: finish_wait(sk_sleep(sk), &wait); From 806d70c7da5bc0dc43e54fe0362fc0fcf573bfe2 Mon Sep 17 00:00:00 2001 From: Nikolay Aleksandrov Date: Wed, 4 May 2016 16:18:45 +0200 Subject: [PATCH 263/312] net: bridge: fix old ioctl unlocked net device walk [ Upstream commit 31ca0458a61a502adb7ed192bf9716c6d05791a5 ] get_bridge_ifindices() is used from the old "deviceless" bridge ioctl calls which aren't called with rtnl held. The comment above says that it is called with rtnl but that is not really the case. Here's a sample output from a test ASSERT_RTNL() which I put in get_bridge_ifindices and executed "brctl show": [ 957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30) [ 957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G W O 4.6.0-rc4+ #157 [ 957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [ 957.423009] 0000000000000000 ffff880058adfdf0 ffffffff8138dec5 0000000000000400 [ 957.423009] ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32 0000000000000001 [ 957.423009] 00007ffec1a444b0 0000000000000400 ffff880053c19130 0000000000008940 [ 957.423009] Call Trace: [ 957.423009] [] dump_stack+0x85/0xc0 [ 957.423009] [] br_ioctl_deviceless_stub+0x212/0x2e0 [bridge] [ 957.423009] [] sock_ioctl+0x22b/0x290 [ 957.423009] [] do_vfs_ioctl+0x95/0x700 [ 957.423009] [] SyS_ioctl+0x79/0x90 [ 957.423009] [] entry_SYSCALL_64_fastpath+0x23/0xc1 Since it only reads bridge ifindices, we can use rcu to safely walk the net device list. Also remove the wrong rtnl comment above. Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/bridge/br_ioctl.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/bridge/br_ioctl.c b/net/bridge/br_ioctl.c index 8d423bc649b9..f876f707fd9e 100644 --- a/net/bridge/br_ioctl.c +++ b/net/bridge/br_ioctl.c @@ -21,18 +21,19 @@ #include #include "br_private.h" -/* called with RTNL */ static int get_bridge_ifindices(struct net *net, int *indices, int num) { struct net_device *dev; int i = 0; - for_each_netdev(net, dev) { + rcu_read_lock(); + for_each_netdev_rcu(net, dev) { if (i >= num) break; if (dev->priv_flags & IFF_EBRIDGE) indices[i++] = dev->ifindex; } + rcu_read_unlock(); return i; } From b2b95b3fbd93c910210922809f6c4d24be172b1c Mon Sep 17 00:00:00 2001 From: Kangjie Lu Date: Sun, 8 May 2016 12:10:14 -0400 Subject: [PATCH 264/312] net: fix a kernel infoleak in x25 module [ Upstream commit 79e48650320e6fba48369fccf13fd045315b19b8 ] Stack object "dte_facilities" is allocated in x25_rx_call_request(), which is supposed to be initialized in x25_negotiate_facilities. However, 5 fields (8 bytes in total) are not initialized. This object is then copied to userland via copy_to_user, thus infoleak occurs. Signed-off-by: Kangjie Lu Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/x25/x25_facilities.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/x25/x25_facilities.c b/net/x25/x25_facilities.c index 7ecd04c21360..997ff7b2509b 100644 --- a/net/x25/x25_facilities.c +++ b/net/x25/x25_facilities.c @@ -277,6 +277,7 @@ int x25_negotiate_facilities(struct sk_buff *skb, struct sock *sk, memset(&theirs, 0, sizeof(theirs)); memcpy(new, ours, sizeof(*new)); + memset(dte, 0, sizeof(*dte)); len = x25_parse_facilities(skb, &theirs, dte, &x25->vc_facil_mask); if (len < 0) From 90eb6718b9db5c145f7c2d4a14df6a4b8d96e7b3 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 9 May 2016 20:55:16 -0700 Subject: [PATCH 265/312] tcp: refresh skb timestamp at retransmit time [ Upstream commit 10a81980fc47e64ffac26a073139813d3f697b64 ] In the very unlikely case __tcp_retransmit_skb() can not use the cloning done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing the copy and transmit, otherwise TCP TS val will be an exact copy of original transmit. Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when") Signed-off-by: Eric Dumazet Cc: Yuchung Cheng Acked-by: Yuchung Cheng Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv4/tcp_output.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 1ea4322c3b0c..ae66c8426ad0 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2627,8 +2627,10 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) */ if (unlikely((NET_IP_ALIGN && ((unsigned long)skb->data & 3)) || skb_headroom(skb) >= 0xFFFF)) { - struct sk_buff *nskb = __pskb_copy(skb, MAX_TCP_HEADER, - GFP_ATOMIC); + struct sk_buff *nskb; + + skb_mstamp_get(&skb->skb_mstamp); + nskb = __pskb_copy(skb, MAX_TCP_HEADER, GFP_ATOMIC); err = nskb ? tcp_transmit_skb(sk, nskb, 0, GFP_ATOMIC) : -ENOBUFS; } else { From a36c14ba24b1ba16964d0289fe8b282b780a5554 Mon Sep 17 00:00:00 2001 From: Gerald Schaefer Date: Fri, 15 Apr 2016 16:38:40 +0200 Subject: [PATCH 266/312] s390/mm: fix asce_bits handling with dynamic pagetable levels [ Upstream commit 723cacbd9dc79582e562c123a0bacf8bfc69e72a ] There is a race with multi-threaded applications between context switch and pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a pagetable upgrade on another thread in crst_table_upgrade() could already have set new asce_bits, but not yet the new mm->pgd. This would result in a corrupt user_asce in switch_mm(), and eventually in a kernel panic from a translation exception. Fix this by storing the complete asce instead of just the asce_bits, which can then be read atomically from switch_mm(), so that it either sees the old value or the new value, but no mixture. Both cases are OK. Having the old value would result in a page fault on access to the higher level memory, but the fault handler would see the new mm->pgd, if it was a valid access after the mmap on the other thread has completed. So as worst-case scenario we would have a page fault loop for the racing thread until the next time slice. Also remove dead code and simplify the upgrade/downgrade path, there are no upgrades from 2 levels, and only downgrades from 3 levels for compat tasks. There are also no concurrent upgrades, because the mmap_sem is held with down_write() in do_mmap, so the flush and table checks during upgrade can be removed. Reported-by: Michael Munday Reviewed-by: Martin Schwidefsky Signed-off-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin --- arch/s390/include/asm/mmu.h | 2 +- arch/s390/include/asm/mmu_context.h | 28 ++++++++-- arch/s390/include/asm/pgalloc.h | 4 +- arch/s390/include/asm/processor.h | 2 +- arch/s390/include/asm/tlbflush.h | 9 +-- arch/s390/mm/init.c | 3 +- arch/s390/mm/mmap.c | 6 +- arch/s390/mm/pgtable.c | 85 ++++++++++------------------- 8 files changed, 62 insertions(+), 77 deletions(-) diff --git a/arch/s390/include/asm/mmu.h b/arch/s390/include/asm/mmu.h index d29ad9545b41..081b2ad99d73 100644 --- a/arch/s390/include/asm/mmu.h +++ b/arch/s390/include/asm/mmu.h @@ -11,7 +11,7 @@ typedef struct { spinlock_t list_lock; struct list_head pgtable_list; struct list_head gmap_list; - unsigned long asce_bits; + unsigned long asce; unsigned long asce_limit; unsigned long vdso_base; /* The mmu context allocates 4K page tables. */ diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h index e485817f7b1a..22877c9440ea 100644 --- a/arch/s390/include/asm/mmu_context.h +++ b/arch/s390/include/asm/mmu_context.h @@ -26,12 +26,28 @@ static inline int init_new_context(struct task_struct *tsk, mm->context.has_pgste = 0; mm->context.use_skey = 0; #endif - if (mm->context.asce_limit == 0) { + switch (mm->context.asce_limit) { + case 1UL << 42: + /* + * forked 3-level task, fall through to set new asce with new + * mm->pgd + */ + case 0: /* context created by exec, set asce limit to 4TB */ - mm->context.asce_bits = _ASCE_TABLE_LENGTH | - _ASCE_USER_BITS | _ASCE_TYPE_REGION3; mm->context.asce_limit = STACK_TOP_MAX; - } else if (mm->context.asce_limit == (1UL << 31)) { + mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH | + _ASCE_USER_BITS | _ASCE_TYPE_REGION3; + break; + case 1UL << 53: + /* forked 4-level task, set new asce with new mm->pgd */ + mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH | + _ASCE_USER_BITS | _ASCE_TYPE_REGION2; + break; + case 1UL << 31: + /* forked 2-level compat task, set new asce with new mm->pgd */ + mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH | + _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT; + /* pgd_alloc() did not increase mm->nr_pmds */ mm_inc_nr_pmds(mm); } crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm)); @@ -42,7 +58,7 @@ static inline int init_new_context(struct task_struct *tsk, static inline void set_user_asce(struct mm_struct *mm) { - S390_lowcore.user_asce = mm->context.asce_bits | __pa(mm->pgd); + S390_lowcore.user_asce = mm->context.asce; if (current->thread.mm_segment.ar4) __ctl_load(S390_lowcore.user_asce, 7, 7); set_cpu_flag(CIF_ASCE); @@ -71,7 +87,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, { int cpu = smp_processor_id(); - S390_lowcore.user_asce = next->context.asce_bits | __pa(next->pgd); + S390_lowcore.user_asce = next->context.asce; if (prev == next) return; if (MACHINE_HAS_TLB_LC) diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h index d7cc79fb6191..5991cdcb5b40 100644 --- a/arch/s390/include/asm/pgalloc.h +++ b/arch/s390/include/asm/pgalloc.h @@ -56,8 +56,8 @@ static inline unsigned long pgd_entry_type(struct mm_struct *mm) return _REGION2_ENTRY_EMPTY; } -int crst_table_upgrade(struct mm_struct *, unsigned long limit); -void crst_table_downgrade(struct mm_struct *, unsigned long limit); +int crst_table_upgrade(struct mm_struct *); +void crst_table_downgrade(struct mm_struct *); static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long address) { diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h index dedb6218544b..7ce53f682ec9 100644 --- a/arch/s390/include/asm/processor.h +++ b/arch/s390/include/asm/processor.h @@ -155,7 +155,7 @@ struct stack_frame { regs->psw.mask = PSW_USER_BITS | PSW_MASK_BA; \ regs->psw.addr = new_psw | PSW_ADDR_AMODE; \ regs->gprs[15] = new_stackp; \ - crst_table_downgrade(current->mm, 1UL << 31); \ + crst_table_downgrade(current->mm); \ execve_tail(); \ } while (0) diff --git a/arch/s390/include/asm/tlbflush.h b/arch/s390/include/asm/tlbflush.h index ca148f7c3eaa..a2e6ef32e054 100644 --- a/arch/s390/include/asm/tlbflush.h +++ b/arch/s390/include/asm/tlbflush.h @@ -110,8 +110,7 @@ static inline void __tlb_flush_asce(struct mm_struct *mm, unsigned long asce) static inline void __tlb_flush_kernel(void) { if (MACHINE_HAS_IDTE) - __tlb_flush_idte((unsigned long) init_mm.pgd | - init_mm.context.asce_bits); + __tlb_flush_idte(init_mm.context.asce); else __tlb_flush_global(); } @@ -133,8 +132,7 @@ static inline void __tlb_flush_asce(struct mm_struct *mm, unsigned long asce) static inline void __tlb_flush_kernel(void) { if (MACHINE_HAS_TLB_LC) - __tlb_flush_idte_local((unsigned long) init_mm.pgd | - init_mm.context.asce_bits); + __tlb_flush_idte_local(init_mm.context.asce); else __tlb_flush_local(); } @@ -148,8 +146,7 @@ static inline void __tlb_flush_mm(struct mm_struct * mm) * only ran on the local cpu. */ if (MACHINE_HAS_IDTE && list_empty(&mm->context.gmap_list)) - __tlb_flush_asce(mm, (unsigned long) mm->pgd | - mm->context.asce_bits); + __tlb_flush_asce(mm, mm->context.asce); else __tlb_flush_full(mm); } diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 80875c43a4a4..728744918a07 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -112,7 +112,8 @@ void __init paging_init(void) asce_bits = _ASCE_TYPE_REGION3 | _ASCE_TABLE_LENGTH; pgd_type = _REGION3_ENTRY_EMPTY; } - S390_lowcore.kernel_asce = (__pa(init_mm.pgd) & PAGE_MASK) | asce_bits; + init_mm.context.asce = (__pa(init_mm.pgd) & PAGE_MASK) | asce_bits; + S390_lowcore.kernel_asce = init_mm.context.asce; clear_table((unsigned long *) init_mm.pgd, pgd_type, sizeof(unsigned long)*2048); vmem_map_init(); diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c index 6e552af08c76..e2f8685d9981 100644 --- a/arch/s390/mm/mmap.c +++ b/arch/s390/mm/mmap.c @@ -184,7 +184,7 @@ int s390_mmap_check(unsigned long addr, unsigned long len, unsigned long flags) if (!(flags & MAP_FIXED)) addr = 0; if ((addr + len) >= TASK_SIZE) - return crst_table_upgrade(current->mm, 1UL << 53); + return crst_table_upgrade(current->mm); return 0; } @@ -201,7 +201,7 @@ s390_get_unmapped_area(struct file *filp, unsigned long addr, return area; if (area == -ENOMEM && !is_compat_task() && TASK_SIZE < (1UL << 53)) { /* Upgrade the page table to 4 levels and retry. */ - rc = crst_table_upgrade(mm, 1UL << 53); + rc = crst_table_upgrade(mm); if (rc) return (unsigned long) rc; area = arch_get_unmapped_area(filp, addr, len, pgoff, flags); @@ -223,7 +223,7 @@ s390_get_unmapped_area_topdown(struct file *filp, const unsigned long addr, return area; if (area == -ENOMEM && !is_compat_task() && TASK_SIZE < (1UL << 53)) { /* Upgrade the page table to 4 levels and retry. */ - rc = crst_table_upgrade(mm, 1UL << 53); + rc = crst_table_upgrade(mm); if (rc) return (unsigned long) rc; area = arch_get_unmapped_area_topdown(filp, addr, len, diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index b33f66110ca9..ebf82a99df45 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -56,81 +56,52 @@ static void __crst_table_upgrade(void *arg) __tlb_flush_local(); } -int crst_table_upgrade(struct mm_struct *mm, unsigned long limit) +int crst_table_upgrade(struct mm_struct *mm) { unsigned long *table, *pgd; - unsigned long entry; - int flush; - BUG_ON(limit > (1UL << 53)); - flush = 0; -repeat: + /* upgrade should only happen from 3 to 4 levels */ + BUG_ON(mm->context.asce_limit != (1UL << 42)); + table = crst_table_alloc(mm); if (!table) return -ENOMEM; + spin_lock_bh(&mm->page_table_lock); - if (mm->context.asce_limit < limit) { - pgd = (unsigned long *) mm->pgd; - if (mm->context.asce_limit <= (1UL << 31)) { - entry = _REGION3_ENTRY_EMPTY; - mm->context.asce_limit = 1UL << 42; - mm->context.asce_bits = _ASCE_TABLE_LENGTH | - _ASCE_USER_BITS | - _ASCE_TYPE_REGION3; - } else { - entry = _REGION2_ENTRY_EMPTY; - mm->context.asce_limit = 1UL << 53; - mm->context.asce_bits = _ASCE_TABLE_LENGTH | - _ASCE_USER_BITS | - _ASCE_TYPE_REGION2; - } - crst_table_init(table, entry); - pgd_populate(mm, (pgd_t *) table, (pud_t *) pgd); - mm->pgd = (pgd_t *) table; - mm->task_size = mm->context.asce_limit; - table = NULL; - flush = 1; - } + pgd = (unsigned long *) mm->pgd; + crst_table_init(table, _REGION2_ENTRY_EMPTY); + pgd_populate(mm, (pgd_t *) table, (pud_t *) pgd); + mm->pgd = (pgd_t *) table; + mm->context.asce_limit = 1UL << 53; + mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH | + _ASCE_USER_BITS | _ASCE_TYPE_REGION2; + mm->task_size = mm->context.asce_limit; spin_unlock_bh(&mm->page_table_lock); - if (table) - crst_table_free(mm, table); - if (mm->context.asce_limit < limit) - goto repeat; - if (flush) - on_each_cpu(__crst_table_upgrade, mm, 0); + + on_each_cpu(__crst_table_upgrade, mm, 0); return 0; } -void crst_table_downgrade(struct mm_struct *mm, unsigned long limit) +void crst_table_downgrade(struct mm_struct *mm) { pgd_t *pgd; + /* downgrade should only happen from 3 to 2 levels (compat only) */ + BUG_ON(mm->context.asce_limit != (1UL << 42)); + if (current->active_mm == mm) { clear_user_asce(); __tlb_flush_mm(mm); } - while (mm->context.asce_limit > limit) { - pgd = mm->pgd; - switch (pgd_val(*pgd) & _REGION_ENTRY_TYPE_MASK) { - case _REGION_ENTRY_TYPE_R2: - mm->context.asce_limit = 1UL << 42; - mm->context.asce_bits = _ASCE_TABLE_LENGTH | - _ASCE_USER_BITS | - _ASCE_TYPE_REGION3; - break; - case _REGION_ENTRY_TYPE_R3: - mm->context.asce_limit = 1UL << 31; - mm->context.asce_bits = _ASCE_TABLE_LENGTH | - _ASCE_USER_BITS | - _ASCE_TYPE_SEGMENT; - break; - default: - BUG(); - } - mm->pgd = (pgd_t *) (pgd_val(*pgd) & _REGION_ENTRY_ORIGIN); - mm->task_size = mm->context.asce_limit; - crst_table_free(mm, (unsigned long *) pgd); - } + + pgd = mm->pgd; + mm->pgd = (pgd_t *) (pgd_val(*pgd) & _REGION_ENTRY_ORIGIN); + mm->context.asce_limit = 1UL << 31; + mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH | + _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT; + mm->task_size = mm->context.asce_limit; + crst_table_free(mm, (unsigned long *) pgd); + if (current->active_mm == mm) set_user_asce(mm); } From 545655ed2eea768dc9abd8899b65b5c272bc5104 Mon Sep 17 00:00:00 2001 From: Lucas Stach Date: Thu, 5 May 2016 10:16:44 -0400 Subject: [PATCH 267/312] drm/radeon: fix PLL sharing on DCE6.1 (v2) [ Upstream commit e3c00d87845ab375f90fa6e10a5e72a3a5778cd3 ] On DCE6.1 PPLL2 is exclusively available to UNIPHYA, so it should not be taken into consideration when looking for an already enabled PLL to be shared with other outputs. This fixes the broken VGA port (TRAVIS DP->VGA bridge) on my Richland based laptop, where the internal display is connected to UNIPHYA through a TRAVIS DP->LVDS bridge. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=78987 v2: agd: add check in radeon_get_shared_nondp_ppll as well, drop extra parameter. Signed-off-by: Lucas Stach Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/atombios_crtc.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c index dac78ad24b31..79bab6fd76bb 100644 --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -1739,6 +1739,7 @@ static u32 radeon_get_pll_use_mask(struct drm_crtc *crtc) static int radeon_get_shared_dp_ppll(struct drm_crtc *crtc) { struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; @@ -1748,6 +1749,10 @@ static int radeon_get_shared_dp_ppll(struct drm_crtc *crtc) test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* for DP use the same PLL for all */ if (test_radeon_crtc->pll_id != ATOM_PPLL_INVALID) return test_radeon_crtc->pll_id; @@ -1769,6 +1774,7 @@ static int radeon_get_shared_nondp_ppll(struct drm_crtc *crtc) { struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc); struct drm_device *dev = crtc->dev; + struct radeon_device *rdev = dev->dev_private; struct drm_crtc *test_crtc; struct radeon_crtc *test_radeon_crtc; u32 adjusted_clock, test_adjusted_clock; @@ -1784,6 +1790,10 @@ static int radeon_get_shared_nondp_ppll(struct drm_crtc *crtc) test_radeon_crtc = to_radeon_crtc(test_crtc); if (test_radeon_crtc->encoder && !ENCODER_MODE_IS_DP(atombios_get_encoder_mode(test_radeon_crtc->encoder))) { + /* PPLL2 is exclusive to UNIPHYA on DCE61 */ + if (ASIC_IS_DCE61(rdev) && !ASIC_IS_DCE8(rdev) && + test_radeon_crtc->pll_id == ATOM_PPLL2) + continue; /* check if we are already driving this connector with another crtc */ if (test_radeon_crtc->connector == radeon_crtc->connector) { /* if we are, return that pll */ From ec2744d3c8e70f1beedeaa361dd8b162017a12e7 Mon Sep 17 00:00:00 2001 From: Josef Bacik Date: Fri, 25 Mar 2016 10:02:41 -0400 Subject: [PATCH 268/312] Btrfs: don't use src fd for printk [ Upstream commit c79b4713304f812d3d6c95826fc3e5fc2c0b0c14 ] The fd we pass in may not be on a btrfs file system, so don't try to do BTRFS_I() on it. Thanks, Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin --- fs/btrfs/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 43502247e176..2eca30adb3e3 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1639,7 +1639,7 @@ static noinline int btrfs_ioctl_snap_create_transid(struct file *file, src_inode = file_inode(src.file); if (src_inode->i_sb != file_inode(file)->i_sb) { - btrfs_info(BTRFS_I(src_inode)->root->fs_info, + btrfs_info(BTRFS_I(file_inode(file))->root->fs_info, "Snapshot src from another FS"); ret = -EXDEV; } else if (!inode_owner_or_capable(src_inode)) { From 616ffbf1a53617037e2e46d2f40232a5cb1ea8c0 Mon Sep 17 00:00:00 2001 From: Andy Gross Date: Tue, 3 May 2016 15:24:11 -0500 Subject: [PATCH 269/312] clk: qcom: msm8916: Fix crypto clock flags [ Upstream commit 2a0974aa1a0b40a92387ea03dbfeacfbc9ba182c ] This patch adds the CLK_SET_RATE_PARENT flag for the crypto core and ahb blocks. Without this flag, clk_set_rate can fail for certain frequency requests. Signed-off-by: Andy Gross Fixes: 3966fab8b6ab ("clk: qcom: Add MSM8916 Global Clock Controller support") Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin --- drivers/clk/qcom/gcc-msm8916.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/clk/qcom/gcc-msm8916.c b/drivers/clk/qcom/gcc-msm8916.c index 5d75bffab141..0f721998b2d4 100644 --- a/drivers/clk/qcom/gcc-msm8916.c +++ b/drivers/clk/qcom/gcc-msm8916.c @@ -1961,6 +1961,7 @@ static struct clk_branch gcc_crypto_ahb_clk = { "pcnoc_bfdcd_clk_src", }, .num_parents = 1, + .flags = CLK_SET_RATE_PARENT, .ops = &clk_branch2_ops, }, }, @@ -1996,6 +1997,7 @@ static struct clk_branch gcc_crypto_clk = { "crypto_clk_src", }, .num_parents = 1, + .flags = CLK_SET_RATE_PARENT, .ops = &clk_branch2_ops, }, }, From 4b6986fcceb200b2edd97d7cc5799d8a7ae53086 Mon Sep 17 00:00:00 2001 From: Lars-Peter Clausen Date: Wed, 30 Mar 2016 13:49:14 +0200 Subject: [PATCH 270/312] usb: gadget: f_fs: Fix EFAULT generation for async read operations [ Upstream commit 332a5b446b7916d272c2a659a3b20909ce34d2c1 ] In the current implementation functionfs generates a EFAULT for async read operations if the read buffer size is larger than the URB data size. Since a application does not necessarily know how much data the host side is going to send it typically supplies a buffer larger than the actual data, which will then result in a EFAULT error. This behaviour was introduced while refactoring the code to use iov_iter interface in commit c993c39b8639 ("gadget/function/f_fs.c: use put iov_iter into io_data"). The original code took the minimum over the URB size and the user buffer size and then attempted to copy that many bytes using copy_to_user(). If copy_to_user() could not copy all data a EFAULT error was generated. Restore the original behaviour by only generating a EFAULT error when the number of bytes copied is not the size of the URB and the target buffer has not been fully filled. Commit 342f39a6c8d3 ("usb: gadget: f_fs: fix check in read operation") already fixed the same problem for the synchronous read path. Fixes: c993c39b8639 ("gadget/function/f_fs.c: use put iov_iter into io_data") Acked-by: Michal Nazarewicz Signed-off-by: Lars-Peter Clausen Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin --- drivers/usb/gadget/function/f_fs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 82240dbdf6dd..db9433eed2cc 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -651,7 +651,7 @@ static void ffs_user_copy_worker(struct work_struct *work) if (io_data->read && ret > 0) { use_mm(io_data->mm); ret = copy_to_iter(io_data->buf, ret, &io_data->data); - if (iov_iter_count(&io_data->data)) + if (ret != io_data->req->actual && iov_iter_count(&io_data->data)) ret = -EFAULT; unuse_mm(io_data->mm); } From 87e0b5757eab598f4b55babdc8879e143a556b7a Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Mon, 21 Mar 2016 12:33:00 +0100 Subject: [PATCH 271/312] KVM: x86: mask CPUID(0xD,0x1).EAX against host value MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 316314cae15fb0e3869b76b468f59a0c83ac3d4e ] This ensures that the guest doesn't see XSAVE extensions (e.g. xgetbv1 or xsavec) that the host lacks. Cc: stable@vger.kernel.org Reviewed-by: Radim Krčmář Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin --- arch/x86/kvm/cpuid.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 1d08ad3582d0..090aa5c1d6b1 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -501,6 +501,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, do_cpuid_1_ent(&entry[i], function, idx); if (idx == 1) { entry[i].eax &= kvm_supported_word10_x86_features; + cpuid_mask(&entry[i].eax, 10); entry[i].ebx = 0; if (entry[i].eax & (F(XSAVES)|F(XSAVEC))) entry[i].ebx = From 5ec75b71be17201805f2e6b7ef9f17f2d6b412d3 Mon Sep 17 00:00:00 2001 From: Jiri Slaby Date: Tue, 3 May 2016 17:05:54 +0200 Subject: [PATCH 272/312] tty: vt, return error when con_startup fails [ Upstream commit 6798df4c5fe0a7e6d2065cf79649a794e5ba7114 ] When csw->con_startup() fails in do_register_con_driver, we return no error (i.e. 0). This was changed back in 2006 by commit 3e795de763. Before that we used to return -ENODEV. So fix the return value to be -ENODEV in that case again. Fixes: 3e795de763 ("VT binding: Add binding/unbinding support for the VT console") Signed-off-by: Jiri Slaby Reported-by: "Dan Carpenter" Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/tty/vt/vt.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 4a24eb2b0ede..ba86956ef4b5 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -3587,9 +3587,10 @@ static int do_register_con_driver(const struct consw *csw, int first, int last) goto err; desc = csw->con_startup(); - - if (!desc) + if (!desc) { + retval = -ENODEV; goto err; + } retval = -EINVAL; From 664fcc116904a7414f7bbd0831c973cbd28b59ad Mon Sep 17 00:00:00 2001 From: Chanwoo Choi Date: Thu, 21 Apr 2016 18:58:31 +0900 Subject: [PATCH 273/312] serial: samsung: Reorder the sequence of clock control when call s3c24xx_serial_set_termios() [ Upstream commit b8995f527aac143e83d3900ff39357651ea4e0f6 ] This patch fixes the broken serial log when changing the clock source of uart device. Before disabling the original clock source, this patch enables the new clock source to protect the clock off state for a split second. Signed-off-by: Chanwoo Choi Reviewed-by: Marek Szyprowski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/tty/serial/samsung.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/samsung.c b/drivers/tty/serial/samsung.c index 1e0d9b8c48c9..e42cb6bdd31d 100644 --- a/drivers/tty/serial/samsung.c +++ b/drivers/tty/serial/samsung.c @@ -1288,6 +1288,8 @@ static void s3c24xx_serial_set_termios(struct uart_port *port, /* check to see if we need to change clock source */ if (ourport->baudclk != clk) { + clk_prepare_enable(clk); + s3c24xx_serial_setsource(port, clk_sel); if (!IS_ERR(ourport->baudclk)) { @@ -1295,8 +1297,6 @@ static void s3c24xx_serial_set_termios(struct uart_port *port, ourport->baudclk = ERR_PTR(-EINVAL); } - clk_prepare_enable(clk); - ourport->baudclk = clk; ourport->baudclk_rate = clk ? clk_get_rate(clk) : 0; } From e8ebd0cf882ba73a5c867bb7228dba1ae746c047 Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Thu, 17 Mar 2016 20:37:10 +0800 Subject: [PATCH 274/312] MIPS: Reserve nosave data for hibernation [ Upstream commit a95d069204e178f18476f5499abab0d0d9cbc32c ] After commit 92923ca3aacef63c92d ("mm: meminit: only set page reserved in the memblock region"), the MIPS hibernation is broken. Because pages in nosave data section should be "reserved", but currently they aren't set to "reserved" at initialization. This patch makes hibernation work again. Signed-off-by: Huacai Chen Cc: Aurelien Jarno Cc: Steven J . Hill Cc: Fuxin Zhang Cc: Zhangjin Wu Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/12888/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin --- arch/mips/kernel/setup.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c index be73c491182b..49b52035226c 100644 --- a/arch/mips/kernel/setup.c +++ b/arch/mips/kernel/setup.c @@ -686,6 +686,9 @@ static void __init arch_mem_init(char **cmdline_p) for_each_memblock(reserved, reg) if (reg->size != 0) reserve_bootmem(reg->base, reg->size, BOOTMEM_DEFAULT); + + reserve_bootmem_region(__pa_symbol(&__nosave_begin), + __pa_symbol(&__nosave_end)); /* Reserve for hibernation */ } static void __init resource_init(void) From 2612a949cf5c2a868adee1ca6bcbf01cd4e2f01e Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Mon, 18 Jan 2016 16:36:09 +0100 Subject: [PATCH 275/312] pipe: limit the per-user amount of pages allocated in pipes [ Upstream commit 759c01142a5d0f364a462346168a56de28a80f52 ] On no-so-small systems, it is possible for a single process to cause an OOM condition by filling large pipes with data that are never read. A typical process filling 4000 pipes with 1 MB of data will use 4 GB of memory. On small systems it may be tricky to set the pipe max size to prevent this from happening. This patch makes it possible to enforce a per-user soft limit above which new pipes will be limited to a single page, effectively limiting them to 4 kB each, as well as a hard limit above which no new pipes may be created for this user. This has the effect of protecting the system against memory abuse without hurting other users, and still allowing pipes to work correctly though with less data at once. The limit are controlled by two new sysctls : pipe-user-pages-soft, and pipe-user-pages-hard. Both may be disabled by setting them to zero. The default soft limit allows the default number of FDs per process (1024) to create pipes of the default size (64kB), thus reaching a limit of 64MB before starting to create only smaller pipes. With 256 processes limited to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB = 1084 MB of memory allocated for a user. The hard limit is disabled by default to avoid breaking existing applications that make intensive use of pipes (eg: for splicing). Reported-by: socketpair@gmail.com Reported-by: Tetsuo Handa Mitigates: CVE-2013-4312 (Linux 2.0+) Suggested-by: Linus Torvalds Signed-off-by: Willy Tarreau Signed-off-by: Al Viro Signed-off-by: Sasha Levin --- Documentation/sysctl/fs.txt | 23 ++++++++++++++++++ fs/pipe.c | 47 +++++++++++++++++++++++++++++++++++-- include/linux/pipe_fs_i.h | 4 ++++ include/linux/sched.h | 1 + kernel/sysctl.c | 14 +++++++++++ 5 files changed, 87 insertions(+), 2 deletions(-) diff --git a/Documentation/sysctl/fs.txt b/Documentation/sysctl/fs.txt index 88152f214f48..302b5ed616a6 100644 --- a/Documentation/sysctl/fs.txt +++ b/Documentation/sysctl/fs.txt @@ -32,6 +32,8 @@ Currently, these files are in /proc/sys/fs: - nr_open - overflowuid - overflowgid +- pipe-user-pages-hard +- pipe-user-pages-soft - protected_hardlinks - protected_symlinks - suid_dumpable @@ -159,6 +161,27 @@ The default is 65534. ============================================================== +pipe-user-pages-hard: + +Maximum total number of pages a non-privileged user may allocate for pipes. +Once this limit is reached, no new pipes may be allocated until usage goes +below the limit again. When set to 0, no limit is applied, which is the default +setting. + +============================================================== + +pipe-user-pages-soft: + +Maximum total number of pages a non-privileged user may allocate for pipes +before the pipe size gets limited to a single page. Once this limit is reached, +new pipes will be limited to a single page in size for this user in order to +limit total memory usage, and trying to increase them using fcntl() will be +denied until usage goes below the limit again. The default value allows to +allocate up to 1024 pipes at their default size. When set to 0, no limit is +applied. + +============================================================== + protected_hardlinks: A long-standing class of security issues is the hardlink-based diff --git a/fs/pipe.c b/fs/pipe.c index 8865f7963700..5916c19dbb02 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -38,6 +38,12 @@ unsigned int pipe_max_size = 1048576; */ unsigned int pipe_min_size = PAGE_SIZE; +/* Maximum allocatable pages per user. Hard limit is unset by default, soft + * matches default values. + */ +unsigned long pipe_user_pages_hard; +unsigned long pipe_user_pages_soft = PIPE_DEF_BUFFERS * INR_OPEN_CUR; + /* * We use a start+len construction, which provides full use of the * allocated memory. @@ -584,20 +590,49 @@ pipe_fasync(int fd, struct file *filp, int on) return retval; } +static void account_pipe_buffers(struct pipe_inode_info *pipe, + unsigned long old, unsigned long new) +{ + atomic_long_add(new - old, &pipe->user->pipe_bufs); +} + +static bool too_many_pipe_buffers_soft(struct user_struct *user) +{ + return pipe_user_pages_soft && + atomic_long_read(&user->pipe_bufs) >= pipe_user_pages_soft; +} + +static bool too_many_pipe_buffers_hard(struct user_struct *user) +{ + return pipe_user_pages_hard && + atomic_long_read(&user->pipe_bufs) >= pipe_user_pages_hard; +} + struct pipe_inode_info *alloc_pipe_info(void) { struct pipe_inode_info *pipe; pipe = kzalloc(sizeof(struct pipe_inode_info), GFP_KERNEL); if (pipe) { - pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * PIPE_DEF_BUFFERS, GFP_KERNEL); + unsigned long pipe_bufs = PIPE_DEF_BUFFERS; + struct user_struct *user = get_current_user(); + + if (!too_many_pipe_buffers_hard(user)) { + if (too_many_pipe_buffers_soft(user)) + pipe_bufs = 1; + pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * pipe_bufs, GFP_KERNEL); + } + if (pipe->bufs) { init_waitqueue_head(&pipe->wait); pipe->r_counter = pipe->w_counter = 1; - pipe->buffers = PIPE_DEF_BUFFERS; + pipe->buffers = pipe_bufs; + pipe->user = user; + account_pipe_buffers(pipe, 0, pipe_bufs); mutex_init(&pipe->mutex); return pipe; } + free_uid(user); kfree(pipe); } @@ -608,6 +643,8 @@ void free_pipe_info(struct pipe_inode_info *pipe) { int i; + account_pipe_buffers(pipe, pipe->buffers, 0); + free_uid(pipe->user); for (i = 0; i < pipe->buffers; i++) { struct pipe_buffer *buf = pipe->bufs + i; if (buf->ops) @@ -996,6 +1033,7 @@ static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long nr_pages) memcpy(bufs + head, pipe->bufs, tail * sizeof(struct pipe_buffer)); } + account_pipe_buffers(pipe, pipe->buffers, nr_pages); pipe->curbuf = 0; kfree(pipe->bufs); pipe->bufs = bufs; @@ -1067,6 +1105,11 @@ long pipe_fcntl(struct file *file, unsigned int cmd, unsigned long arg) if (!capable(CAP_SYS_RESOURCE) && size > pipe_max_size) { ret = -EPERM; goto out; + } else if ((too_many_pipe_buffers_hard(pipe->user) || + too_many_pipe_buffers_soft(pipe->user)) && + !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) { + ret = -EPERM; + goto out; } ret = pipe_set_size(pipe, nr_pages); break; diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h index eb8b8ac6df3c..24f5470d3944 100644 --- a/include/linux/pipe_fs_i.h +++ b/include/linux/pipe_fs_i.h @@ -42,6 +42,7 @@ struct pipe_buffer { * @fasync_readers: reader side fasync * @fasync_writers: writer side fasync * @bufs: the circular array of pipe buffers + * @user: the user who created this pipe **/ struct pipe_inode_info { struct mutex mutex; @@ -57,6 +58,7 @@ struct pipe_inode_info { struct fasync_struct *fasync_readers; struct fasync_struct *fasync_writers; struct pipe_buffer *bufs; + struct user_struct *user; }; /* @@ -123,6 +125,8 @@ void pipe_unlock(struct pipe_inode_info *); void pipe_double_lock(struct pipe_inode_info *, struct pipe_inode_info *); extern unsigned int pipe_max_size, pipe_min_size; +extern unsigned long pipe_user_pages_hard; +extern unsigned long pipe_user_pages_soft; int pipe_proc_fn(struct ctl_table *, int, void __user *, size_t *, loff_t *); diff --git a/include/linux/sched.h b/include/linux/sched.h index 9128b4e9f541..9e39deaeddd6 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -803,6 +803,7 @@ struct user_struct { #endif unsigned long locked_shm; /* How many pages of mlocked shm ? */ unsigned long unix_inflight; /* How many files in flight in unix sockets */ + atomic_long_t pipe_bufs; /* how many pages are allocated in pipe buffers */ #ifdef CONFIG_KEYS struct key *uid_keyring; /* UID specific keyring */ diff --git a/kernel/sysctl.c b/kernel/sysctl.c index c3eee4c6d6c1..7d4900404c94 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1695,6 +1695,20 @@ static struct ctl_table fs_table[] = { .proc_handler = &pipe_proc_fn, .extra1 = &pipe_min_size, }, + { + .procname = "pipe-user-pages-hard", + .data = &pipe_user_pages_hard, + .maxlen = sizeof(pipe_user_pages_hard), + .mode = 0644, + .proc_handler = proc_doulongvec_minmax, + }, + { + .procname = "pipe-user-pages-soft", + .data = &pipe_user_pages_soft, + .maxlen = sizeof(pipe_user_pages_soft), + .mode = 0644, + .proc_handler = proc_doulongvec_minmax, + }, { } }; From 186e7c38727f7a9fecbf238bbff9675e83842a99 Mon Sep 17 00:00:00 2001 From: Eric Sandeen Date: Mon, 4 Jan 2016 16:10:19 +1100 Subject: [PATCH 276/312] xfs: print name of verifier if it fails [ Upstream commit 233135b763db7c64d07b728a9c66745fb0376275 ] This adds a name to each buf_ops structure, so that if a verifier fails we can print the type of verifier that failed it. Should be a slight debugging aid, I hope. Signed-off-by: Eric Sandeen Reviewed-by: Brian Foster Signed-off-by: Dave Chinner Signed-off-by: Sasha Levin --- fs/xfs/libxfs/xfs_alloc.c | 2 ++ fs/xfs/libxfs/xfs_alloc_btree.c | 1 + fs/xfs/libxfs/xfs_attr_leaf.c | 1 + fs/xfs/libxfs/xfs_attr_remote.c | 1 + fs/xfs/libxfs/xfs_bmap_btree.c | 1 + fs/xfs/libxfs/xfs_da_btree.c | 1 + fs/xfs/libxfs/xfs_dir2_block.c | 1 + fs/xfs/libxfs/xfs_dir2_data.c | 2 ++ fs/xfs/libxfs/xfs_dir2_leaf.c | 2 ++ fs/xfs/libxfs/xfs_dir2_node.c | 1 + fs/xfs/libxfs/xfs_dquot_buf.c | 1 + fs/xfs/libxfs/xfs_ialloc.c | 1 + fs/xfs/libxfs/xfs_ialloc_btree.c | 1 + fs/xfs/libxfs/xfs_inode_buf.c | 2 ++ fs/xfs/libxfs/xfs_sb.c | 2 ++ fs/xfs/libxfs/xfs_symlink_remote.c | 1 + fs/xfs/xfs_buf.h | 1 + fs/xfs/xfs_error.c | 4 ++-- 18 files changed, 24 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 516162be1398..e1b1c8278294 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -519,6 +519,7 @@ xfs_agfl_write_verify( } const struct xfs_buf_ops xfs_agfl_buf_ops = { + .name = "xfs_agfl", .verify_read = xfs_agfl_read_verify, .verify_write = xfs_agfl_write_verify, }; @@ -2276,6 +2277,7 @@ xfs_agf_write_verify( } const struct xfs_buf_ops xfs_agf_buf_ops = { + .name = "xfs_agf", .verify_read = xfs_agf_read_verify, .verify_write = xfs_agf_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c index 59d521c09a17..13629ad8a60c 100644 --- a/fs/xfs/libxfs/xfs_alloc_btree.c +++ b/fs/xfs/libxfs/xfs_alloc_btree.c @@ -379,6 +379,7 @@ xfs_allocbt_write_verify( } const struct xfs_buf_ops xfs_allocbt_buf_ops = { + .name = "xfs_allocbt", .verify_read = xfs_allocbt_read_verify, .verify_write = xfs_allocbt_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c index e9d401ce93bb..0532561a6010 100644 --- a/fs/xfs/libxfs/xfs_attr_leaf.c +++ b/fs/xfs/libxfs/xfs_attr_leaf.c @@ -325,6 +325,7 @@ xfs_attr3_leaf_read_verify( } const struct xfs_buf_ops xfs_attr3_leaf_buf_ops = { + .name = "xfs_attr3_leaf", .verify_read = xfs_attr3_leaf_read_verify, .verify_write = xfs_attr3_leaf_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c index dd714037c322..c3db53d1bdb3 100644 --- a/fs/xfs/libxfs/xfs_attr_remote.c +++ b/fs/xfs/libxfs/xfs_attr_remote.c @@ -201,6 +201,7 @@ xfs_attr3_rmt_write_verify( } const struct xfs_buf_ops xfs_attr3_rmt_buf_ops = { + .name = "xfs_attr3_rmt", .verify_read = xfs_attr3_rmt_read_verify, .verify_write = xfs_attr3_rmt_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c index 2c44c8e50782..225f2a8c0436 100644 --- a/fs/xfs/libxfs/xfs_bmap_btree.c +++ b/fs/xfs/libxfs/xfs_bmap_btree.c @@ -719,6 +719,7 @@ xfs_bmbt_write_verify( } const struct xfs_buf_ops xfs_bmbt_buf_ops = { + .name = "xfs_bmbt", .verify_read = xfs_bmbt_read_verify, .verify_write = xfs_bmbt_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index 2385f8cd08ab..5d1827056efb 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -241,6 +241,7 @@ xfs_da3_node_read_verify( } const struct xfs_buf_ops xfs_da3_node_buf_ops = { + .name = "xfs_da3_node", .verify_read = xfs_da3_node_read_verify, .verify_write = xfs_da3_node_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c index 9354e190b82e..a02ee011c8da 100644 --- a/fs/xfs/libxfs/xfs_dir2_block.c +++ b/fs/xfs/libxfs/xfs_dir2_block.c @@ -120,6 +120,7 @@ xfs_dir3_block_write_verify( } const struct xfs_buf_ops xfs_dir3_block_buf_ops = { + .name = "xfs_dir3_block", .verify_read = xfs_dir3_block_read_verify, .verify_write = xfs_dir3_block_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_dir2_data.c b/fs/xfs/libxfs/xfs_dir2_data.c index 534bbf283d6b..e020a2c3a73f 100644 --- a/fs/xfs/libxfs/xfs_dir2_data.c +++ b/fs/xfs/libxfs/xfs_dir2_data.c @@ -302,11 +302,13 @@ xfs_dir3_data_write_verify( } const struct xfs_buf_ops xfs_dir3_data_buf_ops = { + .name = "xfs_dir3_data", .verify_read = xfs_dir3_data_read_verify, .verify_write = xfs_dir3_data_write_verify, }; static const struct xfs_buf_ops xfs_dir3_data_reada_buf_ops = { + .name = "xfs_dir3_data_reada", .verify_read = xfs_dir3_data_reada_verify, .verify_write = xfs_dir3_data_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c index 106119955400..eb66ae07428a 100644 --- a/fs/xfs/libxfs/xfs_dir2_leaf.c +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c @@ -242,11 +242,13 @@ xfs_dir3_leafn_write_verify( } const struct xfs_buf_ops xfs_dir3_leaf1_buf_ops = { + .name = "xfs_dir3_leaf1", .verify_read = xfs_dir3_leaf1_read_verify, .verify_write = xfs_dir3_leaf1_write_verify, }; const struct xfs_buf_ops xfs_dir3_leafn_buf_ops = { + .name = "xfs_dir3_leafn", .verify_read = xfs_dir3_leafn_read_verify, .verify_write = xfs_dir3_leafn_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c index 06bb4218b362..f6e591edbb98 100644 --- a/fs/xfs/libxfs/xfs_dir2_node.c +++ b/fs/xfs/libxfs/xfs_dir2_node.c @@ -147,6 +147,7 @@ xfs_dir3_free_write_verify( } const struct xfs_buf_ops xfs_dir3_free_buf_ops = { + .name = "xfs_dir3_free", .verify_read = xfs_dir3_free_read_verify, .verify_write = xfs_dir3_free_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_dquot_buf.c b/fs/xfs/libxfs/xfs_dquot_buf.c index 48aff071591d..f48c3040c9ce 100644 --- a/fs/xfs/libxfs/xfs_dquot_buf.c +++ b/fs/xfs/libxfs/xfs_dquot_buf.c @@ -301,6 +301,7 @@ xfs_dquot_buf_write_verify( } const struct xfs_buf_ops xfs_dquot_buf_ops = { + .name = "xfs_dquot", .verify_read = xfs_dquot_buf_read_verify, .verify_write = xfs_dquot_buf_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 1c9e75521250..fe20c2670f6c 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -2117,6 +2117,7 @@ xfs_agi_write_verify( } const struct xfs_buf_ops xfs_agi_buf_ops = { + .name = "xfs_agi", .verify_read = xfs_agi_read_verify, .verify_write = xfs_agi_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c index 964c465ca69c..216a6f0997f6 100644 --- a/fs/xfs/libxfs/xfs_ialloc_btree.c +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c @@ -295,6 +295,7 @@ xfs_inobt_write_verify( } const struct xfs_buf_ops xfs_inobt_buf_ops = { + .name = "xfs_inobt", .verify_read = xfs_inobt_read_verify, .verify_write = xfs_inobt_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 7da6d0b2c2ed..a217176fde65 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -138,11 +138,13 @@ xfs_inode_buf_write_verify( } const struct xfs_buf_ops xfs_inode_buf_ops = { + .name = "xfs_inode", .verify_read = xfs_inode_buf_read_verify, .verify_write = xfs_inode_buf_write_verify, }; const struct xfs_buf_ops xfs_inode_buf_ra_ops = { + .name = "xxfs_inode_ra", .verify_read = xfs_inode_buf_readahead_verify, .verify_write = xfs_inode_buf_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index dc4bfc5d88fc..535bd843f2f4 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -637,11 +637,13 @@ xfs_sb_write_verify( } const struct xfs_buf_ops xfs_sb_buf_ops = { + .name = "xfs_sb", .verify_read = xfs_sb_read_verify, .verify_write = xfs_sb_write_verify, }; const struct xfs_buf_ops xfs_sb_quiet_buf_ops = { + .name = "xfs_sb_quiet", .verify_read = xfs_sb_quiet_read_verify, .verify_write = xfs_sb_write_verify, }; diff --git a/fs/xfs/libxfs/xfs_symlink_remote.c b/fs/xfs/libxfs/xfs_symlink_remote.c index e7e26bd6468f..4caff91ced51 100644 --- a/fs/xfs/libxfs/xfs_symlink_remote.c +++ b/fs/xfs/libxfs/xfs_symlink_remote.c @@ -164,6 +164,7 @@ xfs_symlink_write_verify( } const struct xfs_buf_ops xfs_symlink_buf_ops = { + .name = "xfs_symlink", .verify_read = xfs_symlink_read_verify, .verify_write = xfs_symlink_write_verify, }; diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h index 75ff5d5a7d2e..110cb85e04f3 100644 --- a/fs/xfs/xfs_buf.h +++ b/fs/xfs/xfs_buf.h @@ -131,6 +131,7 @@ struct xfs_buf_map { struct xfs_buf_map (map) = { .bm_bn = (blkno), .bm_len = (numblk) }; struct xfs_buf_ops { + char *name; void (*verify_read)(struct xfs_buf *); void (*verify_write)(struct xfs_buf *); }; diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c index 338e50bbfd1e..63db1cc2091a 100644 --- a/fs/xfs/xfs_error.c +++ b/fs/xfs/xfs_error.c @@ -164,9 +164,9 @@ xfs_verifier_error( { struct xfs_mount *mp = bp->b_target->bt_mount; - xfs_alert(mp, "Metadata %s detected at %pF, block 0x%llx", + xfs_alert(mp, "Metadata %s detected at %pF, %s block 0x%llx", bp->b_error == -EFSBADCRC ? "CRC error" : "corruption", - __return_address, bp->b_bn); + __return_address, bp->b_ops->name, bp->b_bn); xfs_alert(mp, "Unmount and run xfs_repair"); From 49956430d3d55b47e4a2d2f5f777d641cae137d6 Mon Sep 17 00:00:00 2001 From: Richard Alpe Date: Mon, 16 May 2016 11:14:54 +0200 Subject: [PATCH 277/312] tipc: check nl sock before parsing nested attributes [ Upstream commit 45e093ae2830cd1264677d47ff9a95a71f5d9f9c ] Make sure the socket for which the user is listing publication exists before parsing the socket netlink attributes. Prior to this patch a call without any socket caused a NULL pointer dereference in tipc_nl_publ_dump(). Tested-and-reported-by: Baozeng Ding Signed-off-by: Richard Alpe Acked-by: Jon Maloy Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/tipc/socket.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 20cc6df07157..d41d424b9913 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2804,6 +2804,9 @@ int tipc_nl_publ_dump(struct sk_buff *skb, struct netlink_callback *cb) if (err) return err; + if (!attrs[TIPC_NLA_SOCK]) + return -EINVAL; + err = nla_parse_nested(sock, TIPC_NLA_SOCK_MAX, attrs[TIPC_NLA_SOCK], tipc_nl_sock_policy); From e39cd93be0009ae4548a737756a947d2030956ab Mon Sep 17 00:00:00 2001 From: Herbert Xu Date: Mon, 16 May 2016 17:28:16 +0800 Subject: [PATCH 278/312] netlink: Fix dump skb leak/double free [ Upstream commit 92964c79b357efd980812c4de5c1fd2ec8bb5520 ] When we free cb->skb after a dump, we do it after releasing the lock. This means that a new dump could have started in the time being and we'll end up freeing their skb instead of ours. This patch saves the skb and module before we unlock so we free the right memory. Fixes: 16b304f3404f ("netlink: Eliminate kmalloc in netlink dump operation.") Reported-by: Baozeng Ding Signed-off-by: Herbert Xu Acked-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/netlink/af_netlink.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 0c29986ecd87..dbc32b19c574 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2683,6 +2683,7 @@ static int netlink_dump(struct sock *sk) struct netlink_callback *cb; struct sk_buff *skb = NULL; struct nlmsghdr *nlh; + struct module *module; int len, err = -ENOBUFS; int alloc_min_size; int alloc_size; @@ -2762,9 +2763,11 @@ static int netlink_dump(struct sock *sk) cb->done(cb); nlk->cb_running = false; + module = cb->module; + skb = cb->skb; mutex_unlock(nlk->cb_mutex); - module_put(cb->module); - consume_skb(cb->skb); + module_put(module); + consume_skb(skb); return 0; errout_skb: From 8454d6443c84ee3501de20b9ff2034ea4767a0da Mon Sep 17 00:00:00 2001 From: Richard Alpe Date: Tue, 17 May 2016 16:57:37 +0200 Subject: [PATCH 279/312] tipc: fix nametable publication field in nl compat [ Upstream commit 03aaaa9b941e136757b55c4cf775aab6068dfd94 ] The publication field of the old netlink API should contain the publication key and not the publication reference. Fixes: 44a8ae94fd55 (tipc: convert legacy nl name table dump to nl compat) Signed-off-by: Richard Alpe Acked-by: Jon Maloy Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/tipc/netlink_compat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index ce9121e8e990..e5ec86dd8dc1 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -712,7 +712,7 @@ static int tipc_nl_compat_name_table_dump(struct tipc_nl_compat_msg *msg, goto out; tipc_tlv_sprintf(msg->rep, "%-10u %s", - nla_get_u32(publ[TIPC_NLA_PUBL_REF]), + nla_get_u32(publ[TIPC_NLA_PUBL_KEY]), scope_str[nla_get_u32(publ[TIPC_NLA_PUBL_SCOPE])]); out: tipc_tlv_sprintf(msg->rep, "\n"); From 27b56c6154943a860c9266ec8f3dc96ba2b1aa19 Mon Sep 17 00:00:00 2001 From: Jason Wang Date: Thu, 19 May 2016 13:36:51 +0800 Subject: [PATCH 280/312] tuntap: correctly wake up process during uninit [ Upstream commit addf8fc4acb1cf79492ac64966f07178793cb3d7 ] We used to check dev->reg_state against NETREG_REGISTERED after each time we are woke up. But after commit 9e641bdcfa4e ("net-tun: restructure tun_do_read for better sleep/wakeup efficiency"), it uses skb_recv_datagram() which does not check dev->reg_state. This will result if we delete a tun/tap device after a process is blocked in the reading. The device will wait for the reference count which was held by that process for ever. Fixes this by using RCV_SHUTDOWN which will be checked during sk_recv_datagram() before trying to wake up the process during uninit. Fixes: 9e641bdcfa4e ("net-tun: restructure tun_do_read for better sleep/wakeup efficiency") Cc: Eric Dumazet Cc: Xi Wang Cc: Michael S. Tsirkin Signed-off-by: Jason Wang Acked-by: Eric Dumazet Acked-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/tun.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index e470ae59d405..01f5ff84cf6b 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -516,11 +516,13 @@ static void tun_detach_all(struct net_device *dev) for (i = 0; i < n; i++) { tfile = rtnl_dereference(tun->tfiles[i]); BUG_ON(!tfile); + tfile->socket.sk->sk_shutdown = RCV_SHUTDOWN; tfile->socket.sk->sk_data_ready(tfile->socket.sk); RCU_INIT_POINTER(tfile->tun, NULL); --tun->numqueues; } list_for_each_entry(tfile, &tun->disabled, next) { + tfile->socket.sk->sk_shutdown = RCV_SHUTDOWN; tfile->socket.sk->sk_data_ready(tfile->socket.sk); RCU_INIT_POINTER(tfile->tun, NULL); } @@ -575,6 +577,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte goto out; } tfile->queue_index = tun->numqueues; + tfile->socket.sk->sk_shutdown &= ~RCV_SHUTDOWN; rcu_assign_pointer(tfile->tun, tun); rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile); tun->numqueues++; @@ -1357,9 +1360,6 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile, if (!iov_iter_count(to)) return 0; - if (tun->dev->reg_state != NETREG_REGISTERED) - return -EIO; - /* Read frames from queue */ skb = __skb_recv_datagram(tfile->socket.sk, noblock ? MSG_DONTWAIT : 0, &peeked, &off, &err); From e2de678c191464580160b68b896d9078aca8b2fc Mon Sep 17 00:00:00 2001 From: Edward Cree Date: Tue, 24 May 2016 18:53:36 +0100 Subject: [PATCH 281/312] sfc: on MC reset, clear PIO buffer linkage in TXQs [ Upstream commit c0795bf64cba4d1b796fdc5b74b33772841ed1bb ] Otherwise, if we fail to allocate new PIO buffers, our TXQs will try to use the old ones, which aren't there any more. Fixes: 183233bec810 "sfc: Allocate and link PIO buffers; map them with write-combining" Signed-off-by: Edward Cree Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/sfc/ef10.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index feca46efa12f..c642e201a45e 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -452,6 +452,17 @@ fail: return rc; } +static void efx_ef10_forget_old_piobufs(struct efx_nic *efx) +{ + struct efx_channel *channel; + struct efx_tx_queue *tx_queue; + + /* All our existing PIO buffers went away */ + efx_for_each_channel(channel, efx) + efx_for_each_channel_tx_queue(tx_queue, channel) + tx_queue->piobuf = NULL; +} + #else /* !EFX_USE_PIO */ static int efx_ef10_alloc_piobufs(struct efx_nic *efx, unsigned int n) @@ -468,6 +479,10 @@ static void efx_ef10_free_piobufs(struct efx_nic *efx) { } +static void efx_ef10_forget_old_piobufs(struct efx_nic *efx) +{ +} + #endif /* EFX_USE_PIO */ static void efx_ef10_remove(struct efx_nic *efx) @@ -699,6 +714,7 @@ static void efx_ef10_reset_mc_allocations(struct efx_nic *efx) nic_data->must_realloc_vis = true; nic_data->must_restore_filters = true; nic_data->must_restore_piobufs = true; + efx_ef10_forget_old_piobufs(efx); nic_data->rx_rss_context = EFX_EF10_RSS_CONTEXT_INVALID; } From daaaff2715757a241f178547cb39ae63882ad7da Mon Sep 17 00:00:00 2001 From: Yuchung Cheng Date: Mon, 6 Jun 2016 15:07:18 -0700 Subject: [PATCH 282/312] tcp: record TLP and ER timer stats in v6 stats [ Upstream commit ce3cf4ec0305919fc69a972f6c2b2efd35d36abc ] The v6 tcp stats scan do not provide TLP and ER timer information correctly like the v4 version . This patch fixes that. Fixes: 6ba8a3b19e76 ("tcp: Tail loss probe (TLP)") Fixes: eed530b6c676 ("tcp: early retransmit") Signed-off-by: Yuchung Cheng Signed-off-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv6/tcp_ipv6.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index c1147acbc8c4..ac6c40d08ac5 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1691,7 +1691,9 @@ static void get_tcp6_sock(struct seq_file *seq, struct sock *sp, int i) destp = ntohs(inet->inet_dport); srcp = ntohs(inet->inet_sport); - if (icsk->icsk_pending == ICSK_TIME_RETRANS) { + if (icsk->icsk_pending == ICSK_TIME_RETRANS || + icsk->icsk_pending == ICSK_TIME_EARLY_RETRANS || + icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) { timer_active = 1; timer_expires = icsk->icsk_timeout; } else if (icsk->icsk_pending == ICSK_TIME_PROBE0) { From d2e4e89ae871295c539334f50368bb48f74c3caf Mon Sep 17 00:00:00 2001 From: Mike Frysinger Date: Mon, 18 Jan 2016 06:32:30 -0500 Subject: [PATCH 283/312] sparc: Fix system call tracing register handling. [ Upstream commit 1a40b95374f680625318ab61d81958e949e0afe3 ] A system call trace trigger on entry allows the tracing process to inspect and potentially change the traced process's registers. Account for that by reloading the %g1 (syscall number) and %i0-%i5 (syscall argument) values. We need to be careful to revalidate the range of %g1, and reload the system call table entry it corresponds to into %l7. Reported-by: Mike Frysinger Signed-off-by: David S. Miller Tested-by: Mike Frysinger Signed-off-by: Sasha Levin --- arch/sparc/kernel/entry.S | 17 +++++++++++++++++ arch/sparc/kernel/syscalls.S | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 53 insertions(+) diff --git a/arch/sparc/kernel/entry.S b/arch/sparc/kernel/entry.S index 33c02b15f478..a83707c83be8 100644 --- a/arch/sparc/kernel/entry.S +++ b/arch/sparc/kernel/entry.S @@ -948,7 +948,24 @@ linux_syscall_trace: cmp %o0, 0 bne 3f mov -ENOSYS, %o0 + + /* Syscall tracing can modify the registers. */ + ld [%sp + STACKFRAME_SZ + PT_G1], %g1 + sethi %hi(sys_call_table), %l7 + ld [%sp + STACKFRAME_SZ + PT_I0], %i0 + or %l7, %lo(sys_call_table), %l7 + ld [%sp + STACKFRAME_SZ + PT_I1], %i1 + ld [%sp + STACKFRAME_SZ + PT_I2], %i2 + ld [%sp + STACKFRAME_SZ + PT_I3], %i3 + ld [%sp + STACKFRAME_SZ + PT_I4], %i4 + ld [%sp + STACKFRAME_SZ + PT_I5], %i5 + cmp %g1, NR_syscalls + bgeu 3f + mov -ENOSYS, %o0 + + sll %g1, 2, %l4 mov %i0, %o0 + ld [%l7 + %l4], %l7 mov %i1, %o1 mov %i2, %o2 mov %i3, %o3 diff --git a/arch/sparc/kernel/syscalls.S b/arch/sparc/kernel/syscalls.S index bb0008927598..c4a1b5c40e4e 100644 --- a/arch/sparc/kernel/syscalls.S +++ b/arch/sparc/kernel/syscalls.S @@ -158,7 +158,25 @@ linux_syscall_trace32: add %sp, PTREGS_OFF, %o0 brnz,pn %o0, 3f mov -ENOSYS, %o0 + + /* Syscall tracing can modify the registers. */ + ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 + sethi %hi(sys_call_table32), %l7 + ldx [%sp + PTREGS_OFF + PT_V9_I0], %i0 + or %l7, %lo(sys_call_table32), %l7 + ldx [%sp + PTREGS_OFF + PT_V9_I1], %i1 + ldx [%sp + PTREGS_OFF + PT_V9_I2], %i2 + ldx [%sp + PTREGS_OFF + PT_V9_I3], %i3 + ldx [%sp + PTREGS_OFF + PT_V9_I4], %i4 + ldx [%sp + PTREGS_OFF + PT_V9_I5], %i5 + + cmp %g1, NR_syscalls + bgeu,pn %xcc, 3f + mov -ENOSYS, %o0 + + sll %g1, 2, %l4 srl %i0, 0, %o0 + lduw [%l7 + %l4], %l7 srl %i4, 0, %o4 srl %i1, 0, %o1 srl %i2, 0, %o2 @@ -170,7 +188,25 @@ linux_syscall_trace: add %sp, PTREGS_OFF, %o0 brnz,pn %o0, 3f mov -ENOSYS, %o0 + + /* Syscall tracing can modify the registers. */ + ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 + sethi %hi(sys_call_table64), %l7 + ldx [%sp + PTREGS_OFF + PT_V9_I0], %i0 + or %l7, %lo(sys_call_table64), %l7 + ldx [%sp + PTREGS_OFF + PT_V9_I1], %i1 + ldx [%sp + PTREGS_OFF + PT_V9_I2], %i2 + ldx [%sp + PTREGS_OFF + PT_V9_I3], %i3 + ldx [%sp + PTREGS_OFF + PT_V9_I4], %i4 + ldx [%sp + PTREGS_OFF + PT_V9_I5], %i5 + + cmp %g1, NR_syscalls + bgeu,pn %xcc, 3f + mov -ENOSYS, %o0 + + sll %g1, 2, %l4 mov %i0, %o0 + lduw [%l7 + %l4], %l7 mov %i1, %o1 mov %i2, %o2 mov %i3, %o3 From 693b8dc4b23b853c5636fc04909bff48e44758cc Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Wed, 27 Apr 2016 17:27:37 -0400 Subject: [PATCH 284/312] sparc64: Fix bootup regressions on some Kconfig combinations. [ Upstream commit 49fa5230462f9f2c4e97c81356473a6bdf06c422 ] The system call tracing bug fix mentioned in the Fixes tag below increased the amount of assembler code in the sequence of assembler files included by head_64.S This caused to total set of code to exceed 0x4000 bytes in size, which overflows the expression in head_64.S that works to place swapper_tsb at address 0x408000. When this is violated, the TSB is not properly aligned, and also the trap table is not aligned properly either. All of this together results in failed boots. So, do two things: 1) Simplify some code by using ba,a instead of ba/nop to get those bytes back. 2) Add a linker script assertion to make sure that if this happens again the build will fail. Fixes: 1a40b95374f6 ("sparc: Fix system call tracing register handling.") Reported-by: Meelis Roos Reported-by: Joerg Abraham Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- arch/sparc/kernel/cherrs.S | 14 +++++--------- arch/sparc/kernel/fpu_traps.S | 11 +++++------ arch/sparc/kernel/head_64.S | 24 ++++++++---------------- arch/sparc/kernel/misctrap.S | 12 ++++-------- arch/sparc/kernel/spiterrs.S | 18 ++++++------------ arch/sparc/kernel/utrap.S | 3 +-- arch/sparc/kernel/vmlinux.lds.S | 4 ++++ arch/sparc/kernel/winfixup.S | 3 +-- 8 files changed, 34 insertions(+), 55 deletions(-) diff --git a/arch/sparc/kernel/cherrs.S b/arch/sparc/kernel/cherrs.S index 4ee1ad420862..655628def68e 100644 --- a/arch/sparc/kernel/cherrs.S +++ b/arch/sparc/kernel/cherrs.S @@ -214,8 +214,7 @@ do_dcpe_tl1_nonfatal: /* Ok we may use interrupt globals safely. */ subcc %g1, %g2, %g1 ! Next cacheline bge,pt %icc, 1b nop - ba,pt %xcc, dcpe_icpe_tl1_common - nop + ba,a,pt %xcc, dcpe_icpe_tl1_common do_dcpe_tl1_fatal: sethi %hi(1f), %g7 @@ -224,8 +223,7 @@ do_dcpe_tl1_fatal: mov 0x2, %o0 call cheetah_plus_parity_error add %sp, PTREGS_OFF, %o1 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size do_dcpe_tl1,.-do_dcpe_tl1 .globl do_icpe_tl1 @@ -259,8 +257,7 @@ do_icpe_tl1_nonfatal: /* Ok we may use interrupt globals safely. */ subcc %g1, %g2, %g1 bge,pt %icc, 1b nop - ba,pt %xcc, dcpe_icpe_tl1_common - nop + ba,a,pt %xcc, dcpe_icpe_tl1_common do_icpe_tl1_fatal: sethi %hi(1f), %g7 @@ -269,8 +266,7 @@ do_icpe_tl1_fatal: mov 0x3, %o0 call cheetah_plus_parity_error add %sp, PTREGS_OFF, %o1 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size do_icpe_tl1,.-do_icpe_tl1 .type dcpe_icpe_tl1_common,#function @@ -456,7 +452,7 @@ __cheetah_log_error: cmp %g2, 0x63 be c_cee nop - ba,pt %xcc, c_deferred + ba,a,pt %xcc, c_deferred .size __cheetah_log_error,.-__cheetah_log_error /* Cheetah FECC trap handling, we get here from tl{0,1}_fecc diff --git a/arch/sparc/kernel/fpu_traps.S b/arch/sparc/kernel/fpu_traps.S index a6864826a4bd..336d2750fe78 100644 --- a/arch/sparc/kernel/fpu_traps.S +++ b/arch/sparc/kernel/fpu_traps.S @@ -100,8 +100,8 @@ do_fpdis: fmuld %f0, %f2, %f26 faddd %f0, %f2, %f28 fmuld %f0, %f2, %f30 - b,pt %xcc, fpdis_exit - nop + ba,a,pt %xcc, fpdis_exit + 2: andcc %g5, FPRS_DU, %g0 bne,pt %icc, 3f fzero %f32 @@ -144,8 +144,8 @@ do_fpdis: fmuld %f32, %f34, %f58 faddd %f32, %f34, %f60 fmuld %f32, %f34, %f62 - ba,pt %xcc, fpdis_exit - nop + ba,a,pt %xcc, fpdis_exit + 3: mov SECONDARY_CONTEXT, %g3 add %g6, TI_FPREGS, %g1 @@ -197,8 +197,7 @@ fpdis_exit2: fp_other_bounce: call do_fpother add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size fp_other_bounce,.-fp_other_bounce .align 32 diff --git a/arch/sparc/kernel/head_64.S b/arch/sparc/kernel/head_64.S index 3d61fcae7ee3..8ff57630a486 100644 --- a/arch/sparc/kernel/head_64.S +++ b/arch/sparc/kernel/head_64.S @@ -461,9 +461,8 @@ sun4v_chip_type: subcc %g3, 1, %g3 bne,pt %xcc, 41b add %g1, 1, %g1 - mov SUN4V_CHIP_SPARC64X, %g4 ba,pt %xcc, 5f - nop + mov SUN4V_CHIP_SPARC64X, %g4 49: mov SUN4V_CHIP_UNKNOWN, %g4 @@ -548,8 +547,7 @@ sun4u_init: stxa %g0, [%g7] ASI_DMMU membar #Sync - ba,pt %xcc, sun4u_continue - nop + ba,a,pt %xcc, sun4u_continue sun4v_init: /* Set ctx 0 */ @@ -560,14 +558,12 @@ sun4v_init: mov SECONDARY_CONTEXT, %g7 stxa %g0, [%g7] ASI_MMU membar #Sync - ba,pt %xcc, niagara_tlb_fixup - nop + ba,a,pt %xcc, niagara_tlb_fixup sun4u_continue: BRANCH_IF_ANY_CHEETAH(g1, g7, cheetah_tlb_fixup) - ba,pt %xcc, spitfire_tlb_fixup - nop + ba,a,pt %xcc, spitfire_tlb_fixup niagara_tlb_fixup: mov 3, %g2 /* Set TLB type to hypervisor. */ @@ -639,8 +635,7 @@ niagara_patch: call hypervisor_patch_cachetlbops nop - ba,pt %xcc, tlb_fixup_done - nop + ba,a,pt %xcc, tlb_fixup_done cheetah_tlb_fixup: mov 2, %g2 /* Set TLB type to cheetah+. */ @@ -659,8 +654,7 @@ cheetah_tlb_fixup: call cheetah_patch_cachetlbops nop - ba,pt %xcc, tlb_fixup_done - nop + ba,a,pt %xcc, tlb_fixup_done spitfire_tlb_fixup: /* Set TLB type to spitfire. */ @@ -782,8 +776,7 @@ setup_trap_table: call %o1 add %sp, (2047 + 128), %o0 - ba,pt %xcc, 2f - nop + ba,a,pt %xcc, 2f 1: sethi %hi(sparc64_ttable_tl0), %o0 set prom_set_trap_table_name, %g2 @@ -822,8 +815,7 @@ setup_trap_table: BRANCH_IF_ANY_CHEETAH(o2, o3, 1f) - ba,pt %xcc, 2f - nop + ba,a,pt %xcc, 2f /* Disable STICK_INT interrupts. */ 1: diff --git a/arch/sparc/kernel/misctrap.S b/arch/sparc/kernel/misctrap.S index 753b4f031bfb..34b4933900bf 100644 --- a/arch/sparc/kernel/misctrap.S +++ b/arch/sparc/kernel/misctrap.S @@ -18,8 +18,7 @@ __do_privact: 109: or %g7, %lo(109b), %g7 call do_privact add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size __do_privact,.-__do_privact .type do_mna,#function @@ -46,8 +45,7 @@ do_mna: mov %l5, %o2 call mem_address_unaligned add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size do_mna,.-do_mna .type do_lddfmna,#function @@ -65,8 +63,7 @@ do_lddfmna: mov %l5, %o2 call handle_lddfmna add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size do_lddfmna,.-do_lddfmna .type do_stdfmna,#function @@ -84,8 +81,7 @@ do_stdfmna: mov %l5, %o2 call handle_stdfmna add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size do_stdfmna,.-do_stdfmna .type breakpoint_trap,#function diff --git a/arch/sparc/kernel/spiterrs.S b/arch/sparc/kernel/spiterrs.S index c357e40ffd01..4a73009f66a5 100644 --- a/arch/sparc/kernel/spiterrs.S +++ b/arch/sparc/kernel/spiterrs.S @@ -85,8 +85,7 @@ __spitfire_cee_trap_continue: ba,pt %xcc, etraptl1 rd %pc, %g7 - ba,pt %xcc, 2f - nop + ba,a,pt %xcc, 2f 1: ba,pt %xcc, etrap_irq rd %pc, %g7 @@ -100,8 +99,7 @@ __spitfire_cee_trap_continue: mov %l5, %o2 call spitfire_access_error add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size __spitfire_access_error,.-__spitfire_access_error /* This is the trap handler entry point for ECC correctable @@ -179,8 +177,7 @@ __spitfire_data_access_exception_tl1: mov %l5, %o2 call spitfire_data_access_exception_tl1 add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size __spitfire_data_access_exception_tl1,.-__spitfire_data_access_exception_tl1 .type __spitfire_data_access_exception,#function @@ -200,8 +197,7 @@ __spitfire_data_access_exception: mov %l5, %o2 call spitfire_data_access_exception add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size __spitfire_data_access_exception,.-__spitfire_data_access_exception .type __spitfire_insn_access_exception_tl1,#function @@ -220,8 +216,7 @@ __spitfire_insn_access_exception_tl1: mov %l5, %o2 call spitfire_insn_access_exception_tl1 add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size __spitfire_insn_access_exception_tl1,.-__spitfire_insn_access_exception_tl1 .type __spitfire_insn_access_exception,#function @@ -240,6 +235,5 @@ __spitfire_insn_access_exception: mov %l5, %o2 call spitfire_insn_access_exception add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap .size __spitfire_insn_access_exception,.-__spitfire_insn_access_exception diff --git a/arch/sparc/kernel/utrap.S b/arch/sparc/kernel/utrap.S index b7f0f3f3a909..c731e8023d3e 100644 --- a/arch/sparc/kernel/utrap.S +++ b/arch/sparc/kernel/utrap.S @@ -11,8 +11,7 @@ utrap_trap: /* %g3=handler,%g4=level */ mov %l4, %o1 call bad_trap add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap invoke_utrap: sllx %g3, 3, %g3 diff --git a/arch/sparc/kernel/vmlinux.lds.S b/arch/sparc/kernel/vmlinux.lds.S index f1a2f688b28a..4a41d412dd3d 100644 --- a/arch/sparc/kernel/vmlinux.lds.S +++ b/arch/sparc/kernel/vmlinux.lds.S @@ -33,6 +33,10 @@ ENTRY(_start) jiffies = jiffies_64; #endif +#ifdef CONFIG_SPARC64 +ASSERT((swapper_tsb == 0x0000000000408000), "Error: sparc64 early assembler too large") +#endif + SECTIONS { #ifdef CONFIG_SPARC64 diff --git a/arch/sparc/kernel/winfixup.S b/arch/sparc/kernel/winfixup.S index 1e67ce958369..855019a8590e 100644 --- a/arch/sparc/kernel/winfixup.S +++ b/arch/sparc/kernel/winfixup.S @@ -32,8 +32,7 @@ fill_fixup: rd %pc, %g7 call do_sparc64_fault add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,a,pt %xcc, rtrap /* Be very careful about usage of the trap globals here. * You cannot touch %g5 as that has the fault information. From 25f7f80eabe7a0553851990c8367aab1533b7021 Mon Sep 17 00:00:00 2001 From: Nitin Gupta Date: Tue, 5 Jan 2016 22:35:35 -0800 Subject: [PATCH 285/312] sparc64: Fix numa node distance initialization [ Upstream commit 36beca6571c941b28b0798667608239731f9bc3a ] Orabug: 22495713 Currently, NUMA node distance matrix is initialized only when a machine descriptor (MD) exists. However, sun4u machines (e.g. Sun Blade 2500) do not have an MD and thus distance values were left uninitialized. The initialization is now moved such that it happens on both sun4u and sun4v. Signed-off-by: Nitin Gupta Tested-by: Mikael Pettersson Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- arch/sparc/mm/init_64.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 559cb744112c..b2f1e75165cb 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -1301,10 +1301,18 @@ static int __init numa_parse_sun4u(void) static int __init bootmem_init_numa(void) { + int i, j; int err = -1; numadbg("bootmem_init_numa()\n"); + /* Some sane defaults for numa latency values */ + for (i = 0; i < MAX_NUMNODES; i++) { + for (j = 0; j < MAX_NUMNODES; j++) + numa_latency[i][j] = (i == j) ? + LOCAL_DISTANCE : REMOTE_DISTANCE; + } + if (numa_enabled) { if (tlb_type == hypervisor) err = numa_parse_mdesc(); From dc316fc9cbd77586c674e92288ece8383b738dac Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 1 Mar 2016 00:25:32 -0500 Subject: [PATCH 286/312] sparc64: Fix sparc64_set_context stack handling. [ Upstream commit 397d1533b6cce0ccb5379542e2e6d079f6936c46 ] Like a signal return, we should use synchronize_user_stack() rather than flush_user_windows(). Reported-by: Ilya Malakhov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- arch/sparc/kernel/signal_64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/sparc/kernel/signal_64.c b/arch/sparc/kernel/signal_64.c index d88beff47bab..39aaec173f66 100644 --- a/arch/sparc/kernel/signal_64.c +++ b/arch/sparc/kernel/signal_64.c @@ -52,7 +52,7 @@ asmlinkage void sparc64_set_context(struct pt_regs *regs) unsigned char fenab; int err; - flush_user_windows(); + synchronize_user_stack(); if (get_thread_wsaved() || (((unsigned long)ucp) & (sizeof(unsigned long)-1)) || (!__access_ok(ucp, sizeof(*ucp)))) From a74b2f239c2be446c4e7cdb0cdcb641e968a9ce0 Mon Sep 17 00:00:00 2001 From: Babu Moger Date: Thu, 24 Mar 2016 13:02:22 -0700 Subject: [PATCH 287/312] sparc/PCI: Fix for panic while enabling SR-IOV [ Upstream commit d0c31e02005764dae0aab130a57e9794d06b824d ] We noticed this panic while enabling SR-IOV in sparc. mlx4_core: Mellanox ConnectX core driver v2.2-1 (Jan 1 2015) mlx4_core: Initializing 0007:01:00.0 mlx4_core 0007:01:00.0: Enabling SR-IOV with 5 VFs mlx4_core: Initializing 0007:01:00.1 Unable to handle kernel NULL pointer dereference insmod(10010): Oops [#1] CPU: 391 PID: 10010 Comm: insmod Not tainted 4.1.12-32.el6uek.kdump2.sparc64 #1 TPC: I7: <__mlx4_init_one+0x324/0x500 [mlx4_core]> Call Trace: [00000000104c5ea4] __mlx4_init_one+0x324/0x500 [mlx4_core] [00000000104c613c] mlx4_init_one+0xbc/0x120 [mlx4_core] [0000000000725f14] local_pci_probe+0x34/0xa0 [0000000000726028] pci_call_probe+0xa8/0xe0 [0000000000726310] pci_device_probe+0x50/0x80 [000000000079f700] really_probe+0x140/0x420 [000000000079fa24] driver_probe_device+0x44/0xa0 [000000000079fb5c] __device_attach+0x3c/0x60 [000000000079d85c] bus_for_each_drv+0x5c/0xa0 [000000000079f588] device_attach+0x88/0xc0 [000000000071acd0] pci_bus_add_device+0x30/0x80 [0000000000736090] virtfn_add.clone.1+0x210/0x360 [00000000007364a4] sriov_enable+0x2c4/0x520 [000000000073672c] pci_enable_sriov+0x2c/0x40 [00000000104c2d58] mlx4_enable_sriov+0xf8/0x180 [mlx4_core] [00000000104c49ac] mlx4_load_one+0x42c/0xd40 [mlx4_core] Disabling lock debugging due to kernel taint Caller[00000000104c5ea4]: __mlx4_init_one+0x324/0x500 [mlx4_core] Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core] Caller[0000000000725f14]: local_pci_probe+0x34/0xa0 Caller[0000000000726028]: pci_call_probe+0xa8/0xe0 Caller[0000000000726310]: pci_device_probe+0x50/0x80 Caller[000000000079f700]: really_probe+0x140/0x420 Caller[000000000079fa24]: driver_probe_device+0x44/0xa0 Caller[000000000079fb5c]: __device_attach+0x3c/0x60 Caller[000000000079d85c]: bus_for_each_drv+0x5c/0xa0 Caller[000000000079f588]: device_attach+0x88/0xc0 Caller[000000000071acd0]: pci_bus_add_device+0x30/0x80 Caller[0000000000736090]: virtfn_add.clone.1+0x210/0x360 Caller[00000000007364a4]: sriov_enable+0x2c4/0x520 Caller[000000000073672c]: pci_enable_sriov+0x2c/0x40 Caller[00000000104c2d58]: mlx4_enable_sriov+0xf8/0x180 [mlx4_core] Caller[00000000104c49ac]: mlx4_load_one+0x42c/0xd40 [mlx4_core] Caller[00000000104c5f90]: __mlx4_init_one+0x410/0x500 [mlx4_core] Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core] Caller[0000000000725f14]: local_pci_probe+0x34/0xa0 Caller[0000000000726028]: pci_call_probe+0xa8/0xe0 Caller[0000000000726310]: pci_device_probe+0x50/0x80 Caller[000000000079f700]: really_probe+0x140/0x420 Caller[000000000079fa24]: driver_probe_device+0x44/0xa0 Caller[000000000079fb08]: __driver_attach+0x88/0xa0 Caller[000000000079d90c]: bus_for_each_dev+0x6c/0xa0 Caller[000000000079f29c]: driver_attach+0x1c/0x40 Caller[000000000079e35c]: bus_add_driver+0x17c/0x220 Caller[00000000007a02d4]: driver_register+0x74/0x120 Caller[00000000007263fc]: __pci_register_driver+0x3c/0x60 Caller[00000000104f62bc]: mlx4_init+0x60/0xcc [mlx4_core] Kernel panic - not syncing: Fatal exception Press Stop-A (L1-A) to return to the boot prom ---[ end Kernel panic - not syncing: Fatal exception Details: Here is the call sequence virtfn_add->__mlx4_init_one->dma_set_mask->dma_supported The panic happened at line 760(file arch/sparc/kernel/iommu.c) 758 int dma_supported(struct device *dev, u64 device_mask) 759 { 760 struct iommu *iommu = dev->archdata.iommu; 761 u64 dma_addr_mask = iommu->dma_addr_mask; 762 763 if (device_mask >= (1UL << 32UL)) 764 return 0; 765 766 if ((device_mask & dma_addr_mask) == dma_addr_mask) 767 return 1; 768 769 #ifdef CONFIG_PCI 770 if (dev_is_pci(dev)) 771 return pci64_dma_supported(to_pci_dev(dev), device_mask); 772 #endif 773 774 return 0; 775 } 776 EXPORT_SYMBOL(dma_supported); Same panic happened with Intel ixgbe driver also. SR-IOV code looks for arch specific data while enabling VFs. When VF device is added, driver probe function makes set of calls to initialize the pci device. Because the VF device is added different way than the normal PF device(which happens via of_create_pci_dev for sparc), some of the arch specific initialization does not happen for VF device. That causes panic when archdata is accessed. To fix this, I have used already defined weak function pcibios_setup_device to copy archdata from PF to VF. Also verified the fix. Signed-off-by: Babu Moger Signed-off-by: Sowmini Varadhan Reviewed-by: Ethan Zhao Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- arch/sparc/kernel/pci.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c index c928bc64b4ba..991ba90beb04 100644 --- a/arch/sparc/kernel/pci.c +++ b/arch/sparc/kernel/pci.c @@ -994,6 +994,23 @@ void pcibios_set_master(struct pci_dev *dev) /* No special bus mastering setup handling */ } +#ifdef CONFIG_PCI_IOV +int pcibios_add_device(struct pci_dev *dev) +{ + struct pci_dev *pdev; + + /* Add sriov arch specific initialization here. + * Copy dev_archdata from PF to VF + */ + if (dev->is_virtfn) { + pdev = dev->physfn; + memcpy(&dev->dev.archdata, &pdev->dev.archdata, + sizeof(struct dev_archdata)); + } + return 0; +} +#endif /* CONFIG_PCI_IOV */ + static int __init pcibios_init(void) { pci_dfl_cache_line_size = 64 >> 2; From ea312fcb4efc84b28f891485d6c4aca2163c2438 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Wed, 25 May 2016 12:51:20 -0700 Subject: [PATCH 288/312] sparc64: Take ctx_alloc_lock properly in hugetlb_setup(). [ Upstream commit 9ea46abe22550e3366ff7cee2f8391b35b12f730 ] On cheetahplus chips we take the ctx_alloc_lock in order to modify the TLB lookup parameters for the indexed TLBs, which are stored in the context register. This is called with interrupts disabled, however ctx_alloc_lock is an IRQ safe lock, therefore we must take acquire/release it properly with spin_{lock,unlock}_irq(). Reported-by: Meelis Roos Tested-by: Meelis Roos Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- arch/sparc/mm/init_64.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index b2f1e75165cb..71c7ace855d7 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -2770,9 +2770,10 @@ void hugetlb_setup(struct pt_regs *regs) * the Data-TLB for huge pages. */ if (tlb_type == cheetah_plus) { + bool need_context_reload = false; unsigned long ctx; - spin_lock(&ctx_alloc_lock); + spin_lock_irq(&ctx_alloc_lock); ctx = mm->context.sparc64_ctx_val; ctx &= ~CTX_PGSZ_MASK; ctx |= CTX_PGSZ_BASE << CTX_PGSZ0_SHIFT; @@ -2791,9 +2792,12 @@ void hugetlb_setup(struct pt_regs *regs) * also executing in this address space. */ mm->context.sparc64_ctx_val = ctx; - on_each_cpu(context_reload, mm, 0); + need_context_reload = true; } - spin_unlock(&ctx_alloc_lock); + spin_unlock_irq(&ctx_alloc_lock); + + if (need_context_reload) + on_each_cpu(context_reload, mm, 0); } } #endif From 6d3c93761479927258727fcdcf5e673aa5d8aaf4 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sat, 28 May 2016 21:21:31 -0700 Subject: [PATCH 289/312] sparc: Harden signal return frame checks. [ Upstream commit d11c2a0de2824395656cf8ed15811580c9dd38aa ] All signal frames must be at least 16-byte aligned, because that is the alignment we explicitly create when we build signal return stack frames. All stack pointers must be at least 8-byte aligned. Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- arch/sparc/kernel/signal32.c | 46 ++++++++++++++++++++++------------ arch/sparc/kernel/signal_32.c | 41 +++++++++++++++++++----------- arch/sparc/kernel/signal_64.c | 31 +++++++++++++++-------- arch/sparc/kernel/sigutil_32.c | 9 ++++++- arch/sparc/kernel/sigutil_64.c | 10 ++++++-- 5 files changed, 92 insertions(+), 45 deletions(-) diff --git a/arch/sparc/kernel/signal32.c b/arch/sparc/kernel/signal32.c index 4eed773a7735..77655f0f0fc7 100644 --- a/arch/sparc/kernel/signal32.c +++ b/arch/sparc/kernel/signal32.c @@ -138,12 +138,24 @@ int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from) return 0; } +/* Checks if the fp is valid. We always build signal frames which are + * 16-byte aligned, therefore we can always enforce that the restore + * frame has that property as well. + */ +static bool invalid_frame_pointer(void __user *fp, int fplen) +{ + if ((((unsigned long) fp) & 15) || + ((unsigned long)fp) > 0x100000000ULL - fplen) + return true; + return false; +} + void do_sigreturn32(struct pt_regs *regs) { struct signal_frame32 __user *sf; compat_uptr_t fpu_save; compat_uptr_t rwin_save; - unsigned int psr; + unsigned int psr, ufp; unsigned pc, npc; sigset_t set; compat_sigset_t seta; @@ -158,11 +170,16 @@ void do_sigreturn32(struct pt_regs *regs) sf = (struct signal_frame32 __user *) regs->u_regs[UREG_FP]; /* 1. Make sure we are not getting garbage from the user */ - if (!access_ok(VERIFY_READ, sf, sizeof(*sf)) || - (((unsigned long) sf) & 3)) + if (invalid_frame_pointer(sf, sizeof(*sf))) goto segv; - if (get_user(pc, &sf->info.si_regs.pc) || + if (get_user(ufp, &sf->info.si_regs.u_regs[UREG_FP])) + goto segv; + + if (ufp & 0x7) + goto segv; + + if (__get_user(pc, &sf->info.si_regs.pc) || __get_user(npc, &sf->info.si_regs.npc)) goto segv; @@ -227,7 +244,7 @@ segv: asmlinkage void do_rt_sigreturn32(struct pt_regs *regs) { struct rt_signal_frame32 __user *sf; - unsigned int psr, pc, npc; + unsigned int psr, pc, npc, ufp; compat_uptr_t fpu_save; compat_uptr_t rwin_save; sigset_t set; @@ -242,11 +259,16 @@ asmlinkage void do_rt_sigreturn32(struct pt_regs *regs) sf = (struct rt_signal_frame32 __user *) regs->u_regs[UREG_FP]; /* 1. Make sure we are not getting garbage from the user */ - if (!access_ok(VERIFY_READ, sf, sizeof(*sf)) || - (((unsigned long) sf) & 3)) + if (invalid_frame_pointer(sf, sizeof(*sf))) goto segv; - if (get_user(pc, &sf->regs.pc) || + if (get_user(ufp, &sf->regs.u_regs[UREG_FP])) + goto segv; + + if (ufp & 0x7) + goto segv; + + if (__get_user(pc, &sf->regs.pc) || __get_user(npc, &sf->regs.npc)) goto segv; @@ -307,14 +329,6 @@ segv: force_sig(SIGSEGV, current); } -/* Checks if the fp is valid */ -static int invalid_frame_pointer(void __user *fp, int fplen) -{ - if ((((unsigned long) fp) & 7) || ((unsigned long)fp) > 0x100000000ULL - fplen) - return 1; - return 0; -} - static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, unsigned long framesize) { unsigned long sp; diff --git a/arch/sparc/kernel/signal_32.c b/arch/sparc/kernel/signal_32.c index 52aa5e4ce5e7..c3c12efe0bc0 100644 --- a/arch/sparc/kernel/signal_32.c +++ b/arch/sparc/kernel/signal_32.c @@ -60,10 +60,22 @@ struct rt_signal_frame { #define SF_ALIGNEDSZ (((sizeof(struct signal_frame) + 7) & (~7))) #define RT_ALIGNEDSZ (((sizeof(struct rt_signal_frame) + 7) & (~7))) +/* Checks if the fp is valid. We always build signal frames which are + * 16-byte aligned, therefore we can always enforce that the restore + * frame has that property as well. + */ +static inline bool invalid_frame_pointer(void __user *fp, int fplen) +{ + if ((((unsigned long) fp) & 15) || !__access_ok((unsigned long)fp, fplen)) + return true; + + return false; +} + asmlinkage void do_sigreturn(struct pt_regs *regs) { + unsigned long up_psr, pc, npc, ufp; struct signal_frame __user *sf; - unsigned long up_psr, pc, npc; sigset_t set; __siginfo_fpu_t __user *fpu_save; __siginfo_rwin_t __user *rwin_save; @@ -77,10 +89,13 @@ asmlinkage void do_sigreturn(struct pt_regs *regs) sf = (struct signal_frame __user *) regs->u_regs[UREG_FP]; /* 1. Make sure we are not getting garbage from the user */ - if (!access_ok(VERIFY_READ, sf, sizeof(*sf))) + if (!invalid_frame_pointer(sf, sizeof(*sf))) goto segv_and_exit; - if (((unsigned long) sf) & 3) + if (get_user(ufp, &sf->info.si_regs.u_regs[UREG_FP])) + goto segv_and_exit; + + if (ufp & 0x7) goto segv_and_exit; err = __get_user(pc, &sf->info.si_regs.pc); @@ -127,7 +142,7 @@ segv_and_exit: asmlinkage void do_rt_sigreturn(struct pt_regs *regs) { struct rt_signal_frame __user *sf; - unsigned int psr, pc, npc; + unsigned int psr, pc, npc, ufp; __siginfo_fpu_t __user *fpu_save; __siginfo_rwin_t __user *rwin_save; sigset_t set; @@ -135,8 +150,13 @@ asmlinkage void do_rt_sigreturn(struct pt_regs *regs) synchronize_user_stack(); sf = (struct rt_signal_frame __user *) regs->u_regs[UREG_FP]; - if (!access_ok(VERIFY_READ, sf, sizeof(*sf)) || - (((unsigned long) sf) & 0x03)) + if (!invalid_frame_pointer(sf, sizeof(*sf))) + goto segv; + + if (get_user(ufp, &sf->regs.u_regs[UREG_FP])) + goto segv; + + if (ufp & 0x7) goto segv; err = __get_user(pc, &sf->regs.pc); @@ -178,15 +198,6 @@ segv: force_sig(SIGSEGV, current); } -/* Checks if the fp is valid */ -static inline int invalid_frame_pointer(void __user *fp, int fplen) -{ - if ((((unsigned long) fp) & 7) || !__access_ok((unsigned long)fp, fplen)) - return 1; - - return 0; -} - static inline void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, unsigned long framesize) { unsigned long sp = regs->u_regs[UREG_FP]; diff --git a/arch/sparc/kernel/signal_64.c b/arch/sparc/kernel/signal_64.c index 39aaec173f66..5ee930c48f4c 100644 --- a/arch/sparc/kernel/signal_64.c +++ b/arch/sparc/kernel/signal_64.c @@ -234,6 +234,17 @@ do_sigsegv: goto out; } +/* Checks if the fp is valid. We always build rt signal frames which + * are 16-byte aligned, therefore we can always enforce that the + * restore frame has that property as well. + */ +static bool invalid_frame_pointer(void __user *fp) +{ + if (((unsigned long) fp) & 15) + return true; + return false; +} + struct rt_signal_frame { struct sparc_stackf ss; siginfo_t info; @@ -246,8 +257,8 @@ struct rt_signal_frame { void do_rt_sigreturn(struct pt_regs *regs) { + unsigned long tpc, tnpc, tstate, ufp; struct rt_signal_frame __user *sf; - unsigned long tpc, tnpc, tstate; __siginfo_fpu_t __user *fpu_save; __siginfo_rwin_t __user *rwin_save; sigset_t set; @@ -261,10 +272,16 @@ void do_rt_sigreturn(struct pt_regs *regs) (regs->u_regs [UREG_FP] + STACK_BIAS); /* 1. Make sure we are not getting garbage from the user */ - if (((unsigned long) sf) & 3) + if (invalid_frame_pointer(sf)) goto segv; - err = get_user(tpc, &sf->regs.tpc); + if (get_user(ufp, &sf->regs.u_regs[UREG_FP])) + goto segv; + + if ((ufp + STACK_BIAS) & 0x7) + goto segv; + + err = __get_user(tpc, &sf->regs.tpc); err |= __get_user(tnpc, &sf->regs.tnpc); if (test_thread_flag(TIF_32BIT)) { tpc &= 0xffffffff; @@ -308,14 +325,6 @@ segv: force_sig(SIGSEGV, current); } -/* Checks if the fp is valid */ -static int invalid_frame_pointer(void __user *fp) -{ - if (((unsigned long) fp) & 15) - return 1; - return 0; -} - static inline void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, unsigned long framesize) { unsigned long sp = regs->u_regs[UREG_FP] + STACK_BIAS; diff --git a/arch/sparc/kernel/sigutil_32.c b/arch/sparc/kernel/sigutil_32.c index 0f6eebe71e6c..e5fe8cef9a69 100644 --- a/arch/sparc/kernel/sigutil_32.c +++ b/arch/sparc/kernel/sigutil_32.c @@ -48,6 +48,10 @@ int save_fpu_state(struct pt_regs *regs, __siginfo_fpu_t __user *fpu) int restore_fpu_state(struct pt_regs *regs, __siginfo_fpu_t __user *fpu) { int err; + + if (((unsigned long) fpu) & 3) + return -EFAULT; + #ifdef CONFIG_SMP if (test_tsk_thread_flag(current, TIF_USEDFPU)) regs->psr &= ~PSR_EF; @@ -97,7 +101,10 @@ int restore_rwin_state(__siginfo_rwin_t __user *rp) struct thread_info *t = current_thread_info(); int i, wsaved, err; - __get_user(wsaved, &rp->wsaved); + if (((unsigned long) rp) & 3) + return -EFAULT; + + get_user(wsaved, &rp->wsaved); if (wsaved > NSWINS) return -EFAULT; diff --git a/arch/sparc/kernel/sigutil_64.c b/arch/sparc/kernel/sigutil_64.c index 387834a9c56a..36aadcbeac69 100644 --- a/arch/sparc/kernel/sigutil_64.c +++ b/arch/sparc/kernel/sigutil_64.c @@ -37,7 +37,10 @@ int restore_fpu_state(struct pt_regs *regs, __siginfo_fpu_t __user *fpu) unsigned long fprs; int err; - err = __get_user(fprs, &fpu->si_fprs); + if (((unsigned long) fpu) & 7) + return -EFAULT; + + err = get_user(fprs, &fpu->si_fprs); fprs_write(0); regs->tstate &= ~TSTATE_PEF; if (fprs & FPRS_DL) @@ -72,7 +75,10 @@ int restore_rwin_state(__siginfo_rwin_t __user *rp) struct thread_info *t = current_thread_info(); int i, wsaved, err; - __get_user(wsaved, &rp->wsaved); + if (((unsigned long) rp) & 7) + return -EFAULT; + + get_user(wsaved, &rp->wsaved); if (wsaved > NSWINS) return -EFAULT; From 445ff22b967197364e31d6ee7f9bdfc3955c091b Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sat, 28 May 2016 20:41:12 -0700 Subject: [PATCH 290/312] sparc64: Fix return from trap window fill crashes. [ Upstream commit 7cafc0b8bf130f038b0ec2dcdd6a9de6dc59b65a ] We must handle data access exception as well as memory address unaligned exceptions from return from trap window fill faults, not just normal TLB misses. Otherwise we can get an OOPS that looks like this: ld-linux.so.2(36808): Kernel bad sw trap 5 [#1] CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 #34 task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000 TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002 Not tainted TPC: g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001 g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001 o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358 o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c RPC: l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000 l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000 i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000 i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c I7: Call Trace: [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c The window trap handlers are slightly clever, the trap table entries for them are composed of two pieces of code. First comes the code that actually performs the window fill or spill trap handling, and then there are three instructions at the end which are for exception processing. The userland register window fill handler is: add %sp, STACK_BIAS + 0x00, %g1; \ ldxa [%g1 + %g0] ASI, %l0; \ mov 0x08, %g2; \ mov 0x10, %g3; \ ldxa [%g1 + %g2] ASI, %l1; \ mov 0x18, %g5; \ ldxa [%g1 + %g3] ASI, %l2; \ ldxa [%g1 + %g5] ASI, %l3; \ add %g1, 0x20, %g1; \ ldxa [%g1 + %g0] ASI, %l4; \ ldxa [%g1 + %g2] ASI, %l5; \ ldxa [%g1 + %g3] ASI, %l6; \ ldxa [%g1 + %g5] ASI, %l7; \ add %g1, 0x20, %g1; \ ldxa [%g1 + %g0] ASI, %i0; \ ldxa [%g1 + %g2] ASI, %i1; \ ldxa [%g1 + %g3] ASI, %i2; \ ldxa [%g1 + %g5] ASI, %i3; \ add %g1, 0x20, %g1; \ ldxa [%g1 + %g0] ASI, %i4; \ ldxa [%g1 + %g2] ASI, %i5; \ ldxa [%g1 + %g3] ASI, %i6; \ ldxa [%g1 + %g5] ASI, %i7; \ restored; \ retry; nop; nop; nop; nop; \ b,a,pt %xcc, fill_fixup_dax; \ b,a,pt %xcc, fill_fixup_mna; \ b,a,pt %xcc, fill_fixup; And the way this works is that if any of those memory accesses generate an exception, the exception handler can revector to one of those final three branch instructions depending upon which kind of exception the memory access took. In this way, the fault handler doesn't have to know if it was a spill or a fill that it's handling the fault for. It just always branches to the last instruction in the parent trap's handler. For example, for a regular fault, the code goes: winfix_trampoline: rdpr %tpc, %g3 or %g3, 0x7c, %g3 wrpr %g3, %tnpc done All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the trap time program counter, we'll get that final instruction in the trap handler. On return from trap, we have to pull the register window in but we do this by hand instead of just executing a "restore" instruction for several reasons. The largest being that from Niagara and onward we simply don't have enough levels in the trap stack to fully resolve all possible exception cases of a window fault when we are already at trap level 1 (which we enter to get ready to return from the original trap). This is executed inline via the FILL_*_RTRAP handlers. rtrap_64.S's code branches directly to these to do the window fill by hand if necessary. Now if you look at them, we'll see at the end: ba,a,pt %xcc, user_rtt_fill_fixup; ba,a,pt %xcc, user_rtt_fill_fixup; ba,a,pt %xcc, user_rtt_fill_fixup; And oops, all three cases are handled like a fault. This doesn't work because each of these trap types (data access exception, memory address unaligned, and faults) store their auxiliary info in different registers to pass on to the C handler which does the real work. So in the case where the stack was unaligned, the unaligned trap handler sets up the arg registers one way, and then we branched to the fault handler which expects them setup another way. So the FAULT_TYPE_* value ends up basically being garbage, and randomly would generate the backtrace seen above. Reported-by: Nick Alcock Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- arch/sparc/include/asm/head_64.h | 4 ++ arch/sparc/include/asm/ttable.h | 8 +-- arch/sparc/kernel/Makefile | 1 + arch/sparc/kernel/rtrap_64.S | 59 ++++--------------- arch/sparc/kernel/urtt_fill.S | 98 ++++++++++++++++++++++++++++++++ 5 files changed, 117 insertions(+), 53 deletions(-) create mode 100644 arch/sparc/kernel/urtt_fill.S diff --git a/arch/sparc/include/asm/head_64.h b/arch/sparc/include/asm/head_64.h index 10e9dabc4c41..f0700cfeedd7 100644 --- a/arch/sparc/include/asm/head_64.h +++ b/arch/sparc/include/asm/head_64.h @@ -15,6 +15,10 @@ #define PTREGS_OFF (STACK_BIAS + STACKFRAME_SZ) +#define RTRAP_PSTATE (PSTATE_TSO|PSTATE_PEF|PSTATE_PRIV|PSTATE_IE) +#define RTRAP_PSTATE_IRQOFF (PSTATE_TSO|PSTATE_PEF|PSTATE_PRIV) +#define RTRAP_PSTATE_AG_IRQOFF (PSTATE_TSO|PSTATE_PEF|PSTATE_PRIV|PSTATE_AG) + #define __CHEETAH_ID 0x003e0014 #define __JALAPENO_ID 0x003e0016 #define __SERRANO_ID 0x003e0022 diff --git a/arch/sparc/include/asm/ttable.h b/arch/sparc/include/asm/ttable.h index 71b5a67522ab..781b9f1dbdc2 100644 --- a/arch/sparc/include/asm/ttable.h +++ b/arch/sparc/include/asm/ttable.h @@ -589,8 +589,8 @@ user_rtt_fill_64bit: \ restored; \ nop; nop; nop; nop; nop; nop; \ nop; nop; nop; nop; nop; \ - ba,a,pt %xcc, user_rtt_fill_fixup; \ - ba,a,pt %xcc, user_rtt_fill_fixup; \ + ba,a,pt %xcc, user_rtt_fill_fixup_dax; \ + ba,a,pt %xcc, user_rtt_fill_fixup_mna; \ ba,a,pt %xcc, user_rtt_fill_fixup; @@ -652,8 +652,8 @@ user_rtt_fill_32bit: \ restored; \ nop; nop; nop; nop; nop; \ nop; nop; nop; \ - ba,a,pt %xcc, user_rtt_fill_fixup; \ - ba,a,pt %xcc, user_rtt_fill_fixup; \ + ba,a,pt %xcc, user_rtt_fill_fixup_dax; \ + ba,a,pt %xcc, user_rtt_fill_fixup_mna; \ ba,a,pt %xcc, user_rtt_fill_fixup; diff --git a/arch/sparc/kernel/Makefile b/arch/sparc/kernel/Makefile index 7cf9c6ea3f1f..fdb13327fded 100644 --- a/arch/sparc/kernel/Makefile +++ b/arch/sparc/kernel/Makefile @@ -21,6 +21,7 @@ CFLAGS_REMOVE_perf_event.o := -pg CFLAGS_REMOVE_pcr.o := -pg endif +obj-$(CONFIG_SPARC64) += urtt_fill.o obj-$(CONFIG_SPARC32) += entry.o wof.o wuf.o obj-$(CONFIG_SPARC32) += etrap_32.o obj-$(CONFIG_SPARC32) += rtrap_32.o diff --git a/arch/sparc/kernel/rtrap_64.S b/arch/sparc/kernel/rtrap_64.S index 39f0c662f4c8..8de386dc8150 100644 --- a/arch/sparc/kernel/rtrap_64.S +++ b/arch/sparc/kernel/rtrap_64.S @@ -14,10 +14,6 @@ #include #include -#define RTRAP_PSTATE (PSTATE_TSO|PSTATE_PEF|PSTATE_PRIV|PSTATE_IE) -#define RTRAP_PSTATE_IRQOFF (PSTATE_TSO|PSTATE_PEF|PSTATE_PRIV) -#define RTRAP_PSTATE_AG_IRQOFF (PSTATE_TSO|PSTATE_PEF|PSTATE_PRIV|PSTATE_AG) - #ifdef CONFIG_CONTEXT_TRACKING # define SCHEDULE_USER schedule_user #else @@ -236,52 +232,17 @@ rt_continue: ldx [%sp + PTREGS_OFF + PT_V9_G1], %g1 wrpr %g1, %cwp ba,a,pt %xcc, user_rtt_fill_64bit +user_rtt_fill_fixup_dax: + ba,pt %xcc, user_rtt_fill_fixup_common + mov 1, %g3 + +user_rtt_fill_fixup_mna: + ba,pt %xcc, user_rtt_fill_fixup_common + mov 2, %g3 + user_rtt_fill_fixup: - rdpr %cwp, %g1 - add %g1, 1, %g1 - wrpr %g1, 0x0, %cwp - - rdpr %wstate, %g2 - sll %g2, 3, %g2 - wrpr %g2, 0x0, %wstate - - /* We know %canrestore and %otherwin are both zero. */ - - sethi %hi(sparc64_kern_pri_context), %g2 - ldx [%g2 + %lo(sparc64_kern_pri_context)], %g2 - mov PRIMARY_CONTEXT, %g1 - -661: stxa %g2, [%g1] ASI_DMMU - .section .sun4v_1insn_patch, "ax" - .word 661b - stxa %g2, [%g1] ASI_MMU - .previous - - sethi %hi(KERNBASE), %g1 - flush %g1 - - or %g4, FAULT_CODE_WINFIXUP, %g4 - stb %g4, [%g6 + TI_FAULT_CODE] - stx %g5, [%g6 + TI_FAULT_ADDR] - - mov %g6, %l1 - wrpr %g0, 0x0, %tl - -661: nop - .section .sun4v_1insn_patch, "ax" - .word 661b - SET_GL(0) - .previous - - wrpr %g0, RTRAP_PSTATE, %pstate - - mov %l1, %g6 - ldx [%g6 + TI_TASK], %g4 - LOAD_PER_CPU_BASE(%g5, %g6, %g1, %g2, %g3) - call do_sparc64_fault - add %sp, PTREGS_OFF, %o0 - ba,pt %xcc, rtrap - nop + ba,pt %xcc, user_rtt_fill_fixup_common + clr %g3 user_rtt_pre_restore: add %g1, 1, %g1 diff --git a/arch/sparc/kernel/urtt_fill.S b/arch/sparc/kernel/urtt_fill.S new file mode 100644 index 000000000000..5604a2b051d4 --- /dev/null +++ b/arch/sparc/kernel/urtt_fill.S @@ -0,0 +1,98 @@ +#include +#include +#include +#include +#include + + .text + .align 8 + .globl user_rtt_fill_fixup_common +user_rtt_fill_fixup_common: + rdpr %cwp, %g1 + add %g1, 1, %g1 + wrpr %g1, 0x0, %cwp + + rdpr %wstate, %g2 + sll %g2, 3, %g2 + wrpr %g2, 0x0, %wstate + + /* We know %canrestore and %otherwin are both zero. */ + + sethi %hi(sparc64_kern_pri_context), %g2 + ldx [%g2 + %lo(sparc64_kern_pri_context)], %g2 + mov PRIMARY_CONTEXT, %g1 + +661: stxa %g2, [%g1] ASI_DMMU + .section .sun4v_1insn_patch, "ax" + .word 661b + stxa %g2, [%g1] ASI_MMU + .previous + + sethi %hi(KERNBASE), %g1 + flush %g1 + + mov %g4, %l4 + mov %g5, %l5 + brnz,pn %g3, 1f + mov %g3, %l3 + + or %g4, FAULT_CODE_WINFIXUP, %g4 + stb %g4, [%g6 + TI_FAULT_CODE] + stx %g5, [%g6 + TI_FAULT_ADDR] +1: + mov %g6, %l1 + wrpr %g0, 0x0, %tl + +661: nop + .section .sun4v_1insn_patch, "ax" + .word 661b + SET_GL(0) + .previous + + wrpr %g0, RTRAP_PSTATE, %pstate + + mov %l1, %g6 + ldx [%g6 + TI_TASK], %g4 + LOAD_PER_CPU_BASE(%g5, %g6, %g1, %g2, %g3) + + brnz,pn %l3, 1f + nop + + call do_sparc64_fault + add %sp, PTREGS_OFF, %o0 + ba,pt %xcc, rtrap + nop + +1: cmp %g3, 2 + bne,pn %xcc, 2f + nop + + sethi %hi(tlb_type), %g1 + lduw [%g1 + %lo(tlb_type)], %g1 + cmp %g1, 3 + bne,pt %icc, 1f + add %sp, PTREGS_OFF, %o0 + mov %l4, %o2 + call sun4v_do_mna + mov %l5, %o1 + ba,a,pt %xcc, rtrap +1: mov %l4, %o1 + mov %l5, %o2 + call mem_address_unaligned + nop + ba,a,pt %xcc, rtrap + +2: sethi %hi(tlb_type), %g1 + mov %l4, %o1 + lduw [%g1 + %lo(tlb_type)], %g1 + mov %l5, %o2 + cmp %g1, 3 + bne,pt %icc, 1f + add %sp, PTREGS_OFF, %o0 + call sun4v_data_access_exception + nop + ba,a,pt %xcc, rtrap + +1: call spitfire_data_access_exception + nop + ba,a,pt %xcc, rtrap From 1a1f239be5b547551adc3bc871521c21aa7966ab Mon Sep 17 00:00:00 2001 From: Ralf Baechle Date: Thu, 4 Feb 2016 01:24:40 +0100 Subject: [PATCH 291/312] MIPS: Fix 64k page support for 32 bit kernels. [ Upstream commit d7de413475f443957a0c1d256e405d19b3a2cb22 ] TASK_SIZE was defined as 0x7fff8000UL which for 64k pages is not a multiple of the page size. Somewhere further down the math fails such that executing an ELF binary fails. Signed-off-by: Ralf Baechle Tested-by: Joshua Henderson Signed-off-by: Sasha Levin --- arch/mips/include/asm/processor.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h index 9b3b48e21c22..6ec90a19be70 100644 --- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -51,7 +51,7 @@ extern unsigned int vced_count, vcei_count; * User space process size: 2GB. This is hardcoded into a few places, * so don't change it unless you know what you are doing. */ -#define TASK_SIZE 0x7fff8000UL +#define TASK_SIZE 0x80000000UL #endif #define STACK_TOP_MAX TASK_SIZE From 780daa25f811f1aeefb9da76ef93c32b17a5f102 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 22 Mar 2016 18:02:49 +0100 Subject: [PATCH 292/312] netfilter: x_tables: validate e->target_offset early [ Upstream commit bdf533de6968e9686df777dc178486f600c6e617 ] We should check that e->target_offset is sane before mark_source_chains gets called since it will fetch the target entry for loop detection. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/arp_tables.c | 17 ++++++++--------- net/ipv4/netfilter/ip_tables.c | 17 ++++++++--------- net/ipv6/netfilter/ip6_tables.c | 17 ++++++++--------- 3 files changed, 24 insertions(+), 27 deletions(-) diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index a61200754f4b..e500f4fcf868 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -469,14 +469,12 @@ static int mark_source_chains(const struct xt_table_info *newinfo, return 1; } -static inline int check_entry(const struct arpt_entry *e, const char *name) +static inline int check_entry(const struct arpt_entry *e) { const struct xt_entry_target *t; - if (!arp_checkentry(&e->arp)) { - duprintf("arp_tables: arp check failed %p %s.\n", e, name); + if (!arp_checkentry(&e->arp)) return -EINVAL; - } if (e->target_offset + sizeof(struct xt_entry_target) > e->next_offset) return -EINVAL; @@ -517,10 +515,6 @@ find_check_entry(struct arpt_entry *e, const char *name, unsigned int size) struct xt_target *target; int ret; - ret = check_entry(e, name); - if (ret) - return ret; - t = arpt_get_target(e); target = xt_request_find_target(NFPROTO_ARP, t->u.user.name, t->u.user.revision); @@ -565,6 +559,7 @@ static inline int check_entry_size_and_hooks(struct arpt_entry *e, unsigned int valid_hooks) { unsigned int h; + int err; if ((unsigned long)e % __alignof__(struct arpt_entry) != 0 || (unsigned char *)e + sizeof(struct arpt_entry) >= limit) { @@ -579,6 +574,10 @@ static inline int check_entry_size_and_hooks(struct arpt_entry *e, return -EINVAL; } + err = check_entry(e); + if (err) + return err; + /* Check hooks & underflows */ for (h = 0; h < NF_ARP_NUMHOOKS; h++) { if (!(valid_hooks & (1 << h))) @@ -1239,7 +1238,7 @@ check_compat_entry_size_and_hooks(struct compat_arpt_entry *e, } /* For purposes of check_entry casting the compat entry is fine */ - ret = check_entry((struct arpt_entry *)e, name); + ret = check_entry((struct arpt_entry *)e); if (ret) return ret; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 2d0e265fef6e..a75ff8a8420f 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -564,14 +564,12 @@ static void cleanup_match(struct xt_entry_match *m, struct net *net) } static int -check_entry(const struct ipt_entry *e, const char *name) +check_entry(const struct ipt_entry *e) { const struct xt_entry_target *t; - if (!ip_checkentry(&e->ip)) { - duprintf("ip check failed %p %s.\n", e, name); + if (!ip_checkentry(&e->ip)) return -EINVAL; - } if (e->target_offset + sizeof(struct xt_entry_target) > e->next_offset) @@ -661,10 +659,6 @@ find_check_entry(struct ipt_entry *e, struct net *net, const char *name, struct xt_mtchk_param mtpar; struct xt_entry_match *ematch; - ret = check_entry(e, name); - if (ret) - return ret; - j = 0; mtpar.net = net; mtpar.table = name; @@ -728,6 +722,7 @@ check_entry_size_and_hooks(struct ipt_entry *e, unsigned int valid_hooks) { unsigned int h; + int err; if ((unsigned long)e % __alignof__(struct ipt_entry) != 0 || (unsigned char *)e + sizeof(struct ipt_entry) >= limit) { @@ -742,6 +737,10 @@ check_entry_size_and_hooks(struct ipt_entry *e, return -EINVAL; } + err = check_entry(e); + if (err) + return err; + /* Check hooks & underflows */ for (h = 0; h < NF_INET_NUMHOOKS; h++) { if (!(valid_hooks & (1 << h))) @@ -1505,7 +1504,7 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, } /* For purposes of check_entry casting the compat entry is fine */ - ret = check_entry((struct ipt_entry *)e, name); + ret = check_entry((struct ipt_entry *)e); if (ret) return ret; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 62f5b0d0bc9b..ef764a67cc64 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -577,14 +577,12 @@ static void cleanup_match(struct xt_entry_match *m, struct net *net) } static int -check_entry(const struct ip6t_entry *e, const char *name) +check_entry(const struct ip6t_entry *e) { const struct xt_entry_target *t; - if (!ip6_checkentry(&e->ipv6)) { - duprintf("ip_tables: ip check failed %p %s.\n", e, name); + if (!ip6_checkentry(&e->ipv6)) return -EINVAL; - } if (e->target_offset + sizeof(struct xt_entry_target) > e->next_offset) @@ -675,10 +673,6 @@ find_check_entry(struct ip6t_entry *e, struct net *net, const char *name, struct xt_mtchk_param mtpar; struct xt_entry_match *ematch; - ret = check_entry(e, name); - if (ret) - return ret; - j = 0; mtpar.net = net; mtpar.table = name; @@ -742,6 +736,7 @@ check_entry_size_and_hooks(struct ip6t_entry *e, unsigned int valid_hooks) { unsigned int h; + int err; if ((unsigned long)e % __alignof__(struct ip6t_entry) != 0 || (unsigned char *)e + sizeof(struct ip6t_entry) >= limit) { @@ -756,6 +751,10 @@ check_entry_size_and_hooks(struct ip6t_entry *e, return -EINVAL; } + err = check_entry(e); + if (err) + return err; + /* Check hooks & underflows */ for (h = 0; h < NF_INET_NUMHOOKS; h++) { if (!(valid_hooks & (1 << h))) @@ -1520,7 +1519,7 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, } /* For purposes of check_entry casting the compat entry is fine */ - ret = check_entry((struct ip6t_entry *)e, name); + ret = check_entry((struct ip6t_entry *)e); if (ret) return ret; From a1c49d8cf9aa2be958c34fae1f84b1e3006f0c67 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 22 Mar 2016 18:02:50 +0100 Subject: [PATCH 293/312] netfilter: x_tables: make sure e->next_offset covers remaining blob size [ Upstream commit 6e94e0cfb0887e4013b3b930fa6ab1fe6bb6ba91 ] Otherwise this function may read data beyond the ruleset blob. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/arp_tables.c | 6 ++++-- net/ipv4/netfilter/ip_tables.c | 6 ++++-- net/ipv6/netfilter/ip6_tables.c | 6 ++++-- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index e500f4fcf868..835b768e10e8 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -562,7 +562,8 @@ static inline int check_entry_size_and_hooks(struct arpt_entry *e, int err; if ((unsigned long)e % __alignof__(struct arpt_entry) != 0 || - (unsigned char *)e + sizeof(struct arpt_entry) >= limit) { + (unsigned char *)e + sizeof(struct arpt_entry) >= limit || + (unsigned char *)e + e->next_offset > limit) { duprintf("Bad offset %p\n", e); return -EINVAL; } @@ -1225,7 +1226,8 @@ check_compat_entry_size_and_hooks(struct compat_arpt_entry *e, duprintf("check_compat_entry_size_and_hooks %p\n", e); if ((unsigned long)e % __alignof__(struct compat_arpt_entry) != 0 || - (unsigned char *)e + sizeof(struct compat_arpt_entry) >= limit) { + (unsigned char *)e + sizeof(struct compat_arpt_entry) >= limit || + (unsigned char *)e + e->next_offset > limit) { duprintf("Bad offset %p, limit = %p\n", e, limit); return -EINVAL; } diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index a75ff8a8420f..4ad55767569a 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -725,7 +725,8 @@ check_entry_size_and_hooks(struct ipt_entry *e, int err; if ((unsigned long)e % __alignof__(struct ipt_entry) != 0 || - (unsigned char *)e + sizeof(struct ipt_entry) >= limit) { + (unsigned char *)e + sizeof(struct ipt_entry) >= limit || + (unsigned char *)e + e->next_offset > limit) { duprintf("Bad offset %p\n", e); return -EINVAL; } @@ -1491,7 +1492,8 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, duprintf("check_compat_entry_size_and_hooks %p\n", e); if ((unsigned long)e % __alignof__(struct compat_ipt_entry) != 0 || - (unsigned char *)e + sizeof(struct compat_ipt_entry) >= limit) { + (unsigned char *)e + sizeof(struct compat_ipt_entry) >= limit || + (unsigned char *)e + e->next_offset > limit) { duprintf("Bad offset %p, limit = %p\n", e, limit); return -EINVAL; } diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index ef764a67cc64..9a4666be412d 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -739,7 +739,8 @@ check_entry_size_and_hooks(struct ip6t_entry *e, int err; if ((unsigned long)e % __alignof__(struct ip6t_entry) != 0 || - (unsigned char *)e + sizeof(struct ip6t_entry) >= limit) { + (unsigned char *)e + sizeof(struct ip6t_entry) >= limit || + (unsigned char *)e + e->next_offset > limit) { duprintf("Bad offset %p\n", e); return -EINVAL; } @@ -1506,7 +1507,8 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, duprintf("check_compat_entry_size_and_hooks %p\n", e); if ((unsigned long)e % __alignof__(struct compat_ip6t_entry) != 0 || - (unsigned char *)e + sizeof(struct compat_ip6t_entry) >= limit) { + (unsigned char *)e + sizeof(struct compat_ip6t_entry) >= limit || + (unsigned char *)e + e->next_offset > limit) { duprintf("Bad offset %p, limit = %p\n", e, limit); return -EINVAL; } From 850c377e0e2d76723884d610ff40827d26aa21eb Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Tue, 22 Mar 2016 18:02:52 +0100 Subject: [PATCH 294/312] netfilter: x_tables: fix unconditional helper [ Upstream commit 54d83fc74aa9ec72794373cb47432c5f7fb1a309 ] Ben Hawkes says: In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it is possible for a user-supplied ipt_entry structure to have a large next_offset field. This field is not bounds checked prior to writing a counter value at the supplied offset. Problem is that mark_source_chains should not have been called -- the rule doesn't have a next entry, so its supposed to return an absolute verdict of either ACCEPT or DROP. However, the function conditional() doesn't work as the name implies. It only checks that the rule is using wildcard address matching. However, an unconditional rule must also not be using any matches (no -m args). The underflow validator only checked the addresses, therefore passing the 'unconditional absolute verdict' test, while mark_source_chains also tested for presence of matches, and thus proceeeded to the next (not-existent) rule. Unify this so that all the callers have same idea of 'unconditional rule'. Reported-by: Ben Hawkes Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/arp_tables.c | 18 +++++++++--------- net/ipv4/netfilter/ip_tables.c | 23 +++++++++++------------ net/ipv6/netfilter/ip6_tables.c | 23 +++++++++++------------ 3 files changed, 31 insertions(+), 33 deletions(-) diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 835b768e10e8..236dcd64ba06 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -354,11 +354,12 @@ unsigned int arpt_do_table(struct sk_buff *skb, } /* All zeroes == unconditional rule. */ -static inline bool unconditional(const struct arpt_arp *arp) +static inline bool unconditional(const struct arpt_entry *e) { static const struct arpt_arp uncond; - return memcmp(arp, &uncond, sizeof(uncond)) == 0; + return e->target_offset == sizeof(struct arpt_entry) && + memcmp(&e->arp, &uncond, sizeof(uncond)) == 0; } /* Figures out from what hook each rule can be called: returns 0 if @@ -397,11 +398,10 @@ static int mark_source_chains(const struct xt_table_info *newinfo, |= ((1 << hook) | (1 << NF_ARP_NUMHOOKS)); /* Unconditional return/END. */ - if ((e->target_offset == sizeof(struct arpt_entry) && + if ((unconditional(e) && (strcmp(t->target.u.user.name, XT_STANDARD_TARGET) == 0) && - t->verdict < 0 && unconditional(&e->arp)) || - visited) { + t->verdict < 0) || visited) { unsigned int oldpos, size; if ((strcmp(t->target.u.user.name, @@ -540,7 +540,7 @@ static bool check_underflow(const struct arpt_entry *e) const struct xt_entry_target *t; unsigned int verdict; - if (!unconditional(&e->arp)) + if (!unconditional(e)) return false; t = arpt_get_target_c(e); if (strcmp(t->u.user.name, XT_STANDARD_TARGET) != 0) @@ -587,9 +587,9 @@ static inline int check_entry_size_and_hooks(struct arpt_entry *e, newinfo->hook_entry[h] = hook_entries[h]; if ((unsigned char *)e - base == underflows[h]) { if (!check_underflow(e)) { - pr_err("Underflows must be unconditional and " - "use the STANDARD target with " - "ACCEPT/DROP\n"); + pr_debug("Underflows must be unconditional and " + "use the STANDARD target with " + "ACCEPT/DROP\n"); return -EINVAL; } newinfo->underflow[h] = underflows[h]; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 4ad55767569a..b653b9de18d0 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -168,11 +168,12 @@ get_entry(const void *base, unsigned int offset) /* All zeroes == unconditional rule. */ /* Mildly perf critical (only if packet tracing is on) */ -static inline bool unconditional(const struct ipt_ip *ip) +static inline bool unconditional(const struct ipt_entry *e) { static const struct ipt_ip uncond; - return memcmp(ip, &uncond, sizeof(uncond)) == 0; + return e->target_offset == sizeof(struct ipt_entry) && + memcmp(&e->ip, &uncond, sizeof(uncond)) == 0; #undef FWINV } @@ -229,11 +230,10 @@ get_chainname_rulenum(const struct ipt_entry *s, const struct ipt_entry *e, } else if (s == e) { (*rulenum)++; - if (s->target_offset == sizeof(struct ipt_entry) && + if (unconditional(s) && strcmp(t->target.u.kernel.target->name, XT_STANDARD_TARGET) == 0 && - t->verdict < 0 && - unconditional(&s->ip)) { + t->verdict < 0) { /* Tail of chains: STANDARD target (return/policy) */ *comment = *chainname == hookname ? comments[NF_IP_TRACE_COMMENT_POLICY] @@ -471,11 +471,10 @@ mark_source_chains(const struct xt_table_info *newinfo, e->comefrom |= ((1 << hook) | (1 << NF_INET_NUMHOOKS)); /* Unconditional return/END. */ - if ((e->target_offset == sizeof(struct ipt_entry) && + if ((unconditional(e) && (strcmp(t->target.u.user.name, XT_STANDARD_TARGET) == 0) && - t->verdict < 0 && unconditional(&e->ip)) || - visited) { + t->verdict < 0) || visited) { unsigned int oldpos, size; if ((strcmp(t->target.u.user.name, @@ -702,7 +701,7 @@ static bool check_underflow(const struct ipt_entry *e) const struct xt_entry_target *t; unsigned int verdict; - if (!unconditional(&e->ip)) + if (!unconditional(e)) return false; t = ipt_get_target_c(e); if (strcmp(t->u.user.name, XT_STANDARD_TARGET) != 0) @@ -750,9 +749,9 @@ check_entry_size_and_hooks(struct ipt_entry *e, newinfo->hook_entry[h] = hook_entries[h]; if ((unsigned char *)e - base == underflows[h]) { if (!check_underflow(e)) { - pr_err("Underflows must be unconditional and " - "use the STANDARD target with " - "ACCEPT/DROP\n"); + pr_debug("Underflows must be unconditional and " + "use the STANDARD target with " + "ACCEPT/DROP\n"); return -EINVAL; } newinfo->underflow[h] = underflows[h]; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 9a4666be412d..077982a6598d 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -198,11 +198,12 @@ get_entry(const void *base, unsigned int offset) /* All zeroes == unconditional rule. */ /* Mildly perf critical (only if packet tracing is on) */ -static inline bool unconditional(const struct ip6t_ip6 *ipv6) +static inline bool unconditional(const struct ip6t_entry *e) { static const struct ip6t_ip6 uncond; - return memcmp(ipv6, &uncond, sizeof(uncond)) == 0; + return e->target_offset == sizeof(struct ip6t_entry) && + memcmp(&e->ipv6, &uncond, sizeof(uncond)) == 0; } static inline const struct xt_entry_target * @@ -258,11 +259,10 @@ get_chainname_rulenum(const struct ip6t_entry *s, const struct ip6t_entry *e, } else if (s == e) { (*rulenum)++; - if (s->target_offset == sizeof(struct ip6t_entry) && + if (unconditional(s) && strcmp(t->target.u.kernel.target->name, XT_STANDARD_TARGET) == 0 && - t->verdict < 0 && - unconditional(&s->ipv6)) { + t->verdict < 0) { /* Tail of chains: STANDARD target (return/policy) */ *comment = *chainname == hookname ? comments[NF_IP6_TRACE_COMMENT_POLICY] @@ -484,11 +484,10 @@ mark_source_chains(const struct xt_table_info *newinfo, e->comefrom |= ((1 << hook) | (1 << NF_INET_NUMHOOKS)); /* Unconditional return/END. */ - if ((e->target_offset == sizeof(struct ip6t_entry) && + if ((unconditional(e) && (strcmp(t->target.u.user.name, XT_STANDARD_TARGET) == 0) && - t->verdict < 0 && - unconditional(&e->ipv6)) || visited) { + t->verdict < 0) || visited) { unsigned int oldpos, size; if ((strcmp(t->target.u.user.name, @@ -716,7 +715,7 @@ static bool check_underflow(const struct ip6t_entry *e) const struct xt_entry_target *t; unsigned int verdict; - if (!unconditional(&e->ipv6)) + if (!unconditional(e)) return false; t = ip6t_get_target_c(e); if (strcmp(t->u.user.name, XT_STANDARD_TARGET) != 0) @@ -764,9 +763,9 @@ check_entry_size_and_hooks(struct ip6t_entry *e, newinfo->hook_entry[h] = hook_entries[h]; if ((unsigned char *)e - base == underflows[h]) { if (!check_underflow(e)) { - pr_err("Underflows must be unconditional and " - "use the STANDARD target with " - "ACCEPT/DROP\n"); + pr_debug("Underflows must be unconditional and " + "use the STANDARD target with " + "ACCEPT/DROP\n"); return -EINVAL; } newinfo->underflow[h] = underflows[h]; From cf756388f8f34e02a338356b3685c46938139871 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:21 +0200 Subject: [PATCH 295/312] netfilter: x_tables: don't move to non-existent next rule [ Upstream commit f24e230d257af1ad7476c6e81a8dc3127a74204e ] Ben Hawkes says: In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it is possible for a user-supplied ipt_entry structure to have a large next_offset field. This field is not bounds checked prior to writing a counter value at the supplied offset. Base chains enforce absolute verdict. User defined chains are supposed to end with an unconditional return, xtables userspace adds them automatically. But if such return is missing we will move to non-existent next rule. Reported-by: Ben Hawkes Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/arp_tables.c | 8 +++++--- net/ipv4/netfilter/ip_tables.c | 4 ++++ net/ipv6/netfilter/ip6_tables.c | 4 ++++ 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 236dcd64ba06..d53050017324 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -434,6 +434,8 @@ static int mark_source_chains(const struct xt_table_info *newinfo, size = e->next_offset; e = (struct arpt_entry *) (entry0 + pos + size); + if (pos + size >= newinfo->size) + return 0; e->counters.pcnt = pos; pos += size; } else { @@ -456,6 +458,8 @@ static int mark_source_chains(const struct xt_table_info *newinfo, } else { /* ... this is a fallthru */ newpos = pos + e->next_offset; + if (newpos >= newinfo->size) + return 0; } e = (struct arpt_entry *) (entry0 + newpos); @@ -679,10 +683,8 @@ static int translate_table(struct xt_table_info *newinfo, void *entry0, } } - if (!mark_source_chains(newinfo, repl->valid_hooks, entry0)) { - duprintf("Looping hook\n"); + if (!mark_source_chains(newinfo, repl->valid_hooks, entry0)) return -ELOOP; - } /* Finally, each sanity check must pass */ i = 0; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index b653b9de18d0..aee80165eaeb 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -515,6 +515,8 @@ mark_source_chains(const struct xt_table_info *newinfo, size = e->next_offset; e = (struct ipt_entry *) (entry0 + pos + size); + if (pos + size >= newinfo->size) + return 0; e->counters.pcnt = pos; pos += size; } else { @@ -536,6 +538,8 @@ mark_source_chains(const struct xt_table_info *newinfo, } else { /* ... this is a fallthru */ newpos = pos + e->next_offset; + if (newpos >= newinfo->size) + return 0; } e = (struct ipt_entry *) (entry0 + newpos); diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 077982a6598d..74d6ad0aa5b6 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -528,6 +528,8 @@ mark_source_chains(const struct xt_table_info *newinfo, size = e->next_offset; e = (struct ip6t_entry *) (entry0 + pos + size); + if (pos + size >= newinfo->size) + return 0; e->counters.pcnt = pos; pos += size; } else { @@ -549,6 +551,8 @@ mark_source_chains(const struct xt_table_info *newinfo, } else { /* ... this is a fallthru */ newpos = pos + e->next_offset; + if (newpos >= newinfo->size) + return 0; } e = (struct ip6t_entry *) (entry0 + newpos); From 8163327a3a927c36f7c032bb67957e6c0f4ec27d Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:22 +0200 Subject: [PATCH 296/312] netfilter: x_tables: validate targets of jumps [ Upstream commit 36472341017529e2b12573093cc0f68719300997 ] When we see a jump also check that the offset gets us to beginning of a rule (an ipt_entry). The extra overhead is negible, even with absurd cases. 300k custom rules, 300k jumps to 'next' user chain: [ plus one jump from INPUT to first userchain ]: Before: real 0m24.874s user 0m7.532s sys 0m16.076s After: real 0m27.464s user 0m7.436s sys 0m18.840s Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/arp_tables.c | 16 ++++++++++++++++ net/ipv4/netfilter/ip_tables.c | 16 ++++++++++++++++ net/ipv6/netfilter/ip6_tables.c | 16 ++++++++++++++++ 3 files changed, 48 insertions(+) diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index d53050017324..538839daf110 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -362,6 +362,18 @@ static inline bool unconditional(const struct arpt_entry *e) memcmp(&e->arp, &uncond, sizeof(uncond)) == 0; } +static bool find_jump_target(const struct xt_table_info *t, + const struct arpt_entry *target) +{ + struct arpt_entry *iter; + + xt_entry_foreach(iter, t->entries, t->size) { + if (iter == target) + return true; + } + return false; +} + /* Figures out from what hook each rule can be called: returns 0 if * there are loops. Puts hook bitmask in comefrom. */ @@ -455,6 +467,10 @@ static int mark_source_chains(const struct xt_table_info *newinfo, /* This a jump; chase it. */ duprintf("Jump rule %u -> %u\n", pos, newpos); + e = (struct arpt_entry *) + (entry0 + newpos); + if (!find_jump_target(newinfo, e)) + return 0; } else { /* ... this is a fallthru */ newpos = pos + e->next_offset; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index aee80165eaeb..ff7a5f322682 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -438,6 +438,18 @@ ipt_do_table(struct sk_buff *skb, #endif } +static bool find_jump_target(const struct xt_table_info *t, + const struct ipt_entry *target) +{ + struct ipt_entry *iter; + + xt_entry_foreach(iter, t->entries, t->size) { + if (iter == target) + return true; + } + return false; +} + /* Figures out from what hook each rule can be called: returns 0 if there are loops. Puts hook bitmask in comefrom. */ static int @@ -535,6 +547,10 @@ mark_source_chains(const struct xt_table_info *newinfo, /* This a jump; chase it. */ duprintf("Jump rule %u -> %u\n", pos, newpos); + e = (struct ipt_entry *) + (entry0 + newpos); + if (!find_jump_target(newinfo, e)) + return 0; } else { /* ... this is a fallthru */ newpos = pos + e->next_offset; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 74d6ad0aa5b6..18b455bc96cc 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -451,6 +451,18 @@ ip6t_do_table(struct sk_buff *skb, #endif } +static bool find_jump_target(const struct xt_table_info *t, + const struct ip6t_entry *target) +{ + struct ip6t_entry *iter; + + xt_entry_foreach(iter, t->entries, t->size) { + if (iter == target) + return true; + } + return false; +} + /* Figures out from what hook each rule can be called: returns 0 if there are loops. Puts hook bitmask in comefrom. */ static int @@ -548,6 +560,10 @@ mark_source_chains(const struct xt_table_info *newinfo, /* This a jump; chase it. */ duprintf("Jump rule %u -> %u\n", pos, newpos); + e = (struct ip6t_entry *) + (entry0 + newpos); + if (!find_jump_target(newinfo, e)) + return 0; } else { /* ... this is a fallthru */ newpos = pos + e->next_offset; From a471ac817cf0e0d6e87779ca1fee216ba849e613 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:23 +0200 Subject: [PATCH 297/312] netfilter: x_tables: add and use xt_check_entry_offsets [ Upstream commit 7d35812c3214afa5b37a675113555259cfd67b98 ] Currently arp/ip and ip6tables each implement a short helper to check that the target offset is large enough to hold one xt_entry_target struct and that t->u.target_size fits within the current rule. Unfortunately these checks are not sufficient. To avoid adding new tests to all of ip/ip6/arptables move the current checks into a helper, then extend this helper in followup patches. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- include/linux/netfilter/x_tables.h | 4 ++++ net/ipv4/netfilter/arp_tables.c | 11 +--------- net/ipv4/netfilter/ip_tables.c | 12 +---------- net/ipv6/netfilter/ip6_tables.c | 12 +---------- net/netfilter/x_tables.c | 34 ++++++++++++++++++++++++++++++ 5 files changed, 41 insertions(+), 32 deletions(-) diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index a3e215bb0241..a2d150873721 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -239,6 +239,10 @@ void xt_unregister_match(struct xt_match *target); int xt_register_matches(struct xt_match *match, unsigned int n); void xt_unregister_matches(struct xt_match *match, unsigned int n); +int xt_check_entry_offsets(const void *base, + unsigned int target_offset, + unsigned int next_offset); + int xt_check_match(struct xt_mtchk_param *, unsigned int size, u_int8_t proto, bool inv_proto); int xt_check_target(struct xt_tgchk_param *, unsigned int size, u_int8_t proto, diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 538839daf110..8c2d13c96a82 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -491,19 +491,10 @@ static int mark_source_chains(const struct xt_table_info *newinfo, static inline int check_entry(const struct arpt_entry *e) { - const struct xt_entry_target *t; - if (!arp_checkentry(&e->arp)) return -EINVAL; - if (e->target_offset + sizeof(struct xt_entry_target) > e->next_offset) - return -EINVAL; - - t = arpt_get_target_c(e); - if (e->target_offset + t->u.target_size > e->next_offset) - return -EINVAL; - - return 0; + return xt_check_entry_offsets(e, e->target_offset, e->next_offset); } static inline int check_target(struct arpt_entry *e, const char *name) diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index ff7a5f322682..2a3f689f9ed7 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -585,20 +585,10 @@ static void cleanup_match(struct xt_entry_match *m, struct net *net) static int check_entry(const struct ipt_entry *e) { - const struct xt_entry_target *t; - if (!ip_checkentry(&e->ip)) return -EINVAL; - if (e->target_offset + sizeof(struct xt_entry_target) > - e->next_offset) - return -EINVAL; - - t = ipt_get_target_c(e); - if (e->target_offset + t->u.target_size > e->next_offset) - return -EINVAL; - - return 0; + return xt_check_entry_offsets(e, e->target_offset, e->next_offset); } static int diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 18b455bc96cc..a7de7c0e1b37 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -598,20 +598,10 @@ static void cleanup_match(struct xt_entry_match *m, struct net *net) static int check_entry(const struct ip6t_entry *e) { - const struct xt_entry_target *t; - if (!ip6_checkentry(&e->ipv6)) return -EINVAL; - if (e->target_offset + sizeof(struct xt_entry_target) > - e->next_offset) - return -EINVAL; - - t = ip6t_get_target_c(e); - if (e->target_offset + t->u.target_size > e->next_offset) - return -EINVAL; - - return 0; + return xt_check_entry_offsets(e, e->target_offset, e->next_offset); } static int check_match(struct xt_entry_match *m, struct xt_mtchk_param *par) diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 51a459c3c649..b631357686ce 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -543,6 +543,40 @@ int xt_compat_match_to_user(const struct xt_entry_match *m, EXPORT_SYMBOL_GPL(xt_compat_match_to_user); #endif /* CONFIG_COMPAT */ +/** + * xt_check_entry_offsets - validate arp/ip/ip6t_entry + * + * @base: pointer to arp/ip/ip6t_entry + * @target_offset: the arp/ip/ip6_t->target_offset + * @next_offset: the arp/ip/ip6_t->next_offset + * + * validates that target_offset and next_offset are sane. + * + * The arp/ip/ip6t_entry structure @base must have passed following tests: + * - it must point to a valid memory location + * - base to base + next_offset must be accessible, i.e. not exceed allocated + * length. + * + * Return: 0 on success, negative errno on failure. + */ +int xt_check_entry_offsets(const void *base, + unsigned int target_offset, + unsigned int next_offset) +{ + const struct xt_entry_target *t; + const char *e = base; + + if (target_offset + sizeof(*t) > next_offset) + return -EINVAL; + + t = (void *)(e + target_offset); + if (target_offset + t->u.target_size > next_offset) + return -EINVAL; + + return 0; +} +EXPORT_SYMBOL(xt_check_entry_offsets); + int xt_check_target(struct xt_tgchk_param *par, unsigned int size, u_int8_t proto, bool inv_proto) { From 801cd32774d12dccfcfc0c22b0b26d84ed995c6f Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:24 +0200 Subject: [PATCH 298/312] netfilter: x_tables: kill check_entry helper [ Upstream commit aa412ba225dd3bc36d404c28cdc3d674850d80d0 ] Once we add more sanity testing to xt_check_entry_offsets it becomes relvant if we're expecting a 32bit 'config_compat' blob or a normal one. Since we already have a lot of similar-named functions (check_entry, compat_check_entry, find_and_check_entry, etc.) and the current incarnation is short just fold its contents into the callers. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/arp_tables.c | 19 ++++++++----------- net/ipv4/netfilter/ip_tables.c | 20 ++++++++------------ net/ipv6/netfilter/ip6_tables.c | 20 ++++++++------------ 3 files changed, 24 insertions(+), 35 deletions(-) diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 8c2d13c96a82..a86aa8dba47f 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -489,14 +489,6 @@ static int mark_source_chains(const struct xt_table_info *newinfo, return 1; } -static inline int check_entry(const struct arpt_entry *e) -{ - if (!arp_checkentry(&e->arp)) - return -EINVAL; - - return xt_check_entry_offsets(e, e->target_offset, e->next_offset); -} - static inline int check_target(struct arpt_entry *e, const char *name) { struct xt_entry_target *t = arpt_get_target(e); @@ -586,7 +578,10 @@ static inline int check_entry_size_and_hooks(struct arpt_entry *e, return -EINVAL; } - err = check_entry(e); + if (!arp_checkentry(&e->arp)) + return -EINVAL; + + err = xt_check_entry_offsets(e, e->target_offset, e->next_offset); if (err) return err; @@ -1248,8 +1243,10 @@ check_compat_entry_size_and_hooks(struct compat_arpt_entry *e, return -EINVAL; } - /* For purposes of check_entry casting the compat entry is fine */ - ret = check_entry((struct arpt_entry *)e); + if (!arp_checkentry(&e->arp)) + return -EINVAL; + + ret = xt_check_entry_offsets(e, e->target_offset, e->next_offset); if (ret) return ret; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 2a3f689f9ed7..077d682aff02 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -582,15 +582,6 @@ static void cleanup_match(struct xt_entry_match *m, struct net *net) module_put(par.match->me); } -static int -check_entry(const struct ipt_entry *e) -{ - if (!ip_checkentry(&e->ip)) - return -EINVAL; - - return xt_check_entry_offsets(e, e->target_offset, e->next_offset); -} - static int check_match(struct xt_entry_match *m, struct xt_mtchk_param *par) { @@ -747,7 +738,10 @@ check_entry_size_and_hooks(struct ipt_entry *e, return -EINVAL; } - err = check_entry(e); + if (!ip_checkentry(&e->ip)) + return -EINVAL; + + err = xt_check_entry_offsets(e, e->target_offset, e->next_offset); if (err) return err; @@ -1514,8 +1508,10 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, return -EINVAL; } - /* For purposes of check_entry casting the compat entry is fine */ - ret = check_entry((struct ipt_entry *)e); + if (!ip_checkentry(&e->ip)) + return -EINVAL; + + ret = xt_check_entry_offsets(e, e->target_offset, e->next_offset); if (ret) return ret; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index a7de7c0e1b37..ca0c3478b5aa 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -595,15 +595,6 @@ static void cleanup_match(struct xt_entry_match *m, struct net *net) module_put(par.match->me); } -static int -check_entry(const struct ip6t_entry *e) -{ - if (!ip6_checkentry(&e->ipv6)) - return -EINVAL; - - return xt_check_entry_offsets(e, e->target_offset, e->next_offset); -} - static int check_match(struct xt_entry_match *m, struct xt_mtchk_param *par) { const struct ip6t_ip6 *ipv6 = par->entryinfo; @@ -761,7 +752,10 @@ check_entry_size_and_hooks(struct ip6t_entry *e, return -EINVAL; } - err = check_entry(e); + if (!ip6_checkentry(&e->ipv6)) + return -EINVAL; + + err = xt_check_entry_offsets(e, e->target_offset, e->next_offset); if (err) return err; @@ -1529,8 +1523,10 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, return -EINVAL; } - /* For purposes of check_entry casting the compat entry is fine */ - ret = check_entry((struct ip6t_entry *)e); + if (!ip6_checkentry(&e->ipv6)) + return -EINVAL; + + ret = xt_check_entry_offsets(e, e->target_offset, e->next_offset); if (ret) return ret; From aae91919c9d6d1aa6d6390826979e6d2c89a7ba4 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:25 +0200 Subject: [PATCH 299/312] netfilter: x_tables: assert minimum target size [ Upstream commit a08e4e190b866579896c09af59b3bdca821da2cd ] The target size includes the size of the xt_entry_target struct. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/x_tables.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index b631357686ce..ce9f7b3dbd12 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -570,6 +570,9 @@ int xt_check_entry_offsets(const void *base, return -EINVAL; t = (void *)(e + target_offset); + if (t->u.target_size < sizeof(*t)) + return -EINVAL; + if (target_offset + t->u.target_size > next_offset) return -EINVAL; From acbcf85306bd563910a2afe07f07d30381b031b0 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:26 +0200 Subject: [PATCH 300/312] netfilter: x_tables: add compat version of xt_check_entry_offsets [ Upstream commit fc1221b3a163d1386d1052184202d5dc50d302d1 ] 32bit rulesets have different layout and alignment requirements, so once more integrity checks get added to xt_check_entry_offsets it will reject well-formed 32bit rulesets. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- include/linux/netfilter/x_tables.h | 3 +++ net/ipv4/netfilter/arp_tables.c | 3 ++- net/ipv4/netfilter/ip_tables.c | 3 ++- net/ipv6/netfilter/ip6_tables.c | 3 ++- net/netfilter/x_tables.c | 22 ++++++++++++++++++++++ 5 files changed, 31 insertions(+), 3 deletions(-) diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index a2d150873721..81843aa42a15 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -435,6 +435,9 @@ void xt_compat_target_from_user(struct xt_entry_target *t, void **dstptr, unsigned int *size); int xt_compat_target_to_user(const struct xt_entry_target *t, void __user **dstptr, unsigned int *size); +int xt_compat_check_entry_offsets(const void *base, + unsigned int target_offset, + unsigned int next_offset); #endif /* CONFIG_COMPAT */ #endif /* _X_TABLES_H */ diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index a86aa8dba47f..d865edf7ae65 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -1246,7 +1246,8 @@ check_compat_entry_size_and_hooks(struct compat_arpt_entry *e, if (!arp_checkentry(&e->arp)) return -EINVAL; - ret = xt_check_entry_offsets(e, e->target_offset, e->next_offset); + ret = xt_compat_check_entry_offsets(e, e->target_offset, + e->next_offset); if (ret) return ret; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 077d682aff02..a355ebbaacf2 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1511,7 +1511,8 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, if (!ip_checkentry(&e->ip)) return -EINVAL; - ret = xt_check_entry_offsets(e, e->target_offset, e->next_offset); + ret = xt_compat_check_entry_offsets(e, + e->target_offset, e->next_offset); if (ret) return ret; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index ca0c3478b5aa..7157bf5e5c8a 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1526,7 +1526,8 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, if (!ip6_checkentry(&e->ipv6)) return -EINVAL; - ret = xt_check_entry_offsets(e, e->target_offset, e->next_offset); + ret = xt_compat_check_entry_offsets(e, + e->target_offset, e->next_offset); if (ret) return ret; diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index ce9f7b3dbd12..a00ea2a05334 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -541,6 +541,27 @@ int xt_compat_match_to_user(const struct xt_entry_match *m, return 0; } EXPORT_SYMBOL_GPL(xt_compat_match_to_user); + +int xt_compat_check_entry_offsets(const void *base, + unsigned int target_offset, + unsigned int next_offset) +{ + const struct compat_xt_entry_target *t; + const char *e = base; + + if (target_offset + sizeof(*t) > next_offset) + return -EINVAL; + + t = (void *)(e + target_offset); + if (t->u.target_size < sizeof(*t)) + return -EINVAL; + + if (target_offset + t->u.target_size > next_offset) + return -EINVAL; + + return 0; +} +EXPORT_SYMBOL(xt_compat_check_entry_offsets); #endif /* CONFIG_COMPAT */ /** @@ -551,6 +572,7 @@ EXPORT_SYMBOL_GPL(xt_compat_match_to_user); * @next_offset: the arp/ip/ip6_t->next_offset * * validates that target_offset and next_offset are sane. + * Also see xt_compat_check_entry_offsets for CONFIG_COMPAT version. * * The arp/ip/ip6t_entry structure @base must have passed following tests: * - it must point to a valid memory location From 73bfda1c492bef7038a87adfa887b7e6b7cd6679 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:27 +0200 Subject: [PATCH 301/312] netfilter: x_tables: check standard target size too [ Upstream commit 7ed2abddd20cf8f6bd27f65bd218f26fa5bf7f44 ] We have targets and standard targets -- the latter carries a verdict. The ip/ip6tables validation functions will access t->verdict for the standard targets to fetch the jump offset or verdict for chainloop detection, but this happens before the targets get checked/validated. Thus we also need to check for verdict presence here, else t->verdict can point right after a blob. Spotted with UBSAN while testing malformed blobs. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/x_tables.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index a00ea2a05334..6b6444143bf1 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -542,6 +542,13 @@ int xt_compat_match_to_user(const struct xt_entry_match *m, } EXPORT_SYMBOL_GPL(xt_compat_match_to_user); +/* non-compat version may have padding after verdict */ +struct compat_xt_standard_target { + struct compat_xt_entry_target t; + compat_uint_t verdict; +}; + +/* see xt_check_entry_offsets */ int xt_compat_check_entry_offsets(const void *base, unsigned int target_offset, unsigned int next_offset) @@ -559,6 +566,10 @@ int xt_compat_check_entry_offsets(const void *base, if (target_offset + t->u.target_size > next_offset) return -EINVAL; + if (strcmp(t->u.user.name, XT_STANDARD_TARGET) == 0 && + target_offset + sizeof(struct compat_xt_standard_target) != next_offset) + return -EINVAL; + return 0; } EXPORT_SYMBOL(xt_compat_check_entry_offsets); @@ -598,6 +609,10 @@ int xt_check_entry_offsets(const void *base, if (target_offset + t->u.target_size > next_offset) return -EINVAL; + if (strcmp(t->u.user.name, XT_STANDARD_TARGET) == 0 && + target_offset + sizeof(struct xt_standard_target) != next_offset) + return -EINVAL; + return 0; } EXPORT_SYMBOL(xt_check_entry_offsets); From 451e4403bc4abc51539376d4314baa739ab9e996 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:28 +0200 Subject: [PATCH 302/312] netfilter: x_tables: check for bogus target offset [ Upstream commit ce683e5f9d045e5d67d1312a42b359cb2ab2a13c ] We're currently asserting that targetoff + targetsize <= nextoff. Extend it to also check that targetoff is >= sizeof(xt_entry). Since this is generic code, add an argument pointing to the start of the match/target, we can then derive the base structure size from the delta. We also need the e->elems pointer in a followup change to validate matches. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- include/linux/netfilter/x_tables.h | 4 ++-- net/ipv4/netfilter/arp_tables.c | 5 +++-- net/ipv4/netfilter/ip_tables.c | 5 +++-- net/ipv6/netfilter/ip6_tables.c | 5 +++-- net/netfilter/x_tables.c | 17 +++++++++++++++-- 5 files changed, 26 insertions(+), 10 deletions(-) diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 81843aa42a15..726f4dd63671 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -239,7 +239,7 @@ void xt_unregister_match(struct xt_match *target); int xt_register_matches(struct xt_match *match, unsigned int n); void xt_unregister_matches(struct xt_match *match, unsigned int n); -int xt_check_entry_offsets(const void *base, +int xt_check_entry_offsets(const void *base, const char *elems, unsigned int target_offset, unsigned int next_offset); @@ -435,7 +435,7 @@ void xt_compat_target_from_user(struct xt_entry_target *t, void **dstptr, unsigned int *size); int xt_compat_target_to_user(const struct xt_entry_target *t, void __user **dstptr, unsigned int *size); -int xt_compat_check_entry_offsets(const void *base, +int xt_compat_check_entry_offsets(const void *base, const char *elems, unsigned int target_offset, unsigned int next_offset); diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index d865edf7ae65..53315ee89dcd 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -581,7 +581,8 @@ static inline int check_entry_size_and_hooks(struct arpt_entry *e, if (!arp_checkentry(&e->arp)) return -EINVAL; - err = xt_check_entry_offsets(e, e->target_offset, e->next_offset); + err = xt_check_entry_offsets(e, e->elems, e->target_offset, + e->next_offset); if (err) return err; @@ -1246,7 +1247,7 @@ check_compat_entry_size_and_hooks(struct compat_arpt_entry *e, if (!arp_checkentry(&e->arp)) return -EINVAL; - ret = xt_compat_check_entry_offsets(e, e->target_offset, + ret = xt_compat_check_entry_offsets(e, e->elems, e->target_offset, e->next_offset); if (ret) return ret; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index a355ebbaacf2..fb00670ef2d8 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -741,7 +741,8 @@ check_entry_size_and_hooks(struct ipt_entry *e, if (!ip_checkentry(&e->ip)) return -EINVAL; - err = xt_check_entry_offsets(e, e->target_offset, e->next_offset); + err = xt_check_entry_offsets(e, e->elems, e->target_offset, + e->next_offset); if (err) return err; @@ -1511,7 +1512,7 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, if (!ip_checkentry(&e->ip)) return -EINVAL; - ret = xt_compat_check_entry_offsets(e, + ret = xt_compat_check_entry_offsets(e, e->elems, e->target_offset, e->next_offset); if (ret) return ret; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 7157bf5e5c8a..c09dbcc30f82 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -755,7 +755,8 @@ check_entry_size_and_hooks(struct ip6t_entry *e, if (!ip6_checkentry(&e->ipv6)) return -EINVAL; - err = xt_check_entry_offsets(e, e->target_offset, e->next_offset); + err = xt_check_entry_offsets(e, e->elems, e->target_offset, + e->next_offset); if (err) return err; @@ -1526,7 +1527,7 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, if (!ip6_checkentry(&e->ipv6)) return -EINVAL; - ret = xt_compat_check_entry_offsets(e, + ret = xt_compat_check_entry_offsets(e, e->elems, e->target_offset, e->next_offset); if (ret) return ret; diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 6b6444143bf1..bd7fd79132cb 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -548,14 +548,17 @@ struct compat_xt_standard_target { compat_uint_t verdict; }; -/* see xt_check_entry_offsets */ -int xt_compat_check_entry_offsets(const void *base, +int xt_compat_check_entry_offsets(const void *base, const char *elems, unsigned int target_offset, unsigned int next_offset) { + long size_of_base_struct = elems - (const char *)base; const struct compat_xt_entry_target *t; const char *e = base; + if (target_offset < size_of_base_struct) + return -EINVAL; + if (target_offset + sizeof(*t) > next_offset) return -EINVAL; @@ -579,12 +582,16 @@ EXPORT_SYMBOL(xt_compat_check_entry_offsets); * xt_check_entry_offsets - validate arp/ip/ip6t_entry * * @base: pointer to arp/ip/ip6t_entry + * @elems: pointer to first xt_entry_match, i.e. ip(6)t_entry->elems * @target_offset: the arp/ip/ip6_t->target_offset * @next_offset: the arp/ip/ip6_t->next_offset * * validates that target_offset and next_offset are sane. * Also see xt_compat_check_entry_offsets for CONFIG_COMPAT version. * + * This function does not validate the targets or matches themselves, it + * only tests that all the offsets and sizes are correct. + * * The arp/ip/ip6t_entry structure @base must have passed following tests: * - it must point to a valid memory location * - base to base + next_offset must be accessible, i.e. not exceed allocated @@ -593,12 +600,18 @@ EXPORT_SYMBOL(xt_compat_check_entry_offsets); * Return: 0 on success, negative errno on failure. */ int xt_check_entry_offsets(const void *base, + const char *elems, unsigned int target_offset, unsigned int next_offset) { + long size_of_base_struct = elems - (const char *)base; const struct xt_entry_target *t; const char *e = base; + /* target start is within the ip/ip6/arpt_entry struct */ + if (target_offset < size_of_base_struct) + return -EINVAL; + if (target_offset + sizeof(*t) > next_offset) return -EINVAL; From a605e7476c66a13312189026e6977bad6ed3050d Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:29 +0200 Subject: [PATCH 303/312] netfilter: x_tables: validate all offsets and sizes in a rule [ Upstream commit 13631bfc604161a9d69cd68991dff8603edd66f9 ] Validate that all matches (if any) add up to the beginning of the target and that each match covers at least the base structure size. The compat path should be able to safely re-use the function as the structures only differ in alignment; added a BUILD_BUG_ON just in case we have an arch that adds padding as well. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/x_tables.c | 81 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 76 insertions(+), 5 deletions(-) diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index bd7fd79132cb..467ceadc1d18 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -418,6 +418,47 @@ int xt_check_match(struct xt_mtchk_param *par, } EXPORT_SYMBOL_GPL(xt_check_match); +/** xt_check_entry_match - check that matches end before start of target + * + * @match: beginning of xt_entry_match + * @target: beginning of this rules target (alleged end of matches) + * @alignment: alignment requirement of match structures + * + * Validates that all matches add up to the beginning of the target, + * and that each match covers at least the base structure size. + * + * Return: 0 on success, negative errno on failure. + */ +static int xt_check_entry_match(const char *match, const char *target, + const size_t alignment) +{ + const struct xt_entry_match *pos; + int length = target - match; + + if (length == 0) /* no matches */ + return 0; + + pos = (struct xt_entry_match *)match; + do { + if ((unsigned long)pos % alignment) + return -EINVAL; + + if (length < (int)sizeof(struct xt_entry_match)) + return -EINVAL; + + if (pos->u.match_size < sizeof(struct xt_entry_match)) + return -EINVAL; + + if (pos->u.match_size > length) + return -EINVAL; + + length -= pos->u.match_size; + pos = ((void *)((char *)(pos) + (pos)->u.match_size)); + } while (length > 0); + + return 0; +} + #ifdef CONFIG_COMPAT int xt_compat_add_offset(u_int8_t af, unsigned int offset, int delta) { @@ -573,7 +614,14 @@ int xt_compat_check_entry_offsets(const void *base, const char *elems, target_offset + sizeof(struct compat_xt_standard_target) != next_offset) return -EINVAL; - return 0; + /* compat_xt_entry match has less strict aligment requirements, + * otherwise they are identical. In case of padding differences + * we need to add compat version of xt_check_entry_match. + */ + BUILD_BUG_ON(sizeof(struct compat_xt_entry_match) != sizeof(struct xt_entry_match)); + + return xt_check_entry_match(elems, base + target_offset, + __alignof__(struct compat_xt_entry_match)); } EXPORT_SYMBOL(xt_compat_check_entry_offsets); #endif /* CONFIG_COMPAT */ @@ -586,17 +634,39 @@ EXPORT_SYMBOL(xt_compat_check_entry_offsets); * @target_offset: the arp/ip/ip6_t->target_offset * @next_offset: the arp/ip/ip6_t->next_offset * - * validates that target_offset and next_offset are sane. - * Also see xt_compat_check_entry_offsets for CONFIG_COMPAT version. + * validates that target_offset and next_offset are sane and that all + * match sizes (if any) align with the target offset. * * This function does not validate the targets or matches themselves, it - * only tests that all the offsets and sizes are correct. + * only tests that all the offsets and sizes are correct, that all + * match structures are aligned, and that the last structure ends where + * the target structure begins. + * + * Also see xt_compat_check_entry_offsets for CONFIG_COMPAT version. * * The arp/ip/ip6t_entry structure @base must have passed following tests: * - it must point to a valid memory location * - base to base + next_offset must be accessible, i.e. not exceed allocated * length. * + * A well-formed entry looks like this: + * + * ip(6)t_entry match [mtdata] match [mtdata] target [tgdata] ip(6)t_entry + * e->elems[]-----' | | + * matchsize | | + * matchsize | | + * | | + * target_offset---------------------------------' | + * next_offset---------------------------------------------------' + * + * elems[]: flexible array member at end of ip(6)/arpt_entry struct. + * This is where matches (if any) and the target reside. + * target_offset: beginning of target. + * next_offset: start of the next rule; also: size of this rule. + * Since targets have a minimum size, target_offset + minlen <= next_offset. + * + * Every match stores its size, sum of sizes must not exceed target_offset. + * * Return: 0 on success, negative errno on failure. */ int xt_check_entry_offsets(const void *base, @@ -626,7 +696,8 @@ int xt_check_entry_offsets(const void *base, target_offset + sizeof(struct xt_standard_target) != next_offset) return -EINVAL; - return 0; + return xt_check_entry_match(elems, base + target_offset, + __alignof__(struct xt_entry_match)); } EXPORT_SYMBOL(xt_check_entry_offsets); From d97246b5d16b77b57d835b6b4d1bde8ea6566b46 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Wed, 1 Jun 2016 02:04:44 +0200 Subject: [PATCH 304/312] netfilter: x_tables: don't reject valid target size on some architectures [ Upstream commit 7b7eba0f3515fca3296b8881d583f7c1042f5226 ] Quoting John Stultz: In updating a 32bit arm device from 4.6 to Linus' current HEAD, I noticed I was having some trouble with networking, and realized that /proc/net/ip_tables_names was suddenly empty. Digging through the registration process, it seems we're catching on the: if (strcmp(t->u.user.name, XT_STANDARD_TARGET) == 0 && target_offset + sizeof(struct xt_standard_target) != next_offset) return -EINVAL; Where next_offset seems to be 4 bytes larger then the offset + standard_target struct size. next_offset needs to be aligned via XT_ALIGN (so we can access all members of ip(6)t_entry struct). This problem didn't show up on i686 as it only needs 4-byte alignment for u64, but iptables userspace on other 32bit arches does insert extra padding. Reported-by: John Stultz Tested-by: John Stultz Fixes: 7ed2abddd20cf ("netfilter: x_tables: check standard target size too") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/netfilter/x_tables.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 467ceadc1d18..5e5cc459dbcb 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -611,7 +611,7 @@ int xt_compat_check_entry_offsets(const void *base, const char *elems, return -EINVAL; if (strcmp(t->u.user.name, XT_STANDARD_TARGET) == 0 && - target_offset + sizeof(struct compat_xt_standard_target) != next_offset) + COMPAT_XT_ALIGN(target_offset + sizeof(struct compat_xt_standard_target)) != next_offset) return -EINVAL; /* compat_xt_entry match has less strict aligment requirements, @@ -693,7 +693,7 @@ int xt_check_entry_offsets(const void *base, return -EINVAL; if (strcmp(t->u.user.name, XT_STANDARD_TARGET) == 0 && - target_offset + sizeof(struct xt_standard_target) != next_offset) + XT_ALIGN(target_offset + sizeof(struct xt_standard_target)) != next_offset) return -EINVAL; return xt_check_entry_match(elems, base + target_offset, From 7bdf4f49826159635b21516296be211f420eefac Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:32 +0200 Subject: [PATCH 305/312] netfilter: arp_tables: simplify translate_compat_table args [ Upstream commit 8dddd32756f6fe8e4e82a63361119b7e2384e02f ] Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/arp_tables.c | 82 +++++++++++++++------------------ 1 file changed, 36 insertions(+), 46 deletions(-) diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 53315ee89dcd..aca12b8b44ec 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -1206,6 +1206,18 @@ static int do_add_counters(struct net *net, const void __user *user, } #ifdef CONFIG_COMPAT +struct compat_arpt_replace { + char name[XT_TABLE_MAXNAMELEN]; + u32 valid_hooks; + u32 num_entries; + u32 size; + u32 hook_entry[NF_ARP_NUMHOOKS]; + u32 underflow[NF_ARP_NUMHOOKS]; + u32 num_counters; + compat_uptr_t counters; + struct compat_arpt_entry entries[0]; +}; + static inline void compat_release_entry(struct compat_arpt_entry *e) { struct xt_entry_target *t; @@ -1221,8 +1233,7 @@ check_compat_entry_size_and_hooks(struct compat_arpt_entry *e, const unsigned char *base, const unsigned char *limit, const unsigned int *hook_entries, - const unsigned int *underflows, - const char *name) + const unsigned int *underflows) { struct xt_entry_target *t; struct xt_target *target; @@ -1293,7 +1304,7 @@ out: static int compat_copy_entry_from_user(struct compat_arpt_entry *e, void **dstptr, - unsigned int *size, const char *name, + unsigned int *size, struct xt_table_info *newinfo, unsigned char *base) { struct xt_entry_target *t; @@ -1326,14 +1337,9 @@ compat_copy_entry_from_user(struct compat_arpt_entry *e, void **dstptr, return ret; } -static int translate_compat_table(const char *name, - unsigned int valid_hooks, - struct xt_table_info **pinfo, +static int translate_compat_table(struct xt_table_info **pinfo, void **pentry0, - unsigned int total_size, - unsigned int number, - unsigned int *hook_entries, - unsigned int *underflows) + const struct compat_arpt_replace *compatr) { unsigned int i, j; struct xt_table_info *newinfo, *info; @@ -1345,8 +1351,8 @@ static int translate_compat_table(const char *name, info = *pinfo; entry0 = *pentry0; - size = total_size; - info->number = number; + size = compatr->size; + info->number = compatr->num_entries; /* Init all hooks to impossible value. */ for (i = 0; i < NF_ARP_NUMHOOKS; i++) { @@ -1357,40 +1363,39 @@ static int translate_compat_table(const char *name, duprintf("translate_compat_table: size %u\n", info->size); j = 0; xt_compat_lock(NFPROTO_ARP); - xt_compat_init_offsets(NFPROTO_ARP, number); + xt_compat_init_offsets(NFPROTO_ARP, compatr->num_entries); /* Walk through entries, checking offsets. */ - xt_entry_foreach(iter0, entry0, total_size) { + xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, entry0, - entry0 + total_size, - hook_entries, - underflows, - name); + entry0 + compatr->size, + compatr->hook_entry, + compatr->underflow); if (ret != 0) goto out_unlock; ++j; } ret = -EINVAL; - if (j != number) { + if (j != compatr->num_entries) { duprintf("translate_compat_table: %u not %u entries\n", - j, number); + j, compatr->num_entries); goto out_unlock; } /* Check hooks all assigned */ for (i = 0; i < NF_ARP_NUMHOOKS; i++) { /* Only hooks which are valid */ - if (!(valid_hooks & (1 << i))) + if (!(compatr->valid_hooks & (1 << i))) continue; if (info->hook_entry[i] == 0xFFFFFFFF) { duprintf("Invalid hook entry %u %u\n", - i, hook_entries[i]); + i, info->hook_entry[i]); goto out_unlock; } if (info->underflow[i] == 0xFFFFFFFF) { duprintf("Invalid underflow %u %u\n", - i, underflows[i]); + i, info->underflow[i]); goto out_unlock; } } @@ -1400,17 +1405,17 @@ static int translate_compat_table(const char *name, if (!newinfo) goto out_unlock; - newinfo->number = number; + newinfo->number = compatr->num_entries; for (i = 0; i < NF_ARP_NUMHOOKS; i++) { newinfo->hook_entry[i] = info->hook_entry[i]; newinfo->underflow[i] = info->underflow[i]; } entry1 = newinfo->entries[raw_smp_processor_id()]; pos = entry1; - size = total_size; - xt_entry_foreach(iter0, entry0, total_size) { + size = compatr->size; + xt_entry_foreach(iter0, entry0, compatr->size) { ret = compat_copy_entry_from_user(iter0, &pos, &size, - name, newinfo, entry1); + newinfo, entry1); if (ret != 0) break; } @@ -1420,12 +1425,12 @@ static int translate_compat_table(const char *name, goto free_newinfo; ret = -ELOOP; - if (!mark_source_chains(newinfo, valid_hooks, entry1)) + if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) goto free_newinfo; i = 0; xt_entry_foreach(iter1, entry1, newinfo->size) { - ret = check_target(iter1, name); + ret = check_target(iter1, compatr->name); if (ret != 0) break; ++i; @@ -1470,7 +1475,7 @@ static int translate_compat_table(const char *name, free_newinfo: xt_free_table_info(newinfo); out: - xt_entry_foreach(iter0, entry0, total_size) { + xt_entry_foreach(iter0, entry0, compatr->size) { if (j-- == 0) break; compat_release_entry(iter0); @@ -1482,18 +1487,6 @@ out_unlock: goto out; } -struct compat_arpt_replace { - char name[XT_TABLE_MAXNAMELEN]; - u32 valid_hooks; - u32 num_entries; - u32 size; - u32 hook_entry[NF_ARP_NUMHOOKS]; - u32 underflow[NF_ARP_NUMHOOKS]; - u32 num_counters; - compat_uptr_t counters; - struct compat_arpt_entry entries[0]; -}; - static int compat_do_replace(struct net *net, void __user *user, unsigned int len) { @@ -1527,10 +1520,7 @@ static int compat_do_replace(struct net *net, void __user *user, goto free_newinfo; } - ret = translate_compat_table(tmp.name, tmp.valid_hooks, - &newinfo, &loc_cpu_entry, tmp.size, - tmp.num_entries, tmp.hook_entry, - tmp.underflow); + ret = translate_compat_table(&newinfo, &loc_cpu_entry, &tmp); if (ret != 0) goto free_newinfo; From cd76c8c82c48ca6c1e3da1df7c1dca7f44302c65 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:30 +0200 Subject: [PATCH 306/312] netfilter: ip_tables: simplify translate_compat_table args [ Upstream commit 7d3f843eed29222254c9feab481f55175a1afcc9 ] Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/ip_tables.c | 61 ++++++++++++++-------------------- 1 file changed, 25 insertions(+), 36 deletions(-) diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index fb00670ef2d8..e4c0a279cad6 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1446,7 +1446,6 @@ compat_copy_entry_to_user(struct ipt_entry *e, void __user **dstptr, static int compat_find_calc_match(struct xt_entry_match *m, - const char *name, const struct ipt_ip *ip, unsigned int hookmask, int *size) @@ -1484,8 +1483,7 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, const unsigned char *base, const unsigned char *limit, const unsigned int *hook_entries, - const unsigned int *underflows, - const char *name) + const unsigned int *underflows) { struct xt_entry_match *ematch; struct xt_entry_target *t; @@ -1521,8 +1519,8 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, entry_offset = (void *)e - (void *)base; j = 0; xt_ematch_foreach(ematch, e) { - ret = compat_find_calc_match(ematch, name, - &e->ip, e->comefrom, &off); + ret = compat_find_calc_match(ematch, &e->ip, e->comefrom, + &off); if (ret != 0) goto release_matches; ++j; @@ -1571,7 +1569,7 @@ release_matches: static int compat_copy_entry_from_user(struct compat_ipt_entry *e, void **dstptr, - unsigned int *size, const char *name, + unsigned int *size, struct xt_table_info *newinfo, unsigned char *base) { struct xt_entry_target *t; @@ -1647,14 +1645,9 @@ compat_check_entry(struct ipt_entry *e, struct net *net, const char *name) static int translate_compat_table(struct net *net, - const char *name, - unsigned int valid_hooks, struct xt_table_info **pinfo, void **pentry0, - unsigned int total_size, - unsigned int number, - unsigned int *hook_entries, - unsigned int *underflows) + const struct compat_ipt_replace *compatr) { unsigned int i, j; struct xt_table_info *newinfo, *info; @@ -1666,8 +1659,8 @@ translate_compat_table(struct net *net, info = *pinfo; entry0 = *pentry0; - size = total_size; - info->number = number; + size = compatr->size; + info->number = compatr->num_entries; /* Init all hooks to impossible value. */ for (i = 0; i < NF_INET_NUMHOOKS; i++) { @@ -1678,40 +1671,39 @@ translate_compat_table(struct net *net, duprintf("translate_compat_table: size %u\n", info->size); j = 0; xt_compat_lock(AF_INET); - xt_compat_init_offsets(AF_INET, number); + xt_compat_init_offsets(AF_INET, compatr->num_entries); /* Walk through entries, checking offsets. */ - xt_entry_foreach(iter0, entry0, total_size) { + xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, entry0, - entry0 + total_size, - hook_entries, - underflows, - name); + entry0 + compatr->size, + compatr->hook_entry, + compatr->underflow); if (ret != 0) goto out_unlock; ++j; } ret = -EINVAL; - if (j != number) { + if (j != compatr->num_entries) { duprintf("translate_compat_table: %u not %u entries\n", - j, number); + j, compatr->num_entries); goto out_unlock; } /* Check hooks all assigned */ for (i = 0; i < NF_INET_NUMHOOKS; i++) { /* Only hooks which are valid */ - if (!(valid_hooks & (1 << i))) + if (!(compatr->valid_hooks & (1 << i))) continue; if (info->hook_entry[i] == 0xFFFFFFFF) { duprintf("Invalid hook entry %u %u\n", - i, hook_entries[i]); + i, info->hook_entry[i]); goto out_unlock; } if (info->underflow[i] == 0xFFFFFFFF) { duprintf("Invalid underflow %u %u\n", - i, underflows[i]); + i, info->underflow[i]); goto out_unlock; } } @@ -1721,17 +1713,17 @@ translate_compat_table(struct net *net, if (!newinfo) goto out_unlock; - newinfo->number = number; + newinfo->number = compatr->num_entries; for (i = 0; i < NF_INET_NUMHOOKS; i++) { newinfo->hook_entry[i] = info->hook_entry[i]; newinfo->underflow[i] = info->underflow[i]; } entry1 = newinfo->entries[raw_smp_processor_id()]; pos = entry1; - size = total_size; - xt_entry_foreach(iter0, entry0, total_size) { + size = compatr->size; + xt_entry_foreach(iter0, entry0, compatr->size) { ret = compat_copy_entry_from_user(iter0, &pos, &size, - name, newinfo, entry1); + newinfo, entry1); if (ret != 0) break; } @@ -1741,12 +1733,12 @@ translate_compat_table(struct net *net, goto free_newinfo; ret = -ELOOP; - if (!mark_source_chains(newinfo, valid_hooks, entry1)) + if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) goto free_newinfo; i = 0; xt_entry_foreach(iter1, entry1, newinfo->size) { - ret = compat_check_entry(iter1, net, name); + ret = compat_check_entry(iter1, net, compatr->name); if (ret != 0) break; ++i; @@ -1791,7 +1783,7 @@ translate_compat_table(struct net *net, free_newinfo: xt_free_table_info(newinfo); out: - xt_entry_foreach(iter0, entry0, total_size) { + xt_entry_foreach(iter0, entry0, compatr->size) { if (j-- == 0) break; compat_release_entry(iter0); @@ -1837,10 +1829,7 @@ compat_do_replace(struct net *net, void __user *user, unsigned int len) goto free_newinfo; } - ret = translate_compat_table(net, tmp.name, tmp.valid_hooks, - &newinfo, &loc_cpu_entry, tmp.size, - tmp.num_entries, tmp.hook_entry, - tmp.underflow); + ret = translate_compat_table(net, &newinfo, &loc_cpu_entry, &tmp); if (ret != 0) goto free_newinfo; From 05e089b3a29c6c965fa9c8801ea72d21a959ca46 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:31 +0200 Subject: [PATCH 307/312] netfilter: ip6_tables: simplify translate_compat_table args [ Upstream commit 329a0807124f12fe1c8032f95d8a8eb47047fb0e ] Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv6/netfilter/ip6_tables.c | 61 ++++++++++++++------------------- 1 file changed, 25 insertions(+), 36 deletions(-) diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index c09dbcc30f82..0c34c08371b9 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1461,7 +1461,6 @@ compat_copy_entry_to_user(struct ip6t_entry *e, void __user **dstptr, static int compat_find_calc_match(struct xt_entry_match *m, - const char *name, const struct ip6t_ip6 *ipv6, unsigned int hookmask, int *size) @@ -1499,8 +1498,7 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, const unsigned char *base, const unsigned char *limit, const unsigned int *hook_entries, - const unsigned int *underflows, - const char *name) + const unsigned int *underflows) { struct xt_entry_match *ematch; struct xt_entry_target *t; @@ -1536,8 +1534,8 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, entry_offset = (void *)e - (void *)base; j = 0; xt_ematch_foreach(ematch, e) { - ret = compat_find_calc_match(ematch, name, - &e->ipv6, e->comefrom, &off); + ret = compat_find_calc_match(ematch, &e->ipv6, e->comefrom, + &off); if (ret != 0) goto release_matches; ++j; @@ -1586,7 +1584,7 @@ release_matches: static int compat_copy_entry_from_user(struct compat_ip6t_entry *e, void **dstptr, - unsigned int *size, const char *name, + unsigned int *size, struct xt_table_info *newinfo, unsigned char *base) { struct xt_entry_target *t; @@ -1660,14 +1658,9 @@ static int compat_check_entry(struct ip6t_entry *e, struct net *net, static int translate_compat_table(struct net *net, - const char *name, - unsigned int valid_hooks, struct xt_table_info **pinfo, void **pentry0, - unsigned int total_size, - unsigned int number, - unsigned int *hook_entries, - unsigned int *underflows) + const struct compat_ip6t_replace *compatr) { unsigned int i, j; struct xt_table_info *newinfo, *info; @@ -1679,8 +1672,8 @@ translate_compat_table(struct net *net, info = *pinfo; entry0 = *pentry0; - size = total_size; - info->number = number; + size = compatr->size; + info->number = compatr->num_entries; /* Init all hooks to impossible value. */ for (i = 0; i < NF_INET_NUMHOOKS; i++) { @@ -1691,40 +1684,39 @@ translate_compat_table(struct net *net, duprintf("translate_compat_table: size %u\n", info->size); j = 0; xt_compat_lock(AF_INET6); - xt_compat_init_offsets(AF_INET6, number); + xt_compat_init_offsets(AF_INET6, compatr->num_entries); /* Walk through entries, checking offsets. */ - xt_entry_foreach(iter0, entry0, total_size) { + xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, entry0, - entry0 + total_size, - hook_entries, - underflows, - name); + entry0 + compatr->size, + compatr->hook_entry, + compatr->underflow); if (ret != 0) goto out_unlock; ++j; } ret = -EINVAL; - if (j != number) { + if (j != compatr->num_entries) { duprintf("translate_compat_table: %u not %u entries\n", - j, number); + j, compatr->num_entries); goto out_unlock; } /* Check hooks all assigned */ for (i = 0; i < NF_INET_NUMHOOKS; i++) { /* Only hooks which are valid */ - if (!(valid_hooks & (1 << i))) + if (!(compatr->valid_hooks & (1 << i))) continue; if (info->hook_entry[i] == 0xFFFFFFFF) { duprintf("Invalid hook entry %u %u\n", - i, hook_entries[i]); + i, info->hook_entry[i]); goto out_unlock; } if (info->underflow[i] == 0xFFFFFFFF) { duprintf("Invalid underflow %u %u\n", - i, underflows[i]); + i, info->underflow[i]); goto out_unlock; } } @@ -1734,17 +1726,17 @@ translate_compat_table(struct net *net, if (!newinfo) goto out_unlock; - newinfo->number = number; + newinfo->number = compatr->num_entries; for (i = 0; i < NF_INET_NUMHOOKS; i++) { newinfo->hook_entry[i] = info->hook_entry[i]; newinfo->underflow[i] = info->underflow[i]; } entry1 = newinfo->entries[raw_smp_processor_id()]; pos = entry1; - size = total_size; - xt_entry_foreach(iter0, entry0, total_size) { + size = compatr->size; + xt_entry_foreach(iter0, entry0, compatr->size) { ret = compat_copy_entry_from_user(iter0, &pos, &size, - name, newinfo, entry1); + newinfo, entry1); if (ret != 0) break; } @@ -1754,12 +1746,12 @@ translate_compat_table(struct net *net, goto free_newinfo; ret = -ELOOP; - if (!mark_source_chains(newinfo, valid_hooks, entry1)) + if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) goto free_newinfo; i = 0; xt_entry_foreach(iter1, entry1, newinfo->size) { - ret = compat_check_entry(iter1, net, name); + ret = compat_check_entry(iter1, net, compatr->name); if (ret != 0) break; ++i; @@ -1804,7 +1796,7 @@ translate_compat_table(struct net *net, free_newinfo: xt_free_table_info(newinfo); out: - xt_entry_foreach(iter0, entry0, total_size) { + xt_entry_foreach(iter0, entry0, compatr->size) { if (j-- == 0) break; compat_release_entry(iter0); @@ -1850,10 +1842,7 @@ compat_do_replace(struct net *net, void __user *user, unsigned int len) goto free_newinfo; } - ret = translate_compat_table(net, tmp.name, tmp.valid_hooks, - &newinfo, &loc_cpu_entry, tmp.size, - tmp.num_entries, tmp.hook_entry, - tmp.underflow); + ret = translate_compat_table(net, &newinfo, &loc_cpu_entry, &tmp); if (ret != 0) goto free_newinfo; From 2756b2a32469316c98a46359710ea2bfdbfe331e Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:33 +0200 Subject: [PATCH 308/312] netfilter: x_tables: xt_compat_match_from_user doesn't need a retval [ Upstream commit 0188346f21e6546498c2a0f84888797ad4063fc5 ] Always returned 0. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- include/linux/netfilter/x_tables.h | 2 +- net/ipv4/netfilter/arp_tables.c | 17 +++++------------ net/ipv4/netfilter/ip_tables.c | 26 +++++++++----------------- net/ipv6/netfilter/ip6_tables.c | 27 +++++++++------------------ net/netfilter/x_tables.c | 5 ++--- 5 files changed, 26 insertions(+), 51 deletions(-) diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 726f4dd63671..445a2d59b341 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -425,7 +425,7 @@ void xt_compat_init_offsets(u_int8_t af, unsigned int number); int xt_compat_calc_jump(u_int8_t af, unsigned int offset); int xt_compat_match_offset(const struct xt_match *match); -int xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, +void xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, unsigned int *size); int xt_compat_match_to_user(const struct xt_entry_match *m, void __user **dstptr, unsigned int *size); diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index aca12b8b44ec..8bdc267e026f 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -1302,7 +1302,7 @@ out: return ret; } -static int +static void compat_copy_entry_from_user(struct compat_arpt_entry *e, void **dstptr, unsigned int *size, struct xt_table_info *newinfo, unsigned char *base) @@ -1311,9 +1311,8 @@ compat_copy_entry_from_user(struct compat_arpt_entry *e, void **dstptr, struct xt_target *target; struct arpt_entry *de; unsigned int origsize; - int ret, h; + int h; - ret = 0; origsize = *size; de = (struct arpt_entry *)*dstptr; memcpy(de, e, sizeof(struct arpt_entry)); @@ -1334,7 +1333,6 @@ compat_copy_entry_from_user(struct compat_arpt_entry *e, void **dstptr, if ((unsigned char *)de - base < newinfo->underflow[h]) newinfo->underflow[h] -= origsize - *size; } - return ret; } static int translate_compat_table(struct xt_table_info **pinfo, @@ -1413,16 +1411,11 @@ static int translate_compat_table(struct xt_table_info **pinfo, entry1 = newinfo->entries[raw_smp_processor_id()]; pos = entry1; size = compatr->size; - xt_entry_foreach(iter0, entry0, compatr->size) { - ret = compat_copy_entry_from_user(iter0, &pos, &size, - newinfo, entry1); - if (ret != 0) - break; - } + xt_entry_foreach(iter0, entry0, compatr->size) + compat_copy_entry_from_user(iter0, &pos, &size, + newinfo, entry1); xt_compat_flush_offsets(NFPROTO_ARP); xt_compat_unlock(NFPROTO_ARP); - if (ret) - goto free_newinfo; ret = -ELOOP; if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index e4c0a279cad6..ac02824c4271 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1567,7 +1567,7 @@ release_matches: return ret; } -static int +static void compat_copy_entry_from_user(struct compat_ipt_entry *e, void **dstptr, unsigned int *size, struct xt_table_info *newinfo, unsigned char *base) @@ -1576,10 +1576,9 @@ compat_copy_entry_from_user(struct compat_ipt_entry *e, void **dstptr, struct xt_target *target; struct ipt_entry *de; unsigned int origsize; - int ret, h; + int h; struct xt_entry_match *ematch; - ret = 0; origsize = *size; de = (struct ipt_entry *)*dstptr; memcpy(de, e, sizeof(struct ipt_entry)); @@ -1588,11 +1587,9 @@ compat_copy_entry_from_user(struct compat_ipt_entry *e, void **dstptr, *dstptr += sizeof(struct ipt_entry); *size += sizeof(struct ipt_entry) - sizeof(struct compat_ipt_entry); - xt_ematch_foreach(ematch, e) { - ret = xt_compat_match_from_user(ematch, dstptr, size); - if (ret != 0) - return ret; - } + xt_ematch_foreach(ematch, e) + xt_compat_match_from_user(ematch, dstptr, size); + de->target_offset = e->target_offset - (origsize - *size); t = compat_ipt_get_target(e); target = t->u.kernel.target; @@ -1605,7 +1602,6 @@ compat_copy_entry_from_user(struct compat_ipt_entry *e, void **dstptr, if ((unsigned char *)de - base < newinfo->underflow[h]) newinfo->underflow[h] -= origsize - *size; } - return ret; } static int @@ -1721,16 +1717,12 @@ translate_compat_table(struct net *net, entry1 = newinfo->entries[raw_smp_processor_id()]; pos = entry1; size = compatr->size; - xt_entry_foreach(iter0, entry0, compatr->size) { - ret = compat_copy_entry_from_user(iter0, &pos, &size, - newinfo, entry1); - if (ret != 0) - break; - } + xt_entry_foreach(iter0, entry0, compatr->size) + compat_copy_entry_from_user(iter0, &pos, &size, + newinfo, entry1); + xt_compat_flush_offsets(AF_INET); xt_compat_unlock(AF_INET); - if (ret) - goto free_newinfo; ret = -ELOOP; if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 0c34c08371b9..43cc379cc7f3 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1582,7 +1582,7 @@ release_matches: return ret; } -static int +static void compat_copy_entry_from_user(struct compat_ip6t_entry *e, void **dstptr, unsigned int *size, struct xt_table_info *newinfo, unsigned char *base) @@ -1590,10 +1590,9 @@ compat_copy_entry_from_user(struct compat_ip6t_entry *e, void **dstptr, struct xt_entry_target *t; struct ip6t_entry *de; unsigned int origsize; - int ret, h; + int h; struct xt_entry_match *ematch; - ret = 0; origsize = *size; de = (struct ip6t_entry *)*dstptr; memcpy(de, e, sizeof(struct ip6t_entry)); @@ -1602,11 +1601,9 @@ compat_copy_entry_from_user(struct compat_ip6t_entry *e, void **dstptr, *dstptr += sizeof(struct ip6t_entry); *size += sizeof(struct ip6t_entry) - sizeof(struct compat_ip6t_entry); - xt_ematch_foreach(ematch, e) { - ret = xt_compat_match_from_user(ematch, dstptr, size); - if (ret != 0) - return ret; - } + xt_ematch_foreach(ematch, e) + xt_compat_match_from_user(ematch, dstptr, size); + de->target_offset = e->target_offset - (origsize - *size); t = compat_ip6t_get_target(e); xt_compat_target_from_user(t, dstptr, size); @@ -1618,7 +1615,6 @@ compat_copy_entry_from_user(struct compat_ip6t_entry *e, void **dstptr, if ((unsigned char *)de - base < newinfo->underflow[h]) newinfo->underflow[h] -= origsize - *size; } - return ret; } static int compat_check_entry(struct ip6t_entry *e, struct net *net, @@ -1733,17 +1729,12 @@ translate_compat_table(struct net *net, } entry1 = newinfo->entries[raw_smp_processor_id()]; pos = entry1; - size = compatr->size; - xt_entry_foreach(iter0, entry0, compatr->size) { - ret = compat_copy_entry_from_user(iter0, &pos, &size, - newinfo, entry1); - if (ret != 0) - break; - } + xt_entry_foreach(iter0, entry0, compatr->size) + compat_copy_entry_from_user(iter0, &pos, &size, + newinfo, entry1); + xt_compat_flush_offsets(AF_INET6); xt_compat_unlock(AF_INET6); - if (ret) - goto free_newinfo; ret = -ELOOP; if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 5e5cc459dbcb..a9143696e9b9 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -528,8 +528,8 @@ int xt_compat_match_offset(const struct xt_match *match) } EXPORT_SYMBOL_GPL(xt_compat_match_offset); -int xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, - unsigned int *size) +void xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, + unsigned int *size) { const struct xt_match *match = m->u.kernel.match; struct compat_xt_entry_match *cm = (struct compat_xt_entry_match *)m; @@ -551,7 +551,6 @@ int xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, *size += off; *dstptr += msize; - return 0; } EXPORT_SYMBOL_GPL(xt_compat_match_from_user); From af815d264b7ed1cdceb0e1fdf4fa7d3bd5fe9a99 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 14:17:34 +0200 Subject: [PATCH 309/312] netfilter: x_tables: do compat validation via translate_table [ Upstream commit 09d9686047dbbe1cf4faa558d3ecc4aae2046054 ] This looks like refactoring, but its also a bug fix. Problem is that the compat path (32bit iptables, 64bit kernel) lacks a few sanity tests that are done in the normal path. For example, we do not check for underflows and the base chain policies. While its possible to also add such checks to the compat path, its more copy&pastry, for instance we cannot reuse check_underflow() helper as e->target_offset differs in the compat case. Other problem is that it makes auditing for validation errors harder; two places need to be checked and kept in sync. At a high level 32 bit compat works like this: 1- initial pass over blob: validate match/entry offsets, bounds checking lookup all matches and targets do bookkeeping wrt. size delta of 32/64bit structures assign match/target.u.kernel pointer (points at kernel implementation, needed to access ->compatsize etc.) 2- allocate memory according to the total bookkeeping size to contain the translated ruleset 3- second pass over original blob: for each entry, copy the 32bit representation to the newly allocated memory. This also does any special match translations (e.g. adjust 32bit to 64bit longs, etc). 4- check if ruleset is free of loops (chase all jumps) 5-first pass over translated blob: call the checkentry function of all matches and targets. The alternative implemented by this patch is to drop steps 3&4 from the compat process, the translation is changed into an intermediate step rather than a full 1:1 translate_table replacement. In the 2nd pass (step #3), change the 64bit ruleset back to a kernel representation, i.e. put() the kernel pointer and restore ->u.user.name . This gets us a 64bit ruleset that is in the format generated by a 64bit iptables userspace -- we can then use translate_table() to get the 'native' sanity checks. This has two drawbacks: 1. we re-validate all the match and target entry structure sizes even though compat translation is supposed to never generate bogus offsets. 2. we put and then re-lookup each match and target. THe upside is that we get all sanity tests and ruleset validations provided by the normal path and can remove some duplicated compat code. iptables-restore time of autogenerated ruleset with 300k chains of form -A CHAIN0001 -m limit --limit 1/s -j CHAIN0002 -A CHAIN0002 -m limit --limit 1/s -j CHAIN0003 shows no noticeable differences in restore times: old: 0m30.796s new: 0m31.521s 64bit: 0m25.674s Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- net/ipv4/netfilter/arp_tables.c | 113 +++++------------------ net/ipv4/netfilter/ip_tables.c | 155 +++++++------------------------- net/ipv6/netfilter/ip6_tables.c | 149 +++++------------------------- net/netfilter/x_tables.c | 8 ++ 4 files changed, 86 insertions(+), 339 deletions(-) diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 8bdc267e026f..49620aad2e98 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -1226,19 +1226,17 @@ static inline void compat_release_entry(struct compat_arpt_entry *e) module_put(t->u.kernel.target->me); } -static inline int +static int check_compat_entry_size_and_hooks(struct compat_arpt_entry *e, struct xt_table_info *newinfo, unsigned int *size, const unsigned char *base, - const unsigned char *limit, - const unsigned int *hook_entries, - const unsigned int *underflows) + const unsigned char *limit) { struct xt_entry_target *t; struct xt_target *target; unsigned int entry_offset; - int ret, off, h; + int ret, off; duprintf("check_compat_entry_size_and_hooks %p\n", e); if ((unsigned long)e % __alignof__(struct compat_arpt_entry) != 0 || @@ -1283,17 +1281,6 @@ check_compat_entry_size_and_hooks(struct compat_arpt_entry *e, if (ret) goto release_target; - /* Check hooks & underflows */ - for (h = 0; h < NF_ARP_NUMHOOKS; h++) { - if ((unsigned char *)e - base == hook_entries[h]) - newinfo->hook_entry[h] = hook_entries[h]; - if ((unsigned char *)e - base == underflows[h]) - newinfo->underflow[h] = underflows[h]; - } - - /* Clear counters and comefrom */ - memset(&e->counters, 0, sizeof(e->counters)); - e->comefrom = 0; return 0; release_target: @@ -1343,7 +1330,7 @@ static int translate_compat_table(struct xt_table_info **pinfo, struct xt_table_info *newinfo, *info; void *pos, *entry0, *entry1; struct compat_arpt_entry *iter0; - struct arpt_entry *iter1; + struct arpt_replace repl; unsigned int size; int ret = 0; @@ -1352,12 +1339,6 @@ static int translate_compat_table(struct xt_table_info **pinfo, size = compatr->size; info->number = compatr->num_entries; - /* Init all hooks to impossible value. */ - for (i = 0; i < NF_ARP_NUMHOOKS; i++) { - info->hook_entry[i] = 0xFFFFFFFF; - info->underflow[i] = 0xFFFFFFFF; - } - duprintf("translate_compat_table: size %u\n", info->size); j = 0; xt_compat_lock(NFPROTO_ARP); @@ -1366,9 +1347,7 @@ static int translate_compat_table(struct xt_table_info **pinfo, xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, entry0, - entry0 + compatr->size, - compatr->hook_entry, - compatr->underflow); + entry0 + compatr->size); if (ret != 0) goto out_unlock; ++j; @@ -1381,23 +1360,6 @@ static int translate_compat_table(struct xt_table_info **pinfo, goto out_unlock; } - /* Check hooks all assigned */ - for (i = 0; i < NF_ARP_NUMHOOKS; i++) { - /* Only hooks which are valid */ - if (!(compatr->valid_hooks & (1 << i))) - continue; - if (info->hook_entry[i] == 0xFFFFFFFF) { - duprintf("Invalid hook entry %u %u\n", - i, info->hook_entry[i]); - goto out_unlock; - } - if (info->underflow[i] == 0xFFFFFFFF) { - duprintf("Invalid underflow %u %u\n", - i, info->underflow[i]); - goto out_unlock; - } - } - ret = -ENOMEM; newinfo = xt_alloc_table_info(size); if (!newinfo) @@ -1414,52 +1376,26 @@ static int translate_compat_table(struct xt_table_info **pinfo, xt_entry_foreach(iter0, entry0, compatr->size) compat_copy_entry_from_user(iter0, &pos, &size, newinfo, entry1); + + /* all module references in entry0 are now gone */ + xt_compat_flush_offsets(NFPROTO_ARP); xt_compat_unlock(NFPROTO_ARP); - ret = -ELOOP; - if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) + memcpy(&repl, compatr, sizeof(*compatr)); + + for (i = 0; i < NF_ARP_NUMHOOKS; i++) { + repl.hook_entry[i] = newinfo->hook_entry[i]; + repl.underflow[i] = newinfo->underflow[i]; + } + + repl.num_counters = 0; + repl.counters = NULL; + repl.size = newinfo->size; + ret = translate_table(newinfo, entry1, &repl); + if (ret) goto free_newinfo; - i = 0; - xt_entry_foreach(iter1, entry1, newinfo->size) { - ret = check_target(iter1, compatr->name); - if (ret != 0) - break; - ++i; - if (strcmp(arpt_get_target(iter1)->u.user.name, - XT_ERROR_TARGET) == 0) - ++newinfo->stacksize; - } - if (ret) { - /* - * The first i matches need cleanup_entry (calls ->destroy) - * because they had called ->check already. The other j-i - * entries need only release. - */ - int skip = i; - j -= i; - xt_entry_foreach(iter0, entry0, newinfo->size) { - if (skip-- > 0) - continue; - if (j-- == 0) - break; - compat_release_entry(iter0); - } - xt_entry_foreach(iter1, entry1, newinfo->size) { - if (i-- == 0) - break; - cleanup_entry(iter1); - } - xt_free_table_info(newinfo); - return ret; - } - - /* And one copy for every other CPU */ - for_each_possible_cpu(i) - if (newinfo->entries[i] && newinfo->entries[i] != entry1) - memcpy(newinfo->entries[i], entry1, newinfo->size); - *pinfo = newinfo; *pentry0 = entry1; xt_free_table_info(info); @@ -1467,17 +1403,16 @@ static int translate_compat_table(struct xt_table_info **pinfo, free_newinfo: xt_free_table_info(newinfo); -out: + return ret; +out_unlock: + xt_compat_flush_offsets(NFPROTO_ARP); + xt_compat_unlock(NFPROTO_ARP); xt_entry_foreach(iter0, entry0, compatr->size) { if (j-- == 0) break; compat_release_entry(iter0); } return ret; -out_unlock: - xt_compat_flush_offsets(NFPROTO_ARP); - xt_compat_unlock(NFPROTO_ARP); - goto out; } static int compat_do_replace(struct net *net, void __user *user, diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index ac02824c4271..c2d3f0eea365 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1481,16 +1481,14 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, struct xt_table_info *newinfo, unsigned int *size, const unsigned char *base, - const unsigned char *limit, - const unsigned int *hook_entries, - const unsigned int *underflows) + const unsigned char *limit) { struct xt_entry_match *ematch; struct xt_entry_target *t; struct xt_target *target; unsigned int entry_offset; unsigned int j; - int ret, off, h; + int ret, off; duprintf("check_compat_entry_size_and_hooks %p\n", e); if ((unsigned long)e % __alignof__(struct compat_ipt_entry) != 0 || @@ -1543,17 +1541,6 @@ check_compat_entry_size_and_hooks(struct compat_ipt_entry *e, if (ret) goto out; - /* Check hooks & underflows */ - for (h = 0; h < NF_INET_NUMHOOKS; h++) { - if ((unsigned char *)e - base == hook_entries[h]) - newinfo->hook_entry[h] = hook_entries[h]; - if ((unsigned char *)e - base == underflows[h]) - newinfo->underflow[h] = underflows[h]; - } - - /* Clear counters and comefrom */ - memset(&e->counters, 0, sizeof(e->counters)); - e->comefrom = 0; return 0; out: @@ -1596,6 +1583,7 @@ compat_copy_entry_from_user(struct compat_ipt_entry *e, void **dstptr, xt_compat_target_from_user(t, dstptr, size); de->next_offset = e->next_offset - (origsize - *size); + for (h = 0; h < NF_INET_NUMHOOKS; h++) { if ((unsigned char *)de - base < newinfo->hook_entry[h]) newinfo->hook_entry[h] -= origsize - *size; @@ -1604,41 +1592,6 @@ compat_copy_entry_from_user(struct compat_ipt_entry *e, void **dstptr, } } -static int -compat_check_entry(struct ipt_entry *e, struct net *net, const char *name) -{ - struct xt_entry_match *ematch; - struct xt_mtchk_param mtpar; - unsigned int j; - int ret = 0; - - j = 0; - mtpar.net = net; - mtpar.table = name; - mtpar.entryinfo = &e->ip; - mtpar.hook_mask = e->comefrom; - mtpar.family = NFPROTO_IPV4; - xt_ematch_foreach(ematch, e) { - ret = check_match(ematch, &mtpar); - if (ret != 0) - goto cleanup_matches; - ++j; - } - - ret = check_target(e, net, name); - if (ret) - goto cleanup_matches; - return 0; - - cleanup_matches: - xt_ematch_foreach(ematch, e) { - if (j-- == 0) - break; - cleanup_match(ematch, net); - } - return ret; -} - static int translate_compat_table(struct net *net, struct xt_table_info **pinfo, @@ -1649,7 +1602,7 @@ translate_compat_table(struct net *net, struct xt_table_info *newinfo, *info; void *pos, *entry0, *entry1; struct compat_ipt_entry *iter0; - struct ipt_entry *iter1; + struct ipt_replace repl; unsigned int size; int ret; @@ -1658,12 +1611,6 @@ translate_compat_table(struct net *net, size = compatr->size; info->number = compatr->num_entries; - /* Init all hooks to impossible value. */ - for (i = 0; i < NF_INET_NUMHOOKS; i++) { - info->hook_entry[i] = 0xFFFFFFFF; - info->underflow[i] = 0xFFFFFFFF; - } - duprintf("translate_compat_table: size %u\n", info->size); j = 0; xt_compat_lock(AF_INET); @@ -1672,9 +1619,7 @@ translate_compat_table(struct net *net, xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, entry0, - entry0 + compatr->size, - compatr->hook_entry, - compatr->underflow); + entry0 + compatr->size); if (ret != 0) goto out_unlock; ++j; @@ -1687,23 +1632,6 @@ translate_compat_table(struct net *net, goto out_unlock; } - /* Check hooks all assigned */ - for (i = 0; i < NF_INET_NUMHOOKS; i++) { - /* Only hooks which are valid */ - if (!(compatr->valid_hooks & (1 << i))) - continue; - if (info->hook_entry[i] == 0xFFFFFFFF) { - duprintf("Invalid hook entry %u %u\n", - i, info->hook_entry[i]); - goto out_unlock; - } - if (info->underflow[i] == 0xFFFFFFFF) { - duprintf("Invalid underflow %u %u\n", - i, info->underflow[i]); - goto out_unlock; - } - } - ret = -ENOMEM; newinfo = xt_alloc_table_info(size); if (!newinfo) @@ -1711,8 +1639,8 @@ translate_compat_table(struct net *net, newinfo->number = compatr->num_entries; for (i = 0; i < NF_INET_NUMHOOKS; i++) { - newinfo->hook_entry[i] = info->hook_entry[i]; - newinfo->underflow[i] = info->underflow[i]; + newinfo->hook_entry[i] = compatr->hook_entry[i]; + newinfo->underflow[i] = compatr->underflow[i]; } entry1 = newinfo->entries[raw_smp_processor_id()]; pos = entry1; @@ -1721,52 +1649,30 @@ translate_compat_table(struct net *net, compat_copy_entry_from_user(iter0, &pos, &size, newinfo, entry1); + /* all module references in entry0 are now gone. + * entry1/newinfo contains a 64bit ruleset that looks exactly as + * generated by 64bit userspace. + * + * Call standard translate_table() to validate all hook_entrys, + * underflows, check for loops, etc. + */ xt_compat_flush_offsets(AF_INET); xt_compat_unlock(AF_INET); - ret = -ELOOP; - if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) + memcpy(&repl, compatr, sizeof(*compatr)); + + for (i = 0; i < NF_INET_NUMHOOKS; i++) { + repl.hook_entry[i] = newinfo->hook_entry[i]; + repl.underflow[i] = newinfo->underflow[i]; + } + + repl.num_counters = 0; + repl.counters = NULL; + repl.size = newinfo->size; + ret = translate_table(net, newinfo, entry1, &repl); + if (ret) goto free_newinfo; - i = 0; - xt_entry_foreach(iter1, entry1, newinfo->size) { - ret = compat_check_entry(iter1, net, compatr->name); - if (ret != 0) - break; - ++i; - if (strcmp(ipt_get_target(iter1)->u.user.name, - XT_ERROR_TARGET) == 0) - ++newinfo->stacksize; - } - if (ret) { - /* - * The first i matches need cleanup_entry (calls ->destroy) - * because they had called ->check already. The other j-i - * entries need only release. - */ - int skip = i; - j -= i; - xt_entry_foreach(iter0, entry0, newinfo->size) { - if (skip-- > 0) - continue; - if (j-- == 0) - break; - compat_release_entry(iter0); - } - xt_entry_foreach(iter1, entry1, newinfo->size) { - if (i-- == 0) - break; - cleanup_entry(iter1, net); - } - xt_free_table_info(newinfo); - return ret; - } - - /* And one copy for every other CPU */ - for_each_possible_cpu(i) - if (newinfo->entries[i] && newinfo->entries[i] != entry1) - memcpy(newinfo->entries[i], entry1, newinfo->size); - *pinfo = newinfo; *pentry0 = entry1; xt_free_table_info(info); @@ -1774,17 +1680,16 @@ translate_compat_table(struct net *net, free_newinfo: xt_free_table_info(newinfo); -out: + return ret; +out_unlock: + xt_compat_flush_offsets(AF_INET); + xt_compat_unlock(AF_INET); xt_entry_foreach(iter0, entry0, compatr->size) { if (j-- == 0) break; compat_release_entry(iter0); } return ret; -out_unlock: - xt_compat_flush_offsets(AF_INET); - xt_compat_unlock(AF_INET); - goto out; } static int diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 43cc379cc7f3..faad6073e2be 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1496,16 +1496,14 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, struct xt_table_info *newinfo, unsigned int *size, const unsigned char *base, - const unsigned char *limit, - const unsigned int *hook_entries, - const unsigned int *underflows) + const unsigned char *limit) { struct xt_entry_match *ematch; struct xt_entry_target *t; struct xt_target *target; unsigned int entry_offset; unsigned int j; - int ret, off, h; + int ret, off; duprintf("check_compat_entry_size_and_hooks %p\n", e); if ((unsigned long)e % __alignof__(struct compat_ip6t_entry) != 0 || @@ -1558,17 +1556,6 @@ check_compat_entry_size_and_hooks(struct compat_ip6t_entry *e, if (ret) goto out; - /* Check hooks & underflows */ - for (h = 0; h < NF_INET_NUMHOOKS; h++) { - if ((unsigned char *)e - base == hook_entries[h]) - newinfo->hook_entry[h] = hook_entries[h]; - if ((unsigned char *)e - base == underflows[h]) - newinfo->underflow[h] = underflows[h]; - } - - /* Clear counters and comefrom */ - memset(&e->counters, 0, sizeof(e->counters)); - e->comefrom = 0; return 0; out: @@ -1617,41 +1604,6 @@ compat_copy_entry_from_user(struct compat_ip6t_entry *e, void **dstptr, } } -static int compat_check_entry(struct ip6t_entry *e, struct net *net, - const char *name) -{ - unsigned int j; - int ret = 0; - struct xt_mtchk_param mtpar; - struct xt_entry_match *ematch; - - j = 0; - mtpar.net = net; - mtpar.table = name; - mtpar.entryinfo = &e->ipv6; - mtpar.hook_mask = e->comefrom; - mtpar.family = NFPROTO_IPV6; - xt_ematch_foreach(ematch, e) { - ret = check_match(ematch, &mtpar); - if (ret != 0) - goto cleanup_matches; - ++j; - } - - ret = check_target(e, net, name); - if (ret) - goto cleanup_matches; - return 0; - - cleanup_matches: - xt_ematch_foreach(ematch, e) { - if (j-- == 0) - break; - cleanup_match(ematch, net); - } - return ret; -} - static int translate_compat_table(struct net *net, struct xt_table_info **pinfo, @@ -1662,7 +1614,7 @@ translate_compat_table(struct net *net, struct xt_table_info *newinfo, *info; void *pos, *entry0, *entry1; struct compat_ip6t_entry *iter0; - struct ip6t_entry *iter1; + struct ip6t_replace repl; unsigned int size; int ret = 0; @@ -1671,12 +1623,6 @@ translate_compat_table(struct net *net, size = compatr->size; info->number = compatr->num_entries; - /* Init all hooks to impossible value. */ - for (i = 0; i < NF_INET_NUMHOOKS; i++) { - info->hook_entry[i] = 0xFFFFFFFF; - info->underflow[i] = 0xFFFFFFFF; - } - duprintf("translate_compat_table: size %u\n", info->size); j = 0; xt_compat_lock(AF_INET6); @@ -1685,9 +1631,7 @@ translate_compat_table(struct net *net, xt_entry_foreach(iter0, entry0, compatr->size) { ret = check_compat_entry_size_and_hooks(iter0, info, &size, entry0, - entry0 + compatr->size, - compatr->hook_entry, - compatr->underflow); + entry0 + compatr->size); if (ret != 0) goto out_unlock; ++j; @@ -1700,23 +1644,6 @@ translate_compat_table(struct net *net, goto out_unlock; } - /* Check hooks all assigned */ - for (i = 0; i < NF_INET_NUMHOOKS; i++) { - /* Only hooks which are valid */ - if (!(compatr->valid_hooks & (1 << i))) - continue; - if (info->hook_entry[i] == 0xFFFFFFFF) { - duprintf("Invalid hook entry %u %u\n", - i, info->hook_entry[i]); - goto out_unlock; - } - if (info->underflow[i] == 0xFFFFFFFF) { - duprintf("Invalid underflow %u %u\n", - i, info->underflow[i]); - goto out_unlock; - } - } - ret = -ENOMEM; newinfo = xt_alloc_table_info(size); if (!newinfo) @@ -1724,61 +1651,34 @@ translate_compat_table(struct net *net, newinfo->number = compatr->num_entries; for (i = 0; i < NF_INET_NUMHOOKS; i++) { - newinfo->hook_entry[i] = info->hook_entry[i]; - newinfo->underflow[i] = info->underflow[i]; + newinfo->hook_entry[i] = compatr->hook_entry[i]; + newinfo->underflow[i] = compatr->underflow[i]; } entry1 = newinfo->entries[raw_smp_processor_id()]; pos = entry1; + size = compatr->size; xt_entry_foreach(iter0, entry0, compatr->size) compat_copy_entry_from_user(iter0, &pos, &size, newinfo, entry1); + /* all module references in entry0 are now gone. */ xt_compat_flush_offsets(AF_INET6); xt_compat_unlock(AF_INET6); - ret = -ELOOP; - if (!mark_source_chains(newinfo, compatr->valid_hooks, entry1)) + memcpy(&repl, compatr, sizeof(*compatr)); + + for (i = 0; i < NF_INET_NUMHOOKS; i++) { + repl.hook_entry[i] = newinfo->hook_entry[i]; + repl.underflow[i] = newinfo->underflow[i]; + } + + repl.num_counters = 0; + repl.counters = NULL; + repl.size = newinfo->size; + ret = translate_table(net, newinfo, entry1, &repl); + if (ret) goto free_newinfo; - i = 0; - xt_entry_foreach(iter1, entry1, newinfo->size) { - ret = compat_check_entry(iter1, net, compatr->name); - if (ret != 0) - break; - ++i; - if (strcmp(ip6t_get_target(iter1)->u.user.name, - XT_ERROR_TARGET) == 0) - ++newinfo->stacksize; - } - if (ret) { - /* - * The first i matches need cleanup_entry (calls ->destroy) - * because they had called ->check already. The other j-i - * entries need only release. - */ - int skip = i; - j -= i; - xt_entry_foreach(iter0, entry0, newinfo->size) { - if (skip-- > 0) - continue; - if (j-- == 0) - break; - compat_release_entry(iter0); - } - xt_entry_foreach(iter1, entry1, newinfo->size) { - if (i-- == 0) - break; - cleanup_entry(iter1, net); - } - xt_free_table_info(newinfo); - return ret; - } - - /* And one copy for every other CPU */ - for_each_possible_cpu(i) - if (newinfo->entries[i] && newinfo->entries[i] != entry1) - memcpy(newinfo->entries[i], entry1, newinfo->size); - *pinfo = newinfo; *pentry0 = entry1; xt_free_table_info(info); @@ -1786,17 +1686,16 @@ translate_compat_table(struct net *net, free_newinfo: xt_free_table_info(newinfo); -out: + return ret; +out_unlock: + xt_compat_flush_offsets(AF_INET6); + xt_compat_unlock(AF_INET6); xt_entry_foreach(iter0, entry0, compatr->size) { if (j-- == 0) break; compat_release_entry(iter0); } return ret; -out_unlock: - xt_compat_flush_offsets(AF_INET6); - xt_compat_unlock(AF_INET6); - goto out; } static int diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index a9143696e9b9..141ae20fa456 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -535,6 +535,7 @@ void xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, struct compat_xt_entry_match *cm = (struct compat_xt_entry_match *)m; int pad, off = xt_compat_match_offset(match); u_int16_t msize = cm->u.user.match_size; + char name[sizeof(m->u.user.name)]; m = *dstptr; memcpy(m, cm, sizeof(*cm)); @@ -548,6 +549,9 @@ void xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, msize += off; m->u.user.match_size = msize; + strlcpy(name, match->name, sizeof(name)); + module_put(match->me); + strncpy(m->u.user.name, name, sizeof(m->u.user.name)); *size += off; *dstptr += msize; @@ -765,6 +769,7 @@ void xt_compat_target_from_user(struct xt_entry_target *t, void **dstptr, struct compat_xt_entry_target *ct = (struct compat_xt_entry_target *)t; int pad, off = xt_compat_target_offset(target); u_int16_t tsize = ct->u.user.target_size; + char name[sizeof(t->u.user.name)]; t = *dstptr; memcpy(t, ct, sizeof(*ct)); @@ -778,6 +783,9 @@ void xt_compat_target_from_user(struct xt_entry_target *t, void **dstptr, tsize += off; t->u.user.target_size = tsize; + strlcpy(name, target->name, sizeof(name)); + module_put(target->me); + strncpy(t->u.user.name, name, sizeof(t->u.user.name)); *size += off; *dstptr += tsize; From b7aa372fec2c1d4a1fcf4398ae88ba94134b9522 Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Fri, 1 Apr 2016 15:37:59 +0200 Subject: [PATCH 310/312] netfilter: x_tables: introduce and use xt_copy_counters_from_user [ Upstream commit d7591f0c41ce3e67600a982bab6989ef0f07b3ce ] The three variants use same copy&pasted code, condense this into a helper and use that. Make sure info.name is 0-terminated. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin --- include/linux/netfilter/x_tables.h | 3 ++ net/ipv4/netfilter/arp_tables.c | 48 ++----------------- net/ipv4/netfilter/ip_tables.c | 48 ++----------------- net/ipv6/netfilter/ip6_tables.c | 49 ++------------------ net/netfilter/x_tables.c | 74 ++++++++++++++++++++++++++++++ 5 files changed, 92 insertions(+), 130 deletions(-) diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 445a2d59b341..7741efa43b35 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -248,6 +248,9 @@ int xt_check_match(struct xt_mtchk_param *, unsigned int size, u_int8_t proto, int xt_check_target(struct xt_tgchk_param *, unsigned int size, u_int8_t proto, bool inv_proto); +void *xt_copy_counters_from_user(const void __user *user, unsigned int len, + struct xt_counters_info *info, bool compat); + struct xt_table *xt_register_table(struct net *net, const struct xt_table *table, struct xt_table_info *bootstrap, diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 49620aad2e98..2953ee9e5fa0 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -1123,56 +1123,18 @@ static int do_add_counters(struct net *net, const void __user *user, unsigned int i, curcpu; struct xt_counters_info tmp; struct xt_counters *paddc; - unsigned int num_counters; - const char *name; - int size; - void *ptmp; struct xt_table *t; const struct xt_table_info *private; int ret = 0; void *loc_cpu_entry; struct arpt_entry *iter; unsigned int addend; -#ifdef CONFIG_COMPAT - struct compat_xt_counters_info compat_tmp; - if (compat) { - ptmp = &compat_tmp; - size = sizeof(struct compat_xt_counters_info); - } else -#endif - { - ptmp = &tmp; - size = sizeof(struct xt_counters_info); - } + paddc = xt_copy_counters_from_user(user, len, &tmp, compat); + if (IS_ERR(paddc)) + return PTR_ERR(paddc); - if (copy_from_user(ptmp, user, size) != 0) - return -EFAULT; - -#ifdef CONFIG_COMPAT - if (compat) { - num_counters = compat_tmp.num_counters; - name = compat_tmp.name; - } else -#endif - { - num_counters = tmp.num_counters; - name = tmp.name; - } - - if (len != size + num_counters * sizeof(struct xt_counters)) - return -EINVAL; - - paddc = vmalloc(len - size); - if (!paddc) - return -ENOMEM; - - if (copy_from_user(paddc, user + size, len - size) != 0) { - ret = -EFAULT; - goto free; - } - - t = xt_find_table_lock(net, NFPROTO_ARP, name); + t = xt_find_table_lock(net, NFPROTO_ARP, tmp.name); if (IS_ERR_OR_NULL(t)) { ret = t ? PTR_ERR(t) : -ENOENT; goto free; @@ -1180,7 +1142,7 @@ static int do_add_counters(struct net *net, const void __user *user, local_bh_disable(); private = t->private; - if (private->number != num_counters) { + if (private->number != tmp.num_counters) { ret = -EINVAL; goto unlock_up_free; } diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index c2d3f0eea365..3bcf28bf1525 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1310,56 +1310,18 @@ do_add_counters(struct net *net, const void __user *user, unsigned int i, curcpu; struct xt_counters_info tmp; struct xt_counters *paddc; - unsigned int num_counters; - const char *name; - int size; - void *ptmp; struct xt_table *t; const struct xt_table_info *private; int ret = 0; void *loc_cpu_entry; struct ipt_entry *iter; unsigned int addend; -#ifdef CONFIG_COMPAT - struct compat_xt_counters_info compat_tmp; - if (compat) { - ptmp = &compat_tmp; - size = sizeof(struct compat_xt_counters_info); - } else -#endif - { - ptmp = &tmp; - size = sizeof(struct xt_counters_info); - } + paddc = xt_copy_counters_from_user(user, len, &tmp, compat); + if (IS_ERR(paddc)) + return PTR_ERR(paddc); - if (copy_from_user(ptmp, user, size) != 0) - return -EFAULT; - -#ifdef CONFIG_COMPAT - if (compat) { - num_counters = compat_tmp.num_counters; - name = compat_tmp.name; - } else -#endif - { - num_counters = tmp.num_counters; - name = tmp.name; - } - - if (len != size + num_counters * sizeof(struct xt_counters)) - return -EINVAL; - - paddc = vmalloc(len - size); - if (!paddc) - return -ENOMEM; - - if (copy_from_user(paddc, user + size, len - size) != 0) { - ret = -EFAULT; - goto free; - } - - t = xt_find_table_lock(net, AF_INET, name); + t = xt_find_table_lock(net, AF_INET, tmp.name); if (IS_ERR_OR_NULL(t)) { ret = t ? PTR_ERR(t) : -ENOENT; goto free; @@ -1367,7 +1329,7 @@ do_add_counters(struct net *net, const void __user *user, local_bh_disable(); private = t->private; - if (private->number != num_counters) { + if (private->number != tmp.num_counters) { ret = -EINVAL; goto unlock_up_free; } diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index faad6073e2be..5254d76dfce8 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1323,56 +1323,17 @@ do_add_counters(struct net *net, const void __user *user, unsigned int len, unsigned int i, curcpu; struct xt_counters_info tmp; struct xt_counters *paddc; - unsigned int num_counters; - char *name; - int size; - void *ptmp; struct xt_table *t; const struct xt_table_info *private; int ret = 0; const void *loc_cpu_entry; struct ip6t_entry *iter; unsigned int addend; -#ifdef CONFIG_COMPAT - struct compat_xt_counters_info compat_tmp; - if (compat) { - ptmp = &compat_tmp; - size = sizeof(struct compat_xt_counters_info); - } else -#endif - { - ptmp = &tmp; - size = sizeof(struct xt_counters_info); - } - - if (copy_from_user(ptmp, user, size) != 0) - return -EFAULT; - -#ifdef CONFIG_COMPAT - if (compat) { - num_counters = compat_tmp.num_counters; - name = compat_tmp.name; - } else -#endif - { - num_counters = tmp.num_counters; - name = tmp.name; - } - - if (len != size + num_counters * sizeof(struct xt_counters)) - return -EINVAL; - - paddc = vmalloc(len - size); - if (!paddc) - return -ENOMEM; - - if (copy_from_user(paddc, user + size, len - size) != 0) { - ret = -EFAULT; - goto free; - } - - t = xt_find_table_lock(net, AF_INET6, name); + paddc = xt_copy_counters_from_user(user, len, &tmp, compat); + if (IS_ERR(paddc)) + return PTR_ERR(paddc); + t = xt_find_table_lock(net, AF_INET6, tmp.name); if (IS_ERR_OR_NULL(t)) { ret = t ? PTR_ERR(t) : -ENOENT; goto free; @@ -1381,7 +1342,7 @@ do_add_counters(struct net *net, const void __user *user, unsigned int len, local_bh_disable(); private = t->private; - if (private->number != num_counters) { + if (private->number != tmp.num_counters) { ret = -EINVAL; goto unlock_up_free; } diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 141ae20fa456..4b850c639ac5 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -754,6 +754,80 @@ int xt_check_target(struct xt_tgchk_param *par, } EXPORT_SYMBOL_GPL(xt_check_target); +/** + * xt_copy_counters_from_user - copy counters and metadata from userspace + * + * @user: src pointer to userspace memory + * @len: alleged size of userspace memory + * @info: where to store the xt_counters_info metadata + * @compat: true if we setsockopt call is done by 32bit task on 64bit kernel + * + * Copies counter meta data from @user and stores it in @info. + * + * vmallocs memory to hold the counters, then copies the counter data + * from @user to the new memory and returns a pointer to it. + * + * If @compat is true, @info gets converted automatically to the 64bit + * representation. + * + * The metadata associated with the counters is stored in @info. + * + * Return: returns pointer that caller has to test via IS_ERR(). + * If IS_ERR is false, caller has to vfree the pointer. + */ +void *xt_copy_counters_from_user(const void __user *user, unsigned int len, + struct xt_counters_info *info, bool compat) +{ + void *mem; + u64 size; + +#ifdef CONFIG_COMPAT + if (compat) { + /* structures only differ in size due to alignment */ + struct compat_xt_counters_info compat_tmp; + + if (len <= sizeof(compat_tmp)) + return ERR_PTR(-EINVAL); + + len -= sizeof(compat_tmp); + if (copy_from_user(&compat_tmp, user, sizeof(compat_tmp)) != 0) + return ERR_PTR(-EFAULT); + + strlcpy(info->name, compat_tmp.name, sizeof(info->name)); + info->num_counters = compat_tmp.num_counters; + user += sizeof(compat_tmp); + } else +#endif + { + if (len <= sizeof(*info)) + return ERR_PTR(-EINVAL); + + len -= sizeof(*info); + if (copy_from_user(info, user, sizeof(*info)) != 0) + return ERR_PTR(-EFAULT); + + info->name[sizeof(info->name) - 1] = '\0'; + user += sizeof(*info); + } + + size = sizeof(struct xt_counters); + size *= info->num_counters; + + if (size != (u64)len) + return ERR_PTR(-EINVAL); + + mem = vmalloc(len); + if (!mem) + return ERR_PTR(-ENOMEM); + + if (copy_from_user(mem, user, len) == 0) + return mem; + + vfree(mem); + return ERR_PTR(-EFAULT); +} +EXPORT_SYMBOL_GPL(xt_copy_counters_from_user); + #ifdef CONFIG_COMPAT int xt_compat_target_offset(const struct xt_target *target) { From 116c75f642ad4a6d267f399c9f1fe8b91fb822c5 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Sun, 10 Jul 2016 16:46:32 -0700 Subject: [PATCH 311/312] tmpfs: fix regression hang in fallocate undo [ Upstream commit 7f556567036cb7f89aabe2f0954b08566b4efb53 ] The well-spotted fallocate undo fix is good in most cases, but not when fallocate failed on the very first page. index 0 then passes lend -1 to shmem_undo_range(), and that has two bad effects: (a) that it will undo every fallocation throughout the file, unrestricted by the current range; but more importantly (b) it can cause the undo to hang, because lend -1 is treated as truncation, which makes it keep on retrying until every page has gone, but those already fully instantiated will never go away. Big thank you to xfstests generic/269 which demonstrates this. Fixes: b9b4bb26af01 ("tmpfs: don't undo fallocate past its last page") Cc: stable@vger.kernel.org Signed-off-by: Hugh Dickins Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/shmem.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 86c150bf8235..46511ad90bc5 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2137,9 +2137,11 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, NULL); if (error) { /* Remove the !PageUptodate pages we added */ - shmem_undo_range(inode, - (loff_t)start << PAGE_CACHE_SHIFT, - ((loff_t)index << PAGE_CACHE_SHIFT) - 1, true); + if (index > start) { + shmem_undo_range(inode, + (loff_t)start << PAGE_CACHE_SHIFT, + ((loff_t)index << PAGE_CACHE_SHIFT) - 1, true); + } goto undone; } From 5880876e94699ce010554f483ccf0009997955ca Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Wed, 13 Jul 2016 14:17:46 -0400 Subject: [PATCH 312/312] Linux 4.1.28 Signed-off-by: Sasha Levin --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 54b3d8ae8624..241237cd4ca6 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 4 PATCHLEVEL = 1 -SUBLEVEL = 27 +SUBLEVEL = 28 EXTRAVERSION = NAME = Series 4800