1
0
Fork 0
Commit Graph

160463 Commits (d06e4156430e7c5eb4f04dabcaa0d9e2fba335e3)

Author SHA1 Message Date
Vasily Gorbik da17767336 s390/unwind: cleanup unused READ_ONCE_TASK_STACK
Kasan instrumentation of backchain unwinder stack reads is disabled
completely and simply uses READ_ONCE_NOCHECK now.
READ_ONCE_TASK_STACK macro is unused and could be removed.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-02 16:00:27 +02:00
Vasily Gorbik 2095574632 s390/kasan: avoid false positives during stack unwind
Avoid kasan false positive when current task is interrupted in-between
stack frame allocation and backchain write instructions leaving new stack
frame backchain invalid. In particular if backchain is 0 the unwinder
tries to read pt_regs from the stack and might hit kasan poisoned bytes,
leading to kasan "stack-out-of-bounds" report.

Disable kasan instrumentation of unwinder stack reads, since this
limitation couldn't be handled otherwise with current backchain unwinder
implementation.

Fixes: 78c98f9074 ("s390/unwind: introduce stack unwind API")
Reported-by: Julian Wiedmann <jwi@linux.ibm.com>
Tested-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-02 16:00:27 +02:00
Andy Lutomirski 539bca535d x86/entry/64: Fix and clean up paranoid_exit
paranoid_exit needs to restore CR3 before GSBASE.  Doing it in the opposite
order crashes if the exception came from a context with user GSBASE and
user CR3 -- RESTORE_CR3 cannot resture user CR3 if run with user GSBASE.
This results in infinitely recursing exceptions if user code does SYSENTER
with TF set if both FSGSBASE and PTI are enabled.

The old code worked if user code just set TF without SYSENTER because #DB
from user mode is special cased in idtentry and paranoid_exit doesn't run.

Fix it by cleaning up the spaghetti code.  All that paranoid_exit needs to
do is to disable IRQs, handle IRQ tracing, then restore CR3, and restore
GSBASE.  Simply do those actions in that order.

Fixes: 708078f657 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Link: https://lkml.kernel.org/r/59725ceb08977359489fbed979716949ad45f616.1562035429.git.luto@kernel.org
2019-07-02 08:45:20 +02:00
Andy Lutomirski dffb3f9db6 x86/entry/64: Don't compile ignore_sysret if 32-bit emulation is enabled
It's only used if !CONFIG_IA32_EMULATION, so disable it in normal
configs.  This will save a few bytes of text and reduce confusion.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc:  "BaeChang Seok" <chang.seok.bae@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Bae, Chang Seok" <chang.seok.bae@intel.com>
Link: https://lkml.kernel.org/r/0f7dafa72fe7194689de5ee8cfe5d83509fabcf5.1562035429.git.luto@kernel.org
2019-07-02 08:45:20 +02:00
Olof Johansson 180ae50952 mvebu fixes for 5.2 (part 2)
Use the armada-38x-uart compatible strings for Armada XP 98dx3236 SoCs
 in order to not loose character anymore.
 -----BEGIN PGP SIGNATURE-----
 
 iF0EABECAB0WIQQYqXDMF3cvSLY+g9cLBhiOFHI71QUCXRYouQAKCRALBhiOFHI7
 1XlVAJ0Qjg4gFXUrf7MUpn8LuUi6UTEf3ACdGLW5ZT78BM78TiEGG79H+WsJ/ow=
 =jxUU
 -----END PGP SIGNATURE-----

Merge tag 'mvebu-fixes-5.2-2' of git://git.infradead.org/linux-mvebu into arm/fixes

mvebu fixes for 5.2 (part 2)

Use the armada-38x-uart compatible strings for Armada XP 98dx3236 SoCs
in order to not loose character anymore.

* tag 'mvebu-fixes-5.2-2' of git://git.infradead.org/linux-mvebu:
  ARM: dts: armada-xp-98dx3236: Switch to armada-38x-uart serial node

Signed-off-by: Olof Johansson <olof@lixom.net>
2019-07-01 15:14:09 -07:00
Catalin Marinas 0c61efd322 Merge branch 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux
* 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
  perf: arm_spe: Enable ACPI/Platform automatic module loading
  arm_pmu: acpi: spe: Add initial MADT/SPE probing
  ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  MAINTAINERS: Add maintainer entry for the imx8 DDR PMU driver
  drivers/perf: imx_ddr: Add DDR performance counter support to perf
  dt-bindings: perf: imx8-ddr: add imx8qxp ddr performance monitor
2019-07-01 15:53:35 +01:00
Greg Kroah-Hartman f254e65ad6 usb: changes for v5.3 merge window
The biggest part here is a set of patches removing unnecesary variables
 from several drivers.
 
 Meson-g12a's dwc3 glue implemented IRQ-based OTG/DRD role swap.
 
 Qcom's dwc3 glue added support for ACPI, mainly for the AArch64-based
 SoCs.
 
 DWC3 also got support for Intel Elkhart Lake platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iQJRBAABCAA7FiEElLzh7wn96CXwjh2IzL64meEamQYFAl0UdeMdHGZlbGlwZS5i
 YWxiaUBsaW51eC5pbnRlbC5jb20ACgkQzL64meEamQbuBxAAqMp9nwVgYu9beeXP
 1xEjfnc/OxA8oMPcbJVPiYseVowbrj5Ue3SK8XcDCeSDfEI09PNOqfpNtLXvjVie
 NxDMj1zj31Ggb0XfoweOZQHXXpq/6tlfqVJ/oXfkxQ92wuSlyKzkoA7ZuCxAy9ay
 p+E+/cSa1E5LGigI/XEyX2C9JuANd9vSM/CaA5Z2XbosThLK9svtHWlNRIPolIGB
 fUBRm3JVi1jLxAMfbu/8Ng05xYGIPnwi8JDcQ8swAdm5nENtuq9Z0eMm8EAxLdvn
 UwArRR14uI4Vgs69IH4R28tmM4MMsuUVnKv3nxOYcoqQ01u9dySiEYsT5x7RETLu
 GH7v4NMdTqTIfN8ECFLUfaE8+tLBx6MjFOBxNHIeu1tc+MrRzb7a7Z00dkpUlMkg
 jaddCfwbAx3CgJ77nDILBYnVRpaEzlKhZWrNkoSCUI1Ty0QlsnInUkhXtUuayi+R
 AjCBc1PBXPOc6FHx5ECQrA0HWBhC0MW23ncdAFxz1eqqJPYhNbPn5zPEaZ8nNvmz
 R1aUlxDi8FDyRvKbjmGoeRrLbiwzcu/9xiLZ13U4H/kPG4+1g+rx3F8ExIvWr1p+
 XrCJCDdYKN+D9KxbO/5ERg38fARsynryXp4Yll4cLR7IWCQZykkVJ+MuLDwejNF1
 itw69proXZUqZ3Voa9C5a1V/gCQ=
 =3HLl
 -----END PGP SIGNATURE-----

Merge tag 'usb-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next

Felipe writes:

usb: changes for v5.3 merge window

The biggest part here is a set of patches removing unnecesary variables
from several drivers.

Meson-g12a's dwc3 glue implemented IRQ-based OTG/DRD role swap.

Qcom's dwc3 glue added support for ACPI, mainly for the AArch64-based
SoCs.

DWC3 also got support for Intel Elkhart Lake platforms.

* tag 'usb-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: (30 commits)
  usb: dwc3: remove unused @lock member of dwc3_ep struct
  usb: dwc3: pci: Add Support for Intel Elkhart Lake Devices
  usb: Replace snprintf with scnprintf in gether_get_ifname
  usb: gadget: ether: Fix race between gether_disconnect and rx_submit
  usb: gadget: storage: Remove warning message
  usb: dwc3: gadget: Add support for disabling U1 and U2 entries
  usb: gadget: send usb_gadget as an argument in get_config_params
  doc: dt: bindings: usb: dwc3: Update entries for disabling U1 and U2
  usb: dwc3: qcom: Use of_clk_get_parent_count()
  usb: dwc3: Fix core validation in probe, move after clocks are enabled
  usb: dwc3: qcom: Improve error handling
  usb: dwc3: qcom: Start USB in 'host mode' on the SDM845
  usb: dwc3: qcom: Add support for booting with ACPI
  soc: qcom: geni: Add support for ACPI
  Revert "usb: dwc2: host: Setting qtd to NULL after freeing it"
  usb: gadget: net2272: remove redundant assignments to pointer 's'
  usb: gadget: Zero ffs_io_data
  USB: omap_udc: Remove unneeded variable
  fotg210-udc: Remove unneeded variable
  usb: gadget: at91_udc: Remove unneeded variable
  ...
2019-07-01 12:01:33 +02:00
Christoph Hellwig 69878ef475 m68k: Implement arch_dma_prep_coherent()
When we remap memory as non-cached, to be used as a DMA coherent buffer,
we should writeback all cache and invalidate the cache lines so that we
make sure we have a clean slate.  Implement this using the cache_push()
helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2019-07-01 11:17:00 +02:00
Christoph Hellwig 34dc63a5fb m68k: Use the generic dma coherent remap allocator
This switches m68k to using common code for the DMA allocations,
including potential use of the CMA allocator if configured.
Also add a comment where the existing behavior seems to be lacking.

Switching to the generic code enables DMA allocations from atomic
context, which is required by the DMA API documentation, and also
adds various other minor features drivers start relying upon.  It
also makes sure we have a tested code base for all architectures
that require uncached pte bits for coherent DMA allocations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2019-07-01 11:17:00 +02:00
Linus Torvalds 39132f746e powerpc fixes for 5.2 #7
One fix for a regression in my commit adding KUAP (Kernel User Access
 Prevention) on Radix, which incorrectly touched the AMR in the early machine
 check handler.
 
 Thanks to:
   Nicholas Piggin.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdF00aAAoJEFHr6jzI4aWAT2UQAJCnXrBsNJd7WikZE8NzwdmM
 G6bioGCSPgNuWDwaxgpi6RSilET3poBBt+NpttgOslZtzif/5mrLIuYqwQYgOTbL
 Oa4CnzVBHnBDFKqcqe/Sm7cKuvd7KO8RVbyfhNuQbm1y9Nqr3vPYKwQ6CTz7bth4
 AatNvjP12Ag8hDwk3VpOOiG88jKpj/N3V7PLNWOt9jn8B3rCWm5/7xZ84VSNWdRQ
 /MvdGAcFAboywZMj44u8mBpT7+EueFa/vVbpCj8gv9QhRSSGwSL1jZ5wNu2Iv6D+
 IxxZqdO3KHJVixEAC4fs5KWCuA84uhjlRMkP2BXTgKNZT3qXaLx0e8Qv9okg/xAU
 dAuZEQ0cv+gxdCblEiVZ+jjG0LQsntwXJwnsCeWjcHQr6S0umd2utFLl1N3HTqfx
 QhgatD5pTGvGU2WHO4+dhXeh0nITVfcB2E3cM0DHUgCESc1BGmK0MtS1kHYiQptt
 BMY5Y92D3vndmnoLTZzQ2DFj5of2u49+y0Cpti7RhJN9yV836bPGm1K8GnropHz8
 7HHYS4hV3HBFUlYH7zHLp4BMNg3nkdTK+WTR6HwFFSREzM59NZtVg5xJVk0j66GK
 mZIJoVOSQ0Sac03xYqwtdxdupxoulXy+khBcjC56OxxOEMIfjS66ZnawTDhI2jVf
 EI7VE3Y4hzrA4pMTw9fp
 =I22i
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 "One fix for a regression in my commit adding KUAP (Kernel User Access
  Prevention) on Radix, which incorrectly touched the AMR in the early
  machine check handler.

  Thanks to Nicholas Piggin"

* tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/exception: Fix machine check early corrupting AMR
2019-06-30 11:20:52 +08:00
Linus Torvalds 728254541e Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes all over the place:

   - might_sleep() atomicity fix in the microcode loader

   - resctrl boundary condition fix

   - APIC arithmethics bug fix for frequencies >= 4.2 GHz

   - three 5-level paging crash fixes

   - two speculation fixes

   - a perf/stacktrace fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/unwind/orc: Fall back to using frame pointers for generated code
  perf/x86: Always store regs->ip in perf_callchain_kernel()
  x86/speculation: Allow guests to use SSBD even if host does not
  x86/mm: Handle physical-virtual alignment mismatch in phys_p4d_init()
  x86/boot/64: Add missing fixup_pointer() for next_early_pgt access
  x86/boot/64: Fix crash if kernel image crosses page table boundary
  x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz
  x86/resctrl: Prevent possible overrun during bitmap operations
  x86/microcode: Fix the microcode load on CPU hotplug for real
2019-06-29 19:42:30 +08:00
Linus Torvalds 57103eb7c6 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Various fixes, most of them related to bugs perf fuzzing found in the
  x86 code"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/regs: Use PERF_REG_EXTENDED_MASK
  perf/x86: Remove pmu->pebs_no_xmm_regs
  perf/x86: Clean up PEBS_XMM_REGS
  perf/x86/regs: Check reserved bits
  perf/x86: Disable extended registers for non-supported PMUs
  perf/ioctl: Add check for the sample_period value
  perf/core: Fix perf_sample_regs_user() mm check
2019-06-29 19:39:17 +08:00
Linus Torvalds eed7d30e12 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
 "Diverse irqchip driver fixes"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Fix command queue pointer comparison bug
  irqchip/mips-gic: Use the correct local interrupt map registers
  irqchip/ti-sci-inta: Fix kernel crash if irq_create_fwspec_mapping fail
  irqchip/irq-csky-mpintc: Support auto irq deliver to all cpus
2019-06-29 19:36:53 +08:00
Linus Torvalds a7211bc9f3 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
 "Four fixes:
   - fix a kexec crash on arm64
   - fix a reboot crash on some Android platforms
   - future-proof the code for upcoming ACPI 6.2 changes
   - fix a build warning on x86"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efibc: Replace variable set function in notifier call
  x86/efi: fix a -Wtype-limits compilation warning
  efi/bgrt: Drop BGRT status field reserved bits check
  efi/memreserve: deal with memreserve entries in unmapped memory
2019-06-29 19:32:09 +08:00
Thomas Gleixner c8c4076723 x86/timer: Skip PIT initialization on modern chipsets
Recent Intel chipsets including Skylake and ApolloLake have a special
ITSSPRC register which allows the 8254 PIT to be gated.  When gated, the
8254 registers can still be programmed as normal, but there are no IRQ0
timer interrupts.

Some products such as the Connex L1430 and exone go Rugged E11 use this
register to ship with the PIT gated by default. This causes Linux to fail
to boot:

  Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with
  apic=debug and send a report.

The panic happens before the framebuffer is initialized, so to the user, it
appears as an early boot hang on a black screen.

Affected products typically have a BIOS option that can be used to enable
the 8254 and make Linux work (Chipset -> South Cluster Configuration ->
Miscellaneous Configuration -> 8254 Clock Gating), however it would be best
to make Linux support the no-8254 case.

Modern sytems allow to discover the TSC and local APIC timer frequencies,
so the calibration against the PIT is not required. These systems have
always running timers and the local APIC timer works also in deep power
states.

So the setup of the PIT including the IO-APIC timer interrupt delivery
checks are a pointless exercise.

Skip the PIT setup and the IO-APIC timer interrupt checks on these systems,
which avoids the panic caused by non ticking PITs and also speeds up the
boot process.

Thanks to Daniel for providing the changelog, initial analysis of the
problem and testing against a variety of machines.

Reported-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Drake <drake@endlessm.com>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Cc: hdegoede@redhat.com
Link: https://lkml.kernel.org/r/20190628072307.24678-1-drake@endlessm.com
2019-06-29 11:35:35 +02:00
Linus Torvalds f8b5c72227 ARC fixes for 5.2-rc7
- hsdk platform unifying apertures
 
  - build system CROSS_COMPILE prefix
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJdFqxcAAoJEGnX8d3iisJebpAP/3clqurFHD7pVwqzq5TImlDN
 u5t4GqMqPAVnbXArv5iiHJIGRwkcPHMoZB5qj/h303zKmNfwMQ4CAjQlDC2YDaGg
 7dk6ovitiO+ZyH7F7viF8uU11cU2vUnuLZ1vP+KVEbu1mUConL4KYj9KMJUoO+VX
 KWdhsVEE+b/fQV1hXb/Jvqciithi3F+B7QKJPVSz39FHOpfQzCBq4yn6aMfKcSvR
 bAWog824yLApOLtfDlM2/+bR8gpxmacOBn5duutymBVoB2Tz9Pop/jcDcQJi836e
 b2iCia6vygOLv3XHFf4zf0VPIeCGmncT8P48QydUAiaYd7dalURkVETqm2YS9LR2
 Pu9D5X6xPw6/0mHNVJ6gWcuDSfN/qHX++m8IDrSJF+3/f/12PBqm4HBIFy7GumVl
 nh99DJTo8LHIDcr5ZSavb8tmtSp1oL/3QFT7ydwP60XyOjlu8ZzP7/S/ycZHMKHV
 bX4sVyvtwtejQ0Gahmt8m+MxraI183yuFRZpIEXE2YbQfbMbPdIeUgiTiDQWphxs
 UCtnucepxKrGKLqpdWn9SxlGtO84Gl7/YROYfk+jJR7IwKK7blnfqzBL7JtfZ/CN
 X5qIYkqfyaqMWuQYpPRdZNqy1K76l30rR6LWLJ9BjXJNKtxv4eflPApG3RSIEwyY
 ijAp+v3+TvmZRSP+yQM7
 =Mhn6
 -----END PGP SIGNATURE-----

Merge tag 'arc-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - hsdk platform unifying apertures

 - build system CROSS_COMPILE prefix

* tag 'arc-5.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: [plat-hsdk]: unify memory apertures configuration
  ARC: build: Try to guess CROSS_COMPILE with cc-cross-prefix
2019-06-29 17:05:58 +08:00
Linus Torvalds c57582adfd Minor RISC-V fixes and one defconfig update for the v5.2-rc series.
The fixes have no functional impact:
 
 - Fix some comment text in the memory management vmalloc_fault path.
 
 - Fix some warnings from the DT compiler in our newly-added DT files.
 
 - Change the newly-added DT bindings such that SoC IP blocks with
   external I/O are marked as "disabled" by default, then enable them
   explicitly in board DT files when the devices are used on the board.
   This aligns the bindings with existing upstream practice.
 
 - Add the MIT license as an option for a minor header file, at the
   request of one of the U-Boot maintainers.
 
 The RISC-V defconfig update builds the SiFive SPI driver and the
 MMC-SPI driver by default.  The intention here is to make v5.2 more
 usable for testers and users with RISC-V hardware.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl0WljgACgkQx4+xDQu9
 KksbXQ//fohS8vHCMCkqj3rtKM3fuf9DRdZZbZf0WdPL5463JxTTK8JULrrawjL5
 j57Ve/EFRQFVSELBtWd0u/4sgAcgGmyJWnfexk3LYISNMZCjBe6Zuz+7Q9Ykbhoa
 YKpjOreDeO+fbQpGqMHK2suD5WFVXsDfiI3TmHE6xGIm0sWdpANawpz2K4CzBkEO
 XOaOsmVPT8HfN2f0XodCmzo2VrGNeEutqyxc9+X1Ah0nxBecj56t9TK9wnseTWrE
 hWjnMw2KMZFTnmtOOQ8kB0EfcRDZ8AvXymAb1BHwuWwmxLFrGELsGKRWzrH+qhyT
 4mlexMjdyz69N1uYWieO6FWGMqbIm+ncR7cMwIl2hOErtJiSoUf5cwGhflXMk9ph
 b/oWmNzLGE/7ib/Uo1tfaBmdEYzlzziEtkB0DDWIf16wqMVK5zyoPknnHC7WPIBa
 7WyN+2FKA7b0440Kqfywgq9CMZ3odvhXCLAEmFBdwaa9wyKGsOR6sUZhPXGUSjyL
 oKe4oszbKmqaUboxTo/YzDYHpD4BPGoBMievY8kCO+TcewN2ARczJngQyc2FLS+B
 BUMFZmTUr85pt1pcnNqK84D5N6alldLqLbKwczYq3PvtHzIR2kFUfZGMwQ0DlEh2
 IOJMDcmHehuCmCAz4jnNykOlJPDIMIYiVLeUtGp+1IwZjcvLfxg=
 =+HL9
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-v5.2/fixes-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:
 "Minor RISC-V fixes and one defconfig update.

  The fixes have no functional impact:

   - Fix some comment text in the memory management vmalloc_fault path.

   - Fix some warnings from the DT compiler in our newly-added DT files.

   - Change the newly-added DT bindings such that SoC IP blocks with
     external I/O are marked as "disabled" by default, then enable them
     explicitly in board DT files when the devices are used on the
     board. This aligns the bindings with existing upstream practice.

   - Add the MIT license as an option for a minor header file, at the
     request of one of the U-Boot maintainers.

  The RISC-V defconfig update builds the SiFive SPI driver and the
  MMC-SPI driver by default. The intention here is to make v5.2 more
  usable for testers and users with RISC-V hardware"

* tag 'riscv-for-v5.2/fixes-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: mm: Fix code comment
  dt-bindings: clock: sifive: add MIT license as an option for the header file
  dt-bindings: riscv: resolve 'make dt_binding_check' warnings
  riscv: dts: Re-organize the DT nodes
  RISC-V: defconfig: enable MMC & SPI for RISC-V
2019-06-29 17:04:21 +08:00
Steven Rostedt (VMware) 39611265ed ftrace/x86: Add a comment to why we take text_mutex in ftrace_arch_code_modify_prepare()
Taking the text_mutex in ftrace_arch_code_modify_prepare() is to fix a
race against module loading and live kernel patching that might try to
change the text permissions while ftrace has it as read/write. This
really needs to be documented in the code. Add a comment that does such.

Link: http://lkml.kernel.org/r/20190627211819.5a591f52@gandalf.local.home

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-06-28 14:21:25 -04:00
Petr Mladek d5b844a2cf ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()
The commit 9f255b632b ("module: Fix livepatch/ftrace module text
permissions race") causes a possible deadlock between register_kprobe()
and ftrace_run_update_code() when ftrace is using stop_machine().

The existing dependency chain (in reverse order) is:

-> #1 (text_mutex){+.+.}:
       validate_chain.isra.21+0xb32/0xd70
       __lock_acquire+0x4b8/0x928
       lock_acquire+0x102/0x230
       __mutex_lock+0x88/0x908
       mutex_lock_nested+0x32/0x40
       register_kprobe+0x254/0x658
       init_kprobes+0x11a/0x168
       do_one_initcall+0x70/0x318
       kernel_init_freeable+0x456/0x508
       kernel_init+0x22/0x150
       ret_from_fork+0x30/0x34
       kernel_thread_starter+0x0/0xc

-> #0 (cpu_hotplug_lock.rw_sem){++++}:
       check_prev_add+0x90c/0xde0
       validate_chain.isra.21+0xb32/0xd70
       __lock_acquire+0x4b8/0x928
       lock_acquire+0x102/0x230
       cpus_read_lock+0x62/0xd0
       stop_machine+0x2e/0x60
       arch_ftrace_update_code+0x2e/0x40
       ftrace_run_update_code+0x40/0xa0
       ftrace_startup+0xb2/0x168
       register_ftrace_function+0x64/0x88
       klp_patch_object+0x1a2/0x290
       klp_enable_patch+0x554/0x980
       do_one_initcall+0x70/0x318
       do_init_module+0x6e/0x250
       load_module+0x1782/0x1990
       __s390x_sys_finit_module+0xaa/0xf0
       system_call+0xd8/0x2d0

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(text_mutex);
                               lock(cpu_hotplug_lock.rw_sem);
                               lock(text_mutex);
  lock(cpu_hotplug_lock.rw_sem);

It is similar problem that has been solved by the commit 2d1e38f566
("kprobes: Cure hotplug lock ordering issues"). Many locks are involved.
To be on the safe side, text_mutex must become a low level lock taken
after cpu_hotplug_lock.rw_sem.

This can't be achieved easily with the current ftrace design.
For example, arm calls set_all_modules_text_rw() already in
ftrace_arch_code_modify_prepare(), see arch/arm/kernel/ftrace.c.
This functions is called:

  + outside stop_machine() from ftrace_run_update_code()
  + without stop_machine() from ftrace_module_enable()

Fortunately, the problematic fix is needed only on x86_64. It is
the only architecture that calls set_all_modules_text_rw()
in ftrace path and supports livepatching at the same time.

Therefore it is enough to move text_mutex handling from the generic
kernel/trace/ftrace.c into arch/x86/kernel/ftrace.c:

   ftrace_arch_code_modify_prepare()
   ftrace_arch_code_modify_post_process()

This patch basically reverts the ftrace part of the problematic
commit 9f255b632b ("module: Fix livepatch/ftrace module
text permissions race"). And provides x86_64 specific-fix.

Some refactoring of the ftrace code will be needed when livepatching
is implemented for arm or nds32. These architectures call
set_all_modules_text_rw() and use stop_machine() at the same time.

Link: http://lkml.kernel.org/r/20190627081334.12793-1-pmladek@suse.com

Fixes: 9f255b632b ("module: Fix livepatch/ftrace module text permissions race")
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
[
  As reviewed by Miroslav Benes <mbenes@suse.cz>, removed return value of
  ftrace_run_update_code() as it is a void function.
]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-06-28 14:20:25 -04:00
Christian Brauner 7615d9e178
arch: wire-up pidfd_open()
This wires up the pidfd_open() syscall into all arches at once.

Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
2019-06-28 12:17:55 +02:00
Ricardo Neri fd329f276e x86/mtrr: Skip cache flushes on CPUs with cache self-snooping
Programming MTRR registers in multi-processor systems is a rather lengthy
process. Furthermore, all processors must program these registers in lock
step and with interrupts disabled; the process also involves flushing
caches and TLBs twice. As a result, the process may take a considerable
amount of time.

On some platforms, this can lead to a large skew of the refined-jiffies
clock source. Early when booting, if no other clock is available (e.g.,
booting with hpet=disabled), the refined-jiffies clock source is used to
monitor the TSC clock source. If the skew of refined-jiffies is too large,
Linux wrongly assumes that the TSC is unstable:

  clocksource: timekeeping watchdog on CPU1: Marking clocksource
               'tsc-early' as unstable because the skew is too large:
  clocksource: 'refined-jiffies' wd_now: fffedc10 wd_last:
               fffedb90 mask: ffffffff
  clocksource: 'tsc-early' cs_now: 5eccfddebc cs_last: 5e7e3303d4
               mask: ffffffffffffffff
  tsc: Marking TSC unstable due to clocksource watchdog

As per measurements, around 98% of the time needed by the procedure to
program MTRRs in multi-processor systems is spent flushing caches with
wbinvd(). As per the Section 11.11.8 of the Intel 64 and IA 32
Architectures Software Developer's Manual, it is not necessary to flush
caches if the CPU supports cache self-snooping. Thus, skipping the cache
flushes can reduce by several tens of milliseconds the time needed to
complete the programming of the MTRR registers:

Platform                      	Before	   After
104-core (208 Threads) Skylake  1437ms      28ms
  2-core (  4 Threads) Haswell   114ms       2ms

Reported-by: Mohammad Etemadi <mohammad.etemadi@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Alan Cox <alan.cox@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jordan Borgner <mail@jordan-borgner.de>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Link: https://lkml.kernel.org/r/1561689337-19390-3-git-send-email-ricardo.neri-calderon@linux.intel.com
2019-06-28 07:21:00 +02:00
Ricardo Neri 1e03bff360 x86/cpu/intel: Clear cache self-snoop capability in CPUs with known errata
Processors which have self-snooping capability can handle conflicting
memory type across CPUs by snooping its own cache. However, there exists
CPU models in which having conflicting memory types still leads to
unpredictable behavior, machine check errors, or hangs.

Clear this feature on affected CPUs to prevent its use.

Suggested-by: Alan Cox <alan.cox@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jordan Borgner <mail@jordan-borgner.de>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Mohammad Etemadi <mohammad.etemadi@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Link: https://lkml.kernel.org/r/1561689337-19390-2-git-send-email-ricardo.neri-calderon@linux.intel.com
2019-06-28 07:20:48 +02:00
Baoquan He 8ff80fbe7e x86/kdump/64: Restrict kdump kernel reservation to <64TB
Restrict kdump to only reserve crashkernel below 64TB.

The reaons is that the kdump may jump from a 5-level paging mode to a
4-level paging mode kernel. If a 4-level paging mode kdump kernel is put
above 64TB, then the kdump kernel cannot start.

The 1st kernel reserves the kdump kernel region during bootup. At that
point it is not known whether the kdump kernel has 5-level or 4-level
paging support.

To support both restrict the kdump kernel reservation to the lower 64TB
address space to ensure that a 4-level paging mode kdump kernel can be
loaded and successfully started.

[ tglx: Massaged changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20190524073810.24298-4-bhe@redhat.com
2019-06-28 07:14:59 +02:00
Baoquan He ee338b9ee2 x86/kexec/64: Prevent kexec from 5-level paging to a 4-level only kernel
If the running kernel has 5-level paging activated, the 5-level paging mode
is preserved across kexec. If the kexec'ed kernel does not contain support
for handling active 5-level paging mode in the decompressor, the
decompressor will crash with #GP.

Prevent this situation at load time. If 5-level paging is active, check the
xloadflags whether the kexec kernel can handle 5-level paging at least in
the decompressor. If not, reject the load attempt and print out an error
message.

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: dyoung@redhat.com
Link: https://lkml.kernel.org/r/20190524073810.24298-3-bhe@redhat.com
2019-06-28 07:14:59 +02:00
Baoquan He f2d08c5d3b x86/boot: Add xloadflags bits to check for 5-level paging support
The current kernel supports 5-level paging mode, and supports dynamically
choosing the paging mode during bootup depending on the kernel image,
hardware and kernel parameter settings. This flexibility brings several
issues to kexec/kdump:

1) Dynamic switching between paging modes requires support in the target
   kernel. This means kexec from a 5-level paging kernel into a kernel
   which does not support mode switching is not possible. So the loader
   needs to be able to analyze the supported paging modes of the kexec
   target kernel.

2) If running on a 5-level paging kernel and the kexec target kernel is a
   4-level paging kernel, the target immage cannot be loaded above the 64TB
   address space limit. But the kexec loader searches for a load area from
   top to bottom which would eventually put the target kernel above 64TB
   when the machine has large enough RAM size. So the loader needs to be
   able to analyze the paging mode of the target kernel to load it at a
   suitable spot in the address space.

Solution:

Add two bits XLF_5LEVEL and XLF_5LEVEL_ENABLED:

 - Bit XLF_5LEVEL indicates whether 5-level paging mode switching support
   is available. (Issue #1)

 - Bit XLF_5LEVEL_ENABLED indicates whether the kernel was compiled with
   full 5-level paging support (CONFIG_X86_5LEVEL=y). (Issue #2)

The loader will use these bits to verify whether the target kernel is
suitable to be kexec'ed to from a 5-level paging kernel and to determine
the constraints of the target kernel load address.

The flags will be used by the kernel kexec subsystem and the userspace
kexec tools.

[ tglx: Massaged changelog ]

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: dyoung@redhat.com
Link: https://lkml.kernel.org/r/20190524073810.24298-2-bhe@redhat.com
2019-06-28 07:14:59 +02:00
David S. Miller d96ff269a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The new route handling in ip_mc_finish_output() from 'net' overlapped
with the new support for returning congestion notifications from BPF
programs.

In order to handle this I had to take the dev_loopback_xmit() calls
out of the switch statement.

The aquantia driver conflicts were simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 21:06:39 -07:00
Linus Torvalds fe2da896fd ARM: SoC fixes
A smaller batch of fixes, nothing that stands out as risky or scary.
 
 Mostly DTS tweaks for a few issues:
  - GPU fixlets for Meson
  - CPU idle fix for LS1028A
  - PWM interrupt fixes for i.MX6UL
 
 Also, enable a driver (FSL_EDMA) on arm64 defconfig, and a warning and
 two MAINTAINER tweaks.
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAl0ULP8PHG9sb2ZAbGl4
 b20ubmV0AAoJEIwa5zzehBx3h3sP/AkPQ+18tw5r6eY01k7a+JtDIbzKUizc6qh5
 /IBOpynFVv28+VRVrmu1xqek+5iJ7pVkQrJO5Nf0ChbFjo6Hqdk/84tivccyozrY
 4eO/7BALoV57g6inDTLWvhYL3V8bwLYT/1XCP4cN1Di9WBqBhZoe+h8BQr3ztrep
 p3QDjs3WDSzsJ8Oy8NBDUFXtWnZznXaSRzXFKGaUEVIpnlV4OHNW5XbXkLFFHygO
 SmoJdJRPIoKki6Gq0GvZH4U/0U53sa927uwT/02DaxIzlPFfhQtyNw8ZCo//6adg
 tyUTJn7zzOTxFSJZ512EJ4OG6MG9T/3wDGPlT+KJ4Bgv19jSeksdnvCfZrtAuPfu
 j1APenXGRNImSDJOeDrxeKAbW29RpxQjYzvMvGT3iYqH93sD/lz6uIoObCcGzwXQ
 BGIvMKOs3luw6Bk3pJpfBmzMPBkrDWerDgL3qdHnQEYenmbeTCyKFQVOM7f+PQqg
 jKT7gitFq1bT4JXImcInEdY/2nFlBJUgdIwwK273uS0RmeOHmF8TNJpKeaYbO7ds
 fcG177RaLqPoIfx6GbT7kZRVSgBHJrUh6gRmuQcoJaaP4zXdX0+N3S1WYQkGMkos
 t0SU9YPsqDyCpmtCN7TTY5MwhR/jTGLmxArCGBf1+IrfFx1cdmaFPjnJvEI8fCJM
 CJRPfzKQ
 =Jr9u
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Olof Johansson:
 "A smaller batch of fixes, nothing that stands out as risky or scary.

  Mostly DTS tweaks for a few issues:

   - GPU fixlets for Meson

   - CPU idle fix for LS1028A

   - PWM interrupt fixes for i.MX6UL

  Also, enable a driver (FSL_EDMA) on arm64 defconfig, and a warning and
  two MAINTAINER tweaks"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: dts: imx6ul: fix PWM[1-4] interrupts
  ARM: omap2: remove incorrect __init annotation
  ARM: dts: gemini Fix up DNS-313 compatible string
  ARM: dts: Blank D-Link DIR-685 console
  arm64: defconfig: Enable FSL_EDMA driver
  arm64: dts: ls1028a: Fix CPU idle fail.
  MAINTAINERS: BCM53573: Add internal Broadcom mailing list
  MAINTAINERS: BCM2835: Add internal Broadcom mailing list
  ARM: dts: meson8b: fix the operating voltage of the Mali GPU
  ARM: dts: meson8b: drop undocumented property from the Mali GPU node
  ARM: dts: meson8: fix GPU interrupts and drop an undocumented property
2019-06-28 08:37:04 +08:00
Linus Torvalds 139ca25805 arch/csky fixup for 5.2
Only 1 fixup patch for rt_sigframe in signal.c
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE2KAv+isbWR/viAKHAXH1GYaIxXsFAl0TCP4SHHJlbl9ndW9A
 Yy1za3kuY29tAAoJEAFx9RmGiMV7XRYP/izEKH5Uyb5fGt62ZW+QvEQwEsXL0Az/
 YWxCFF6DjOGsO3OVh45PTTGg+2W0nGrpm8aTqfn91qA5Ak9mr/lDlbnk/9ZQ6yRz
 MI5Uzy+kTjyWA4tDHRQlohedjzFy9gNhpMjY6LWuXqUruvf1enyBezIrKUiJhdW6
 7ztTFsSZ6rVI5CSIOhXePmN3pO80YPC+6giqcqv3Yaed4oL6NLrs6yKXmZ3JMi1X
 fpa2EwR/STmPPA3mejDTezD4lriJJUAL39KBjCHT0Ej4y69PIl/wNIdFmri7cXHk
 iPcocR4SapaJNqmwnz+rvWxAoEFpPvJbZsV7ah6ZrpT6LMMoBX5y6h3XA4NXQu9t
 yLU5TCHE4osy4uUHeYMu6AUjTnAzhoUrNNP2h57I0ZNsTSYJiH3eqxAuHM6rrwot
 YMGeiXGtyF+rI0jOroF1giFDzUy7RlDbtLR3B2gV5tP4VXI9b2Gcia/GR25QBrSV
 utEcdE96Td38bPeMhvkOMhG9CHQD32s4o1rYLNndbWoqSCCFdwiC89mvSfthHP2c
 Bx3EJs1vhuIdErN+pPEvX1sAuzmfx8q41Fg93kTYfoUQyflDdNswoRV1weRgHUu3
 CmeJDpPEQH89Ddzb7HJrPCcC+diqclxNqraEcQyoX6KesLHciTB/x2arxjEJQa3c
 tTTGreAq4/+g
 =0p/x
 -----END PGP SIGNATURE-----

Merge tag 'csky-for-linus-5.2-fixup-gcc-unwind' of git://github.com/c-sky/csky-linux

Pull arch/csky fixup from Guo Ren:
 "A fixup patch for rt_sigframe in signal.c"

* tag 'csky-for-linus-5.2-fixup-gcc-unwind' of git://github.com/c-sky/csky-linux:
  csky: Fixup libgcc unwind error
2019-06-28 08:31:57 +08:00
Thomas Gleixner e44252f4fe x86/hpet: Use channel for legacy clockevent storage
All preparations are done. Use the channel storage for the legacy
clockevent and remove the static variable.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.737689919@linutronix.de
2019-06-28 00:57:27 +02:00
Thomas Gleixner 49adaa60fa x86/hpet: Use common init for legacy clockevent
Replace the static initialization of the legacy clockevent with runtime
initialization utilizing the common init function as the last preparatory
step to switch the legacy clockevent over to the channel 0 storage in
hpet_base.

This comes with a twist. The static clockevent initializer has selected
support for periodic and oneshot mode unconditionally whether the HPET
config advertised periodic mode or not. Even the pre clockevents code did
this. But....

Using the conditional in hpet_init_clockevent() makes at least Qemu and one
hardware machine fail to boot.  There are two issues which cause the boot
failure:

 #1 After the timer delivery test in IOAPIC and the IOAPIC setup the next
    interrupt is not delivered despite the HPET channel being programmed
    correctly. Reprogramming the HPET after switching to IOAPIC makes it
    work again. After fixing this, the next issue surfaces:

 #2 Due to the unconditional periodic mode 'availability' the Local APIC
    timer calibration can hijack the global clockevents event handler
    without causing damage. Using oneshot at this stage makes if hang
    because the HPET does not get reprogrammed due to the handler
    hijacking. Duh, stupid me!

Both issues require major surgery and especially the kick HPET again after
enabling IOAPIC results in really nasty hackery.  This 'assume periodic
works' magic has survived since HPET support got added, so it's
questionable whether this should be fixed. Both Qemu and the failing
hardware machine support periodic mode despite the fact that both don't
advertise it in the configuration register and both need that extra kick
after switching to IOAPIC. Seems to be a feature...

Keep the 'assume periodic works' magic around and add a big fat comment.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.646565913@linutronix.de
2019-06-28 00:57:27 +02:00
Thomas Gleixner ea99110dd0 x86/hpet: Carve out shareable parts of init_one_hpet_msi_clockevent()
To finally remove the static channel0/clockevent storage and to utilize the
channel 0 storage in hpet_base, it's required to run time initialize the
clockevent. The MSI clockevents already have a run time init function.

Carve out the parts which can be shared between the legacy and the MSI
implementation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.552451082@linutronix.de
2019-06-28 00:57:26 +02:00
Thomas Gleixner 310b5b3eb6 x86/hpet: Consolidate clockevent functions
Now that the legacy clockevent is wrapped in a hpet_channel struct most
clockevent functions can be shared between the legacy and the MSI based
clockevents.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.461437795@linutronix.de
2019-06-28 00:57:26 +02:00
Thomas Gleixner 18e84a2dff x86/hpet: Wrap legacy clockevent in hpet_channel
For HPET channel 0 there exist two clockevent structures right now:
  - the static hpet_clockevent
  - the clockevent in channel 0 storage

The goal is to use the clockevent in the channel storage, remove the static
variable and share code with the MSI implementation.

As a first step wrap the legacy clockevent into a hpet_channel struct and
convert the users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.368141247@linutronix.de
2019-06-28 00:57:25 +02:00
Thomas Gleixner 45e0a41563 x86/hpet: Use cached info instead of extra flags
Now that HPET clockevent support is integrated into the channel data, reuse
the cached boot configuration instead of copying the same information into
a flags field.

This also allows to consolidate the reservation code into one place, which
can now solely depend on the mode information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.277510163@linutronix.de
2019-06-28 00:57:25 +02:00
Thomas Gleixner 4d5e68330d x86/hpet: Move clockevents into channels
Instead of allocating yet another data structure, move the clock event data
into the channel structure. This allows further consolidation of the
reservation code and the reuse of the cached boot config to replace the
extra flags in the clockevent data.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.185851116@linutronix.de
2019-06-28 00:57:24 +02:00
Ingo Molnar d415c75431 x86/hpet: Rename variables to prepare for switching to channels
struct hpet_dev is gone with the next change as the clockevent storage
moves into struct hpet_channel. So the variable name hdev will not make
sense anymore. Ditto for timer vs. channel and similar details.

Doing the rename in the change makes the patch harder to review. Doing it
afterward is problematic vs. tracking down issues.  Doing it upfront is the
easiest solution as it does not change functionality.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.093113681@linutronix.de
2019-06-28 00:57:24 +02:00
Thomas Gleixner af5a1dadf3 x86/hpet: Add function to select a /dev/hpet channel
If CONFIG_HPET=y is enabled the x86 specific HPET code should reserve at
least one channel for the /dev/hpet character device, so that not all
channels are absorbed for per CPU clockevent devices.

Create a function to assign HPET_MODE_DEVICE so the rework of the
clockevents allocation code can utilize the mode information instead of
reducing the number of evaluated channels by #ifdef hackery.

The function is not yet used, but provided as a separate patch for ease of
review. It will be used when the rework of the clockevent selection takes
place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132436.002758910@linutronix.de
2019-06-28 00:57:23 +02:00
Thomas Gleixner 9e16e4933e x86/hpet: Add mode information to struct hpet_channel
The usage of the individual HPET channels is not tracked in a central
place. The information is scattered in different data structures. Also the
HPET reservation in the HPET character device is split out into several
places which makes the code hard to follow.

Assigning a mode to the channel allows to consolidate the reservation code
and paves the way for further simplifications.

As a first step set the mode of the legacy channels when the HPET is in
legacy mode.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.911652981@linutronix.de
2019-06-28 00:57:23 +02:00
Thomas Gleixner 2460d5878a x86/hpet: Use cached channel data
Instead of rereading the HPET registers over and over use the information
which was cached in hpet_enable().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.821728550@linutronix.de
2019-06-28 00:57:22 +02:00
Thomas Gleixner e37f0881e9 x86/hpet: Introduce struct hpet_base and struct hpet_channel
Introduce new data structures to replace the ad hoc collection of separate
variables and pointers.

Replace the boot configuration store and restore as a first step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.728456320@linutronix.de
2019-06-28 00:57:21 +02:00
Ingo Molnar 0b5c597de6 x86/hpet: Coding style cleanup
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.637420368@linutronix.de
2019-06-28 00:57:21 +02:00
Ingo Molnar dfe36b573e x86/hpet: Clean up comments
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.545653922@linutronix.de
2019-06-28 00:57:20 +02:00
Ingo Molnar 3fe50c34dc x86/hpet: Make naming consistent
Use 'evt' for clockevents pointers and capitalize HPET in comments.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.454138339@linutronix.de
2019-06-28 00:57:20 +02:00
Ingo Molnar 9bc9e1d4c1 x86/hpet: Remove not required includes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.348089155@linutronix.de
2019-06-28 00:57:20 +02:00
Thomas Gleixner 3535aa12f7 x86/hpet: Decapitalize and rename EVT_TO_HPET_DEV
It's a function not a macro and the upcoming changes use channel for the
individual hpet timer units to allow a step by step refactoring approach.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.241032433@linutronix.de
2019-06-28 00:57:19 +02:00
Thomas Gleixner 44b5be5733 x86/hpet: Simplify counter validation
There is no point to loop for 200k TSC cycles to check afterwards whether
the HPET counter is working. Read the counter inside of the loop and break
out when the counter value changed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.149535103@linutronix.de
2019-06-28 00:57:19 +02:00
Thomas Gleixner 3222daf970 x86/hpet: Separate counter check out of clocksource register code
The init code checks whether the HPET counter works late in the init
function when the clocksource is registered. That should happen right with
the other sanity checks.

Split it into a separate validation function and move it to the other
sanity checks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132435.058540608@linutronix.de
2019-06-28 00:57:18 +02:00
Thomas Gleixner 6bdec41a0c x86/hpet: Shuffle code around for readability sake
It doesn't make sense to have init functions in the middle of other
code. Aside of that, further changes in that area create horrible diffs if
the code stays where it is.

No functional change

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.951733064@linutronix.de
2019-06-28 00:57:18 +02:00
Thomas Gleixner 8c273f2c81 x86/hpet: Move static and global variables to one place
Having static and global variables sprinkled all over the code is just
annoying to read. Move them all to the top of the file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.860549134@linutronix.de
2019-06-28 00:57:17 +02:00
Thomas Gleixner 4ce78e2094 x86/hpet: Sanitize stub functions
Mark them inline and remove the pointless 'return;' statement.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.754768274@linutronix.de
2019-06-28 00:57:17 +02:00
Thomas Gleixner 433526cc05 x86/hpet: Mark init functions __init
They are only called from init code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.645357869@linutronix.de
2019-06-28 00:57:17 +02:00
Thomas Gleixner eb8ec32c45 x86/hpet: Remove the unused hpet_msi_read() function
No users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.553729327@linutronix.de
2019-06-28 00:57:16 +02:00
Thomas Gleixner 853acaf064 x86/hpet: Remove unused parameter from hpet_next_event()
The clockevent device pointer is not used in this function.

While at it, rename the misnamed 'timer' parameter to 'channel', which makes it
clear what this parameter means.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.447880978@linutronix.de
2019-06-28 00:57:16 +02:00
Thomas Gleixner 7c4b0e0898 x86/hpet: Remove pointless x86-64 specific #include
Nothing requires asm/pgtable.h here anymore.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.339011567@linutronix.de
2019-06-28 00:57:16 +02:00
Thomas Gleixner 9b0b28de83 x86/hpet: Restructure init code
As a preparatory change for further consolidation, restructure the HPET
init code so it becomes more readable. Fix up misleading and stale comments
and rename variables so they actually make sense.

No intended functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.247842972@linutronix.de
2019-06-28 00:57:15 +02:00
Thomas Gleixner 46e5b64fde x86/hpet: Replace printk(KERN...) with pr_...()
And sanitize the format strings while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.140411339@linutronix.de
2019-06-28 00:57:15 +02:00
Thomas Gleixner 36b9017f02 x86/hpet: Simplify CPU online code
The indirection via work scheduled on the upcoming CPU was necessary with the
old hotplug code because the online callback was invoked on the control CPU
not on the upcoming CPU. The rework of the CPU hotplug core guarantees that
the online callbacks are invoked on the upcoming CPU.

Remove the now pointless work redirection.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/20190623132434.047254075@linutronix.de
2019-06-28 00:57:15 +02:00
Josh Poimboeuf ae6a45a086 x86/unwind/orc: Fall back to using frame pointers for generated code
The ORC unwinder can't unwind through BPF JIT generated code because
there are no ORC entries associated with the code.

If an ORC entry isn't available, try to fall back to frame pointers.  If
BPF and other generated code always do frame pointer setup (even with
CONFIG_FRAME_POINTERS=n) then this will allow ORC to unwind through most
generated code despite there being no corresponding ORC entries.

Fixes: d15d356887 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER")
Reported-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kairui Song <kasong@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/b6f69208ddff4343d56b7bfac1fc7cfcd62689e8.1561595111.git.jpoimboe@redhat.com
2019-06-28 00:11:21 +02:00
Song Liu 83f44ae0f8 perf/x86: Always store regs->ip in perf_callchain_kernel()
The stacktrace_map_raw_tp BPF selftest is failing because the RIP saved by
perf_arch_fetch_caller_regs() isn't getting saved by perf_callchain_kernel().

This was broken by the following commit:

  d15d356887 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER")

With that change, when starting with non-HW regs, the unwinder starts
with the current stack frame and unwinds until it passes up the frame
which called perf_arch_fetch_caller_regs().  So regs->ip needs to be
saved deliberately.

Fixes: d15d356887 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kairui Song <kasong@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/3975a298fa52b506fea32666d8ff6a13467eee6d.1561595111.git.jpoimboe@redhat.com
2019-06-28 00:11:20 +02:00
Andy Lutomirski 441cedab2d x86/vsyscall: Add __ro_after_init to global variables
The vDSO is only configurable by command-line options, so make its
global variables __ro_after_init.  This seems highly unlikely to
ever stop an exploit, but it's nicer anyway.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/a386925835e49d319e70c4d7404b1f6c3c2e3702.1561610354.git.luto@kernel.org
2019-06-28 00:04:40 +02:00
Andy Lutomirski 625b7b7f79 x86/vsyscall: Change the default vsyscall mode to xonly
The use case for full emulation over xonly is very esoteric, e.g. magic
instrumentation tools.

Change the default to the safer xonly mode.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/30539f8072d2376b9c9efcc07e6ed0d6bf20e882.1561610354.git.luto@kernel.org
2019-06-28 00:04:39 +02:00
Andy Lutomirski e0a446ce39 x86/vsyscall: Document odd SIGSEGV error code for vsyscalls
Even if vsyscall=none, user page faults on the vsyscall page are reported
as though the PROT bit in the error code was set.  Add a comment explaining
why this is probably okay and display the value in the test case.

While at it, explain why the behavior is correct with respect to PKRU.

Modify also the selftest to print the odd error code so that there is a
way to demonstrate the odd behaviour.

If anyone really cares about more accurate emulation, the behaviour could
be changed. But that needs a real good justification.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/75c91855fd850649ace162eec5495a1354221aaa.1561610354.git.luto@kernel.org
2019-06-28 00:04:39 +02:00
Andy Lutomirski 918ce32509 x86/vsyscall: Show something useful on a read fault
Just segfaulting the application when it tries to read the vsyscall page in
xonly mode is not helpful for those who need to debug it.

Emit a hint.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jann Horn <jannh@google.com>
Link: https://lkml.kernel.org/r/8016afffe0eab497be32017ad7f6f7030dc3ba66.1561610354.git.luto@kernel.org
2019-06-28 00:04:39 +02:00
Andy Lutomirski bd49e16e33 x86/vsyscall: Add a new vsyscall=xonly mode
With vsyscall emulation on, a readable vsyscall page is still exposed that
contains syscall instructions that validly implement the vsyscalls.

This is required because certain dynamic binary instrumentation tools
attempt to read the call targets of call instructions in the instrumented
code.  If the instrumented code uses vsyscalls, then the vsyscall page needs
to contain readable code.

Unfortunately, leaving readable memory at a deterministic address can be
used to help various ASLR bypasses, so some hardening value can be gained
by disallowing vsyscall reads.

Given how rarely the vsyscall page needs to be readable, add a mechanism to
make the vsyscall page be execute only.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/d17655777c21bc09a7af1bbcf74e6f2b69a51152.1561610354.git.luto@kernel.org
2019-06-28 00:04:38 +02:00
Dianzhang Chen 993773d11d x86/tls: Fix possible spectre-v1 in do_get_thread_area()
The index to access the threads tls array is controlled by userspace
via syscall: sys_ptrace(), hence leading to a potential exploitation
of the Spectre variant 1 vulnerability.

The index can be controlled from:
        ptrace -> arch_ptrace -> do_get_thread_area.

Fix this by sanitizing the user supplied index before using it to access
the p->thread.tls_array.

Signed-off-by: Dianzhang Chen <dianzhangchen0@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1561524630-3642-1-git-send-email-dianzhangchen0@gmail.com
2019-06-27 23:48:04 +02:00
Dianzhang Chen 31a2fbb390 x86/ptrace: Fix possible spectre-v1 in ptrace_get_debugreg()
The index to access the threads ptrace_bps is controlled by userspace via
syscall: sys_ptrace(), hence leading to a potential exploitation of the
Spectre variant 1 vulnerability.

The index can be controlled from:
    ptrace -> arch_ptrace -> ptrace_get_debugreg.

Fix this by sanitizing the user supplied index before using it access
thread->ptrace_bps.

Signed-off-by: Dianzhang Chen <dianzhangchen0@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1561476617-3759-1-git-send-email-dianzhangchen0@gmail.com
2019-06-27 23:48:04 +02:00
Jeremy Linton d24a0c7099 arm_pmu: acpi: spe: Add initial MADT/SPE probing
ACPI 6.3 adds additional fields to the MADT GICC
structure to describe SPE PPI's. We pick these out
of the cached reference to the madt_gicc structure
similarly to the core PMU code. We then create a platform
device referring to the IRQ and let the user/module loader
decide whether to load the SPE driver.

Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27 16:53:42 +01:00
Joshua Scott 8003136174 ARM: dts: armada-xp-98dx3236: Switch to armada-38x-uart serial node
Switch to the "marvell,armada-38x-uart" driver variant to empty
the UART buffer before writing to the UART_LCR register.

Signed-off-by: Joshua Scott <joshua.scott@alliedtelesis.co.nz>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>.
Cc: stable@vger.kernel.org
Fixes: 43e28ba877 ("ARM: dts: Use armada-370-xp as a base for armada-xp-98dx3236")
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2019-06-27 17:34:38 +02:00
Zhenzhong Duan d97ee99bf2 x86/jailhouse: Mark jailhouse_x2apic_available() as __init
.. as it is only called at early bootup stage.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: jailhouse-dev@googlegroups.com
Link: https://lkml.kernel.org/r/1561539289-29180-1-git-send-email-zhenzhong.duan@oracle.com
2019-06-27 16:59:19 +02:00
Sudeep Holla b07d7d5c7b x86/entry: Simplify _TIF_SYSCALL_EMU handling
The usage of emulated and _TIF_SYSCALL_EMU flags in syscall_trace_enter
is more complicated than required.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-27 10:14:06 +01:00
Xiaoyao Li 2238246ff8 x86/boot: Make the GDT 8-byte aligned
The segment descriptors are loaded with an implicitly LOCK-ed instruction,
which could trigger the split lock #AC exception if the variable is not
properly aligned and crosses a cache line.

Align the GDT properly so the descriptors are all 8 byte aligned.

Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lkml.kernel.org/r/20190627045525.105266-1-xiaoyao.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-27 10:56:11 +02:00
ShihPo Hung 0db7f5cd4a riscv: mm: Fix code comment
Fix the comment since vmalloc_fault doesn't reach
flush_tlb_fix_spurious_fault.

Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-riscv@lists.infradead.org
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-06-26 15:10:30 -07:00
Yash Shah 45b03df286 riscv: dts: Re-organize the DT nodes
As per the convention for any SOC device with external connection,
define only device DT node in SOC DTSi file with status = "disabled"
and enable device in Board DTS file with status = "okay"

Reported-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Yash Shah <yash.shah@sifive.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-06-26 10:28:33 -07:00
Atish Patra ff8391e1b7 RISC-V: defconfig: enable MMC & SPI for RISC-V
Currently, riscv upstream defconfig doesn't let you boot
through userspace if rootfs is on the SD card.

Let's enable MMC & SPI drivers as well so that one can boot
to the user space using default config in upstream kernel.

While here, enable automatic mounting of devtmpfs to simplify
kernel testing with minimal root filesystems. (pjw)

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
[paul.walmsley@sifive.com: mention the DEVTMPFS_MOUNT change in the
 patch description]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-06-26 10:27:49 -07:00
jinho lim 7b71665603 arm64: rename dump_instr as dump_kernel_instr
In traps.c, only __die calls dump_instr.
However, this function has sub-function as __dump_instr.

dump_kernel_instr can replace those functions.
By using aarch64_insn_read, it does not have to change fs to KERNEL_DS.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: jinho lim <jordan.lim@samsung.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-26 17:59:15 +01:00
Alejandro Jimenez c1f7fec1eb x86/speculation: Allow guests to use SSBD even if host does not
The bits set in x86_spec_ctrl_mask are used to calculate the guest's value
of SPEC_CTRL that is written to the MSR before VMENTRY, and control which
mitigations the guest can enable.  In the case of SSBD, unless the host has
enabled SSBD always on mode (by passing "spec_store_bypass_disable=on" in
the kernel parameters), the SSBD bit is not set in the mask and the guest
can not properly enable the SSBD always on mitigation mode.

This has been confirmed by running the SSBD PoC on a guest using the SSBD
always on mitigation mode (booted with kernel parameter
"spec_store_bypass_disable=on"), and verifying that the guest is vulnerable
unless the host is also using SSBD always on mode. In addition, the guest
OS incorrectly reports the SSB vulnerability as mitigated.

Always set the SSBD bit in x86_spec_ctrl_mask when the host CPU supports
it, allowing the guest to use SSBD whether or not the host has chosen to
enable the mitigation in any of its modes.

Fixes: be6fcb5478 ("x86/bugs: Rework spec_ctrl base and mask logic")
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: bp@alien8.de
Cc: rkrcmar@redhat.com
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1560187210-11054-1-git-send-email-alejandro.j.jimenez@oracle.com
2019-06-26 16:38:36 +02:00
Tiezhu Yang 53b7607382 x86/kexec: Make variable static and config dependent
The following sparse warning is emitted:

  arch/x86/kernel/crash.c:59:15:
  warning: symbol 'crash_zero_bytes' was not declared. Should it be static?

The variable is only used in this compilation unit, but it is also only
used when CONFIG_KEXEC_FILE is enabled. Just making it static would result
in a 'defined but not used' warning for CONFIG_KEXEC_FILE=n.

Make it static and move it into the existing CONFIG_KEXEC_FILE section.

[ tglx: Massaged changelog and moved it into the existing ifdef ]

Fixes: dd5f726076 ("kexec: support for kexec on panic using new system call")
Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: kexec@lists.infradead.org
Cc: vgoyal@redhat.com
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: https://lkml.kernel.org/r/117ef0c6.3d30.16b87c9cfbf.Coremail.kernelpatch@126.com
2019-06-26 16:02:45 +02:00
Zhenzhong Duan ab3765a050 x86/speculation/mds: Eliminate leaks by trace_hardirqs_on()
Move mds_idle_clear_cpu_buffers() after trace_hardirqs_on() to ensure
all store buffer entries are flushed.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: jgross@suse.com
Cc: ndesaulniers@google.com
Cc: gregkh@linuxfoundation.org
Link: https://lkml.kernel.org/r/1561260904-29669-2-git-send-email-zhenzhong.duan@oracle.com
2019-06-26 15:01:50 +02:00
Linus Walleij 670b004417 x86/platform/geode: Drop <linux/gpio.h> includes
These board files only use gpio_keys not gpio in general.  This include is
just surplus, delete it.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-gpio@vger.kernel.org
Cc: Andres Salomon <dilinger@queued.net>
Cc: linux-geode@lists.infradead.org
Cc: Andy Shevchenko <andy@infradead.org>
Cc: Darren Hart <dvhart@infradead.org>
Cc: platform-driver-x86@vger.kernel.org
Link: https://lkml.kernel.org/r/20190626092119.3172-1-linus.walleij@linaro.org
2019-06-26 15:00:12 +02:00
Vincenzo Frascino 3acf4be235 arm64: vdso: Fix compilation with clang older than 8
clang versions older than 8 do not support -mcmodel=tiny.

Add a check to the vDSO Makefile for arm64 to remove the flag when
these versions of the compiler are detected.

Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Qian Cai <cai@lca.pw>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: catalin.marinas@arm.com
Cc: will.deacon@arm.com
Cc: arnd@arndb.de
Cc: linux@armlinux.org.uk
Cc: ralf@linux-mips.org
Cc: paul.burton@mips.com
Cc: daniel.lezcano@linaro.org
Cc: salyzyn@android.com
Cc: pcc@google.com
Cc: shuah@kernel.org
Cc: 0x7f454c46@gmail.com
Cc: linux@rasmusvillemoes.dk
Cc: huw@codeweavers.com
Cc: sthotton@marvell.com
Cc: andre.przywara@arm.com
Cc: luto@kernel.org
Link: https://lkml.kernel.org/r/20190626113632.9295-1-vincenzo.frascino@arm.com
2019-06-26 14:26:55 +02:00
Vincenzo Frascino 6241c4dc6e arm64: compat: Fix __arch_get_hw_counter() implementation
Provide the following fixes for the __arch_get_hw_counter()
implementation on arm64:
- Fallback on syscall when an unstable counter is detected.
- Introduce isb()s before and after the counter read to avoid
speculation of the counter value and of the seq lock
respectively.
The second isb() is a temporary solution that will be revisited
in 5.3-rc1.

These fixes restore the semantics that __arch_counter_get_cntvct()
had on arm64.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: catalin.marinas@arm.com
Cc: will.deacon@arm.com
Cc: arnd@arndb.de
Cc: linux@armlinux.org.uk
Cc: ralf@linux-mips.org
Cc: paul.burton@mips.com
Cc: daniel.lezcano@linaro.org
Cc: salyzyn@android.com
Cc: pcc@google.com
Cc: shuah@kernel.org
Cc: 0x7f454c46@gmail.com
Cc: linux@rasmusvillemoes.dk
Cc: huw@codeweavers.com
Cc: sthotton@marvell.com
Cc: andre.przywara@arm.com
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/r/20190625161804.38713-3-vincenzo.frascino@arm.com
2019-06-26 14:26:54 +02:00
Vincenzo Frascino 27e11a9fe2 arm64: Fix __arch_get_hw_counter() implementation
Provide the following fixes for the __arch_get_hw_counter()
implementation on arm64:
 - Fallback on syscall when an unstable counter is detected.
 - Introduce isb()s before and after the counter read to avoid
   speculation of the counter value and of the seq lock
   respectively.
   The second isb() is a temporary solution that will be revisited
   in 5.3-rc1.

These fixes restore the semantics that __arch_counter_get_cntvct()
had on arm64.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: catalin.marinas@arm.com
Cc: will.deacon@arm.com
Cc: arnd@arndb.de
Cc: linux@armlinux.org.uk
Cc: ralf@linux-mips.org
Cc: paul.burton@mips.com
Cc: daniel.lezcano@linaro.org
Cc: salyzyn@android.com
Cc: pcc@google.com
Cc: shuah@kernel.org
Cc: 0x7f454c46@gmail.com
Cc: linux@rasmusvillemoes.dk
Cc: huw@codeweavers.com
Cc: sthotton@marvell.com
Cc: andre.przywara@arm.com
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/r/20190625161804.38713-2-vincenzo.frascino@arm.com
2019-06-26 14:26:54 +02:00
Thomas Gleixner 9d90b93bf3 lib/vdso: Make delta calculation work correctly
The x86 vdso implementation on which the generic vdso library is based on
has subtle (unfortunately undocumented) twists:

 1) The code assumes that the clocksource mask is U64_MAX which means that
    no bits are masked. Which is true for any valid x86 VDSO clocksource.
    Stupidly it still did the mask operation for no reason and at the wrong
    place right after reading the clocksource.

 2) It contains a sanity check to catch the case where slightly
    unsynchronized TSC values can be observed which would cause the delta
    calculation to make a huge jump. It therefore checks whether the
    current TSC value is larger than the value on which the current
    conversion is based on. If it's not larger the base value is used to
    prevent time jumps.

#1 Is not only stupid for the X86 case because it does the masking for no
reason it is also completely wrong for clocksources with a smaller mask
which can legitimately wrap around during a conversion period. The core
timekeeping code does it correct by applying the mask after the delta
calculation:

	(now - base) & mask

#2 is equally broken for clocksources which have smaller masks and can wrap
around during a conversion period because there the now > base check is
just wrong and causes stale time stamps and time going backwards issues.

Unbreak it by:

  1) Removing the mask operation from the clocksource read which makes the
     fallback detection work for all clocksources

  2) Replacing the conditional delta calculation with a overrideable inline
     function.

#2 could reuse clocksource_delta() from the timekeeping code but that
results in a significant performance hit for the x86 VSDO. The timekeeping
core code must have the non optimized version as it has to operate
correctly with clocksources which have smaller masks as well to handle the
case where TSC is discarded as timekeeper clocksource and replaced by HPET
or pmtimer. For the VDSO there is no replacement clocksource. If TSC is
unusable the syscall is enforced which does the right thing.

To accommodate to the needs of various architectures provide an
override-able inline function which defaults to the regular delta
calculation with masking:

	(now - base) & mask

Override it for x86 with the non-masking and checking version.

This unbreaks the ARM64 syscall fallback operation, allows to use
clocksources with arbitrary width and preserves the performance
optimization for x86.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: LAK <linux-arm-kernel@lists.infradead.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: catalin.marinas@arm.com
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux@armlinux.org.uk
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: paul.burton@mips.com
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: salyzyn@android.com
Cc: pcc@google.com
Cc: shuah@kernel.org
Cc: 0x7f454c46@gmail.com
Cc: linux@rasmusvillemoes.dk
Cc: huw@codeweavers.com
Cc: sthotton@marvell.com
Cc: andre.przywara@arm.com
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906261159230.32342@nanos.tec.linutronix.de
2019-06-26 14:26:53 +02:00
Nathan Chancellor aa69fb62be arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly
After r363059 and r363928 in LLVM, a build using ld.lld as the linker
with CONFIG_RANDOMIZE_BASE enabled fails like so:

ld.lld: error: relocation R_AARCH64_ABS32 cannot be used against symbol
__efistub_stext_offset; recompile with -fPIC

Fangrui and Peter figured out that ld.lld is incorrectly considering
__efistub_stext_offset as a relative symbol because of the order in
which symbols are evaluated. _text is treated as an absolute symbol
and stext is a relative symbol, making __efistub_stext_offset a
relative symbol.

Adding ABSOLUTE will force ld.lld to evalute this expression in the
right context and does not change ld.bfd's behavior. ld.lld will
need to be fixed but the developers do not see a quick or simple fix
without some research (see the linked issue for further explanation).
Add this simple workaround so that ld.lld can continue to link kernels.

Link: https://github.com/ClangBuiltLinux/linux/issues/561
Link: 025a815d75
Link: 249fde8583
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Debugged-by: Fangrui Song <maskray@google.com>
Debugged-by: Peter Smith <peter.smith@linaro.org>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
[will: add comment]
Signed-off-by: Will Deacon <will@kernel.org>
2019-06-26 11:40:20 +01:00
Ard Biesheuvel 6f496a555d arm64: kaslr: keep modules inside module region when KASAN is enabled
When KASLR and KASAN are both enabled, we keep the modules where they
are, and randomize the placement of the kernel so it is within 2 GB
of the module region. The reason for this is that putting modules in
the vmalloc region (like we normally do when KASLR is enabled) is not
possible in this case, given that the entire vmalloc region is already
backed by KASAN zero shadow pages, and so allocating dedicated KASAN
shadow space as required by loaded modules is not possible.

The default module allocation window is set to [_etext - 128MB, _etext]
in kaslr.c, which is appropriate for KASLR kernels booted without a
seed or with 'nokaslr' on the command line. However, as it turns out,
it is not quite correct for the KASAN case, since it still intersects
the vmalloc region at the top, where attempts to allocate shadow pages
will collide with the KASAN zero shadow pages, causing a WARN() and all
kinds of other trouble. So cap the top end to MODULES_END explicitly
when running with KASAN.

Cc: <stable@vger.kernel.org> # 4.9+
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-06-26 11:34:10 +01:00
Anshuman Khandual d9db691d3c arm64/mm: Drop [PTE|PMD]_TYPE_FAULT
This was added part of the original commit which added MMU definitions.

commit 4f04d8f005 ("arm64: MMU definitions").

These symbols never got used as confirmed from a git log search.

git log -p arch/arm64/ | grep PTE_TYPE_FAULT
git log -p arch/arm64/ | grep PMD_TYPE_FAULT

These probably meant to identify non present entries which can now be
achieved with PMD_SECT_VALID or PTE_VALID bits. Hence just drop these
unused symbols which are not required anymore.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-26 11:28:10 +01:00
Guo Ren 19e5e2ae9c csky: Fixup libgcc unwind error
The struct rt_sigframe is also defined in libgcc/config/csky/linux-unwind.h
of gcc. Although there is no use for the first three word space, we must
keep them the same with linux-unwind.h for member position.

The BUG is found in glibc test with the tst-cancel02.
The BUG is from commit:bf2416829362 of linux-5.2-rc1 merge window.

Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Signed-off-by: Mao Han <han_mao@c-sky.com>
Cc: Arnd Bergmann <arnd@arndb.de>
2019-06-26 13:45:48 +08:00
Catalin Marinas 6a5b78b32d arm64: compat: No need for pre-ARMv7 barriers on an ARMv8 system
Remove the deprecated (pre-ARMv7) compat barriers as they would not be used
on an ARMv8 system.

Fixes: a7f71a2c89 ("arm64: compat: Add vDSO")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Huw Davies <huw@codeweavers.com>
Cc: Shijith Thotton <sthotton@marvell.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Link: https://lkml.kernel.org/r/20190624140018.GD29120@arrakis.emea.arm.com
2019-06-26 07:28:10 +02:00
Catalin Marinas 94fee4d437 arm64: vdso: Remove unnecessary asm-offsets.c definitions
Since the VDSO code has moved to C from assembly, there is no need to
define and maintain the corresponding asm offsets.

Fixes: 28b1a824a4 ("arm64: vdso: Substitute gettimeofday() with C implementation")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Huw Davies <huw@codeweavers.com>
Cc: Shijith Thotton <sthotton@marvell.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Link: https://lkml.kernel.org/r/20190624135812.GC29120@arrakis.emea.arm.com
2019-06-26 07:28:10 +02:00
Kirill A. Shutemov 432c833218 x86/mm: Handle physical-virtual alignment mismatch in phys_p4d_init()
Kyle has reported occasional crashes when booting a kernel in 5-level
paging mode with KASLR enabled:

  WARNING: CPU: 0 PID: 0 at arch/x86/mm/init_64.c:87 phys_p4d_init+0x1d4/0x1ea
  RIP: 0010:phys_p4d_init+0x1d4/0x1ea
  Call Trace:
   __kernel_physical_mapping_init+0x10a/0x35c
   kernel_physical_mapping_init+0xe/0x10
   init_memory_mapping+0x1aa/0x3b0
   init_range_memory_mapping+0xc8/0x116
   init_mem_mapping+0x225/0x2eb
   setup_arch+0x6ff/0xcf5
   start_kernel+0x64/0x53b
   ? copy_bootdata+0x1f/0xce
   x86_64_start_reservations+0x24/0x26
   x86_64_start_kernel+0x8a/0x8d
   secondary_startup_64+0xb6/0xc0

which causes later:

  BUG: unable to handle page fault for address: ff484d019580eff8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  BAD
  Oops: 0000 [#1] SMP NOPTI
  RIP: 0010:fill_pud+0x13/0x130
  Call Trace:
   set_pte_vaddr_p4d+0x2e/0x50
   set_pte_vaddr+0x6f/0xb0
   __native_set_fixmap+0x28/0x40
   native_set_fixmap+0x39/0x70
   register_lapic_address+0x49/0xb6
   early_acpi_boot_init+0xa5/0xde
   setup_arch+0x944/0xcf5
   start_kernel+0x64/0x53b

Kyle bisected the issue to commit b569c18434 ("x86/mm/KASLR: Reduce
randomization granularity for 5-level paging to 1GB")

Before this commit PAGE_OFFSET was always aligned to P4D_SIZE when booting
5-level paging mode. But now only PUD_SIZE alignment is guaranteed.

In the case I was able to reproduce the following vaddr/paddr values were
observed in phys_p4d_init():

Iteration     vaddr			paddr
   1 	      0xff4228027fe00000 	0x033fe00000
   2	      0xff42287f40000000	0x8000000000

'vaddr' in both cases belongs to the same p4d entry.

But due to the original assumption that PAGE_OFFSET is aligned to P4D_SIZE
this overlap cannot be handled correctly. The code assumes strictly aligned
entries and unconditionally increments the index into the P4D table, which
creates false duplicate entries. Once the index reaches the end, the last
entry in the page table is missing.

Aside of that the 'paddr >= paddr_end' condition can evaluate wrong which
causes an P4D entry to be cleared incorrectly.

Change the loop in phys_p4d_init() to walk purely based on virtual
addresses like __kernel_physical_mapping_init() does. This makes it work
correctly with unaligned virtual addresses.

Fixes: b569c18434 ("x86/mm/KASLR: Reduce randomization granularity for 5-level paging to 1GB")
Reported-by: Kyle Pelton <kyle.d.pelton@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kyle Pelton <kyle.d.pelton@intel.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190624123150.920-1-kirill.shutemov@linux.intel.com
2019-06-26 07:25:09 +02:00
Kirill A. Shutemov c1887159eb x86/boot/64: Add missing fixup_pointer() for next_early_pgt access
__startup_64() uses fixup_pointer() to access global variables in a
position-independent fashion. Access to next_early_pgt was wrapped into the
helper, but one instance in the 5-level paging branch was missed.

GCC generates a R_X86_64_PC32 PC-relative relocation for the access which
doesn't trigger the issue, but Clang emmits a R_X86_64_32S which leads to
an invalid memory access and system reboot.

Fixes: 187e91fe5e ("x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Potapenko <glider@google.com>
Link: https://lkml.kernel.org/r/20190620112422.29264-1-kirill.shutemov@linux.intel.com
2019-06-26 07:25:09 +02:00
Kirill A. Shutemov 81c7ed296d x86/boot/64: Fix crash if kernel image crosses page table boundary
A kernel which boots in 5-level paging mode crashes in a small percentage
of cases if KASLR is enabled.

This issue was tracked down to the case when the kernel image unpacks in a
way that it crosses an 1G boundary. The crash is caused by an overrun of
the PMD page table in __startup_64() and corruption of P4D page table
allocated next to it. This particular issue is not visible with 4-level
paging as P4D page tables are not used.

But the P4D and the PUD calculation have similar problems.

The PMD index calculation is wrong due to operator precedence, which fails
to confine the PMDs in the PMD array on wrap around.

The P4D calculation for 5-level paging and the PUD calculation calculate
the first index correctly, but then blindly increment it which causes the
same issue when a kernel image is located across a 512G and for 5-level
paging across a 46T boundary.

This wrap around mishandling was introduced when these parts moved from
assembly to C.

Restore it to the correct behaviour.

Fixes: c88d71508e ("x86/boot/64: Rewrite startup_64() in C")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190620112345.28833-1-kirill.shutemov@linux.intel.com
2019-06-26 07:25:09 +02:00
Andrew Murray 5a35441256 clocksource/drivers/arm_arch_timer: Extract elf_hwcap use to arch-helper
Different mechanisms are used to test and set elf_hwcaps between ARM
and ARM64, this results in the use of ifdeferry in this file when
setting/testing for the EVTSTRM hwcap.

Let's improve readability by extracting this to an arch helper.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2019-06-25 19:49:18 +02:00
Aaro Koskinen d914d4d497 arm64: Implement panic_smp_self_stop()
Currently arm64 uses the default implementation of panic_smp_self_stop()
where the CPU runs in a cpu_relax() loop unable to receive IPIs anymore.
As a result, when two CPUs panic() simultaneously we get "SMP: failed to
stop secondary CPUs" warnings and extra delays before a reset, because
smp_send_stop() still tries to stop the other paniced CPU.

Provide an implementation of panic_smp_self_stop() that is identical to
the IPI CPU stop handler, so that the online status of stopped CPUs gets
properly updated.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-25 16:42:19 +01:00
Jayachandran C dccc9da22d arm64: Improve parking of stopped CPUs
The current code puts the stopped cpus in an 'yield' instruction loop.
Using a busy loop here is unnecessary, we can use the cpu_park_loop()
function here to do a wfi/wfe.

Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-25 16:42:09 +01:00
Mark Brown ca9503fc9e arm64: Expose FRINT capabilities to userspace
ARMv8.5 introduces the FRINT series of instructions for rounding floating
point numbers to integers. Provide a capability to userspace in order to
allow applications to determine if the system supports these instructions.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-25 14:24:00 +01:00
Mark Brown 1201937491 arm64: Expose ARMv8.5 CondM capability to userspace
ARMv8.5 adds new instructions XAFLAG and AXFLAG to translate the
representation of the results of floating point comparisons between the
native ARM format and an alternative format used by some software. Add
a hwcap allowing userspace to determine if they are present, since we
referred to earlier CondM extensions as FLAGM call these extensions
FLAGM2.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-25 14:21:41 +01:00
Denis Efremov b3d5f311d3 parisc: asm: psw.h: missing header guard
The psw.h header file contains #ifndef directive of the guard,
but the complimentary #define directive is missing. The patch
adds the appropriate #define to fix the header guard.

Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-25 14:52:26 +02:00
Olof Johansson e73f65930f i.MX fixes for 5.2, round 3:
- A recent testing by Sébastien discovers that the PWM interrupts of
    i.MX6UL were wrongly coded in device tree.  It's a fix for it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJdEXKFAAoJEFBXWFqHsHzOnh4H/RVNWE/NwhLlcqXHnRq4TPpi
 CdKQ9ZdXYWfi7SkFQJK8bYefklxfY7JUg4Pa/igTbkyeuwpHQugkYRK7is/jsEcd
 TZSCllyvbn4QIOCZJiRsWd3/soEryjaQmZcHTVRLTJXZKhTPB89MuTX7j1BmGd22
 SiSCTRtgVRWOY0ORF9oMtjNq1BiHEtf8M7GwvkifhlZCUh3pyqI7sqg+Zer/CEmM
 TCOpDc42x56igw36/H/TMKyRQestYNlp8AKxg1cagwNrrLNE4D3Icm5UHdd44KuT
 cS5O68BQxUldhfgdejfIZUvVguVlYQkuEZbCW/l18GdiK9IECW/xZRliBqO8RSw=
 =Gn7a
 -----END PGP SIGNATURE-----

Merge tag 'imx-fixes-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 5.2, round 3:
 - A recent testing by Sébastien discovers that the PWM interrupts of
   i.MX6UL were wrongly coded in device tree.  It's a fix for it.

* tag 'imx-fixes-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6ul: fix PWM[1-4] interrupts

Signed-off-by: Olof Johansson <olof@lixom.net>
2019-06-25 04:20:08 -07:00
Olof Johansson 4232db2e2a ARM: dts: Amlogic fixes for v5.2-rc
- fix GPU interrupts and operating voltage
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEe4dGDhaSf6n1v/EMWTcYmtP7xmUFAl0LBHEACgkQWTcYmtP7
 xmWjuw//aSfw9ezjFFhwVC2yEBb/lY85VSp5kA6R8p7TsieCFOU+qhY6zfhebu4x
 DRJ5vtrQ8s+FaRwTCoOfupSbRXwN4WyN6SJVhYEwR/Jj4ExMxtWvKc7NoSIi1xGl
 elvPEZ4wzoQ4eQIGKLwEa07vfSYKD03si9IWcp3TxEsmGMzzfhDyv2nT8ovS/t9B
 D0y2Y/s6A40q6Pr4eATzyLAVCVDScOVCtkYmX5OzPhgOD+5pBTQSt5TPzH03prqU
 dZ3cjOUmgvEt1WoZFajVJzngnN4qRTzVZnr9Aue4SS4qb0ytUrK9FtmnoU2LINbE
 cX+rciZuxeRIZiNE/2lxBz7S1R9HwWVzRdzLwaCemm9T0SlJ4CrU1sA+Mdk3eAVM
 Z6d3LYKm9Lbz99Jcy9sTbdjtrWJqYlucd9eJhjwfoQyk2gNuGQJvSWAHgLcw9N1O
 WWcf9Bocj+iQBtn1/uPVCYHgqZaAFndf8iz955Mj+edONeycUglclBmIXHpu4TNJ
 rFX3DalUsiGpHCto+9Hm9qGGXFnFE2FAimrtjeDKI7NcHQnGS1qgsWj/OTgGJVfD
 /iq7/EF27hpm1fX1bzLaWA3LIl5oW4ycksI8qHFfxONx6jXePVkI9gTG96HNbf9n
 XQ5xWcb1XaiG9Ox29sbqSkjrbLF4tGuoMtnkvZrFvLFAYWPEyIc=
 =SdFR
 -----END PGP SIGNATURE-----

Merge tag 'amlogic-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into arm/fixes

ARM: dts: Amlogic fixes for v5.2-rc
- fix GPU interrupts and operating voltage

* tag 'amlogic-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic:
  ARM: dts: meson8b: fix the operating voltage of the Mali GPU
  ARM: dts: meson8b: drop undocumented property from the Mali GPU node
  ARM: dts: meson8: fix GPU interrupts and drop an undocumented property

Signed-off-by: Olof Johansson <olof@lixom.net>
2019-06-25 04:19:26 -07:00
Nicholas Piggin e13e7cd4c0 powerpc/64s/exception: Fix machine check early corrupting AMR
The early machine check runs in real mode, so locking is unnecessary.
Worse, the windup does not restore AMR, so this can result in a false
KUAP fault after a recoverable machine check hits inside a user copy
operation.

Fix this similarly to HMI by just avoiding the kuap lock in the
early machine check handler (it will be set by the late handler that
runs in virtual mode if that runs). If the virtual mode handler is
reached, it will lock and restore the AMR.

Fixes: 890274c2dc ("powerpc/64s: Implement KUAP for Radix MMU")
Cc: Russell Currey <ruscur@russell.cc>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-25 21:04:27 +10:00
Nick Desaulniers 8049672bb1 arm64: defconfig: enable CONFIG_RANDOMIZE_BASE
For testing coverage and improved defense in depth, enable KASLR by
default.

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-25 09:33:06 +01:00
Catalin Marinas faaa73bcec arm64: ARM64_MODULES_PLTS must depend on MODULES
Otherwise, selecting it without MODULES leads to build failures.

Fixes: 58557e486f ("arm64: Allow user selection of ARM64_MODULE_PLTS")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-25 09:32:11 +01:00
Peter Zijlstra 7457c0da02 x86/alternatives: Add int3_emulate_call() selftest
Given that the entry_*.S changes for this functionality are somewhat
tricky, make sure the paths are tested every boot, instead of on the
rare occasion when we trip an INT3 while rewriting text.

Requested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25 10:23:50 +02:00
Peter Zijlstra faeedb0679 x86/stackframe/32: Allow int3_emulate_push()
Now that x86_32 has an unconditional gap on the kernel stack frame,
the int3_emulate_push() thing will work without further changes.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25 10:23:49 +02:00
Peter Zijlstra 3c88c692c2 x86/stackframe/32: Provide consistent pt_regs
Currently pt_regs on x86_32 has an oddity in that kernel regs
(!user_mode(regs)) are short two entries (esp/ss). This means that any
code trying to use them (typically: regs->sp) needs to jump through
some unfortunate hoops.

Change the entry code to fix this up and create a full pt_regs frame.

This then simplifies various trampolines in ftrace and kprobes, the
stack unwinder, ptrace, kdump and kgdb.

Much thanks to Josh for help with the cleanups!

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25 10:23:47 +02:00
Peter Zijlstra ea1ed38dba x86/stackframe, x86/ftrace: Add pt_regs frame annotations
When CONFIG_FRAME_POINTER, we should mark pt_regs frames.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25 10:23:47 +02:00
Peter Zijlstra 4201311dae x86/stackframe, x86/kprobes: Fix frame pointer annotations
The kprobe trampolines have a FRAME_POINTER annotation that makes no
sense. It marks the frame in the middle of pt_regs, at the place of
saving BP.

Change it to mark the pt_regs frame as per the ENCODE_FRAME_POINTER
from the respective entry_*.S.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25 10:23:46 +02:00
Peter Zijlstra a9b3c6998d x86/stackframe: Move ENCODE_FRAME_POINTER to asm/frame.h
In preparation for wider use, move the ENCODE_FRAME_POINTER macros to
a common header and provide inline asm versions.

These macros are used to encode a pt_regs frame for the unwinder; see
unwind_frame.c:decode_frame_pointer().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25 10:23:45 +02:00
Peter Zijlstra 5e1246ff2d x86/entry/32: Clean up return from interrupt preemption path
The code flow around the return from interrupt preemption point seems
needlessly complicated.

There is only one site jumping to resume_kernel, and none (outside of
resume_kernel) jumping to restore_all_kernel. Inline resume_kernel
in restore_all_kernel and avoid the CONFIG_PREEMPT dependent label.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25 10:23:44 +02:00
Ingo Molnar c21ac93288 Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into x86/asm, to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-25 10:23:22 +02:00
Masahiro Yamada 87b61864d7 x86/build: Remove redundant 'clean-files += capflags.c'
All the files added to 'targets' are cleaned. Adding the same file to both
'targets' and 'clean-files' is redundant.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20190625073311.18303-1-yamada.masahiro@socionext.com
2019-06-25 09:52:06 +02:00
Masahiro Yamada bc53d3d777 x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c
Without 'set -e', shell scripts continue running even after any
error occurs. The missed 'set -e' is a typical bug in shell scripting.

For example, when a disk space shortage occurs while this script is
running, it actually ends up with generating a truncated capflags.c.

Yet, mkcapflags.sh continues running and exits with 0. So, the build
system assumes it has succeeded.

It will not be re-generated in the next invocation of Make since its
timestamp is newer than that of any of the source files.

Add 'set -e' so that any error in this script is caught and propagated
to the build system.

Since 9c2af1c737 ("kbuild: add .DELETE_ON_ERROR special target"),
make automatically deletes the target on any failure. So, the broken
capflags.c will be deleted automatically.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20190625072622.17679-1-yamada.masahiro@socionext.com
2019-06-25 09:52:05 +02:00
Reinette Chatre 2ef085bd11 x86/resctrl: Cleanup cbm_ensure_valid()
A recent fix to the cbm_ensure_valid() function left some coding style
issues that are now addressed:

- Return a value instead of using a function parameter as input and
  output
- Use if (!val) instead of if (val == 0)
- Follow reverse fir tree ordering of variable declarations

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/15ba03856f1d944468ee6f44e3fd7aa548293ede.1561408280.git.reinette.chatre@intel.com
2019-06-25 09:26:11 +02:00
Thomas Gleixner 4fedcde702 Merge branch 'x86/urgent' into x86/cache
Pick up pending upstream fixes to meet dependencies
2019-06-25 09:24:35 +02:00
YueHaibing bf10c97adb x86/jump_label: Make tp_vec_nr static
Fix sparse warning:

arch/x86/kernel/jump_label.c:106:5: warning:
 symbol 'tp_vec_nr' was not declared. Should it be static?

It's only used in jump_label.c, so make it static.

Fixes: ba54f0c3f7 ("x86/jump_label: Batch jump label updates")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <bp@alien8.de>
Cc: <hpa@zytor.com>
Cc: <peterz@infradead.org>
Cc: <bristot@redhat.com>
Cc: <namit@vmware.com>
Link: https://lkml.kernel.org/r/20190625034548.26392-1-yuehaibing@huawei.com
2019-06-25 09:22:14 +02:00
Linus Torvalds 249155c20f Merge branch 'parisc-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
 "Add missing PCREL64 relocation in module loader to fix module load
  errors when the static branch and JUMP_LABEL feature is enabled on
  a 64-bit kernel"

* 'parisc-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix module loading error with JUMP_LABEL feature
2019-06-25 05:52:31 +08:00
Dmitry Korotin 0b24cae4d5
MIPS: Add missing EHB in mtc0 -> mfc0 sequence.
Add a missing EHB (Execution Hazard Barrier) in mtc0 -> mfc0 sequence.
Without this execution hazard barrier it's possible for the value read
back from the KScratch register to be the value from before the mtc0.

Reproducible on P5600 & P6600.

The hazard is documented in the MIPS Architecture Reference Manual Vol.
III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev
6.03 table 8.1 which includes:

   Producer | Consumer | Hazard
  ----------|----------|----------------------------
   mtc0     | mfc0     | any coprocessor 0 register

Signed-off-by: Dmitry Korotin <dkorotin@wavecomp.com>
[paul.burton@mips.com:
  - Commit message tweaks.
  - Add Fixes tags.
  - Mark for stable back to v3.15 where P5600 support was introduced.]
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 3d8bfdd030 ("MIPS: Use C0_KScratch (if present) to hold PGD pointer.")
Fixes: 829dcc0a95 ("MIPS: Add MIPS P5600 probe support")
Cc: linux-mips@vger.kernel.org
Cc: stable@vger.kernel.org # v3.15+
2019-06-24 13:43:12 -07:00
Jiri Olsa 637d97b53c perf/x86/rapl: Get quirk state from new probe framework
Getting the apply_quirk bool from new rapl_model_match array.

And because apply_quirk was the last remaining piece of data
in rapl_cpu_match, replacing it with rapl_model_match as device
table.

The switch to new perf_msr_probe detection API is done.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-9-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:28:36 +02:00
Jiri Olsa 5fc1bd8466 perf/x86/rapl: Get attributes from new probe framework
We no longer need model specific attribute arrays,
because we get all this detected in rapl_events_attrs.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-8-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:28:35 +02:00
Jiri Olsa 122f1c51b1 perf/x86/rapl: Get MSR values from new probe framework
There's no need to have special code for getting
the bit and MSR value for given event. We can
now easily get it from rapl_msrs array.

Also getting rid of RAPL_IDX_*, which is no longer
needed and replacing INTEL_RAPL* with PERF_RAPL*
enums.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-7-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:28:34 +02:00
Jiri Olsa cd105aed1a perf/x86/rapl: Get rapl_cntr_mask from new probe framework
We get rapl_cntr_mask from perf_msr_probe call, as a replacement
for current intel_rapl_init_fun::cntr_mask value for each model.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-6-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:28:34 +02:00
Jiri Olsa 5fb5273a90 perf/x86/rapl: Use new MSR detection interface
Using perf_msr_probe function to probe for RAPL MSRs.

Adding new rapl_model_match device table, that
gathers events info for given model, following
the MSR and cstate module design.

It will replace the current rapl_cpu_match device
table and detection code in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-5-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:28:33 +02:00
Jiri Olsa 8f2a28c585 perf/x86/cstate: Use new probe function
Using perf_msr_probe function to probe for cstate events.

The functionality is the same, with one exception, that
perf_msr_probe checks for rdmsr to return value != 0 for
given MSR register.

Using the new attribute groups and adding the events via
pmu::attr_update.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-4-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:28:33 +02:00
Jiri Olsa dde5e72068 perf/x86/msr: Use new probe function
Using perf_msr_probe function to probe for msr events.

The functionality is the same, with one exception, that
perf_msr_probe checks for rdmsr to return value != 0 for
given MSR register.

Using the new attribute groups and adding the events via
pmu::attr_update.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-3-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:28:32 +02:00
Jiri Olsa 98253a546a perf/x86: Add MSR probe interface
Adding perf_msr_probe function to provide interface for
checking up on MSR register and set the related attribute
group visibility.

User defines following struct for each MSR register:

  struct perf_msr {
       u64                       msr;
       struct attribute_group   *grp;
       bool                    (*test)(int idx, void *data);
       bool                      no_check;
  };

Where:
  msr      - is the MSR address
  attrs    - is attribute groups array to add if the check passed
  test     - is test function pointer
  no_check - is bool that bypass the check and adds the
              attribute without any test

The array of struct perf_msr is passed into:

  perf_msr_probe(struct perf_msr *msr, int cnt, bool zero, void *data)

Together with:
  cnt  - which is the number of struct msr array elements
  data - which is user pointer passed to the test function
  zero - allow counters that returns zero on rdmsr

The perf_msr_probe will executed test code, read the MSR and
check the value is != 0. If all these tests pass, related
attribute group is kept visible.

Also adding PMU_EVENT_GROUP macro helper to define attribute
group for single attribute. It will be used in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan <kan.liang@linux.intel.com>
Cc: Liang
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20190616140358.27799-2-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:28:31 +02:00
Ingo Molnar 9e6e87e62a Merge branch 'x86/cpu' into perf/core, to pick up dependent patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:26:39 +02:00
Ingo Molnar b9271f0c65 Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into perf/core, to refresh branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:25:52 +02:00
Vincent Guittot 8ec59c0f5f sched/topology: Remove unused 'sd' parameter from arch_scale_cpu_capacity()
The 'struct sched_domain *sd' parameter to arch_scale_cpu_capacity() is
unused since commit:

  765d0af19f ("sched/topology: Remove the ::smt_gain field from 'struct sched_domain'")

Remove it.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: gregkh@linuxfoundation.org
Cc: linux@armlinux.org.uk
Cc: quentin.perret@arm.com
Cc: rafael@kernel.org
Link: https://lkml.kernel.org/r/1560783617-5827-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:23:39 +02:00
Ingo Molnar d2abae71eb Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into sched/core, to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:19:53 +02:00
Kan Liang cd6b984f6d perf/x86: Remove pmu->pebs_no_xmm_regs
We don't need pmu->pebs_no_xmm_regs anymore, the capabilities
PERF_PMU_CAP_EXTENDED_REGS can be used to check if XMM registers
collection is supported.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/1559081314-9714-4-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:19:25 +02:00
Kan Liang dce86ac75d perf/x86: Clean up PEBS_XMM_REGS
Use generic macro PERF_REG_EXTENDED_MASK to replace PEBS_XMM_REGS to
avoid duplication.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/1559081314-9714-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:19:24 +02:00
Kan Liang 90d424915a perf/x86/regs: Check reserved bits
The perf fuzzer triggers a warning which map to:

        if (WARN_ON_ONCE(idx >= ARRAY_SIZE(pt_regs_offset)))
                return 0;

The bits between XMM registers and generic registers are reserved.
But perf_reg_validate() doesn't check these bits.

Add PERF_REG_X86_RESERVED for reserved bits on X86.
Check the reserved bits in perf_reg_validate().

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 878068ea27 ("perf/x86: Support outputting XMM registers")
Link: https://lkml.kernel.org/r/1559081314-9714-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:19:24 +02:00
Kan Liang e321d02db8 perf/x86: Disable extended registers for non-supported PMUs
The perf fuzzer caused Skylake machine to crash:

[ 9680.085831] Call Trace:
[ 9680.088301]  <IRQ>
[ 9680.090363]  perf_output_sample_regs+0x43/0xa0
[ 9680.094928]  perf_output_sample+0x3aa/0x7a0
[ 9680.099181]  perf_event_output_forward+0x53/0x80
[ 9680.103917]  __perf_event_overflow+0x52/0xf0
[ 9680.108266]  ? perf_trace_run_bpf_submit+0xc0/0xc0
[ 9680.113108]  perf_swevent_hrtimer+0xe2/0x150
[ 9680.117475]  ? check_preempt_wakeup+0x181/0x230
[ 9680.122091]  ? check_preempt_curr+0x62/0x90
[ 9680.126361]  ? ttwu_do_wakeup+0x19/0x140
[ 9680.130355]  ? try_to_wake_up+0x54/0x460
[ 9680.134366]  ? reweight_entity+0x15b/0x1a0
[ 9680.138559]  ? __queue_work+0x103/0x3f0
[ 9680.142472]  ? update_dl_rq_load_avg+0x1cd/0x270
[ 9680.147194]  ? timerqueue_del+0x1e/0x40
[ 9680.151092]  ? __remove_hrtimer+0x35/0x70
[ 9680.155191]  __hrtimer_run_queues+0x100/0x280
[ 9680.159658]  hrtimer_interrupt+0x100/0x220
[ 9680.163835]  smp_apic_timer_interrupt+0x6a/0x140
[ 9680.168555]  apic_timer_interrupt+0xf/0x20
[ 9680.172756]  </IRQ>

The XMM registers can only be collected by PEBS hardware events on the
platforms with PEBS baseline support, e.g. Icelake, not software/probe
events.

Add capabilities flag PERF_PMU_CAP_EXTENDED_REGS to indicate the PMU
which support extended registers. For X86, the extended registers are
XMM registers.

Add has_extended_regs() to check if extended registers are applied.

The generic code define the mask of extended registers as 0 if arch
headers haven't overridden it.

Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 878068ea27 ("perf/x86: Support outputting XMM registers")
Link: https://lkml.kernel.org/r/1559081314-9714-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-24 19:19:23 +02:00
Ard Biesheuvel 3f75070648 arm64: bpf: do not allocate executable memory
The BPF code now takes care of mapping the code pages executable
after mapping them read-only, to ensure that no RWX mapped regions
are needed, even transiently. This means we can drop the executable
permissions from the mapping at allocation time.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-24 18:10:39 +01:00
Ard Biesheuvel f83b4f8860 arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages
In order to avoid transient inconsistencies where freed code pages
are remapped writable while stale TLB entries still exist on other
cores, mark the kprobes text pages with the VM_FLUSH_RESET_PERMS
attribute. This instructs the core vmalloc code not to defer the
TLB flush when this region is unmapped and returned to the page
allocator.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-24 18:10:39 +01:00
Ard Biesheuvel 4739d53fcd arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP
Wire up the special helper functions to manipulate aliases of vmalloc
regions in the linear map.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-24 18:10:39 +01:00
Ard Biesheuvel 7dfac3c5f4 arm64: module: create module allocations without exec permissions
Now that the core code manages the executable permissions of code
regions of modules explicitly, it is no longer necessary to create
the module vmalloc regions with RWX permissions, and we can create
them with RW- permissions instead, which is preferred from a
security perspective.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-24 18:10:39 +01:00
Florian Fainelli 58557e486f arm64: Allow user selection of ARM64_MODULE_PLTS
Make ARM64_MODULE_PLTS a selectable Kconfig symbol, since some people
might have very big modules spilling out of the dedicated module area
into vmalloc. Help text is copied from the ARM 32-bit counterpart and
modified to a mention of KASLR and specific ARM errata workaround(s).

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-24 18:10:39 +01:00
Ard Biesheuvel 2af22f3ec3 acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
Some Qualcomm Snapdragon based laptops built to run Microsoft Windows
are clearly ACPI 5.1 based, given that that is the first ACPI revision
that supports ARM, and introduced the FADT 'arm_boot_flags' field,
which has a non-zero field on those systems.

So in these cases, infer from the ARM boot flags that the FADT must be
5.1 or later, and treat it as 5.1.

Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Graeme Gregory <graeme.gregory@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-24 18:10:38 +01:00
Linus Torvalds 26df62aaae powerpc fixes for 5.2 #6
One fix for a bug in our context id handling on 64-bit hash CPUs, which can lead
 to unrelated processes being able to read/write to each other's virtual memory.
 See the commit for full details.
 
 That is the fix for CVE-2019-12817.
 
 This also adds a kernel selftest for the bug.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdCheiAAoJEFHr6jzI4aWADMcP/3gC9mVintc5iFU+bi7O73d6
 ClHLkL7fqRsAiRthUVpRo6M8kdmKXnOy+Tqoy5dnJPmCTfjVIQzhEBwuHToaj9qs
 IaJKXrJFAg6ou2xcMjnyBk8CfPAKVPDDYKU2YcM8ODsFbketeKykRfNliw/91Z4t
 /cViOHGBY/oxlq4/MqG6n+OvYBf1c2/gqW25uG+gJzVEM/reCViHLj6Veqa6Cu0i
 9H4cNi4yE4aUsApqmNlJi4zJ0SMkwTOU1cRObQyUaK1njDUuIBp5IgGw2TxkThAq
 RXcsv14VwV+AGxkAkHEmc3rLvcL0P1E04J9HINBcVpShfGR5y3oUaxGsKhNgStLl
 Rex77/LBkVaV86pWvJTWVOcGz61EYu8/3Yh02zkzOlfMuVd6QjJhRGmnW55/Ntsz
 EOp93yXjRZycm6EZQvcITlFSUZ44htj9awK2xUvDHEPUIi+wkehjyq/F4ORCnxxH
 8kV6ZSNXsTZFYgHv8DOTortn9bGV9lEnFYn0wWCoej38gXQNb5ryYpSRuoOw5n5O
 cU+4z/Y9pHfrOzQpJxHLXQdhSGfoqNIxTHwDigxoBgGXRx/hdZWAsXP7AssFrTlJ
 V6p1VtKIdAhwmrSnTqTD0zFx0A3dunuhtNRgfzppvKVrcL4fJQyi3V0juUCigYJu
 Kv9LG+KrWZCfeQVp8kAf
 =y5oH
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "One fix for a bug in our context id handling on 64-bit hash CPUs,
  which can lead to unrelated processes being able to read/write to each
  other's virtual memory. See the commit for full details.

  That is the fix for CVE-2019-12817.

  This also adds a kernel selftest for the bug"

* tag 'powerpc-5.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Add test of fork with mapping above 512TB
  powerpc/mm/64s/hash: Reallocate context ids on fork
2019-06-24 21:20:39 +08:00
Sébastien Szymanski 3cf10132ac ARM: dts: imx6ul: fix PWM[1-4] interrupts
According to the i.MX6UL/L RM, table 3.1 "ARM Cortex A7 domain interrupt
summary", the interrupts for the PWM[1-4] go from 83 to 86.

Fixes: b9901fe84f ("ARM: dts: imx6ul: add pwm[1-4] nodes")
Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2019-06-24 21:13:27 +08:00
Joerg Roedel ceedd5f74d Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into generic-dma-ops

Linux 5.2-rc6
2019-06-24 10:23:16 +02:00
Fenghua Yu bd9a0c97e5 x86/umwait: Add sysfs interface to control umwait maximum time
IA32_UMWAIT_CONTROL[31:2] determines the maximum time in TSC-quanta
that processor can stay in C0.1 or C0.2. A zero value means no maximum
time.

Each instruction sets its own deadline in the instruction's implicit
input EDX:EAX value. The instruction wakes up if the time-stamp counter
reaches or exceeds the specified deadline, or the umwait maximum time
expires, or a store happens in the monitored address range in umwait.

The administrator can write an unsigned 32-bit number to
/sys/devices/system/cpu/umwait_control/max_time to change the default
value. Note that a value of zero means there is no limit. The lower two
bits of the value must be zero.

[ tglx: Simplify the write function. Massage changelog ]

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Andy Lutomirski" <luto@kernel.org>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-5-git-send-email-fenghua.yu@intel.com
2019-06-24 01:44:20 +02:00
Fenghua Yu ff4b353f2e x86/umwait: Add sysfs interface to control umwait C0.2 state
C0.2 state in umwait and tpause instructions can be enabled or disabled
on a processor through IA32_UMWAIT_CONTROL MSR register.

By default, C0.2 is enabled and the user wait instructions results in
lower power consumption with slower wakeup time.

But in real time systems which require faster wakeup time although power
savings could be smaller, the administrator needs to disable C0.2 and all
umwait invocations from user applications use C0.1.

Create a sysfs interface which allows the administrator to control C0.2
state during run time.

Andy Lutomirski suggested to turn off local irqs before writing the MSR to
ensure the cached control value is not changed by a concurrent sysfs write
from a different CPU via IPI.

[ tglx: Simplified the update logic in the write function and got rid of
  	all the convoluted type casts. Added a shared update function and
	made the namespace consistent. Moved the sysfs create invocation.
	Massaged changelog ]

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Andy Lutomirski" <luto@kernel.org>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-4-git-send-email-fenghua.yu@intel.com
2019-06-24 01:44:20 +02:00
Fenghua Yu bd688c69b7 x86/umwait: Initialize umwait control values
umwait or tpause allows the processor to enter a light-weight
power/performance optimized state (C0.1 state) or an improved
power/performance optimized state (C0.2 state) for a period specified by
the instruction or until the system time limit or until a store to the
monitored address range in umwait.

IA32_UMWAIT_CONTROL MSR register allows the OS to enable/disable C0.2 on
the processor and to set the maximum time the processor can reside in C0.1
or C0.2.

By default C0.2 is enabled so the user wait instructions can enter the
C0.2 state to save more power with slower wakeup time.

Andy Lutomirski proposed to set the maximum umwait time to 100000 cycles by
default. A quote from Andy:

  "What I want to avoid is the case where it works dramatically differently
   on NO_HZ_FULL systems as compared to everything else. Also, UMWAIT may
   behave a bit differently if the max timeout is hit, and I'd like that
   path to get exercised widely by making it happen even on default
   configs."

A sysfs interface to adjust the time and the C0.2 enablement is provided in
a follow up change.

[ tglx: Renamed MSR_IA32_UMWAIT_CONTROL_MAX_TIME to
  	MSR_IA32_UMWAIT_CONTROL_TIME_MASK because the constant is used as
  	mask throughout the code.
	Massaged comments and changelog ]

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-3-git-send-email-fenghua.yu@intel.com
2019-06-24 01:44:19 +02:00
Fenghua Yu 6dbbf5ec9e x86/cpufeatures: Enumerate user wait instructions
umonitor, umwait, and tpause are a set of user wait instructions.

umonitor arms address monitoring hardware using an address. The
address range is determined by using CPUID.0x5. A store to
an address within the specified address range triggers the
monitoring hardware to wake up the processor waiting in umwait.

umwait instructs the processor to enter an implementation-dependent
optimized state while monitoring a range of addresses. The optimized
state may be either a light-weight power/performance optimized state
(C0.1 state) or an improved power/performance optimized state
(C0.2 state).

tpause instructs the processor to enter an implementation-dependent
optimized state C0.1 or C0.2 state and wake up when time-stamp counter
reaches specified timeout.

The three instructions may be executed at any privilege level.

The instructions provide power saving method while waiting in
user space. Additionally, they can allow a sibling hyperthread to
make faster progress while this thread is waiting. One example of an
application usage of umwait is when waiting for input data from another
application, such as a user level multi-threaded packet processing
engine.

Availability of the user wait instructions is indicated by the presence
of the CPUID feature flag WAITPKG CPUID.0x07.0x0:ECX[5].

Detailed information on the instructions and CPUID feature WAITPKG flag
can be found in the latest Intel Architecture Instruction Set Extensions
and Future Features Programming Reference and Intel 64 and IA-32
Architectures Software Developer's Manual.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Link: https://lkml.kernel.org/r/1560994438-235698-2-git-send-email-fenghua.yu@intel.com
2019-06-24 01:44:19 +02:00
Andy Lutomirski ecf9db3d1f x86/vdso: Give the [ph]vclock_page declarations real types
Clean up the vDSO code a bit by giving pvclock_page and hvclock_page
their actual types instead of u8[PAGE_SIZE].  This shouldn't
materially affect the generated code.

Heavily based on a patch from Linus.

[ tglx: Adapted to the unified VDSO code ]

Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/6920c5188f8658001af1fc56fd35b815706d300c.1561241273.git.luto@kernel.org
2019-06-24 01:21:31 +02:00
Christoph Hellwig ad97f9df0f riscv: add binfmt_flat support
Just use the generic definitions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:47 +10:00
Christoph Hellwig 6843d8aa5b binfmt_flat: remove the persistent argument from flat_get_addr_from_rp
The argument is never used.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2019-06-24 09:16:47 +10:00