1
0
Fork 0
Commit Graph

152890 Commits (0d945c1f966b2bcb67bb12be749da0a7fb00201b)

Author SHA1 Message Date
KarimAllah Ahmed 22a7cdcae6 KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned
The spec only requires the posted interrupt descriptor address to be
64-bytes aligned (i.e. bits[0:5] == 0). Using page_address_valid also
forces the address to be page aligned.

Only validate that the address does not cross the maximum physical address
without enforcing a page alignment.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Fixes: 6de84e581c ("nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2")
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Krish Sadhuhan <krish.sadhukhan@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-10-24 12:47:16 +02:00
Linus Torvalds ba9f6f8954 Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull siginfo updates from Eric Biederman:
 "I have been slowly sorting out siginfo and this is the culmination of
  that work.

  The primary result is in several ways the signal infrastructure has
  been made less error prone. The code has been updated so that manually
  specifying SEND_SIG_FORCED is never necessary. The conversion to the
  new siginfo sending functions is now complete, which makes it
  difficult to send a signal without filling in the proper siginfo
  fields.

  At the tail end of the patchset comes the optimization of decreasing
  the size of struct siginfo in the kernel from 128 bytes to about 48
  bytes on 64bit. The fundamental observation that enables this is by
  definition none of the known ways to use struct siginfo uses the extra
  bytes.

  This comes at the cost of a small user space observable difference.
  For the rare case of siginfo being injected into the kernel only what
  can be copied into kernel_siginfo is delivered to the destination, the
  rest of the bytes are set to 0. For cases where the signal and the
  si_code are known this is safe, because we know those bytes are not
  used. For cases where the signal and si_code combination is unknown
  the bits that won't fit into struct kernel_siginfo are tested to
  verify they are zero, and the send fails if they are not.

  I made an extensive search through userspace code and I could not find
  anything that would break because of the above change. If it turns out
  I did break something it will take just the revert of a single change
  to restore kernel_siginfo to the same size as userspace siginfo.

  Testing did reveal dependencies on preferring the signo passed to
  sigqueueinfo over si->signo, so bit the bullet and added the
  complexity necessary to handle that case.

  Testing also revealed bad things can happen if a negative signal
  number is passed into the system calls. Something no sane application
  will do but something a malicious program or a fuzzer might do. So I
  have fixed the code that performs the bounds checks to ensure negative
  signal numbers are handled"

* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits)
  signal: Guard against negative signal numbers in copy_siginfo_from_user32
  signal: Guard against negative signal numbers in copy_siginfo_from_user
  signal: In sigqueueinfo prefer sig not si_signo
  signal: Use a smaller struct siginfo in the kernel
  signal: Distinguish between kernel_siginfo and siginfo
  signal: Introduce copy_siginfo_from_user and use it's return value
  signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE
  signal: Fail sigqueueinfo if si_signo != sig
  signal/sparc: Move EMT_TAGOVF into the generic siginfo.h
  signal/unicore32: Use force_sig_fault where appropriate
  signal/unicore32: Generate siginfo in ucs32_notify_die
  signal/unicore32: Use send_sig_fault where appropriate
  signal/arc: Use force_sig_fault where appropriate
  signal/arc: Push siginfo generation into unhandled_exception
  signal/ia64: Use force_sig_fault where appropriate
  signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn
  signal/ia64: Use the generic force_sigsegv in setup_frame
  signal/arm/kvm: Use send_sig_mceerr
  signal/arm: Use send_sig_fault where appropriate
  signal/arm: Use force_sig_fault where appropriate
  ...
2018-10-24 11:22:39 +01:00
Roger Pau Monne 7deecbda30 xen/pvh: increase early stack size
While booting on an AMD EPYC box the stack canary would detect stack
overflows when using the current PVH early stack size (256). Switch to
using the value defined by BOOT_STACK_SIZE, which prevents the stack
overflow.

Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2018-10-24 10:18:28 +02:00
Juergen Gross a856531951 xen: make xen_qlock_wait() nestable
xen_qlock_wait() isn't safe for nested calls due to interrupts. A call
of xen_qlock_kick() might be ignored in case a deeper nesting level
was active right before the call of xen_poll_irq():

CPU 1:                                   CPU 2:
spin_lock(lock1)
                                         spin_lock(lock1)
                                         -> xen_qlock_wait()
                                            -> xen_clear_irq_pending()
                                            Interrupt happens
spin_unlock(lock1)
-> xen_qlock_kick(CPU 2)
spin_lock_irqsave(lock2)
                                         spin_lock_irqsave(lock2)
                                         -> xen_qlock_wait()
                                            -> xen_clear_irq_pending()
                                               clears kick for lock1
                                            -> xen_poll_irq()
spin_unlock_irq_restore(lock2)
-> xen_qlock_kick(CPU 2)
                                            wakes up
                                         spin_unlock_irq_restore(lock2)
                                         IRET
                                           resumes in xen_qlock_wait()
                                           -> xen_poll_irq()
                                           never wakes up

The solution is to disable interrupts in xen_qlock_wait() and not to
poll for the irq in case xen_qlock_wait() is called in nmi context.

Cc: stable@vger.kernel.org
Cc: Waiman.Long@hp.com
Cc: peterz@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2018-10-24 10:18:04 +02:00
Juergen Gross 2ac2a7d4d9 xen: fix race in xen_qlock_wait()
In the following situation a vcpu waiting for a lock might not be
woken up from xen_poll_irq():

CPU 1:                CPU 2:                      CPU 3:
takes a spinlock
                      tries to get lock
                      -> xen_qlock_wait()
frees the lock
-> xen_qlock_kick(cpu2)
                        -> xen_clear_irq_pending()

takes lock again
                                                  tries to get lock
                                                  -> *lock = _Q_SLOW_VAL
                        -> *lock == _Q_SLOW_VAL ?
                        -> xen_poll_irq()
frees the lock
-> xen_qlock_kick(cpu3)

And cpu 2 will sleep forever.

This can be avoided easily by modifying xen_qlock_wait() to call
xen_poll_irq() only if the related irq was not pending and to call
xen_clear_irq_pending() only if it was pending.

Cc: stable@vger.kernel.org
Cc: Waiman.Long@hp.com
Cc: peterz@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2018-10-24 10:17:38 +02:00
Linus Torvalds 50b825d7e8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add VF IPSEC offload support in ixgbe, from Shannon Nelson.

 2) Add zero-copy AF_XDP support to i40e, from Björn Töpel.

 3) All in-tree drivers are converted to {g,s}et_link_ksettings() so we
    can get rid of the {g,s}et_settings ethtool callbacks, from Michal
    Kubecek.

 4) Add software timestamping to veth driver, from Michael Walle.

 5) More work to make packet classifiers and actions lockless, from Vlad
    Buslov.

 6) Support sticky FDB entries in bridge, from Nikolay Aleksandrov.

 7) Add ipv6 version of IP_MULTICAST_ALL sockopt, from Andre Naujoks.

 8) Support batching of XDP buffers in vhost_net, from Jason Wang.

 9) Add flow dissector BPF hook, from Petar Penkov.

10) i40e vf --> generic iavf conversion, from Jesse Brandeburg.

11) Add NLA_REJECT netlink attribute policy type, to signal when users
    provide attributes in situations which don't make sense. From
    Johannes Berg.

12) Switch TCP and fair-queue scheduler over to earliest departure time
    model. From Eric Dumazet.

13) Improve guest receive performance by doing rx busy polling in tx
    path of vhost networking driver, from Tonghao Zhang.

14) Add per-cgroup local storage to bpf

15) Add reference tracking to BPF, from Joe Stringer. The verifier can
    now make sure that references taken to objects are properly released
    by the program.

16) Support in-place encryption in TLS, from Vakul Garg.

17) Add new taprio packet scheduler, from Vinicius Costa Gomes.

18) Lots of selftests additions, too numerous to mention one by one here
    but all of which are very much appreciated.

19) Support offloading of eBPF programs containing BPF to BPF calls in
    nfp driver, frm Quentin Monnet.

20) Move dpaa2_ptp driver out of staging, from Yangbo Lu.

21) Lots of u32 classifier cleanups and simplifications, from Al Viro.

22) Add new strict versions of netlink message parsers, and enable them
    for some situations. From David Ahern.

23) Evict neighbour entries on carrier down, also from David Ahern.

24) Support BPF sk_msg verdict programs with kTLS, from Daniel Borkmann
    and John Fastabend.

25) Add support for filtering route dumps, from David Ahern.

26) New igc Intel driver for 2.5G parts, from Sasha Neftin et al.

27) Allow vxlan enslavement to bridges in mlxsw driver, from Ido
    Schimmel.

28) Add queue and stack map types to eBPF, from Mauricio Vasquez B.

29) Add back byte-queue-limit support to r8169, with all the bug fixes
    in other areas of the driver it works now! From Florian Westphal and
    Heiner Kallweit.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2147 commits)
  tcp: add tcp_reset_xmit_timer() helper
  qed: Fix static checker warning
  Revert "be2net: remove desc field from be_eq_obj"
  Revert "net: simplify sock_poll_wait"
  net: socionext: Reset tx queue in ndo_stop
  net: socionext: Add dummy PHY register read in phy_write()
  net: socionext: Stop PHY before resetting netsec
  net: stmmac: Set OWN bit for jumbo frames
  arm64: dts: stratix10: Support Ethernet Jumbo frame
  tls: Add maintainers
  net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode
  octeontx2-af: Support for NIXLF's UCAST/PROMISC/ALLMULTI modes
  octeontx2-af: Support for setting MAC address
  octeontx2-af: Support for changing RSS algorithm
  octeontx2-af: NIX Rx flowkey configuration for RSS
  octeontx2-af: Install ucast and bcast pkt forwarding rules
  octeontx2-af: Add LMAC channel info to NIXLF_ALLOC response
  octeontx2-af: NPC MCAM and LDATA extract minimal configuration
  octeontx2-af: Enable packet length and csum validation
  octeontx2-af: Support for VTAG strip and capture
  ...
2018-10-24 06:47:44 +01:00
Linus Torvalds a97a2d4d56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:
 "Mostly VDSO cleanups and optimizations"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Several small VDSO vclock_gettime.c improvements.
  sparc: Validate VDSO for undefined symbols.
  sparc: Really use linker with LDFLAGS.
  sparc: Improve VDSO CFLAGS.
  sparc: Set DISABLE_BRANCH_PROFILING in VDSO CFLAGS.
  sparc: Don't bother masking out TICK_PRIV_BIT in VDSO code.
  sparc: Inline VDSO gettime code aggressively.
  sparc: Improve VDSO instruction patching.
  sparc: Fix parport build warnings.
2018-10-24 06:42:00 +01:00
Linus Torvalds 44786880df Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "Lots of small fixes and enhancements, most noteably:

   - Many TLB and cache flush optimizations (Dave)

   - Fixed HPMC/crash handler on 64-bit kernel (Dave and myself)

   - Added alternative infrastructre. The kernel now live-patches itself
     for various situations, e.g. replace SMP code when running on one
     CPU only or drop cache flushes when system has no cache installed.

   - vmlinuz now contains a full copy of the compressed vmlinux file.
     This simplifies debugging the currently booted kernel.

   - Unused driver removal (Christoph)

   - Reduced warnings of Dino PCI bridge when running in qemu

   - Removed gcc version check (Masahiro)"

* 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (23 commits)
  parisc: Retrieve and display the PDC PAT capabilities
  parisc: Optimze cache flush algorithms
  parisc: Remove pte_inserted define
  parisc: Add PDC PAT cell_info() and pd_get_pdc_revisions() functions
  parisc: Drop two instructions from pte lookup code
  parisc: Use zdep for shlw macro on PA1.1 and PA2.0
  parisc: Add alternative coding infrastructure
  parisc: Include compressed vmlinux file in vmlinuz boot kernel
  extract-vmlinux: Check for uncompressed image as fallback
  parisc: Fix address in HPMC IVA
  parisc: Fix exported address of os_hpmc handler
  parisc: Fix map_pages() to not overwrite existing pte entries
  parisc: Purge TLB entries after updating page table entry and set page accessed flag in TLB handler
  parisc: Release spinlocks using ordered store
  parisc: Ratelimit dino stuck interrupt warnings
  parisc: dino: Utilize DINO_MASK_IRQ() macro
  parisc: Clean up crash header output
  parisc: Add SYSTEM_INFO and REGISTER TOC PAT functions
  parisc: Remove PTE load and fault check from L2_ptep macro
  parisc: Reorder TLB flush timing calculation
  ...
2018-10-23 20:02:03 +01:00
Linus Torvalds 07171da264 Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "The main item in this pull request are the Spectre variant 1.1 fixes
  from Julien Thierry.

  A few other patches to improve various areas, and removal of some
  obsolete mcount bits and a redundant kbuild conditional"

* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8802/1: Call syscall_trace_exit even when system call skipped
  ARM: 8797/1: spectre-v1.1: harden __copy_to_user
  ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization
  ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()
  ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limit
  ARM: 8793/1: signal: replace __put_user_error with __put_user
  ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user()
  ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state
  ARM: 8790/1: signal: always use __copy_to_user to save iwmmxt context
  ARM: 8789/1: signal: copy registers using __copy_to_user()
  ARM: 8801/1: makefile: use ARMv3M mode for RiscPC
  ARM: 8800/1: use choice for kernel unwinders
  ARM: 8798/1: remove unnecessary KBUILD_SRC ifeq conditional
  ARM: 8788/1: ftrace: remove old mcount support
  ARM: 8786/1: Debug kernel copy by printing
2018-10-23 19:32:10 +01:00
Linus Torvalds 034bda1cd5 Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso updates from Ingo Molnar:
 "Two main changes:

   - Cleanups, simplifications and CLOCK_TAI support (Thomas Gleixner)

   - Improve code generation (Andy Lutomirski)"

* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso: Rearrange do_hres() to improve code generation
  x86/vdso: Document vgtod_ts better
  x86/vdso: Remove "memory" clobbers in the vDSO syscall fallbacks
  x66/vdso: Add CLOCK_TAI support
  x86/vdso: Move cycle_last handling into the caller
  x86/vdso: Simplify the invalid vclock case
  x86/vdso: Replace the clockid switch case
  x86/vdso: Collapse coarse functions
  x86/vdso: Collapse high resolution functions
  x86/vdso: Introduce and use vgtod_ts
  x86/vdso: Use unsigned int consistently for vsyscall_gtod_data:: Seq
  x86/vdso: Enforce 64bit clocksource
  x86/time: Implement clocksource_arch_init()
  clocksource: Provide clocksource_arch_init()
2018-10-23 19:07:25 +01:00
Linus Torvalds d82924c3b8 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti updates from Ingo Molnar:
 "The main changes:

   - Make the IBPB barrier more strict and add STIBP support (Jiri
     Kosina)

   - Micro-optimize and clean up the entry code (Andy Lutomirski)

   - ... plus misc other fixes"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Propagate information about RSB filling mitigation to sysfs
  x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
  x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
  x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant
  x86/CPU: Fix unused variable warning when !CONFIG_IA32_EMULATION
  x86/pti/64: Remove the SYSCALL64 entry trampoline
  x86/entry/64: Use the TSS sp2 slot for SYSCALL/SYSRET scratch space
  x86/entry/64: Document idtentry
2018-10-23 18:43:04 +01:00
Linus Torvalds d7197a5ad8 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
 "Two minor OLPC changes: a build fix and a new quirk"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/olpc: Fix build error with CONFIG_MFD_CS5535=m
  x86/olpc: Indicate that legacy PC XO-1 platform should not register RTC
2018-10-23 18:17:11 +01:00
Linus Torvalds f682a7920b Merge branch 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 paravirt updates from Ingo Molnar:
 "Two main changes:

   - Remove no longer used parts of the paravirt infrastructure and put
     large quantities of paravirt ops under a new config option
     PARAVIRT_XXL=y, which is selected by XEN_PV only. (Joergen Gross)

   - Enable PV spinlocks on Hyperv (Yi Sun)"

* 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/hyperv: Enable PV qspinlock for Hyper-V
  x86/hyperv: Add GUEST_IDLE_MSR support
  x86/paravirt: Clean up native_patch()
  x86/paravirt: Prevent redefinition of SAVE_FLAGS macro
  x86/xen: Make xen_reservation_lock static
  x86/paravirt: Remove unneeded mmu related paravirt ops bits
  x86/paravirt: Move the Xen-only pv_mmu_ops under the PARAVIRT_XXL umbrella
  x86/paravirt: Move the pv_irq_ops under the PARAVIRT_XXL umbrella
  x86/paravirt: Move the Xen-only pv_cpu_ops under the PARAVIRT_XXL umbrella
  x86/paravirt: Move items in pv_info under PARAVIRT_XXL umbrella
  x86/paravirt: Introduce new config option PARAVIRT_XXL
  x86/paravirt: Remove unused paravirt bits
  x86/paravirt: Use a single ops structure
  x86/paravirt: Remove clobbers from struct paravirt_patch_site
  x86/paravirt: Remove clobbers parameter from paravirt patch functions
  x86/paravirt: Make paravirt_patch_call() and paravirt_patch_jmp() static
  x86/xen: Add SPDX identifier in arch/x86/xen files
  x86/xen: Link platform-pci-unplug.o only if CONFIG_XEN_PVHVM
  x86/xen: Move pv specific parts of arch/x86/xen/mmu.c to mmu_pv.c
  x86/xen: Move pv irq related functions under CONFIG_XEN_PV umbrella
2018-10-23 17:54:58 +01:00
Linus Torvalds 99792e0cea Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
 "Lots of changes in this cycle:

   - Lots of CPA (change page attribute) optimizations and related
     cleanups (Thomas Gleixner, Peter Zijstra)

   - Make lazy TLB mode even lazier (Rik van Riel)

   - Fault handler cleanups and improvements (Dave Hansen)

   - kdump, vmcore: Enable kdumping encrypted memory with AMD SME
     enabled (Lianbo Jiang)

   - Clean up VM layout documentation (Baoquan He, Ingo Molnar)

   - ... plus misc other fixes and enhancements"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  x86/stackprotector: Remove the call to boot_init_stack_canary() from cpu_startup_entry()
  x86/mm: Kill stray kernel fault handling comment
  x86/mm: Do not warn about PCI BIOS W+X mappings
  resource: Clean it up a bit
  resource: Fix find_next_iomem_res() iteration issue
  resource: Include resource end in walk_*() interfaces
  x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
  x86/mm: Remove spurious fault pkey check
  x86/mm/vsyscall: Consider vsyscall page part of user address space
  x86/mm: Add vsyscall address helper
  x86/mm: Fix exception table comments
  x86/mm: Add clarifying comments for user addr space
  x86/mm: Break out user address space handling
  x86/mm: Break out kernel address space handling
  x86/mm: Clarify hardware vs. software "error_code"
  x86/mm/tlb: Make lazy TLB mode lazier
  x86/mm/tlb: Add freed_tables element to flush_tlb_info
  x86/mm/tlb: Add freed_tables argument to flush_tlb_mm_range
  smp,cpumask: introduce on_each_cpu_cond_mask
  smp: use __cpumask_set_cpu in on_each_cpu_cond
  ...
2018-10-23 17:05:28 +01:00
Linus Torvalds 382d72a9aa Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 hyperv updates from Ingo Molnar:
 "Two small changes: a boot warning removal and a minor cleanup"

* 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/hyperv: Remove unused include
  x86/hyperv: Suppress "PCI: Fatal: No config space access function found"
2018-10-23 16:47:41 +01:00
Linus Torvalds ac73e08eda Merge branch 'x86-grub2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 grub2 updates from Ingo Molnar:
 "This extends the x86 boot protocol to include an address for the RSDP
  table - utilized by Xen currently.

  Matching Grub2 patches are pending as well. (Juergen Gross)"

* 'x86-grub2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/acpi, x86/boot: Take RSDP address for boot params if available
  x86/boot: Add ACPI RSDP address to setup_header
  x86/xen: Fix boot loader version reported for PVH guests
2018-10-23 16:31:33 +01:00
Linus Torvalds fec98069fb Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Add support for the "Dhyana" x86 CPUs by Hygon: these are licensed
     based on the AMD Zen architecture, and are built and sold in China,
     for domestic datacenter use. The code is pretty close to AMD
     support, mostly with a few quirks and enumeration differences. (Pu
     Wen)

   - Enable CPUID support on Cyrix 6x86/6x86L processors"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/cpupower: Add Hygon Dhyana support
  cpufreq: Add Hygon Dhyana support
  ACPI: Add Hygon Dhyana support
  x86/xen: Add Hygon Dhyana support to Xen
  x86/kvm: Add Hygon Dhyana support to KVM
  x86/mce: Add Hygon Dhyana support to the MCA infrastructure
  x86/bugs: Add Hygon Dhyana to the respective mitigation machinery
  x86/apic: Add Hygon Dhyana support
  x86/pci, x86/amd_nb: Add Hygon Dhyana support to PCI and northbridge
  x86/amd_nb: Check vendor in AMD-only functions
  x86/alternative: Init ideal_nops for Hygon Dhyana
  x86/events: Add Hygon Dhyana support to PMU infrastructure
  x86/smpboot: Do not use BSP INIT delay and MWAIT to idle on Dhyana
  x86/cpu/mtrr: Support TOP_MEM2 and get MTRR number
  x86/cpu: Get cache info and setup cache cpumap for Hygon Dhyana
  x86/cpu: Create Hygon Dhyana architecture support file
  x86/CPU: Change query logic so CPUID is enabled before testing
  x86/CPU: Use correct macros for Cyrix calls
2018-10-23 16:16:40 +01:00
Linus Torvalds 04ce7fae3d Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build update from Ingo Molnar:
 "A small cleanup to x86 Kconfigs"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kconfig: Remove redundant 'default n' lines from all x86 Kconfig's
2018-10-23 16:04:22 +01:00
Linus Torvalds 642116d4ac Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
 "Two cleanups and a bugfix for a rare boot option combination"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot/KASLR: Remove return value from handle_mem_options()
  x86/corruption-check: Use pr_*() instead of printk()
  x86/corruption-check: Fix panic in memory_corruption_check() when boot option without value is provided
2018-10-23 15:54:42 +01:00
Radim Krčmář f9dcf08e20 Revert "kvm: x86: optimize dr6 restore"
This reverts commit 0e0a53c551.

As Christian Ehrhardt noted:

  The most common case is that vcpu->arch.dr6 and the host's %dr6 value
  are not related at all because ->switch_db_regs is zero. To do this
  all correctly, we must handle the case where the guest leaves an arbitrary
  unused value in vcpu->arch.dr6 before disabling breakpoints again.

  However, this means that vcpu->arch.dr6 is not suitable to detect the
  need for a %dr6 clear.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-10-23 16:34:59 +02:00
Linus Torvalds e1d20beae7 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
 "The main changes in this cycle were the fsgsbase related preparatory
  patches from Chang S. Bae - but there's also an optimized
  memcpy_flushcache() and a cleanup for the __cmpxchg_double() assembly
  glue"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fsgsbase/64: Clean up various details
  x86/segments: Introduce the 'CPUNODE' naming to better document the segment limit CPU/node NR trick
  x86/vdso: Initialize the CPU/node NR segment descriptor earlier
  x86/vdso: Introduce helper functions for CPU and node number
  x86/segments/64: Rename the GDT PER_CPU entry to CPU_NUMBER
  x86/fsgsbase/64: Factor out FS/GS segment loading from __switch_to()
  x86/fsgsbase/64: Convert the ELF core dump code to the new FSGSBASE helpers
  x86/fsgsbase/64: Make ptrace use the new FS/GS base helpers
  x86/fsgsbase/64: Introduce FS/GS base helper functions
  x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately
  x86/asm: Use CC_SET()/CC_OUT() in __cmpxchg_double()
  x86/asm: Optimize memcpy_flushcache()
2018-10-23 15:24:22 +01:00
Linus Torvalds cbbfb0ae2c Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Ingo Molnar:
 "Improve the spreading of managed IRQs at allocation time"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irq/matrix: Spread managed interrupts on allocation
  irq/matrix: Split out the CPU selection code into a helper
2018-10-23 15:15:20 +01:00
Linus Torvalds 42f52e1c59 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes are:

   - Migrate CPU-intense 'misfit' tasks on asymmetric capacity systems,
     to better utilize (much) faster 'big core' CPUs. (Morten Rasmussen,
     Valentin Schneider)

   - Topology handling improvements, in particular when CPU capacity
     changes and related load-balancing fixes/improvements (Morten
     Rasmussen)

   - ... plus misc other improvements, fixes and updates"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
  sched/completions/Documentation: Add recommendation for dynamic and ONSTACK completions
  sched/completions/Documentation: Clean up the document some more
  sched/completions/Documentation: Fix a couple of punctuation nits
  cpu/SMT: State SMT is disabled even with nosmt and without "=force"
  sched/core: Fix comment regarding nr_iowait_cpu() and get_iowait_load()
  sched/fair: Remove setting task's se->runnable_weight during PELT update
  sched/fair: Disable LB_BIAS by default
  sched/pelt: Fix warning and clean up IRQ PELT config
  sched/topology: Make local variables static
  sched/debug: Use symbolic names for task state constants
  sched/numa: Remove unused numa_stats::nr_running field
  sched/numa: Remove unused code from update_numa_stats()
  sched/debug: Explicitly cast sched_feat() to bool
  sched/core: Disable SD_PREFER_SIBLING on asymmetric CPU capacity domains
  sched/fair: Don't move tasks to lower capacity CPUs unless necessary
  sched/fair: Set rq->rd->overload when misfit
  sched/fair: Wrap rq->rd->overload accesses with READ/WRITE_ONCE()
  sched/core: Change root_domain->overload type to int
  sched/fair: Change 'prefer_sibling' type to bool
  sched/fair: Kick nohz balance if rq->misfit_task_load
  ...
2018-10-23 15:00:03 +01:00
Linus Torvalds 0d1b82cd8a Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
 "Misc smaller fixes and cleanups"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mcelog: Remove one mce_helper definition
  x86/mce: Add macros for the corrected error count bit field
  x86/mce: Use BIT_ULL(x) for bit mask definitions
  x86/mce-inject: Reset injection struct after injection
2018-10-23 13:46:36 +01:00
Linus Torvalds c05f3642f4 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "The main updates in this cycle were:

   - Lots of perf tooling changes too voluminous to list (big perf trace
     and perf stat improvements, lots of libtraceevent reorganization,
     etc.), so I'll list the authors and refer to the changelog for
     details:

       Benjamin Peterson, Jérémie Galarneau, Kim Phillips, Peter
       Zijlstra, Ravi Bangoria, Sangwon Hong, Sean V Kelley, Steven
       Rostedt, Thomas Gleixner, Ding Xiang, Eduardo Habkost, Thomas
       Richter, Andi Kleen, Sanskriti Sharma, Adrian Hunter, Tzvetomir
       Stoyanov, Arnaldo Carvalho de Melo, Jiri Olsa.

     ... with the bulk of the changes written by Jiri Olsa, Tzvetomir
     Stoyanov and Arnaldo Carvalho de Melo.

   - Continued intel_rdt work with a focus on playing well with perf
     events. This also imported some non-perf RDT work due to
     dependencies. (Reinette Chatre)

   - Implement counter freezing for Arch Perfmon v4 (Skylake and newer).
     This allows to speed up the PMI handler by avoiding unnecessary MSR
     writes and make it more accurate. (Andi Kleen)

   - kprobes cleanups and simplification (Masami Hiramatsu)

   - Intel Goldmont PMU updates (Kan Liang)

   - ... plus misc other fixes and updates"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (155 commits)
  kprobes/x86: Use preempt_enable() in optimized_callback()
  x86/intel_rdt: Prevent pseudo-locking from using stale pointers
  kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
  perf/x86/intel: Export mem events only if there's PEBS support
  x86/cpu: Drop pointless static qualifier in punit_dev_state_show()
  x86/intel_rdt: Fix initial allocation to consider CDP
  x86/intel_rdt: CBM overlap should also check for overlap with CDP peer
  x86/intel_rdt: Introduce utility to obtain CDP peer
  tools lib traceevent, perf tools: Move struct tep_handler definition in a local header file
  tools lib traceevent: Separate out tep_strerror() for strerror_r() issues
  perf python: More portable way to make CFLAGS work with clang
  perf python: Make clang_has_option() work on Python 3
  perf tools: Free temporary 'sys' string in read_event_files()
  perf tools: Avoid double free in read_event_file()
  perf tools: Free 'printk' string in parse_ftrace_printk()
  perf tools: Cleanup trace-event-info 'tdata' leak
  perf strbuf: Match va_{add,copy} with va_end
  perf test: S390 does not support watchpoints in test 22
  perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONG
  tools include: Adopt linux/bits.h
  ...
2018-10-23 13:32:18 +01:00
Linus Torvalds 0200fbdd43 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and misc x86 updates from Ingo Molnar:
 "Lots of changes in this cycle - in part because locking/core attracted
  a number of related x86 low level work which was easier to handle in a
  single tree:

   - Linux Kernel Memory Consistency Model updates (Alan Stern, Paul E.
     McKenney, Andrea Parri)

   - lockdep scalability improvements and micro-optimizations (Waiman
     Long)

   - rwsem improvements (Waiman Long)

   - spinlock micro-optimization (Matthew Wilcox)

   - qspinlocks: Provide a liveness guarantee (more fairness) on x86.
     (Peter Zijlstra)

   - Add support for relative references in jump tables on arm64, x86
     and s390 to optimize jump labels (Ard Biesheuvel, Heiko Carstens)

   - Be a lot less permissive on weird (kernel address) uaccess faults
     on x86: BUG() when uaccess helpers fault on kernel addresses (Jann
     Horn)

   - macrofy x86 asm statements to un-confuse the GCC inliner. (Nadav
     Amit)

   - ... and a handful of other smaller changes as well"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  locking/lockdep: Make global debug_locks* variables read-mostly
  locking/lockdep: Fix debug_locks off performance problem
  locking/pvqspinlock: Extend node size when pvqspinlock is configured
  locking/qspinlock_stat: Count instances of nested lock slowpaths
  locking/qspinlock, x86: Provide liveness guarantee
  x86/asm: 'Simplify' GEN_*_RMWcc() macros
  locking/qspinlock: Rework some comments
  locking/qspinlock: Re-order code
  locking/lockdep: Remove duplicated 'lock_class_ops' percpu array
  x86/defconfig: Enable CONFIG_USB_XHCI_HCD=y
  futex: Replace spin_is_locked() with lockdep
  locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCKDEP=y
  x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs
  x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs
  x86/extable: Macrofy inline assembly code to work around GCC inlining bugs
  x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops
  x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs
  x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs
  x86/refcount: Work around GCC inlining bug
  x86/objtool: Use asm macros to work around GCC inlining bugs
  ...
2018-10-23 13:08:53 +01:00
Linus Torvalds de3fbb2aa8 Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The main changes are:

   - Add support for enlisting the help of the EFI firmware to create
     memory reservations that persist across kexec.

   - Add page fault handling to the runtime services support code on x86
     so we can more gracefully recover from buggy EFI firmware.

   - Fix command line handling on x86 for the boot path that omits the
     stub's PE/COFF entry point.

   - Other assorted fixes and updates"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: boot: Fix EFI stub alignment
  efi/x86: Call efi_parse_options() from efi_main()
  efi/x86: earlyprintk - Add 64bit efi fb address support
  efi/x86: drop task_lock() from efi_switch_mm()
  efi/x86: Handle page faults occurring while running EFI runtime services
  efi: Make efi_rts_work accessible to efi page fault handler
  efi/efi_test: add exporting ResetSystem runtime service
  efi/libstub: arm: support building with clang
  efi: add API to reserve memory persistently across kexec reboot
  efi/arm: libstub: add a root memreserve config table
  efi: honour memory reservations passed via a linux specific config table
2018-10-23 13:04:03 +01:00
Ingo Molnar dda93b4538 Merge branch 'x86/cache' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-23 12:30:19 +02:00
Linus Torvalds e2b623fbe6 s390 updates for the 4.20 merge window
- Improved access control for the zcrypt driver, multiple device nodes
    can now be created with different access control lists
 
  - Extend the pkey API to provide random protected keys, this is useful
    for encrypted swap device with ephemeral protected keys
 
  - Add support for virtually mapped kernel stacks
 
  - Rework the early boot code, this moves the memory detection into the
    boot code that runs prior to decompression.
 
  - Add KASAN support
 
  - Bug fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJbzXKpAAoJEDjwexyKj9rg98YH/jZ5/kEYV44JsACroTNBC782
 6QLCvoCvSgXUAqRwnIfnxrcjrUVNW2aK6rOSsI/I8rQDsSA3boJ7FimoEI2BsUZG
 dcMy0hC47AYB7yKREQX3gdDEj8f0bn8v2ize5F6gwLkIx0A+aBUSivRQeYMaF8sn
 N/5OkSJwjCb+ZkNmDa3SHif+hC5+iL+q1hfuBdQkeCBok9pAqhyosRkgLe8CQgUV
 HGrvaWJ4FudIpg4tu2jL2OsNoZFX2pK5d+Up886+KGKQEUfiXKYtdmzX17Vd7PIk
 Vkf7EWUipzIA7UtrJ6pljoFsrNa+83jm4j5Dgy0ohadCVUBYLORte3yEl4P1EoM=
 =MMf0
 -----END PGP SIGNATURE-----

Merge tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Martin Schwidefsky:

 - Improved access control for the zcrypt driver, multiple device nodes
   can now be created with different access control lists

 - Extend the pkey API to provide random protected keys, this is useful
   for encrypted swap device with ephemeral protected keys

 - Add support for virtually mapped kernel stacks

 - Rework the early boot code, this moves the memory detection into the
   boot code that runs prior to decompression.

 - Add KASAN support

 - Bug fixes and cleanups

* tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits)
  s390/pkey: move pckmo subfunction available checks away from module init
  s390/kasan: support preemptible kernel build
  s390/pkey: Load pkey kernel module automatically
  s390/perf: Return error when debug_register fails
  s390/sthyi: Fix machine name validity indication
  s390/zcrypt: fix broken zcrypt_send_cprb in-kernel api function
  s390/vmalloc: fix VMALLOC_START calculation
  s390/mem_detect: add missing include
  s390/dumpstack: print psw mask and address again
  s390/crypto: Enhance paes cipher to accept variable length key material
  s390/pkey: Introduce new API for transforming key blobs
  s390/pkey: Introduce new API for random protected key verification
  s390/pkey: Add sysfs attributes to emit secure key blobs
  s390/pkey: Add sysfs attributes to emit protected key blobs
  s390/pkey: Define protected key blob format
  s390/pkey: Introduce new API for random protected key generation
  s390/zcrypt: add ap_adapter_mask sysfs attribute
  s390/zcrypt: provide apfs failure code on type 86 error reply
  s390/zcrypt: zcrypt device driver cleanup
  s390/kasan: add support for mem= kernel parameter
  ...
2018-10-23 11:14:47 +01:00
Linus Torvalds 70408a9987 Miscellaneous ia64 fixes from Christoph
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQQW3WBGcnu5yJnSXn0kTJLX0iGMLAUCW84v0RQcdG9ueS5sdWNr
 QGludGVsLmNvbQAKCRAkTJLX0iGMLGJGAP9fUhp7O4ef6PHxGtvmKHRqkTX6a4b5
 /oASkd4qIetgzAEA7hwUopUllbq13IRqc+1Z93wymj4vGjT+jV+2unI0ZAc=
 =3Yoq
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-next' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull ia64 updates from Tony Luck:
 "Miscellaneous ia64 fixes from Christoph"

* tag 'please-pull-next' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  intel-iommu: mark intel_dma_ops static
  ia64: remove machvec_dma_sync_{single,sg}
  ia64/sn2: remove no-ops dma sync methods
  ia64: remove the unused iommu_dma_init function
  ia64: remove the unused pci_iommu_shutdown function
  ia64: remove the unused bad_dma_address symbol
  ia64: remove iommu_dma_supported
  ia64: remove the dead iommu_sac_force variable
  ia64: remove the kern_mem_attribute export
2018-10-23 11:06:43 +01:00
Linus Torvalds 12dd08fa95 Power management updates for 4.20-rc1
- Backport hibernation bug fixes from x86-64 to x86-32 and
    consolidate hibernation handling on x86 to allow 32-bit
    systems to work in all of the cases in which 64-bit ones
    work (Zhimin Gu, Chen Yu).
 
  - Fix hibernation documentation (Vladimir D. Seleznev).
 
  - Update the menu cpuidle governor to fix a couple of issues
    with it, make it more efficient in some cases and clean it
    up (Rafael Wysocki).
 
  - Rework the cpuidle polling state implementation to make it
    more efficient (Rafael Wysocki).
 
  - Clean up the cpuidle core somewhat (Fieah Lim).
 
  - Fix the cpufreq conservative governor to take policy limits
    into account properly in some cases (Rafael Wysocki).
 
  - Add support for retrieving guaranteed performance information
    to the ACPI CPPC library and make the intel_pstate driver use
    it to expose the CPU base frequency via sysfs on systems with
    the hardware-managed P-states (HWP) feature enabled (Srinivas
    Pandruvada).
 
  - Fix clang warning in the CPPC cpufreq driver (Nathan Chancellor).
 
  - Get rid of device_node.name printing from cpufreq (Rob Herring).
 
  - Remove unnecessary unlikely() from the cpufreq core (Igor Stoppa).
 
  - Add support for the r8a7744 SoC to the cpufreq-dt driver (Biju Das).
 
  - Update the dt-platdev cpufreq driver to allow RK3399 to have
    separate tunables per cluster (Dmitry Torokhov).
 
  - Fix the dma_alloc_coherent() usage in the tegra186 cpufreq driver
    (Christoph Hellwig).
 
  - Make the imx6q cpufreq driver read OCOTP through nvmem for
    imx6ul/imx6ull (Anson Huang).
 
  - Fix several bugs in the operating performance points (OPP)
    framework and make it more stable (Viresh Kumar, Dave Gerlach).
 
  - Update the devfreq subsystem to take changes in the APIs used
    by into account, fix some issues with it and make it stop
    print device_node.name directly (Bjorn Andersson, Enric Balletbo
    i Serra, Matthias Kaehlcke, Rob Herring, Vincent Donnefort, zhong
    jiang).
 
  - Prepare the generic power domains (genpd) framework for dealing
    with domains containing CPUs (Ulf Hansson).
 
  - Prevent sysfs attributes representing low-power S0 residency
    counters from being exposed if low-power S0 support is not
    indicated in ACPI FADT (Rajneesh Bhardwaj).
 
  - Get rid of custom CPU features macros for Intel CPUs from the
    intel_idle and RAPL drivers (Andy Shevchenko).
 
  - Update the tasks freezer to list tasks that refused to freeze
    and caused a system transition to a sleep state to be aborted
    (Todd Brandt).
 
  - Update the pm-graph set of tools to v5.2 (Todd Brandt).
 
  - Fix some issues in the cpupower utility (Anders Roxell, Prarit
    Bhargava).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJbyaznAAoJEILEb/54YlRxUkoP/iOroh5pMW7PDa1g8sG26bfN
 ICln5Tt9lv1Euk3QALc5r05kLjyObfoMoDwvH2oiM0TgwSw6G64tm/ansTsvbPpc
 DCk53d0/gSqv5B1dZxV6OUYoXP0Z5hD+nW+1dg6EiGr1h24kesdEXdSB09bfTUY3
 N4zUurWDUD92havuV3PakI/d/aOdxlwt9drwxv/cx4/gSYS0q5KtB2/N8YdWrk8Q
 1UNwZkQLO8I0URfp9bwvwG3VhgKn0SKpLHlajq9KzWDPRgCl32oB0tY+3fOHW9Q+
 djgMRA7xlAzAcCCL0vYJnEja6uMenvx3hZa1m68ZWFr0C25LQ5V87IEyZ3znvJQu
 IlcY9jMbYkX8dZz1M8LZA+nOtyYM5GxvgylaQvHRn8fi0jzYJWfJbAKdyvEX94qz
 UWtY35ihXFVBkhJuSxDPzluhMwxtd5uux1zO09/KlpUg8nnhxRx5l7AF7k7YyRk9
 TZ5dVa6kp8CdmBZK6E9FNHstfvECL64oc9Ig3CB/bRXYBm60hN9pLXO2abJKV7dU
 FUe4kmWUNus5QKOzfGuPKJokw34/vxBW2CVrOeRUNcuaRhlUwuboijeLPf23XLI/
 fYDI4EiMxAZvcEZ5h0KKDS0MaLv4uy0LbAdrWx8Eg7pNeFUiovDgovYUF7HOmn6M
 BzesklDaXWUSPWxlnASg
 =WJgu
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These make hibernation on 32-bit x86 systems work in all of the cases
  in which it works on 64-bit x86 ones, update the menu cpuidle governor
  and the "polling" state to make them more efficient, add more hardware
  support to cpufreq drivers and fix issues with some of them, fix a bug
  in the conservative cpufreq governor, fix the operating performance
  points (OPP) framework and make it more stable, update the devfreq
  subsystem to take changes in the APIs used by into account and clean
  up some things all over.

  Specifics:

   - Backport hibernation bug fixes from x86-64 to x86-32 and
     consolidate hibernation handling on x86 to allow 32-bit systems to
     work in all of the cases in which 64-bit ones work (Zhimin Gu, Chen
     Yu).

   - Fix hibernation documentation (Vladimir D. Seleznev).

   - Update the menu cpuidle governor to fix a couple of issues with it,
     make it more efficient in some cases and clean it up (Rafael
     Wysocki).

   - Rework the cpuidle polling state implementation to make it more
     efficient (Rafael Wysocki).

   - Clean up the cpuidle core somewhat (Fieah Lim).

   - Fix the cpufreq conservative governor to take policy limits into
     account properly in some cases (Rafael Wysocki).

   - Add support for retrieving guaranteed performance information to
     the ACPI CPPC library and make the intel_pstate driver use it to
     expose the CPU base frequency via sysfs on systems with the
     hardware-managed P-states (HWP) feature enabled (Srinivas
     Pandruvada).

   - Fix clang warning in the CPPC cpufreq driver (Nathan Chancellor).

   - Get rid of device_node.name printing from cpufreq (Rob Herring).

   - Remove unnecessary unlikely() from the cpufreq core (Igor Stoppa).

   - Add support for the r8a7744 SoC to the cpufreq-dt driver (Biju
     Das).

   - Update the dt-platdev cpufreq driver to allow RK3399 to have
     separate tunables per cluster (Dmitry Torokhov).

   - Fix the dma_alloc_coherent() usage in the tegra186 cpufreq driver
     (Christoph Hellwig).

   - Make the imx6q cpufreq driver read OCOTP through nvmem for
     imx6ul/imx6ull (Anson Huang).

   - Fix several bugs in the operating performance points (OPP)
     framework and make it more stable (Viresh Kumar, Dave Gerlach).

   - Update the devfreq subsystem to take changes in the APIs used by
     into account, fix some issues with it and make it stop print
     device_node.name directly (Bjorn Andersson, Enric Balletbo i Serra,
     Matthias Kaehlcke, Rob Herring, Vincent Donnefort, zhong jiang).

   - Prepare the generic power domains (genpd) framework for dealing
     with domains containing CPUs (Ulf Hansson).

   - Prevent sysfs attributes representing low-power S0 residency
     counters from being exposed if low-power S0 support is not
     indicated in ACPI FADT (Rajneesh Bhardwaj).

   - Get rid of custom CPU features macros for Intel CPUs from the
     intel_idle and RAPL drivers (Andy Shevchenko).

   - Update the tasks freezer to list tasks that refused to freeze and
     caused a system transition to a sleep state to be aborted (Todd
     Brandt).

   - Update the pm-graph set of tools to v5.2 (Todd Brandt).

   - Fix some issues in the cpupower utility (Anders Roxell, Prarit
     Bhargava)"

* tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
  PM / Domains: Document flags for genpd
  PM / Domains: Deal with multiple states but no governor in genpd
  PM / Domains: Don't treat zero found compatible idle states as an error
  cpuidle: menu: Avoid computations when result will be discarded
  cpuidle: menu: Drop redundant comparison
  cpufreq: tegra186: don't pass GFP_DMA32 to dma_alloc_coherent()
  cpufreq: conservative: Take limits changes into account properly
  Documentation: intel_pstate: Add base_frequency information
  cpufreq: intel_pstate: Add base_frequency attribute
  ACPI / CPPC: Add support for guaranteed performance
  cpuidle: menu: Simplify checks related to the polling state
  PM / tools: sleepgraph and bootgraph: upgrade to v5.2
  PM / tools: sleepgraph: first batch of v5.2 changes
  cpupower: Fix coredump on VMWare
  cpupower: Fix AMD Family 0x17 msr_pstate size
  cpufreq: imx6q: read OCOTP through nvmem for imx6ul/imx6ull
  cpufreq: dt-platdev: allow RK3399 to have separate tunables per cluster
  cpuidle: poll_state: Revise loop termination condition
  cpuidle: menu: Move the latency_req == 0 special case check
  cpuidle: menu: Avoid computations for very close timers
  ...
2018-10-23 10:28:21 +01:00
Linus Torvalds 114b5f8f7e This is the bulk of GPIO changes for the v4.20 series:
Core changes:
 
 - A patch series from Hans Verkuil to make it possible to
   enable/disable IRQs on a GPIO line at runtime and drive GPIO
   lines as output without having to put/get them from scratch.
   The irqchip callbacks have been improved so that they can
   use only the fastpatch callbacks to enable/disable irqs
   like any normal irqchip, especially the gpiod_lock_as_irq()
   has been improved to be callable in fastpath context.
   A bunch of rework had to be done to achieve this but it is
   a big win since I never liked to restrict this to slowpath.
   The only call requireing slowpath was try_module_get() and
   this is kept at the .request_resources() slowpath callback.
   In the GPIO CEC driver this is a big win sine a single
   line is used for both outgoing and incoming traffic, and
   this needs to use IRQs for incoming traffic while actively
   driving the line for outgoing traffic.
 
 - Janusz Krzysztofik improved the GPIO array API to pass a
   "cookie" (struct gpio_array) and a bitmap for setting or
   getting multiple GPIO lines at once. This improvement
   orginated in a specific need to speed up an OMAP1 driver and
   has led to a much better API and real performance gains
   when the state of the array can be used to bypass a lot
   of checks and code when we want things to go really fast.
   The previous code would minimize the number of calls
   down to the driver callbacks assuming the CPU speed was
   orders of magnitude faster than the I/O latency, but this
   assumption was wrong on several platforms: what we needed
   to do was to profile and improve the speed on the hot
   path of the array functions and this change is now
   completed.
 
 - Clean out the painful and hard to grasp BNF experiments
   from the device tree bindings. Future approaches are looking
   into using JSON schema for this purpose. (Rob Herring
   is floating a patch series.)
 
 New drivers:
 
 - The RCAR driver now supports r8a774a1 (RZ/G2M).
 
 - Synopsys GPIO via CREGs driver.
 
 Major improvements:
 
 - Modernization of the EP93xx driver to use irqdomain and
   other contemporary concepts.
 
 - The ingenic driver has been merged into the Ingenic pin
   control driver and removed from the GPIO subsystem.
 
 - Debounce support in the ftgpio010 driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbzdyOAAoJEEEQszewGV1zfYcP/0HBEAOPhHD/i5OQxfKs1msh
 mFT/t/IbTmRpCgbEv4CDx4Kc/InE0sUnQr1TL/1WvU6uObM6Ncxq5Z90MvyrgzYu
 BqQHq2k2tORvkVSNRxcfD/BAAoo1EerXts1kDhutvdKfepfS6DxpENwzvsFgkVlq
 2jj1cdZztjv8A+9cspHDpQP+jDvl1VSc10nR5fRu1TttSpUwzRJaB30NBNXJmMJc
 5KUr67lEbsQRPsBvFErU11bydPqhfT+pXmODcfIwS0EtATQ8WC5mkSb/Ooei0fvT
 oZ7uR3Os8tMf7isOKssEyFabKwhnfOEt6TBt9em0TfUtInOo0Dc7r8TfBcn57fyZ
 xg2R9DQEVRfac8bjhF/BI5KHuN9IMGDDvj6XApumQVliZbISRjMnh3jte6RpcV0A
 Ejqz8FeDY13qvEdOnW1EPpwmXdDVWiEAq0ebGLStKNls+/4gB2HmyxGUOzJf+og5
 hujsxcJzGQqjCe0moeY/1d7vsy0ZjbHoS+p5fy79U212y2O7onEzFU92AX89bxKC
 rx2eCNmiZxCUy1nqu8edO62VnH6QdnqG3o+a4DJfCSHPvFM/E/NX9zHemZubQQ4I
 rYXNy4bL4tEG9cqWMfBxWrpiDZw7H6l8kXwdZG8IMyRU9BcKu96amgZ+jBXwzoaB
 JZelAAUWB9APghJYFr7o
 =YosT
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for the v4.20 series:

  Core changes:

   - A patch series from Hans Verkuil to make it possible to
     enable/disable IRQs on a GPIO line at runtime and drive GPIO lines
     as output without having to put/get them from scratch.

     The irqchip callbacks have been improved so that they can use only
     the fastpatch callbacks to enable/disable irqs like any normal
     irqchip, especially the gpiod_lock_as_irq() has been improved to be
     callable in fastpath context.

     A bunch of rework had to be done to achieve this but it is a big
     win since I never liked to restrict this to slowpath. The only call
     requireing slowpath was try_module_get() and this is kept at the
     .request_resources() slowpath callback. In the GPIO CEC driver this
     is a big win sine a single line is used for both outgoing and
     incoming traffic, and this needs to use IRQs for incoming traffic
     while actively driving the line for outgoing traffic.

   - Janusz Krzysztofik improved the GPIO array API to pass a "cookie"
     (struct gpio_array) and a bitmap for setting or getting multiple
     GPIO lines at once.

     This improvement orginated in a specific need to speed up an OMAP1
     driver and has led to a much better API and real performance gains
     when the state of the array can be used to bypass a lot of checks
     and code when we want things to go really fast.

     The previous code would minimize the number of calls down to the
     driver callbacks assuming the CPU speed was orders of magnitude
     faster than the I/O latency, but this assumption was wrong on
     several platforms: what we needed to do was to profile and improve
     the speed on the hot path of the array functions and this change is
     now completed.

   - Clean out the painful and hard to grasp BNF experiments from the
     device tree bindings. Future approaches are looking into using JSON
     schema for this purpose. (Rob Herring is floating a patch series.)

  New drivers:

   - The RCAR driver now supports r8a774a1 (RZ/G2M).

   - Synopsys GPIO via CREGs driver.

  Major improvements:

   - Modernization of the EP93xx driver to use irqdomain and other
     contemporary concepts.

   - The ingenic driver has been merged into the Ingenic pin control
     driver and removed from the GPIO subsystem.

   - Debounce support in the ftgpio010 driver"

* tag 'gpio-v4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (116 commits)
  gpio: Clarify kerneldoc on gpiochip_set_chained_irqchip()
  gpio: Remove unused 'irqchip' argument to gpiochip_set_cascaded_irqchip()
  gpio: Drop parent irq assignment during cascade setup
  mmc: pwrseq_simple: Fix incorrect handling of GPIO bitmap
  gpio: fix SNPS_CREG kconfig dependency warning
  gpiolib: Initialize gdev field before is used
  gpio: fix kernel-doc after devres.c file rename
  gpio: fix doc string for devm_gpiochip_add_data() to not talk about irq_chip
  gpio: syscon: Fix possible NULL ptr usage
  gpiolib: Show correct direction from the beginning
  pinctrl: msm: Use init_valid_mask exported function
  gpiolib: Add init_valid_mask exported function
  GPIO: add single-register GPIO via CREG driver
  dt-bindings: Document the Synopsys GPIO via CREG bindings
  gpio: mockup: use device properties instead of platform_data
  gpio: Slightly more helpful debugfs
  gpio: omap: Remove set but not used variable 'dev'
  gpio: omap: drop omap_gpio_list
  Accept partial 'gpio-line-names' property.
  gpio: omap: get rid of the conditional PM runtime calls
  ...
2018-10-23 08:45:05 +01:00
Linus Torvalds 1650ac5306 MMC core:
- Introduce a host helper function to share re-tuning progress
 
 MMC host:
  - sdhci: Add support for v4 host mode
  - sdhci-of-arasan: Add Support for AM654 MMC and PHY
  - sdhci-sprd: Add support for Spreadtrum's host controller
  - sdhci-tegra: Add support for HS400 enhanced strobe
  - sdhci-tegra: Enable UHS/HS200 modes for Tegra186/210
  - sdhci-tegra: Add support for HS400 delay line calibration
  - sdhci-tegra: Add support for pad calibration
  - sdhci-of-dwcmshc: Address 128MB DMA boundary limitation
  - sdhci-of-esdhc: Add support for tuning erratum A008171
  - sdhci-iproc: Add ACPI support
  - mediatek: Add support for MT8183
  - mediatek: Improve the support for tuning
  - mediatek: Add bus clock control for MT2712
  - jz4740: Add support for the JZ4725B
  - mmci: Add support for the stm32 sdmmc variant
  - mmci: Add support for an optional reset control
  - mmci: Add some new variant specific properties/callbacks
  - mmci: Re-structure DMA code to prepare for new variants
  - renesas_sdhi: Add support for r8a77470, r8a7744 and r8a774a1
  - renesas_sdhi_internal_dmac: Whitelist r8a77970 and r8a774a1
  - tmio/uniphier-sd: Add new UniPhier SD/eMMC controller driver
  - tmio/renesas_sdhi: Deal properly with SCC detection during re-tune
  - tmio/renesas_sdhi: Refactor/consolidate clock management
  - omap_hsmmc: Drop cover detection and some unused platform data
  - dw_mmc-exynos: Enable tuning for more speed modes
  - sunxi: Clarify the new timing mode and enable it for the A64 controller
  - various: Convert to slot GPIO descriptors
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAlvNvV0XHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCmstxAAzI10FKr/RiEd4l1w3AFcjPZ4
 iGrsepaKg38LKeWwVXdu0FGfE+phcR7IrR6UqEzvPMFoajRiZa6K6q52r+gwDimA
 wNpTfci3yaRSJjg99DmQ5Eu1cLNbApQYzczkq7+hDna1PABm3ddT7HSlcuweALkV
 73Ozt5PeBtYDofTgXZos2pPJc3Uqfwe6ZWdL95Q/I45SragQtnBce6P8O/oOP3RL
 2Dzqy3jiXTuOCruV94Wwse4MBwec92pE4l1FKFQ505Hi8D0bs31CMLggXb49VDxG
 kl4zPqppbYdGqBpiCoKOMis1rYmSlgrskYzCGcq/TVzzQWNfj3sXtUAXDm31oy7c
 syZalIVejUpPiIfL01be92nA2dBHRuNaraCj6Mrx9NVkaKnKYhUax26gbN0PfZXt
 6/OgIK5kcJ2Imf5j9hNjPChQ4uudjzaFi9K+1WWpZ+XC/b3JxdGhKKxibWwzz4+V
 kOk4RBp+aWBnUUTBPYo3syhicU7dUfcfTSCHz7+XQnWphZvog3W05onCdsi4O1GA
 s24w3rXZC3OOBsKGt5pE1XyV+oDfv/4BpxNiIe/utnW3bS+0OagpLgV8FBU1pKNP
 /EMEvnHV2G1rEF1C8y6ZJgGTFan8pCR/k5MbadeRChmwK/6B/AhW9KIKamHCJoE4
 5rh0gyWdmJHMLj62ssw=
 =/4yi
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC updates from Ulf Hansson:
 "MMC core:
   - Introduce a host helper function to share re-tuning progress

  MMC host:
   - sdhci: Add support for v4 host mode
   - sdhci-of-arasan: Add Support for AM654 MMC and PHY
   - sdhci-sprd: Add support for Spreadtrum's host controller
   - sdhci-tegra: Add support for HS400 enhanced strobe
   - sdhci-tegra: Enable UHS/HS200 modes for Tegra186/210
   - sdhci-tegra: Add support for HS400 delay line calibration
   - sdhci-tegra: Add support for pad calibration
   - sdhci-of-dwcmshc: Address 128MB DMA boundary limitation
   - sdhci-of-esdhc: Add support for tuning erratum A008171
   - sdhci-iproc: Add ACPI support
   - mediatek: Add support for MT8183
   - mediatek: Improve the support for tuning
   - mediatek: Add bus clock control for MT2712
   - jz4740: Add support for the JZ4725B
   - mmci: Add support for the stm32 sdmmc variant
   - mmci: Add support for an optional reset control
   - mmci: Add some new variant specific properties/callbacks
   - mmci: Re-structure DMA code to prepare for new variants
   - renesas_sdhi: Add support for r8a77470, r8a7744 and r8a774a1
   - renesas_sdhi_internal_dmac: Whitelist r8a77970 and r8a774a1
   - tmio/uniphier-sd: Add new UniPhier SD/eMMC controller driver
   - tmio/renesas_sdhi: Deal properly with SCC detection during re-tune
   - tmio/renesas_sdhi: Refactor/consolidate clock management
   - omap_hsmmc: Drop cover detection and some unused platform data
   - dw_mmc-exynos: Enable tuning for more speed modes
   - sunxi: Clarify the new timing mode and enable it for the A64 controller
   - various: Convert to slot GPIO descriptors"

* tag 'mmc-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (129 commits)
  mmc: mediatek: drop too much code of tuning method
  mmc: mediatek: add MT8183 MMC driver support
  mmc: mediatek: tune CMD/DATA together
  mmc: mediatek: fix cannot receive new request when msdc_cmd_is_ready fail
  mmc: mediatek: fill the actual clock for mmc debugfs
  mmc: dt-bindings: add support for MT8183 SoC
  mmc: uniphier-sd: avoid using broken DMA RX channel
  mmc: uniphier-sd: fix DMA disabling
  mmc: tmio: simplify the DMA mode test
  mmc: tmio: remove TMIO_MMC_HAVE_HIGH_REG flag
  mmc: tmio: move MFD variant reset to a platform hook
  mmc: renesas_sdhi: Add r8a77470 SDHI1 support
  dt-bindings: mmc: renesas_sdhi: Add r8a77470 support
  mmc: mmci: add stm32 sdmmc variant
  dt-bindings: mmci: add stm32 sdmmc variant
  mmc: mmci: add stm32 sdmmc registers
  mmc: mmci: add clock divider for stm32 sdmmc
  mmc: mmci: add optional reset property
  dt-bindings: mmci: add optional reset property
  mmc: mmci: add variant property to not read datacnt
  ...
2018-10-23 08:36:15 +01:00
Thor Thayer a27460c976 arm64: dts: stratix10: Support Ethernet Jumbo frame
Properly specify the RX and TX FIFO size which is important
for Jumbo frames.
Update the max-frame-size to support Jumbo frames.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 20:22:59 -07:00
Linus Torvalds ca9eb48fe0 regulator: Regulator updates for next release
The biggest chunk of the regulator changes for this release outside of
 the new drivers is the conversion of the fixed regulator to use the GPIO
 descriptor API, there's a small addition to the GPIO API plus a bunch of
 updates to board files to implement it.  This is some really welcome
 work from Linus Walleij that's had a bunch of review and has been
 sitting in -next for a while so I'm fairly happy there's no major
 issues.
 
  - Helpers for overlapping linear ranges.
  - Display opmode and consumer requested load in the regualtor_summary
    file in debugfs, plus a fix there.
  - Support for the fun and entertaining power off mechanism that the
    pfuze100 hardware implements.
  - Conversion of the fixed regulator API to use GPIO descriptors,
    including pulling in a bunch of patches to a bunch of board files.
  - New drivers for Cirrus Logic Lochnagar, Qualcomm PMS405, Rohm
    BD71847, ST PMIC1, and TI LM363x devices.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlvNyl0THGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0DubB/4nWL/XSUb7qIm2fjjMUffelfk4/viB
 MZg3JPEMr7ahK+QC1RQ5nOmkuACSU3Uij8RE1omLp5isfCiSa+e17f9uQx4Cn/pw
 9DsIeJUEC4LvZ9gA9pDf0313B/0BIYfOMJToyLgwTNmJl+T+0e59RcS4TTCEqxwD
 PmpPakOvCTD6YuVI7HhYL/HXJK1buvrAiENSjCyfyJTDaMSzJl6WMn+eibFaZbDn
 NXwj2W+QyuiFCdl/7/4NWaqhlyOvM05ivnnLM/SPMBj+Iu4gSZ0PX81z98eZ3M66
 YSPhF2o5SkhhffFx5xjpgR3VquXDVb0oefzhvJmZYHXi7ZKKMruoYaGB
 =Ztu3
 -----END PGP SIGNATURE-----

Merge tag 'regulator-v5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
 "The biggest chunk of the regulator changes for this release outside of
  the new drivers is the conversion of the fixed regulator to use the
  GPIO descriptor API, there's a small addition to the GPIO API plus a
  bunch of updates to board files to implement it. This is some really
  welcome work from Linus Walleij that's had a bunch of review and has
  been sitting in -next for a while so I'm fairly happy there's no major
  issues.

   - Helpers for overlapping linear ranges.

   - Display opmode and consumer requested load in the regualtor_summary
     file in debugfs, plus a fix there.

   - Support for the fun and entertaining power off mechanism that the
     pfuze100 hardware implements.

   - Conversion of the fixed regulator API to use GPIO descriptors,
     including pulling in a bunch of patches to a bunch of board files.

   - New drivers for Cirrus Logic Lochnagar, Qualcomm PMS405, Rohm
     BD71847, ST PMIC1, and TI LM363x devices"

* tag 'regulator-v5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (36 commits)
  regulator: lochnagar: Use a consisent comment style for SPDX header
  regulator: bd718x7: Remove struct bd718xx_pmic
  regulator: Fetch enable gpiods nonexclusive
  regulator/gpio: Allow nonexclusive GPIO access
  regulator: lochnagar: Add support for the Cirrus Logic Lochnagar
  regulator: stpmic1: Return REGULATOR_MODE_INVALID for invalid mode
  regulator: stpmic1: add stpmic1 regulator driver
  dt-bindings: regulator: document stpmic1 pmic regulators
  regulator: axp20x: Mark expected switch fall-throughs
  regulator: bd718xx: fix build warning on x86_64
  regulator: fixed: Default enable high on DT regulators
  regulator: bd718xx: rename bd71837 to 718xx
  regulator: bd718XX use pickable ranges
  regulator/mfd: bd718xx: rename bd71837/bd71847 common instances
  regulator: Support regulators where voltage ranges are selectable
  mfd: dt bindings: add BD71847 device-tree binding documentation
  regulator: dt bindings: add BD71847 device-tree binding documentation
  regulator/mfd: Support ROHM BD71847 power management IC
  regulator: da905{2,5}: Remove unnecessary array check
  regulator: qcom: Add PMS405 regulators
  ...
2018-10-23 01:54:44 +01:00
David S. Miller 19832d2449 sparc: Several small VDSO vclock_gettime.c improvements.
Almost entirely borrowed from the x86 code.

Main improvement is to avoid having to initialize
ts->tv_nsec to zero before the sequence loops, by
expanding timespec_add_ns().

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 17:42:10 -07:00
Palmer Dabbelt d26c4bbf99
RISC-V: SMP cleanup and new features
This patch series now has evolved to contain several related changes.

1. Updated the assorted cleanup series by Palmer.
The original cleanup patch series can be found here.
http://lists.infradead.org/pipermail/linux-riscv/2018-August/001232.html

2. Implemented decoupling linux logical CPU ids from hart id.
Some of the work has been inspired from ARM64.
Tested on QEMU & HighFive Unleashed board with/without SMP enabled.

3. Included Anup's cleanup and IPI stat patch.

All the patch series have been combined to avoid conflicts as a lot of
common code is changed different patch sets. Atish has mostly addressed
review comments and fixed checkpatch errors from Palmer's and Anup's
series.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:41:43 -07:00
Palmer Dabbelt a6de21baf6
RISC-V: Fix some RV32 bugs and build failures
This patch set fixes up various failures in the RV32I port.  The fixes
are all nominally independent, but are really only testable together
because the RV32I port fails to build without all of them.  The patch
set includes:

* The removal of tishift on RV32I targets, as 128-bit integers are not
  supported by the toolchain.
* The removal of swiotlb from RV32I targets, since all physical
  addresses can be mapped by all hardware on all existing RV32I targets.
* The addition of ummodi3 and udivmoddi4 from an old version of GCC that
  was licensed under GPLv2 as generic code, along with their use on
  RV32I targets.
* A fix to our page alignment logic within ioremap for RV32I targets.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:39:08 -07:00
Palmer Dabbelt 4e4101cfef
riscv: Add support to no-FPU systems
This patchset adds an option, CONFIG_FPU, to enable/disable floating-
point support within the kernel.  The kernel's new behavior will be as
follows:

* with CONFIG_FPU=y
  All FPU codes are reserved.  If no FPU is found during booting, a
  global flag will be set, and those functions will be bypassed with
  condition check to that flag.

* with CONFIG_FPU=n
  No floating-point instructions in kernel and all related settings
  are excluded.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:38:26 -07:00
Nick Kossifidis aef53f97b5
RISC-V: Cosmetic menuconfig changes
* Move the built-in cmdline configuration on a new menu entry "Boot
  options", it doesn't make much sense to be part of the debuging menu.

* Rename "Kernel Type" menu to "Kernel features" to be more consistent with
  what other architectures are using, plus "type" is a bit misleading here.

Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:38:20 -07:00
Masahiro Yamada ee5928843a
riscv: move GCC version check for ARCH_SUPPORTS_INT128 to Kconfig
This becomes much neater in Kconfig.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:38:15 -07:00
Christoph Hellwig f31b8de988
RISC-V: remove the unused return_to_handler export
This export is not only not needed, but also breaks symbol versioning
due to being an undeclared assembly export.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:38:12 -07:00
Jim Wilson b90edb3301
RISC-V: Add futex support.
Here is an attempt to add the missing futex support.  I started with the MIPS
version of futex.h and modified it until I got it working.  I tested it on
a HiFive Unleashed running Fedora Core 29 using the fc29 4.15 version of the
kernel.  This was tested against the glibc testsuite, where it fixes 14 nptl
related testsuite failures.  That unfortunately only tests the cmpxchg support,
so I also used the testcase at the end of

    https://lwn.net/Articles/148830/

which tests the atomic_op functionality, except that it doesn't verify that
the operations are atomic, which they obviously are.  This testcase runs
successfully with the patch and fails without it.

I'm not a kernel expert, so there could be details I got wrong here.  I wasn't
sure about the memory model support, so I used aqrl which seemed safest, and
didn't add fences which seemed unnecessary.  I'm not sure about the copyright
statements, I left in Ralf Baechle's line because I started with his code.
Checkpatch reports some style problems, but it is the same style as the MIPS
futex.h, and the uses of ENOSYS appear correct even though it complains about
them.  I don't know if any of that matters.

This patch was tested on qemu with the glibc nptl/tst-cond-except
testcase, and the wake_op testcase from above.

Signed-off-by: Jim Wilson <jimw@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:38:08 -07:00
Jim Wilson b8c8a9590e
RISC-V: Add FP register ptrace support for gdb.
Add a variable and a macro to describe FP registers, assuming only D is
supported.  FP code is conditional on CONFIG_FPU.  The FP regs and FCSR
are copied separately to avoid copying struct padding.  Tested by hand and
with the gdb testsuite.

Signed-off-by: Jim Wilson <jimw@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:38:04 -07:00
Palmer Dabbelt 86e581e310
RISC-V: Mask out the F extension on systems without D
The RISC-V Linux port doesn't support systems that have the F extension
but don't have the D extension -- we actually don't support systems
without D either, but Alan's patch set is rectifying that soon.  For now
I think we can leave this in a semi-broken state and just wait for
Alan's patch set to get merged for proper non-FPU support -- the patch
set is starting to look good, so doing something in-between doesn't seem
like it's worth the work.

I don't think it's worth fretting about support for systems with F but
not D for now: our glibc ABIs are IMAC and IMAFDC so they probably won't
end up being popular.  We can always extend this in the future.

CC: Alan Kao <alankao@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:38:00 -07:00
Palmer Dabbelt 1760debb51
RISC-V: Don't set cacheinfo.{physical_line_partition,attributes}
These are just hard coded in the RISC-V port, which doesn't make any
sense.  We should probably be setting these from device tree entries
when they exist, but for now I think it's saner to just leave them all
as their default values.

Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:37:41 -07:00
Christophe Leroy b6ae3550c8 powerpc/8xx: add missing header in 8xx_mmu.c
arch/powerpc/mm/8xx_mmu.c:174:6: error: no previous prototype for ‘set_context’ [-Werror=missing-prototypes]
 void set_context(unsigned long id, pgd_t *pgd)

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2018-10-22 19:11:58 -05:00
Christophe Leroy e738c5f155 powerpc/8xx: Add DT node for using the SEC engine of the MPC885
The MPC885 has SEC engine version 1.2 with the following details:
- Number of Crypto channels: 1
- Exec Units: DEU, MDEU and AESU
- Available descriptors: 00010, 00100, 00110, 01000, 11000, 11010

It is also supposed to have descriptor 00000, but it doesn't work
properly so we keep it out for the moment.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2018-10-22 19:11:55 -05:00
Linus Torvalds a36cf68651 SPI NOR changes:
Core changes:
   * Support non-uniform erase size
   * Support controllers with limited TX fifo size
 
  Driver changes:
   * m25p80: Re-issue a WREN command after each write access
   * cadence: Pass a proper dir value to dma_[un]map_single()
   * fsl-qspi: Check fsl_qspi_get_seqid() return val make sure 4B
     addressing opcodes are properly handled
   * intel-spi: Add a new PCI entry for Ice Lake
 
 NAND changes:
  Raw NAND core changes:
  - Two batchs of cleanups of the NAND API, including:
    * Deprecating a lot of interfaces (now replaced by ->exec_op()).
    * Moving code in separate drivers (JEDEC, ONFI), in private files
      (internals), in platform drivers, etc.
    * Functions/structures reordering.
    * Exclusive use of the nand_chip structure instead of the MTD one
      all across the subsystem.
  - Addition of the nand_wait_readrdy/rdy_op() helpers.
 
  Raw NAND controllers drivers changes:
  - Various coccinelle patches.
  - Marvell:
    * Use regmap_update_bits() for syscon access.
    * More documentation.
    * BCH failure path rework.
    * More layouts to be supported.
    * IRQ handler complete() condition fixed.
  - Fsl_ifc:
    * SRAM initialization fixed for newer controller versions.
  - Denali:
    * Fix licenses mismatch and use a SPDX tag.
    * Set SPARE_AREA_SKIP_BYTES register to 8 if unset.
  - Qualcomm:
    * Do not include dma-direct.h.
  - Docg4:
    * Removed.
  - Ams-delta:
    * Use of a GPIO lookup table
    * Internal machinery changes.
 
  Raw NAND chip drivers changes:
  - Toshiba:
    * Add support for Toshiba memory BENAND
    * Pass a single nand_chip object to the status helper.
  - ESMT:
    * New driver to retrieve the ECC requirements from the 5th ID byte.
 
 MTD changes:
  * physmap cleanups/fixe
  * gpio-addr-flash cleanups/fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJQBAABCgA6FiEEKmCqpbOU668PNA69Ze02AX4ItwAFAlvNmJocHGJvcmlzLmJy
 ZXppbGxvbkBib290bGluLmNvbQAKCRBl7TYBfgi3AKr+D/4pH0gYSGDUstZsqUHW
 gZsPRU4jmOA+OCRbSxE03bOZOYR0UPgdoUYLfAKhZ5qxc7wHd3b47wykP0kUEviu
 lC8QhSs4DUA+EuMVPDVS4FlXRT0e3dMV7jh/IK6nInshD2YkaovyCqa6GvgwFEcM
 7BCizxRhtHV8+fo7GVQWuMLb9ZfbEvz42D0sYOu6hIsF1SnRDvHOvfdDUFEXpJoF
 a2mC9ove65ChEzc2iZ/r260iZ4aoJYb9XJRJKWxmYeITZSmLNmcrUyGqAaNQ7NRc
 AIuPXASbeHGjrIuEfXpKYW07AE5MV1nJFSk3v4u5FjgoohOoobPp7npk+Lz/qwe3
 y8/uW9qOQ/iEOsRnkvMGNu4Yjhw41L1a+J5wVvUvzmwHy1xMCrRHYB/8gXoZzemR
 A7XmCPwjAFVWv1WeKV2cs5MLW9yZq9QNMtGLlNs5OgFR6CccLk67/tHIkYLsJ/l9
 IDXFhd/YKhTeF361u6Iimmgb427TwM71P9N+OMpv4nk46DxurpxkgGle3nslzKNU
 DOFNFlMGiPIx3h4X96AKER6u7cxiOXnLGq+XeHa/y8tIrziy3jg03YoTh0RSwKxV
 J3EXwh1sFLaeebwWEgpE3Ag1LOxpRCqJ2ED71SPzS/DR/938HBVsEoPqUEHNf1in
 jwsUB3cRzNwf+XX68DRCVT5GFA==
 =7w4w
 -----END PGP SIGNATURE-----

Merge tag 'mtd/for-4.20' of git://git.infradead.org/linux-mtd

Pull mtd updates from Boris Brezillon:
 "SPI NOR core changes:
   - Support non-uniform erase size
   - Support controllers with limited TX fifo size

 Driver changes:
   - m25p80: Re-issue a WREN command after each write access
   - cadence: Pass a proper dir value to dma_[un]map_single()
   - fsl-qspi: Check fsl_qspi_get_seqid() return val make sure 4B
     addressing opcodes are properly handled
   - intel-spi: Add a new PCI entry for Ice Lake

 Raw NAND core changes:
   - Two batchs of cleanups of the NAND API, including:
      * Deprecating a lot of interfaces (now replaced by ->exec_op()).
      * Moving code in separate drivers (JEDEC, ONFI), in private files
        (internals), in platform drivers, etc.
      * Functions/structures reordering.
      * Exclusive use of the nand_chip structure instead of the MTD one
        all across the subsystem.
   - Addition of the nand_wait_readrdy/rdy_op() helpers.

 Raw NAND controllers drivers changes:
   - Various coccinelle patches.
   - Marvell:
      * Use regmap_update_bits() for syscon access.
      * More documentation.
      * BCH failure path rework.
      * More layouts to be supported.
      * IRQ handler complete() condition fixed.
   - Fsl_ifc:
      * SRAM initialization fixed for newer controller versions.
   - Denali:
      * Fix licenses mismatch and use a SPDX tag.
      * Set SPARE_AREA_SKIP_BYTES register to 8 if unset.
   - Qualcomm:
      * Do not include dma-direct.h.
   - Docg4:
      * Removed.
   - Ams-delta:
      * Use of a GPIO lookup table
      * Internal machinery changes.

 Raw NAND chip drivers changes:
   - Toshiba:
      * Add support for Toshiba memory BENAND
      * Pass a single nand_chip object to the status helper.
   - ESMT:
      * New driver to retrieve the ECC requirements from the 5th ID
        byte.

  MTD changes:
   - physmap cleanups/fixe
   - gpio-addr-flash cleanups/fixes"

* tag 'mtd/for-4.20' of git://git.infradead.org/linux-mtd: (93 commits)
  jffs2: free jffs2_sb_info through jffs2_kill_sb()
  mtd: spi-nor: fsl-quadspi: fix read error for flash size larger than 16MB
  mtd: spi-nor: intel-spi: Add support for Intel Ice Lake SPI serial flash
  mtd: maps: gpio-addr-flash: Convert to gpiod
  mtd: maps: gpio-addr-flash: Replace array with an integer
  mtd: maps: gpio-addr-flash: Use order instead of size
  mtd: spi-nor: fsl-quadspi: Don't let -EINVAL on the bus
  mtd: devices: m25p80: Make sure WRITE_EN is issued before each write
  mtd: spi-nor: Support controllers with limited TX FIFO size
  mtd: spi-nor: cadence-quadspi: Use proper enum for dma_[un]map_single
  mtd: spi-nor: parse SFDP Sector Map Parameter Table
  mtd: spi-nor: add support to non-uniform SFDP SPI NOR flash memories
  mtd: rawnand: marvell: fix the IRQ handler complete() condition
  mtd: rawnand: denali: set SPARE_AREA_SKIP_BYTES register to 8 if unset
  mtd: rawnand: r852: fix spelling mistake "card_registred" -> "card_registered"
  mtd: rawnand: toshiba: Pass a single nand_chip object to the status helper
  mtd: maps: gpio-addr-flash: Use devm_* functions
  mtd: maps: gpio-addr-flash: Fix ioremapped size
  mtd: maps: gpio-addr-flash: Replace custom printk
  mtd: physmap_of: Release resources on error
  ...
2018-10-23 01:09:22 +01:00
Anup Patel 8b20d2db0a
RISC-V: Show IPI stats
This patch provides arch_show_interrupts() implementation to
show IPI stats via /proc/interrupts.

Now the contents of /proc/interrupts" will look like below:
           CPU0       CPU1       CPU2       CPU3
  8:         17          7          6         14  SiFive PLIC   8  virtio0
 10:         10         10          9         11  SiFive PLIC  10  ttyS0
IPI0:       170        673        251         79  Rescheduling interrupts
IPI1:         1         12         27          1  Function call interrupts

Signed-off-by: Anup Patel <anup@brainfault.org>
[Atish - Fixed checkpatch errors]
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>

Changes since v2:
 - Remove use of IPI_CALL_WAKEUP because it's being removed

Changes since v1:
 - Add stub inline show_ipi_stats() function for !CONFIG_SMP
 - Make ipi_names[] dynamically sized at compile time
 - Minor beautification of ipi_names[] using tabs

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:37 -07:00
Anup Patel 4b26d22fdf
RISC-V: Show CPU ID and Hart ID separately in /proc/cpuinfo
Currently, /proc/cpuinfo show logical CPU ID as Hart ID which
is in-correct. This patch shows CPU ID and Hart ID separately
in /proc/cpuinfo using cpuid_to_hardid_map().

With this patch, contents of /proc/cpuinfo looks as follows:
processor	: 0
hart		: 1
isa		: rv64imafdc
mmu		: sv48

processor	: 1
hart		: 0
isa		: rv64imafdc
mmu		: sv48

processor	: 2
hart		: 2
isa		: rv64imafdc
mmu		: sv48

processor	: 3
hart		: 3
isa		: rv64imafdc
mmu		: sv48

Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:37 -07:00
Atish Patra f99fb607fb
RISC-V: Use Linux logical CPU number instead of hartid
Setup the cpu_logical_map during boot. Moreover, every SBI call
and PLIC context are based on the physical hartid. Use the logical
CPU to hartid mapping to pass correct hartid to respective functions.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:37 -07:00
Atish Patra 6825c7a80f
RISC-V: Add logical CPU indexing for RISC-V
Currently, both Linux CPU id and hart id are same.
This is not recommended as it will lead to discontinuous CPU
indexing in Linux. Moreover, kdump kernel will run from CPU0
which would be absent if we follow existing scheme.

Implement a logical mapping between Linux CPU id and hart
id to decouple these two. Always mark the boot processor as
CPU0 and all other CPUs get the logical CPU id based on their
booting order.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:37 -07:00
Atish Patra a37d56fc40
RISC-V: Use WRITE_ONCE instead of direct access
The secondary harts spin on couple of per cpu variables until both of
these are non-zero so it's not necessary to have any ordering here.
However, WRITE_ONCE should be used to avoid tearing.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:37 -07:00
Palmer Dabbelt 46373cb442
RISC-V: Use mmgrab()
commit f1f1007644 ("mm: add new mmgrab() helper") added a
helper that we missed out on.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:36 -07:00
Palmer Dabbelt 177fae4515
RISC-V: Rename im_okay_therefore_i_am to found_boot_cpu
The old name was a bit odd.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:36 -07:00
Palmer Dabbelt b2f8cfa7ac
RISC-V: Rename riscv_of_processor_hart to riscv_of_processor_hartid
It's a bit confusing exactly what this function does: it actually
returns the hartid of an OF processor node, failing with -1 on invalid
nodes.  I've changed the name to _hartid() in order to make that a bit
more clear, as well as adding a comment.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
[Atish: code comment formatting update]
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:36 -07:00
Palmer Dabbelt 9639a44394
RISC-V: Provide a cleaner raw_smp_processor_id()
I'm not sure how I managed to miss this the first time, but this is much
better.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
[Atish: code comment formatting and other fixes]
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:36 -07:00
Atish Patra 6db170ff4c
RISC-V: Disable preemption before enabling interrupts
Currently, irq is enabled before preemption disabling happens.
If the scheduler fired right here and cpu is scheduled then it
may blow up.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
[Atish: Commit text and code comment formatting update]
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:36 -07:00
Palmer Dabbelt b18d6f0525
RISC-V: Comment on the TLB flush in smp_callin()
This isn't readily apparent from reading the code.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
[Atish: code comment formatting update]
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:36 -07:00
Palmer Dabbelt 19ccf29bb1
RISC-V: Filter ISA and MMU values in cpuinfo
We shouldn't be directly passing device tree values to userspace, both
because there could be mistakes in device trees and because the kernel
doesn't support arbitrary ISAs.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
[Atish: checkpatch fix and code comment formatting update]
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:35 -07:00
Palmer Dabbelt 566d6c428e
RISC-V: Don't set cacheinfo.{physical_line_partition,attributes}
These are just hard coded in the RISC-V port, which doesn't make any
sense.  We should probably be setting these from device tree entries
when they exist, but for now I think it's saner to just leave them all
as their default values.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:35 -07:00
Anup Patel 1ed4237ab6
RISC-V: No need to pass scause as arg to do_IRQ()
The scause is already part of pt_regs so no need to pass
scause as separate arg to do_IRQ().

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:03:35 -07:00
Vincent Chen 827a438156
RISC-V: Avoid corrupting the upper 32-bit of phys_addr_t in ioremap
For 32bit, the upper 32-bit of phys_addr_t will be flushed to zero
after AND with PAGE_MASK because the data type of PAGE_MASK is
unsigned long. To fix this problem, the page alignment is done by
subtracting the page offset instead of AND with PAGE_MASK.

Signed-off-by: Vincent Chen <vincentc@andestech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:56 -07:00
Zong Li 757331db92
RISC-V: Select GENERIC_LIB_UMODDI3 on RV32
On 32-bit, it need to use __umoddi3 by some drivers.

Signed-off-by: Zong Li <zong@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:56 -07:00
Zong Li 51858aaf9b
RISC-V: Use swiotlb on RV64 only
Only RV64 supports swiotlb. On RV32, it don't select the SWIOTLB.

Signed-off-by: Zong Li <zong@andestech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:56 -07:00
Zong Li 7f47c73b35
RISC-V: Build tishift only on 64-bit
Only RV64 supports 128 integer size.

Signed-off-by: Zong Li <zong@andestech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:55 -07:00
Alan Kao 9411ec60c2
Auto-detect whether a FPU exists
We expect that a kernel with CONFIG_FPU=y can still support no-FPU
machines. To do so, the kernel should first examine the existence of a
FPU, then do nothing if a FPU does exist; otherwise, it should
disable/bypass all FPU-related functions.

In this patch, a new global variable, has_fpu, is created and determined
when parsing the hardware capability from device tree during booting.
This variable is used in those FPU-related functions.

Signed-off-by: Alan Kao <alankao@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Cc: Vincent Chen <vincentc@andestech.com>
Cc: Zong Li <zong@andestech.com>
Cc: Nick Hu <nickhu@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:23 -07:00
Alan Kao 9671f70614
Allow to disable FPU support
FPU codes have been separated from common part in previous patches.
This patch add the CONFIG_FPU option and some stubs, so that a no-FPU
configuration is allowed.

Signed-off-by: Alan Kao <alankao@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Cc: Vincent Chen <vincentc@andestech.com>
Cc: Zong Li <zong@andestech.com>
Cc: Nick Hu <nickhu@andestech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:23 -07:00
Alan Kao e8be530233
Cleanup ISA string setting
This patch cleanup the MARCH string passing to both compiler and
assembler.  Note that the CFLAGS should not contain "fd" before we
have mechnisms like kernel_fpu_begin/end in other architectures.

Signed-off-by: Alan Kao <alankao@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Cc: Vincent Chen <vincentc@andestech.com>
Cc: Zong Li <zong@andestech.com>
Cc: Nick Hu <nickhu@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:23 -07:00
Alan Kao 007f5c3589
Refactor FPU code in signal setup/return procedures
FPU-related logic is separated from normal signal handling path in
this patch.  Kernel can easily be configured to exclude those procedures
for no-FPU systems.

Signed-off-by: Alan Kao <alankao@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Cc: Vincent Chen <vincentc@andestech.com>
Cc: Zong Li <zong@andestech.com>
Cc: Nick Hu <nickhu@andestech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:23 -07:00
Alan Kao e68ad867f7
Extract FPU context operations from entry.S
We move __fstate_save and __fstate_restore to a new source
file, fpu.S.

Signed-off-by: Alan Kao <alankao@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Cc: Vincent Chen <vincentc@andestech.com>
Cc: Zong Li <zong@andestech.com>
Cc: Nick Hu <nickhu@andestech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-10-22 17:02:22 -07:00
David S. Miller ecd4c19f3d sparc: Validate VDSO for undefined symbols.
There should be no undefined symbols in the resulting VDSO image(s).

On sparc, fixed register usage can result in undefined symbols ending
up in the image.  To combat this, we do two things:

1) Define current_thread_info() specially when BUILD_DSO.

2) Ignore "#scratch" register undefined symbols in the output.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 16:09:27 -07:00
David S. Miller 3c2b2d9408 sparc: Really use linker with LDFLAGS.
Rather than funneling through CC.

Also, use --hash-style=both just like other VDSO architectures and
glibc do.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 16:02:05 -07:00
David S. Miller 5615edcca9 sparc: Improve VDSO CFLAGS.
Do not set any special register usage options, use the default which
is exactly what we should use for userspace code.

Make sure we remove the gcc plugin options from the 64-bit build.
The 32-bit cflags got it right already.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 15:56:11 -07:00
David S. Miller 44231b7fee sparc: Set DISABLE_BRANCH_PROFILING in VDSO CFLAGS.
Not in vclock_gettime.c itself.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 15:51:45 -07:00
David S. Miller 3fe5d7e861 sparc: Don't bother masking out TICK_PRIV_BIT in VDSO code.
If the TICK_PRIV_BIT was set, we would not be able to read the tick
register in user space, which is where this code runs.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 15:31:38 -07:00
David S. Miller 794b88e047 sparc: Inline VDSO gettime code aggressively.
One interesting thing we need to do is stop using
__builtin_return_address() in get_vvar_data().

Simply read the %pc register instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 15:27:49 -07:00
David S. Miller 2f6c9bf31a sparc: Improve VDSO instruction patching.
The current VDSO patch mechanism has several problems:

1) It assumes how gcc will emit a function, with a register
   window, an initial save instruction and then immediately
   the %tick read when compiling vread_tick().

   There is no such guarantees, code generation could change
   at any time, gcc could put a nop between the save and
   the %tick read, etc.

   So this is extremely fragile and would fail some day.

2) It disallows us to properly inline vread_tick() into the callers
   and thus get the best possible code sequences.

So fix this to patch properly, with location based annotations.

We have to be careful because we cannot do it the way we do
patches elsewhere in the kernel.  Those use a sequence like:

	1:
	insn
	.section	.whatever_patch, "ax"
	.word		1b
	replacement_insn
	.previous

This is a dynamic shared object, so that .word cannot be resolved at
build time, and thus cannot be used to execute the patches when the
kernel initializes the images.

Even trying to use label difference equations doesn't work in the
above kind of scheme:

	1:
	insn
	.section	.whatever_patch, "ax"
	.word		. - 1b
	replacement_insn
	.previous

The assembler complains that it cannot resolve that computation.
The issue is that this is contained in an executable section.

Borrow the sequence used by x86 alternatives, which is:

	1:
	insn
	.pushsection	.whatever_patch, "a"
	.word		. - 1b, . - 1f
	.popsection
	.pushsection	.whatever_patch_replacements, "ax"
	1:
	replacement_insn
	.previous

This works, allows us to inline vread_tick() as much as we like, and
can be used for arbitrary kinds of VDSO patching in the future.

Also, reverse the condition for patching.  Most systems are %stick
based, so if we only patch on %tick systems the patching code will
get little or no testing.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 15:22:14 -07:00
Linus Torvalds cff229491a First batch of dma-mapping changes for 4.20:
- mostly more consolidation of the direct mapping code, including
    converting over hexagon, and merging the coherent and non-coherent
    code into a single dma_map_ops instance (me)
  - cleanups for the dma_configure/dma_unconfigure callchains (me)
  - better handling of dma_masks in odd setups (me, Alexander Duyck)
  - better debugging of passing vmalloc address to the DMA API
    (Stephen Boyd)
  - CMA command line parsing fix (He Zhe)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlvNg6YLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMm/Q/9FFVOH73Nc3rT40N2HdaPbzV2hXmI1//hEJcImDP5
 mLGq8XqieGuo8Pmu9+xp1tC2UnfUkhK4FjhQbWM+qKER/RNYES2BD50xVFmt6ICS
 9d8IaRcs+ceggljfdwszkkucJspBsYNxpiKjjao0OsHn6UDatu6elZs/yvb2nXci
 HCJUvs9vYm9MkAtVXEtOQtij3YRaJ/9xYY4h5Dy5vBtHPp+kjUMF0mWAwA2+Ec1V
 8iqKjUY3c8nr8Kf6WE9tzJ0wrMFijc4HJlE3W1ud8YsKdfCkCf8XiIuS6PgTzOeK
 0cn9h8dVrV1ZXJ/D/9JZDivmYvIsoKWAYVQHNzAiq7PI3uOJY1ggCxyZpWtTHZhM
 ATHF0sJGpIenkSWybYpKee8e8RsS7L9dUgu6bYpK5pVkirNYnR9IOGVJNmS63L7Q
 B0uUtqjBKDG2yNGZGY9zqBQFgxiPO0wxFLeKyHbIsC0b7FBti3rXGAimch5WiBuL
 zlDV0zEfMH0BW6gNPrjfFur84duKtGZ/0DBSxQ0E1Mvk8B1LBr78MgZt8OfJEuoe
 dx1FYU70u8PYi+hjmn386YnNNMTjd1GT5XW7AWedM2wCjRYmNy0yMGmm9cACMneN
 5eBv/SYr7X1zKNL7w7H6KQVZilTJcBoj3f/lmjL7i22m9FXYQpcUP61L8wHNM8H2
 iJo=
 =AVSD
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.20' of git://git.infradead.org/users/hch/dma-mapping

Pull dma mapping updates from Christoph Hellwig:
 "First batch of dma-mapping changes for 4.20.

  There will be a second PR as some big changes were only applied just
  before the end of the merge window, and I want to give them a few more
  days in linux-next.

  Summary:

   - mostly more consolidation of the direct mapping code, including
     converting over hexagon, and merging the coherent and non-coherent
     code into a single dma_map_ops instance (me)

   - cleanups for the dma_configure/dma_unconfigure callchains (me)

   - better handling of dma_masks in odd setups (me, Alexander Duyck)

   - better debugging of passing vmalloc address to the DMA API (Stephen
     Boyd)

   - CMA command line parsing fix (He Zhe)"

* tag 'dma-mapping-4.20' of git://git.infradead.org/users/hch/dma-mapping: (27 commits)
  dma-direct: respect DMA_ATTR_NO_WARN
  dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN
  dma-direct: document the zone selection logic
  dma-debug: Check for drivers mapping invalid addresses in dma_map_single()
  dma-direct: fix return value of dma_direct_supported
  dma-mapping: move dma_default_get_required_mask under ifdef
  dma-direct: always allow dma mask <= physiscal memory size
  dma-direct: implement complete bus_dma_mask handling
  dma-direct: refine dma_direct_alloc zone selection
  dma-direct: add an explicit dma_direct_get_required_mask
  dma-mapping: make the get_required_mask method available unconditionally
  unicore32: remove swiotlb support
  Revert "dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops"
  dma-mapping: support non-coherent devices in dma_common_get_sgtable
  dma-mapping: consolidate the dma mmap implementations
  dma-mapping: merge direct and noncoherent ops
  dma-mapping: move the dma_coherent flag to struct device
  MIPS: don't select DMA_MAYBE_COHERENT from DMA_PERDEV_COHERENT
  dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration
  dma-mapping: fix panic caused by passing empty cma command line argument
  ...
2018-10-22 18:16:03 +01:00
Linus Torvalds 6ab9e09238 for-4.20/block-20181021
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlvNQKgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgps+8D/9Iy6YIeoPwN10gYsqIh0P2fS3wKzL3kiww
 3vFsWO78PzgLxUlNmB7teLtNFc/R5mi8becZmAdvs9za5YFZk56o3Ifv1x+e+z00
 VY1/gxhiJD6suLeJ6lECnERGDaiWOZVRMo2TE17vxYGW6GGaa0Ts6PUUXmpla1u5
 WKctgt0Qv9WVNyiIdLdeHqzKJwsSSwNTt8fK7eFhy3x8e0CwJr+GtXckbbW3LFkY
 lug0npsTli3EmEPMovZhd25SjZmTk5GTM+ADZQ7Tnv5KXoDWB9jn6TcCSAi3G+5d
 5WUVwfnDyYJiH8qvlg5tRJ690muIy3xMOmpr7QBQ0YnR/LQ3EW+1CVfqD+qimgLH
 TXzlREXQpBP3YlxSDS5nddz4o5z84GZmC9B/43ujPaZKIQ6eBXYdkmQH7tPtSugm
 C6VGomR5tHotjxIiAsexh/5hAus+wW8bObKGTPTyINT0ub3XNclwCKLh26CgI9ie
 WvbS9g3j/KPvu/7s6weZpgD+cks0YdWe/XdXXxiHwsGI9h3J2aJna5RQt1rKWDm5
 wGCgbc/B8eSwiWx+GXlqdB9/Dy/bGXOnSTDnKpEVl1f5zNjeLwUKXbjvkMefWs4m
 jEIcquuDETORY+ZYEfa5YbmS4Lhskr0kzMVTVkZ++81tAWpSCU9Xh3IHrR8TNpt+
 J0oh0FHBDg==
 =LRTT
 -----END PGP SIGNATURE-----

Merge tag 'for-4.20/block-20181021' of git://git.kernel.dk/linux-block

Pull block layer updates from Jens Axboe:
 "This is the main pull request for block changes for 4.20. This
  contains:

   - Series enabling runtime PM for blk-mq (Bart).

   - Two pull requests from Christoph for NVMe, with items such as;
      - Better AEN tracking
      - Multipath improvements
      - RDMA fixes
      - Rework of FC for target removal
      - Fixes for issues identified by static checkers
      - Fabric cleanups, as prep for TCP transport
      - Various cleanups and bug fixes

   - Block merging cleanups (Christoph)

   - Conversion of drivers to generic DMA mapping API (Christoph)

   - Series fixing ref count issues with blkcg (Dennis)

   - Series improving BFQ heuristics (Paolo, et al)

   - Series improving heuristics for the Kyber IO scheduler (Omar)

   - Removal of dangerous bio_rewind_iter() API (Ming)

   - Apply single queue IPI redirection logic to blk-mq (Ming)

   - Set of fixes and improvements for bcache (Coly et al)

   - Series closing a hotplug race with sysfs group attributes (Hannes)

   - Set of patches for lightnvm:
      - pblk trace support (Hans)
      - SPDX license header update (Javier)
      - Tons of refactoring patches to cleanly abstract the 1.2 and 2.0
        specs behind a common core interface. (Javier, Matias)
      - Enable pblk to use a common interface to retrieve chunk metadata
        (Matias)
      - Bug fixes (Various)

   - Set of fixes and updates to the blk IO latency target (Josef)

   - blk-mq queue number updates fixes (Jianchao)

   - Convert a bunch of drivers from the old legacy IO interface to
     blk-mq. This will conclude with the removal of the legacy IO
     interface itself in 4.21, with the rest of the drivers (me, Omar)

   - Removal of the DAC960 driver. The SCSI tree will introduce two
     replacement drivers for this (Hannes)"

* tag 'for-4.20/block-20181021' of git://git.kernel.dk/linux-block: (204 commits)
  block: setup bounce bio_sets properly
  blkcg: reassociate bios when make_request() is called recursively
  blkcg: fix edge case for blk_get_rl() under memory pressure
  nvme-fabrics: move controller options matching to fabrics
  nvme-rdma: always have a valid trsvcid
  mtip32xx: fully switch to the generic DMA API
  rsxx: switch to the generic DMA API
  umem: switch to the generic DMA API
  sx8: switch to the generic DMA API
  sx8: remove dead IF_64BIT_DMA_IS_POSSIBLE code
  skd: switch to the generic DMA API
  ubd: remove use of blk_rq_map_sg
  nvme-pci: remove duplicate check
  drivers/block: Remove DAC960 driver
  nvme-pci: fix hot removal during error handling
  nvmet-fcloop: suppress a compiler warning
  nvme-core: make implicit seed truncation explicit
  nvmet-fc: fix kernel-doc headers
  nvme-fc: rework the request initialization code
  nvme-fc: introduce struct nvme_fcp_op_w_sgl
  ...
2018-10-22 17:46:08 +01:00
Linus Torvalds 5289851171 arm64 updates for 4.20:
- Core mmu_gather changes which allow tracking the levels of page-table
   being cleared together with the arm64 low-level flushing routines
 
 - Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
   mitigate Spectre-v4 dynamically without trapping to EL3 firmware
 
 - Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
 
 - Optimise emulation of MRS instructions to ID_* registers on ARMv8.4
 
 - Support for Common Not Private (CnP) translations allowing threads of
   the same CPU to share the TLB entries
 
 - Accelerated crc32 routines
 
 - Move swapper_pg_dir to the rodata section
 
 - Trap WFI instruction executed in user space
 
 - ARM erratum 1188874 workaround (arch_timer)
 
 - Miscellaneous fixes and clean-ups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlvKGdEACgkQa9axLQDI
 XvGSQBAAiOH6aQABL4TB7c5KIc7C+Unjm6QCFCoaeGWoHuemnM6cFJ7RQsi0GqnP
 dVEX5V/FKfmeTWO5g24Ah+MbTm3Bt6+81gywAmi1rrHhmCaCIPjT7xDqy/WsLlvt
 7WtgegSGvQ7DIMj2dbfFav6+ra67qAiYZTc46jvuynVl6DrE3BCiyTDbXAWt2nzP
 Xf3un4AHRbg3UEMUZTLqU5q4z0tbM6rEAZru8O0UOTnD2q7uttUqW3Ab7fpuEkkj
 lEVrMWD3h8SJg+Df9CbXmCNOjh4VhwBwDb5LgO8vA/AcyV/YLEF5b2OUAk/28qwo
 0GBwjqRyI4+YQ9LPg41MhGzrlnta0HCdYoeNLgLQZiDcUkuSfGhoA+MNZNOR8B08
 sCWF7F6f8UIQm8KMMBiYYdlVyUYgHLsWE/1+CyeLV0oIoWT5k3c+Xe3pho9KpVb0
 Co04TqMlqalry0sbevHz5c55H7iWIjB1Tpo3SxM105dVJVibXRPXkz+WZ5iPO+xa
 ex2j1kjNdA/AUzrSCZ5lh22zhg0WsfwD++E5meAaJMxieim8FeZDRga43rowJ0BA
 zMbSNB/+NDFZ9EhC40VaUfKk8Tkgiug9J5swv0+v7hy1QLDyydHhbOecTuIueauM
 6taiT2Iuov5yFng1eonYj4htvouVF4WOhPGthFPJMOcrB9mLMhs=
 =3Mc8
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Apart from some new arm64 features and clean-ups, this also contains
  the core mmu_gather changes for tracking the levels of the page table
  being cleared and a minor update to the generic
  compat_sys_sigaltstack() introducing COMPAT_SIGMINSKSZ.

  Summary:

   - Core mmu_gather changes which allow tracking the levels of
     page-table being cleared together with the arm64 low-level flushing
     routines

   - Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
     mitigate Spectre-v4 dynamically without trapping to EL3 firmware

   - Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack

   - Optimise emulation of MRS instructions to ID_* registers on ARMv8.4

   - Support for Common Not Private (CnP) translations allowing threads
     of the same CPU to share the TLB entries

   - Accelerated crc32 routines

   - Move swapper_pg_dir to the rodata section

   - Trap WFI instruction executed in user space

   - ARM erratum 1188874 workaround (arch_timer)

   - Miscellaneous fixes and clean-ups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (78 commits)
  arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work
  arm64: cpufeature: Trap CTR_EL0 access only where it is necessary
  arm64: cpufeature: Fix handling of CTR_EL0.IDC field
  arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
  Documentation/arm64: HugeTLB page implementation
  arm64: mm: Use __pa_symbol() for set_swapper_pgd()
  arm64: Add silicon-errata.txt entry for ARM erratum 1188873
  Revert "arm64: uaccess: implement unsafe accessors"
  arm64: mm: Drop the unused cpu parameter
  MAINTAINERS: fix bad sdei paths
  arm64: mm: Use #ifdef for the __PAGETABLE_P?D_FOLDED defines
  arm64: Fix typo in a comment in arch/arm64/mm/kasan_init.c
  arm64: xen: Use existing helper to check interrupt status
  arm64: Use daifflag_restore after bp_hardening
  arm64: daifflags: Use irqflags functions for daifflags
  arm64: arch_timer: avoid unused function warning
  arm64: Trap WFI executed in userspace
  arm64: docs: Document SSBS HWCAP
  arm64: docs: Fix typos in ELF hwcaps
  arm64/kprobes: remove an extra semicolon in arch_prepare_kprobe
  ...
2018-10-22 17:30:06 +01:00
Vasily Gorbik cf3dbe5dac s390/kasan: support preemptible kernel build
When the kernel is built with:
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
"stfle" function used by kasan initialization code makes additional
call to preempt_count_add/preempt_count_sub. To avoid removing kasan
instrumentation from sched code where those functions leave split stfle
function and provide __stfle variant without preemption handling to be
used by Kasan.

Reported-by: Benjamin Block <bblock@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-22 08:37:45 +02:00
Christophe Leroy 977e4be5eb x86/stackprotector: Remove the call to boot_init_stack_canary() from cpu_startup_entry()
The following commit:

  d7880812b3 ("idle: Add the stack canary init to cpu_startup_entry()")

... added an x86 specific boot_init_stack_canary() call to the generic
cpu_startup_entry() as a temporary hack, with the intention to remove
the #ifdef CONFIG_X86 later.

More than 5 years later let's finally realize that plan! :-)

While implementing stack protector support for PowerPC, we found
that calling boot_init_stack_canary() is also needed for PowerPC
which uses per task (TLS) stack canary like the X86.

However, calling boot_init_stack_canary() would break architectures
using a global stack canary (ARM, SH, MIPS and XTENSA).

Instead of modifying the #ifdef CONFIG_X86 to an even messier:

   #if defined(CONFIG_X86) || defined(CONFIG_PPC)

PowerPC implemented the call to boot_init_stack_canary() in the function
calling cpu_startup_entry().

Let's try the same cleanup on the x86 side as well.

On x86 we have two functions calling cpu_startup_entry():

 - start_secondary()
 - cpu_bringup_and_idle()

start_secondary() already calls boot_init_stack_canary(), so
it's good, and this patch adds the call to boot_init_stack_canary()
in cpu_bringup_and_idle().

I.e. now x86 catches up to the rest of the world and the ugly init
sequence in init/main.c can be removed from cpu_startup_entry().

As a final benefit we can also remove the <linux/stackprotector.h>
dependency from <linux/sched.h>.

[ mingo: Improved the changelog a bit, added language explaining x86 borkage and sched.h change. ]
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20181020072649.5B59310483E@pc16082vm.idsi0.si.c-s.fr
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-22 04:07:24 +02:00
Masami Hiramatsu 2e62024c26 kprobes/x86: Use preempt_enable() in optimized_callback()
The following commit:

  a19b2e3d78 ("kprobes/x86: Remove IRQ disabling from ftrace-based/optimized kprobes”)

removed local_irq_save/restore() from optimized_callback(), the handler
might be interrupted by the rescheduling interrupt and might be
rescheduled - so we must not use the preempt_enable_no_resched() macro.

Use preempt_enable() instead, to not lose preemption events.

[ mingo: Improved the changelog. ]

Reported-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dwmw@amazon.co.uk
Fixes: a19b2e3d78 ("kprobes/x86: Remove IRQ disabling from ftrace-based/optimized kprobes”)
Link: http://lkml.kernel.org/r/154002887331.7627.10194920925792947001.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-22 03:31:01 +02:00
David S. Miller 21ea1d36f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David Ahern's dump indexing bug fix in 'net' overlapped the
change of the function signature of inet6_fill_ifaddr() in
'net-next'.  Trivially resolved.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-21 11:54:28 -07:00
Mark Brown 4fd1f509e8
Merge branch 'regulator-4.20' into regulator-next 2018-10-21 17:00:02 +01:00
Paolo Bonzini 574c0cfbc7 Second PPC KVM update for 4.20.
Two commits; one is an optimization for PCI pass-through, and the
 other disables nested HV-KVM on early POWER9 chips that need a
 particular hardware bug workaround.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJbywJ7AAoJEJ2a6ncsY3GfZ4QH/j7rKij/OV73LATQyS0zPe66
 OPl8F93n3IoPqHouTW8c9isag5OyF14ne7IlWj54zP3r67OU2K13/Fi6ITHmokQV
 vJ2xIOqClQtV22tpjBoJK+b0r6lwHm8JPtbmnnsHfwCtX28ZIzhZn7Dt2/KD/+c1
 GemX8D1dcewHCjwWZqcFLhHAjB4pbGHOKGAlQPK9H04LFsgypQNR+vy/n++yB3tP
 HsraRrmqYS+lO+7DVzbNHg13/pml6+bgDkQ6Vs7j2DF8HzkpgGUpCOUxmquG8ODU
 Pw2O4OxYMy3Uq+pwHZnoJInfSstu63SGHgnLBqp001PKPiyMvAMugdLtxs+GjtY=
 =vQjp
 -----END PGP SIGNATURE-----

Merge tag 'kvm-ppc-next-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD

Second PPC KVM update for 4.20.

Two commits; one is an optimization for PCI pass-through, and the
other disables nested HV-KVM on early POWER9 chips that need a
particular hardware bug workaround.
2018-10-21 11:47:01 +02:00
Dave Hansen 1620414251 x86/mm: Kill stray kernel fault handling comment
I originally had matching user and kernel comments, but the kernel
one got improved.  Some errant conflict resolution kicked the commment
somewhere wrong.  Kill it.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: aa37c51b94 ("x86/mm: Break out user address space handling")
Link: http://lkml.kernel.org/r/20181019140842.12F929FA@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-21 10:58:10 +02:00
Christophe Leroy 0f99153def powerpc/msi: Fix compile error on mpc83xx
mpic_get_primary_version() is not defined when not using MPIC.
The compile error log like:

arch/powerpc/sysdev/built-in.o: In function `fsl_of_msi_probe':
fsl_msi.c:(.text+0x150c): undefined reference to `fsl_mpic_primary_get_version'

Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Reported-by: Radu Rendec <radu.rendec@gmail.com>
Fixes: 807d38b73b ("powerpc/mpic: Add get_version API both for internal and external use")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-21 19:32:07 +11:00
Michael Ellerman b6aeddea74 powerpc: Fix stack protector crashes on CPU hotplug
Recently in commit 7241d26e81 ("powerpc/64: properly initialise
the stackprotector canary on SMP.") we fixed a crash with stack
protector on SMP by initialising the stack canary in
cpu_idle_thread_init().

But this can also causes crashes, when a CPU comes back online after
being offline:

  Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: pnv_smp_cpu_kill_self+0x2a0/0x2b0
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.19.0-rc3-gcc-7.3.1-00168-g4ffe713b7587 #94
  Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    panic+0x144/0x328
    __stack_chk_fail+0x2c/0x30
    pnv_smp_cpu_kill_self+0x2a0/0x2b0
    cpu_die+0x48/0x70
    arch_cpu_idle_dead+0x20/0x40
    do_idle+0x274/0x390
    cpu_startup_entry+0x38/0x50
    start_secondary+0x5e4/0x600
    start_secondary_prolog+0x10/0x14

Looking at the stack we see that the canary value in the stack frame
doesn't match the canary in the task/paca. That is because we have
reinitialised the task/paca value, but then the CPU coming online has
returned into a function using the old canary value. That causes the
comparison to fail.

Instead we can call boot_init_stack_canary() from start_secondary()
which never returns. This is essentially what the generic code does in
cpu_startup_entry() under #ifdef X86, we should make that non-x86
specific in a future patch.

Fixes: 7241d26e81 ("powerpc/64: properly initialise the stackprotector canary on SMP.")
Reported-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
2018-10-21 19:32:00 +11:00
Camelia Groza 0400d65501 powerpc/dts/fsl: t2080rdb: reorder the Cortina PHY XFI lanes
According to the T2080RDB schematics, for the CS4315 PHY, the XFI 1 lane is
connected to SFP 2 and the XFI 2 lane is connected to SFP 1. Change the
device tree to reflect the correct PHY order and port association.

Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2018-10-20 18:23:56 -05:00
Helge Deller e543b3a620 parisc: Retrieve and display the PDC PAT capabilities
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-20 21:10:37 +02:00
John David Anglin 4c5fe5db1a parisc: Optimze cache flush algorithms
The attached patch implements three optimizations:

1) Loops in flush_user_dcache_range_asm, flush_kernel_dcache_range_asm,
purge_kernel_dcache_range_asm, flush_user_icache_range_asm, and
flush_kernel_icache_range_asm are unrolled to reduce branch overhead.

2) The static branch prediction for cmpb instructions in pacache.S have
been reviewed and the operand order adjusted where necessary.

3) For flush routines in cache.c, we purge rather flush when we have no
context.  The pdc instruction at level 0 is not required to write back
dirty lines to memory. This provides a performance improvement over the
fdc instruction if the feature is implemented.

Version 2 adds alternative patching.

The patch provides an average improvement of about 2%.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-20 21:10:26 +02:00
John David Anglin 5a23237f14 parisc: Remove pte_inserted define
The attached change removes the pte_inserted from pgtable.h.  As a
result, we always flush the TLB entry when the associated page table
entry is changed.

This change doesn't impact performance signifcantly and it may catch
some cases where the TLB needs flushing but wasn't.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-20 21:09:30 +02:00
Bjorn Helgaas 525fde0750 Merge branch 'remotes/lorenzo/pci/dwc'
- Support 100MHz/200MHz refclocks for i.MX6 (Lucas Stach)

  - Add initial power management for i.MX7 (Leonard Crestez)

  - Add PME_Turn_Off support for i.MX7 (Leonard Crestez)

  - Fix qcom runtime power management error handling (Bjorn Andersson)

  - Update TI dra7xx unaligned access errata workaround for host mode as
    well as endpoint mode (Vignesh R)

  - Fix kirin section mismatch warning (Nathan Chancellor)

* remotes/lorenzo/pci/dwc:
  PCI: imx: Add PME_Turn_Off support
  ARM: dts: imx7d: Add turnoff reset
  dt-bindings: imx6q-pcie: Add turnoff reset for imx7d
  reset: imx7: Add PCIE_CTRL_APPS_TURNOFF
  PCI: kirin: Fix section mismatch warning
  PCI: dwc: pci-dra7xx: Enable errata i870 for both EP and RC mode
  dt-bindings: PCI: dra7xx: Add bindings for unaligned access in host mode
  PCI: qcom: Fix error handling in runtime PM support
  PCI: imx: Initial imx7d pm support
  PCI: imx6: Support MPLL reconfiguration for 100MHz and 200MHz refclock
2018-10-20 11:45:49 -05:00
Bjorn Helgaas 6aa0459e75 Merge branch 'pci/host-vmd'
- Fix VMD AERSID quirk Device ID matching (Jon Derrick)

* pci/host-vmd:
  x86/PCI: Apply VMD's AERSID fixup generically
2018-10-20 11:45:44 -05:00
Bjorn Helgaas 20634dc361 Merge branch 'pci/hotplug'
- Differentiate between pciehp surprise and safe removal (Lukas Wunner)

  - Remove unnecessary pciehp includes (Lukas Wunner)

  - Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner)

  - Tolerate PCIe Slot Presence Detect being hardwired to zero to
    workaround broken hardware, e.g., the Wilocity switch/wireless device
    (Lukas Wunner)

  - Unify pciehp controller & slot structs (Lukas Wunner)

  - Constify hotplug_slot_ops (Lukas Wunner)

  - Drop hotplug_slot_info (Lukas Wunner)

  - Embed hotplug_slot struct into users instead of allocating it
    separately (Lukas Wunner)

  - Initialize PCIe port service drivers directly instead of relying on
    initcall ordering (Keith Busch)

  - Restore PCI config state after a slot reset (Keith Busch)

  - Save/restore DPC config state along with other PCI config state (Keith
    Busch)

  - Reference count devices during AER handling to avoid race issue with
    concurrent hot removal (Keith Busch)

  - If an Upstream Port reports ERR_FATAL, don't try to read the Port's
    config space because it is probably unreachable (Keith Busch)

  - During error handling, use slot-specific reset instead of secondary
    bus reset to avoid link up/down issues on hotplug ports (Keith Busch)

  - Restore previous AER/DPC handling that does not remove and re-enumerate
    devices on ERR_FATAL (Keith Busch)

  - Notify all drivers that may be affected by error recovery resets (Keith
    Busch)

  - Always generate error recovery uevents, even if a driver doesn't have
    error callbacks (Keith Busch)

  - Make PCIe link active reporting detection generic (Keith Busch)

  - Support D3cold in PCIe hierarchies during system sleep and runtime,
    including hotplug and Thunderbolt ports (Mika Westerberg)

  - Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots
    are empty or occupied (Jon Derrick)

  - Remove duplicated include from pci/pcie/err.c and unused variable from
    cpqphp (YueHaibing)

  - Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza
    Pawandeep)

  - Uninline PCI bus accessors for better ftracing (Keith Busch)

  - Remove unused AER Root Port .error_resume method (Keith Busch)

  - Use kfifo in AER instead of a local version (Keith Busch)

  - Use threaded IRQ in AER bottom half (Keith Busch)

  - Use managed resources in AER core (Keith Busch)

  - Reuse pcie_port_find_device() for AER injection (Keith Busch)

  - Abstract AER interrupt handling to disconnect error injection (Keith
    Busch)

  - Refactor AER injection callbacks to simplify future improvments (Keith
    Busch)

* pci/hotplug:
  PCI/AER: Refactor error injection fallbacks
  PCI/AER: Abstract AER interrupt handling
  PCI/AER: Reuse existing pcie_port_find_device() interface
  PCI/AER: Use managed resource allocations
  PCI/AER: Use threaded IRQ for bottom half
  PCI/AER: Use kfifo_in_spinlocked() to insert locked elements
  PCI/AER: Use kfifo for tracking events instead of reimplementing it
  PCI/AER: Remove error source from AER struct aer_rpc
  PCI/AER: Remove unused aer_error_resume()
  PCI: Uninline PCI bus accessors for better ftracing
  PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls
  PCI: pnv_php: Use kmemdup()
  PCI: cpqphp: Remove set but not used variable 'physical_slot'
  PCI/ERR: Remove duplicated include from err.c
  PCI: Equalize hotplug memory and io for occupied and empty slots
  PCI / ACPI: Whitelist D3 for more PCIe hotplug ports
  ACPI / property: Allow multiple property compatible _DSD entries
  PCI/PME: Implement runtime PM callbacks
  PCI: pciehp: Implement runtime PM callbacks
  PCI/portdrv: Add runtime PM hooks for port service drivers
  PCI/portdrv: Resume upon exit from system suspend if left runtime suspended
  PCI: pciehp: Do not handle events if interrupts are masked
  PCI: pciehp: Disable hotplug interrupt during suspend
  PCI / ACPI: Enable wake automatically for power managed bridges
  PCI: Do not skip power-managed bridges in pci_enable_wake()
  PCI: Make link active reporting detection generic
  PCI: Unify device inaccessible
  PCI/ERR: Always report current recovery status for udev
  PCI/ERR: Simplify broadcast callouts
  PCI/ERR: Run error recovery callbacks for all affected devices
  PCI/ERR: Handle fatal error recovery
  PCI/ERR: Use slot reset if available
  PCI/AER: Don't read upstream ports below fatal errors
  PCI/AER: Take reference on error devices
  PCI/DPC: Save and restore config state
  PCI: portdrv: Restore PCI config state on slot reset
  PCI: portdrv: Initialize service drivers directly
  PCI: hotplug: Document TODOs
  PCI: hotplug: Embed hotplug_slot
  PCI: hotplug: Drop hotplug_slot_info
  PCI: hotplug: Constify hotplug_slot_ops
  PCI: pciehp: Reshuffle controller struct for clarity
  PCI: pciehp: Rename controller struct members for clarity
  PCI: pciehp: Unify controller and slot structs
  PCI: pciehp: Tolerate Presence Detect hardwired to zero
  PCI: pciehp: Drop hotplug_slot_ops wrappers
  PCI: pciehp: Drop unnecessary includes
  PCI: pciehp: Differentiate between surprise and safe removal
  PCI: Simplify disconnected marking
2018-10-20 11:45:29 -05:00
Greg Kroah-Hartman b0d04fb56b Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Ingo writes:
  "x86 fixes:

   It's 4 misc fixes, 3 build warning fixes and 3 comment fixes.

   In hindsight I'd have left out the 3 comment fixes to make the pull
   request look less scary at such a late point in the cycle. :-/"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/swiotlb: Enable swiotlb for > 4GiG RAM on 32-bit kernels
  x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context switch if there is an FPU
  x86/fpu: Remove second definition of fpu in __fpu__restore_sig()
  x86/entry/64: Further improve paranoid_entry comments
  x86/entry/32: Clear the CS high bits
  x86/boot: Add -Wno-pointer-sign to KBUILD_CFLAGS
  x86/time: Correct the attribute on jiffies' definition
  x86/entry: Add some paranoid entry/exit CR3 handling comments
  x86/percpu: Fix this_cpu_read()
  x86/tsc: Force inlining of cyc2ns bits
2018-10-20 15:04:23 +02:00
Alexey Kardashevskiy 6e301a8e56 KVM: PPC: Optimize clearing TCEs for sparse tables
The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE
table and a table with userspace addresses. These tables are radix trees,
we allocate indirect levels when they are written to. Since
the memory allocation is problematic in real mode, we have 2 accessors
to the entries:
- for virtual mode: it allocates the memory and it is always expected
to return non-NULL;
- fr real mode: it does not allocate and can return NULL.

Also, DMA windows can span to up to 55 bits of the address space and since
we never have this much RAM, such windows are sparse. However currently
the SPAPR TCE IOMMU driver walks through all TCEs to unpin DMA memory.

Since we maintain a userspace addresses table for VFIO which is a mirror
of the hardware table, we can use it to know which parts of the DMA
window have not been mapped and skip these so does this patch.

The bare metal systems do not have this problem as they use a bypass mode
of a PHB which maps RAM directly.

This helps a lot with sparse DMA windows, reducing the shutdown time from
about 3 minutes per 1 billion TCEs to a few seconds for 32GB sparse guest.
Just skipping the last level seems to be good enough.

As non-allocating accessor is used now in virtual mode as well, rename it
from IOMMU_TABLE_USERSPACE_ENTRY_RM (real mode) to _RO (read only).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-10-20 20:47:02 +11:00
Christophe Leroy daf00ae71d powerpc/traps: restore recoverability of machine_check interrupts
commit b96672dd84 ("powerpc: Machine check interrupt is a non-
maskable interrupt") added a call to nmi_enter() at the beginning of
machine check restart exception handler. Due to that, in_interrupt()
always returns true regardless of the state before entering the
exception, and die() panics even when the system was not already in
interrupt.

This patch calls nmi_exit() before calling die() in order to restore
the interrupt state we had before calling nmi_enter()

Fixes: b96672dd84 ("powerpc: Machine check interrupt is a non-maskable interrupt")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Nicholas Piggin b851ba02a6 powerpc/64/module: REL32 relocation range check
The recent module relocation overflow crash demonstrated that we
have no range checking on REL32 relative relocations. This patch
implements a basic check, the same kernel that previously oopsed
and rebooted now continues with some of these errors when loading
the module:

  module_64: x_tables: REL32 527703503449812 out of range!

Possibly other relocations (ADDR32, REL16, TOC16, etc.) should also have
overflow checks.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Nicholas Piggin dd76ff5af3 powerpc/64s/radix: Fix radix__flush_tlb_collapsed_pmd double flushing pmd
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Michael Ellerman 0d923962ab powerpc/mm: Fix page table dump to work on Radix
When we're running on Book3S with the Radix MMU enabled the page table
dump currently prints the wrong addresses because it uses the wrong
start address.

Fix it to use PAGE_OFFSET rather than KERN_VIRT_START.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Michael Ellerman afb6d0647f powerpc/mm/radix: Display if mappings are exec or not
At boot we print the ranges we've mapped for the linear mapping and
what page size we've used. Also track whether the range is mapped
executable or not and display that as well.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Michael Ellerman 232aa40763 powerpc/mm/radix: Simplify split mapping logic
If we look closely at the logic in create_physical_mapping(), when
we're doing STRICT_KERNEL_RWX, we do the following steps:
  - determine the gap from where we are to the end of the range
  - choose an appropriate mapping_size based on the gap
  - check if that mapping_size would overlap the __init_begin
    boundary, and if not choose an appropriate mapping_size

We can simplify the logic by taking the __init_begin boundary into
account when we calculate the initial gap.

So add a next_boundary() function which tells us what the next
boundary is, either the __init_begin boundary or end. In future we can
add more boundaries.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Michael Ellerman 57306c663d powerpc/mm/radix: Remove the retry in the split mapping logic
When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the
linear mapping at the text/data boundary so we can map the kernel
text read only.

The current logic uses a goto inside the for loop, which works, but is
hard to reason about.

When we hit the goto retry case we set max_mapping_size to PMD_SIZE
and go back to the start.

Setting max_mapping_size means we skip the PUD case and go to the PMD
case.

We know we will pass the alignment and gap checks because the only
reason we are there is we hit the goto retry, and that is guarded by
mapping_size == PUD_SIZE, which means addr is PUD aligned and gap is
greater or equal to PUD_SIZE.

So the only part of the check that can fail is the mmu_psize_defs
check for the 2M page size.

If we just duplicate that check we can avoid the goto, and we get the
same result.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Michael Ellerman 81d1b54dec powerpc/mm/radix: Fix small page at boundary when splitting
When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the
linear mapping at the text/data boundary so we can map the kernel
text read only.

Currently we always use a small page at the text/data boundary, even
when that's not necessary:

  Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages
  Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages
  Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages

This is because the check that the mapping crosses the __init_begin
boundary is too strict, it also returns true when we map exactly up to
the boundary.

So fix it to check that the mapping would actually map past
__init_begin, and with that we see:

  Mapped 0x0000000000000000-0x0000000040000000 with 2.00 MiB pages
  Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Michael Ellerman 3b5657ed5b powerpc/mm/radix: Fix overuse of small pages in splitting logic
When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the
linear mapping at the text/data boundary so we can map the kernel text
read only.

But the current logic uses small pages for the entire text section,
regardless of whether a larger page size would fit. eg. with the
boundary at 16M we could use 2M pages, but instead we use 64K pages up
to the 16M boundary:

  Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages
  Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages
  Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages

This is because the test is checking if addr is < __init_begin
and addr + mapping_size is >= _stext. But that is true for all pages
between _stext and __init_begin.

Instead what we want to check is if we are crossing the text/data
boundary, which is at __init_begin. With that fixed we see:

  Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages
  Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages
  Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages
  Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages

ie. we're correctly using 2MB pages below __init_begin, but we still
drop down to 64K pages unnecessarily at the boundary.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Michael Ellerman 5c6499b704 powerpc/mm/radix: Fix off-by-one in split mapping logic
When we have CONFIG_STRICT_KERNEL_RWX enabled, we try to split the
kernel linear (1:1) mapping so that the kernel text is in a separate
page to kernel data, so we can mark the former read-only.

We could achieve that just by always using 64K pages for the linear
mapping, but we try to be smarter. Instead we use huge pages when
possible, and only switch to smaller pages when necessary.

However we have an off-by-one bug in that logic, which causes us to
calculate the wrong boundary between text and data.

For example with the end of the kernel text at 16M we see:

  radix-mmu: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages
  radix-mmu: Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages

ie. we mapped from 0 to 18M with 64K pages, even though the boundary
between text and data is at 16M.

With the fix we see we're correctly hitting the 16M boundary:

  radix-mmu: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages
  radix-mmu: Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Naveen N. Rao 67361cf807 powerpc/ftrace: Handle large kernel configs
Currently, we expect to be able to reach ftrace_caller() from all
ftrace-enabled functions through a single relative branch. With large
kernel configs, we see functions outside of 32MB of ftrace_caller()
causing ftrace_init() to bail.

In such configurations, gcc/ld emits two types of trampolines for mcount():
1. A long_branch, which has a single branch to mcount() for functions that
   are one hop away from mcount():
	c0000000019e8544 <00031b56.long_branch._mcount>:
	c0000000019e8544:	4a 69 3f ac 	b       c00000000007c4f0 <._mcount>

2. A plt_branch, for functions that are farther away from mcount():
	c0000000051f33f8 <0008ba04.plt_branch._mcount>:
	c0000000051f33f8:	3d 82 ff a4 	addis   r12,r2,-92
	c0000000051f33fc:	e9 8c 04 20 	ld      r12,1056(r12)
	c0000000051f3400:	7d 89 03 a6 	mtctr   r12
	c0000000051f3404:	4e 80 04 20 	bctr

We can reuse those trampolines for ftrace if we can have those
trampolines go to ftrace_caller() instead. However, with ABIv2, we
cannot depend on r2 being valid. As such, we use only the long_branch
trampolines by patching those to instead branch to ftrace_caller or
ftrace_regs_caller.

In addition, we add additional trampolines around .text and .init.text
to catch locations that are covered by the plt branches. This allows
ftrace to work with most large kernel configurations.

For now, we always patch the trampolines to go to ftrace_regs_caller,
which is slightly inefficient. This can be optimized further at a later
point.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Aneesh Kumar K.V dd0e144a63 powerpc/mm: Fix WARN_ON with THP NUMA migration
WARNING: CPU: 12 PID: 4322 at /arch/powerpc/mm/pgtable-book3s64.c:76 set_pmd_at+0x4c/0x2b0
 Modules linked in:
 CPU: 12 PID: 4322 Comm: qemu-system-ppc Tainted: G        W         4.19.0-rc3-00758-g8f0c636b0542 #36
 NIP:  c0000000000872fc LR: c000000000484eec CTR: 0000000000000000
 REGS: c000003fba876fe0 TRAP: 0700   Tainted: G        W          (4.19.0-rc3-00758-g8f0c636b0542)
 MSR:  900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 24282884  XER: 00000000
 CFAR: c000000000484ee8 IRQMASK: 0
 GPR00: c000000000484eec c000003fba877268 c000000001f0ec00 c000003fbd229f80
 GPR04: 00007c8fe8e00000 c000003f864c5a38 860300853e0000c0 0000000000000080
 GPR08: 0000000080000000 0000000000000001 0401000000000080 0000000000000001
 GPR12: 0000000000002000 c000003fffff5400 c000003fce292000 00007c9024570000
 GPR16: 0000000000000000 0000000000ffffff 0000000000000001 c000000001885950
 GPR20: 0000000000000000 001ffffc0004807c 0000000000000008 c000000001f49d05
 GPR24: 00007c8fe8e00000 c0000000020f2468 ffffffffffffffff c000003fcd33b090
 GPR28: 00007c8fe8e00000 c000003fbd229f80 c000003f864c5a38 860300853e0000c0
 NIP [c0000000000872fc] set_pmd_at+0x4c/0x2b0
 LR [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20
 Call Trace:
 [c000003fba877268] [c00000000045931c] mpol_misplaced+0x1bc/0x230 (unreliable)
 [c000003fba8772c8] [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20
 [c000003fba877398] [c00000000040d344] __handle_mm_fault+0x5e4/0x2300
 [c000003fba8774d8] [c00000000040f400] handle_mm_fault+0x3a0/0x420
 [c000003fba877528] [c0000000003ff6f4] __get_user_pages+0x2e4/0x560
 [c000003fba877628] [c000000000400314] get_user_pages_unlocked+0x104/0x2a0
 [c000003fba8776c8] [c000000000118f44] __gfn_to_pfn_memslot+0x284/0x6a0
 [c000003fba877748] [c0000000001463a0] kvmppc_book3s_radix_page_fault+0x360/0x12d0
 [c000003fba877838] [c000000000142228] kvmppc_book3s_hv_page_fault+0x48/0x1300
 [c000003fba877988] [c00000000013dc08] kvmppc_vcpu_run_hv+0x1808/0x1b50
 [c000003fba877af8] [c000000000126b44] kvmppc_vcpu_run+0x34/0x50
 [c000003fba877b18] [c000000000123268] kvm_arch_vcpu_ioctl_run+0x288/0x2d0
 [c000003fba877b98] [c00000000011253c] kvm_vcpu_ioctl+0x1fc/0x8c0
 [c000003fba877d08] [c0000000004e9b24] do_vfs_ioctl+0xa44/0xae0
 [c000003fba877db8] [c0000000004e9c44] ksys_ioctl+0x84/0xf0
 [c000003fba877e08] [c0000000004e9cd8] sys_ioctl+0x28/0x80

We removed the pte_protnone check earlier with the understanding that we
mark the pte invalid before the set_pte/set_pmd usage. But the huge pmd
autonuma still use the set_pmd_at directly. This is ok because a protnone pte
won't have translation cache in TLB.

Fixes: da7ad366b4 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy 51eeef9e13 powerpc/time: no steal_time when CONFIG_PPC_SPLPAR is not selected
If CONFIG_PPC_SPLPAR is not selected, steal_time will always
be NUL, so accounting it is pointless

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy abcff86df2 powerpc/time: Only set CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC64
scaled cputime is only meaningfull when the processor has
SPURR and/or PURR, which means only on PPC64.

Removing it on PPC32 significantly reduces the size of
vtime_account_system() and vtime_account_idle() on an 8xx:

Before:
00000000 l     F .text	000000a8 vtime_delta
00000280 g     F .text	0000010c vtime_account_system
0000038c g     F .text	00000048 vtime_account_idle

After:
(vtime_delta gets inlined inside the two functions)
000001d8 g     F .text	000000a0 vtime_account_system
00000278 g     F .text	00000038 vtime_account_idle

In terms of performance, we also get approximatly 7% improvement on
task switch. The following small benchmark app is run with perf stat:

void *thread(void *arg)
{
	int i;

	for (i = 0; i < atoi((char*)arg); i++)
		pthread_yield();
}

int main(int argc, char **argv)
{
	pthread_t th1, th2;

	pthread_create(&th1, NULL, thread, argv[1]);
	pthread_create(&th2, NULL, thread, argv[1]);
	pthread_join(th1, NULL);
	pthread_join(th2, NULL);

	return 0;
}

Before the patch:

 Performance counter stats for 'chrt -f 98 ./sched 100000' (50 runs):

       8228.476465      task-clock (msec)         #    0.954 CPUs utilized            ( +-  0.23% )
            200004      context-switches          #    0.024 M/sec                    ( +-  0.00% )

After the patch:

 Performance counter stats for 'chrt -f 98 ./sched 100000' (50 runs):

       7649.070444      task-clock (msec)         #    0.955 CPUs utilized            ( +-  0.27% )
            200004      context-switches          #    0.026 M/sec                    ( +-  0.00% )

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy b38a181c11 powerpc/time: isolate scaled cputime accounting in dedicated functions.
scaled cputime is only meaningfull when the processor has
SPURR and/or PURR, which means only on PPC64.

In preparation of the following patch that will remove
CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC32, this patch moves
all scaled cputing accounting logic into dedicated functions.

This patch doesn't change any functionality. It's only code
reorganisation.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy fb978ca207 powerpc/kgdb: add kgdb_arch_set/remove_breakpoint()
Generic implementation fails to remove breakpoints after init
when CONFIG_STRICT_KERNEL_RWX is selected:

[   13.251285] KGDB: BP remove failed: c001c338
[   13.259587] kgdbts: ERROR PUT: end of test buffer on 'do_fork_test' line 8 expected OK got $E14#aa
[   13.268969] KGDB: re-enter exception: ALL breakpoints killed
[   13.275099] CPU: 0 PID: 1 Comm: init Not tainted 4.18.0-g82bbb913ffd8 #860
[   13.282836] Call Trace:
[   13.285313] [c60e1ba0] [c0080ef0] kgdb_handle_exception+0x6f4/0x720 (unreliable)
[   13.292618] [c60e1c30] [c000e97c] kgdb_handle_breakpoint+0x3c/0x98
[   13.298709] [c60e1c40] [c000af54] program_check_exception+0x104/0x700
[   13.305083] [c60e1c60] [c000e45c] ret_from_except_full+0x0/0x4
[   13.310845] [c60e1d20] [c02a22ac] run_simple_test+0x2b4/0x2d4
[   13.316532] [c60e1d30] [c0081698] put_packet+0xb8/0x158
[   13.321694] [c60e1d60] [c00820b4] gdb_serial_stub+0x230/0xc4c
[   13.327374] [c60e1dc0] [c0080af8] kgdb_handle_exception+0x2fc/0x720
[   13.333573] [c60e1e50] [c000e928] kgdb_singlestep+0xb4/0xcc
[   13.339068] [c60e1e70] [c000ae1c] single_step_exception+0x90/0xac
[   13.345100] [c60e1e80] [c000e45c] ret_from_except_full+0x0/0x4
[   13.350865] [c60e1f40] [c000e11c] ret_from_syscall+0x0/0x38
[   13.356346] Kernel panic - not syncing: Recursive entry to debugger

This patch creates powerpc specific version of
kgdb_arch_set_breakpoint() and kgdb_arch_remove_breakpoint()
using patch_instruction()

Fixes: 1e0fc9d1eb ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy 6beb3381b1 powerpc/sysdev/ipic: check primary_ipic NULL pointer before using it
ipic_get_mcp_status() is used by targets implementing NMI
watchdog in target specific machine check handler in order
to known whether a machine check results from a watchdog
NMI reset.

In case of very early machine check, primary_ipic pointer
might not have been set yet, so ipic_get_mcp_status() needs
to check it for nullity before using it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy 37e9c674e7 powerpc/mm: fix always true/false warning in slice.c
This patch fixes the following warnings (obtained with make W=1).

arch/powerpc/mm/slice.c: In function 'slice_range_to_mask':
arch/powerpc/mm/slice.c:73:12: error: comparison is always true due to limited range of data type [-Werror=type-limits]
  if (start < SLICE_LOW_TOP) {
            ^
arch/powerpc/mm/slice.c:81:20: error: comparison is always false due to limited range of data type [-Werror=type-limits]
  if ((start + len) > SLICE_LOW_TOP) {
                    ^
arch/powerpc/mm/slice.c: In function 'slice_mask_for_free':
arch/powerpc/mm/slice.c:136:17: error: comparison is always true due to limited range of data type [-Werror=type-limits]
  if (high_limit <= SLICE_LOW_TOP)
                 ^
arch/powerpc/mm/slice.c: In function 'slice_check_range_fits':
arch/powerpc/mm/slice.c:185:12: error: comparison is always true due to limited range of data type [-Werror=type-limits]
  if (start < SLICE_LOW_TOP) {
            ^
arch/powerpc/mm/slice.c:195:39: error: comparison is always false due to limited range of data type [-Werror=type-limits]
  if (SLICE_NUM_HIGH && ((start + len) > SLICE_LOW_TOP)) {
                                       ^
arch/powerpc/mm/slice.c: In function 'slice_scan_available':
arch/powerpc/mm/slice.c:306:11: error: comparison is always true due to limited range of data type [-Werror=type-limits]
  if (addr < SLICE_LOW_TOP) {
           ^
arch/powerpc/mm/slice.c: In function 'get_slice_psize':
arch/powerpc/mm/slice.c:709:11: error: comparison is always true due to limited range of data type [-Werror=type-limits]
  if (addr < SLICE_LOW_TOP) {
           ^

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy aa5456abdc powerpc/mm: fix missing prototypes in slice.c
This patch fixes the following warnings (obtained with make W=1).

arch/powerpc/mm/slice.c: At top level:
arch/powerpc/mm/slice.c:682:15: error: no previous prototype for 'arch_get_unmapped_area' [-Werror=missing-prototypes]
 unsigned long arch_get_unmapped_area(struct file *filp,
               ^
arch/powerpc/mm/slice.c:692:15: error: no previous prototype for 'arch_get_unmapped_area_topdown' [-Werror=missing-prototypes]
 unsigned long arch_get_unmapped_area_topdown(struct file *filp,
               ^

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy 8114c36ea6 powerpc/mm: Trace tlbia instruction
Add a trace point for tlbia (Translation Lookaside Buffer Invalidate
All) instruction.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy cf4a608515 powerpc/mm: Add missing tracepoint for tlbie
commit 0428491cba ("powerpc/mm: Trace tlbie(l) instructions")
added tracepoints for tlbie calls, but _tlbil_va() was forgotten

Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Christophe Leroy 3ff38e1874 powerpc/book3s64: fix dump_linuxpagetables "present" flag
Since commit bd0dbb73e0 ("powerpc/mm/books3s: Add new pte bit to
mark pte temporarily invalid."), _PAGE_PRESENT doesn't mean exactly
that a page is present. A page is also considered preset when
_PAGE_INVALID is set.

This patch changes the meaning of "present" and adds a status "valid"
associated to the _PAGE_PRESENT flag.

Fixes: bd0dbb73e0 ("powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid.")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Aravinda Prasad c6c26fb55e powerpc/pseries: Export raw per-CPU VPA data via debugfs
This patch exports the raw per-CPU VPA data via debugfs.
A per-CPU file is created which exports the VPA data of
that CPU to help debug some of the VPA related issues or
to analyze the per-CPU VPA related statistics.

v3: Removed offline CPU check.

v2: Included offline CPU check and other review comments.

Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Naveen N. Rao 59fe7eaf35 powerpc64/module elfv1: Set opd addresses after module relocation
module_frob_arch_sections() is called before the module is moved to its
final location. The function descriptor section addresses we are setting
here are thus invalid. Fix this by processing opd section during
module_finalize()

Fixes: 5633e85b2c ("powerpc64: Add .opd based function descriptor dereference")
Cc: stable@vger.kernel.org # v4.16
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:47 +11:00
Naveen N. Rao 7cd01b08d3 powerpc: Add support for function error injection
We implement regs_set_return_value() and override_function_with_return()
for this purpose.

On powerpc, a return from a function (blr) just branches to the location
contained in the link register. So, we can just update pt_regs rather
than redirecting execution to a dummy function that returns.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-20 13:26:43 +11:00
Helge Deller fe8376dbbd parisc: Add PDC PAT cell_info() and pd_get_pdc_revisions() functions
Add wrappers for the PDC_PAT_CELL_GET_INFO and
PDC_PAT_PD_GET_PDC_INTERF_REV PAT PDC subfunctions.

Both provide access to the PAT capability bitfield which can guide us if
simultaneous PTLBs are allowed on the bus, and if firmware will
rendezvous all processors within PDCE_Check in case of an HPMC.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-19 22:22:07 +02:00
Helge Deller 32c1ceeabd parisc: Drop two instructions from pte lookup code
Remove two instruction from the hot path. The temporary move to %r9 is
unneccessary, and the zero-inialization of pte happens twice.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-19 22:22:06 +02:00
Helge Deller a45a01160f parisc: Use zdep for shlw macro on PA1.1 and PA2.0
The zdep and depw,z mnemonics generate the same code. The assembler will
accept the depw,z mnemonic when generating PA 1.x code. The zdep
mnemonic is okay when generating PA 2.0 code. This patch changes depw,z
to zdep in the current shlw macro, while the binary code will be the
same.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: John David Anglin <dave.anglin@bell.net>
2018-10-19 22:22:05 +02:00
David S. Miller 2e2d6f0342 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'.  Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19 11:03:06 -07:00
David S. Miller 46b8306480 sparc: Fix parport build warnings.
If PARPORT_PC_FIFO is not enabled, do not provide the dma lock
macros and lock definition.  Otherwise:

./arch/sparc/include/asm/parport.h:24:24: warning: ‘dma_spin_lock’ defined but not used [-Wunused-variable]
 static DEFINE_SPINLOCK(dma_spin_lock);
                        ^~~~~~~~~~~~~
./include/linux/spinlock_types.h:81:39: note: in definition of macro ‘DEFINE_SPINLOCK’
 #define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x)

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19 10:52:52 -07:00
Vitaly Kuznetsov cbe3f898d1 x86/kvm/nVMX: tweak shadow fields
It seems we have some leftovers from times when 'unrestricted guest'
wasn't exposed to L1. Stop shadowing GUEST_CS_{BASE,LIMIT,AR_SELECTOR}
and GUEST_ES_BASE, shadow GUEST_SS_AR_BYTES as it was found that some
hypervisors (e.g. Hyper-V without Enlightened VMCS) access it pretty
often.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 18:45:14 +02:00
James Morse 4debef5510 arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work
enable_smccc_arch_workaround_1() passes NULL as the hyp_vecs start and
end if the HVC conduit is in use, and ARM_SMCCC_ARCH_WORKAROUND_1 is
detected.

If the guest kernel happened to be built with KVM_INDIRECT_VECTORS,
we go on to allocate a slot, memcpy() the empty workaround in and
do the appropriate cache maintenance.

This works as we always tell memcpy() the range is 0, so it never
accesses the NULL src pointer, but we still do the cache maintenance.

If hyp_vecs_start is NULL we know we're a guest, just update the fn
like the !KVM_INDIRECT_VECTORS version.

Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-19 15:37:25 +01:00
Paolo Bonzini e42b4a507e KVM/arm updates for 4.20
- Improved guest IPA space support (32 to 52 bits)
 - RAS event delivery for 32bit
 - PMU fixes
 - Guest entry hardening
 - Various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlvJ0HIVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDnWsP/02W6iIZUlg0SfsNq3bownJv+3VH
 BwEWTfRhWqqzSnsPwUEcOakKI8OIDJ07wIr6XoqPqq2PESS4BQv90qUTxytJXIt4
 gdTxZbNdCSzOc8Zf5URi1WtydekxsEFKgZy9iYWuILJzGW8iFbDZasgG6l8TWupN
 SsoyoGYBVwqR4xRf2f+PLf2n4U0McM8gFuKBFpnp1vCg6gZMBOvvKxQSRk9lUXEL
 C5LERL1CsGVn1Q2GxEB4yAxqrlAMMjy/S2dAf2KpCvMvviK3t05C4vY/+/mT21YE
 wCStX7W5Jfhy3hEsyHCkeulODdomIyro32/hw1qLhMXv4+wRvoiNrMVEoxUPi+by
 L89C6slwxqZOgcF2epSQgf7LBiLw+LnCGtACq2xY7p8yGuy0XW7mK9DlY5RvBHka
 aMmZ6kK/GIZFqRHDHa+ND2cAqS+Xyg2t/j2rvUPL0/xNelI1hpUUyGECTcqAXLr7
 N28+8aoHWcYb03r8YwfgWkEcwT9leAS45NBmHgnkOL4srcyW7anSW4NhZb/+U0mM
 8cLF+2BxfUo733Q5EyM2Q3JdbgaDaeanf6zzy7xAsPEywK4P5/kdqjc0N9se+LUx
 WhU3BRDU4KwV6S7bBS9ZuFK3heuwfuKWaYwwDaxrTlem++8FhoLBNV2vN8VjemD/
 AY5RvHrEhFYndijj
 =vjLz
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm updates for 4.20

- Improved guest IPA space support (32 to 52 bits)
- RAS event delivery for 32bit
- PMU fixes
- Guest entry hardening
- Various cleanups
2018-10-19 15:24:24 +02:00
Jithu Joseph b61b8bba18 x86/intel_rdt: Prevent pseudo-locking from using stale pointers
When the last CPU in an rdt_domain goes offline, its rdt_domain struct gets
freed. Current pseudo-locking code is unaware of this scenario and tries to
dereference the freed structure in a few places.

Add checks to prevent pseudo-locking code from doing this.

While further work is needed to seamlessly restore resource groups (not
just pseudo-locking) to their configuration when the domain is brought back
online, the immediate issue of invalid pointers is addressed here.

Fixes: f4e80d67a5 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information")
Fixes: 443810fe61 ("x86/intel_rdt: Create debugfs files for pseudo-locking testing")
Fixes: 746e08590b ("x86/intel_rdt: Create character device exposing pseudo-locked region")
Fixes: 33dc3e410a ("x86/intel_rdt: Make CPU information accessible for pseudo-locked regions")
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: gavin.hindman@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/231f742dbb7b00a31cc104416860e27dba6b072d.1539384145.git.reinette.chatre@intel.com
2018-10-19 14:54:28 +02:00
Christoffer Dall e4e11cc0f8 KVM: arm64: Safety check PSTATE when entering guest and handle IL
This commit adds a paranoid check when entering the guest to make sure
we don't attempt running guest code in an equally or more privilged mode
than the hypervisor.  We also catch other accidental programming of the
SPSR_EL2 which results in an illegal exception return and report this
safely back to the user.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-10-19 11:13:03 +01:00
Paul Mackerras 8d9fcacff9 KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips
This disables the use of the streamlined entry path for radix guests
on early POWER9 chips that need the workaround added in commit
a25bd72bad ("powerpc/mm/radix: Workaround prefetch issue with KVM",
2017-07-24), because the streamlined entry path does not include
that workaround.  This also means that we can't do nested HV-KVM
on those chips.

Since the chips that need that workaround are the same ones that can't
run both radix and HPT guests at the same time on different threads of
a core, we use the existing 'no_mixing_hpt_and_radix' variable that
identifies those chips to identify when we can't use the new guest
entry path, and when we can't do nested virtualization.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-10-19 20:44:04 +11:00
Christoph Hellwig 886643b766 arm64: use the generic swiotlb_dma_ops
Now that the generic swiotlb code supports non-coherent DMA we can switch
to it for arm64.  For that we need to refactor the existing
alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
and implement the standard arch_sync_dma_for_{device,cpu} hooks for
cache maintaincance in the streaming dma hooks, which also implies
using the generic dma_coherent flag in struct device.

Note that we need to keep the old is_device_dma_coherent function around
for now, so that the shared arm/arm64 Xen code keeps working.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-19 08:53:24 +02:00
Christoph Hellwig fafadcd165 swiotlb: don't dip into swiotlb pool for coherent allocations
All architectures that support swiotlb also have a zone that backs up
these less than full addressing allocations (usually ZONE_DMA32).

Because of that it is rather pointless to fall back to the global swiotlb
buffer if the normal dma direct allocation failed - the only thing this
will do is to eat up bounce buffers that would be more useful to serve
streaming mappings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2018-10-19 08:48:28 +02:00
Christoph Hellwig dff8d6c1ed swiotlb: remove the overflow buffer
Like all other dma mapping drivers just return an error code instead
of an actual memory buffer.  The reason for the overflow buffer was
that at the time swiotlb was invented there was no way to check for
dma mapping errors, but this has long been fixed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2018-10-19 08:43:46 +02:00
Thomas Richter ec0c0bb489 s390/perf: Return error when debug_register fails
Return an error when the function debug_register() fails allocating
the debug handle.
Also remove the registered debug handle when the initialization fails
later on.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-19 08:18:16 +02:00
Christoph Hellwig 485734f3fc x86/swiotlb: Enable swiotlb for > 4GiG RAM on 32-bit kernels
We already build the swiotlb code for 32-bit kernels with PAE support,
but the code to actually use swiotlb has only been enabled for 64-bit
kernels for an unknown reason.

Before Linux v4.18 we paper over this fact because the networking code,
the SCSI layer and some random block drivers implemented their own
bounce buffering scheme.

[ mingo: Changelog fixes. ]

Fixes: 21e07dba9f ("scsi: reduce use of block bounce buffers")
Fixes: ab74cfebaf ("net: remove the PCI_DMA_BUS_IS_PHYS check in illegal_highdma")
Reported-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Matthew Whitehead <tedheadster@gmail.com>
Cc: konrad.wilk@oracle.com
Cc: iommu@lists.linux-foundation.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181014075208.2715-1-hch@lst.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-19 07:49:32 +02:00
Michael Ellerman b4d16ab58c powerpc/time: Fix clockevent_decrementer initalisation for PR KVM
In the recent commit 8b78fdb045 ("powerpc/time: Use
clockevents_register_device(), fixing an issue with large
decrementer") we changed the way we initialise the decrementer
clockevent(s).

We no longer initialise the mult & shift values of
decrementer_clockevent itself.

This has the effect of breaking PR KVM, because it uses those values
in kvmppc_emulate_dec(). The symptom is guest kernels spin forever
mid-way through boot.

For now fix it by assigning back to decrementer_clockevent the mult
and shift values.

Fixes: 8b78fdb045 ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 15:09:04 +11:00
Michael Ellerman 6ce7bff045 powerpc/aout: Fix struct user definition to use user_pt_regs
I'm pretty sure this is dead code, it's only used by the a.out core
dump code, and we don't support a.out. We should remove it.

But while it's in the tree it should be using the ABI version of
pt_regs which is called user_pt_regs in the kernel, because the whole
struct is written to the core dump and so its size shouldn't change.

Note this isn't a uapi header so we don't need an ifdef.

Fixes: 002af9391b ("powerpc: Split user/kernel definitions of struct pt_regs")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 15:09:04 +11:00
Michael Ellerman 22a3d03d69 powerpc/uapi: Fix sigcontext definition to use user_pt_regs
My recent patch to split pt_regs between user and kernel missed
the usage in struct sigcontext.

Because this is a user visible struct it should be using the user
visible definition, which when we're building for the kernel is called
struct user_pt_regs.

As far as I can see this hasn't actually caused a bug (yet), because
we don't use the sizeof() the sigcontext->regs anywhere. But we should
still fix it to avoid confusion and future bugs.

Fixes: 002af9391b ("powerpc: Split user/kernel definitions of struct pt_regs")
Reported-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 15:09:04 +11:00
Stephen Boyd 1578968f77 Merge branches 'clk-imx6-mmdc', 'clk-qcom-krait', 'clk-rockchip' and 'clk-smp2s11-match' into clk-next
- iMX6 MMDC clks
  - Qualcomm Krait CPU clk support

* clk-imx6-mmdc:
  clk: imx6q: add mmdc0 ipg clock
  clk: imx6sl: add mmdc ipg clocks
  clk: imx6sll: add mmdc1 ipg clock
  clk: imx6sx: add mmdc1 ipg clock
  clk: imx6ul: add mmdc1 ipg clock

* clk-qcom-krait:
  clk: qcom: Add safe switch hook for krait mux clocks
  dt-bindings: clock: Document qcom,krait-cc
  clk: qcom: Add Krait clock controller driver
  dt-bindings: arm: Document qcom,kpss-gcc
  clk: qcom: Add KPSS ACC/GCC driver
  clk: qcom: Add support for Krait clocks
  clk: qcom: Add IPQ806X's HFPLLs
  clk: qcom: Add MSM8960/APQ8064's HFPLLs
  dt-bindings: clock: Document qcom,hfpll
  clk: qcom: Add HFPLL driver
  clk: qcom: Add support for High-Frequency PLLs (HFPLLs)
  ARM: Add Krait L2 register accessor functions

* clk-rockchip:
  clk: rockchip: Fix static checker warning in rockchip_ddrclk_get_parent call
  clk: rockchip: use the newly added clock-id for hdmi on RK3066
  clk: rockchip: add clock-id for HCLK_HDMI on rk3066
  clk: rockchip: fix wrong mmc sample phase shift for rk3328
  clk: rockchip: improve rk3288 pll rates for better hdmi output

* clk-smp2s11-match:
  clk: s2mps11: Add used attribute to s2mps11_dt_match
  clk: s2mps11: Fix matching when built as module and DT node contains compatible
2018-10-18 15:44:01 -07:00
Christoph Hellwig ecb0a83e31 ubd: remove use of blk_rq_map_sg
There is no good reason to create a scatterlist in the ubd driver,
it can just iterate the request directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
[rw: Folded in improvements as discussed with hch and jens]
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-18 15:13:12 -06:00
Christophe Leroy a0e102914a powerpc/io: remove old GCC version implementation
GCC 4.6 is the minimum supported now.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Michael Ellerman 23ad1a2700 powerpc: Add -Werror at arch/powerpc level
Back when I added -Werror in commit ba55bd7436 ("powerpc: Add
configurable -Werror for arch/powerpc") I did it by adding it to most
of the arch Makefiles.

At the time we excluded math-emu, because apparently it didn't build
cleanly. But that seems to have been fixed somewhere in the interim.

So move the -Werror addition to the top-level of the arch, this saves
us from repeating it in every Makefile and means we won't forget to
add it to any new sub-dirs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Michael Ellerman c47ca98d32 powerpc: Move core kernel logic into arch/powerpc/Kbuild
This is a nice cleanup, arch/powerpc/Makefile is long and messy so
moving this out helps a little.

It also allows us to do:

  $ make arch/powerpc

Which can be helpful if you just want to compile test some changes to
arch code and not link everything.

Finally it also gives us a single place to do subdir-cc-flags
assignments which affect the whole of arch/powerpc, which we will do
in a future patch.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Christophe Leroy bd03fd84a5 powerpc/traps: remove redundant in_interrupt panic in die()
do_exit() already includes a test to panic() is in_interrupt()

This patch removes powerpc one which is redundant.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt f1f208e54d powerpc/prom_init: Generate "phandle" instead of "linux, phandle"
When creating the boot-time FDT from an actual Open Firmware live
tree, let's generate "phandle" properties for the phandles instead
of the old deprecated "linux,phandle".

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Unsplit warning printf()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt 2c51d97ee8 powerpc: Check prom_init for disallowed sections
prom_init.c must not modify the kernel image outside
of the .bss.prominit section. Thus make sure that
prom_init.o doesn't have anything in any of these:

	.data
	.bss
	.init.data

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt 5f69e38885 powerpc/prom_init: Move __prombss to it's own section and store it in .bss
This makes __prombss its own section, and for now store
it in .bss.

This will give us the ability later to store it elsewhere
and/or free it after boot (it's about 8KB).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt 8ca2d5151e powerpc/prom_init: Move a few remaining statics to appropriate sections
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt d00e34b92c powerpc/prom_init: Move const structures to __initconst
As they are no longer used past the end of prom_init

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt a614f52e75 powerpc/prom_init: Move ibm_arch_vec to __prombss
Make the existing initialized definition constant and copy
it to a __prombss copy

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt c886087cae powerpc/prom_init: Move prom_radix_disable to __prombss
Initialize it dynamically instead of statically

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt 11fdb30934 powerpc/prom_init: Remove support for OPAL v2
We removed support for running under any OPAL version
earlier than v3 in 2015 (they never saw the light of day
anyway), but we kept some leftovers of this support in
prom_init.c, so let's take it out.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Benjamin Herrenschmidt e63334e556 powerpc/prom_init: Replace __initdata with __prombss when applicable
This replaces all occurrences of __initdata for uninitialized
data with a new __prombss

Currently __promdata is defined to be __initdata but we'll
eventually change that.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Oliver O'Halloran b5beae5e22 powerpc/pseries: Add driver for PAPR SCM regions
Adds a driver that implements support for enabling and accessing PAPR
SCM regions. Unfortunately due to how the PAPR interface works we can't
use the existing of_pmem driver (yet) because:

 a) The guest is required to use the H_SCM_BIND_MEM h-call to add
    add the SCM region to it's physical address space, and
 b) There is currently no mechanism for relating a bare of_pmem region
    to the backing DIMM (or not-a-DIMM for our case).

Both of these are easily handled by rolling the functionality into a
seperate driver so here we are...

Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Oliver O'Halloran 4c5d87db49 powerpc/pseries: PAPR persistent memory support
This patch implements support for discovering storage class memory
devices at boot and for handling hotplug of new regions via RTAS
hotplug events.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Fix CONFIG_MEMORY_HOTPLUG=n build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Christophe Leroy 422123ccb9 powerpc/traps: fix machine check handlers to use pr_cont()
When printing the machine check cause, the cause appears on the
following line due to bad use of printk without \n:

[   33.663993] Machine check in kernel mode.
[   33.664011] Caused by (from SRR1=9032):
[   33.664036] Data access error at address c90c8000

This patch fixes it by using pr_cont() for the second part:

[  133.258131] Machine check in kernel mode.
[  133.258146] Caused by (from SRR1=9032): Data access error at address c90c8000

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Christophe Leroy bde1a1335c powerpc/book3e: redefine pte_mkprivileged() and pte_mkuser()
Book3e defines both _PAGE_USER and _PAGE_PRIVILEGED, so the nohash
default pte_mkprivileged() and pte_mkuser() are not usable.

This patch redefines them for book3e.

In theorie, only pte_mkprivileged() needs to be redefined because
_PAGE_USER includes _PAGE_PRIVILEGED, but it is less confusing
to redefine both.

Fixes: a0da4bc166 ("powerpc/mm: Allow platforms to redefine some helpers")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Aneesh Kumar K.V b9fb4480a3 powerpc/mm: Make pte_pgprot return all pte bits
Other archs do the same and instead of adding required pte bits (which
got masked out) in __ioremap_at(), make sure we filter only pfn bits
out.

Fixes: 26973fa5ac ("powerpc/mm: use pte helpers in generic code")
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Rafael J. Wysocki 3f858ae02c Merge branches 'acpi-pm' and 'pm-sleep'
* acpi-pm:
  ACPI / PM: LPIT: Register sysfs attributes based on FADT

* pm-sleep:
  x86-32, hibernate: Adjust in_suspend after resumed on 32bit system
  x86-32, hibernate: Set up temporary text mapping for 32bit system
  x86-32, hibernate: Switch to relocated restore code during resume on 32bit system
  x86-32, hibernate: Switch to original page table after resumed
  x86-32, hibernate: Use the page size macro instead of constant value
  x86-32, hibernate: Use temp_pgt as the temporary page table
  x86, hibernate: Rename temp_level4_pgt to temp_pgt
  x86-32, hibernate: Enable CONFIG_ARCH_HIBERNATION_HEADER on 32bit system
  x86, hibernate: Extract the common code of 64/32 bit system
  x86-32/asm/power: Create stack frames in hibernate_asm_32.S
  PM / hibernate: Check the success of generating md5 digest before hibernation
  x86, hibernate: Fix nosave_regions setup for hibernation
  PM / sleep: Show freezing tasks that caused a suspend abort
  PM / hibernate: Documentation: fix image_size default value
2018-10-18 12:27:30 +02:00
Dongjiu Geng 58bf437ff6 arm/arm64: KVM: Enable 32 bits kvm vcpu events support
The commit 539aee0edb ("KVM: arm64: Share the parts of
get/set events useful to 32bit") shares the get/set events
helper for arm64 and arm32, but forgot to share the cap
extension code.

User space will check whether KVM supports vcpu events by
checking the KVM_CAP_VCPU_EVENTS extension

Acked-by: James Morse <james.morse@arm.com>
Reviewed-by : Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-10-18 10:14:03 +01:00
Dongjiu Geng 375bdd3b5d arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension()
Rename kvm_arch_dev_ioctl_check_extension() to
kvm_arch_vm_ioctl_check_extension(), because it does
not have any relationship with device.

Renaming this function can make code readable.

Cc: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-10-18 10:12:53 +01:00
Steven Rostedt (VMware) c2712b8581 kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
Andy had some concerns about using regs_get_kernel_stack_nth() in a new
function regs_get_kernel_argument() as if there's any error in the stack
code, it could cause a bad memory access. To be on the safe side, call
probe_kernel_read() on the stack address to be extra careful in accessing
the memory. A helper function, regs_get_kernel_stack_nth_addr(), was added
to just return the stack address (or NULL if not on the stack), that will be
used to find the address (and could be used by other functions) and read the
address with kernel_probe_read().

Requested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181017165951.09119177@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-18 08:28:35 +02:00
Dan Carpenter 62d6f3b7b8 sparc: vDSO: Silence an uninitialized variable warning
Smatch complains that "val" would be uninitialized if kstrtoul() fails.

Fixes: 9a08862a5d ("vDSO for sparc")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17 21:55:02 -07:00
David S. Miller 776ca1543b sparc: Fix syscall fallback bugs in VDSO.
First, the trap number for 32-bit syscalls is 0x10.

Also, only negate the return value when syscall error is indicated by
the carry bit being set.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17 21:29:23 -07:00
Sebastian Andrzej Siewior 711f76a328 x86/mcelog: Remove one mce_helper definition
Commit

  5de97c9f6d ("x86/mce: Factor out and deprecate the /dev/mcelog driver")

moved the old interface into one file including mce_helper definition as
static and "extern". Remove one.

Fixes: 5de97c9f6d ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Tony Luck <tony.luck@intel.com>
CC: linux-edac <linux-edac@vger.kernel.org>
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20181017170554.18841-3-bigeasy@linutronix.de
2018-10-18 00:05:04 +02:00
Stephen Boyd 36d68f64c4 ARM: Add Krait L2 register accessor functions
Krait CPUs have a handful of L2 cache controller registers that
live behind a cp15 based indirection register. First you program
the indirection register (l2cpselr) to point the L2 'window'
register (l2cpdr) at what you want to read/write.  Then you
read/write the 'window' register to do what you want. The
l2cpselr register is not banked per-cpu so we must lock around
accesses to it to prevent other CPUs from re-pointing l2cpdr
underneath us.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Tested-by: Craig Tatlor <ctatlor97@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-17 13:14:33 -07:00
Paolo Bonzini 1e58e5e591 KVM: VMX: enable nested virtualization by default
With live migration support and finally a good solution for exception
event injection, nested VMX should be ready for having a stable userspace
ABI.  The results of syzkaller fuzzing are not perfect but not horrible
either (and might be partially due to running on GCE, so that effectively
we're testing three-level nesting on a fork of upstream KVM!).  Enabling
it by default seems like a nice way to conclude the 4.20 pull request. :)

Unfortunately, enabling nested SVM in 2009 (commit 4b6e4dca70) was a
bit premature.  However, until live migration support is in place we can
reasonably expect that it does not offer much in terms of ABI guarantees.
Therefore we are still in time to break things and conform as much as
possible to the interface used for VMX.

Suggested-by: Jim Mattson <jmattson@google.com>
Suggested-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Celebrated-by: Liran Alon <liran.alon@oracle.com>
Celebrated-by: Wanpeng Li <kernellwp@gmail.com>
Celebrated-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 19:10:09 +02:00
Uros Bizjak 43ce76ce73 KVM/x86: Use 32bit xor to clear registers in svm.c
x86_64 zero-extends 32bit xor operation to a full 64bit register.

Also add a comment and remove unnecessary instruction suffix in vmx.c

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 19:08:21 +02:00
Jim Mattson c4f55198c7 kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD
This is a per-VM capability which can be enabled by userspace so that
the faulting linear address will be included with the information
about a pending #PF in L2, and the "new DR6 bits" will be included
with the information about a pending #DB in L2. With this capability
enabled, the L1 hypervisor can now intercept #PF before CR2 is
modified. Under VMX, the L1 hypervisor can now intercept #DB before
DR6 and DR7 are modified.

When userspace has enabled KVM_CAP_EXCEPTION_PAYLOAD, it should
generally provide an appropriate payload when injecting a #PF or #DB
exception via KVM_SET_VCPU_EVENTS. However, to support restoring old
checkpoints, this payload is not required.

Note that bit 16 of the "new DR6 bits" is set to indicate that a debug
exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM
region while advanced debugging of RTM transactional regions was
enabled. This is the reverse of DR6.RTM, which is cleared in this
scenario.

This capability also enables exception.pending in struct
kvm_vcpu_events, which allows userspace to distinguish between pending
and injected exceptions.

Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 19:07:44 +02:00
Jim Mattson f10c729ff9 kvm: vmx: Defer setting of DR6 until #DB delivery
When exception payloads are enabled by userspace (which is not yet
possible) and a #DB is raised in L2, defer the setting of DR6 until
later. Under VMX, this allows the L1 hypervisor to intercept the fault
before DR6 is modified. Under SVM, DR6 is modified before L1 can
intercept the fault (as has always been the case with DR7).

Note that the payload associated with a #DB exception includes only
the "new DR6 bits." When the payload is delievered, DR6.B0-B3 will be
cleared and DR6.RTM will be set prior to merging in the new DR6 bits.

Also note that bit 16 in the "new DR6 bits" is set to indicate that a
debug exception (#DB) or a breakpoint exception (#BP) occurred inside
an RTM region while advanced debugging of RTM transactional regions
was enabled. Though the reverse of DR6.RTM, this makes the #DB payload
field compatible with both the pending debug exceptions field under
VMX and the exit qualification for #DB exceptions under VMX.

Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 19:07:43 +02:00
Jim Mattson da998b46d2 kvm: x86: Defer setting of CR2 until #PF delivery
When exception payloads are enabled by userspace (which is not yet
possible) and a #PF is raised in L2, defer the setting of CR2 until
the #PF is delivered. This allows the L1 hypervisor to intercept the
fault before CR2 is modified.

For backwards compatibility, when exception payloads are not enabled
by userspace, kvm_multiple_exception modifies CR2 when the #PF
exception is raised.

Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 19:07:43 +02:00
Jim Mattson 91e86d225e kvm: x86: Add payload operands to kvm_multiple_exception
kvm_multiple_exception now takes two additional operands: has_payload
and payload, so that updates to CR2 (and DR6 under VMX) can be delayed
until the exception is delivered. This is necessary to properly
emulate VMX or SVM hardware behavior for nested virtualization.

The new behavior is triggered by
vcpu->kvm->arch.exception_payload_enabled, which will (later) be set
by a new per-VM capability, KVM_CAP_EXCEPTION_PAYLOAD.

Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 19:07:42 +02:00
Jim Mattson 59073aaf6d kvm: x86: Add exception payload fields to kvm_vcpu_events
The per-VM capability KVM_CAP_EXCEPTION_PAYLOAD (to be introduced in a
later commit) adds the following fields to struct kvm_vcpu_events:
exception_has_payload, exception_payload, and exception.pending.

With this capability set, all of the details of vcpu->arch.exception,
including the payload for a pending exception, are reported to
userspace in response to KVM_GET_VCPU_EVENTS.

With this capability clear, the original ABI is preserved, and the
exception.injected field is set for either pending or injected
exceptions.

When userspace calls KVM_SET_VCPU_EVENTS with
KVM_CAP_EXCEPTION_PAYLOAD clear, exception.injected is no longer
translated to exception.pending. KVM_SET_VCPU_EVENTS can now only
establish a pending exception when KVM_CAP_EXCEPTION_PAYLOAD is set.

Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 19:07:38 +02:00
Helge Deller 3847dab774 parisc: Add alternative coding infrastructure
This patch adds the necessary code to patch a running kernel at runtime
to improve performance.

The current implementation offers a few optimizations variants:

- When running a SMP kernel on a single UP processor, unwanted assembler
  statements like locking functions are overwritten with NOPs. When
  multiple instructions shall be skipped, one branch instruction is used
  instead of multiple nop instructions.

- In the UP case, some pdtlb and pitlb instructions are patched to
  become pdtlb,l and pitlb,l which only flushes the CPU-local tlb
  entries instead of broadcasting the flush to other CPUs in the system
  and thus may improve performance.

- fic and fdc instructions are skipped if no I- or D-caches are
  installed.  This should speed up qemu emulation and cacheless systems.

- If no cache coherence is needed for IO operations, the relevant fdc
  and sync instructions in the sba and ccio drivers are replaced by
  nops.

- On systems which share I- and D-TLBs and thus don't have a seperate
  instruction TLB, the pitlb instruction is replaced by a nop.

Live-patching is done early in the boot process, just after having run
the system inventory. No drivers are running and thus no external
interrupts should arrive. So the hope is that no TLB exceptions will
occur during the patching. If this turns out to be wrong we will
probably need to do the patching in real-mode.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 17:22:26 +02:00
Greg Kroah-Hartman c343db455e Merge branch 'parisc-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Helge writes:
   "parisc fix:

    Fix an unitialized variable usage in the parisc unwind code."

* 'parisc-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix uninitialized variable usage in unwind.c
2018-10-17 14:01:00 +02:00
Sebastian Andrzej Siewior 2224d61652 x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context switch if there is an FPU
Booting an i486 with "no387 nofxsr" ends with with the following crash:

   math_emulate: 0060:c101987d
   Kernel panic - not syncing: Math emulation needed in kernel

on the first context switch in user land.

The reason is that copy_fpregs_to_fpstate() tries FNSAVE which does not work
as the FPU is turned off.

This bug was introduced in:

  f1c8cd0176 ("x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active")

Add a check for X86_FEATURE_FPU before trying to save FPU registers (we
have such a check in switch_fpu_finish() already).

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: f1c8cd0176 ("x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active")
Link: http://lkml.kernel.org/r/20181016202525.29437-4-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-17 12:30:38 +02:00
Sebastian Andrzej Siewior 6aa676761d x86/fpu: Remove second definition of fpu in __fpu__restore_sig()
Commit:

  c5bedc6847 ("x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active")

introduced the 'fpu' variable at top of __restore_xstate_sig(),
which now shadows the other definition:

  arch/x86/kernel/fpu/signal.c:318:28: warning: symbol 'fpu' shadows an earlier one
  arch/x86/kernel/fpu/signal.c:271:20: originally declared here

Remove the shadowed definition of 'fpu', as the two definitions are the same.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c5bedc6847 ("x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active")
Link: http://lkml.kernel.org/r/20181016202525.29437-3-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-17 12:30:31 +02:00
Andy Lutomirski ae852495be x86/entry/64: Further improve paranoid_entry comments
Commit:

  16561f27f9 ("x86/entry: Add some paranoid entry/exit CR3 handling comments")

... added some comments.  This improves them a bit:

 - When I first read the new comments, it was unclear to me whether
   they were referring to the case where paranoid_entry interrupted
   other entry code or where paranoid_entry was itself interrupted.
   Clarify it.

 - Remove the EBX comment.  We no longer use EBX as a SWAPGS
   indicator.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c47daa1888dc2298e7e1d3f82bd76b776ea33393.1539542111.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-17 12:30:27 +02:00
Jan Kiszka 04f4f954b6 x86/entry/32: Clear the CS high bits
Even if not on an entry stack, the CS's high bits must be
initialized because they are unconditionally evaluated in
PARANOID_EXIT_TO_KERNEL_MODE.

Failing to do so broke the boot on Galileo Gen2 and IOT2000 boards.

 [ bp: Make the commit message tone passive and impartial. ]

Fixes: b92a165df1 ("x86/entry/32: Handle Entry from Kernel-Mode on Entry-Stack")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andrea Arcangeli <aarcange@redhat.com>
CC: Andy Lutomirski <luto@kernel.org>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Brian Gerst <brgerst@gmail.com>
CC: Dave Hansen <dave.hansen@intel.com>
CC: David Laight <David.Laight@aculab.com>
CC: Denys Vlasenko <dvlasenk@redhat.com>
CC: Eduardo Valentin <eduval@amazon.com>
CC: Greg KH <gregkh@linuxfoundation.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Jiri Kosina <jkosina@suse.cz>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Juergen Gross <jgross@suse.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Will Deacon <will.deacon@arm.com>
CC: aliguori@amazon.com
CC: daniel.gruss@iaik.tugraz.at
CC: hughd@google.com
CC: keescook@google.com
CC: linux-mm <linux-mm@kvack.org>
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/f271c747-1714-5a5b-a71f-ae189a093b8d@siemens.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-17 12:30:20 +02:00
Bartlomiej Zolnierkiewicz b3569d3a4b x86/kconfig: Remove redundant 'default n' lines from all x86 Kconfig's
'default n' is the default value for any bool or tristate Kconfig
setting so there is no need to write it explicitly.

Also, since commit:

  f467c5640c ("kconfig: only write '# CONFIG_FOO is not set' for visible symbols")

... the Kconfig behavior is the same regardless of 'default n' being present or not:

    ...
    One side effect of (and the main motivation for) this change is making
    the following two definitions behave exactly the same:

        config FOO
                bool

        config FOO
                bool
                default n

    With this change, neither of these will generate a
    '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
    That might make it clearer to people that a bare 'default n' is
    redundant.
    ...

Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Juergen Gross <jgross@suse.co>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20181016134217eucas1p2102984488b89178a865162553369025b%7EeGpI5NlJo0851008510eucas1p2D@eucas1p2.samsung.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-17 08:39:42 +02:00
Helge Deller 34c201ae49 parisc: Include compressed vmlinux file in vmlinuz boot kernel
Change the parisc vmlinuz boot code to include and process the real
compressed vmlinux.gz ELF file instead of a compressed memory dump.
This brings parisc in sync on how it's done on x86_64.

The benefit of this change is that, e.g. for debugging purposes, one can
then extract the vmlinux file out of the vmlinuz which was booted which
wasn't possible before. This can be archieved with the existing
scripts/extract-vmlinux script, which just needs a small tweak to prefer
to extract a compressed file before trying the existing given binary.

The downside of this approach is that due to the extra round of
decompression/ELF processing we need more physical memory installed to
be able to boot a kernel.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:01 +02:00
John David Anglin 1138b6718f parisc: Fix address in HPMC IVA
Helge noticed that the address of the os_hpmc handler was not being
correctly calculated in the hpmc macro.  As a result, PDCE_CHECK would
fail to call os_hpmc:

<Cpu2> e800009802e00000  0000000000000000  CC_ERR_CHECK_HPMC
<Cpu2> 37000f7302e00000  8040004000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu2> f600105e02e00000  fffffff0f0c00000  CC_MC_HPMC_MONARCH_SELECTED
<Cpu2> 140003b202e00000  000000000000000b  CC_ERR_HPMC_STATE_ENTRY
<Cpu2> 5600100b02e00000  00000000000001a0  CC_MC_OS_HPMC_LEN_ERR
<Cpu2> 5600106402e00000  fffffff0f0438e70  CC_MC_BR_TO_OS_HPMC_FAILED
<Cpu2> e800009802e00000  0000000000000000  CC_ERR_CHECK_HPMC
<Cpu2> 37000f7302e00000  8040004000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu2> 4000109f02e00000  0000000000000000  CC_MC_HPMC_INITIATED
<Cpu2> 4000101902e00000  0000000000000000  CC_MC_MULTIPLE_HPMCS
<Cpu2> 030010d502e00000  0000000000000000  CC_CPU_STOP

The address problem can be seen by dumping the fault vector:

0000000040159000 <fault_vector_20>:
    40159000:   63 6f 77 73     stb r15,-2447(dp)
    40159004:   20 63 61 6e     ldil L%b747000,r3
    40159008:   20 66 6c 79     ldil L%-1c3b3000,r3
        ...
    40159020:   08 00 02 40     nop
    40159024:   20 6e 60 02     ldil L%15d000,r3
    40159028:   34 63 00 00     ldo 0(r3),r3
    4015902c:   e8 60 c0 02     bv,n r0(r3)
    40159030:   08 00 02 40     nop
    40159034:   00 00 00 00     break 0,0
    40159038:   c0 00 70 00     bb,*< r0,sar,40159840 <fault_vector_20+0x840>
    4015903c:   00 00 00 00     break 0,0

Location 40159038 should contain the physical address of os_hpmc:

000000004015d000 <os_hpmc>:
    4015d000:   08 1a 02 43     copy r26,r3
    4015d004:   01 c0 08 a4     mfctl iva,r4
    4015d008:   48 85 00 68     ldw 34(r4),r5

This patch moves the address setup into initialize_ivt to resolve the
above problem.  I tested the change by dumping the HPMC entry after setup:

0000000040209020:  8000240
0000000040209024: 206a2004
0000000040209028: 34630ac0
000000004020902c: e860c002
0000000040209030:  8000240
0000000040209034: 1bdddce6
0000000040209038:   15d000
000000004020903c:      1a0

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:01 +02:00
Helge Deller 99a3ae51d5 parisc: Fix exported address of os_hpmc handler
In the C-code we need to put the physical address of the hpmc handler in
the interrupt vector table (IVA) in order to get HPMCs working.  Since
on parisc64 function pointers are indirect (in fact they are function
descriptors) we instead export the address as variable and not as
function.

This reverts a small part of commit f39cce654f ("parisc: Add
cfi_startproc and cfi_endproc to assembly code").

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>    [4.9+]
2018-10-17 08:18:01 +02:00
Helge Deller 3c229b3f2d parisc: Fix map_pages() to not overwrite existing pte entries
Fix a long-existing small nasty bug in the map_pages() implementation which
leads to overwriting already written pte entries with zero, *if* map_pages() is
called a second time with an end address which isn't aligned on a pmd boundry.
This happens for example if we want to remap only the text segment read/write
in order to run alternative patching on the code. Exiting the loop when we
reach the end address fixes this.

Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:01 +02:00
John David Anglin 4dd5b673fa parisc: Purge TLB entries after updating page table entry and set page accessed flag in TLB handler
This patch may resolve some races in TLB handling.  Hopefully, TLB
inserts are accesses and protected by spin lock.

If not, we may need to IPI calls and do local purges on PA 2.0.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:01 +02:00
John David Anglin d27dfa13b9 parisc: Release spinlocks using ordered store
This patch updates the spin unlock code to use an ordered store with
release semanatics.  All prior accesses are guaranteed to be performed
before an ordered store is performed.

Using an ordered store is significantly faster than using the sync
memory barrier.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:01 +02:00
Helge Deller e98bc5ee97 parisc: Clean up crash header output
On kernel crash, this is the current output:
Kernel Fault: Code=26 (Data memory access rights trap) regs=(ptrval) (Addr=00000004)

Drop the address of regs, it's of no use for debugging, and show the
faulty address without parenthesis.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:01 +02:00
Helge Deller 8dbac7746e parisc: Add SYSTEM_INFO and REGISTER TOC PAT functions
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:01 +02:00
John David Anglin 32a7901f6d parisc: Remove PTE load and fault check from L2_ptep macro
This change removes the PTE load and present check from the L2_ptep
macro.  The load and check for kernel pages is now done in the tlb_lock
macro.  This avoids a double load and check for user pages.  The load
and check for user pages is now done inside the lock so the fault
handler can't be called while the entry is being updated.  This version
uses an ordered store to release the lock when the page table entry
isn't present.  It also corrects the check in the non SMP case.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:00 +02:00
John David Anglin a886c9791a parisc: Reorder TLB flush timing calculation
On boot (mostly reboot), my c8000 sometimes crashes after it prints the
TLB flush threshold.  The lockup is hard.  The front LED flashes red and
the box must be unplugged to reset the error.

I noticed that when the crash occurs the TLB flush threshold is about
one quarter what it is on a successful boot.  If I disabled the
calculation, the crash didn't occur.  There also seemed to be a timing
dependency affecting the crash.  I finally realized that the
flush_tlb_all() timing test runs just after the secondary CPUs are
started.  There seems to be a problem with running flush_tlb_all() too
soon after the CPUs are started.

The timing for the range test always seemed okay.  So, I reversed the
order of the two timing tests and I haven't had a crash at this point so
far.

I added a couple of information messages which I have left to help with
diagnosis if the problem should appear on another machine.

This version reduces the minimum TLB flush threshold to 16 KiB.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:00 +02:00
Masahiro Yamada c9dfa0c796 parisc: remove check for minimum required GCC version
Commit cafa0010cd ("Raise the minimum required gcc version to 4.6")
bumped the minimum GCC version to 4.6 for all architectures.

The version check in arch/parisc/Makefile is obsolete now.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:00 +02:00
Helge Deller cd2b852068 parisc: Use PARISC_ITLB_TRAP constant in entry.S
Fixes: 5b00ca0b80 ("parisc: Restore possibility to execute 64-bit applications")
Signed-off-by: Helge Deller <deller@gmx.de>
2018-10-17 08:18:00 +02:00
Jim Mattson c851436a34 kvm: x86: Add has_payload and payload to kvm_queued_exception
The payload associated with a #PF exception is the linear address of
the fault to be loaded into CR2 when the fault is delivered. The
payload associated with a #DB exception is a mask of the DR6 bits to
be set (or in the case of DR6.RTM, cleared) when the fault is
delivered. Add fields has_payload and payload to kvm_queued_exception
to track payloads for pending exceptions.

The new fields are introduced here, but for now, they are just cleared.

Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:22 +02:00
Paul Burton edbb4233e7
MIPS: Cleanup DSP ASE detection
Currently we hardcode a list of files for which we specify that the
toolchain has DSP ASE support when building for MIPSr2 only. This has a
number of problems:

  1) It doesn't actually ensure that the toolchain supports the DSP ASE
     at all.

  2) It's fragile if we try to use DSP ASE macros in other files.

  3) It makes no provision for MIPSr6 & later systems which also support
     the DSP ASE & end up using the .word directive implementation of
     the DSP macros.

Fix this by detecting assembler support for the DSP ASE globally, not
just for a small set of files, and not just for MIPSr2. This now exposes
use of toolchain DSP support to kernel builds targeting MIPSr1 and
older, so we add .set MIPS_ISA_LEVEL directives prior to all .set dsp
directives in order to prevent the assembler from complaining that the
DSP ASE is only supported with MIPSr2 & higher.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20901/
Cc: linux-mips@linux-mips.org
2018-10-16 15:30:21 -07:00
Vitaly Kuznetsov 8cab6507f6 x86/kvm/nVMX: nested state migration for Enlightened VMCS
Add support for get/set of nested state when Enlightened VMCS is in use.
A new KVM_STATE_NESTED_EVMCS flag to indicate eVMCS on the vCPU was enabled
is added.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:19 +02:00
Vitaly Kuznetsov a1b0c1c64d x86/kvm/nVMX: allow bare VMXON state migration
It is perfectly valid for a guest to do VMXON and not do VMPTRLD. This
state needs to be preserved on migration.

Cc: stable@vger.kernel.org
Fixes: 8fcc4b5923
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:18 +02:00
Vitaly Kuznetsov a7c42bb6da x86/kvm/lapic: preserve gfn_to_hva_cache len on cache reinit
vcpu->arch.pv_eoi is accessible through both HV_X64_MSR_VP_ASSIST_PAGE and
MSR_KVM_PV_EOI_EN so on migration userspace may try to restore them in any
order. Values match, however, kvm_lapic_enable_pv_eoi() uses different
length: for Hyper-V case it's the whole struct hv_vp_assist_page, for KVM
native case it is 8. In case we restore KVM-native MSR last cache will
be reinitialized with len=8 so trying to access VP assist page beyond
8 bytes with kvm_read_guest_cached() will fail.

Check if we re-initializing cache for the same address and preserve length
in case it was greater.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:17 +02:00
Vitaly Kuznetsov 12e0c6186b x86/kvm/hyperv: don't clear VP assist pages on init
VP assist pages may hold valuable data which needs to be preserved across
migration. Clean PV EOI portion of the data on init, the guest is
responsible for making sure there's no garbage in the rest.

This will be used for nVMX migration, eVMCS address needs to be preserved.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:17 +02:00
Vitaly Kuznetsov c4ebd6295a KVM: nVMX: optimize prepare_vmcs02{,_full} for Enlightened VMCS case
When Enlightened VMCS is in use by L1 hypervisor we can avoid vmwriting
VMCS fields which did not change.

Our first goal is to achieve minimal impact on traditional VMCS case so
we're not wrapping each vmwrite() with an if-changed checker. We also can't
utilize static keys as Enlightened VMCS usage is per-guest.

This patch implements the simpliest solution: checking fields in groups.
We skip single vmwrite() statements as doing the check will cost us
something even in non-evmcs case and the win is tiny. Unfortunately, this
makes prepare_vmcs02_full{,_full}() code Enlightened VMCS-dependent (and
a bit ugly).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:16 +02:00
Vitaly Kuznetsov b8bbab928f KVM: nVMX: implement enlightened VMPTRLD and VMCLEAR
Per Hyper-V TLFS 5.0b:

"The L1 hypervisor may choose to use enlightened VMCSs by writing 1 to
the corresponding field in the VP assist page (see section 7.8.7).
Another field in the VP assist page controls the currently active
enlightened VMCS. Each enlightened VMCS is exactly one page (4 KB) in
size and must be initially zeroed. No VMPTRLD instruction must be
executed to make an enlightened VMCS active or current.

After the L1 hypervisor performs a VM entry with an enlightened VMCS,
the VMCS is considered active on the processor. An enlightened VMCS
can only be active on a single processor at the same time. The L1
hypervisor can execute a VMCLEAR instruction to transition an
enlightened VMCS from the active to the non-active state. Any VMREAD
or VMWRITE instructions while an enlightened VMCS is active is
unsupported and can result in unexpected behavior."

Keep Enlightened VMCS structure for the current L2 guest permanently mapped
from struct nested_vmx instead of mapping it every time.

Suggested-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:16 +02:00
Vitaly Kuznetsov 945679e301 KVM: nVMX: add enlightened VMCS state
Adds hv_evmcs pointer and implement copy_enlightened_to_vmcs12() and
copy_enlightened_to_vmcs12().

prepare_vmcs02()/prepare_vmcs02_full() separation is not valid for
Enlightened VMCS, do full sync for now.

Suggested-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:15 +02:00
Vitaly Kuznetsov 57b119da35 KVM: nVMX: add KVM_CAP_HYPERV_ENLIGHTENED_VMCS capability
Enlightened VMCS is opt-in. The current version does not contain all
fields supported by nested VMX so we must not advertise the
corresponding VMX features if enlightened VMCS is enabled.

Userspace is given the enlightened VMCS version supported by KVM as
part of enabling KVM_CAP_HYPERV_ENLIGHTENED_VMCS. The version is to
be advertised to the nested hypervisor, currently done via a cpuid
leaf for Hyper-V.

Suggested-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:14 +02:00
Vitaly Kuznetsov 5d7a644336 KVM: VMX: refactor evmcs_sanitize_exec_ctrls()
Split off EVMCS1_UNSUPPORTED_* macros so we can re-use them when
enabling Enlightened VMCS for Hyper-V on KVM.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:14 +02:00
Ladi Prosek 72bbf9358c KVM: hyperv: define VP assist page helpers
The state related to the VP assist page is still managed by the LAPIC
code in the pv_eoi field.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:13 +02:00
Wei Yang e7912386ed KVM: x86: reintroduce pte_list_remove, but including mmu_spte_clear_track_bits
rmap_remove() removes the sptep after locating the correct rmap_head but,
in several cases, the caller has already known the correct rmap_head.

This patch introduces a new pte_list_remove(); because it is known that
the spte is present (or it would not have an rmap_head), it is safe
to remove the tracking bits without any previous check.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:12 +02:00
Wei Yang 8daf346226 KVM: x86: rename pte_list_remove to __pte_list_remove
This is a patch preparing for further change.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:11 +02:00
Peng Hao 39337ad1a7 kvm/x86 : fix some typo
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:10 +02:00
Lan Tianyu a5c214dad1 KVM/VMX: Change hv flush logic when ept tables are mismatched.
If ept table pointers are mismatched, flushing tlb for each vcpus via
hv flush interface still helps to reduce vmexits which are triggered
by IPI and INEPT emulation.

Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:09 +02:00
Uros Bizjak 44c2d667ce KVM/x86: Use 32bit xor to clear register
x86_64 zero-extends 32bit xor to a full 64bit register. Use %k asm
operand modifier to force 32bit register and save 268 bytes in kvm.o

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:09 +02:00
Uros Bizjak 4b1e54786e KVM/x86: Use assembly instruction mnemonics instead of .byte streams
Recently the minimum required version of binutils was changed to 2.20,
which supports all VMX instruction mnemonics. The patch removes
all .byte #defines and uses real instruction mnemonics instead.

The compiler is now able to pass memory operand to the instruction,
so there is no need for memory clobber anymore. Also, the compiler
adds CC register clobber automatically to all extended asm clauses,
so the patch also removes explicit CC clobber.

The immediate benefit of the patch is removal of many unnecesary
register moves, resulting in 1434 saved bytes in vmx.o:

   text    data     bss     dec     hex filename
 151257   18246    8500  178003   2b753 vmx.o
 152691   18246    8500  179437   2bced vmx-old.o

Some examples of improvement include removal of unneeded moves
of %rsp to %rax in front of invept and invvpid instructions:

    a57e:	b9 01 00 00 00       	mov    $0x1,%ecx
    a583:	48 89 04 24          	mov    %rax,(%rsp)
    a587:	48 89 e0             	mov    %rsp,%rax
    a58a:	48 c7 44 24 08 00 00 	movq   $0x0,0x8(%rsp)
    a591:	00 00
    a593:	66 0f 38 80 08       	invept (%rax),%rcx

to:

    a45c:	48 89 04 24          	mov    %rax,(%rsp)
    a460:	b8 01 00 00 00       	mov    $0x1,%eax
    a465:	48 c7 44 24 08 00 00 	movq   $0x0,0x8(%rsp)
    a46c:	00 00
    a46e:	66 0f 38 80 04 24    	invept (%rsp),%rax

and the ability to use more optimal registers and memory operands
in the instruction:

    8faa:	48 8b 44 24 28       	mov    0x28(%rsp),%rax
    8faf:	4c 89 c2             	mov    %r8,%rdx
    8fb2:	0f 79 d0             	vmwrite %rax,%rdx

to:

    8e7c:	44 0f 79 44 24 28    	vmwrite 0x28(%rsp),%r8

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:08 +02:00
Uros Bizjak 5ebb272b2e KVM/x86: Fix invvpid and invept register operand size in 64-bit mode
Register operand size of invvpid and invept instruction in 64-bit mode
has always 64 bits. Adjust inline function argument type to reflect
correct size.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:07 +02:00
Vitaly Kuznetsov bf627a9288 x86/kvm/mmu: check if MMU reconfiguration is needed in init_kvm_nested_mmu()
We don't use root page role for nested_mmu, however, optimizing out
re-initialization in case nothing changed is still valuable as this
is done for every nested vmentry.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:07 +02:00
Vitaly Kuznetsov 7dcd575520 x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed
MMU reconfiguration in init_kvm_tdp_mmu()/kvm_init_shadow_mmu() can be
avoided if the source data used to configure it didn't change; enhance
MMU extended role with the required fields and consolidate common code in
kvm_calc_mmu_role_common().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:06 +02:00
Vitaly Kuznetsov a336282d77 x86/kvm/nVMX: introduce source data cache for kvm_init_shadow_ept_mmu()
MMU re-initialization is expensive, in particular,
update_permission_bitmask() and update_pkru_bitmask() are.

Cache the data used to setup shadow EPT MMU and avoid full re-init when
it is unchanged.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:06 +02:00
Vitaly Kuznetsov 36d9594dfb x86/kvm/mmu: make space for source data caching in struct kvm_mmu
In preparation to MMU reconfiguration avoidance we need a space to
cache source data. As this partially intersects with kvm_mmu_page_role,
create 64bit sized union kvm_mmu_role holding both base and extended data.
No functional change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:05 +02:00
Paolo Bonzini e173299101 x86/kvm/mmu: get rid of redundant kvm_mmu_setup()
Just inline the contents into the sole caller, kvm_init_mmu is now
public.

Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
2018-10-17 00:30:04 +02:00
Vitaly Kuznetsov 14c07ad89f x86/kvm/mmu: introduce guest_mmu
When EPT is used for nested guest we need to re-init MMU as shadow
EPT MMU (nested_ept_init_mmu_context() does that). When we return back
from L2 to L1 kvm_mmu_reset_context() in nested_vmx_load_cr3() resets
MMU back to normal TDP mode. Add a special 'guest_mmu' so we can use
separate root caches; the improved hit rate is not very important for
single vCPU performance, but it avoids contention on the mmu_lock for
many vCPUs.

On the nested CPUID benchmark, with 16 vCPUs, an L2->L1->L2 vmexit
goes from 42k to 26k cycles.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:04 +02:00
Vitaly Kuznetsov 6a82cd1c7b x86/kvm/mmu.c: add kvm_mmu parameter to kvm_mmu_free_roots()
Add an option to specify which MMU root we want to free. This will
be used when nested and non-nested MMUs for L1 are split.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
2018-10-17 00:30:03 +02:00
Vitaly Kuznetsov 3dc773e745 x86/kvm/mmu.c: set get_pdptr hook in kvm_init_shadow_ept_mmu()
kvm_init_shadow_ept_mmu() doesn't set get_pdptr() hook and is this
not a problem just because MMU context is already initialized and this
hook points to kvm_pdptr_read(). As we're intended to use a dedicated
MMU for shadow EPT MMU set this hook explicitly.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
2018-10-17 00:30:03 +02:00
Vitaly Kuznetsov 44dd3ffa7b x86/kvm/mmu: make vcpu->mmu a pointer to the current MMU
As a preparation to full MMU split between L1 and L2 make vcpu->arch.mmu
a pointer to the currently used mmu. For now, this is always
vcpu->arch.root_mmu. No functional change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
2018-10-17 00:30:02 +02:00
Paolo Bonzini 0e0a53c551 kvm: x86: optimize dr6 restore
The quote from the comment almost says it all: we are currently zeroing
the guest dr6 in kvm_arch_vcpu_put, because do_debug expects it.  However,
the host %dr6 is either:

- zero because the guest hasn't run after kvm_arch_vcpu_load

- written from vcpu->arch.dr6 by vcpu_enter_guest

- written by the guest and copied to vcpu->arch.dr6 by ->sync_dirty_debug_regs().

Therefore, we can skip the write if vcpu->arch.dr6 is already zero.  We
may do extra useless writes if vcpu->arch.dr6 is nonzero but the guest
hasn't run; however that is less important for performance.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:02 +02:00
Vitaly Kuznetsov f21dd49450 KVM: x86: hyperv: optimize sparse VP set processing
Rewrite kvm_hv_flush_tlb()/send_ipi_vcpus_mask() making them cleaner and
somewhat more optimal.

hv_vcpu_in_sparse_set() is converted to sparse_set_to_vcpu_mask()
which copies sparse banks u64-at-a-time and then, depending on the
num_mismatched_vp_indexes value, returns immediately or does
vp index to vcpu index conversion by walking all vCPUs.

To support the change and make kvm_hv_send_ipi() look similar to
kvm_hv_flush_tlb() send_ipi_vcpus_mask() is introduced.

Suggested-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:01 +02:00
Vitaly Kuznetsov e6b6c483eb KVM: x86: hyperv: fix 'tlb_lush' typo
Regardless of whether your TLB is lush or not it still needs flushing.

Reported-by: Roman Kagan <rkagan@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:00 +02:00
Sean Christopherson 2768c0cc4a KVM: nVMX: WARN if nested run hits VMFail with early consistency checks enabled
When early consistency checks are enabled, all VMFail conditions
should be caught by nested_vmx_check_vmentry_hw().

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:00 +02:00
Sean Christopherson 52017608da KVM: nVMX: add option to perform early consistency checks via H/W
KVM defers many VMX consistency checks to the CPU, ostensibly for
performance reasons[1], including checks that result in VMFail (as
opposed to VMExit).  This behavior may be undesirable for some users
since this means KVM detects certain classes of VMFail only after it
has processed guest state, e.g. emulated MSR load-on-entry.  Because
there is a strict ordering between checks that cause VMFail and those
that cause VMExit, i.e. all VMFail checks are performed before any
checks that cause VMExit, we can detect (almost) all VMFail conditions
via a dry run of sorts.  The almost qualifier exists because some
state in vmcs02 comes from L0, e.g. VPID, which means that hardware
will never detect an invalid VPID in vmcs12 because it never sees
said value.  Software must (continue to) explicitly check such fields.

After preparing vmcs02 with all state needed to pass the VMFail
consistency checks, optionally do a "test" VMEnter with an invalid
GUEST_RFLAGS.  If the VMEnter results in a VMExit (due to bad guest
state), then we can safely say that the nested VMEnter should not
VMFail, i.e. any VMFail encountered in nested_vmx_vmexit() must
be due to an L0 bug.  GUEST_RFLAGS is used to induce VMExit as it
is unconditionally loaded on all implementations of VMX, has an
invalid value that is writable on a 32-bit system and its consistency
check is performed relatively early in all implementations (the exact
order of consistency checks is micro-architectural).

Unfortunately, since the "passing" case causes a VMExit, KVM must
be extra diligent to ensure that host state is restored, e.g. DR7
and RFLAGS are reset on VMExit.  Failure to restore RFLAGS.IF is
particularly fatal.

And of course the extra VMEnter and VMExit impacts performance.
The raw overhead of the early consistency checks is ~6% on modern
hardware (though this could easily vary based on configuration),
while the added latency observed from the L1 VMM is ~10%.  The
early consistency checks do not occur in a vacuum, e.g. spending
more time in L0 can lead to more interrupts being serviced while
emulating VMEnter, thereby increasing the latency observed by L1.

Add a module param, early_consistency_checks, to provide control
over whether or not VMX performs the early consistency checks.
In addition to standard on/off behavior, the param accepts a value
of -1, which is essentialy an "auto" setting whereby KVM does
the early checks only when it thinks it's running on bare metal.
When running nested, doing early checks is of dubious value since
the resulting behavior is heavily dependent on L0.  In the future,
the "auto" setting could also be used to default to skipping the
early hardware checks for certain configurations/platforms if KVM
reaches a state where it has 100% coverage of VMFail conditions.

[1] To my knowledge no one has implemented and tested full software
    emulation of the VMFail consistency checks.  Until that happens,
    one can only speculate about the actual performance overhead of
    doing all VMFail consistency checks in software.  Obviously any
    code is slower than no code, but in the grand scheme of nested
    virtualization it's entirely possible the overhead is negligible.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:59 +02:00
Sean Christopherson 5a5e8a15d7 KVM: vmx: write HOST_IA32_EFER in vmx_set_constant_host_state()
EFER is constant in the host and writing it once during setup means
we can skip writing the host value in add_atomic_switch_msr_special().

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:59 +02:00
Sean Christopherson 09abb5e3e5 KVM: nVMX: call kvm_skip_emulated_instruction in nested_vmx_{fail,succeed}
... as every invocation of nested_vmx_{fail,succeed} is immediately
followed by a call to kvm_skip_emulated_instruction().  This saves
a bit of code and eliminates some silly paths, e.g. nested_vmx_run()
ended up with a goto label purely used to call and return
kvm_skip_emulated_instruction().

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:58 +02:00
Sean Christopherson c37a6116d8 KVM: nVMX: do not call nested_vmx_succeed() for consistency check VMExit
EFLAGS is set to a fixed value on VMExit, calling nested_vmx_succeed()
is unnecessary and wrong.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:57 +02:00
Sean Christopherson cb61de2f48 KVM: nVMX: do not skip VMEnter instruction that succeeds
A successful VMEnter is essentially a fancy indirect branch that
pulls the target RIP from the VMCS.  Skipping the instruction is
unnecessary (RIP will get overwritten by the VMExit handler) and
is problematic because it can incorrectly suppress a #DB due to
EFLAGS.TF when a VMFail is detected by hardware (happens after we
skip the instruction).

Now that vmx_nested_run() is not prematurely skipping the instr,
use the full kvm_skip_emulated_instruction() in the VMFail path
of nested_vmx_vmexit().  We also need to explicitly update the
GUEST_INTERRUPTIBILITY_INFO when loading vmcs12 host state.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:57 +02:00
Sean Christopherson 16fb9a46c5 KVM: nVMX: do early preparation of vmcs02 before check_vmentry_postreqs()
In anticipation of using vmcs02 to do early consistency checks, move
the early preparation of vmcs02 prior to checking the postreqs.  The
downside of this approach is that we'll unnecessary load vmcs02 in
the case that check_vmentry_postreqs() fails, but that is essentially
our slow path anyways (not actually slow, but it's the path we don't
really care about optimizing).

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:56 +02:00
Sean Christopherson 9d6105b2b5 KVM: nVMX: initialize vmcs02 constant exactly once (per VMCS)
Add a dedicated flag to track if vmcs02 has been initialized, i.e.
the constant state for vmcs02 has been written to the backing VMCS.
The launched flag (in struct loaded_vmcs) gets cleared on logical
CPU migration to mirror hardware behavior[1], i.e. using the launched
flag to determine whether or not vmcs02 constant state needs to be
initialized results in unnecessarily re-initializing the VMCS when
migrating between logical CPUS.

[1] The active VMCS needs to be VMCLEARed before it can be migrated
    to a different logical CPU.  Hardware's VMCS cache is per-CPU
    and is not coherent between CPUs.  VMCLEAR flushes the cache so
    that any dirty data is written back to memory.  A side effect
    of VMCLEAR is that it also clears the VMCS's internal launch
    flag, which KVM must mirror because VMRESUME must be used to
    run a previously launched VMCS.

Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:56 +02:00
Sean Christopherson 09abe32002 KVM: nVMX: split pieces of prepare_vmcs02() to prepare_vmcs02_early()
Add prepare_vmcs02_early() and move pieces of prepare_vmcs02() to the
new function.  prepare_vmcs02_early() writes the bits of vmcs02 that
a) must be in place to pass the VMFail consistency checks (assuming
vmcs12 is valid) and b) are needed recover from a VMExit, e.g. host
state that is loaded on VMExit.  Splitting the functionality will
enable KVM to leverage hardware to do VMFail consistency checks via
a dry run of VMEnter and recover from a potential VMExit without
having to fully initialize vmcs02.

Add prepare_vmcs02_constant_state() to handle writing vmcs02 state that
comes from vmcs01 and never changes, i.e. we don't need to rewrite any
of the vmcs02 that is effectively constant once defined.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:55 +02:00
Sean Christopherson 860ff2aa84 KVM: VMX: remove ASSERT() on vmx->pml_pg validity
vmx->pml_pg is allocated by vmx_create_vcpu() and is only nullified
when the vCPU is destroyed by vmx_free_vcpu().  Remove the ASSERTs
on vmx->pml_pg, there is no need to carry debug code that provides
no value to the current code base.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:55 +02:00
Sean Christopherson 39f9c3885c KVM: vVMX: rename label for post-enter_guest_mode consistency check
Rename 'fail' to 'vmentry_fail_vmexit_guest_mode' to make it more
obvious that it's simply a different entry point to the VMExit path,
whose purpose is unwind the updates done prior to calling
prepare_vmcs02().

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:54 +02:00
Sean Christopherson a633e41e73 KVM: nVMX: assimilate nested_vmx_entry_failure() into nested_vmx_enter_non_root_mode()
Handling all VMExits due to failed consistency checks on VMEnter in
nested_vmx_enter_non_root_mode() consolidates all relevant code into
a single location, and removing nested_vmx_entry_failure() eliminates
a confusing function name and label.  For a VMEntry, "fail" and its
derivatives has a very specific meaning due to the different behavior
of a VMEnter VMFail versus VMExit, i.e. it wasn't obvious that
nested_vmx_entry_failure() handled VMExit scenarios.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:53 +02:00
Sean Christopherson 7671ce21b1 KVM: nVMX: move check_vmentry_postreqs() call to nested_vmx_enter_non_root_mode()
In preparation of supporting checkpoint/restore for nested state,
commit ca0bde28f2 ("kvm: nVMX: Split VMCS checks from nested_vmx_run()")
modified check_vmentry_postreqs() to only perform the guest EFER
consistency checks when nested_run_pending is true.  But, in the
normal nested VMEntry flow, nested_run_pending is only set after
check_vmentry_postreqs(), i.e. the consistency check is being skipped.

Alternatively, nested_run_pending could be set prior to calling
check_vmentry_postreqs() in nested_vmx_run(), but placing the
consistency checks in nested_vmx_enter_non_root_mode() allows us
to split prepare_vmcs02() and interleave the preparation with
the consistency checks without having to change the call sites
of nested_vmx_enter_non_root_mode().  In other words, the rest
of the consistency check code in nested_vmx_run() will be joining
the postreqs checks in future patches.

Fixes: ca0bde28f2 ("kvm: nVMX: Split VMCS checks from nested_vmx_run()")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Jim Mattson <jmattson@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:53 +02:00
Sean Christopherson d63907dc7d KVM: nVMX: rename enter_vmx_non_root_mode to nested_vmx_enter_non_root_mode
...to be more consistent with the nested VMX nomenclature.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:52 +02:00
Sean Christopherson 3df5c37e55 KVM: nVMX: try to set EFER bits correctly when initializing controls
VM_ENTRY_IA32E_MODE and VM_{ENTRY,EXIT}_LOAD_IA32_EFER will be
explicitly set/cleared as needed by vmx_set_efer(), but attempt
to get the bits set correctly when intializing the control fields.
Setting the value correctly can avoid multiple VMWrites.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:52 +02:00
Sean Christopherson 02343cf207 KVM: vmx: do not unconditionally clear EFER switching
Do not unconditionally call clear_atomic_switch_msr() when updating
EFER.  This adds up to four unnecessary VMWrites in the case where
guest_efer != host_efer, e.g. if the load_on_{entry,exit} bits were
already set.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:51 +02:00
Sean Christopherson b7031fd40f KVM: nVMX: reset cache/shadows when switching loaded VMCS
Reset the vm_{entry,exit}_controls_shadow variables as well as the
segment cache after loading a new VMCS in vmx_switch_vmcs().  The
shadows/cache track VMCS data, i.e. they're stale every time we
switch to a new VMCS regardless of reason.

This fixes a bug where stale control shadows would be consumed after
a nested VMExit due to a failed consistency check.

Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:50 +02:00
Sean Christopherson 1abf23fb42 KVM: nVMX: use vm_exit_controls_init() to write exit controls for vmcs02
Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring
vmcs02, otherwise vm_exit_controls_shadow will be stale.  EFER in
particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated
due to an incorrect shadow optimization, which can crash L0 due to
EFER not being loaded on exit.  This does not occur with the current
code base simply because update_transition_efer() unconditionally
clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and
because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER
clear, i.e. we'll only ever unnecessarily clear the bit.  That is,
until someone optimizes update_transition_efer()...

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:50 +02:00
Sean Christopherson 5b8ba41daf KVM: nVMX: move vmcs12 EPTP consistency check to check_vmentry_prereqs()
An invalid EPTP causes a VMFail(VMXERR_ENTRY_INVALID_CONTROL_FIELD),
not a VMExit.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:49 +02:00
Sean Christopherson 64a919f7b5 KVM: nVMX: move host EFER consistency checks to VMFail path
Invalid host state related to loading EFER on VMExit causes a
VMFail(VMXERR_ENTRY_INVALID_HOST_STATE_FIELD), not a VMExit.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:49 +02:00
Jim Mattson 3c6e099fa1 KVM: nVMX: Always reflect #NM VM-exits to L1
When bit 3 (corresponding to CR0.TS) of the VMCS12 cr0_guest_host_mask
field is clear, the VMCS12 guest_cr0 field does not necessarily hold
the current value of the L2 CR0.TS bit, so the code that checked for
L2's CR0.TS bit being set was incorrect. Moreover, I'm not sure that
the CR0.TS check was adequate. (What if L2's CR0.EM was set, for
instance?)

Fortunately, lazy FPU has gone away, so L0 has lost all interest in
intercepting #NM exceptions. See commit bd7e5b0899 ("KVM: x86:
remove code for lazy FPU handling"). Therefore, there is no longer any
question of which hypervisor gets first dibs. The #NM VM-exit should
always be reflected to L1. (Note that the corresponding bit must be
set in the VMCS12 exception_bitmap field for there to be an #NM
VM-exit at all.)

Fixes: ccf9844e5d ("kvm, vmx: Really fix lazy FPU on nested guest")
Reported-by: Abhiroop Dabral <adabral@paloaltonetworks.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Tested-by: Abhiroop Dabral <adabral@paloaltonetworks.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:47 +02:00