alistair23-linux

redonkable

Author	SHA1	Message	Date
Linus Torvalds	753c8d9b7d	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A collection of assorted fixes: - Fix for the pinned cr0/4 fallout which escaped all testing efforts because the kvm-intel module was never loaded when the kernel was compiled with CONFIG_PARAVIRT=n. The cr0/4 accessors are moved out of line and static key is now solely used in the core code and therefore can stay in the RO after init section. So the kvm-intel and other modules do not longer reference the (read only) static key which the module loader tried to update. - Prevent an infinite loop in arch_stack_walk_user() by breaking out of the loop once the return address is detected to be 0. - Prevent the int3_emulate_call() selftest from corrupting the stack when KASAN is enabled. KASASN clobbers more registers than covered by the emulated call implementation. Convert the int3_magic() selftest to a ASM function so the compiler cannot KASANify it. - Unbreak the build with old GCC versions and with the Gold linker by reverting the 'Move of _etext to the actual end of .text'. In both cases the build fails with 'Invalid absolute R_X86_64_32S relocation: _etext' - Initialize the context lock for init_mm, which was never an issue until the alternatives code started to use a temporary mm for patching. - Fix a build warning vs. the LOWMEM_PAGES constant where clang complains rightfully about a signed integer overflow in the shift operation by converting the operand to an ULL. - Adjust the misnamed ENDPROC() of common_spurious in the 32bit entry code" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/stacktrace: Prevent infinite loop in arch_stack_walk_user() x86/asm: Move native_write_cr0/4() out of line x86/pgtable/32: Fix LOWMEM_PAGES constant x86/alternatives: Fix int3_emulate_call() selftest stack corruption x86/entry/32: Fix ENDPROC of common_spurious Revert "x86/build: Move _etext to actual end of .text" x86/ldt: Initialize the context lock for init_mm	2019-07-11 13:54:00 -07:00
Atish Patra	0f327f2aaa	RISC-V: Add an Image header that boot loader can parse. Currently, the last stage boot loaders such as U-Boot can accept only uImage which is an unnecessary additional step in automating boot process. Add an image header that boot loader understands and boot Linux from flat Image directly. This header is based on ARM64 boot image header and provides an opportunity to combine both ARM64 & RISC-V image headers in future. Also make sure that PE/COFF header can co-exist in the same image so that EFI stub can be supported for RISC-V in future. EFI specification needs PE/COFF image header in the beginning of the kernel image in order to load it as an EFI application. In order to support EFI stub, code0 should be replaced with "MZ" magic string and res4(at offset 0x3c) should point to the rest of the PE/COFF header (which will be added during EFI support). Tested on both QEMU and HiFive Unleashed using OpenSBI + U-Boot + Linux. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Karsten Merker <merker@debian.org> Tested-by: Karsten Merker <merker@debian.org> (QEMU+OpenSBI+U-Boot) Tested-by: Kevin Hilman <khilman@baylibre.com> (OpenSBI + U-Boot + Linux) [paul.walmsley@sifive.com: fixed whitespace in boot-image-header.txt; converted structure comment to kernel-doc format and added some detail] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-07-11 11:58:20 -07:00
Vasily Gorbik	9a15919041	s390/unwind: avoid int overflow in outside_of_stack When current task is interrupted in-between stack frame allocation and backchain write instructions new stack frame backchain pointer is left uninitialized. That invalid backchain value is passed into outside_of_stack for sanity check. Make sure int overflow does not happen by subtracting stack_frame size from the stack "end" rather than adding it to "random" backchain value. Fixes: 41b0474c1b1c ("s390/unwind: introduce stack unwind API") Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-11 20:40:02 +02:00
Sebastian Ott	8e4708b3f8	s390/pci: add mio_enabled attribute Provide an attribute to query the usage of mio instructions. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-11 20:40:02 +02:00
Sebastian Ott	9964f396f1	s390: fix setting of mio addressing control Move enablement of mio addressing control from detect_machine_facilities to pci_base_init. detect_machine_facilities runs so early that the static branches have not been toggled yet, thus mio addressing control was always off. In pci_base_init we have to use the SMP aware ctl_set_bit though. Fixes: `833b441ec0` ("s390: enable processes for mio instructions") Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-11 20:40:02 +02:00
Philipp Rudo	1b2be2071a	s390/ipl: Fix detection of has_secure attribute Use the correct bit for detection of the machine capability associated with the has_secure attribute. It is expected that the underlying platform (including hypervisors) unsets the bit when they don't provide secure ipl for their guests. Fixes: `c9896acc78` ("s390/ipl: Provide has_secure sysfs attribute") Cc: stable@vger.kernel.org # 5.2 Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-11 20:40:02 +02:00
Thomas Richter	820bace734	s390/cpumf: Add extended counter set definitions for model 8561 and 8562 Add the extended counter set definitions for s390 machine types 8561 and 8262. They are identical with machine types 3906 and 3907. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-11 20:40:01 +02:00
Jan Höppner	91dc4a1975	s390/dasd: Add new ioctl to release space Userspace tools might have the need to release space for Extent Space Efficient (ESE) volumes when working with such a device. Provide the necessarry interface for such a task by implementing a new ioctl BIODASDRAS. The ioctl uses the format_data_t data structure for data input: typedef struct format_data_t { unsigned int start_unit; /* from track / unsigned int stop_unit; / to track / unsigned int blksize; / sectorsize */ unsigned int intensity; } format_data_t; If the intensity is set to 0x40, start_unit and stop_unit are ignored and space for the entire volume is released. Otherwise, if intensity is set to 0, the respective range is released (if possible). Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-11 20:39:54 +02:00
Jan Höppner	d7a4434d60	s390/dasd: Add missing intensity definition The definition for the bit that removes the write permission for record zero when formatting was missing. Add it to complete the list. Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-11 20:39:54 +02:00
Jan Höppner	2df4774cb4	s390/dasd: Fix whitespace Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-11 20:39:53 +02:00
Linus Torvalds	237f83dfbe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Some highlights from this development cycle: 1) Big refactoring of ipv6 route and neigh handling to support nexthop objects configurable as units from userspace. From David Ahern. 2) Convert explored_states in BPF verifier into a hash table, significantly decreased state held for programs with bpf2bpf calls, from Alexei Starovoitov. 3) Implement bpf_send_signal() helper, from Yonghong Song. 4) Various classifier enhancements to mvpp2 driver, from Maxime Chevallier. 5) Add aRFS support to hns3 driver, from Jian Shen. 6) Fix use after free in inet frags by allocating fqdirs dynamically and reworking how rhashtable dismantle occurs, from Eric Dumazet. 7) Add act_ctinfo packet classifier action, from Kevin Darbyshire-Bryant. 8) Add TFO key backup infrastructure, from Jason Baron. 9) Remove several old and unused ISDN drivers, from Arnd Bergmann. 10) Add devlink notifications for flash update status to mlxsw driver, from Jiri Pirko. 11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski. 12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes. 13) Various enhancements to ipv6 flow label handling, from Eric Dumazet and Willem de Bruijn. 14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van der Merwe, and others. 15) Various improvements to axienet driver including converting it to phylink, from Robert Hancock. 16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean. 17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana Radulescu. 18) Add devlink health reporting to mlx5, from Moshe Shemesh. 19) Convert stmmac over to phylink, from Jose Abreu. 20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from Shalom Toledo. 21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera. 22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel. 23) Track spill/fill of constants in BPF verifier, from Alexei Starovoitov. 24) Support bounded loops in BPF, from Alexei Starovoitov. 25) Various page_pool API fixes and improvements, from Jesper Dangaard Brouer. 26) Just like ipv4, support ref-countless ipv6 route handling. From Wei Wang. 27) Support VLAN offloading in aquantia driver, from Igor Russkikh. 28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy. 29) Add flower GRE encap/decap support to nfp driver, from Pieter Jansen van Vuuren. 30) Protect against stack overflow when using act_mirred, from John Hurley. 31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen. 32) Use page_pool API in netsec driver, Ilias Apalodimas. 33) Add Google gve network driver, from Catherine Sullivan. 34) More indirect call avoidance, from Paolo Abeni. 35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan. 36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek. 37) Add MPLS manipulation actions to TC, from John Hurley. 38) Add sending a packet to connection tracking from TC actions, and then allow flower classifier matching on conntrack state. From Paul Blakey. 39) Netfilter hw offload support, from Pablo Neira Ayuso" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits) net/mlx5e: Return in default case statement in tx_post_resync_params mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync(). net: dsa: add support for BRIDGE_MROUTER attribute pkt_sched: Include const.h net: netsec: remove static declaration for netsec_set_tx_de() net: netsec: remove superfluous if statement netfilter: nf_tables: add hardware offload support net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload net: flow_offload: add flow_block_cb_is_busy() and use it net: sched: remove tcf block API drivers: net: use flow block API net: sched: use flow block API net: flow_offload: add flow_block_cb_{priv, incref, decref}() net: flow_offload: add list handling functions net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free() net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_* net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND net: flow_offload: add flow_block_cb_setup_simple() net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC ...	2019-07-11 10:55:49 -07:00
Linus Torvalds	8f6ccf6159	clone3-v5.3 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXSMhhgAKCRCRxhvAZXjc or7kAP9VzDcQaK/WoDd2ezh2C7Wh5hNy9z/qJVCa6Tb+N+g1UgEAxbhFUg55uGOA JNf7fGar5JF5hBMIXR+NqOi1/sb4swg= =ELWo -----END PGP SIGNATURE----- Merge tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull clone3 system call from Christian Brauner: "This adds the clone3 syscall which is an extensible successor to clone after we snagged the last flag with CLONE_PIDFD during the 5.2 merge window for clone(). It cleanly supports all of the flags from clone() and thus all legacy workloads. There are few user visible differences between clone3 and clone. First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse this flag. Second, the CSIGNAL flag is deprecated and will cause EINVAL to be reported. It is superseeded by a dedicated "exit_signal" argument in struct clone_args thus freeing up even more flags. And third, clone3 gives CLONE_PIDFD a dedicated return argument in struct clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr argument. The clone3 uapi is designed to be easy to handle on 32- and 64 bit: /* uapi / struct clone_args { __aligned_u64 flags; __aligned_u64 pidfd; __aligned_u64 child_tid; __aligned_u64 parent_tid; __aligned_u64 exit_signal; __aligned_u64 stack; __aligned_u64 stack_size; __aligned_u64 tls; }; and a separate kernel struct is used that uses proper kernel typing: / kernel internal / struct kernel_clone_args { u64 flags; int __user pidfd; int __user child_tid; int __user parent_tid; int exit_signal; unsigned long stack; unsigned long stack_size; unsigned long tls; }; The system call comes with a size argument which enables the kernel to detect what version of clone_args userspace is passing in. clone3 validates that any additional bytes a given kernel does not know about are set to zero and that the size never exceeds a page. A nice feature is that this patchset allowed us to cleanup and simplify various core kernel codepaths in kernel/fork.c by making the internal _do_fork() function take struct kernel_clone_args even for legacy clone(). This patch also unblocks the time namespace patchset which wants to introduce a new CLONE_TIMENS flag. Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and xtensa. These were the architectures that did not require special massaging. Other architectures treat fork-like system calls individually and after some back and forth neither Arnd nor I felt confident that we dared to add clone3 unconditionally to all architectures. We agreed to leave this up to individual architecture maintainers. This is why there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3 which any architecture can set once it has implemented support for clone3. The patch also adds a cond_syscall(clone3) for architectures such as nios2 or h8300 that generate their syscall table by simply including asm-generic/unistd.h. The hope is to get rid of __ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon" * tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: arch: handle arches who do not yet define clone3 arch: wire-up clone3() syscall fork: add clone3	2019-07-11 10:09:44 -07:00
Oliver O'Halloran	3343962068	powerpc/eeh: Handle hugepages in ioremap space In commit 4a7b06c157a2 ("powerpc/eeh: Handle hugepages in ioremap space") support for using hugepages in the vmalloc and ioremap areas was enabled for radix. Unfortunately this broke EEH MMIO error checking. Detection works by inserting a hook which checks the results of the ioreadXX() set of functions. When a read returns a 0xFFs response we need to check for an error which we do by mapping the (virtual) MMIO address back to a physical address, then mapping physical address to a PCI device via an interval tree. When translating virt -> phys we currently assume the ioremap space is only populated by PAGE_SIZE mappings. If a hugepage mapping is found we emit a WARN_ON(), but otherwise handles the check as though a normal page was found. In pathalogical cases such as copying a buffer containing a lot of 0xFFs from BAR memory this can result in the system not booting because it's too busy printing WARN_ON()s. There's no real reason to assume huge pages can't be present and we're prefectly capable of handling them, so do that. Fixes: 4a7b06c157a2 ("powerpc/eeh: Handle hugepages in ioremap space") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190710150517.27114-1-oohall@gmail.com	2019-07-12 01:02:09 +10:00
Paolo Bonzini	a45ff5994c	KVM/arm updates for 5.3 - Add support for chained PMU counters in guests - Improve SError handling - Handle Neoverse N1 erratum #1349291 - Allow side-channel mitigation status to be migrated - Standardise most AArch64 system register accesses to msr_s/mrs_s - Fix host MPIDR corruption on 32bit -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl0kge4VHG1hcmMuenlu Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDYyQP/3XY5tFcLKkp/h9rnGaCXwAxhNzn TyF/IZEFBKFTSoDMXKLLc8KllvoPQ7aUl03heYbuayYpyKR1+LCx7lDwu1MYyEf+ aSSuOKlbG//tLUEGp09pTRCgjs2mhhZYqOj5GF2mZ7xpovFVSNOPzTazbXDNQ7tw zUAs43YNg+bUMwj+SLWpBlizjrLr7T34utIr6daKJE/GSfmIrcYXhGbZqUh0zbO0 z5LNasebws8/pHyeGI7+/yoMIKaQ8foMgywTpsRpBsx6YI+AbOLjEmCk2IBOPcEK pm9KkSIBZEO2CSxZKl3NQiEow/Qd/lnz2xLMCSfh4XrYoI2Th4gNcsbJpiBDWP5a 0eZ5jSiexxKngIbM+to7jR3m0yc9RgcuzceJg3Uly7Ya0vb5RqKwOX4Ge4XP4VDT DzIVFdQjxDKdVIf3EvGp1cj4P7dRUU3xbZcbzyuRPEmT3vgjEnbxawmPLs3QMAl1 31Wd2wIsPB86kSxzSMel27Vs5VgMhgyHE26zN91R745CvhDXaDKydIWjGjdVMHsB GuX/h2kL+ohx+N/OpZPgwsVUAGLSOQFP3pE/EcGtqc2kkfqa+bx12DKcZ3zdmJvy +cu5ixU8q5thPH/pZob/C3hKUY/eLy02emS34RK0Jh2sZHbQgAOtMsiqUxNHEjUm 6TkpdWa5SRd7CtGV =yfCs -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for 5.3 - Add support for chained PMU counters in guests - Improve SError handling - Handle Neoverse N1 erratum #1349291 - Allow side-channel mitigation status to be migrated - Standardise most AArch64 system register accesses to msr_s/mrs_s - Fix host MPIDR corruption on 32bit	2019-07-11 15:14:16 +02:00
Sean Christopherson	d7a08882a0	KVM: x86: Unconditionally enable irqs in guest context On VMX, KVM currently does not re-enable irqs until after it has exited the guest context. As a result, a tick that fires in the window between VM-Exit and guest_exit_irqoff() will be accounted as system time. While said window is relatively small, it's large enough to be problematic in some configurations, e.g. if VM-Exits are consistently occurring a hair earlier than the tick irq. Intentionally toggle irqs back off so that guest_exit_irqoff() can be used in lieu of guest_exit() in order to avoid the save/restore of flags in guest_exit(). On my Haswell system, "nop; cli; sti" is ~6 cycles, versus ~28 cycles for "pushf; pop <reg>; cli; push <reg>; popf". Fixes: `f2485b3e0c` ("KVM: x86: use guest_exit_irqoff") Reported-by: Wei Yang <w90p710@gmail.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-11 15:10:22 +02:00
Eric Hankland	66bb8a065f	KVM: x86: PMU Event Filter Some events can provide a guest with information about other guests or the host (e.g. L3 cache stats); providing the capability to restrict access to a "safe" set of events would limit the potential for the PMU to be used in any side channel attacks. This change introduces a new VM ioctl that sets an event filter. If the guest attempts to program a counter for any blacklisted or non-whitelisted event, the kernel counter won't be created, so any RDPMC/RDMSR will show 0 instances of that event. Signed-off-by: Eric Hankland <ehankland@google.com> [Lots of changes. All remaining bugs are probably mine. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-11 15:08:28 +02:00
Eiichi Tsukata	cbf5b73d16	x86/stacktrace: Prevent infinite loop in arch_stack_walk_user() arch_stack_walk_user() checks `if (fp == frame.next_fp)` to prevent a infinite loop by self reference but it's not enogh for circular reference. Once a lack of return address is found, there is no point to continue the loop, so break out. Fixes: `02b67518e2` ("tracing: add support for userspace stacktraces in tracing/iter_ctrl") Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20190711023501.963-1-devel@etsukata.com	2019-07-11 08:22:03 +02:00
Linus Torvalds	5450e8a316	pidfd-updates-v5.3 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXSMhUgAKCRCRxhvAZXjc okkiAQC3Hlg/O2JoIb4PqgEvBkpHSdVxyuWagn0ksjACW9ANKQEAl5OadMhvOq16 UHGhKlpE/M8HflknIffoEGlIAWHrdwU= =7kP5 -----END PGP SIGNATURE----- Merge tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull pidfd updates from Christian Brauner: "This adds two main features. - First, it adds polling support for pidfds. This allows process managers to know when a (non-parent) process dies in a race-free way. The notification mechanism used follows the same logic that is currently used when the parent of a task is notified of a child's death. With this patchset it is possible to put pidfds in an {e}poll loop and get reliable notifications for process (i.e. thread-group) exit. - The second feature compliments the first one by making it possible to retrieve pollable pidfds for processes that were not created using CLONE_PIDFD. A lot of processes get created with traditional PID-based calls such as fork() or clone() (without CLONE_PIDFD). For these processes a caller can currently not create a pollable pidfd. This is a problem for Android's low memory killer (LMK) and service managers such as systemd. Both patchsets are accompanied by selftests. It's perhaps worth noting that the work done so far and the work done in this branch for pidfd_open() and polling support do already see some adoption: - Android is in the process of backporting this work to all their LTS kernels [1] - Service managers make use of pidfd_send_signal but will need to wait until we enable waiting on pidfds for full adoption. - And projects I maintain make use of both pidfd_send_signal and CLONE_PIDFD [2] and will use polling support and pidfd_open() too" [1] https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.9+backport%22 https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.14+backport%22 https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.19+backport%22 [2] `aab6e3eb73/src/lxc/start.c (L1753)` * tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add pidfd_open() tests arch: wire-up pidfd_open() pid: add pidfd_open() pidfd: add polling selftests pidfd: add polling support	2019-07-10 22:17:21 -07:00
Linus Torvalds	29cd581b59	m68k updates for v5.3 (take two) - Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire. -----BEGIN PGP SIGNATURE----- iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCXSXP0xUcZ2VlcnRAbGlu dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XCjtAEAiqVBOG4QAWdyk3DOUvDPZ7V5CqjP IH5WXOUMRSd21EcBAKwmPFe7vmcNjq5XkCvIVxcLBSGbyy+cZFL9Cunv4N4K =67YF -----END PGP SIGNATURE----- Merge tag 'm68k-for-v5.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k fix from Geert Uytterhoeven: "Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire. This is a fix for an issue detected in next, to avoid introducing build failures when merging Christoph's dma-mapping tree later" * tag 'm68k-for-v5.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire	2019-07-10 21:44:07 -07:00
Linus Torvalds	398364a35d	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68nommu updates from Greg Ungerer: "A series of cleanups for the FLAT format binary loader, binfmt_flat, from Christoph. The end goal is to support no-MMU on RISC-V, and the last patch enables that" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: riscv: add binfmt_flat support binfmt_flat: don't offset the data start binfmt_flat: move the MAX_SHARED_LIBS definition to binfmt_flat.c binfmt_flat: remove the persistent argument from flat_get_addr_from_rp binfmt_flat: provide an asm-generic/flat.h binfmt_flat: make support for old format binaries optional binfmt_flat: add a ARCH_HAS_BINFMT_FLAT option binfmt_flat: add endianess annotations binfmt_flat: use fixed size type for the on-disk format binfmt_flat: consolidate two version of flat_v2_reloc_t binfmt_flat: remove the unused OLD_FLAT_FLAG_RAM definition binfmt_flat: remove the uapi <linux/flat.h> header binfmt_flat: replace flat_argvp_envp_on_stack with a Kconfig variable binfmt_flat: remove flat_old_ram_flag binfmt_flat: provide a default version of flat_get_relocate_addr binfmt_flat: remove flat_set_persistent binfmt_flat: remove flat_reloc_valid	2019-07-10 21:42:03 -07:00
Thomas Gleixner	7652ac9201	x86/asm: Move native_write_cr0/4() out of line The pinning of sensitive CR0 and CR4 bits caused a boot crash when loading the kvm_intel module on a kernel compiled with CONFIG_PARAVIRT=n. The reason is that the static key which controls the pinning is marked RO after init. The kvm_intel module contains a CR4 write which requires to update the static key entry list. That obviously does not work when the key is in a RO section. With CONFIG_PARAVIRT enabled this does not happen because the CR4 write uses the paravirt indirection and the actual write function is built in. As the key is intended to be immutable after init, move native_write_cr0/4() out of line. While at it consolidate the update of the cr4 shadow variable and store the value right away when the pinning is initialized on a booting CPU. No point in reading it back 20 instructions later. This allows to confine the static key and the pinning variable to cpu/common and allows to mark them static. Fixes: `8dbec27a24` ("x86/asm: Pin sensitive CR0 bits") Fixes: `873d50d58f` ("x86/asm: Pin sensitive CR4 bits") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Xi Ruoyao <xry111@mengyan1223.wang> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Xi Ruoyao <xry111@mengyan1223.wang> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907102140340.1758@nanos.tec.linutronix.de	2019-07-10 22:15:05 +02:00
Arnd Bergmann	2651569986	x86/pgtable/32: Fix LOWMEM_PAGES constant clang points out that the computation of LOWMEM_PAGES causes a signed integer overflow on 32-bit x86: arch/x86/kernel/head32.c:83:20: error: signed shift result (0x100000000) requires 34 bits to represent, but 'int' only has 32 bits [-Werror,-Wshift-overflow] (PAGE_TABLE_SIZE(LOWMEM_PAGES) << PAGE_SHIFT); ^~~~~~~~~~~~ arch/x86/include/asm/pgtable_32.h:109:27: note: expanded from macro 'LOWMEM_PAGES' #define LOWMEM_PAGES ((((2<<31) - __PAGE_OFFSET) >> PAGE_SHIFT)) ~^ ~~ arch/x86/include/asm/pgtable_32.h:98:34: note: expanded from macro 'PAGE_TABLE_SIZE' #define PAGE_TABLE_SIZE(pages) ((pages) / PTRS_PER_PGD) Use the _ULL() macro to make it a 64-bit constant. Fixes: `1e620f9b23` ("x86/boot/32: Convert the 32-bit pgtable setup code from assembly to C") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190710130522.1802800-1-arnd@arndb.de	2019-07-10 17:19:58 +02:00
Masahiro Yamada	75dd47472b	kbuild: remove src and obj from the top Makefile Replace $(src) and $(obj) with $(srctree) and $(objtree), respectively. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-07-11 00:05:09 +09:00
Yi Wang	cdc238eb72	kvm: x86: Fix -Wmissing-prototypes warnings We get a warning when build kernel W=1: arch/x86/kvm/../../../virt/kvm/eventfd.c:48:1: warning: no previous prototype for ‘kvm_arch_irqfd_allowed’ [-Wmissing-prototypes] kvm_arch_irqfd_allowed(struct kvm kvm, struct kvm_irqfd args) ^ The reason is kvm_arch_irqfd_allowed() is declared in arch/x86/kvm/irq.h, which is not included by eventfd.c. Considering kvm_arch_irqfd_allowed() is a weakly defined function in eventfd.c, remove the declaration to kvm_host.h can fix this. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-10 16:35:58 +02:00
Christoph Hellwig	79a986721d	dma-mapping: remove dma_max_pfn These days, the DMA mapping code must bounce buffers for any unsupported address. If the driver needs to optimize for natively supported ranges, then it should use dma_get_required_mask. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2019-07-10 13:17:30 +02:00
Masahiro Yamada	4ba7f80f42	powerpc/boot: pass CONFIG options in a simpler and more robust way Commit `5e9dcb6188` ("powerpc/boot: Expose Kconfig symbols to wrapper") was wrong, but commit `e41b93a6be` ("powerpc/boot: Fix build failures with -j 1") was also wrong. The correct dependency is: $(obj)/serial.o: $(obj)/autoconf.h However, I do not see the reason why we need to copy autoconf.h to arch/power/boot/. Nor do I see consistency in the way of passing CONFIG options. decompress.c references CONFIG_KERNEL_GZIP and CONFIG_KERNEL_XZ, which are passed via the command line. serial.c includes autoconf.h to reference a couple of CONFIG options, but this is fragile because we often forget to include "autoconf.h" from source files. In fact, it is already broken. ppc_asm.h references CONFIG_PPC_8xx, but utils.S is not given any way to access CONFIG options. So, CONFIG_PPC_8xx is never defined here. Pass $(LINUXINCLUDE) to make sure CONFIG options are accessible from all .c and .S files in arch/powerpc/boot/. I also removed the -traditional flag to make include/linux/kconfig.h work. This flag makes the preprocessor imitate the behavior of the pre-standard C compiler, but I do not understand why it is necessary. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190705100144.28785-2-yamada.masahiro@socionext.com	2019-07-10 13:20:44 +10:00
Masahiro Yamada	9e005b761e	powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h The next commit will make the way of passing CONFIG options more robust. Unfortunately, it would uncover another hidden issue; without this commit, skiroot_defconfig would be broken like this: \| WRAP arch/powerpc/boot/zImage.pseries \| arch/powerpc/boot/wrapper.a(decompress.o): In function `bcj_powerpc.isra.10': \| decompress.c:(.text+0x720): undefined reference to `get_unaligned_be32' \| decompress.c:(.text+0x7a8): undefined reference to `put_unaligned_be32' \| make[1]: * [arch/powerpc/boot/Makefile;383: arch/powerpc/boot/zImage.pseries] Error 1 \| make: * [arch/powerpc/Makefile;295: zImage] Error 2 skiroot_defconfig is the only defconfig that enables CONFIG_KERNEL_XZ for ppc, which has never been correctly built before. I figured out the root cause in lib/decompress_unxz.c: \| #ifdef CONFIG_PPC \| # define XZ_DEC_POWERPC \| #endif CONFIG_PPC is undefined here in the ppc bootwrapper because autoconf.h is not included except by arch/powerpc/boot/serial.c XZ_DEC_POWERPC is not defined, therefore, bcj_powerpc() is not compiled for the bootwrapper. With the next commit passing CONFIG_PPC correctly, we would realize that {get,put}_unaligned_be32 was missing. Unlike the other decompressors, the ppc bootwrapper duplicates all the necessary helpers in arch/powerpc/boot/. The other architectures define __KERNEL__ and pull in helpers for building the decompressors. If ppc bootwrapper had defined __KERNEL__, lib/xz/xz_private.h would have included <asm/unaligned.h>: \| #ifdef __KERNEL__ \| # include <linux/xz.h> \| # include <linux/kernel.h> \| # include <asm/unaligned.h> However, doing so would cause tons of definition conflicts since the bootwrapper has duplicated everything. I just added copies of {get,put}_unaligned_be32, following the bootwrapper coding convention. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190705100144.28785-1-yamada.masahiro@socionext.com	2019-07-10 13:20:44 +10:00
Michael Ellerman	0fc12c022a	powerpc/irq: Don't WARN continuously in arch_local_irq_restore() When CONFIG_PPC_IRQ_SOFT_MASK_DEBUG is enabled (uncommon), we have a series of WARN_ON's in arch_local_irq_restore(). These are "should never happen" conditions, but if they do happen they can flood the console and render the system unusable. So switch them to WARN_ON_ONCE(). Fixes: `e2b36d5917` ("powerpc/64: Don't trace code that runs with the soft irq mask unreconciled") Fixes: `9b81c0211c` ("powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely") Fixes: `7c0482e3d0` ("powerpc/irq: Fix another case of lazy IRQ state getting out of sync") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190708061046.7075-1-mpe@ellerman.id.au	2019-07-10 13:20:43 +10:00
Geert Uytterhoeven	61daf52c4d	sparc64: Add missing newline at end of file "git diff" says: \ No newline at end of file after modifying the file. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-09 14:53:57 -07:00
Peter Zijlstra	ecc6061038	x86/alternatives: Fix int3_emulate_call() selftest stack corruption KASAN shows the following splat during boot: BUG: KASAN: unknown-crash in unwind_next_frame+0x3f6/0x490 Read of size 8 at addr ffffffff84007db0 by task swapper/0 CPU: 0 PID: 0 Comm: swapper Tainted: G T 5.2.0-rc6-00013-g7457c0d #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: dump_stack+0x19/0x1b print_address_description+0x1b0/0x2b2 __kasan_report+0x10f/0x171 kasan_report+0x12/0x1c __asan_load8+0x54/0x81 unwind_next_frame+0x3f6/0x490 unwind_next_frame+0x1b/0x23 arch_stack_walk+0x68/0xa5 stack_trace_save+0x7b/0xa0 save_trace+0x3c/0x93 mark_lock+0x1ef/0x9b1 lock_acquire+0x122/0x221 __mutex_lock+0xb6/0x731 mutex_lock_nested+0x16/0x18 _vm_unmap_aliases+0x141/0x183 vm_unmap_aliases+0x14/0x16 change_page_attr_set_clr+0x15e/0x2f2 set_memory_4k+0x2a/0x2c check_bugs+0x11fd/0x1298 start_kernel+0x793/0x7eb x86_64_start_reservations+0x55/0x76 x86_64_start_kernel+0x87/0xaa secondary_startup_64+0xa4/0xb0 Memory state around the buggy address: ffffffff84007c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 ffffffff84007d00: f1 00 00 00 00 00 00 00 00 00 f2 f2 f2 f3 f3 f3 >ffffffff84007d80: f3 79 be 52 49 79 be 00 00 00 00 00 00 00 00 f1 It turns out that int3_selftest() is corrupting the stack. The problem is that the KASAN-ified version of int3_magic() is much less trivial than the C code appears. It clobbers several unexpected registers. So when the selftest's INT3 is converted to an emulated call to int3_magic(), the registers are clobbered and Bad Things happen when the function returns. Fix this by converting int3_magic() to the trivial ASM function it should be, avoiding all calling convention issues. Also add ASM_CALL_CONSTRAINT to the INT3 ASM, since it contains a 'CALL'. [peterz: cribbed changelog from josh] Fixes: `7457c0da02` ("x86/alternatives: Add int3_emulate_call() selftest") Reported-by: kernel test robot <rong.a.chen@intel.com> Debugged-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20190709125744.GB3402@hirez.programming.kicks-ass.net	2019-07-09 22:39:15 +02:00
Linus Torvalds	e9a83bd232	It's been a relatively busy cycle for docs: - A fair pile of RST conversions, many from Mauro. These create more than the usual number of simple but annoying merge conflicts with other trees, unfortunately. He has a lot more of these waiting on the wings that, I think, will go to you directly later on. - A new document on how to use merges and rebases in kernel repos, and one on Spectre vulnerabilities. - Various improvements to the build system, including automatic markup of function() references because some people, for reasons I will never understand, were of the opinion that :c:func:``function()`` is unattractive and not fun to type. - We now recommend using sphinx 1.7, but still support back to 1.4. - Lots of smaller improvements, warning fixes, typo fixes, etc. -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl0krAEPHGNvcmJldEBs d24ubmV0AAoJEBdDWhNsDH5Yg98H/AuLqO9LpOgUjF4LhyjxGPdzJkY9RExSJ7km gznyreLCZgFaJR+AY6YDsd4Jw6OJlPbu1YM/Qo3C3WrZVFVhgL/s2ebvBgCo50A8 raAFd8jTf4/mGCHnAqRotAPQ3mETJUk315B66lBJ6Oc+YdpRhwXWq8ZW2bJxInFF 3HDvoFgMf0KhLuMHUkkL0u3fxH1iA+KvDu8diPbJYFjOdOWENz/CV8wqdVkXRSEW DJxIq89h/7d+hIG3d1I7Nw+gibGsAdjSjKv4eRKauZs4Aoxd1Gpl62z0JNk6aT3m dtq4joLdwScydonXROD/Twn2jsu4xYTrPwVzChomElMowW/ZBBY= =D0eO -----END PGP SIGNATURE----- Merge tag 'docs-5.3' of git://git.lwn.net/linux Pull Documentation updates from Jonathan Corbet: "It's been a relatively busy cycle for docs: - A fair pile of RST conversions, many from Mauro. These create more than the usual number of simple but annoying merge conflicts with other trees, unfortunately. He has a lot more of these waiting on the wings that, I think, will go to you directly later on. - A new document on how to use merges and rebases in kernel repos, and one on Spectre vulnerabilities. - Various improvements to the build system, including automatic markup of function() references because some people, for reasons I will never understand, were of the opinion that :c:func:``function()`` is unattractive and not fun to type. - We now recommend using sphinx 1.7, but still support back to 1.4. - Lots of smaller improvements, warning fixes, typo fixes, etc" * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits) docs: automarkup.py: ignore exceptions when seeking for xrefs docs: Move binderfs to admin-guide Disable Sphinx SmartyPants in HTML output doc: RCU callback locks need only _bh, not necessarily _irq docs: format kernel-parameters -- as code Doc : doc-guide : Fix a typo platform: x86: get rid of a non-existent document Add the RCU docs to the core-api manual Documentation: RCU: Add TOC tree hooks Documentation: RCU: Rename txt files to rst Documentation: RCU: Convert RCU UP systems to reST Documentation: RCU: Convert RCU linked list to reST Documentation: RCU: Convert RCU basic concepts to reST docs: filesystems: Remove uneeded .rst extension on toctables scripts/sphinx-pre-install: fix out-of-tree build docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/ Documentation: PGP: update for newer HW devices Documentation: Add section about CPU vulnerabilities for Spectre Documentation: platform: Delete x86-laptop-drivers.txt docs: Note that :c:func: should no longer be used ...	2019-07-09 12:34:26 -07:00
Linus Torvalds	593c75463a	Merge branch 'parisc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "Dynamic ftrace support by Sven Schnelle and a header guard fix by Denis Efremov" * 'parisc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: asm: psw.h: missing header guard parisc: add dynamic ftrace compiler.h: add CC_USING_PATCHABLE_FUNCTION_ENTRY parisc: use pr_debug() in kernel/module.c parisc: add WARN_ON() to clear_fixmap parisc: add spinlock to patch function parisc: add support for patching multiple words	2019-07-09 12:08:15 -07:00
Linus Torvalds	565eb5f8c5	Merge branch 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x865 kdump updates from Thomas Gleixner: "Yet more kexec/kdump updates: - Properly support kexec when AMD's memory encryption (SME) is enabled - Pass reserved e820 ranges to the kexec kernel so both PCI and SME can work" * 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: fs/proc/vmcore: Enable dumping of encrypted memory when SEV was active x86/kexec: Set the C-bit in the identity map page table when SEV is active x86/kexec: Do not map kexec area as decrypted when SEV is active x86/crash: Add e820 reserved ranges to kdump kernel's e820 table x86/mm: Rework ioremap resource mapping determination x86/e820, ioport: Add a new I/O resource descriptor IORES_DESC_RESERVED x86/mm: Create a workarea in the kernel for SME early encryption x86/mm: Identify the end of the kernel area to be reserved	2019-07-09 11:52:34 -07:00
Linus Torvalds	b7d5c92398	Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Thomas Gleixner: "Assorted updates to kexec/kdump: - Proper kexec support for 4/5-level paging and jumping from a 5-level to a 4-level paging kernel. - Make the EFI support for kexec/kdump more robust - Enforce that the GDT is properly aligned instead of getting the alignment by chance" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kdump/64: Restrict kdump kernel reservation to <64TB x86/kexec/64: Prevent kexec from 5-level paging to a 4-level only kernel x86/boot: Add xloadflags bits to check for 5-level paging support x86/boot: Make the GDT 8-byte aligned x86/kexec: Add the ACPI NVS region to the ident map x86/boot: Call get_rsdp_addr() after console_init() Revert "x86/boot: Disable RSDP parsing temporarily" x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernels x86/kexec: Add the EFI system tables and ACPI tables to the ident map	2019-07-09 11:35:38 -07:00
Linus Torvalds	608745f124	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main changes in this cycle on the kernel side were: - CPU PMU and uncore driver updates to Intel Snow Ridge, IceLake, KabyLake, AmberLake and WhiskeyLake CPUs. - Rework the MSR probing infrastructure to make it more robust, make it work better on virtualized systems and to better expose it on sysfs. - Rework PMU attributes group support based on the feedback from Greg. The core sysfs patch that adds sysfs_update_groups() was acked by Greg. There's a lot of perf tooling changes as well, all around the place: - vendor updates to Intel, cs-etm (ARM), ARM64, s390, - various enhancements to Intel PT tooling support: - Improve CBR (Core to Bus Ratio) packets support. - Export power and ptwrite events to sqlite and postgresql. - Add support for decoding PEBS via PT packets. - Add support for samples to contain IPC ratio, collecting cycles information from CYC packets, showing the IPC info periodically - Allow using time ranges - lots of updates to perf pmu, perf stat, perf trace, eBPF support, perf record, perf diff, etc. - please see the shortlog and Git log for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (252 commits) tools arch x86: Sync asm/cpufeatures.h with the with the kernel tools build: Check if gettid() is available before providing helper perf jvmti: Address gcc string overflow warning for strncpy() perf python: Remove -fstack-protector-strong if clang doesn't have it perf annotate TUI browser: Do not use member from variable within its own initialization perf tests: Fix record+probe_libc_inet_pton.sh for powerpc64 perf evsel: Do not rely on errno values for precise_ip fallback perf thread: Allow references to thread objects after machine__exit() perf header: Assign proper ff->ph in perf_event__synthesize_features() tools arch kvm: Sync kvm headers with the kernel sources perf script: Allow specifying the files to process guest samples perf tools metric: Don't include duration_time in group perf list: Avoid extra : for --raw metrics perf vendor events intel: Metric fixes for SKX/CLX perf tools: Fix typos / broken sentences perf jevents: Add support for Hisi hip08 L3C PMU aliasing perf jevents: Add support for Hisi hip08 HHA PMU aliasing perf jevents: Add support for Hisi hip08 DDRC PMU aliasing perf pmu: Support more complex PMU event aliasing perf diff: Documentation -c cycles option ...	2019-07-09 11:15:52 -07:00
Linus Torvalds	cf2d213e49	Power management updates for 5.3-rc1 - Improve the handling of shared ACPI power resources in the PCI bus type layer (Mika Westerberg). - Make the PCI layer take link delays required by the PCIe spec into account as appropriate and avoid polling devices in D3cold for PME (Mika Westerberg). - Fix some corner case issues in ACPI device power management and in the PCI bus type layer, optimiza and clean up the handling of runtime-suspended PCI devices during system-wide transitions to sleep states (Rafael Wysocki). - Rework hibernation handling in the ACPI core and the PCI bus type to resume runtime-suspended devices before hibernation (which allows some functional problems to be avoided) and fix some ACPI power management issues related to hiberation (Rafael Wysocki). - Extend the operating performance points (OPP) framework to support a wider range of devices (Rajendra Nayak, Stehpen Boyd). - Fix issues related to genpd_virt_devs and issues with platforms using the set_opp() callback in the OPP framework (Viresh Kumar, Dmitry Osipenko). - Add new cpufreq driver for Raspberry Pi (Nicolas Saenz Julienne). - Add new cpufreq driver for imx8m and imx7d chips (Leonard Crestez). - Fix and clean up the pcc-cpufreq, brcmstb-avs-cpufreq, s5pv210, and armada-37xx cpufreq drivers (David Arcari, Florian Fainelli, Paweł Chmiel, YueHaibing). - Clean up and fix the cpufreq core (Viresh Kumar, Daniel Lezcano). - Fix minor issue in the ACPI system sleep support code and export one function from it (Lenny Szubowicz, Dexuan Cui). - Clean up assorted pieces of PM code and documentation (Kefeng Wang, Andy Shevchenko, Bart Van Assche, Greg Kroah-Hartman, Fuqian Huang, Geert Uytterhoeven, Mathieu Malaterre, Rafael Wysocki). - Update the pm-graph utility to v5.4 (Todd Brandt). - Fix and clean up the cpupower utility (Abhishek Goel, Nick Black). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl0jK18SHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxgEAP/RbPe71Y9ufKu64L3EtgV6mS9iuLEhux /Ad9laLNeM1b0oceT3QxGk7xQCacnZcBlcaqXVWI4NRsn4RBZp1cYZngpgJ9DP6E ONr8hzyzDOMVReba3XJIF8H+WoTKjywMYtFutjdx6dRe2ZJLutqnuZ0JbH1YSSK7 IxOt0mJVALf2M4Zz7F17d+n3yGE/4xAPBVbj/rBRcTEsGYlR/Hoxs7iF6EBau7fy R5drUH6XSrWk8adc+z7l3BTGqMMYj9deRSfAWB3wpM4YK7Fv7msX/amBoGINkdn6 xP/ZcrHvhKKzE89MS8OUGP4rGVwq+7tu6mktnYL/tpKgutJqqx5LVvrLsGDSWr+W /aJExN8Eb4Jh98C6vog3XUJoqBxkVGbU8qoCBU3jlFsaznFEWjW9IKhBHs5CIaqz MXte6AsJ8lvFzxILjvx0m2206wNpRJRXYLX3a/BSBxa4OgOESjIpBTmbPfOwbxwj 8z9hIVlDTTDtnF6BEyDQr1fjPi3Mxl7ibGnoqRrJm36VKBy9VZNNwG/0Y2oSvm6k Es8CiTWA3A/46dCZxGr18/9Vbfxn1Yvg9QZ1lCE5Fqij+0F2yRApbHZgUro+1rji 6J8OWC5r5JccdKuGHh4RH/asMFhD0cAR/gUsRzS4dz/wz2jVIN1FstdD1aKN5GBy d0lchx/AKR5H =aBN3 -----END PGP SIGNATURE----- Merge tag 'pm-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These update PCI and ACPI power management (improved handling of ACPI power resources and PCIe link delays, fixes related to corner cases, hibernation handling rework), fix and extend the operating performance points (OPP) framework, add new cpufreq drivers for Raspberry Pi and imx8m chips, update some other cpufreq drivers, clean up assorted pieces of PM code and documentation and update tools. Specifics: - Improve the handling of shared ACPI power resources in the PCI bus type layer (Mika Westerberg). - Make the PCI layer take link delays required by the PCIe spec into account as appropriate and avoid polling devices in D3cold for PME (Mika Westerberg). - Fix some corner case issues in ACPI device power management and in the PCI bus type layer, optimiza and clean up the handling of runtime-suspended PCI devices during system-wide transitions to sleep states (Rafael Wysocki). - Rework hibernation handling in the ACPI core and the PCI bus type to resume runtime-suspended devices before hibernation (which allows some functional problems to be avoided) and fix some ACPI power management issues related to hiberation (Rafael Wysocki). - Extend the operating performance points (OPP) framework to support a wider range of devices (Rajendra Nayak, Stehpen Boyd). - Fix issues related to genpd_virt_devs and issues with platforms using the set_opp() callback in the OPP framework (Viresh Kumar, Dmitry Osipenko). - Add new cpufreq driver for Raspberry Pi (Nicolas Saenz Julienne). - Add new cpufreq driver for imx8m and imx7d chips (Leonard Crestez). - Fix and clean up the pcc-cpufreq, brcmstb-avs-cpufreq, s5pv210, and armada-37xx cpufreq drivers (David Arcari, Florian Fainelli, Paweł Chmiel, YueHaibing). - Clean up and fix the cpufreq core (Viresh Kumar, Daniel Lezcano). - Fix minor issue in the ACPI system sleep support code and export one function from it (Lenny Szubowicz, Dexuan Cui). - Clean up assorted pieces of PM code and documentation (Kefeng Wang, Andy Shevchenko, Bart Van Assche, Greg Kroah-Hartman, Fuqian Huang, Geert Uytterhoeven, Mathieu Malaterre, Rafael Wysocki). - Update the pm-graph utility to v5.4 (Todd Brandt). - Fix and clean up the cpupower utility (Abhishek Goel, Nick Black)" * tag 'pm-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (57 commits) ACPI: PM: Make acpi_sleep_state_supported() non-static PM: sleep: Drop dev_pm_skip_next_resume_phases() ACPI: PM: Unexport acpi_device_get_power() Documentation: ABI: power: Add missing newline at end of file ACPI: PM: Drop unused function and function header ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS ACPI: PM: Simplify and fix PM domain hibernation callbacks PCI: PM: Simplify bus-level hibernation callbacks PM: ACPI/PCI: Resume all devices during hibernation cpufreq: Avoid calling cpufreq_verify_current_freq() from handle_update() cpufreq: Consolidate cpufreq_update_current_freq() and __cpufreq_get() kernel: power: swap: use kzalloc() instead of kmalloc() followed by memset() cpufreq: Don't skip frequency validation for has_target() drivers PCI: PM/ACPI: Refresh all stale power state data in pci_pm_complete() PCI / ACPI: Add _PR0 dependent devices ACPI / PM: Introduce concept of a _PR0 dependent device PCI / ACPI: Use cached ACPI device state to get PCI device power state ACPI: PM: Allow transitions to D0 to occur in special cases ACPI: PM: Avoid evaluating _PS3 on transitions from D3hot to D3cold cpufreq: Use has_target() instead of !setpolicy ...	2019-07-09 10:05:22 -07:00
Linus Torvalds	2d41ef5432	fbdev changes for v5.3: - remove fbdev notifier usage for fbcon (as prep work to clean up the fbcon locking), add locking checks in vt/console code and make assorted cleanups in fbdev and backlight code (Daniel Vetter) - add COMPILE_TEST support to atmel_lcdfb, da8xx-fb, gbefb, imxfb, pvr2fb and pxa168fb drivers (me) - fix DMA API abuse in au1200fb and jz4740_fb drivers (Christoph Hellwig) - add check for new BGRT status field rotation bits in efifb driver (Hans de Goede) - mark expected switch fall-throughs in s3c-fb driver (Gustavo A. R. Silva) - remove fbdev mxsfb driver in favour of the drm version (Fabio Estevam) - remove broken rfbi code from omap2fb driver (me) - misc fixes (Arnd Bergmann, Shobhit Kukreti, Wei Yongjun, me) - misc cleanups (Gustavo A. R. Silva, Colin Ian King, me) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJdJIqIAAoJEH4ztj+gR8IL3vAP/2brrEw/t42O2IPj35rgzjKX IPLw60Z3Q5jPaXbeJfeBgjesHWp0E/Cx1V4Yhh/5m4skhKg3lScJ7HV1Yra3SxbF ZDmpvFoHrdTw2V6I3IuSfmOqRCLo7ws9E3gfBWdmpg32FQbVCOAFBeGp1AAjXigL IBuc73B4jXVWU4IejMHpP5DssK2UBdmmXCGjnRR1OpPcBjDs2vCx6QtyBZtAm032 3Ol/T4f1KrAyUEfRtwxVTDdVmUoViT6914fbcRkSjjNDy1St4hngnfFGSRX/A9TC mr6VH4o/J7OAev0clnmp/8SxXXsu4CXubrPSP0TbNujHP0CudA8L/XefrMGxZlyu eLAWzQMv0qoiFQQEAlayw2aOKx77ed3Zay71SHbopbYSLXfTwHMa6CFxsvNSR/WY 67tDw6/wqqMNCjB8U8B5EFddtiRQrDeykEInS9QTIJ3lJzl77kmFnmbdq+Pi6rAs y+M5UzyWCnChIPezb0Ix2HLEjyhy0e942Hu9BtUJa9YPskxDY3RkyaZ/YcfB8bGW +wmcVnBQfTJIn+BVlvHGhRcxGLhYqNs5JyzJbFWB/UhsGleFAsjBWgkoYjgg9oWG qvcP0m6n96ozrpoExhX6osJoHZ7uRZpllP2whQpYCJ1VyQcgrQbV0dcJd2uh26CD w3ne+HXsIjiStDxbeAIS =W1AD -----END PGP SIGNATURE----- Merge tag 'fbdev-v5.3' of git://github.com/bzolnier/linux Pull fbdev updates from Bartlomiej Zolnierkiewicz: - remove fbdev notifier usage for fbcon (as prep work to clean up the fbcon locking), add locking checks in vt/console code and make assorted cleanups in fbdev and backlight code (Daniel Vetter) - add COMPILE_TEST support to atmel_lcdfb, da8xx-fb, gbefb, imxfb, pvr2fb and pxa168fb drivers (me) - fix DMA API abuse in au1200fb and jz4740_fb drivers (Christoph Hellwig) - add check for new BGRT status field rotation bits in efifb driver (Hans de Goede) - mark expected switch fall-throughs in s3c-fb driver (Gustavo A. R. Silva) - remove fbdev mxsfb driver in favour of the drm version (Fabio Estevam) - remove broken rfbi code from omap2fb driver (me) - misc fixes (Arnd Bergmann, Shobhit Kukreti, Wei Yongjun, me) - misc cleanups (Gustavo A. R. Silva, Colin Ian King, me) * tag 'fbdev-v5.3' of git://github.com/bzolnier/linux: (62 commits) video: fbdev: imxfb: fix a typo in imxfb_probe() video: fbdev: s3c-fb: Mark expected switch fall-throughs video: fbdev: s3c-fb: fix sparse warnings about using incorrect types video: fbdev: don't print error message on framebuffer_alloc() failure video: fbdev: intelfb: return -ENOMEM on framebuffer_alloc() failure video: fbdev: s3c-fb: return -ENOMEM on framebuffer_alloc() failure vga_switcheroo: Depend upon fbcon being built-in, if enabled video: fbdev: omap2: remove rfbi video: fbdev: atmel_lcdfb: remove redundant initialization to variable ret video: fbdev-MMP: Use struct_size() in devm_kzalloc() video: fbdev: controlfb: fix warnings about comparing pointer to 0 efifb: BGRT: Add check for new BGRT status field rotation bits jz4740_fb: fix DMA API abuse video: fbdev: pvr2fb: fix link error for pvr2fb_pci_exit video: fbdev: s3c-fb: add COMPILE_TEST support video: fbdev: imxfb: fix sparse warnings about using incorrect types video: fbdev: pvr2fb: fix build warning when compiling as module fbcon: Export fbcon_update_vcs backlight: simplify lcd notifier staging/olpc_dcon: Add drm conversion to TODO ...	2019-07-09 09:55:45 -07:00
Linus Torvalds	947fbd4ca9	EDAC changes for v5.3 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJdI2fNAAoJEKurIx+X31iBJVoP/jSJbSY49IcJ7st8uolxJ9d9 84ol7TNBnKeKeUXxrQom2hJsqUnQzaUgw3FKfxX0hmYG5Q9xGS8c+BlW4Giei+Ur baGLO7/UEudWcaez3yOQF+R+yVfsLEATN7gSHcrG81aDyR0F6sMPVCOJOj3hqqnY pZpfxR2+52Xx+Bt8KUUQziCK8qghQYKqHUQUz7R83L0gbbx5+hTAT08h4FCxE8Vx fhntQuteJ2PfYgXlmfv+ZLE4HSHaAlokOnVXJhK+7tMdwDD2we+pL0zr5XdbkZYc If6p9LgJinMe5P5gJSvxT1idWmomKIQqazaC17ff/anLRySrzi9F2oPAGtVI2tvK NekoO3oo4s+xONXfe7Q922rIGt/4vZj6tcqBuMYCOAU7TJGQRqDeEl4+T+aIZNMB 9QBFUfKupy7XZ3H5rJTrYaXFPyYkdRGu5ODEHvwnpRiu+uD1UaTR57iyK5kjToGY mcK3nVad2X1foMOQW33jVAhxGJ+sz2YB/XgQuNnqpUFKktLQ8Es1CSgzB02GvW2b OBJiFCZybMKtBnTEVINYc/dcZ6uOUQ17BwKoR4szFGQLWrzbnfe7as5fa0uyj3ib BEbWcDM3KCDMRWAE/VwgyGzH3lg4QP1mE7uRdLfMb5JhIbQvHjo7T0bUBzUy5/Mf 4P9CWweplIi82Q0f2q7r =5hCV -----END PGP SIGNATURE----- Merge tag 'please-pull-for_5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Tony Luck: "All the bits that Boris had queued in his tree plus four patches to add support for Intel Icelake Xeon and then fix a few corner cases" * tag 'please-pull-for_5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC: Fix global-out-of-bounds write when setting edac_mc_poll_msec EDAC, skx, i10nm: Fix source ID register offset EDAC, i10nm: Check ECC enabling status per channel EDAC, i10nm: Add Intel additional Ice-Lake support EDAC: Make edac_debugfs_create_x*() return void EDAC/aspeed: Remove set but not used variable 'np' EDAC/ie31200: Reformat PCI device table EDAC/ie31200: Add Intel Coffee Lake CPU support EDAC/sifive: Add EDAC platform driver for SiFive SoCs EDAC/sb_edac: Remove redundant update of tad_base arm64: dts: stratix10: Add SDMMC EDAC node EDAC/altera: Add Stratix10 SDMMC support arm64: dts: stratix10: Add OCRAM EDAC node EDAC/altera: Add Stratix10 OCRAM ECC support EDAC/sysfs: Drop device references properly EDAC/sysfs: Fix memory leak when creating a csrow object	2019-07-09 09:43:20 -07:00
Linus Torvalds	6b04014f3f	IOMMU Updates for Linux v5.3 Including: - Patches to make the dma-iommu code more generic so that it can be used outside of the ARM context with other IOMMU drivers. Goal is to make use of it on x86 too. - Generic IOMMU domain support for the Intel VT-d driver. This driver now makes more use of common IOMMU code to allocate default domains for the devices it handles. - An IOMMU fault reporting API to userspace. With that the IOMMU fault handling can be done in user-space, for example to forward the faults to a VM. - Better handling for reserved regions requested by the firmware. These can be 'relaxed' now, meaning that those don't prevent a device being attached to a VM. - Suspend/Resume support for the Renesas IOMMU driver. - Added support for dumping SVA related fields of the DMAR table in the Intel VT-d driver via debugfs. - A pile of smaller fixes and cleanups. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl0jaOAACgkQK/BELZcB GuNYTRAAjXuNs1OX/ROJ4TByT3YBWj5BLZVMrpGx75MAEvO68a7rTdaCpGuHv09w 5JneXxA3H2O1q6JCGguLm//Dy5QycJQIn0WuaBGi5Vjo+cGe47sf48Hb6AhIoAFf exjhfmG6kL3ZzEkLU/7iJWqz+iiCBxRU0yMINJvzFFzliJkfyGWsjoruxobmUEhm XdEsdeAQFXoPMNVCDiorM+B+3VCl7BfwfhdlFrmX3fpNrMZ5ytlnPTgfJdXAGgHn DAoK8qGoyYtRnZnivn1y56M13qv5y3XvicKqx/GRKK8CyJpz/vYSBLktI5HO4tOF y/VzNg+ctDSjQgFyKIq+38k6qu4YWctyxZNObmPVpioNf9j9GaZWT97fxGzXMNS4 G7TqOeq4yxjotAOcd+DdH265uVBtGmLqk7XobigxCZ0xzGVVnmPy4oJtkWwjaMgN HtNSMPRRirjprK4BUoCiXwP8YyWZs0a93oJmh2orD406yZkHNuqLNaicu24WsKKf Ze5BsFbNUb0qYdyL7ofOugQrOvYYoWYuREiz8ClF3U4mCcOk4FJ9K2Vf1kSfKBX8 1Skz75NeSZdzXg9JcznNMuiqH2mN3lx/88xKJg3LJxEeFxU++VDMX3Z8kEmpj5w+ 9m/v+VRclUXL6g8s6QFNIlIenMknAEjtWw6hVWPhWLaULH6fR6Y= =Kvk5 -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Make the dma-iommu code more generic so that it can be used outside of the ARM context with other IOMMU drivers. Goal is to make use of it on x86 too. - Generic IOMMU domain support for the Intel VT-d driver. This driver now makes more use of common IOMMU code to allocate default domains for the devices it handles. - An IOMMU fault reporting API to userspace. With that the IOMMU fault handling can be done in user-space, for example to forward the faults to a VM. - Better handling for reserved regions requested by the firmware. These can be 'relaxed' now, meaning that those don't prevent a device being attached to a VM. - Suspend/Resume support for the Renesas IOMMU driver. - Added support for dumping SVA related fields of the DMAR table in the Intel VT-d driver via debugfs. - A pile of smaller fixes and cleanups. * tag 'iommu-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (90 commits) iommu/omap: No need to check return value of debugfs_create functions iommu/arm-smmu-v3: Invalidate ATC when detaching a device iommu/arm-smmu-v3: Fix compilation when CONFIG_CMA=n iommu/vt-d: Cleanup unused variable iommu/amd: Flush not present cache in iommu_map_page iommu/amd: Only free resources once on init error iommu/amd: Move gart fallback to amd_iommu_init iommu/amd: Make iommu_disable safer iommu/io-pgtable: Support non-coherent page tables iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flag iommu/io-pgtable-arm: Add support to use system cache iommu/arm-smmu-v3: Increase maximum size of queues iommu/vt-d: Silence a variable set but not used iommu/vt-d: Remove an unused variable "length" iommu: Fix integer truncation iommu: Add padding to struct iommu_fault iommu/vt-d: Consolidate domain_init() to avoid duplication iommu/vt-d: Cleanup after delegating DMA domain to generic iommu iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices() iommu/vt-d: Allow DMA domain attaching to rmrr locked device ...	2019-07-09 09:21:02 -07:00
Linus Torvalds	98537ee92f	regulator: Updates for v5.3 A couple of new features in the core, the most interesting one being support for complex regulator coupling configurations initially targeted at nVidia Tegra SoCs, and some new drivers but otherwise quite a quiet release. - Core support for gradual ramping of voltages for devices that can't manage large changes in hardware from Bartosz Golaszewski. - Core support for systems that have complex coupling requirements best described via code, contributed by Dmitry Osipenko. - New drivers for Dialog SLG51000, Qualcomm PM8005 and ST Microelectronics STM32-Booster. -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl0jIZwTHGJyb29uaWVA a2VybmVsLm9yZwAKCRAk1otyXVSH0PG+B/9EQnjM29THpBKM6bKdl8cYcf3Hq/FX KNtXjwaTM0DqYtpk1RkaHgecxTJwesJ0k9AYijh0ieeMhb5UES280+6B4NqPb7xr UmFBbNcdk9G+x9q1TyT8akRMkCugEMscQodyk4npzZRjGv8qUsNJUY71Bq2T/JJx QXo5fKlWICzBahF87DCB5pKC7PKfNkx3BWCrGGXOqoBX2ZEKytyWlCa0nGUZ+LqL GqXmmIjKL7H8MP3avmrrRHYpeF5DLXAzH+HEIM9Y0F1cRcdgOS1Exv2eG+90a634 yybVDBX2d5zfiEIoRnpldtB52EGXjvwbo7w1mFwh9UgJRbOCZgb+ZQXT =26r6 -----END PGP SIGNATURE----- Merge tag 'regulator-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator updates from Mark Brown: "A couple of new features in the core, the most interesting one being support for complex regulator coupling configurations initially targeted at nVidia Tegra SoCs, and some new drivers but otherwise quite a quiet release. Summary: - Core support for gradual ramping of voltages for devices that can't manage large changes in hardware from Bartosz Golaszewski. - Core support for systems that have complex coupling requirements best described via code, contributed by Dmitry Osipenko. - New drivers for Dialog SLG51000, Qualcomm PM8005 and ST Microelectronics STM32-Booster" * tag 'regulator-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (52 commits) regulator: max77650: use vsel_step regulator: implement selector stepping regulator: max77650: add MODULE_ALIAS() regulator: max77620: remove redundant assignment to variable ret dt-bindings: regulator: add support for the stm32-booster regulator: add support for the stm32-booster regulator: s2mps11: Adjust supported buck voltages to real values regulator: s2mps11: Fix buck7 and buck8 wrong voltages gpio: Fix return value mismatch of function gpiod_get_from_of_node() regulator: core: Expose some of core functions needed by couplers regulator: core: Introduce API for regulators coupling customization regulator: s2mps11: Add support for disabling S2MPS11 regulators in suspend regulator: s2mps11: Reduce number of rdev_get_id() calls regulator: qcom_spmi: Do NULL check for lvs regulator: qcom_spmi: Fix math of spmi_regulator_set_voltage_time_sel regulator: da9061/62: Adjust LDO voltage selection minimum value regulator: s2mps11: Fix ERR_PTR dereference on GPIO lookup failure regulator: qcom_spmi: add PMS405 SPMI regulator dt-bindings: qcom_spmi: Document pms405 support arm64: dts: msm8998-mtp: Add pm8005_s1 regulator ...	2019-07-09 09:15:03 -07:00
Anup Patel	671f9a3e2e	RISC-V: Setup initial page tables in two stages Currently, the setup_vm() does initial page table setup in one-shot very early before enabling MMU. Due to this, the setup_vm() has to map all possible kernel virtual addresses since it does not know size and location of RAM. This means we have kernel mappings for non-existent RAM and any buggy driver (or kernel) code doing out-of-bound access to RAM will not fault and cause underterministic behaviour. Further, the setup_vm() creates PMD mappings (i.e. 2M mappings) for RV64 systems. This means for PAGE_OFFSET=0xffffffe000000000 (i.e. MAXPHYSMEM_128GB=y), the setup_vm() will require 129 pages (i.e. 516 KB) of memory for initial page tables which is never freed. The memory required for initial page tables will further increase if we chose a lower value of PAGE_OFFSET (e.g. 0xffffff0000000000) This patch implements two-staged initial page table setup, as follows: 1. Early (i.e. setup_vm()): This stage maps kernel image and DTB in a early page table (i.e. early_pg_dir). The early_pg_dir will be used only by boot HART so it can be freed as-part of init memory free-up. 2. Final (i.e. setup_vm_final()): This stage maps all possible RAM banks in the final page table (i.e. swapper_pg_dir). The boot HART will start using swapper_pg_dir at the end of setup_vm_final(). All non-boot HARTs directly use the swapper_pg_dir created by boot HART. We have following advantages with this new approach: 1. Kernel mappings for non-existent RAM don't exists anymore. 2. Memory consumed by initial page tables is now indpendent of the chosen PAGE_OFFSET. 3. Memory consumed by initial page tables on RV64 system is 2 pages (i.e. 8 KB) which has significantly reduced and these pages will be freed as-part of the init memory free-up. The patch also provides a foundation for implementing strict kernel mappings where we protect kernel text and rodata using PTE permissions. Suggested-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Anup Patel <anup.patel@wdc.com> [paul.walmsley@sifive.com: updated to apply; fixed a checkpatch warning] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-07-09 09:08:04 -07:00
Linus Torvalds	2ec98f5678	Bulk GPIO changes for the v5.3 kernel cycle: Core: - When a gpio_chip request GPIOs from itself, it can now fully control the line characteristics, both machine and consumer flags. This makes a lot of sense, but took some time before I figured out that this is how it has to work. - Several smallish documentation fixes. New drivers: - The PCA953x driver now supports the TI TCA9539. - The DaVinci driver now supports the K3 AM654 SoCs. Driver improvements: - Major overhaul and hardening of the OMAP driver by Russell King. - Starting to move some drivers to the new API passing irq_chip along with the gpio_chip when adding the gpio_chip instead of adding it separately. Unrelated: - Delete the FMC subsystem. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl0i7gEACgkQQRCzN7AZ XXOeUA/+JKyI2zebTWBcgtxhn6VQCufMCtFmQl2JkEcy4pT7aBJcGWqFQCBW2Szf VTtqc8nNa90SZoOzsNbkeQgRjNKGZruMbh0ARUPcW4v3ZJHtUNUEDLTo8c3iyTgS 9k/FTeaTLt4WSZujeAO0O7G4KNnOOlTKLh58dr0PmXUR+0v+fbMhcJqJ9ABueV+V qENdpkTuG1ZcvzgLhBBEXdt3Plw9ICLWmPXtwY+784ewucVPbyQX7jV4+bBZ25fL DerCuMIgL5vRWWdiFO6/Jp603rHzZpTnjLJJocXUFiD6zA5rvU2jTWxsnUttjisg 8cTLMyQspsDvBxhEhCJVTuIKotbKH900TSaz+vx20W72/A1euy4y6uVi8FGZo4Ww KDkzB7anwHyEFKGnlYgHzDrfctgZrhQoyFz808DQRYg1JseZB5oGVDvScrPBD43j nbNDd8gwG4yp3tFnDx9xjIwQy3Ax4d510rAZyUN2801IlbA1bueq4t6Z2cCucWzX XA1gCKlXe4BUeitRAoZtqZNZG1ymEysW4jXy1V8xrwtAf8+QSN+xO98akz3VpnQL ae9q+HtF76fDBY1xFSXT37Ma3+4OR2vMF9QWuo4TCb9j1cL7llf8ZxtUq9LEHbDu erKLSSnwSFmqJNGSEA5SulGOCR/tRPkClngE9x0XEM6gOD+bs6E= =8zSV -----END PGP SIGNATURE----- Merge tag 'gpio-v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the big slew of GPIO changes for the v5.3 kernel cycle. This is mostly incremental work this time. Three important things: - The FMC subsystem is deleted through my tree. This happens through GPIO as its demise was discussed in relation to a patch decoupling its GPIO implementation from the standard way of handling GPIO. As it turns out, that is not the only subsystem it reimplements and the authors think it is better do scratch it and start over using the proper kernel subsystems than try to polish the rust shiny. See the commit (ACKed by the maintainers) for details. - Arnd made a small devres patch that was ACKed by Greg and goes into the device core. - SPDX header change colissions may happen, because at times I've seen that quite a lot changed during the -rc:s in regards to SPDX. (It is good stuff, tglx has me convinced, and it is worth the occasional pain.) Apart from this is is nothing controversial or problematic. Summary: Core: - When a gpio_chip request GPIOs from itself, it can now fully control the line characteristics, both machine and consumer flags. This makes a lot of sense, but took some time before I figured out that this is how it has to work. - Several smallish documentation fixes. New drivers: - The PCA953x driver now supports the TI TCA9539. - The DaVinci driver now supports the K3 AM654 SoCs. Driver improvements: - Major overhaul and hardening of the OMAP driver by Russell King. - Starting to move some drivers to the new API passing irq_chip along with the gpio_chip when adding the gpio_chip instead of adding it separately. Unrelated: - Delete the FMC subsystem" * tag 'gpio-v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (87 commits) Revert "gpio: tegra: Clean-up debugfs initialisation" gpiolib: Use spinlock_t instead of struct spinlock gpio: stp-xway: allow compile-testing gpio: stp-xway: get rid of the #include <lantiq_soc.h> dependency gpio: stp-xway: improve module clock error handling gpio: stp-xway: simplify error handling in xway_stp_probe() gpiolib: Clarify use of non-sleeping functions gpiolib: Fix references to gpiod_[gs]et_value_cansleep() variants gpiolib: Document new gpio_chip.init_valid_mask field Documentation: gpio: Fix reference to gpiod_get_array() gpio: pl061: drop duplicate printing of device name gpio: altera: Pass irqchip when adding gpiochip gpio: siox: Use devm_ managed gpiochip gpio: siox: Add struct device dev helper variable gpio: siox: Pass irqchip when adding gpiochip drivers: gpio: amd-fch: make resource struct const devres: allow const resource arguments gpio: ath79: Pass irqchip when adding gpiochip gpio: tegra: Clean-up debugfs initialisation gpio: siox: Switch to IRQ_TYPE_NONE ...	2019-07-09 09:07:00 -07:00
Jiri Slaby	1cbec37b3f	x86/entry/32: Fix ENDPROC of common_spurious common_spurious is currently ENDed erroneously. common_interrupt is used in its ENDPROC. So fix this mistake. Found by my asm macros rewrite patchset. Fixes: `f8a8fe61fe` ("x86/irq: Seperate unused system vectors from spurious entry again") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190709063402.19847-1-jslaby@suse.cz	2019-07-09 14:41:03 +02:00
Ross Zwisler	013c66edf2	Revert "x86/build: Move _etext to actual end of .text" This reverts commit `392bef7096`. Per the discussion here: https://lkml.kernel.org/r/201906201042.3BF5CD6@keescook the above referenced commit breaks kernel compilation with old GCC toolchains as well as current versions of the Gold linker. Revert it to fix the regression and to keep the ability to compile the kernel with these tools. Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Guenter Roeck <groeck@chromium.org> Cc: <stable@vger.kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kees Cook <keescook@chromium.org> Cc: Johannes Hirte <johannes.hirte@datenkhaos.de> Cc: Klaus Kusche <klaus.kusche@computerix.info> Cc: samitolvanen@google.com Cc: Guenter Roeck <groeck@google.com> Link: https://lkml.kernel.org/r/20190701155208.211815-1-zwisler@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-07-09 13:57:31 +02:00
Sebastian Andrzej Siewior	39ca5fb492	x86/ldt: Initialize the context lock for init_mm The mutex mm->context->lock for init_mm is not initialized for init_mm. This wasn't a problem because it remained unused. This changed however since commit `4fc19708b1` ("x86/alternatives: Initialize temporary mm for patching") Initialize the mutex for init_mm. Fixes: `4fc19708b1` ("x86/alternatives: Initialize temporary mm for patching") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20190701173354.2pe62hhliok2afea@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-07-09 13:57:27 +02:00
Christoph Hellwig	f28a1f1613	m68k: Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire M68k only provides the arch_dma_prep_coherent symbol when an mmu is enabled and not on the coldfire platform. Fix the Kconfig symbol selection up to match this. Fixes: `69878ef475` ("m68k: Implement arch_dma_prep_coherent()") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>	2019-07-09 09:13:24 +02:00
Linus Torvalds	5ad18b2e60	Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull force_sig() argument change from Eric Biederman: "A source of error over the years has been that force_sig has taken a task parameter when it is only safe to use force_sig with the current task. The force_sig function is built for delivering synchronous signals such as SIGSEGV where the userspace application caused a synchronous fault (such as a page fault) and the kernel responded with a signal. Because the name force_sig does not make this clear, and because the force_sig takes a task parameter the function force_sig has been abused for sending other kinds of signals over the years. Slowly those have been fixed when the oopses have been tracked down. This set of changes fixes the remaining abusers of force_sig and carefully rips out the task parameter from force_sig and friends making this kind of error almost impossible in the future" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits) signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus signal: Remove the signal number and task parameters from force_sig_info signal: Factor force_sig_info_to_task out of force_sig_info signal: Generate the siginfo in force_sig signal: Move the computation of force into send_signal and correct it. signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal signal: Remove the task parameter from force_sig_fault signal: Use force_sig_fault_to_task for the two calls that don't deliver to current signal: Explicitly call force_sig_fault on current signal/unicore32: Remove tsk parameter from __do_user_fault signal/arm: Remove tsk parameter from __do_user_fault signal/arm: Remove tsk parameter from ptrace_break signal/nds32: Remove tsk parameter from send_sigtrap signal/riscv: Remove tsk parameter from do_trap signal/sh: Remove tsk parameter from force_sig_info_fault signal/um: Remove task parameter from send_sigtrap signal/x86: Remove task parameter from send_sigtrap signal: Remove task parameter from force_sig_mceerr signal: Remove task parameter from force_sig signal: Remove task parameter from force_sigsegv ...	2019-07-08 21:48:15 -07:00
Linus Torvalds	2b49350b16	ARM updates: - Add a "cut here" to make it clearer where oops dumps should be cut from - we already have a marker for the end of the dumps. - Add logging severity to show_pte() - Drop unnecessary common-page-size linker flag - Errata workarounds for Cortex A12 857271, Cortex A17 857272 and Cortex A7 814220. - Remove some unused variables that had started to provoke a compiler warning. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAXSMct/TnkBvkraxkAQIpKBAAoE+5ZLeBanLZ6GMUpIMW+wVfVYaI4p8g kZd0Dm2EB5Y9x9kfYwCLmdUp5A/Lh3dmAkAGHSP0TVdDj34qK9KXxCA3t+k85ph5 +VDzdpAtUrbuDWaF/6KIaGMLgA4ss+VXpoqMOvZuwDQReXbXaT325WVzcTFlNLe7 sUBS8k2DxTTjiF0HM/jnxtbvVmc3sfkPij+MTEz9XEcS9K0Hilz9suoJEKZCXY6H ULX3n+8wJAxbnbnXdGLyQ095BqtaQC88mEcMUvBci7NBGiRne19vvRqdLwe1mJPc PTz1v/Pjvbfj1apSZDGVJvIu3BVEkwJNBF1NujRGUVrYnKQd/YJSKJkqjXat5Q5M H/G2l+y0vmzn4VSbn1LUEvleuou6Hq/1nucksS1DHruEB7tH/aQRtf0EduzGRsop VmYkTiIiaF28N+eHDolDtUzSg5RNzEMCQnni4y6ajGhbWiu1Q63GBR7k9zwB27Bj 3t6RdrqIaE52uKUdgwc9APcpZhdJR/ncqrI68B+wIwbuObhUSY5sa1Ec8uiELOiK g3zCp4AWRBbYQFNeQy87A811JSFU+L/lCbABs32Bj8UEKGn/qdmVwEaH9rZ8/PHk TAlcu8PtwBVQQIEgs+ur+OyfZcLt2U4DUDiL/26GCyG+pzzL7fYap5GSoBuqR02p MmoHFr87YZM= =Ru/4 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM updates from Russell King: - Add a "cut here" to make it clearer where oops dumps should be cut from - we already have a marker for the end of the dumps. - Add logging severity to show_pte() - Drop unnecessary common-page-size linker flag - Errata workarounds for Cortex A12 857271, Cortex A17 857272 and Cortex A7 814220. - Remove some unused variables that had started to provoke a compiler warning. * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8863/1: stm32: select ARM errata 814220 ARM: 8862/1: errata: 814220-B-Cache maintenance by set/way operations can execute out of order ARM: 8865/1: mm: remove unused variables ARM: 8864/1: Add workaround for I-Cache line size mismatch between CPU cores ARM: 8861/1: errata: Workaround errata A12 857271 / A17 857272 ARM: 8860/1: VDSO: Drop implicit common-page-size linker flag ARM: arrange show_pte() to issue severity-based messages ARM: add "8<--- cut here ---" to kernel dumps	2019-07-08 21:08:34 -07:00
Linus Torvalds	4d2fa8b44b	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "Here is the crypto update for 5.3: API: - Test shash interface directly in testmgr - cra_driver_name is now mandatory Algorithms: - Replace arc4 crypto_cipher with library helper - Implement 5 way interleave for ECB, CBC and CTR on arm64 - Add xxhash - Add continuous self-test on noise source to drbg - Update jitter RNG Drivers: - Add support for SHA204A random number generator - Add support for 7211 in iproc-rng200 - Fix fuzz test failures in inside-secure - Fix fuzz test failures in talitos - Fix fuzz test failures in qat" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (143 commits) crypto: stm32/hash - remove interruptible condition for dma crypto: stm32/hash - Fix hmac issue more than 256 bytes crypto: stm32/crc32 - rename driver file crypto: amcc - remove memset after dma_alloc_coherent crypto: ccp - Switch to SPDX license identifiers crypto: ccp - Validate the the error value used to index error messages crypto: doc - Fix formatting of new crypto engine content crypto: doc - Add parameter documentation crypto: arm64/aes-ce - implement 5 way interleave for ECB, CBC and CTR crypto: arm64/aes-ce - add 5 way interleave routines crypto: talitos - drop icv_ool crypto: talitos - fix hash on SEC1. crypto: talitos - move struct talitos_edesc into talitos.h lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE crypto/NX: Set receive window credits to max number of CRBs in RxFIFO crypto: asymmetric_keys - select CRYPTO_HASH where needed crypto: serpent - mark __serpent_setkey_sbox noinline crypto: testmgr - dynamically allocate crypto_shash crypto: testmgr - dynamically allocate testvec_config crypto: talitos - eliminate unneeded 'done' functions at build time ...	2019-07-08 20:57:08 -07:00
Linus Torvalds	8b68150883	Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "Bug fixes, code clean up, and new features: - IMA policy rules can be defined in terms of LSM labels, making the IMA policy dependent on LSM policy label changes, in particular LSM label deletions. The new environment, in which IMA-appraisal is being used, frequently updates the LSM policy and permits LSM label deletions. - Prevent an mmap'ed shared file opened for write from also being mmap'ed execute. In the long term, making this and other similar changes at the VFS layer would be preferable. - The IMA per policy rule template format support is needed for a couple of new/proposed features (eg. kexec boot command line measurement, appended signatures, and VFS provided file hashes). - Other than the "boot-aggregate" record in the IMA measuremeent list, all other measurements are of file data. Measuring and storing the kexec boot command line in the IMA measurement list is the first buffer based measurement included in the measurement list" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: Introduce struct evm_xattr ima: Update MAX_TEMPLATE_NAME_LEN to fit largest reasonable definition KEXEC: Call ima_kexec_cmdline to measure the boot command line args IMA: Define a new template field buf IMA: Define a new hook to measure the kexec boot command line arguments IMA: support for per policy rule template formats integrity: Fix __integrity_init_keyring() section mismatch ima: Use designated initializers for struct ima_event_data ima: use the lsm policy update notifier LSM: switch to blocking policy update notifiers x86/ima: fix the Kconfig dependency for IMA_ARCH_POLICY ima: Make arch_policy_entry static ima: prevent a file already mmap'ed write to be mmap'ed execute x86/ima: check EFI SetupMode too	2019-07-08 20:28:59 -07:00
David S. Miller	af144a9834	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two cases of overlapping changes, nothing fancy. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-08 19:48:57 -07:00
Linus Torvalds	222a21d295	Merge branch 'x86-topology-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 topology updates from Ingo Molnar: "Implement multi-die topology support on Intel CPUs and expose the die topology to user-space tooling, by Len Brown, Kan Liang and Zhang Rui. These changes should have no effect on the kernel's existing understanding of topologies, i.e. there should be no behavioral impact on cache, NUMA, scheduler, perf and other topologies and overall system performance" * 'x86-topology-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/rapl: Cosmetic rename internal variables in response to multi-die/pkg support perf/x86/intel/uncore: Cosmetic renames in response to multi-die/pkg support hwmon/coretemp: Cosmetic: Rename internal variables to zones from packages thermal/x86_pkg_temp_thermal: Cosmetic: Rename internal variables to zones from packages perf/x86/intel/cstate: Support multi-die/package perf/x86/intel/rapl: Support multi-die/package perf/x86/intel/uncore: Support multi-die/package topology: Create core_cpus and die_cpus sysfs attributes topology: Create package_cpus sysfs attribute hwmon/coretemp: Support multi-die/package powercap/intel_rapl: Update RAPL domain name and debug messages thermal/x86_pkg_temp_thermal: Support multi-die/package powercap/intel_rapl: Support multi-die/package powercap/intel_rapl: Simplify rapl_find_package() x86/topology: Define topology_logical_die_id() x86/topology: Define topology_die_id() cpu/topology: Export die_id x86/topology: Create topology_max_die_per_package() x86/topology: Add CPUID.1F multi-die/package support	2019-07-08 18:28:44 -07:00
Linus Torvalds	8faef7125d	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updayes from Ingo Molnar: "Most of the commits add ACRN hypervisor guest support, plus two cleanups" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/jailhouse: Mark jailhouse_x2apic_available() as __init x86/platform/geode: Drop <linux/gpio.h> includes x86/acrn: Use HYPERVISOR_CALLBACK_VECTOR for ACRN guest upcall vector x86: Add support for Linux guests on an ACRN hypervisor x86/Kconfig: Add new X86_HV_CALLBACK_VECTOR config symbol	2019-07-08 17:49:45 -07:00
Linus Torvalds	da17702385	Merge branch 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Ingo Molnar: "A handful of paravirt patching code enhancements to make it more robust against patching failures, and related cleanups and not so related cleanups - by Thomas Gleixner and myself" * 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/paravirt: Rename paravirt_patch_site::instrtype to paravirt_patch_site::type x86/paravirt: Standardize 'insn_buff' variable names x86/paravirt: Match paravirt patchlet field definition ordering to initialization ordering x86/paravirt: Replace the paravirt patch asm magic x86/paravirt: Unify the 32/64 bit paravirt patching code x86/paravirt: Detect over-sized patching bugs in paravirt_patch_call() x86/paravirt: Detect over-sized patching bugs in paravirt_patch_insns() x86/paravirt: Remove bogus extern declarations	2019-07-08 17:34:44 -07:00
Linus Torvalds	3431a940bb	Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 AVX512 status update from Ingo Molnar: "This adds a new ABI that the main scheduler probably doesn't want to deal with but HPC job schedulers might want to use: the AVX512_elapsed_ms field in the new /proc/<pid>/arch_status task status file, which allows the user-space job scheduler to cluster such tasks, to avoid turbo frequency drops" * 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/filesystems/proc.txt: Add arch_status file x86/process: Add AVX-512 usage elapsed time to /proc/pid/arch_status proc: Add /proc/<pid>/arch_status	2019-07-08 17:28:57 -07:00
Linus Torvalds	5b7a209523	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc small cleanups: removal of superfluous code and coding style cleanups mostly" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kexec: Make variable static and config dependent x86/defconfigs: Remove useless UEVENT_HELPER_PATH x86/amd_nb: Make hygon_nb_misc_ids static x86/tsc: Move inline keyword to the beginning of function declarations x86/io_delay: Define IO_DELAY macros in C instead of Kconfig x86/io_delay: Break instead of fallthrough in switch statement	2019-07-08 17:27:24 -07:00
Linus Torvalds	6cfcdad763	Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache resource control update from Ingo Molnar: "Two cleanup patches" * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Cleanup cbm_ensure_valid() x86/resctrl: Use _ASM_BX to avoid ifdeffery	2019-07-08 17:25:53 -07:00
Linus Torvalds	c83b5d321b	Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Ingo Molnar: "Two kbuild enhancements by Masahiro Yamada" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Remove redundant 'clean-files += capflags.c' x86/build: Add 'set -e' to mkcapflags.sh to delete broken capflags.c	2019-07-08 17:24:44 -07:00
Linus Torvalds	a1aab6f3d2	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "Most of the changes relate to Peter Zijlstra's cleanup of ptregs handling, in particular the i386 part is now much simplified and standardized - no more partial ptregs stack frames via the esp/ss oddity. This simplifies ftrace, kprobes, the unwinder, ptrace, kdump and kgdb. There's also a CR4 hardening enhancements by Kees Cook, to make the generic platform functions such as native_write_cr4() less useful as ROP gadgets that disable SMEP/SMAP. Also protect the WP bit of CR0 against similar attacks. The rest is smaller cleanups/fixes" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Add int3_emulate_call() selftest x86/stackframe/32: Allow int3_emulate_push() x86/stackframe/32: Provide consistent pt_regs x86/stackframe, x86/ftrace: Add pt_regs frame annotations x86/stackframe, x86/kprobes: Fix frame pointer annotations x86/stackframe: Move ENCODE_FRAME_POINTER to asm/frame.h x86/entry/32: Clean up return from interrupt preemption path x86/asm: Pin sensitive CR0 bits x86/asm: Pin sensitive CR4 bits Documentation/x86: Fix path to entry_32.S x86/asm: Remove unused TASK_TI_flags from asm-offsets.c	2019-07-08 16:59:34 -07:00
Linus Torvalds	dad1c12ed8	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Remove the unused per rq load array and all its infrastructure, by Dietmar Eggemann. - Add utilization clamping support by Patrick Bellasi. This is a refinement of the energy aware scheduling framework with support for boosting of interactive and capping of background workloads: to make sure critical GUI threads get maximum frequency ASAP, and to make sure background processing doesn't unnecessarily move to cpufreq governor to higher frequencies and less energy efficient CPU modes. - Add the bare minimum of tracepoints required for LISA EAS regression testing, by Qais Yousef - which allows automated testing of various power management features, including energy aware scheduling. - Restructure the former tsk_nr_cpus_allowed() facility that the -rt kernel used to modify the scheduler's CPU affinity logic such as migrate_disable() - introduce the task->cpus_ptr value instead of taking the address of &task->cpus_allowed directly - by Sebastian Andrzej Siewior. - Misc optimizations, fixes, cleanups and small enhancements - see the Git log for details. * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) sched/uclamp: Add uclamp support to energy_compute() sched/uclamp: Add uclamp_util_with() sched/cpufreq, sched/uclamp: Add clamps for FAIR and RT tasks sched/uclamp: Set default clamps for RT tasks sched/uclamp: Reset uclamp values on RESET_ON_FORK sched/uclamp: Extend sched_setattr() to support utilization clamping sched/core: Allow sched_setattr() to use the current policy sched/uclamp: Add system default clamps sched/uclamp: Enforce last task's UCLAMP_MAX sched/uclamp: Add bucket local max tracking sched/uclamp: Add CPU's clamp buckets refcounting sched/fair: Rename weighted_cpuload() to cpu_runnable_load() sched/debug: Export the newly added tracepoints sched/debug: Add sched_overutilized tracepoint sched/debug: Add new tracepoint to track PELT at se level sched/debug: Add new tracepoints to track PELT at rq level sched/debug: Add a new sched_trace_*() helper functions sched/autogroup: Make autogroup_path() always available sched/wait: Deduplicate code with do-while sched/topology: Remove unused 'sd' parameter from arch_scale_cpu_capacity() ...	2019-07-08 16:39:53 -07:00
Linus Torvalds	090bc5a2a9	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "Boris is on vacation so I'm sending the RAS bits this time. The main changes were: - Various RAS/CEC improvements and fixes by Borislav Petkov: - error insertion fixes - offlining latency fix - memory leak fix - additional sanity checks - cleanups - debug output improvements - More SMCA enhancements by Yazen Ghannam: - make banks truly per-CPU which they are in the hardware - don't over-cache certain registers - make the number of MCA banks per-CPU variable The long term goal with these changes is to support future heterogenous SMCA extensions. - Misc fixes and improvements" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Do not check return value of debugfs_create functions x86/MCE: Determine MCA banks' init state properly x86/MCE: Make the number of MCA banks a per-CPU variable x86/MCE/AMD: Don't cache block addresses on SMCA systems x86/MCE: Make mce_banks a per-CPU array x86/MCE: Make struct mce_banks[] static RAS/CEC: Add copyright RAS/CEC: Add CONFIG_RAS_CEC_DEBUG and move CEC debug features there RAS/CEC: Dump the different array element sections RAS/CEC: Rename count_threshold to action_threshold RAS/CEC: Sanity-check array on every insertion RAS/CEC: Fix potential memory leak RAS/CEC: Do not set decay value on error RAS/CEC: Check count_threshold unconditionally RAS/CEC: Fix pfn insertion	2019-07-08 16:31:06 -07:00
Linus Torvalds	e192832869	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle are: - rwsem scalability improvements, phase #2, by Waiman Long, which are rather impressive: "On a 2-socket 40-core 80-thread Skylake system with 40 reader and writer locking threads, the min/mean/max locking operations done in a 5-second testing window before the patchset were: 40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810 40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255 After the patchset, they became: 40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741 40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098" There's a lot of changes to the locking implementation that makes it similar to qrwlock, including owner handoff for more fair locking. Another microbenchmark shows how across the spectrum the improvements are: "With a locking microbenchmark running on 5.1 based kernel, the total locking rates (in kops/s) on a 2-socket Skylake system with equal numbers of readers and writers (mixed) before and after this patchset were: # of Threads Before Patch After Patch ------------ ------------ ----------- 2 2,618 4,193 4 1,202 3,726 8 802 3,622 16 729 3,359 32 319 2,826 64 102 2,744" The changes are extensive and the patch-set has been through several iterations addressing various locking workloads. There might be more regressions, but unless they are pathological I believe we want to use this new implementation as the baseline going forward. - jump-label optimizations by Daniel Bristot de Oliveira: the primary motivation was to remove IPI disturbance of isolated RT-workload CPUs, which resulted in the implementation of batched jump-label updates. Beyond the improvement of the real-time characteristics kernel, in one test this patchset improved static key update overhead from 57 msecs to just 1.4 msecs - which is a nice speedup as well. - atomic64_t cross-arch type cleanups by Mark Rutland: over the last ~10 years of atomic64_t existence the various types used by the APIs only had to be self-consistent within each architecture - which means they became wildly inconsistent across architectures. Mark puts and end to this by reworking all the atomic64 implementations to use 's64' as the base type for atomic64_t, and to ensure that this type is consistently used for parameters and return values in the API, avoiding further problems in this area. - A large set of small improvements to lockdep by Yuyang Du: type cleanups, output cleanups, function return type and othr cleanups all around the place. - A set of percpu ops cleanups and fixes by Peter Zijlstra. - Misc other changes - please see the Git log for more details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits) locking/lockdep: increase size of counters for lockdep statistics locking/atomics: Use sed(1) instead of non-standard head(1) option locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING x86/jump_label: Make tp_vec_nr static x86/percpu: Optimize raw_cpu_xchg() x86/percpu, sched/fair: Avoid local_clock() x86/percpu, x86/irq: Relax {set,get}_irq_regs() x86/percpu: Relax smp_processor_id() x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}() locking/rwsem: Guard against making count negative locking/rwsem: Adaptive disabling of reader optimistic spinning locking/rwsem: Enable time-based spinning on reader-owned rwsem locking/rwsem: Make rwsem->owner an atomic_long_t locking/rwsem: Enable readers spinning on writer locking/rwsem: Clarify usage of owner's nonspinaable bit locking/rwsem: Wake up almost all readers in wait queue locking/rwsem: More optimal RT task handling of null owner locking/rwsem: Always release wait_lock before waking up tasks locking/rwsem: Implement lock handoff to prevent lock starvation locking/rwsem: Make rwsem_spin_on_owner() return owner state ...	2019-07-08 16:12:03 -07:00
Michael Kelley	765e33f521	Drivers: hv: vmbus: Break out ISA independent parts of mshyperv.h Break out parts of mshyperv.h that are ISA independent into a separate file in include/asm-generic. This move facilitates ARM64 code reusing these definitions and avoids code duplication. No functionality or behavior is changed. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-07-08 19:06:27 -04:00
Max Filippov	775f1f7eac	xtensa: virt: add defconfig and DTS Add defconfig and DTS for a virt board. Defconfig enables PCIe host and a number of virtio devices. DTS routes legacy PCI IRQs to the first four level-triggered external IRQ lines. CPU core with edge-triggered IRQs among the first four may need a custom DTS to work correctly. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-07-08 14:32:06 -07:00
Linus Torvalds	223cea6a4f	Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti updates from Thomas Gleixner: "The speculative paranoia departement delivers a few more plugs for possible (probably theoretical) spectre/mds leaks" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tls: Fix possible spectre-v1 in do_get_thread_area() x86/ptrace: Fix possible spectre-v1 in ptrace_get_debugreg() x86/speculation/mds: Eliminate leaks by trace_hardirqs_on()	2019-07-08 12:23:00 -07:00
Linus Torvalds	2f0f6503e3	Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Thomas Gleixner: "A rather large series consolidating the HPET code, which was triggered by the attempt to bolt HPET NMI watchdog support on to the existing maze with the usual duct tape and super glue approach. This mainly removes two separate partially redundant storage layers and consolidates them into a single one which provides a consistent view of the different HPET channels and their usage and allows to integrate HPET NMI watchdog support (if it turns out to be feasible) in a non intrusive way" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) x86/hpet: Use channel for legacy clockevent storage x86/hpet: Use common init for legacy clockevent x86/hpet: Carve out shareable parts of init_one_hpet_msi_clockevent() x86/hpet: Consolidate clockevent functions x86/hpet: Wrap legacy clockevent in hpet_channel x86/hpet: Use cached info instead of extra flags x86/hpet: Move clockevents into channels x86/hpet: Rename variables to prepare for switching to channels x86/hpet: Add function to select a /dev/hpet channel x86/hpet: Add mode information to struct hpet_channel x86/hpet: Use cached channel data x86/hpet: Introduce struct hpet_base and struct hpet_channel x86/hpet: Coding style cleanup x86/hpet: Clean up comments x86/hpet: Make naming consistent x86/hpet: Remove not required includes x86/hpet: Decapitalize and rename EVT_TO_HPET_DEV x86/hpet: Simplify counter validation x86/hpet: Separate counter check out of clocksource register code x86/hpet: Shuffle code around for readability sake ...	2019-07-08 12:16:40 -07:00
Linus Torvalds	13324c42c1	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU feature updates from Thomas Gleixner: "Updates for x86 CPU features: - Support for UMWAIT/UMONITOR, which allows to use MWAIT and MONITOR instructions in user space to save power e.g. in HPC workloads which spin wait on synchronization points. The maximum time a MWAIT can halt in userspace is controlled by the kernel and can be adjusted by the sysadmin. - Speed up the MTRR handling code on CPUs which support cache self-snooping correctly. On those CPUs the wbinvd() invocations can be omitted which speeds up the MTRR setup by a factor of 50. - Support for the new x86 vendor Zhaoxin who develops processors based on the VIA Centaur technology. - Prevent 'cat /proc/cpuinfo' from affecting isolated NOHZ_FULL CPUs by sending IPIs to retrieve the CPU frequency and use the cached values instead. - The addition and late revert of the FSGSBASE support. The revert was required as it turned out that the code still has hard to diagnose issues. Yet another engineering trainwreck... - Small fixes, cleanups, improvements and the usual new Intel CPU family/model addons" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) x86/fsgsbase: Revert FSGSBASE support selftests/x86/fsgsbase: Fix some test case bugs x86/entry/64: Fix and clean up paranoid_exit x86/entry/64: Don't compile ignore_sysret if 32-bit emulation is enabled selftests/x86: Test SYSCALL and SYSENTER manually with TF set x86/mtrr: Skip cache flushes on CPUs with cache self-snooping x86/cpu/intel: Clear cache self-snoop capability in CPUs with known errata Documentation/ABI: Document umwait control sysfs interfaces x86/umwait: Add sysfs interface to control umwait maximum time x86/umwait: Add sysfs interface to control umwait C0.2 state x86/umwait: Initialize umwait control values x86/cpufeatures: Enumerate user wait instructions x86/cpu: Disable frequency requests via aperfmperf IPI for nohz_full CPUs x86/acpi/cstate: Add Zhaoxin processors support for cache flush policy in C3 ACPI, x86: Add Zhaoxin processors support for NONSTOP TSC x86/cpu: Create Zhaoxin processors architecture support file x86/cpu: Split Tremont based Atoms from the rest Documentation/x86/64: Add documentation for GS/FS addressing mode x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit ...	2019-07-08 11:59:59 -07:00
Linus Torvalds	ab2486a9ee	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU updates from Thomas Gleixner: "A small set of updates for the FPU code: - Make the no387/nofxsr command line options useful by restricting them to 32bit and actually clearing all dependencies to prevent random crashes and malfunction. - Simplify and cleanup the kernel_fpu_() helpers" 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Inline fpu__xstate_clear_all_cpu_caps() x86/fpu: Make 'no387' and 'nofxsr' command line options useful x86/fpu: Remove the fpu__save() export x86/fpu: Simplify kernel_fpu_begin() x86/fpu: Simplify kernel_fpu_end()	2019-07-08 11:45:26 -07:00
Linus Torvalds	0d37dde706	Merge branch 'x86-entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 vsyscall updates from Thomas Gleixner: "Further hardening of the legacy vsyscall by providing support for execute only mode and switching the default to it. This prevents a certain class of attacks which rely on the vsyscall page being accessible at a fixed address in the canonical kernel address space" * 'x86-entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/x86: Add a test for process_vm_readv() on the vsyscall page x86/vsyscall: Add __ro_after_init to global variables x86/vsyscall: Change the default vsyscall mode to xonly selftests/x86/vsyscall: Verify that vsyscall=none blocks execution x86/vsyscall: Document odd SIGSEGV error code for vsyscalls x86/vsyscall: Show something useful on a read fault x86/vsyscall: Add a new vsyscall=xonly mode Documentation/admin: Remove the vsyscall=native documentation	2019-07-08 11:42:09 -07:00
Linus Torvalds	0902d5011c	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x96 apic updates from Thomas Gleixner: "Updates for the x86 APIC interrupt handling and APIC timer: - Fix a long standing issue with spurious interrupts which was caused by the big vector management rework a few years ago. Robert Hodaszi provided finally enough debug data and an excellent initial failure analysis which allowed to understand the underlying issues. This contains a change to the core interrupt management code which is required to handle this correctly for the APIC/IO_APIC. The core changes are NOOPs for most architectures except ARM64. ARM64 is not impacted by the change as confirmed by Marc Zyngier. - Newer systems allow to disable the PIT clock for power saving causing panic in the timer interrupt delivery check of the IO/APIC when the HPET timer is not enabled either. While the clock could be turned on this would cause an endless whack a mole game to chase the proper register in each affected chipset. These systems provide the relevant frequencies for TSC, CPU and the local APIC timer via CPUID and/or MSRs, which allows to avoid the PIT/HPET based calibration. As the calibration code is the only usage of the legacy timers on modern systems and is skipped anyway when the frequencies are known already, there is no point in setting up the PIT and actually checking for the interrupt delivery via IO/APIC. To achieve this on a wide variety of platforms, the CPUID/MSR based frequency readout has been made more robust, which also allowed to remove quite some workarounds which turned out to be not longer required. Thanks to Daniel Drake for analysis, patches and verification" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Seperate unused system vectors from spurious entry again x86/irq: Handle spurious interrupt after shutdown gracefully x86/ioapic: Implement irq_get_irqchip_state() callback genirq: Add optional hardware synchronization for shutdown genirq: Fix misleading synchronize_irq() documentation genirq: Delay deactivation in free_irq() x86/timer: Skip PIT initialization on modern chipsets x86/apic: Use non-atomic operations when possible x86/apic: Make apic_bsp_setup() static x86/tsc: Set LAPIC timer period to crystal clock frequency x86/apic: Rename 'lapic_timer_frequency' to 'lapic_timer_period' x86/tsc: Use CPUID.0x16 to calculate missing crystal frequency	2019-07-08 11:22:57 -07:00
Linus Torvalds	927ba67a63	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The timer and timekeeping departement delivers: Core: - The consolidation of the VDSO code into a generic library including the conversion of x86 and ARM64. Conversion of ARM and MIPS are en route through the relevant maintainer trees and should end up in 5.4. This gets rid of the unnecessary different copies of the same code and brings all architectures on the same level of VDSO functionality. - Make the NTP user space interface more robust by restricting the TAI offset to prevent undefined behaviour. Includes a selftest. - Validate user input in the compat settimeofday() syscall to catch invalid values which would be turned into valid values by a multiplication overflow - Consolidate the time accessors - Small fixes, improvements and cleanups all over the place Drivers: - Support for the NXP system counter, TI davinci timer - Move the Microsoft HyperV clocksource/events code into the drivers/clocksource directory so it can be shared between x86 and ARM64. - Overhaul of the Tegra driver - Delay timer support for IXP4xx - Small fixes, improvements and cleanups as usual" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) time: Validate user input in compat_settimeofday() timer: Document TIMER_PINNED clocksource/drivers: Continue making Hyper-V clocksource ISA agnostic clocksource/drivers: Make Hyper-V clocksource ISA agnostic MAINTAINERS: Fix Andy's surname and the directory entries of VDSO hrtimer: Use a bullet for the returns bullet list arm64: vdso: Fix compilation with clang older than 8 arm64: compat: Fix __arch_get_hw_counter() implementation arm64: Fix __arch_get_hw_counter() implementation lib/vdso: Make delta calculation work correctly MAINTAINERS: Add entry for the generic VDSO library arm64: compat: No need for pre-ARMv7 barriers on an ARMv8 system arm64: vdso: Remove unnecessary asm-offsets.c definitions vdso: Remove superfluous #ifdef __KERNEL__ in vdso/datapage.h clocksource/drivers/davinci: Add support for clocksource clocksource/drivers/davinci: Add support for clockevents clocksource/drivers/tegra: Set up maximum-ticks limit properly clocksource/drivers/tegra: Cycles can't be 0 clocksource/drivers/tegra: Restore base address before cleanup clocksource/drivers/tegra: Add verbose definition for 1MHz constant ...	2019-07-08 11:06:29 -07:00
Linus Torvalds	e0e86b111b	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP/hotplug updates from Thomas Gleixner: "A small set of updates for SMP and CPU hotplug: - Abort disabling secondary CPUs in the freezer when a wakeup is pending instead of evaluating it only after all CPUs have been offlined. - Remove the shared annotation for the strict per CPU cfd_data in the smp function call core code. - Remove the return values of smp_call_function() and on_each_cpu() as they are unconditionally 0. Fixup the few callers which actually bothered to check the return value" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Remove smp_call_function() and on_each_cpu() return values smp: Do not mark call_function_data as shared cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending cpu/hotplug: Fix notify_cpu_starting() reference in bringup_wait_for_ap()	2019-07-08 10:39:56 -07:00
Linus Torvalds	1758feddb0	s390 updates for the 5.3 merge window - Improve stop_machine wait logic: replace cpu_relax_yield call in generic stop_machine function with a weak stop_machine_yield function. This is overridden on s390, which yields the current cpu to the neighbouring cpu after a couple of retries, instead of blindly giving up the cpu to the hipervisor. This significantly improves stop_machine performance on s390 in overcommitted scenarios. This includes common code changes which have been Acked by Peter Zijlstra and Thomas Gleixner. - Improve jump label transformation speed: transform jump labels without using stop_machine. - Refactoring of the vfio-ccw cp handling, simplifying the code and avoiding unneeded allocating/copying. - Various vfio-ccw fixes (ccw translation, state machine). - Add support for vfio-ap queue interrupt control in the guest. This includes s390 kvm changes which have been Acked by Christian Borntraeger. - Add protected virtualization support for virtio-ccw. - Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to remove some code which most likely isn't working at all, besides that s390 didn't even compile for !CONFIG_SMP. - Support for special flagged EP11 CPRBs for zcrypt. - Handle PCI devices with no support for new MIO instructions. - Avoid KASAN false positives in reworked stack unwinder. - Couple of fixes for the QDIO layer. - Convert s390 specific documentation to ReST format. - Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if hardware is missing. This way our modules behave like most other modules and which is also what systemd's systemd-modules-load.service expects. - Replace defconfig with performance_defconfig, so there is one config file less to maintain. - Remove the SCLP call home device driver, which was never useful. - Cleanups all over the place. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl0iEpcACgkQjYWKoQLX FBgtZwf8DOJ6COUG91jKP0RSDlc2YvIMBxopQ38ql1lIsTj5t6DvJ2z3X5uct1wy 6mMiF01VuyD4V4UXbTJQrihzNx7D4dUh47s2sS+diGHxJyXacVxlmjS5k+6pLIUO AyLvtCcoqDPPiThqnSTZFRm/TcfO/25fCG/IdjrFGj1MD09wHpUCh16tmRPTGFlC BWZeilDT77fVXnh7Ggn3JB0mQay5PAw2ODOxELHTUBaLmYF8RJPPVKBPmXGl9P1W 84ESm2p+iALGGWDiTOUad9eu8wyQci/V/R+hFgs0Bz/HRcjznNH5EVvfQNCD4VNF g/PET10nIQYZv2BNdi0cwRjR9jCFbw== =jp0i -----END PGP SIGNATURE----- Merge tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Improve stop_machine wait logic: replace cpu_relax_yield call in generic stop_machine function with a weak stop_machine_yield function. This is overridden on s390, which yields the current cpu to the neighbouring cpu after a couple of retries, instead of blindly giving up the cpu to the hipervisor. This significantly improves stop_machine performance on s390 in overcommitted scenarios. This includes common code changes which have been Acked by Peter Zijlstra and Thomas Gleixner. - Improve jump label transformation speed: transform jump labels without using stop_machine. - Refactoring of the vfio-ccw cp handling, simplifying the code and avoiding unneeded allocating/copying. - Various vfio-ccw fixes (ccw translation, state machine). - Add support for vfio-ap queue interrupt control in the guest. This includes s390 kvm changes which have been Acked by Christian Borntraeger. - Add protected virtualization support for virtio-ccw. - Enforce both CONFIG_SMP and CONFIG_HOTPLUG_CPU, which allows to remove some code which most likely isn't working at all, besides that s390 didn't even compile for !CONFIG_SMP. - Support for special flagged EP11 CPRBs for zcrypt. - Handle PCI devices with no support for new MIO instructions. - Avoid KASAN false positives in reworked stack unwinder. - Couple of fixes for the QDIO layer. - Convert s390 specific documentation to ReST format. - Let s390 crypto modules return -ENODEV instead of -EOPNOTSUPP if hardware is missing. This way our modules behave like most other modules and which is also what systemd's systemd-modules-load.service expects. - Replace defconfig with performance_defconfig, so there is one config file less to maintain. - Remove the SCLP call home device driver, which was never useful. - Cleanups all over the place. * tag 's390-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits) docs: s390: s390dbf: typos and formatting, update crash command docs: s390: unify and update s390dbf kdocs at debug.c docs: s390: restore important non-kdoc parts of s390dbf.rst vfio-ccw: Fix the conversion of Format-0 CCWs to Format-1 s390/pci: correctly handle MIO opt-out s390/pci: deal with devices that have no support for MIO instructions s390: ap: kvm: Enable PQAP/AQIC facility for the guest s390: ap: implement PAPQ AQIC interception in kernel vfio: ap: register IOMMU VFIO notifier s390: ap: kvm: add PQAP interception for AQIC s390/unwind: cleanup unused READ_ONCE_TASK_STACK s390/kasan: avoid false positives during stack unwind s390/qdio: don't touch the dsci in tiqdio_add_input_queues() s390/qdio: (re-)initialize tiqdio list entries s390/dasd: Fix a precision vs width bug in dasd_feature_list() s390/cio: introduce driver_override on the css bus vfio-ccw: make convert_ccw0_to_ccw1 static vfio-ccw: Remove copy_ccw_from_iova() vfio-ccw: Factor out the ccw0-to-ccw1 transition vfio-ccw: Copy CCW data outside length calculation ...	2019-07-08 10:06:12 -07:00
Max Filippov	d6d5f19e21	xtensa: abstract 'entry' and 'retw' in assembly code Provide abi_entry, abi_entry_default, abi_ret and abi_ret_default macros that allocate aligned stack frame in windowed and call0 ABIs. Provide XTENSA_SPILL_STACK_RESERVE macro that specifies required stack frame size when register spilling is involved. Replace all uses of 'entry' and 'retw' with the above macros. This makes most of the xtensa assembly code ready for XEA3 and call0 ABI. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-07-08 10:04:48 -07:00
Linus Torvalds	278ecbf027	m68k updates for v5.3 - Switch to using the generic remapping DMA allocator, - Defconfig updates. -----BEGIN PGP SIGNATURE----- iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCXSL2rhUcZ2VlcnRAbGlu dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XBcgQEAgcOMl12clyZtus960R+OpKhTyjn/ oTPrDFu/jSj2ZY0BAPg9KaQjMhInXHzRs5S9IbwKcbsWogRGg76NbR09ehsM =9T08 -----END PGP SIGNATURE----- Merge tag 'm68k-for-v5.3-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - switch to using the generic remapping DMA allocator - defconfig updates * tag 'm68k-for-v5.3-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Implement arch_dma_prep_coherent() m68k: Use the generic dma coherent remap allocator m68k: defconfig: Update defconfigs for v5.2-rc1	2019-07-08 10:00:58 -07:00
Linus Torvalds	dfd437a257	arm64 updates for 5.3: - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP} - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to manage the permissions of executable vmalloc regions more strictly - Slight performance improvement by keeping softirqs enabled while touching the FPSIMD/SVE state (kernel_neon_begin/end) - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG and AXFLAG instructions for floating point comparison flags manipulation) and FRINT (rounding floating point numbers to integers) - Re-instate ARM64_PSEUDO_NMI support which was previously marked as BROKEN due to some bugs (now fixed) - Improve parking of stopped CPUs and implement an arm64-specific panic_smp_self_stop() to avoid warning on not being able to stop secondary CPUs during panic - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI platforms - perf: DDR performance monitor support for iMX8QXP - cache_line_size() can now be set from DT or ACPI/PPTT if provided to cope with a system cache info not exposed via the CPUID registers - Avoid warning on hardware cache line size greater than ARCH_DMA_MINALIGN if the system is fully coherent - arm64 do_page_fault() and hugetlb cleanups - Refactor set_pte_at() to avoid redundant READ_ONCE(ptep) - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags' introduced in 5.1) - CONFIG_RANDOMIZE_BASE now enabled in defconfig - Allow the selection of ARM64_MODULE_PLTS, currently only done via RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill over into the vmalloc area - Make ZONE_DMA32 configurable -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl0eHqcACgkQa9axLQDI XvFyNA/+L+bnkz8m3ncydlqqfXomQn4eJJVQ8Uksb0knJz+1+3CUxxbO4ry4jXZN fMkbggYrDPRKpDbsUl0lsRipj7jW9bqan+N37c3SWqCkgb6HqDaHViwxdx6Ec/Uk gHudozDSPh/8c7hxGcSyt/CFyuW6b+8eYIQU5rtIgz8aVY2BypBvS/7YtYCbIkx0 w4CFleRTK1zXD5mJQhrc6jyDx659sVkrAvdhf6YIymOY8nBTv40vwdNo3beJMYp8 Po/+0Ixu+VkHUNtmYYZQgP/AGH96xiTcRnUqd172JdtRPpCLqnLqwFokXeVIlUKT KZFMDPzK+756Ayn4z4huEePPAOGlHbJje8JVNnFyreKhVVcCotW7YPY/oJR10bnc eo7yD+DxABTn+93G2yP436bNVa8qO1UqjOBfInWBtnNFJfANIkZweij/MQ6MjaTA o7KtviHnZFClefMPoiI7HDzwL8XSmsBDbeQ04s2Wxku1Y2xUHLx4iLmadwLQ1ZPb lZMTZP3N/T1554MoURVA1afCjAwiqU3bt1xDUGjbBVjLfSPBAn/25IacsG9Li9AF 7Rp1M9VhrfLftjFFkB2HwpbhRASOxaOSx+EI3kzEfCtM2O9I1WHgP3rvCdc3l0HU tbK0/IggQicNgz7GSZ8xDlWPwwSadXYGLys+xlMZEYd3pDIOiFc= =0TDT -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP} - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to manage the permissions of executable vmalloc regions more strictly - Slight performance improvement by keeping softirqs enabled while touching the FPSIMD/SVE state (kernel_neon_begin/end) - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG and AXFLAG instructions for floating point comparison flags manipulation) and FRINT (rounding floating point numbers to integers) - Re-instate ARM64_PSEUDO_NMI support which was previously marked as BROKEN due to some bugs (now fixed) - Improve parking of stopped CPUs and implement an arm64-specific panic_smp_self_stop() to avoid warning on not being able to stop secondary CPUs during panic - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI platforms - perf: DDR performance monitor support for iMX8QXP - cache_line_size() can now be set from DT or ACPI/PPTT if provided to cope with a system cache info not exposed via the CPUID registers - Avoid warning on hardware cache line size greater than ARCH_DMA_MINALIGN if the system is fully coherent - arm64 do_page_fault() and hugetlb cleanups - Refactor set_pte_at() to avoid redundant READ_ONCE(ptep) - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags' introduced in 5.1) - CONFIG_RANDOMIZE_BASE now enabled in defconfig - Allow the selection of ARM64_MODULE_PLTS, currently only done via RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill over into the vmalloc area - Make ZONE_DMA32 configurable * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits) perf: arm_spe: Enable ACPI/Platform automatic module loading arm_pmu: acpi: spe: Add initial MADT/SPE probing ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens ACPI/PPTT: Modify node flag detection to find last IDENTICAL x86/entry: Simplify _TIF_SYSCALL_EMU handling arm64: rename dump_instr as dump_kernel_instr arm64/mm: Drop [PTE\|PMD]_TYPE_FAULT arm64: Implement panic_smp_self_stop() arm64: Improve parking of stopped CPUs arm64: Expose FRINT capabilities to userspace arm64: Expose ARMv8.5 CondM capability to userspace arm64: defconfig: enable CONFIG_RANDOMIZE_BASE arm64: ARM64_MODULES_PLTS must depend on MODULES arm64: bpf: do not allocate executable memory arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP arm64: module: create module allocations without exec permissions arm64: Allow user selection of ARM64_MODULE_PLTS acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 arm64: Allow selecting Pseudo-NMI again ...	2019-07-08 09:54:55 -07:00
Ingo Molnar	552a031ba1	Linux 5.2 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0idTweHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGZesIAJDKicw2Voyx8K8m 3pXSK+71RuO/d3Y9M51mdfTMKRP4PHR9/4wVZ9wHPwC4dV6wxgsmIYCF69a1Wety LD1MpDCP1DK5wVfPNKVX2xmj7ua6iutPtSsJHzdzM2TlscgsrFKjmUccqJ5JLwL5 c34nqwXWnzzRyI5Ga9cQSlwzAXq0vDHXyML3AnCosSsLX0lKFrHlK1zttdOPNkfj dXRN62g3q+9kVQozzhDXb8atZZ7IkBk8Q0lujpNXW83Ci1VjaVNv3SB8GZTXIlLj U15VdyuwfJDfpBgFBN6/unzVaAB6FFrEKy0jT1aeTyKarMKDKgOnJjn10aKjDNno /bXsKKc= =TVqV -----END PGP SIGNATURE----- Merge tag 'v5.2' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-07-08 18:04:41 +02:00
Marc Zyngier	1e0cf16cda	KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register As part of setting up the host context, we populate its MPIDR by using cpu_logical_map(). It turns out that contrary to arm64, cpu_logical_map() on 32bit ARM doesn't return the full MPIDR, but a truncated version. This leaves the host MPIDR slightly corrupted after the first run of a VM, since we won't correctly restore the MPIDR on exit. Oops. Since we cannot trust cpu_logical_map(), let's adopt a different strategy. We move the initialization of the host CPU context as part of the per-CPU initialization (which, in retrospect, makes a lot of sense), and directly read the MPIDR from the HW. This is guaranteed to work on both arm and arm64. Reported-by: Andre Przywara <Andre.Przywara@arm.com> Tested-by: Andre Przywara <Andre.Przywara@arm.com> Fixes: `32f1395519` ("arm/arm64: KVM: Statically configure the host's view of MPIDR") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-08 16:29:48 +01:00
Russell King	5ccd3bd992	Merge branches 'fixes' and 'misc' Fix up the conflict between "VDSO: Drop implicit common-page-size linker flag" and "vdso: pass --be8 to linker if necessary" Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-07-08 11:32:21 +01:00
Rafael J. Wysocki	3dbeb44854	Merge branch 'pm-sleep' * pm-sleep: PM: sleep: Drop dev_pm_skip_next_resume_phases() ACPI: PM: Drop unused function and function header ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS ACPI: PM: Simplify and fix PM domain hibernation callbacks PCI: PM: Simplify bus-level hibernation callbacks PM: ACPI/PCI: Resume all devices during hibernation kernel: power: swap: use kzalloc() instead of kmalloc() followed by memset() PM: sleep: Update struct wakeup_source documentation drivers: base: power: remove wakeup_sources_stats_dentry variable PM: suspend: Rename pm_suspend_via_s2idle() PM: sleep: Show how long dpm_suspend_start() and dpm_suspend_end() take PM: hibernate: powerpc: Expose pfn_is_nosave() prototype	2019-07-08 10:51:25 +02:00
Eugeniy Paltsev	24a20b0a44	ARC: [plat-hsdk]: Enable AXI DW DMAC in defconfig For some reasons my previous patch "Enable AXI DW DMAC support" was applied only partially (only device tree part). So enable AXI DW DMAC in HSDK defconfig to be able to use it in verification flow. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2019-07-08 09:24:47 +01:00
Eugeniy Paltsev	aab128d006	ARC: [plat-hsdk]: enable DW SPI controller HSDK SoC has DW SPI controller. Enable it in preparation of enabling on-board SPI peripherals. Acked-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2019-07-08 09:24:46 +01:00
Arnd Bergmann	fd5de2721e	ARC: hide unused function unw_hdr_alloc As kernelci.org reports, this function is not used in vdk_hs38_defconfig: arch/arc/kernel/unwind.c:188:14: warning: 'unw_hdr_alloc' defined but not used [-Wunused-function] Fixes: `bc79c9a721` ("ARC: dw2 unwind: Reinstante unwinding out of modules") Link: https://kernelci.org/build/id/5d1cae3f59b514300340c132/logs/ Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2019-07-08 09:24:46 +01:00
Alexey Brodkin	94b8beb972	ARC: [haps] Add Virtio support As a preparation for QEMU usage for ARC let's add basic Virtio-MMIO peripherals support for the platform we're going to use. For now we add 5 Virtio slots in .dts and enable block and network devices via Virtio-MMIO. Note even though typically Virtio register set fits in 0x200 bytes we "allocate" here 0x2000 so that it matches ARC's default 8KiB page size and so remapping of that area is done clearly. We also enable DEVTMPFS automount for more convenient use of external root file-stystem. Before that we used to use built-in Initramfs which didn't automount DEVTMPFS anyways so we didn't need that option, while now it starts making sense. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2019-07-08 09:24:45 +01:00
Vineet Gupta	75370ad440	ARCv2: entry: simplify return to Delay Slot via interrupt Commit `4255b07f2c` ("ARCv2: STAR 9000793984: Handle return from intr to Delay Slot") involved a complex 2 staged trampoline. Apparently this can be greatly simplified by returning from pure kernel mode (iso interrupt) so drop to pure kernel mdoe and execute the normal exception return path. Testing this was a bit of challenge as return from interrupt is rarely executed now after commit `4de0e52867` ("ARCv2: STAR 9000814690: Really Re-enable interrupts to avoid deadlocks"). That fix is necessary evil and pct interrupts etc do exercise intr return path. Anyhow after a revert of above in my local test setup I was able to hit this case and verify the patch works. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2019-07-08 09:24:45 +01:00
Vineet Gupta	68e5c6f073	ARC: entry: EV_Trap expects r10 (vs. r9) to have exception cause avoids 1 MOV instruction in light of double load/store code Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2019-07-08 09:24:44 +01:00
Olof Johansson	35051f8434	Samsung DTS ARM changes for v5.3, third round 1. Fix imprecise abort on Exynos4210 caused by newly added Mali nodes, 2. Reorganize Mali nodes under /soc, 3. Adjust buck regulators voltages on Arndale Octa and Odroid XU3/XU4 family to sane values. -----BEGIN PGP SIGNATURE----- iQJEBAABCgAuFiEE3dJiKD0RGyM7briowTdm5oaLg9cFAl0iMskQHGtyemtAa2Vy bmVsLm9yZwAKCRDBN2bmhouD1wlMD/0Wr9gCWjL+f4axGqL9A/54mTwcaGo67Wrm iynRK9e3miII8ijQGVf6xkpfXGSCpOcYDcy8cR97G4gLRO1zc/lHO2GJBxPf/ALR rvIvrmPqPOe7t/rzruzoiykg4p1C8o4rX0/b6rxsmaSDja0zvEK2J0/E5cmZFy/A w+45dr/AuDkKnzhcy+GjuLmkyRcDTwuPPPXtFEBYvSWBwZMvLwAfckpO5q5iQCkR eTUwtpwnJa2tp1qDM20VaGcqJd6XeADS6cEX1kM5AQ291ZK5F6XmkhJ1pIyeibwp TbBLePAnwUV5/RGYdcVpaKLqESyZ0ctKz5WYMB43TILIgYq0iLq9pxujKGHHTZNX zI9t0yZ/wr4ypXe824S9v1g2rSJPGs6WtZETukZL07CjF5d0/ambJ34hFv7cpgXo AYZi9haVOb/aFzuXvFtF6nhpuggpDpesrYxo+iXmnDU/WSDOPeCljhOzk1gstyue Dt7dJZSWXIZ6i+vY5K7kCUOmpOw4KMaeTH81G+abP/TWaolafnn5tdFalf4+FL+9 f4DguaY0VQ73dhiaIjWHjnkW1zshZIfiLaG2rQjuam1O+foR24TfVfpMj299m29o ahxbN1ynh8o5Ny2ErIqcsprsDEav9fbQijOl9zV/8jMraMI2s+Ny4Z4t5wtEXaxm MZf4lsSKhA== =4yv0 -----END PGP SIGNATURE----- Merge tag 'samsung-dt-5.3-3' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/dt Samsung DTS ARM changes for v5.3, third round 1. Fix imprecise abort on Exynos4210 caused by newly added Mali nodes, 2. Reorganize Mali nodes under /soc, 3. Adjust buck regulators voltages on Arndale Octa and Odroid XU3/XU4 family to sane values. * tag 'samsung-dt-5.3-3' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: dts: exynos: Adjust buck[78] regulators to supported values on Arndale Octa ARM: dts: exynos: Adjust buck[78] regulators to supported values on Odroid XU3 family ARM: dts: exynos: Move Mali400 GPU node to "/soc" ARM: dts: exynos: Fix imprecise abort on Mali GPU probe on Exynos4210 Link: https://lore.kernel.org/r/20190707180115.5562-1-krzk@kernel.org Signed-off-by: Olof Johansson <olof@lixom.net>	2019-07-07 13:33:03 -07:00
Sebastian Andrzej Siewior	7891bc0ab7	x86/fpu: Inline fpu__xstate_clear_all_cpu_caps() All fpu__xstate_clear_all_cpu_caps() does is to invoke one simple function since commit `73e3a7d2a7` ("x86/fpu: Remove the explicit clearing of XSAVE dependent features") so invoke that function directly and remove the wrapper. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190704060743.rvew4yrjd6n33uzx@linutronix.de	2019-07-07 12:01:47 +02:00
Sebastian Andrzej Siewior	9838e3bff0	x86/fpu: Make 'no387' and 'nofxsr' command line options useful The command line option `no387' is designed to disable the FPU entirely. This only 'works' with CONFIG_MATH_EMULATION enabled. But on 64bit this cannot work because user space expects SSE to work which required basic FPU support. MATH_EMULATION does not help because SSE is not emulated. The command line option `nofxsr' should also be limited to 32bit because FXSR is part of the required flags on 64bit so turning it off is not possible. Clearing X86_FEATURE_FPU without emulation enabled will not work anyway and hang in fpu__init_system_early_generic() before the console is enabled. Setting additioal dependencies, ensures that the CPU still boots on a modern CPU. Otherwise, dropping FPU will leave FXSR enabled causing the kernel to crash early in fpu__init_system_mxcsr(). With XSAVE support it will crash in fpu__init_cpu_xstate(). The problem is that xsetbv() with XMM set and SSE cleared is not allowed. That means XSAVE has to be disabled. The XSAVE support is disabled in fpu__init_system_xstate_size_legacy() but it is too late. It can be removed, it has been added in commit `1f999ab5a1` ("x86, xsave: Disable xsave in i387 emulation mode") to use `no387' on a CPU with XSAVE support. All this happens before console output. After hat, the next possible crash is in RAID6 detect code because MMX remained enabled. With a 3DNOW enabled config it will explode in memcpy() for instance due to kernel_fpu_begin() but this is unconditionally enabled. This is enough to boot a Debian Wheezy on a 32bit qemu "host" CPU which supports everything up to XSAVES, AVX2 without 3DNOW. Later, Debian increased the minimum requirements to i686 which means it does not boot userland atleast due to CMOV. After masking the additional features it still keeps SSE4A and 3DNOW* enabled (if present on the host) but those are unused in the kernel. Restrict `no387' and `nofxsr' otions to 32bit only. Add dependencies for FPU, FXSR to additionaly mask CMOV, MMX, XSAVE if FXSR or FPU is cleared. Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190703083247.57kjrmlxkai3vpw3@linutronix.de	2019-07-07 12:01:46 +02:00
Linus Torvalds	bcc0e65f47	A few more MIPS fixes: - Fix a silly typo in virt_addr_valid which led to completely bogus behavior (that happened to stop tripping up hardened usercopy despite being broken). - Fix UART parity setup on AR933x systems. - A build fix for non-Linux build machines. - Have the 'all' make target build DTBs, primarily to fit in with the behavior of scripts/package/builddeb. - Handle an execution hazard in TLB exceptions that use KScratch registers, which could inadvertently clobber the $1 register on some generally higher-end out-of-order CPUs. - A MAINTAINERS update to fix the path to the NAND driver for Ingenic systems. -----BEGIN PGP SIGNATURE----- iIsEABYIADMWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCXSDJfhUccGF1bC5idXJ0 b25AbWlwcy5jb20ACgkQPqefrLV1AN35ygEA30KckazfjbtmW0EqD+C19sgtbSS3 eCAiweHHwLJoyUUBAJ/HzlZ8ap2X9ilZuFdzKEf1igj5WsLIyrkl6kkauUEA =DRYO -----END PGP SIGNATURE----- Merge tag 'mips_fixes_5.2_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A few more MIPS fixes: - Fix a silly typo in virt_addr_valid which led to completely bogus behavior (that happened to stop tripping up hardened usercopy despite being broken). - Fix UART parity setup on AR933x systems. - A build fix for non-Linux build machines. - Have the 'all' make target build DTBs, primarily to fit in with the behavior of scripts/package/builddeb. - Handle an execution hazard in TLB exceptions that use KScratch registers, which could inadvertently clobber the $1 register on some generally higher-end out-of-order CPUs. - A MAINTAINERS update to fix the path to the NAND driver for Ingenic systems" * tag 'mips_fixes_5.2_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MAINTAINERS: Correct path to moved files MIPS: Add missing EHB in mtc0 -> mfc0 sequence. MIPS: have "plain" make calls build dtbs for selected platforms MIPS: fix build on non-linux hosts MIPS: ath79: fix ar933x uart parity mode MIPS: Fix bounds check virt_addr_valid	2019-07-06 10:32:12 -07:00
Linus Torvalds	9fdb86c8cf	x86 bugfix patches and one compilation fix for ARM. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJdH7H6AAoJEL/70l94x66D3XcH/0If8IAHn746Y+2QoasbmW0m lACzkzNZYzKWUgzJ9+r4eNe0tq7wUzU546Jl2GgOjIcQGnRCPAJMGodGTEq5AGo2 XWGHXkKLa9w3bSCRi9Ov7d1CGO9oDk6PtkTP4xT0oVyEtsgPdKWdEz2dDe8BnK7T BynVcz3JZOm+vE4N1GusjkdN6hbVFBdTZNSvN9uE3iNoUBUoe98Ctv2HiSca9tDF oPJpSWLSThC/cRXn1EX7uXiUTZ2bTaS+mCdwmR6QS5VBEHR+ssWzHq4ru/U0p/vO 7L9AZoyCTqW5pOMuYZFnfntnbu54RUnfc7A0jyqzM2SFeBt8h9hpsVXoWmtnL5k= =qzdP -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "x86 bugfix patches and one compilation fix for ARM" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: arm64/sve: Fix vq_present() macro to yield a bool KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC KVM: nVMX: Change KVM_STATE_NESTED_EVMCS to signal vmcs12 is copied from eVMCS KVM: nVMX: Allow restore nested-state to enable eVMCS when vCPU in SMM KVM: x86: degrade WARN to pr_warn_ratelimited	2019-07-05 19:13:24 -07:00
Luke Nelson	46dd3d7d28	bpf, riscv: Enable zext optimization for more RV64G ALU ops Commit `66d0d5a854` ("riscv: bpf: eliminate zero extension code-gen") added the new zero-extension optimization for some BPF ALU operations. Since then, bugs in the JIT that have been fixed in the bpf tree require this optimization to be added to other operations: commit `1e692f09e0` ("bpf, riscv: clear high 32 bits for ALU32 add/sub/neg/lsh/rsh/arsh"), and commit `fe121ee531` ("bpf, riscv: clear target register high 32-bits for and/or/xor on ALU32"). Now that these have been merged to bpf-next, the zext optimization can be enabled for the fixed operations. Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Cc: Song Liu <liu.song.a23@gmail.com> Cc: Jiong Wang <jiong.wang@netronome.com> Cc: Xi Wang <xi.wang@gmail.com> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Acked-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-05 23:55:41 +02:00
Wanpeng Li	548f7fb222	KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane which can happen sporadically in product environment. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 21:54:25 +02:00
Markus Elfring	831c4f3da8	xtensa: One function call less in bootmem_init() Avoid an extra function call by using a ternary operator instead of a conditional statement for a setting selection. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Message-Id: <495c9f2e-7880-ee9a-5c61-eee598bb24c2@web.de> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>	2019-07-05 12:02:37 -07:00
Christophe Leroy	a2b6f26c26	powerpc/module64: Use symbolic instructions names. To increase readability/maintainability, replace hard coded instructions values by symbolic names. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Fix R_PPC64_ENTRY case, the addi reads from r2 not r12] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-06 00:29:50 +10:00
Christophe Leroy	4eb4516ead	powerpc/module32: Use symbolic instructions names. To increase readability/maintainability, replace hard coded instructions values by symbolic names. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-06 00:29:50 +10:00
Christophe Leroy	7f9c929a7f	powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h PPC_HA() PPC_HI() and PPC_LO() macros are nice macros. Move them from module64.c to ppc-opcode.h in order to use them in other places. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Clean up formatting in new code, drop duplicates in ftrace.c] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-06 00:29:50 +10:00
Michael Ellerman	2fb0a2c989	powerpc/module64: Fix comment in R_PPC64_ENTRY handling The comment here is wrong, the addi reads from r2 not r12. The code is correct, 0x38420000 = addi r2,r2,0. Fixes: `a61674bdfc` ("powerpc/module: Handle R_PPC64_ENTRY relocations") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-06 00:29:50 +10:00
Paolo Bonzini	01402cf810	kvm: LAPIC: write down valid APIC registers Replace a magic 64-bit mask with a list of valid registers, computing the same mask in the end. Suggested-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 15:32:59 +02:00
Dave Martin	fdec2a9ef8	KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s Currently, the {read,write}_sysreg_el() accessors for accessing particular ELs' sysregs in the presence of VHE rely on some local hacks and define their system register encodings in a way that is inconsistent with the core definitions in <asm/sysreg.h>. As a result, it is necessary to add duplicate definitions for any system register that already needs a definition in sysreg.h for other reasons. This is a bit of a maintenance headache, and the reasons for the _el() accessors working the way they do is a bit historical. This patch gets rid of the shadow sysreg definitions in <asm/kvm_hyp.h>, converts the _el*() accessors to use the core __msr_s/__mrs_s interface, and converts all call sites to use the standard sysreg #define names (i.e., upper case, with SYS_ prefix). This patch will conflict heavily anyway, so the opportunity to clean up some bad whitespace in the context of the changes is taken. The change exposes a few system registers that have no sysreg.h definition, due to msr_s/mrs_s being used in place of msr/mrs: additions are made in order to fill in the gaps. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Link: https://www.spinics.net/lists/kvm-arm/msg31717.html [Rebased to v4.21-rc1] Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> [Rebased to v5.2-rc5, changelog updates] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:57:25 +01:00
Andre Przywara	99adb56763	KVM: arm/arm64: Add save/restore support for firmware workaround state KVM implements the firmware interface for mitigating cache speculation vulnerabilities. Guests may use this interface to ensure mitigation is active. If we want to migrate such a guest to a host with a different support level for those workarounds, migration might need to fail, to ensure that critical guests don't loose their protection. Introduce a way for userland to save and restore the workarounds state. On restoring we do checks that make sure we don't downgrade our mitigation level. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:56:27 +01:00
Andre Przywara	c118bbb527	arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests Recent commits added the explicit notion of "workaround not required" to the state of the Spectre v2 (aka. BP_HARDENING) workaround, where we just had "needed" and "unknown" before. Export this knowledge to the rest of the kernel and enhance the existing kvm_arm_harden_branch_predictor() to report this new state as well. Export this new state to guests when they use KVM's firmware interface emulation. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:56:27 +01:00
Andrew Murray	418e5ca88c	KVM: arm/arm64: Rename kvm_pmu_{enable/disable}_counter functions The kvm_pmu_{enable/disable}_counter functions can enable/disable multiple counters at once as they operate on a bitmask. Let's make this clearer by renaming the function. Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:56:04 +01:00
Paolo Bonzini	101628ded5	KVM: LAPIC: ARBPRI is a reserved register for x2APIC kvm-unit-tests were adjusted to match bare metal behavior, but KVM itself was not doing what bare metal does; fix that. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 14:14:15 +02:00
James Morse	11b41626bd	KVM: arm64: Skip more of the SError vaxorcism During __guest_exit() we need to consume any SError left pending by the guest so it doesn't contaminate the host. With v8.2 we use the ESB-instruction. For systems without v8.2, we use dsb+isb and unmask SError. We do this on every guest exit. Use the same dsb+isr_el1 trick, this lets us know if an SError is pending after the dsb, allowing us to skip the isb and self-synchronising PSTATE write if its not. This means SError remains masked during KVM's world-switch, so any SError that occurs during this time is reported by the host, instead of causing a hyp-panic. As we're benchmarking this code lets polish the layout. If you give gcc likely()/unlikely() hints in an if() condition, it shuffles the generated assembly so that the likely case is immediately after the branch. Lets do the same here. Signed-off-by: James Morse <james.morse@arm.com> Changes since v2: * Added isb after the dsb to prevent an early read Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:03:34 +01:00
James Morse	dad6321ffa	KVM: arm64: Re-mask SError after the one instruction window KVM consumes any SError that were pending during guest exit with a dsb/isb and unmasking SError. It currently leaves SError unmasked for the rest of world-switch. This means any SError that occurs during this part of world-switch will cause a hyp-panic. We'd much prefer it to remain pending until we return to the host. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:03:34 +01:00
James Morse	3276cc2489	arm64: Update silicon-errata.txt for Neoverse-N1 #1349291 Neoverse-N1 affected by #1349291 may report an Uncontained RAS Error as Unrecoverable. The kernel's architecture code already considers Unrecoverable errors as fatal as without kernel-first support no further error-handling is possible. Now that KVM attributes SError to the host/guest more precisely the host's architecture code will always handle host errors that become pending during world-switch. Errors misclassified by this errata that affected the guest will be re-injected to the guest as an implementation-defined SError, which can be uncontained. Until kernel-first support is implemented, no workaround is needed for this issue. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:03:30 +01:00
James Morse	5dcd0fdbb4	KVM: arm64: Defer guest entry when an asynchronous exception is pending SError that occur during world-switch's entry to the guest will be accounted to the guest, as the exception is masked until we enter the guest... but we want to attribute the SError as precisely as possible. Reading DISR_EL1 before guest entry requires free registers, and using ESB+DISR_EL1 to consume and read back the ESR would leave KVM holding a host SError... We would rather leave the SError pending and let the host take it once we exit world-switch. To do this, we need to defer guest-entry if an SError is pending. Read the ISR to see if SError (or an IRQ) is pending. If so fake an exit. Place this check between __guest_enter()'s save of the host registers, and restore of the guest's. SError that occur between here and the eret into the guest must have affected the guest's registers, which we can naturally attribute to the guest. The dsb is needed to ensure any previous writes have been done before we read ISR_EL1. On systems without the v8.2 RAS extensions this doesn't give us anything as we can't contain errors, and the ESR bits to describe the severity are all implementation-defined. Replace this with a nop for these systems. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:03:30 +01:00
James Morse	0e5b9c085d	KVM: arm64: Consume pending SError as early as possible On systems with v8.2 we switch the 'vaxorcism' of guest SError with an alternative sequence that uses the ESB-instruction, then reads DISR_EL1. This saves the unmasking and remasking of asynchronous exceptions. We do this after we've saved the guest registers and restored the host's. Any SError that becomes pending due to this will be accounted to the guest, when it actually occurred during host-execution. Move the ESB-instruction as early as possible. Any guest SError will become pending due to this ESB-instruction and then consumed to DISR_EL1 before the host touches anything. This lets us account for host/guest SError precisely on the guest exit exception boundary. Because the ESB-instruction now lands in the preamble section of the vectors, we need to add it to the unpatched indirect vectors too, and to any sequence that may be patched in over the top. The ESB-instruction always lives in the head of the vectors, to be before any memory write. Whereas the register-store always lives in the tail. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:03:29 +01:00
James Morse	5d994374e8	KVM: arm64: Make indirect vectors preamble behaviour symmetric The KVM indirect vectors support is a little complicated. Different CPUs may use different exception vectors for KVM that are generated at boot. Adding new instructions involves checking all the possible combinations do the right thing. To make changes here easier to review lets state what we expect of the preamble: 1. The first vector run, must always run the preamble. 2. Patching the head or tail of the vector shouldn't remove preamble instructions. Today, this is easy as we only have one instruction in the preamble. Change the unpatched tail of the indirect vector so that it always runs this, regardless of patching. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:03:29 +01:00
James Morse	3dbf100b0b	KVM: arm64: Abstract the size of the HYP vectors pre-amble The EL2 vector hardening feature causes KVM to generate vectors for each type of CPU present in the system. The generated sequences already do some of the early guest-exit work (i.e. saving registers). To avoid duplication the generated vectors branch to the original vector just after the preamble. This size is hard coded. Adding new instructions to the HYP vector causes strange side effects, which are difficult to debug as the affected code is patched in at runtime. Add KVM_VECTOR_PREAMBLE to tell kvm_patch_vector_branch() how big the preamble is. The valid_vect macro can then validate this at build time. Reviewed-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:03:29 +01:00
James Morse	2b68a2a963	arm64: assembler: Switch ESB-instruction with a vanilla nop if !ARM64_HAS_RAS The ESB-instruction is a nop on CPUs that don't implement the RAS extensions. This lets us use it in places like the vectors without having to use alternatives. If someone disables CONFIG_ARM64_RAS_EXTN, this instruction still has its RAS extensions behaviour, but we no longer read DISR_EL1 as this register does depend on alternatives. This could go wrong if we want to synchronize an SError from a KVM guest. On a CPU that has the RAS extensions, but the KConfig option was disabled, we consume the pending SError with no chance of ever reading it. Hide the ESB-instruction behind the CONFIG_ARM64_RAS_EXTN option, outputting a regular nop if the feature has been disabled. Reported-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2019-07-05 13:03:29 +01:00
Krish Sadhukhan	1ef23e1f16	KVM nVMX: Check Host Segment Registers and Descriptor Tables on vmentry of nested guests According to section "Checks on Host Segment and Descriptor-Table Registers" in Intel SDM vol 3C, the following checks are performed on vmentry of nested guests: - In the selector field for each of CS, SS, DS, ES, FS, GS and TR, the RPL (bits 1:0) and the TI flag (bit 2) must be 0. - The selector fields for CS and TR cannot be 0000H. - The selector field for SS cannot be 0000H if the "host address-space size" VM-exit control is 0. - On processors that support Intel 64 architecture, the base-address fields for FS, GS and TR must contain canonical addresses. Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 14:01:51 +02:00
Sean Christopherson	f087a02941	KVM: nVMX: Stash L1's CR3 in vmcs01.GUEST_CR3 on nested entry w/o EPT KVM does not have 100% coverage of VMX consistency checks, i.e. some checks that cause VM-Fail may only be detected by hardware during a nested VM-Entry. In such a case, KVM must restore L1's state to the pre-VM-Enter state as L2's state has already been loaded into KVM's software model. L1's CR3 and PDPTRs in particular are loaded from vmcs01.GUEST_. But when EPT is disabled, the associated fields hold KVM's shadow values, not L1's "real" values. Fortunately, when EPT is disabled the PDPTRs come from memory, i.e. are not cached in the VMCS. Which leaves CR3 as the sole anomaly. A previously applied workaround to handle CR3 was to force nested early checks if EPT is disabled: commit `2b27924bb1` ("KVM: nVMX: always use early vmcs check when EPT is disabled") Forcing nested early checks is undesirable as doing so adds hundreds of cycles to every nested VM-Entry. Rather than take this performance hit, handle CR3 by overwriting vmcs01.GUEST_CR3 with L1's CR3 during nested VM-Entry when EPT is disabled and* nested early checks are disabled. By stuffing vmcs01.GUEST_CR3, nested_vmx_restore_host_state() will naturally restore the correct vcpu->arch.cr3 from vmcs01.GUEST_CR3. These shenanigans work because nested_vmx_restore_host_state() does a full kvm_mmu_reset_context(), i.e. unloads the current MMU, which guarantees vmcs01.GUEST_CR3 will be rewritten with a new shadow CR3 prior to re-entering L1. vcpu->arch.root_mmu.root_hpa is set to INVALID_PAGE via: nested_vmx_restore_host_state() -> kvm_mmu_reset_context() -> kvm_mmu_unload() -> kvm_mmu_free_roots() kvm_mmu_unload() has WARN_ON(root_hpa != INVALID_PAGE), i.e. we can bank on 'root_hpa == INVALID_PAGE' unless the implementation of kvm_mmu_reset_context() is changed. On the way into L1, VMCS.GUEST_CR3 is guaranteed to be written (on a successful entry) via: vcpu_enter_guest() -> kvm_mmu_reload() -> kvm_mmu_load() -> kvm_mmu_load_cr3() -> vmx_set_cr3() Stuff vmcs01.GUEST_CR3 if and only if nested early checks are disabled as a "late" VM-Fail should never happen win that case (KVM WARNs), and the conditional write avoids the need to restore the correct GUEST_CR3 when nested_vmx_check_vmentry_hw() fails. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20190607185534.24368-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:57:06 +02:00
Paolo Bonzini	335e192a3f	KVM: x86: add tracepoints around __direct_map and FNAME(fetch) These are useful in debugging shadow paging. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:48 +02:00
Paolo Bonzini	e9f2a760b1	KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON Note that in such a case it is quite likely that KVM will BUG_ON in __pte_list_remove when the VM is closed. However, there is no immediate risk of memory corruption in the host so a WARN_ON is enough and it lets you gather traces for debugging. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:48 +02:00
Paolo Bonzini	d679b32611	KVM: x86: remove now unneeded hugepage gfn adjustment After the previous patch, the low bits of the gfn are masked in both FNAME(fetch) and __direct_map, so we do not need to clear them in transparent_hugepage_adjust. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:47 +02:00
Paolo Bonzini	3fcf2d1bde	KVM: x86: make FNAME(fetch) and __direct_map more similar These two functions are basically doing the same thing through kvm_mmu_get_page, link_shadow_page and mmu_set_spte; yet, for historical reasons, their code looks very different. This patch tries to take the best of each and make them very similar, so that it is easy to understand changes that apply to both of them. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:46 +02:00
Junaid Shahid	43fdcda96e	kvm: x86: Do not release the page inside mmu_set_spte() Release the page at the call-site where it was originally acquired. This makes the exit code cleaner for most call sites, since they do not need to duplicate code between success and the failure label. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:46 +02:00
Paolo Bonzini	60cec433c4	KVM: cpuid: remove has_leaf_count from struct kvm_cpuid_param The has_leaf_count member was originally added for KVM's paravirtualization CPUID leaves. However, since then the leaf count _has_ been added to those leaves as well, so we can drop that special case. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:45 +02:00
Paolo Bonzini	50a9e1a4b1	KVM: cpuid: rename do_cpuid_1_ent do_cpuid_1_ent does not do the entire processing for a CPUID entry, it only retrieves the host's values. Rename it to match reality. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:45 +02:00
Paolo Bonzini	d9aadaf689	KVM: cpuid: set struct kvm_cpuid_entry2 flags in do_cpuid_1_ent do_cpuid_1_ent is typically called in two places by __do_cpuid_func for CPUID functions that have subleafs. Both places have to set the KVM_CPUID_FLAG_SIGNIFCANT_INDEX. Set that flag, and KVM_CPUID_FLAG_STATEFUL_FUNC as well, directly in do_cpuid_1_ent. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:44 +02:00
Paolo Bonzini	54d360d412	KVM: cpuid: extract do_cpuid_7_mask and support multiple subleafs CPUID function 7 has multiple subleafs. Instead of having nested switch statements, move the logic to filter supported features to a separate function, and call it for each subleaf. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:43 +02:00
Paolo Bonzini	ab8bcf6497	KVM: cpuid: do_cpuid_ent works on a whole CPUID function Rename it as well as __do_cpuid_ent and __do_cpuid_ent_emulated to have "func" in its name, and drop the index parameter which is always 0. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 13:48:43 +02:00
Steffen Maier	0328e519a7	docs: s390: unify and update s390dbf kdocs at debug.c For non-static-inlines, debug.c already had non-compliant function header docs. So move the pure prototype kdocs of ("s390: include/asm/debug.h add kerneldoc markups") from debug.h to debug.c and merge them with the old function docs. Also, I had the impression that kdoc typically is at the implementation in the compile unit rather than at the prototype in the header file. While at it, update the short kdoc description to distinguish the different functions. And a few more consistency cleanups. Added a new kdoc for debug_set_critical() since debug.h comments it as part of the API. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <1562149189-1417-3-git-send-email-maier@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-05 13:42:22 +02:00
Zhang Lei	e644fa18e2	KVM: arm64/sve: Fix vq_present() macro to yield a bool The original implementation of vq_present() relied on aggressive inlining in order for the compiler to know that the code is correct, due to some const-casting issues. This was causing sparse and clang to complain, while GCC compiled cleanly. Commit `0c529ff789` addressed this problem, but since vq_present() is no longer a function, there is now no implicit casting of the returned value to the return type (bool). In set_sve_vls(), this uncast bit value is compared against a bool, and so may spuriously compare as unequal when both are nonzero. As a result, KVM may reject valid SVE vector length configurations as invalid, and vice versa. Fix it by forcing the returned value to a bool. Signed-off-by: Zhang Lei <zhang.lei@jp.fujitsu.com> Fixes: `0c529ff789` ("KVM: arm64: Implement vq_present() as a macro") Signed-off-by: Dave Martin <Dave.Martin@arm.com> [commit message rewrite] Cc: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-05 12:07:51 +02:00
Linus Torvalds	ecbe5086ad	ARM: SoC fixes Likely our final small batch of fixes for 5.2: - Some fixes for USB on davinci, regressions were due to the recent conversion of the OCHI driver to use GPIO regulators - A fixup of kconfig dependencies for a TI irq controller - A switch of armada-38x to avoid dropped characters on uart, caused by switch of base inherited platform description earlier this year -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAl0eQuQPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3hI8P/25Kx4vAcgzmZezixE3fT7TTE8ks7EpOmNRS P4DWheWehRlctaLjehWhsdOLcMyhMqa5WF3PgCw5j31c6zPLoe8ZiC8nizjVMKxu 5vzMbN3R4MN5iFN8w9qzOYP2wHqZsnLoM/Epig9xei4kWvohq89QVW0R6lWg4xv0 i40AyhLapbSCmLXJNd0N47jgjFek23aqAVbXyhq/GTlCDHAELF9gKZ+Zif3frPzH ZBCNB3pAiazOmMKp3BzKyM8Qbl/KQLEfWTXV3luGiHDJgO94guVvzrKYxsxTfJxd fsqrvqLp7S2pvHVcvF2PVeORl+5xAeP0nabuHwFHLi+E6VIzL9NS1h/h7Qoy5eAC Dj7aM+smsqAwId5s+oIhHtVrsfLK8AQ4TE53mUVTN2iikuJc04xYj72tIwB1LL2X 99vYn+bRoitkYfff3XpuvJzOptAYKOvMsWmbCQ6ChckXXvqs6kKmsVvwAesqg/7j tsXy5iDVMuZiAvWcNstAiQtJggKC7tegWFBnqDLtKd7z1x3KivsErBJdR5x8f5vZ uxGOt8sS1fRChc03V0qFsrHAMc/0FBz4OdXYy0E83y3X7SYHbkoxOu6ZIioCBSWF 9RFiKr55FNa6DrNBok6/tsqN9sBkcqwEk9kRZRmdC4zIRzIyg+SR09PEea3Nf66+ KNmOEGs/ =2Hcr -----END PGP SIGNATURE----- Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "Likely our final small batch of fixes for 5.2: - Some fixes for USB on davinci, regressions were due to the recent conversion of the OCHI driver to use GPIO regulators - A fixup of kconfig dependencies for a TI irq controller - A switch of armada-38x to avoid dropped characters on uart, caused by switch of base inherited platform description earlier this year" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: davinci: da830-evm: fix GPIO lookup for OHCI ARM: davinci: omapl138-hawk: add missing regulator constraints for OHCI ARM: davinci: da830-evm: add missing regulator constraints for OHCI soc: ti: fix irq-ti-sci link error ARM: dts: armada-xp-98dx3236: Switch to armada-38x-uart serial node	2019-07-05 11:35:45 +09:00
Mark Brown	65244e5b1f	Merge branch 'regulator-5.3' into regulator-next	2019-07-04 17:34:32 +01:00
Christophe Leroy	264bffad4d	powerpc/boot: Add lzo support for uImage This patch allows to generate lzo compressed uImage Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:38 +10:00
Christophe Leroy	1cc9a21b0b	powerpc/boot: Add lzma support for uImage This patch allows to generate lzma compressed uImage Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:38 +10:00
Christophe Leroy	fbded57c96	powerpc/boot: don't force gzipped uImage This patch modifies the generation of uImage by handing over the selected compression type instead of forcing gzip Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:38 +10:00
Christophe Leroy	43db76f418	powerpc/8xx: Add microcode patch to move SMC parameter RAM. Some SCC functions like the QMC requires an extended parameter RAM. On modern 8xx (ie 866 and 885), SPI area can already be relocated, allowing the use of those functions on SCC2. But SCC3 and SCC4 parameter RAM collide with SMC1 and SMC2 parameter RAMs. This patch adds microcode to allow the relocation of both SMC1 and SMC2, and relocate them at offsets 0x1ec0 and 0x1fc0. Those offsets are by default for the CPM1 DSP1 and DSP2, but there is no kernel driver using them at the moment so this area can be reused. This microcode is provided by Freescale/NXP in Engineering Bulletin EB662 ("MPC8xx I2C/SPI and SMC Relocation Microcode Packages") dated 2006. The binary code is public. The source is not available. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:38 +10:00
Christophe Leroy	c3eec5d7da	powerpc/8xx: Use IO accessors in microcode programming. Change microcode functions to use IO accessors and get rid of volatile attributes. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:38 +10:00
Christophe Leroy	647d5ed0ae	powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c Reduce #ifdef mess by using IS_ENABLED() instead. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	f5348c080e	powerpc/8xx: refactor programming of microcode CPM params. The CPM registers RCCR and CPMCR1..4 registers has to be set in accordance with the microcode patch beeing programmed. Lets define them as part of the patch set and refactor their programming from that definition. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	5cfd5d8943	powerpc/8xx: refactor printing of microcode patch name. Define patch name together with the patch code, and refactor the associated printk() while replacing it by a pr_info() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	11597ff20b	powerpc/8xx: Refactor microcode write Add empty microcode tables so that all tables are defined all the time. Regroup the writing of the 3 tables regardless of the selected microcode. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	372fba9c76	powerpc/8xx: refactor writing of CPM microcode arrays Create a function to refactor the writing of CPM microcode arrays. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	9fb7e639f6	powerpc/8xx: compact microcode arrays Compact obscure microcode arrays by putting 4 values per line in order to reduce number of lines in the file to increase readability. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	4d6d9c6db5	powerpc/8xx: drop verify_patch() verify_patch() has been opted out since many years, and the comment suggests it doesn't work. So drop it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	4128a89ac8	powerpc/8xx: move CPM1 related files from sysdev/ to platforms/8xx Only 8xx selects CPM1 and related CONFIG options are already in platforms/8xx/Kconfig Move the related C files to platforms/8xx/. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Minor formatting fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	22e9c88d48	powerpc/64: reuse PPC32 static inline flush_dcache_range() This patch drops the assembly PPC64 version of flush_dcache_range() and re-uses the PPC32 static inline version. With GCC 8.1, the following code is generated: void flush_test(unsigned long start, unsigned long stop) { flush_dcache_range(start, stop); } 0000000000000130 <.flush_test>: 130: 3d 22 00 00 addis r9,r2,0 132: R_PPC64_TOC16_HA .data+0x8 134: 81 09 00 00 lwz r8,0(r9) 136: R_PPC64_TOC16_LO .data+0x8 138: 3d 22 00 00 addis r9,r2,0 13a: R_PPC64_TOC16_HA .data+0xc 13c: 80 e9 00 00 lwz r7,0(r9) 13e: R_PPC64_TOC16_LO .data+0xc 140: 7d 48 00 d0 neg r10,r8 144: 7d 43 18 38 and r3,r10,r3 148: 7c 00 04 ac hwsync 14c: 4c 00 01 2c isync 150: 39 28 ff ff addi r9,r8,-1 154: 7c 89 22 14 add r4,r9,r4 158: 7c 83 20 50 subf r4,r3,r4 15c: 7c 89 3c 37 srd. r9,r4,r7 160: 41 82 00 1c beq 17c <.flush_test+0x4c> 164: 7d 29 03 a6 mtctr r9 168: 60 00 00 00 nop 16c: 60 00 00 00 nop 170: 7c 00 18 ac dcbf 0,r3 174: 7c 63 42 14 add r3,r3,r8 178: 42 00 ff f8 bdnz 170 <.flush_test+0x40> 17c: 7c 00 04 ac hwsync 180: 4c 00 01 2c isync 184: 4e 80 00 20 blr 188: 60 00 00 00 nop 18c: 60 00 00 00 nop Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	d98fc70fc1	powerpc/32: define helpers to get L1 cache sizes. This patch defines C helpers to retrieve the size of cache blocks and uses them in the cacheflush functions. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 02:06:37 +10:00
Christophe Leroy	1cfb725fb1	powerpc/64: flush_inval_dcache_range() becomes flush_dcache_range() On most arches having function flush_dcache_range(), including PPC32, this function does a writeback and invalidation of the cache bloc. On PPC64, flush_dcache_range() only does a writeback while flush_inval_dcache_range() does the invalidation in addition. In addition it looks like within arch/powerpc/, there are no PPC64 platforms using flush_dcache_range() This patch drops the existing 64 bits version of flush_dcache_range() and renames flush_inval_dcache_range() into flush_dcache_range(). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 01:35:10 +10:00
Christophe Leroy	6c5875843b	powerpc: slightly improve cache helpers Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers that are summed to obtain the target address. Using 'Z' constraint and '%y0' argument gives GCC the opportunity to use both registers instead of only one with the second being forced to 0. Suggested-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 01:35:10 +10:00
Joerg Roedel	d95c388586	Merge branches 'x86/vt-d', 'x86/amd', 'arm/smmu', 'arm/omap', 'generic-dma-ops' and 'core' into next	2019-07-04 17:26:48 +02:00
Aneesh Kumar K.V	ac25ba68fa	powerpc/mm/hugetlb: Don't enable HugeTLB if we don't have a page table cache This makes sure we don't enable HugeTLB if the cache is not configured. I am still not sure about this. IMHO hugetlb support should be a hardware support derivative and any cache allocation failure should be handled as I did in the earlier patch. But then if we were not able to create hugetlb page table cache, we can as well declare hugetlb support disabled thereby avoiding calling into allocation routines. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:48:01 +10:00
Aneesh Kumar K.V	5d49275a27	powerpc/mm/hugetlb: Fix kernel crash if we fail to allocate page table caches We only check for hugetlb allocations, because with hugetlb we do conditional registration. For PGD/PUD/PMD levels we register them always in pgtable_cache_init. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:48:00 +10:00
Aneesh Kumar K.V	2230ebf6e6	powerpc/mm: Handle page table allocation failures This fixes kernel crash that arises due to not handling page table allocation failures while allocating hugetlb page table. Fixes: `e2b3d202d1` ("powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:47:59 +10:00
Aneesh Kumar K.V	57caddae6e	powerpc/mm: Remove radix dependency on HugeTLB page Now that we have switched the page table walk to use pmd_is_leaf we can now revert commit `8adddf349f` ("powerpc/mm/radix: Make Radix require HUGETLB_PAGE") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:44:53 +10:00
Aneesh Kumar K.V	1ecf2cdc74	powerpc/mm: pmd_devmap implies pmd_large(). large devmap usage is dependent on THP. Hence once check is sufficient. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:43:59 +10:00
Aneesh Kumar K.V	d6eacedd1f	powerpc/book3s: Use config independent helpers for page table walk Even when we have HugeTLB and THP disabled, kernel linear map can still be mapped with hugepages. This is only an issue with radix translation because hash MMU doesn't map kernel linear range in linux page table and other kernel map areas are not mapped using hugepage. Add config independent helpers and put WARN_ON() when we don't expect things to be mapped via hugepages. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:43:50 +10:00
Aneesh Kumar K.V	259a948c4b	powerpc/pseries/scm: Use a specific endian format for storing uuid from the device tree We used uuid_parse to convert uuid string from device tree to two u64 components. We want to make sure we look at the uuid read from device tree in an endian-neutral fashion. For now, I am picking little-endian to be format so that we don't end up doing an additional conversion. The reason to store in a specific endian format is to enable reading the namespace created with a little-endian kernel config on a big-endian kernel. We do store the device tree uuid string as a 64-bit little-endian cookie in the label area. When booting the kernel we also compare this cookie against what is read from the device tree. For this, to work we have to store and compare these values in a CPU endian config independent fashion. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:42:10 +10:00
Aneesh Kumar K.V	53e80bd042	powerpc/nvdimm: Add support for multibyte read/write for metadata SCM_READ/WRITE_MEATADATA hcall supports multibyte read/write. This patch updates the metadata read/write to use 1, 2, 4 or 8 byte read/write as mentioned in PAPR document. READ/WRITE_METADATA hcall supports the 1, 2, 4, or 8 bytes read/write. For other values hcall results H_P3. Hypervisor stores the metadata contents in big-endian format and in-order to enable read/write in different granularity, we need to switch the contents to big-endian before calling HCALL. Based on an patch from Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:37:33 +10:00
Aneesh Kumar K.V	2a0ffbd478	powerpc/pseries/scm: Mark the region volatile if cache flush not required The device tree node is documented as below: “ibm,cache-flush-required”: property name indicates Cache Flush Required for this Persistent Memory Segment to persist memory prop-encoded-array: None, this is a name only property. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:35:15 +10:00
Aneesh Kumar K.V	c0b1b23b9c	powerpc/mm/nvdimm: Add an informative message if we fail to allocate altmap block Allocation from altmap area can fail based on vmemmap page size used. Add kernel info message to indicate the failure. That allows the user to identify whether they are really using persistent memory reserved space for per-page metadata. The message looks like: [ 136.587212] altmap block allocation failed, falling back to system memory Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:32:57 +10:00
Aneesh Kumar K.V	495c2ff4c8	powerpc/mm: Consolidate numa_enable check and min_common_depth check If we fail to parse min_common_depth from device tree we boot with numa disabled. Reflect the same by updating numa_enabled variable to false. Also, switch all min_common_depth failure check to if (!numa_enabled) check. This helps us to avoid checking for both in different code paths. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:27:33 +10:00
Aneesh Kumar K.V	f52741c410	powerpc/mm: Fix node look up with numa=off boot If we boot with numa=off, we need to make sure we return NUMA_NO_NODE when looking up associativity details of resources. Without this, we hit crash like below BUG: Unable to handle kernel data access at 0x40000000008 Faulting instruction address: 0xc000000008f31704 cpu 0x1b: Vector: 380 (Data SLB Access) at [c00000000b9bb320] pc: c000000008f31704: _raw_spin_lock+0x14/0x100 lr: c0000000083f41fc: ____cache_alloc_node+0x5c/0x290 sp: c00000000b9bb5b0 msr: 800000010280b033 dar: 40000000008 current = 0xc00000000b9a2700 paca = 0xc00000000a740c00 irqmask: 0x03 irq_happened: 0x01 pid = 1, comm = swapper/27 Linux version 5.2.0-rc4-00925-g74e188c620b1 (root@linux-d8ip) (gcc version 7.4.1 20190424 [gcc-7-branch revision 270538] (SUSE Linux)) #34 SMP Sat Jun 29 00:41:02 EDT 2019 enter ? for help [link register ] c0000000083f41fc ____cache_alloc_node+0x5c/0x290 [c00000000b9bb5b0] 0000000000000dc0 (unreliable) [c00000000b9bb5f0] c0000000083f48c8 kmem_cache_alloc_node_trace+0x138/0x360 [c00000000b9bb670] c000000008aa789c devres_alloc_node+0x4c/0xa0 [c00000000b9bb6a0] c000000008337218 devm_memremap+0x58/0x130 [c00000000b9bb6f0] c000000008aed00c devm_nsio_enable+0xdc/0x170 [c00000000b9bb780] c000000008af3b6c nd_pmem_probe+0x4c/0x180 [c00000000b9bb7b0] c000000008ad84cc nvdimm_bus_probe+0xac/0x260 [c00000000b9bb840] c000000008aa0628 really_probe+0x148/0x500 [c00000000b9bb8d0] c000000008aa0d7c driver_probe_device+0x19c/0x1d0 [c00000000b9bb950] c000000008aa11bc device_driver_attach+0xcc/0x100 [c00000000b9bb990] c000000008aa12ec __driver_attach+0xfc/0x1e0 [c00000000b9bba10] c000000008a9d0a4 bus_for_each_dev+0xb4/0x130 [c00000000b9bba70] c000000008a9fc04 driver_attach+0x34/0x50 [c00000000b9bba90] c000000008a9f118 bus_add_driver+0x1d8/0x300 [c00000000b9bbb20] c000000008aa2358 driver_register+0x98/0x1a0 [c00000000b9bbb90] c000000008ad7e6c __nd_driver_register+0x5c/0x100 [c00000000b9bbbf0] c0000000093efbac nd_pmem_driver_init+0x34/0x48 [c00000000b9bbc10] c0000000080106c0 do_one_initcall+0x60/0x2d0 [c00000000b9bbce0] c00000000938463c kernel_init_freeable+0x384/0x48c [c00000000b9bbdb0] c000000008010a5c kernel_init+0x2c/0x160 [c00000000b9bbe20] c00000000800ba54 ret_from_kernel_thread+0x5c/0x68 Reported-and-debugged-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:27:32 +10:00
Aneesh Kumar K.V	ea9f5b702f	powerpc/mm/drconf: Use NUMA_NO_NODE on failures instead of node 0 If we fail to parse the associativity array we should default to NUMA_NO_NODE instead of NODE 0. Rest of the code fallback to the right default if we find the numa node value NUMA_NO_NODE. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:27:31 +10:00
Aneesh Kumar K.V	89a3496e06	powerpc/mm/radix: Use the right page size for vmemmap mapping We use mmu_vmemmap_psize to find the page size for mapping the vmmemap area. With radix translation, we are suboptimally setting this value to PAGE_SIZE. We do check for 2M page size support and update mmu_vmemap_psize to use hugepage size but we suboptimally reset the value to PAGE_SIZE in radix__early_init_mmu(). This resulted in always mapping vmemmap area with 64K page size. Fixes: `2bfd65e45e` ("powerpc/mm/radix: Add radix callbacks for early init routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:27:21 +10:00
Aneesh Kumar K.V	78c9498885	powerpc/mm/hash/4k: Don't use 64K page size for vmemmap with 4K pagesize With hash translation and 4K PAGE_SIZE config, we need to make sure we don't use 64K page size for vmemmap. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:27:20 +10:00
Aneesh Kumar K.V	b8c8a524cc	powerpc/mm: Remove unused variable declaration Since commit `0034d395f8` ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range") __kernel_virt_size is not used anymore. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-05 00:20:42 +10:00
Olof Johansson	4471e44f97	Allwinner DT64 Changes for 5.3 - Round 2 One extra change wiring up the interrupt line for the external RTC chip on the Pine H64. -----BEGIN PGP SIGNATURE----- iQJCBAABCgAsFiEE2nN1m/hhnkhOWjtHOJpUIZwPJDAFAl0doYYOHHdlbnNAY3Np ZS5vcmcACgkQOJpUIZwPJDABuQ/+InlVdkY/6t4TednD1AymSacCtkGKorwE1NMF gcUlHyEouElzqCWiYQrjElCP4oXcypBFy/elpc8HtuOLwZwz0lrQ5pk1IcIg5nfq KOW9zWSHFX9M+I7Fay04YPaLQcrmXZDVSXAAwF8aAIyjSVDpg82irAQtXU+zTqOg XXHacJNmyX702YMbYYYXeOdlVzvQb6qKHzndC5fLsRFVCMgDiBSVmkB1chZ+knUe y9TduFtBBGpdFwoSQuaDBJEgh5GuRGRNoHQcKm2q29gpX8QjDHqRpBZS3afFl2D4 GcsG+BxyQTp3/rHiN386Ii03V7/fTaHTaBi7Wj5mZ0I7LZLMruC9+7oHcfRdro5p BQIS2yayXnMymC+LOekIXU5KcGWjzWDU31e6/lTz/lV7nLkUWXy8vTZ/QP+uZio9 /MgYBRbCjHTJLBghz47rawdeGCcGvxcHnLMvDrJ/DtetWVrm6yxF+EACftki379U H79Pw4yfc7RKWWB/7gZVGc2VsWhrVi0D/4Wj2pfws4KqkJkK3L2YE5OtnoL/u55X xxGEkY5TTiXLfod/PHdLjxhdhKWeecnObDwKCbMoKAp+JY9QZlQTB7Zk1aOo8hbm WvVhK/i5iIdUVEKmx9Uzt1oltFrObYHV80XFXsY9oZ0Vmg1diOGZx7yxs5Spk8IE 4/vO7A0= =cinn -----END PGP SIGNATURE----- Merge tag 'sunxi-dt64-for-5.3-round-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/dt Allwinner DT64 Changes for 5.3 - Round 2 One extra change wiring up the interrupt line for the external RTC chip on the Pine H64. * tag 'sunxi-dt64-for-5.3-round-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: dts: allwinner: h6: Pine H64: Add interrupt line for RTC Link: https://lore.kernel.org/r/20190704065326.GA19010@wens.csie.org Signed-off-by: Olof Johansson <olof@lixom.net>	2019-07-04 07:02:31 -07:00
Naveen N. Rao	18a593c8b5	powerpc/pseries: Protect against hogging the cpu while setting up the stats When enabling or disabling the vcpu dispatch statistics, we do a lot of work including allocating/deallocating memory across all possible cpus for the DTL buffer. In order to guard against hogging the cpu for too long, track the time we're taking and yield the processor if necessary. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 22:27:25 +10:00
Naveen N. Rao	d62c8deeb6	powerpc/pseries: Provide vcpu dispatch statistics For Shared Processor LPARs, the POWER Hypervisor maintains a relatively static mapping of the LPAR processors (vcpus) to physical processor chips (representing the "home" node) and tries to always dispatch vcpus on their associated physical processor chip. However, under certain scenarios, vcpus may be dispatched on a different processor chip (away from its home node). The actual physical processor number on which a certain vcpu is dispatched is available to the guest in the 'processor_id' field of each DTL entry. The guest can discover the home node of each vcpu through the H_HOME_NODE_ASSOCIATIVITY(flags=1) hcall. The guest can also discover the associativity of physical processors, as represented in the DTL entry, through the H_HOME_NODE_ASSOCIATIVITY(flags=2) hcall. These can then be compared to determine if the vcpu was dispatched on its home node or not. If the vcpu was not dispatched on the home node, it is possible to determine if the vcpu was dispatched in a different chip, socket or drawer. Introduce a procfs file /proc/powerpc/vcpudispatch_stats that can be used to obtain these statistics. Writing '1' to this file enables collecting the statistics, while writing '0' disables the statistics. The statistics themselves are available by reading the procfs file. By default, the DTLB log for each vcpu is processed 50 times a second so as not to miss any entries. This processing frequency can be changed through /proc/powerpc/vcpudispatch_stats_freq. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 22:27:09 +10:00
Naveen N. Rao	5a1ea4774d	powerpc/pseries: Move mm/book3s64/vphn.c under platforms/pseries/ hcall_vphn() is specific to pseries and will be used in a subsequent patch. So, move it to a more appropriate place under arch/powerpc/platforms/pseries. Also merge vphn.h into lppaca.h and update vphn selftest to use the new files. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 22:23:38 +10:00
Naveen N. Rao	ef34e0efa2	powerpc/pseries: Generalize hcall_vphn() H_HOME_NODE_ASSOCIATIVITY hcall can take two different flags and return different associativity information in each case. Generalize the existing hcall_vphn() function to take flags as an argument and to return the result. Update the only existing user to pass the proper arguments. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 22:23:38 +10:00
Naveen N. Rao	06220d78f2	powerpc/pseries: Introduce rwlock to gatekeep DTLB usage Since we would be introducing a new user of the DTL buffer in a subsequent patch, we need a way to gatekeep use of the DTL buffer. The current debugfs interface for DTL allows registering and opening cpu-specific DTL buffers. Cpu specific files are exposed under debugfs 'powerpc/dtl/' node, and changing 'dtl_event_mask' in the same directory enables controlling the event mask used when registering DTL buffer for a particular cpu. Subsequently, we will be introducing a user of the DTL buffers that registers access to the DTL buffers across all cpus with the same event mask. To ensure these two users do not step on each other, we introduce a rwlock to gatekeep DTL buffer access. This fits the requirement of the current debugfs interface wanting to allow multiple independent cpu-specific users (read lock), and the subsequent user wanting exclusive access (write lock). Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 22:23:38 +10:00
Naveen N. Rao	1c85a2a194	powerpc/pseries: Factor out DTL buffer allocation and registration routines Introduce new helpers for DTL buffer allocation and registration and have the existing code use those. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Don't split error messages across lines, for grepability] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 22:23:10 +10:00
Naveen N. Rao	5b3306f084	powerpc/pseries: Do not save the previous DTL mask value When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is enabled, we always initialize DTL enable mask to DTL_LOG_PREEMPT (0x2). There are no other places where the mask is changed. As such, when reading the DTL log buffer through debugfs, there is no need to save and restore the previous mask value. We don't need to save and restore the earlier mask value if CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not enabled. So, remove the field from the structure as well. Acked-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 22:20:48 +10:00
Naveen N. Rao	515bbc8ab4	powerpc/pseries: Use macros for referring to the DTL enable mask Introduce macros to encode the DTL enable mask fields and use those instead of hardcoding numbers. Acked-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 22:20:47 +10:00
Sebastian Ott	6ae3483d41	s390/pci: correctly handle MIO opt-out Do not issue CLP_SET_ENABLE_MIO after opting out of MIO instruction usage. This should not fix a bug but reduce overhead within firmware. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-04 13:13:59 +02:00
Sebastian Ott	c7ff0e918a	s390/pci: deal with devices that have no support for MIO instructions Unfortunately we have to handle a class of devices that don't support the new MIO instructions. Adjust resource assignment and mapping accordingly. Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-04 13:13:57 +02:00
Satheesh Rajendran	31afa05bf9	powerpc: Enable CONFIG_IPV6 in ppc64_defconfig Enable CONFIG_IPV6 in ppc64_defconfig to enable certain network functionalities required for tests. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 21:04:06 +10:00
Christoph Hellwig	2ebca1cbb4	riscv: remove free_initrd_mem The RISC-V free_initrd_mem is identical to the default one, except that it doesn't poison the freed memory. Remove it so that the default implementations gets used instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-07-04 03:12:57 -07:00
Yash Shah	df7e9059cf	riscv: ccache: Remove unused variable Reading the count register clears the interrupt signal. Currently, the count registers are read into 'regval' variable but the variable is never used. Therefore remove it. V2 of this patch add comments to justify the readl calls without checking the return value. Signed-off-by: Yash Shah <yash.shah@sifive.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-07-04 03:12:24 -07:00
Ingo Molnar	f584dd32ed	Merge branch 'x86/cpu' into perf/core, to pick up revert perf/core has an earlier version of the x86/cpu tree merged, to avoid conflicts, and due to this we want to pick up this ABI impacting revert as well: 049331f277fe: ("x86/fsgsbase: Revert FSGSBASE support") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-07-04 10:36:20 +02:00
Geliang Tang	658829dfe7	powerpc/cell: set no_llseek in spufs_cntl_fops In spufs_cntl_fops, since we use nonseekable_open() to open, we should use no_llseek() to seek, not generic_file_llseek(). Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 18:05:09 +10:00
Johannes Berg	b482e48d29	um: fix build without CONFIG_UML_TIME_TRAVEL_SUPPORT When CONFIG_UML_TIME_TRAVEL_SUPPORT isn't set, the build was broken. Fix this. Fixes: `065038706f` ("um: Support time travel mode") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-04 09:52:18 +02:00
Geliang Tang	c197922f0a	powerpc/perf/24x7: use rb_entry To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 17:09:37 +10:00
Anton Blanchard	7505a13f85	powerpc/configs: Disable latencytop latencytop adds almost 4kB to each and every task struct and as such it doesn't deserve to be in our defconfigs. Signed-off-by: Anton Blanchard <anton@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 17:09:03 +10:00
Enrico Weigelt, metux IT consult	4f44e8aeaf	powerpc/Kconfig: Clean up formatting Formatting of Kconfig files doesn't look so pretty, so let the Great White Handkerchief come around and clean it up. Also convert "---help---" as requested. Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-04 16:55:10 +10:00
Krzysztof Kozlowski	f017da5c70	nios2: configs: Remove useless UEVENT_HELPER_PATH Remove the CONFIG_UEVENT_HELPER_PATH because: 1. It is disabled since commit `1be01d4a57` ("driver: base: Disable CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was made default to 'n', 2. It is not recommended (help message: "This should not be used today [...] creates a high system load") and was kept only for ancient userland, 3. Certain userland specifically requests it to be disabled (systemd README: "Legacy hotplug slows down the system and confuses udev"). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>	2019-07-04 13:35:38 +08:00
Masahiro Yamada	029f162ab0	nios2: remove pointless second entry for CONFIG_TRACE_IRQFLAGS_SUPPORT Strangely enough, NIOS2 defines TRACE_IRQFLAGS_SUPPORT twice with different values, which is pointless and confusing. [1] arch/nios2/Kconfig config TRACE_IRQFLAGS_SUPPORT def_bool n [2] arch/nios2/Kconfig.debug config TRACE_IRQFLAGS_SUPPORT def_bool y [1] is included before [2]. In the Kconfig syntax, the first one is effective. So, TRACE_IRQFLAGS_SUPPORT is always 'n'. The second define in arch/nios2/Kconfig.debug is dead code. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>	2019-07-04 11:32:09 +08:00
Linus Torvalds	550d1f5bda	This includes three fixes: - Fixes a deadlock from a previous fix to keep module loading and function tracing text modifications from stepping on each other. (this has a few patches to help document the issue in comments) - Fix a crash when the snapshot buffer gets out of sync with the main ring buffer. - Fix a memory leak when reading the memory logs -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXRzBCBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qnDaAP9qTFBOFtgIGCT5wVP8xjQeESxh1b8R tbaT7/U2oPpeiwEAvp1mYo5UYcc8KauBqVaLSLJVN4pv07xiZF5Qgh9C1QE= =m2IT -----END PGP SIGNATURE----- Merge tag 'trace-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "This includes three fixes: - Fix a deadlock from a previous fix to keep module loading and function tracing text modifications from stepping on each other (this has a few patches to help document the issue in comments) - Fix a crash when the snapshot buffer gets out of sync with the main ring buffer - Fix a memory leak when reading the memory logs" * tag 'trace-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace/x86: Anotate text_mutex split between ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare() tracing/snapshot: Resize spare buffer if size changed tracing: Fix memory leak in tracing_err_log_open() ftrace/x86: Add a comment to why we take text_mutex in ftrace_arch_code_modify_prepare() ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()	2019-07-04 10:26:17 +09:00
Christoph Hellwig	2ee7a4ef98	MIPS: only select ARCH_HAS_UNCACHED_SEGMENT for non-coherent platforms While mips might architecturally have the uncached segment all the time, the infrastructure to use it is only need on platforms where DMA is at least partially incoherent. Only select it for those configuration to fix a build failure as the arch_dma_prep_coherent symbol is also only provided for non-coherent platforms. Fixes: `2e96e04d25` ("MIPS: use the generic uncached segment support in dma-direct") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Paul Burton <paul.burton@mips.com> Tested-by: Guenter Roeck <linux@roeck-us.net>	2019-07-03 15:29:52 -07:00
Alexandre Ghiti	9e953cda5c	riscv: Introduce huge page support for 32/64bit kernel This patch implements both 4MB huge page support for 32bit kernel and 2MB/1GB huge pages support for 64bit kernel. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-07-03 15:23:38 -07:00
Alexandre Ghiti	3876d4a38a	x86, arm64: Move ARCH_WANT_HUGE_PMD_SHARE config in arch/Kconfig ARCH_WANT_HUGE_PMD_SHARE config was declared in both architectures: move this declaration in arch/Kconfig and make those architectures select it. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Palmer Dabbelt <palmer@sifive.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64 Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>	2019-07-03 15:22:50 -07:00
Alastair D'Silva	60e8523e2e	ocxl: Allow contexts to be attached with a NULL mm If an OpenCAPI context is to be used directly by a kernel driver, there may not be a suitable mm to use. The patch makes the mm parameter to ocxl_context_attach optional. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Link: https://lore.kernel.org/r/20190620041203.12274-1-alastair@au1.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-07-03 21:29:47 +02:00
David S. Miller	c3ead2df97	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2019-07-03 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix the interpreter to properly handle BPF_ALU32 \| BPF_ARSH on BE architectures, from Jiong. 2) Fix several bugs in the x32 BPF JIT for handling shifts by 0, from Luke and Xi. 3) Fix NULL pointer deref in btf_type_is_resolve_source_only(), from Stanislav. 4) Properly handle the check that forwarding is enabled on the device in bpf_ipv6_fib_lookup() helper code, from Anton. 5) Fix UAPI bpf_prog_info fields alignment for archs that have 16 bit alignment such as m68k, from Baruch. 6) Fix kernel hanging in unregister_netdevice loop while unregistering device bound to XDP socket, from Ilya. 7) Properly terminate tail update in xskq_produce_flush_desc(), from Nathan. 8) Fix broken always_inline handling in test_lwt_seg6local, from Jiri. 9) Fix bpftool to use correct argument in cgroup errors, from Jakub. 10) Fix detaching dummy prog in XDP redirect sample code, from Prashant. 11) Add Jonathan to AF_XDP reviewers, from Björn. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-07-03 12:09:00 -07:00
Thomas Gleixner	049331f277	x86/fsgsbase: Revert FSGSBASE support The FSGSBASE series turned out to have serious bugs and there is still an open issue which is not fully understood yet. The confidence in those changes has become close to zero especially as the test cases which have been shipped with that series were obviously never run before sending the final series out to LKML. ./fsgsbase_64 >/dev/null Segmentation fault As the merge window is close, the only sane decision is to revert FSGSBASE support. The revert is necessary as this branch has been merged into perf/core already and rebasing all of that a few days before the merge window is not the most brilliant idea. I could definitely slap myself for not noticing the test case fail when merging that series, but TBH my expectations weren't that low back then. Won't happen again. Revert the following commits: `539bca535d` ("x86/entry/64: Fix and clean up paranoid_exit") `2c7b5ac5d5` ("Documentation/x86/64: Add documentation for GS/FS addressing mode") `f987c955c7` ("x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2") `2032f1f96e` ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit") `5bf0cab60e` ("x86/entry/64: Document GSBASE handling in the paranoid path") `708078f657` ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit") `79e1932fa3` ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro") `1d07316b13` ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry") `f60a83df45` ("x86/process/64: Use FSGSBASE instructions on thread copy and ptrace") `1ab5f3f7fe` ("x86/process/64: Use FSBSBASE in switch_to() if available") `a86b462513` ("x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions") `8b71340d70` ("x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions") `b64ed19b93` ("x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com>	2019-07-03 16:35:23 +02:00
Wanpeng Li	7be373b6de	KVM: LAPIC: remove the trailing newline used in the fmt parameter of TP_printk The trailing newlines will lead to extra newlines in the trace file which looks like the following output, so remove it. qemu-system-x86-15695 [002] ...1 15774.839240: kvm_hv_timer_state: vcpu_id 0 hv_timer 1 qemu-system-x86-15695 [002] ...1 15774.839309: kvm_hv_timer_state: vcpu_id 0 hv_timer 1 Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-03 16:14:39 +02:00
Paolo Bonzini	d647eb63e6	KVM: svm: add nrips module parameter Allow testing code for old processors that lack the next RIP save feature, by disabling usage of the next_rip field. Nested hypervisors however get the feature unconditionally. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-03 16:14:38 +02:00
Ard Biesheuvel	7367bfeb2c	crypto: arm64/aes-ce - implement 5 way interleave for ECB, CBC and CTR This implements 5-way interleaving for ECB, CBC decryption and CTR, resulting in a speedup of ~11% on Marvell ThunderX2, which has a very deep pipeline and therefore a high issue latency for NEON instructions operating on the same registers. Note that XTS is left alone: implementing 5-way interleave there would either involve spilling of the calculated tweaks to the stack, or recalculating them after the encryption operation, and doing either of those would most likely penalize low end cores. For ECB, this is not a concern at all, given that we have plenty of spare registers. For CTR and CBC decryption, we take advantage of the fact that v16 is not used by the CE version of the code (which is the only one targeted by the optimization), and so we can reshuffle the code a bit and avoid having to spill to memory (with the exception of one extra reload in the CBC routine) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-07-03 22:13:12 +08:00
Ard Biesheuvel	e217413964	crypto: arm64/aes-ce - add 5 way interleave routines In preparation of tweaking the accelerated AES chaining mode routines to be able to use a 5-way stride, implement the core routines to support processing 5 blocks of input at a time. While at it, drop the 2 way versions, which have been unused for a while now. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-07-03 22:13:12 +08:00
Anshuman Khandual	c9093486f2	mips/kprobes: Export kprobe_fault_handler() Generic kprobe_page_fault() calls into kprobe_fault_handler() which must be available with and without CONFIG_KPROBES. There is one stub implementation for !CONFIG_KPROBES. For CONFIG_KPROBES all subscribing archs must provide a kprobe_fault_handler() definition. Currently mips has an implementation which is defined as 'static inline'. Make it available for generic kprobes to comply with the above new requirement. Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mips@vger.kernel.org Cc: linux-mm@kvack.org Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 773734b44557 ("mm, kprobes: generalize and rename notify_page_fault() as kprobe_page_fault()") Cc: linux-kernel@vger.kernel.org	2019-07-03 14:14:04 +01:00
Masahiro Yamada	6d3ca7e736	powerpc/mm: mark more tlb functions as __always_inline With CONFIG_OPTIMIZE_INLINING enabled, Laura Abbott reported error with gcc 9.1.1: arch/powerpc/mm/book3s64/radix_tlb.c: In function '_tlbiel_pid': arch/powerpc/mm/book3s64/radix_tlb.c:104:2: warning: asm operand 3 probably doesn't match constraints 104 \| asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1) \| ^~~ arch/powerpc/mm/book3s64/radix_tlb.c:104:2: error: impossible constraint in 'asm' Fixing _tlbiel_pid() is enough to address the warning above, but I inlined more functions to fix all potential issues. To meet the "i" (immediate) constraint for the asm operands, functions propagating "ric" must be always inlined. Fixes: `9012d01166` ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") Reported-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 20:49:50 +10:00
Russell King	1f6db18fbd	Merge branch 'sa1100-for-next'; commit 'riscpc^{/ARM: riscpc: enable chained scatterlist support}' into for-arm-soc	2019-07-03 11:44:46 +01:00
Russell King	d6c8204659	ARM: sa1100: convert to common clock framework Convert sa1100 to use the common clock framework. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2019-07-03 11:44:09 +01:00
Luke Nelson	6fa632e719	bpf, x32: Fix bug with ALU64 {LSH, RSH, ARSH} BPF_K shift by 0 The current x32 BPF JIT does not correctly compile shift operations when the immediate shift amount is 0. The expected behavior is for this to be a no-op. The following program demonstrates the bug. The expexceted result is 1, but the current JITed code returns 2. r0 = 1 r1 = 1 r1 <<= 0 if r1 == 1 goto end r0 = 2 end: exit This patch simplifies the code and fixes the bug. Fixes: `03f5781be2` ("bpf, x86_32: add eBPF JIT compiler for ia32") Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-03 11:14:28 +02:00
Luke Nelson	68a8357ec1	bpf, x32: Fix bug with ALU64 {LSH, RSH, ARSH} BPF_X shift by 0 The current x32 BPF JIT for shift operations is not correct when the shift amount in a register is 0. The expected behavior is a no-op, whereas the current implementation changes bits in the destination register. The following example demonstrates the bug. The expected result of this program is 1, but the current JITed code returns 2. r0 = 1 r1 = 1 r2 = 0 r1 <<= r2 if r1 == 1 goto end r0 = 2 end: exit The bug is caused by an incorrect assumption by the JIT that a shift by 32 clear the register. On x32 however, shifts use the lower 5 bits of the source, making a shift by 32 equivalent to a shift by 0. This patch fixes the bug using double-precision shifts, which also simplifies the code. Fixes: `03f5781be2` ("bpf, x86_32: add eBPF JIT compiler for ia32") Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-03 11:14:28 +02:00
Michael Kelley	dd2cb34861	clocksource/drivers: Continue making Hyper-V clocksource ISA agnostic Continue consolidating Hyper-V clock and timer code into an ISA independent Hyper-V clocksource driver. Move the existing clocksource code under drivers/hv and arch/x86 to the new clocksource driver while separating out the ISA dependencies. Update Hyper-V initialization to call initialization and cleanup routines since the Hyper-V synthetic clock is not independently enumerated in ACPI. Update Hyper-V clocksource users in KVM and VDSO to get definitions from the new include file. No behavior is changed and no new functionality is added. Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: "bp@alien8.de" <bp@alien8.de> Cc: "will.deacon@arm.com" <will.deacon@arm.com> Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com> Cc: "mark.rutland@arm.com" <mark.rutland@arm.com> Cc: "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org> Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org> Cc: "linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org> Cc: "olaf@aepfle.de" <olaf@aepfle.de> Cc: "apw@canonical.com" <apw@canonical.com> Cc: "jasowang@redhat.com" <jasowang@redhat.com> Cc: "marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com> Cc: Sunil Muthuswamy <sunilmut@microsoft.com> Cc: KY Srinivasan <kys@microsoft.com> Cc: "sashal@kernel.org" <sashal@kernel.org> Cc: "vincenzo.frascino@arm.com" <vincenzo.frascino@arm.com> Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org> Cc: "linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org> Cc: "linux-kselftest@vger.kernel.org" <linux-kselftest@vger.kernel.org> Cc: "arnd@arndb.de" <arnd@arndb.de> Cc: "linux@armlinux.org.uk" <linux@armlinux.org.uk> Cc: "ralf@linux-mips.org" <ralf@linux-mips.org> Cc: "paul.burton@mips.com" <paul.burton@mips.com> Cc: "daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org> Cc: "salyzyn@android.com" <salyzyn@android.com> Cc: "pcc@google.com" <pcc@google.com> Cc: "shuah@kernel.org" <shuah@kernel.org> Cc: "0x7f454c46@gmail.com" <0x7f454c46@gmail.com> Cc: "linux@rasmusvillemoes.dk" <linux@rasmusvillemoes.dk> Cc: "huw@codeweavers.com" <huw@codeweavers.com> Cc: "sfr@canb.auug.org.au" <sfr@canb.auug.org.au> Cc: "pbonzini@redhat.com" <pbonzini@redhat.com> Cc: "rkrcmar@redhat.com" <rkrcmar@redhat.com> Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org> Link: https://lkml.kernel.org/r/1561955054-1838-3-git-send-email-mikelley@microsoft.com	2019-07-03 11:00:59 +02:00
Michael Kelley	fd1fea6834	clocksource/drivers: Make Hyper-V clocksource ISA agnostic Hyper-V clock/timer code and data structures are currently mixed in with other code in the ISA independent drivers/hv directory as well as the ISA dependent Hyper-V code under arch/x86. Consolidate this code and data structures into a Hyper-V clocksource driver to better follow the Linux model. In doing so, separate out the ISA dependent portions so the new clocksource driver works for x86 and for the in-process Hyper-V on ARM64 code. To start, move the existing clockevents code to create the new clocksource driver. Update the VMbus driver to call initialization and cleanup routines since the Hyper-V synthetic timers are not independently enumerated in ACPI. No behavior is changed and no new functionality is added. Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: "bp@alien8.de" <bp@alien8.de> Cc: "will.deacon@arm.com" <will.deacon@arm.com> Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com> Cc: "mark.rutland@arm.com" <mark.rutland@arm.com> Cc: "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org> Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org> Cc: "linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org> Cc: "olaf@aepfle.de" <olaf@aepfle.de> Cc: "apw@canonical.com" <apw@canonical.com> Cc: "jasowang@redhat.com" <jasowang@redhat.com> Cc: "marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com> Cc: Sunil Muthuswamy <sunilmut@microsoft.com> Cc: KY Srinivasan <kys@microsoft.com> Cc: "sashal@kernel.org" <sashal@kernel.org> Cc: "vincenzo.frascino@arm.com" <vincenzo.frascino@arm.com> Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org> Cc: "linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org> Cc: "linux-kselftest@vger.kernel.org" <linux-kselftest@vger.kernel.org> Cc: "arnd@arndb.de" <arnd@arndb.de> Cc: "linux@armlinux.org.uk" <linux@armlinux.org.uk> Cc: "ralf@linux-mips.org" <ralf@linux-mips.org> Cc: "paul.burton@mips.com" <paul.burton@mips.com> Cc: "daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org> Cc: "salyzyn@android.com" <salyzyn@android.com> Cc: "pcc@google.com" <pcc@google.com> Cc: "shuah@kernel.org" <shuah@kernel.org> Cc: "0x7f454c46@gmail.com" <0x7f454c46@gmail.com> Cc: "linux@rasmusvillemoes.dk" <linux@rasmusvillemoes.dk> Cc: "huw@codeweavers.com" <huw@codeweavers.com> Cc: "sfr@canb.auug.org.au" <sfr@canb.auug.org.au> Cc: "pbonzini@redhat.com" <pbonzini@redhat.com> Cc: "rkrcmar@redhat.com" <rkrcmar@redhat.com> Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org> Link: https://lkml.kernel.org/r/1561955054-1838-2-git-send-email-mikelley@microsoft.com	2019-07-03 11:00:59 +02:00
Thomas Gleixner	3419240495	Merge branch 'timers/vdso' into timers/core so the hyper-v clocksource update can be applied.	2019-07-03 10:50:21 +02:00
Thomas Gleixner	f8a8fe61fe	x86/irq: Seperate unused system vectors from spurious entry again Quite some time ago the interrupt entry stubs for unused vectors in the system vector range got removed and directly mapped to the spurious interrupt vector entry point. Sounds reasonable, but it's subtly broken. The spurious interrupt vector entry point pushes vector number 0xFF on the stack which makes the whole logic in __smp_spurious_interrupt() pointless. As a consequence any spurious interrupt which comes from a vector != 0xFF is treated as a real spurious interrupt (vector 0xFF) and not acknowledged. That subsequently stalls all interrupt vectors of equal and lower priority, which brings the system to a grinding halt. This can happen because even on 64-bit the system vector space is not guaranteed to be fully populated. A full compile time handling of the unused vectors is not possible because quite some of them are conditonally populated at runtime. Bring the entry stubs back, which wastes 160 bytes if all stubs are unused, but gains the proper handling back. There is no point to selectively spare some of the stubs which are known at compile time as the required code in the IDT management would be way larger and convoluted. Do not route the spurious entries through common_interrupt and do_IRQ() as the original code did. Route it to smp_spurious_interrupt() which evaluates the vector number and acts accordingly now that the real vector numbers are handed in. Fixup the pr_warn so the actual spurious vector (0xff) is clearly distiguished from the other vectors and also note for the vectored case whether it was pending in the ISR or not. "Spurious APIC interrupt (vector 0xFF) on CPU#0, should never happen." "Spurious interrupt vector 0xed on CPU#1. Acked." "Spurious interrupt vector 0xee on CPU#1. Not pending!." Fixes: `2414e021ac` ("x86: Avoid building unused IRQ entry stubs") Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Jan Beulich <jbeulich@suse.com> Link: https://lkml.kernel.org/r/20190628111440.550568228@linutronix.de	2019-07-03 10:12:31 +02:00
Thomas Gleixner	b7107a67f0	x86/irq: Handle spurious interrupt after shutdown gracefully Since the rework of the vector management, warnings about spurious interrupts have been reported. Robert provided some more information and did an initial analysis. The following situation leads to these warnings: CPU 0 CPU 1 IO_APIC interrupt is raised sent to CPU1 Unable to handle immediately (interrupts off, deep idle delay) mask() ... free() shutdown() synchronize_irq() clear_vector() do_IRQ() -> vector is clear Before the rework the vector entries of legacy interrupts were statically assigned and occupied precious vector space while most of them were unused. Due to that the above situation was handled silently because the vector was handled and the core handler of the assigned interrupt descriptor noticed that it is shut down and returned. While this has been usually observed with legacy interrupts, this situation is not limited to them. Any other interrupt source, e.g. MSI, can cause the same issue. After adding proper synchronization for level triggered interrupts, this can only happen for edge triggered interrupts where the IO-APIC obviously cannot provide information about interrupts in flight. While the spurious warning is actually harmless in this case it worries users and driver developers. Handle it gracefully by marking the vector entry as VECTOR_SHUTDOWN instead of VECTOR_UNUSED when the vector is freed up. If that above late handling happens the spurious detector will not complain and switch the entry to VECTOR_UNUSED. Any subsequent spurious interrupt on that line will trigger the spurious warning as before. Fixes: `464d12309e` ("x86/vector: Switch IOAPIC to global reservation mode") Reported-by: Robert Hodaszi <Robert.Hodaszi@digi.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>- Tested-by: Robert Hodaszi <Robert.Hodaszi@digi.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Link: https://lkml.kernel.org/r/20190628111440.459647741@linutronix.de	2019-07-03 10:12:30 +02:00
Thomas Gleixner	dfe0cf8b51	x86/ioapic: Implement irq_get_irqchip_state() callback When an interrupt is shut down in free_irq() there might be an inflight interrupt pending in the IO-APIC remote IRR which is not yet serviced. That means the interrupt has been sent to the target CPUs local APIC, but the target CPU is in a state which delays the servicing. So free_irq() would proceed to free resources and to clear the vector because synchronize_hardirq() does not see an interrupt handler in progress. That can trigger a spurious interrupt warning, which is harmless and just confuses users, but it also can leave the remote IRR in a stale state because once the handler is invoked the interrupt resources might be freed already and therefore acknowledgement is not possible anymore. Implement the irq_get_irqchip_state() callback for the IO-APIC irq chip. The callback is invoked from free_irq() via __synchronize_hardirq(). Check the remote IRR bit of the interrupt and return 'in flight' if it is set and the interrupt is configured in level mode. For edge mode the remote IRR has no meaning. As this is only meaningful for level triggered interrupts this won't cure the potential spurious interrupt warning for edge triggered interrupts, but the edge trigger case does not result in stale hardware state. This has to be addressed at the vector/interrupt entry level seperately. Fixes: `464d12309e` ("x86/vector: Switch IOAPIC to global reservation mode") Reported-by: Robert Hodaszi <Robert.Hodaszi@digi.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Link: https://lkml.kernel.org/r/20190628111440.370295517@linutronix.de	2019-07-03 10:12:30 +02:00
Linus Torvalds	4b1fe9b58e	arm64 fixes for 5.2 - Fix module allocation when running with KASLR enabled - Fix broken build due to bug in LLVM linker (ld.lld) -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl0Z9bIACgkQt6xw3ITB YzS70gf/Trw6+Yy1dHSyz5f2W9OtedFFv+rEGcvUkF6kYFffw7taNj30K6otjkK7 CYPp9kWYpFhGgE7VwAfQ9NGyAwZ62IvGhQDYdAG72Y39zX7yQ4OHWKdr8K53KYN8 CThcgXxEPoZw1pP7fwXkaBiiljW6JGF64Hv3ybA1vzGmjiv6wdjO3pQlbXkJu4kk xlsLSLOZUDawcRuVNGWwPiToxopVTcAJ3lapYBVmO2dSO00QYv1jvJgV0tK6n68q ZQMJbTdNHLIKMRdLcDBGQAwetWkkZ5LazwuiaHQcSQcRgp7IkKrIvEz8vzkdAvcR jniDc7bbKYlvlJdiquIOH2l1ElEQyQ== =Pp2j -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Fix a build failure with the LLVM linker and a module allocation failure when KASLR is active: - Fix module allocation when running with KASLR enabled - Fix broken build due to bug in LLVM linker (ld.lld)" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly arm64: kaslr: keep modules inside module region when KASAN is enabled	2019-07-03 15:57:30 +08:00
Nishad Kamdar	2200bbec12	powerpc: Use the correct style for SPDX License Identifier This patch corrects the SPDX License Identifier style in the powerpc Hardware Architecture related files. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:36 +10:00
Stewart Smith	41732bdc9c	powerpc/powernv-eeh: Consisely desribe what this file does If the previous comment made sense, continue debugging or call your doctor immediately. Signed-off-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:36 +10:00
Krzysztof Kozlowski	14b2f7d908	powerpc/configs: Remove useless UEVENT_HELPER_PATH Remove the CONFIG_UEVENT_HELPER_PATH because: 1. It is disabled since commit `1be01d4a57` ("driver: base: Disable CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was made default to 'n', 2. It is not recommended (help message: "This should not be used today [...] creates a high system load") and was kept only for ancient userland, 3. Certain userland specifically requests it to be disabled (systemd README: "Legacy hotplug slows down the system and confuses udev"). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:36 +10:00
Christian Lamparter	3ab3a0689e	powerpc/4xx/uic: clear pending interrupt after irq type/pol change When testing out gpio-keys with a button, a spurious interrupt (and therefore a key press or release event) gets triggered as soon as the driver enables the irq line for the first time. This patch clears any potential bogus generated interrupt that was caused by the switching of the associated irq's type and polarity. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:36 +10:00
Suraj Jitindar Singh	6fbcdd5909	powerpc: Add barrier_nospec to raw_copy_in_user() Commit `ddf35cf376` ("powerpc: Use barrier_nospec in copy_from_user()") Added barrier_nospec before loading from user-controlled pointers. The intention was to order the load from the potentially user-controlled pointer vs a previous branch based on an access_ok() check or similar. In order to achieve the same result, add a barrier_nospec to the raw_copy_in_user() function before loading from such a user-controlled pointer. Fixes: `ddf35cf376` ("powerpc: Use barrier_nospec in copy_from_user()") Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:36 +10:00
Michael Neuling	3fefd1cd95	KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation When emulating tsr, treclaim and trechkpt, we incorrectly set CR0. The code currently sets: CR0 <- 00 \|\| MSR[TS] but according to the ISA it should be: CR0 <- 0 \|\| MSR[TS] \|\| 0 This fixes the bit shift to put the bits in the correct location. This is a data integrity issue as CR0 is corrupted. Fixes: `4bb3c7a020` ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9") Cc: stable@vger.kernel.org # v4.17+ Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:36 +10:00
Alexey Kardashevskiy	5636427d08	powerpc/powernv: Fix stale iommu table base after VFIO The powernv platform uses @dma_iommu_ops for non-bypass DMA. These ops need an iommu_table pointer which is stored in dev->archdata.iommu_table_base. It is initialized during pcibios_setup_device() which handles boot time devices. However when a device is taken from the system in order to pass it through, the default IOMMU table is destroyed but the pointer in a device is not updated; also when a device is returned back to the system, a new table pointer is not stored in dev->archdata.iommu_table_base either. So when a just returned device tries using IOMMU, it crashes on accessing stale iommu_table or its members. This calls set_iommu_table_base() when the default window is created. Note it used to be there before but was wrongly removed (see "fixes"). It did not appear before as these days most devices simply use bypass. This adds set_iommu_table_base(NULL) when a device is taken from the system to make it clear that IOMMU DMA cannot be used past that point. Fixes: `c4e9d3c1e6` ("powerpc/powernv/pseries: Rework device adding to IOMMU groups") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Alexey Kardashevskiy	dead1c845d	powerpc/pci/of: Parse unassigned resources The pseries platform uses the PCI_PROBE_DEVTREE method of PCI probing which reads "assigned-addresses" of every PCI device and initializes the device resources. However if the property is missing or zero sized, then there is no fallback of any kind and the PCI resources remain undiscovered, i.e. pdev->resource[] array remains empty. This adds a fallback which parses the "reg" property in pretty much same way except it marks resources as "unset" which later make Linux assign those resources proper addresses. This has an effect when: 1. a hypervisor failed to assign any resource for a device; 2. /chosen/linux,pci-probe-only=0 is in the DT so the system may try assigning a resource. Neither is likely to happen under PowerVM. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Alexey Kardashevskiy	1a047cc7e5	powerpc/pseries/dma: Enable SWIOTLB So far the pseries platforms has always been using IOMMU making SWIOTLB unnecessary. Now we want secure guests which means devices can only access certain areas of guest physical memory; we are going to use SWIOTLB for this purpose. This allows SWIOTLB for pseries. By default there is no change in behavior. This enables SWIOTLB when the "swiotlb" kernel parameter is set to "force". With the SWIOTLB enabled, the kernel creates a directly mapped DMA window (using the usual DDW mechanism) and implements SWIOTLB on top of that. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Alexey Kardashevskiy	efd176a04b	powerpc/pseries/dma: Allow SWIOTLB The commit `8617a5c5bc` ("powerpc/dma: handle iommu bypass in dma_iommu_ops") merged direct DMA ops into the IOMMU DMA ops allowing SWIOTLB as well but only for mapping; the unmapping and bouncing parts were left unmodified. This adds missing direct unmapping calls to .unmap_page() and .unmap_sg(). This adds missing sync callbacks and directs them to the direct DMA hooks. Fixes: `8617a5c5bc` ("powerpc/dma: handle iommu bypass in dma_iommu_ops") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Christoph Hellwig	24911acd64	powerpc: remove device_to_mask() Use the dma_get_mask() helper from dma-mapping.h instead, as they are functionally identical. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Michael Neuling	a278e7ea60	powerpc: Fix compile issue with force DAWR If you compile with KVM but without CONFIG_HAVE_HW_BREAKPOINT you fail at linking with: arch/powerpc/kvm/book3s_hv_rmhandlers.o:(.text+0x708): undefined reference to `dawr_force_enable' This was caused by commit `c1fe190c06` ("powerpc: Add force enable of DAWR on P9 option"). This moves a bunch of code around to fix this. It moves a lot of the DAWR code in a new file and creates a new CONFIG_PPC_DAWR to enable compiling it. Fixes: `c1fe190c06` ("powerpc: Add force enable of DAWR on P9 option") Signed-off-by: Michael Neuling <mikey@neuling.org> [mpe: Minor formatting in set_dawr()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Mathieu Malaterre	548c54acba	powerpc: silence a -Wcast-function-type warning in dawr_write_file_bool In commit `c1fe190c06` ("powerpc: Add force enable of DAWR on P9 option") the following piece of code was added: smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0); Since GCC 8 this triggers the following warning about incompatible function types: arch/powerpc/kernel/hw_breakpoint.c:408:21: error: cast between incompatible function types from 'int ()(struct arch_hw_breakpoint )' to 'void ()(void )' [-Werror=cast-function-type] Since the warning is there for a reason, and should not be hidden behind a cast, provide an intermediate callback function to avoid the warning. Fixes: `c1fe190c06` ("powerpc: Add force enable of DAWR on P9 option") Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Nicholas Piggin	6c46fcce39	powerpc/64s/radix: keep kernel ERAT over local process/guest invalidates ISA v3.0 radix modes provide SLBIA variants which can invalidate ERAT for effPID!=0 or for effLPID!=0, which allows user and guest invalidations to retain kernel/host ERAT entries. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Nicholas Piggin	fe7946ce08	powerpc/64s: Rename PPC_INVALIDATE_ERAT to PPC_ISA_3_0_INVALIDATE_ERAT This makes it clear to the caller that it can only be used on POWER9 and later CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Use "ISA_3_0" rather than "ARCH_300"] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Nicholas Piggin	293c2e27b9	powerpc/64s/exception: simplify hmi control flow Branch to the relocated 0xc000 address early (still in real mode), to simplify subsequent branches. Have the virt mode handler avoid just 'windup' and redo the exception from scratch, rather than branching back to the trampoline. Rearrange the stack setup instruction location to match the system reset handler (e.g., right before EXCEPTION_PROLOG_COMMON). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Nicholas Piggin	f34c9675ca	powerpc/64s/exception: hmi remove special case macro No code change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Nicholas Piggin	acc8da4492	powerpc/64s/exception: sreset move trampoline ahead of common code Follow convention and move tramp ahead of common. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:19:35 +10:00
Nicholas Piggin	0e10be2bb9	powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case The idle wake up code in the system reset interrupt is not very optimal. There are two requirements: perform idle wake up quickly; and save everything including CFAR for non-idle interrupts, with no performance requirement. The problem with placing the idle test in the middle of the handler and using the normal handler code to save CFAR, is that it's quite costly (e.g., mfcfar is serialising, speculative workarounds get applied, SRR1 has to be reloaded, etc). It also prevents the standard interrupt handler boilerplate being used. This pain can be avoided by using a dedicated idle interrupt handler at the start of the interrupt handler, which restores all registers back to the way they were in case it was not an idle wake up. CFAR is preserved without saving it before the non-idle case by making that the fall-through, and idle is a taken branch. Performance seems to be in the noise, but possibly around 0.5% faster, the executed instructions certainly look better. The bigger benefit is being able to drop in standard interrupt handlers after the idle code, which helps with subsequent cleanup and consolidation. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-07-03 15:18:46 +10:00
Olof Johansson	2659dc8d22	This set of patches fixes regressions introduced in v5.2 kernel when DA8xx OHCI driver was converted over to use GPIO regulators. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJdG3dJAAoJEGFBu2jqvgRNUdEP/06CH3kwX/VLJjyjG8lpm504 T8Xr1akrvwjUCtdMy2XFvGfP7Mg8cABIKN8eUqbCjusw4E4mCCh1AMnLoXUtZN1M LK/MNQ4WznREu4ZNjj/nqAAfKV/mGB2Ym7GToj57p0C6DXXxXE4GKpD2Tlrpxm8g cLk7efyCU5j5/MM9fgWVViV5LEf15tR6jRY4M4FRhZUF0CP/MWsAnrsmZuywIfSi ihGrYfdkyhAjgYInCQw3ZMNtUD+0Ohf1QmKkzmHhE7UF5FklBLFY6uOPHdwcYqPL ymhToPRquYIAU95vuCXHZRSV5HaeM47yn1SegCeNvSLr0QiEI5+31l9QOAku/dbs JLlW4hXeLvsayVF3c27K0QztDTEpv+jYDEJQ2sxjkoJkdJsQZc//tWJObw8cfWlV b61gM1bcbYOVQX3uFDdznjreKmdk+6oGhzFAeGY05jIHBL/y+afFYw0U6knolJAy p9FisEcI8XdAP2CBxc9W+Yy7YM5/H6Lvx5MmFCWfzOYl+InKL1IyIjNuCXmyjLSr WBxnM9Dpry2aaZYY6XN5h1wO34jmIduyQGBQO+n2fsgnBuy4J4i1+SEodZcfeFgv FZmLxJVGi/CHs2iypDoa3hU8Ny9/hr5ErN8fSai3a5vDNKML8VEpml+IbA6Xbxe1 l/PRRsxag7uvJOfoQW6f =vwpF -----END PGP SIGNATURE----- Merge tag 'davinci-fixes-for-v5.2-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci into arm/fixes This set of patches fixes regressions introduced in v5.2 kernel when DA8xx OHCI driver was converted over to use GPIO regulators. * tag 'davinci-fixes-for-v5.2-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci: ARM: davinci: da830-evm: fix GPIO lookup for OHCI ARM: davinci: omapl138-hawk: add missing regulator constraints for OHCI ARM: davinci: da830-evm: add missing regulator constraints for OHCI + Linux 5.2-rc7 Signed-off-by: Olof Johansson <olof@lixom.net>	2019-07-02 15:13:20 -07:00
Marek Majkowski	c4683cd5fb	um: Fix kcov crash during startup Kcov fails to start when compiled with kcov. Disable KCOV on arch/uml/kernel/skas. $ gdb -q -ex r ./vmlinux Program received signal SIGSEGV, Segmentation fault. check_kcov_mode (t=<>, needed_mode=<>) at kernel/kcov.c:70 70 mode = READ_ONCE(t->kcov_mode); Signed-off-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:42 +02:00
Krzysztof Kozlowski	80b81cdc66	um: configs: Remove useless UEVENT_HELPER_PATH Remove the CONFIG_UEVENT_HELPER_PATH because: 1. It is disabled since commit `1be01d4a57` ("driver: base: Disable CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was made default to 'n', 2. It is not recommended (help message: "This should not be used today [...] creates a high system load") and was kept only for ancient userland, 3. Certain userland specifically requests it to be disabled (systemd README: "Legacy hotplug slows down the system and confuses udev"). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:41 +02:00
Johannes Berg	065038706f	um: Support time travel mode Sometimes it can be useful to run with "time travel" inside the UML instance, for example for testing. For example, some tests for the wireless subsystem and userspace are based on hwsim, a virtual wireless adapter. Some tests can take a long time to run because they e.g. wait for 120 seconds to elapse for some regulatory checks. This obviously goes faster if it need not actually wait that long, but time inside the test environment just "bumps up" when there's nothing to do. Add CONFIG_UML_TIME_TRAVEL_SUPPORT to enable code to support such modes at runtime, selected on the command line: * just "time-travel", in which time inside the UML instance can move faster than real time, if there's nothing to do * "time-travel=inf-cpu" in which time also moves slower and any CPU processing takes no time at all, which allows to implement consistent behaviour regardless of host CPU load (or speed) or debug overhead. An additional "time-travel-start=<seconds>" parameter is also supported in this case to start the wall clock at this time (in unix epoch). With this enabled, the test mentioned above goes from a runtime of about 140 seconds (with startup overhead and all) to being CPU bound and finishing in 15 seconds (on my slow laptop). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:36 +02:00
Johannes Berg	c7c6f3b953	um: Pass nsecs to os timer functions This makes the code clearer and lets the time travel patch have the actual time used for these functions in just one place. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:29 +02:00
Johannes Berg	b00bdd3244	um: Remove drivers/ssl.h This file just contains two unused prototypes, remove it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:24 +02:00
Johannes Berg	c7f04e87e4	um: Don't garbage collect in deactivate_all_fds() My previous commit didn't actually address the whole issue with lockdep shutdown, I had another local modification that disabled lockdep but that wasn't sufficient alone, so had to do the other change. Another issue remained though - during kfree() we acquire locks and lockdep tries to annotate those with exactly the same issue in the other patch - we no longer have "current". So, just remove the garbage collection. There's no value in it anyway since we're going to shut down anyway and marking a slab object as free is now not very useful anymore. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:19 +02:00
Johannes Berg	80bf6ceaf9	um: Silence lockdep complaint about mmap_sem When we get into activate_mm(), lockdep complains that we're doing something strange: WARNING: possible circular locking dependency detected 5.1.0-10252-gb00152307319-dirty #121 Not tainted ------------------------------------------------------ inside.sh/366 is trying to acquire lock: (____ptrval____) (&(&p->alloc_lock)->rlock){+.+.}, at: flush_old_exec+0x703/0x8d7 but task is already holding lock: (____ptrval____) (&mm->mmap_sem){++++}, at: flush_old_exec+0x6c5/0x8d7 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mm->mmap_sem){++++}: [...] __lock_acquire+0x12ab/0x139f lock_acquire+0x155/0x18e down_write+0x3f/0x98 flush_old_exec+0x748/0x8d7 load_elf_binary+0x2ca/0xddb [...] -> #0 (&(&p->alloc_lock)->rlock){+.+.}: [...] __lock_acquire+0x12ab/0x139f lock_acquire+0x155/0x18e _raw_spin_lock+0x30/0x83 flush_old_exec+0x703/0x8d7 load_elf_binary+0x2ca/0xddb [...] other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(&(&p->alloc_lock)->rlock); lock(&mm->mmap_sem); lock(&(&p->alloc_lock)->rlock); * DEADLOCK * 2 locks held by inside.sh/366: #0: (____ptrval____) (&sig->cred_guard_mutex){+.+.}, at: __do_execve_file+0x12d/0x869 #1: (____ptrval____) (&mm->mmap_sem){++++}, at: flush_old_exec+0x6c5/0x8d7 stack backtrace: CPU: 0 PID: 366 Comm: inside.sh Not tainted 5.1.0-10252-gb00152307319-dirty #121 Stack: [...] Call Trace: [<600420de>] show_stack+0x13b/0x155 [<6048906b>] dump_stack+0x2a/0x2c [<6009ae64>] print_circular_bug+0x332/0x343 [<6009c5c6>] check_prev_add+0x669/0xdad [<600a06b4>] __lock_acquire+0x12ab/0x139f [<6009f3d0>] lock_acquire+0x155/0x18e [<604a07e0>] _raw_spin_lock+0x30/0x83 [<60151e6a>] flush_old_exec+0x703/0x8d7 [<601a8eb8>] load_elf_binary+0x2ca/0xddb [...] I think it's because in exec_mmap() we have down_read(&old_mm->mmap_sem); ... task_lock(tsk); ... activate_mm(active_mm, mm); (which does down_write(&mm->mmap_sem)) I'm not really sure why lockdep throws in the whole knowledge about the task lock, but it seems that old_mm and mm shouldn't ever be the same (and it doesn't deadlock) so tell lockdep that they're different. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:11 +02:00
Johannes Berg	8eacd6fca4	um: Remove locking in deactivate_all_fds() Not only does the locking contradict the comment, and as the comment says is pointless and actually harmful (all the actual OS threads have exited already), but it also causes crashes when lockdep is enabled, because calling into the spinlock calls into lockdep, which then tries to determine the current task, which no longer exists. Remove the locking to let UML shut down cleanly in case lockdep is enabled. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:05 +02:00
Johannes Berg	56fc187065	um: Timer code cleanup There are some unused functions, and some others that have unused arguments; clean up the timer code a bit. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:27:00 +02:00
Johannes Berg	fcd242c6c8	um: fix os_timer_one_shot() os_timer_one_shot() gets passed a value "unsigned long delta", so must not have an "int ticks" as that actually ends up being -1, and thus triggering a timer over and over again. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:26:57 +02:00
Jouni Malinen	bebe4681d0	um: Fix IRQ controller regression on console read The conversion of UML to use epoll based IRQ controller claimed that clone_one_chan() can safely call um_free_irq() while starting to ignore the delay_free_irq parameter that explicitly noted that the IRQ cannot be freed because this is being called from chan_interrupt(). This resulted in free_irq() getting called in interrupt context ("Trying to free IRQ 6 from IRQ context!"). Fix this by restoring previously used delay_free_irq processing. Fixes: `ff6a17989c` ("Epoll based IRQ controller") Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2019-07-02 23:26:52 +02:00
Jiri Kosina	074376ac0e	ftrace/x86: Anotate text_mutex split between ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare() ftrace_arch_code_modify_prepare() is acquiring text_mutex, while the corresponding release is happening in ftrace_arch_code_modify_post_process(). This has already been documented in the code, but let's also make the fact that this is intentional clear to the semantic analysis tools such as sparse. Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1906292321170.27227@cbobk.fhfr.pm Fixes: `39611265ed` ("ftrace/x86: Add a comment to why we take text_mutex in ftrace_arch_code_modify_prepare()") Fixes: `d5b844a2cf` ("ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()") Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2019-07-02 15:41:35 -04:00
Christoph Hellwig	514caf23a7	memremap: replace the altmap_valid field with a PGMAP_ALTMAP_VALID flag Add a flags field to struct dev_pagemap to replace the altmap_valid boolean to be a little more extensible. Also add a pgmap_altmap() helper to find the optional altmap and clean up the code using the altmap using it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-07-02 14:32:44 -03:00
Wanpeng Li	bb34e690e9	KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC Thomas reported that: \| Background: \| \| In preparation of supporting IPI shorthands I changed the CPU offline \| code to software disable the local APIC instead of just masking it. \| That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV \| register. \| \| Failure: \| \| When the CPU comes back online the startup code triggers occasionally \| the warning in apic_pending_intr_clear(). That complains that the IRRs \| are not empty. \| \| The offending vector is the local APIC timer vector who's IRR bit is set \| and stays set. \| \| It took me quite some time to reproduce the issue locally, but now I can \| see what happens. \| \| It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1 \| (and hardware support) it behaves correctly. \| \| Here is the series of events: \| \| Guest CPU \| \| goes down \| \| native_cpu_disable() \| \| apic_soft_disable(); \| \| play_dead() \| \| .... \| \| startup() \| \| if (apic_enabled()) \| apic_pending_intr_clear() <- Not taken \| \| enable APIC \| \| apic_pending_intr_clear() <- Triggers warning because IRR is stale \| \| When this happens then the deadline timer or the regular APIC timer - \| happens with both, has fired shortly before the APIC is disabled, but the \| interrupt was not serviced because the guest CPU was in an interrupt \| disabled region at that point. \| \| The state of the timer vector ISR/IRR bits: \| \| ISR IRR \| before apic_soft_disable() 0 1 \| after apic_soft_disable() 0 1 \| \| On startup 0 1 \| \| Now one would assume that the IRR is cleared after the INIT reset, but this \| happens only on CPU0. \| \| Why? \| \| Because our CPU0 hotplug is just for testing to make sure nothing breaks \| and goes through an NMI wakeup vehicle because INIT would send it through \| the boots-trap code which is not really working if that CPU was not \| physically unplugged. \| \| Now looking at a real world APIC the situation in that case is: \| \| ISR IRR \| before apic_soft_disable() 0 1 \| after apic_soft_disable() 0 1 \| \| On startup 0 0 \| \| Why? \| \| Once the dying CPU reenables interrupts the pending interrupt gets \| delivered as a spurious interupt and then the state is clear. \| \| While that CPU0 hotplug test case is surely an esoteric issue, the APIC \| emulation is still wrong, Even if the play_dead() code would not enable \| interrupts then the pending IRR bit would turn into an ISR .. interrupt \| when the APIC is reenabled on startup. From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled * Pending interrupts in the IRR and ISR registers are held and require masking or handling by the CPU. In Thomas's testing, hardware cpu will not respect soft disable LAPIC when IRR has already been set or APICv posted-interrupt is in flight, so we can skip soft disable APIC checking when clearing IRR and set ISR, continue to respect soft disable APIC when attempting to set IRR. Reported-by: Rong Chen <rong.a.chen@intel.com> Reported-by: Feng Tang <feng.tang@intel.com> Reported-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rong Chen <rong.a.chen@intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 19:02:46 +02:00
Liran Alon	323d73a8ec	KVM: nVMX: Change KVM_STATE_NESTED_EVMCS to signal vmcs12 is copied from eVMCS Currently KVM_STATE_NESTED_EVMCS is used to signal that eVMCS capability is enabled on vCPU. As indicated by vmx->nested.enlightened_vmcs_enabled. This is quite bizarre as userspace VMM should make sure to expose same vCPU with same CPUID values in both source and destination. In case vCPU is exposed with eVMCS support on CPUID, it is also expected to enable KVM_CAP_HYPERV_ENLIGHTENED_VMCS capability. Therefore, KVM_STATE_NESTED_EVMCS is redundant. KVM_STATE_NESTED_EVMCS is currently used on restore path (vmx_set_nested_state()) only to enable eVMCS capability in KVM and to signal need_vmcs12_sync such that on next VMEntry to guest nested_sync_from_vmcs12() will be called to sync vmcs12 content into eVMCS in guest memory. However, because restore nested-state is rare enough, we could have just modified vmx_set_nested_state() to always signal need_vmcs12_sync. From all the above, it seems that we could have just removed the usage of KVM_STATE_NESTED_EVMCS. However, in order to preserve backwards migration compatibility, we cannot do that. (vmx_get_nested_state() needs to signal flag when migrating from new kernel to old kernel). Returning KVM_STATE_NESTED_EVMCS when just vCPU have eVMCS enabled have a bad side-effect of userspace VMM having to send nested-state from source to destination as part of migration stream. Even if guest have never used eVMCS as it doesn't even run a nested hypervisor workload. This requires destination userspace VMM and KVM to support setting nested-state. Which make it more difficult to migrate from new host to older host. To avoid this, change KVM_STATE_NESTED_EVMCS to signal eVMCS is not only enabled but also active. i.e. Guest have made some eVMCS active via an enlightened VMEntry. i.e. vmcs12 is copied from eVMCS and therefore should be restored into eVMCS resident in memory (by copy_vmcs12_to_enlightened()). Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Maran Wilson <maran.wilson@oracle.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 19:02:45 +02:00
Liran Alon	65b712f156	KVM: nVMX: Allow restore nested-state to enable eVMCS when vCPU in SMM As comment in code specifies, SMM temporarily disables VMX so we cannot be in guest mode, nor can VMLAUNCH/VMRESUME be pending. However, code currently assumes that these are the only flags that can be set on kvm_state->flags. This is not true as KVM_STATE_NESTED_EVMCS can also be set on this field to signal that eVMCS should be enabled. Therefore, fix code to check for guest-mode and pending VMLAUNCH/VMRESUME explicitly. Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 19:02:44 +02:00
Paolo Bonzini	3f16a5c318	KVM: x86: degrade WARN to pr_warn_ratelimited This warning can be triggered easily by userspace, so it should certainly not cause a panic if panic_on_warn is set. Reported-by: syzbot+c03f30b4f4c46bdf8575@syzkaller.appspotmail.com Suggested-by: Alexander Potapenko <glider@google.com> Acked-by: Alexander Potapenko <glider@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 19:02:44 +02:00
Jim Mattson	c550505b57	kvm: x86: Pass through AMD_STIBP_ALWAYS_ON in GET_SUPPORTED_CPUID This bit is purely advisory. Passing it through to the guest indicates that the virtual processor, like the physical processor, prefers that STIBP is only set once during boot and not changed. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 19:01:56 +02:00
Jim Mattson	b119019847	kvm: nVMX: Remove unnecessary sync_roots from handle_invept When L0 is executing handle_invept(), the TDP MMU is active. Emulating an L1 INVEPT does require synchronizing the appropriate shadow EPT root(s), but a call to kvm_mmu_sync_roots in this context won't do that. Similarly, the hardware TLB and paging-structure-cache entries associated with the appropriate shadow EPT root(s) must be flushed, but requesting a TLB_FLUSH from this context won't do that either. How did this ever work? KVM always does a sync_roots and TLB flush (in the correct context) when transitioning from L1 to L2. That isn't the best choice for nested VM performance, but it effectively papers over the mistakes here. Remove the unnecessary operations and leave a comment to try to do better in the future. Reported-by: Junaid Shahid <junaids@google.com> Fixes: `bfd0a56b90` ("nEPT: Nested INVEPT") Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Nadav Har'El <nyh@il.ibm.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: Xinhao Xu <xinhao.xu@intel.com> Cc: Yang Zhang <yang.z.zhang@Intel.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by Peter Shier <pshier@google.com> Reviewed-by: Junaid Shahid <junaids@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 19:01:56 +02:00
Wanpeng Li	32b72ecc83	KVM: X86: Expose PV_SCHED_YIELD CPUID feature bit to guest Expose PV_SCHED_YIELD feature bit to guest, the guest can check this feature bit before using paravirtualized sched yield. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 18:56:05 +02:00
Wanpeng Li	715062970f	KVM: X86: Implement PV sched yield hypercall The target vCPUs are in runnable state after vcpu_kick and suitable as a yield target. This patch implements the sched yield hypercall. 17% performance increasement of ebizzy benchmark can be observed in an over-subscribe environment. (w/ kvm-pv-tlb disabled, testing TLB flush call-function IPI-many since call-function is not easy to be trigged by userspace workload). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 18:56:04 +02:00
Wanpeng Li	f85f6e7bc9	KVM: X86: Yield to IPI target if necessary When sending a call-function IPI-many to vCPUs, yield if any of the IPI target vCPUs was preempted, we just select the first preempted target vCPU which we found since the state of target vCPUs can change underneath and to avoid race conditions. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-07-02 18:56:01 +02:00

... 3 4 5 6 7 ...

161775 Commits (6d6486a0c59759681e75d1a2bd6684c501fcbd0e)