Commit graph

166225 commits

Author SHA1 Message Date
Martin KaFai Lau 85d33df357 bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS
The patch introduces BPF_MAP_TYPE_STRUCT_OPS.  The map value
is a kernel struct with its func ptr implemented in bpf prog.
This new map is the interface to register/unregister/introspect
a bpf implemented kernel struct.

The kernel struct is actually embedded inside another new struct
(or called the "value" struct in the code).  For example,
"struct tcp_congestion_ops" is embbeded in:
struct bpf_struct_ops_tcp_congestion_ops {
	refcount_t refcnt;
	enum bpf_struct_ops_state state;
	struct tcp_congestion_ops data;  /* <-- kernel subsystem struct here */
}
The map value is "struct bpf_struct_ops_tcp_congestion_ops".
The "bpftool map dump" will then be able to show the
state ("inuse"/"tobefree") and the number of subsystem's refcnt (e.g.
number of tcp_sock in the tcp_congestion_ops case).  This "value" struct
is created automatically by a macro.  Having a separate "value" struct
will also make extending "struct bpf_struct_ops_XYZ" easier (e.g. adding
"void (*init)(void)" to "struct bpf_struct_ops_XYZ" to do some
initialization works before registering the struct_ops to the kernel
subsystem).  The libbpf will take care of finding and populating the
"struct bpf_struct_ops_XYZ" from "struct XYZ".

Register a struct_ops to a kernel subsystem:
1. Load all needed BPF_PROG_TYPE_STRUCT_OPS prog(s)
2. Create a BPF_MAP_TYPE_STRUCT_OPS with attr->btf_vmlinux_value_type_id
   set to the btf id "struct bpf_struct_ops_tcp_congestion_ops" of the
   running kernel.
   Instead of reusing the attr->btf_value_type_id,
   btf_vmlinux_value_type_id s added such that attr->btf_fd can still be
   used as the "user" btf which could store other useful sysadmin/debug
   info that may be introduced in the furture,
   e.g. creation-date/compiler-details/map-creator...etc.
3. Create a "struct bpf_struct_ops_tcp_congestion_ops" object as described
   in the running kernel btf.  Populate the value of this object.
   The function ptr should be populated with the prog fds.
4. Call BPF_MAP_UPDATE with the object created in (3) as
   the map value.  The key is always "0".

During BPF_MAP_UPDATE, the code that saves the kernel-func-ptr's
args as an array of u64 is generated.  BPF_MAP_UPDATE also allows
the specific struct_ops to do some final checks in "st_ops->init_member()"
(e.g. ensure all mandatory func ptrs are implemented).
If everything looks good, it will register this kernel struct
to the kernel subsystem.  The map will not allow further update
from this point.

Unregister a struct_ops from the kernel subsystem:
BPF_MAP_DELETE with key "0".

Introspect a struct_ops:
BPF_MAP_LOOKUP_ELEM with key "0".  The map value returned will
have the prog _id_ populated as the func ptr.

The map value state (enum bpf_struct_ops_state) will transit from:
INIT (map created) =>
INUSE (map updated, i.e. reg) =>
TOBEFREE (map value deleted, i.e. unreg)

The kernel subsystem needs to call bpf_struct_ops_get() and
bpf_struct_ops_put() to manage the "refcnt" in the
"struct bpf_struct_ops_XYZ".  This patch uses a separate refcnt
for the purose of tracking the subsystem usage.  Another approach
is to reuse the map->refcnt and then "show" (i.e. during map_lookup)
the subsystem's usage by doing map->refcnt - map->usercnt to filter out
the map-fd/pinned-map usage.  However, that will also tie down the
future semantics of map->refcnt and map->usercnt.

The very first subsystem's refcnt (during reg()) holds one
count to map->refcnt.  When the very last subsystem's refcnt
is gone, it will also release the map->refcnt.  All bpf_prog will be
freed when the map->refcnt reaches 0 (i.e. during map_free()).

Here is how the bpftool map command will look like:
[root@arch-fb-vm1 bpf]# bpftool map show
6: struct_ops  name dctcp  flags 0x0
	key 4B  value 256B  max_entries 1  memlock 4096B
	btf_id 6
[root@arch-fb-vm1 bpf]# bpftool map dump id 6
[{
        "value": {
            "refcnt": {
                "refs": {
                    "counter": 1
                }
            },
            "state": 1,
            "data": {
                "list": {
                    "next": 0,
                    "prev": 0
                },
                "key": 0,
                "flags": 2,
                "init": 24,
                "release": 0,
                "ssthresh": 25,
                "cong_avoid": 30,
                "set_state": 27,
                "cwnd_event": 28,
                "in_ack_event": 26,
                "undo_cwnd": 29,
                "pkts_acked": 0,
                "min_tso_segs": 0,
                "sndbuf_expand": 0,
                "cong_control": 0,
                "get_info": 0,
                "name": [98,112,102,95,100,99,116,99,112,0,0,0,0,0,0,0
                ],
                "owner": 0
            }
        }
    }
]

Misc Notes:
* bpf_struct_ops_map_sys_lookup_elem() is added for syscall lookup.
  It does an inplace update on "*value" instead returning a pointer
  to syscall.c.  Otherwise, it needs a separate copy of "zero" value
  for the BPF_STRUCT_OPS_STATE_INIT to avoid races.

* The bpf_struct_ops_map_delete_elem() is also called without
  preempt_disable() from map_delete_elem().  It is because
  the "->unreg()" may requires sleepable context, e.g.
  the "tcp_unregister_congestion_control()".

* "const" is added to some of the existing "struct btf_func_model *"
  function arg to avoid a compiler warning caused by this patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003505.3855919-1-kafai@fb.com
2020-01-09 08:46:18 -08:00
David S. Miller 31d518f35e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Simple overlapping changes in bpf land wrt. bpf_helper_defs.h
handling.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-31 13:37:13 -08:00
Olof Johansson 1833e327a5 riscv: export flush_icache_all to modules
This is needed by LKDTM (crash dump test module), it calls
flush_icache_range(), which on RISC-V turns into flush_icache_all(). On
other architectures, the actual implementation is exported, so follow
that precedence and export it here too.

Fixes build of CONFIG_LKDTM that fails with:
ERROR: "flush_icache_all" [drivers/misc/lkdtm/lkdtm.ko] undefined!

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-27 21:51:01 -08:00
David Abdurachmanov 556f47ac60 riscv: reject invalid syscalls below -1
Running "stress-ng --enosys 4 -t 20 -v" showed a large number of kernel oops
with "Unable to handle kernel paging request at virtual address" message. This
happens when enosys stressor starts testing random non-valid syscalls.

I forgot to redirect any syscall below -1 to sys_ni_syscall.

With the patch kernel oops messages are gone while running stress-ng enosys
stressor.

Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com>
Fixes: 5340627e3f ("riscv: add support for SECCOMP and SECCOMP_FILTER")
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-27 21:50:57 -08:00
Luc Van Oostenryck 4d47ce158e riscv: fix compile failure with EXPORT_SYMBOL() & !MMU
When support for !MMU was added, the declaration of
__asm_copy_to_user() & __asm_copy_from_user() were #ifdefed
out hence their EXPORT_SYMBOL() give an error message like:
  .../riscv_ksyms.c:13:15: error: '__asm_copy_to_user' undeclared here
  .../riscv_ksyms.c:14:15: error: '__asm_copy_from_user' undeclared here

Since these symbols are not defined with !MMU it's wrong to export them.
Same for __clear_user() (even though this one is also declared in
include/asm-generic/uaccess.h and thus doesn't give an error message).

Fix this by doing the EXPORT_SYMBOL() directly where these symbols
are defined: inside lib/uaccess.S itself.

Fixes: 6bd33e1ece ("riscv: fix compile failure with EXPORT_SYMBOL() & !MMU")
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-27 21:44:36 -08:00
David S. Miller 2bbc078f81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-12-27

The following pull-request contains BPF updates for your *net-next* tree.

We've added 127 non-merge commits during the last 17 day(s) which contain
a total of 110 files changed, 6901 insertions(+), 2721 deletions(-).

There are three merge conflicts. Conflicts and resolution looks as follows:

1) Merge conflict in net/bpf/test_run.c:

There was a tree-wide cleanup c593642c8b ("treewide: Use sizeof_field() macro")
which gets in the way with b590cb5f80 ("bpf: Switch to offsetofend in
BPF_PROG_TEST_RUN"):

  <<<<<<< HEAD
          if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) +
                             sizeof_field(struct __sk_buff, priority),
  =======
          if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority),
  >>>>>>> 7c8dce4b16

There are a few occasions that look similar to this. Always take the chunk with
offsetofend(). Note that there is one where the fields differ in here:

  <<<<<<< HEAD
          if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) +
                             sizeof_field(struct __sk_buff, tstamp),
  =======
          if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs),
  >>>>>>> 7c8dce4b16

Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to
850a88cc40 ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN").

2) Merge conflict in arch/riscv/net/bpf_jit_comp.c:

(I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.)

  <<<<<<< HEAD
          if (is_13b_check(off, insn))
                  return -1;
          emit(rv_blt(tcc, RV_REG_ZERO, off >> 1), ctx);
  =======
          emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx);
  >>>>>>> 7c8dce4b16

Result should look like:

          emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx);

3) Merge conflict in arch/riscv/include/asm/pgtable.h:

  <<<<<<< HEAD
  =======
  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
  #define VMALLOC_END      (PAGE_OFFSET - 1)
  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)

  #define BPF_JIT_REGION_SIZE     (SZ_128M)
  #define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
  #define BPF_JIT_REGION_END      (VMALLOC_END)

  /*
   * Roughly size the vmemmap space to be large enough to fit enough
   * struct pages to map half the virtual address space. Then
   * position vmemmap directly below the VMALLOC region.
   */
  #define VMEMMAP_SHIFT \
          (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
  #define VMEMMAP_SIZE    BIT(VMEMMAP_SHIFT)
  #define VMEMMAP_END     (VMALLOC_START - 1)
  #define VMEMMAP_START   (VMALLOC_START - VMEMMAP_SIZE)

  #define vmemmap         ((struct page *)VMEMMAP_START)

  >>>>>>> 7c8dce4b16

Only take the BPF_* defines from there and move them higher up in the
same file. Remove the rest from the chunk. The VMALLOC_* etc defines
got moved via 01f52e16b8 ("riscv: define vmemmap before pfn_to_page
calls"). Result:

  [...]
  #define __S101  PAGE_READ_EXEC
  #define __S110  PAGE_SHARED_EXEC
  #define __S111  PAGE_SHARED_EXEC

  #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
  #define VMALLOC_END      (PAGE_OFFSET - 1)
  #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)

  #define BPF_JIT_REGION_SIZE     (SZ_128M)
  #define BPF_JIT_REGION_START    (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
  #define BPF_JIT_REGION_END      (VMALLOC_END)

  /*
   * Roughly size the vmemmap space to be large enough to fit enough
   * struct pages to map half the virtual address space. Then
   * position vmemmap directly below the VMALLOC region.
   */
  #define VMEMMAP_SHIFT \
          (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
  #define VMEMMAP_SIZE    BIT(VMEMMAP_SHIFT)
  #define VMEMMAP_END     (VMALLOC_START - 1)
  #define VMEMMAP_START   (VMALLOC_START - VMEMMAP_SIZE)

  [...]

Let me know if there are any other issues.

Anyway, the main changes are:

1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific
   to a provided BPF object file. This provides an alternative, simplified API
   compared to standard libbpf interaction. Also, add libbpf extern variable
   resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko.

2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by
   generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also,
   add various BPF riscv JIT improvements, from Björn Töpel.

3) Extend bpftool to allow matching BPF programs and maps by name,
   from Paul Chaignon.

4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI
   flag for allowing updates without service interruption, from Andrey Ignatov.

5) Cleanup and simplification of ring access functions for AF_XDP with a
   bonus of 0-5% performance improvement, from Magnus Karlsson.

6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of
   audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa.

7) Move and extend test_select_reuseport into BPF program tests under
   BPF selftests, from Jakub Sitnicki.

8) Various BPF sample improvements for xdpsock for customizing parameters
   to set up and benchmark AF_XDP, from Jay Jayatheerthan.

9) Improve libbpf to provide a ulimit hint on permission denied errors.
   Also change XDP sample programs to attach in driver mode by default,
   from Toke Høiland-Jørgensen.

10) Extend BPF test infrastructure to allow changing skb mark from tc BPF
    programs, from Nikita V. Shirokov.

11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King.

12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after
    libbpf conversion, from Jesper Dangaard Brouer.

13) Minor misc improvements from various others.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-27 14:20:10 -08:00
David S. Miller ac80010fc9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Mere overlapping changes in the conflicts here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-22 15:15:05 -08:00
Linus Torvalds a313c8e056 PPC:
* Fix a bug where we try to do an ultracall on a system without an ultravisor.
 
 KVM:
 - Fix uninitialised sysreg accessor
 - Fix handling of demand-paged device mappings
 - Stop spamming the console on IMPDEF sysregs
 - Relax mappings of writable memslots
 - Assorted cleanups
 
 MIPS:
 - Now orphan, James Hogan is stepping down
 
 x86:
 - MAINTAINERS change, so long Radim and thanks for all the fish
 - supported CPUID fixes for AMD machines without SPEC_CTRL
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJd/1+WAAoJEL/70l94x66DFuYH/A8x/P6BuCpppdGoEw+VGy7X
 E8141dHTd7b1Wgi0kDNLRREr4QIfArvavGe0z0W8p4fGtcVjXdyhhfPd0UK6dfKG
 9P66phY4AGPjde/8q/qSdFup9yshpcFwSVYdRC0L1w86dBRlXwuqk6K5zsRyCU4b
 38v5Q3rPdMnWWB0K88/GMvAyQmPkgMOXJvhoecKeDQ+9IZ3ub6DBBNGM/xTJ9Y3z
 vUe2BoYkZ3KKn6sfP66PdprBVI1EOrrAoj/l4BSuo/yUPcQsxTihXMkh5iGl18TF
 h7TN9eq2Bn2ryh0TsaSK8opuePcotVvx7oll3ERtSV4e+89z5FDt4vVcY1VyRuc=
 =adm7
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "PPC:
   - Fix a bug where we try to do an ultracall on a system without an
     ultravisor

  KVM:
   - Fix uninitialised sysreg accessor
   - Fix handling of demand-paged device mappings
   - Stop spamming the console on IMPDEF sysregs
   - Relax mappings of writable memslots
   - Assorted cleanups

  MIPS:
   - Now orphan, James Hogan is stepping down

  x86:
   - MAINTAINERS change, so long Radim and thanks for all the fish
   - supported CPUID fixes for AMD machines without SPEC_CTRL"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  MAINTAINERS: remove Radim from KVM maintainers
  MAINTAINERS: Orphan KVM for MIPS
  kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD
  kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD
  KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor
  KVM: arm/arm64: Properly handle faulting of device mappings
  KVM: arm64: Ensure 'params' is initialised when looking up sys register
  KVM: arm/arm64: Remove excessive permission check in kvm_arch_prepare_memory_region
  KVM: arm64: Don't log IMP DEF sysreg traps
  KVM: arm64: Sanely ratelimit sysreg messages
  KVM: arm/arm64: vgic: Use wrapper function to lock/unlock all vcpus in kvm_vgic_create()
  KVM: arm/arm64: vgic: Fix potential double free dist->spis in __kvm_vgic_destroy()
  KVM: arm/arm64: Get rid of unused arg in cpu_init_hyp_mode()
2019-12-22 10:26:59 -08:00
Linus Torvalds 7214618c60 RISC-V updates for v5.5-rc3
Several fixes, and one cleanup, for RISC-V.
 
 Fixes:
 
 - Fix an error in a Kconfig file that resulted in an undefined Kconfig
   option "CONFIG_CONFIG_MMU"
 
 - Fix undefined Kconfig option "CONFIG_CONFIG_MMU"
 
 - Fix scratch register clearing in M-mode (affects nommu users)
 
 - Fix a mismerge on my part that broke the build for
   CONFIG_SPARSEMEM_VMEMMAP users
 
 Cleanups:
 
 - Move SiFive L2 cache-related code to drivers/soc, per request
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl3+2nAACgkQx4+xDQu9
 Kkvc7g//Qwd3/tv8SEhoGw23ThtHZJX+GOtKtY+5IrAFjrfLgs7/A0Jco0jqGH+8
 t9AMu3I1S7Qgt5aCQxMTs711T80PB21XL4bgUXBmnrY77BPj/ysIu7FHQghK9ctM
 6F+DIaPsSWmckxF7fV+xY2qE/BtiS8LwiFhyUlNoXF5Js30AtXyeEGxV9+3WTf3A
 fO8nrckDN6YNpcNT/rVsUqSNkXaYSQ1YhiUsgMRMU6tPPgds7hyPT4VaxO2ACzBN
 xSKxLXMuO3GqaQLgmn8y13JDdwBySKwtL8rJMxkWSsRTjQGf67VaZPt9seQFjGDm
 u46d03WCDIcHlr8yubUmy1iH4KuyVylCSllWlWHSk3jk1jwNDPNWa8LeNZlVGgRM
 btgw0Xid4DOBRlg/pe+PIoQvkrTeQPiNV+ABlxYGupGEJESSmOXI165y5eNapC7N
 RCnxnpUWF4lUUXDJwNOiw/YJnwqDERfrttdnzWMALatMQGtKOCrofSsUroewYx4I
 P9+4cISlk1x4LIm96Jl/t1NiNiT3tfOsXyqHMBJgcMSM4XT0wngMUV0ul1OA0gws
 iEpMVcQMQ4/C0nKZ0Nk7fk6TByOOybXH2wedApB1DbArqTFY9z3dwn6hC6x7pKCn
 X+rzwYRA/4WjSJ/IpgWhouehT/Ze1vwU+VxoOLK+4VUd0MF3PUc=
 =TpM4
 -----END PGP SIGNATURE-----

Merge tag 'riscv/for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:
 "Several fixes, and one cleanup, for RISC-V.

  Fixes:

   - Fix an error in a Kconfig file that resulted in an undefined
     Kconfig option "CONFIG_CONFIG_MMU"

   - Fix undefined Kconfig option "CONFIG_CONFIG_MMU"

   - Fix scratch register clearing in M-mode (affects nommu users)

   - Fix a mismerge on my part that broke the build for
     CONFIG_SPARSEMEM_VMEMMAP users

  Cleanup:

   - Move SiFive L2 cache-related code to drivers/soc, per request"

* tag 'riscv/for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: move sifive_l2_cache.c to drivers/soc
  riscv: define vmemmap before pfn_to_page calls
  riscv: fix scratch register clearing in M-mode.
  riscv: Fix use of undefined config option CONFIG_CONFIG_MMU
2019-12-22 10:22:47 -08:00
Linus Torvalds 78bac77b52 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Several nf_flow_table_offload fixes from Pablo Neira Ayuso,
    including adding a missing ipv6 match description.

 2) Several heap overflow fixes in mwifiex from qize wang and Ganapathi
    Bhat.

 3) Fix uninit value in bond_neigh_init(), from Eric Dumazet.

 4) Fix non-ACPI probing of nxp-nci, from Stephan Gerhold.

 5) Fix use after free in tipc_disc_rcv(), from Tuong Lien.

 6) Enforce limit of 33 tail calls in mips and riscv JIT, from Paul
    Chaignon.

 7) Multicast MAC limit test is off by one in qede, from Manish Chopra.

 8) Fix established socket lookup race when socket goes from
    TCP_ESTABLISHED to TCP_LISTEN, because there lacks an intervening
    RCU grace period. From Eric Dumazet.

 9) Don't send empty SKBs from tcp_write_xmit(), also from Eric Dumazet.

10) Fix active backup transition after link failure in bonding, from
    Mahesh Bandewar.

11) Avoid zero sized hash table in gtp driver, from Taehee Yoo.

12) Fix wrong interface passed to ->mac_link_up(), from Russell King.

13) Fix DSA egress flooding settings in b53, from Florian Fainelli.

14) Memory leak in gmac_setup_txqs(), from Navid Emamdoost.

15) Fix double free in dpaa2-ptp code, from Ioana Ciornei.

16) Reject invalid MTU values in stmmac, from Jose Abreu.

17) Fix refcount leak in error path of u32 classifier, from Davide
    Caratti.

18) Fix regression causing iwlwifi firmware crashes on boot, from Anders
    Kaseorg.

19) Fix inverted return value logic in llc2 code, from Chan Shu Tak.

20) Disable hardware GRO when XDP is attached to qede, frm Manish
    Chopra.

21) Since we encode state in the low pointer bits, dst metrics must be
    at least 4 byte aligned, which is not necessarily true on m68k. Add
    annotations to fix this, from Geert Uytterhoeven.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (160 commits)
  sfc: Include XDP packet headroom in buffer step size.
  sfc: fix channel allocation with brute force
  net: dst: Force 4-byte alignment of dst_metrics
  selftests: pmtu: fix init mtu value in description
  hv_netvsc: Fix unwanted rx_table reset
  net: phy: ensure that phy IDs are correctly typed
  mod_devicetable: fix PHY module format
  qede: Disable hardware gro when xdp prog is installed
  net: ena: fix issues in setting interrupt moderation params in ethtool
  net: ena: fix default tx interrupt moderation interval
  net/smc: unregister ib devices in reboot_event
  net: stmmac: platform: Fix MDIO init for platforms without PHY
  llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c)
  net: hisilicon: Fix a BUG trigered by wrong bytes_compl
  net: dsa: ksz: use common define for tag len
  s390/qeth: don't return -ENOTSUPP to userspace
  s390/qeth: fix promiscuous mode after reset
  s390/qeth: handle error due to unsupported transport mode
  cxgb4: fix refcount init for TC-MQPRIO offload
  tc-testing: initial tdc selftests for cls_u32
  ...
2019-12-22 09:54:33 -08:00
Paolo Bonzini d68321dec1 PPC KVM fix for 5.5
- Fix a bug where we try to do an ultracall on a system without an
   ultravisor.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEv0VLfXa2m9eKuaRpnZrqdyxjcZ8FAl35s5kACgkQnZrqdyxj
 cZ8cwwf/UPCvZIYPSeYvzrCrlA+wlhBAh3bh47+ZXaNybOpss1xZ7QOFGkgoVBkn
 ES2Sdx3qgLvhmbR+nEKon8YCDVSwUj2ehwJu1nzAUzuVYw+m8OHGjdW07+go5KKi
 xZOndwBQGYaaWxch2O8Qw27TZU4lcVY/FNQiti5Ahg9dKK98CLyMsWnTms23ZjGD
 JMN/jCoMxa6godxWk3mSLaIwXj8P1P4pH3oiMFF8ngRTqyMgi1l02wim+DV10rD4
 5JoAF2kzSYngDlrhhQAsSOWrsWst1X2txcHA2QsoL7ZGYUQzzKyHH6zC6dS9eWk4
 ni70RLEnJj8YpsjwT52tFYokxwTPfQ==
 =kPkE
 -----END PGP SIGNATURE-----

Merge tag 'kvm-ppc-fixes-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master

PPC KVM fix for 5.5

- Fix a bug where we try to do an ultracall on a system without an
  ultravisor.
2019-12-22 13:18:15 +01:00
Linus Torvalds 60b04df6bf s390 updates for 5.5-rc3
- Fix unwinding from irq context of interrupted user process.
 
 - Add purgatory build missing symbols check. That helped to uncover and
   fix missing symbols when built with kasan support enabled.
 
 - Couple of ftrace fixes. Avoid broken stack trace and fix recursion
   loop in function_graph tracer.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl3+dtoACgkQjYWKoQLX
 FBggAQf/WuAi4PPv67jbjn8L62iiVfwvpKVuOQUS4TxCn+AEIIoxAn1VwhqOwruR
 r2oVQyLw0toK2dOAbancV72tsujk3b3akoSfeQB9fmE4bo62pvmHWFf6sdupoU3C
 WWcWTFv/E4PpCV4uVDN6KLx7iohqJghm9sHllGuQzPBBdeyQSeUai1u2G+BZ8qf4
 9q4j5HafNoDUg1YcX8fWra73kZ6ggbZ4+PTwoKkM6iIaVJ3+vWOVt+DXE1RHKnbI
 7VqmI5XvO9eaTQmwJeZMDsYkS2TFVA04GRNkz97GlXFznikbSeCzKEQ6SqRixRfq
 6VezZSvKWacFuKqlqmq+lHUV56rdyw==
 =3/cw
 -----END PGP SIGNATURE-----

Merge tag 's390-5.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - Fix unwinding from irq context of interrupted user process.

 - Add purgatory build missing symbols check. That helped to uncover and
   fix missing symbols when built with kasan support enabled.

 - Couple of ftrace fixes. Avoid broken stack trace and fix recursion
   loop in function_graph tracer.

* tag 's390-5.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/ftrace: save traced function caller
  s390/unwind: stop gracefully at user mode pt_regs in irq stack
  s390/purgatory: do not build purgatory with kcov, kasan and friends
  s390/purgatory: Make sure we fail the build if purgatory has missing symbols
  s390/ftrace: fix endless recursion in function_graph tracer
2019-12-21 12:17:14 -08:00
Linus Torvalds c4ff10efe8 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixes: a BTS fix, a PT NMI handling fix, a PMU sysfs fix and an
  SRCU annotation"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Add SRCU annotation for pmus list walk
  perf/x86/intel: Fix PT PMI handling
  perf/x86/intel/bts: Fix the use of page_private()
  perf/x86: Fix potential out-of-bounds access
2019-12-21 10:51:00 -08:00
Linus Torvalds 6c1c79a5f4 Kbuild fixes for v5.5
- fix warning in out-of-tree 'make clean'
 
  - add READELF variable to the top Makefile
 
  - fix broken builds when LINUX_COMPILE_BY contains a backslash
 
  - fix build warning in kallsyms
 
  - fix NULL pointer access in expr_eq() in Kconfig
 
  - fix missing dependency on rsync in deb-pkg build
 
  - remove ---help--- from documentation
 
  - fix misleading documentation about directory descending
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl3+OssVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGCccP/ROhTMQviPaj1W8qgE5r18Yh5U7H
 8f3h2OuqX2mVn6Jnqhs83JvuAKKENczh884MCgvmZCq4eVGkxB5yigZjctpQYzpR
 hpLZRSCfv415wSTZOttsVmff4ovrJ0KpYfx6cMOTK7Hm+4UBRtDMK2xjE+x4ZNmH
 dX04q3oA5eoc1jx2cqP5nltiwFbfEk1hx8s9QwXc9JysYTlUbJZ4SQE7NGGEdk5+
 rZOA9koJOpcbqcGMjMusApD2lzl2RgP2/fbLdA68zlgpUV9xT3GiieQiSAkIi8gh
 +PcWjWVoZyuQUMeGG+3HmnUVfBaIGXxWLfERbB5LZEftUS7HRS5+i2DHvla67gPp
 alqxtz1sE5VeEmkq+msdrnZxqakNP5EjLnCa57BkVvRRphtKy1gijN9FuT8mjZUz
 UJA7CYhB2HIcwSzJIosRPhIhNudwpZidWU0nnjxmEatVgL5y2UG1UNe9Wfmh45Gf
 fNYQSMwL+LL/xiO2E1WACBqAhCfvGbCG3NEinO1IAe0SuMtF2Ha77Fo3CKqFtLyu
 ROB3ffuADIWID/tKe02BOnACcHXjHV0XEOc9CjRmn9pMw109Wy03EdfzgQyULTU7
 qKOWeE+dcnsoLwnUGrWzaqaqH3yADVy9pIKLYKtD/Veu6lLL3kJ1dOIdr6spSICO
 Ji3f8/3ADS8jJh/a
 =OKOu
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-fixes-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - fix warning in out-of-tree 'make clean'

 - add READELF variable to the top Makefile

 - fix broken builds when LINUX_COMPILE_BY contains a backslash

 - fix build warning in kallsyms

 - fix NULL pointer access in expr_eq() in Kconfig

 - fix missing dependency on rsync in deb-pkg build

 - remove ---help--- from documentation

 - fix misleading documentation about directory descending

* tag 'kbuild-fixes-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: clarify the difference between obj-y and obj-m w.r.t. descending
  kconfig: remove ---help--- from documentation
  scripts: package: mkdebian: add missing rsync dependency
  kconfig: don't crash on NULL expressions in expr_eq()
  scripts/kallsyms: fix offset overflow of kallsyms_relative_base
  mkcompile_h: use printf for LINUX_COMPILE_BY
  mkcompile_h: git rid of UTS_TRUNCATE from LINUX_COMPILE_{BY,HOST}
  x86/boot: kbuild: allow readelf executable to be specified
  kbuild: fix 'No such file or directory' warning when cleaning
2019-12-21 10:49:47 -08:00
Linus Torvalds 6210469417 Merge branch 'parisc-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pul parisc fixes from Helge Deller:
 "Two build error fixes, one for the soft_offline_page() parameter
  change and one for a specific KEXEC/KEXEC_FILE configuration, as well
  as a compiler and a linker warning fix"

* 'parisc-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix compiler warnings in debug_core.c
  parisc: soft_offline_page() now takes the pfn
  parisc: add missing __init annotation
  parisc: fix compilation when KEXEC=n and KEXEC_FILE=y
2019-12-21 06:49:41 -08:00
Linus Torvalds 6d04182dd3 powerpc fixes for 5.5 #4
Two weeks worth of accumulated fixes.
 
 A fix for a performance regression seen on PowerVM LPARs using dedicated CPUs,
 caused by our vcpu_is_preempted() returning true even for idle CPUs.
 
 One of the ultravisor support patches broke KVM on big endian hosts in v5.4.
 
 Our KUAP (Kernel User Access Prevention) code missed allowing access in
 __clear_user(), which could lead to an oops or erroneous SEGV when triggered via
 PTRACE_GETREGSET.
 
 Two fixes for the ocxl driver, an open/remove race, and a memory leak in an
 error path.
 
 A handful of other small fixes.
 
 Thanks to:
   Andrew Donnellan, Christian Zigotzky, Christophe Leroy, Christoph Hellwig,
   Daniel Axtens, David Hildenbrand, Frederic Barrat, Gautham R. Shenoy, Greg
   Kurz, Ihor Pasichnyk, Juri Lelli, Marcus Comstedt, Mike Rapoport, Parth Shah,
   Srikar Dronamraju, Vaidyanathan Srinivasan.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl39/EETHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgNkxD/9o9t+e3IvmZSFNrN0krQCiQCQntNp9
 vYZSCq734ZKu1pIQxTOpOEAn11DhOcZZNULg3EC3WVFuTxz8i84yh2kd+rx94XO8
 diRwXIdowR3KkqP9spq3dlPZOJtSZxgj56uCrtdt3diTQNT3xEwT2h9Y2wQWKiTF
 GrRoDmgxlByVcOPinUobgFmho5DXWp7XLcJoV9skB/pHEPwo9a5DgN60jGDtlnkn
 Vpb2zQZDK2p6OhcZGdVbfrIcphK/uAR2a6uWWVdsRbreSj1UV03IlvipSthNPiQi
 Dv+B+gtQZJ/w1LvddE9R2ySfTeaeARd7XjwfZf/nn+oTEi+XHsRFmP+x73KDJllx
 GXGMckiNq21eii2O4xXvs/NiXJmFfbpg3tkkAJxrL45Ikv2/eWn3ugs07HHbS5uG
 NiM8b8o2PgR4XSCULc0LgJflVSqCEaujm4CukZ7Jbn4QMHe1edQFqu2W8cmY0Sxy
 C1h0DQWJXbAozGpF0+3mB8DNl6ru0CibIaXgWVSsF7nAc3w343o3HhtK5sX/UqGa
 fFPIRXuAr0fdkeb5Kv5Q5uzos3bdC5vJ77aQ+QvkyStg1fD+uf1ChUg2Uc+XFGBC
 oxN9hg8BA7HEoxzaUC5nzEOw1Abl1CssIB2vTUAkiOps1ypMk4HOR796MIQ+3EDO
 VnEfIK73yBmoLA==
 =d/F6
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Two weeks worth of accumulated fixes:

   - A fix for a performance regression seen on PowerVM LPARs using
     dedicated CPUs, caused by our vcpu_is_preempted() returning true
     even for idle CPUs.

   - One of the ultravisor support patches broke KVM on big endian hosts
     in v5.4.

   - Our KUAP (Kernel User Access Prevention) code missed allowing
     access in __clear_user(), which could lead to an oops or erroneous
     SEGV when triggered via PTRACE_GETREGSET.

   - Two fixes for the ocxl driver, an open/remove race, and a memory
     leak in an error path.

   - A handful of other small fixes.

  Thanks to: Andrew Donnellan, Christian Zigotzky, Christophe Leroy,
  Christoph Hellwig, Daniel Axtens, David Hildenbrand, Frederic Barrat,
  Gautham R. Shenoy, Greg Kurz, Ihor Pasichnyk, Juri Lelli, Marcus
  Comstedt, Mike Rapoport, Parth Shah, Srikar Dronamraju, Vaidyanathan
  Srinivasan"

* tag 'powerpc-5.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  KVM: PPC: Book3S HV: Fix regression on big endian hosts
  powerpc: Fix __clear_user() with KUAP enabled
  powerpc/pseries/cmm: fix managed page counts when migrating pages between zones
  powerpc/8xx: fix bogus __init on mmu_mapin_ram_chunk()
  ocxl: Fix potential memory leak on context creation
  powerpc/irq: fix stack overflow verification
  powerpc: Ensure that swiotlb buffer is allocated from low memory
  powerpc/shared: Use static key to detect shared processor
  powerpc/vcpu: Assume dedicated processors as non-preempt
  ocxl: Fix concurrent AFU open and device removal
2019-12-21 06:17:05 -08:00
Linus Torvalds 5c741e2583 Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 RAS fixes from Borislav Petkov:
 "Three urgent RAS fixes for the AMD side of things:

   - initialize struct mce.bank so that calculated error severity on AMD
     SMCA machines is correct

   - do not send IPIs early during bank initialization, when interrupts
     are disabled

   - a fix for when only a subset of MCA banks are enabled, which led to
     boot hangs on some new AMD CPUs"

* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Fix possibly incorrect severity calculation on AMD
  x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[]
  x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure()
2019-12-21 06:04:12 -08:00
Oleksij Rempel 4eb7ae7a30 MIPS: ath79: ar9331: add ar9331-switch node
Add switch node supported by dsa ar9331 driver.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20 17:05:47 -08:00
Linus Torvalds 3939f2c866 arm64 fixes:
- Leftover put_cpu() in the perf/smmuv3 error path.
 
 - Add Hisilicon TSV110 to spectre-v2 safe list
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl39DvwACgkQa9axLQDI
 XvFgIxAAkmYKNeK6EpnxiMKQANeD77X0qPcJQWsiG5Qu7Xc5AybnVlZGUPA9yVxL
 jZPyYtBULElZsQ6vbdmz+kXga4j51jVkvKt2ZTmnQvuEFDRY3Fby1lPWfrGGmXBX
 YLU9OpLTs7DacfOLey9KuEoKYd2fqnetY+hDH07UhhY+VZFpH7fnFz03vxoX9mvs
 dar4y+1fH5mbSfLAkprRR+LeNHUxYIkxAVZlSelHvcjr6epaniuv/uk4/bsgaoVS
 mnZ7gLDoUwJ4btV1w/iY55RRUp5lRO6igBi49DFJyOmoSltfeFEA7KGJC4K7F+hp
 qeO/9t3fCauCQCRCW6oSZmLbyiZyvLVBs0YcVrrsdfDjhwStygDhbJP/TIRZhcfC
 5PAcOt/8nbU04j5+kiGTMu9Enx1htZ6wT8ycj2hQcEH4Xenixh8T4JDEIo1fNitJ
 6GYhEBw8zukYouZO22ppcLALF1XKZ0MrDthhcLf7YSK58mtdXUxmont9mfA8v6Po
 iVty3ZmMnZGrqk1DgFKC2pEtZRCNx/vSzTzRpQffDntE0vsq+L9pqgG2mapM230w
 Wo4qZyKiRCfecD+PZeDvPaKzMAA4xMr32YFi3V/OK3OWoqkg/jRZaCi58xsDEzfG
 QmZhNlrHPZe3ObPGaUB+zer1c+ZJ3KG2s4bNk78zizcZ8IHzgYY=
 =+P0H
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Leftover put_cpu() in the perf/smmuv3 error path.

 - Add Hisilicon TSV110 to spectre-v2 safe list

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: cpu_errata: Add Hisilicon TSV110 to spectre-v2 safe list
  perf/smmuv3: Remove the leftover put_cpu() in error path
2019-12-20 13:36:49 -08:00
Helge Deller 75cf979700 parisc: Fix compiler warnings in debug_core.c
Fix this compiler warning:
kernel/debug/debug_core.c: In function ‘kgdb_cpu_enter’:
arch/parisc/include/asm/cmpxchg.h:48:3: warning: value computed is not used [-Wunused-value]
   48 |  ((__typeof__(*(ptr)))__xchg((unsigned long)(x), (ptr), sizeof(*(ptr))))
arch/parisc/include/asm/atomic.h:78:30: note: in expansion of macro ‘xchg’
   78 | #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
      |                              ^~~~
kernel/debug/debug_core.c:596:4: note: in expansion of macro ‘atomic_xchg’
  596 |    atomic_xchg(&kgdb_active, cpu);
      |    ^~~~~~~~~~~

Signed-off-by: Helge Deller <deller@gmx.de>
2019-12-20 21:01:42 +01:00
Helge Deller 36257d5580 parisc: soft_offline_page() now takes the pfn
Switch page deallocation table (pdt) driver to use pfn instead of a page
pointer in soft_offline_page().

Fixes: feec24a613 ("mm, soft-offline: convert parameter to pfn")
Signed-off-by: Helge Deller <deller@gmx.de>
2019-12-20 19:47:00 +01:00
Wei Li aa638cfe3e arm64: cpu_errata: Add Hisilicon TSV110 to spectre-v2 safe list
HiSilicon Taishan v110 CPUs didn't implement CSV2 field of the
ID_AA64PFR0_EL1, but spectre-v2 is mitigated by hardware, so
whitelist the MIDR in the safe list.

Signed-off-by: Wei Li <liwei391@huawei.com>
[hanjun: re-write the commit log]
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-12-20 17:57:22 +00:00
Christoph Hellwig 9209fb5189 riscv: move sifive_l2_cache.c to drivers/soc
The sifive_l2_cache.c is in no way related to RISC-V architecture
memory management.  It is a little stub driver working around the fact
that the EDAC maintainers prefer their drivers to be structured in a
certain way that doesn't fit the SiFive SOCs.

Move the file to drivers/soc and add a Kconfig option for it, as well
as the whole drivers/soc boilerplate for CONFIG_SOC_SIFIVE.

Fixes: a967a289f1 ("RISC-V: sifive_l2_cache: Add L2 cache controller driver for SiFive SoCs")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
[paul.walmsley@sifive.com: keep the MAINTAINERS change specific to the L2$ controller code]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-20 03:40:24 -08:00
David Abdurachmanov 01f52e16b8 riscv: define vmemmap before pfn_to_page calls
pfn_to_page & page_to_pfn depend on vmemmap being available before the calls
if kernel is configured with CONFIG_SPARSEMEM_VMEMMAP=y. This was caused
by NOMMU changes which moved vmemmap definition bellow functions definitions
calling pfn_to_page & page_to_pfn.

Noticed while compiled 5.5-rc2 kernel for Fedora/RISCV.

v2:
- Add a comment for vmemmap in source

Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com>
Fixes: 6bd33e1ece ("riscv: add nommu support")
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-20 03:32:24 -08:00
Greentime Hu d411cf02ed riscv: fix scratch register clearing in M-mode.
This patch fixes that the sscratch register clearing in M-mode. It cleared
sscratch register in M-mode, but it should clear mscratch register. That will
cause kernel trap if the CPU core doesn't support S-mode when trying to access
sscratch.

Fixes: 9e80635619 ("riscv: clear the instruction cache and all registers when booting")
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-20 03:32:24 -08:00
Andreas Schwab 0312a3d4b4 riscv: Fix use of undefined config option CONFIG_CONFIG_MMU
In Kconfig files, config options are written without the CONFIG_ prefix.

Fixes: 6bd33e1ece ("riscv: add nommu support")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-20 03:32:24 -08:00
Björn Töpel 34bfc10a6e riscv, perf: Add arch specific perf_arch_bpf_user_pt_regs
RISC-V was missing a proper perf_arch_bpf_user_pt_regs macro for
CONFIG_PERF_EVENT builds.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191216091343.23260-10-bjorn.topel@gmail.com
2019-12-19 16:03:31 +01:00
Björn Töpel eb9928bed0 riscv, bpf: Add missing uapi header for BPF_PROG_TYPE_PERF_EVENT programs
Add missing uapi header the BPF_PROG_TYPE_PERF_EVENT programs by
exporting struct user_regs_struct instead of struct pt_regs which is
in-kernel only.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191216091343.23260-9-bjorn.topel@gmail.com
2019-12-19 16:03:31 +01:00
Björn Töpel e368b64f8b riscv, bpf: Optimize calls
Instead of using emit_imm() and emit_jalr() which can expand to six
instructions, start using jal or auipc+jalr.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191216091343.23260-8-bjorn.topel@gmail.com
2019-12-19 16:03:31 +01:00
Björn Töpel 7f3631e88e riscv, bpf: Provide RISC-V specific JIT image alloc/free
This commit makes sure that the JIT images is kept close to the kernel
text, so BPF calls can use relative calling with auipc/jalr or jal
instead of loading the full 64-bit address and jalr.

The BPF JIT image region is 128 MB before the kernel text.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191216091343.23260-7-bjorn.topel@gmail.com
2019-12-19 16:03:31 +01:00
Björn Töpel fe8322b866 riscv, bpf: Optimize BPF tail calls
Remove one addi, and instead use the offset part of jalr.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191216091343.23260-6-bjorn.topel@gmail.com
2019-12-19 16:03:31 +01:00
Björn Töpel 33203c02f2 riscv, bpf: Add support for far jumps and exits
This commit add support for far (offset > 21b) jumps and exits.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Luke Nelson <lukenels@cs.washington.edu>
Link: https://lore.kernel.org/bpf/20191216091343.23260-5-bjorn.topel@gmail.com
2019-12-19 16:03:30 +01:00
Björn Töpel 29d92edd9e riscv, bpf: Add support for far branching when emitting tail call
Start use the emit_branch() function in the tail call emitter in order
to support far branching.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191216091343.23260-4-bjorn.topel@gmail.com
2019-12-19 16:03:30 +01:00
Björn Töpel 7d1ef13fea riscv, bpf: Add support for far branching
This commit adds branch relaxation to the BPF JIT, and with that
support for far (offset greater than 12b) branching.

The branch relaxation requires more than two passes to converge. For
most programs it is three passes, but for larger programs it can be
more.

Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Luke Nelson <lukenels@cs.washington.edu>
Link: https://lore.kernel.org/bpf/20191216091343.23260-3-bjorn.topel@gmail.com
2019-12-19 16:03:30 +01:00
Björn Töpel f1003b787c riscv, bpf: Fix broken BPF tail calls
The BPF JIT incorrectly clobbered the a0 register, and did not flag
usage of s5 register when BPF stack was being used.

Fixes: 2353ecc6f9 ("bpf, riscv: add BPF JIT for RV64G")
Signed-off-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191216091343.23260-2-bjorn.topel@gmail.com
2019-12-19 16:03:30 +01:00
Vasily Gorbik b4adfe5591 s390/ftrace: save traced function caller
A typical backtrace acquired from ftraced function currently looks like
the following (e.g. for "path_openat"):

arch_stack_walk+0x15c/0x2d8
stack_trace_save+0x50/0x68
stack_trace_call+0x15a/0x3b8
ftrace_graph_caller+0x0/0x1c
0x3e0007e3c98 <- ftraced function caller (should be do_filp_open+0x7c/0xe8)
do_open_execat+0x70/0x1b8
__do_execve_file.isra.0+0x7d8/0x860
__s390x_sys_execve+0x56/0x68
system_call+0xdc/0x2d8

Note random "0x3e0007e3c98" stack value as ftraced function caller. This
value causes either imprecise unwinder result or unwinding failure.
That "0x3e0007e3c98" comes from r14 of ftraced function stack frame, which
it haven't had a chance to initialize since the very first instruction
calls ftrace code ("ftrace_caller"). (ftraced function might never
save r14 as well). Nevertheless according to s390 ABI any function
is called with stack frame allocated for it and r14 contains return
address. "ftrace_caller" itself is called with "brasl %r0,ftrace_caller".
So, to fix this issue simply always save traced function caller onto
ftraced function stack frame.

Reported-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18 23:29:26 +01:00
Vasily Gorbik eef06cbf67 s390/unwind: stop gracefully at user mode pt_regs in irq stack
Consider reaching user mode pt_regs at the bottom of irq stack graceful
unwinder termination. This is the case when irq/mcck/ext interrupt arrives
while in user mode.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18 23:29:26 +01:00
Christian Borntraeger c23587c92f s390/purgatory: do not build purgatory with kcov, kasan and friends
the purgatory must not rely on functions from the "old" kernel,
so we must disable kasan and friends. We also need to have a
separate copy of string.c as the default does not build memcmp
with KASAN.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18 23:29:26 +01:00
Hans de Goede cd92ac2530 s390/purgatory: Make sure we fail the build if purgatory has missing symbols
Since we link purgatory with -r aka we enable "incremental linking"
no checks for unresolved symbols are done while linking the purgatory.

This commit adds an extra check for unresolved symbols by calling ld
without -r before running objcopy to generate purgatory.ro.

This will help us catch missing symbols in the purgatory sooner.

Note this commit also removes --no-undefined from LDFLAGS_purgatory
as that has no effect.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/lkml/20191212205304.191610-1-hdegoede@redhat.com
Tested-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18 23:29:26 +01:00
Sven Schnelle 6feeee8efc s390/ftrace: fix endless recursion in function_graph tracer
The following sequence triggers a kernel stack overflow on s390x:

mount -t tracefs tracefs /sys/kernel/tracing
cd /sys/kernel/tracing
echo function_graph > current_tracer
[crash]

This is because preempt_count_{add,sub} are in the list of traced
functions, which can be demonstrated by:

echo preempt_count_add >set_ftrace_filter
echo function_graph > current_tracer
[crash]

The stack overflow happens because get_tod_clock_monotonic() gets called
by ftrace but itself calls preempt_{disable,enable}(), which leads to a
endless recursion. Fix this by using preempt_{disable,enable}_notrace().

Fixes: 011620688a ("s390/time: ensure get_clock_monotonic() returns monotonic values")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18 23:29:26 +01:00
Paolo Bonzini f5d5f5fae4 KVM/arm fixes for .5.5, take #1
- Fix uninitialised sysreg accessor
 - Fix handling of demand-paged device mappings
 - Stop spamming the console on IMPDEF sysregs
 - Relax mappings of writable memslots
 - Assorted cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl3ydlsPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDRLIQAIrwXMd5LnXS62AFmDtze1Q4uFsRf1RsUJBY
 P5ZReCm3VZWt3SncHI7Tuy2VpzvRq1foTOBDe5tTqu2hkbXC0EIvQL/tcbdDTteG
 YInVVs6YiGIQ6GMGIgJTWOVmTHGTft5sEKdYeV2S30N+4drfSUnbUrOqRLs87VJ5
 WvFbl97Jg3CVLEdZ2ZdmxpufqH24WxEm2aGjncBOhKjxj1xokxKUSmxoJ0Zj95o5
 4kkjKkyNG1Gnt07qL2jUV52g7hnSo9NhgNu3XaU6vEv3Xy4YPrbBZwL3gM3f2lKk
 F4AFUDKqaskjWWlNscEYZYhUtd7Sx2s4tqFVQXUbPkBrUorhRHid4r1BiwGYZNb0
 pKB40kVXz8q5ZJV4qKhSQ2qwnyoGBw9LKkiqYqT4Wk0JLqizWjQyfX3bO1XHB2o/
 jz1VojegSrmWg+798osXCyT6Tjvuwyp8cvnY1tM5t2rO0ppTH6EtwAqhpBARou1V
 VOpqIVHI9yx4zok+TIQAZtRSI7eFisJIjSuFrqQkJUKdp7G+W1PM51LTcroZNNyD
 LEUnEPZPbONsH0shl8qV8rt7U2sN4eHSlsDhdaHe0mW2gllFIUbaSljH+IWBnmWV
 phOKFDZnIvE6hySJzCLiCuvRH9OGdEDSFe96cWNjHTa5EEYS0v9BARUO9zpoNcq+
 1w4YLExU
 =jJwo
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm fixes for .5.5, take #1

- Fix uninitialised sysreg accessor
- Fix handling of demand-paged device mappings
- Stop spamming the console on IMPDEF sysregs
- Relax mappings of writable memslots
- Assorted cleanups
2019-12-18 17:47:38 +01:00
Jim Mattson 8715f05269 kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD
The host reports support for the synthetic feature X86_FEATURE_SSBD
when any of the three following hardware features are set:
  CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31]
  CPUID.80000008H:EBX.AMD_SSBD[bit 24]
  CPUID.80000008H:EBX.VIRT_SSBD[bit 25]

Either of the first two hardware features implies the existence of the
IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does
not. Therefore, CPUID.80000008H:EBX.AMD_SSBD[bit 24] should only be
set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or
CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host.

Fixes: 4c6903a0f9 ("KVM: x86: fix reporting of AMD speculation bug CPUID leaf")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jacob Xu <jacobhxu@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-18 17:46:35 +01:00
Jim Mattson 396d2e878f kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD
The host reports support for the synthetic feature X86_FEATURE_SSBD
when any of the three following hardware features are set:
  CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31]
  CPUID.80000008H:EBX.AMD_SSBD[bit 24]
  CPUID.80000008H:EBX.VIRT_SSBD[bit 25]

Either of the first two hardware features implies the existence of the
IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does
not. Therefore, CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] should only be
set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or
CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host.

Fixes: 0c54914d0c ("KVM: x86: use Intel speculation bugs and features as derived in generic x86 code")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jacob Xu <jacobhxu@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-18 17:46:23 +01:00
Paul Mackerras d89c69f42b KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor
Commit 22945688ac ("KVM: PPC: Book3S HV: Support reset of secure
guest") added a call to uv_svm_terminate, which is an ultravisor
call, without any check that the guest is a secure guest or even that
the system has an ultravisor.  On a system without an ultravisor,
the ultracall will degenerate to a hypercall, but since we are not
in KVM guest context, the hypercall will get treated as a system
call, which could have random effects depending on what happens to
be in r0, and could also corrupt the current task's kernel stack.
Hence this adds a test for the guest being a secure guest before
doing uv_svm_terminate().

Fixes: 22945688ac ("KVM: PPC: Book3S HV: Support reset of secure guest")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-12-18 15:46:34 +11:00
Linus Torvalds 9065e06360 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "Fix kexec booting with certain EFI memory map layouts"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Update e820 with reserved EFI boot services data to fix kexec breakage
2019-12-17 11:17:03 -08:00
Linus Torvalds 2abf193275 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
 "Add HPET quirks for the Intel 'Coffee Lake H' and 'Ice Lake' platforms"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/intel: Disable HPET on Intel Ice Lake platforms
  x86/intel: Disable HPET on Intel Coffee Lake H platforms
2019-12-17 11:11:08 -08:00
Alexander Shishkin 92ca7da4bd perf/x86/intel: Fix PT PMI handling
Commit:

  ccbebba4c6 ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it")

skips the PT/LBR exclusivity check on CPUs where PT and LBRs coexist, but
also inadvertently skips the active_events bump for PT in that case, which
is a bug. If there aren't any hardware events at the same time as PT, the
PMI handler will ignore PT PMIs, as active_events reads zero in that case,
resulting in the "Uhhuh" spurious NMI warning and PT data loss.

Fix this by always increasing active_events for PT events.

Fixes: ccbebba4c6 ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it")
Reported-by: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lkml.kernel.org/r/20191210105101.77210-1-alexander.shishkin@linux.intel.com
2019-12-17 13:32:46 +01:00
Alexander Shishkin ff61541cc6 perf/x86/intel/bts: Fix the use of page_private()
Commit

  8062382c8d ("perf/x86/intel/bts: Add BTS PMU driver")

brought in a warning with the BTS buffer initialization
that is easily tripped with (assuming KPTI is disabled):

instantly throwing:

> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 326 at arch/x86/events/intel/bts.c:86 bts_buffer_setup_aux+0x117/0x3d0
> Modules linked in:
> CPU: 2 PID: 326 Comm: perf Not tainted 5.4.0-rc8-00291-gceb9e77324fa #904
> RIP: 0010:bts_buffer_setup_aux+0x117/0x3d0
> Call Trace:
>  rb_alloc_aux+0x339/0x550
>  perf_mmap+0x607/0xc70
>  mmap_region+0x76b/0xbd0
...

It appears to assume (for lost raisins) that PagePrivate() is set,
while later it actually tests for PagePrivate() before using
page_private().

Make it consistent and always check PagePrivate() before using
page_private().

Fixes: 8062382c8d ("perf/x86/intel/bts: Add BTS PMU driver")
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lkml.kernel.org/r/20191205142853.28894-2-alexander.shishkin@linux.intel.com
2019-12-17 13:32:46 +01:00
Peter Zijlstra 1e69a0efc0 perf/x86: Fix potential out-of-bounds access
UBSAN reported out-of-bound accesses for x86_pmu.event_map(), it's
arguments should be < x86_pmu.max_events. Make sure all users observe
this constraint.

Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Meelis Roos <mroos@linux.ee>
2019-12-17 13:32:46 +01:00
Jan H. Schönherr a3a57ddad0 x86/mce: Fix possibly incorrect severity calculation on AMD
The function mce_severity_amd_smca() requires m->bank to be initialized
for correct operation. Fix the one case, where mce_severity() is called
without doing so.

Fixes: 6bda529ec4 ("x86/mce: Grade uncorrected errors for SMCA-enabled systems")
Fixes: d28af26faa ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()")
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de
2019-12-17 09:39:53 +01:00