Commit graph

4834 commits

Author SHA1 Message Date
Linus Torvalds 5bb053bef8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Support offloading wireless authentication to userspace via
    NL80211_CMD_EXTERNAL_AUTH, from Srinivas Dasari.

 2) A lot of work on network namespace setup/teardown from Kirill Tkhai.
    Setup and cleanup of namespaces now all run asynchronously and thus
    performance is significantly increased.

 3) Add rx/tx timestamping support to mv88e6xxx driver, from Brandon
    Streiff.

 4) Support zerocopy on RDS sockets, from Sowmini Varadhan.

 5) Use denser instruction encoding in x86 eBPF JIT, from Daniel
    Borkmann.

 6) Support hw offload of vlan filtering in mvpp2 dreiver, from Maxime
    Chevallier.

 7) Support grafting of child qdiscs in mlxsw driver, from Nogah
    Frankel.

 8) Add packet forwarding tests to selftests, from Ido Schimmel.

 9) Deal with sub-optimal GSO packets better in BBR congestion control,
    from Eric Dumazet.

10) Support 5-tuple hashing in ipv6 multipath routing, from David Ahern.

11) Add path MTU tests to selftests, from Stefano Brivio.

12) Various bits of IPSEC offloading support for mlx5, from Aviad
    Yehezkel, Yossi Kuperman, and Saeed Mahameed.

13) Support RSS spreading on ntuple filters in SFC driver, from Edward
    Cree.

14) Lots of sockmap work from John Fastabend. Applications can use eBPF
    to filter sendmsg and sendpage operations.

15) In-kernel receive TLS support, from Dave Watson.

16) Add XDP support to ixgbevf, this is significant because it should
    allow optimized XDP usage in various cloud environments. From Tony
    Nguyen.

17) Add new Intel E800 series "ice" ethernet driver, from Anirudh
    Venkataramanan et al.

18) IP fragmentation match offload support in nfp driver, from Pieter
    Jansen van Vuuren.

19) Support XDP redirect in i40e driver, from Björn Töpel.

20) Add BPF_RAW_TRACEPOINT program type for accessing the arguments of
    tracepoints in their raw form, from Alexei Starovoitov.

21) Lots of striding RQ improvements to mlx5 driver with many
    performance improvements, from Tariq Toukan.

22) Use rhashtable for inet frag reassembly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1678 commits)
  net: mvneta: improve suspend/resume
  net: mvneta: split rxq/txq init and txq deinit into SW and HW parts
  ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_thresh
  net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
  net: bgmac: Correctly annotate register space
  route: check sysctl_fib_multipath_use_neigh earlier than hash
  fix typo in command value in drivers/net/phy/mdio-bitbang.
  sky2: Increase D3 delay to sky2 stops working after suspend
  net/mlx5e: Set EQE based as default TX interrupt moderation mode
  ibmvnic: Disable irqs before exiting reset from closed state
  net: sched: do not emit messages while holding spinlock
  vlan: also check phy_driver ts_info for vlan's real device
  Bluetooth: Mark expected switch fall-throughs
  Bluetooth: Set HCI_QUIRK_SIMULTANEOUS_DISCOVERY for BTUSB_QCA_ROME
  Bluetooth: btrsi: remove unused including <linux/version.h>
  Bluetooth: hci_bcm: Remove DMI quirk for the MINIX Z83-4
  sh_eth: kill useless check in __sh_eth_get_regs()
  sh_eth: add sh_eth_cpu_data::no_xdfar flag
  ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data()
  ipv4: factorize sk_wmem_alloc updates done by __ip_append_data()
  ...
2018-04-03 14:04:18 -07:00
Linus Torvalds f5a8eb632b arch: remove obsolete architecture ports
This removes the entire architecture code for blackfin, cris, frv, m32r,
 metag, mn10300, score, and tile, including the associated device drivers.
 
 I have been working with the (former) maintainers for each one to ensure
 that my interpretation was right and the code is definitely unused in
 mainline kernels. Many had fond memories of working on the respective
 ports to start with and getting them included in upstream, but also saw
 no point in keeping the port alive without any users.
 
 In the end, it seems that while the eight architectures are extremely
 different, they all suffered the same fate: There was one company
 in charge of an SoC line, a CPU microarchitecture and a software
 ecosystem, which was more costly than licensing newer off-the-shelf
 CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems
 that all the SoC product lines are still around, but have not used the
 custom CPU architectures for several years at this point. In contrast,
 CPU instruction sets that remain popular and have actively maintained
 kernel ports tend to all be used across multiple licensees.
 
 The removal came out of a discussion that is now documented at
 https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
 marking any ports as deprecated but remove them all at once after I made
 sure that they are all unused. Some architectures (notably tile, mn10300,
 and blackfin) are still being shipped in products with old kernels,
 but those products will never be updated to newer kernel releases.
 
 After this series, we still have a few architectures without mainline
 gcc support:
 
 - unicore32 and hexagon both have very outdated gcc releases, but the
   maintainers promised to work on providing something newer. At least
   in case of hexagon, this will only be llvm, not gcc.
 
 - openrisc, risc-v and nds32 are still in the process of finishing their
   support or getting it added to mainline gcc in the first place.
   They all have patched gcc-7.3 ports that work to some degree, but
   complete upstream support won't happen before gcc-8.1. Csky posted
   their first kernel patch set last week, their situation will be similar.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJawdL2AAoJEGCrR//JCVInuH0P/RJAZh1nTD+TR34ZhJq2TBoo
 PgygwDU7Z2+tQVU+EZ453Gywz9/NMRFk1RWAZqrLix4ZtyIMvC6A1qfT2yH1Y7Fb
 Qh6tccQeLe4ezq5u4S/46R/fQXu3Txr92yVwzJJUuPyU0arF9rv5MmI8e6p7L1en
 yb74kSEaCe+/eMlsEj1Cc1dgthDNXGKIURHkRsILoweysCpesjiTg4qDcL+yTibV
 FP2wjVbniKESMKS6qL71tiT5sexvLsLwMNcGiHPj94qCIQuI7DLhLdBVsL5Su6gI
 sbtgv0dsq4auRYAbQdMaH1hFvu6WptsuttIbOMnz2Yegi2z28H8uVXkbk2WVLbqG
 ZESUwutGh8MzOL2RJ4jyyQq5sfo++CRGlfKjr6ImZRv03dv0pe/W85062cK5cKNs
 cgDDJjGRorOXW7dyU6jG2gRqODOQBObIv3w5efdq5OgzOWlbI4EC+Y5u1Z0JF/76
 pSwtGXA6YhwC+9LLAlnVTHG+yOwuLmAICgoKcTbzTVDKA2YQZG/cYuQfI5S1wD8e
 X6urPx3Md2GCwLXQ9mzKBzKZUpu/Tuhx0NvwF4qVxy6x1PELjn68zuP7abDHr46r
 57/09ooVN+iXXnEGMtQVS/OPvYHSa2NgTSZz6Y86lCRbZmUOOlK31RDNlMvYNA+s
 3iIVHovno/JuJnTOE8LY
 =fQ8z
 -----END PGP SIGNATURE-----

Merge tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pul removal of obsolete architecture ports from Arnd Bergmann:
 "This removes the entire architecture code for blackfin, cris, frv,
  m32r, metag, mn10300, score, and tile, including the associated device
  drivers.

  I have been working with the (former) maintainers for each one to
  ensure that my interpretation was right and the code is definitely
  unused in mainline kernels. Many had fond memories of working on the
  respective ports to start with and getting them included in upstream,
  but also saw no point in keeping the port alive without any users.

  In the end, it seems that while the eight architectures are extremely
  different, they all suffered the same fate: There was one company in
  charge of an SoC line, a CPU microarchitecture and a software
  ecosystem, which was more costly than licensing newer off-the-shelf
  CPU cores from a third party (typically ARM, MIPS, or RISC-V). It
  seems that all the SoC product lines are still around, but have not
  used the custom CPU architectures for several years at this point. In
  contrast, CPU instruction sets that remain popular and have actively
  maintained kernel ports tend to all be used across multiple licensees.

  [ See the new nds32 port merged in the previous commit for the next
    generation of "one company in charge of an SoC line, a CPU
    microarchitecture and a software ecosystem"   - Linus ]

  The removal came out of a discussion that is now documented at
  https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
  marking any ports as deprecated but remove them all at once after I
  made sure that they are all unused. Some architectures (notably tile,
  mn10300, and blackfin) are still being shipped in products with old
  kernels, but those products will never be updated to newer kernel
  releases.

  After this series, we still have a few architectures without mainline
  gcc support:

   - unicore32 and hexagon both have very outdated gcc releases, but the
     maintainers promised to work on providing something newer. At least
     in case of hexagon, this will only be llvm, not gcc.

   - openrisc, risc-v and nds32 are still in the process of finishing
     their support or getting it added to mainline gcc in the first
     place. They all have patched gcc-7.3 ports that work to some
     degree, but complete upstream support won't happen before gcc-8.1.
     Csky posted their first kernel patch set last week, their situation
     will be similar

  [ Palmer Dabbelt points out that RISC-V support is in mainline gcc
    since gcc-7, although gcc-7.3.0 is the recommended minimum  - Linus ]"

This really says it all:

 2498 files changed, 95 insertions(+), 467668 deletions(-)

* tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits)
  MAINTAINERS: UNICORE32: Change email account
  staging: iio: remove iio-trig-bfin-timer driver
  tty: hvc: remove tile driver
  tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers
  serial: remove tile uart driver
  serial: remove m32r_sio driver
  serial: remove blackfin drivers
  serial: remove cris/etrax uart drivers
  usb: Remove Blackfin references in USB support
  usb: isp1362: remove blackfin arch glue
  usb: musb: remove blackfin port
  usb: host: remove tilegx platform glue
  pwm: remove pwm-bfin driver
  i2c: remove bfin-twi driver
  spi: remove blackfin related host drivers
  watchdog: remove bfin_wdt driver
  can: remove bfin_can driver
  mmc: remove bfin_sdh driver
  input: misc: remove blackfin rotary driver
  input: keyboard: remove bf54x driver
  ...
2018-04-02 20:20:12 -07:00
Linus Torvalds 2fcd2b306a Merge branch 'x86-dma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 dma mapping updates from Ingo Molnar:
 "This tree, by Christoph Hellwig, switches over the x86 architecture to
  the generic dma-direct and swiotlb code, and also unifies more of the
  dma-direct code between architectures. The now unused x86-only
  primitives are removed"

* 'x86-dma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs
  swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPS
  dma/swiotlb: Remove swiotlb_{alloc,free}_coherent()
  dma/direct: Handle force decryption for DMA coherent buffers in common code
  dma/direct: Handle the memory encryption bit in common code
  dma/swiotlb: Remove swiotlb_set_mem_attributes()
  set_memory.h: Provide set_memory_{en,de}crypted() stubs
  x86/dma: Remove dma_alloc_coherent_gfp_flags()
  iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent()
  iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}()
  x86/dma/amd_gart: Use dma_direct_{alloc,free}()
  x86/dma/amd_gart: Look at dev->coherent_dma_mask instead of GFP_DMA
  x86/dma: Use generic swiotlb_ops
  x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y)
  x86/dma: Remove dma_alloc_coherent_mask()
2018-04-02 17:18:45 -07:00
Linus Torvalds 701f3b3149 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in the locking subsystem in this cycle were:

   - Add the Linux Kernel Memory Consistency Model (LKMM) subsystem,
     which is an an array of tools in tools/memory-model/ that formally
     describe the Linux memory coherency model (a.k.a.
     Documentation/memory-barriers.txt), and also produce 'litmus tests'
     in form of kernel code which can be directly executed and tested.

     Here's a high level background article about an earlier version of
     this work on LWN.net:

        https://lwn.net/Articles/718628/

     The design principles:

      "There is reason to believe that Documentation/memory-barriers.txt
       could use some help, and a major purpose of this patch is to
       provide that help in the form of a design-time tool that can
       produce all valid executions of a small fragment of concurrent
       Linux-kernel code, which is called a "litmus test". This tool's
       functionality is roughly similar to a full state-space search.
       Please note that this is a design-time tool, not useful for
       regression testing. However, we hope that the underlying
       Linux-kernel memory model will be incorporated into other tools
       capable of analyzing large bodies of code for regression-testing
       purposes."

     [...]

      "A second tool is klitmus7, which converts litmus tests to
       loadable kernel modules for direct testing. As with herd7, the
       klitmus7 code is freely available from

         http://diy.inria.fr/sources/index.html

       (and via "git" at https://github.com/herd/herdtools7)"

     [...]

     Credits go to:

      "This patch was the result of a most excellent collaboration
       founded by Jade Alglave and also including Alan Stern, Andrea
       Parri, and Luc Maranget."

     ... and to the gents listed in the MAINTAINERS entry:

        LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
        M:      Alan Stern <stern@rowland.harvard.edu>
        M:      Andrea Parri <parri.andrea@gmail.com>
        M:      Will Deacon <will.deacon@arm.com>
        M:      Peter Zijlstra <peterz@infradead.org>
        M:      Boqun Feng <boqun.feng@gmail.com>
        M:      Nicholas Piggin <npiggin@gmail.com>
        M:      David Howells <dhowells@redhat.com>
        M:      Jade Alglave <j.alglave@ucl.ac.uk>
        M:      Luc Maranget <luc.maranget@inria.fr>
        M:      "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

     The LKMM project already found several bugs in Linux locking
     primitives and improved the understanding and the documentation of
     the Linux memory model all around.

   - Add KASAN instrumentation to atomic APIs (Dmitry Vyukov)

   - Add RWSEM API debugging and reorganize the lock debugging Kconfig
     (Waiman Long)

   - ... misc cleanups and other smaller changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  locking/Kconfig: Restructure the lock debugging menu
  locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable
  locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches
  lockdep: Make the lock debug output more useful
  locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
  locking/atomic, asm-generic, x86: Add comments for atomic instrumentation
  locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations
  locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h
  locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h
  locking/xchg/alpha: Remove superfluous memory barriers from the _local() variants
  tools/memory-model: Finish the removal of rb-dep, smp_read_barrier_depends(), and lockless_dereference()
  tools/memory-model: Add documentation of new litmus test
  tools/memory-model: Remove mention of docker/gentoo image
  locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more
  locking/lockdep: Show unadorned pointers
  mutex: Drop linkage.h from mutex.h
  tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference
  tools/memory-model: Convert underscores to hyphens
  tools/memory-model: Add a S lock-based external-view litmus test
  tools/memory-model: Add required herd7 version to README file
  ...
2018-04-02 10:27:16 -07:00
Linus Torvalds 61d1757f56 Merge branch 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects updates from Ingo Molnar:
 "Misc improvements:

   - add better instrumentation/debugging

   - optimize the freeing logic improve performance"

* 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Avoid another unused variable warning
  debugobjects: Fix debug_objects_freed accounting
  debugobjects: Use global free list in __debug_check_no_obj_freed()
  debugobjects: Use global free list in free_object()
  debugobjects: Add global free list and the counter
  debugobjects: Export max loops counter
2018-04-02 09:14:41 -07:00
David S. Miller d4069fe6fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-03-31

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add raw BPF tracepoint API in order to have a BPF program type that
   can access kernel internal arguments of the tracepoints in their
   raw form similar to kprobes based BPF programs. This infrastructure
   also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which
   returns an anon-inode backed fd for the tracepoint object that allows
   for automatic detach of the BPF program resp. unregistering of the
   tracepoint probe on fd release, from Alexei.

2) Add new BPF cgroup hooks at bind() and connect() entry in order to
   allow BPF programs to reject, inspect or modify user space passed
   struct sockaddr, and as well a hook at post bind time once the port
   has been allocated. They are used in FB's container management engine
   for implementing policy, replacing fragile LD_PRELOAD wrapper
   intercepting bind() and connect() calls that only works in limited
   scenarios like glibc based apps but not for other runtimes in
   containerized applications, from Andrey.

3) BPF_F_INGRESS flag support has been added to sockmap programs for
   their redirect helper call bringing it in line with cls_bpf based
   programs. Support is added for both variants of sockmap programs,
   meaning for tx ULP hooks as well as recv skb hooks, from John.

4) Various improvements on BPF side for the nfp driver, besides others
   this work adds BPF map update and delete helper call support from
   the datapath, JITing of 32 and 64 bit XADD instructions as well as
   offload support of bpf_get_prandom_u32() call. Initial implementation
   of nfp packet cache has been tackled that optimizes memory access
   (see merge commit for further details), from Jakub and Jiong.

5) Removal of struct bpf_verifier_env argument from the print_bpf_insn()
   API has been done in order to prepare to use print_bpf_insn() soon
   out of perf tool directly. This makes the print_bpf_insn() API more
   generic and pushes the env into private data. bpftool is adjusted
   as well with the print_bpf_insn() argument removal, from Jiri.

6) Couple of cleanups and prep work for the upcoming BTF (BPF Type
   Format). The latter will reuse the current BPF verifier log as
   well, thus bpf_verifier_log() is further generalized, from Martin.

7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read
   and write support has been added in similar fashion to existing
   IPv6 IPV6_TCLASS socket option we already have, from Nikita.

8) Fixes in recent sockmap scatterlist API usage, which did not use
   sg_init_table() for initialization thus triggering a BUG_ON() in
   scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and
   uses a small helper sg_init_marker() to properly handle the affected
   cases, from Prashant.

9) Let the BPF core follow IDR code convention and therefore use the
   idr_preload() and idr_preload_end() helpers, which would also help
   idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory
   pressure, from Shaohua.

10) Last but not least, a spelling fix in an error message for the
    BPF cookie UID helper under BPF sample code, from Colin.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:33:04 -04:00
Eric Dumazet ae6da1f503 rhashtable: add schedule points
Rehashing and destroying large hash table takes a lot of time,
and happens in process context. It is safe to add cond_resched()
in rhashtable_rehash_table() and rhashtable_free_and_destroy()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:25:39 -04:00
Waiman Long 19193bcad8 locking/Kconfig: Restructure the lock debugging menu
Two config options in the lock debugging menu that are probably the most
frequently used, as far as I am concerned, is the PROVE_LOCKING and
LOCK_STAT. From a UI perspective, they should be front and center. So
these two options are now moved to the top of the lock debugging menu.

The DEBUG_WW_MUTEX_SLOWPATH option is also added to the PROVE_LOCKING
umbrella.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-4-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31 07:30:51 +02:00
Waiman Long f07cbebb6d locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable
There are a couples of lock debugging Kconfig options that depends on
the following support options:

 - TRACE_IRQFLAGS_SUPPORT
 - STACKTRACE_SUPPORT
 - LOCKDEP_SUPPORT

That makes those lock debugging options harder to read and understand.
So a new LOCK_DEBUGGING_SUPPORT option is added that is equivalent to
the above three options together. That makes the Kconfig.debug file
more readable.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-3-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31 07:30:50 +02:00
Waiman Long 5149cbac42 locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches
For a rwsem, locking can either be exclusive or shared. The corresponding
exclusive or shared unlock must be used. Otherwise, the protected data
structures may get corrupted or the lock may be in an inconsistent state.

In order to detect such anomaly, a new configuration option DEBUG_RWSEMS
is added which can be enabled to look for such mismatches and print
warnings that that happens.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31 07:30:50 +02:00
Prashant Bhole f385178679 lib/scatterlist: add sg_init_marker() helper
sg_init_marker initializes sg_magic in the sg table and calls
sg_mark_end() on the last entry of the table. This can be useful to
avoid memset in sg_init_table() when scatterlist is already zeroed out

For example: when scatterlist is embedded inside other struct and that
container struct is zeroed out

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-30 22:50:15 +02:00
Dan Carpenter 99fe29d3a2 test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()
The skb_segment() function returns error pointers on error.  It never
returns NULL.

Fixes: 76db8087c4 ("net: bpf: add a test for skb_segment in test_bpf module")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29 14:33:29 -04:00
Christoph Hellwig e89f5b3701 dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs
Revert the clearing of __GFP_ZERO in dma_alloc_attrs and move it to
dma_direct_alloc for now.  While most common architectures always zero dma
cohereny allocations (and x86 did so since day one) this is not documented
and at least arc and s390 do not zero without the explicit __GFP_ZERO
argument.

Fixes: 57bf5a8963 ("dma-mapping: clear harmful GFP_* flags in common code")
Reported-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com>
Cc: iommu@lists.linux-foundation.org
Link: https://lkml.kernel.org/r/20180328133535.17302-2-hch@lst.de
2018-03-28 17:34:23 +02:00
Greg Kroah-Hartman a0306db6e5 Merge 4.16-rc7 into staging-next
We want the IIO and staging driver fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28 13:33:37 +02:00
Greg Kroah-Hartman b24d0d5b12 Merge 4.16-rc7 into char-misc-next
We want the hyperv fix in here for merging and testing.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28 12:27:35 +02:00
Kirill Tkhai 2f635ceeb2 net: Drop pernet_operations::async
Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 13:18:09 -04:00
Martin Kelly 75a24b822d kfifo: fix inaccurate comment
The comment in __kfifo_alloc says we round down, but we actually round
up, so correct it.

Signed-off-by: Martin Kelly <mkelly@xevo.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-03-27 11:15:42 +02:00
Arnd Bergmann 889ce12b16 raid: remove tile specific raid6 implementation
The Tile architecture is getting removed, so we no longer need this either.

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26 15:56:28 +02:00
Arnd Bergmann a687a53370 treewide: simplify Kconfig dependencies for removed archs
A lot of Kconfig symbols have architecture specific dependencies.
In those cases that depend on architectures we have already removed,
they can be omitted.

Acked-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-26 15:55:57 +02:00
Nikolay Borisov df91f56adc libcrc32c: Add crc32c_impl function
This function returns a string with the currently in-use implementation
of the crc32c algorithm, i.e crc32c-generic (for unoptimised, generic
implementation) or crc32c-intel for the sse optimised version. This
will be used by btrfs.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
[ use crypto_shash_driver_name as suggested by Herbert ]
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-26 15:09:38 +02:00
Joe Perches 447a5647c9 treewide: Align function definition open/close braces
Some functions definitions have either the initial open brace and/or
the closing brace outside of column 1.

Move those braces to column 1.

This allows various function analyzers like gnu complexity to work
properly for these modified functions.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-03-26 11:13:09 +02:00
Yonghong Song 76db8087c4 net: bpf: add a test for skb_segment in test_bpf module
Without the previous commit,
"modprobe test_bpf" will have the following errors:
...
[   98.149165] ------------[ cut here ]------------
[   98.159362] kernel BUG at net/core/skbuff.c:3667!
[   98.169756] invalid opcode: 0000 [#1] SMP PTI
[   98.179370] Modules linked in:
[   98.179371]  test_bpf(+)
...
which triggers the bug the previous commit intends to fix.

The skbs are constructed to mimic what mlx5 may generate.
The packet size/header may not mimic real cases in production. But
the processing flow is similar.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-25 16:46:05 -04:00
Masahiro Yamada dc35da16a2 lib: zstd: clean up Makefile for simpler composite object handling
Now, Kbuild nicely handles composite objects to avoid multiple
definition.

Makefiles can simply add the same objects multiple times across
composite objects.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:27 +09:00
Nicholas Piggin f49821ee32 kbuild: rename built-in.o to built-in.a
Incremental linking is gone, so rename built-in.o to built-in.a, which
is the usual extension for archive files.

This patch does two things, first is a simple search/replace:

git grep -l 'built-in\.o' | xargs sed -i 's/built-in\.o/built-in\.a/g'

The second is to invert nesting of nested text manipulations to avoid
filtering built-in.a out from libs-y2:

-libs-y2 := $(filter-out %.a, $(patsubst %/, %/built-in.a, $(libs-y)))
+libs-y2 := $(patsubst %/, %/built-in.a, $(filter-out %.a, $(libs-y)))

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26 02:01:19 +09:00
Ingo Molnar ea2301b622 Merge branch 'linus' into x86/dma, to resolve a conflict with upstream
Conflicts:
	arch/x86/mm/init_64.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-24 09:25:26 +01:00
Christoph Hellwig 0803e6051c swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPS
Otherwise this causes unused symbol warnings for configs that build
swiotlb.c only for use by xen-swiotlb.c and that don't otherwise select
CONFIG_DMA_DIRECT_OPS, which is possible on arm.

Fixes: 16e73adbca ("dma/swiotlb: Remove swiotlb_{alloc,free}_coherent()")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: iommu@lists.linux-foundation.org
Cc: konrad.wilk@oracle.com
Link: https://lkml.kernel.org/r/20180323174930.17767-1-hch@lst.de
2018-03-23 20:15:38 +01:00
David S. Miller 03fe2debbb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Fun set of conflict resolutions here...

For the mac80211 stuff, these were fortunately just parallel
adds.  Trivially resolved.

In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
function phy_disable_interrupts() earlier in the file, whilst in
'net-next' the phy_error() call from this function was removed.

In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
'rt_table_id' member of rtable collided with a bug fix in 'net' that
added a new struct member "rt_mtu_locked" which needs to be copied
over here.

The mlxsw driver conflict consisted of net-next separating
the span code and definitions into separate files, whilst
a 'net' bug fix made some changes to that moved code.

The mlx5 infiniband conflict resolution was quite non-trivial,
the RDMA tree's merge commit was used as a guide here, and
here are their notes:

====================

    Due to bug fixes found by the syzkaller bot and taken into the for-rc
    branch after development for the 4.17 merge window had already started
    being taken into the for-next branch, there were fairly non-trivial
    merge issues that would need to be resolved between the for-rc branch
    and the for-next branch.  This merge resolves those conflicts and
    provides a unified base upon which ongoing development for 4.17 can
    be based.

    Conflicts:
            drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f95
            (IB/mlx5: Fix cleanup order on unload) added to for-rc and
            commit b5ca15ad7e (IB/mlx5: Add proper representors support)
            add as part of the devel cycle both needed to modify the
            init/de-init functions used by mlx5.  To support the new
            representors, the new functions added by the cleanup patch
            needed to be made non-static, and the init/de-init list
            added by the representors patch needed to be modified to
            match the init/de-init list changes made by the cleanup
            patch.
    Updates:
            drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
            prototypes added by representors patch to reflect new function
            names as changed by cleanup patch
            drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
            stage list to match new order from cleanup patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23 11:31:58 -04:00
Linus Torvalds f36b7534b8 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "13 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, thp: do not cause memcg oom for thp
  mm/vmscan: wake up flushers for legacy cgroups too
  Revert "mm: page_alloc: skip over regions of invalid pfns where possible"
  mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink()
  mm/thp: do not wait for lock_page() in deferred_split_scan()
  mm/khugepaged.c: convert VM_BUG_ON() to collapse fail
  x86/mm: implement free pmd/pte page interfaces
  mm/vmalloc: add interfaces to free unmapped page table
  h8300: remove extraneous __BIG_ENDIAN definition
  hugetlbfs: check for pgoff value overflow
  lockdep: fix fs_reclaim warning
  MAINTAINERS: update Mark Fasheh's e-mail
  mm/mempolicy.c: avoid use uninitialized preferred_node
2018-03-22 18:48:43 -07:00
Toshi Kani b6bdb7517c mm/vmalloc: add interfaces to free unmapped page table
On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
create pud/pmd mappings.  A kernel panic was observed on arm64 systems
with Cortex-A75 in the following steps as described by Hanjun Guo.

 1. ioremap a 4K size, valid page table will build,
 2. iounmap it, pte0 will set to 0;
 3. ioremap the same address with 2M size, pgd/pmd is unchanged,
    then set the a new value for pmd;
 4. pte0 is leaked;
 5. CPU may meet exception because the old pmd is still in TLB,
    which will lead to kernel panic.

This panic is not reproducible on x86.  INVLPG, called from iounmap,
purges all levels of entries associated with purged address on x86.  x86
still has memory leak.

The patch changes the ioremap path to free unmapped page table(s) since
doing so in the unmap path has the following issues:

 - The iounmap() path is shared with vunmap(). Since vmap() only
   supports pte mappings, making vunmap() to free a pte page is an
   overhead for regular vmap users as they do not need a pte page freed
   up.

 - Checking if all entries in a pte page are cleared in the unmap path
   is racy, and serializing this check is expensive.

 - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
   Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
   purge.

Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
clear a given pud/pmd entry and free up a page for the lower level
entries.

This patch implements their stub functions on x86 and arm64, which work
as workaround.

[akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
Fixes: e61ce6ade4 ("mm: change ioremap to set up huge I/O mappings")
Reported-by: Lei Li <lious.lilei@hisilicon.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Wang Xuefeng <wxf.wang@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Chintan Pandya <cpandya@codeaurora.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-22 17:07:01 -07:00
Linus Torvalds c4f4d2f917 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Always validate XFRM esn replay attribute, from Florian Westphal.

 2) Fix RCU read lock imbalance in xfrm_get_tos(), from Xin Long.

 3) Don't try to get firmware dump if not loaded in iwlwifi, from Shaul
    Triebitz.

 4) Fix BPF helpers to deal with SCTP GSO SKBs properly, from Daniel
    Axtens.

 5) Fix some interrupt handling issues in e1000e driver, from Benjamin
    Poitier.

 6) Use strlcpy() in several ethtool get_strings methods, from Florian
    Fainelli.

 7) Fix rhlist dup insertion, from Paul Blakey.

 8) Fix SKB leak in netem packet scheduler, from Alexey Kodanev.

 9) Fix driver unload crash when link is up in smsc911x, from Jeremy
    Linton.

10) Purge out invalid socket types in l2tp_tunnel_create(), from Eric
    Dumazet.

11) Need to purge the write queue when TCP connections are aborted,
    otherwise userspace using MSG_ZEROCOPY can't close the fd. From
    Soheil Hassas Yeganeh.

12) Fix double free in error path of team driver, from Arkadi
    Sharshevsky.

13) Filter fixes for hv_netvsc driver, from Stephen Hemminger.

14) Fix non-linear packet access in ipv6 ndisc code, from Lorenzo
    Bianconi.

15) Properly filter out unsupported feature flags in macvlan driver,
    from Shannon Nelson.

16) Don't request loading the diag module for a protocol if the protocol
    itself is not even registered. From Xin Long.

17) If datagram connect fails in ipv6, make sure the socket state is
    consistent afterwards. From Paolo Abeni.

18) Use after free in qed driver, from Dan Carpenter.

19) If received ipv4 PMTU is less than the min pmtu, lock the mtu in the
    entry. From Sabrina Dubroca.

20) Fix sleep in atomic in tg3 driver, from Jonathan Toppins.

21) Fix vlan in vlan untagging in some situations, from Toshiaki Makita.

22) Fix double SKB free in genlmsg_mcast(). From Nicolas Dichtel.

23) Fix NULL derefs in error paths of tcf_*_init(), from Davide Caratti.

24) Unbalanced PM runtime calls in FEC driver, from Florian Fainelli.

25) Memory leak in gemini driver, from Igor Pylypiv.

26) IDR leaks in error paths of tcf_*_init() functions, from Davide
    Caratti.

27) Need to use GFP_ATOMIC in seg6_build_state(), from David Lebrun.

28) Missing dev_put() in error path of macsec_newlink(), from Dan
    Carpenter.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (201 commits)
  macsec: missing dev_put() on error in macsec_newlink()
  net: dsa: Fix functional dsa-loop dependency on FIXED_PHY
  hv_netvsc: common detach logic
  hv_netvsc: change GPAD teardown order on older versions
  hv_netvsc: use RCU to fix concurrent rx and queue changes
  hv_netvsc: disable NAPI before channel close
  net/ipv6: Handle onlink flag with multipath routes
  ppp: avoid loop in xmit recursion detection code
  ipv6: sr: fix NULL pointer dereference when setting encap source address
  ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state
  net: aquantia: driver version bump
  net: aquantia: Implement pci shutdown callback
  net: aquantia: Allow live mac address changes
  net: aquantia: Add tx clean budget and valid budget handling logic
  net: aquantia: Change inefficient wait loop on fw data reads
  net: aquantia: Fix a regression with reset on old firmware
  net: aquantia: Fix hardware reset when SPI may rarely hangup
  s390/qeth: on channel error, reject further cmd requests
  s390/qeth: lock read device while queueing next buffer
  s390/qeth: when thread completes, wake up all waiters
  ...
2018-03-22 14:10:29 -07:00
Christian Brauner 692ec06d7c netns: send uevent messages
This patch adds a receive method to NETLINK_KOBJECT_UEVENT netlink sockets
to allow sending uevent messages into the network namespace the socket
belongs to.

Currently non-initial network namespaces are already isolated and don't
receive uevents. There are a number of cases where it is beneficial for a
sufficiently privileged userspace process to send a uevent into a network
namespace.

One such use case would be debugging and fuzzing of a piece of software
which listens and reacts to uevents. By running a copy of that software
inside a network namespace, specific uevents could then be presented to it.
More concretely, this would allow for easy testing of udevd/ueventd.

This will also allow some piece of software to run components inside a
separate network namespace and then effectively filter what that software
can receive. Some examples of software that do directly listen to uevents
and that we have in the past attempted to run inside a network namespace
are rbd (CEPH client) or the X server.

Implementation:
The implementation has been kept as simple as possible from the kernel's
perspective. Specifically, a simple input method uevent_net_rcv() is added
to NETLINK_KOBJECT_UEVENT sockets which completely reuses existing
af_netlink infrastructure and does neither add an additional netlink family
nor requires any user-visible changes.

For example, by using netlink_rcv_skb() we can make use of existing netlink
infrastructure to report back informative error messages to userspace.

Furthermore, this implementation does not introduce any overhead for
existing uevent generating codepaths. The struct netns got a new uevent
socket member that records the uevent socket associated with that network
namespace including its position in the uevent socket list. Since we record
the uevent socket for each network namespace in struct net we don't have to
walk the whole uevent socket list. Instead we can directly retrieve the
relevant uevent socket and send the message. At exit time we can now also
trivially remove the uevent socket from the uevent socket list. This keeps
the codepath very performant without introducing needless overhead and even
makes older codepaths faster.

Uevent sequence numbers are kept global. When a uevent message is sent to
another network namespace the implementation will simply increment the
global uevent sequence number and append it to the received uevent. This
has the advantage that the kernel will never need to parse the received
uevent message to replace any existing uevent sequence numbers. Instead it
is up to the userspace process to remove any existing uevent sequence
numbers in case the uevent message to be sent contains any.

Security:
In order for a caller to send uevent messages to a target network namespace
the caller must have CAP_SYS_ADMIN in the owning user namespace of the
target network namespace. Additionally, any received uevent message is
verified to not exceed size UEVENT_BUFFER_SIZE. This includes the space
needed to append the uevent sequence number.

Testing:
This patch has been tested and verified to work with the following udev
implementations:
1. CentOS 6 with udevd version 147
2. Debian Sid with systemd-udevd version 237
3. Android 7.1.1 with ueventd

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:16:43 -04:00
Christian Brauner 94e5e3087a net: add uevent socket member
This commit adds struct uevent_sock to struct net. Since struct uevent_sock
records the position of the uevent socket in the uevent socket list we can
trivially remove it from the uevent socket list during cleanup. This speeds
up the old removal codepath.
Note, list_del() will hit __list_del_entry_valid() in its call chain which
will validate that the element is a member of the list. If it isn't it will
take care that the list is not modified.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:16:42 -04:00
Zhichang Yuan 031e360186 lib: Add generic PIO mapping method
41f8bba7f5 ("of/pci: Add pci_register_io_range() and
pci_pio_to_address()") added support for PCI I/O space mapped into CPU
physical memory space.  With that support, the I/O ranges configured for
PCI/PCIe hosts on some architectures can be mapped to logical PIO and
converted easily between CPU address and the corresponding logical PIO.
Based on this, PCI I/O port space can be accessed via in/out accessors that
use memory read/write.

But on some platforms, there are bus hosts that access I/O port space with
host-local I/O port addresses rather than memory addresses.

Add a more generic I/O mapping method to support those devices.  With this
patch, both the CPU addresses and the host-local port can be mapped into
the logical PIO space with different logical/fake PIOs.  After this, all
the I/O accesses to either PCI MMIO devices or host-local I/O peripherals
can be unified into the existing I/O accessors defined in asm-generic/io.h
and be redirected to the right device-specific hooks based on the input
logical PIO.

Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Zhichang Yuan <yuanzhichang@hisilicon.com>
Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
[bhelgaas: remove -EFAULT return from logic_pio_register_range() per
https://lkml.kernel.org/r/20180403143909.GA21171@ulmo, fix NULL pointer
checking per https://lkml.kernel.org/r/20180403211505.GA29612@embeddedor.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2018-03-21 17:18:34 -05:00
Thadeu Lima de Souza Cascardo 52fda36d63 test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other arches
Function bpf_fill_maxinsns11 is designed to not be able to be JITed on
x86_64. So, it fails when CONFIG_BPF_JIT_ALWAYS_ON=y, and
commit 09584b4067 ("bpf: fix selftests/bpf test_kmod.sh failure when
CONFIG_BPF_JIT_ALWAYS_ON=y") makes sure that failure is detected on that
case.

However, it does not fail on other architectures, which have a different
JIT compiler design. So, test_bpf has started to fail to load on those.

After this fix, test_bpf loads fine on both x86_64 and ppc64el.

Fixes: 09584b4067 ("bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-20 23:04:30 +01:00
Christoph Hellwig 16e73adbca dma/swiotlb: Remove swiotlb_{alloc,free}_coherent()
Unused now that everyone uses swiotlb_{alloc,free}().

Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Muli Ben-Yehuda <mulix@mulix.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20180319103826.12853-15-hch@lst.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20 10:01:59 +01:00
Christoph Hellwig c10f07aa27 dma/direct: Handle force decryption for DMA coherent buffers in common code
With that in place the generic DMA-direct routines can be used to
allocate non-encrypted bounce buffers, and the x86 SEV case can use
the generic swiotlb ops including nice features such as using CMA
allocations.

Note that I'm not too happy about using sev_active() in DMA-direct, but
I couldn't come up with a good enough name for a wrapper to make it
worth adding.

Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Muli Ben-Yehuda <mulix@mulix.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20180319103826.12853-14-hch@lst.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20 10:01:59 +01:00
Christoph Hellwig b6e05477c1 dma/direct: Handle the memory encryption bit in common code
Give the basic phys_to_dma() and dma_to_phys() helpers a __-prefix and add
the memory encryption mask to the non-prefixed versions.  Use the
__-prefixed versions directly instead of clearing the mask again in
various places.

Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Muli Ben-Yehuda <mulix@mulix.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20180319103826.12853-13-hch@lst.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20 10:01:59 +01:00
Christoph Hellwig e7de6c7cc2 dma/swiotlb: Remove swiotlb_set_mem_attributes()
Now that set_memory_decrypted() is always available we can just call it
directly.

Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Muli Ben-Yehuda <mulix@mulix.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20180319103826.12853-12-hch@lst.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20 10:01:58 +01:00
Matt Brown aa9532d489 lib/raid6: Build proper raid6test files on powerpc
Previously the raid6 test Makefile did not build the POWER specific files
(altivec and vpermxor).
This patch fixes the bug, so that all appropriate files for powerpc are built.

This patch also fixes the missing and mismatched ifdef statements to allow the
altivec.uc file to be built correctly.

Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-20 16:47:52 +11:00
Matt Brown 751ba79cc5 lib/raid6/altivec: Add vpermxor implementation for raid6 Q syndrome
This patch uses the vpermxor instruction to optimise the raid6 Q
syndrome. This instruction was made available with POWER8, ISA version
2.07. It allows for both vperm and vxor instructions to be done in a
single instruction. This has been tested for correctness on a ppc64le
vm with a basic RAID6 setup containing 5 drives.

The performance benchmarks are from the raid6test in the
/lib/raid6/test directory. These results are from an IBM Firestone
machine with ppc64le architecture. The benchmark results show a 35%
speed increase over the best existing algorithm for powerpc (altivec).
The raid6test has also been run on a big-endian ppc64 vm to ensure it
also works for big-endian architectures.

Performance benchmarks:
  raid6: altivecx4 gen() 18773 MB/s
  raid6: altivecx8 gen() 19438 MB/s

  raid6: vpermxor4 gen() 25112 MB/s
  raid6: vpermxor8 gen() 26279 MB/s

Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
[mpe: Add VPERMXOR macro so we can build with old binutils]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-20 16:47:25 +11:00
Linus Torvalds 0d707a2f24 Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fixes from Tejun Heo:
 "Late percpu pull request for v4.16-rc6.

   - percpu allocator pool replenishing no longer triggers OOM or
     warning messages.

     Also, the alloc interface now understands __GFP_NORETRY and
     __GFP_NOWARN. This is to allow avoiding OOMs from userland
     triggered actions like bpf map creation.

     Also added cond_resched() in alloc loop.

   - perpcu allocation now can be interrupted by kill sigs to avoid
     deadlocking OOM killer.

   - Added Dennis Zhou as a co-maintainer.

     He has rewritten the area map allocator, understands most of the
     code base and has been responsive for all bug reports"

* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods
  mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn()
  percpu: include linux/sched.h for cond_resched()
  percpu: add a schedule point in pcpu_balance_workfn()
  percpu: allow select gfp to be passed to underlying allocators
  percpu: add __GFP_NORETRY semantics to the percpu balancing path
  percpu: match chunk allocator declarations with definitions
  percpu: add Dennis Zhou as a percpu co-maintainer
2018-03-19 14:48:35 -07:00
Tejun Heo b3a5d11199 percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods
percpu_ref internally uses sched-RCU to implement the percpu -> atomic
mode switching and the documentation suggested that this could be
depended upon.  This doesn't seem like a good idea.

* percpu_ref uses sched-RCU which has different grace periods regular
  RCU.  Users may combine percpu_ref with regular RCU usage and
  incorrectly believe that regular RCU grace periods are performed by
  percpu_ref.  This can lead to, for example, use-after-free due to
  premature freeing.

* percpu_ref has a grace period when switching from percpu to atomic
  mode.  It doesn't have one between the last put and release.  This
  distinction is subtle and can lead to surprising bugs.

* percpu_ref allows starting in and switching to atomic mode manually
  for debugging and other purposes.  This means that there may not be
  any grace periods from kill to release.

This patch makes it clear that the grace periods are percpu_ref's
internal implementation detail and can't be depended upon by the
users.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-03-19 10:09:44 -07:00
Greg Kroah-Hartman 73709e1af5 Merge 4.16-rc6 into staging-next
We want the staging fixes in here as well to handle merge/test issues.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19 06:47:01 +01:00
Yisheng Xie 1b723413aa devres: combine function devm_ioremap*
When I tried to use devm_ioremap function and review related
code, I found devm_ioremap_* almost have the similar realize
with each other, which can be combined.

In the former version, I have tried to kill ioremap_cache to
reduce the size of devres, which can not work for ioremap is
not the same as ioremap_nocache in some ARCHs likes ia64.
Therefore, as the suggestion of Christophe, I introduce a help
function __devm_ioremap, let devm_ioremap* inline and call
__devm_ioremap with different devm_ioremap_type.

After apply the patch, the size of devres.o can be reduce from
8216 Bytes to 8052 Bytes in my compile environment.

Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Suggested-by: Christophe LEROY <christophe.leroy@c-s.fr>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Reviewed-by: Christophe LEROY <christophe.leroy@c-s.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15 18:08:55 +01:00
Andy Shevchenko 82d1f1178a lib/kobject: Join string literals back
There is no need to split string literals. Moreover, it would be simpler
to grep for an actual code line, when debugging, by using almost any
part of the string literal in question.

While here, replace printk(LEVEL) by pr_lvl() macros.

No functional change intended.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15 14:38:55 +01:00
Dave Young e36df28f53 printk: move dump stack related code to lib/dump_stack.c
dump_stack related stuff should belong to lib/dump_stack.c thus move them
there. Also conditionally compile lib/dump_stack.c since dump_stack code
does not make sense if printk is disabled.

Link: http://lkml.kernel.org/r/20180213072834.GA24784@dhcp-128-65.nay.redhat.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Cc: akpm@linux-foundation.org
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Dave Young <dyoung@redhat.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-03-15 13:25:36 +01:00
Joern Engel 8df3aaaf9b btree: avoid variable-length allocations
geo->keylen cannot be larger than 4.  So we might as well make
fixed-size allocations.

Given the one remaining user, geo->keylen cannot even be larger than 1.
Logfs used to have 64bit and 128bit keys, tcm_qla2xxx only has 32bit
keys.  But let's not break the code if we don't have to.

Signed-off-by: Joern Engel <joern@purestorage.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-14 16:55:29 -07:00
Arnd Bergmann 163cf842f5 debugobjects: Avoid another unused variable warning
debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(),
and only read in debug_objects_maxchecked, unfortunately both of these are
optional and depend on different Kconfig symbols.

When both CONFIG_DEBUG_OBJECTS_FREE and CONFIG_DEBUG_FS are disabled this
warning is emitted:

  lib/debugobjects.c:56:14: error: 'debug_objects_maxchecked' defined but not used [-Werror=unused-variable]

Rather than trying to add more complex #ifdef protections, mark the
variable as __maybe_unused so it can be silently dropped when usused.

Fixes: bd9dcd0465 ("debugobjects: Export max loops counter")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20180313131857.158876-1-arnd@arndb.de
2018-03-14 20:20:01 +01:00
Luis R. Rodriguez ac68b1b3b9 lib/test_kmod.c: fix limit check on number of test devices created
As reported by Dan the parentheses is in the wrong place, and since
unlikely() call returns either 0 or 1 it's never less than zero.  The
second issue is that signed integer overflows like "INT_MAX + 1" are
undefined behavior.

Since num_test_devs represents the number of devices, we want to stop
prior to hitting the max, and not rely on the wrap arround at all.  So
just cap at num_test_devs + 1, prior to assigning a new device.

Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org
Fixes: d9c6a72d6f ("kmod: add test driver to stress test the module loader")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:02 -08:00
Kees Cook 1b4cfe3c0a lib/bug.c: exclude non-BUG/WARN exceptions from report_bug()
Commit b8347c2196 ("x86/debug: Handle warnings before the notifier
chain, to fix KGDB crash") changed the ordering of fixups, and did not
take into account the case of x86 processing non-WARN() and non-BUG()
exceptions.  This would lead to output of a false BUG line with no other
information.

In the case of a refcount exception, it would be immediately followed by
the refcount WARN(), producing very strange double-"cut here":

  lkdtm: attempting bad refcount_inc() overflow
  ------------[ cut here ]------------
  Kernel BUG at 0000000065f29de5 [verbose debug info unavailable]
  ------------[ cut here ]------------
  refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0
  WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4
  ...

In the prior ordering, exceptions were searched first:

   do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
   ...
                if (fixup_exception(regs, trapnr))
                        return 0;

  -               if (fixup_bug(regs, trapnr))
  -                       return 0;
  -

As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account
needing to search the exception list first, since that had already
happened.

So, instead of searching the exception list twice (once in
is_valid_bugaddr() and then again in fixup_exception()), just add a
simple sanity check to report_bug() that will immediately bail out if a
BUG() (or WARN()) entry is not found.

Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast
Fixes: b8347c2196 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash")
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
Kees Cook 0862ca422b bug: use %pB in BUG and stack protector failure
The BUG and stack protector reports were still using a raw %p.  This
changes it to %pB for more meaningful output.

Link: http://lkml.kernel.org/r/20180301225704.GA34198@beast
Fixes: ad67b74d24 ("printk: hash addresses printed with %p")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Richard Weinberger <richard.weinberger@gmail.com>,
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09 16:40:01 -08:00
David Howells 739d875dd6 mn10300: Remove the architecture
Remove the MN10300 arch as the hardware is defunct.

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Masahiro Yamada <yamada.masahiro@socionext.com>
cc: linux-am33-list@redhat.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-09 23:19:56 +01:00
Arnd Bergmann b67aea2bba Remove metag architecture
These patches remove the metag architecture and tightly dependent
 drivers from the kernel. With the 4.16 kernel the ancient gcc 4.2.4
 based metag toolchain we have been using is hitting compiler bugs, so
 now seems a good time to drop it altogether.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlqdcgQACgkQbAtpk944
 dno/1BAAvaiRcKcNxMrYkxG+Wn4r68odu7+E1dy99AaUnvPFT42R5XLMOv4BCu/Y
 bhMQ14lMJ9ZBKdYg9E97ulTV0YFhCBHuEWDyDnk/G3CVAEvdPuAQ6ktHDZxRQBFK
 JoTUKky53OZbWU9KhLeWpFg4F4E64FBm1kyAkqhs8pPM/LwmrxwIG2sxdTTqkhkc
 b+6ABf2NKtmQwHXWmKWCB8rmXMzulYth2ePC/r9MVj92xGKxADsiFArZk4kmoIUb
 H5eZ8FkemtUEfZp600dsGR/ffaTBwZJ3SULSkAklUnrcvdIRM+Fu8osG8O8yQKTd
 H7xnmtTJ2kCnhhuUMxt6v8WrDbKB8JdFxFOpXW93YKpKAkiGMvoUEZjlwPYIqWxL
 xtnDb9Rv+uZ4RpqZf9AtE4Td8lHTH7OZ78RDs9eMo6n1ZIr5CwcLaM2k5skAeyPr
 yt1lXePhXFqSS+OpOV6hn95ROqlkuZgvPfkcdNpCJPfM4SpfRLlUjIVqiVK0LDRk
 FAkk0VIfzjjNuyV9yr2XXuw90DerhFUgUl6ZYggkgf6umOHhZQdDTFr8gsfvaLm1
 1k1banUEF1tpDcUeShylDvqNmVSZZC6siTQMA7T0zjbjYJD25hJWLpFEcPkx/Anp
 4oGQNNoe4WgJIrJAoTJTiBVwC/xLDeZV6b5t2pOXBlH+v2eKgMg=
 =zDIl
 -----END PGP SIGNATURE-----

Merge tag 'metag_remove_2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jhogan/metag into asm-generic

Remove metag architecture

These patches remove the metag architecture and tightly dependent
drivers from the kernel. With the 4.16 kernel the ancient gcc 4.2.4
based metag toolchain we have been using is hitting compiler bugs, so
now seems a good time to drop it altogether.

* tag 'metag_remove_2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  i2c: img-scb: Drop METAG dependency
  media: img-ir: Drop METAG dependency
  watchdog: imgpdc: Drop METAG dependency
  MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE
  tty: Remove metag DA TTY and console driver
  clocksource: Remove metag generic timer driver
  irqchip: Remove metag irqchip drivers
  Drop a bunch of metag references
  docs: Remove remaining references to metag
  docs: Remove metag docs
  metag: Remove arch/metag/

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-07 22:18:39 +01:00
Paul Blakey 499ac3b60f test_rhashtable: add test case for rhltable with duplicate objects
Tries to insert duplicates in the middle of bucket's chain:
bucket 1:  [[val 21 (tid=1)]] -> [[ val 1 (tid=2),  val 1 (tid=0) ]]

Reuses tid to distinguish the elements insertion order.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07 10:44:03 -05:00
Paul Blakey d3dcf8eb61 rhashtable: Fix rhlist duplicates insertion
When inserting duplicate objects (those with the same key),
current rhlist implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.

Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.

Issue: 1241076
Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
Fixes: ca26893f05 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07 10:44:02 -05:00
David S. Miller 0f3e9c97eb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
All of the conflicts were cases of overlapping changes.

In net/core/devlink.c, we have to make care that the
resouce size_params have become a struct member rather
than a pointer to such an object.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-06 01:20:46 -05:00
Linus Torvalds 547046141f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Use an appropriate TSQ pacing shift in mac80211, from Toke
    Høiland-Jørgensen.

 2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk
    in ip6_route_me_harder, from Eric Dumazet.

 3) Fix several shutdown races and similar other problems in l2tp, from
    James Chapman.

 4) Handle missing XDP flush properly in tuntap, for real this time.
    From Jason Wang.

 5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel
    Borkmann.

 6) Fix phy_resume() locking, from Andrew Lunn.

 7) IFLA_MTU values are ignored on newlink for some tunnel types, fix
    from Xin Long.

 8) Revert F-RTO middle box workarounds, they only handle one dimension
    of the problem. From Yuchung Cheng.

 9) Fix socket refcounting in RDS, from Ka-Cheong Poon.

10) Don't allow ppp unit registration to an unregistered channel, from
    Guillaume Nault.

11) Various hv_netvsc fixes from Stephen Hemminger.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (98 commits)
  hv_netvsc: propagate rx filters to VF
  hv_netvsc: filter multicast/broadcast
  hv_netvsc: defer queue selection to VF
  hv_netvsc: use napi_schedule_irqoff
  hv_netvsc: fix race in napi poll when rescheduling
  hv_netvsc: cancel subchannel setup before halting device
  hv_netvsc: fix error unwind handling if vmbus_open fails
  hv_netvsc: only wake transmit queue if link is up
  hv_netvsc: avoid retry on send during shutdown
  virtio-net: re enable XDP_REDIRECT for mergeable buffer
  ppp: prevent unregistered channels from connecting to PPP units
  tc-testing: skbmod: fix match value of ethertype
  mlxsw: spectrum_switchdev: Check success of FDB add operation
  net: make skb_gso_*_seglen functions private
  net: xfrm: use skb_gso_validate_network_len() to check gso sizes
  net: sched: tbf: handle GSO_BY_FRAGS case in enqueue
  net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len
  rds: Incorrect reference counting in TCP socket creation
  net: ethtool: don't ignore return from driver get_fecparam method
  vrf: check forwarding on the original netdevice when generating ICMP dest unreachable
  ...
2018-03-05 11:29:24 -08:00
David S. Miller a5f7b0eeb2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-02-28

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Add schedule points and reduce the number of loop iterations
   the test_bpf kernel module is performing in order to not hog
   the CPU for too long, from Eric.

2) Fix an out of bounds access in tail calls in the ppc64 BPF
   JIT compiler, from Daniel.

3) Fix a crash on arm64 on unaligned BPF xadd operations that
   could be triggered via interpreter and JIT, from Daniel.

Please not that once you merge net into net-next at some point, there
is a minor merge conflict in test_verifier.c since test cases had
been added at the end in both trees. Resolution is trivial: keep all
the test cases from both trees.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 21:42:07 -05:00
Omar Sandoval 4ace53f1ed sbitmap: use test_and_set_bit_lock()/clear_bit_unlock()
sbitmap_queue_get()/sbitmap_queue_clear() are used for
allocating/freeing a resource, so they should provide acquire/release
barrier semantics, respectively. sbitmap_get() currently contains a full
barrier, which is unnecessary, so use test_and_set_bit_lock() instead of
test_and_set_bit() (these are equivalent on x86_64). sbitmap_clear_bit()
does not imply any barriers, which is incorrect, as accesses of the
resource (e.g., request) could potentially get reordered to after the
clear_bit(). Introduce sbitmap_clear_bit_unlock() and use it for
sbitmap_queue_clear() (this only adds a compiler barrier on x86_64). The
other existing user of sbitmap_clear_bit() (the blk-mq software queue
pending map) is serialized through a spinlock and does not need this.

Reported-by: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-28 12:23:35 -07:00
Linus Torvalds b1aad6824a A single fix for a memory leak regression in the dma-debug code.
-----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlqW7wELHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYO8ew/+LaPsN4xIJAMTvga2x1ZXozUDz+TjkmvOtjjj98j5
 EhbENUKuGa2l2vDv8+iT1c76AxQeXzcpirtNmfMTSQP7BuyGNAlIuTNeDP2B/wSR
 u5AM0pO2tiycLTkpn1PIRZODOeE/+TgZwGwtDnjA2IKhlwlqIH4K02nBi89kVgfo
 s2dIqE+w/XZuAU5RgIhp9mL1FbaggTxPiNAvGlIP7ZiiQl2Fgw43cMUSi4SJv36v
 dUIZ629BZTeH2WL7+YHgCLW9+U04nEsM3WyPQktiCUL7ZyE5dOOlG1lZoCmScPPc
 wTtaSO0g8CyuYrl3E1vS6cZfFtAX9OLN5q0z+1r/IEw2Zy+C7ko9/jZ5VhXdsZdA
 GUNsRbuaTDZwi42Fhsti0dYuahV71QHmjhMjn9HjPKqRp1C3X7h9J7wbXXqSXMgN
 vrxYZAzl1DjKSi1lUkMOGJ90AkQvbvuBoeCLJVTSRb4E1VUl/8aAFjnMJwWlpfML
 aaOkxTUpbBu2CTPuCKwjRyEM2Jsx8qbD82s2T9xeTqRnvliU4tRFEeXTeGy3K8X4
 BU9U9CFoPZtkPfmWw1Q8ZEKdew+Ce9rtDYaMlE4k51SbYUNh0dohLwVS9iBYd5Ti
 7FjAr+D+6QHkba6yGUHdFfc6r3/Dfce9ZLnV5SxW+WIK2FbRWfGG/DKSrr7NbPQI
 vX8=
 =PWy1
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.16-3' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:
 "A single fix for a memory leak regression in the dma-debug code"

* tag 'dma-mapping-4.16-3' of git://git.infradead.org/users/hch/dma-mapping:
  dma-debug: fix memory leak in debug_dma_alloc_coherent
2018-02-28 11:13:08 -08:00
Eric Dumazet 9960d7669e test_bpf: reduce MAX_TESTRUNS
For tests that are using the maximal number of BPF instruction, each
run takes 20 usec. Looping 10,000 times on them totals 200 ms, which
is bad when the loop is not preemptible.

test_bpf: #264 BPF_MAXINSNS: Call heavy transformations jited:1 19248
18548 PASS
test_bpf: #269 BPF_MAXINSNS: ld_abs+get_processor_id jited:1 20896 PASS

Lets divide by ten the number of iterations, so that max latency is
20ms. We could use need_resched() to break the loop earlier if we
believe 20 ms is too much.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-28 17:49:18 +01:00
Eric Dumazet d40bc96257 test_bpf: add a schedule point
test_bpf() is taking 1.6 seconds nowadays, it is time
to add a schedule point in it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-26 21:06:56 +01:00
Matthew Wilcox 4b0ad07653 idr: Fix handling of IDs above INT_MAX
Khalid reported that the kernel selftests are currently failing:

selftests: test_bpf.sh
========================================
test_bpf: [FAIL]
not ok 1..8 selftests:  test_bpf.sh [FAIL]

He bisected it to 6ce711f275 ("idr: Make
1-based IDRs more efficient").

The root cause is doing a signed comparison in idr_alloc_u32() instead
of an unsigned comparison.  I went looking for any similar problems and
found a couple (which would each result in the failure to warn in two
situations that aren't supposed to happen).

I knocked up a few test-cases to prove that I was right and added them
to the test-suite.

Reported-by: Khalid Aziz <khalid.aziz@oracle.com>
Tested-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-26 14:39:30 -05:00
Greg Kroah-Hartman 36e9f7203e Merge 4.16-rc3 into staging-next
We want the IIO/Staging fixes in here, and to resolve a merge problem
with the move of the fsl-mc code.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-26 15:32:00 +01:00
David S. Miller f74290fdb3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-02-24 00:04:20 -05:00
Linus Torvalds 13f514bef1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk fixlet from Petr Mladek:
 "People expect to see the real pointer value for %px.

  Let's substitute '(null)' only for the other %p? format modifiers that
  need to deference the pointer"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
  vsprintf: avoid misleading "(null)" for %px
2018-02-23 14:57:20 -08:00
James Hogan 5f171577b4
Drop a bunch of metag references
Now that arch/metag/ has been removed, drop a bunch of metag references
in various codes across the whole tree:
 - VM_GROWSUP and __VM_ARCH_SPECIFIC_1.
 - MT_METAG_* ELF note types.
 - METAG Kconfig dependencies (FRAME_POINTER) and ranges
   (MAX_STACK_SIZE_MB).
 - metag cases in tools (checkstack.pl, recordmcount.c, perf).

Signed-off-by: James Hogan <jhogan@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-mm@kvack.org
Cc: linux-metag@vger.kernel.org
2018-02-23 14:29:59 +00:00
Miles Chen af1da68684 dma-debug: fix memory leak in debug_dma_alloc_coherent
Marty reported a memory leakage introduced by commit 3aaabbf1c3
("lib/dma-debug.c: fix incorrect pfn calculation"). Fix it
by checking the virtual address before allocating the entry.

This patch also use virt_addr_valid() instead of virt_to_page()
to check if a virtual address is linear.

Fixes: 3aaabbf1 ("lib/dma-debug.c: fix incorrect pfn calculation")
Reported-by: Marty Faltesek <mfaltesek@google.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-02-22 15:02:33 -08:00
Arnd Bergmann 04148187aa debugobjects: Fix debug_objects_freed accounting
The removal of the batched object freeing has caused the debug_objects_freed
to become read-only, and the reading is inside an ifdef, so gcc warns that it
is completely unused without CONFIG_DEBUG_FS:

lib/debugobjects.c:71:14: error: 'debug_objects_freed' defined but not used [-Werror=unused-variable]

Assuming we are still interested in this number, this adds back code to
keep track of the freed objects.

Fixes: 636e1970fd ("debugobjects: Use global free list in free_object()")
Suggested-by: Waiman Long <longman@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20180222155335.1647466-1-arnd@arndb.de
2018-02-22 22:00:24 +01:00
Anders Roxell 908009e832 lib/Kconfig.debug: enable RUNTIME_TESTING_MENU
Commit d3deafaa8b ("lib/: make RUNTIME_TESTS a menuconfig to ease
disabling it all") causes a regression when using runtime tests due to
it defaults RUNTIME_TESTING_MENU to not set.

Link: http://lkml.kernel.org/r/20180214133015.10090-1-anders.roxell@linaro.org
Fixes: d3deafaa8b ("lib/: make RUNTIME_TESTS a menuconfig to easedisabling it all")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Cc: Vincent Legoll <vincent.legoll@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21 15:35:43 -08:00
Rasmus Villemoes b1a8a7a700 ida: do zeroing in ida_pre_get()
As far as I can tell, the only place the per-cpu ida_bitmap is populated
is in ida_pre_get.  The pre-allocated element is stolen in two places in
ida_get_new_above, in both cases immediately followed by a memset(0).

Since ida_get_new_above is called with locks held, do the zeroing in
ida_pre_get, or rather let kmalloc() do it.  Also, apparently gcc
generates ~44 bytes of code to do a memset(, 0, 128):

  $ scripts/bloat-o-meter vmlinux.{0,1}
  add/remove: 0/0 grow/shrink: 2/1 up/down: 5/-88 (-83)
  Function                                     old     new   delta
  ida_pre_get                                  115     119      +4
  vermagic                                      27      28      +1
  ida_get_new_above                            715     627     -88

Link: http://lkml.kernel.org/r/20180108225634.15340-1-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21 15:35:43 -08:00
Greg Kroah-Hartman 59142f808a First round of new devices, features and cleanups for IIO in the 4.17 cycle.
Outside of IIO
 * Strongly typed 64bit int_sqrt function needed by the mlx90632
 
 New device support
 * adc081s
   - New driver supporting adc081s, adc101s and adc121s TI ADCs.
 * ad5272
   - New driver supproting the ad5272 and ad5274 ADI digital potentiometers
     with DT bindings.
 * axp20x_adc
   - support the AXP813 ADC - includes rework patches to prepare for this.
 * mlx90632
   - New driver with dt bindings for this IR temperature sensor.
 
 Features
 * axp20x_adc
   - Add DT bindings and probing.
 * dht11
   - The sensor has a wider range than advertised in the datasheet - support it.
 * st_lsm6dsx
   - Add hardware timestamp su9pport.
 
 Cleanups
 * ABI docs
   - Update email contact for Matt Ranostay
 * SPDX changes
   - Matt Ranostay has moved his drivers over to SPDX.  Currently we are making
     this an author choice in IIO.
 * ad7192
   - Disable burnout current on misconfiguration.  No actually effect as
     they simply won't work otherwise.
 * ad7476
   - Drop a license definition that was replicating information in SPDX tag.
 * ade7758
   - Expand buf_lock to cover both buffer and state protection allowing
     unintented uses of mlock in the core to be removed.
 * ade7759
   - Align parameters to opening parenthesis.
 * at91_adc
   - Depend on sysfs instead of selecting it - for try wide consistency.
 * ccs811
   - trivial naming type for a define.
 * ep93xx
   - Drop a redundant return as a result checking platform_get_resource.
 * hts221
   - Regmap conversion which simplifies the driver somewhat.
   - Clean up some restricted endian cast warnings.
   - Drop a trailing whitespace from a comment
   - Drop an unnecessary get_unaligned by changing to the right 16bit data type.
 * ms5611
   - Fix coding style in the probe function (whitespace)
 * st_accel
   - Use strlcpy instead of strncpy to avoid potentially truncating a string.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAlqL3RIRHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0Foiq7w//eKBOgTy0osSkLjSXACsJ0oizUoEGDV3t
 NxQ3D3el+1ATUd2tJL33gDvCzYw5DPcNcy3Hh/tl5svnwdV0WvBwIDIgdPfPBJQu
 LpBjT6bI2MwYdi1WLHZJBeoIXxg0PMF4GNZ2DIk0GL0WJg87n6SkZCmslW8Rv8BD
 NE8m9Qv72vhhZ9QjcBPoQYzRAY16moaX+9a36TBgtgGs7c5vHYNGNxoMn6sZ3sQk
 VbUtqWv9By3gp0mUg7T2Qb+62d04MbDk3Sa/K2B0COBOCSP7kWjg9h7KP902R6rU
 WNsTK+rVRHxV5Jm6PBPcit4HeSK3/9308EKkv2axQrPqCLM50GoxoqAhNq7UpAEu
 YXLi4f0te7sEQdt2QFDCdDMk1HjI3eCA7a8enBpex3gt7bdGTMQH9qDKdsWE+z4G
 Gm9xbZn+z7wNfowf5wAE16Y+yUS4UvVXawWD8FSIpKi3HvvyYS1I0nTnzvTvS/C3
 EhubRf+nguYsFjHB+LpA3II6b/NlrkKGO8PDzCr3kvfPUVErTWPgVzx6+5vPG2Gd
 6qhoFVanV7SuHD0lSq9tKQfsh0DFjAh6DQYdLpoFaxgIJi6cLCGbwOMY49ihYfSO
 mTMz5c52mJACOpq7W5e9xzvLwx6pxiQobeun3fq7kDdq5vUweKSGK4iggEZ+sr2N
 luFxzN7VLNA=
 =CH1K
 -----END PGP SIGNATURE-----

Merge tag 'iio-for-4.17a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next

Jonathan writes:

First round of new devices, features and cleanups for IIO in the 4.17 cycle.

Outside of IIO
* Strongly typed 64bit int_sqrt function needed by the mlx90632

New device support
* adc081s
  - New driver supporting adc081s, adc101s and adc121s TI ADCs.
* ad5272
  - New driver supproting the ad5272 and ad5274 ADI digital potentiometers
    with DT bindings.
* axp20x_adc
  - support the AXP813 ADC - includes rework patches to prepare for this.
* mlx90632
  - New driver with dt bindings for this IR temperature sensor.

Features
* axp20x_adc
  - Add DT bindings and probing.
* dht11
  - The sensor has a wider range than advertised in the datasheet - support it.
* st_lsm6dsx
  - Add hardware timestamp su9pport.

Cleanups
* ABI docs
  - Update email contact for Matt Ranostay
* SPDX changes
  - Matt Ranostay has moved his drivers over to SPDX.  Currently we are making
    this an author choice in IIO.
* ad7192
  - Disable burnout current on misconfiguration.  No actually effect as
    they simply won't work otherwise.
* ad7476
  - Drop a license definition that was replicating information in SPDX tag.
* ade7758
  - Expand buf_lock to cover both buffer and state protection allowing
    unintented uses of mlock in the core to be removed.
* ade7759
  - Align parameters to opening parenthesis.
* at91_adc
  - Depend on sysfs instead of selecting it - for try wide consistency.
* ccs811
  - trivial naming type for a define.
* ep93xx
  - Drop a redundant return as a result checking platform_get_resource.
* hts221
  - Regmap conversion which simplifies the driver somewhat.
  - Clean up some restricted endian cast warnings.
  - Drop a trailing whitespace from a comment
  - Drop an unnecessary get_unaligned by changing to the right 16bit data type.
* ms5611
  - Fix coding style in the probe function (whitespace)
* st_accel
  - Use strlcpy instead of strncpy to avoid potentially truncating a string.
2018-02-20 10:23:17 +01:00
David S. Miller f5c0c6f429 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-02-19 18:46:11 -05:00
Kirill Tkhai 15898a011b net: Convert uevent_net_ops
uevent_net_init() and uevent_net_exit() create and
destroy netlink socket, and these actions serialized
in netlink code.

Parallel execution with other pernet_operations
makes the socket disappear earlier from uevent_sock_list
on ->exit. As userspace can't be interested in broadcast
messages of dying net, and, as I see, no one in kernel
listen them, we may safely make uevent_net_ops async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 10:36:07 -05:00
Yang Shi 1ea9b98b00 debugobjects: Use global free list in __debug_check_no_obj_freed()
__debug_check_no_obj_freed() iterates over the to be freed memory region in
chunks and iterates over the corresponding hash bucket list for each
chunk. This can accumulate to hundred thousands of checked objects. In the
worst case this can trigger the soft lockup detector:

NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s!
CPU: 15 PID: 110342 Comm: stress-ng-getde
Call Trace:
  [<ffffffff8141177e>] debug_check_no_obj_freed+0x13e/0x220
  [<ffffffff811f8751>] __free_pages_ok+0x1f1/0x5c0
  [<ffffffff811fa785>] __free_pages+0x25/0x40
  [<ffffffff812638db>] __free_slab+0x19b/0x270
  [<ffffffff812639e9>] discard_slab+0x39/0x50
  [<ffffffff812679f7>] __slab_free+0x207/0x270
  [<ffffffff81269966>] ___cache_free+0xa6/0xb0
  [<ffffffff8126c267>] qlist_free_all+0x47/0x80
  [<ffffffff8126c5a9>] quarantine_reduce+0x159/0x190
  [<ffffffff8126b3bf>] kasan_kmalloc+0xaf/0xc0
  [<ffffffff8126b8a2>] kasan_slab_alloc+0x12/0x20
  [<ffffffff81265e8a>] kmem_cache_alloc+0xfa/0x360
  [<ffffffff812abc8f>] ? getname_flags+0x4f/0x1f0
  [<ffffffff812abc8f>] getname_flags+0x4f/0x1f0
  [<ffffffff812abe42>] getname+0x12/0x20
  [<ffffffff81298da9>] do_sys_open+0xf9/0x210
  [<ffffffff81298ede>] SyS_open+0x1e/0x20
  [<ffffffff817d6e01>] entry_SYSCALL_64_fastpath+0x1f/0xc2

The code path might be called in either atomic or non-atomic context, but
in_atomic() can't tell if the current context is atomic or not on a
PREEMPT=n kernel, so cond_resched() can't be used to prevent the
softlockup.

Utilize the global free list to shorten the loop execution time.

[ tglx: Massaged changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-5-git-send-email-yang.shi@linux.alibaba.com
2018-02-13 10:59:18 +01:00
Yang Shi 636e1970fd debugobjects: Use global free list in free_object()
The newly added global free list allows to avoid lengthy pool_list
iterations in free_obj_work() by putting objects either into the pool list
when the fill level of the pool is below the maximum or by putting them on
the global free list immediately.

As the pool is now guaranteed to never exceed the maximum fill level this
allows to remove the batch removal from pool list in free_obj_work().

Split free_object() into two parts, so the actual queueing function can be
reused without invoking schedule_work() on every invocation.

[ tglx: Remove the batch removal from pool list and massage changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-4-git-send-email-yang.shi@linux.alibaba.com
2018-02-13 10:58:59 +01:00
Yang Shi 36c4ead6f6 debugobjects: Add global free list and the counter
free_object() adds objects to the pool list and schedules work when the
pool list is larger than the pool size.  The worker handles the actual
kfree() of the object by iterating the pool list until the pool size is
below the maximum pool size again.

To iterate the pool list, pool_lock has to be held and the objects which
should be freed() need to be put into temporary storage so pool_lock can be
dropped for the actual kmem_cache_free() invocation. That's a pointless and
expensive exercise if there is a large number of objects to free.

In such a case its better to evaulate the fill level of the pool in
free_objects() and queue the object to free either in the pool list or if
it's full on a separate global free list.

The worker can then do the following simpler operation:

  - Move objects back from the global free list to the pool list if the
    pool list is not longer full.

  - Remove the remaining objects in a single list move operation from the
    global free list and do the kmem_cache_free() operation lockless from
    the temporary list head.

In fill_pool() the global free list is checked as well to avoid real
allocations from the kmem cache.

Add the necessary list head and a counter for the number of objects on the
global free list and export that counter via sysfs:

max_chain     :79
max_loops     :8147
warnings      :0
fixups        :0
pool_free     :1697
pool_min_free :346
pool_used     :15356
pool_max_used :23933
on_free_list  :39
objs_allocated:32617
objs_freed    :16588

Nothing queues objects on the global free list yet. This happens in a
follow up change.

[ tglx: Simplified implementation and massaged changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-3-git-send-email-yang.shi@linux.alibaba.com
2018-02-13 10:58:58 +01:00
Yang Shi bd9dcd0465 debugobjects: Export max loops counter
__debug_check_no_obj_freed() can be an expensive operation depending on the
size of memory freed. It already exports the maximum chain walk length via
debugfs, but this only records the maximum of a single memory chunk.

Though there is no information about the total number of objects inspected
for a __debug_check_no_obj_freed() operation, which might be significantly
larger when a huge memory region is freed.

Aggregate the number of objects inspected for a single invocation of
__debug_check_no_obj_freed() and export it via sysfs.

The resulting output of /sys/kernel/debug/debug_objects/stats looks like:

max_chain     :121
max_checked   :543267
warnings      :0
fixups        :0
pool_free     :1764
pool_min_free :341
pool_used     :86438
pool_max_used :268887
objs_allocated:6068254
objs_freed    :5981076

[ tglx: Renamed the variable to max_checked and adjusted changelog ]

Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/1517872708-24207-2-git-send-email-yang.shi@linux.alibaba.com
2018-02-13 10:58:58 +01:00
Christoph Hellwig 42ed64524d dma-direct: comment the dma_direct_free calling convention
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-02-12 15:59:07 +00:00
Christoph Hellwig f25e6f6b4e dma-direct: mark as is_phys
Various PCI_DMA_BUS_IS_PHYS implementations rely on this flag to make proper
decisions for block and networking addressability.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-02-12 15:59:06 +00:00
Linus Torvalds 9a61df9e5f Kbuild updates for v4.16 (2nd)
Makefile changes:
 - enable unused-variable warning that was wrongly disabled for clang
 
 Kconfig changes:
 - warn blank 'help' and fix existing instances
 - fix 'choice' behavior to not write out invisible symbols
 - fix misc weirdness
 
 Coccinell changes:
 - fix false positive of free after managed memory alloc detection
 - improve performance of NULL dereference detection
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJafl86AAoJED2LAQed4NsGrBEP/23mRl+8dSYkTmlczP7stZHL
 tGkMQIOj85usPV4JIxYHgge6XhU0CBFXNGnDnxGeNEtwnBr5QQNxuS2nPh7rHJIN
 zX5rX/vwO9lzn2FFEdKrk8bito1IgnUHUvN/d0ikPWzY7GaMy6WrIWgyThllsLNE
 W7hp3cpUQOhLL9PKXwglZ/oS4iTpEs0DwN93UXU7cp7zyRa0XtFfPf7/IJ2KY+Yl
 a2TEsUuZ/slJoxLhacr6+TBAgqUyewWIs0nAGdjP2EVlSjxZJQYFEQq4KnLUO2gV
 wLHH2snsZSBDfPDp0M9OOb737HE17NRmuLjWxUZZOMFz8tvfUT1454zhVAN2OtSQ
 cP0RqVRrFiS721oxacZpAxKFrd7o4ugUHpftJMPQAq70T9JFFbapfCLvd+OblOb/
 CWmDOOR37tvop5OCuaqaSMq7a+ZQt2cO5fogiEDdnjZkk2AH5GgsAHJIrl7hH4OT
 P9UMcxaWSGbutdVkM4cMUmYMuAJjiFhx1fiD+hevB1KvemXRXrqhCb0wV+GRdcoU
 MXGvOGVw5WyF/vFdjpjkY7KeCgpU3BTWH3pFC2a5vUCDqgD8yndwFghJMDfSjl6d
 46DIqknyveq234GK/Yz5khlbY094yL8JrJU2duva/9fGV86tgOr29xgMK28Lpyh8
 AYRGO9XgmehZrEHcAQ57
 =jy5l
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:
 "Makefile changes:
   - enable unused-variable warning that was wrongly disabled for clang

  Kconfig changes:
   - warn about blank 'help' and fix existing instances
   - fix 'choice' behavior to not write out invisible symbols
   - fix misc weirdness

  Coccinell changes:
   - fix false positive of free after managed memory alloc detection
   - improve performance of NULL dereference detection"

* tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits)
  kconfig: remove const qualifier from sym_expand_string_value()
  kconfig: add xrealloc() helper
  kconfig: send error messages to stderr
  kconfig: echo stdin to stdout if either is redirected
  kconfig: remove check_stdin()
  kconfig: remove 'config*' pattern from .gitignnore
  kconfig: show '?' prompt even if no help text is available
  kconfig: do not write choice values when their dependency becomes n
  coccinelle: deref_null: avoid useless computation
  coccinelle: devm_free: reduce false positives
  kbuild: clang: disable unused variable warnings only when constant
  kconfig: Warn if help text is blank
  nios2: kconfig: Remove blank help text
  arm: vt8500: kconfig: Remove blank help text
  MIPS: kconfig: Remove blank help text
  MIPS: BCM63XX: kconfig: Remove blank help text
  lib/Kconfig.debug: Remove blank help text
  Staging: rtl8192e: kconfig: Remove blank help text
  Staging: rtl8192u: kconfig: Remove blank help text
  mmc: kconfig: Remove blank help text
  ...
2018-02-09 19:32:41 -08:00
Linus Torvalds c839682c71 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Make allocations less aggressive in x_tables, from Minchal Hocko.

 2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.

 3) Fix connection loss problems in rtlwifi, from Larry Finger.

 4) Correct DRAM dump length for some chips in ath10k driver, from Yu
    Wang.

 5) Fix ABORT handling in rxrpc, from David Howells.

 6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.

 7) Some ipv6 onlink handling fixes, from David Ahern.

 8) Netem packet scheduler interval calcualtion fix from Md. Islam.

 9) Don't put crypto buffers on-stack in rxrpc, from David Howells.

10) Fix handling of error non-delivery status in netlink multicast
    delivery over multiple namespaces, from Nicolas Dichtel.

11) Missing xdp flush in tuntap driver, from Jason Wang.

12) Synchonize RDS protocol netns/module teardown with rds object
    management, from Sowini Varadhan.

13) Add nospec annotations to mpls, from Dan Williams.

14) Fix SKB truesize handling in TIPC, from Hoang Le.

15) Interrupt masking fixes in stammc from Niklas Cassel.

16) Don't allow ptr_ring objects to be sized outside of kmalloc's
    limits, from Jason Wang.

17) Don't allow SCTP chunks to be built which will have a length
    exceeding the chunk header's 16-bit length field, from Alexey
    Kodanev.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
  ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
  bpf: fix rlimit in reuseport net selftest
  sctp: verify size of a new chunk in _sctp_make_chunk()
  s390/qeth: fix SETIP command handling
  s390/qeth: fix underestimated count of buffer elements
  ptr_ring: try vmalloc() when kmalloc() fails
  ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
  net: stmmac: remove redundant enable of PMT irq
  net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
  net: stmmac: discard disabled flags in interrupt status register
  ibmvnic: Reset long term map ID counter
  tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
  selftests/bpf: add selftest that use test_libbpf_open
  selftests/bpf: add test program for loading BPF ELF files
  tools/libbpf: improve the pr_debug statements to contain section numbers
  bpf: Sync kernel ABI header with tooling header for bpf_common.h
  net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
  net: thunder: change q_len's type to handle max ring size
  tipc: fix skb truesize/datasize ratio control
  net/sched: cls_u32: fix cls_u32 on filter replace
  ...
2018-02-09 15:34:18 -08:00
David S. Miller 437a4db66d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-02-09

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Two fixes for BPF sockmap in order to break up circular map references
   from programs attached to sockmap, and detaching related sockets in
   case of socket close() event. For the latter we get rid of the
   smap_state_change() and plug into ULP infrastructure, which will later
   also be used for additional features anyway such as TX hooks. For the
   second issue, dependency chain is broken up via map release callback
   to free parse/verdict programs, all from John.

2) Fix a libbpf relocation issue that was found while implementing XDP
   support for Suricata project. Issue was that when clang was invoked
   with default target instead of bpf target, then various other e.g.
   debugging relevant sections are added to the ELF file that contained
   relocation entries pointing to non-BPF related sections which libbpf
   trips over instead of skipping them. Test cases for libbpf are added
   as well, from Jesper.

3) Various misc fixes for bpftool and one for libbpf: a small addition
   to libbpf to make sure it recognizes all standard section prefixes.
   Then, the Makefile in bpftool/Documentation is improved to explicitly
   check for rst2man being installed on the system as we otherwise risk
   installing empty man pages; the man page for bpftool-map is corrected
   and a set of missing bash completions added in order to avoid shipping
   bpftool where the completions are only partially working, from Quentin.

4) Fix applying the relocation to immediate load instructions in the
   nfp JIT which were missing a shift, from Jakub.

5) Two fixes for the BPF kernel selftests: handle CONFIG_BPF_JIT_ALWAYS_ON=y
   gracefully in test_bpf.ko module and mark them as FLAG_EXPECTED_FAIL
   in this case; and explicitly delete the veth devices in the two tests
   test_xdp_{meta,redirect}.sh before dismantling the netnses as when
   selftests are run in batch mode, then workqueue to handle destruction
   might not have finished yet and thus veth creation in next test under
   same dev name would fail, from Yonghong.

6) Fix test_kmod.sh to check the test_bpf.ko module path before performing
   an insmod, and fallback to modprobe. Especially the latter is useful
   when having a device under test that has the modules installed instead,
   from Naresh.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-09 14:05:10 -05:00
Linus Torvalds 9d21874da8 Merge branch 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax
Pull idr updates from Matthew Wilcox:

 - test-suite improvements

 - replace the extended API by improving the normal API

 - performance improvement for IDRs which are 1-based rather than
   0-based

 - add documentation

* 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax:
  idr: Add documentation
  idr: Make 1-based IDRs more efficient
  idr: Warn if old iterators see large IDs
  idr: Rename idr_for_each_entry_ext
  idr: Remove idr_alloc_ext
  cls_u32: Convert to idr_alloc_u32
  cls_u32: Reinstate cyclic allocation
  cls_flower: Convert to idr_alloc_u32
  cls_bpf: Convert to use idr_alloc_u32
  cls_basic: Convert to use idr_alloc_u32
  cls_api: Convert to idr_alloc_u32
  net sched actions: Convert to use idr_alloc_u32
  idr: Add idr_alloc_u32 helper
  idr: Delete idr_find_ext function
  idr: Delete idr_replace_ext function
  idr: Delete idr_remove_ext function
  IDR test suite: Check handling negative end correctly
  idr test suite: Fix ida_test_random()
  radix tree test suite: Remove ARRAY_SIZE
2018-02-08 14:39:29 -08:00
Adam Borowski 3a129cc215 vsprintf: avoid misleading "(null)" for %px
Like %pK already does, print "00000000" instead.

This confused people -- the convention is that "(null)" means you tried to
dereference a null pointer as opposed to printing the address.

Link: http://lkml.kernel.org/r/20180204174521.21383-1-kilobyte@angband.pl
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Roberts, William C" <william.c.roberts@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-02-08 14:21:41 +01:00
Linus Torvalds a2e5790d84 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - kasan updates

 - procfs

 - lib/bitmap updates

 - other lib/ updates

 - checkpatch tweaks

 - rapidio

 - ubsan

 - pipe fixes and cleanups

 - lots of other misc bits

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
  Documentation/sysctl/user.txt: fix typo
  MAINTAINERS: update ARM/QUALCOMM SUPPORT patterns
  MAINTAINERS: update various PALM patterns
  MAINTAINERS: update "ARM/OXNAS platform support" patterns
  MAINTAINERS: update Cortina/Gemini patterns
  MAINTAINERS: remove ARM/CLKDEV SUPPORT file pattern
  MAINTAINERS: remove ANDROID ION pattern
  mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors
  mm: docs: fix parameter names mismatch
  mm: docs: fixup punctuation
  pipe: read buffer limits atomically
  pipe: simplify round_pipe_size()
  pipe: reject F_SETPIPE_SZ with size over UINT_MAX
  pipe: fix off-by-one error when checking buffer limits
  pipe: actually allow root to exceed the pipe buffer limits
  pipe, sysctl: remove pipe_proc_fn()
  pipe, sysctl: drop 'min' parameter from pipe-max-size converter
  kasan: rework Kconfig settings
  crash_dump: is_kdump_kernel can be boolean
  kernel/mutex: mutex_is_locked can be boolean
  ...
2018-02-06 22:15:42 -08:00
Arnd Bergmann e7c52b84fb kasan: rework Kconfig settings
We get a lot of very large stack frames using gcc-7.0.1 with the default
-fsanitize-address-use-after-scope --param asan-stack=1 options, which can
easily cause an overflow of the kernel stack, e.g.

  drivers/gpu/drm/i915/gvt/handlers.c:2434:1: warning: the frame size of 46176 bytes is larger than 3072 bytes
  drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size of 23632 bytes is larger than 3072 bytes
  lib/atomic64_test.c:250:1: warning: the frame size of 11200 bytes is larger than 3072 bytes
  drivers/gpu/drm/i915/gvt/handlers.c:2621:1: warning: the frame size of 9208 bytes is larger than 3072 bytes
  drivers/media/dvb-frontends/stv090x.c:3431:1: warning: the frame size of 6816 bytes is larger than 3072 bytes
  fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than 3072 bytes

To reduce this risk, -fsanitize-address-use-after-scope is now split out
into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack
frames that are smaller than 2 kilobytes most of the time on x86_64.  An
earlier version of this patch also prevented combining KASAN_EXTRA with
KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.

All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y
and CONFIG_KASAN_EXTRA=n have been merged by maintainers now, so we can
bring back that default now.  KASAN_EXTRA=y still causes lots of
warnings but now defaults to !COMPILE_TEST to disable it in
allmodconfig, and it remains disabled in all other defconfigs since it
is a new option.  I arbitrarily raise the warning limit for KASAN_EXTRA
to 3072 to reduce the noise, but an allmodconfig kernel still has around
50 warnings on gcc-7.

I experimented a bit more with smaller stack frames and have another
follow-up series that reduces the warning limit for 64-bit architectures
to 1280 bytes (without CONFIG_KASAN).

With earlier versions of this patch series, I also had patches to address
the warnings we get with KASAN and/or KASAN_EXTRA, using a
"noinline_if_stackbloat" annotation.

That annotation now got replaced with a gcc-8 bugfix (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715) and a workaround for
older compilers, which means that KASAN_EXTRA is now just as bad as
before and will lead to an instant stack overflow in a few extreme
cases.

This reverts parts of commit 3f181b4d86 ("lib/Kconfig.debug: disable
-Wframe-larger-than warnings with KASAN=y").  Two patches in linux-next
should be merged first to avoid introducing warnings in an allmodconfig
build:
  3cd890dbe2 ("media: dvb-frontends: fix i2c access helpers for KASAN")
  16c3ada89c ("media: r820t: fix r820t_write_reg for KASAN")

Do we really need to backport this?

I think we do: without this patch, enabling KASAN will lead to
unavoidable kernel stack overflow in certain device drivers when built
with gcc-7 or higher on linux-4.10+ or any version that contains a
backport of commit c5caf21ab0.  Most people are probably still on
older compilers, but it will get worse over time as they upgrade their
distros.

The warnings we get on kernels older than this should all be for code
that uses dangerously large stack frames, though most of them do not
cause an actual stack overflow by themselves.The asan-stack option was
added in linux-4.0, and commit 3f181b4d86 ("lib/Kconfig.debug:
disable -Wframe-larger-than warnings with KASAN=y") effectively turned
off the warning for allmodconfig kernels, so I would like to see this
fix backported to any kernels later than 4.0.

I have done dozens of fixes for individual functions with stack frames
larger than 2048 bytes with asan-stack, and I plan to make sure that
all those fixes make it into the stable kernels as well (most are
already there).

Part of the complication here is that asan-stack (from 4.0) was
originally assumed to always require much larger stacks, but that
turned out to be a combination of multiple gcc bugs that we have now
worked around and fixed, but sanitize-address-use-after-scope (from
v4.10) has a much higher inherent stack usage and also suffers from at
least three other problems that we have analyzed but not yet fixed
upstream, each of them makes the stack usage more severe than it should
be.

Link: http://lkml.kernel.org/r/20171221134744.2295529-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00
Andrey Ryabinin bac7a1fff7 lib/ubsan: remove returns-nonnull-attribute checks
Similarly to type mismatch checks, new GCC 8.x and Clang also changed for
ABI for returns_nonnull checks.  While we can update our code to conform
the new ABI it's more reasonable to just remove it.  Because it's just
dead code, we don't have any single user of returns_nonnull attribute in
the whole kernel.

And AFAIU the advantage that this attribute could bring would be mitigated
by -fno-delete-null-pointer-checks cflag that we use to build the kernel.
So it's unlikely we will have a lot of returns_nonnull attribute in
future.

So let's just remove the code, it has no use.

[aryabinin@virtuozzo.com: fix warning]
  Link: http://lkml.kernel.org/r/20180122165711.11510-1-aryabinin@virtuozzo.com
Link: http://lkml.kernel.org/r/20180119152853.16806-2-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Sodagudi Prasad <psodagud@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:46 -08:00
Andrey Ryabinin 42440c1f99 lib/ubsan: add type mismatch handler for new GCC/Clang
UBSAN=y fails to build with new GCC/clang:

    arch/x86/kernel/head64.o: In function `sanitize_boot_params':
    arch/x86/include/asm/bootparam_utils.h:37: undefined reference to `__ubsan_handle_type_mismatch_v1'

because Clang and GCC 8 slightly changed ABI for 'type mismatch' errors.
Compiler now uses new __ubsan_handle_type_mismatch_v1() function with
slightly modified 'struct type_mismatch_data'.

Let's add new 'struct type_mismatch_data_common' which is independent from
compiler's layout of 'struct type_mismatch_data'.  And make
__ubsan_handle_type_mismatch[_v1]() functions transform compiler-dependent
type mismatch data to our internal representation.  This way, we can
support both old and new compilers with minimal amount of change.

Link: http://lkml.kernel.org/r/20180119152853.16806-1-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Sodagudi Prasad <psodagud@codeaurora.org>
Cc: <stable@vger.kernel.org>	[4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:46 -08:00
Andrew Morton b8fe1120b4 lib/ubsan.c: s/missaligned/misaligned/
A vist from the spelling fairy.

Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:46 -08:00
Pravin Shedge 92fc7cb8ae lib/test_sort.c: add module unload support
test_sort.c performs array-based and linked list sort test.  Code allows
to compile either as a loadable modules or builtin into the kernel.

Current code is not allow to unload the test_sort.ko module after
successful completion.

This patch adds support to unload the "test_sort.ko" module by adding
module_exit support.

Previous patch was implemented auto unload support by returning -EAGAIN
from module_init() function on successful case, but this approach is not
ideal.

The auto-unload might seem like a nice optimization, but it encourages
inconsistent behaviour.  And behaviour that is different from all other
normal modules.

Link: http://lkml.kernel.org/r/1513967133-6843-1-git-send-email-pravin.shedge4linux@gmail.com
Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Cc: Kostenzer Felix <fkostenzer@live.at>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:45 -08:00
Vincent Legoll d3deafaa8b lib/: make RUNTIME_TESTS a menuconfig to ease disabling it all
No need to get into the submenu to disable all related config entries.

This makes it easier to disable all RUNTIME_TESTS config options without
entering the submenu.  It will also enable one to see that en/dis-abled
state from the outside menu.

This is only intended to change menuconfig UI, not change the config
dependencies.

Link: http://lkml.kernel.org/r/20171209162742.7363-1-vincent.legoll@gmail.com
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Clement Courbet 0ade34c370 lib: optimize cpumask_next_and()
We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
It's essentially a joined iteration in search for a non-zero bit, which is
currently implemented as a lookup join (find a nonzero bit on the lhs,
lookup the rhs to see if it's set there).

Implement a direct join (find a nonzero bit on the incrementally built
join).  Also add generic bitmap benchmarks in the new `test_find_bit`
module for new function (see `find_next_and_bit` in [2] and [3] below).

For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
faster with a geometric mean of 2.1 on 32 CPUs [1].  No impact on memory
usage.  Note that on Arm, the new pure-C implementation still outperforms
the old one that uses a mix of C and asm (`find_next_bit`) [3].

[1] Approximate benchmark code:

```
  unsigned long src1p[nr_cpumask_longs] = {pattern1};
  unsigned long src2p[nr_cpumask_longs] = {pattern2};
  for (/*a bunch of repetitions*/) {
    for (int n = -1; n <= nr_cpu_ids; ++n) {
      asm volatile("" : "+rm"(src1p)); // prevent any optimization
      asm volatile("" : "+rm"(src2p));
      unsigned long result = cpumask_next_and(n, src1p, src2p);
      asm volatile("" : "+rm"(result));
    }
  }
```

Results:
pattern1    pattern2     time_before/time_after
0x0000ffff  0x0000ffff   1.65
0x0000ffff  0x00005555   2.24
0x0000ffff  0x00001111   2.94
0x0000ffff  0x00000000   14.0
0x00005555  0x0000ffff   1.67
0x00005555  0x00005555   1.71
0x00005555  0x00001111   1.90
0x00005555  0x00000000   6.58
0x00001111  0x0000ffff   1.46
0x00001111  0x00005555   1.49
0x00001111  0x00001111   1.45
0x00001111  0x00000000   3.10
0x00000000  0x0000ffff   1.18
0x00000000  0x00005555   1.18
0x00000000  0x00001111   1.17
0x00000000  0x00000000   1.25
-----------------------------
               geo.mean  2.06

[2] test_find_next_bit, X86 (skylake)

 [ 3913.477422] Start testing find_bit() with random-filled bitmap
 [ 3913.477847] find_next_bit: 160868 cycles, 16484 iterations
 [ 3913.477933] find_next_zero_bit: 169542 cycles, 16285 iterations
 [ 3913.478036] find_last_bit: 201638 cycles, 16483 iterations
 [ 3913.480214] find_first_bit: 4353244 cycles, 16484 iterations
 [ 3913.480216] Start testing find_next_and_bit() with random-filled
 bitmap
 [ 3913.481074] find_next_and_bit: 89604 cycles, 8216 iterations
 [ 3913.481075] Start testing find_bit() with sparse bitmap
 [ 3913.481078] find_next_bit: 2536 cycles, 66 iterations
 [ 3913.481252] find_next_zero_bit: 344404 cycles, 32703 iterations
 [ 3913.481255] find_last_bit: 2006 cycles, 66 iterations
 [ 3913.481265] find_first_bit: 17488 cycles, 66 iterations
 [ 3913.481266] Start testing find_next_and_bit() with sparse bitmap
 [ 3913.481272] find_next_and_bit: 764 cycles, 1 iterations

[3] test_find_next_bit, arm (v7 odroid XU3).

[  267.206928] Start testing find_bit() with random-filled bitmap
[  267.214752] find_next_bit: 4474 cycles, 16419 iterations
[  267.221850] find_next_zero_bit: 5976 cycles, 16350 iterations
[  267.229294] find_last_bit: 4209 cycles, 16419 iterations
[  267.279131] find_first_bit: 1032991 cycles, 16420 iterations
[  267.286265] Start testing find_next_and_bit() with random-filled
bitmap
[  267.302386] find_next_and_bit: 2290 cycles, 8140 iterations
[  267.309422] Start testing find_bit() with sparse bitmap
[  267.316054] find_next_bit: 191 cycles, 66 iterations
[  267.322726] find_next_zero_bit: 8758 cycles, 32703 iterations
[  267.329803] find_last_bit: 84 cycles, 66 iterations
[  267.336169] find_first_bit: 4118 cycles, 66 iterations
[  267.342627] Start testing find_next_and_bit() with sparse bitmap
[  267.356919] find_next_and_bit: 91 cycles, 1 iterations

[courbet@google.com: v6]
  Link: http://lkml.kernel.org/r/20171129095715.23430-1-courbet@google.com
[geert@linux-m68k.org: m68k/bitops: always include <asm-generic/bitops/find.h>]
  Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org
Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.com
Signed-off-by: Clement Courbet <courbet@google.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yury Norov <ynorov@caviumnetworks.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Yury Norov 15ff67bf85 lib/find_bit_benchmark.c: improvements
As suggested in review comments:
* printk: align numbers using whitespaces instead of tabs;
* return error value from init() to avoid calling rmmod if testing again;
* use ktime_get instead of get_cycles as some arches don't support it;

The output in dmesg (on QEMU arm64):
[   38.823430] Start testing find_bit() with random-filled bitmap
[   38.845358] find_next_bit:                20138448 ns, 163968 iterations
[   38.856217] find_next_zero_bit:           10615328 ns, 163713 iterations
[   38.863564] find_last_bit:                 7111888 ns, 163967 iterations
[   40.944796] find_first_bit:             2081007216 ns, 163968 iterations
[   40.944975]
[   40.944975] Start testing find_bit() with sparse bitmap
[   40.945268] find_next_bit:                   73216 ns,    656 iterations
[   40.967858] find_next_zero_bit:           22461008 ns, 327025 iterations
[   40.968047] find_last_bit:                   62320 ns,    656 iterations
[   40.978060] find_first_bit:                9889360 ns,    656 iterations

Link: http://lkml.kernel.org/r/20171124143040.a44jvhmnaiyedg2i@yury-thinkpad
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Clement Courbet <courbet@google.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Yury Norov dceeb3e7fd lib/test_find_bit.c: rename to find_bit_benchmark.c
As suggested in review comments, rename test_find_bit.c to
find_bit_benchmark.c.

Link: http://lkml.kernel.org/r/20171124143040.a44jvhmnaiyedg2i@yury-thinkpad
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Clement Courbet <courbet@google.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Alexander Potapenko a571b272ab lib/stackdepot.c: use a non-instrumented version of memcmp()
stackdepot used to call memcmp(), which compiler tools normally
instrument, therefore every lookup used to unnecessarily call instrumented
code.  This is somewhat ok in the case of KASAN, but under KMSAN a lot of
time was spent in the instrumentation.

Link: http://lkml.kernel.org/r/20171117172149.69562-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Andy Shevchenko fe81814c3e lib/test_bitmap.c: clean up test_zero_fill_copy() test case and rename
Since we have separate explicit test cases for bitmap_zero() /
bitmap_clear() and bitmap_fill() / bitmap_set(), clean up
test_zero_fill_copy() to only test bitmap_copy() functionality and thus
rename a function to reflect the changes.

While here, replace bitmap_fill() by bitmap_set() with proper values.

Link: http://lkml.kernel.org/r/20180109172430.87452-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Yury Norov <ynorov@caviumnetworks.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Andy Shevchenko 978f369c5c lib/test_bitmap.c: add bitmap_fill()/bitmap_set() test cases
Explicitly test bitmap_fill() and bitmap_set() functions.

For bitmap_fill() we expect a consistent behaviour as in bitmap_zero(),
i.e.  the trailing bits will be set up to unsigned long boundary.

Link: http://lkml.kernel.org/r/20180109172430.87452-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Yury Norov <ynorov@caviumnetworks.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Andy Shevchenko ee3527bd5e lib/test_bitmap.c: add bitmap_zero()/bitmap_clear() test cases
Explicitly test bitmap_zero() and bitmap_clear() functions.

Link: http://lkml.kernel.org/r/20180109172430.87452-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Yury Norov <ynorov@caviumnetworks.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Yury Norov 3aa56885e5 bitmap: replace bitmap_{from,to}_u32array
with bitmap_{from,to}_arr32 over the kernel. Additionally to it:
* __check_eq_bitmap() now takes single nbits argument.
* __check_eq_u32_array is not used in new test but may be used in
  future. So I don't remove it here, but annotate as __used.

Tested on arm64 and 32-bit BE mips.

[arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32]
  Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com
[ynorov@caviumnetworks.com: fix net/core/ethtool.c]
  Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad
Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: David Decotigny <decot@googlers.com>,
Cc: David S. Miller <davem@davemloft.net>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00