Commit graph

826647 commits

Author SHA1 Message Date
Linus Torvalds b25c69b9d5 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixes:
   - various tooling fixes
   - kretprobe fixes
   - kprobes annotation fixes
   - kprobes error checking fix
   - fix the default events for AMD Family 17h CPUs
   - PEBS fix
   - AUX record fix
   - address filtering fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kprobes: Avoid kretprobe recursion bug
  kprobes: Mark ftrace mcount handler functions nokprobe
  x86/kprobes: Verify stack frame on kretprobe
  perf/x86/amd: Add event map for AMD Family 17h
  perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf()
  perf tools: Fix map reference counting
  perf evlist: Fix side band thread draining
  perf tools: Check maps for bpf programs
  perf bpf: Return NULL when RB tree lookup fails in perf_env__find_bpf_prog_info()
  tools include uapi: Sync sound/asound.h copy
  perf top: Always sample time to satisfy needs of use of ordered queuing
  perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user)
  tools lib traceevent: Fix missing equality check for strcmp
  perf stat: Disable DIR_FORMAT feature for 'perf stat record'
  perf scripts python: export-to-sqlite.py: Fix use of parent_id in calls_view
  perf header: Fix lock/unlock imbalances when processing BPF/BTF info
  perf/x86: Fix incorrect PEBS_REGS
  perf/ring_buffer: Fix AUX record suppression
  perf/core: Fix the address filtering fix
  kprobes: Fix error check when reusing optimized probes
2019-04-20 10:05:02 -07:00
Linus Torvalds 1fd91d719e Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes all over the place: a console spam fix, section attributes
  fixes, a KASLR fix, a TLB stack-variable alignment fix, a reboot
  quirk, boot options related warnings fix, an LTO fix, a deadlock fix
  and an RDT fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority
  x86/cpu/bugs: Use __initconst for 'const' init data
  x86/mm/KASLR: Fix the size of the direct mapping section
  x86/Kconfig: Fix spelling mistake "effectivness" -> "effectiveness"
  x86/mm/tlb: Revert "x86/mm: Align TLB invalidation info"
  x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T
  x86/mm: Prevent bogus warnings with "noexec=off"
  x86/build/lto: Fix truncated .bss with -fdata-sections
  x86/speculation: Prevent deadlock on ssb_state::lock
  x86/resctrl: Do not repeat rdtgroup mode initialization
2019-04-20 10:01:11 -07:00
Linus Torvalds 2b4cf5850d Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "A deadline scheduler warning/race fix, and a cfs_period_us quota
  calculation workaround where the real fix looks too involved to merge
  immediately"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Correctly handle active 0-lag timers
  sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup
2019-04-20 09:53:36 -07:00
Linus Torvalds de3af9a990 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "A lockdep warning fix and a script execution fix when atomics are
  generated"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomics: Don't assume that scripts are executable
  locking/lockdep: Make lockdep_unregister_key() honor 'debug_locks' again
2019-04-20 09:38:01 -07:00
Linus Torvalds 371dd432ab Merge branch 'for-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
 "A patch to fix a RCU imbalance error in the devices cgroup
  configuration error path"

* 'for-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  device_cgroup: fix RCU imbalance in error case
2019-04-19 18:03:55 -07:00
Linus Torvalds 4c3f49ae13 Merge branch 'for-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu
Pull percpu fixlet from Dennis Zhou:
 "This stops printing the base address of percpu memory on
  initialization"

* 'for-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
  percpu: stop printing kernel addresses
2019-04-19 15:37:22 -07:00
Linus Torvalds 55e3a6ba5c TTY/Serial fixes for 5.1-rc6
Here are 5 small fixes for some tty/serial/vt issues that have been
 reported.
 
 The vt one has been around for a while, it is good to finally get that
 resolved.  The others fix a build warning that showed up in 5.1-rc1, and
 resolve a problem in the sh-sci driver.
 
 Note, the second patch for build warning fix for the sc16is7xx driver
 was just applied to the tree, as it resolves a problem with the previous
 patch to try to solve the issue.  It has not shown up in linux-next yet,
 unlike all of the other patches, but it has passed 0-day testing and
 everyone seems to agree that it is correct.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXLoeMg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yll3gCfdfclnKAQt7UnXUZccT7f6oebB6gAoLEy0NEi
 lDl7inafWImRKCRq1+zw
 =Qk9s
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are five small fixes for some tty/serial/vt issues that have been
  reported.

  The vt one has been around for a while, it is good to finally get that
  resolved. The others fix a build warning that showed up in 5.1-rc1,
  and resolve a problem in the sh-sci driver.

  Note, the second patch for build warning fix for the sc16is7xx driver
  was just applied to the tree, as it resolves a problem with the
  previous patch to try to solve the issue. It has not shown up in
  linux-next yet, unlike all of the other patches, but it has passed
  0-day testing and everyone seems to agree that it is correct"

* tag 'tty-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  sc16is7xx: put err_spi and err_i2c into correct #ifdef
  vt: fix cursor when clearing the screen
  sc16is7xx: move label 'err_spi' to correct section
  serial: sh-sci: Fix HSCIF RX sampling point adjustment
  serial: sh-sci: Fix HSCIF RX sampling point calculation
2019-04-19 12:22:27 -07:00
Linus Torvalds 3ecafda911 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "16 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
  mm/kmemleak.c: fix unused-function warning
  init: initialize jump labels before command line option parsing
  kernel/watchdog_hld.c: hard lockup message should end with a newline
  kcov: improve CONFIG_ARCH_HAS_KCOV help text
  mm: fix inactive list balancing between NUMA nodes and cgroups
  mm/hotplug: treat CMA pages as unmovable
  proc: fixup proc-pid-vm test
  proc: fix map_files test on F29
  mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
  mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock
  mm: swapoff: shmem_unuse() stop eviction without igrab()
  mm: swapoff: take notice of completion sooner
  mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES
  mm: swapoff: shmem_find_swap_entries() filter out other types
  slab: store tagged freelist for off-slab slabmgmt
2019-04-19 11:46:51 -07:00
Linus Torvalds b222e9af0a Staging/IIO fixes for 5.1-rc6
Here is a bunch of IIO driver fixes, and some smaller staging driver
 fixes, for 5.1-rc6.  The IIO fixes were delayed due to my vacation, but
 all resolve a number of reported issues and have been in linux-next for
 a few weeks with no reported issues.
 
 The other staging driver fixes are all tiny, resolving some reported
 issues in the comedi and most drivers, as well as some erofs fixes.
 
 All of these patches have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXLm7QA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylqDgCgwt+6SUTrkHCrwfiSEIyY0gXBnZsAoKrNjd1Y
 VvbBiG9lk0ST1V/vWrkJ
 =Njex
 -----END PGP SIGNATURE-----

Merge tag 'staging-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging and IIO fixes from Greg KH:
 "Here is a bunch of IIO driver fixes, and some smaller staging driver
  fixes, for 5.1-rc6. The IIO fixes were delayed due to my vacation, but
  all resolve a number of reported issues and have been in linux-next
  for a few weeks with no reported issues.

  The other staging driver fixes are all tiny, resolving some reported
  issues in the comedi and most drivers, as well as some erofs fixes.

  All of these patches have been in linux-next with no reported issues"

* tag 'staging-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (24 commits)
  staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf
  staging: comedi: ni_usb6501: Fix use of uninitialized mutex
  staging: erofs: fix unexpected out-of-bound data access
  staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf
  staging: comedi: vmk80xx: Fix use of uninitialized semaphore
  staging: most: core: use device description as name
  iio: core: fix a possible circular locking dependency
  iio: ad_sigma_delta: select channel when reading register
  iio: pms7003: select IIO_TRIGGERED_BUFFER
  iio: cros_ec: Fix the maths for gyro scale calculation
  iio: adc: xilinx: prevent touching unclocked h/w on remove
  iio: adc: xilinx: fix potential use-after-free on probe
  iio: adc: xilinx: fix potential use-after-free on remove
  iio: dac: mcp4725: add missing powerdown bits in store eeprom
  io: accel: kxcjk1013: restore the range after resume.
  iio:chemical:bme680: Fix SPI read interface
  iio:chemical:bme680: Fix, report temperature in millidegrees
  iio: chemical: fix missing Kconfig block for sgp30
  iio: adc: at91: disable adc channel interrupt in timeout case
  iio: gyro: mpu3050: fix chip ID reading
  ...
2019-04-19 11:10:42 -07:00
Linus Torvalds f9764dd4d3 char/misc fixes for 5.1-rc6
Here are 4 small misc driver fixes for 5.1-rc6.
 
 Nothing major at all, they fix up a Kconfig issues, a SPDX invalid
 license tag, and 2 tiny bugfixes.
 
 All have been in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXLm6yw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykBXQCgok0WblimRO9jTDQul7JXTxwZyxMAoMU841gi
 WPjtj2t1aqbxn8IdJPP/
 =bIbF
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
 "Here are four small misc driver fixes for 5.1-rc6.

  Nothing major at all, they fix up a Kconfig issues, a SPDX invalid
  license tag, and two tiny bugfixes.

  All have been in linux-next for a while with no reported issues"

* tag 'char-misc-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  drivers: power: supply: goldfish_battery: Fix bogus SPDX identifier
  extcon: ptn5150: fix COMPILE_TEST dependencies
  misc: fastrpc: add checked value for dma_set_mask
  habanalabs: remove low credit limit of DMA #0
2019-04-19 11:08:43 -07:00
Ming Lei 6bedf00e55 block: make sure that bvec length can't be overflow
bvec->bv_offset may be bigger than PAGE_SIZE sometimes, such as,
when one bio is splitted in the middle of one bvec via bio_split(),
and bi_iter.bi_bvec_done is used to build offset of the 1st bvec of
remained bio. And the remained bio's bvec may be re-submitted to fs
layer via ITER_IBVEC, such as loop and nvme-loop.

So we have to make sure that every bvec's offset is less than
PAGE_SIZE from bio_for_each_segment_all() because some drivers(loop,
nvme-loop) passes the splitted bvec to fs layer via ITER_BVEC.

This patch fixes this issue reported by Zhang Yi When running nvme/011.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Yi Zhang <yi.zhang@redhat.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Fixes: 6dc4f100c1 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-19 11:32:14 -06:00
Hou Tao b40fabc05e block: kill all_q_node in request_queue
all_q_node has not been used since commit 4b855ad371 ("blk-mq: Create
hctx for each present CPU"), so remove it.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-19 11:31:42 -06:00
Linus Torvalds 240206fcab Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:

 - several new key mappings for HID

 - a host of new ACPI IDs used to identify Elan touchpads in Lenovo
   laptops

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ
  HID: input: add mapping for "Toggle Display" key
  HID: input: add mapping for "Full Screen" key
  HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys
  HID: input: add mapping for Expose/Overview key
  HID: input: fix mapping of aspect ratio key
  [media] doc-rst: switch to new names for Full Screen/Aspect keys
  Input: document meanings of KEY_SCREEN and KEY_ZOOM
  Input: elan_i2c - add hardware ID for multiple Lenovo laptops
2019-04-19 10:28:27 -07:00
Hans de Goede 2ee27796f2 x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority
The "ENERGY_PERF_BIAS: Set to 'normal', was 'performance'" message triggers
on pretty much every Intel machine. The purpose of log messages with
a warning level is to notify the user of something which potentially is
a problem, or at least somewhat unexpected.

This message clearly does not match those criteria, so lower its log
priority from warning to info.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181230172715.17469-1-hdegoede@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 19:23:13 +02:00
Ingo Molnar 7579dfc42d perf/urgent fixes:
perf top:
 
   Jiri Olsa:
 
   - Fix 'perf top --pid', it needs PERF_SAMPLE_TIME since we switched to using
     a different thread to sort the events and then even for just a single
     thread we now need timestamps.
 
 BPF:
 
   Jiri Olsa:
 
   - Fix bpf_prog and btf lookup functions failure path to to properly return
     NULL.
 
   - Fix side band thread draining, used to process PERF_RECORD_BPF_EVENT
     metadata records.
 
 core:
 
   Jiri Olsa:
 
   - Fix map lookup by name to get a refcount when the name is already in
     the tree. Found
 
   Song Liu:
 
   - Fix __map__is_kmodule() by taking into account recently added BPF
     maps.
 
 UAPI:
 
   Arnaldo Carvalho de Melo:
 
   - Sync sound/asound.h copy
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXLm+cgAKCRCyPKLppCJ+
 J7QCAQCtI0NIeLUKY/o0vITBvCeKi5R1NNkE/A8ZhTzWD9UApwEA0eqoNHsOmIUG
 tFRCYi3hPByvtgjIRXvrCm0gpUfFdAo=
 =/siq
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-5.1-20190419' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

perf top:

  Jiri Olsa:

  - Fix 'perf top --pid', it needs PERF_SAMPLE_TIME since we switched to using
    a different thread to sort the events and then even for just a single
    thread we now need timestamps.

BPF:

  Jiri Olsa:

  - Fix bpf_prog and btf lookup functions failure path to to properly return
    NULL.

  - Fix side band thread draining, used to process PERF_RECORD_BPF_EVENT
    metadata records.

core:

  Jiri Olsa:

  - Fix map lookup by name to get a refcount when the name is already in
    the tree. Found

  Song Liu:

  - Fix __map__is_kmodule() by taking into account recently added BPF
    maps.

UAPI:

  Arnaldo Carvalho de Melo:

  - Sync sound/asound.h copy

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 19:10:47 +02:00
Andrea Arcangeli 04f5866e41 coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
The core dumping code has always run without holding the mmap_sem for
writing, despite that is the only way to ensure that the entire vma
layout will not change from under it.  Only using some signal
serialization on the processes belonging to the mm is not nearly enough.
This was pointed out earlier.  For example in Hugh's post from Jul 2017:

  https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils

  "Not strictly relevant here, but a related note: I was very surprised
   to discover, only quite recently, how handle_mm_fault() may be called
   without down_read(mmap_sem) - when core dumping. That seems a
   misguided optimization to me, which would also be nice to correct"

In particular because the growsdown and growsup can move the
vm_start/vm_end the various loops the core dump does around the vma will
not be consistent if page faults can happen concurrently.

Pretty much all users calling mmget_not_zero()/get_task_mm() and then
taking the mmap_sem had the potential to introduce unexpected side
effects in the core dumping code.

Adding mmap_sem for writing around the ->core_dump invocation is a
viable long term fix, but it requires removing all copy user and page
faults and to replace them with get_dump_page() for all binary formats
which is not suitable as a short term fix.

For the time being this solution manually covers the places that can
confuse the core dump either by altering the vma layout or the vma flags
while it runs.  Once ->core_dump runs under mmap_sem for writing the
function mmget_still_valid() can be dropped.

Allowing mmap_sem protected sections to run in parallel with the
coredump provides some minor parallelism advantage to the swapoff code
(which seems to be safe enough by never mangling any vma field and can
keep doing swapins in parallel to the core dumping) and to some other
corner case.

In order to facilitate the backporting I added "Fixes: 86039bd3b4e6"
however the side effect of this same race condition in /proc/pid/mem
should be reproducible since before 2.6.12-rc2 so I couldn't add any
other "Fixes:" because there's no hash beyond the git genesis commit.

Because find_extend_vma() is the only location outside of the process
context that could modify the "mm" structures under mmap_sem for
reading, by adding the mmget_still_valid() check to it, all other cases
that take the mmap_sem for reading don't need the new check after
mmget_not_zero()/get_task_mm().  The expand_stack() in page fault
context also doesn't need the new check, because all tasks under core
dumping are frozen.

Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
Fixes: 86039bd3b4 ("userfaultfd: add new syscall to provide memory externalization")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Arnd Bergmann dce5b0bdee mm/kmemleak.c: fix unused-function warning
The only references outside of the #ifdef have been removed, so now we
get a warning in non-SMP configurations:

  mm/kmemleak.c:1404:13: error: unused function 'scan_large_block' [-Werror,-Wunused-function]

Add a new #ifdef around it.

Link: http://lkml.kernel.org/r/20190416123148.3502045-1-arnd@arndb.de
Fixes: 298a32b132 ("kmemleak: powerpc: skip scanning holes in the .bss section")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Dan Williams 6041186a32 init: initialize jump labels before command line option parsing
When a module option, or core kernel argument, toggles a static-key it
requires jump labels to be initialized early.  While x86, PowerPC, and
ARM64 arrange for jump_label_init() to be called before parse_args(),
ARM does not.

  Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 console=ttyAMA0,115200 page_alloc.shuffle=1
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303
  page_alloc_shuffle+0x12c/0x1ac
  static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used
  before call to jump_label_init()
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted
  5.1.0-rc4-next-20190410-00003-g3367c36ce744 #1
  Hardware name: ARM Integrator/CP (Device Tree)
  [<c0011c68>] (unwind_backtrace) from [<c000ec48>] (show_stack+0x10/0x18)
  [<c000ec48>] (show_stack) from [<c07e9710>] (dump_stack+0x18/0x24)
  [<c07e9710>] (dump_stack) from [<c001bb1c>] (__warn+0xe0/0x108)
  [<c001bb1c>] (__warn) from [<c001bb88>] (warn_slowpath_fmt+0x44/0x6c)
  [<c001bb88>] (warn_slowpath_fmt) from [<c0b0c4a8>]
  (page_alloc_shuffle+0x12c/0x1ac)
  [<c0b0c4a8>] (page_alloc_shuffle) from [<c0b0c550>] (shuffle_store+0x28/0x48)
  [<c0b0c550>] (shuffle_store) from [<c003e6a0>] (parse_args+0x1f4/0x350)
  [<c003e6a0>] (parse_args) from [<c0ac3c00>] (start_kernel+0x1c0/0x488)

Move the fallback call to jump_label_init() to occur before
parse_args().

The redundant calls to jump_label_init() in other archs are left intact
in case they have static key toggling use cases that are even earlier
than option parsing.

Link: http://lkml.kernel.org/r/155544804466.1032396.13418949511615676665.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Guenter Roeck <groeck@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Sergey Senozhatsky 8f4a8c12ca kernel/watchdog_hld.c: hard lockup message should end with a newline
Separate print_modules() and hard lockup error message.

Before the patch:

  NMI watchdog: Watchdog detected hard LOCKUP on cpu 1Modules linked in: nls_cp437

Link: http://lkml.kernel.org/r/20190412062557.2700-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Mark Rutland 40453c4f9b kcov: improve CONFIG_ARCH_HAS_KCOV help text
The help text for CONFIG_ARCH_HAS_KCOV is stale, and describes the
feature as being enabled only for x86_64, when it is now enabled for
several architectures, including arm, arm64, powerpc, and s390.

Let's remove that stale help text, and update it along the lines of hat
for ARCH_HAS_FORTIFY_SOURCE, better describing when an architecture
should select CONFIG_ARCH_HAS_KCOV.

Link: http://lkml.kernel.org/r/20190412102733.5154-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Johannes Weiner 3b991208b8 mm: fix inactive list balancing between NUMA nodes and cgroups
During !CONFIG_CGROUP reclaim, we expand the inactive list size if it's
thrashing on the node that is about to be reclaimed.  But when cgroups
are enabled, we suddenly ignore the node scope and use the cgroup scope
only.  The result is that pressure bleeds between NUMA nodes depending
on whether cgroups are merely compiled into Linux.  This behavioral
difference is unexpected and undesirable.

When the refault adaptivity of the inactive list was first introduced,
there were no statistics at the lruvec level - the intersection of node
and memcg - so it was better than nothing.

But now that we have that infrastructure, use lruvec_page_state() to
make the list balancing decision always NUMA aware.

[hannes@cmpxchg.org: fix bisection hole]
  Link: http://lkml.kernel.org/r/20190417155241.GB23013@cmpxchg.org
Link: http://lkml.kernel.org/r/20190412144438.2645-1-hannes@cmpxchg.org
Fixes: 2a2e48854d ("mm: vmscan: fix IO/refault regression in cache workingset transition")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Qian Cai 1a9f219157 mm/hotplug: treat CMA pages as unmovable
has_unmovable_pages() is used by allocating CMA and gigantic pages as
well as the memory hotplug.  The later doesn't know how to offline CMA
pool properly now, but if an unused (free) CMA page is encountered, then
has_unmovable_pages() happily considers it as a free memory and
propagates this up the call chain.  Memory offlining code then frees the
page without a proper CMA tear down which leads to an accounting issues.
Moreover if the same memory range is onlined again then the memory never
gets back to the CMA pool.

State after memory offline:

 # grep cma /proc/vmstat
 nr_free_cma 205824

 # cat /sys/kernel/debug/cma/cma-kvm_cma/count
 209920

Also, kmemleak still think those memory address are reserved below but
have already been used by the buddy allocator after onlining.  This
patch fixes the situation by treating CMA pageblocks as unmovable except
when has_unmovable_pages() is called as part of CMA allocation.

  Offlined Pages 4096
  kmemleak: Cannot insert 0xc000201f7d040008 into the object search tree (overlaps existing)
  Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    create_object+0x344/0x380
    __kmalloc_node+0x3ec/0x860
    kvmalloc_node+0x58/0x110
    seq_read+0x41c/0x620
    __vfs_read+0x3c/0x70
    vfs_read+0xbc/0x1a0
    ksys_read+0x7c/0x140
    system_call+0x5c/0x70
  kmemleak: Kernel memory leak detector disabled
  kmemleak: Object 0xc000201cc8000000 (size 13757317120):
  kmemleak:   comm "swapper/0", pid 0, jiffies 4294937297
  kmemleak:   min_count = -1
  kmemleak:   count = 0
  kmemleak:   flags = 0x5
  kmemleak:   checksum = 0
  kmemleak:   backtrace:
       cma_declare_contiguous+0x2a4/0x3b0
       kvm_cma_reserve+0x11c/0x134
       setup_arch+0x300/0x3f8
       start_kernel+0x9c/0x6e8
       start_here_common+0x1c/0x4b0
  kmemleak: Automatic memory scanning thread ended

[cai@lca.pw: use is_migrate_cma_page() and update commit log]
  Link: http://lkml.kernel.org/r/20190416170510.20048-1-cai@lca.pw
Link: http://lkml.kernel.org/r/20190413002623.8967-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:05 -07:00
Alexey Dobriyan 68545aa1cd proc: fixup proc-pid-vm test
Silly sizeof(pointer) vs sizeof(uint8_t[]) bug.

Link: http://lkml.kernel.org/r/20190414123009.GA12971@avx2
Fixes: e483b02087 ("proc: test /proc/*/maps, smaps, smaps_rollup, statm")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Alexey Dobriyan 8cd40d1d41 proc: fix map_files test on F29
F29 bans mapping first 64KB even for root making test fail.  Iterate
from address 0 until mmap() works.

Gentoo (root):

	openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3
	mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0

Gentoo (non-root):

	openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3
	mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EPERM (Operation not permitted)
	mmap(0x1000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x1000

F29 (root):

	openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3
	mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x1000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x2000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x3000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x4000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x5000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x6000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x7000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x8000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x9000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xa000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xb000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xc000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xd000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xe000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0xf000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = -1 EACCES (Permission denied)
	mmap(0x10000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x10000

Now all proc tests succeed on F29 if run as root, at last!

Link: http://lkml.kernel.org/r/20190414123612.GB12971@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Konstantin Khlebnikov e8277b3b52 mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
Commit 58bc4c34d2 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
depends on skipping vmstat entries with empty name introduced in
7aaf772723 ("mm: don't show nr_indirectly_reclaimable in
/proc/vmstat") but reverted in b29940c1ab ("mm: rename and change
semantics of nr_indirectly_reclaimable_bytes").

So skipping no longer works and /proc/vmstat has misformatted lines " 0".

This patch simply shows debug counters "nr_tlb_remote_*" for UP.

Link: http://lkml.kernel.org/r/155481488468.467.4295519102880913454.stgit@buzz
Fixes: 58bc4c34d2 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Roman Gushchin <guro@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
zhong jiang 37803841c9 mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock
When adding memory by probing a memory block in the sysfs interface,
there is an obvious issue where we will unlock the device_hotplug_lock
when we failed to takes it.

That issue was introduced in 8df1d0e4a2 ("mm/memory_hotplug: make
add_memory() take the device_hotplug_lock").

We should drop out in time when failing to take the device_hotplug_lock.

Link: http://lkml.kernel.org/r/1554696437-9593-1-git-send-email-zhongjiang@huawei.com
Fixes: 8df1d0e4a2 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock")
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reported-by: Yang yingliang <yangyingliang@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Hugh Dickins af53d3e9e0 mm: swapoff: shmem_unuse() stop eviction without igrab()
The igrab() in shmem_unuse() looks good, but we forgot that it gives no
protection against concurrent unmounting: a point made by Konstantin
Khlebnikov eight years ago, and then fixed in 2.6.39 by 778dd893ae
("tmpfs: fix race between umount and swapoff").  The current 5.1-rc
swapoff is liable to hit "VFS: Busy inodes after unmount of tmpfs.
Self-destruct in 5 seconds.  Have a nice day..." followed by GPF.

Once again, give up on using igrab(); but don't go back to making such
heavy-handed use of shmem_swaplist_mutex as last time: that would spoil
the new design, and I expect could deadlock inside shmem_swapin_page().

Instead, shmem_unuse() just raise a "stop_eviction" count in the shmem-
specific inode, and shmem_evict_inode() wait for that to go down to 0.
Call it "stop_eviction" rather than "swapoff_busy" because it can be put
to use for others later (huge tmpfs patches expect to use it).

That simplifies shmem_unuse(), protecting it from both unlink and
unmount; and in practice lets it locate all the swap in its first try.
But do not rely on that: there's still a theoretical case, when
shmem_writepage() might have been preempted after its get_swap_page(),
before making the swap entry visible to swapoff.

[hughd@google.com: remove incorrect list_del()]
  Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904091133570.1898@eggly.anvils
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081259400.1523@eggly.anvils
Fixes: b56a2d8af9 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Hugh Dickins 64165b1aff mm: swapoff: take notice of completion sooner
The old try_to_unuse() implementation was driven by find_next_to_unuse(),
which terminated as soon as all the swap had been freed.

Add inuse_pages checks now (alongside signal_pending()) to stop scanning
mms and swap_map once finished.

The same ought to be done in shmem_unuse() too, but never was before,
and needs a different interface: so leave it as is for now.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081258200.1523@eggly.anvils
Fixes: b56a2d8af9 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Hugh Dickins dd862deb15 mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES
SWAP_UNUSE_MAX_TRIES 3 appeared to work well in earlier testing, but
further testing has proved it to be a source of unnecessary swapoff
EBUSY failures (which can then be followed by unmount EBUSY failures).

When mmget_not_zero() or shmem's igrab() fails, there is an mm exiting
or inode being evicted, freeing up swap independent of try_to_unuse().
Those typically completed much sooner than the old quadratic swapoff,
but now it's more common that swapoff may need to wait for them.

It's possible to move those cases from init_mm.mmlist and shmem_swaplist
to separate "exiting" swaplists, and try_to_unuse() then wait for those
lists to be emptied; but we've not bothered with that in the past, and
don't want to risk missing some other forgotten case.  So just revert to
cycling around until the swap is gone, without any retries limit.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081256170.1523@eggly.anvils
Fixes: b56a2d8af9 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Hugh Dickins 8703954654 mm: swapoff: shmem_find_swap_entries() filter out other types
Swapfile "type" was passed all the way down to shmem_unuse_inode(), but
then forgotten from shmem_find_swap_entries(): with the result that
removing one swapfile would try to free up all the swap from shmem - no
problem when only one swapfile anyway, but counter-productive when more,
causing swapoff to be unnecessarily OOM-killed when it should succeed.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081254470.1523@eggly.anvils
Fixes: b56a2d8af9 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Qian Cai 1a62b18d51 slab: store tagged freelist for off-slab slabmgmt
Commit 51dedad06b ("kasan, slab: make freelist stored without tags")
calls kasan_reset_tag() for off-slab slab management object leading to
freelist being stored non-tagged.

However, cache_grow_begin() calls alloc_slabmgmt() which calls
kmem_cache_alloc_node() assigns a tag for the address and stores it in
the shadow address.  As the result, it causes endless errors below
during boot due to drain_freelist() -> slab_destroy() ->
kasan_slab_free() which compares already untagged freelist against the
stored tag in the shadow address.

Since off-slab slab management object freelist is such a special case,
just store it tagged.  Non-off-slab management object freelist is still
stored untagged which has not been assigned a tag and should not cause
any other troubles with this inconsistency.

  BUG: KASAN: double-free or invalid-free in slab_destroy+0x84/0x88
  Pointer tag: [ff], memory tag: [99]

  CPU: 0 PID: 1376 Comm: kworker/0:4 Tainted: G        W 5.1.0-rc3+ #8
  Hardware name: HPE Apollo 70             /C01_APACHE_MB         , BIOS L50_5.13_1.0.6 07/10/2018
  Workqueue: cgroup_destroy css_killed_work_fn
  Call trace:
   print_address_description+0x74/0x2a4
   kasan_report_invalid_free+0x80/0xc0
   __kasan_slab_free+0x204/0x208
   kasan_slab_free+0xc/0x18
   kmem_cache_free+0xe4/0x254
   slab_destroy+0x84/0x88
   drain_freelist+0xd0/0x104
   __kmem_cache_shrink+0x1ac/0x224
   __kmemcg_cache_deactivate+0x1c/0x28
   memcg_deactivate_kmem_caches+0xa0/0xe8
   memcg_offline_kmem+0x8c/0x3d4
   mem_cgroup_css_offline+0x24c/0x290
   css_killed_work_fn+0x154/0x618
   process_one_work+0x9cc/0x183c
   worker_thread+0x9b0/0xe38
   kthread+0x374/0x390
   ret_from_fork+0x10/0x18

  Allocated by task 1625:
   __kasan_kmalloc+0x168/0x240
   kasan_slab_alloc+0x18/0x20
   kmem_cache_alloc_node+0x1f8/0x3a0
   cache_grow_begin+0x4fc/0xa24
   cache_alloc_refill+0x2f8/0x3e8
   kmem_cache_alloc+0x1bc/0x3bc
   sock_alloc_inode+0x58/0x334
   alloc_inode+0xb8/0x164
   new_inode_pseudo+0x20/0xec
   sock_alloc+0x74/0x284
   __sock_create+0xb0/0x58c
   sock_create+0x98/0xb8
   __sys_socket+0x60/0x138
   __arm64_sys_socket+0xa4/0x110
   el0_svc_handler+0x2c0/0x47c
   el0_svc+0x8/0xc

  Freed by task 1625:
   __kasan_slab_free+0x114/0x208
   kasan_slab_free+0xc/0x18
   kfree+0x1a8/0x1e0
   single_release+0x7c/0x9c
   close_pdeo+0x13c/0x43c
   proc_reg_release+0xec/0x108
   __fput+0x2f8/0x784
   ____fput+0x1c/0x28
   task_work_run+0xc0/0x1b0
   do_notify_resume+0xb44/0x1278
   work_pending+0x8/0x10

  The buggy address belongs to the object at ffff809681b89e00
   which belongs to the cache kmalloc-128 of size 128
  The buggy address is located 0 bytes inside of
   128-byte region [ffff809681b89e00, ffff809681b89e80)
  The buggy address belongs to the page:
  page:ffff7fe025a06e00 count:1 mapcount:0 mapping:01ff80082000fb00
  index:0xffff809681b8fe04
  flags: 0x17ffffffc000200(slab)
  raw: 017ffffffc000200 ffff7fe025a06d08 ffff7fe022ef7b88 01ff80082000fb00
  raw: ffff809681b8fe04 ffff809681b80000 00000001000000e0 0000000000000000
  page dumped because: kasan: bad access detected
  page allocated via order 0, migratetype Unmovable, gfp_mask
  0x2420c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE)
   prep_new_page+0x4e0/0x5e0
   get_page_from_freelist+0x4ce8/0x50d4
   __alloc_pages_nodemask+0x738/0x38b8
   cache_grow_begin+0xd8/0xa24
   ____cache_alloc_node+0x14c/0x268
   __kmalloc+0x1c8/0x3fc
   ftrace_free_mem+0x408/0x1284
   ftrace_free_init_mem+0x20/0x28
   kernel_init+0x24/0x548
   ret_from_fork+0x10/0x18

  Memory state around the buggy address:
   ffff809681b89c00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
   ffff809681b89d00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
  >ffff809681b89e00: 99 99 99 99 99 99 99 99 fe fe fe fe fe fe fe fe
                     ^
   ffff809681b89f00: 43 43 43 43 43 fe fe fe fe fe fe fe fe fe fe fe
   ffff809681b8a000: 6d fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe

Link: http://lkml.kernel.org/r/20190403022858.97584-1-cai@lca.pw
Fixes: 51dedad06b ("kasan, slab: make freelist stored without tags")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19 09:46:04 -07:00
Andi Kleen 1de7edbb59 x86/cpu/bugs: Use __initconst for 'const' init data
Some of the recently added const tables use __initdata which causes section
attribute conflicts.

Use __initconst instead.

Fixes: fa1202ef22 ("x86/speculation: Add command line control")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190330004743.29541-9-andi@firstfloor.org
2019-04-19 17:11:39 +02:00
Masami Hiramatsu b191fa96ea x86/kprobes: Avoid kretprobe recursion bug
Avoid kretprobe recursion loop bg by setting a dummy
kprobes to current_kprobe per-CPU variable.

This bug has been introduced with the asm-coded trampoline
code, since previously it used another kprobe for hooking
the function return placeholder (which only has a nop) and
trampoline handler was called from that kprobe.

This revives the old lost kprobe again.

With this fix, we don't see deadlock anymore.

And you can see that all inner-called kretprobe are skipped.

  event_1                                  235               0
  event_2                                19375           19612

The 1st column is recorded count and the 2nd is missed count.
Above shows (event_1 rec) + (event_2 rec) ~= (event_2 missed)
(some difference are here because the counter is racy)

Reported-by: Andrea Righi <righi.andrea@gmail.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: c9becf58d9 ("[PATCH] kretprobe: kretprobe-booster")
Link: http://lkml.kernel.org/r/155094064889.6137.972160690963039.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 14:26:07 +02:00
Masami Hiramatsu fabe38ab6b kprobes: Mark ftrace mcount handler functions nokprobe
Mark ftrace mcount handler functions nokprobe since
probing on these functions with kretprobe pushes
return address incorrectly on kretprobe shadow stack.

Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 14:26:06 +02:00
Masami Hiramatsu 3ff9c075cc x86/kprobes: Verify stack frame on kretprobe
Verify the stack frame pointer on kretprobe trampoline handler,
If the stack frame pointer does not match, it skips the wrong
entry and tries to find correct one.

This can happen if user puts the kretprobe on the function
which can be used in the path of ftrace user-function call.
Such functions should not be probed, so this adds a warning
message that reports which function should be blacklisted.

Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/155094059185.6137.15527904013362842072.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 14:26:05 +02:00
Andrew Morton b50776ae01 locking/atomics: Don't assume that scripts are executable
patch(1) doesn't set the x bit on files.  So if someone downloads and
applies patch-4.21.xz, their kernel won't build.  Fix that by executing
/bin/sh.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-19 14:21:43 +02:00
Guoqing Jiang c53051128b sc16is7xx: put err_spi and err_i2c into correct #ifdef
err_spi is only called within SERIAL_SC16IS7XX_SPI
while err_i2c is called inside SERIAL_SC16IS7XX_I2C.
So we need to put err_spi and err_i2c into each #ifdef
accordingly.

This change fixes ("sc16is7xx: move label 'err_spi'
to correct section").

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-19 14:09:23 +02:00
Greg Kroah-Hartman cef62a615d Merge tag 'misc-habanalabs-next-2019-04-19' of git://people.freedesktop.org/~gabbayo/linux into char-misc-next
Oded writes:

This tag contains many changes for kernel 5.2.

The major changes are:
- Add a new IOCTL for debug, profiling and trace operations on the device.
  This will allow the user to perform profiling and debugging of the
  deep learning topologies that are executing on the ASIC.

- Add a shadow table for the ASIC's MMU page tables to avoid doing page
  table walks on the device's DRAM during map/unmap operations.

- re-factor of ASIC-dependent code to be common code for all ASICs

In addition, there are many small fixes and changes. The notable ones are:
- Allow accessing the DRAM using virtual address through the debugFS
  interface. Until now, only physical addresses were valid, but that is
  useless for debugging when working with MMU.

- Allow the user to modify the TPC clock relaxation value to better
  control TPC power consumption during topology execution.

- Allow the user to inquire about the device's status
  (operational/Malfunction/in-reset) in the INFO IOCTL.

- Improvements to the device's removal function, to prevent crash in case
  of force removal by the OS.

- Prevent PTE read/write during hard-reset. This will improve stability of
  the device during hard-reset.

* tag 'misc-habanalabs-next-2019-04-19' of git://people.freedesktop.org/~gabbayo/linux: (31 commits)
  habanalabs: prevent device PTE read/write during hard-reset
  habanalabs: improve IOCTLs behavior when disabled or reset
  habanalabs: all FD must be closed before removing device
  habanalabs: split mmu/no-mmu code paths in memory ioctl
  habanalabs: ASIC_AUTO_DETECT enum value is redundant
  habanalabs: refactoring in goya.c
  uapi/habanalabs: fix some comments in uapi file
  habanalabs: add goya implementation for debug configuration
  habanalabs: add new IOCTL for debug, tracing and profiling
  habanalabs: remove extra semicolon
  habanalabs: prevent CPU soft lockup on Palladium
  habanalabs: remove trailing blank line from EOF
  habanalabs: improve error messages
  habanalabs: add device status option to INFO IOCTL
  habanalabs: allow user to modify TPC clock relaxation value
  habanalabs: set new golden value to tpc clock relaxation
  habanalabs: never fail hard reset of device
  habanalabs: keep track of the device's dma mask
  habanalabs: add MMU shadow mapping
  habanalabs: Allow accessing DRAM virtual addresses via debugfs
  ...
2019-04-19 11:53:42 +02:00
Christoph Hellwig 144ec97493 scsi: aic7xxx: fix EISA support
Instead of relying on the now removed NULL argument to
pci_alloc_consistent, switch to the generic DMA API, and store the struct
device so that we can pass it.

Fixes: 4167b2ad51 ("PCI: Remove NULL device handling from PCI DMA API")
Reported-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-18 20:43:10 -04:00
Saurav Kashyap 0228034d8e Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"
This patch clears FC_RP_STARTED flag during logoff, because of this
re-login(flogi) didn't happen to the switch.

This reverts commit 1550ec458e.

Fixes: 1550ec458e ("scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO")
Cc: <stable@vger.kernel.org> # v4.18+
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Reviewed-by: Hannes Reinecke <hare@#suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-18 20:40:00 -04:00
Linus Torvalds 6d906f9981 Avoid compiler uninitialised warning introduced by recent arm64 futex fix
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAly4sd0ACgkQa9axLQDI
 XvGEbw/+I3511KdExOVY3MSubeSOUmY6sPmTkTOp/Gl1DxFK8GkWGQ4QrZupimqg
 YREir7frY6Mufd6qknQClzBlcOfi1tjWe7hjW1bFLHdI3aG1Aui25RsyDsRZkHV7
 p5HWX48fNgBwWtx6z/4jIYjbqt5pzNnNxxhvC7wMttMDVgZUqSnGT+cucUbh4hth
 LrnsoyvRSSUQqtkls64TQIWMfd3/cMzxtxdi6BZltv6C//LKeaZiEZtf2jGecdMT
 K2FINRbxp8Cya/pZRWHSasuhlAstV5l+8Vc9C9P/n1I5Xcxr1PwQAt53mQpu5zwf
 lQlJRzUKnWhwXyedvZe17hPoiByqn07h9JkFBu+ZLBu3ZfY/sEst3pWfyRs6fokb
 8YXnD+lcywU0tlitbZSzYlH/QmFc9Df7P6XUx57zPc9cqZj9Meolqg9VWc+xs5iX
 sr2MhLr0tUIvQZ9DI3FmcgZ1Xuoq7suCqn2/tQ715pb3zoDsVPog4wQHT95CyRWP
 9N8MzGcwIRnXsNe5gw47/ipyle8DN5nKi7pklYsnx6aYLqJQbgc5lJX/Czbv5cTu
 /VgFoCC4CW2vH/o2sccuc95+3JPXvzPXQCB75fpCPWG2Z3AAjxewAIUqLZ0LgEEp
 +rqnczLPWrSXEa6VDqRCwPPvrQRzSJ0YD4Ol2yyH894xOK9CBOs=
 =Z+9W
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Avoid compiler uninitialised warning introduced by recent arm64 futex
  fix"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: futex: Restore oldval initialization to work around buggy compilers
2019-04-18 10:24:48 -07:00
Nathan Chancellor ff8acf9290 arm64: futex: Restore oldval initialization to work around buggy compilers
Commit 045afc2412 ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with
non-zero result value") removed oldval's zero initialization in
arch_futex_atomic_op_inuser because it is not necessary. Unfortunately,
Android's arm64 GCC 4.9.4 [1] does not agree:

../kernel/futex.c: In function 'do_futex':
../kernel/futex.c:1658:17: warning: 'oldval' may be used uninitialized
in this function [-Wmaybe-uninitialized]
   return oldval == cmparg;
                 ^
In file included from ../kernel/futex.c:73:0:
../arch/arm64/include/asm/futex.h:53:6: note: 'oldval' was declared here
  int oldval, ret, tmp;
      ^

GCC fails to follow that when ret is non-zero, futex_atomic_op_inuser
returns right away, avoiding the uninitialized use that it claims.
Restoring the zero initialization works around this issue.

[1]: https://android.googlesource.com/platform/prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/

Cc: stable@vger.kernel.org
Fixes: 045afc2412 ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-04-18 18:17:08 +01:00
Christian Brauner 738a7832d2 signal: use fdget() since we don't allow O_PATH
As stated in the original commit for pidfd_send_signal() we don't allow
to signal processes through O_PATH file descriptors since it is
semantically equivalent to a write on the pidfd.

We already correctly error out right now and return EBADF if an O_PATH
fd is passed.  This is because we use file->f_op to detect whether a
pidfd is passed and O_PATH fds have their file->f_op set to empty_fops
in do_dentry_open() and thus fail the test.

Thus, there is no regression.  It's just semantically correct to use
fdget() and return an error right from there instead of taking a
reference and returning an error later.

Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jann@thejh.net>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-18 08:35:18 -07:00
Linus Torvalds d22113a2cd s390 update with bug fixes for 5.1-rc6
- Fix overwrite of the initial ramdisk due to misuse of IS_ENABLED
 
  - Fix integer overflow in the dasd driver resulting in incorrect number
    of blocks for large devices
 
  - Fix a lockdep false positive in the 3270 driver
 
  - Fix a deadlock in the zcrypt driver
 
  - Fix incorrect debug feature entries in the pkey api
 
  - Fix inline assembly constraints fallout with CONFIG_KASAN=y
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJcuIwlAAoJEDjwexyKj9rgCJEH/Agbx2sJlxApuKC6BMAjPZCy
 dJUu7hmumgtL+QI7IqQMRgfuxiC16xpJQd14GvfNoyjX1NVajepYI2+OUZHj7F6s
 La0fq/16fRriVByLiL+pqjV2y0c5kQ8Uhe15vwEvSmr3T/NgPLqUbIe5pwY5KS59
 O6B3WpVKeIb8UOUlEAQnezTYj70jWMT783NTyzhPcYLkiN7JJ7vI21FKPRM88ElS
 Q7f9qxGPv0hGsOqBXCBWztxgx+ezEHsnlH4t+he0tLOGXuuiy4BrgGtgbJx17frj
 xNNe8gOXV83bUbYTBygITnjMjS0m/1viJ4XTBgv/HLGgcKG06n3v6U4+zefB4N4=
 =+41g
 -----END PGP SIGNATURE-----

Merge tag 's390-5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 bug fixes from Martin Schwidefsky:

 - Fix overwrite of the initial ramdisk due to misuse of IS_ENABLED

 - Fix integer overflow in the dasd driver resulting in incorrect number
   of blocks for large devices

 - Fix a lockdep false positive in the 3270 driver

 - Fix a deadlock in the zcrypt driver

 - Fix incorrect debug feature entries in the pkey api

 - Fix inline assembly constraints fallout with CONFIG_KASAN=y

* tag 's390-5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: correct some inline assembly constraints
  s390/pkey: add one more argument space for debug feature entry
  s390/zcrypt: fix possible deadlock situation on ap queue remove
  s390/3270: fix lockdep false positive on view->lock
  s390/dasd: Fix capacity calculation for large volumes
  s390/mem_detect: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD)
2019-04-18 08:15:06 -07:00
Linus Torvalds 2a852fd1ac AFS fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAXLGSOfu3V2unywtrAQKrLA/+MJfgZC9ejNsXqjnnqq48LtBf4B3FhYrO
 qdPZMKPuy5XxSWBIIGpXtgK7PspkpAD1NCe7qpvibQoa1SJqwa2gc46aN7+gzp3e
 HETkb4UEXNV5yXSjEA+YOQVa0MiLG2yL6satlqhN4KqoDB5Rj9KoaPFwT79K4SIU
 1iME9Q//tA2M7SU8B1FWqyx5kd350A38KoXmHOux7x8K1ZyjEB2DbyKqZBtn6Y/T
 nfH9ZZioep1JbPFwOuG5NYQH0hP2jUj8ogDnrzzWzn1rQQTWs6wVfigwAh8tc/Dv
 eDKA8rV9WtZT97ExOG6cQOe0UZsM3NFhnb4P/oSCpl+jtr9WHO3pDmJPLToCr0BA
 8xXW7vA+Ejx15dNz4Snnz9yj3FS0UeURSFyhn3g2AFQMK4v9Dd3ZZKOgjR9Czi98
 f1UNh768LKY+siOHmoOhgZNlrI8bgamCmXOucPayGzrQ/HVHqAwR6PiUGCzaH+vk
 +LfDC/Z/ezSqrfQiH/Ji5lLkQkhmYiUTcePn5xdi+kZGWqHwl9qGplVAFUVafs5O
 Hz+H2F7i8F6Oc4v5hzCHp5aDUP2oEKmruaXJimN+O9SOkUPi72aG8STRpBFnBqf7
 77WxNVG/XDFL6I8HukkgRMRePFDXi+2NwGOjB9k8PpAqyA1Vi6sSkCXVkwgAhLRS
 NCaOR2iDaqE=
 =cqBe
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-20190413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:

 - Stop using the deprecated get_seconds().

 - Don't make tracepoint strings const as the section they go in isn't
   read-only.

 - Differentiate failure due to unmarshalling from other failure cases.
   We shouldn't abort with RXGEN_CC/SS_UNMARSHAL if it's not due to
   unmarshalling.

 - Add a missing unlock_page().

 - Fix the interaction between receiving a notification from a server
   that it has invalidated all outstanding callback promises and a
   client call that we're in the middle of making that will get a new
   promise.

* tag 'afs-fixes-20190413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix in-progess ops to ignore server-level callback invalidation
  afs: Unlock pages for __pagevec_release()
  afs: Differentiate abort due to unmarshalling from other errors
  afs: Avoid section confusion in CM_NAME
  afs: avoid deprecated get_seconds()
2019-04-18 08:10:22 -07:00
Linus Torvalds d3ce3b1879 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "Fix a bug in the implementation of the x86 accelerated version of
  poly1305"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/poly1305 - fix overflow during partial reduction
2019-04-18 08:04:10 -07:00
Linus Torvalds 95ea55291e drm tegra, amdgpu fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJct/CWAAoJEAx081l5xIa+jkAP/0uiIUunGeRIGbNzbNSaIHcH
 E87Ggq5FB3bSIq7LY13JffrTYfeZRCTjpqVhqnOuqLPxrzxnjXmX9z+OLM2tuP5e
 paPYPtJSUCEBWf5Y8nHysIE4iKjS1Cc8HACfj/3LyYBRzG7x6bYG6qe4y0GIhyVP
 xypYuLpyYDGl0Ixgmk5MstO5UCOe2fnkyrvlw2Aoguk8sZcHqIZK0DJRO0teCx7t
 dgqO5x4JlsC26aCNLy0v0lhcmEXwCDgQLvSZlrS2c7OLiTWurlQJkdOvL9bx/ji2
 mShuuTqVNs9CVn7iIqFToOLMKgX0ZYmTD3W5367/plTdbd7DT6nkfyIjKpXpzmtH
 L2NrTl+Z8yiUlU4E4hhN9eGara4p5J9wI95SAQyvCADCB2LhfPuSSPpVBkjbAgl6
 8vLGASEyzR98337MvdTXcMeVXMwpnVbPTiTANeihMKDfIBN6LqL/fCFsxpcFQuTg
 NcrkmHLHAof80M7rYbjF6ECSFI7/uWMDRpLmRszO31xaJzwvWIDmO0Q5JGx3q3i7
 gnJz2NJsmWv/+IXbY1toiF6F4x7joCfeVL3uwY0eNpiKIe2cvZcsuU5DollXS2va
 awsgfMM0x9vSSLUr/E7QTMEZX8x1BsIxePG7OrazltLG6T9xHEun+Xp7BvGovFcF
 y7qVkYrKbGvMZ1+lBlB0
 =nAcQ
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2019-04-18' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Since Easter is looming for me, I'm just pushing whatever is in my
  tree, I'll see what else turns up and maybe I'll send another pull
  early next week if there is anything.

  tegra:
   - stream id programming fix
   - avoid divide by 0 for bad hdmi audio setup code

  ttm:
   - Hugepages fix
   - refcount imbalance in error path fix

  amdgpu:
   - GPU VM fixes for Vega/RV
   - DC AUX fix for active DP-DVI dongles
   - DC fix for multihead regression"

* tag 'drm-fixes-2019-04-18' of git://anongit.freedesktop.org/drm/drm:
  drm/tegra: hdmi: Setup audio only if configured
  drm/amd/display: If one stream full updates, full update all planes
  drm/amdgpu/gmc9: fix VM_L2_CNTL3 programming
  drm/amdgpu: shadow in shadow_list without tbo.mem.start cause page fault in sriov TDR
  gpu: host1x: Program stream ID to bypass without SMMU
  drm/amd/display: extending AUX SW Timeout
  drm/ttm: fix dma_fence refcount imbalance on error path
  drm/ttm: fix incrementing the page pointer for huge pages
  drm/ttm: fix start page for huge page check in ttm_put_pages()
  drm/ttm: fix out-of-bounds read in ttm_put_pages() v2
2019-04-18 07:56:05 -07:00
Chang-An Chen 3f2552f7e9 timers/sched_clock: Prevent generic sched_clock wrap caused by tick_freeze()
tick_freeze() introduced by suspend-to-idle in commit 124cf9117c ("PM /
sleep: Make it possible to quiesce timers during suspend-to-idle") uses
timekeeping_suspend() instead of syscore_suspend() during
suspend-to-idle. As a consequence generic sched_clock will keep going
because sched_clock_suspend() and sched_clock_resume() are not invoked
during suspend-to-idle which can result in a generic sched_clock wrap.

On a ARM system with suspend-to-idle enabled, sched_clock is registered
as "56 bits at 13MHz, resolution 76ns, wraps every 4398046511101ns", which
means the real wrapping duration is 8796093022202ns.

[  134.551779] suspend-to-idle suspend (timekeeping_suspend())
[ 1204.912239] suspend-to-idle resume (timekeeping_resume())
......
[ 1206.912239] suspend-to-idle suspend (timekeeping_suspend())
[ 5880.502807] suspend-to-idle resume (timekeeping_resume())
......
[ 6000.403724] suspend-to-idle suspend (timekeeping_suspend())
[ 8035.753167] suspend-to-idle resume  (timekeeping_resume())
......
[ 8795.786684] (2)[321:charger_thread]......
[ 8795.788387] (2)[321:charger_thread]......
[    0.057226] (0)[0:swapper/0]......
[    0.061447] (2)[0:swapper/2]......

sched_clock was not stopped during suspend-to-idle, and sched_clock_poll
hrtimer was not expired because timekeeping_suspend() was invoked during
suspend-to-idle. It makes sched_clock wrap at kernel time 8796s.

To prevent this, invoke sched_clock_suspend() and sched_clock_resume() in
tick_freeze() together with timekeeping_suspend() and timekeeping_resume().

Fixes: 124cf9117c (PM / sleep: Make it possible to quiesce timers during suspend-to-idle)
Signed-off-by: Chang-An Chen <chang-an.chen@mediatek.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Corey Minyard <cminyard@mvista.com>
Cc: <linux-mediatek@lists.infradead.org>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: <kuohong.wang@mediatek.com>
Cc: <freddy.hsin@mediatek.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1553828349-8914-1-git-send-email-chang-an.chen@mediatek.com
2019-04-18 14:34:53 +02:00
Kim Phillips 3fe3331bb2 perf/x86/amd: Add event map for AMD Family 17h
Family 17h differs from prior families by:

 - Does not support an L2 cache miss event
 - It has re-enumerated PMC counters for:
   - L2 cache references
   - front & back end stalled cycles

So we add a new amd_f17h_perfmon_event_map[] so that the generic
perf event names will resolve to the correct h/w events on
family 17h and above processors.

Reference sections 2.1.13.3.3 (stalls) and 2.1.13.3.6 (L2):

  https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: <stable@vger.kernel.org> # v4.9+
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: e40ed1542d ("perf/x86: Add perf support for AMD family-17h processors")
[ Improved the formatting a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-18 14:31:54 +02:00
Baoquan He ec3937107a x86/mm/KASLR: Fix the size of the direct mapping section
kernel_randomize_memory() uses __PHYSICAL_MASK_SHIFT to calculate
the maximum amount of system RAM supported. The size of the direct
mapping section is obtained from the smaller one of the below two
values:

  (actual system RAM size + padding size) vs (max system RAM size supported)

This calculation is wrong since commit

  b83ce5ee91 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52").

In it, __PHYSICAL_MASK_SHIFT was changed to be 52, regardless of whether
the kernel is using 4-level or 5-level page tables. Thus, it will always
use 4 PB as the maximum amount of system RAM, even in 4-level paging
mode where it should actually be 64 TB.

Thus, the size of the direct mapping section will always
be the sum of the actual system RAM size plus the padding size.

Even when the amount of system RAM is 64 TB, the following layout will
still be used. Obviously KALSR will be weakened significantly.

   |____|_______actual RAM_______|_padding_|______the rest_______|
   0            64TB                                            ~120TB

Instead, it should be like this:

   |____|_______actual RAM_______|_________the rest______________|
   0            64TB                                            ~120TB

The size of padding region is controlled by
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING, which is 10 TB by default.

The above issue only exists when
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING is set to a non-zero value,
which is the case when CONFIG_MEMORY_HOTPLUG is enabled. Otherwise,
using __PHYSICAL_MASK_SHIFT doesn't affect KASLR.

Fix it by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS.

 [ bp: Massage commit message. ]

Fixes: b83ce5ee91 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: frank.ramsay@hpe.com
Cc: herbert@gondor.apana.org.au
Cc: kirill@shutemov.name
Cc: mike.travis@hpe.com
Cc: thgarnie@google.com
Cc: x86-ml <x86@kernel.org>
Cc: yamada.masahiro@socionext.com
Link: https://lkml.kernel.org/r/20190417083536.GE7065@MiWiFi-R3L-srv
2019-04-18 10:42:58 +02:00