Commit graph

737673 commits

Author SHA1 Message Date
Kan Liang 8872481bd0 perf mmap: Introduce perf_mmap__read_init()
The new function perf_mmap__read_init() is factored out from
perf_mmap__push().

It is to calculate the 'start' and 'end' of the available data in
ringbuffer.

No functional change.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-5-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:52:22 -03:00
Kan Liang f92c8cbe59 perf mmap: Cleanup perf_mmap__push()
The first assignment for 'start' and 'end' is redundant.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-4-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:52:05 -03:00
Kan Liang dc6c35c679 perf mmap: Recalculate size for overwrite mode
In perf_mmap__push(), the 'size' need to be recalculated, otherwise the
invalid data might be pushed to the record in overwrite mode.

The issue is introduced by commit 7fb4b407a1 ("perf mmap: Don't
discard prev in backward mode").

When the ring buffer is full in overwrite mode, backward_rb_find_range()
will be called to recalculate the 'start' and 'end'. The 'size' needs to
be recalculated accordingly.

Unconditionally recalculate the 'size', not just for full ring buffer in
overwrite mode. Because:

- There is no harmful to recalculate the 'size' for other cases.
- The code of calculating 'start' and 'end' will be factored out later.
  The new function does not need to return 'size'.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 7fb4b407a1 ("perf mmap: Don't discard prev in backward mode")
Link: http://lkml.kernel.org/r/1516310792-208685-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:51:57 -03:00
Kan Liang 6888ff66c4 perf evlist: Remove stale mmap read for backward
perf_evlist__mmap_read_catchup() and perf_evlist__mmap_read_backward()
are only for overwrite mode.

But they read the evlist->mmap buffer which is for non-overwrite mode.

It did not bring any serious problem yet, because there is no one use
it.

Remove the unused interfaces.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1516310792-208685-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:50:53 -03:00
William Cohen 0b7c1528fb perf vendor events aarch64: Add JSON metrics for ARM Cortex-A53 Processor
Add JSON metrics for ARM Cortex-A53 Processor.

Unlike the Intel processors there isn't a script that automatically
generated these files. The patch was manually generated from the
documentation and the previous oprofile ARM Cortex ac53 event file patch
I made.

The relevant documentation is in the "12.9 Events" section of the ARM
Cortex A53 MPCore Processor Revision: r0p4 Technical Reference Manual.

The ARM Cortex A53 manual is available at:

  http://infocenter.arm.com/help/topic/com.arm.doc.ddi0500g/DDI0500G_cortex_a53_trm.pdf

Use that to look for additional information about the events.

Signed-off-by: William Cohen <wcohen@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180131032813.9564-1-wcohen@redhat.com
[ Added references provided by William Cohen ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:49:44 -03:00
Rafael J. Wysocki 31a3be353f Merge branches 'acpi-ec', 'acpi-tables' and 'acpi-doc'
* acpi-ec:
  ACPI / EC: Restore polling during noirq suspend/resume phases

* acpi-tables:
  ACPI: SPCR: Mark expected switch fall-through in acpi_parse_spcr

* acpi-doc:
  ACPI: dock: document sysfs interface
  ACPI / DPTF: Document dptf_power sysfs atttributes
2018-02-15 12:02:42 +01:00
Rafael J. Wysocki 822ffaa581 Merge branches 'pm-cpuidle' and 'pm-opp'
* pm-cpuidle:
  PM: cpuidle: Fix cpuidle_poll_state_init() prototype
  Documentation/ABI: update cpuidle sysfs documentation

* pm-opp:
  opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table
2018-02-15 12:01:53 +01:00
Alexander Abrosimov c8ba9db2a7 platform/x86: dell-laptop: Removed duplicates in DMI whitelist
Fixed a mistake in which several entries were duplicated in the DMI list
from the below commit
fe486138 platform/x86: dell-laptop: Add 2-in-1 devices to the DMI whitelist

Signed-off-by: Alexander Abrosimov <alexander.n.abrosimov@gmail.com>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-02-15 12:18:33 +02:00
Laszlo Toth eca39e7f0c platform/x86: dell-laptop: fix kbd_get_state's request value
Commit 9862b43624 ("platform/x86: dell-laptop: Allocate buffer on heap
rather than globally")
broke one request, changed it back to the original value.

Tested on a Dell E6540, backlight came back.

Fixes: 9862b43624 ("platform/x86: dell-laptop: Allocate buffer on heap rather than globally")
Signed-off-by: Laszlo Toth <laszlth@gmail.com>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-02-15 12:18:33 +02:00
Aaron Ma ed5b9ba7be platform/x86: ideapad-laptop: Increase timeout to wait for EC answer
Lenovo E41-20 needs more time than 100ms to read VPC,
the funtion keys always failed responding.
Increase timeout to get the value from VPC, then
the funtion keys like mic mute key work well.

Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-02-15 12:18:32 +02:00
Andrey Ryabinin 6e1d8ea909 platform/x86: wmi: fix off-by-one write in wmi_dev_probe()
wmi_dev_probe() allocates one byte less than necessary, thus
subsequent sprintf() call writes trailing zero past the end
of the 'buf':

    BUG: KASAN: slab-out-of-bounds in vsnprintf+0xda4/0x1240
    Write of size 1 at addr ffff880423529caf by task kworker/1:1/32

    Call Trace:
     dump_stack+0xb3/0x14d
     print_address_description+0xd7/0x380
     kasan_report+0x166/0x2b0
     vsnprintf+0xda4/0x1240
     sprintf+0x9b/0xd0
     wmi_dev_probe+0x1c3/0x400
     driver_probe_device+0x5d1/0x990
     bus_for_each_drv+0x109/0x190
     __device_attach+0x217/0x360
     bus_probe_device+0x1ad/0x260
     deferred_probe_work_func+0x10f/0x5d0
     process_one_work+0xa8b/0x1dc0
     worker_thread+0x20d/0x17d0
     kthread+0x311/0x3d0
     ret_from_fork+0x3a/0x50

    Allocated by task 32:
     kasan_kmalloc+0xa0/0xd0
     __kmalloc+0x14f/0x3e0
     wmi_dev_probe+0x182/0x400
     driver_probe_device+0x5d1/0x990
     bus_for_each_drv+0x109/0x190
     __device_attach+0x217/0x360
     bus_probe_device+0x1ad/0x260
     deferred_probe_work_func+0x10f/0x5d0
     process_one_work+0xa8b/0x1dc0
     worker_thread+0x20d/0x17d0
     kthread+0x311/0x3d0
     ret_from_fork+0x3a/0x50

Increment allocation size to fix this.

Fixes: 44b6b76611 ("platform/x86: wmi: create userspace interface for drivers")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-02-15 12:18:32 +02:00
Jens Axboe 7ddbc29fe4 Merge branch 'nvme-4.16-rc' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Keith:

"After syncing with Christoph and Sagi, we feel this is a good time to
 send our latest fixes across most of the nvme components for 4.16"

* 'nvme-4.16-rc' of git://git.infradead.org/nvme:
  nvme-rdma: fix sysfs invoked reset_ctrl error flow
  nvmet: Change return code of discard command if not supported
  nvme-pci: Fix timeouts in connecting state
  nvme-pci: Remap CMB SQ entries on every controller reset
  nvme: fix the deadlock in nvme_update_formats
  nvme: Don't use a stack buffer for keep-alive command
  nvme_fc: cleanup io completion
  nvme_fc: correct abort race condition on resets
  nvme: Fix discard buffer overrun
  nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition
  nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process
  nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
2018-02-14 19:01:53 -07:00
Linus Torvalds e525de3ab0 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes all across the map:

   - /proc/kcore vsyscall related fixes
   - LTO fix
   - build warning fix
   - CPU hotplug fix
   - Kconfig NR_CPUS cleanups
   - cpu_has() cleanups/robustification
   - .gitignore fix
   - memory-failure unmapping fix
   - UV platform fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
  x86/error_inject: Make just_return_func() globally visible
  x86/platform/UV: Fix GAM Range Table entries less than 1GB
  x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore
  x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU
  x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally
  vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page
  x86/Kconfig: Further simplify the NR_CPUS config
  x86/Kconfig: Simplify NR_CPUS config
  x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()"
  x86/cpufeature: Update _static_cpu_has() to use all named variables
  x86/cpufeature: Reindent _static_cpu_has()
2018-02-14 17:31:51 -08:00
Linus Torvalds d4667ca142 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar:
 "Here's the latest set of Spectre and PTI related fixes and updates:

  Spectre:
   - Add entry code register clearing to reduce the Spectre attack
     surface
   - Update the Spectre microcode blacklist
   - Inline the KVM Spectre helpers to get close to v4.14 performance
     again.
   - Fix indirect_branch_prediction_barrier()
   - Fix/improve Spectre related kernel messages
   - Fix array_index_nospec_mask() asm constraint
   - KVM: fix two MSR handling bugs

  PTI:
   - Fix a paranoid entry PTI CR3 handling bug
   - Fix comments

  objtool:
   - Fix paranoid_entry() frame pointer warning
   - Annotate WARN()-related UD2 as reachable
   - Various fixes
   - Add Add Peter Zijlstra as objtool co-maintainer

  Misc:
   - Various x86 entry code self-test fixes
   - Improve/simplify entry code stack frame generation and handling
     after recent heavy-handed PTI and Spectre changes. (There's two
     more WIP improvements expected here.)
   - Type fix for cache entries

  There's also some low risk non-fix changes I've included in this
  branch to reduce backporting conflicts:

   - rename a confusing x86_cpu field name
   - de-obfuscate the naming of single-TLB flushing primitives"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  x86/entry/64: Fix CR3 restore in paranoid_exit()
  x86/cpu: Change type of x86_cache_size variable to unsigned int
  x86/spectre: Fix an error message
  x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
  selftests/x86/mpx: Fix incorrect bounds with old _sigfault
  x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]()
  x86/speculation: Add <asm/msr-index.h> dependency
  nospec: Move array_index_nospec() parameter checking into separate macro
  x86/speculation: Fix up array_index_nospec_mask() asm constraint
  x86/debug: Use UD2 for WARN()
  x86/debug, objtool: Annotate WARN()-related UD2 as reachable
  objtool: Fix segfault in ignore_unreachable_insn()
  selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems
  selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c
  selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c
  selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory
  selftests/x86/pkeys: Remove unused functions
  selftests/x86: Clean up and document sscanf() usage
  selftests/x86: Fix vDSO selftest segfault for vsyscall=none
  x86/entry/64: Remove the unused 'icebp' macro
  ...
2018-02-14 17:02:15 -08:00
Ingo Molnar e486575734 x86/entry/64: Fix CR3 restore in paranoid_exit()
Josh Poimboeuf noticed the following bug:

 "The paranoid exit code only restores the saved CR3 when it switches back
  to the user GS.  However, even in the kernel GS case, it's possible that
  it needs to restore a user CR3, if for example, the paranoid exception
  occurred in the syscall exit path between SWITCH_TO_USER_CR3_STACK and
  SWAPGS."

Josh also confirmed via targeted testing that it's possible to hit this bug.

Fix the bug by also restoring CR3 in the paranoid_exit_no_swapgs branch.

The reason we haven't seen this bug reported by users yet is probably because
"paranoid" entry points are limited to the following cases:

 idtentry double_fault       do_double_fault  has_error_code=1  paranoid=2
 idtentry debug              do_debug         has_error_code=0  paranoid=1 shift_ist=DEBUG_STACK
 idtentry int3               do_int3          has_error_code=0  paranoid=1 shift_ist=DEBUG_STACK
 idtentry machine_check      do_mce           has_error_code=0  paranoid=1

Amongst those entry points only machine_check is one that will interrupt an
IRQS-off critical section asynchronously - and machine check events are rare.

The other main asynchronous entries are NMI entries, which can be very high-freq
with perf profiling, but they are special: they don't use the 'idtentry' macro but
are open coded and restore user CR3 unconditionally so don't have this bug.

Reported-and-tested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180214073910.boevmg65upbk3vqb@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:54 +01:00
Gustavo A. R. Silva 24dbc6000f x86/cpu: Change type of x86_cache_size variable to unsigned int
Currently, x86_cache_size is of type int, which makes no sense as we
will never have a valid cache size equal or less than 0. So instead of
initializing this variable to -1, it can perfectly be initialized to 0
and use it as an unsigned variable instead.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Addresses-Coverity-ID: 1464429
Link: http://lkml.kernel.org/r/20180213192208.GA26414@embeddedor.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:53 +01:00
Dan Carpenter 9de29eac8d x86/spectre: Fix an error message
If i == ARRAY_SIZE(mitigation_options) then we accidentally print
garbage from one space beyond the end of the mitigation_options[] array.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Fixes: 9005c6834c ("x86/spectre: Simplify spectre_v2 command line parsing")
Link: http://lkml.kernel.org/r/20180214071416.GA26677@mwanda
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:53 +01:00
Jia Zhang b399151cb4 x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
x86_mask is a confusing name which is hard to associate with the
processor's stepping.

Additionally, correct an indent issue in lib/cpu.c.

Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com>
[ Updated it to more recent kernels. ]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:52 +01:00
Rui Wang 961888b1d7 selftests/x86/mpx: Fix incorrect bounds with old _sigfault
For distributions with old userspace header files, the _sigfault
structure is different. mpx-mini-test fails with the following
error:

  [root@Purley]# mpx-mini-test_64 tabletest
  XSAVE is supported by HW & OS
  XSAVE processor supported state mask: 0x2ff
  XSAVE OS supported state mask: 0x2ff
   BNDREGS: size: 64 user: 1 supervisor: 0 aligned: 0
    BNDCSR: size: 64 user: 1 supervisor: 0 aligned: 0
  starting mpx bounds table test
  ERROR: siginfo bounds do not match shadow bounds for register 0

Fix it by using the correct offset of _lower/_upper in _sigfault.
RHEL needs this patch to work.

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave.hansen@linux.intel.com
Fixes: e754aedc26 ("x86/mpx, selftests: Add MPX self test")
Link: http://lkml.kernel.org/r/1513586050-1641-1-git-send-email-rui.y.wang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:52 +01:00
Andy Lutomirski 1299ef1d88 x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]()
flush_tlb_single() and flush_tlb_one() sound almost identical, but
they really mean "flush one user translation" and "flush one kernel
translation".  Rename them to flush_tlb_one_user() and
flush_tlb_one_kernel() to make the semantics more obvious.

[ I was looking at some PTI-related code, and the flush-one-address code
  is unnecessarily hard to understand because the names of the helpers are
  uninformative.  This came up during PTI review, but no one got around to
  doing it. ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/3303b02e3c3d049dc5235d5651e0ae6d29a34354.1517414378.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:52 +01:00
Peter Zijlstra ea00f30128 x86/speculation: Add <asm/msr-index.h> dependency
Joe Konno reported a compile failure resulting from using an MSR
without inclusion of <asm/msr-index.h>, and while the current code builds
fine (by accident) this needs fixing for future patches.

Reported-by: Joe Konno <joe.konno@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: arjan@linux.intel.com
Cc: bp@alien8.de
Cc: dan.j.williams@intel.com
Cc: dave.hansen@linux.intel.com
Cc: dwmw2@infradead.org
Cc: dwmw@amazon.co.uk
Cc: gregkh@linuxfoundation.org
Cc: hpa@zytor.com
Cc: jpoimboe@redhat.com
Cc: linux-tip-commits@vger.kernel.org
Cc: luto@kernel.org
Fixes: 20ffa1caec ("x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support")
Link: http://lkml.kernel.org/r/20180213132819.GJ25201@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:51 +01:00
Will Deacon 8fa80c503b nospec: Move array_index_nospec() parameter checking into separate macro
For architectures providing their own implementation of
array_index_mask_nospec() in asm/barrier.h, attempting to use WARN_ONCE() to
complain about out-of-range parameters using WARN_ON() results in a mess
of mutually-dependent include files.

Rather than unpick the dependencies, simply have the core code in nospec.h
perform the checking for us.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1517840166-15399-1-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:51 +01:00
Dan Williams be3233fbfc x86/speculation: Fix up array_index_nospec_mask() asm constraint
Allow the compiler to handle @size as an immediate value or memory
directly rather than allocating a register.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/151797010204.1289.1510000292250184993.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:50 +01:00
Peter Zijlstra 3b3a371cc9 x86/debug: Use UD2 for WARN()
Since the Intel SDM added an ModR/M byte to UD0 and binutils followed
that specification, we now cannot disassemble our kernel anymore.

This now means Intel and AMD disagree on the encoding of UD0. And instead
of playing games with additional bytes that are valid ModR/M and single
byte instructions (0xd6 for instance), simply use UD2 for both WARN() and
BUG().

Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180208194406.GD25181@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:50 +01:00
Josh Poimboeuf 2b5db66862 x86/debug, objtool: Annotate WARN()-related UD2 as reachable
By default, objtool assumes that a UD2 is a dead end.  This is mainly
because GCC 7+ sometimes inserts a UD2 when it detects a divide-by-zero
condition.

Now that WARN() is moving back to UD2, annotate the code after it as
reachable so objtool can follow the code flow.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kbuild test robot <fengguang.wu@intel.com>
Link: http://lkml.kernel.org/r/0e483379275a42626ba8898117f918e1bf661e40.1518130694.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:49 +01:00
Josh Poimboeuf fe24e27128 objtool: Fix segfault in ignore_unreachable_insn()
Peter Zijlstra's patch for converting WARN() to use UD2 triggered a
bunch of false "unreachable instruction" warnings, which then triggered
a seg fault in ignore_unreachable_insn().

The seg fault happened when it tried to dereference a NULL 'insn->func'
pointer.  Thanks to static_cpu_has(), some functions can jump to a
non-function area in the .altinstr_aux section.  That breaks
ignore_unreachable_insn()'s assumption that it's always inside the
original function.

Make sure ignore_unreachable_insn() only follows jumps within the
current function.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kbuild test robot <fengguang.wu@intel.com>
Link: http://lkml.kernel.org/r/bace77a60d5af9b45eddb8f8fb9c776c8de657ef.1518130694.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:49 +01:00
Dominik Brodowski 9279ddf23c selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems
The ldt_gdt and ptrace_syscall selftests, even in their 64-bit variant, use
hard-coded 32-bit syscall numbers and call "int $0x80".

This will fail on 64-bit systems with CONFIG_IA32_EMULATION=y disabled.

Therefore, do not build these tests if we cannot build 32-bit binaries
(which should be a good approximation for CONFIG_IA32_EMULATION=y being enabled).

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kselftest@vger.kernel.org
Cc: shuah@kernel.org
Link: http://lkml.kernel.org/r/20180211111013.16888-6-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:48 +01:00
Dominik Brodowski 4105c69703 selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c
On 64-bit builds, we should not rely on "int $0x80" working (it only does if
CONFIG_IA32_EMULATION=y is enabled). To keep the "Set TF and check int80"
test running on 64-bit installs with CONFIG_IA32_EMULATION=y enabled, build
this test only if we can also build 32-bit binaries (which should be a
good approximation for that).

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kselftest@vger.kernel.org
Cc: shuah@kernel.org
Link: http://lkml.kernel.org/r/20180211111013.16888-5-linux@dominikbrodowski.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:48 +01:00
Corentin Labbe c1e150ceb6 powerpc/pseries: Add empty update_numa_cpu_lookup_table() for NUMA=n
When CONFIG_NUMA is not set, the build fails with:

  arch/powerpc/platforms/pseries/hotplug-cpu.c:335:4:
  error: déclaration implicite de la fonction « update_numa_cpu_lookup_table »

So we have to add update_numa_cpu_lookup_table() as an empty function
when CONFIG_NUMA is not set.

Fixes: 1d9a090783 ("powerpc/numa: Invalidate numa_cpu_lookup_table on cpu remove")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-15 10:10:02 +11:00
Nicholas Piggin e7bde88cdb powerpc/powernv: IMC fix out of bounds memory access at shutdown
The OPAL IMC driver's shutdown handler disables nest PMU counters by
walking nodes and taking the first CPU out of their cpumask, which is
used to index into the paca (get_hard_smp_processor_id()). This does
not always do the right thing, and in particular for CPU-less nodes it
returns NR_CPUS and that overruns the paca and dereferences random
memory.

Fix it by being more careful about checking returned CPU, and only
using online CPUs. It's not clear this shutdown code makes sense after
commit 885dcd709b ("powerpc/perf: Add nest IMC PMU support"), but this
should not make things worse

Currently the bug causes us to call OPAL with a junk CPU number. A
separate patch in development to change the way pacas are allocated
escalates this bug into a crash:

  Unable to handle kernel paging request for data at address 0x2a21af1eeb000076
  Faulting instruction address: 0xc0000000000a5468
  Oops: Kernel access of bad area, sig: 11 [#1]
  ...
  NIP opal_imc_counters_shutdown+0x148/0x1d0
  LR  opal_imc_counters_shutdown+0x134/0x1d0
  Call Trace:
   opal_imc_counters_shutdown+0x134/0x1d0 (unreliable)
   platform_drv_shutdown+0x44/0x60
   device_shutdown+0x1f8/0x350
   kernel_restart_prepare+0x54/0x70
   kernel_restart+0x28/0xc0
   SyS_reboot+0x1d0/0x2c0
   system_call+0x58/0x6c

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-15 09:54:45 +11:00
Cédric Le Goater 8e036c8d30 powerpc/xive: Use hw CPU ids when configuring the CPU queues
The CPU event notification queues on sPAPR should be configured using
a hardware CPU identifier.

The problem did not show up on the Power Hypervisor because pHyp
supports 8 threads per core which keeps CPU number contiguous. This is
not the case on all sPAPR virtual machines, some use SMT=1.

Also improve error logging by adding the CPU number.

Fixes: eac1e731b5 ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-15 09:54:43 +11:00
Cyril Bur c134f0d57a powerpc: Expose TSCR via sysfs only on powernv
The TSCR can only be accessed in hypervisor mode.

Fixes: 88b5e12eeb11 ("powerpc: Expose TSCR via sysfs")
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-15 09:54:42 +11:00
David S. Miller bdc8587ad7 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-02-14

This patch series enables the new mqprio hardware offload mechanism
creating traffic classes on VFs for XL710 devices. The parameters
needed to configure these traffic classes/queue channels are provides
by the user via the tc tool. A maximum of four traffic classes can be
created on each VF. This patch series also enables application of cloud
filters to each of these traffic classes. The cloud filters are applied
using the tc-flower classifier.

Example:
    1. tc qdisc add dev vf0 root mqprio num_tc 4 map 0 0 0 0 1 2 2 3\
        queues 2@0 2@2 1@4 1@5 hw 1 mode channel
    2. tc qdisc add dev vf0 ingress
    3. ethtool -K vf0 hw-tc-offload on
    4. ip link set eth0 vf 0 spoofchk off
    5. tc filter add dev vf0 protocol ip parent ffff: prio 1 flower dst_ip\
        192.168.3.5/32 ip_proto udp dst_port 25 skip_sw hw_tc 2
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:44:05 -05:00
Tom Herbert dff8baa261 kcm: Call strp_stop before strp_done in kcm_attach
In kcm_attach strp_done is called when sk_user_data is already
set to fail the attach. strp_done needs the strp to be stopped and
warns if it isn't. Call strp_stop in this case to eliminate the
warning message.

Reported-by: syzbot+88dfb55e4c8b770d86e3@syzkaller.appspotmail.com
Fixes: e557124023 ("kcm: Check if sk_user_data already set in kcm_attach"
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:38:39 -05:00
Wadim Egorov 0e47079b6c net: phy: dp83867: Add documentation for CLK_OUT pin muxing
Add documentation of ti,clk-output-sel which can be used to select
a specific clock for CLK_OUT.

Signed-off-by: Wadim Egorov <w.egorov@phytec.de>
Signed-off-by: Daniel Schultz <d.schultz@phytec.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:33:43 -05:00
Wadim Egorov 9708fb630d net: phy: dp83867: Add binding for the CLK_OUT pin muxing option
The DP83867 has a muxing option for the CLK_OUT pin. It is possible
to set CLK_OUT for different channels.
Create a binding to select a specific clock for CLK_OUT pin.

Signed-off-by: Wadim Egorov <w.egorov@phytec.de>
Signed-off-by: Daniel Schultz <d.schultz@phytec.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:33:43 -05:00
Jon Maloy 37c64cf63b tipc: apply bearer link tolerance on running links
Currently, the default link tolerance set in struct tipc_bearer only
has effect on links going up after that moment. I.e., a user has to
reset all the node's links across that bearer to have the new value
applied. This is too limiting and disturbing on a running cluster to
be useful.

We now change this so that also already existing links are updated
dynamically, without any need for a reset, when the bearer value is
changed. We leverage the already existing per-link functionality
for this to achieve the wanted effect.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:22:24 -05:00
Boris Pismenny c410c1966f tls: getsockopt return record sequence number
Return the TLS record sequence number in getsockopt.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:05:19 -05:00
Boris Pismenny 257082e6ae tls: reset the crypto info if copy_from_user fails
copy_from_user could copy some partial information, as a result
TLS_CRYPTO_INFO_READY(crypto_info) could be true while crypto_info is
using uninitialzed data.

This patch resets crypto_info when copy_from_user fails.

fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:05:19 -05:00
Boris Pismenny a1dfa6812b tls: retrun the correct IV in getsockopt
Current code returns four bytes of salt followed by four bytes of IV.
This patch returns all eight bytes of IV.

fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:05:19 -05:00
David S. Miller a92ac140fc Merge branch 'cxgb4-speed-up-reading-on-chip-memory'
Rahul Lakkireddy says:

====================
cxgb4: speed up reading on-chip memory

This series of patches speed up reading on-chip memory (EDC and MC)
by reading 64-bits at a time.

Patch 1 reworks logic to read EDC and MC.

Patch 2 adds logic to read EDC and MC 64-bits at a time.

v2:
- Dropped AVX CPU intrinsic instructions.
- Use readq() to read 64-bits at a time.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:01:52 -05:00
Rahul Lakkireddy 7494f980ca cxgb4: speed up on-chip memory read
Use readq() (via t4_read_reg64()) to read 64-bits at a time.
Read residual in 32-bit multiples.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:01:51 -05:00
Rahul Lakkireddy 1a4330cdbf cxgb4: rework on-chip memory read
Rework logic to read EDC and MC. Do 32-bit reads at a time.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:01:51 -05:00
David S. Miller 8ace02073e Merge branch 'net-segmentation-offload-doc-fixes'
Daniel Axtens says:

====================
Updates to segmentation-offloads.txt

I've been trying to wrap my head around GSO for a while now. This is a
set of small changes to the docs that would probably have been helpful
when I was starting out.

I realise that GSO_DODGY is still a notable omission - I'm hesitant to
write too much on it just yet as I don't understand it well and I
think it's in the process of changing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:52:39 -05:00
Daniel Axtens a677088922 docs: segmentation-offloads.txt: add SCTP info
Most of this is extracted from 90017accff ("sctp: Add GSO support"),
with some extra text about GSO_BY_FRAGS and the need to check for it.

Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:52:39 -05:00
Daniel Axtens bc3c2431d4 docs: segmentation-offloads.txt: Fix ref to SKB_GSO_TUNNEL_REMCSUM
The doc originally called it SKB_GSO_REMCSUM. Fix it.

Fixes: f7a6272bf3 ("Documentation: Add documentation for TSO and GSO features")
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:52:39 -05:00
Daniel Axtens a65820e695 docs: segmentation-offloads.txt: update for UFO depreciation
UFO is deprecated except for tuntap and packet per 0c19f846d5,
("net: accept UFO datagrams from tuntap and packet"). Update UFO
docs to reflect this.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:52:38 -05:00
David S. Miller 080fe7aa18 Merge branch 'tipc-locking-fixes'
Ying Xue says:

====================
tipc: Fix missing RTNL lock protection during setting link properties

At present it's unsafe to configure link properties through netlink
as the entire setting process is not under RTNL lock protection. Now
TIPC supports two different sets of netlink APIs at the same time, and
they share the same set of backend functions to configure bearer,
media and net properties. In order to solve the missing RTNL issue,
we have to make the whole __tipc_nl_compat_doit() protected by RTNL,
which means any function called within it cannot take RTNL any more.
So in the series we first introduce the following new functions which
doesn't hold RTNl lock:

 - __tipc_nl_bearer_disable()
 - __tipc_nl_bearer_enable()
 - __tipc_nl_bearer_set()
 - __tipc_nl_media_set()
 - __tipc_nl_net_set()

Meanwhile, __tipc_nl_compat_doit() has been reconstructed to minimize
the time of holding RTNL lock.

Changes in v4:
 - Per suggestion of Kirill Tkhai, divided original big one patch into
   seven small ones so that they can be easily reviewed.

Changes in v3:
 - Optimized return method of __tipc_nl_bearer_enable() regarding
   the comments from David M and Kirill Tkhai
 - Moved the allocations of memory in __tipc_nl_compat_doit() out
   of RTNL lock to minimize the time of holding RTNL lock according
   to the suggestion of Kirill Tkhai.

Changes in v2:
 - The whole operation of setting bearer/media properties has been
   protected under RTNL, as per feedback from David M.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:33 -05:00
Ying Xue ed4ffdfec2 tipc: Fix missing RTNL lock protection during setting link properties
Currently when user changes link properties, TIPC first checks if
user's command message contains media name or bearer name through
tipc_media_find() or tipc_bearer_find() which is protected by RTNL
lock. But when tipc_nl_compat_link_set() conducts the checking with
the two functions, it doesn't hold RTNL lock at all, as a result,
the following complaints were reported:

audit: type=1400 audit(1514679888.244:9): avc:  denied  { write } for
pid=3194 comm="syzkaller021477" path="socket:[11143]" dev="sockfs"
ino=11143 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
tclass=netlink_generic_socket permissive=1
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

=============================
WARNING: suspicious RCU usage
4.15.0-rc5+ #152 Not tainted
-----------------------------
net/tipc/bearer.c:177 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by syzkaller021477/3194:
  #0:  (cb_lock){++++}, at: [<00000000d20133ea>] genl_rcv+0x19/0x40
net/netlink/genetlink.c:634
  #1:  (genl_mutex){+.+.}, at: [<00000000fcc5d1bc>] genl_lock
net/netlink/genetlink.c:33 [inline]
  #1:  (genl_mutex){+.+.}, at: [<00000000fcc5d1bc>] genl_rcv_msg+0x115/0x140
net/netlink/genetlink.c:622

stack backtrace:
CPU: 1 PID: 3194 Comm: syzkaller021477 Not tainted 4.15.0-rc5+ #152
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
  tipc_bearer_find+0x2b4/0x3b0 net/tipc/bearer.c:177
  tipc_nl_compat_link_set+0x329/0x9f0 net/tipc/netlink_compat.c:729
  __tipc_nl_compat_doit net/tipc/netlink_compat.c:288 [inline]
  tipc_nl_compat_doit+0x15b/0x660 net/tipc/netlink_compat.c:335
  tipc_nl_compat_handle net/tipc/netlink_compat.c:1119 [inline]
  tipc_nl_compat_recv+0x112f/0x18f0 net/tipc/netlink_compat.c:1201
  genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:599
  genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:624
  netlink_rcv_skb+0x21e/0x460 net/netlink/af_netlink.c:2408
  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
  netlink_unicast_kernel net/netlink/af_netlink.c:1275 [inline]
  netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1301
  netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1864
  sock_sendmsg_nosec net/socket.c:636 [inline]
  sock_sendmsg+0xca/0x110 net/socket.c:646
  sock_write_iter+0x31a/0x5d0 net/socket.c:915
  call_write_iter include/linux/fs.h:1772 [inline]
  new_sync_write fs/read_write.c:469 [inline]
  __vfs_write+0x684/0x970 fs/read_write.c:482
  vfs_write+0x189/0x510 fs/read_write.c:544
  SYSC_write fs/read_write.c:589 [inline]
  SyS_write+0xef/0x220 fs/read_write.c:581
  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
  entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129

In order to correct the mistake, __tipc_nl_compat_doit() has been
protected by RTNL lock, which means the whole operation of setting
bearer/media properties is under RTNL protection.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reported-by: syzbot <syzbot+6345fd433db009b29413@syzkaller.appspotmail.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:33 -05:00
Ying Xue 5631f65dec tipc: Introduce __tipc_nl_net_set
Introduce __tipc_nl_net_set() which doesn't hold RTNL lock.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:33 -05:00