1
0
Fork 0
Commit Graph

738181 Commits (dbc7886164e2d4d3e842cd818d9f8f2b28b13d5f)

Author SHA1 Message Date
Helge Deller 0ed1fe4ad3 parisc: Check if secondary CPUs want own PDC calls
The architecture specification says (for 64-bit systems): PDC is a per
processor resource, and operating system software must be prepared to
manage separate pointers to PDCE_PROC for each processor.  The address
of PDCE_PROC for the monarch processor is stored in the Page Zero
location MEM_PDC. The address of PDCE_PROC for each non-monarch
processor is passed in gr26 when PDCE_RESET invokes OS_RENDEZ.

Currently we still use one PDC for all CPUs, but in case we face a
machine which is following the specification let's warn about it.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-03-02 10:04:46 +01:00
Helge Deller fd8d0ca256 parisc: Hide virtual kernel memory layout
For security reasons do not expose the virtual kernel memory layout to
userspace.

Signed-off-by: Helge Deller <deller@gmx.de>
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org # 4.15
Reviewed-by: Kees Cook <keescook@chromium.org>
2018-03-02 10:04:35 +01:00
John David Anglin 0adb24e03a parisc: Fix ordering of cache and TLB flushes
The change to flush_kernel_vmap_range() wasn't sufficient to avoid the
SMP stalls.  The problem is some drivers call these routines with
interrupts disabled.  Interrupts need to be enabled for flush_tlb_all()
and flush_cache_all() to work.  This version adds checks to ensure
interrupts are not disabled before calling routines that need IPI
interrupts.  When interrupts are disabled, we now drop into slower code.

The attached change fixes the ordering of cache and TLB flushes in
several cases.  When we flush the cache using the existing PTE/TLB
entries, we need to flush the TLB after doing the cache flush.  We don't
need to do this when we flush the entire instruction and data caches as
these flushes don't use the existing TLB entries.  The same is true for
tmpalias region flushes.

The flush_kernel_vmap_range() and invalidate_kernel_vmap_range()
routines have been updated.

Secondly, we added a new purge_kernel_dcache_range_asm() routine to
pacache.S and use it in invalidate_kernel_vmap_range().  Nominally,
purges are faster than flushes as the cache lines don't have to be
written back to memory.

Hopefully, this is sufficient to resolve the remaining problems due to
cache speculation.  So far, testing indicates that this is the case.  I
did work up a patch using tmpalias flushes, but there is a performance
hit because we need the physical address for each page, and we also need
to sequence access to the tmpalias flush code.  This increases the
probability of stalls.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
2018-03-02 10:03:28 +01:00
Arvind Prasanna 1a90ce36c6 kconfig: Update ncurses package names for menuconfig
The package name is ncurses-devel for Redhat based distros
and libncurses-dev for Debian based distros.

Signed-off-by: Arvind Prasanna <arvindprasanna@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-02 09:20:57 +09:00
Cao jin cbf7a90e30 kbuild/kallsyms: trivial typo fix
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-02 09:20:56 +09:00
Masahiro Yamada 0da4fabdf4 kbuild: test --build-id linker flag by ld-option instead of cc-ldoption
'--build-id' is passed to $(LD), so it should be tested by 'ld-option'.

This seems a kind of misconversion when ld-option was renamed to
cc-ldoption.

Commit f86fd30660 ("kbuild: rename ld-option to cc-ldoption") renamed
all instances of 'ld-option' to 'cc-ldoption'.

Then, commit 691ef3e7fd ("kbuild: introduce ld-option") re-added
'ld-option' as a new implementation.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-02 09:20:56 +09:00
Cao jin a7b151fffb kbuild: drop superfluous GCC_PLUGINS_CFLAGS assignment
GCC_PLUGINS_CFLAGS is already in the environment, so it is superfluous
to add it in commandline of final build of init/.

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-02 09:20:56 +09:00
Masahiro Yamada bf0bbdcf10 kconfig: Don't leak choice names during parsing
The named choice is not used in the kernel tree, but if it were used,
it would not be freed.

The intention of the named choice can be seen in the log of
commit 5a1aa8a1af ("kconfig: add named choice group").

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-02 09:20:55 +09:00
Masahiro Yamada 1b1e4ee86e sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCE
If CONFIG_USE_BUILTIN_DTB is enabled, but CONFIG_BUILTIN_DTB_SOURCE
is empty (for example, allmodconfig), it fails to build, like this:

  make[2]: *** No rule to make target 'arch/sh/boot/dts/.dtb.o',
  needed by 'arch/sh/boot/dts/built-in.o'.  Stop.

Surround obj-y with ifneq ... endif.

I replaced $(CONFIG_USE_BUILTIN_DTB) with 'y' since this is always
the case from the following code from arch/sh/Makefile:

  core-$(CONFIG_USE_BUILTIN_DTB)  += arch/sh/boot/dts/

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-02 09:20:55 +09:00
Masahiro Yamada f4bc1eefc1 kconfig: set SYMBOL_AUTO to the symbol marked with defconfig_list
The 'defconfig_list' is a weird attribute.  If the '.config' is
missing, conf_read_simple() iterates over all visible defaults,
then it uses the first one for which fopen() succeeds.

config DEFCONFIG_LIST
	string
	depends on !UML
	option defconfig_list
	default "/lib/modules/$UNAME_RELEASE/.config"
	default "/etc/kernel-config"
	default "/boot/config-$UNAME_RELEASE"
	default "$ARCH_DEFCONFIG"
	default "arch/$ARCH/defconfig"

However, like other symbols, the first visible default is always
written out to the .config file.  This might be different from what
has been actually used.

For example, on my machine, the third one "/boot/config-$UNAME_RELEASE"
is opened, like follows:

  $ rm .config
  $ make oldconfig 2>/dev/null
  scripts/kconfig/conf  --oldconfig Kconfig
  #
  # using defaults found in /boot/config-4.4.0-112-generic
  #
  *
  * Restart config...
  *
  *
  * IRQ subsystem
  *
  Expose irq internals in debugfs (GENERIC_IRQ_DEBUGFS) [N/y/?] (NEW)

However, the resulted .config file contains the first one since it is
visible:

  $ grep CONFIG_DEFCONFIG_LIST .config
  CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

In order to stop confusing people, prevent this CONFIG option from
being written to the .config file.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-02 09:20:44 +09:00
Linus Torvalds 5d60e057d1 amdgpu, i915, virtio-gpu, nouveau, sun4i fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJamImYAAoJEAx081l5xIa+kp8P/3xR78pTqrhQLjNyRTHwq5AK
 4k8/SwfymW6LU/owkMTB00flS4fCpXzpdMC1fDW6GEfssmlFIf16D3gPsL3mipCG
 m+KXqyGJfQbTd3OcXxR3NuWx5JC0CCy64imkkJLLB8LjHFZV7d+tXvxFnCHI43e0
 aIkBbnD6WGh2v4A0MuuizFHLh24WO+L9fc87moeEiE/zzQ8Ug+q6NLpRu7ehFLI4
 QWQn0SivyPcv/YdmJxHhNnUgjMiZHuOMgUioJ8LDjvT5oDDNY/gYDN36PHwkRLsw
 t2kepc+Hbyb16XAyMcIyZAXGJaHUt8ujHLxAo8oh3QulHOmhii9Y1i27IyiBAYqV
 dZxB5DNdTm0cCy2UdumDRJU0HEqGBux3Yg4RwcnVa10PRp/haX6VpzLnlqAIeT6k
 u7j9eP/0x2BGt7QmYpdvIrSx9cPbFWLsHDCX6K8qwljUOoc8V/afI1vF8eYcfj+b
 iEHWPW6/4Pq9DL2h0jFpNCTTrw+I9nz6iLj5PQPXRTeSfiAT9eC9bvbLV7W2AOsm
 CR7VqkMbvUH7FqLP+p7HEdyaR8yQevvAq8vZp9HvOIr+abXeRdG4bogCx2mcJ7px
 mHe4VJOwHY5ACBXb/xTp38b3n9NixJLrOxswzyy/wmBWxmwh1W/EPcfU6vN1bVBL
 wp/M6WKkCPueeMrZ7mm2
 =o75L
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-for-v4.16-rc4' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Pretty much run of the mill drm fixes.

  amdgpu:
   - power management fixes
   - some display fixes
   - one ppc 32-bit dma fix

  i915:
   - two display fixes
   - three gem fixes

  sun4i:
   - display regression fixes

  nouveau:
   - display regression fix

  virtio-gpu:
   - dumb airlied ioctl fix"

* tag 'drm-fixes-for-v4.16-rc4' of git://people.freedesktop.org/~airlied/linux: (25 commits)
  drm/amdgpu: skip ECC for SRIOV in gmc late_init
  drm/amd/amdgpu: Correct VRAM width for APUs with GMC9
  drm/amdgpu: fix&cleanups for wb_clear
  drm/amdgpu: Correct sdma_v4 get_wptr(v2)
  drm/amd/powerplay: fix power over limit on Fiji
  drm/amdgpu:Fixed wrong emit frame size for enc
  drm/amdgpu: move WB_FREE to correct place
  drm/amdgpu: only flush hotplug work without DC
  drm/amd/display: check for ipp before calling cursor operations
  drm/i915: Make global seqno known in i915_gem_request_execute tracepoint
  drm/i915: Clear the in-use marker on execbuf failure
  drm/i915/cnl: Fix PORT_TX_DW5/7 register address
  drm/i915/audio: fix check for av_enc_map overflow
  drm/i915: Fix rsvd2 mask when out-fence is returned
  virtio-gpu: fix ioctl and expose the fixed status to userspace.
  drm/sun4i: Protect the TCON pixel clocks
  drm/sun4i: Enable the output on the pins (tcon0)
  drm/nouveau: prefer XBGR2101010 for addfb ioctl
  drm/radeon: insist on 32-bit DMA for Cedar on PPC64/PPC64LE
  drm/amd/display: VGA black screen from s3 when attached to hook
  ...
2018-03-01 15:56:15 -08:00
Linus Torvalds 2120447b5d ARC fixes for 4.16-rc4
- MCIP aka ARconnect fixes for SMP builds [Euginey]
 
  - Preventive fix for SLC (L2 cache) flushing [Euginey]
 
  - Kconfig default fix [Ulf Magnusson]
 
  - trailing semicolon fixes [Luis de Bethencourt]
 
  - other assorted minor fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJamHh7AAoJEGnX8d3iisJeM+MQAL+cIkdRd9NJoPPL66IOgwmi
 68jUkVQ8PSgR2P/sWvwIRTionOG9siu58Q1ygXpPR9SNnSJlyIYIW4Onuoajs+Ii
 TmTwWm8jnKtuPtKMBep8XpYQQ+FRL6sDNn0QLjCnnvqAeE7rTFODSlpBR+jL6ur4
 t2h1wHQem6oas1YRwnKrxxMnfV3KP3TkCS2a/lvmeAAt8xi+ll+9OnzVcSCj7pCJ
 ZToenfH3UAbhAd5T7gWkJv2v00zbHNzKtpdluSW6WBybP1Ib2IxOxEiUlAvxQgpl
 NctvS8Q1uveOPQZHO+fNXsJvf+imP2RdWh5RaOcmm8a4tp8jsR51BScX55usjr/z
 ybg4bzHBV3x9YbZkkW9DKyF9eeZ7hWST8nNoebcHNVjboUeD+wgtV8e3Fvc6bIoo
 /Xkw28y/mZwCs7mBfu1fNOIIiiZDSgz7YeeqzOPBzEYVcHE4VGINwX+9cLXPOsgz
 rmI/Md3buSjrfXPnXTuk1R4fl0WKM5FT/2BJ+IbNU2w6VO1/iKqiBQREH5lBfYAI
 BUikxKNwrJ9+zG8EXFUAEi6gn4dxosxMKJ04CTviwGFc1y6yNsmMgizevFpPUyx7
 9o3z19hHUZLx94QsGHLI7QJe3Kawamu0tmdJvQBgB/eZ9lj6OdkzY48I2lgGezSw
 00e+JPSy6EDi8YrlMki5
 =+Bmc
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - MCIP aka ARconnect fixes for SMP builds [Euginey]

 - preventive fix for SLC (L2 cache) flushing [Euginey]

 - Kconfig default fix [Ulf Magnusson]

 - trailing semicolon fixes [Luis de Bethencourt]

 - other assorted minor fixes

* tag 'arc-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: setup cpu possible mask according to possible-cpus dts property
  ARC: mcip: update MCIP debug mask when the new cpu came online
  ARC: mcip: halt GFRC counter when ARC cores halt
  ARCv2: boot log: fix HS48 release number
  arc: dts: use 'atmel' as manufacturer for at24 in axs10x_mb
  ARC: Fix malformed ARC_EMUL_UNALIGNED default
  ARC: boot log: Fix trailing semicolon
  ARC: dw2 unwind: Fix trailing semicolon
  ARC: Enable fatal signals on boot for dev platforms
  ARCv2: Don't pretend we may set L-bit in STATUS32 with kflag instruction
  ARCv2: cache: fix slc_entire_op: flush only instead of flush-n-inv
2018-03-01 14:32:23 -08:00
Radim Krčmář b7e31be385 KVM: x86: fix vcpu initialization with userspace lapic
Moving the code around broke this rare configuration.
Use this opportunity to finally call lapic reset from vcpu reset.

Reported-by: syzbot+fb7a33a4b6c35007a72b@syzkaller.appspotmail.com
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Fixes: 0b2e9904c1 ("KVM: x86: move LAPIC initialization after VMCS creation")
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-01 22:32:45 +01:00
Wanpeng Li 518e7b9481 KVM: X86: Allow userspace to define the microcode version
Linux (among the others) has checks to make sure that certain features
aren't enabled on a certain family/model/stepping if the microcode version
isn't greater than or equal to a known good version.

By exposing the real microcode version, we're preventing buggy guests that
don't check that they are running virtualized (i.e., they should trust the
hypervisor) from disabling features that are effectively not buggy.

Suggested-by: Filippo Sironi <sironi@amazon.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-01 22:32:44 +01:00
Wanpeng Li 66421c1ec3 KVM: X86: Introduce kvm_get_msr_feature()
Introduce kvm_get_msr_feature() to handle the msrs which are supported
by different vendors and sharing the same emulation logic.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-01 22:32:44 +01:00
Linus Torvalds 8da5db7dda platform-drivers-x86 for v4.16-5
Fix a regression on laptops like Dell XPS 9360 where keyboard stopped
 working.
 
 Correct sysfs wakeup attribute after removal of some drivers to reflect
 that they are not able to wake system up anymore.
 
 The following is an automated git shortlog grouped by driver:
 
 intel-hid:
  -  Reset wakeup capable flag on removal
 
 intel-vbtn:
  -  Reset wakeup capable flag on removal
  -  Only activate tablet mode switch on 2-in-1's
 
 wmi:
  -  Fix misuse of vsprintf extension %pULL
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEhiZOUlnC9oKN3n3AmT3/83c5Sy0FAlqYQ3QACgkQmT3/83c5
 Sy3gUw//TnBvy5Ts4Wu4HZgtXp/qAaMKpIcZzuB8P4x+4jKvfqIQ3z9S1XqhLbT1
 aXflqVRoJmp6zLyQG4QuoVcKfEA/wRhBtvYGhzE3/DDrkr/3vu35guRGYUtkRfJZ
 bWz+JvPFcExh7y5rNxNnddC6l/UCj+Ptb2MNeASFzFXI4ggyCjJpiCjnteEInnJa
 jy42nLaNKR0oljWvnb9NJJJtbd91JoapY4e3jbs3UXEYnRLcAuZCy8qC7ls1BQ1y
 h23lPQZMp54KHuYCyQ2Huudg7pK8oH3qOZGN0vFYq/D8W13bq8Dx6QhXRK0gAprt
 pxH4gBTGUutck08YZua+8shMRU1oxrHQAwq/z7NC+IMd/joQOlmdUcHXQtCUF33Z
 6H7OkbtSrOJkRETPgelWxQ1bTtaqxd8CwHSevjBXqAIFHGAFH8hnjmQN/3e9Ynpz
 okKqkMut0WkCXF1sO/4tQIu13oJMrEfInZujn0t99kDH7ecMBlYqhxgHIzwnKDuS
 xCWGy/c2ejC/zndLktm/EyacCeFgKEhbGPHcM+mjtGeDvHpQ0huyBTS0CLuIEBQs
 9RDe+A28bV4Qi6B8GB6AzyShzwRx/8ogO8aOeDAtrjhMEWSTh5VcrgHe7NRUt0jC
 WGoLTe0NajpGVzf4XjzgdzeyCSgdN+LyH4nPlvi8AvNbrYtMAhc=
 =ezh3
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v4.16-5' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform drivers fixes from Andy Shevchenko:

 - fix a regression on laptops like Dell XPS 9360 where keyboard stopped
   working.

 - correct sysfs wakeup attribute after removal of some drivers to
   reflect that they are not able to wake system up anymore.

* tag 'platform-drivers-x86-v4.16-5' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: wmi: Fix misuse of vsprintf extension %pULL
  platform/x86: intel-hid: Reset wakeup capable flag on removal
  platform/x86: intel-vbtn: Reset wakeup capable flag on removal
  platform/x86: intel-vbtn: Only activate tablet mode switch on 2-in-1's
2018-03-01 10:50:01 -08:00
Linus Torvalds 7e30309968 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull MD bugfixes from Shaohua Li:

 - fix raid5-ppl flush request handling hang from Artur

 - fix a potential deadlock in raid5/10 reshape from BingJing

 - fix a deadlock for dm-raid from Heinz

 - fix two md-cluster of raid10 from Lidong and Guoqing

 - fix a NULL deference problem in device removal from Neil

 - fix a NULL deference problem in raid1/raid10 in specific condition
   from Yufen

 - other cleanup and fixes

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid1: fix NULL pointer dereference
  md: fix a potential deadlock of raid5/raid10 reshape
  md-cluster: choose correct label when clustered layout is not supported
  md: raid5: avoid string overflow warning
  raid5-ppl: fix handling flush requests
  md raid10: fix NULL deference in handle_write_completed()
  md: only allow remove_and_add_spares when no sync_thread running.
  md: document lifetime of internal rdev pointer.
  md: fix md_write_start() deadlock w/o metadata devices
  MD: Free bioset when md_run fails
  raid10: change the size of resync window for clustered raid
  md-multipath: Use seq_putc() in multipath_status()
  md/raid1: Fix trailing semicolon
  md/raid5: simplify uninitialization of shrinker
2018-03-01 10:08:47 -08:00
Linus Torvalds 7bec4a9646 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk fix from Petr Mladek:
 "Make sure that we wake up userspace loggers. This fixes a race
  introduced by the console waiter logic during this merge window"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
  printk: Wake klogd when passing console_lock owner
2018-03-01 10:06:39 -08:00
Joe Perches 1cedc6385d platform/x86: wmi: Fix misuse of vsprintf extension %pULL
%pULL doesn't officially exist but %pUL does.

Miscellanea:

o Add missing newlines to a couple logging messages

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-01 10:01:39 -08:00
Tom Lendacky d1d93fa90f KVM: SVM: Add MSR-based feature support for serializing LFENCE
In order to determine if LFENCE is a serializing instruction on AMD
processors, MSR 0xc0011029 (MSR_F10H_DECFG) must be read and the state
of bit 1 checked.  This patch will add support to allow a guest to
properly make this determination.

Add the MSR feature callback operation to svm.c and add MSR 0xc0011029
to the list of MSR-based features.  If LFENCE is serializing, then the
feature is supported, allowing the hypervisor to set the value of the
MSR that guest will see.  Support is also added to write (hypervisor only)
and read the MSR value for the guest.  A write by the guest will result in
a #GP.  A read by the guest will return the value as set by the host.  In
this way, the support to expose the feature to the guest is controlled by
the hypervisor.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-01 19:00:28 +01:00
Tom Lendacky 801e459a6f KVM: x86: Add a framework for supporting MSR-based features
Provide a new KVM capability that allows bits within MSRs to be recognized
as features.  Two new ioctls are added to the /dev/kvm ioctl routine to
retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops
callback is used to determine support for the listed MSR-based features.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Tweaked documentation. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-01 19:00:28 +01:00
Linus Torvalds 16453c9cf8 sound fixes for 4.16-rc4
The only core change is the fix for possible memory corruption by
 ALSA ctl API since 4.14 kernel due to a thinko.  The rest are all
 device-specific: in addition to the usual suspects (HD-audio and
 USB-audio fixups), a few LPE HDMI audio fixes came in at this time.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAlqXtNsOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE86Lw//d+e0ikj+Mjb2hovsst6drAXP3WvgtTSzSA9i
 fKUjPYUErvCUK+jM7p0p06FOxkMR4/nBEUFSzjoE52LAMxYBMFtW5aZjewc4JA2X
 nhhO0YMqm6tvaVmYFuGl6MZ4SqN+428Nmn5NaEDu8RymQVLkQLS0dXz7h8Z9wHg0
 z3EaYAFtH28gV3eip5KM3yhfvLtcuMdXm9Z+qcl0XAtF04MU/Wj8/dDnYFOjYVXn
 DhlQdrH4nFeRKPWYhZMkCnfSbIY1JLkAC1W18bYTzBaxuwTSCL3I78qaeSfjiIZy
 8ld/bPq4FcY2HndLZzw78No4qn1r1TJR0A0VDZ6RsprbmXSB5PVgFFnq5PHUg/G5
 a4e8yn7cfsNjiZisu15LrAeS1+NbjVMQbo3y7APfSNmWJJLMOXoo5YxJJawoDLlP
 NvQzP/MtUupWWxjaLTBIFU58/gLJBsNk+YEq36Ggynecb6PwwmzUbSSk6XHHgqFy
 82gXfd7VLXOCs6Y6h/l1BD2uQdxE01ziW6jbSlJ+20nJaqodpFmVQvzzTakMMipi
 WpuC+SVJsjLdTBVyyOdQHv/SHYsv1nGUoC21q/5IdhjLF23uBKzM8hoDoUJE6mCa
 CadWCwHVdGjkRtsyJAGy6kjru2bHjnboxkYxX9tF+qbdi0HmLSJRSRjq1e/TTIsf
 atVBiuw=
 =UY/c
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "The only core change is the fix for possible memory corruption by ALSA
  ctl API since 4.14 kernel due to a thinko.

  The rest are all device-specific: in addition to the usual suspects
  (HD-audio and USB-audio fixups), a few LPE HDMI audio fixes came in at
  this time"

* tag 'sound-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: x86: Fix potential crash at error path
  ALSA: x86: Fix missing spinlock and mutex initializations
  ALSA: control: Fix memory corruption risk in snd_ctl_elem_read
  ALSA: hda - Fix pincfg at resume on Lenovo T470 dock
  ALSA: usb-audio: Add a quirck for B&W PX headphones
  ALSA: hda: Add a power_save blacklist
  ALSA: x86: hdmi: Add single_port option for compatible behavior
2018-03-01 08:31:23 -08:00
Linus Torvalds 44896cd1de Pin control fixes for v4.16:
- Fix a pin group on the Meson.
 - Assign maintainers for Freescale/NXP pin controllers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJal7MkAAoJEEEQszewGV1zfW8QAJbeMR/FZCmufgSaeHi1HuTQ
 Q6naMUT+23zaBjFwHarXjvlH2YxGqGDRu22fQxn44Ruxh4utCG9dkp800Si3Rzit
 FmIiKBBRw/MnjDmmWMq+SkpaAzseCXg97us6beFk9ZoO+/t/JrZenuVgk/XQtrv9
 7Fy98OvU0nXr366Jn2ZI8q4g57XpRbGX+2x/DIs68B7wOLBQpVl/OId4n+foyVmR
 VxRzGS9f6fnXK4BxSUGPzdcCdbRDLZIYxuLXJoAzQDAAEbby7/qKyPHeIsZYWBBW
 Jsn1lun/tcPVFFhYR6Fo5w/45UW7+rjfm4MFR3zPtZfEusflMd9byBp7+PPoV1X4
 CQ4TB3yW5+0I/xjZVPgU6UcbnZWTUGeFy8LZ4NIzTb9I8hxGrkoVlHZtft5syFtD
 kZViwRG5a1txWH+9/5yJnLg0KKS3+rh2Knn/GJTV4NdJcj/18Ye4NEueXIcJzYlX
 947aSnBJpMfOX0I6tBg0VW8tb2js2ROHLg5sOMbGAaCyr3KfZkOnCze7lWSV0VmS
 v8K9zBtUjMFPddXG6O4EvwYOax0W5bnFfJ6MmcEnGxCKdEtUShaJNRiyPzGYei0O
 s/znLxQmhV4IAKt6AQ/bLlKeNpvoeC7THBBjhjKhn5j+QNhBqtUVvVrUh/tv2DEQ
 t0XrzQMKSYcbfiPKQMI9
 =jVcD
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Two smallish pin control fixes: one actual code fix for the Meson and
  a MAINTAINERS update.

  Summary:

   - fix a pin group on the Meson

   - assign maintainers for Freescale/NXP pin controllers"

* tag 'pinctrl-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  MAINTAINERS: add Freescale pin controllers
  pinctrl: meson-axg: adjust uart_ao_b pin group naming
2018-03-01 08:19:10 -08:00
Linus Torvalds f902a778df GPIO fixes for v4.16:
- Fix up device tree properties readout caused by my own
   refactorings.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJal7CpAAoJEEEQszewGV1zij8QALoTOgsX0x1RHRzJBPIVcoGL
 QBQJKBFm1dKwAHNrhgzY9rF1LNnsBN0LQ00PUiKJSPbsc8aa9WZw9XbhtXx1ICRK
 uEOytwAwkBNd//O1en3ohteg5ihszmG/mr0DwGF62AM1wstmy+4ewciLFxBIC/ue
 a9Wf1k1abDriqFjmwQS/TWN5YmaHOCKtrt6eSNE46YkKAEGFsPEfe64P08kg6Y1m
 UBxLJg8HBUMVl2lA6UQwrnrCrY4PQR6Jl3j5qaY3ENMtnr3EYFvQenwzHunDWu51
 dR43gg+isjVsANomB1+yrwB6yS7Xhrz/uQwkdeMR9Gq06uq39B82/L96jHr9Xpvk
 OjKLOP2j/hoAQ+P6eoVGZcm522F6itpavaCdV/78wm6BEetvZWavXsREyb8D8i/i
 8Xmc65Gn2+tOTBbLp5QDHabxhJnSR5xc3d2q5rDx2RjZBS4H+OZvvoYm1jgo2gOg
 y9ccAaaLMitQ3dBpuPvidn159nBC3hcZIiGXaCXJKdclxQptIcMCaO4cZuxIF1bU
 4gFo/cn4cr/TAL929UZFtgqfkAxdLaiJXQs9mzLC0EqWy89aaB7XFxI22JlYlCFH
 pLJEpEqY3b6rTpIrNaH5Agbwk+BkamMSYx7jMuK+cw/8AbKSc78KVIJfUGZomXUb
 B9qXuyzuaS3Bg8kFYByX
 =XcVM
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Fix up device tree properties readout caused by my own refactorings"

* tag 'gpio-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: Handle deferred probing in of_find_gpio() properly
  gpiolib: Keep returning EPROBE_DEFER when we should
2018-03-01 08:17:01 -08:00
Jiufei Xue 158e61865a block: fix a typo
Fix a typo in pkt_start_recovery.

Fixes: 74d46992e0 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-01 08:41:27 -07:00
Jiufei Xue 9c0fb1e313 block: display the correct diskname for bio
bio_devname use __bdevname to display the device name, and can
only show the major and minor of the part0,
Fix this by using disk_name to display the correct name.

Fixes: 74d46992e0 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-01 08:41:25 -07:00
Jiufei Xue 7c5a0dcf55 block: fix the count of PGPGOUT for WRITE_SAME
The vm counters is counted in sectors, so we should do the conversation
in submit_bio.

Fixes: 74d46992e0 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Cc: stable@vger.kernel.org
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-01 08:41:23 -07:00
Chengguang Xu 1c78924957 ceph: fix potential memory leak in init_caches()
There is lack of cache destroy operation for ceph_file_cachep
when failing from fscache register.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-03-01 16:39:47 +01:00
Damien Le Moal f3bc78d2d4 mq-deadline: Make sure to always unlock zones
In case of a failed write request (all retries failed) and when using
libata, the SCSI error handler calls scsi_finish_command(). In the
case of blk-mq this means that scsi_mq_done() does not get called,
that blk_mq_complete_request() does not get called and also that the
mq-deadline .completed_request() method is not called. This results in
the target zone of the failed write request being left in a locked
state, preventing that any new write requests are issued to the same
zone.

Fix this by replacing the .completed_request() method with the
.finish_request() method as this method is always called whether or
not a request completes successfully. Since the .finish_request()
method is only called by the blk-mq core if a .prepare_request()
method exists, add a dummy .prepare_request() method.

Fixes: 5700f69178 ("mq-deadline: Introduce zone locking support")
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
[ bvanassche: edited patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-01 08:39:24 -07:00
Masahiro Yamada cd81fc82b9 kconfig: add xstrdup() helper
We already have xmalloc(), xcalloc(), and xrealloc(().  Add xstrdup()
as well to save tedious error handling.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-02 00:26:47 +09:00
Luc Van Oostenryck 6c49f359ca kbuild: disable sparse warnings about unknown attributes
Currently, sparse issues warnings on code using an attribute
it doesn't know about.

One of the problem with this is that these warnings have no
value for the developer, it's just noise for him. At best these
warnings tell something about some deficiencies of sparse itself
but not about a potential problem with code analyzed.

A second problem with this is that sparse release are, alas,
less frequent than new attributes are added to GCC.

So, avoid the noise by asking sparse to not warn about
attributes it doesn't know about.

Reference: https://marc.info/?l=linux-sparse&m=151871600016790
Reference: https://marc.info/?l=linux-sparse&m=151871725417322
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-02 00:26:46 +09:00
Ulf Magnusson 61277981dd Makefile: Fix lying comment re. silentoldconfig
The comment above the silentoldconfig invocation is outdated.
'make oldconfig' updates just .config and doesn't touch the
include/config/ tree.

This came up in https://lkml.org/lkml/2018/2/12/415.

While fixing the comment, make it more informative by explaining the
purpose of the unfortunately named silentoldconfig.

I can't make sense of the comment re. auto.conf.cmd and a cleaned tree.
include/config/auto.conf and include/config/auto.conf.cmd are both
created simultaneously by silentoldconfig (in
scripts/kconfig/confdata.c, by conf_write_autoconf()), and nothing seems
to remove auto.conf.cmd that wouldn't remove auto.conf. Remove that part
of the comment rather than blindly copying it. It might be a leftover
from an older way of doing things.

The include/config/auto.conf.cmd prerequisite might be there to ensure
that silentoldconfig gets rerun if conf_write_autoconf() fails between
writing out auto.conf.cmd and auto.conf (a comment in the function
indicates that auto.conf is deliberately written out last to mark
completion of the operation). It seems the Makefile dependency between
include/config/auto.conf and .config would already take care of that
though, since include/config/auto.conf would still be out of date re.
.config if the operation fails.

Cop out and leave the prerequisite in for now.

Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-02 00:26:46 +09:00
Filipe Manana 1f250e929a Btrfs: fix log replay failure after unlink and link combination
If we have a file with 2 (or more) hard links in the same directory,
remove one of the hard links, create a new file (or link an existing file)
in the same directory with the name of the removed hard link, and then
finally fsync the new file, we end up with a log that fails to replay,
causing a mount failure.

Example:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ mkdir /mnt/testdir
  $ touch /mnt/testdir/foo
  $ ln /mnt/testdir/foo /mnt/testdir/bar

  $ sync

  $ unlink /mnt/testdir/bar
  $ touch /mnt/testdir/bar
  $ xfs_io -c "fsync" /mnt/testdir/bar

  <power failure>

  $ mount /dev/sdb /mnt
  mount: mount(2) failed: /mnt: No such file or directory

When replaying the log, for that example, we also see the following in
dmesg/syslog:

  [71813.671307] BTRFS info (device dm-0): failed to delete reference to bar, inode 258 parent 257
  [71813.674204] ------------[ cut here ]------------
  [71813.675694] BTRFS: Transaction aborted (error -2)
  [71813.677236] WARNING: CPU: 1 PID: 13231 at fs/btrfs/inode.c:4128 __btrfs_unlink_inode+0x17b/0x355 [btrfs]
  [71813.679669] Modules linked in: btrfs xfs f2fs dm_flakey dm_mod dax ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper evdev psmouse i2c_piix4 parport_pc i2c_core pcspkr sg serio_raw parport button sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod ata_generic sd_mod virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel floppy virtio e1000 scsi_mod [last unloaded: btrfs]
  [71813.679669] CPU: 1 PID: 13231 Comm: mount Tainted: G        W        4.15.0-rc9-btrfs-next-56+ #1
  [71813.679669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
  [71813.679669] RIP: 0010:__btrfs_unlink_inode+0x17b/0x355 [btrfs]
  [71813.679669] RSP: 0018:ffffc90001cef738 EFLAGS: 00010286
  [71813.679669] RAX: 0000000000000025 RBX: ffff880217ce4708 RCX: 0000000000000001
  [71813.679669] RDX: 0000000000000000 RSI: ffffffff81c14bae RDI: 00000000ffffffff
  [71813.679669] RBP: ffffc90001cef7c0 R08: 0000000000000001 R09: 0000000000000001
  [71813.679669] R10: ffffc90001cef5e0 R11: ffffffff8343f007 R12: ffff880217d474c8
  [71813.679669] R13: 00000000fffffffe R14: ffff88021ccf1548 R15: 0000000000000101
  [71813.679669] FS:  00007f7cee84c480(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
  [71813.679669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [71813.679669] CR2: 00007f7cedc1abf9 CR3: 00000002354b4003 CR4: 00000000001606e0
  [71813.679669] Call Trace:
  [71813.679669]  btrfs_unlink_inode+0x17/0x41 [btrfs]
  [71813.679669]  drop_one_dir_item+0xfa/0x131 [btrfs]
  [71813.679669]  add_inode_ref+0x71e/0x851 [btrfs]
  [71813.679669]  ? __lock_is_held+0x39/0x71
  [71813.679669]  ? replay_one_buffer+0x53/0x53a [btrfs]
  [71813.679669]  replay_one_buffer+0x4a4/0x53a [btrfs]
  [71813.679669]  ? rcu_read_unlock+0x3a/0x57
  [71813.679669]  ? __lock_is_held+0x39/0x71
  [71813.679669]  walk_up_log_tree+0x101/0x1d2 [btrfs]
  [71813.679669]  walk_log_tree+0xad/0x188 [btrfs]
  [71813.679669]  btrfs_recover_log_trees+0x1fa/0x31e [btrfs]
  [71813.679669]  ? replay_one_extent+0x544/0x544 [btrfs]
  [71813.679669]  open_ctree+0x1cf6/0x2209 [btrfs]
  [71813.679669]  btrfs_mount_root+0x368/0x482 [btrfs]
  [71813.679669]  ? trace_hardirqs_on_caller+0x14c/0x1a6
  [71813.679669]  ? __lockdep_init_map+0x176/0x1c2
  [71813.679669]  ? mount_fs+0x64/0x10b
  [71813.679669]  mount_fs+0x64/0x10b
  [71813.679669]  vfs_kern_mount+0x68/0xce
  [71813.679669]  btrfs_mount+0x13e/0x772 [btrfs]
  [71813.679669]  ? trace_hardirqs_on_caller+0x14c/0x1a6
  [71813.679669]  ? __lockdep_init_map+0x176/0x1c2
  [71813.679669]  ? mount_fs+0x64/0x10b
  [71813.679669]  mount_fs+0x64/0x10b
  [71813.679669]  vfs_kern_mount+0x68/0xce
  [71813.679669]  do_mount+0x6e5/0x973
  [71813.679669]  ? memdup_user+0x3e/0x5c
  [71813.679669]  SyS_mount+0x72/0x98
  [71813.679669]  entry_SYSCALL_64_fastpath+0x1e/0x8b
  [71813.679669] RIP: 0033:0x7f7cedf150ba
  [71813.679669] RSP: 002b:00007ffca71da688 EFLAGS: 00000206
  [71813.679669] Code: 7f a0 e8 51 0c fd ff 48 8b 43 50 f0 0f ba a8 30 2c 00 00 02 72 17 41 83 fd fb 74 11 44 89 ee 48 c7 c7 7d 11 7f a0 e8 38 f5 8d e0 <0f> ff 44 89 e9 ba 20 10 00 00 eb 4d 48 8b 4d b0 48 8b 75 88 4c
  [71813.679669] ---[ end trace 83bd473fc5b4663b ]---
  [71813.854764] BTRFS: error (device dm-0) in __btrfs_unlink_inode:4128: errno=-2 No such entry
  [71813.886994] BTRFS: error (device dm-0) in btrfs_replay_log:2307: errno=-2 No such entry (Failed to recover log tree)
  [71813.903357] BTRFS error (device dm-0): cleaner transaction attach returned -30
  [71814.128078] BTRFS error (device dm-0): open_ctree failed

This happens because the log has inode reference items for both inode 258
(the first file we created) and inode 259 (the second file created), and
when processing the reference item for inode 258, we replace the
corresponding item in the subvolume tree (which has two names, "foo" and
"bar") witht he one in the log (which only has one name, "foo") without
removing the corresponding dir index keys from the parent directory.
Later, when processing the inode reference item for inode 259, which has
a name of "bar" associated to it, we notice that dir index entries exist
for that name and for a different inode, so we attempt to unlink that
name, which fails because the inode reference item for inode 258 no longer
has the name "bar" associated to it, making a call to btrfs_unlink_inode()
fail with a -ENOENT error.

Fix this by unlinking all the names in an inode reference item from a
subvolume tree that are not present in the inode reference item found in
the log tree, before overwriting it with the item from the log tree.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-01 16:18:40 +01:00
Filipe Manana 9a6509c4da Btrfs: fix log replay failure after linking special file and fsync
If in the same transaction we rename a special file (fifo, character/block
device or symbolic link), create a hard link for it having its old name
then sync the log, we will end up with a log that can not be replayed and
at when attempting to replay it, an EEXIST error is returned and mounting
the filesystem fails. Example scenario:

  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt
  $ mkdir /mnt/testdir
  $ mkfifo /mnt/testdir/foo
  # Make sure everything done so far is durably persisted.
  $ sync

  # Create some unrelated file and fsync it, this is just to create a log
  # tree. The file must be in the same directory as our special file.
  $ touch /mnt/testdir/f1
  $ xfs_io -c "fsync" /mnt/testdir/f1

  # Rename our special file and then create a hard link with its old name.
  $ mv /mnt/testdir/foo /mnt/testdir/bar
  $ ln /mnt/testdir/bar /mnt/testdir/foo

  # Create some other unrelated file and fsync it, this is just to persist
  # the log tree which was modified by the previous rename and link
  # operations. Alternatively we could have modified file f1 and fsync it.
  $ touch /mnt/f2
  $ xfs_io -c "fsync" /mnt/f2

  <power failure>

  $ mount /dev/sdc /mnt
  mount: mount /dev/sdc on /mnt failed: File exists

This happens because when both the log tree and the subvolume's tree have
an entry in the directory "testdir" with the same name, that is, there
is one key (258 INODE_REF 257) in the subvolume tree and another one in
the log tree (where 258 is the inode number of our special file and 257
is the inode for directory "testdir"). Only the data of those two keys
differs, in the subvolume tree the index field for inode reference has
a value of 3 while the log tree it has a value of 5. Because the same key
exists in both trees, but have different index, the log replay fails with
an -EEXIST error when attempting to replay the inode reference from the
log tree.

Fix this by setting the last_unlink_trans field of the inode (our special
file) to the current transaction id when a hard link is created, as this
forces logging the parent directory inode, solving the conflict at log
replay time.

A new generic test case for fstests was also submitted.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-01 16:18:34 +01:00
Filipe Manana d4dfc0f4d3 Btrfs: send, fix issuing write op when processing hole in no data mode
When doing an incremental send of a filesystem with the no-holes feature
enabled, we end up issuing a write operation when using the no data mode
send flag, instead of issuing an update extent operation. Fix this by
issuing the update extent operation instead.

Trivial reproducer:

  $ mkfs.btrfs -f -O no-holes /dev/sdc
  $ mkfs.btrfs -f /dev/sdd
  $ mount /dev/sdc /mnt/sdc
  $ mount /dev/sdd /mnt/sdd

  $ xfs_io -f -c "pwrite -S 0xab 0 32K" /mnt/sdc/foobar
  $ btrfs subvolume snapshot -r /mnt/sdc /mnt/sdc/snap1

  $ xfs_io -c "fpunch 8K 8K" /mnt/sdc/foobar
  $ btrfs subvolume snapshot -r /mnt/sdc /mnt/sdc/snap2

  $ btrfs send /mnt/sdc/snap1 | btrfs receive /mnt/sdd
  $ btrfs send --no-data -p /mnt/sdc/snap1 /mnt/sdc/snap2 \
       | btrfs receive -vv /mnt/sdd

Before this change the output of the second receive command is:

  receiving snapshot snap2 uuid=f6922049-8c22-e544-9ff9-fc6755918447...
  utimes
  write foobar, offset 8192, len 8192
  utimes foobar
  BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=f6922049-8c22-e544-9ff9-...

After this change it is:

  receiving snapshot snap2 uuid=564d36a3-ebc8-7343-aec9-bf6fda278e64...
  utimes
  update_extent foobar: offset=8192, len=8192
  utimes foobar
  BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=564d36a3-ebc8-7343-aec9-bf6fda278e64...

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-01 16:18:07 +01:00
Anand Jain 3c181c12c4 btrfs: use proper endianness accessors for super_copy
The fs_info::super_copy is a byte copy of the on-disk structure and all
members must use the accessor macros/functions to obtain the right
value.  This was missing in update_super_roots and in sysfs readers.

Moving between opposite endianness hosts will report bogus numbers in
sysfs, and mount may fail as the root will not be restored correctly. If
the filesystem is always used on a same endian host, this will not be a
problem.

Fix this by using the btrfs_set_super...() functions to set
fs_info::super_copy values, and for the sysfs, use the cached
fs_info::nodesize/sectorsize values.

CC: stable@vger.kernel.org
Fixes: df93589a17 ("btrfs: export more from FS_INFO to sysfs")
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-01 16:17:27 +01:00
Hans van Kranenburg 92e222df7b btrfs: alloc_chunk: fix DUP stripe size handling
In case of using DUP, we search for enough unallocated disk space on a
device to hold two stripes.

The devices_info[ndevs-1].max_avail that holds the amount of unallocated
space found is directly assigned to stripe_size, while it's actually
twice the stripe size.

Later on in the code, an unconditional division of stripe_size by
dev_stripes corrects the value, but in the meantime there's a check to
see if the stripe_size does not exceed max_chunk_size. Since during this
check stripe_size is twice the amount as intended, the check will reduce
the stripe_size to max_chunk_size if the actual correct to be used
stripe_size is more than half the amount of max_chunk_size.

The unconditional division later tries to correct stripe_size, but will
actually make sure we can't allocate more than half the max_chunk_size.

Fix this by moving the division by dev_stripes before the max chunk size
check, so it always contains the right value, instead of putting a duct
tape division in further on to get it fixed again.

Since in all other cases than DUP, dev_stripes is 1, this change only
affects DUP.

Other attempts in the past were made to fix this:
* 37db63a400 "Btrfs: fix max chunk size check in chunk allocator" tried
to fix the same problem, but still resulted in part of the code acting
on a wrongly doubled stripe_size value.
* 86db25785a "Btrfs: fix max chunk size on raid5/6" unintentionally
broke this fix again.

The real problem was already introduced with the rest of the code in
73c5de0051.

The user visible result however will be that the max chunk size for DUP
will suddenly double, while it's actually acting according to the limits
in the code again like it was 5 years ago.

Reported-by: Naohiro Aota <naohiro.aota@wdc.com>
Link: https://www.spinics.net/lists/linux-btrfs/msg69752.html
Fixes: 73c5de0051 ("btrfs: quasi-round-robin for chunk allocation")
Fixes: 86db25785a ("Btrfs: fix max chunk size on raid5/6")
Signed-off-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update comment ]
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-01 16:16:47 +01:00
Nikolay Borisov 765f3cebff btrfs: Handle btrfs_set_extent_delalloc failure in relocate_file_extent_cluster
Essentially duplicate the error handling from the above block which
handles the !PageUptodate(page) case and additionally clear
EXTENT_BOUNDARY.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-01 16:16:12 +01:00
Nikolay Borisov ac01f26a27 btrfs: handle failure of add_pending_csums
add_pending_csums was added as part of the new data=ordered
implementation in e6dcd2dc9c ("Btrfs: New data=ordered
implementation"). Even back then it called the btrfs_csum_file_blocks
which can fail but it never bothered handling the failure. In ENOMEM
situation this could lead to the filesystem failing to write the
checksums for a particular extent and not detect this. On read this
could lead to the filesystem erroring out due to crc mismatch. Fix it by
propagating failure from add_pending_csums and handling them.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-01 16:16:00 +01:00
Jeff Mahoney a8fd1f7174 btrfs: use kvzalloc to allocate btrfs_fs_info
The srcu_struct in btrfs_fs_info scales in size with NR_CPUS.  On
kernels built with NR_CPUS=8192, this can result in kmalloc failures
that prevent mounting.

There is work in progress to try to resolve this for every user of
srcu_struct but using kvzalloc will work around the failures until
that is complete.

As an example with NR_CPUS=512 on x86_64: the overall size of
subvol_srcu is 3460 bytes, fs_info is 6496.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-01 16:15:36 +01:00
Rafael J. Wysocki 38c08aa54b platform/x86: intel-hid: Reset wakeup capable flag on removal
The intel-hid device will not be able to wake up the system any more
after removing the notify handler provided by its driver, so make
its sysfs attributes reflect that.

Fixes: ef884112e5 (platform: x86: intel-hid: Wake up the system from suspend-to-idle)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-03-01 13:08:25 +02:00
Rafael J. Wysocki b758dbd576 platform/x86: intel-vbtn: Reset wakeup capable flag on removal
The intel-vbtn device will not be able to wake up the system any more
after removing the notify handler provided by its driver, so make
its sysfs attributes reflect that.

Fixes: 91f9e850d4 (platform: x86: intel-vbtn: Wake up the system from suspend-to-idle)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-03-01 13:08:25 +02:00
Thomas Gleixner 945fd17ab6 x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table
The separation of the cpu_entry_area from the fixmap missed the fact that
on 32bit non-PAE kernels the cpu_entry_area mapping might not be covered in
initial_page_table by the previous synchronizations.

This results in suspend/resume failures because 32bit utilizes initial page
table for resume. The absence of the cpu_entry_area mapping results in a
triple fault, aka. insta reboot.

With PAE enabled this works by chance because the PGD entry which covers
the fixmap and other parts incindentally provides the cpu_entry_area
mapping as well.

Synchronize the initial page table after setting up the cpu entry
area. Instead of adding yet another copy of the same code, move it to a
function and invoke it from the various places.

It needs to be investigated if the existing calls in setup_arch() and
setup_per_cpu_areas() can be replaced by the later invocation from
setup_cpu_entry_areas(), but that's beyond the scope of this fix.

Fixes: 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
Reported-by: Woody Suwalski <terraluna977@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Woody Suwalski <terraluna977@gmail.com>
Cc: William Grant <william.grant@canonical.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1802282137290.1392@nanos.tec.linutronix.de
2018-03-01 09:48:27 +01:00
Stefano Stabellini d811bcee1f pvcalls-front: 64-bit align flags
We are using test_and_* operations on the status and flag fields of
struct sock_mapping. However, these functions require the operand to be
64-bit aligned on arm64. Currently, only status is 64-bit aligned.

Make status and flags explicitly 64-bit aligned.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2018-03-01 07:23:36 +01:00
Dave Airlie 93dfdf9fde Merge branch 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A few misc fixes for 4.16.

* 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: skip ECC for SRIOV in gmc late_init
  drm/amd/amdgpu: Correct VRAM width for APUs with GMC9
  drm/amdgpu: fix&cleanups for wb_clear
  drm/amdgpu: Correct sdma_v4 get_wptr(v2)
  drm/amd/powerplay: fix power over limit on Fiji
  drm/amdgpu:Fixed wrong emit frame size for enc
  drm/amdgpu: move WB_FREE to correct place
  drm/amdgpu: only flush hotplug work without DC
  drm/amd/display: check for ipp before calling cursor operations
2018-03-01 14:03:14 +10:00
Dave Airlie 2679b96ae4 Two regression fixes here: a fb format regression on nouveau and a 4.16-rc1
regression with on LVDS with one sun4i device. Plus a sun4i and  a virtio-gpu
 fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJalr4FAAoJEEN0HIUfOBk0kfwQAJaUiAihz3aahzRAlbN+9u94
 mPWkjr24pQg6BJ92mO/y3z81IIfXrJqNFnw3CU6CYTQxhqA5gp7LqlWfaH9Vnyut
 SMVqLVzqc+JPHoh2HKaA0XqM2lpJtWTIhDhWklN3tU3hr+mIHAnExVyX5okSu09l
 irah/n4ZWAfF6VBOFN/af9u/pS9DRhX1O619DcZchTFMVvGDScJAvyVUnJDa5BZK
 Oix2JQxyuUBxCXcKg+pTwcWaegcqKD05s4cmM2E7vmsES+LSeN6+StEmtjK6zNZC
 cEkzGsRI0KIU+Dc2tIukUK+A1ez0LDrOog6llYBA4ZqveBS261cgpTbjdzynb5cx
 xxcxt0O0+CwO24BV71ZTLRX0Hqxb6d5csTzFgFtAUU+iU7aNfYN8NKtHxIdTZxcT
 pcIAurFk+qiHVEV716hiLNBFvObAZMSJtyKJ0EWp3L70yFTXzU9eeUuk0r7CE4Jh
 mZDy2t2Z9J1uyXPv+lEfwhjwMEN/+BdPUmnAFZbrPZlqTZABMsTgEadrIXasRWcr
 3uE0q7SMWIPUlZioUWnXxnPVLqi5EjUNHtAYEzBupy2cf3eMB6WrHr7q1Blqe2JV
 r3gnJReAUQTlYVF0SWpT3EJfFgJaVytF7ZS5N4K5KRH0GMsTCDj2ivRfxIk1ujcW
 RIJV0IfBx8m8qAVK58gn
 =57h5
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2018-02-28' of git://people.freedesktop.org/drm-misc into drm-fixes

Two regression fixes here: a fb format regression on nouveau and a 4.16-rc1
regression with on LVDS with one sun4i device. Plus a sun4i and  a virtio-gpu
fixes.

* tag 'drm-misc-fixes-2018-02-28' of git://people.freedesktop.org/drm-misc:
  virtio-gpu: fix ioctl and expose the fixed status to userspace.
  drm/sun4i: Protect the TCON pixel clocks
  drm/sun4i: Enable the output on the pins (tcon0)
  drm/nouveau: prefer XBGR2101010 for addfb ioctl
2018-03-01 14:02:32 +10:00
Dave Airlie 4757d972d9 - 2 display fixes: audio av_enc_map overflow check, and Cannonlake PLL related register offset.
- 3 gem fixes: Clear for in-fence out-fence, fix for clearing exec_flags on execbuf failure, and add back global seqno to tracepoints that had been removed recently by other fence related patch.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJal1J8AAoJEPpiX2QO6xPKQyIH/2rk42oWKjLJboCErj8WkY6B
 0Pdt3WghQ8VJPBnAeCvXx+QHf1sKKYEKGNhTTuR2Gfd8V8L7CcK3XUYWkt3zhDS3
 b2KW3bpQm3k+lMS55uH7dqXpzustNoslKEWkyT/TGcubDIzWGeKujFxdfkf131+R
 /iG5h5eU8ryPaJ8Jv2DwPsWdnT+DGC01e+jtTdcAfWwyPC/glDKWpNTQTaUz0WZM
 7lK8o5ta5/CSDFgmzK50amL0gAbHg4g8IhcRFlRf6lupTMJm0GypIrDKM5HmPRuw
 RQZA2k8cLsSbFjBc0Mg6ykwyA6T2Jf1T+ymKFAIz8tbBQVeIL6+1y2PE3JDgA7Y=
 =axz/
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-fixes-2018-02-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- 2 display fixes: audio av_enc_map overflow check, and Cannonlake PLL related register offset.
- 3 gem fixes: Clear for in-fence out-fence, fix for clearing exec_flags on execbuf failure, and add back global seqno to tracepoints that had been removed recently by other fence related patch.

* tag 'drm-intel-fixes-2018-02-28' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915: Make global seqno known in i915_gem_request_execute tracepoint
  drm/i915: Clear the in-use marker on execbuf failure
  drm/i915/cnl: Fix PORT_TX_DW5/7 register address
  drm/i915/audio: fix check for av_enc_map overflow
  drm/i915: Fix rsvd2 mask when out-fence is returned
2018-03-01 13:59:21 +10:00
Linus Torvalds 97ace515f0 ARM: SoC fixes for 4.16
This is the first set of bugfixes for ARM SoCs, fixing a couple
 of stability problems, mostly on TI OMAP and Rockchips platforms:
 
 - OMAP2 hwmod clocks must be enabled in the correct order
 
 - OMAP3 Wakeup from resume through PRM IRQ was unreliable
 
 - One regression on OMAP5 caused by a kexec fix
 
 - Rockchip ethernet needs some settings for stable operation on Rock64
 
 - Rockchip based Chrombook Plus needs another clock setting for
   stable display suspend/resume
 
 - Rockchip based phyCORE-RK3288 was able to run at an invalid
   CPU clock frequency
 
 - Rockchip MMC link was sometimes unreliable
 
 - Multiple fixes to avoid crashes in the Broadcom STB DPFE driver
 
 Other minor changes include:
 
 - Devicetree fixes for incorrect hardware description (rockchip,
   omap, Gemini, amlogic)
 
 - Some MAINTAINER file updates to correct email and git addresses
 
 - Some fixes addressing 'make W=1' dtc warnings (broadcom, amlogic,
   cavium, qualcomm, hisilicon, zx)
 
 - Fixes for LTO-compilation (orion, davinci, clps711x)
 
 - One fix for an incorrect Kconfig errata selection
 
 - A memory leak in the OMAP timer driver
 
 - A kernel data leak in OMAP1 debugfs files
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJalzTaAAoJEGCrR//JCVInBeQP/3wBXCnzfCkmSSliZHoNzgYB
 XGkC+JIqw9AnHvn/ckvHMwUv8kQlbi7ImPXz1P8yafy3h2vHIdN2My0XYtRyQkNT
 NoAxIXT+NiQx9sAoLGY8gWTN4Do63q1vw5SLmOEDD2GYzo1jao4s7J0mhFZopBLw
 WkgHf8t4jRmoBDA4GEYcdJZS5shMydFDyb9CiiqNHVA4S4IL87XcPoJDpJmyVDZ4
 vZVeccyhw0Xh0NJLzRIhVDGRN2pj1ayFFVodfRNTseRGf0QRexntiIyIHa2wOi1l
 93IjJ3XgHuYEj0NNNpZiHV5OZxxRbQlTD/ji5L8j71lklVjIedJsJdWFUKiK53oh
 ufQXTRZaVMmh4xcvihABSchg8vEXMqx4cZ/hj/+LIepDJM6GC39uGipg6enORVym
 BuZpol8b1owABN461Bt2RfAVyXqJ7TRkdVy+RaP7RCsddLEcdKdI6HYi3aeDVmHQ
 krvTrLQhRsDL4IHvi6rQDqyJMf5GDP4y7aInf7YzvJlbV2uU+M0ndiSHpGhw6vbG
 brhc/n56U/waMPG8tOv9AB1+afARQOc4Fo9xg96PADA69SXn7Eq2dgf1D/ern8UQ
 6KgNZ1hmmEHzkxsAXjEcStlmhpwk4lh4T0nSDbamsMRvZRNQaqmskMbmYYepIXKC
 71k/Uwf4CQhMxe2aXIOo
 =fcv0
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "This is the first set of bugfixes for ARM SoCs, fixing a couple of
  stability problems, mostly on TI OMAP and Rockchips platforms:

   - OMAP2 hwmod clocks must be enabled in the correct order

   - OMAP3 Wakeup from resume through PRM IRQ was unreliable

   - one regression on OMAP5 caused by a kexec fix

   - Rockchip ethernet needs some settings for stable operation on
     Rock64

   - Rockchip based Chrombook Plus needs another clock setting for
     stable display suspend/resume

   - Rockchip based phyCORE-RK3288 was able to run at an invalid CPU
     clock frequency

   - Rockchip MMC link was sometimes unreliable

   - multiple fixes to avoid crashes in the Broadcom STB DPFE driver

  Other minor changes include:

   - Devicetree fixes for incorrect hardware description (rockchip,
     omap, Gemini, amlogic)

   - some MAINTAINER file updates to correct email and git addresses

   - some fixes addressing 'make W=1' dtc warnings (broadcom, amlogic,
     cavium, qualcomm, hisilicon, zx)

   - fixes for LTO-compilation (orion, davinci, clps711x)

   - one fix for an incorrect Kconfig errata selection

   - a memory leak in the OMAP timer driver

   - a kernel data leak in OMAP1 debugfs files"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits)
  MAINTAINERS: update entries for ARM/STM32
  ARM: dts: bcm283x: Move arm-pmu out of soc node
  ARM: dts: bcm283x: Fix unit address of local_intc
  ARM: dts: NSP: Fix amount of RAM on BCM958625HR
  ARM: dts: Set D-Link DNS-313 SATA to muxmode 0
  ARM: omap2: set CONFIG_LIRC=y in defconfig
  ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS
  memory: brcmstb: dpfe: support new way of passing data from the DCPU
  memory: brcmstb: dpfe: fix type declaration of variable "ret"
  memory: brcmstb: dpfe: properly mask vendor error bits
  ARM: BCM: dts: Remove leading 0x and 0s from bindings notation
  ARM: orion: fix orion_ge00_switch_board_info initialization
  ARM: davinci: mark spi_board_info arrays as const
  ARM: clps711x: mark clps711x_compat as const
  arm: zx: dts: Remove leading 0x and 0s from bindings notation
  arm64: dts: Remove leading 0x and 0s from bindings notation
  arm64: dts: cavium: fix PCI bus dtc warnings
  MAINTAINERS: ARM: at91: update my email address
  soc: imx: gpc: de-register power domains only if initialized
  ARM: dts: rockchip: Fix DWMMC clocks
  ...
2018-02-28 16:11:04 -08:00
Linus Torvalds b5e792f11a RISC-V: smb_mb() fix for 4.16-rc4
This week we have a single fix: replacing smp_mb() with __smp_mb().  We
 were the only architecture with smp_mb() and it appears to just be
 clearly wrong, so I think this is a pretty safe patch for an RC.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlqUQEETHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQciND/wJOHzf/O7aHWlt7UYvzKsqgIdxNKiH
 IWbv4jTmJGJju10Ht5jImuIxSYJ5gu24Vj5f+/XkFRZfTGrOGgByXh37RuFP51kH
 3yFqzDoBfmBHu8NTeyAb6hHtozvcVTBYynSqKBJzsCz3mrBgqobvzoii1R2TX4lc
 +cPNffv+YRPZSHdHOpaytYIYdVxbcFmsGocftjzZ9AbR87mbT1zxDJV+YHgH86vE
 2qqaWYTJChCCQo+We57+0HheMtzjvuC9wXRg7yqDOhe5mvpxNQf/RLhr4cxJtcFN
 i96ERaRg0B5PAU+Pcql0wVL9oigd7bPv4Ncc+rjPG9efkXDKeN/ueD1az9yWRqE1
 SUjJROel5tF6Yxytn2XX+MEIPEy0AO1WlD+9o23aoaKXHgSu8kMU3Zgryvs+S6W5
 VkZeeUk/VV4YIQy8HwMujgjKzLXOoBjAHOvSyyf75ERkYbQxpEjkWRZWtjsiLGbP
 lgHX59p3C2J3w/xLA+rUE3gdPm4qY45l232RJFlFS6e+wG32/Fw+Yc+PxFHhPARK
 J+mhAVswhfrC4x9KbL83y8Zf9RvJ4y4qnpJAOyze+OgARq30SLzr6ul+5nHQAkb+
 4lcPS0ROsxpa83BnGC8bfz9TGoomdYJrT8piV+lx8wTPrZQg4gfAoStSyu89V0Vz
 LBcwO6MObWcdhw==
 =ke4H
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-4.16-rc4_smp_mb' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V fix from Palmer Dabbelt:
 "This week we have a single fix: replacing smp_mb() with __smp_mb().

  We were the only architecture with smp_mb() and it appears to just be
  clearly wrong, so I think this is a pretty safe patch for an RC"

* tag 'riscv-for-linus-4.16-rc4_smp_mb' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  riscv/barrier: Define __smp_{mb,rmb,wmb}
2018-02-28 14:55:07 -08:00
Lingutla Chandrasekhar c52232a49e timers: Forward timer base before migrating timers
On CPU hotunplug the enqueued timers of the unplugged CPU are migrated to a
live CPU. This happens from the control thread which initiated the unplug.

If the CPU on which the control thread runs came out from a longer idle
period then the base clock of that CPU might be stale because the control
thread runs prior to any event which forwards the clock.

In such a case the timers from the unplugged CPU are queued on the live CPU
based on the stale clock which can cause large delays due to increased
granularity of the outer timer wheels which are far away from base:;clock.

But there is a worse problem than that. The following sequence of events
illustrates it:

 - CPU0 timer1 is queued expires = 59969 and base->clk = 59131.

   The timer is queued at wheel level 2, with resulting expiry time = 60032
   (due to level granularity).

 - CPU1 enters idle @60007, with next timer expiry @60020.

 - CPU0 is hotplugged at @60009

 - CPU1 exits idle and runs the control thread which migrates the
   timers from CPU0

   timer1 is now queued in level 0 for immediate handling in the next
   softirq because the requested expiry time 59969 is before CPU1 base->clk
   60007

 - CPU1 runs code which forwards the base clock which succeeds because the
   next expiring timer. which was collected at idle entry time is still set
   to 60020.

   So it forwards beyond 60007 and therefore misses to expire the migrated
   timer1. That timer gets expired when the wheel wraps around again, which
   takes between 63 and 630ms depending on the HZ setting.

Address both problems by invoking forward_timer_base() for the control CPUs
timer base. All other places, which might run into a similar problem
(mod_timer()/add_timer_on()) already invoke forward_timer_base() to avoid
that.

[ tglx: Massaged comment and changelog ]

Fixes: a683f390b9 ("timers: Forward the wheel clock whenever possible")
Co-developed-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: linux-arm-msm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180118115022.6368-1-clingutla@codeaurora.org
2018-02-28 23:34:33 +01:00