Commit graph

85630 commits

Author SHA1 Message Date
Vladimir Murzin 5a7a8426b2 arm64: KVM: Use static keys for selecting the GIC backend
Currently GIC backend is selected via alternative framework and this
is fine. We are going to introduce vgic-v3 to 32-bit world and there
we don't have patching framework in hand, so we can either check
support for GICv3 every time we need to choose which backend to use or
try to optimise it by using static keys. The later looks quite
promising because we can share logic involved in selecting GIC backend
between architectures if both uses static keys.

This patch moves arm64 from alternative to static keys framework for
selecting GIC backend. For that we embed static key into vgic_global
and enable the key during vgic initialisation based on what has
already been exposed by the host GIC driver.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-22 13:21:35 +02:00
Marc Zyngier bf8feb3964 arm64: KVM: vgic-v2: Add GICV access from HYP
Now that we have the necessary infrastructure to handle MMIO accesses
in HYP, perform the GICV access on behalf of the guest. This requires
checking that the access is strictly 32bit, properly aligned, and
falls within the expected range.

When all condition are satisfied, we perform the access and tell
the rest of the HYP code that the instruction has been correctly
emulated.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Marc Zyngier fb5ee369cc arm64: KVM: vgic-v2: Add the GICV emulation infrastructure
In order to efficiently perform the GICV access on behalf of the
guest, we need to be able to avoid going back all the way to
the host kernel.

For this, we introduce a new hook in the world switch code,
conveniently placed just after populating the fault info.
At that point, we only have saved/restored the GP registers,
and we can quickly perform all the required checks (data abort,
translation fault, valid faulting syndrome, not an external
abort, not a PTW).

Coming back from the emulation code, we need to skip the emulated
instruction. This involves an additional bit of save/restore in
order to be able to access the guest's PC (and possibly CPSR if
this is a 32bit guest).

At this stage, no emulation code is provided.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-09-08 12:53:00 +02:00
Paolo Bonzini 4174879b2e Merge branch 'x86/amd-avic' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into HEAD
Merge IOMMU bits for virtualization of interrupt injection into
virtual machines.
2016-09-05 16:00:24 +02:00
Suravee Suthikulpanit b9fc6b56f4 iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for pass-through devices
This patch implements irq_set_vcpu_affinity() function to set up interrupt
remapping table entry with vapic mode for pass-through devices.

In case requirements for vapic mode are not met, it falls back to set up
the IRTE in legacy mode.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-09-05 12:41:46 +02:00
Suravee Suthikulpanit 8dbea3fd7b iommu/amd: Introduce amd_iommu_update_ga()
Introduces a new IOMMU API, amd_iommu_update_ga(), which allows
KVM (SVM) to update existing posted interrupt IOMMU IRTE when
load/unload vcpu.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-09-05 12:41:46 +02:00
Suravee Suthikulpanit bd6fcefc66 iommu/amd: Adding GALOG interrupt handler
This patch adds AMD IOMMU guest virtual APIC log (GALOG) handler.
When IOMMU hardware receives an interrupt targeting a blocking vcpu,
it creates an entry in the GALOG, and generates an interrupt to notify
the AMD IOMMU driver.

At this point, the driver processes the log entry, and notify the SVM
driver via the registered iommu_ga_log_notifier function.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-09-05 12:41:46 +02:00
Paolo Bonzini 2eeb321fd2 KVM/ARM Fixes for v4.8-rc3
This tag contains the following fixes on top of v4.8-rc1:
  - ITS init issues
  - ITS error handling issues
  - ITS IRQ leakage fix
  - Plug a couple of ITS race conditions
  - An erratum workaround for timers
  - Some removal of misleading use of errors and comments
  - A fix for GICv3 on 32-bit guests
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXtLicAAoJEEtpOizt6ddyEC4H/16IngntN6Gz1WPwtBBelgyj
 ZfU970uzGOyDtDOeOX1NT+gJpkDvUMhsNlngWnMrMwqqqPVKdE4XBShPiW2v53E7
 JquDTd2kKl+OO9e9XnkLw9yUcARmJFKIjHdISlg+E78t2kcNHn+XB2jrfTLKQVl8
 tk1ztDALb4LXSGYPZQ/uHTYp9U0qei+2SbbQufRcdQ3ggyxLDwPP2aO25amctzEP
 0Y42tlnNoZj7yBBp0X9BWRrHF2AZuOp+qBJnpFiQdsgLL6G1P3DcU/t9+KDjVBVr
 LYKN8jId2r5eyGGg8aKb4I3trevayToWhDw/jzarrTNAovB1cp8G5J7ozfmeS3g=
 =4PCW
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-v4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM Fixes for v4.8-rc3

This tag contains the following fixes on top of v4.8-rc1:
 - ITS init issues
 - ITS error handling issues
 - ITS IRQ leakage fix
 - Plug a couple of ITS race conditions
 - An erratum workaround for timers
 - Some removal of misleading use of errors and comments
 - A fix for GICv3 on 32-bit guests
2016-08-18 12:19:19 +02:00
Linus Torvalds 184ca82348 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
    Fietkau.

 2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.

 3) Fix some tg3 ethtool logic bugs, and one that would cause no
    interrupts to be generated when rx-coalescing is set to 0.  From
    Satish Baddipadige and Siva Reddy Kallam.

 4) QLCNIC mailbox corruption and napi budget handling fix from Manish
    Chopra.

 5) Fix fib_trie logic when walking the trie during /proc/net/route
    output than can access a stale node pointer.  From David Forster.

 6) Several sctp_diag fixes from Phil Sutter.

 7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.

 8) Checksum fixup fixes in bpf from Daniel Borkmann.

 9) Memork leaks in nfnetlink, from Liping Zhang.

10) Use after free in rxrpc, from David Howells.

11) Use after free in new skb_array code of macvtap driver, from Jason
    Wang.

12) Calipso resource leak, from Colin Ian King.

13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.

14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.

15) Fix lockdep splats in macsec, from Sabrina Dubroca.

16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
    handling.

17) Various tc-action bug fixes, from CONG Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  net_sched: allow flushing tc police actions
  net_sched: unify the init logic for act_police
  net_sched: convert tcf_exts from list to pointer array
  net_sched: move tc offload macros to pkt_cls.h
  net_sched: fix a typo in tc_for_each_action()
  net_sched: remove an unnecessary list_del()
  net_sched: remove the leftover cleanup_a()
  mlxsw: spectrum: Allow packets to be trapped from any PG
  mlxsw: spectrum: Unmap 802.1Q FID before destroying it
  mlxsw: spectrum: Add missing rollbacks in error path
  mlxsw: reg: Fix missing op field fill-up
  mlxsw: spectrum: Trap loop-backed packets
  mlxsw: spectrum: Add missing packet traps
  mlxsw: spectrum: Mark port as active before registering it
  mlxsw: spectrum: Create PVID vPort before registering netdevice
  mlxsw: spectrum: Remove redundant errors from the code
  mlxsw: spectrum: Don't return upon error in removal path
  i40e: check for and deal with non-contiguous TCs
  ixgbe: Re-enable ability to toggle VLAN filtering
  ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
  ...
2016-08-17 17:26:58 -07:00
WANG Cong 22dc13c837 net_sched: convert tcf_exts from list to pointer array
As pointed out by Jamal, an action could be shared by
multiple filters, so we can't use list to chain them
any more after we get rid of the original tc_action.
Instead, we could just save pointers to these actions
in tcf_exts, since they are refcount'ed, so convert
the list to an array of pointers.

The "ugly" part is the action API still accepts list
as a parameter, I just introduce a helper function to
convert the array of pointers to a list, instead of
relying on the C99 feature to iterate the array.

Fixes: a85a970af2 ("net_sched: move tc_action into tcf_common")
Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
WANG Cong 2734437ef3 net_sched: move tc offload macros to pkt_cls.h
struct tcf_exts belongs to filters, should not be visible
to plain tc actions.

Cc: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
WANG Cong 0c23c3e705 net_sched: fix a typo in tc_for_each_action()
It is harmless because all users pass 'a' to this macro.

Fixes: 00175aec94 ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
Cc: Amir Vadai <amir@vadai.me>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:27:51 -04:00
Simon Horman 3d7b332092 gre: set inner_protocol on xmit
Ensure that the inner_protocol is set on transmit so that GSO segmentation,
which relies on that field, works correctly.

This is achieved by setting the inner_protocol in gre_build_header rather
than each caller of that function. It ensures that the inner_protocol is
set when gre_fb_xmit() is used to transmit GRE which was not previously the
case.

I have observed this is not the case when OvS transmits GRE using
lwtunnel metadata (which it always does).

Fixes: 3872035241 ("gre: Use inner_proto to obtain inner header protocol")
Cc: Pravin Shelar <pshelar@ovn.org>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-15 13:37:12 -07:00
Sabrina Dubroca 952fcfd08c net: remove type_check from dev_get_nest_level()
The idea for type_check in dev_get_nest_level() was to count the number
of nested devices of the same type (currently, only macvlan or vlan
devices).
This prevented the false positive lockdep warning on configurations such
as:

eth0 <--- macvlan0 <--- vlan0 <--- macvlan1

However, this doesn't prevent a warning on a configuration such as:

eth0 <--- macvlan0 <--- vlan0
eth1 <--- vlan1 <--- macvlan1

In this case, all the locks end up with a nesting subclass of 1, so
lockdep thinks that there is still a deadlock:

- in the first case we have (macvlan_netdev_addr_lock_key, 1) and then
  take (vlan_netdev_xmit_lock_key, 1)
- in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then
  take (macvlan_netdev_addr_lock_key, 1)

By removing the linktype check in dev_get_nest_level() and always
incrementing the nesting depth, lockdep considers this configuration
valid.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:15:54 -07:00
Johannes Berg c15c0ab12f ipv6: suppress sparse warnings in IP6_ECN_set_ce()
Pass the correct type __wsum to csum_sub() and csum_add(). This doesn't
really change anything since __wsum really *is* __be32, but removes the
address space warnings from sparse.

Cc: Eric Dumazet <edumazet@google.com>
Fixes: 34ae6a1aa0 ("ipv6: update skb->csum when CE mark is propagated")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:08:00 -07:00
Linus Torvalds 329f415291 KVM locks kvm_device list to prevent corruption on device creation.
PPC splits debugfs initialization from creation of the xics device to
 unlock the newly taken kvm lock earlier.
 
 s390 prevents userspace from triggering two WARN_ON_ONCE.
 
 MIPS fixes several issues in the management of TLB faults (Cc: stable).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJXrx2ZAAoJEED/6hsPKofoo/4H/jra5NNxvpo09LWlXTwGXxBH
 cwcfDZSiOFxgvWztKJOIjPI4ETL3mnZvb9SFWBZZh1U0kfZ/TGiWouwaDNlBkPYj
 I3YHuPI7if+yUOmJlI3N2hWa0Wo0qiMqIjKT0pQVSLLdK/CVE+xGyS+qtXTNXHQn
 pFdKlYr//7OwQEY0ow1yj5VnsFrXB1JWFyB/+N5zaCfbCaQVyZAL7rj8SUbC/32W
 CiNhrvatzierKIfPerWw8DvvBKhCgWaRuLl0W+uMncrC9Qepcx9moM2beD1txK2I
 iHor1TDxUPifGQONfWMAlw87FluzHF4vQ5nN2jyTi8TT+CEfZpZ43Q+DY7okD4w=
 =NQP9
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "KVM:
   - lock kvm_device list to prevent corruption on device creation.

  PPC:
   - split debugfs initialization from creation of the xics device to
     unlock the newly taken kvm lock earlier.

  s390:
   - prevent userspace from triggering two WARN_ON_ONCE.

  MIPS:
   - fix several issues in the management of TLB faults (Cc: stable)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  MIPS: KVM: Propagate kseg0/mapped tlb fault errors
  MIPS: KVM: Fix gfn range check in kseg0 tlb faults
  MIPS: KVM: Add missing gfn range check
  MIPS: KVM: Fix mapped fault broken commpage handling
  KVM: Protect device ops->create and list_add with kvm->lock
  KVM: PPC: Move xics_debugfs_init out of create
  KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed
  KVM: s390: set the prefix initially properly
2016-08-13 10:11:14 -07:00
Linus Torvalds a1e210331b Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - an NVMe fix from Gabriel, fixing a suspend/resume issue on some
   setups

 - addition of a few missing entries in the block queue sysfs
   documentation, from Joe

 - a fix for a sparse shadow warning for the bvec iterator, from
   Johannes

 - a writeback deadlock involving raid issuing barriers, and not
   flushing the plug when we wakeup the flusher threads.  From
   Konstantin

 - a set of patches for the NVMe target/loop/rdma code, from Roland and
   Sagi

* 'for-linus' of git://git.kernel.dk/linux-block:
  bvec: avoid variable shadowing warning
  doc: update block/queue-sysfs.txt entries
  nvme: Suspend all queues before deletion
  mm, writeback: flush plugged IO in wakeup_flusher_threads()
  nvme-rdma: Remove unused includes
  nvme-rdma: start async event handler after reconnecting to a controller
  nvmet: Fix controller serial number inconsistency
  nvmet-rdma: Don't use the inline buffer in order to avoid allocation for small reads
  nvmet-rdma: Correctly handle RDMA device hot removal
  nvme-rdma: Make sure to shutdown the controller if we can
  nvme-loop: Remove duplicate call to nvme_remove_namespaces
  nvme-rdma: Free the I/O tags when we delete the controller
  nvme-rdma: Remove duplicate call to nvme_remove_namespaces
  nvme-rdma: Fix device removal handling
  nvme-rdma: Queue ns scanning after a sucessful reconnection
  nvme-rdma: Don't leak uninitialized memory in connect request private data
2016-08-13 09:56:45 -07:00
Daniel Borkmann 747ea55e4f bpf: fix bpf_skb_in_cgroup helper naming
While hashing out BPF's current_task_under_cgroup helper bits, it came
to discussion that the skb_in_cgroup helper name was suboptimally chosen.

Tejun says:

  So, I think in_cgroup should mean that the object is in that
  particular cgroup while under_cgroup in the subhierarchy of that
  cgroup. Let's rename the other subhierarchy test to under too. I
  think that'd be a lot less confusing going forward.

  [...]

  It's more intuitive and gives us the room to implement the real
  "in" test if ever necessary in the future.

Since this touches uapi bits, we need to change this as long as v4.8
is not yet officially released. Thus, change the helper enum and rename
related bits.

Fixes: 4a482f34af ("cgroup: bpf: Add bpf_skb_in_cgroup_proto")
Reference: http://patchwork.ozlabs.org/patch/658500/
Suggested-by: Sargun Dhillon <sargun@sargun.me>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2016-08-12 21:53:33 -07:00
Linus Torvalds ad83242a8f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, plus two uncore-PMU fixes, an uprobes fix, a
  perf-cgroups fix and an AUX events fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Add enable_box for client MSR uncore
  perf/x86/intel/uncore: Fix uncore num_counters
  uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions
  perf/core: Set cgroup in CPU contexts for new cgroup events
  perf/core: Fix sideband list-iteration vs. event ordering NULL pointer deference crash
  perf probe ppc64le: Fix probe location when using DWARF
  perf probe: Add function to post process kernel trace events
  tools: Sync cpufeatures headers with the kernel
  toops: Sync tools/include/uapi/linux/bpf.h with the kernel
  tools: Sync cpufeatures.h and vmx.h with the kernel
  perf probe: Support signedness casting
  perf stat: Avoid skew when reading events
  perf probe: Fix module name matching
  perf probe: Adjust map->reloc offset when finding kernel symbol from map
  perf hists: Trim libtraceevent trace_seq buffers
  perf script: Add 'bpf-output' field to usage message
2016-08-12 13:21:18 -07:00
Linus Torvalds 1f8083c640 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "Misc fixes: lockstat fix, futex fix on !MMU systems, big endian fix
  for qrwlocks and a race fix for pvqspinlocks"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/pvqspinlock: Fix a bug in qstat_read()
  locking/pvqspinlock: Fix double hash race
  locking/qrwlock: Fix write unlock bug on big endian systems
  futex: Assume all mappings are private on !MMU systems
2016-08-12 12:46:37 -07:00
Linus Torvalds 25db69188e Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
 "A fix for an MSI regression"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/msi: Make sure PCI MSIs are activated early
2016-08-12 12:41:51 -07:00
Linus Torvalds 9909170065 NFS client bugfixes for Linux 4.8
Highlights include:
 
 - Stable patch from Olga to fix RPCSEC_GSS upcalls when the same user needs
   multiple different security services (e.g. krb5i and krb5p).
 - Stable patch to fix a regression introduced by the use of SO_REUSEPORT,
   and that prevented the use of multiple different NFS versions to the
   same server.
 - TCP socket reconnection timer fixes.
 - Patch from Neil to disable the use of IPv6 temporary addresses.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXrh03AAoJEGcL54qWCgDyp4EQALwZpmYCxWJE5xSHW95Fs124
 HYM8g4LznOfs3/ohInb1ja2FaQqUy0XEk3pSjNKfyYgjuwB4qJSOpnAqoIKxJFGB
 h4582leYZOZYMMCGslS2I4zcElBYO1WjnKNyb7MpZjCHmN0AdFfIcOXd2K7eL9hM
 /poImcs5KfMGIEJqmKqMUxmJ3RjxpK3LySQAes/Y5odOiHC4SGJdGUmSeuPGTbQd
 YjFWVHRFU6kVAzPd2Jl46Sgy6SpDaVz82HodXCSY+8lklmIkbIsVqJs0VWo3WkfL
 r5WLQ3PzZvloQ7o/E9tZGiB/LEi7roa51hYsG4sleN6Kap5vwyWg0QIKjqyJdFxB
 JmFanlCMfae3zNz4cusvgu1okvMnNqO4uRXJIAKfk64k775N9ebY7TXAZUK4/UbY
 4nxCHcxygamP/k/8HYFpc4964tMaimIs9JUdojad5a3dzffwXcgEC/0HPUih9R+i
 DO/cbVtWeDkmQPLrUqFfOAbmQdyAjELrv48d5BVIst49uuCULU2LlDlVLiAvaZvq
 s2YNmr7lkHowvgaH4ShL89wuyyD14Xu5/f49oFBFNKEQay9YthQ8s3XmdZBG7Zl0
 oyA1XJjWEq3p8nvPGIqFD26w75ppUbAWLTHsyoU0YfEYrZJrF9jPxowI7WlHgfVo
 Io79x1sbgTrckjG+osAf
 =UHph
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

   - Stable patch from Olga to fix RPCSEC_GSS upcalls when the same user
     needs multiple different security services (e.g.  krb5i and krb5p).

   - Stable patch to fix a regression introduced by the use of
     SO_REUSEPORT, and that prevented the use of multiple different NFS
     versions to the same server.

   - TCP socket reconnection timer fixes.

   - Patch from Neil to disable the use of IPv6 temporary addresses"

* tag 'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Cap the transport reconnection timer at 1/2 lease period
  NFSv4: Cleanup the setting of the nfs4 lease period
  SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout
  SUNRPC: Fix reconnection timeouts
  NFSv4.2: LAYOUTSTATS may return NFS4ERR_ADMIN/DELEG_REVOKED
  SUNRPC: disable the use of IPv6 temporary addresses.
  SUNRPC: allow for upcalls for same uid but different gss service
  SUNRPC: Fix up socket autodisconnect
  SUNRPC: Handle EADDRNOTAVAIL on connection failures
2016-08-12 12:32:24 -07:00
Linus Torvalds 8766dc68d1 powerpc fixes for 4.8 #3
- powerpc/vdso: Fix build rules to rebuild vdsos correctly from Nicholas Piggin
  - powerpc/ptrace: Fix coredump since ptrace TM changes from Cyril Bur
  - powerpc/32: Fix csum_partial_copy_generic() from Christophe Leroy
  - cxl: Set psl_fir_cntl to production environment value from Frederic Barrat
  - powerpc/eeh: Switch to conventional PCI address output in EEH log from Guilherme G. Piccoli
  - cxl: Use fixed width predefined types in data structure. from Philippe Bergheaud
  - powerpc/vdso: Add missing include file from Guenter Roeck
  - powerpc: Fix unused function warning 'lmb_to_memblock' from Alastair D'Silva
  - powerpc/powernv/ioda: Fix TCE invalidate to work in real mode again from Alexey Kardashevskiy
  - powerpc/cell: Add missing error code in spufs_mkgang() from Dan Carpenter
  - crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading from Anton Blanchard
  - powerpc/pasemi: Fix coherent_dma_mask for dma engine from Darren Stevens
 
 Benjamin Herrenschmidt:
  - powerpc/32: Fix crash during static key init
  - powerpc: Update obsolete comment in setup_32.c about early_init()
  - powerpc: Print the kernel load address at the end of prom_init()
  - powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs
  - powerpc/xics: Properly set Edge/Level type and enable resend
 
 Mahesh Salgaonkar:
  - powerpc/book3s: Fix MCE console messages for unrecoverable MCE.
  - powerpc/powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
  - powerpc/powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h
  - powerpc/powernv: Load correct TOC pointer while waking up from winkle.
 
 Andrew Donnellan:
  - cxl: Fix sparse warnings
  - cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests
 
 Michael Ellerman:
  - selftests/powerpc: Specify we expect to build with std=gnu99
  - powerpc/Makefile: Use cflags-y/aflags-y for setting endian options
  - powerpc/pci: Fix endian bug in fixed PHB numbering
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXrZFAAAoJEFHr6jzI4aWAxacQALPfu/kbKJFhwX8dnbzaCwHe
 1bTZHE4bkkxfS5JrghbiLZHUeoCZucDhGGlZSPOEb5VA9lkEX3OJJRQDng754Pit
 u3pwt0SLmAxBn9BgTZy/5g5U6KMGptzJcSsKVEtZs17PKpqhPNELMm5EmGhJmNHH
 Ksycw4FhVrsjDm5n7s4IqUhsh0Z9QPOOxxb5rVgdBBxmLHz5a1FJSSCFan5WW3PT
 QNiMfg58NdBBOFbDQJWLiWXrfPPUMhXfPxHGGArXPEsa+7l5yXaygCSv5KyUBJMt
 sDxn6XZMuYzzvg4j8uc9mkDWNWiyxcxBJ6+/Hm5xf9vvpxzHAM1M8j9xqpaCHjeg
 b0fsWqVeLD+DuAVqh6rUgUERbsfUtuKXRSB+NR0hHWd7GLx707FIr3i1AAvjDODC
 qwcZg9mkcAbKAIOAmsk9aAB60jl7aENiz+bTvLYMHDhIbb+st94jajdaG7MSVn5z
 M9FFbRKmRHTW0Qoop1VuseyO9C+Lmb+ksIhBHeYaNDaJ5lzk0NwJltCNd4ybnL6h
 i+AFxuhN0uyT6OJOPqTR07+9p+k04LOSYPZR34rclKQ3Z+sQiYQAmwLMHasN6uBk
 dZxJUxmeio5J/0BXLGKLYFnaNpHnq3EQm9vdt6spn1kidmm+bOeICB8UW8AairqC
 8HasF1QrjZihmoBoXgul
 =gw2z
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Some powerpc fixes for 4.8:

  Misc:
   - powerpc/vdso: Fix build rules to rebuild vdsos correctly from Nicholas Piggin
   - powerpc/ptrace: Fix coredump since ptrace TM changes from Cyril Bur
   - powerpc/32: Fix csum_partial_copy_generic() from Christophe Leroy
   - cxl: Set psl_fir_cntl to production environment value from Frederic Barrat
   - powerpc/eeh: Switch to conventional PCI address output in EEH log from Guilherme G. Piccoli
   - cxl: Use fixed width predefined types in data structure. from Philippe Bergheaud
   - powerpc/vdso: Add missing include file from Guenter Roeck
   - powerpc: Fix unused function warning 'lmb_to_memblock' from Alastair D'Silva
   - powerpc/powernv/ioda: Fix TCE invalidate to work in real mode again from Alexey Kardashevskiy
   - powerpc/cell: Add missing error code in spufs_mkgang() from Dan Carpenter
   - crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading from Anton Blanchard
   - powerpc/pasemi: Fix coherent_dma_mask for dma engine from Darren Stevens

  Benjamin Herrenschmidt:
   - powerpc/32: Fix crash during static key init
   - powerpc: Update obsolete comment in setup_32.c about early_init()
   - powerpc: Print the kernel load address at the end of prom_init()
   - powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs
   - powerpc/xics: Properly set Edge/Level type and enable resend

  Mahesh Salgaonkar:
   - powerpc/book3s: Fix MCE console messages for unrecoverable MCE.
   - powerpc/powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
   - powerpc/powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h
   - powerpc/powernv: Load correct TOC pointer while waking up from winkle.

  Andrew Donnellan:
   - cxl: Fix sparse warnings
   - cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests

  Michael Ellerman:
   - selftests/powerpc: Specify we expect to build with std=gnu99
   - powerpc/Makefile: Use cflags-y/aflags-y for setting endian options
   - powerpc/pci: Fix endian bug in fixed PHB numbering"

* tag 'powerpc-4.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (26 commits)
  selftests/powerpc: Specify we expect to build with std=gnu99
  powerpc/vdso: Fix build rules to rebuild vdsos correctly
  powerpc/Makefile: Use cflags-y/aflags-y for setting endian options
  powerpc/32: Fix crash during static key init
  powerpc: Update obsolete comment in setup_32.c about early_init()
  powerpc: Print the kernel load address at the end of prom_init()
  powerpc/ptrace: Fix coredump since ptrace TM changes
  powerpc/32: Fix csum_partial_copy_generic()
  cxl: Set psl_fir_cntl to production environment value
  powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs
  powerpc/book3s: Fix MCE console messages for unrecoverable MCE.
  powerpc/pci: Fix endian bug in fixed PHB numbering
  powerpc/eeh: Switch to conventional PCI address output in EEH log
  cxl: Fix sparse warnings
  cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests
  cxl: Use fixed width predefined types in data structure.
  powerpc/vdso: Add missing include file
  powerpc: Fix unused function warning 'lmb_to_memblock'
  powerpc/powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
  powerpc/powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h
  ...
2016-08-12 12:09:44 -07:00
Christoffer Dall a28ebea2ad KVM: Protect device ops->create and list_add with kvm->lock
KVM devices were manipulating list data structures without any form of
synchronization, and some implementations of the create operations also
suffered from a lack of synchronization.

Now when we've split the xics create operation into create and init, we
can hold the kvm->lock mutex while calling the create operation and when
manipulating the devices list.

The error path in the generic code gets slightly ugly because we have to
take the mutex again and delete the device from the list, but holding
the mutex during anon_inode_getfd or releasing/locking the mutex in the
common non-error path seemed wrong.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-08-12 12:01:27 +02:00
Christoffer Dall 023e9fddc3 KVM: PPC: Move xics_debugfs_init out of create
As we are about to hold the kvm->lock during the create operation on KVM
devices, we should move the call to xics_debugfs_init into its own
function, since holding a mutex over extended amounts of time might not
be a good idea.

Introduce an init operation on the kvm_device_ops struct which cannot
fail and call this, if configured, after the device has been created.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-08-12 12:01:26 +02:00
Linus Torvalds 6da7e95326 virtio/vhost: fixes and cleanups for 4.8
- Misc fixes and cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXq0ruAAoJECgfDbjSjVRp5P8H/2OlDJdSS1l+TwOXbY95ntQ1
 vxUX4vGCX5IujC+Rbt7sQV2prE3b6IktFNagpbRoWn21JkpoDMvPtYJrn5BhLtoh
 fvDkZE6Wo3QztFSjaUBZWEABBt03KPX0yrAIZplu8ne/Z8KAT3zK57BPnKfmxwv+
 dpxt+1wlnqAvYsoUUQZBFT4Gmk2oDiTofiIbQq7W9W/fooznLtLB+ArYtdfNJizC
 JnI/vJuWceEXfjT26HexCRhA2OZskrA4ZadDhOjAqkTPN5DHfweLDuHh7IsVfDd1
 wXqjc4ks3cYG0CloJ2qY2K7RpDOFIxIizixeDIuAbn9aX4sPOYYfqRm+4iRwmqQ=
 =9aUO
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio/vhost fixes and cleanups from Michael Tsirkin:
 "Misc fixes and cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio/s390: deprecate old transport
  virtio/s390: keep early_put_chars
  virtio_blk: Fix a slient kernel panic
  virtio-vsock: fix include guard typo
  vhost/vsock: fix vhost virtio_vsock_pkt use-after-free
  9p/trans_virtio: use kvfree() for iov_iter_get_pages_alloc()
  virtio: fix error handling for debug builds
  virtio: fix memory leak in virtqueue_add()
2016-08-11 14:10:23 -07:00
Johannes Berg 1ea049b2de bvec: avoid variable shadowing warning
Due to the (indirect) nesting of min(..., min(...)), sparse will
show a variable shadowing warning whenever bvec.h is included.

Avoid that by assigning the inner min() to a temporary variable first.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-11 09:41:35 -06:00
David S. Miller 293fddff20 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Use mod_timer_pending() to avoid reactivating a dead expectation in
   the h323 conntrack helper, from Liping Zhang.

2) Oneliner to fix a type in the register name defined in the nf_tables
   header.

3) Don't try to look further when we find an inactive elements with no
   descendants in the rbtree set implementation, otherwise we crash.

4) Handle valid zero CSeq in the SIP conntrack helper, from
   Christophe Leroy.

5) Don't display a trailing slash in conntrack helper with no classes
   via /proc/net/nf_conntrack_expect, from Liping Zhang.

6) Fix an expectation leak during creation from the nfqueue path, again
   from Liping Zhang.

7) Validate netlink port ID in verdict message from nfqueue, otherwise
   an injection can be possible. Again from Zhang.

8) Reject conntrack tuples with different transport protocol on
   original and reply tuples, also from Zhang.

9) Validate offset and length in nft_exthdr, make sure they are under
   sizeof(u8), from Laura Garcia Liebana.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 14:54:27 -07:00
pan xinhui 2db34e8bf9 locking/qrwlock: Fix write unlock bug on big endian systems
This patch aims to get rid of endianness in queued_write_unlock(). We
want to set  __qrwlock->wmode to NULL, however the address is not
&lock->cnts in big endian machine. That causes queued_write_unlock()
write NULL to the wrong field of __qrwlock.

So implement __qrwlock_write_byte() which returns the correct
__qrwlock->wmode address.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman.Long@hpe.com
Cc: arnd@arndb.de
Cc: boqun.feng@gmail.com
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1468835259-4486-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 14:13:27 +02:00
David Carrillo-Cisneros db4a835601 perf/core: Set cgroup in CPU contexts for new cgroup events
There's a perf stat bug easy to observer on a machine with only one cgroup:

  $ perf stat -e cycles -I 1000 -C 0 -G /
  #          time             counts unit events
      1.000161699      <not counted>      cycles                    /
      2.000355591      <not counted>      cycles                    /
      3.000565154      <not counted>      cycles                    /
      4.000951350      <not counted>      cycles                    /

We'd expect some output there.

The underlying problem is that there is an optimization in
perf_cgroup_sched_{in,out}() that skips the switch of cgroup events
if the old and new cgroups in a task switch are the same.

This optimization interacts with the current code in two ways
that cause a CPU context's cgroup (cpuctx->cgrp) to be NULL even if a
cgroup event matches the current task. These are:

  1. On creation of the first cgroup event in a CPU: In current code,
  cpuctx->cpu is only set in perf_cgroup_sched_in, but due to the
  aforesaid optimization, perf_cgroup_sched_in will run until the next
  cgroup switches in that CPU. This may happen late or never happen,
  depending on system's number of cgroups, CPU load, etc.

  2. On deletion of the last cgroup event in a cpuctx: In list_del_event,
  cpuctx->cgrp is set NULL. Any new cgroup event will not be sched in
  because cpuctx->cgrp == NULL until a cgroup switch occurs and
  perf_cgroup_sched_in is executed (updating cpuctx->cgrp).

This patch fixes both problems by setting cpuctx->cgrp in list_add_event,
mirroring what list_del_event does when removing a cgroup event from CPU
context, as introduced in:

  commit 68cacd2916 ("perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx()")

With this patch, cpuctx->cgrp is always set/clear when installing/removing
the first/last cgroup event in/from the CPU context. With cpuctx->cgrp
correctly set, event_filter_match works as intended when events are
sched in/out.

After the fix, the output is as expected:

  $ perf stat -e cycles -I 1000 -a -G /
  #         time             counts unit events
     1.004699159          627342882      cycles                    /
     2.007397156          615272690      cycles                    /
     3.010019057          616726074      cycles                    /

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1470124092-113192-1-git-send-email-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 13:05:52 +02:00
Linus Torvalds a0cba2179e Revert "printk: create pr_<level> functions"
This reverts commit 874f9c7da9.

Geert Uytterhoeven reports:
 "This change seems to have an (unintendent?) side-effect.

  Before, pr_*() calls without a trailing newline characters would be
  printed with a newline character appended, both on the console and in
  the output of the dmesg command.

  After this commit, no new line character is appended, and the output
  of the next pr_*() call of the same type may be appended, like in:

    - Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000
    - Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)
    + Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)"

Joe Perches says:
 "No, that is not intentional.

  The newline handling code inside vprintk_emit is a bit involved and
  for now I suggest a revert until this has all the same behavior as
  earlier"

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Requested-by: Joe Perches <joe@perches.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-09 10:48:18 -07:00
Linus Torvalds 84bd8d33a9 Luiz Capitulino noticed that the tick_stop tracepoint wasn't being parsed
properly by the tracing user space tools. This was due to the
 TRACE_DEFINE_ENUM() being set to a define, when it should have been set
 to the enum itself. The define was of the MASK that used the BIT to shift.
 The BIT was the enum and by adding that, everything gets converted nicely.
 The MASK is still kept just in case it gets converted to an enum in the
 future.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXqeA/AAoJEKKk/i67LK/8wLAH/0nD8L5pxtn+pZi3mQnWNwbn
 qEBtfKK8cvnt0IWH2HlKmRKLAiIJfp8UrkGoPiLT7Nb83PlKaw3UT868t7eDmknu
 b29SsZroMgvJ1MeeNH9Yzk7cK3/K213VO02P4ce8EWSELYCqFlxJDE3dhl52K9Lj
 clJSoZbIGTLlx4pk6zZnPksTt3Z9WZXcVJITwxEiz/Cr+CKAWZpLoPPUaqJOmA4j
 9oS6d++0DYZdz3cWCmYBBkphmc5IQkBNZWGMYLcAR+M+m5fsN+baWlP3Dhq4j7he
 WknHwu4WDFMk6a2Kh0Ggi6yUWVUIkLNY2Z4QUF2gNmJT1g/FH5lZka4/kIxjvKw=
 =rgBA
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix tick_stop tracepoint symbols for user export.

  Luiz Capitulino noticed that the tick_stop tracepoint wasn't being
  parsed properly by the tracing user space tools.

  This was due to the TRACE_DEFINE_ENUM() being set to a define, when it
  should have been set to the enum itself.  The define was of the MASK
  that used the BIT to shift.  The BIT was the enum and by adding that,
  everything gets converted nicely.  The MASK is still kept just in case
  it gets converted to an enum in the future"

* tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix tick_stop tracepoint symbols for user export
2016-08-09 10:34:09 -07:00
Linus Torvalds cb0d93aaf0 Merge tag 'drm-fixes-for-4.8-rc2' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "This contains a bunch of amdgpu fixes, and some i915 regression fixes.

  It also contains some fixes for an older regression with some EDID
  changes and some 6bpc panels.

  Then there are the lockdep, cirrus and rcar-du regression fixes from
  this window"

* tag 'drm-fixes-for-4.8-rc2' of git://people.freedesktop.org/~airlied/linux:
  drm/cirrus: Fix NULL pointer dereference when registering the fbdev
  drm/edid: Set 8 bpc color depth for displays with "DFP 1.x compliant TMDS".
  drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown"
  drm/edid: Add 6 bpc quirk for display AEO model 0.
  drm: Paper over locking inversion after registration rework
  drm: rcar-du: Link HDMI encoder with bridge
  drm/ttm: Wait for a BO to become idle before unbinding it from GTT
  drm/i915/fbdev: Check for the framebuffer before use
  drm/amdgpu: update golden setting of polaris10
  drm/amdgpu: update golden setting of stoney
  drm/amdgpu: update golden setting of polaris11
  drm/amdgpu: update golden setting of carrizo
  drm/amdgpu: update golden setting of iceland
  drm/amd/amdgpu: change pptable output format from ASCII to binary
  drm/amdgpu/ci: add mullins to default case for smc ucode
  drm/amdgpu/gmc7: add missing mullins case
  drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
  drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
2016-08-09 10:20:21 -07:00
Andre Przywara fd837b08d9 KVM: arm64: ITS: return 1 on successful MSI injection
According to the KVM API documentation a successful MSI injection
should return a value > 0 on success.
Return possible errors in vgic_its_trigger_msi() and report a
successful injection back to userland, while also reporting the
case where the MSI could not be delivered due to the guest not
having the LPI mapped, for instance.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-09 16:43:23 +02:00
Steven Rostedt (Red Hat) c87edb3611 tracing: Fix tick_stop tracepoint symbols for user export
The symbols used in the tick_stop tracepoint were not being converted
properly into integers in the trace_stop format file. Instead we had this:

print fmt: "success=%d dependency=%s", REC->success,
    __print_symbolic(REC->dependency, { 0, "NONE" },
     { (1 << TICK_DEP_BIT_POSIX_TIMER), "POSIX_TIMER" },
     { (1 << TICK_DEP_BIT_PERF_EVENTS), "PERF_EVENTS" },
     { (1 << TICK_DEP_BIT_SCHED), "SCHED" },
     { (1 << TICK_DEP_BIT_CLOCK_UNSTABLE), "CLOCK_UNSTABLE" })

User space tools have no idea how to parse "TICK_DEP_BIT_SCHED" or the other
symbols used to do the bit shifting. The reason is that the conversion was
done with using the TICK_DEP_MASK_* symbols which are just macros that
convert to the BIT shift itself (with the exception of NONE, which was
converted properly, because it doesn't use bits, and is defined as zero).

The TICK_DEP_BIT_* needs to be denoted by TRACE_DEFINE_ENUM() in order to
have this properly converted for user space tools to parse this event.

Cc: stable@vger.kernel.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Fixes: e6e6cc22e0 ("nohz: Use enum code for tick stop failure tracing message")
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-08-09 09:51:23 -04:00
Stefan Hajnoczi 28ad55578b virtio-vsock: fix include guard typo
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-09 13:42:38 +03:00
Marc Zyngier f3b0946d62 genirq/msi: Make sure PCI MSIs are activated early
Bharat Kumar Gogada reported issues with the generic MSI code, where the
end-point ended up with garbage in its MSI configuration (both for the vector
and the message).

It turns out that the two MSI paths in the kernel are doing slightly different
things:

generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI

And it turns out that end-points are allowed to latch the content of the MSI
configuration registers as soon as MSIs are enabled.  In Bharat's case, the
end-point ends up using whatever was there already, which is not what you
want.

In order to make things converge, we introduce a new MSI domain flag
(MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for PCI/MSI. When set,
this flag forces the programming of the end-point as soon as the MSIs are
allocated.

A consequence of this is that we have an extra activate in irq_startup, but
that should be without much consequence.

tglx: 

 - Several people reported a VMWare regression with PCI/MSI-X passthrough. It
   turns out that the patch also cures that issue.

 - We need to have a look at the MSI disable interrupt path, where we write
   the msg to all zeros without disabling MSI in the PCI device. Is that
   correct?

Fixes: 52f518a3a7 "x86/MSI: Use hierarchical irqdomains to manage MSI interrupts"
Reported-and-tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Reported-and-tested-by: Foster Snowhill <forst@forstwoof.ru>
Reported-by: Matthias Prager <linux@matthiasprager.de>
Reported-by: Jason Taylor <jason.taylor@simplivity.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1468426713-31431-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-08-09 09:19:32 +02:00
Philippe Bergheaud cbd74e1bc8 cxl: Use fixed width predefined types in data structure.
This patch fixes a regression introduced by commit b810253bd9 ("cxl:
Add mechanism for delivering AFU driver specific events").

It changes the type u8 to __u8 in the uapi header cxl.h, because the
former is a kernel internal type, and may not be defined in userland
build environments, in particular when cross-compiling libcxl on x86_64
linux machines (RHEL6.7 and Ubuntu 16.04).

This patch also changes the size of the field data_size, and makes it
constant, to support 32-bit userland applications running on big-endian
ppc64 kernels transparently.

mpe: This is an ABI change, however the ABI was only added during the
4.8 merge window so has never been part of a released kernel - therefore
we give ourselves permission to change it.

Fixes: b810253bd9 ("cxl: Add mechanism for delivering AFU driver specific events")
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 16:52:01 +10:00
Sudarsana Reddy Kalluru 59bcb7972f qed: Add dcbx app support for IEEE Selection Field.
MFW now supports the Selection field for IEEE mode. Add driver changes to
use the newer MFW masks to read/write the port-id value.

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 22:22:20 -07:00
Linus Torvalds 1eccfa090e Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user
bounds checking for most architectures on SLAB and SLUB.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJXl9tlAAoJEIly9N/cbcAm5BoP/ikTtDp2bFw1sn92yHTnIWzl
 O+dcKVAeRgjfnSvPfb1JITpaM58exQSaDsPBeR0DbVzU1zDdhLcwHHiQupFh98Ka
 vBZthbrlL/u4NB26enEEW0iyA32BsxYBMnIu0z5ux9RbZflmQwGQ0c0rvy3dJ7/b
 FzB5ayVST5y/a0m6/sImeeExh78GU9rsMb1XmJRMwlJAy6miDz/F9TP0LnuW6PhG
 J5XC99ygNJS1pQBLACRsrZw6ImgBxXnWCok6tWPMxFfD+rJBU2//wqS+HozyMWHL
 iYP7+ytVo/ZVok4114X/V4Oof3a6wqgpBuYrivJ228QO+UsLYbYLo6sZ8kRK7VFm
 9GgHo/8rWB1T9lBbSaa7UL5r0dVNNLjFGS42vwV+YlgUMQ1A35VRojO0jUnJSIQU
 Ug1IxKmylLd0nEcwD8/l3DXeQABsfL8GsoKW0OtdTZtW4RND4gzq34LK6t7hvayF
 kUkLg1OLNdUJwOi16M/rhugwYFZIMfoxQtjkRXKWN4RZ2QgSHnx2lhqNmRGPAXBG
 uy21wlzUTfLTqTpoeOyHzJwyF2qf2y4nsziBMhvmlrUvIzW1LIrYUKCNT4HR8Sh5
 lC2WMGYuIqaiu+NOF3v6CgvKd9UW+mxMRyPEybH8mEgfm+FLZlWABiBjIUpSEZuB
 JFfuMv1zlljj/okIQRg8
 =USIR
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull usercopy protection from Kees Cook:
 "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
  copy_from_user bounds checking for most architectures on SLAB and
  SLUB"

* tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  mm: SLUB hardened usercopy support
  mm: SLAB hardened usercopy support
  s390/uaccess: Enable hardened usercopy
  sparc/uaccess: Enable hardened usercopy
  powerpc/uaccess: Enable hardened usercopy
  ia64/uaccess: Enable hardened usercopy
  arm64/uaccess: Enable hardened usercopy
  ARM: uaccess: Enable hardened usercopy
  x86/uaccess: Enable hardened usercopy
  mm: Hardened usercopy
  mm: Implement stack frame object validation
  mm: Add is_migrate_cma_page
2016-08-08 14:48:14 -07:00
Daniel Borkmann 479ffcccef bpf: fix checksum fixups on bpf_skb_store_bytes
bpf_skb_store_bytes() invocations above L2 header need BPF_F_RECOMPUTE_CSUM
flag for updates, so that CHECKSUM_COMPLETE will be fixed up along the way.
Where we ran into an issue with bpf_skb_store_bytes() is when we did a
single-byte update on the IPv6 hoplimit despite using BPF_F_RECOMPUTE_CSUM
flag; simple ping via ICMPv6 triggered a hw csum failure as a result. The
underlying issue has been tracked down to a buffer alignment issue.

Meaning, that csum_partial() computations via skb_postpull_rcsum() and
skb_postpush_rcsum() pair invoked had a wrong result since they operated on
an odd address for the hoplimit, while other computations were done on an
even address. This mix doesn't work as-is with skb_postpull_rcsum(),
skb_postpush_rcsum() pair as it always expects at least half-word alignment
of input buffers, which is normally the case. Thus, instead of these helpers
using csum_sub() and (implicitly) csum_add(), we need to use csum_block_sub(),
csum_block_add(), respectively. For unaligned offsets, they rotate the sum
to align it to a half-word boundary again, otherwise they work the same as
csum_sub() and csum_add().

Adding __skb_postpull_rcsum(), __skb_postpush_rcsum() variants that take the
offset as an input and adapting bpf_skb_store_bytes() to them fixes the hw
csum failures again. The skb_postpull_rcsum(), skb_postpush_rcsum() helpers
use a 0 constant for offset so that the compiler optimizes the offset & 1
test away and generates the same code as with csum_sub()/_add().

Fixes: 608cd71a9c ("tc: bpf: generalize pedit action")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 13:11:43 -07:00
Linus Torvalds 1bd4403d86 unsafe_[get|put]_user: change interface to use a error target label
When I initially added the unsafe_[get|put]_user() helpers in commit
5b24a7a2aa ("Add 'unsafe' user access functions for batched
accesses"), I made the mistake of modeling the interface on our
traditional __[get|put]_user() functions, which return zero on success,
or -EFAULT on failure.

That interface is fairly easy to use, but it's actually fairly nasty for
good code generation, since it essentially forces the caller to check
the error value for each access.

In particular, since the error handling is already internally
implemented with an exception handler, and we already use "asm goto" for
various other things, we could fairly easily make the error cases just
jump directly to an error label instead, and avoid the need for explicit
checking after each operation.

So switch the interface to pass in an error label, rather than checking
the error value in the caller.  Best do it now before we start growing
more users (the signal handling code in particular would be a good place
to use the new interface).

So rather than

	if (unsafe_get_user(x, ptr))
		... handle error ..

the interface is now

	unsafe_get_user(x, ptr, label);

where an error during the user mode fetch will now just cause a jump to
'label' in the caller.

Right now the actual _implementation_ of this all still ends up being a
"if (err) goto label", and does not take advantage of any exception
label tricks, but for "unsafe_put_user()" in particular it should be
fairly straightforward to convert to using the exception table model.

Note that "unsafe_get_user()" is much harder to convert to a clever
exception table model, because current versions of gcc do not allow the
use of "asm goto" (for the exception) with output values (for the actual
value to be fetched).  But that is hopefully not a limitation in the
long term.

[ Also note that it might be a good idea to switch unsafe_get_user() to
  actually _return_ the value it fetches from user space, but this
  commit only changes the error handling semantics ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-08 13:02:01 -07:00
Phil Sutter dca3f53c02 sctp: Export struct sctp_info to userspace
This is required to correctly interpret INET_DIAG_INFO messages exported
by sctp_diag module.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 12:51:58 -07:00
Pablo Neira Ayuso 2c86943c20 netfilter: nf_tables: s/MFT_REG32_01/NFT_REG32_01
MFT_REG32_01 is a typo, rename this to NFT_REG32_01.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-08 09:35:05 +02:00
Dave Airlie 4872850a19 Merge branch 'drm-next-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-next
A few fixes for amdgpu and ttm for 4.8
- fix a ttm regression caused by the new pipelining code
- fixes for mullins on amdgpu
- updated golden settings for amdgpu

* 'drm-next-4.8' of git://people.freedesktop.org/~agd5f/linux:
  drm/ttm: Wait for a BO to become idle before unbinding it from GTT
  drm/amdgpu: update golden setting of polaris10
  drm/amdgpu: update golden setting of stoney
  drm/amdgpu: update golden setting of polaris11
  drm/amdgpu: update golden setting of carrizo
  drm/amdgpu: update golden setting of iceland
  drm/amd/amdgpu: change pptable output format from ASCII to binary
  drm/amdgpu/ci: add mullins to default case for smc ucode
  drm/amdgpu/gmc7: add missing mullins case
2016-08-08 16:45:33 +10:00
Linus Torvalds 857953d72f Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull more block fixes from Jens Axboe:
 "As mentioned in the pull the other day, a few more fixes for this
  round, all related to the bio op changes in this series.

  Two fixes, and then a cleanup, renaming bio->bi_rw to bio->bi_opf.  I
  wanted to do that change right after or right before -rc1, so that
  risk of conflict was reduced.  I just rebased the series on top of
  current master, and no new ->bi_rw usage has snuck in"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: rename bio bi_rw to bi_opf
  target: iblock_execute_sync_cache() should use bio_set_op_attrs()
  mm: make __swap_writepage() use bio_set_op_attrs()
  block/mm: make bdev_ops->rw_page() take a bool for read/write
2016-08-07 16:38:45 -07:00
Linus Torvalds 635a4ba111 Merge tag 'drm-for-v4.8-zpos' of git://people.freedesktop.org/~airlied/linux
Pull drm zpos property support from Dave Airlie:
 "This tree was waiting on some media stuff I hadn't had time to get a
  stable branchpoint off, so I just waited until it was all in your tree
  first.

  It's been around a bit on the list and shouldn't affect anything
  outside adding the generic API and moving some ARM drivers to using
  it"

* tag 'drm-for-v4.8-zpos' of git://people.freedesktop.org/~airlied/linux:
  drm: rcar: use generic code for managing zpos plane property
  drm/exynos: use generic code for managing zpos plane property
  drm: sti: use generic zpos for plane
  drm: add generic zpos property
2016-08-07 16:35:08 -07:00
Jens Axboe 1eff9d322a block: rename bio bi_rw to bi_opf
Since commit 63a4cc2486, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually setting bi_rw is most likely
going to be broken. Instead of letting that brokeness linger,
rename the member, to force old and out-of-tree code to break
at compile time instead of at runtime.

No intended functional changes in this commit.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-07 14:41:02 -06:00
Jens Axboe c11f0c0b5b block/mm: make bdev_ops->rw_page() take a bool for read/write
Commit abf545484d changed it from an 'rw' flags type to the
newer ops based interface, but now we're effectively leaking
some bdev internals to the rest of the kernel. Since we only
care about whether it's a read or a write at that level, just
pass in a bool 'is_write' parameter instead.

Then we can also move op_is_write() and friends back under
CONFIG_BLOCK protection.

Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-07 14:41:02 -06:00
Linus Torvalds fe64f3283f Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  In the "trivial API change" department - ->d_compare() losing 'parent'
  argument"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  cachefiles: Fix race between inactivating and culling a cache object
  9p: use clone_fid()
  9p: fix braino introduced in "9p: new helper - v9fs_parent_fid()"
  vfs: make dentry_needs_remove_privs() internal
  vfs: remove file_needs_remove_privs()
  vfs: fix deadlock in file_remove_privs() on overlayfs
  get rid of 'parent' argument of ->d_compare()
  cifs, msdos, vfat, hfs+: don't bother with parent in ->d_compare()
  affs ->d_compare(): don't bother with ->d_inode
  fold _d_rehash() and __d_rehash() together
  fold dentry_rcuwalk_invalidate() into its only remaining caller
2016-08-07 10:01:14 -04:00