Commit graph

50155 commits

Author SHA1 Message Date
Christian Borntraeger 3491caf275 KVM: halt_polling: provide a way to qualify wakeups during poll
Some wakeups should not be considered a sucessful poll. For example on
s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
would be considered runnable - letting all vCPUs poll all the time for
transactional like workload, even if one vCPU would be enough.
This can result in huge CPU usage for large guests.
This patch lets architectures provide a way to qualify wakeups if they
should be considered a good/bad wakeups in regard to polls.

For s390 the implementation will fence of halt polling for anything but
known good, single vCPU events. The s390 implementation for floating
interrupts does a wakeup for one vCPU, but the interrupt will be delivered
by whatever CPU checks first for a pending interrupt. We prefer the
woken up CPU by marking the poll of this CPU as "good" poll.
This code will also mark several other wakeup reasons like IPI or
expired timers as "good". This will of course also mark some events as
not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
we prefer to not poll, unless we are really sure, though.

This patch successfully limits the CPU usage for cases like uperf 1byte
transactional ping pong workload or wakeup heavy workload like OLTP
while still providing a proper speedup.

This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
wakeups that are considered not good for polling.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
Cc: David Matlack <dmatlack@google.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
[Rename config symbol. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-13 17:29:23 +02:00
Mark Brown 180bc41ad1 Merge remote-tracking branches 'asoc/topic/es8328', 'asoc/topic/find-dai', 'asoc/topic/fsl', 'asoc/topic/fsl-sai' and 'asoc/topic/fsl-ssi' into asoc-next 2016-05-13 14:27:01 +01:00
Mark Brown 39d652e066 Merge remote-tracking branches 'regulator/topic/pwm', 'regulator/topic/qcom-spmi', 'regulator/topic/rk808' and 'regulator/topic/s2mps11' into regulator-next 2016-05-13 14:23:46 +01:00
Mark Brown eb76d8407c Merge remote-tracking branches 'regulator/topic/max77686', 'regulator/topic/max8973', 'regulator/topic/maxim', 'regulator/topic/palmas' and 'regulator/topic/pv88080' into regulator-next 2016-05-13 14:23:38 +01:00
Mark Brown 78d5501cf4 Merge remote-tracking branches 'regulator/topic/can-change', 'regulator/topic/constrain', 'regulator/topic/debugfs' and 'regulator/topic/doc' into regulator-next 2016-05-13 14:23:27 +01:00
Mark Brown 8595bb27ce Merge remote-tracking branches 'regulator/topic/abb', 'regulator/topic/act8865', 'regulator/topic/as3722' and 'regulator/topic/axp20x' into regulator-next 2016-05-13 14:23:08 +01:00
Mark Brown 170b649e40 Merge remote-tracking branch 'regulator/topic/core' into regulator-next 2016-05-13 14:22:57 +01:00
James Hogan ca9eb49aa9 SIGNAL: Move generic copy_siginfo() to signal.h
The generic copy_siginfo() is currently defined in
asm-generic/siginfo.h, after including uapi/asm-generic/siginfo.h which
defines the generic struct siginfo. However this makes it awkward for an
architecture to use it if it has to define its own struct siginfo (e.g.
MIPS and potentially IA64), since it means that asm-generic/siginfo.h
can only be included after defining the arch-specific siginfo, which may
be problematic if the arch-specific definition needs definitions from
uapi/asm-generic/siginfo.h.

It is possible to work around this by first including
uapi/asm-generic/siginfo.h to get the constants before defining the
arch-specific siginfo, and include asm-generic/siginfo.h after. However
uapi headers can't be included by other uapi headers, so that first
include has to be in an ifdef __kernel__, with the non __kernel__ case
including the non-UAPI header instead.

Instead of that mess, move the generic copy_siginfo() definition into
linux/signal.h, which allows an arch-specific uapi/asm/siginfo.h to
include asm-generic/siginfo.h and define the arch-specific siginfo, and
for the generic copy_siginfo() to see that arch-specific definition.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Petr Malat <oss@malat.biz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Christopher Ferris <cferris@google.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 4.0-
Patchwork: https://patchwork.linux-mips.org/patch/12478/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-13 14:02:10 +02:00
Daniel Wagner e55d531244 crash_dump: Add vmcore_elf32_check_arch
parse_crash_elf{32|64}_headers will check the headers via the
elf_check_arch respectively vmcore_elf64_check_arch macro.

The MIPS architecture implements those two macros differently.
In order to make the differentiation more explicit, let's introduce
an vmcore_elf32_check_arch to allow the archs to overwrite it.

Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Suggested-by: Maciej W. Rozycki <macro@imgtec.com>
Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12535/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-13 14:01:59 +02:00
Paul Burton 835d2b4529 irqchip: mips-gic: Provide VP ID accessor
Provide a gic_read_local_vp_id() function to read the VCNUM field of the
GICs local VP_IDENT register. This will be used by a further patch to
check that the value reported by the GIC matches up with the kernels
calculation.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Bresticker <abrestic@chromium.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12334/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-13 14:01:49 +02:00
Rafał Miłecki 2ab71a02c5 MIPS: BCM47xx: Move SPROM driver to drivers/firmware/
Broadcom ARM home routers store SPROM content in NVRAM just like MIPS
ones. To share SPROM code we need to move it out of arch/mips/ to some
common place. We already have bcm47xx_nvram in firmware path and SPROM
should fit there as well.
This driver is responsible for parsing SoC configuration data into a
struct shared between ssb and bcma buses.
This was tested with BCM4706 & BCM5357C0 (BCM47XX) and BCM4708A0
(ARCH_BCM_5301X).

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12210/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-13 14:01:43 +02:00
Mark Brown 9689dab30a Merge remote-tracking branches 'regulator/fix/axp20x', 'regulator/fix/da9063', 'regulator/fix/gpio' and 'regulator/fix/s2mps11' into regulator-linus 2016-05-13 11:11:08 +01:00
Al Viro ae05327a00 ext4: switch to ->iterate_shared()
Note that we need relax_dir() equivalent for directories
locked shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-12 20:36:01 -04:00
Andrea Arcangeli 6d0a07edd1 mm: thp: calculate the mapcount correctly for THP pages during WP faults
This will provide fully accuracy to the mapcount calculation in the
write protect faults, so page pinning will not get broken by false
positive copy-on-writes.

total_mapcount() isn't the right calculation needed in
reuse_swap_page(), so this introduces a page_trans_huge_mapcount()
that is effectively the full accurate return value for page_mapcount()
if dealing with Transparent Hugepages, however we only use the
page_trans_huge_mapcount() during COW faults where it strictly needed,
due to its higher runtime cost.

This also provide at practical zero cost the total_mapcount
information which is needed to know if we can still relocate the page
anon_vma to the local vma. If page_trans_huge_mapcount() returns 1 we
can reuse the page no matter if it's a pte or a pmd_trans_huge
triggering the fault, but we can only relocate the page anon_vma to
the local vma->anon_vma if we're sure it's only this "vma" mapping the
whole THP physical range.

Kirill A. Shutemov discovered the problem with moving the page
anon_vma to the local vma->anon_vma in a previous version of this
patch and another problem in the way page_move_anon_rmap() was called.

Andrew Morton discovered that CONFIG_SWAP=n wouldn't build in a
previous version, because reuse_swap_page must be a macro to call
page_trans_huge_mapcount from swap.h, so this uses a macro again
instead of an inline function. With this change at least it's a less
dangerous usage than it was before, because "page" is used only once
now, while with the previous code reuse_swap_page(page++) would have
called page_mapcount on page+1 and it would have increased page twice
instead of just once.

Dean Luick noticed an uninitialized variable that could result in a
rmap inefficiency for the non-THP case in a previous version.

Mike Marciniszyn said:

: Our RDMA tests are seeing an issue with memory locking that bisects to
: commit 61f5d698cc ("mm: re-enable THP")
:
: The test program registers two rather large MRs (512M) and RDMA
: writes data to a passive peer using the first and RDMA reads it back
: into the second MR and compares that data.  The sizes are chosen randomly
: between 0 and 1024 bytes.
:
: The test will get through a few (<= 4 iterations) and then gets a
: compare error.
:
: Tracing indicates the kernel logical addresses associated with the individual
: pages at registration ARE correct , the data in the "RDMA read response only"
: packets ARE correct.
:
: The "corruption" occurs when the packet crosse two pages that are not physically
: contiguous.   The second page reads back as zero in the program.
:
: It looks like the user VA at the point of the compare error no longer points to
: the same physical address as was registered.
:
: This patch totally resolves the issue!

Link: http://lkml.kernel.org/r/1462547040-1737-2-git-send-email-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Tested-by: Josh Collier <josh.d.collier@intel.com>
Cc: Marc Haber <mh+linux-kernel@zugschlus.de>
Cc: <stable@vger.kernel.org>	[4.5]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-12 15:52:50 -07:00
Bjorn Andersson b3d39032d7 remoteproc: Add additional crash reasons
The Qualcomm WCNSS can crash by watchdog or a fatal software error. Add
these types to the list of remoteproc crash reasons.

Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-05-12 15:50:19 -07:00
Tomasz Nowicki c5076cfe76 PCI, of: Move PCI I/O space management to PCI core code
No functional changes in this patch.

PCI I/O space mapping code does not depend on OF; therefore it can be moved
to PCI core code.  This way we will be able to use it, e.g., in ACPI PCI
code.

Suggested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Liviu Dudau <Liviu.Dudau@arm.com>
2016-05-12 07:07:42 -05:00
Thomas Gleixner 50605ffbda sched/core: Provide a tsk_nr_cpus_allowed() helper
tsk_nr_cpus_allowed() is an accessor for task->nr_cpus_allowed which allows
us to change the representation of ->nr_cpus_allowed if required.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1462969411-17735-2-git-send-email-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-12 09:55:36 +02:00
Ingo Molnar 4eb8676517 Merge branch 'smp/hotplug' into sched/core, to resolve conflicts
Conflicts:
	kernel/sched/core.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-12 09:51:36 +02:00
Ingo Molnar eb60b3e5e8 Merge branch 'sched/urgent' into sched/core to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-12 09:18:13 +02:00
Kedareswara rao Appana 07b0e7d49c dmaengine: vdma: Add Support for Xilinx AXI Central Direct Memory Access Engine
This patch adds support for the AXI Central Direct Memory Access
(AXI CDMA) core to the existing vdma driver, AXI CDMA is a
soft Xilinx IP core that provides high-bandwidth
Direct Memory Access(DMA) between a memory-mapped
source address and a memory-mapped destination address.

Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-05-12 11:58:58 +05:30
Kedareswara rao Appana c0bba3a99f dmaengine: vdma: Add Support for Xilinx AXI Direct Memory Access Engine
This patch adds support for the AXI Direct Memory Access (AXI DMA)
core in the existing vdma driver, AXI DMA Core is a
soft Xilinx IP core that provides high-bandwidth
direct memory access between memory and AXI4-Stream
type target peripherals.

Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-05-12 11:58:30 +05:30
Yuval Mintz 831bfb0e88 qed*: Tx-switching configuration
Device should be configured by default to VEB once VFs are active.
This changes the configuration of both PFs' and VFs' vports into enabling
tx-switching once sriov is enabled.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:08 -04:00
Yuval Mintz 73390ac9d8 qed*: support ndo_get_vf_config
Allows the user to view the VF configuration by observing the PF's
device.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:08 -04:00
Yuval Mintz 6ddc760825 qed*: IOV support spoof-checking
Add support in `ndo_set_vf_spoofchk' for allowing PF control over
its VF spoof-checking configuration.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:08 -04:00
Yuval Mintz 733def6a04 qed*: IOV link control
This adds support in 2 ndo that allow PF to tweak the VF's view of the
link - `ndo_set_vf_link_state' to allow it a view independent of the PF's,
and `ndo_set_vf_rate' which would allow the PF to limit the VF speed.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:08 -04:00
Yuval Mintz eff169608c qed*: Support forced MAC
Allows the PF to enforce the VF's mac.
i.e., by using `ip link ... vf <x> mac <value>'.

While a MAC is forced, PF would prevent the VF from configuring any other
MAC.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:08 -04:00
Yuval Mintz 08feecd7fc qed*: Support PVID configuration
This adds support for PF control over the VF vlan configuration.
I.e., `ip link ... vf <x> vlan <vid>' should now be supported.

 1. <vid> != 0 => VF receives [unknowingly] only traffic tagged by
    <vid> and tags all outgoing traffic sent by VF with <vid>.
 2. <vid> == 0 ==> Remove the pvid configuration, reverting to previous.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:07 -04:00
Yuval Mintz 0b55e27d56 qed: IOV configure and FLR
While previous patches have already added the necessary logic to probe
VFs as well as enabling them in the HW, this patch adds the ability to
support VF FLR & SRIOV disable.

It then wraps both flows together into the first IOV callback to be
provided to the protocol driver - `configure'. This would later to be used
to enable and disable SRIOV in the adapter.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:07 -04:00
Yuval Mintz 1408cc1fa4 qed: Introduce VFs
This adds the qed VFs for the first time -
The vfs are limited functions, with a very different PCI bar structure
[when compared with PFs] to better impose the related security demands
associated with them.

This patch includes the logic neccesary to allow VFs to successfully probe
[without actually adding the ability to enable iov].
This includes diverging all the flows that would occur as part of the pci
probe of the driver, preventing VF from accessing registers/memories it
can't and instead utilize the VF->PF channel to query the PF for needed
information.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:07 -04:00
Yuval Mintz 37bff2b9c6 qed: Add VF->PF channel infrastructure
Communication between VF and PF is based on a dedicated HW channel;
VF will prepare a messge, and by signaling the HW the PF would get a
notification of that message existance. The PF would then copy the
message, process it and DMA an answer back to the VF as a response.

The messages themselves are TLV-based - allowing easier backward/forward
compatibility.

This patch adds the infrastructure of the channel on the PF side -
starting with the arrival of the notification and ending with DMAing
the response back to the VF.

It also adds a dummy-response as reference, as it only lays the
groundwork of the communication; it doesn't really add support of any
actual messages.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-12 00:04:07 -04:00
Tariq Toukan 7219ab34f1 net/mlx5e: CQE compression
CQE compression feature is meant to save PCIe bandwidth by
compressing few CQEs into smaller amount of bytes on PCIe.
CQE compression can be selectively enabled per CQ.  By default
is disabled for now and will be enabled later on.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-11 19:42:39 -04:00
David Ahern 74b20582ac net: l3mdev: Add hook in ip and ipv6
Currently the VRF driver uses the rx_handler to switch the skb device
to the VRF device. Switching the dev prior to the ip / ipv6 layer
means the VRF driver has to duplicate IP/IPv6 processing which adds
overhead and makes features such as retaining the ingress device index
more complicated than necessary.

This patch moves the hook to the L3 layer just after the first NF_HOOK
for PRE_ROUTING. This location makes exposing the original ingress device
trivial (next patch) and allows adding other NF_HOOKs to the VRF driver
in the future.

dev_queue_xmit_nit is exported so that the VRF driver can cycle the skb
with the switched device through the packet taps to maintain current
behavior (tcpdump can be used on either the vrf device or the enslaved
devices).

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-11 19:31:40 -04:00
Alex Williamson 14717e2031 kvm: Conditionally register IRQ bypass consumer
If we don't support a mechanism for bypassing IRQs, don't register as
a consumer.  This eliminates meaningless dev_info()s when the connect
fails between producer and consumer, such as on AMD systems where
kvm_x86_ops->update_pi_irte is not implemented

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-11 22:37:55 +02:00
Alex Williamson b52f3ed022 irqbypass: Disallow NULL token
A NULL token is meaningless and can only lead to unintended problems.
Error on registration with a NULL token, ignore de-registrations with
a NULL token.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-11 22:37:54 +02:00
Greg Kurz 0b1b1dfd52 kvm: introduce KVM_MAX_VCPU_ID
The KVM_MAX_VCPUS define provides the maximum number of vCPUs per guest, and
also the upper limit for vCPU ids. This is okay for all archs except PowerPC
which can have higher ids, depending on the cpu/core/thread topology. In the
worst case (single threaded guest, host with 8 threads per core), it limits
the maximum number of vCPUS to KVM_MAX_VCPUS / 8.

This patch separates the vCPU numbering from the total number of vCPUs, with
the introduction of KVM_MAX_VCPU_ID, as the maximal valid value for vCPU ids
plus one.

The corresponding KVM_CAP_MAX_VCPU_ID allows userspace to validate vCPU ids
before passing them to KVM_CREATE_VCPU.

This patch only implements KVM_MAX_VCPU_ID with a specific value for PowerPC.
Other archs continue to return KVM_MAX_VCPUS instead.

Suggested-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-11 22:37:54 +02:00
Greg Kurz 9b9e3fc4d5 KVM: remove NULL return path for vcpu ids >= KVM_MAX_VCPUS
Commit c896939f7c ("KVM: use heuristic for fast VCPU lookup by id") added
a return path that prevents vcpu ids to exceed KVM_MAX_VCPUS. This is a
problem for powerpc where vcpu ids can grow up to 8*KVM_MAX_VCPUS.

This patch simply reverses the logic so that we only try fast path if the
vcpu id can be tried as an index in kvm->vcpus[]. The slow path is not
affected by the change.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-11 22:37:53 +02:00
Paolo Bonzini bdb4094eb5 KVM/ARM Changes for Linux v4.7
Reworks our stage 2 page table handling to have page table manipulation
 macros separate from those of the host systems as the underlying
 hardware page tables can be configured to be noticably different in
 layout from the stage 1 page tables used by the host.
 
 Adds 16K page size support based on the above.
 
 Adds a generic firmware probing layer for the timer and GIC so that KVM
 initializes using the same logic based on both ACPI and FDT.
 
 Finally adds support for hardware updating of the access flag.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXMzKGAAoJEEtpOizt6ddyQ0AH/RS1tn3obEnujzD1yFeiSNS7
 kBTwDEDdsAE7RJ12c43knfGaO4JO9+V2o1F5/16+/mjWDUJwAsm0yzSxvlxiS/+o
 Q3QIfAbXxj/xia+sDFEtuSpRLL7Kl9oYeKBc5BijobvIQ5PKWm41kxehS8phMovQ
 hiC2WQ5Wm1ww9L6AcI3gf8jqj4GJ/v+RSWzMTmPA7Wm7l03VGFn+G6AOnO0Rx6Fp
 aRhI3dgvMAeMV8DXKTCdZggPrZ/ipLU+LgU+FwUXx2Ru9VfjU94MBEJG3FsTjNM0
 UT1NLQ3kZaiYjlW/tN8WCXQDK1AUFDWUCHW7p77mq9w3cSylQBUlKbjRzTtjBsM=
 =8PZF
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM Changes for Linux v4.7

Reworks our stage 2 page table handling to have page table manipulation
macros separate from those of the host systems as the underlying
hardware page tables can be configured to be noticably different in
layout from the stage 1 page tables used by the host.

Adds 16K page size support based on the above.

Adds a generic firmware probing layer for the timer and GIC so that KVM
initializes using the same logic based on both ACPI and FDT.

Finally adds support for hardware updating of the access flag.
2016-05-11 22:37:37 +02:00
Ingo Molnar d2950158d0 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-11 16:56:38 +02:00
Marc Zyngier 074f23b675 irqchip/gic-v3: Remove inexistant register definition
The GICv3 include file defines GICR_ISACTIVER and GICR_ICACTIVER
in the RD_base page. News flash, they do not exist (probably
a copy/paste brain fart). Just drop them.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-05-11 10:12:12 +01:00
Al Viro e4d35be584 Merge branch 'ovl-fixes' into for-linus 2016-05-11 00:00:29 -04:00
Miklos Szeredi 3c9fe8cdff vfs: add lookup_hash() helper
Overlayfs needs lookup without inode_permission() and already has the name
hash (in form of dentry->d_name on overlayfs dentry).  It also doesn't
support filesystems with d_op->d_hash() so basically it only needs
the actual hashed lookup from lookup_one_len_unlocked()

So add a new helper that does unlocked lookup of a hashed name.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-05-10 23:56:28 -04:00
Miklos Szeredi 54d5ca871e vfs: add vfs_select_inode() helper
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2+
2016-05-10 23:55:01 -04:00
Nicolas Dichtel 1dee3f59a8 block/drbd: align properly u64 in nl messages
The attribute 0 is never used in drbd, so let's use it as pad attribute
in netlink messages. This minimizes the patch.

Note that this patch is only compile-tested.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10 15:43:09 -04:00
Philippe Reynes 9d9a77cee1 net: phy: add phy_ethtool_{get|set}_link_ksettings
Ethtool callbacks {get|set}_link_ksettings are often the same, so
we add two generics functions phy_ethtool_{get|set}_link_ksettings
to avoid writing severals times the same function.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-By: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10 15:06:19 -04:00
David S. Miller 7a27de7810 linux-can-next-for-4.7-20160509
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCgAGBQJXMGAcAAoJED07qiWsqSVqR7AH/RTuW5SeDFQGI1YK4U6ekrbg
 +22EDLyUh+MD/eBKf74C9jciaTnd84PAYCOEBa6rXi/2P1gHMnyEIJOxse/cfgKz
 Hf26avGjaTCPS7VFHJeLTSrOlR/Hogl5gp+SEjA4WD1cpr480lS3sgGjax8YTY20
 sNl2xJqnFVjkJAa0f7AsmaZRHsyytvPbS5c8z7RuihhX1yamTPm8BKqY7s4oJ83n
 Rg2/fXV6O1Dg+p/2qra7kyMGj6wIIXOI9wXPjLNXuR6nqT3vWhGaKy+pkl/Ok2JY
 UvwDeb7UvgXcypv5FO3LW9R7vqF5L9ZpqS2XCrlTwoFct7bCOCH1xJFGaXV/Cbo=
 =Eipf
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-4.7-20160509' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2016-05-09

this is a pull request of 12 patches for net-next/master.

Alexander Gerasiov and Nikita Edward Baruzdin each contribute a patch
improving the sja1000 driver. Amitoj Kaur Chawla's patch converts the
mcp251x driver to alloc_workqueue(). A patch by Oliver Hartkopp fixes
the handling of CAN config options. Andreas Gröger improves the error
handling in the janz-ican3 driver. The patch by Maximilian Schneider
for the gs_usb improves probing of the USB driver. Finally there are 6
improvement patches by Marek Vasut for the ifi CAN driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10 14:19:06 -04:00
Shaohua Li 59fa0224cf blk-throttle: don't parse cgroup path if trace isn't enabled
if trace isn't enabled, parsing cgroup path just wastes cpu

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-10 08:41:37 -06:00
David S. Miller e800072c18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
In netdevice.h we removed the structure in net-next that is being
changes in 'net'.  In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.

The mlx5 conflicts have to do with vxlan support dependencies.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 15:59:24 -04:00
Linus Torvalds 26acc792c9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Check klogctl failure correctly, from Colin Ian King.

 2) Prevent OOM when under memory pressure in flowcache, from Steffen
    Klassert.

 3) Fix info leak in llc and rtnetlink ifmap code, from Kangjie Lu.

 4) Memory barrier and multicast handling fixes in bnxt_en, from Michael
    Chan.

 5) Endianness bug in mlx5, from Daniel Jurgens.

 6) Fix disconnect handling in VSOCK, from Ian Campbell.

 7) Fix locking of netdev list walking in get_bridge_ifindices(), from
    Nikolay Aleksandrov.

 8) Bridge multicast MLD parser can look at wrong packet offsets, fix
    from Linus Lüssing.

 9) Fix chip hang in qede driver, from Sudarsana Reddy Kalluru.

10) Fix missing setting of encapsulation before inner handling completes
    in udp_offload code, from Jarno Rajahalme.

11) Missing rollbacks during LAG join and flood configuration failures
    in mlxsw driver, from Ido Schimmel.

12) Fix error code checks in netxen driver, from Dan Carpenter.

13) Fix key size in new macsec driver, from Sabrina Dubroca.

14) Fix mlx5/VXLAN dependencies, from Arnd Bergmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
  net/mlx5e: make VXLAN support conditional
  Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue"
  macsec: key identifier is 128 bits, not 64
  Documentation/networking: more accurate LCO explanation
  macvtap: segmented packet is consumed
  tools: bpf_jit_disasm: check for klogctl failure
  qede: uninitialized variable in qede_start_xmit()
  netxen: netxen_rom_fast_read() doesn't return -1
  netxen: reversed condition in netxen_nic_set_link_parameters()
  netxen: fix error handling in netxen_get_flash_block()
  mlxsw: spectrum: Add missing rollback in flood configuration
  mlxsw: spectrum: Fix rollback order in LAG join failure
  udp_offload: Set encapsulation before inner completes.
  udp_tunnel: Remove redundant udp_tunnel_gro_complete().
  qede: prevent chip hang when increasing channels
  net: ipv6: tcp reset, icmp need to consider L3 domain
  bridge: fix igmp / mld query parsing
  net: bridge: fix old ioctl unlocked net device walk
  VSOCK: do not disconnect socket when peer has shutdown SEND only
  net/mlx4_en: Fix endianness bug in IPV6 csum calculation
  ...
2016-05-09 12:11:37 -07:00
David S. Miller e8ed77dfa9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following large patchset contains Netfilter updates for your
net-next tree. My initial intention was to send you this in two goes but
when I looked back twice I already had this burden on top of me.

Several updates for IPVS from Marco Angaroni:

1) Allow SIP connections originating from real-servers to be load
   balanced by the SIP persistence engine as is already implemented
   in the other direction.

2) Release connections immediately for One-packet-scheduling (OPS)
   in IPVS, instead of making it via timer and rcu callback.

3) Skip deleting conntracks for each one packet in OPS, and don't call
   nf_conntrack_alter_reply() since no reply is expected.

4) Enable drop on exhaustion for OPS + SIP persistence.

Miscelaneous conntrack updates from Florian Westphal, including fix for
hash resize:

5) Move conntrack generation counter out of conntrack pernet structure
   since this is only used by the init_ns to allow hash resizing.

6) Use get_random_once() from packet path to collect hash random seed
    instead of our compound.

7) Don't disable BH from ____nf_conntrack_find() for statistics,
   use NF_CT_STAT_INC_ATOMIC() instead.

8) Fix lookup race during conntrack hash resizing.

9) Introduce clash resolution on conntrack insertion for connectionless
   protocol.

Then, Florian's netns rework to get rid of per-netns conntrack table,
thus we use one single table for them all. There was consensus on this
change during the NFWS 2015 and, on top of that, it has recently been
pointed as a source of multiple problems from unpriviledged netns:

11) Use a single conntrack hashtable for all namespaces. Include netns
    in object comparisons and make it part of the hash calculation.
    Adapt early_drop() to consider netns.

12) Use single expectation and NAT hashtable for all namespaces.

13) Use a single slab cache for all namespaces for conntrack objects.

14) Skip full table scanning from nf_ct_iterate_cleanup() if the pernet
    conntrack counter tells us the table is empty (ie. equals zero).

Fixes for nf_tables interval set element handling, support to set
conntrack connlabels and allow set names up to 32 bytes.

15) Parse element flags from element deletion path and pass it up to the
    backend set implementation.

16) Allow adjacent intervals in the rbtree set type for dynamic interval
    updates.

17) Add support to set connlabel from nf_tables, from Florian Westphal.

18) Allow set names up to 32 bytes in nf_tables.

Several x_tables fixes and updates:

19) Fix incorrect use of IS_ERR_VALUE() in x_tables, original patch
    from Andrzej Hajda.

And finally, miscelaneous netfilter updates such as:

20) Disable automatic helper assignment by default. Note this proc knob
    was introduced by a900689264 ("netfilter: nf_ct_helper: allow to
    disable automatic helper assignment") 4 years ago to start moving
    towards explicit conntrack helper configuration via iptables CT
    target.

21) Get rid of obsolete and inconsistent debugging instrumentation
    in x_tables.

22) Remove unnecessary check for null after ip6_route_output().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 15:02:58 -04:00
Josh Poimboeuf 8634de6d25 compiler-gcc: require gcc 4.8 for powerpc __builtin_bswap16()
gcc support for __builtin_bswap16() was supposedly added for powerpc in
gcc 4.6, and was then later added for other architectures in gcc 4.8.

However, Stephen Rothwell reported that attempting to use it on powerpc
in gcc 4.6 fails with:

  lib/vsprintf.c:160:2: error: initializer element is not constant
  lib/vsprintf.c:160:2: error: (near initialization for 'decpair[0]')
  lib/vsprintf.c:160:2: error: initializer element is not constant
  lib/vsprintf.c:160:2: error: (near initialization for 'decpair[1]')
  ...

I'm not entirely sure what those errors mean, but I don't see them on
gcc 4.8.  So let's consider gcc 4.8 to be the official starting point
for __builtin_bswap16().

Arnd Bergmann adds:
 "I found the commit in gcc-4.8 that replaced the powerpc-specific
  implementation of __builtin_bswap16 with an architecture-independent
  one.  Apparently the powerpc version (gcc-4.6 and 4.7) just mapped to
  the lhbrx/sthbrx instructions, so it ended up not being a constant,
  though the intent of the patch was mainly to add support for the
  builtin to x86:

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52624

  has the patch that went into gcc-4.8 and more information."

Fixes: 7322dd755e ("byteswap: try to avoid __builtin_constant_p gcc bug")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-09 11:54:29 -07:00
Joerg Roedel 6c0b43df74 Merge branches 'arm/io-pgtable', 'arm/rockchip', 'arm/omap', 'x86/vt-d', 'ppc/pamu', 'core' and 'x86/amd' into next 2016-05-09 19:39:17 +02:00
Guoqing Jiang 85ad1d13ee md: set MD_CHANGE_PENDING in a atomic region
Some code waits for a metadata update by:

1. flagging that it is needed (MD_CHANGE_DEVS or MD_CHANGE_CLEAN)
2. setting MD_CHANGE_PENDING and waking the management thread
3. waiting for MD_CHANGE_PENDING to be cleared

If the first two are done without locking, the code in md_update_sb()
which checks if it needs to repeat might test if an update is needed
before step 1, then clear MD_CHANGE_PENDING after step 2, resulting
in the wait returning early.

So make sure all places that set MD_CHANGE_PENDING are atomicial, and
bit_clear_unless (suggested by Neil) is introduced for the purpose.

Cc: Martin Kepplinger <martink@posteo.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: <linux-kernel@vger.kernel.org>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-05-09 09:24:02 -07:00
Serge E. Hallyn 4f41fc5962 cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces
Patch summary:

When showing a cgroupfs entry in mountinfo, show the path of the mount
root dentry relative to the reader's cgroup namespace root.

Short explanation (courtesy of mkerrisk):

If we create a new cgroup namespace, then we want both /proc/self/cgroup
and /proc/self/mountinfo to show cgroup paths that are correctly
virtualized with respect to the cgroup mount point.  Previous to this
patch, /proc/self/cgroup shows the right info, but /proc/self/mountinfo
does not.

Long version:

When a uid 0 task which is in freezer cgroup /a/b, unshares a new cgroup
namespace, and then mounts a new instance of the freezer cgroup, the new
mount will be rooted at /a/b.  The root dentry field of the mountinfo
entry will show '/a/b'.

 cat > /tmp/do1 << EOF
 mount -t cgroup -o freezer freezer /mnt
 grep freezer /proc/self/mountinfo
 EOF

 unshare -Gm  bash /tmp/do1
 > 330 160 0:34 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,freezer
 > 355 133 0:34 /a/b /mnt rw,relatime - cgroup freezer rw,freezer

The task's freezer cgroup entry in /proc/self/cgroup will simply show
'/':

 grep freezer /proc/self/cgroup
 9:freezer:/

If instead the same task simply bind mounts the /a/b cgroup directory,
the resulting mountinfo entry will again show /a/b for the dentry root.
However in this case the task will find its own cgroup at /mnt/a/b,
not at /mnt:

 mount --bind /sys/fs/cgroup/freezer/a/b /mnt
 130 25 0:34 /a/b /mnt rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,freezer

In other words, there is no way for the task to know, based on what is
in mountinfo, which cgroup directory is its own.

Example (by mkerrisk):

First, a little script to save some typing and verbiage:

echo -e "\t/proc/self/cgroup:\t$(cat /proc/self/cgroup | grep freezer)"
cat /proc/self/mountinfo | grep freezer |
        awk '{print "\tmountinfo:\t\t" $4 "\t" $5}'

Create cgroup, place this shell into the cgroup, and look at the state
of the /proc files:

2653
2653                         # Our shell
14254                        # cat(1)
        /proc/self/cgroup:      10:freezer:/a/b
        mountinfo:              /       /sys/fs/cgroup/freezer

Create a shell in new cgroup and mount namespaces. The act of creating
a new cgroup namespace causes the process's current cgroups directories
to become its cgroup root directories. (Here, I'm using my own version
of the "unshare" utility, which takes the same options as the util-linux
version):

Look at the state of the /proc files:

        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /       /sys/fs/cgroup/freezer

The third entry in /proc/self/cgroup (the pathname of the cgroup inside
the hierarchy) is correctly virtualized w.r.t. the cgroup namespace, which
is rooted at /a/b in the outer namespace.

However, the info in /proc/self/mountinfo is not for this cgroup
namespace, since we are seeing a duplicate of the mount from the
old mount namespace, and the info there does not correspond to the
new cgroup namespace. However, trying to create a new mount still
doesn't show us the right information in mountinfo:

                                      # propagating to other mountns
        /proc/self/cgroup:      7:freezer:/
        mountinfo:              /a/b    /mnt/freezer

The act of creating a new cgroup namespace caused the process's
current freezer directory, "/a/b", to become its cgroup freezer root
directory. In other words, the pathname directory of the directory
within the newly mounted cgroup filesystem should be "/",
but mountinfo wrongly shows us "/a/b". The consequence of this is
that the process in the cgroup namespace cannot correctly construct
the pathname of its cgroup root directory from the information in
/proc/PID/mountinfo.

With this patch, the dentry root field in mountinfo is shown relative
to the reader's cgroup namespace.  So the same steps as above:

        /proc/self/cgroup:      10:freezer:/a/b
        mountinfo:              /       /sys/fs/cgroup/freezer
        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /../..  /sys/fs/cgroup/freezer
        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /       /mnt/freezer

cgroup.clone_children  freezer.parent_freezing  freezer.state      tasks
cgroup.procs           freezer.self_freezing    notify_on_release
3164
2653                   # First shell that placed in this cgroup
3164                   # Shell started by 'unshare'
14197                  # cat(1)

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2016-05-09 12:15:03 -04:00
Al Viro 884be17535 nfs: per-name sillyunlink exclusion
use d_alloc_parallel() for sillyunlink/lookup exclusion and
explicit rwsem (nfs_rmdir() being a writer and nfs_call_unlink() -
a reader) for rmdir/sillyunlink one.

That ought to make lookup/readdir/!O_CREAT atomic_open really
parallel on NFS.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-09 11:39:45 -04:00
Laxman Dewangan 327156c593 mfd: max77620: Add core driver for MAX77620/MAX20024
MAX77620/MAX20024 are Power Management IC from the MAXIM.
It supports RTC, multiple GPIOs, multiple DCDC and LDOs,
watchdog, clock etc.

Add MFD drier to provides common support for accessing the
device; additional drivers is developed on respected subsystem
in order to use the functionality of the device.

Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Mallikarjun Kasoju <mkasoju@nvidia.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-05-09 15:42:00 +01:00
Arnd Bergmann 11c3aead82 Legacy booting vs device tree booting fixes for omaps for
v4.7 merge window. These are not considered urgent fixes enough
 for the v4.6-rc cycle, but we need them in v4.7 in order to drop
 the last remaining board-*.c files for omap3 for v4.8 merge window.
 
 On Nokia N900, we need to pass the MMC slot names for the legacy
 user space to work. Let's do that using auxdata as the driver is
 setting up things already with the pdata for legacy booting. Then
 we can later on discuss if we may want to have some generic binding
 describing where the MMC slots are on the device.
 
 N900 also has had the ir-rx51 device driver unusable with multiarch
 for a long time. Let's pass the dmtimer data in pdata for the driver
 to get it going again. Then once things are working, we can eventually
 change the driver to use just hrtimer and PWM framework. The driver
 changes will be queued separately.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXKi9zAAoJEBvUPslcq6VzfNcP+wRJTBKja7wxwxo8R2DKbYgw
 oJXcldqdqJZIaHnJUR7uFDHZu1Jt6ZUM6WZQAEQTozPtMPnufM3VNgREz/LUqcCa
 /vNQUBbvGwiaP3KOk3jNX1HuzSFhoAIQxTkE07FEBziZc0Soq7Cs//mUP1HZOrG2
 a2+5MRM1TsVqIf0vSHUSZ9ez6iJpbpq2brdFb0upJwV9DFUcDouBBjd1atOPBJqW
 mZ4myDqgbSSY7fL2cPEGnqVR0tqNTwrToWNxZFBP5T9IrI1oOe0MfI2vDgdHFf9h
 mrVVITSRvwte751Xheze069ktzI1qJoIUDcyXN1+kTAC7jp3q7Kjo8fzqMVPfcqE
 zn+Sj7C1l+1a0WjXjzhJhZgJBQkHwJubyJv3cGhb3HdkHbjIp7WURbcMF6gElEH7
 J9eA+aHHyd3PW0mfdjuLY0gI5XEoBvRLiboZIewjQ9hIMgD+bKC0FZ/h313xo6eG
 uPgobT9JBDmcv/EQw2eCH9kVvbf/gudNpCNgdoHcBf3V/EiSPl5PKxhgHtLM4WFj
 S4rVg5X3MPglQay436SwSwmFYxk3WDo03qLDhn7DWPxxeCVuJDscvgzPhRuVKH4A
 ox1a17UI1510M6bxoQLGFKK//Q2rNFEQb0YDImlwfsB7ZO/WV3mxn/KX4q2wDWhB
 WlvMCPGZPi4K38utIQuW
 =klmr
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v4.7/legacy-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/fixes-non-critical

Merge "omap legacy boot vs dt boot fixes for v4.7 merge window"
from Tony Lindgren:

Legacy booting vs device tree booting fixes for omaps for
v4.7 merge window. These are not considered urgent fixes enough
for the v4.6-rc cycle, but we need them in v4.7 in order to drop
the last remaining board-*.c files for omap3 for v4.8 merge window.

On Nokia N900, we need to pass the MMC slot names for the legacy
user space to work. Let's do that using auxdata as the driver is
setting up things already with the pdata for legacy booting. Then
we can later on discuss if we may want to have some generic binding
describing where the MMC slots are on the device.

N900 also has had the ir-rx51 device driver unusable with multiarch
for a long time. Let's pass the dmtimer data in pdata for the driver
to get it going again. Then once things are working, we can eventually
change the driver to use just hrtimer and PWM framework. The driver
changes will be queued separately.

* tag 'omap-for-v4.7/legacy-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: n900 needs MMC slot names for legacy user space
  ARM: OMAP2+: Add more functions to pwm pdata for ir-rx51
2016-05-09 16:39:14 +02:00
Arnd Bergmann 1644de03cb Reset controller changes for v4.7
- add missing stub for device_reset
 - add support for OXNAS SoCs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXKFfkAAoJEFDCiBxwnmDrLHkP/2jdM9kuoUmGMHP/ikaV3TaP
 sE0sfyC4LNFC1pglAqtGtGCXlG3jHj304dqgfpJ7NzMNNC3Ch1scMlLC27+ocM+q
 Wi/vPSAZIb7myH4nTuc8539bD9G31xBt3enh6VnsZ/vjtSJBc35LtkJxmJ3F77xu
 VdRJfWSbY9f1gQ76AJmeVNtGLf7wX4rCNM+qNvtwciWz1sFuWc0rVoKYBA6qfXAX
 RRK5W7y6+57u6Gui7VE+Yu4gDz4sdDElVGp0yD1QuZbL2i6hJriJU/b0/SlK05p+
 6qAU1pQ5i6/1SIkD0F5ISdXLAJqooVP0FPVVclyG7lU3l9XmdQSj5oT0KONBE0cd
 gIXsNygk5sfBTPFy0imrZk3bSmsfBDmA5BOJn1br+06tgt6DDvidLWtOC5h1r9Hb
 8llCpXspfC5HIHj8UNitUfaOBYt9dlPTLhON1/8jZZU8xACBEt2aG9xehZT1JDSE
 nZIccLrCbMAeqsivTgaJPjDJEP88gWa9LawFM+I9bGqxDc+Hv/CPaaqnEgP+ba6S
 FfaRwOU3UKVv7RZyBJD6JNUC71thwWBOZ9xoMdC4Q80xRWi4TiMijFwz84Rny5GY
 OO5dhlMXMFdUqnuKXBo1/BwkhBidqeKj30dJ95rX5Hse+YTm7a+fR4g3UlzMyOHI
 Nyy9HDEXCOP6SBonA91J
 =mbVz
 -----END PGP SIGNATURE-----

Merge tag 'reset-for-4.7-2' of git://git.pengutronix.de/git/pza/linux into next/drivers

Merge "Reset controller changes for v4.7" from Philipp Zabel:

- add missing stub for device_reset
- add support for OXNAS SoCs

* tag 'reset-for-4.7-2' of git://git.pengutronix.de/git/pza/linux:
  reset: Add missing function stub for device_reset
  dt-bindings: Add Oxford Semiconductor Reset Controller bindings
  reset: Add Oxford Semiconductor Reset Controller driver
2016-05-09 16:32:29 +02:00
Arnd Bergmann 4ace926172 phy: tegra: Changes for v4.7-rc1
This set of patches adds support for the Tegra XUSB pad controller. The
 controller provides a set of pads (lanes) that are used for I/O by other
 IP blocks within Tegra SoCs (PCIe, SATA and XUSB).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXI3hiAAoJEN0jrNd/PrOhSeIP/0ziDYSAgTbB30L2CJNLnyp3
 xnbn6YHLsZVcwD4Pxlu9TW2z4iqCz3BU7Glbv0zc8tGGW9OzaPPH2M/+Vl8neM5O
 fRaP/hd2FlW9leiPU/xNU4gdGycRuk94clxObNtS8g3qKxl2KsQeZBWiMfIPsJRE
 IvK57SmLtznDgigtV2xJjH90OkwycAWQBi6r7pcttnLWB5qAEkMl0EnVz35q6EOM
 8EXOSjMATdrxRwE3FjRDzzSWPUpRHG61DC4krMpo8VgHXyqUdR1o5VwxEPIBcL3W
 td/oPZrUNAa7z/IoSdH9SD9IEc1OIwZOmcwFkOsFjFRn118gx+RE7pd8QkvKaCGU
 CDfqHS76pgOrnHOLWCtuYMagrPtwI2H8KOqx2VOLdKQghez1ykk0gCpKqp5CLGC1
 G4VQp1jK7y4dB97K3C5/8WfntNSczE+kb61B3Q3gHKaP8GfvBtdhGfP+POXwp4bC
 rrw8kienv3sws11GTZXMHhQWVGgbWxenPh+Fjj43fnI8YSoweC9xErWmxQyUYcov
 xv+ryi+BoGyr36IbX2dng7peRzgxGadMRFwOJ0EVBw27nGZNuULwgtFSgYcCTZRY
 6wJKxGX8EWpI034LVyhbrPy34cIFTSkzZCeYvENWzPNCw7y+bC7hjBewPMPDNF7U
 63BEtuIczTyzgUjcav+s
 =JRT5
 -----END PGP SIGNATURE-----

Merge tag 'tegra-for-4.7-phy' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into next/drivers

Merge "phy: tegra: Changes for v4.7-rc1" from Thierry Reding:

This set of patches adds support for the Tegra XUSB pad controller. The
controller provides a set of pads (lanes) that are used for I/O by other
IP blocks within Tegra SoCs (PCIe, SATA and XUSB).

* tag 'tegra-for-4.7-phy' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  phy: tegra: Add Tegra210 support
  phy: Add Tegra XUSB pad controller support
  dt-bindings: phy: tegra-xusb-padctl: Add Tegra210 support
  dt-bindings: phy: Add NVIDIA Tegra XUSB pad controller binding
  phy: core: Allow children node to be overridden
  clk: tegra: Add interface to enable hardware control of SATA/XUSB PLLs
2016-05-09 16:18:37 +02:00
Robin Murphy 3b6b7e19e3 iommu/dma: Finish optimising higher-order allocations
Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09 15:33:29 +02:00
Robin Murphy d16e0faab9 iommu: Allow selecting page sizes per domain
Many IOMMUs support multiple page table formats, meaning that any given
domain may only support a subset of the hardware page sizes presented in
iommu_ops->pgsize_bitmap. There are also certain use-cases where the
creator of a domain may want to control which page sizes are used, for
example to force the use of hugepage mappings to reduce pagetable walk
depth.

To this end, add a per-domain pgsize_bitmap to represent the subset of
page sizes actually in use, to make it possible for domains with
different requirements to coexist.

Signed-off-by: Will Deacon <will.deacon@arm.com>
[rm: hijacked and rebased original patch with new commit message]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09 15:33:29 +02:00
Robin Murphy 53c92d7933 iommu: of: enforce const-ness of struct iommu_ops
As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.

Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09 15:33:29 +02:00
Will Deacon 3c3e8943ac iommu: remove unused priv field from struct iommu_ops
The priv field from iommu_ops is a hangover from the of_dma_configure
series and isn't actually used. Remove it before it has chance to
spread.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09 15:33:29 +02:00
Joerg Roedel 8801561ce0 Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu 2016-05-09 12:03:37 +02:00
Oliver Hartkopp bb208f144c can: fix handling of unmodifiable configuration options
As described in 'can: m_can: tag current CAN FD controllers as non-ISO'
(6cfda7fbeb) it is possible to define fixed configuration options by
setting the according bit in 'ctrlmode' and clear it in 'ctrlmode_supported'.
This leads to the incovenience that the fixed configuration bits can not be
passed by netlink even when they have the correct values (e.g. non-ISO, FD).

This patch fixes that issue and not only allows fixed set bit values to be set
again but now requires(!) to provide these fixed values at configuration time.
A valid CAN FD configuration consists of a nominal/arbitration bittiming, a
data bittiming and a control mode with CAN_CTRLMODE_FD set - which is now
enforced by a new can_validate() function. This fix additionally removed the
inconsistency that was prohibiting the support of 'CANFD-only' controller
drivers, like the RCar CAN FD.

For this reason a new helper can_set_static_ctrlmode() has been introduced to
provide a proper interface to handle static enabled CAN controller options.

Reported-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Reviewed-by: Ramesh Shanmugasundaram  <ramesh.shanmugasundaram@bp.renesas.com>
Cc: <stable@vger.kernel.org> # >= 3.18
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-05-09 11:07:28 +02:00
Dan Carpenter 0e10d549f6 mfd: wm8400-core: Delete wm8400_reg_read()
There was a static checker warning in wm8400_reg_read() because we were
returning u16 and that can't hold the negative error codes.  The
function isn't used, so let's just delete it.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-05-09 08:20:45 +01:00
Courtney Cavin bdabad3e36 net: Add Qualcomm IPC router
Add an implementation of Qualcomm's IPC router protocol, used to
communicate with service providing remote processors.

Signed-off-by: Courtney Cavin <courtney.cavin@sonymobile.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
[bjorn: Cope with 0 being a valid node id and implement RTM_NEWADDR]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-08 23:46:14 -04:00
Bjorn Andersson 43315f31ad soc: qcom: smd: Introduce compile stubs
Introduce compile stubs for the SMD API, allowing consumers to be
compile tested.

Acked-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-08 23:46:14 -04:00
Julia Lawall 1cfd63166c efi: Merge boolean flag arguments
The parameters atomic and duplicates of efivar_init always have opposite
values.  Drop the parameter atomic, replace the uses of !atomic with
duplicates, and update the call sites accordingly.

The code using duplicates is slightly reorganized with an 'else', to avoid
duplicating the lock code.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Saurabh Sengar <saurabh.truth@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vaishali Thakkar <vaishali.thakkar@oracle.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1462570771-13324-5-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-07 07:06:13 +02:00
Ingo Molnar 35dc9ec107 Merge branch 'linus' into efi/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-07 07:00:07 +02:00
Jarno Rajahalme 229740c631 udp_offload: Set encapsulation before inner completes.
UDP tunnel segmentation code relies on the inner offsets being set for
an UDP tunnel GSO packet, but the inner *_complete() functions will
set the inner offsets only if 'encapsulation' is set before calling
them.  Currently, udp_gro_complete() sets 'encapsulation' only after
the inner *_complete() functions are done.  This causes the inner
offsets having invalid values after udp_gro_complete() returns, which
in turn will make it impossible to properly segment the packet in case
it needs to be forwarded, which would be visible to the user either as
invalid packets being sent or as packet loss.

This patch fixes this by setting skb's 'encapsulation' in
udp_gro_complete() before calling into the inner complete functions,
and by making each possible UDP tunnel gro_complete() callback set the
inner_mac_header to the beginning of the tunnel payload.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Reviewed-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 18:25:26 -04:00
Rafael J. Wysocki dab2e29402 Merge back new device properties material for v4.7. 2016-05-06 22:07:33 +02:00
Rafael J. Wysocki c8541203a6 Merge back new material for v4.7. 2016-05-06 22:05:16 +02:00
Alexei Starovoitov db58ba4592 bpf: wire in data and data_end for cls_act_bpf
allow cls_bpf and act_bpf programs access skb->data and skb->data_end pointers.
The bpf helpers that change skb->data need to update data_end pointer as well.
The verifier checks that programs always reload data, data_end pointers
after calls to such bpf helpers.
We cannot add 'data_end' pointer to struct qdisc_skb_cb directly,
since it's embedded as-is by infiniband ipoib, so wrapper struct is needed.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 16:01:54 -04:00
Linus Torvalds 01ec716761 Power management and ACPI fixes for v4.6-rc7
- Fix for a recent regression in the intel_pstate driver causing
    it to fail to restore the HWP (HW-managed P-states) configuration
    of the boot CPU after suspend-to-RAM (Rafael Wysocki).
 
  - Fix for two recent regressions in the intel_pstate driver, one
    that can trigger a divide by zero if the driver is accessed via
    sysfs before it manages to take the first sample and one causing
    it to fail to update a structure field used in a trace point, so
    the information coming from it is less useful (Rafael Wysocki).
 
  - Fix for a problem in the sti-cpufreq driver introduced during
    the 4.5 cycle that causes it to break CPU PM in multi-platform
    kernels by registering cpufreq-dt (which subsequently doesn't
    work) unconditionally and preventing the driver that would
    actually work from registering (Sudeep Holla).
 
  - Stable-candidate fix for an ARM64 cpuidle issue causing idle
    state usage counters to be incorrectly updated for idle states
    that were not entered due to errors (James Morse).
 
  - Fix for a recently introduced issue in the OPP (Operating
    Performance Points) framework causing it to print bogus error
    messages for missing optional regulators (Viresh Kumar).
 
  - Fix for a recently introduced issue in the generic device
    properties framework that may cause it to attempt to dereferece
    and invalid pointer in some cases (Heikki Krogerus).
 
  - Fix for a deadlock in the ACPICA core that may be triggered
    by device (eg. Thunderbolt) hotplug (Prarit Bhargava).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJXLIwPAAoJEILEb/54YlRxT+wP/ROEo/r5IaRZ2k8cphWjsiKk
 k9eDuWBL2KZ29ikghXs/vVY2fMbtQkaDT5h57imsUEKoEzI3MlYA3OkQyffFOcsY
 dz/9EnG6K9Efi6VS1dS1tNCgl45aIeHLCqlVPOBCZ9TwSoAERdNJGqItJdS2YKIA
 +C1LGrWl4UiJ95AOof9PHfKfnWxrnRbpIsB2PbxD0Swe5vfskrHoRWGOAMLJIwpF
 7NvEJ15fryDIvlMR/ggNrg2L2piOu1fJl2kVZYWZTb/u+qAO3utxTQN4y++zTSNb
 LAN78Hq/nJu156SSioO9fLa0wPaU+k2OChfWXtlMsTDK+L5EQz4G3pJwi5FA8QTD
 nfeZNC9VgqfP4LtqWw05h/AOw4A0XUeuwB8Edbc+WG5twzULqDhS57jew4A4xX8d
 jOsvK5syygnR+/rExWc0NWSmCH0g1u6mCUWXQuocfSb/oOEcUGq5RSixRNRfmJUq
 9XNF3hbp7W/Vnp9GWT30Md+CenrEtQXFK8ZQtg0ckBl+b5bEqKYs6FXGqCkUmjZy
 Qgt5sqxgdLWtslS3vSu1/mdryeaLmXNO6c6wueSPMmLyYODEoIHSSka9N9O0Inwv
 d106p7gUy3/ETamC3lbnyHkUrAru74Qh8rErKpqaRLkKfcIq7YCB073fxbqlamzz
 X4n8a1H37LefLqmKwIbF
 =pU+A
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "Fixes for problems introduced or discovered recently (intel_pstate,
  sti-cpufreq, ARM64 cpuidle, Operating Performance Points framework,
  generic device properties framework) and one fix for a hotplug-related
  deadlock in ACPICA that's been there forever, but is nasty enough.

  Specifics:

   - Fix for a recent regression in the intel_pstate driver causing it
     to fail to restore the HWP (HW-managed P-states) configuration of
     the boot CPU after suspend-to-RAM (Rafael Wysocki).

   - Fix for two recent regressions in the intel_pstate driver, one that
     can trigger a divide by zero if the driver is accessed via sysfs
     before it manages to take the first sample and one causing it to
     fail to update a structure field used in a trace point, so the
     information coming from it is less useful (Rafael Wysocki).

   - Fix for a problem in the sti-cpufreq driver introduced during the
     4.5 cycle that causes it to break CPU PM in multi-platform kernels
     by registering cpufreq-dt (which subsequently doesn't work)
     unconditionally and preventing the driver that would actually work
     from registering (Sudeep Holla).

   - Stable-candidate fix for an ARM64 cpuidle issue causing idle state
     usage counters to be incorrectly updated for idle states that were
     not entered due to errors (James Morse).

   - Fix for a recently introduced issue in the OPP (Operating
     Performance Points) framework causing it to print bogus error
     messages for missing optional regulators (Viresh Kumar).

   - Fix for a recently introduced issue in the generic device
     properties framework that may cause it to attempt to dereferece and
     invalid pointer in some cases (Heikki Krogerus).

   - Fix for a deadlock in the ACPICA core that may be triggered by
     device (eg Thunderbolt) hotplug (Prarit Bhargava)"

* tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / OPP: Remove useless check
  ACPICA: Dispatcher: Update thread ID for recursive method calls
  intel_pstate: Fix intel_pstate_get()
  cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
  cpufreq: st: enable selective initialization based on the platform
  ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
  device property: Avoid potential dereferences of invalid pointers
2016-05-06 11:58:45 -07:00
Javier González 116f7d4a21 lightnvm: reserved space calculation incorrect
The nvm_dev->max_pages_per_blk variable was removed in favor of the new
nvm->sec_per_blk variable. The ->max_pages_per_blk variable was still
used in rrpc_capacity, reporting the reserved capacity to zero. Replace
with ->sec_per_blk to calculate the reserved area again.

Signed-off-by: Javier González <javier@cnexlabs.com>
Updated patch description. Was "lightnvm: eliminate redundant variable"
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González 6d5be9590b lightnvm: rename nr_pages to nr_ppas on nvm_rq
The number of ppas contained on a request is not necessarily the number
of pages that it maps to neither on the target nor on the device side.
In order to avoid confusion, rename nr_pages to nr_ppas since it is what
the variable actually contains.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling df414b33bb lightnvm: add is_cached entry to struct ppa_addr
A target requires a method to identify PPAs that are either cached in
memory or on disk. This can efficiently be maintained within the PPA.
The target host-side translation table can then lookup a PPA and know
from the PPA if it is cached or on disk. In the case it is cached, it is
the responsibility of the target to maintain this cache.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling 04a8aa173b lightnvm: expose gennvm_mark_blk to targets
Targets can update a block state when having a reference to an
in-memory virtual block. In the case that a target does not keep the
block metadata in memory, it does not have a way to update this
structure.

Therefore, expose gennvm_mark_blk() through the media managers
->mark_blk() callback and let targets update the state structure through
this callback.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling 976bdfcae3 lightnvm: remove mgt targets on mgt removal
Targets associated with a device manager are not freed on device
removal. They have to be manually removed before shutdown. Make sure
any outstanding targets are freed upon shutdown.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González 75b8564932 lightnvm: rename dma helper functions
Until now, the dma pool have been exclusively used to allocate the ppa
list being sent to the device. In pblk (upcoming), we use these pools to
allocate metadata too. Thus, we generalize the names of some variables
on the dma helper functions to make the code more readable.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González 003fad376b lightnvm: enable metadata to be sent to device
Enable metadata buffer to be sent to the device through the metadata
field on the physical rw nvme command. The size of the metadata buffer
must follow dev->oob_size * # of PPAs.

Signed-off-by: Javier González <javier@cnexlabs.com>
Updated description.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling 00ee6cc3b7 lightnvm: refactor set_bb_tbl for accepting ppa list
The set_bb_tbl takes struct nvm_rq and only uses its ppa_list and
nr_pages internally. Instead, make these two variables explicit.
This allows a user to call it without initializing a struct nvm_rq
first.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling 5ebc7d9fe1 lightnvm: make nvm_set_rqd_ppalist() aware of vblks
A virtual block enables a block to identify multiple physical blocks.
This is useful for metadata where a device media supports multiple
planes. In that case, a block, with multiple planes can be managed
as a single vblk. Reducing the metadata required by one forth.

nvm_set_rqd_ppalist() takes care of expanding a ppa_list with vblks
automatically. However, for some use-cases, where only a single physical
block is required, the ppa_list should not be expanded.

Therefore, add a vblk parameter to nvm_set_rqd_ppalist(), and only
expand the ppa_list if vblk is set.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling e11903f5df lightnvm: refactor device ops->get_bb_tbl()
The device ops->get_bb_tbl() takes a callback, that allows the caller
to use its own callback function to update its data structures in the
returning function.

This makes it difficult to send parameters to the callback, and usually
is circumvented by small private structures, that both carry the callers
state and any flags needed to fulfill the update.

Refactor ops->get_bb_tbl() to fill a data buffer with the status of the
blocks returned, and let the user call the callback function manually.
That will provide the necessary flags and data structures and simplify
the logic around ops->get_bb_tbl().

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling 5136061ce7 lightnvm: introduce nvm_for_each_lun_ppa() macro
Users that wish to iterate all luns on a device. Must create a
struct ppa_addr and separate iterators for channels and luns. To set the
iterators, two loops are required, one to iterate channels, and another
to iterate luns. This leads to decrease in readability.

Introduce nvm_for_each_lun_ppa, which implements the nested loop and
sets ppa, channel, and lun variable for each loop body, eliminating
the boilerplate code.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Simon A. F. Lund 6f8645cba5 lightnvm: refactor dev->online_target to global nvm_targets
A target name must be unique. However, a per-device registration of
targets is maintained on a dev->online_targets list, with a per-device
search for targets upon registration.

This results in a name collision when two targets, with the same name,
are created on two different targets, where the per-device list is not
shared.

Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Simon A. F. Lund 6063fe399d lightnvm: rename nvm_targets to nvm_tgt_type
The functions nvm_register_target(), nvm_unregister_target() and
associated list refers to a target type that is being registered by a
target type module. Rename nvm_*_targets() to nvm_*_tgt_type(), so that
the intension is clear.

This enables target instances to use the _nvm_*_targets() naming.

Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling 22e8c9766a lightnvm: move block fold outside of get_bb_tbl()
The get block table command returns a list of blocks and planes
with their associated state. Users, such as gennvm and sysblk,
manages all planes as a single virtual block.

It was therefore  natural to fold the bad block list before it is
returned. However, to allow users, which manages on a per-plane
block level, to also use the interface, the get_bb_tbl interface is
changed to not fold by default and instead let the caller fold if
necessary.

Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling 4891d120b9 lightnvm: add fpg_size and pfpg_size to struct nvm_dev
The flash page size (fpg) and size across planes (pfpg) are convenient
to know when allocating buffer sizes. This has previously been a
calculated in various places. Replace with the pre-calculated values.

Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling 1145e6351a lightnvm: implement nvm_submit_ppa_list
The nvm_submit_ppa function assumes that users manage all plane
blocks as a single block. Extend the API with nvm_submit_ppa_list
to allow the user to send its own ppa list. If the user submits more
than a single PPA, the user must take care to allocate and free
the corresponding ppa list.

Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Andrew F. Davis f3d9f1ce07 rpmsg: add helper macro module_rpmsg_driver
This patch introduces the module_rpmsg_driver macro which is a
convenience macro for rpmsg driver modules similar to
module_platform_driver. It is intended to be used by drivers which
init/exit section does nothing but register/unregister the rpmsg driver.
By using this macro it is possible to eliminate a few lines of
boilerplate code per rpmsg driver.

Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-05-06 11:09:00 -07:00
Andrew F. Davis bc3c57c132 rpmsg: add THIS_MODULE to rpmsg_driver in rpmsg core
Add register_rpmsg_driver helper macro that adds THIS_MODULE to
rpmsg_driver for the registering driver. We rename and modify
the existing register_rpmsg_driver to enable this.

Signed-off-by: Andrew F. Davis <afd@ti.com>
Acked-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-05-06 11:08:58 -07:00
Thomas Gleixner aaddd7d1c7 sched/hotplug: Make activate() the last hotplug step
The scheduler can handle per cpu threads before the cpu is set to active and
it does not allow user space threads on the cpu before active is
set. Attaching to the scheduling domains is also not required before user
space threads can be handled.

Move the activation to the end of the hotplug state space. That also means
that deactivation is the first action when a cpu is shut down.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160310120025.597477199@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 14:58:25 +02:00
Thomas Gleixner f2785ddb53 sched/hotplug: Move migration CPU_DYING to sched_cpu_dying()
Remove the hotplug notifier and make it an explicit state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160310120025.502222097@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 14:58:25 +02:00
Thomas Gleixner 40190a78f8 sched/hotplug: Convert cpu_[in]active notifiers to state machine
Now that we reduced everything into single notifiers, it's simple to move them
into the hotplug state machine space.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 14:58:24 +02:00
Thomas Gleixner 135fb3e197 sched: Consolidate the notifier maze
We can maintain the ordering of the scheduler cpu hotplug functionality nicely
in one notifer. Get rid of the maze.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 14:58:23 +02:00
Thomas Gleixner 9cf7243d5d sched: Make set_cpu_rq_start_time() a built in hotplug state
Start distangling the maze of hotplug notifiers in the scheduler.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 14:58:23 +02:00
Peter Zijlstra (Intel) e9d867a67f sched: Allow per-cpu kernel threads to run on online && !active
In order to enable symmetric hotplug, we must mirror the online &&
!active state of cpu-down on the cpu-up side.

However, to retain sanity, limit this state to per-cpu kthreads.

Aside from the change to set_cpus_allowed_ptr(), which allow moving
the per-cpu kthreads on, the other critical piece is the cpu selection
for pinned tasks in select_task_rq(). This avoids dropping into
select_fallback_rq().

select_fallback_rq() cannot be allowed to select !active cpus because
its used to migrate user tasks away. And we do not want to move user
tasks onto cpus that are in transition.

Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160301152303.GV6356@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-06 14:58:22 +02:00
Rafael J. Wysocki 7c21b38ca9 Merge branches 'acpica-fixes' and 'device-properties-fixes'
* acpica-fixes:
  ACPICA: Dispatcher: Update thread ID for recursive method calls

* device-properties-fixes:
  device property: Avoid potential dereferences of invalid pointers
2016-05-06 13:15:52 +02:00
Ezequiel Garcia 80d6737b27 leds: gpio: Support the "panic-indicator" firmware property
Calling a GPIO LEDs is quite likely to work even if the kernel
has paniced, so they are ideal to blink in this situation.
This commit adds support for the new "panic-indicator"
firmware property, allowing to mark a given LED to blink on
a kernel panic.

Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
2016-05-06 10:26:07 +02:00
Ezequiel Garcia ba93cdce5b leds: triggers: Allow to switch the trigger to "panic" on a kernel panic
This commit adds a new led_cdev flag LED_PANIC_INDICATOR, which
allows to mark a specific LED to be switched to the "panic"
trigger, on a kernel panic.

This is useful to allow the user to assign a regular trigger
to a given LED, and still blink that LED on a kernel panic.

Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
2016-05-06 10:22:09 +02:00
Haggai Abramovsky 73898db043 net/mlx4: Avoid wrong virtual mappings
The dma_alloc_coherent() function returns a virtual address which can
be used for coherent access to the underlying memory.  On some
architectures, like arm64, undefined behavior results if this memory is
also accessed via virtual mappings that are not coherent.  Because of
their undefined nature, operations like virt_to_page() return garbage
when passed virtual addresses obtained from dma_alloc_coherent().  Any
subsequent mappings via vmap() of the garbage page values are unusable
and result in bad things like bus errors (synchronous aborts in ARM64
speak).

The mlx4 driver contains code that does the equivalent of:
vmap(virt_to_page(dma_alloc_coherent)), this results in an OOPs when the
device is opened.

Prevent Ethernet driver to run this problematic code by forcing it to
allocate contiguous memory. As for the Infiniband driver, at first we
are trying to allocate contiguous memory, but in case of failure roll
back to work with fragmented memory.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reported-by: David Daney <david.daney@cavium.com>
Tested-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-05 23:23:05 -04:00
Andrea Arcangeli 127393fbe5 mm: thp: kvm: fix memory corruption in KVM with THP enabled
After the THP refcounting change, obtaining a compound pages from
get_user_pages() no longer allows us to assume the entire compound page
is immediately mappable from a secondary MMU.

A secondary MMU doesn't want to call get_user_pages() more than once for
each compound page, in order to know if it can map the whole compound
page.  So a secondary MMU needs to know from a single get_user_pages()
invocation when it can map immediately the entire compound page to avoid
a flood of unnecessary secondary MMU faults and spurious
atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier
users).

Ideally instead of the page->_mapcount < 1 check, get_user_pages()
should return the granularity of the "page" mapping in the "mm" passed
to get_user_pages().  However it's non trivial change to pass the "pmd"
status belonging to the "mm" walked by get_user_pages up the stack (up
to the caller of get_user_pages).  So the fix just checks if there is
not a single pte mapping on the page returned by get_user_pages, and in
turn if the caller can assume that the whole compound page is mapped in
the current "mm" (in a pmd_trans_huge()).  In such case the entire
compound page is safe to map into the secondary MMU without additional
get_user_pages() calls on the surrounding tail/head pages.  In addition
of being faster, not having to run other get_user_pages() calls also
reduces the memory footprint of the secondary MMU fault in case the pmd
split happened as result of memory pressure.

Without this fix after a MADV_DONTNEED (like invoked by QEMU during
postcopy live migration or balloning) or after generic swapping (with a
failure in split_huge_page() that would only result in pmd splitting and
not a physical page split), KVM would map the whole compound page into
the shadow pagetables, despite regular faults or userfaults (like
UFFDIO_COPY) may map regular pages into the primary MMU as result of the
pte faults, leading to the guest mode and userland mode going out of
sync and not working on the same memory at all times.

Any other secondary MMU notifier manager (KVM is just one of the many
MMU notifier users) will need the same information if it doesn't want to
run a flood of get_user_pages_fast and it can support multiple
granularity in the secondary MMU mappings, so I think it is justified to
be exposed not just to KVM.

The other option would be to move transparent_hugepage_adjust to
mm/huge_memory.c but that currently has all kind of KVM data structures
in it, so it's definitely not a cut-and-paste work, so I couldn't do a
fix as cleaner as this one for 4.6.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Li, Liang Z" <liang.z.li@intel.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 17:38:53 -07:00
Alexandre Bounine 4e1016dac1 rapidio/mport_cdev: fix uapi type definitions
Fix problems in uapi definitions reported by Gabriel Laskar: (see
https://lkml.org/lkml/2016/4/5/205 for details)

 - move public header file rio_mport_cdev.h to include/uapi/linux directory
 - change types in data structures passed as IOCTL parameters
 - improve parameter checking in some IOCTL service routines

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reported-by: Gabriel Laskar <gabriel@lse.epita.fr>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Gabriel Laskar <gabriel@lse.epita.fr>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 17:38:53 -07:00
Johannes Weiner 4550c4e157 mm: memcontrol: let v2 cgroups follow changes in system swappiness
Cgroup2 currently doesn't have a per-cgroup swappiness setting.  We
might want to add one later - that's a different discussion - but until
we do, the cgroups should always follow the system setting.  Otherwise
it will be unchangeably set to whatever the ancestor inherited from the
system setting at the time of cgroup creation.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: <stable@vger.kernel.org>	[4.5]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-05 17:38:53 -07:00
James Morris 0250abcd72 Merge tag 'keys-next-20160505' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2016-05-06 09:29:00 +10:00
Mike Snitzer 0ef5a50c16 block: make bio_inc_remaining() interface accessible again
Commit 326e1dbb57 ("block: remove management of bi_remaining when
restoring original bi_end_io") made bio_inc_remaining() private to bio.c
because the only use-case that made sense was confined to the
bio_chain() interface.

Since that time DM thinp went on to use bio_chain() in its relatively
complex implementation of async discard support.  That implementation,
even when converted over to use the new async __blkdev_issue_discard()
interface, depends on deferred completion of the original discard bio --
which is most appropriately implemented using bio_inc_remaining().

DM thinp foolishly duplicated bio_inc_remaining(), local to dm-thin.c as
__bio_inc_remaining(), so re-exporting bio_inc_remaining() allows us to
put an end to that foolishness.

All said, bio_inc_remaining() should really only be used in conjunction
with bio_chain().  It isn't intended for generic bio reference counting.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-05 13:03:29 -06:00
Fabio Estevam 4d2458507d ASoC: fsl_sai: Allow setting the SAI MCLK direction
On mx6ul the General Purpose Register 1 (GPR1) contains the following
bits for configuring the direction of the SAI MCLKs:
SAI1_MCLK_DIR, SAI2_MCLK_DIR, SAI3_MCLK_DIR

Introduce  the "fsl,sai-mclk-direction-output" optional property to allow
configuring the SAI_MCLK outputs.

Tested on a imx6ul-evk board.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-05-05 16:44:22 +01:00
Mark Rutland 5101ef20f0 perf/arm: Special-case hetereogeneous CPUs
Commit:

  2665784850 ("perf/core: Verify we have a single perf_hw_context PMU")

forcefully prevents multiple PMUs from sharing perf_hw_context, as this
generally doesn't make sense. It is a common bug for uncore PMUs to
use perf_hw_context rather than perf_invalid_context, which this detects.

However, systems exist with heterogeneous CPUs (and hence heterogeneous
HW PMUs), for which sharing perf_hw_context is necessary, and possible
in some limited cases.

To make this work we have to perform some gymnastics, as we did in these
commits:

  66eb579e66 ("perf: allow for PMU-specific event filtering")
  c904e32a69 ("arm: perf: filter unschedulable events")

To allow those systems to work, we must allow PMUs for heterogeneous
CPUs to share perf_hw_context, though we must still disallow sharing
otherwise to detect the common misuse of perf_hw_context.

This patch adds a new PERF_PMU_CAP_HETEROGENEOUS_CPUS for this, updates
the core logic to account for this, and makes use of it in the arm_pmu
code that is used for systems with heterogeneous CPUs. Comments are
added to make the rationale clear and hopefully avoid accidental abuse.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20160426103346.GA20836@leverpostej
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:13:59 +02:00
Alexander Shishkin 375637bc52 perf/core: Introduce address range filtering
Many instruction tracing PMUs out there support address range-based
filtering, which would, for example, generate trace data only for a
given range of instruction addresses, which is useful for tracing
individual functions, modules or libraries. Other PMUs may also
utilize this functionality to allow filtering to or filtering out
code at certain address ranges.

This patch introduces the interface for userspace to specify these
filters and for the PMU drivers to apply these filters to hardware
configuration.

The user interface is an ASCII string that is passed via an ioctl()
and specifies (in the form of an ASCII string) address ranges within
certain object files or within kernel. There is no special treatment
for kernel modules yet, but it might be a worthy pursuit.

The PMU driver interface basically adds two extra callbacks to the
PMU driver structure, one of which validates the filter configuration
proposed by the user against what the hardware is actually capable of
doing and the other one translates hardware-independent filter
configuration into something that can be programmed into the
hardware.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1461771888-10409-6-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:13:57 +02:00
Ingo Molnar 1a618c2cfe Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 10:12:37 +02:00
Peter Zijlstra a1cc5bcfcf locking/atomics: Flip atomic_fetch_or() arguments
All the atomic operations have their arguments the wrong way around;
make atomic_fetch_or() consistent and flip them.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 09:58:52 +02:00
Yuyang Du 7b5953345e sched/fair: Add detailed description to the sched load avg metrics
These sched metrics have become complex enough, so describe them
in detail at their definition.

Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Fixed the text to improve its spelling and typography. ]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: lizefan@huawei.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1459829551-21625-4-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 09:41:08 +02:00
Yuyang Du 6ecdd74962 sched/fair: Generalize the load/util averages resolution definition
Integer metric needs fixed point arithmetic. In sched/fair, a few
metrics, e.g., weight, load, load_avg, util_avg, freq, and capacity,
may have different fixed point ranges, which makes their update and
usage error-prone.

In order to avoid the errors relating to the fixed point range, we
definie a basic fixed point range, and then formalize all metrics to
base on the basic range.

The basic range is 1024 or (1 << 10). Further, one can recursively
apply the basic range to have larger range.

Pointed out by Ben Segall, weight (visible to user, e.g., NICE-0 has
1024) and load (e.g., NICE_0_LOAD) have independent ranges, but they
must be well calibrated.

Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: lizefan@huawei.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1459829551-21625-2-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 09:24:00 +02:00
Peter Zijlstra e7904a28f5 locking/lockdep, sched/core: Implement a better lock pinning scheme
The problem with the existing lock pinning is that each pin is of
value 1; this mean you can simply unpin if you know its pinned,
without having any extra information.

This scheme generates a random (16 bit) cookie for each pin and
requires this same cookie to unpin. This means you have to keep the
cookie in context.

No objsize difference for !LOCKDEP kernels.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 09:23:59 +02:00
Ingo Molnar 64b7aad579 Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05 09:01:49 +02:00
Rafael J. Wysocki 28ed05732a Merge branch 'pm-opp' into pm-cpufreq 2016-05-05 01:38:56 +02:00
Sudeep Holla 411466c508 PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table
Functions dev_pm_opp_of_{cpumask_,}remove_table removes/frees all the
static OPP entries associated with the device and/or all cpus(in case
of cpumask) that are created from DT.

However the OPP entries are populated reading from the firmware or some
different method using dev_pm_opp_add are marked dynamic and can't be
removed using above functions.

This patch adds non DT/OF versions of dev_pm_opp_{cpumask_,}remove_table
to support the above mentioned usecase.

This is in preparation to make use of the same in scpi-cpufreq.c

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05 01:38:44 +02:00
Arnd Bergmann ddbb74bc70 PM / OPP: pass cpumask by reference
The new use of dev_pm_opp_set_sharing_cpus resulted in a harmless compiler
warning with CONFIG_CPUMASK_OFFSTACK=y:

drivers/cpufreq/mvebu-cpufreq.c: In function 'armada_xp_pmsu_cpufreq_init':
include/linux/cpumask.h:550:25: error: passing argument 2 of 'dev_pm_opp_set_sharing_cpus' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]

The problem here is that cpumask_var_t gets passed by reference, but
by declaring a 'const cpumask_var_t' argument, only the pointer is
constant, not the actual mask. This is harmless because the function
does not actually modify the mask.

This patch changes the function prototypes for all of the related functions
to pass a 'struct cpumask *' instead of 'cpumask_var_t', matching what
most other such functions do in the kernel. This lets us mark all the
other similar functions as taking a 'const' mask where possible,
and it avoids the warning without any change in object code.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 947bd567f7 (mvebu: Use dev_pm_opp_set_sharing_cpus() to mark OPP tables as shared)
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05 01:34:20 +02:00
Sinan Kaya 9e5ed6d1fb ACPI,PCI,IRQ: remove SCI penalize function
Removing the SCI penalize function as the penalty is now calculated on the
fly.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05 01:10:32 +02:00
Lv Zheng e5f660ebef ACPI / osi: Collect _OSI handling into one single file
_OSI handling code grows giant and it's time to move them into one file.

This patch collects all _OSI handling code into one single file.
So that we only have the following functions to be used externally:

 early_acpi_osi_init(): Used by DMI detections;
 acpi_osi_init(): Used to initialize OSI command line settings and install
                  Linux specific _OSI handler;
 acpi_osi_setup(): The API that should be used by the external quirks.
 acpi_osi_is_win8(): The API is used by the external drivers to determine
                     if BIOS supports Win8.

CONFIG_DMI is not useful as stub dmi_check_system() can make everything
stub because of strip.

No functional changes.

Tested-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05 00:13:53 +02:00
Lv Zheng d5a91d74c6 ACPI / osi: Cleanup coding style issues before creating a separate OSI source file
This patch performs necessary cleanups before moving OSI support to
another file.

 1. Change printk into pr_xxx
 2. Do not initialize values to 0
 3. Do not append additional "return" at the end of the function
 4. Remove useless comments which may easily break line breaking rule

After fixing the coding style issues, rename functions to make them looking
like acpi_osi_xxx.

No functional changes.

Tested-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05 00:13:52 +02:00
Lv Zheng dc45eb20a8 ACPI / osi: Cleanup OSI handling code to use bool
This patch changes "int/unsigned int" to "bool" to simplify the code.

Tested-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05 00:13:52 +02:00
Chen Yu e10cfdc33a ACPI / osi: Fix default _OSI(Darwin) support
The following commit always reports positive value when Apple hardware
queries _OSI("Darwin"):

 Commit: 7bc5a2bad0
 Subject: ACPI: Support _OSI("Darwin") correctly

However since this implementation places the judgement in runtime, it
breaks acpi_osi=!Darwin and cannot return unsupported for _OSI("WinXXX")
invoked before invoking _OSI("Darwin").

This patch fixes the issues by reverting the wrong support and implementing
the default behavior of _OSI("Darwin")/_OSI("WinXXX") on Apple hardware via
DMI matching.

Fixes: 7bc5a2bad0 (ACPI: Support _OSI("Darwin") correctly)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=92111
Reported-and-tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-05 00:13:52 +02:00
Aaron Lu 01c3664de6 video / backlight: remove the backlight_device_registered API
Since we will need the backlight_device_get_by_type API, we can use it
instead of the backlight_device_registered API whenever necessary so
remove the backlight_device_registered API.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-04 23:41:14 +02:00
Aaron Lu f6a4790a54 video / backlight: add two APIs for drivers to use
It is useful to get the backlight device's pointer and use it to set
backlight in some cases(the following patch will make use of it) so add
the two APIs and export them.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-04 23:41:14 +02:00
Eric Dumazet 46cc6e4976 tcp: fix lockdep splat in tcp_snd_una_update()
tcp_snd_una_update() and tcp_rcv_nxt_update() call
u64_stats_update_begin() either from process context or BH handler.

This triggers a lockdep splat on 32bit & SMP builds.

We could add u64_stats_update_begin_bh() variant but this would
slow down 32bit builds with useless local_disable_bh() and
local_enable_bh() pairs, since we own the socket lock at this point.

I add sock_owned_by_me() helper to have proper lockdep support
even on 64bit builds, and new u64_stats_update_begin_raw()
and u64_stats_update_end_raw methods.

Fixes: c10d9310ed ("tcp: do not assume TCP code is non preemptible")
Reported-by: Fabio Estevam <festevam@gmail.com>
Diagnosed-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 16:55:11 -04:00
Peter Rosin 6ef91fcca8 i2c: mux: relax locking of the top i2c adapter during mux-locked muxing
With a i2c topology like the following

                       GPIO ---|  ------ BAT1
                        |      v /
   I2C  -----+----------+---- MUX
             |                   \
           EEPROM                 ------ BAT2

there is a locking problem with the GPIO controller since it is a client
on the same i2c bus that it muxes. Transfers to the mux clients (e.g. BAT1)
will lock the whole i2c bus prior to attempting to switch the mux to the
correct i2c segment. In the above case, the GPIO device is an I/O expander
with an i2c interface, and since the GPIO subsystem knows nothing (and
rightfully so) about the lockless needs of the i2c mux code, this results
in a deadlock when the GPIO driver issues i2c transfers to modify the
mux.

So, observing that while it is needed to have the i2c bus locked during the
actual MUX update in order to avoid random garbage on the slave side, it
is not strictly a must to have it locked over the whole sequence of a full
select-transfer-deselect mux client operation. The mux itself needs to be
locked, so transfers to clients behind the mux are serialized, and the mux
needs to be stable during all i2c traffic (otherwise individual mux slave
segments might see garbage, or worse).

Introduce this new locking concept as "mux-locked" muxes, and call the
pre-existing mux locking scheme "parent-locked".

Modify the i2c mux locking so that muxes that are "mux-locked" locks only
the muxes on the parent adapter instead of the whole i2c bus when there is
a transfer to the slave side of the mux. This lock serializes transfers to
the slave side of the muxes on the parent adapter.

Add code to i2c-mux-gpio and i2c-mux-pinctrl that checks if all involved
gpio/pinctrl devices have a parent that is an i2c adapter in the same
adapter tree that is muxed, and request a "mux-locked mux" if that is the
case.

Modify the select-transfer-deselect code for "mux-locked" muxes so
that each of the select-transfer-deselect ops locks the mux parent
adapter individually.

Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-05-04 22:39:17 +02:00
Peter Rosin 8320f495cf i2c: allow adapter drivers to override the adapter locking
Add i2c_lock_bus() and i2c_unlock_bus(), which call the new lock_bus and
unlock_bus ops in the adapter. These funcs/ops take an additional flags
argument that indicates for what purpose the adapter is locked.

There are two flags, I2C_LOCK_ROOT_ADAPTER and I2C_LOCK_SEGMENT, but they
are both implemented the same. For now. Locking the root adapter means
that the whole bus is locked, locking the segment means that only the
current bus segment is locked (i.e. i2c traffic on the parent side of
a mux is still allowed even if the child side of the mux is locked).

Also support a trylock_bus op (but no function to call it, as it is not
expected to be needed outside of the i2c core).

Implement i2c_lock_adapter/i2c_unlock_adapter in terms of the new locking
scheme (i.e. lock with the I2C_LOCK_ROOT_ADAPTER flag).

Locking the root adapter and locking the segment is the same thing for
all root adapters (e.g. in the normal case of a simple topology with no
i2c muxes). The two locking variants are also the same for traditional
muxes (aka parent-locked muxes). These muxes traverse the tree, locking
each level as they go until they reach the root. This patch is preparatory
for a later patch in the series introducing mux-locked muxes, which behave
differently depending on the requested locking. Since all current users
are using i2c_lock_adapter, which is a wrapper for I2C_LOCK_ROOT_ADAPTER,
we only need to annotate the calls that will not need to lock the root
adapter for mux-locked muxes. I.e. the instances that needs to use
I2C_LOCK_SEGMENT instead of i2c_lock_adapter/I2C_LOCK_ROOT_ADAPTER. Those
instances are in the i2c_transfer and i2c_smbus_xfer functions, so that
mux-locked muxes can single out normal i2c accesses to its slave side
and adjust the locking for those accesses.

Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-05-04 22:28:31 +02:00
Florian Westphal 9b36627ace net: remove dev->trans_start
previous patches removed all direct accesses to dev->trans_start,
so change the netif_trans_update helper to update trans_start of
netdev queue 0 instead and then remove trans_start from struct net_device.

AFAICS a lot of the netif_trans_update() invocations are now useless
because they occur in ndo_start_xmit and driver doesn't set LLTX
(i.e. stack already took care of the update).

As I can't test any of them it seems better to just leave them alone.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:16:50 -04:00
Florian Westphal ba162f8eed netdevice: add helper to update trans_start
trans_start exists twice:
- as member of net_device (legacy)
- as member of netdev_queue

In order to get rid of the legacy case, add a helper for the
dev->trans_update (this patch), then convert spots that do

dev->trans_start = jiffies

to use this helper (next patch).

This would then allow us to change the helper so that it updates the
trans_stamp of netdev queue 0 instead.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:16:48 -04:00
Mohamad Haj Yahia efdc810ba3 net/mlx5: Flow steering, Add vport ACL support
Update the relevant flow steering device structs and commands to
support vport.
Update the flow steering core API to receive vport number.
Add ingress and egress ACL flow table name spaces.
Add ACL flow table support:
* ACL (Access Control List) flow table is a table that contains
only allow/drop steering rules.

* We have two types of ACL flow tables - ingress and egress.

* ACLs handle traffic sent from/to E-Switch FDB table, Ingress refers to
traffic sent from Vport to E-Switch and Egress refers to traffic sent
from E-Switch to vport.

* Ingress ACL flow table allow/drop rules is checked against traffic
sent from VF.

* Egress ACL flow table allow/drop rules is checked against traffic sent
to VF.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:46 -04:00
David Howells d55201ce08 Merge branch 'keys-trust' into keys-next
Here's a set of patches that changes how certificates/keys are determined
to be trusted.  That's currently a two-step process:

 (1) Up until recently, when an X.509 certificate was parsed - no matter
     the source - it was judged against the keys in .system_keyring,
     assuming those keys to be trusted if they have KEY_FLAG_TRUSTED set
     upon them.

     This has just been changed such that any key in the .ima_mok keyring,
     if configured, may also be used to judge the trustworthiness of a new
     certificate, whether or not the .ima_mok keyring is meant to be
     consulted for whatever process is being undertaken.

     If a certificate is determined to be trustworthy, KEY_FLAG_TRUSTED
     will be set upon a key it is loaded into (if it is loaded into one),
     no matter what the key is going to be loaded for.

 (2) If an X.509 certificate is loaded into a key, then that key - if
     KEY_FLAG_TRUSTED gets set upon it - can be linked into any keyring
     with KEY_FLAG_TRUSTED_ONLY set upon it.  This was meant to be the
     system keyring only, but has been extended to various IMA keyrings.
     A user can at will link any key marked KEY_FLAG_TRUSTED into any
     keyring marked KEY_FLAG_TRUSTED_ONLY if the relevant permissions masks
     permit it.

These patches change that:

 (1) Trust becomes a matter of consulting the ring of trusted keys supplied
     when the trust is evaluated only.

 (2) Every keyring can be supplied with its own manager function to
     restrict what may be added to that keyring.  This is called whenever a
     key is to be linked into the keyring to guard against a key being
     created in one keyring and then linked across.

     This function is supplied with the keyring and the key type and
     payload[*] of the key being linked in for use in its evaluation.  It
     is permitted to use other data also, such as the contents of other
     keyrings such as the system keyrings.

     [*] The type and payload are supplied instead of a key because as an
         optimisation this function may be called whilst creating a key and
         so may reject the proposed key between preparse and allocation.

 (3) A default manager function is provided that permits keys to be
     restricted to only asymmetric keys that are vouched for by the
     contents of the system keyring.

     A second manager function is provided that just rejects with EPERM.

 (4) A key allocation flag, KEY_ALLOC_BYPASS_RESTRICTION, is made available
     so that the kernel can initialise keyrings with keys that form the
     root of the trust relationship.

 (5) KEY_FLAG_TRUSTED and KEY_FLAG_TRUSTED_ONLY are removed, along with
     key_preparsed_payload::trusted.

This change also makes it possible in future for userspace to create a private
set of trusted keys and then to have it sealed by setting a manager function
where the private set is wholly independent of the kernel's trust
relationships.

Further changes in the set involve extracting certain IMA special keyrings
and making them generally global:

 (*) .system_keyring is renamed to .builtin_trusted_keys and remains read
     only.  It carries only keys built in to the kernel.  It may be where
     UEFI keys should be loaded - though that could better be the new
     secondary keyring (see below) or a separate UEFI keyring.

 (*) An optional secondary system keyring (called .secondary_trusted_keys)
     is added to replace the IMA MOK keyring.

     (*) Keys can be added to the secondary keyring by root if the keys can
         be vouched for by either ring of system keys.

 (*) Module signing and kexec only use .builtin_trusted_keys and do not use
     the new secondary keyring.

 (*) Config option SYSTEM_TRUSTED_KEYS now depends on ASYMMETRIC_KEY_TYPE as
     that's the only type currently permitted on the system keyrings.

 (*) A new config option, IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY,
     is provided to allow keys to be added to IMA keyrings, subject to the
     restriction that such keys are validly signed by a key already in the
     system keyrings.

     If this option is enabled, but secondary keyrings aren't, additions to
     the IMA keyrings will be restricted to signatures verifiable by keys in
     the builtin system keyring only.

Signed-off-by: David Howells <dhowells@redhat.com>
2016-05-04 17:20:20 +01:00
Wolfram Sang 3d376fb2ea mmc: tmio/sdhi: introduce flag for RCar 2+ specific features
RCar Gen2 and later implementations of TMIO/SDHI have their own set of
features and additions. FAST_CLK_CHG is just one of them and I see a few
others being added soon. Some may work on older chipsets but this needs
to be tested case by case. Instead of adding a bunch of flags for each
feature, add a global RCar2+ one for now. We can still break out
features if the need arises.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-05-04 09:28:04 +02:00
Andy Lutomirski c876eeab64 signals/sigaltstack: If SS_AUTODISARM, bypass on_sig_stack()
If a signal stack is set up with SS_AUTODISARM, then the kernel
inherently avoids incorrectly resetting the signal stack if signals
recurse: the signal stack will be reset on the first signal
delivery.  This means that we don't need check the stack pointer
when delivering signals if SS_AUTODISARM is set.

This will make segmented x86 programs more robust: currently there's
a hole that could be triggered if ESP/RSP appears to point to the
signal stack but actually doesn't due to a nonzero SS base.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Jason Low <jason.low2@hp.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <pmoore@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/c46bee4654ca9e68c498462fd11746e2bd0d98c8.1462296606.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-04 08:34:13 +02:00
David S. Miller cba6532100 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv4/ip_gre.c

Minor conflicts between tunnel bug fixes in net and
ipv6 tunnel cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 00:52:29 -04:00
Linus Torvalds 7391daf2ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Some straggler bug fixes:

   1) Batman-adv DAT must consider VLAN IDs when choosing candidate
      nodes, from Antonio Quartulli.

   2) Fix botched reference counting of vlan objects and neigh nodes in
      batman-adv, from Sven Eckelmann.

   3) netem can crash when it sees GSO packets, the fix is to segment
      then upon ->enqueue.  Fix from Neil Horman with help from Eric
      Dumazet.

   4) Fix VXLAN dependencies in mlx5 driver Kconfig, from Matthew
      Finlay.

   5) Handle VXLAN ops outside of rcu lock, via a workqueue, in mlx5,
      since it can sleep.  Fix also from Matthew Finlay.

   6) Check mdiobus_scan() return values properly in pxa168_eth and macb
      drivers.  From Sergei Shtylyov.

   7) If the netdevice doesn't support checksumming, disable
      segmentation.  From Alexandery Duyck.

   8) Fix races between RDS tcp accept and sending, from Sowmini
      Varadhan.

   9) In macb driver, probe MDIO bus before we register the netdev,
      otherwise we can try to open the device before it is really ready
      for that.  Fix from Florian Fainelli.

  10) Netlink attribute size for ILA "tunnels" not calculated properly,
      fix from Nicolas Dichtel"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv6/ila: fix nlsize calculation for lwtunnel
  net: macb: Probe MDIO bus before registering netdev
  RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.
  RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
  vxlan: Add checksum check to the features check function
  net: Disable segmentation if checksumming is not supported
  net: mvneta: Remove superfluous SMP function call
  macb: fix mdiobus_scan() error check
  pxa168_eth: fix mdiobus_scan() error check
  net/mlx5e: Use workqueue for vxlan ops
  net/mlx5e: Implement a mlx5e workqueue
  net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue
  net/mlx5: Unmap only the relevant IO memory mapping
  netem: Segment GSO packets on enqueue
  batman-adv: Fix reference counting of hardif_neigh_node object for neigh_node
  batman-adv: Fix reference counting of vlan object for tt_local_entry
  batman-adv: B.A.T.M.A.N V - make sure iface is reactivated upon NETDEV_UP event
  batman-adv: fix DAT candidate selection (must use vid)
2016-05-03 15:07:50 -07:00
Alexander Duyck af67eb9e7e vxlan: Add checksum check to the features check function
We need to perform an additional check on the inner headers to determine if
we can offload the checksum for them.  Previously this check didn't occur
so we would generate an invalid frame in the case of an IPv6 header
encapsulated inside of an IPv4 tunnel.  To fix this I added a secondary
check to vxlan_features_check so that we can verify that we can offload the
inner checksum.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 16:00:54 -04:00
Robin Murphy e511267bc2 io-64-nonatomic: Add relaxed accessor variants
Whilst commit 9439eb3ab9 ("asm-generic: io: implement relaxed
accessor macros as conditional wrappers") makes the *_relaxed forms of
I/O accessors universally available to drivers, in cases where writeq()
is implemented via the io-64-nonatomic helpers, writeq_relaxed() will
end up falling back to writel() regardless of whether writel_relaxed()
is available (identically for s/write/read/).

Add corresponding relaxed forms of the nonatomic helpers to delegate
to the equivalent 32-bit accessors as appropriate. We also need to fix
io.h to avoid defining default relaxed variants if the basic accessors
themselves don't exist.

CC: Christoph Hellwig <hch@lst.de>
CC: Darren Hart <dvhart@linux.intel.com>
CC: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-05-03 18:23:02 +01:00
Bjorn Helgaas d9322d226f Merge branches 'pci/dpc', 'pci/resource' and 'pci/thunderbolt' into next
* pci/dpc:
  PCI: Add Downstream Port Containment driver
  PCI: Add Downstream Port Containment portdrv service type
  PCI: Widen portdrv service type from 4 bits to 8 bits

* pci/resource:
  alpha/PCI: Call iomem_is_exclusive() for IORESOURCE_MEM, but not IORESOURCE_IO
  PCI: Supply CPU physical address (not bus address) to iomem_is_exclusive()

* pci/thunderbolt:
  thunderbolt: Fix double free of drom buffer
2016-05-03 11:49:21 -05:00
Bjorn Helgaas 58f8b094e9 Merge branches 'pci/host-armada', 'pci/host-designware', 'pci/host-hv', 'pci/host-imx6', 'pci/host-keystone', 'pci/host-mvebu', 'pci/host-rcar', 'pci/host-thunder' and 'pci/host-vmd' into next
* pci/host-armada:
  PCI: armada: Add driver for Marvell Armada 7K/8K PCIe controller
  dt-bindings: pci: add DT binding for Marvell Armada 7K/8K PCIe controller

* pci/host-designware:
  PCI: designware: Remove incorrect RC memory base/limit configuration
  PCI: designware: Move Root Complex setup code to dw_pcie_setup_rc()

* pci/host-hv:
  PCI: hv: Report resources release after stopping the bus

* pci/host-imx6:
  ARM: dts: imx6qp: Specify imx6qp version of PCIe core
  PCI: imx6: Implement reset sequence for i.MX6+
  PCI: imx6: Use enum instead of bool for variant indicator
  PCI: imx6: Add DT property for link gen, default to Gen1
  PCI: imx6: Add reset-gpio-active-high boolean property to DT
  ARM: dts: imx6: Fix PCIe reset GPIO polarity on Toradex Apalis Ixora
  PCI: imx6: Add initial imx6sx support
  PCI: imx6: Factor out ref clock enable
  Revert "PCI: imx6: Add support for active-low reset GPIO"

* pci/host-keystone:
  PCI: keystone: Remove unnecessary goto statement
  PCI: keystone: Add error IRQ handler

* pci/host-mvebu:
  PCI: mvebu: Use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS for mvebu_pcie_pm_ops
  PCI: mvebu: Constify mvebu_pcie_pm_ops structure

* pci/host-rcar:
  PCI: rcar: Select PCI_MSI_IRQ_DOMAIN

* pci/host-thunder:
  PCI: thunder: Don't clobber read-only bits in bridge config registers

* pci/host-vmd:
  PCI: Remove return values from pcie_port_platform_notify() and relatives
  PCI/ACPI: Allow all PCIe services on non-ACPI host bridges
2016-05-03 11:42:30 -05:00
Keith Busch 10126ac14d PCI: Add Downstream Port Containment portdrv service type
Add the Downstream Port Containment (PCIE_PORT_SERVICE_DPC) portdrv service
type, available if the device has the DPC extended capability.

[bhelgaas: split to separate patch, changelog]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-05-03 10:35:49 -05:00
Boris Brezillon e39c0df1be pwm: Introduce the pwm_args concept
Currently the PWM core mixes the current PWM state with the per-platform
reference config (specified through the PWM lookup table, DT definition
or directly hardcoded in PWM drivers).

Create a struct pwm_args to store this reference configuration, so that
PWM users can differentiate between the current and reference
configurations.

Patch all places where pwm->args should be initialized. We keep the
pwm_set_polarity/period() calls until all PWM users are patched to use
pwm_args instead of pwm_get_period/polarity().

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
[thierry.reding@gmail.com: reword kerneldoc comments]
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2016-05-03 13:44:37 +02:00
Julien Grall 1839e57696 irqchip/gic-v3: Parse and export virtual GIC information
Fill up the recently introduced gic_kvm_info with the hardware
information used for virtualization.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Julien Grall 502d6df11a irqchip/gic-v2: Parse and export virtual GIC information
For now, the firmware tables are parsed 2 times: once in the GIC
drivers, the other timer when initializing the vGIC. It means code
duplication and make more tedious to add the support for another
firmware table (like ACPI).

Introduce a new structure and set of helpers to get/set the virtual GIC
information. Also fill up the structure for GICv2.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-03 12:54:21 +02:00
Stas Sergeev 2a74213838 signals/sigaltstack: Implement SS_AUTODISARM flag
This patch implements the SS_AUTODISARM flag that can be OR-ed with
SS_ONSTACK when forming ss_flags.

When this flag is set, sigaltstack will be disabled when entering
the signal handler; more precisely, after saving sas to uc_stack.
When leaving the signal handler, the sigaltstack is restored by
uc_stack.

When this flag is used, it is safe to switch from sighandler with
swapcontext(). Without this flag, the subsequent signal will corrupt
the state of the switched-away sighandler.

To detect the support of this functionality, one can do:

  err = sigaltstack(SS_DISABLE | SS_AUTODISARM);
  if (err && errno == EINVAL)
	unsupported();

Signed-off-by: Stas Sergeev <stsp@list.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Amanieu d'Antras <amanieu@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Jason Low <jason.low2@hp.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Moore <pmoore@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-api@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1460665206-13646-4-git-send-email-stsp@list.ru
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-03 08:37:59 +02:00
Eric Dumazet 9580bf2edb net: relax expensive skb_unclone() in iptunnel_handle_offloads()
Locally generated TCP GSO packets having to go through a GRE/SIT/IPIP
tunnel have to go through an expensive skb_unclone()

Reallocating skb->head is a lot of work.

Test should really check if a 'real clone' of the packet was done.

TCP does not care if the original gso_type is changed while the packet
travels in the stack.

This adds skb_header_unclone() which is a variant of skb_clone()
using skb_header_cloned() check instead of skb_cloned().

This variant can probably be used from other points.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 00:22:19 -04:00
Florian Westphal c0ef079ca7 netdevice: shrink size of struct netdev_queue
- trans_timeout is incremented when tx queue timed out (tx watchdog).
- tx_maxrate is set via sysfs

Moving tx_maxrate to read-mostly part shrinks the struct by 64 bytes.
While at it, also move trans_timeout (it is out-of-place in the
'write-mostly' part).

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 22:51:41 -04:00
Chanwoo Choi 996133119f PM / devfreq: Add new passive governor
This patch adds the new passive governor for DEVFREQ framework. The following
governors are already present and used for DVFS (Dynamic Voltage and Frequency
Scaling) drivers. The following governors are independently used for one device
driver which don't give the influence to other device drviers and also don't
receive the effect from other device drivers.
- ondemand / performance / powersave / userspace

The passive governor depends on operation of parent driver with specific
governos extremely and is not able to decide the new frequency by oneself.
According to the decided new frequency of parent driver with governor,
the passive governor uses it to decide the appropriate frequency for own
device driver. The passive governor must need the following information
from device tree:
- the source clock and OPP tables
- the instance of parent device

For exameple,
there are one more devfreq device drivers which need to change their source
clock according to their utilization on runtime. But, they share the same
power line (e.g., regulator). So, specific device driver is operated as parent
with ondemand governor and then the rest device driver with passive governor
is influenced by parent device.

Suggested-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
[tjakobi: Reported RCU locking issue and cw00.choi fix it]
Reported-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
[linux.amoon: Reported possible recursive locking and cw00.choi fix it]
Reported-by: Anand Moon <linux.amoon@gmail.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Acked-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-05-03 11:20:07 +09:00
Chanwoo Choi 0fe3a66410 PM / devfreq: Add new DEVFREQ_TRANSITION_NOTIFIER notifier
This patch adds the new DEVFREQ_TRANSITION_NOTIFIER notifier to send
the notification when the frequency of device is changed.
This notifier has two state as following:
- DEVFREQ_PRECHANGE  : Notify it before chaning the frequency of device
- DEVFREQ_POSTCHANGE : Notify it after changed the frequency of device

And this patch adds the resourced-managed function to release the resource
automatically when error happen.

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
[m.reichl and linux.amoon: Tested it on exynos4412-odroidu3 board]
Tested-by: Markus Reichl <m.reichl@fivetechno.de>
Tested-by: Anand Moon <linux.amoon@gmail.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Acked-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-05-03 11:20:07 +09:00
Chanwoo Choi 8f510aeb22 PM / devfreq: Add devfreq_get_devfreq_by_phandle()
This patch adds the new devfreq_get_devfreq_by_phandle() OF helper function
which can find the instance of devfreq device by using phandle ("devfreq").

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
[m.reichl and linux.amoon: Tested it on exynos4412-odroidu3 board]
Tested-by: Markus Reichl <m.reichl@fivetechno.de>
Tested-by: Anand Moon <linux.amoon@gmail.com>
Acked-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-05-03 11:20:06 +09:00
Steven Rostedt (Red Hat) dcb0b5575d tracing: Remove TRACE_EVENT_FL_USE_CALL_FILTER logic
Nothing sets TRACE_EVENT_FL_USE_CALL_FILTER anymore. Remove it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-05-02 21:30:04 -04:00
Al Viro df889b3631 Merge branch 'for-linus' into work.lookups 2016-05-02 19:49:46 -04:00
Al Viro 6192269444 introduce a parallel variant of ->iterate()
New method: ->iterate_shared().  Same arguments as in ->iterate(),
called with the directory locked only shared.  Once all filesystems
switch, the old one will be gone.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:29 -04:00
Al Viro 63b6df1413 give readdir(2)/getdents(2)/etc. uniform exclusion with lseek()
same as read() on regular files has, and for the same reason.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:28 -04:00
Al Viro 9902af79c0 parallel lookups: actual switch to rwsem
ta-da!

The main issue is the lack of down_write_killable(), so the places
like readdir.c switched to plain inode_lock(); once killable
variants of rwsem primitives appear, that'll be dealt with.

lockdep side also might need more work

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:28 -04:00
Al Viro d9171b9345 parallel lookups machinery, part 4 (and last)
If we *do* run into an in-lookup match, we need to wait for it to
cease being in-lookup.  Fortunately, we do have unused space in
in-lookup dentries - d_lru is never looked at until it stops being
in-lookup.

So we can stash a pointer to wait_queue_head from stack frame of
the caller of ->lookup().  Some precautions are needed while
waiting, but it's not that hard - we do hold a reference to dentry
we are waiting for, so it can't go away.  If it's found to be
in-lookup the wait_queue_head is still alive and will remain so
at least while ->d_lock is held.  Moreover, the condition we
are waiting for becomes true at the same point where everything
on that wq gets woken up, so we can just add ourselves to the
queue once.

d_alloc_parallel() gets a pointer to wait_queue_head_t from its
caller; lookup_slow() adjusted, d_add_ci() taught to use
d_alloc_parallel() if the dentry passed to it happens to be
in-lookup one (i.e. if it's been called from the parallel lookup).

That's pretty much it - all that remains is to switch ->i_mutex
to rwsem and have lookup_slow() take it shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:27 -04:00
Al Viro 94bdd655ca parallel lookups machinery, part 3
We will need to be able to check if there is an in-lookup
dentry with matching parent/name.  Right now it's impossible,
but as soon as start locking directories shared such beasts
will appear.

Add a secondary hash for locating those.  Hash chains go through
the same space where d_alias will be once it's not in-lookup anymore.
Search is done under the same bitlock we use for modifications -
with the primary hash we can rely on d_rehash() into the wrong
chain being the worst that could happen, but here the pointers are
buggered once it's removed from the chain.  On the other hand,
the chains are not going to be long and normally we'll end up
adding to the chain anyway.  That allows us to avoid bothering with
->d_lock when doing the comparisons - everything is stable until
removed from chain.

New helper: d_alloc_parallel().  Right now it allocates, verifies
that no hashed and in-lookup matches exist and adds to in-lookup
hash.

Returns ERR_PTR() for error, hashed match (in the unlikely case it's
been found) or new dentry.  In-lookup matches trigger BUG() for
now; that will change in the next commit when we introduce waiting
for ongoing lookup to finish.  Note that in-lookup matches won't be
possible until we actually go for shared locking.

lookup_slow() switched to use of d_alloc_parallel().

Again, these commits are separated only for making it easier to
review.  All this machinery will start doing something useful only
when we go for shared locking; it's just that the combination is
too large for my taste.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:27 -04:00
Al Viro 84e710da2a parallel lookups machinery, part 2
We'll need to verify that there's neither a hashed nor in-lookup
dentry with desired parent/name before adding to in-lookup set.

One possible solution would be to hold the parent's ->d_lock through
both checks, but while the in-lookup set is relatively small at any
time, dcache is not.  And holding the parent's ->d_lock through
something like __d_lookup_rcu() would suck too badly.

So we leave the parent's ->d_lock alone, which means that we watch
out for the following scenario:
	* we verify that there's no hashed match
	* existing in-lookup match gets hashed by another process
	* we verify that there's no in-lookup matches and decide
that everything's fine.

Solution: per-directory kinda-sorta seqlock, bumped around the times
we hash something that used to be in-lookup or move (and hash)
something in place of in-lookup.  Then the above would turn into
	* read the counter
	* do dcache lookup
	* if no matches found, check for in-lookup matches
	* if there had been none of those either, check if the
counter has changed; repeat if it has.

The "kinda-sorta" part is due to the fact that we don't have much spare
space in inode.  There is a spare word (shared with i_bdev/i_cdev/i_pipe),
so the counter part is not a problem, but spinlock is a different story.

We could use the parent's ->d_lock, and it would be less painful in
terms of contention, for __d_add() it would be rather inconvenient to
grab; we could do that (using lock_parent()), but...

Fortunately, we can get serialization on the counter itself, and it
might be a good idea in general; we can use cmpxchg() in a loop to
get from even to odd and smp_store_release() from odd to even.

This commit adds the counter and updating logics; the readers will be
added in the next commit.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:26 -04:00
Al Viro 85c7f81041 beginning of transition to parallel lookups - marking in-lookup dentries
marked as such when (would be) parallel lookup is about to pass them
to actual ->lookup(); unmarked when
	* __d_add() is about to make it hashed, positive or not.
	* __d_move() (from d_splice_alias(), directly or via
__d_unalias()) puts a preexisting dentry in its place
	* in caller of ->lookup() if it has escaped all of the
above.  Bug (WARN_ON, actually) if it reaches the final dput()
or d_instantiate() while still marked such.

As the result, we are guaranteed that for as long as the flag is
set, dentry will
	* remain negative unhashed with positive refcount
	* never have its ->d_alias looked at
	* never have its ->d_lru looked at
	* never have its ->d_parent and ->d_name changed

Right now we have at most one such for any given parent directory.
With parallel lookups that restriction will weaken to
	* only exist when parent is locked shared
	* at most one with given (parent,name) pair (comparison of
names is according to ->d_compare())
	* only exist when there's no hashed dentry with the same
(parent,name)

Transition will take the next several commits; unfortunately, we'll
only be able to switch to rwsem at the end of this series.  The
reason for not making it a single patch is to simplify review.

New primitives: d_in_lookup() (a predicate checking if dentry is in
the in-lookup state) and d_lookup_done() (tells the system that
we are done with lookup and if it's still marked as in-lookup, it
should cease to be such).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:47:51 -04:00
Al Viro 84695ffee7 Merge getxattr prototype change into work.lookups
The rest of work.xattr stuff isn't needed for this branch
2016-05-02 19:45:47 -04:00
Linus Torvalds 689de1d6ca Minimal fix-up of bad hashing behavior of hash_64()
This is a fairly minimal fixup to the horribly bad behavior of hash_64()
with certain input patterns.

In particular, because the multiplicative value used for the 64-bit hash
was intentionally bit-sparse (so that the multiply could be done with
shifts and adds on architectures without hardware multipliers), some
bits did not get spread out very much.  In particular, certain fairly
common bit ranges in the input (roughly bits 12-20: commonly with the
most information in them when you hash things like byte offsets in files
or memory that have block factors that mean that the low bits are often
zero) would not necessarily show up much in the result.

There's a bigger patch-series brewing to fix up things more completely,
but this is the fairly minimal fix for the 64-bit hashing problem.  It
simply picks a much better constant multiplier, spreading the bits out a
lot better.

NOTE! For 32-bit architectures, the bad old hash_64() remains the same
for now, since 64-bit multiplies are expensive.  The bigger hashing
cleanup will replace the 32-bit case with something better.

The new constants were picked by George Spelvin who wrote that bigger
cleanup series.  I just picked out the constants and part of the comment
from that series.

Cc: stable@vger.kernel.org
Cc: George Spelvin <linux@horizon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-02 13:01:51 -07:00
Andrey Smirnov 4d31c6109a PCI: imx6: Implement reset sequence for i.MX6+
I.MX6+ has a dedicated bit for resetting PCIe core, which should be used
instead of a regular reset sequence since using the latter will hang the
SoC.

This commit is based on c34068d48273e24d392d9a49a38be807954420ed from
http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git

Tested-by: Gary Bisson <gary.bisson@boundarydevices.com>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
2016-05-02 14:33:17 -05:00
Linus Torvalds 9c5d1bc2b7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) MODULE_FIRMWARE firmware string not correct for iwlwifi 8000 chips,
    from Sara Sharon.

 2) Fix SKB size checks in batman-adv stack on receive, from Sven
    Eckelmann.

 3) Leak fix on mac80211 interface add error paths, from Johannes Berg.

 4) Cannot invoke napi_disable() with BH disabled in myri10ge driver,
    fix from Stanislaw Gruszka.

 5) Fix sign extension problem when computing feature masks in
    net_gso_ok(), from Marcelo Ricardo Leitner.

 6) lan78xx driver doesn't count packets and packet lengths in its
    statistics properly, fix from Woojung Huh.

 7) Fix the buffer allocation sizes in pegasus USB driver, from Petko
    Manolov.

 8) Fix refcount overflows in bpf, from Alexei Starovoitov.

 9) Unified dst cache handling introduced a preempt warning in
    ip_tunnel, fix by resetting rather then setting the cached route.
    From Paolo Abeni.

10) Listener hash collision test fix in soreuseport, from Craig Gallak

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  gre: do not pull header in ICMP error processing
  net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
  tipc: only process unicast on intended node
  cxgb3: fix out of bounds read
  net/smscx5xx: use the device tree for mac address
  soreuseport: Fix TCP listener hash collision
  net: l2tp: fix reversed udp6 checksum flags
  ip_tunnel: fix preempt warning in ip tunnel creation/updating
  samples/bpf: fix trace_output example
  bpf: fix check_map_func_compatibility logic
  bpf: fix refcnt overflow
  drivers: net: cpsw: use of_phy_connect() in fixed-link case
  dt: cpsw: phy-handle, phy_id, and fixed-link are mutually exclusive
  drivers: net: cpsw: don't ignore phy-mode if phy-handle is used
  drivers: net: cpsw: fix segfault in case of bad phy-handle
  drivers: net: cpsw: fix parsing of phy-handle DT property in dual_emac config
  MAINTAINERS: net: Change maintainer for GRETH 10/100/1G Ethernet MAC device driver
  gre: reject GUE and FOU in collect metadata mode
  pegasus: fixes reported packet length
  pegasus: fixes URB buffer allocation size;
  ...
2016-05-02 09:40:42 -07:00
Christoph Hellwig 38f2525533 block: add __blkdev_issue_discard
This is a version of blkdev_issue_discard which doesn't wait for
the I/O to complete, but instead allows the caller to submit
the final bio and/or chain it to others.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Sagi Grimberg <sagig@grimberg.me>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-02 09:19:46 -06:00
Wang Sheng-Hui a5b714ad39 NVMe: correct comment for offset enum of controller registers in nvme.h
Section 3.1 gives the comment for the offset of controller registers
in the specification 1.2a.

Some are mis-copied in the header file nvme.h. Correct them.

Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-02 09:13:35 -06:00
Marc Zyngier 9e2c986cb4 irqchip: Add per-cpu interrupt partitioning library
We've unfortunately started seeing a situation where percpu interrupts
are partitioned in the system: one arbitrary set of CPUs has an
interrupt connected to a type of device, while another disjoint
set of CPUs has the same interrupt connected to another type of device.

This makes it impossible to have a device driver requesting this interrupt
using the current percpu-interrupt abstraction, as the same interrupt number
is now potentially claimed by at least two drivers, and we forbid interrupt
sharing on per-cpu interrupt.

A solution to this is to turn things upside down. Let's assume that our
system describes all the possible partitions for a given interrupt, and
give each of them a unique identifier. It is then possible to create
a namespace where the affinity identifier itself is a form of interrupt
number. At this point, it becomes easy to implement a set of partitions
as a cascaded irqchip, each affinity identifier being the HW irq.

This allows us to keep a number of nice properties:
- Each partition results in a separate percpu-interrupt (with a restrictied
  affinity), which keeps drivers happy.
- Because the underlying interrupt is still per-cpu, the overhead of
  the indirection can be kept pretty minimal.
- The core code can ignore most of that crap.

For that purpose, we implement a small library that deals with some of
the boilerplate code, relying on platform-specific drivers to provide
a description of the affinity sets and a set of callbacks.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/1460365075-7316-4-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-02 13:42:51 +02:00
Marc Zyngier 222df54fd8 genirq: Allow the affinity of a percpu interrupt to be set/retrieved
In order to prepare the genirq layer for the concept of partitionned
percpu interrupts, let's allow an affinity to be associated with
such an interrupt. We introduce:

- irq_set_percpu_devid_partition: flag an interrupt as a percpu-devid
  interrupt, and associate it with an affinity
- irq_get_percpu_devid_partition: allow the affinity of that interrupt
  to be retrieved.

This will allow a driver to discover which CPUs the per-cpu interrupt
can actually fire on.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/1460365075-7316-3-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-02 13:42:51 +02:00
Marc Zyngier 651e8b54ab irqdomain: Allow domain matching on irq_fwspec
When iterating over the irq domain list, we try to match a domain
either by calling a match() function or by comparing a number
of fields passed as parameters.

Both approaches are a bit restrictive:
- match() is DT specific and only takes a device node
- the fallback case only deals with the fwnode_handle

It would be useful if we had a per-domain function that would
actually perform the matching check on the whole of the
irq_fwspec structure. This would allow for a domain to triage
matching attempts that need to extend beyond the fwnode.

Let's introduce irq_find_matching_fwspec(), which takes a full
blown irq_fwspec structure, and call into a select() function
implemented by the irqdomain. irq_find_matching_fwnode() is
made a wrapper around irq_find_matching_fwspec in order to
preserve compatibility.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/1460365075-7316-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-02 13:42:50 +02:00
Matt Redfearn 7cec18a390 genirq: Add error code reporting to irq_{reserve,destroy}_ipi
Make these functions return appropriate error codes when something goes
wrong.

Previously irq_destroy_ipi returned void making it impossible to notify
the caller if the request could not be fulfilled. Patch 1 in the series
added another condition in which this could fail in addition to the
existing ones. irq_reserve_ipi returned an unsigned int meaning it could
only return 0 on failure and give the caller no indication as to why the
request failed.

As time goes on there are likely to be further conditions added in which
these functions can fail. These APIs and the IPI IRQ domain are new in
4.6 and the number of existing call sites are low, changing the API now
has little impact on the code, while making it easier for these
functions to grow over time.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: ralf@linux-mips.org
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: lisa.parratt@imgtec.com
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1461568464-31701-2-git-send-email-matt.redfearn@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-02 13:42:50 +02:00
Matt Redfearn 01292cea0d genirq: Make irq_destroy_ipi take a cpumask of IPIs to destroy
Previously irq_destroy_ipi() would destroy IPIs to all CPUs that were
configured by irq_reserve_ipi(). This change makes it possible to
destroy just a subset of the IPIs. This may be useful to remove IPIs to
CPUs that have been hot removed so that the IRQ numbers allocated within
the IPI domain can be re-used.

The original behaviour is restored by passing the complete mask that the
IPI was created with.

There are currently no users of this function that would break from the
API change.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: ralf@linux-mips.org
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: lisa.parratt@imgtec.com
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1461568464-31701-1-git-send-email-matt.redfearn@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-02 13:42:50 +02:00
Andy Shevchenko 3a14c66d43 dmaengine: dw: pass platform data via struct dw_dma_chip
We pass struct dw_dma_chip to dw_dma_probe() anyway, thus we may use it to
pass a platform data as well.

While here, constify the source of the platform data.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-05-02 15:31:05 +05:30
Andy Shevchenko 161c3d04ae dmaengine: dw: keep entire platform data in struct dw_dma
Keep the entire platform data in the struct dw_dma.
It makes the driver a bit cleaner.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-05-02 15:31:05 +05:30
Andy Shevchenko 2e65060e80 dmaengine: dw: revisit data_width property
There several changes are done here:

- Convert the property to be in bytes

  Besides that this is a common practice for such property, the use of a value
  in bytes much more convenient than handling the encoded one.

- Rename data_width to data-width in the device tree bindings

  The change leaves the support for the old format as well just in case someone
  will use a newer kernel with an old device tree blob.

- While here, replace dwc_fast_ffs() by __ffs()

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-05-02 15:30:47 +05:30
Wolfram Sang 010629436d mmc: sh_mobile_sdhi: remove obsolete include file
A few SH boards include the file but don't make use of it (no named
interrupts). The SDHI code removed support for this feature as well.
So, drop the references and ultimately remove the unneeded file.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Rich Felker <dalias@libc.org>
Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-05-02 10:36:04 +02:00
Wolfram Sang ac86045ee9 mmc: tmio: merge distributed include files
There is no reason to have a public and private header file. Merge them
into a private one, so looking up symbols is less confusing.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-05-02 10:33:40 +02:00
Shawn Lin 6929eeec2a mmc: dw_mmc: remove unused EVENT_XFER_ERROR
EVENT_XFER_ERROR isn't been used now, so it can be removed.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-05-02 10:33:16 +02:00
Shawn Lin 49b17858c1 mmc: dw_mmc: fix warning reported by kernel-doc
Try to fix the warning reported by:
scripts/kernel-doc -man -v include/linux/mmc/dw_mmc.h > /dev/null

warning: No description found for parameter 'irq_lock'
warning: No description found for parameter 'stop_abort'
warning: No description found for parameter 'prev_blksz'
warning: No description found for parameter 'timing'
warning: No description found for parameter 'ring_size'
warning: No description found for parameter 'dms'
warning: No description found for parameter 'phy_regs'
warning: No description found for parameter 'fifoth_val'
warning: No description found for parameter 'vqmmc_enabled'
warning: No description found for parameter 'cmd11_timer'
warning: Excess struct/union/enum/typedef member 'card_tasklet'
description in 'dw_mci'

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-05-02 10:33:15 +02:00
Wolfram Sang 93b6911ac1 mmc: host: add note that set_ios needs to handle 0Hz properly
While here, refactor the comments so that they are before the
declaration they are referring to.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-05-02 10:33:14 +02:00
Wolfram Sang 452e5eef6d mmc: tmio: Add UHS-I mode support
Based on work by Shinobu Uehara and Ben Dooks. This adds the voltage
switch operation needed for all UHS-I modes, but not the tuning needed
for SDR-104 which will come later.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-05-02 10:33:12 +02:00
Sudarsana Reddy Kalluru 03dc76ca1e qed: add infrastructure for device self tests.
This patch adds the functionality and APIs needed for selftests.
It adds the ability to configure the link-mode which is required for the
implementation of loopback tests. It adds the APIs for clock test,
register test, interrupt test and memory test.

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 00:16:39 -04:00
Tim Bingham 2c94b53738 net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
Prior to commit d92cff89a0 ("net_dbg_ratelimited: turn into no-op
when !DEBUG") the implementation of net_dbg_ratelimited() was buggy
for both the DEBUG and CONFIG_DYNAMIC_DEBUG cases.

The bug was that net_ratelimit() was being called and, despite
returning true, nothing was being printed to the console. This
resulted in messages like the following -

"net_ratelimit: %d callbacks suppressed"

with no other output nearby.

After commit d92cff89a0 ("net_dbg_ratelimited: turn into no-op when
!DEBUG") the bug is fixed for the DEBUG case. However, there's no
output at all for CONFIG_DYNAMIC_DEBUG case.

This patch restores debug output (if enabled) for the
CONFIG_DYNAMIC_DEBUG case.

Add a definition of net_dbg_ratelimited() for the CONFIG_DYNAMIC_DEBUG
case. The implementation takes care to check that dynamic debugging is
enabled before calling net_ratelimit().

Fixes: d92cff89a0 ("net_dbg_ratelimited: turn into no-op when !DEBUG")
Signed-off-by: Tim Bingham <tbingham@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 21:34:01 -04:00
Christoph Hellwig e259221763 fs: simplify the generic_write_sync prototype
The kiocb already has the new position, so use that.  The only interesting
case is AIO, where we currently don't bother updating ki_pos.  We're about
to free the kiocb after we're done, so we might as well update it to make
everyone's life simpler.

While we're at it also return the bytes written argument passed in if
we were successful so that the boilerplate error switch code in the
callers can go away.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-01 19:58:39 -04:00
Christoph Hellwig dde0c2e798 fs: add IOCB_SYNC and IOCB_DSYNC
This will allow us to do per-I/O sync file writes, as required by a lot
of fileservers or storage targets.

XXX: Will need a few additional audits for O_DSYNC

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-01 19:58:39 -04:00
Christoph Hellwig c8b8e32d70 direct-io: eliminate the offset argument to ->direct_IO
Including blkdev_direct_IO and dax_do_io.  It has to be ki_pos to actually
work, so eliminate the superflous argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-01 19:58:39 -04:00
Christoph Hellwig 1af5bb491f filemap: remove the pos argument to generic_file_direct_write
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-01 19:58:39 -04:00
Mimi Zohar 05d1a717ec ima: add support for creating files using the mknodat syscall
Commit 3034a14 "ima: pass 'opened' flag to identify newly created files"
stopped identifying empty files as new files.  However new empty files
can be created using the mknodat syscall.  On systems with IMA-appraisal
enabled, these empty files are not labeled with security.ima extended
attributes properly, preventing them from subsequently being opened in
order to write the file data contents.  This patch defines a new hook
named ima_post_path_mknod() to mark these empty files, created using
mknodat, as new in order to allow the file data contents to be written.

In addition, files with security.ima xattrs containing a file signature
are considered "immutable" and can not be modified.  The file contents
need to be written, before signing the file.  This patch relaxes this
requirement for new files, allowing the file signature to be written
before the file contents.

Changelog:
- defer identifying files with signatures stored as security.ima
  (based on Dmitry Rozhkov's comments)
- removing tests (eg. dentry, dentry->d_inode, inode->i_size == 0)
  (based on Al's review)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Al Viro <<viro@zeniv.linux.org.uk>
Tested-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>
2016-05-01 09:23:52 -04:00
Linus Torvalds 925d96a0c9 Final set of -rc fixes for 4.6
- A number of collected fixes for oopses, memory corruptions, deadlocks,
   etc.  All of these fixes are small (many only 5-10 lines), obvious,
   and tested.
 - Fix for the security issue related to the use of write for
   bi-directional communications.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXIrZVAAoJELgmozMOVy/d/XEP/1A4Ohm7WiZMN09wvlFGSgLe
 2z2tY9ILvFuiAF++VZRfYyRmHorVKHYB1tk0JTsW1Ts1DrkjExgr4LS1/YDLOC42
 q8YlBZw2x7pdnD5W1MJm+HK6oNj7aZVVjEHG7QnfLUIXr57a2rBQIeeWLx24M+OS
 j1yvaY/v39qvf7dwHwVjs07rh2WW9QCZn2c/552G4xz1YDdTkYBTc2WNnl0eng1f
 1NqqMXhnajmNyR+Q+0+Vbcp4YWv551l5E6j9M+5nebehNtPSRb0GEIjxT3KnSGEg
 AjFev3XwnRF9EkQOwgbsg7a784+UHXe15vbr1MvbzGygcQeq4NVzLl04WEHExQe1
 Om0ES/i8zfRs6d5XYB5zMY8pJbdjSVM/20d+h21SQs//4JXXJrN35WVAyy8lgwrX
 M3oY4t21eBQlV7oezfEZQgEEbdtccr8LILfZZmRUWPHd2ymaTWg6e4pZwtn45rlD
 O/Gb11G/UT7SXgw+XiPLBj5xlQk7nRn0kGuaStR7PonkLQ9Zy1wSSptvJeGj0VWE
 W6TEJnIqtv0aiJLhIQn8Ee1pCxE/ds7UPW6wT5O9R8ccEdDeIB3BDgskTNg1xPuS
 I8e1o7iA9752YS3wMDhLA8PifwbmVbkGHxJUQecOBPiDcdukkM0q4/YoNYqXKLCc
 nLDv3fMztpinG1L1T2LM
 =MFdC
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
 "Final set of -rc fixes for 4.6.

  I've collected up a number of patches that are all pretty small with
  the exception of only a couple.  The hfi1 driver has a number of
  important patches, and it is what really drives the line count of this
  pull request up.  These are all small and I've got this kernel built
  and running in the test lab (I have most of the hardware, I think nes
  is the only thing in this patch set that I can't say I've personally
  tested and have up and running).

  Summary:

   - A number of collected fixes for oopses, memory corruptions,
     deadlocks, etc.  All of these fixes are small (many only 5-10
     lines), obvious, and tested.

   - Fix for the security issue related to the use of write for
     bi-directional communications"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  RDMA/nes: don't leak skb if carrier down
  IB/security: Restrict use of the write() interface
  IB/hfi1: Use kernel default llseek for ui device
  IB/hfi1: Don't attempt to free resources if initialization failed
  IB/hfi1: Fix missing lock/unlock in verbs drain callback
  IB/rdmavt: Fix send scheduling
  IB/hfi1: Prevent unpinning of wrong pages
  IB/hfi1: Fix deadlock caused by locking with wrong scope
  IB/hfi1: Prevent NULL pointer deferences in caching code
  MAINTAINERS: Update iser/isert maintainer contact info
  IB/mlx5: Expose correct max_sge_rd limit
  RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
  iw_cxgb4: handle draining an idle qp
  iw_cxgb3: initialize ibdev.iwcm->ifname for port mapping
  iw_cxgb4: initialize ibdev.iwcm->ifname for port mapping
  IB/core: Don't drain non-existent rq queue-pair
  IB/core: Fix oops in ib_cache_gid_set_default_gid
2016-04-29 17:07:54 -07:00
Steven Rostedt (Red Hat) 904d1857ad tracing: Remove unused function trace_current_buffer_lock_reserve()
trace_current_buffer_lock_reserve() has no more users. Remove it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-29 18:11:54 -04:00
Miroslav Benes f09d90864e livepatch: make object/func-walking helpers more robust
Current object-walking helper checks the presence of obj->funcs to
determine the end of objs array in klp_object structure. This is
somewhat fragile because one can easily forget about funcs definition
during livepatch creation. In such a case the livepatch module is
successfully loaded and all objects after the incorrect one are omitted.
This is very confusing. Let's make the helper more robust and check also
for the other external member, name. Thus the helper correctly stops on
an empty item of the array. We need to have a check for obj->funcs in
klp_init_object() to make it work.

The same applies to a func-walking helper.

As a benefit we'll check for new_func member definition during the
livepatch initialization. There is no such check anywhere in the code
now.

[jkosina@suse.cz: fix shortlog]
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-04-30 00:04:08 +02:00
Maor Gottlieb 5a7b27eb9c net/mlx5: Initializing CPU reverse mapping
Allocating CPU rmap and add entry for each IRQ.
CPU rmap is used in aRFS to get the RX queue number
of the RX completion interrupts.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:11 -04:00
Maor Gottlieb d63cd28608 net/mlx5: Add user chosen levels when allocating flow tables
Currently, consumers of the flow steering infrastructure can't
choose their own flow table levels and are limited to one
flow table per level. This just waste levels.
Instead, we introduce here the possibility to use multiple
flow tables in a level. The user is free to connect these
flow tables, while following the rule (FTEs in FT of level x
could only point to FTs of level y where y > x).

In addition this patch switch the order of the create/destroy
flow tables of the NIC(vlan and main).

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:09 -04:00
Maor Gottlieb d745098ced net/mlx5: Introduce modify flow rule destination
This API is used for modifying the flow rule destination.
This is needed for modifying the pointed flow table by the
traffic type classifier rules to point on the aRFS tables.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:08 -04:00
Steven Rostedt (Red Hat) a9fe48dcde tracing: Remove unused function trace_current_buffer_discard_commit()
The function trace_current_buffer_discard_commit() has no callers, remove
it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-29 16:14:13 -04:00
Steven Rostedt (Red Hat) fa66ddb870 tracing: Move trace_buffer_unlock_commit{_regs}() to local header
The functions trace_buffer_unlock_commit() and the _regs() version are only
used within the kernel/trace directory. Move them to the local header and
remove the export as well.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-04-29 16:14:12 -04:00
Nikolay Aleksandrov f4b05d27ec net: constify is_skb_forwardable's arguments
is_skb_forwardable is not supposed to change anything so constify its
arguments

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:13:36 -04:00
Linus Torvalds 1d003af2ef Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "20 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  Documentation/sysctl/vm.txt: update numa_zonelist_order description
  lib/stackdepot.c: allow the stack trace hash to be zero
  rapidio: fix potential NULL pointer dereference
  mm/memory-failure: fix race with compound page split/merge
  ocfs2/dlm: return zero if deref_done message is successfully handled
  Ananth has moved
  kcov: don't profile branches in kcov
  kcov: don't trace the code coverage code
  mm: wake kcompactd before kswapd's short sleep
  .mailmap: add Frank Rowand
  mm/hwpoison: fix wrong num_poisoned_pages accounting
  mm: call swap_slot_free_notify() with page lock held
  mm: vmscan: reclaim highmem zone if buffer_heads is over limit
  numa: fix /proc/<pid>/numa_maps for THP
  mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
  mailmap: fix Krzysztof Kozlowski's misspelled name
  thp: keep huge zero page pinned until tlb flush
  mm: exclude HugeTLB pages from THP page_mapped() logic
  kexec: export OFFSET(page.compound_head) to find out compound tail page
  kexec: update VMCOREINFO for compound_order/dtor
2016-04-29 11:21:22 -07:00
Thierry Reding 53d2a715c2 phy: Add Tegra XUSB pad controller support
Add a new driver for the XUSB pad controller found on NVIDIA Tegra SoCs.
This hardware block used to be exposed as a pin controller, but it turns
out that this isn't a good fit. The new driver and DT binding much more
accurately describe the hardware and are more flexible in supporting new
SoC generations.

Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2016-04-29 16:44:47 +02:00
Thierry Reding 1140f7c899 phy: core: Allow children node to be overridden
In order to more flexibly support device tree bindings, allow drivers to
override the container of the child nodes. By default the device node of
the PHY provider is assumed to be the parent for children, but bindings
may decide to add additional levels for better organization.

Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2016-04-29 16:39:39 +02:00
Jiang Qiu 4ba8cfa79f gpio: dwapb: convert device node to fwnode
This patch converts device node to fwnode for dwapb driver, so
as to provide a unified fwnode for DT and ACPI bindings.

Tested-by: Alan Tull <delicious.quinoa@gmail.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Jiang Qiu <qiujiang@huawei.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-04-29 11:23:53 +02:00