Commit graph

533489 commits

Author SHA1 Message Date
Thomas Gleixner a899418167 hotplug: Prevent alloc/free of irq descriptors during cpu up/down
When a cpu goes up some architectures (e.g. x86) have to walk the irq
space to set up the vector space for the cpu. While this needs extra
protection at the architecture level we can avoid a few race
conditions by preventing the concurrent allocation/free of irq
descriptors and the associated data.

When a cpu goes down it moves the interrupts which are targeted to
this cpu away by reassigning the affinities. While this happens
interrupts can be allocated and freed, which opens a can of race
conditions in the code which reassignes the affinities because
interrupt descriptors might be freed underneath.

Example:

CPU1				CPU2
cpu_up/down
 irq_desc = irq_to_desc(irq);
				remove_from_radix_tree(desc);
 raw_spin_lock(&desc->lock);
				free(desc);

We could protect the irq descriptors with RCU, but that would require
a full tree change of all accesses to interrupt descriptors. But
fortunately these kind of race conditions are rather limited to a few
things like cpu hotplug. The normal setup/teardown is very well
serialized. So the simpler and obvious solution is:

Prevent allocation and freeing of interrupt descriptors accross cpu
hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.063519515@linutronix.de
2015-07-08 11:32:25 +02:00
Tvrtko Ursulin 2c3d99845e drm/i915: Restore all GGTT VMAs on resume
When rotated and partial views were added no one spotted the resume
path which assumes only one GGTT VMA per object and hence is now
skipping rebind of alternative views.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-08 11:28:47 +02:00
Sébastien Hinderer 69711ca19b x86/kconfig: Fix typo in the CONFIG_CMDLINE_BOOL help text
Signed-off-by: Sébastien Hinderer <Sebastien.Hinderer@ens-lyon.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samuel Thibault <Samuel.Thibault@ens-lyon.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-08 11:10:56 +02:00
Julien Grall 86e8971800 netfilter: bridge: Use __in6_dev_get rather than in6_dev_get in br_validate_ipv6
The commit efb6de9b4b "netfilter: bridge:
forward IPv6 fragmented packets" introduced a new function
br_validate_ipv6 which take a reference on the inet6 device. Although,
the reference is not released at the end.

This will result to the impossibility to destroy any netdevice using
ipv6 and bridge.

It's possible to directly retrieve the inet6 device without taking a
reference as all netfilter hooks are protected by rcu_read_lock via
nf_hook_slow.

Spotted while trying to destroy a Xen guest on the upstream Linux:
"unregister_netdevice: waiting for vif1.0 to become free. Usage count = 1"

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Bernhard Thaler <bernhard.thaler@wvnet.at>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: fw@strlen.de
Cc: ian.campbell@citrix.com
Cc: wei.liu2@citrix.com
Cc: Bob Liu <bob.liu@oracle.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-08 11:02:16 +02:00
Pablo Neira Ayuso 91c269a0d3 MAINTAINER: add bridge netfilter
So scripts/get_maintainer.pl shows the Netfilter mailing lists.

Reported-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-07-08 11:01:36 +02:00
Herbert Xu 030f4e9687 crypto: nx - Fix reentrancy bugs
This patch fixes a host of reentrancy bugs in the nx driver.  The
following algorithms are affected:

* CCM
* GCM
* CTR
* XCBC
* SHA256
* SHA512

The crypto API allows a single transform to be used by multiple
threads simultaneously.  For example, IPsec will use a single tfm
to process packets for a given SA.  As packets may arrive on
multiple CPUs that tfm must be reentrant.

The nx driver does try to deal with this by using a spin lock.
Unfortunately only the basic AES/CBC/ECB algorithms do this in
the correct way.

The symptom of these bugs may range from the generation of incorrect
output to memory corruption.

Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-07-08 15:14:13 +08:00
Michael Ellerman 9958084a52 powerpc: Update MAINTAINERS to point at shared tree
Now that we have a shared powerpc tree on kernel.org, point folks at that
as the primary place to look for powerpc stuff.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08 15:18:21 +10:00
Sukadev Bhattiprolu 442053e57a powerpc/perf/24x7: Fix lockdep warning
The sysfs attributes for the 24x7 counters are dynamically allocated.
Initialize the attributes using sysfs_attr_init() to fix following
warning which occurs when CONFIG_DEBUG_LOCK_VMALLOC=y.

[    0.346249] audit: initializing netlink subsys (disabled)
[    0.346284] audit: type=2000 audit(1436295254.340:1): initialized
[    0.346489] BUG: key c0000000efe90198 not in .data!
[    0.346491] DEBUG_LOCKS_WARN_ON(1)
[    0.346502] ------------[ cut here ]------------
[    0.346504] WARNING: at ../kernel/locking/lockdep.c:3002
[    0.346506] Modules linked in:

Reported-by: Gustavo Luiz Duarte <gustavold@linux.vnet.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Tested-by: Gustavo Luiz Duarte <gustavold@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08 15:18:04 +10:00
Ian Munsie 10a5894f2d cxl: Fix off by one error allowing subsequent mmap page to be accessed
It was discovered that if a process mmaped their problem state area they
were able to access one page more than expected, potentially allowing
them to access the problem state area of an unrelated process.

This was due to a simple off by one error in the mmap fault handler
introduced in 0712dc7e73 ("cxl: Fix issues
when unmapping contexts"), which is fixed in this patch.

Cc: stable@vger.kernel.org
Fixes: 0712dc7e73 ("cxl: Fix issues when unmapping contexts")
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08 15:17:47 +10:00
Ian Munsie 5caaf53468 cxl: Fail mmap if requested mapping is larger than assigned problem state area
This patch makes the mmap call fail outright if the requested region is
larger than the problem state area assigned to the context so the error
is reported immediately rather than waiting for an attempt to access an
address out of bounds.

Although we never expect users to map more than the assigned problem
state area and are not aware of anyone doing this (other than for
testing), this does have the potential to break users if someone has
used a larger range regardless. I'm submitting it for consideration, but
if this change is not considered acceptable the previous patch is
sufficient to prevent access out of bounds without breaking anyone.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08 15:17:46 +10:00
Ralf Baechle 7928eb0370 MIPS: O32: Do not handle require 32 bytes from the stack to be readable.
Commit 46e12c07b3 (MIPS: O32 / 32-bit:
Always copy 4 stack arguments.) change the O32 syscall handler to always
load four arguments from the userspace stack even for syscalls that
require fewer or no arguments to be copied.  This removes a large table
from kernel space and need to maintain it.  It appeared that it was ok
the implementation chosen requires 16 bytes of readable stack space
above the user stack pointer.

Turned out a few threading implementations munmap the user stack before
the thread exits resulting in errors due to the unreadable stack.

We now treat any failed load as a if the loaded value was zero and let
the actual syscall deal with the situation.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-07-08 05:03:30 +02:00
Pankaj Dev 56551da925 drivers: clk: st: Incorrect register offset used for lock_status
Incorrect register offset used for sthi407 clockgenC

Signed-off-by: Pankaj Dev <pankaj.dev@st.com>
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@linaro.org>
Fixes: 51306d56ba ("clk: st: STiH407: Support for clockgenC0")
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2015-07-07 16:05:08 -07:00
Linus Torvalds d6ac4ffc61 Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "These are late by a week; they should have been merged during the
  merge window, but unfortunately, the ARM kernel build/boot farms were
  indicating random failures, and it wasn't clear whether the cause was
  something in these changes or something during the merge window.

  This is a set of merge window fixes with some documentation additions"

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: avoid unwanted GCC memset()/memcpy() optimisations for IO variants
  ARM: pgtable: document mapping types
  ARM: io: convert ioremap*() to functions
  ARM: io: fix ioremap_wt() implementation
  ARM: io: document ARM specific behaviour of ioremap*() implementations
  ARM: fix lockdep unannotated irqs-off warning
  ARM: 8397/1: fix vdsomunge not to depend on glibc specific error.h
  ARM: add helpful message when truncating physical memory
  ARM: add help text for HIGHPTE configuration entry
  ARM: fix DEBUG_SET_MODULE_RONX build dependencies
  ARM: 8396/1: use phys_addr_t in pfn_to_kaddr()
  ARM: 8394/1: update memblock limit after mapping lowmem
  ARM: 8393/1: smp: Fix suspicious RCU usage with ipi tracepoints
2015-07-07 15:19:09 -07:00
Tomas Winkler 4f273959b8 mei: nfc: fix deadlock on shutdown/suspend path
In function mei_nfc_host_exit mei_cl_remove_device cannot be called
under the device mutex as device removing flow invokes the device driver
remove handler that calls in turn to mei_cl_disable_device which
naturally acquires the device mutex.

Also remove mei_cl_bus_remove_devices which has the same issue, but is
never executed as currently the only device on the mei client bus is NFC
and a new device cannot be easily added till the bus revamp is
completed.

This fixes regression caused by commit be9b720a0c ("mei_phy: move all
nfc logic from mei driver to nfc")

Prior to this change the nfc driver remove handler called to no-op
disable function while actual nfc device was disabled directly from the
mei driver.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-07 15:04:12 -07:00
Rafael J. Wysocki 8076ca480f Merge branch 'acpi-scan'
* acpi-scan:
  ata: ahci_platform: Add ACPI _CLS matching
  ACPI / scan: Add support for ACPI _CLS device matching
2015-07-07 22:48:25 +02:00
Rafael J. Wysocki d0aee67fa1 Merge branches 'acpi-pnp', 'acpi-soc', 'pm-domains' and 'pm-sleep'
* acpi-pnp:
  ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stage

* acpi-soc:
  ACPI / LPSS: Fix up acpi_lpss_create_device()

* pm-domains:
  PM / Domains: Avoid infinite loops in attach/detach code

* pm-sleep:
  PM / hibernate: clarify resume documentation
2015-07-07 22:48:14 +02:00
Rafael J. Wysocki 3fc7aeeb08 Merge branch 'pm-wakeirq'
* pm-wakeirq:
  PM / wakeirq: Avoid setting power.wakeirq too hastily
2015-07-07 22:47:43 +02:00
Thomas Gleixner 37b64a4206 tick/broadcast: Unbreak CONFIG_GENERIC_CLOCKEVENTS=n build
Making tick_broadcast_oneshot_control() independent from
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST broke the build for
CONFIG_GENERIC_CLOCKEVENTS=n because the function is not defined
there.

Provide a proper stub inline.

Fixes: f32dd11705 'tick/broadcast: Make idle check independent from mode and config'
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-07 21:56:34 +02:00
Ralf Baechle 0bb383a2d8 MIPS, CPUFREQ: Fix spelling of Institute.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-07-07 20:59:42 +02:00
Ralf Baechle 71eeedcf51 MIPS: Lemote 2F: Fix build caused by recent mass rename.
CC      arch/mips/loongson64/lemote-2f/clock.o
/home/ralf/src/linux/linux-mips/arch/mips/loongson64/lemote-2f/clock.c:18:40: fatal error: asm/mach-loongson/loongson.h: No such file or directory
 #include <asm/mach-loongson/loongson.h>
                                        ^
compilation terminated.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-07-07 20:59:34 +02:00
duson 539c4b8814 Input: elan_i2c - change the hover event from MT to ST
The firmware only reports hover condition while the very first contact is
approaching the surface; the hover is not reported for the subsequent
contacts. Therefore we should not be using ABS_MT_DISTANCE to report hover
but rather its single-touch counterpart ABS_DISTANCE.

Signed-off-by: Duson Lin <dusonlin@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2015-07-07 11:28:31 -07:00
Thomas Gleixner c428833481 tick/broadcast: Handle spurious interrupts gracefully
Andriy reported that on a virtual machine the warning about negative
expiry time in the clock events programming code triggered:

hpet: hpet0 irq 40 for MSI
hpet: hpet1 irq 41 for MSI
Switching to clocksource hpet
WARNING: at kernel/time/clockevents.c:239

[<ffffffff810ce6eb>] clockevents_program_event+0xdb/0xf0
[<ffffffff810cf211>] tick_handle_periodic_broadcast+0x41/0x50
[<ffffffff81016525>] timer_interrupt+0x15/0x20

When the second hpet is installed as a per cpu timer the broadcast
event is not longer required and stopped, which sets the next_evt of
the broadcast device to KTIME_MAX.

If after that a spurious interrupt happens on the broadcast device,
then the current code blindly handles it and tries to reprogram the
broadcast device afterwards, which adds the period to
next_evt. KTIME_MAX + period results in a negative expiry value
causing the WARN_ON in the clockevents code to trigger.

Add a proper check for the state of the broadcast device into the
interrupt handler and return if the interrupt is spurious.

[ Folded in pointer fix from Sudeep ]

Reported-by: Andriy Gapon <avg@FreeBSD.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20150705205221.802094647@linutronix.de
2015-07-07 18:46:48 +02:00
Thomas Gleixner d5113e13a5 tick/broadcast: Check for hrtimer broadcast active early
If the current cpu is the one which has the hrtimer based broadcast
queued then we better return busy immediately instead of going through
loops and hoops to figure that out.

[ Split out from a larger combo patch ]

Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
2015-07-07 18:46:48 +02:00
Thomas Gleixner 0cc5281aa5 tick/broadcast: Return busy when IPI is pending
Tell the idle code not to go deep if the broadcast IPI is about to
arrive.

[ Split out from a larger combo patch ]

Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
2015-07-07 18:46:48 +02:00
Thomas Gleixner d33257264b tick/broadcast: Return busy if periodic mode and hrtimer broadcast
If the system is in periodic mode and the broadcast device is hrtimer
based, return busy as we have no proper handling for this.

[ Split out from a larger combo patch ]

Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
2015-07-07 18:46:48 +02:00
Thomas Gleixner e3ac79e087 tick/broadcast: Move the check for periodic mode inside state handling
We need to check more than the periodic mode for proper operation in
all runtime combinations. To avoid code duplication move the check
into the enter state handling.

No functional change.

[ Split out from a larger combo patch ]

Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
2015-07-07 18:46:47 +02:00
Thomas Gleixner b78f3f3c89 tick/broadcast: Prevent deep idle if no broadcast device available
Add a check for a installed broadcast device to the oneshot control
function and return busy if not.

[ Split out from a larger combo patch ]

Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
2015-07-07 18:46:47 +02:00
Thomas Gleixner f32dd11705 tick/broadcast: Make idle check independent from mode and config
Currently the broadcast busy check, which prevents the idle code from
going into deep idle, works only in one shot mode.

If NOHZ and HIGHRES are off (config or command line) there is no
sanity check at all, so under certain conditions cpus are allowed to
go into deep idle, where the local timer stops, and are not woken up
again because there is no broadcast timer installed or a hrtimer based
broadcast device is not evaluated.

Move tick_broadcast_oneshot_control() into the common code and provide
proper subfunctions for the various config combinations.

The common check in tick_broadcast_oneshot_control() is for the C3STOP
misfeature flag of the local clock event device. If its not set, idle
can proceed. If set, further checks are necessary.

Provide checks for the trivial cases:

 - If broadcast is disabled in the config, then return busy

 - If oneshot mode (NOHZ/HIGHES) is disabled in the config, return
   busy if the broadcast device is hrtimer based.

 - If oneshot mode is enabled in the config call the original
   tick_broadcast_oneshot_control() function. That function needs
   extra checks which will be implemented in seperate patches.

[ Split out from a larger combo patch ]

Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
2015-07-07 18:46:47 +02:00
Thomas Gleixner e045431190 tick/broadcast: Sanity check the shutdown of the local clock_event
The broadcast code shuts down the local clock event unconditionally
even if no broadcast device is installed or if the broadcast device is
hrtimer based.

Add proper sanity checks.

[ Split out from a larger combo patch ]

Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
2015-07-07 18:46:47 +02:00
Thomas Gleixner 8eb231261f tick/broadcast: Prevent hrtimer recursion
The hrtimer based broadcast vehicle can cause a hrtimer recursion
which went unnoticed until we changed the hrtimer expiry code to keep
track of the currently running timer.

local_timer_interrupt()
  local_handler()
    hrtimer_interrupt()
      expire_hrtimers()
        broadcast_hrtimer()
	  send_ipis()
	  local_handler()
	    hrtimer_interrupt()
	     ....

Solution is simple: Prevent the local handler call from the broadcast
code when the broadcast 'device' is hrtimer based.

[ Split out from a larger combo patch ]

Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos
2015-07-07 18:46:47 +02:00
Catalin Marinas ef37566cf8 arm64: Keep the ARM64 Kconfig selects sorted
Move EDAC_SUPPORT to the right place.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-07 17:15:39 +01:00
Al Stone 99e3e3ae33 ACPI / ARM64 : use the new BAD_MADT_GICC_ENTRY macro
For those parts of the arm64 ACPI code that need to check GICC subtables
in the MADT, use the new BAD_MADT_GICC_ENTRY macro instead of the previous
BAD_MADT_ENTRY.  The new macro takes into account differences in the size
of the GICC subtable that the old macro did not; this caused failures even
though the subtable entries are valid.

Fixes: aeb823bbac ("ACPICA: ACPI 6.0: Add changes for FADT table.")
Signed-off-by: Al Stone <al.stone@linaro.org>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-07 14:55:04 +01:00
Al Stone b6cfb27737 ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro
The BAD_MADT_ENTRY() macro is designed to work for all of the subtables
of the MADT.  In the ACPI 5.1 version of the spec, the struct for the
GICC subtable (struct acpi_madt_generic_interrupt) is 76 bytes long; in
ACPI 6.0, the struct is 80 bytes long.  But, there is only one definition
in ACPICA for this struct -- and that is the 6.0 version.  Hence, when
BAD_MADT_ENTRY() compares the struct size to the length in the GICC
subtable, it fails if 5.1 structs are in use, and there are systems in
the wild that have them.

This patch adds the BAD_MADT_GICC_ENTRY() that checks the GICC subtable
only, accounting for the difference in specification versions that are
possible.  The BAD_MADT_ENTRY() will continue to work as is for all other
MADT subtables.

This code is being added to an arm64 header file since that is currently
the only architecture using the GICC subtable of the MADT.  As a GIC is
specific to ARM, it is also unlikely the subtable will be used elsewhere.

Fixes: aeb823bbac ("ACPICA: ACPI 6.0: Add changes for FADT table.")
Signed-off-by: Al Stone <al.stone@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
[catalin.marinas@arm.com: extra brackets around macro arguments]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-07 14:54:59 +01:00
Russell King 06be5eefe1 Merge branches 'fixes' and 'ioremap' into for-linus 2015-07-07 12:35:33 +01:00
Rafael J. Wysocki 6d3dab7d84 PM / wakeirq: Avoid setting power.wakeirq too hastily
If dev_pm_attach_wake_irq() fails, the device's power.wakeirq field
should not be set to point to the struct wake_irq passed to that
function, as that object will be freed going forward.

For this reason, make dev_pm_attach_wake_irq() first call
device_wakeup_attach_irq() and only set the device's power.wakeirq
field if that's successful.

That requires device_wakeup_attach_irq() to be called under the
device's power.lock lock, but since dev_pm_attach_wake_irq() is
the only caller of it, the requisite changes are easy to make.

Fixes: 4990d4fe32 (PM / Wakeirq: Add automated device wake IRQ handling)
Reported-by: Felipe Balbi <balbi@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07 13:08:39 +02:00
Thomas Gleixner 09cf92b784 x86/irq: Retrieve irq data after locking irq_desc
irq_data is protected by irq_desc->lock, so retrieving the irq chip
from irq_data outside the lock is racy vs. an concurrent update. Move
it into the lock held region.

While at it add a comment why the vector walk does not require
vector_lock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.331320612@linutronix.de
2015-07-07 11:54:04 +02:00
Thomas Gleixner cbb24dc761 x86/irq: Use proper locking in check_irq_vectors_for_cpu_disable()
It's unsafe to examine fields in the irq descriptor w/o holding the
descriptor lock. Add proper locking.

While at it add a comment why the vector check can run lock less

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.236544164@linutronix.de
2015-07-07 11:54:04 +02:00
Thomas Gleixner 5a3f75e3f0 x86/irq: Plug irq vector hotplug race
Jin debugged a nasty cpu hotplug race which results in leaking a irq
vector on the newly hotplugged cpu.

cpu N				cpu M
native_cpu_up                   device_shutdown
  do_boot_cpu			  free_msi_irqs
  start_secondary                   arch_teardown_msi_irqs
    smp_callin                        default_teardown_msi_irqs
       setup_vector_irq                  arch_teardown_msi_irq
        __setup_vector_irq		   native_teardown_msi_irq
          lock(vector_lock)		     destroy_irq 
          install vectors
          unlock(vector_lock)
					       lock(vector_lock)
--->                                  	       __clear_irq_vector
                                    	       unlock(vector_lock)
    lock(vector_lock)
    set_cpu_online
    unlock(vector_lock)

This leaves the irq vector(s) which are torn down on CPU M stale in
the vector array of CPU N, because CPU M does not see CPU N online
yet. There is a similar issue with concurrent newly setup interrupts.

The alloc/free protection of irq descriptors does not prevent the
above race, because it merily prevents interrupt descriptors from
going away or changing concurrently.

Prevent this by moving the call to setup_vector_irq() into the
vector_lock held region which protects set_cpu_online():

cpu N				cpu M
native_cpu_up                   device_shutdown
  do_boot_cpu			  free_msi_irqs
  start_secondary                   arch_teardown_msi_irqs
    smp_callin                        default_teardown_msi_irqs
       lock(vector_lock)                arch_teardown_msi_irq
       setup_vector_irq()
        __setup_vector_irq		   native_teardown_msi_irq
          install vectors		     destroy_irq 
       set_cpu_online
       unlock(vector_lock)
					       lock(vector_lock)
                                  	       __clear_irq_vector
                                    	       unlock(vector_lock)

So cpu M either sees the cpu N online before clearing the vector or
cpu N installs the vectors after cpu M has cleared it.

Reported-by: xiao jin <jin.xiao@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.141898931@linutronix.de
2015-07-07 11:54:04 +02:00
Michael Neuling 3f8dc44d88 cxl: Fix refcounting in kernel API
Currently the kernel API AFU dev refcounting is done on context start and stop.
This patch moves this refcounting to context init and release, bringing it
inline with how the userspace API does it.

Without this we've seen the refcounting on the AFU get out of whack between the
user and kernel API usage.  This causes the AFU structures to be freed when
they are actually still in use.

This fixes some kref warnings we've been seeing and spurious ErrIVTE IRQs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-07 19:38:37 +10:00
Viresh Kumar 7c4a976cd5 clockevents: Allow set-state callbacks to be optional
Its mandatory for the drivers to provide set_state_{oneshot|periodic}()
(only if related modes are supported) and set_state_shutdown() callbacks
today, if they are implementing the new set-state interface.

But this leads to unnecessary noop callbacks for drivers which don't
want to implement them. Over that, it will lead to a full function call
for nothing really useful.

Lets make all set-state callbacks optional.

Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1436256875-15562-1-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-07 10:44:45 +02:00
Philippe Reynes 747d34e731 clocksource/imx: Define clocksource for mx27
The rework of the imx clocksource driver missed to add an entry for
imx27 which results in a boot failure on those machines.

Add the proper CLOCKSOURCE_OF_DECLARE() entry for imx27 and map it to
the imx21 init.

Fixes: bef11c881b 'ARM: imx: initialize gpt device type for DT boot'
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: daniel.lezcano@linaro.org
Cc: fabio.estevam@freescale.com
Cc: shawnguo@kernel.org
Cc: kernel@pengutronix.de
Link: http://lkml.kernel.org/r/1435439504-406-1-git-send-email-tremyfr@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-07 10:44:45 +02:00
Shreyas B. Prabhu b32aadc1a8 powerpc/powernv: Fix race in updating core_idle_state
core_idle_state is maintained for each core. It uses 0-7 bits to track
whether a thread in the core has entered fastsleep or winkle. 8th bit is
used as a lock bit.
The lock bit is set in these 2 scenarios-
 - The thread is first in subcore to wakeup from sleep/winkle.
 - If its the last thread in the core about to enter sleep/winkle

While the lock bit is set, if any other thread in the core wakes up, it
loops until the lock bit is cleared before proceeding in the wakeup
path. This helps prevent race conditions w.r.t fastsleep workaround and
prevents threads from switching to process context before core/subcore
resources are restored.

But, in the path to sleep/winkle entry, we currently don't check for
lock-bit. This exposes us to following race when running with subcore
on-

First thread in the subcorea		Another thread in the same
waking up		   		core entering sleep/winkle

lwarx   r15,0,r14
ori     r15,r15,PNV_CORE_IDLE_LOCK_BIT
stwcx.  r15,0,r14
[Code to restore subcore state]

						lwarx   r15,0,r14
						[clear thread bit]
						stwcx.  r15,0,r14

andi.   r15,r15,PNV_CORE_IDLE_THREAD_BITS
stw     r15,0(r14)

Here, after the thread entering sleep clears its thread bit in
core_idle_state, the value is overwritten by the thread waking up.
In such cases when the core enters fastsleep, code mistakes an idle
thread as running. Because of this, the first thread waking up from
fastsleep which is supposed to resync timebase skips it. So we can
end up having a core with stale timebase value.

This patch fixes the above race by looping on the lock bit even while
entering the idle states.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Fixes: 7b54e9f213f76 'powernv/powerpc: Add winkle support for offline cpus'
Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-07 10:16:52 +10:00
Linus Torvalds c7e9ad7da2 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:

 - fix the perf build, by fixing the rbtree.c sharing bug between kernel
   and tools/perf by creating a local copy of rbtree.c (more will be
   done for v4.3)

 - fix an AUX buffer (Intel-PT support) refcounting bug

 - fix copy_from_user_nmi() return value"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix copy_from_user_nmi() return if range is not ok
  perf: Fix AUX buffer refcounting
  tools: Copy rbtree_augmented.h from the kernel
  tools: Move rbtree.h from tools/perf/
  tools: Copy lib/rbtree.c to tools/lib/
  perf tools: Copy rbtree.h from the kernel
  tools: Adopt {READ,WRITE_ONCE} from the kernel
2015-07-06 17:07:56 -07:00
Suthikulpanit, Suravee 2051e92486 ata: ahci_platform: Add ACPI _CLS matching
This patch adds ACPI supports for AHCI platform driver, which uses _CLS
method to match the device.

The following is an example of ASL structure in DSDT for a SATA controller,
which contains _CLS package to be matched by the ahci_platform driver:

  Device (AHC0) // AHCI Controller
  {
    Name(_HID, "AMDI0600")
    Name (_CCA, 1)
    Name (_CLS, Package (3)
    {
      0x01, // Base Class: Mass Storage
      0x06, // Sub-Class: serial ATA
      0x01, // Interface: AHCI
    })
    Name (_CRS, ResourceTemplate ()
    {
      Memory32Fixed (ReadWrite, 0xE0300000, 0x00010000)
      Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive,,,) { 387 }
    })
  }

Also, since ATA driver should not require PCI support for ATA_ACPI,
this patch removes dependency in the driver/ata/Kconfig.

Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07 01:55:21 +02:00
Suthikulpanit, Suravee 26095a01d3 ACPI / scan: Add support for ACPI _CLS device matching
Device drivers typically use ACPI _HIDs/_CIDs listed in struct device_driver
acpi_match_table to match devices. However, for generic drivers, we do not
want to list _HID for all supported devices. Also, certain classes of devices
do not have _CID (e.g. SATA, USB). Instead, we can leverage ACPI _CLS,
which specifies PCI-defined class code (i.e. base-class, subclass and
programming interface). This patch adds support for matching ACPI devices using
the _CLS method.

To support loadable module, current design uses _HID or _CID to match device's
modalias. With the new way of matching with _CLS this would requires modification
to the current ACPI modalias key to include _CLS. This patch appends PCI-defined
class-code to the existing ACPI modalias as following.

    acpi:<HID>:<CID1>:<CID2>:..:<CIDn>:<bbsspp>:
E.g:
    # cat /sys/devices/platform/AMDI0600:00/modalias
    acpi:AMDI0600:010601:

where bb is th base-class code, ss is te sub-class code, and pp is the
programming interface code

Since there would not be _HID/_CID in the ACPI matching table of the driver,
this patch adds a field to acpi_device_id to specify the matching _CLS.

    static const struct acpi_device_id ahci_acpi_match[] = {
        { ACPI_DEVICE_CLASS(PCI_CLASS_STORAGE_SATA_AHCI, 0xffffff) },
        {},
    };

In this case, the corresponded entry in modules.alias file would be:

    alias acpi*:010601:* ahci_platform

Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07 01:55:20 +02:00
Uwe Geuder b51f9b103f PM / hibernate: clarify resume documentation
it was not the whole truth that kernel mode cannot be used with swap on LVM

Signed-off-by: Uwe Geuder <linuxkernel2015-ugeuder@snkmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07 01:18:11 +02:00
Geert Uytterhoeven 93af5e9354 PM / Domains: Avoid infinite loops in attach/detach code
If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end
up with an infinite loop in genpd_dev_pm_{at,de}tach().

This may happen due to a genpd.prepared_count imbalance.  This is a bug
elsewhere, but it will result in a system lock up, possibly during
reboot of an otherwise functioning system.

To avoid this, put a limit on the maximum number of loop iterations,
using an exponential back-off mechanism.  If the limit is reached, the
operation will just fail.  An error message is already printed.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07 01:13:07 +02:00
Sascha Hauer 7b2a4635b8 clk: mediatek: mt8173: Fix enabling of critical clocks
On the MT8173 the clocks are provided by different units. To enable
the critical clocks we must be sure that all parent clocks are already
registered, otherwise the parents of the critical clocks end up being
unused and get disabled later. To find a place where all parents are
registered we try each time after we've registered some clocks if
all known providers are present now and only then we enable the critical
clocks

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: James Liao <jamesjj.liao@mediatek.com>
[sboyd@codeaurora.org: Marked function and data __init]
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2015-07-06 15:54:13 -07:00
Rafael J. Wysocki d3e13ff3c1 ACPI / LPSS: Fix up acpi_lpss_create_device()
Fix a return value (which should be a negative error code) and a
memory leak (the list allocated by acpi_dev_get_resources() needs
to be freed on ioremap() errors too) in acpi_lpss_create_device()
introduced by commit 4483d59e29 'ACPI / LPSS: check the result
of ioremap()'.

Fixes: 4483d59e29 'ACPI / LPSS: check the result of ioremap()'
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07 00:31:47 +02:00
Rafael J. Wysocki 0294112ee3 ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stage
This effectively reverts the following three commits:

 7bc10388cc ACPI / resources: free memory on error in add_region_before()
 0f1b414d19 ACPI / PNP: Avoid conflicting resource reservations
 b9a5e5e18f ACPI / init: Fix the ordering of acpi_reserve_resources()

(commit b9a5e5e18f introduced regressions some of which, but not
all, were addressed by commit 0f1b414d19 and commit 7bc10388cc
was a fixup on top of the latter) and causes ACPI fixed hardware
resources to be reserved at the fs_initcall_sync stage of system
initialization.

The story is as follows.  First, a boot regression was reported due
to an apparent resource reservation ordering change after a commit
that shouldn't lead to such changes.  Investigation led to the
conclusion that the problem happened because acpi_reserve_resources()
was executed at the device_initcall() stage of system initialization
which wasn't strictly ordered with respect to driver initialization
(and with respect to the initialization of the pcieport driver in
particular), so a random change causing the device initcalls to be
run in a different order might break things.

The response to that was to attempt to run acpi_reserve_resources()
as soon as we knew that ACPI would be in use (commit b9a5e5e18f).
However, that turned out to be too early, because it caused resource
reservations made by the PNP system driver to fail on at least one
system and that failure was addressed by commit 0f1b414d19.

That fix still turned out to be insufficient, though, because
calling acpi_reserve_resources() before the fs_initcall stage of
system initialization caused a boot regression to happen on the
eCAFE EC-800-H20G/S netbook.  That meant that we only could call
acpi_reserve_resources() at the fs_initcall initialization stage
or later, but then we might just as well call it after the PNP
initalization in which case commit 0f1b414d19 wouldn't be
necessary any more.

For this reason, the changes made by commit 0f1b414d19 are reverted
(along with a memory leak fixup on top of that commit), the changes
made by commit b9a5e5e18f that went too far are reverted too and
acpi_reserve_resources() is changed into fs_initcall_sync, which
will cause it to be executed after the PNP subsystem initialization
(which is an fs_initcall) and before device initcalls (including
the pcieport driver initialization) which should avoid the initial
issue.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581
Link: http://marc.info/?t=143092384600002&r=1&w=2
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831
Link: http://marc.info/?t=143389402600001&r=1&w=2
Fixes: b9a5e5e18f "ACPI / init: Fix the ordering of acpi_reserve_resources()"
Reported-by: Roland Dreier <roland@purestorage.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-06 23:52:21 +02:00