Commit graph

10455 commits

Author SHA1 Message Date
Ingo Molnar 0967160ad6 Merge branch 'x86/asm' into perf/x86, to avoid conflicts with upcoming patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-04 09:01:12 +01:00
Ingo Molnar b57c0b5175 x86: Entry cleanups and a bugfix for 3.20
This fixes a bug in the RCU code I added in ist_enter.  It also includes
 the sysret stuff discussed here:
 
 http://lkml.kernel.org/g/cover.1421453410.git.luto%40amacapital.net
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUzhZ0AAoJEK9N98ZeDfrksUEH/j7wkUlMGan5h1AQIZQW6gKk
 OjlE1a4rfcgKocgkc0ix6UMc8Ks/NAUWKpeHR08eqR+Xi6Yk29cqLkboTEmAdYJ3
 jQvKjGu51kiprNjAGqF5wdqxvCT3oBSdm7CWdtY4zHkEr+2W93Ht9PM7xZhj4r+P
 ekUC8mIKQrhyhlC7g7VpXLAi3Bk4mO+f499T7XBVsVoywWpgVpOMYMhtUobV1reW
 V7/zul/dMerzNLB0t3amvdgCLphHBQTQ0fHBAN62RY78UvSDt36EZFyS65isirsR
 LhO4FpWzF5YNMRk8Dep/fB8jYlhsCi40ZIlOtGSE6kNJyLhPt+oLnkpgOwWAMQc=
 =uiRw
 -----END PGP SIGNATURE-----

Merge tag 'pr-20150201-x86-entry' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux into x86/asm

Pull "x86: Entry cleanups and a bugfix for 3.20" from Andy Lutomirski:

 " This fixes a bug in the RCU code I added in ist_enter.  It also includes
   the sysret stuff discussed here:

     http://lkml.kernel.org/g/cover.1421453410.git.luto%40amacapital.net "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-03 12:24:08 +01:00
Ingo Molnar 8dbcb8737c Linux 3.19-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUzvgKAAoJEHm+PkMAQRiG8XQH/1qVbHI4pP0KcnzfZUHq/mXq
 RuS4aJMwLm/Y6cXFraXBDaPde1A3CPtwtpob2C6giKcfu2zXGunY65haOEeJWNpX
 lCbBsLkNC3oDNkygBpVr5Zd6yibaw63WBjjLnpAi7pn2G2Zm2zB8DfILWWWMb7yz
 MH8ZXV+/xIYCTkjNWGWA1iMjmdYqu0PQHPeOgLsYQ+u7rxfM1zb/wHEkjqUZS6iu
 IaaZv7PV2PnFYnqib/iIPYjAEDvSQ4vN/7b82zlFd2Culm9j/568KCCWUPhJTb2l
 X0u4QYs49GnMTWVRa3bgYxS/nTUaE/6DeWs2y2WzqTt0/XDntVUnok0blUeDxGk=
 =o2kS
 -----END PGP SIGNATURE-----

Merge tag 'v3.19-rc7' into x86/asm, to refresh the branch before pulling in new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-03 12:22:18 +01:00
Andy Lutomirski 96b6352c12 x86_64, entry: Remove the syscall exit audit and schedule optimizations
We used to optimize rescheduling and audit on syscall exit.  Now
that the full slow path is reasonably fast, remove these
optimizations.  Syscall exit auditing is now handled exclusively by
syscall_trace_leave.

This adds something like 10ns to the previously optimized paths on
my computer, presumably due mostly to SAVE_REST / RESTORE_REST.

I think that we should eventually replace both the syscall and
non-paranoid interrupt exit slow paths with a pair of C functions
along the lines of the syscall entry hooks.

Link: http://lkml.kernel.org/r/22f2aa4a0361707a5cfb1de9d45260b39965dead.1421453410.git.luto@amacapital.net
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-02-01 04:03:02 -08:00
Andy Lutomirski 2a23c6b8a9 x86_64, entry: Use sysret to return to userspace when possible
The x86_64 entry code currently jumps through complex and
inconsistent hoops to try to minimize the impact of syscall exit
work.  For a true fast-path syscall, almost nothing needs to be
done, so returning is just a check for exit work and sysret.  For a
full slow-path return from a syscall, the C exit hook is invoked if
needed and we join the iret path.

Using iret to return to userspace is very slow, so the entry code
has accumulated various special cases to try to do certain forms of
exit work without invoking iret.  This is error-prone, since it
duplicates assembly code paths, and it's dangerous, since sysret
can malfunction in interesting ways if used carelessly.  It's
also inefficient, since a lot of useful cases aren't optimized
and therefore force an iret out of a combination of paranoia and
the fact that no one has bothered to write even more asm code
to avoid it.

I would argue that this approach is backwards.  Rather than trying
to avoid the iret path, we should instead try to make the iret path
fast.  Under a specific set of conditions, iret is unnecessary.  In
particular, if RIP==RCX, RFLAGS==R11, RIP is canonical, RF is not
set, and both SS and CS are as expected, then
movq 32(%rsp),%rsp;sysret does the same thing as iret.  This set of
conditions is nearly always satisfied on return from syscalls, and
it can even occasionally be satisfied on return from an irq.

Even with the careful checks for sysret applicability, this cuts
nearly 80ns off of the overhead from syscalls with unoptimized exit
work.  This includes tracing and context tracking, and any return
that invokes KVM's user return notifier.  For example, the cost of
getpid with CONFIG_CONTEXT_TRACKING_FORCE=y drops from ~360ns to
~280ns on my computer.

This may allow the removal and even eventual conversion to C
of a respectable amount of exit asm.

This may require further tweaking to give the full benefit on Xen.

It may be worthwhile to adjust signal delivery and exec to try hit
the sysret path.

This does not optimize returns to 32-bit userspace.  Making the same
optimization for CS == __USER32_CS is conceptually straightforward,
but it will require some tedious code to handle the differences
between sysretl and sysexitl.

Link: http://lkml.kernel.org/r/71428f63e681e1b4aa1a781e3ef7c27f027d1103.1421453410.git.luto@amacapital.net
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-02-01 04:03:01 -08:00
Andy Lutomirski b926e6f61a x86, traps: Fix ist_enter from userspace
context_tracking_user_exit() has no effect if in_interrupt() returns true,
so ist_enter() didn't work.  Fix it by calling exception_enter(), and thus
context_tracking_user_exit(), before incrementing the preempt count.

This also adds an assertion that will catch the problem reliably if
CONFIG_PROVE_RCU=y to help prevent the bug from being reintroduced.

Link: http://lkml.kernel.org/r/261ebee6aee55a4724746d0d7024697013c40a08.1422709102.git.luto@amacapital.net
Fixes: 9592747538 x86, traps: Track entry into and exit from IST context
Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-02-01 04:02:53 -08:00
Ingo Molnar b3890e4704 Merge branch 'perf/hw_breakpoints' into perf/core
The new hw_breakpoint bits are now ready for v3.20, merge them
into the main branch, to avoid conflicts.

Conflicts:
	tools/perf/Documentation/perf-record.txt

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-28 15:48:59 +01:00
Ingo Molnar 772a9aca12 This is my accumulated x86 entry work, part 1, for 3.20. The meat
of this is an IST rework.  When an IST exception interrupts user
 space, we will handle it on the per-thread kernel stack instead of
 on the IST stack.  This sounds messy, but it actually simplifies the
 IST entry/exit code, because it eliminates some ugly games we used
 to play in order to handle rescheduling, signal delivery, etc on the
 way out of an IST exception.
 
 The IST rework introduces proper context tracking to IST exception
 handlers.  I haven't seen any bug reports, but the old code could
 have incorrectly treated an IST exception handler as an RCU extended
 quiescent state.
 
 The memory failure change (included in this pull request with
 Borislav and Tony's permission) eliminates a bunch of code that
 is no longer needed now that user memory failure handlers are
 called in process context.
 
 Finally, this includes a few on Denys' uncontroversial and Obviously
 Correct (tm) cleanups.
 
 The IST and memory failure changes have been in -next for a while.
 
 LKML references:
 
 IST rework:
 http://lkml.kernel.org/r/cover.1416604491.git.luto@amacapital.net
 
 Memory failure change:
 http://lkml.kernel.org/r/54ab2ffa301102cd6e@agluck-desk.sc.intel.com
 
 Denys' cleanups:
 http://lkml.kernel.org/r/1420927210-19738-1-git-send-email-dvlasenk@redhat.com
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUtvkFAAoJEK9N98ZeDfrkcfsIAJxZ0UBUCEDvulbqgk/iPGOa
 fIpKLMowS7CpKtw6Wdc/YvAIkeHXWm1vU44Hj0TrjSrXCgVF8yCngs/xlXtOjoa1
 dosXQqgqVJJ+hyui7chAEWyalLW7bEO8raq/6snhiMrhiuEkVKpEr7Fer4FVVCZL
 4VALmNQQsbV+Qq4pXIhuagZC0Nt/XKi/+/cKvhS4p//q1F/TbHTz0FpDUrh0jPMh
 18WFy0jWgxdkMRnSp/wJhekvdXX6PwUy5BdES9fjw8LQJZxxFpqN3Fe1kgfyzV0k
 yuvEHw1hPt2aBGj3q69wQvDVyyn4OqMpRDBhk4S+GJYmVh7mFyFMN4BDMEy/EY8=
 =LXVl
 -----END PGP SIGNATURE-----

Merge tag 'pr-20150114-x86-entry' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux into x86/asm

Pull x86/entry enhancements from Andy Lutomirski:

" This is my accumulated x86 entry work, part 1, for 3.20.  The meat
  of this is an IST rework.  When an IST exception interrupts user
  space, we will handle it on the per-thread kernel stack instead of
  on the IST stack.  This sounds messy, but it actually simplifies the
  IST entry/exit code, because it eliminates some ugly games we used
  to play in order to handle rescheduling, signal delivery, etc on the
  way out of an IST exception.

  The IST rework introduces proper context tracking to IST exception
  handlers.  I haven't seen any bug reports, but the old code could
  have incorrectly treated an IST exception handler as an RCU extended
  quiescent state.

  The memory failure change (included in this pull request with
  Borislav and Tony's permission) eliminates a bunch of code that
  is no longer needed now that user memory failure handlers are
  called in process context.

  Finally, this includes a few on Denys' uncontroversial and Obviously
  Correct (tm) cleanups.

  The IST and memory failure changes have been in -next for a while.

  LKML references:

  IST rework:
  http://lkml.kernel.org/r/cover.1416604491.git.luto@amacapital.net

  Memory failure change:
  http://lkml.kernel.org/r/54ab2ffa301102cd6e@agluck-desk.sc.intel.com

  Denys' cleanups:
  http://lkml.kernel.org/r/1420927210-19738-1-git-send-email-dvlasenk@redhat.com
"

This tree semantically depends on and is based on the following RCU commit:

  734d168013 ("rcu: Make rcu_nmi_enter() handle nesting")

... and for that reason won't be pushed upstream before the RCU bits hit Linus's tree.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-28 15:33:26 +01:00
Ingo Molnar 41ca5d4e9b Merge commit 3669ef9fa7 ("x86, tls: Interpret an all-zero struct user_desc as 'no segment'") into x86/asm
Pick up the latestest asm fixes before advancing it any further.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-28 15:30:32 +01:00
Kan Liang ef454caeb7 perf/x86/intel: Add model number for Airmont
Intel Airmont supports the same architectural and non-architectural
performance monitoring events as Silvermont.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1421913053-99803-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-28 13:17:32 +01:00
Stephane Eranian 98b008dff8 perf/rapl: Fix crash in rapl_scale()
This patch fixes a systematic crash in rapl_scale()
due to an invalid pointer.

The bug was introduced by commit:

  89cbc76768 ("x86: Replace __get_cpu_var uses")

The fix is simple. Just put the parenthesis where it needs
to be, i.e., around rapl_pmu. To my surprise, the compiler
was not complaining about passing an integer instead of a
pointer.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 89cbc76768 ("x86: Replace __get_cpu_var uses")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: cl@linux.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150122203834.GA10228@thinkpad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-28 13:04:35 +01:00
Kan Liang c05199e5a5 perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization
There were some issues about the uncore driver tried to access
non-existing boxes, which caused boot crashes. These issues have
been all fixed. But we should avoid boot failures if that ever
happens again.

This patch intends to prevent this kind of potential issues.
It moves uncore_box_init out of driver initialization. The box
will be initialized when it's first enabled.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1421729665-5912-1-git-send-email-kan.liang@intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-28 13:04:34 +01:00
Linus Torvalds 14746306af Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "Hopefully the last round of fixes for 3.19

   - regression fix for the LDT changes
   - regression fix for XEN interrupt handling caused by the APIC
     changes
   - regression fixes for the PAT changes
   - last minute fixes for new the MPX support
   - regression fix for 32bit UP
   - fix for a long standing relocation issue on 64bit tagged for stable
   - functional fix for the Hyper-V clocksource tagged for stable
   - downgrade of a pr_err which tends to confuse users

  Looks a bit on the large side, but almost half of it are valuable
  comments"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsc: Change Fast TSC calibration failed from error to info
  x86/apic: Re-enable PCI_MSI support for non-SMP X86_32
  x86, mm: Change cachemode exports to non-gpl
  x86, tls: Interpret an all-zero struct user_desc as "no segment"
  x86, tls, ldt: Stop checking lm in LDT_empty
  x86, mpx: Strictly enforce empty prctl() args
  x86, mpx: Fix potential performance issue on unmaps
  x86, mpx: Explicitly disable 32-bit MPX support on 64-bit kernels
  x86, hyperv: Mark the Hyper-V clocksource as being continuous
  x86: Don't rely on VMWare emulating PAT MSR correctly
  x86, irq: Properly tag virtualization entry in /proc/interrupts
  x86, boot: Skip relocs when load address unchanged
  x86/xen: Override ACPI IRQ management callback __acpi_unregister_gsi
  ACPI: pci: Do not clear pci_dev->irq in acpi_pci_irq_disable()
  x86/xen: Treat SCI interrupt as normal GSI interrupt
2015-01-25 18:11:17 -08:00
Alexandre Demers 520452172e x86/tsc: Change Fast TSC calibration failed from error to info
Many users see this message when booting without knowning that it is
of no importance and that TSC calibration may have succeeded by
another way.

As explained by Paul Bolle in
http://lkml.kernel.org/r/1348488259.1436.22.camel@x61.thuisdomein

  "Fast TSC calibration failed" should not be considered as an error
  since other calibration methods are being tried afterward. At most,
  those send a warning if they fail (not an error). So let's change
  the message from error to warning.

[ tglx: Make if pr_info. It's really not important at all ]

Fixes: c767a54ba0 x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level>
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1418106470-6906-1-git-send-email-alexandre.f.demers@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-23 10:53:52 +01:00
Andy Lutomirski 3669ef9fa7 x86, tls: Interpret an all-zero struct user_desc as "no segment"
The Witcher 2 did something like this to allocate a TLS segment index:

        struct user_desc u_info;
        bzero(&u_info, sizeof(u_info));
        u_info.entry_number = (uint32_t)-1;

        syscall(SYS_set_thread_area, &u_info);

Strictly speaking, this code was never correct.  It should have set
read_exec_only and seg_not_present to 1 to indicate that it wanted
to find a free slot without putting anything there, or it should
have put something sensible in the TLS slot if it wanted to allocate
a TLS entry for real.  The actual effect of this code was to
allocate a bogus segment that could be used to exploit espfix.

The set_thread_area hardening patches changed the behavior, causing
set_thread_area to return -EINVAL and crashing the game.

This changes set_thread_area to interpret this as a request to find
a free slot and to leave it empty, which isn't *quite* what the game
expects but should be close enough to keep it working.  In
particular, using the code above to allocate two segments will
allocate the same segment both times.

According to FrostbittenKing on Github, this fixes The Witcher 2.

If this somehow still causes problems, we could instead allocate
a limit==0 32-bit data segment, but that seems rather ugly to me.

Fixes: 41bdc78544 x86/tls: Validate TLS entries to protect espfix
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: stable@vger.kernel.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/0cb251abe1ff0958b8e468a9a9a905b80ae3a746.1421954363.git.luto@amacapital.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-22 21:45:07 +01:00
Linus Torvalds 193934123c Surprising number of fixes this merge window :(
First two are minor fallout from the param rework which went in this merge
 window.
 
 Next three are a series which fixes a longstanding (but never previously
 reported and unlikely , so no CC stable) race between kallsyms and freeing
 the init section.
 
 Finally, a minor cleanup as our module refcount will now be -1 during
 unload.
 
 Thanks,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUwEmwAAoJENkgDmzRrbjx77kP/1cNQR2eG2sBwokg3q0tvHnQ
 IKqEXErW7NvxRa+RAMEmy2uQoGt6+uNklAbtyJEYM9oR1NieFbPi2yrt9Xn5SAXS
 Brp1S8WYBMilA3W3o6I0trFDRWHdpdtkKIQwLWgJNSEWjbTXh8bSwp/2X1rlOPyI
 ZmphCMOQMU2/uFEyJhTz1WMEV8eVXiRLN8OxSkPxToxdZoGln2U8IBCCCJC9OG+f
 Cf3eMgEcNdEXNcPKqr11NIcHkAx6M6qI/eMDOqk151PslHa8lbis6di9Z87aE0ps
 i8PyrkJGTmgM9cCjXwE8deNseeCmuKYlbPIF+NoxcqtvZstfaMrISwTIEuzV4JHi
 p13YhDxy4XiC3H6pKHub/jo7UCl+wWtFh9SqpqGgduFX/p6FtUHQJm0S0X/DFFZt
 C+2MFVSe6HRHE8B7bFz86+619Qd/rU7+806CLCE+NbYlYAKIBYKzWt/bml6VH3RJ
 OjwXhQqmznWhJjsfD3BUUUpZpHijmylI9gAe2F1oErb8YjRU6gIm7P8hlkOzD7AS
 TfGHPFq2raQcfAiGdVmvkbvvhvYZXnB3WVsAexrYoqrT9I8eEfRI+7SkL75MLR2E
 ikzhJS3SHkAUAd7fUVMt7xMwh0jmhsPjWCCqc13m6UUFoXhTaDgKgPGftltN0bI2
 g85+enZ3/eca6xh/KxvW
 =Kf9b
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module and param fixes from Rusty Russell:
 "Surprising number of fixes this merge window :(

  The first two are minor fallout from the param rework which went in
  this merge window.

  The next three are a series which fixes a longstanding (but never
  previously reported and unlikely , so no CC stable) race between
  kallsyms and freeing the init section.

  Finally, a minor cleanup as our module refcount will now be -1 during
  unload"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: make module_refcount() a signed integer.
  module: fix race in kallsyms resolution during module load success.
  module: remove mod arg from module_free, rename module_memfree().
  module_arch_freeing_init(): new hook for archs before module->module_init freed.
  param: fix uninitialized read with CONFIG_DEBUG_LOCK_ALLOC
  param: initialize store function to NULL if not available.
2015-01-23 06:40:36 +12:00
K. Y. Srinivasan 32c6590d12 x86, hyperv: Mark the Hyper-V clocksource as being continuous
The Hyper-V clocksource is continuous; mark it accordingly.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: jasowang@redhat.com
Cc: gregkh@linuxfoundation.org
Cc: devel@linuxdriverproject.org
Cc: olaf@aepfle.de
Cc: apw@canonical.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1421108762-3331-1-git-send-email-kys@microsoft.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20 14:36:25 +01:00
Jan Beulich 4a0d3107d6 x86, irq: Properly tag virtualization entry in /proc/interrupts
The mis-naming likely was a copy-and-paste effect.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/54B9408B0200007800055E8B@mail.emea.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20 12:37:23 +01:00
Jiang Liu b568b8601f x86/xen: Treat SCI interrupt as normal GSI interrupt
Currently Xen Domain0 has special treatment for ACPI SCI interrupt,
that is initialize irq for ACPI SCI at early stage in a special way as:
xen_init_IRQ()
	->pci_xen_initial_domain()
		->xen_setup_acpi_sci()
			Allocate and initialize irq for ACPI SCI

Function xen_setup_acpi_sci() calls acpi_gsi_to_irq() to get an irq
number for ACPI SCI. But unfortunately acpi_gsi_to_irq() depends on
IOAPIC irqdomains through following path
acpi_gsi_to_irq()
	->mp_map_gsi_to_irq()
		->mp_map_pin_to_irq()
			->check IOAPIC irqdomain

For PV domains, it uses Xen event based interrupt manangement and
doesn't make uses of native IOAPIC, so no irqdomains created for IOAPIC.
This causes Xen domain0 fail to install interrupt handler for ACPI SCI
and all ACPI events will be lost. Please refer to:
https://lkml.org/lkml/2014/12/19/178

So the fix is to get rid of special treatment for ACPI SCI, just treat
ACPI SCI as normal GSI interrupt as:
acpi_gsi_to_irq()
	->acpi_register_gsi()
		->acpi_register_gsi_xen()
			->xen_register_gsi()

With above change, there's no need for xen_setup_acpi_sci() anymore.
The above change also works with bare metal kernel too.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Tony Luck <tony.luck@intel.com>
Cc: xen-devel@lists.xenproject.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1421720467-7709-2-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20 11:44:40 +01:00
Rusty Russell be1f221c04 module: remove mod arg from module_free, rename module_memfree().
Nothing needs the module pointer any more, and the next patch will
call it from RCU, where the module itself might no longer exist.
Removing the arg is the safest approach.

This just codifies the use of the module_alloc/module_free pattern
which ftrace and bpf use.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: x86@kernel.org
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: nios2-dev@lists.rocketboards.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: netdev@vger.kernel.org
2015-01-20 11:38:33 +10:30
Linus Torvalds 59b2858f57 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also two PMU driver fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools powerpc: Use dwfl_report_elf() instead of offline.
  perf tools: Fix segfault for symbol annotation on TUI
  perf test: Fix dwarf unwind using libunwind.
  perf tools: Avoid build splat for syscall numbers with uclibc
  perf tools: Elide strlcpy warning with uclibc
  perf tools: Fix statfs.f_type data type mismatch build error with uclibc
  tools: Remove bitops/hweight usage of bits in tools/perf
  perf machine: Fix __machine__findnew_thread() error path
  perf tools: Fix building error in x86_64 when dwarf unwind is on
  perf probe: Propagate error code when write(2) failed
  perf/x86/intel: Fix bug for "cycles:p" and "cycles:pp" on SLM
  perf/rapl: Fix sysfs_show() initialization for RAPL PMU
2015-01-18 06:24:30 +12:00
Andy Lutomirski 0fcedc8631 x86_64 entry: Fix RCX for ptraced syscalls
The int_ret_from_sys_call and syscall tracing code disagrees
with the sysret path as to the value of RCX.

The Intel SDM, the AMD APM, and my laptop all agree that sysret
returns with RCX == RIP.  The syscall tracing code does not
respect this property.

For example, this program:

int main()
{
	extern const char syscall_rip[];
	unsigned long rcx = 1;
	unsigned long orig_rcx = rcx;
	asm ("mov $-1, %%eax\n\t"
	     "syscall\n\t"
	     "syscall_rip:"
	     : "+c" (rcx) : : "r11");
	printf("syscall: RCX = %lX  RIP = %lX  orig RCX = %lx\n",
	       rcx, (unsigned long)syscall_rip, orig_rcx);
	return 0;
}

prints:

  syscall: RCX = 400556  RIP = 400556  orig RCX = 1

Running it under strace gives this instead:

  syscall: RCX = FFFFFFFFFFFFFFFF  RIP = 400556  orig RCX = 1

This changes FIXUP_TOP_OF_STACK to match sysret, causing the
test to show RCX == RIP even under strace.

It looks like this is a partial revert of:
88e4bc32686e ("[PATCH] x86-64 architecture specific sync for 2.5.8")
from the historic git tree.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/c9a418c3dc3993cb88bb7773800225fd318a4c67.1421453410.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-17 11:02:53 +01:00
Linus Torvalds 23aa4b416a This holds a few fixes to the ftrace infrastructure as well as
the mixture of function graph tracing and kprobes.
 
 When jprobes and function graph tracing is enabled at the same time
 it will crash the system.
 
   # modprobe jprobe_example
   # echo function_graph > /sys/kernel/debug/tracing/current_tracer
 
 After the first fork (jprobe_example probes it), the system will crash.
 This is due to the way jprobes copies the stack frame and does not
 do a normal function return. This messes up with the function graph
 tracing accounting which hijacks the return address from the stack
 and replaces it with a hook function. It saves the return addresses in
 a separate stack to put back the correct return address when done.
 But because the jprobe functions do not do a normal return, their
 stack addresses are not put back until the function they probe is called,
 which means that the probed function will get the return address of
 the jprobe handler instead of its own.
 
 The simple fix here was to disable function graph tracing while the
 jprobe handler is being called.
 
 While debugging this I found two minor bugs with the function graph
 tracing.
 
 The first was about the function graph tracer sharing its function hash
 with the function tracer (they both get filtered by the same input).
 The changing of the set_ftrace_filter would not sync the function recording
 records after a change if the function tracer was disabled but the
 function graph tracer was enabled. This was due to the update only checking
 one of the ops instead of the shared ops to see if they were enabled and
 should perform the sync. This caused the ftrace accounting to break and
 a ftrace_bug() would be triggered, disabling ftrace until a reboot.
 
 The second was that the check to update records only checked one of the
 filter hashes. It needs to test both the "filter" and "notrace" hashes.
 The "filter" hash determines what functions to trace where as the "notrace"
 hash determines what functions not to trace (trace all but these).
 Both hashes need to be passed to the update code to find out what change
 is being done during the update. This also broke the ftrace record
 accounting and triggered a ftrace_bug().
 
 This patch set also include two more fixes that were reported separately
 from the kprobe issue.
 
 One was that init_ftrace_syscalls() was called twice at boot up.
 This is not a major bug, but that call performed a rather large kmalloc
 (NR_syscalls * sizeof(*syscalls_metadata)). The second call made the first
 one a memory leak, and wastes memory.
 
 The other fix is a regression caused by an update in the v3.19 merge window.
 The moving to enable events early, moved the enabling before PID 1 was
 created. The syscall events require setting the TIF_SYSCALL_TRACEPOINT
 for all tasks. But for_each_process_thread() does not include the swapper
 task (PID 0), and ended up being a nop. A suggested fix was to add
 the init_task() to have its flag set, but I didn't really want to mess
 with PID 0 for this minor bug. Instead I disable and re-enable events again
 at early_initcall() where it use to be enabled. This also handles any other
 event that might have its own reg function that could break at early
 boot up.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUt9vmAAoJEEjnJuOKh9ldLHEIAJ9XrPW2xMIY5yI69jT1F7pv
 PkSRqENnOK0l4UulD52SvIBecQTTBcEEjao4yVGkc7DCJBOws/1LZ5gW8OfNlKjq
 rMB8yaosL1tXJ1ARVPMjcQVy+228zkgTXznwEZCjku1g7LuScQ28qyXsXO7B6yiK
 xKoHqKjygmM/a2aVn+8tdiVKiDp6jdmkbYicbaFT4xP7XB5DaMmIiXRHxdvW6xdR
 azKrVfYiMyJqTZNt/EVSWUk2WjeaYhoXyNtvgPx515wTo/llCnzhjcsocXBtH2P/
 YOtwl+1L7Z89ukV9oXqrtrUJZ6Ps7+g7I1flJuL7/1FlNGnklcP9JojD+t6HeT8=
 =vkec
 -----END PGP SIGNATURE-----

Merge tag 'trace-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull ftrace fixes from Steven Rostedt:
 "This holds a few fixes to the ftrace infrastructure as well as the
  mixture of function graph tracing and kprobes.

  When jprobes and function graph tracing is enabled at the same time it
  will crash the system:

      # modprobe jprobe_example
      # echo function_graph > /sys/kernel/debug/tracing/current_tracer

  After the first fork (jprobe_example probes it), the system will
  crash.

  This is due to the way jprobes copies the stack frame and does not do
  a normal function return.  This messes up with the function graph
  tracing accounting which hijacks the return address from the stack and
  replaces it with a hook function.  It saves the return addresses in a
  separate stack to put back the correct return address when done.  But
  because the jprobe functions do not do a normal return, their stack
  addresses are not put back until the function they probe is called,
  which means that the probed function will get the return address of
  the jprobe handler instead of its own.

  The simple fix here was to disable function graph tracing while the
  jprobe handler is being called.

  While debugging this I found two minor bugs with the function graph
  tracing.

  The first was about the function graph tracer sharing its function
  hash with the function tracer (they both get filtered by the same
  input).  The changing of the set_ftrace_filter would not sync the
  function recording records after a change if the function tracer was
  disabled but the function graph tracer was enabled.  This was due to
  the update only checking one of the ops instead of the shared ops to
  see if they were enabled and should perform the sync.  This caused the
  ftrace accounting to break and a ftrace_bug() would be triggered,
  disabling ftrace until a reboot.

  The second was that the check to update records only checked one of
  the filter hashes.  It needs to test both the "filter" and "notrace"
  hashes.  The "filter" hash determines what functions to trace where as
  the "notrace" hash determines what functions not to trace (trace all
  but these).  Both hashes need to be passed to the update code to find
  out what change is being done during the update.  This also broke the
  ftrace record accounting and triggered a ftrace_bug().

  This patch set also include two more fixes that were reported
  separately from the kprobe issue.

  One was that init_ftrace_syscalls() was called twice at boot up.  This
  is not a major bug, but that call performed a rather large kmalloc
  (NR_syscalls * sizeof(*syscalls_metadata)).  The second call made the
  first one a memory leak, and wastes memory.

  The other fix is a regression caused by an update in the v3.19 merge
  window.  The moving to enable events early, moved the enabling before
  PID 1 was created.  The syscall events require setting the
  TIF_SYSCALL_TRACEPOINT for all tasks.  But for_each_process_thread()
  does not include the swapper task (PID 0), and ended up being a nop.

  A suggested fix was to add the init_task() to have its flag set, but I
  didn't really want to mess with PID 0 for this minor bug.  Instead I
  disable and re-enable events again at early_initcall() where it use to
  be enabled.  This also handles any other event that might have its own
  reg function that could break at early boot up"

* tag 'trace-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix enabling of syscall events on the command line
  tracing: Remove extra call to init_ftrace_syscalls()
  ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
  ftrace: Check both notrace and filter for old hash
  ftrace: Fix updating of filters for shared global_ops filters
2015-01-17 07:55:52 +13:00
Kan Liang 33636732dc perf/x86/intel: Fix bug for "cycles:p" and "cycles:pp" on SLM
cycles:p and cycles:pp do not work on SLM since commit:

   86a04461a9 ("perf/x86: Revamp PEBS event selection")

UOPS_RETIRED.ALL is not a PEBS capable event, so it should not be used
to count cycle number.

Actually SLM calls intel_pebs_aliases_core2() which uses INST_RETIRED.ANY_P
to count the number of cycles. It's a PEBS capable event. But inv and
cmask must be set to count cycles.

Considering SLM allows all events as PEBS with no flags, only
INST_RETIRED.ANY_P, inv=1, cmask=16 needs to handled specially.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1421084541-31639-1-git-send-email-kan.liang@intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-16 09:06:59 +01:00
Stephane Eranian 433678bdc6 perf/rapl: Fix sysfs_show() initialization for RAPL PMU
This patch fixes a problem with the initialization of the
sysfs_show() routine for the RAPL PMU.

The current code was wrongly relying on the EVENT_ATTR_STR()
macro which uses the events_sysfs_show() function in the x86
PMU code. That function itself was relying on the x86_pmu data
structure. Yet RAPL and the core PMU (x86_pmu) have nothing to
do with each other. They should therefore not interact with
each other.

The x86_pmu structure is initialized at boot time based on
the host CPU model. When the host CPU is not supported, the
x86_pmu remains uninitialized and some of the callbacks it
contains are NULL.

The false dependency with x86_pmu could potentially cause crashes
in case the x86_pmu is not initialized while the RAPL PMU is. This
may, for instance, be the case in virtualized environments.

This patch fixes the problem by using a private sysfs_show()
routine for exporting the RAPL PMU events.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150113225953.GA21525@thinkpad
Cc: vincent.weaver@maine.edu
Cc: jolsa@redhat.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-16 09:06:58 +01:00
Steven Rostedt (Red Hat) 237d28db03 ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
If the function graph tracer traces a jprobe callback, the system will
crash. This can easily be demonstrated by compiling the jprobe
sample module that is in the kernel tree, loading it and running the
function graph tracer.

 # modprobe jprobe_example.ko
 # echo function_graph > /sys/kernel/debug/tracing/current_tracer
 # ls

The first two commands end up in a nice crash after the first fork.
(do_fork has a jprobe attached to it, so "ls" just triggers that fork)

The problem is caused by the jprobe_return() that all jprobe callbacks
must end with. The way jprobes works is that the function a jprobe
is attached to has a breakpoint placed at the start of it (or it uses
ftrace if fentry is supported). The breakpoint handler (or ftrace callback)
will copy the stack frame and change the ip address to return to the
jprobe handler instead of the function. The jprobe handler must end
with jprobe_return() which swaps the stack and does an int3 (breakpoint).
This breakpoint handler will then put back the saved stack frame,
simulate the instruction at the beginning of the function it added
a breakpoint to, and then continue on.

For function tracing to work, it hijakes the return address from the
stack frame, and replaces it with a hook function that will trace
the end of the call. This hook function will restore the return
address of the function call.

If the function tracer traces the jprobe handler, the hook function
for that handler will not be called, and its saved return address
will be used for the next function. This will result in a kernel crash.

To solve this, pause function tracing before the jprobe handler is called
and unpause it before it returns back to the function it probed.

Some other updates:

Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the
code look a bit cleaner and easier to understand (various tries to fix
this bug required this change).

Note, if fentry is being used, jprobes will change the ip address before
the function graph tracer runs and it will not be able to trace the
function that the jprobe is probing.

Link: http://lkml.kernel.org/r/20150114154329.552437962@goodmis.org

Cc: stable@vger.kernel.org # 2.6.30+
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-01-15 09:39:18 -05:00
Denys Vlasenko f6f64681d9 x86: entry_64.S: fold SAVE_ARGS_IRQ macro into its sole user
No code changes.

This is a preparatory patch for change in "struct pt_regs" handling.

CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: X86 ML <x86@kernel.org>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: linux-kernel@vger.kernel.org
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-01-13 14:18:08 -08:00
Denys Vlasenko af9cfe270d x86: entry_64.S: delete unused code
A define, two macros and an unreferenced bit of assembly are gone.

Acked-by: Borislav Petkov <bp@suse.de>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Oleg Nesterov <oleg@redhat.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: X86 ML <x86@kernel.org>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: linux-kernel@vger.kernel.org
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-01-13 14:00:33 -08:00
Linus Torvalds 505569d208 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes: two vdso fixes, two kbuild fixes and a boot failure fix
  with certain odd memory mappings"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, vdso: Use asm volatile in __getcpu
  x86/build: Clean auto-generated processor feature files
  x86: Fix mkcapflags.sh bash-ism
  x86: Fix step size adjustment during initial memory mapping
  x86_64, vdso: Fix the vdso address randomization algorithm
2015-01-11 11:53:46 -08:00
Linus Torvalds ddb321a8dd Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also some kernel side fixes: uncore PMU
  driver fix, user regs sampling fix and an instruction decoder fix that
  unbreaks PEBS precise sampling"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes
  perf/x86_64: Improve user regs sampling
  perf: Move task_pt_regs sampling into arch code
  x86: Fix off-by-one in instruction decoder
  perf hists browser: Fix segfault when showing callchain
  perf callchain: Free callchains when hist entries are deleted
  perf hists: Fix children sort key behavior
  perf diff: Fix to sort by baseline field by default
  perf list: Fix --raw-dump option
  perf probe: Fix crash in dwarf_getcfi_elf
  perf probe: Fix to fall back to find probe point in symbols
  perf callchain: Append callchains only when requested
  perf ui/tui: Print backtrace symbols when segfault occurs
  perf report: Show progress bar for output resorting
2015-01-11 11:47:45 -08:00
Andi Kleen 5306c31c57 perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes
There was another report of a boot failure with a #GP fault in the
uncore SBOX initialization. The earlier work around was not enough
for this system.

The boot was failing while trying to initialize the third SBOX.

This patch detects parts with only two SBOXes and limits the number
of SBOX units to two there.

Stable material, as it affects boot problems on 3.18.

Tested-by: Andreas Oehler <andreas@oehler-net.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1420583675-9163-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-09 11:12:30 +01:00
Andy Lutomirski 86c269fea3 perf/x86_64: Improve user regs sampling
Perf reports user regs for kernel-mode samples so that samples can
be backtraced through user code.  The old code was very broken in
syscall context, resulting in useless backtraces.

The new code, in contrast, is still dangerously racy, but it should
at least work most of the time.

Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: chenggang.qcg@taobao.com
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/243560c26ff0f739978e2459e203f6515367634d.1420396372.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-09 11:12:29 +01:00
Andy Lutomirski 88a7c26af8 perf: Move task_pt_regs sampling into arch code
On x86_64, at least, task_pt_regs may be only partially initialized
in many contexts, so x86_64 should not use it without extra care
from interrupt context, let alone NMI context.

This will allow x86_64 to override the logic and will supply some
scratch space to use to make a cleaner copy of user regs.

Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: chenggang.qcg@taobao.com
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/e431cd4c18c2e1c44c774f10758527fb2d1025c4.1420396372.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-09 11:12:28 +01:00
Luck, Tony d4812e169d x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks
We now switch to the kernel stack when a machine check interrupts
during user mode.  This means that we can perform recovery actions
in the tail of do_machine_check()

Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-01-07 07:47:42 -08:00
Hanjun Guo d02dc27db0 ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu()
acpi_map_lsapic() will allocate a logical CPU number and map it to
physical CPU id (such as APIC id) for the hot-added CPU, it will also
do some mapping for NUMA node id and etc, acpi_unmap_lsapic() will
do the reverse.

We can see that the name of the function is a little bit confusing and
arch (IA64) dependent so rename them as acpi_(un)map_cpu() to make arch
agnostic and explicit.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-05 23:34:26 +01:00
Andy Lutomirski bced35b65a x86, traps: Add ist_begin_non_atomic and ist_end_non_atomic
In some IST handlers, if the interrupt came from user mode,
we can safely enable preemption.  Add helpers to do it safely.

This is intended to be used my the memory failure code in
do_machine_check.

Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-01-02 10:22:46 -08:00
Andy Lutomirski 83653c16da x86: Clean up current_stack_pointer
There's no good reason for it to be a macro, and x86_64 will want to
use it, so it should be in a header.

Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-01-02 10:22:46 -08:00
Andy Lutomirski 9592747538 x86, traps: Track entry into and exit from IST context
We currently pretend that IST context is like standard exception
context, but this is incorrect.  IST entries from userspace are like
standard exceptions except that they use per-cpu stacks, so they are
atomic.  IST entries from kernel space are like NMIs from RCU's
perspective -- they are not quiescent states even if they
interrupted the kernel during a quiescent state.

Add and use ist_enter and ist_exit to track IST context.  Even
though x86_32 has no IST stacks, we track these interrupts the same
way.

This fixes two issues:

 - Scheduling from an IST interrupt handler will now warn.  It would
   previously appear to work as long as we got lucky and nothing
   overwrote the stack frame.  (I don't know of any bugs in this
   that would trigger the warning, but it's good to be on the safe
   side.)

 - RCU handling in IST context was dangerous.  As far as I know,
   only machine checks were likely to trigger this, but it's good to
   be on the safe side.

Note that the machine check handlers appears to have been missing
any context tracking at all before this patch.

Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-01-02 10:22:46 -08:00
Andy Lutomirski 48e08d0fb2 x86, entry: Switch stacks on a paranoid entry from userspace
This causes all non-NMI, non-double-fault kernel entries from
userspace to run on the normal kernel stack.  Double-fault is
exempt to minimize confusion if we double-fault directly from
userspace due to a bad kernel stack.

This is, suprisingly, simpler and shorter than the current code.  It
removes the IMO rather frightening paranoid_userspace path, and it
make sync_regs much simpler.

There is no risk of stack overflow due to this change -- the kernel
stack that we switch to is empty.

This will also enable us to create non-atomic sections within
machine checks from userspace, which will simplify memory failure
handling.  It will also allow the upcoming fsgsbase code to be
simplified, because it doesn't need to worry about usergs when
scheduling in paranoid_exit, as that code no longer exists.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tony Luck <tony.luck@intel.com>
Acked-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-01-02 10:22:45 -08:00
Bjørn Mork 280dbc5723 x86/build: Clean auto-generated processor feature files
Commit 9def39be4e ("x86: Support compiling out human-friendly
processor feature names") made two source file targets
conditional. Such conditional targets will not be cleaned
automatically by make mrproper.

Fix by adding explicit clean-files targets for the two files.

Fixes: 9def39be4e ("x86: Support compiling out human-friendly processor feature names")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Cc: Josh Triplett <josh@joshtriplett.org>
Link: http://lkml.kernel.org/r/1419335863-10608-1-git-send-email-bjorn@mork.no
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-23 15:37:06 +01:00
Sylvain BERTRAND ea174f4c4f x86: Fix mkcapflags.sh bash-ism
Chocked while compiling linux with dash shell instead of bash
shell. See:

   http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html

Signed-off-by: Sylvain BERTRAND <sylvain.bertrand@gmail.com>
Link: http://lkml.kernel.org/r/20141223123912.GA1386@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-23 15:34:57 +01:00
Linus Torvalds e589c9e13a Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Thomas Gleixner:
 "After stopping the full x86/apic branch, I took some time to go
  through the first block of patches again, which are mostly cleanups
  and preparatory work for the irqdomain conversion and ioapic hotplug
  support.

  Unfortunaly one of the real problematic commits was right at the
  beginning, so I rebased this portion of the pending patches without
  the offenders.

  It would be great to get this into 3.19.  That makes reworking the
  problematic parts simpler.  The usual tip testing did not unearth any
  issues and it is fully bisectible now.

  I'm pretty confident that this wont affect the calmness of the xmas
  season.

  Changes:
   - Split the convoluted io_apic.c code into domain specific parts
     (vector, ioapic, msi, htirq)
   - Introduce proper helper functions to retrieve irq specific data
     instead of open coded dereferencing of pointers
   - Preparatory work for ioapic hotplug and irqdomain conversion
   - Removal of the non functional pci-ioapic driver
   - Removal of unused irq entry stubs
   - Make native_smp_prepare_cpus() preemtible to avoid GFP_ATOMIC
     allocations for everything which is called from there.
   - Small cleanups and fixes"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  iommu/amd: Use helpers to access irq_cfg data structure associated with IRQ
  iommu/vt-d: Use helpers to access irq_cfg data structure associated with IRQ
  x86: irq_remapping: Use helpers to access irq_cfg data structure associated with IRQ
  x86, irq: Use helpers to access irq_cfg data structure associated with IRQ
  x86, irq: Make MSI and HT_IRQ indepenent of X86_IO_APIC
  x86, irq: Move IRQ initialization routines from io_apic.c into vector.c
  x86, irq: Move IOAPIC related declarations from hw_irq.h into io_apic.h
  x86, irq: Move HT IRQ related code from io_apic.c into htirq.c
  x86, irq: Move PCI MSI related code from io_apic.c into msi.c
  x86, irq: Replace printk(KERN_LVL) with pr_lvl() utilities
  x86, irq: Make UP version of irq_complete_move() an inline stub
  x86, irq: Move local APIC related code from io_apic.c into vector.c
  x86, irq: Introduce helpers to access struct irq_cfg
  x86, irq: Protect __clear_irq_vector() with vector_lock
  x86, irq: Rename local APIC related functions in io_apic.c as apic_xxx()
  x86, irq: Refine hw_irq.h to prepare for irqdomain support
  x86, irq: Convert irq_2_pin list to generic list
  x86, irq: Kill useless parameter 'irq_attr' of IO_APIC_get_PCI_irq_vector()
  x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI
  x86, irq: Introduce helper to check whether an IOAPIC has been registered
  ...
2014-12-19 14:02:02 -08:00
Linus Torvalds a54455766b Merge branch 'x86-mpx-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 MPX fixes from Thomas Gleixner:
 "Three updates for the new MPX infrastructure:
   - Use the proper error check in the trap handler
   - Add a proper config option for it
   - Bring documentation up to date"

* 'x86-mpx-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, mpx: Give MPX a real config option prompt
  x86, mpx: Update documentation
  x86_64/traps: Fix always true condition
2014-12-19 13:22:42 -08:00
Linus Torvalds 1092b596a5 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "This contains a single TLS ABI validation fix from Andy Lutomirski"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tls: Don't validate lm in set_thread_area() after all
2014-12-19 13:18:31 -08:00
Linus Torvalds 88a57667f2 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes and cleanups from Ingo Molnar:
 "A kernel fix plus mostly tooling fixes, but also some tooling
  restructuring and cleanups"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  perf: Fix building warning on ARM 32
  perf symbols: Fix use after free in filename__read_build_id
  perf evlist: Use roundup_pow_of_two
  tools: Adopt roundup_pow_of_two
  perf tools: Make the mmap length autotuning more robust
  tools: Adopt rounddown_pow_of_two and deps
  tools: Adopt fls_long and deps
  tools: Move bitops.h from tools/perf/util to tools/
  tools: Introduce asm-generic/bitops.h
  tools lib: Move asm-generic/bitops/find.h code to tools/include and tools/lib
  tools: Whitespace prep patches for moving bitops.h
  tools: Move code originally from asm-generic/atomic.h into tools/include/asm-generic/
  tools: Move code originally from linux/log2.h to tools/include/linux/
  tools: Move __ffs implementation to tools/include/asm-generic/bitops/__ffs.h
  perf evlist: Do not use hard coded value for a mmap_pages default
  perf trace: Let the perf_evlist__mmap autosize the number of pages to use
  perf evlist: Improve the strerror_mmap method
  perf evlist: Clarify sterror_mmap variable names
  perf evlist: Fixup brown paper bag on "hint" for --mmap-pages cmdline arg
  perf trace: Provide a better explanation when mmap fails
  ...
2014-12-19 13:15:24 -08:00
Linus Torvalds c0f486fde3 More ACPI and power management updates for 3.19-rc1
- Fix a regression in leds-gpio introduced by a recent commit that
    inadvertently changed the name of one of the properties used by
    the driver (Fabio Estevam).
 
  - Fix a regression in the ACPI backlight driver introduced by a
    recent fix that missed one special case that had to be taken
    into account (Aaron Lu).
 
  - Drop the level of some new kernel messages from the ACPI core
    introduced by a recent commit to KERN_DEBUG which they should
    have used from the start and drop some other unuseful KERN_ERR
    messages printed by ACPI (Rafael J Wysocki).
 
  - Revert an incorrect commit modifying the cpupower tool
    (Prarit Bhargava).
 
  - Fix two regressions introduced by recent commits in the OPP
    library and clean up some existing minor issues in that code
    (Viresh Kumar).
 
  - Continue to replace CONFIG_PM_RUNTIME with CONFIG_PM throughout
    the tree (or drop it where that can be done) in order to make
    it possible to eliminate CONFIG_PM_RUNTIME (Rafael J Wysocki,
    Ulf Hansson, Ludovic Desroches).  There will be one more
    "CONFIG_PM_RUNTIME removal" batch after this one, because some
    new uses of it have been introduced during the current merge
    window, but that should be sufficient to finally get rid of it.
 
  - Make the ACPI EC driver more robust against race conditions
    related to GPE handler installation failures (Lv Zheng).
 
  - Prevent the ACPI device PM core code from attempting to
    disable GPEs that it has not enabled which confuses ACPICA
    and makes it report errors unnecessarily (Rafael J Wysocki).
 
  - Add a "force" command line switch to the intel_pstate driver
    to make it possible to override the blacklisting of some
    systems in that driver if needed (Ethan Zhao).
 
  - Improve intel_pstate code documentation and add a MAINTAINERS
    entry for it (Kristen Carlson Accardi).
 
  - Make the ACPI fan driver create cooling device interfaces
    witn names that reflect the IDs of the ACPI device objects
    they are associated with, except for "generic" ACPI fans
    (PNP ID "PNP0C0B").  That's necessary for user space thermal
    management tools to be able to connect the fans with the
    parts of the system they are supposed to be cooling properly.
    From Srinivas Pandruvada.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJUk0IDAAoJEILEb/54YlRx7fgP/3+yF/0TnEW93j2ALDAQFiLF
 tSv2A2vQC8vtMJjjWx0z/HqPh86gfaReEFZmUJD/Q/e2LXEnxNZJ+QMjcekPVkDM
 mTvcIMc2MR8vOA/oMkgxeaKregrrx7RkCfojd+NWZhVukkjl+mvBHgAnYjXRL+NZ
 unDWGlbHG97vq/3kGjPYhDS00nxHblw8NHFBu5HL5RxwABdWoeZJITwqxXWyuPLw
 nlqNWlOxmwvtSbw2VMKz0uof1nFHyQLykYsMG0ZsyayCRdWUZYkEqmE7GGpCLkLu
 D6yfmlpen6ccIOsEAae0eXBt50IFY9Tihk5lovx1mZmci2SNRg29BqMI105wIn0u
 8b8Ej7MNHp7yMxRpB5WfU90p/y7ioJns9guFZxY0CKaRnrI2+BLt3RscMi3MPI06
 Cu2/WkSSa09fhDPA+pk+VDYsmWgyVawigesNmMP5/cvYO/yYywVRjOuO1k77qQGp
 4dSpFYEHfpxinejZnVZOk2V9MkvSLoSMux6wPV0xM0IE1iD0ulVpHjTJrwp80ph4
 +bfUFVr/vrD1y7EKbf1PD363ZKvJhWhvQWDgETsM1vgLf21PfWO7C2kflIAsWsdQ
 1ukD5nCBRlP4K73hG7bdM6kRztXhUdR0SHg85/t0KB/ExiVqtcXIzB60D0G1lENd
 QlKbq3O4lim1WGuhazQY
 =5fo2
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI and power management updates from Rafael Wysocki:
 "These are regression fixes (leds-gpio, ACPI backlight driver,
  operating performance points library, ACPI device enumeration
  messages, cpupower tool), other bug fixes (ACPI EC driver, ACPI device
  PM), some cleanups in the operating performance points (OPP)
  framework, continuation of CONFIG_PM_RUNTIME elimination, a couple of
  minor intel_pstate driver changes, a new MAINTAINERS entry for it and
  an ACPI fan driver change needed for better support of thermal
  management in user space.

  Specifics:

   - Fix a regression in leds-gpio introduced by a recent commit that
     inadvertently changed the name of one of the properties used by the
     driver (Fabio Estevam).

   - Fix a regression in the ACPI backlight driver introduced by a
     recent fix that missed one special case that had to be taken into
     account (Aaron Lu).

   - Drop the level of some new kernel messages from the ACPI core
     introduced by a recent commit to KERN_DEBUG which they should have
     used from the start and drop some other unuseful KERN_ERR messages
     printed by ACPI (Rafael J Wysocki).

   - Revert an incorrect commit modifying the cpupower tool (Prarit
     Bhargava).

   - Fix two regressions introduced by recent commits in the OPP library
     and clean up some existing minor issues in that code (Viresh
     Kumar).

   - Continue to replace CONFIG_PM_RUNTIME with CONFIG_PM throughout the
     tree (or drop it where that can be done) in order to make it
     possible to eliminate CONFIG_PM_RUNTIME (Rafael J Wysocki, Ulf
     Hansson, Ludovic Desroches).

     There will be one more "CONFIG_PM_RUNTIME removal" batch after this
     one, because some new uses of it have been introduced during the
     current merge window, but that should be sufficient to finally get
     rid of it.

   - Make the ACPI EC driver more robust against race conditions related
     to GPE handler installation failures (Lv Zheng).

   - Prevent the ACPI device PM core code from attempting to disable
     GPEs that it has not enabled which confuses ACPICA and makes it
     report errors unnecessarily (Rafael J Wysocki).

   - Add a "force" command line switch to the intel_pstate driver to
     make it possible to override the blacklisting of some systems in
     that driver if needed (Ethan Zhao).

   - Improve intel_pstate code documentation and add a MAINTAINERS entry
     for it (Kristen Carlson Accardi).

   - Make the ACPI fan driver create cooling device interfaces witn
     names that reflect the IDs of the ACPI device objects they are
     associated with, except for "generic" ACPI fans (PNP ID "PNP0C0B").

     That's necessary for user space thermal management tools to be able
     to connect the fans with the parts of the system they are supposed
     to be cooling properly.  From Srinivas Pandruvada"

* tag 'pm+acpi-3.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
  MAINTAINERS: add entry for intel_pstate
  ACPI / video: update the skip case for acpi_video_device_in_dod()
  power / PM: Eliminate CONFIG_PM_RUNTIME
  NFC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  SCSI / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  ACPI / EC: Fix unexpected ec_remove_handlers() invocations
  Revert "tools: cpupower: fix return checks for sysfs_get_idlestate_count()"
  tracing / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  x86 / PM: Replace CONFIG_PM_RUNTIME in io_apic.c
  PM: Remove the SET_PM_RUNTIME_PM_OPS() macro
  mmc: atmel-mci: use SET_RUNTIME_PM_OPS() macro
  PM / Kconfig: Replace PM_RUNTIME with PM in dependencies
  ARM / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  sound / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  phy / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  video / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  tty / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  spi: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  ACPI / PM: Do not disable wakeup GPEs that have not been enabled
  ACPI / utils: Drop error messages from acpi_evaluate_reference()
  ...
2014-12-18 20:28:33 -08:00
Linus Torvalds 66dcff86ba 3.19 changes for KVM:
- spring cleaning: removed support for IA64, and for hardware-assisted
 virtualization on the PPC970
 - ARM, PPC, s390 all had only small fixes
 
 For x86:
 - small performance improvements (though only on weird guests)
 - usual round of hardware-compliancy fixes from Nadav
 - APICv fixes
 - XSAVES support for hosts and guests.  XSAVES hosts were broken because
 the (non-KVM) XSAVES patches inadvertently changed the KVM userspace
 ABI whenever XSAVES was enabled; hence, this part is going to stable.
 Guest support is just a matter of exposing the feature and CPUID leaves
 support.
 
 Right now KVM is broken for PPC BookE in your tree (doesn't compile).
 I'll reply to the pull request with a patch, please apply it either
 before the pull request or in the merge commit, in order to preserve
 bisectability somewhat.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJUkpg+AAoJEL/70l94x66DUmoH/jzXYkptSW9NGgm79KqxGJlD
 lzLnLBkitVvx++Mz5YBhdJEhKKLUlCtifFT1zPJQ/pthQhIRSaaAwZyNGgUs5w5x
 yMGKHiPQFyZRbmQtZhCInW0BftJoYHHciO3nUfHCZnp34My9MP2D55W7/z+fYFfQ
 DuqBSE9ThyZJtZ4zh8NRA9fCOeuqwVYRyoBs820Wbsh4cpIBoIK63Dg7k+CLE+ZV
 MZa/mRL6bAfsn9W5bnOUAgHJ3SPznnWbO3/g0aV+roL/5pffblprJx9lKNR08xUM
 6hDFLop2gDehDJesDkY/o8Ckp1hEouvfsVpSShry4vcgtn0hgh2O5/6Orbmj6vE=
 =Zwq1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM update from Paolo Bonzini:
 "3.19 changes for KVM:

   - spring cleaning: removed support for IA64, and for hardware-
     assisted virtualization on the PPC970

   - ARM, PPC, s390 all had only small fixes

  For x86:
   - small performance improvements (though only on weird guests)
   - usual round of hardware-compliancy fixes from Nadav
   - APICv fixes
   - XSAVES support for hosts and guests.  XSAVES hosts were broken
     because the (non-KVM) XSAVES patches inadvertently changed the KVM
     userspace ABI whenever XSAVES was enabled; hence, this part is
     going to stable.  Guest support is just a matter of exposing the
     feature and CPUID leaves support"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (179 commits)
  KVM: move APIC types to arch/x86/
  KVM: PPC: Book3S: Enable in-kernel XICS emulation by default
  KVM: PPC: Book3S HV: Improve H_CONFER implementation
  KVM: PPC: Book3S HV: Fix endianness of instruction obtained from HEIR register
  KVM: PPC: Book3S HV: Remove code for PPC970 processors
  KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions
  KVM: PPC: Book3S HV: Simplify locking around stolen time calculations
  arch: powerpc: kvm: book3s_paired_singles.c: Remove unused function
  arch: powerpc: kvm: book3s_pr.c: Remove unused function
  arch: powerpc: kvm: book3s.c: Remove some unused functions
  arch: powerpc: kvm: book3s_32_mmu.c: Remove unused function
  KVM: PPC: Book3S HV: Check wait conditions before sleeping in kvmppc_vcore_blocked
  KVM: PPC: Book3S HV: ptes are big endian
  KVM: PPC: Book3S HV: Fix inaccuracies in ICP emulation for H_IPI
  KVM: PPC: Book3S HV: Fix KSM memory corruption
  KVM: PPC: Book3S HV: Fix an issue where guest is paused on receiving HMI
  KVM: PPC: Book3S HV: Fix computation of tlbie operand
  KVM: PPC: Book3S HV: Add missing HPTE unlock
  KVM: PPC: BookE: Improve irq inject tracepoint
  arm/arm64: KVM: Require in-kernel vgic for the arch timers
  ...
2014-12-18 16:05:28 -08:00
Andy Lutomirski 3fb2f4237b x86/tls: Don't validate lm in set_thread_area() after all
It turns out that there's a lurking ABI issue.  GCC, when
compiling this in a 32-bit program:

struct user_desc desc = {
	.entry_number    = idx,
	.base_addr       = base,
	.limit           = 0xfffff,
	.seg_32bit       = 1,
	.contents        = 0, /* Data, grow-up */
	.read_exec_only  = 0,
	.limit_in_pages  = 1,
	.seg_not_present = 0,
	.useable         = 0,
};

will leave .lm uninitialized.  This means that anything in the
kernel that reads user_desc.lm for 32-bit tasks is unreliable.

Revert the .lm check in set_thread_area().  The value never did
anything in the first place.

Fixes: 0e58af4e1d ("x86/tls: Disallow unusual TLS segments")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # Only if 0e58af4e1d is backported
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-18 12:12:26 +01:00
Jiang Liu a978609112 x86, irq: Use helpers to access irq_cfg data structure associated with IRQ
Use helpers to access irq_cfg data structure associated with IRQ,
instead of accessing irq_data->chip_data directly. Later we can
rewrite those helpers to support hierarchy irqdomain.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1414397531-28254-17-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-16 14:08:17 +01:00
Jiang Liu 11d686e956 x86, irq: Move IRQ initialization routines from io_apic.c into vector.c
Move IRQ initialization routines from io_apic.c into vector.c,
preparing for enabling hierarchy irqdomain.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1414397531-28254-15-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-16 14:08:17 +01:00