Commit graph

22766 commits

Author SHA1 Message Date
Andrea Arcangeli 1380fca084 userfaultfd: activate syscall
This activates the userfaultfd syscall.

[sfr@canb.auug.org.au: activate syscall fix]
[akpm@linux-foundation.org: don't enable userfaultfd on powerpc]
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Ulrich Obergfell ec6a90661a watchdog: rename watchdog_suspend() and watchdog_resume()
Rename watchdog_suspend() to lockup_detector_suspend() and
watchdog_resume() to lockup_detector_resume() to avoid confusion with the
watchdog subsystem and to be consistent with the existing name
lockup_detector_init().

Also provide comment blocks to explain the watchdog_running and
watchdog_suspended variables and their relationship.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Stephane Eranian <eranian@google.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Ulrich Obergfell 999bbe49ea watchdog: use suspend/resume interface in fixup_ht_bug()
Remove watchdog_nmi_disable_all() and watchdog_nmi_enable_all() since
these functions are no longer needed.  If a subsystem has a need to
deactivate the watchdog temporarily, it should utilize the
watchdog_suspend() and watchdog_resume() functions.

[akpm@linux-foundation.org: fix build with CONFIG_LOCKUP_DETECTOR=m]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Stephane Eranian <eranian@google.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Guenter Roeck aacfbe6a97 kernel/watchdog: move NMI function header declarations from watchdog.h to nmi.h
The kernel's NMI watchdog has nothing to do with the watchdog subsystem.
Its header declarations should be in linux/nmi.h, not linux/watchdog.h.

The code provided two sets of dummy functions if HARDLOCKUP_DETECTOR is
not configured, one in the include file and one in kernel/watchdog.c.
Remove the dummy functions from kernel/watchdog.c and use those from the
include file.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Linus Torvalds f377ea88b8 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is the main pull request for the drm for 4.3.  Nouveau is
  probably the biggest amount of changes in here, since it missed 4.2.
  Highlights below, along with the usual bunch of fixes.

  All stuff outside drm should have applicable acks.

  Highlights:

   - new drivers:
        freescale dcu kms driver

   - core:
        more atomic fixes
        disable some dri1 interfaces on kms drivers
        drop fb panic handling, this was just getting more broken, as more locking was required.
        new core fbdev Kconfig support - instead of each driver enable/disabling it
        struct_mutex cleanups

   - panel:
        more new panels
        cleanup Kconfig

   - i915:
        Skylake support enabled by default
        legacy modesetting using atomic infrastructure
        Skylake fixes
        GEN9 workarounds

   - amdgpu:
        Fiji support
        CGS support for amdgpu
        Initial GPU scheduler - off by default
        Lots of bug fixes and optimisations.

   - radeon:
        DP fixes
        misc fixes

   - amdkfd:
        Add Carrizo support for amdkfd using amdgpu.

   - nouveau:
        long pending cleanup to complete driver,
        fully bisectable which makes it larger,
        perfmon work
        more reclocking improvements
        maxwell displayport fixes

   - vmwgfx:
        new DX device support, supports OpenGL 3.3
        screen targets support

   - mgag200:
        G200eW support
        G200e new revision support

   - msm:
        dragonboard 410c support, msm8x94 support, msm8x74v1 support
        yuv format support
        dma plane support
        mdp5 rotation
        initial hdcp

   - sti:
        atomic support

   - exynos:
        lots of cleanups
        atomic modesetting/pageflipping support
        render node support

   - tegra:
        tegra210 support (dc, dsi, dp/hdmi)
        dpms with atomic modesetting support

   - atmel:
        support for 3 more atmel SoCs
        new input formats, PRIME support.

   - dwhdmi:
        preparing to add audio support

   - rockchip:
        yuv plane support"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1369 commits)
  drm/amdgpu: rename gmc_v8_0_init_compute_vmid
  drm/amdgpu: fix vce3 instance handling
  drm/amdgpu: remove ib test for the second VCE Ring
  drm/amdgpu: properly enable VM fault interrupts
  drm/amdgpu: fix warning in scheduler
  drm/amdgpu: fix buffer placement under memory pressure
  drm/amdgpu/cz: fix cz_dpm_update_low_memory_pstate logic
  drm/amdgpu: fix typo in dce11 watermark setup
  drm/amdgpu: fix typo in dce10 watermark setup
  drm/amdgpu: use top down allocation for non-CPU accessible vram
  drm/amdgpu: be explicit about cpu vram access for driver BOs (v2)
  drm/amdgpu: set MEC doorbell range for Fiji
  drm/amdgpu: implement burst NOP for SDMA
  drm/amdgpu: add insert_nop ring func and default implementation
  drm/amdgpu: add amdgpu_get_sdma_instance helper function
  drm/amdgpu: add AMDGPU_MAX_SDMA_INSTANCES
  drm/amdgpu: add burst_nop flag for sdma
  drm/amdgpu: add count field for the SDMA NOP packet v2
  drm/amdgpu: use PT for VM sync on unmap
  drm/amdgpu: make wait_event uninterruptible in push_job
  ...
2015-09-04 15:49:32 -07:00
Andrey Ryabinin 71c6da846b crypto: ghash-clmulni: specify context size for ghash async algorithm
Currently context size (cra_ctxsize) doesn't specified for
ghash_async_alg. Which means it's zero. Thus crypto_create_tfm()
doesn't allocate needed space for ghash_async_ctx, so any
read/write to ctx (e.g. in ghash_async_init_tfm()) is not valid.

Cc: stable@vger.kernel.org
Signed-off-by: Andrey Ryabinin <aryabinin@odin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-09-04 23:21:07 +08:00
Linus Torvalds ca520cab25 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
2015-09-03 15:46:07 -07:00
Thomas Gleixner 66c117d7fa x86/alternatives: Make optimize_nops() interrupt safe and synced
Richard reported the following crash:

[    0.036000] BUG: unable to handle kernel paging request at 55501e06
[    0.036000] IP: [<c0aae48b>] common_interrupt+0xb/0x38
[    0.036000] Call Trace:
[    0.036000]  [<c0409c80>] ? add_nops+0x90/0xa0
[    0.036000]  [<c040a054>] apply_alternatives+0x274/0x630

Chuck decoded:

 "  0:   8d 90 90 83 04 24       lea    0x24048390(%eax),%edx
    6:   80 fc 0f                cmp    $0xf,%ah
    9:   a8 0f                   test   $0xf,%al
 >> b:   a0 06 1e 50 55          mov    0x55501e06,%al
   10:   57                      push   %edi
   11:   56                      push   %esi

 Interrupt 0x30 occurred while the alternatives code was replacing the
 initial 0x90,0x90,0x90 NOPs (from the ASM_CLAC macro) with the
 optimized version, 0x8d,0x76,0x00. Only the first byte has been
 replaced so far, and it makes a mess out of the insn decoding."

optimize_nops() is buggy in two aspects:

- It's not disabling interrupts across the modification
- It's lacking a sync_core() call

Add both.

Fixes: 4fd4b6e553 'x86/alternatives: Use optimized NOPs for padding'
Reported-and-tested-by: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Chuck Ebbert <cebbert.lkml@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1509031232340.15006@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-03 21:27:47 +02:00
Linus Torvalds dd5cdb48ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Another merge window, another set of networking changes.  I've heard
  rumblings that the lightweight tunnels infrastructure has been voted
  networking change of the year.  But what do I know?

   1) Add conntrack support to openvswitch, from Joe Stringer.

   2) Initial support for VRF (Virtual Routing and Forwarding), which
      allows the segmentation of routing paths without using multiple
      devices.  There are some semantic kinks to work out still, but
      this is a reasonably strong foundation.  From David Ahern.

   3) Remove spinlock fro act_bpf fast path, from Alexei Starovoitov.

   4) Ignore route nexthops with a link down state in ipv6, just like
      ipv4.  From Andy Gospodarek.

   5) Remove spinlock from fast path of act_gact and act_mirred, from
      Eric Dumazet.

   6) Document the DSA layer, from Florian Fainelli.

   7) Add netconsole support to bcmgenet, systemport, and DSA.  Also
      from Florian Fainelli.

   8) Add Mellanox Switch Driver and core infrastructure, from Jiri
      Pirko.

   9) Add support for "light weight tunnels", which allow for
      encapsulation and decapsulation without bearing the overhead of a
      full blown netdevice.  From Thomas Graf, Jiri Benc, and a cast of
      others.

  10) Add Identifier Locator Addressing support for ipv6, from Tom
      Herbert.

  11) Support fragmented SKBs in iwlwifi, from Johannes Berg.

  12) Allow perf PMUs to be accessed from eBPF programs, from Kaixu Xia.

  13) Add BQL support to 3c59x driver, from Loganaden Velvindron.

  14) Stop using a zero TX queue length to mean that a device shouldn't
      have a qdisc attached, use an explicit flag instead.  From Phil
      Sutter.

  15) Use generic geneve netdevice infrastructure in openvswitch, from
      Pravin B Shelar.

  16) Add infrastructure to avoid re-forwarding a packet in software
      that was already forwarded by a hardware switch.  From Scott
      Feldman.

  17) Allow AF_PACKET fanout function to be implemented in a bpf
      program, from Willem de Bruijn"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1458 commits)
  netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
  netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled
  net: fec: clear receive interrupts before processing a packet
  ipv6: fix exthdrs offload registration in out_rt path
  xen-netback: add support for multicast control
  bgmac: Update fixed_phy_register()
  sock, diag: fix panic in sock_diag_put_filterinfo
  flow_dissector: Use 'const' where possible.
  flow_dissector: Fix function argument ordering dependency
  ixgbe: Resolve "initialized field overwritten" warnings
  ixgbe: Remove bimodal SR-IOV disabling
  ixgbe: Add support for reporting 2.5G link speed
  ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
  ixgbe: support for ethtool set_rxfh
  ixgbe: Avoid needless PHY access on copper phys
  ixgbe: cleanup to use cached mask value
  ixgbe: Remove second instance of lan_id variable
  ixgbe: use kzalloc for allocating one thing
  flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c
  ixgbe: Remove unused PCI bus types
  ...
2015-09-03 08:08:17 -07:00
Linda Knippers 31e09b18c8 x86/mm/srat: Print non-volatile flag in SRAT
With the addition of NVDIMM support, a question came up as to
whether NVDIMM ranges should be in the SRAT with this bit set.
I think the consensus was no because the ranges are in the NFIT
with proximity domain information there.

ACPI is not clear on the meaning of this bit in the SRAT.
If someone is setting it, we might want to ask them what they
expect to happen with it.

Right now this bit is only printed if all the ACPI debug
information is turned on.

Signed-off-by: Linda Knippers <linda.knippers@hp.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150901194154.GA4939@ljkz400
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-02 09:33:25 +02:00
Linus Torvalds ae98207309 Power management and ACPI material for v4.3-rc1
- ACPICA update to upstream revision 20150818 including method
    tracing extensions to allow more in-depth AML debugging in the
    kernel and a number of assorted fixes and cleanups (Bob Moore,
    Lv Zheng, Markus Elfring).
 
  - ACPI sysfs code updates and a documentation update related to
    AML method tracing (Lv Zheng).
 
  - ACPI EC driver fix related to serialized evaluations of _Qxx
    methods and ACPI tools updates allowing the EC userspace tool
    to be built from the kernel source (Lv Zheng).
 
  - ACPI processor driver updates preparing it for future
    introduction of CPPC support and ACPI PCC mailbox driver
    updates (Ashwin Chaugule).
 
  - ACPI interrupts enumeration fix for a regression related
    to the handling of IRQ attribute conflicts between MADT
    and the ACPI namespace (Jiang Liu).
 
  - Fixes related to ACPI device PM (Mika Westerberg, Srinidhi Kasagar).
 
  - ACPI device registration code reorganization to separate the
    sysfs-related code and bus type operations from the rest (Rafael
    J Wysocki).
 
  - Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
    Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).
 
  - ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups
    (Pan Xinhui, Rafael J Wysocki).
 
  - cpufreq core cleanups on top of the previous changes allowing it
    to preseve its sysfs directories over system suspend/resume (Viresh
    Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).
 
  - cpufreq fixes and cleanups related to governors (Viresh Kumar).
 
  - cpufreq updates (core and the cpufreq-dt driver) related to the
    turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).
 
  - New DT bindings for Operating Performance Points (OPP), support
    for them in the OPP framework and in the cpufreq-dt driver plus
    related OPP framework fixes and cleanups (Viresh Kumar).
 
  - cpufreq powernv driver updates (Shilpasri G Bhat).
 
  - New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).
 
  - Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
    and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).
 
  - intel_pstate driver updates including Skylake-S support, support
    for enabling HW P-states per CPU and an additional vendor bypass
    list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).
 
  - cpuidle core fixes related to the handling of coupled idle states
    (Xunlei Pang).
 
  - intel_idle driver updates including Skylake Client support and
    support for freeze-mode-specific idle states (Len Brown).
 
  - Driver core updates related to power management (Andy Shevchenko,
    Rafael J Wysocki).
 
  - Generic power domains framework fixes and cleanups (Jon Hunter,
    Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).
 
  - Device PM QoS framework update to allow the latency tolerance
    setting to be exposed to user space via sysfs (Mika Westerberg).
 
  - devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
    exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).
 
  - System sleep support updates (Alan Stern, Len Brown, SungEun Kim).
 
  - rockchip-io AVS support updates (Heiko Stuebner).
 
  - PM core clocks support fixup (Colin Ian King).
 
  - Power capping RAPL driver update including support for Skylake H/S
    and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).
 
  - Generic device properties framework fixes related to the handling
    of static (driver-provided) property sets (Andy Shevchenko).
 
  - turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
    Shreyas B Prabhu).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJV5hhGAAoJEILEb/54YlRxs+EQAK51iFk48+IbpHYaZZ50Yo4m
 ZZc2zBcbwRcBlU9vKERrhG+jieSl8J/JJNxT8vBjKqyvNw038mCjewQh02ol0HuC
 R7nlDiVJkmZ50sLO4xwE/1UBZr/XqbddwCUnYzvFMkMTA0ePzFtf8BrJ1FXpT8S/
 fkwSXQty6hvJDwxkfrbMSaA730wMju9lahx8D6MlmUAedWYZOJDMQKB4WKa/St5X
 9uckBPHUBB2KiKlXxdbFPwKLNxHvLROq5SpDLc6cM/7XZB+QfNFy85CUjCUtYo1O
 1W8k0qnztvZ6UEv27qz5dejGyAGOarMWGGNsmL9evoeGeHRpQL+dom7HcTnbAfUZ
 walyhYSm/zKkdy7Vl3xWUUQkMG48+PviMI6K0YhHXb3Rm5wlR/yBNZTwNIty9SX/
 fKCHEa8QynWwLxgm53c3xRkiitJxMsHNK03moLD9zQMjshTyTNvpNbZoahyKQzk6
 H+9M1DBRHhkkREDWSwGutukxfEMtWe2vcZcyERrFiY7l5k1j58DwDBMPqjPhRv6q
 P/1NlCzr0XYf83Y86J18LbDuPGDhTjjIEn6CqbtI2mmWqTg3+rF7zvS2ux+FzMnA
 gisv8l6GT9JiWhxKFqqL/rrVpwtyHebWLYE/RpNUW6fEzLziRNj1qyYO9dqI/GGi
 I3rfxlXoc/5xJWCgNB8f
 =fTgI
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI updates from Rafael Wysocki:
 "From the number of commits perspective, the biggest items are ACPICA
  and cpufreq changes with the latter taking the lead (over 50 commits).

  On the cpufreq front, there are many cleanups and minor fixes in the
  core and governors, driver updates etc.  We also have a new cpufreq
  driver for Mediatek MT8173 chips.

  ACPICA mostly updates its debug infrastructure and adds a number of
  fixes and cleanups for a good measure.

  The Operating Performance Points (OPP) framework is updated with new
  DT bindings and support for them among other things.

  We have a few updates of the generic power domains framework and a
  reorganization of the ACPI device enumeration code and bus type
  operations.

  And a lot of fixes and cleanups all over.

  Included is one branch from the MFD tree as it contains some
  PM-related driver core and ACPI PM changes a few other commits are
  based on.

  Specifics:

   - ACPICA update to upstream revision 20150818 including method
     tracing extensions to allow more in-depth AML debugging in the
     kernel and a number of assorted fixes and cleanups (Bob Moore, Lv
     Zheng, Markus Elfring).

   - ACPI sysfs code updates and a documentation update related to AML
     method tracing (Lv Zheng).

   - ACPI EC driver fix related to serialized evaluations of _Qxx
     methods and ACPI tools updates allowing the EC userspace tool to be
     built from the kernel source (Lv Zheng).

   - ACPI processor driver updates preparing it for future introduction
     of CPPC support and ACPI PCC mailbox driver updates (Ashwin
     Chaugule).

   - ACPI interrupts enumeration fix for a regression related to the
     handling of IRQ attribute conflicts between MADT and the ACPI
     namespace (Jiang Liu).

   - Fixes related to ACPI device PM (Mika Westerberg, Srinidhi
     Kasagar).

   - ACPI device registration code reorganization to separate the
     sysfs-related code and bus type operations from the rest (Rafael J
     Wysocki).

   - Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
     Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).

   - ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan
     Xinhui, Rafael J Wysocki).

   - cpufreq core cleanups on top of the previous changes allowing it to
     preseve its sysfs directories over system suspend/resume (Viresh
     Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).

   - cpufreq fixes and cleanups related to governors (Viresh Kumar).

   - cpufreq updates (core and the cpufreq-dt driver) related to the
     turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).

   - New DT bindings for Operating Performance Points (OPP), support for
     them in the OPP framework and in the cpufreq-dt driver plus related
     OPP framework fixes and cleanups (Viresh Kumar).

   - cpufreq powernv driver updates (Shilpasri G Bhat).

   - New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).

   - Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
     and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).

   - intel_pstate driver updates including Skylake-S support, support
     for enabling HW P-states per CPU and an additional vendor bypass
     list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).

   - cpuidle core fixes related to the handling of coupled idle states
     (Xunlei Pang).

   - intel_idle driver updates including Skylake Client support and
     support for freeze-mode-specific idle states (Len Brown).

   - Driver core updates related to power management (Andy Shevchenko,
     Rafael J Wysocki).

   - Generic power domains framework fixes and cleanups (Jon Hunter,
     Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).

   - Device PM QoS framework update to allow the latency tolerance
     setting to be exposed to user space via sysfs (Mika Westerberg).

   - devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
     exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).

   - System sleep support updates (Alan Stern, Len Brown, SungEun Kim).

   - rockchip-io AVS support updates (Heiko Stuebner).

   - PM core clocks support fixup (Colin Ian King).

   - Power capping RAPL driver update including support for Skylake H/S
     and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).

   - Generic device properties framework fixes related to the handling
     of static (driver-provided) property sets (Andy Shevchenko).

   - turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
     Shreyas B Prabhu)"

* tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (180 commits)
  cpufreq: speedstep-lib: Use monotonic clock
  cpufreq: powernv: Increase the verbosity of OCC console messages
  cpufreq: sfi: use kmemdup rather than duplicating its implementation
  cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor()
  cpufreq: rename cpufreq_real_policy as cpufreq_user_policy
  cpufreq: remove redundant 'policy' field from user_policy
  cpufreq: remove redundant 'governor' field from user_policy
  cpufreq: update user_policy.* on success
  cpufreq: use memcpy() to copy policy
  cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event
  cpufreq: mediatek: Add MT8173 cpufreq driver
  dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings
  PM / Domains: Fix typo in description of genpd_dev_pm_detach()
  PM / Domains: Remove unusable governor dummies
  PM / Domains: Make pm_genpd_init() available to modules
  PM / domains: Align column headers and data in pm_genpd_summary output
  powercap / RAPL: disable the 2nd power limit properly
  tools: cpupower: Fix error when running cpupower monitor
  PM / OPP: Drop unlikely before IS_ERR(_OR_NULL)
  PM / OPP: Fix static checker warning (broken 64bit big endian systems)
  ...
2015-09-01 19:45:46 -07:00
Linus Torvalds e713c80a4e Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 clockevent update from Thomas Gleixner:
 "A single commit, which converts HPET clockevents driver to the new
  callbacks"

* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/hpet: Migrate to new set_state interface
2015-09-01 15:42:28 -07:00
Linus Torvalds 43af9872f5 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Thomas Gleixner:
 "This udpate contains:

   - rework the irq vector array to store a pointer to the irq
     descriptor instead of the irq number to avoid a lookup of the irq
     descriptor in the irq entry path

   - lguest interrupt handling cleanups

   - conversion of the local apic timer to the new clockevent callbacks

   - preparatory changes for the irq argument removal of interrupt flow
     handlers"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Do not dereference irq descriptor before checking it
  tools/lguest: Clean up include dir
  tools/lguest: Fix redefinition of struct virtio_pci_cfg_cap
  x86/irq: Store irq descriptor in vector array
  genirq: Provide irq_desc_has_action
  x86/irq: Get rid of an indentation level
  x86/irq: Rename VECTOR_UNDEFINED to VECTOR_UNUSED
  x86/irq: Replace numeric constant
  x86/irq: Protect smp_cleanup_move
  x86/lguest: Do not setup unused irq vectors
  x86/lguest: Clean up lguest_setup_irq
  x86/apic: Drop local_irq_save/restore in timer callbacks
  x86/apic: Migrate apic timer to new set_state interface
  x86/irq: Use access helper irq_data_get_affinity_mask()
  x86/irq: Use accessor irq_data_get_irq_handler_data()
  x86/irq: Use accessor irq_data_get_node()
2015-09-01 15:20:51 -07:00
Linus Torvalds 17e6b00ac4 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "This updated pull request does not contain the last few GIC related
  patches which were reported to cause a regression.  There is a fix
  available, but I let it breed for a couple of days first.

  The irq departement provides:

   - new infrastructure to support non PCI based MSI interrupts
   - a couple of new irq chip drivers
   - the usual pile of fixlets and updates to irq chip drivers
   - preparatory changes for removal of the irq argument from interrupt
     flow handlers
   - preparatory changes to remove IRQF_VALID"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
  irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources
  irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2
  irqchip: Add documentation for the bcm2836 interrupt controller
  irqchip/bcm2835: Add support for being used as a second level controller
  irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ
  PCI: xilinx: Fix typo in function name
  irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance
  irqchip/gic: Only allow the primary GIC to set the CPU map
  PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove
  unicore32/irq: Prepare puv3_gpio_handler for irq argument removal
  tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal
  m68k/irq: Prepare irq handlers for irq argument removal
  C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal
  blackfin: Prepare irq handlers for irq argument removal
  arc/irq: Prepare idu_cascade_isr for irq argument removal
  sparc/irq: Use access helper irq_data_get_affinity_mask()
  sparc/irq: Use helper irq_data_get_irq_handler_data()
  parisc/irq: Use access helper irq_data_get_affinity_mask()
  mn10300/irq: Use access helper irq_data_get_affinity_mask()
  irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal
  ...
2015-09-01 14:33:35 -07:00
Linus Torvalds 5e359bf221 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Rather large, but nothing exiting:

   - new range check for settimeofday() to prevent that boot time
     becomes negative.
   - fix for file time rounding
   - a few simplifications of the hrtimer code
   - fix for the proc/timerlist code so the output of clock realtime
     timers is accurate
   - more y2038 work
   - tree wide conversion of clockevent drivers to the new callbacks"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (88 commits)
  hrtimer: Handle failure of tick_init_highres() gracefully
  hrtimer: Unconfuse switch_hrtimer_base() a bit
  hrtimer: Simplify get_target_base() by returning current base
  hrtimer: Drop return code of hrtimer_switch_to_hres()
  time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64()
  time: Introduce current_kernel_time64()
  time: Introduce struct itimerspec64
  time: Add the common weak version of update_persistent_clock()
  time: Always make sure wall_to_monotonic isn't positive
  time: Fix nanosecond file time rounding in timespec_trunc()
  timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers
  cris/time: Migrate to new 'set-state' interface
  kernel: broadcast-hrtimer: Migrate to new 'set-state' interface
  xtensa/time: Migrate to new 'set-state' interface
  unicore/time: Migrate to new 'set-state' interface
  um/time: Migrate to new 'set-state' interface
  sparc/time: Migrate to new 'set-state' interface
  sh/localtimer: Migrate to new 'set-state' interface
  score/time: Migrate to new 'set-state' interface
  s390/time: Migrate to new 'set-state' interface
  ...
2015-09-01 14:04:50 -07:00
Linus Torvalds 361f7d1757 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core platform updates from Ingo Molnar:
 "The main changes are:

   - Intel Atom platform updates.  (Andy Shevchenko)

   - modularity fixlets.  (Paul Gortmaker)

   - x86 platform clockevents driver updates for lguest, uv and Xen.
     (Viresh Kumar)

   - Microsoft Hyper-V TSC fixlet.  (Vitaly Kuznetsov)"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform: Make atom/pmc_atom.c explicitly non-modular
  x86/hyperv: Mark the Hyper-V TSC as unstable
  x86/xen/time: Migrate to new set-state interface
  x86/uv/time: Migrate to new set-state interface
  x86/lguest/timer: Migrate to new set-state interface
  x86/pci/intel_mid_pci: Use proper constants for irq polarity
  x86/pci/intel_mid_pci: Make intel_mid_pci_ops static
  x86/pci/intel_mid_pci: Propagate actual return code
  x86/pci/intel_mid_pci: Work around for IRQ0 assignment
  x86/platform/iosf_mbi: Add Intel Tangier PCI id
  x86/platform/iosf_mbi: Source cleanup
  x86/platform/iosf_mbi: Remove NULL pointer checks for pci_dev_put()
  x86/platform/iosf_mbi: Check return value of debugfs_create properly
  x86/platform/iosf_mbi: Move to dedicated folder
  x86/platform/intel/pmc_atom: Move the PMC-Atom code to arch/x86/platform/atom
  x86/platform/intel/pmc_atom: Add Cherrytrail PMC interface
  x86/platform/intel/pmc_atom: Supply register mappings via PMC object
  x86/platform/intel/pmc_atom: Print index of device in loop
  x86/platform/intel/pmc_atom: Export accessors to PMC registers
2015-09-01 10:33:31 -07:00
Linus Torvalds 25525bea46 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
 "The dominant change in this cycle was the continued work to isolate
  kernel drivers from MTRR legacies: this tree gets rid of all kernel
  internal driver interfaces to MTRRs (mostly by rewriting it to proper
  PAT interfaces), the only access left is the /proc/mtrr ABI.

  This work was done by Luis R Rodriguez.

  There's also some related PCI interface additions for which I've
  Cc:-ed Bjorn"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del()
  s390/io: Add pci_iomap_wc() and pci_iomap_wc_range()
  drivers/dma/iop-adma: Use dma_alloc_writecombine() kernel-style
  drivers/video/fbdev/vt8623fb: Use arch_phys_wc_add() and pci_iomap_wc()
  drivers/video/fbdev/s3fb: Use arch_phys_wc_add() and pci_iomap_wc()
  drivers/video/fbdev/arkfb.c: Use arch_phys_wc_add() and pci_iomap_wc()
  PCI: Add pci_iomap_wc() variants
  drivers/video/fbdev/gxt4500: Use pci_ioremap_wc_bar() to map framebuffer
  drivers/video/fbdev/kyrofb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
  drivers/video/fbdev/i740fb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
  PCI: Add pci_ioremap_wc_bar()
  x86/mm: Make kernel/check.c explicitly non-modular
  x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular
  x86/mm/pat: Add comments to cachemode translation tables
  arch/*/io.h: Add ioremap_uc() to all architectures
  drivers/video/fbdev/atyfb: Use arch_phys_wc_add() and ioremap_wc()
  drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC
  drivers/video/fbdev/atyfb: Clarify ioremap() base and length used
  drivers/video/fbdev/atyfb: Carve out framebuffer length fudging into a helper
  x86/mm, asm-generic: Add IOMMU ioremap_uc() variant default
  ...
2015-09-01 10:07:40 -07:00
Linus Torvalds 2962156d5c Merge branch 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 irq fixlet from Ingo Molnar:
 "A single change that hides the 'HYP:' line in /proc/interrupts when
  it's unused"

* 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Hide 'HYP:' line in /proc/interrupts when not on Xen/Hyper-V
2015-09-01 10:05:44 -07:00
Linus Torvalds 6b2282aa37 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Ingo Molnar:
 "Two changes: a suspend/resume quirk and a new CPUID bit definition"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpufeature: Add feature bit for Intel's Silicon Debug CPUID bit
  x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume
2015-09-01 09:41:03 -07:00
Linus Torvalds 0c0fee018d Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 init code fixlet from Ingo Molnar:
 "A single change: fix obsolete init code annotations"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Drop bogus __ref / __refdata annotations
2015-09-01 09:33:26 -07:00
Linus Torvalds a0c0d985de Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build fixlet from Ingo Molnar:
 "A single change propagating CONFIG_JUMP_LABEL into the x86 defconfigs"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kconfig: Enable CONFIG_JUMP_LABEL in the defconfigs
2015-09-01 09:32:37 -07:00
Linus Torvalds 11e612ddb4 Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
 "The main x86 bootup related changes in this cycle were:

   - more boot time optimizations.  (Len Brown)

   - implement hex output to allow the debugging of early bootup
     parameters.  (Kees Cook)

   - remove obsolete MCA leftovers.  (Paolo Pisati)"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/smpboot: Remove APIC.wait_for_init_deassert and atomic init_deasserted
  x86/smpboot: Remove SIPI delays from cpu_up()
  x86/smpboot: Remove udelay(100) when polling cpu_callin_map
  x86/smpboot: Remove udelay(100) when polling cpu_initialized_map
  x86/boot: Obsolete the MCA sys_desc_table
  x86/boot: Add hex output for debugging
2015-09-01 09:04:31 -07:00
Linus Torvalds 5778077d03 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm changes from Ingo Molnar:
 "The biggest changes in this cycle were:

   - Revamp, simplify (and in some cases fix) Time Stamp Counter (TSC)
     primitives.  (Andy Lutomirski)

   - Add new, comprehensible entry and exit handlers written in C.
     (Andy Lutomirski)

   - vm86 mode cleanups and fixes.  (Brian Gerst)

   - 32-bit compat code cleanups.  (Brian Gerst)

  The amount of simplification in low level assembly code is already
  palpable:

     arch/x86/entry/entry_32.S                          | 130 +----
     arch/x86/entry/entry_64.S                          | 197 ++-----

  but more simplifications are planned.

  There's also the usual laudry mix of low level changes - see the
  changelog for details"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (83 commits)
  x86/asm: Drop repeated macro of X86_EFLAGS_AC definition
  x86/asm/msr: Make wrmsrl() a function
  x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer
  x86/asm: Add MONITORX/MWAITX instruction support
  x86/traps: Weaken context tracking entry assertions
  x86/asm/tsc: Add rdtscll() merge helper
  selftests/x86: Add syscall_nt selftest
  selftests/x86: Disable sigreturn_64
  x86/vdso: Emit a GNU hash
  x86/entry: Remove do_notify_resume(), syscall_trace_leave(), and their TIF masks
  x86/entry/32: Migrate to C exit path
  x86/entry/32: Remove 32-bit syscall audit optimizations
  x86/vm86: Rename vm86->v86flags and v86mask
  x86/vm86: Rename vm86->vm86_info to user_vm86
  x86/vm86: Clean up vm86.h includes
  x86/vm86: Move the vm86 IRQ definitions to vm86.h
  x86/vm86: Use the normal pt_regs area for vm86
  x86/vm86: Eliminate 'struct kernel_vm86_struct'
  x86/vm86: Move fields from 'struct kernel_vm86_struct' to 'struct vm86'
  x86/vm86: Move vm86 fields out of 'thread_struct'
  ...
2015-09-01 08:40:25 -07:00
Rafael J. Wysocki e625ccec1f Merge branches 'pm-tools' and 'powercap'
* pm-tools:
  tools: cpupower: Fix error when running cpupower monitor
  tools/power turbostat: fix typo on DRAM column in Joules-mode
  cpupower: Do not change the frequency of offline cpu
  tools/power turbostat: fix parameter passing for forked command
  tools/power turbostat: dump CONFIG_TDP
  tools/power turbostat: cpu0 is no longer hard-coded, so  update output
  tools/power turbostat: update turbostat(8)

* powercap:
  powercap / RAPL: disable the 2nd power limit properly
  powercap / RAPL: Add support for Broadwell-H
  powercap / RAPL: Add support for Skylake H/S
2015-09-01 15:54:30 +02:00
Linus Torvalds 65a99597f0 Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull NOHZ updates from Ingo Molnar:
 "The main changes, mostly written by Frederic Weisbecker, include:

   - Fix some jiffies based cputime assumptions.  (No real harm because
     the concerned code isn't used by full dynticks.)

   - Simplify jiffies <-> usecs conversions.  Remove dead code.

   - Remove early hacks on nohz full code that avoided messing up idle
     nohz internals.  Now nohz integrates well full and idle and such
     hack have become needless.

   - Restart nohz full tick from irq exit.  (A simplification and a
     preparation for future optimization on scheduler kick to nohz
     full)

   - Code cleanups.

   - Tile driver isolation enhancement on top of nohz.  (Chris Metcalf)"

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  nohz: Remove useless argument on tick_nohz_task_switch()
  nohz: Move tick_nohz_restart_sched_tick() above its users
  nohz: Restart nohz full tick from irq exit
  nohz: Remove idle task special case
  nohz: Prevent tilegx network driver interrupts
  alpha: Fix jiffies based cputime assumption
  apm32: Fix cputime == jiffies assumption
  jiffies: Remove HZ > USEC_PER_SEC special case
2015-08-31 21:04:24 -07:00
Linus Torvalds a1d8561172 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The biggest change in this cycle is the rewrite of the main SMP load
  balancing metric: the CPU load/utilization.  The main goal was to make
  the metric more precise and more representative - see the changelog of
  this commit for the gory details:

    9d89c257df ("sched/fair: Rewrite runnable load and utilization average tracking")

  It is done in a way that significantly reduces complexity of the code:

    5 files changed, 249 insertions(+), 494 deletions(-)

  and the performance testing results are encouraging.  Nevertheless we
  need to keep an eye on potential regressions, since this potentially
  affects every SMP workload in existence.

  This work comes from Yuyang Du.

  Other changes:

   - SCHED_DL updates.  (Andrea Parri)

   - Simplify architecture callbacks by removing finish_arch_switch().
     (Peter Zijlstra et al)

   - cputime accounting: guarantee stime + utime == rtime.  (Peter
     Zijlstra)

   - optimize idle CPU wakeups some more - inspired by Facebook server
     loads.  (Mike Galbraith)

   - stop_machine fixes and updates.  (Oleg Nesterov)

   - Introduce the 'trace_sched_waking' tracepoint.  (Peter Zijlstra)

   - sched/numa tweaks.  (Srikar Dronamraju)

   - misc fixes and small cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  sched/deadline: Fix comment in enqueue_task_dl()
  sched/deadline: Fix comment in push_dl_tasks()
  sched: Change the sched_class::set_cpus_allowed() calling context
  sched: Make sched_class::set_cpus_allowed() unconditional
  sched: Fix a race between __kthread_bind() and sched_setaffinity()
  sched: Ensure a task has a non-normalized vruntime when returning back to CFS
  sched/numa: Fix NUMA_DIRECT topology identification
  tile: Reorganize _switch_to()
  sched, sparc32: Update scheduler comments in copy_thread()
  sched: Remove finish_arch_switch()
  sched, tile: Remove finish_arch_switch
  sched, sh: Fold finish_arch_switch() into switch_to()
  sched, score: Remove finish_arch_switch()
  sched, avr32: Remove finish_arch_switch()
  sched, MIPS: Get rid of finish_arch_switch()
  sched, arm: Remove finish_arch_switch()
  sched/fair: Clean up load average references
  sched/fair: Provide runnable_load_avg back to cfs_rq
  sched/fair: Remove task and group entity load when they are dead
  sched/fair: Init cfs_rq's sched_entity load average
  ...
2015-08-31 20:26:22 -07:00
Linus Torvalds 3959df1dfb Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
 "MCE handling updates, but also some generic drivers/edac/ changes to
  better organize the Kconfig space"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ras: Move AMD MCE injector to arch/x86/ras/
  x86/mce: Add a wrapper around mce_log() for injection
  x86/mce: Rename rcu_dereference_check_mce() to mce_log_get_idx_check()
  RAS: Add a menuconfig option with descriptive text
  x86/mce: Reenable CMCI banks when swiching back to interrupt mode
  x86/mce: Clear Local MCE opt-in before kexec
  x86/mce: Remove unused function declarations
  x86/mce: Kill drain_mcelog_buffer()
  x86/mce: Avoid potential deadlock due to printk() in MCE context
  x86/mce: Remove the MCE ring for Action Optional errors
  x86/mce: Don't use percpu workqueues
  x86/mce: Provide a lockless memory pool to save error records
  x86/mce: Reuse one of the u16 padding fields in 'struct mce'
2015-08-31 20:20:30 -07:00
Linus Torvalds 41d859a83c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Main perf kernel side changes:

   - uprobes updates/fixes.  (Oleg Nesterov)

   - Add PERF_RECORD_SWITCH to indicate context switches and use it in
     tooling.  (Adrian Hunter)

   - Support BPF programs attached to uprobes and first steps for BPF
     tooling support.  (Wang Nan)

   - x86 generic x86 MSR-to-perf PMU driver.  (Andy Lutomirski)

   - x86 Intel PT, LBR and BTS updates.  (Alexander Shishkin)

   - x86 Intel Skylake support.  (Andi Kleen)

   - x86 Intel Knights Landing (KNL) RAPL support.  (Dasaratharaman
     Chandramouli)

   - x86 Intel Broadwell-DE uncore support.  (Kan Liang)

   - x86 hw breakpoints robustization (Andy Lutomirski)

  Main perf tooling side changes:

   - Support Intel PT in several tools, enabling the use of the
     processor trace feature introduced in Intel Broadwell processors:
     (Adrian Hunter)

       # dmesg | grep Performance
       # [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
       # perf record -e intel_pt//u -a sleep 1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.216 MB perf.data ]
       # perf script # then navigate in the tool output to some area, like this one:
       184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so)
       185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so)
       186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so)
       187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so)
       188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so)
       189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so)
       190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so)
       191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so)
       192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so)
       193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so)
       194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so)
       195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so)
       196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so)
       197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so)
       198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so)
       199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so)
       200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) =>  7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so)
       201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so)

   - Add support for using several Intel PT features (CYC, MTC packets),
     the relevant documentation was updated in:
         tools/perf/Documentation/intel-pt.txt
     briefly describing those packets, its purposes, how to configure
     them in the event config terms and relevant external documentation
     for further reading.  (Adrian Hunter)

   - Introduce support for probing at an absolute address, for user and
     kernel 'perf probe's, useful when one have the symbol maps on a
     developer machine but not on an embedded system.  (Wang Nan)

   - Add Intel BTS support, with a call-graph script to show it and PT
     in use in a GUI using 'perf script' python scripting with
     postgresql and Qt.  (Adrian Hunter)

   - Allow selecting the type of callchains per event, including
     disabling callchains in all but one entry in an event list, to save
     space, and also to ask for the callchains collected in one event to
     be used in other events.  (Kan Liang)

   - Beautify more syscall arguments in 'perf trace': (Arnaldo Carvalho
     de Melo)
       * A bunch more translate file/pathnames from pointers to strings.
       * Convert numbers to strings for the 'keyctl' syscall 'option'
         arg.
       * Add missing 'clockid' entries.

   - Introduce 'srcfile' sort key: (Andi Kleen)

       # perf record -F 10000 usleep 1
       # perf report --stdio --dsos '[kernel.vmlinux]' -s srcfile
       <SNIP>
       # Overhead  Source File
          26.49%  copy_page_64.S
           5.49%  signal.c
           0.51%  msr.h
       #

     It can be combined with other fields, for instance, experiment with
     '-s srcfile,symbol'.

     There are some oddities in some distros and with some specific
     DSOs, being investigated, so your mileage may vary.

   - Support per-event 'freq' term: (Namhyung Kim)

       $ perf record -e 'cpu/instructions,freq=1234/',cycles -c 1000 sleep 1
       $ perf evlist -F
       cpu/instructions,freq=1234/: sample_freq=1234
       cycles: sample_period=1000
       $

   - Deref sys_enter pointer args with contents from probe:vfs_getname,
     showing pathnames instead of pointers in many syscalls in 'perf
     trace'.  (Arnaldo Carvalho de Melo)

   - Stop collecting /proc/kallsyms in perf.data files, saving about
     4.5MB on a typical x86-64 system, use the the symbol resolution
     routines used in all the other tools (report, top, etc) now that we
     can ask libtraceevent to use perf's symbol resolution code.
     (Arnaldo Carvalho de Melo)

   - Allow filtering out of perf's PID via 'perf record --exclude-perf'.
     (Wang Nan)

   - 'perf trace' now supports syscall groups, like strace, i.e:

       $ trace -e file touch file

     Will expand 'file' into multiple, file related, syscalls.  More
     work needed to add extra groups for other syscall groups, and also
     to complement what was added for the 'file' group, included as a
     proof of concept.  (Arnaldo Carvalho de Melo)

   - Add lock_pi stresser to 'perf bench futex', to test the kernel code
     related to FUTEX_(UN)LOCK_PI.  (Davidlohr Bueso)

   - Let user have timestamps with per-thread recording in 'perf record'
     (Adrian Hunter)

   - ... and tons of other changes, see the shortlog and the Git log for
     details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (240 commits)
  perf evlist: Add backpointer for perf_env to evlist
  perf tools: Rename perf_session_env to perf_env
  perf tools: Do not change lib/api/fs/debugfs directly
  perf tools: Add tracing_path and remove unneeded functions
  perf buildid: Introduce sysfs/filename__sprintf_build_id
  perf evsel: Add a backpointer to the evlist a evsel is in
  perf trace: Add header with copyright and background info
  perf scripts python: Add new compaction-times script
  perf stat: Get correct cpu id for print_aggr
  tools lib traceeveent: Allow for negative numbers in print format
  perf script: Add --[no-]-demangle/--[no-]-demangle-kernel
  tracing/uprobes: Do not print '0x (null)' when offset is 0
  perf probe: Support probing at absolute address
  perf probe: Fix error reported when offset without function
  perf probe: Fix list result when address is zero
  perf probe: Fix list result when symbol can't be found
  tools build: Allow duplicate objects in the object list
  perf tools: Remove export.h from MANIFEST
  perf probe: Prevent segfault when reading probe point with absolute address
  perf tools: Update Intel PT documentation
  ...
2015-08-31 19:49:05 -07:00
Linus Torvalds 4658000955 Merge branch 'mm-kasan-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/kasan changes from Ingo Molnar:
 "These are two KASAN changes that factor out (and generalize) x86
  specific KASAN code from x86 to mm"

* 'mm-kasan-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kasan, mm: Introduce generic kasan_populate_zero_shadow()
  x86/kasan: Define KASAN_SHADOW_OFFSET per architecture
2015-08-31 19:16:36 -07:00
Rafael J. Wysocki 5d2a1a927d Merge branches 'acpi-pci', 'acpi-soc', 'acpi-ec' and 'acpi-osl'
* acpi-pci:
  ACPI, PCI: Penalize legacy IRQ used by ACPI SCI

* acpi-soc:
  ACPI / LPSS: Ignore 10ms delay for Braswell

* acpi-ec:
  ACPI / EC: Fix an issue caused by the serialized _Qxx evaluations

* acpi-osl:
  ACPI / osl: replace custom implementation of readq / writeq
2015-09-01 03:41:19 +02:00
Linus Torvalds 5757bd6157 Merge branch 'core-types-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull inlining tuning from Ingo Molnar:
 "A handful of inlining optimizations inspired by x86 work but
  applicable in general"

* 'core-types-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  jiffies: Force inlining of {m,u}msecs_to_jiffies()
  x86/hweight: Force inlining of __arch_hweight{32,64}()
  linux/bitmap: Force inlining of bitmap weight functions
2015-08-31 18:37:33 -07:00
Linus Torvalds 7073bc6612 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main RCU changes in this cycle are:

   - the combination of tree geometry-initialization simplifications and
     OS-jitter-reduction changes to expedited grace periods.  These two
     are stacked due to the large number of conflicts that would
     otherwise result.

   - privatize smp_mb__after_unlock_lock().

     This commit moves the definition of smp_mb__after_unlock_lock() to
     kernel/rcu/tree.h, in recognition of the fact that RCU is the only
     thing using this, that nothing else is likely to use it, and that
     it is likely to go away completely.

   - documentation updates.

   - torture-test updates.

   - misc fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  rcu,locking: Privatize smp_mb__after_unlock_lock()
  rcu: Silence lockdep false positive for expedited grace periods
  rcu: Don't disable CPU hotplug during OOM notifiers
  scripts: Make checkpatch.pl warn on expedited RCU grace periods
  rcu: Update MAINTAINERS entry
  rcu: Clarify CONFIG_RCU_EQS_DEBUG help text
  rcu: Fix backwards RCU_LOCKDEP_WARN() in synchronize_rcu_tasks()
  rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()
  rcu: Make rcu_is_watching() really notrace
  cpu: Wait for RCU grace periods concurrently
  rcu: Create a synchronize_rcu_mult()
  rcu: Fix obsolete priority-boosting comment
  rcu: Use WRITE_ONCE in RCU_INIT_POINTER
  rcu: Hide RCU_NOCB_CPU behind RCU_EXPERT
  rcu: Add RCU-sched flavors of get-state and cond-sync
  rcu: Add fastpath bypassing funnel locking
  rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQS
  rcu: Pull out wait_event*() condition into helper function
  documentation: Describe new expedited stall warnings
  rcu: Add stall warnings to synchronize_sched_expedited()
  ...
2015-08-31 18:12:07 -07:00
Linus Torvalds d4c90396ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.3:

  API:

   - the AEAD interface transition is now complete.
   - add top-level skcipher interface.

  Drivers:

   - x86-64 acceleration for chacha20/poly1305.
   - add sunxi-ss Allwinner Security System crypto accelerator.
   - add RSA algorithm to qat driver.
   - add SRIOV support to qat driver.
   - add LS1021A support to caam.
   - add i.MX6 support to caam"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (163 commits)
  crypto: algif_aead - fix for multiple operations on AF_ALG sockets
  crypto: qat - enable legacy VFs
  MPI: Fix mpi_read_buffer
  crypto: qat - silence a static checker warning
  crypto: vmx - Fixing opcode issue
  crypto: caam - Use the preferred style for memory allocations
  crypto: caam - Propagate the real error code in caam_probe
  crypto: caam - Fix the error handling in caam_probe
  crypto: caam - fix writing to JQCR_MS when using service interface
  crypto: hash - Add AHASH_REQUEST_ON_STACK
  crypto: testmgr - Use new skcipher interface
  crypto: skcipher - Add top-level skcipher interface
  crypto: cmac - allow usage in FIPS mode
  crypto: sahara - Use dmam_alloc_coherent
  crypto: caam - Add support for LS1021A
  crypto: qat - Don't move data inside output buffer
  crypto: vmx - Fixing GHASH Key issue on little endian
  crypto: vmx - Fixing AES-CTR counter bug
  crypto: null - Add missing Kconfig tristate for NULL2
  crypto: nx - Add forward declaration for struct crypto_aead
  ...
2015-08-31 17:38:39 -07:00
Linus Torvalds 26f8b7edc9 PCI changes for the v4.3 merge window:
Enumeration
     Allocate ATS struct during enumeration (Bjorn Helgaas)
     Embed ATS info directly into struct pci_dev (Bjorn Helgaas)
     Reduce size of ATS structure elements (Bjorn Helgaas)
     Stop caching ATS Invalidate Queue Depth (Bjorn Helgaas)
     iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth (Bjorn Helgaas)
     Move MPS configuration check to pci_configure_device() (Bjorn Helgaas)
     Set MPS to match upstream bridge (Keith Busch)
     ARM/PCI: Set MPS before pci_bus_add_devices() (Murali Karicheri)
     Add pci_scan_root_bus_msi() (Lorenzo Pieralisi)
     ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() (Lorenzo Pieralisi)
 
   Resource management
     Call pci_read_bridge_bases() from core instead of arch code (Lorenzo Pieralisi)
 
   PCI device hotplug
     pciehp: Remove unused interrupt events (Bjorn Helgaas)
     pciehp: Remove ignored MRL sensor interrupt events (Bjorn Helgaas)
     pciehp: Handle invalid data when reading from non-existent devices (Jarod Wilson)
     pciehp: Simplify pcie_poll_cmd() (Yijing Wang)
     Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot (Yijing Wang)
     Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem (Yijing Wang)
     Hold pci_slot_mutex while searching bus->slots list (Yijing Wang)
 
   Power management
     Disable async suspend/resume for JMicron multi-function SATA/AHCI (Zhang Rui)
 
   Virtualization
     Add ACS quirks for Intel I219-LM/V (Alex Williamson)
     Restore ACS configuration as part of pci_restore_state() (Alexander Duyck)
 
   MSI
     Add pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu)
     x86: Implement pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu)
     Add helpers to manage pci_dev->irq and pci_dev->irq_managed (Jiang Liu)
     Free legacy IRQ when enabling MSI/MSI-X (Jiang Liu)
     ARM/PCI: Remove msi_controller from struct pci_sys_data (Lorenzo Pieralisi)
     Remove unused pcibios_msi_controller() hook (Lorenzo Pieralisi)
 
   Generic host bridge driver
     Remove dependency on ARM-specific struct hw_pci (Jayachandran C)
     Build setup-irq.o for arm64 (Jayachandran C)
     Add arm64 support (Jayachandran C)
 
   APM X-Gene host bridge driver
     Add APM X-Gene PCIe 64-bit prefetchable window (Duc Dang)
     Add support for a 64-bit prefetchable memory window (Duc Dang)
     Drop owner assignment from platform_driver (Krzysztof Kozlowski)
 
   Broadcom iProc host bridge driver
     Allow BCMA bus driver to be built as module (Hauke Mehrtens)
     Delete unnecessary checks before phy calls (Markus Elfring)
     Add arm64 support (Ray Jui)
 
   Synopsys DesignWare host bridge driver
     Don't complain missing *config* reg space if va_cfg0 is set (Murali Karicheri)
 
   TI DRA7xx host bridge driver
     Disable pm_runtime on get_sync failure (Kishon Vijay Abraham I)
     Add PM support (Kishon Vijay Abraham I)
     Clear MSE bit during suspend so clocks will idle (Kishon Vijay Abraham I)
     Add support to make GPIO drive PERST# line (Kishon Vijay Abraham I)
 
   Xilinx AXI host bridge driver
     Check for MSI interrupt flag before handling as INTx (Russell Joyce)
 
   Miscellaneous
     Fix Intersil/Techwell TW686[4589] AV capture class code (Krzysztof Hałasa)
     Use PCI_CLASS_SERIAL_USB instead of bare number (Bjorn Helgaas)
     Fix generic NCR 53c810 class code quirk (Bjorn Helgaas)
     Fix TI816X class code quirk (Bjorn Helgaas)
     Remove unused "pci_probe" flags (Bjorn Helgaas)
     Host bridge driver code simplifications (Fabio Estevam)
     Add dev_flags bit to access VPD through function 0 (Mark Rustad)
     Add VPD function 0 quirk for Intel Ethernet devices (Mark Rustad)
     Kill off set_irq_flags() usage (Rob Herring)
     Remove Intel Cherrytrail D3 delays (Srinidhi Kasagar)
     Clean up pci_find_capability() (Wei Yang)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV5FE/AAoJEFmIoMA60/r8I2QP/R9b9MrvH2i9tN98/lTDl7g3
 czE58ZM1d4kMYtW3Pm/DrYI6y6RprAaB4ZEp5rHxlFLqBPZEQwWodA19NkjECcb6
 g5qKWOdIWA4T6Jaab6a/yCmAFa0jni7iAmmTYqca9o3Xj7tFovxDxqPSYkh+rer0
 v+1sAr/4HXSiN339KR6teEF3VZqLFp6ewMydQlVS+R7kAOHHYQDqoo9WF6JnIoL5
 PO3Kbmr1WN3fZY3s98yLq1x6XmLrLlmGdJI+2r+KewO4r/05CL6wTVP/oTMi+Eti
 dueseeISlOTcTAUhk87Vap23uJPeB/rJbYoFdCr7+0AkZGe/U/E2dpZm2wyMcCvq
 OrATuFymgzIuJm5uUPsdH4lzsX97U9BcDccracfC38rYnP5u3bqHCjw8HJzANR7p
 VYbFBzc5ZCCUYtQAjyrKt2820AvTFo+Bu+z75IsJO8LQQgv/zGtQQ8grIQeAjH+l
 sAe3xOTwzZnq6Obl4qb/GElHmIGUbQ1X4Dx1mliiijKMKkhYHOA0iFnB/OBILmEZ
 wHzKU8chWcI9lip0aaX8q9i/qovdVUt2+rdo/N40l7YY66x4jkNgQQXZX+FSKk6H
 stTvEBQgK28EKCHDxMsgzTGIqllSyk4DnRMA7ij1hRWqdUbGk7wOPTvm9QSwNDWe
 SokuWzAQD9YeMRGdsYjZ
 =DX1r
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for the v4.3 merge window:

  Enumeration:
   - Allocate ATS struct during enumeration (Bjorn Helgaas)
   - Embed ATS info directly into struct pci_dev (Bjorn Helgaas)
   - Reduce size of ATS structure elements (Bjorn Helgaas)
   - Stop caching ATS Invalidate Queue Depth (Bjorn Helgaas)
   - iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth (Bjorn Helgaas)
   - Move MPS configuration check to pci_configure_device() (Bjorn Helgaas)
   - Set MPS to match upstream bridge (Keith Busch)
   - ARM/PCI: Set MPS before pci_bus_add_devices() (Murali Karicheri)
   - Add pci_scan_root_bus_msi() (Lorenzo Pieralisi)
   - ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() (Lorenzo Pieralisi)

  Resource management:
   - Call pci_read_bridge_bases() from core instead of arch code (Lorenzo Pieralisi)

  PCI device hotplug:
   - pciehp: Remove unused interrupt events (Bjorn Helgaas)
   - pciehp: Remove ignored MRL sensor interrupt events (Bjorn Helgaas)
   - pciehp: Handle invalid data when reading from non-existent devices (Jarod Wilson)
   - pciehp: Simplify pcie_poll_cmd() (Yijing Wang)
   - Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot (Yijing Wang)
   - Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem (Yijing Wang)
   - Hold pci_slot_mutex while searching bus->slots list (Yijing Wang)

  Power management:
   - Disable async suspend/resume for JMicron multi-function SATA/AHCI (Zhang Rui)

  Virtualization:
   - Add ACS quirks for Intel I219-LM/V (Alex Williamson)
   - Restore ACS configuration as part of pci_restore_state() (Alexander Duyck)

  MSI:
   - Add pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu)
   - x86: Implement pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu)
   - Add helpers to manage pci_dev->irq and pci_dev->irq_managed (Jiang Liu)
   - Free legacy IRQ when enabling MSI/MSI-X (Jiang Liu)
   - ARM/PCI: Remove msi_controller from struct pci_sys_data (Lorenzo Pieralisi)
   - Remove unused pcibios_msi_controller() hook (Lorenzo Pieralisi)

  Generic host bridge driver:
   - Remove dependency on ARM-specific struct hw_pci (Jayachandran C)
   - Build setup-irq.o for arm64 (Jayachandran C)
   - Add arm64 support (Jayachandran C)

  APM X-Gene host bridge driver:
   - Add APM X-Gene PCIe 64-bit prefetchable window (Duc Dang)
   - Add support for a 64-bit prefetchable memory window (Duc Dang)
   - Drop owner assignment from platform_driver (Krzysztof Kozlowski)

  Broadcom iProc host bridge driver:
   - Allow BCMA bus driver to be built as module (Hauke Mehrtens)
   - Delete unnecessary checks before phy calls (Markus Elfring)
   - Add arm64 support (Ray Jui)

  Synopsys DesignWare host bridge driver:
   - Don't complain missing *config* reg space if va_cfg0 is set (Murali Karicheri)

  TI DRA7xx host bridge driver:
   - Disable pm_runtime on get_sync failure (Kishon Vijay Abraham I)
   - Add PM support (Kishon Vijay Abraham I)
   - Clear MSE bit during suspend so clocks will idle (Kishon Vijay Abraham I)
   - Add support to make GPIO drive PERST# line (Kishon Vijay Abraham I)

  Xilinx AXI host bridge driver:
   - Check for MSI interrupt flag before handling as INTx (Russell Joyce)

  Miscellaneous:
   - Fix Intersil/Techwell TW686[4589] AV capture class code (Krzysztof Hałasa)
   - Use PCI_CLASS_SERIAL_USB instead of bare number (Bjorn Helgaas)
   - Fix generic NCR 53c810 class code quirk (Bjorn Helgaas)
   - Fix TI816X class code quirk (Bjorn Helgaas)
   - Remove unused "pci_probe" flags (Bjorn Helgaas)
   - Host bridge driver code simplifications (Fabio Estevam)
   - Add dev_flags bit to access VPD through function 0 (Mark Rustad)
   - Add VPD function 0 quirk for Intel Ethernet devices (Mark Rustad)
   - Kill off set_irq_flags() usage (Rob Herring)
   - Remove Intel Cherrytrail D3 delays (Srinidhi Kasagar)
   - Clean up pci_find_capability() (Wei Yang)"

* tag 'pci-v4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (72 commits)
  PCI: Disable async suspend/resume for JMicron multi-function SATA/AHCI
  PCI: Set MPS to match upstream bridge
  PCI: Move MPS configuration check to pci_configure_device()
  PCI: Drop references acquired by of_parse_phandle()
  PCI/MSI: Remove unused pcibios_msi_controller() hook
  ARM/PCI: Remove msi_controller from struct pci_sys_data
  ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi()
  PCI: Add pci_scan_root_bus_msi()
  ARM/PCI: Replace panic with WARN messages on failures
  PCI: generic: Add arm64 support
  PCI: Build setup-irq.o for arm64
  PCI: generic: Remove dependency on ARM-specific struct hw_pci
  PCI: imx6: Simplify a trivial if-return sequence
  PCI: spear: Use BUG_ON() instead of condition followed by BUG()
  PCI: dra7xx: Remove unneeded use of IS_ERR_VALUE()
  PCI: Remove pci_ats_enabled()
  PCI: Stop caching ATS Invalidate Queue Depth
  PCI: Move ATS declarations to linux/pci.h so they're all together
  PCI: Clean up ATS error handling
  PCI: Use pci_physfn() rather than looking up physfn by hand
  ...
2015-08-31 17:14:39 -07:00
Ingo Molnar 0050ae57cd Merge branch 'x86/cpufeature' into x86/urgent, because it's ready
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-31 19:47:03 +02:00
Linus Torvalds 1af115d675 Driver core patches for 4.3-rc1
Here is the new patches for the driver core / sysfs for 4.3-rc1.
 
 Very small number of changes here, all the details are in the shortlog,
 nothing major happening at all this kernel release, which is nice to
 see.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlXV9EwACgkQMUfUDdst+ylv1ACgj7srYyvumehX1zfRVzEWNuez
 chQAoKHnSpDMME/WmhQQRxzQ5pfd1Pni
 =uGHg
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the new patches for the driver core / sysfs for 4.3-rc1.

  Very small number of changes here, all the details are in the
  shortlog, nothing major happening at all this kernel release, which is
  nice to see"

* tag 'driver-core-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  bus: subsys: update return type of ->remove_dev() to void
  driver core: correct device's shutdown order
  driver core: fix docbook for device_private.device
  selftests: firmware: skip timeout checks for kernels without user mode helper
  kernel, cpu: Remove bogus __ref annotations
  cpu: Remove bogus __ref annotation of cpu_subsys_online()
  firmware: fix wrong memory deallocation in fw_add_devm_name()
  sysfs.txt: update show method notes about sprintf/snprintf/scnprintf usage
  devres: fix devres_get()
2015-08-31 08:47:40 -07:00
Linus Torvalds 1c00038c76 Char/Misc driver patches for 4.3-rc1
Here's the "big" char/misc driver update for 4.3-rc1.
 
 Not much really interesting here, just a number of little changes all
 over the place, and some nice consolidation of the nvmem drivers to a
 common framework.  As usual, the mei drivers stand out as the largest
 "churn" to handle new devices and features in their hardware.
 
 All have been in linux-next for a while with no issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlXV844ACgkQMUfUDdst+ymYfQCgmDKjq3fsVHCxNZPxnukFYzvb
 xZkAnRb8fuub5gVQFP29A+rhyiuWD13v
 =Bq9K
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver patches from Greg KH:
 "Here's the "big" char/misc driver update for 4.3-rc1.

  Not much really interesting here, just a number of little changes all
  over the place, and some nice consolidation of the nvmem drivers to a
  common framework.  As usual, the mei drivers stand out as the largest
  "churn" to handle new devices and features in their hardware.

  All have been in linux-next for a while with no issues"

* tag 'char-misc-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (136 commits)
  auxdisplay: ks0108: initialize local parport variable
  extcon: palmas: Fix build break due to devm_gpiod_get_optional API change
  extcon: palmas: Support GPIO based USB ID detection
  extcon: Fix signedness bugs about break error handling
  extcon: Drop owner assignment from i2c_driver
  extcon: arizona: Simplify pdata symantics for micd_dbtime
  extcon: arizona: Declare 3-pole jack if we detect open circuit on mic
  extcon: Add exception handling to prevent the NULL pointer access
  extcon: arizona: Ensure variables are set for headphone detection
  extcon: arizona: Use gpiod inteface to handle micd_pol_gpio gpio
  extcon: arizona: Add basic microphone detection DT/ACPI bindings
  extcon: arizona: Update to use the new device properties API
  extcon: palmas: Remove the mutually_exclusive array
  extcon: Remove optional print_state() function pointer of struct extcon_dev
  extcon: Remove duplicate header file in extcon.h
  extcon: max77843: Clear IRQ bits state before request IRQ
  toshiba laptop: replace ioremap_cache with ioremap
  misc: eeprom: max6875: clean up max6875_read()
  misc: eeprom: clean up eeprom_read()
  misc: eeprom: 93xx46: clean up eeprom_93xx46_bin_read/write
  ...
2015-08-31 08:34:13 -07:00
Linus Torvalds 44e98edcd1 A very small release for x86 and s390 KVM.
s390: timekeeping changes, cleanups and fixes
 
 x86: support for Hyper-V MSRs to report crashes, and a bunch of cleanups.
 
 One interesting feature that was planned for 4.3 (emulating the local
 APIC in kernel while keeping the IOAPIC and 8254 in userspace) had to
 be delayed because Intel complained about my reading of the manual.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJVznW4AAoJEL/70l94x66Dt+gH/3vydhh6kv+mKhnR+kADaGfM
 gaunw0CUpJLU6gkOkYOm5M32WGhsT9Hd3WtRTJO6PhSo7cQ88hMx24u4XAffoewo
 Os5tDwAaHeV2enVSTri6xX8e2F2mgPDghGcYJPUBwnmMjRzZ8tj2VHUcbxqVT6Pb
 pX3V8ZxOZ81+ACZU2tdNRzLUd2H1v4d74gtVS7ove1Vb0CvPOBdHf1KQuUCUa2Pi
 73fvnaEuSaFYtSWZIP1PYxLnsQHpApH3Kco/5kHeqUPpYaGa/g2bnfncHRw20Svr
 gb3opwbfyiq91xfGbRVR3+E63Cw4G6aTl5MDNv9UFJ+xFKuj8WJ72xXXTSwzUi4=
 =HgT+
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.3-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "A very small release for x86 and s390 KVM.

   - s390: timekeeping changes, cleanups and fixes

   - x86: support for Hyper-V MSRs to report crashes, and a bunch of
     cleanups.

  One interesting feature that was planned for 4.3 (emulating the local
  APIC in kernel while keeping the IOAPIC and 8254 in userspace) had to
  be delayed because Intel complained about my reading of the manual"

* tag 'kvm-4.3-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
  x86/kvm: Rename VMX's segment access rights defines
  KVM: x86/vPMU: Fix unnecessary signed extension for AMD PERFCTRn
  kvm: x86: Fix error handling in the function kvm_lapic_sync_from_vapic
  KVM: s390: Fix assumption that kvm_set_irq_routing is always run successfully
  KVM: VMX: drop ept misconfig check
  KVM: MMU: fully check zero bits for sptes
  KVM: MMU: introduce is_shadow_zero_bits_set()
  KVM: MMU: introduce the framework to check zero bits on sptes
  KVM: MMU: split reset_rsvds_bits_mask_ept
  KVM: MMU: split reset_rsvds_bits_mask
  KVM: MMU: introduce rsvd_bits_validate
  KVM: MMU: move FNAME(is_rsvd_bits_set) to mmu.c
  KVM: MMU: fix validation of mmio page fault
  KVM: MTRR: Use default type for non-MTRR-covered gfn before WARN_ON
  KVM: s390: host STP toleration for VMs
  KVM: x86: clean/fix memory barriers in irqchip_in_kernel
  KVM: document memory barriers for kvm->vcpus/kvm->online_vcpus
  KVM: x86: remove unnecessary memory barriers for shared MSRs
  KVM: move code related to KVM_SET_BOOT_CPU_ID to x86
  KVM: s390: log capability enablement and vm attribute changes
  ...
2015-08-31 08:27:44 -07:00
Ingo Molnar 02b643b643 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-31 10:25:26 +02:00
Thomas Gleixner a47d4576cd x86/irq: Do not dereference irq descriptor before checking it
Having the IS_NULL_OR_ERR() check after dereferencing the pointer is
not really working well.

Move the dereference after the check.

Fixes: a782a7e46b 'x86/irq: Store irq descriptor in vector array'
Reported-and-tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-08-28 10:30:15 +02:00
Luis R. Rodriguez 2baa891e42 x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del()
The effort to replace mtrr_add() with architecture agnostic
arch_phys_wc_add() is complete, this will ensure write-combining
implementations (PAT on x86) is taken advantage instead of using
MTRR. With the effort done now, hide direct MTRR access for
drivers.

The legacy user-space /proc/mtrr ABI is not affected.

Update x86 documentation on MTRR to reflect the completion of
the phasing out of direct access to MTRR, also add a note on
platform firmware code use of MTRRs based on the obituary
discussion of MTRRs on Linux [0].

  [0] http://lkml.kernel.org/r/1438991330.3109.196.camel@hp.com

Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Cc: <syrjala@sci.fi>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Walls <awalls@md.metrocast.net>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Doug Ledford <dledford@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Ville Syrjälä <syrjala@sci.fi>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: airlied@linux.ie
Cc: benh@kernel.crashing.org
Cc: bhelgaas@google.com
Cc: dan.j.williams@intel.com
Cc: konrad.wilk@oracle.com
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: mst@redhat.com
Cc: netdev@vger.kernel.org
Cc: vinod.koul@intel.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/1440443613-13696-12-git-send-email-mcgrof@do-not-panic.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-28 10:09:28 +02:00
Huang Rui 7e01ebffff x86/asm: Drop repeated macro of X86_EFLAGS_AC definition
We just need one macro of X86_EFLAGS_AC_BIT and X86_EFLAGS_AC.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Li <tony.li@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1440669844-21535-1-git-send-email-ray.huang@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-28 07:33:34 +02:00
David S. Miller 0d36938bb8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-27 21:45:31 -07:00
Dan Williams 96601adb74 x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
Given that a write-back (WB) mapping plus non-temporal stores is
expected to be the most efficient way to access PMEM, update the
definition of ARCH_HAS_PMEM_API to imply arch support for
WB-mapped-PMEM.  This is needed as a pre-requisite for adding PMEM to
the direct map and mapping it with struct page.

The above clarification for X86_64 means that memcpy_to_pmem() is
permitted to use the non-temporal arch_memcpy_to_pmem() rather than
needlessly fall back to default_memcpy_to_pmem() when the pcommit
instruction is not available.  When arch_memcpy_to_pmem() is not
guaranteed to flush writes out of cache, i.e. on older X86_32
implementations where non-temporal stores may just dirty cache,
ARCH_HAS_PMEM_API is simply disabled.

The default fall back for persistent memory handling remains.  Namely,
map it with the WT (write-through) cache-type and hope for the best.

arch_has_pmem_api() is updated to only indicate whether the arch
provides the proper helpers to meet the minimum "writes are visible
outside the cache hierarchy after memcpy_to_pmem() + wmb_pmem()".  Code
that cares whether wmb_pmem() actually flushes writes to pmem must now
call arch_has_wmb_pmem() directly.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
[hch: set ARCH_HAS_PMEM_API=n on x86_32]
Reviewed-by: Christoph Hellwig <hch@lst.de>
[toshi: x86_32 compile fixes]
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27 19:40:59 -04:00
Dan Williams 033fbae988 mm: ZONE_DEVICE for "device memory"
While pmem is usable as a block device or via DAX mappings to userspace
there are several usage scenarios that can not target pmem due to its
lack of struct page coverage. In preparation for "hot plugging" pmem
into the vmemmap add ZONE_DEVICE as a new zone to tag these pages
separately from the ones that are subject to standard page allocations.
Importantly "device memory" can be removed at will by userspace
unbinding the driver of the device.

Having a separate zone prevents allocation and otherwise marks these
pages that are distinct from typical uniform memory.  Device memory has
different lifetime and performance characteristics than RAM.  However,
since we have run out of ZONES_SHIFT bits this functionality currently
depends on sacrificing ZONE_DMA.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Jerome Glisse <j.glisse@gmail.com>
[hch: various simplifications in the arch interface]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27 19:40:58 -04:00
Dan Williams 4a9bf88a5c Merge branch 'pmem-api' into libnvdimm-for-next 2015-08-27 19:40:26 -04:00
Ross Zwisler 67a3e8fe90 nd_blk: change aperture mapping from WC to WB
This should result in a pretty sizeable performance gain for reads.  For
rough comparison I did some simple read testing using PMEM to compare
reads of write combining (WC) mappings vs write-back (WB).  This was
done on a random lab machine.

PMEM reads from a write combining mapping:
	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
	100000+0 records in
	100000+0 records out
	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s

PMEM reads from a write-back mapping:
	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
	1000000+0 records in
	1000000+0 records out
	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s

To be able to safely support a write-back aperture I needed to add
support for the "read flush" _DSM flag, as outlined in the DSM spec:

http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

This flag tells the ND BLK driver that it needs to flush the cache lines
associated with the aperture after the aperture is moved but before any
new data is read.  This ensures that any stale cache lines from the
previous contents of the aperture will be discarded from the processor
cache, and the new data will be read properly from the DIMM.  We know
that the cache lines are clean and will be discarded without any
writeback because either a) the previous aperture operation was a read,
and we never modified the contents of the aperture, or b) the previous
aperture operation was a write and we must have written back the dirtied
contents of the aperture to the DIMM before the I/O was completed.

In order to add support for the "read flush" flag I needed to add a
generic routine to invalidate cache lines, mmio_flush_range().  This is
protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
only supported on x86.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27 19:38:28 -04:00
Jiang Liu 5d0ddfebb9 ACPI, PCI: Penalize legacy IRQ used by ACPI SCI
Nick Meier reported a regression with HyperV that "
  After rebooting the VM, the following messages are logged in syslog
  when trying to load the tulip driver:
    tulip: Linux Tulip drivers version 1.1.15 (Feb 27, 2007)
    tulip: 0000:00:0a.0: PCI INT A: failed to register GSI
    tulip: Cannot enable tulip board #0, aborting
    tulip: probe of 0000:00:0a.0 failed with error -16
  Errors occur in 3.19.0 kernel
  Works in 3.17 kernel.
"

According to the ACPI dump file posted by Nick at
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072

The ACPI MADT table includes an interrupt source overridden entry for
ACPI SCI:
[236h 0566  1]                Subtable Type : 02 <Interrupt Source Override>
[237h 0567  1]                       Length : 0A
[238h 0568  1]                          Bus : 00
[239h 0569  1]                       Source : 09
[23Ah 0570  4]                    Interrupt : 00000009
[23Eh 0574  2]        Flags (decoded below) : 000D
                                   Polarity : 1
                               Trigger Mode : 3

And in DSDT table, we have _PRT method to define PCI interrupts, which
eventually goes to:
        Name (PRSA, ResourceTemplate ()
        {
            IRQ (Level, ActiveLow, Shared, )
                {3,4,5,7,9,10,11,12,14,15}
        })
        Name (PRSB, ResourceTemplate ()
        {
            IRQ (Level, ActiveLow, Shared, )
                {3,4,5,7,9,10,11,12,14,15}
        })
        Name (PRSC, ResourceTemplate ()
        {
            IRQ (Level, ActiveLow, Shared, )
                {3,4,5,7,9,10,11,12,14,15}
        })
        Name (PRSD, ResourceTemplate ()
        {
            IRQ (Level, ActiveLow, Shared, )
                {3,4,5,7,9,10,11,12,14,15}
        })

According to the MADT and DSDT tables, IRQ 9 may be used for:
 1) ACPI SCI in level, high mode
 2) PCI legacy IRQ in level, low mode
So there's a conflict in polarity setting for IRQ 9.

Prior to commit cd68f6bd53 ("x86, irq, acpi: Get rid of special
handling of GSI for ACPI SCI"), ACPI SCI is handled specially and
there's no check for conflicts between ACPI SCI and PCI legagy IRQ.
And it seems that the HyperV hypervisor doesn't make use of the
polarity configuration in IOAPIC entry, so it just works.

Commit cd68f6bd53 gets rid of the specially handling of ACPI SCI,
and then the pin attribute checking code discloses the conflicts
between ACPI SCI and PCI legacy IRQ on HyperV virtual machine,
and rejects the request to assign IRQ9 to PCI devices.

So penalize legacy IRQ used by ACPI SCI and mark it unusable if ACPI
SCI attributes conflict with PCI IRQ attributes.

Please refer to following links for more information:
https://bugzilla.kernel.org/show_bug.cgi?id=101301
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072

Fixes: cd68f6bd53 ("x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI")
Reported-and-tested-by: Nick Meier <nmeier@microsoft.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: 3.19+ <stable@vger.kernel.org> # 3.19+
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-08-27 01:12:23 +02:00
Linus Torvalds b1713b135f Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
 "A single fix for a APIC regression introduced in 4.0 which went
  undetected until now.

  I screwed up the x2apic cleanup in a subtle way.  The screwup is only
  visible on systems which have x2apic preenabled in the BIOS and need
  to disable it during boot"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Fix fallout from x2apic cleanup
2015-08-25 09:01:05 -07:00
Ingo Molnar 8d58b66ed2 Linux 4.2-rc8
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV2pUkAAoJEHm+PkMAQRiGCIoH/Rb29ZjdCoZJp9OtnjAG+qRc
 bG3YuomIdib86x7xHRKKaLWBa7din7IYjuwT/X4S4duO5a1R5Lp1sRG3IlGfhT0W
 nBNbjFl4q4bOyiTPtTRTYyh4g5UQv4IuyCnCmZyCTJyVi/O6HVM9TWKUzm68P2dJ
 30LwLUcQJ+mHueGJwFBAXe2BaojEpvYCdSX6tvbrQ/8X3FrVExZXuJl4uMYNFYNK
 ZwG/v5t7tYOiAe76JGbrEuVFPZWLPEW7amHOWR0T4Ye4nWTlBgx7fENiNRlfgcvI
 CM16l/xkyrZQ3Q5jZy1qYDfdHYF++dyEDysX4w1ae/X0aaLZn7l+u5VQD6WpkQQ=
 =IF6I
 -----END PGP SIGNATURE-----

Merge tag 'v4.2-rc8' into x86/mm, before applying new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-25 09:59:19 +02:00
Paul Gortmaker 13fe86f465 x86/mm: Make kernel/check.c explicitly non-modular
The Kconfig currently controlling compilation of this code is:

  arch/x86/Kconfig:config X86_CHECK_BIOS_CORRUPTION
  arch/x86/Kconfig:       bool "Check for low memory corruption"

...meaning that it currently is not being built as a module by
anyone.

Lets remove the couple traces of modularity so that when reading
the code there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the
non-modular case, the init ordering remains unchanged with this
commit.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1440459295-21814-4-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-25 09:48:38 +02:00
Paul Gortmaker 8f45fe441a x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular
The file pageattr.c is obj-y and it includes pageattr-test.c
based on CPA_DEBUG (a bool), meaning that no code here is
currently being built as a module by anyone.

Lets remove the couple traces of modularity so that when reading
the code there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the
non-modular case, the init ordering remains unchanged with this
commit.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1440459295-21814-3-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-25 09:48:38 +02:00
Paul Gortmaker e971aa2cba x86/platform: Make atom/pmc_atom.c explicitly non-modular
The Kconfig currently controlling compilation of this code is:

config PMC_ATOM
        def_bool y

...meaning that it currently is not being built as a module by
anyone.

Lets remove the couple traces of modularity so that when reading
the driver there is no doubt it is builtin-only.

Since module_init() translates to device_initcall() in the
non-modular case, the init ordering remains unchanged with this
commit.

We leave some tags like MODULE_AUTHOR() for documentation
purposes.

Also note that MODULE_DEVICE_TABLE() is a no-op for non-modular
code. We correct a comment that indicates the data was only used
by that macro, as it actually is used by the code directly.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1440459295-21814-2-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-25 09:47:50 +02:00
Rafael J. Wysocki 82bb70c599 Merge branch 'turbostat' of https://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux into pm-tools
Pull turbostat changes for v4.3 from Len Brown.

* 'turbostat' of https://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: fix typo on DRAM column in Joules-mode
  tools/power turbostat: fix parameter passing for forked command
  tools/power turbostat: dump CONFIG_TDP
  tools/power turbostat: cpu0 is no longer hard-coded, so  update output
  tools/power turbostat: update turbostat(8)
2015-08-24 23:10:02 +02:00
Dave Airlie 3732ce72b4 Linux 4.2-rc8
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV2pUkAAoJEHm+PkMAQRiGCIoH/Rb29ZjdCoZJp9OtnjAG+qRc
 bG3YuomIdib86x7xHRKKaLWBa7din7IYjuwT/X4S4duO5a1R5Lp1sRG3IlGfhT0W
 nBNbjFl4q4bOyiTPtTRTYyh4g5UQv4IuyCnCmZyCTJyVi/O6HVM9TWKUzm68P2dJ
 30LwLUcQJ+mHueGJwFBAXe2BaojEpvYCdSX6tvbrQ/8X3FrVExZXuJl4uMYNFYNK
 ZwG/v5t7tYOiAe76JGbrEuVFPZWLPEW7amHOWR0T4Ye4nWTlBgx7fENiNRlfgcvI
 CM16l/xkyrZQ3Q5jZy1qYDfdHYF++dyEDysX4w1ae/X0aaLZn7l+u5VQD6WpkQQ=
 =IF6I
 -----END PGP SIGNATURE-----

Merge tag 'v4.2-rc8' into drm-next

Linux 4.2-rc8

Backmerge required for Intel so they can fix their -next tree up properly.
2015-08-24 16:36:42 +10:00
Andy Lutomirski 47edb65178 x86/asm/msr: Make wrmsrl() a function
As of cf991de2f6 ("x86/asm/msr: Make wrmsrl_safe() a
function"), wrmsrl_safe is a function, but wrmsrl is still a
macro.  The wrmsrl macro performs invalid shifts if the value
argument is 32 bits. This makes it unnecessarily awkward to
write code that puts an unsigned long into an MSR.

To make this work, syscall_init needs tweaking to stop passing
a function pointer to wrmsrl.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Willy Tarreau <w@1wt.eu>
Link: http://lkml.kernel.org/r/690f0c629a1085d054e2d1ef3da073cfb3f7db92.1437678821.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-23 13:25:38 +02:00
Linus Torvalds d0b89bd548 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Various low level fixes: fix more fallout from the FPU rework and the
  asm entry code rework, plus an MSI rework fix, and an idle-tracing fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu/math-emu: Fix crash in fork()
  x86/fpu/math-emu: Fix math-emu boot crash
  x86/idle: Restore trace_cpu_idle to mwait_idle() calls
  x86/irq: Build correct vector mapping for multiple MSI interrupts
  Revert "sched/x86_64: Don't save flags on context switch"
2015-08-22 08:15:36 -07:00
Thomas Gleixner a57e456a7b x86/apic: Fix fallout from x2apic cleanup
In the recent x2apic cleanup I got two things really wrong:
1) The safety check in __disable_x2apic which allows the function to
   be called unconditionally is backwards. The check is there to
   prevent access to the apic MSR in case that the machine has no
   apic. Though right now it returns if the machine has an apic and
   therefor the disabling of x2apic is never invoked.

2) x2apic_disable() sets x2apic_mode to 0 after registering the local
   apic. That's wrong, because register_lapic_address() checks x2apic
   mode and therefor takes the wrong code path.

This results in boot failures on machines with x2apic preenabled by
BIOS and can also lead to an fatal MSR access on machines without
apic.

The solutions are simple:
1) Correct the sanity check for apic availability
2) Clear x2apic_mode _before_ calling register_lapic_address()

Fixes: 659006bf3a 'x86/x2apic: Split enable and setup function'
Reported-and-tested-by: Javier Monteagudo <javiermon@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1224764
Cc: stable@vger.kernel.org # 4.0+
Cc: Laura Abbott <labbott@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
2015-08-22 17:01:48 +02:00
Andrey Ryabinin 69786cdb37 x86/kasan, mm: Introduce generic kasan_populate_zero_shadow()
Introduce generic kasan_populate_zero_shadow(shadow_start,
shadow_end). This function maps kasan_zero_page to the
[shadow_start, shadow_end] addresses.

This replaces x86_64 specific populate_zero_shadow() and will
be used for ARM64 in follow on patches.

The main changes from original version are:

 * Use p?d_populate*() instead of set_p?d()
 * Use memblock allocator directly instead of vmemmap_alloc_block()
 * __pa() instead of __pa_nodebug(). __pa() causes troubles
   iff we use it before kasan_early_init(). kasan_populate_zero_shadow()
   will be used later, so we ok with __pa() here.

Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexey Klimov <klimov.linux@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Keitel <dkeitel@codeaurora.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yury <yury.norov@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1439444244-26057-3-git-send-email-ryabinin.a.a@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 14:54:55 +02:00
Andrey Ryabinin 920e277e17 x86/kasan: Define KASAN_SHADOW_OFFSET per architecture
Current definition of  KASAN_SHADOW_OFFSET in
include/linux/kasan.h will not work for upcomming arm64, so move
it to the arch header.

Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexey Klimov <klimov.linux@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Keitel <dkeitel@codeaurora.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yury <yury.norov@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1439444244-26057-2-git-send-email-ryabinin.a.a@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 14:54:55 +02:00
Huang Rui b466bdb614 x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer
MWAITX can enable a timer and a corresponding timer value
specified in SW P0 clocks. The SW P0 frequency is the same as
TSC. The timer provides an upper bound on how long the
instruction waits before exiting.

This way, a delay function in the kernel can leverage that
MWAITX timer of MWAITX.

When a CPU core executes MWAITX, it will be quiesced in a
waiting phase, diminishing its power consumption. This way, we
can save power in comparison to our default TSC-based delays.

A simple test shows that:

	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc
	$ sleep 10000s
	$ cat /sys/bus/pci/devices/0000\:00\:18.4/hwmon/hwmon0/power1_acc

Results:

	* TSC-based default delay:      485115 uWatts average power
	* MWAITX-based delay:           252738 uWatts average power

Thus, that's about 240 milliWatts less power consumption. The
test method relies on the support of AMD CPU accumulated power
algorithm in fam15h_power for which patches are forthcoming.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Suggested-by: Borislav Petkov <bp@suse.de>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Huang Rui <ray.huang@amd.com>
[ Fix delay truncation. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Andreas Herrmann <herrmann.der.user@gmail.com>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Li <tony.li@amd.com>
Link: http://lkml.kernel.org/r/1438744732-1459-3-git-send-email-ray.huang@amd.com
Link: http://lkml.kernel.org/r/1439201994-28067-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 14:52:16 +02:00
Huang Rui f96756746c x86/asm: Add MONITORX/MWAITX instruction support
AMD Carrizo processors (Family 15h, Models 60h-6fh) added a new
feature called MWAITX (MWAIT with extensions) as an extension to
MONITOR/MWAIT.

This new instruction controls a configurable timer which causes
the core to exit wait state on timer expiration, in addition to
"normal" MWAIT condition of reading from a monitored VA.

Compared to MONITOR/MWAIT, there are minor differences in opcode
and input parameters:

MWAITX ECX[1]: enable timer if set
MWAITX EBX[31:0]: max wait time expressed in SW P0 clocks ==
TSC. The software P0 frequency is the same as the TSC frequency.

                MWAIT                           MWAITX
opcode          0f 01 c9           |            0f 01 fb
ECX[0]                  value of RFLAGS.IF seen by instruction
ECX[1]          unused/#GP if set  |            enable timer if set
ECX[31:2]                     unused/#GP if set
EAX                           unused (reserve for hint)
EBX[31:0]       unused             |            max wait time (SW P0 == TSC)

                MONITOR                         MONITORX
opcode          0f 01 c8           |            0f 01 fa
EAX                     (logical) address to monitor
ECX                     #GP if not zero

Max timeout = EBX/(TSC frequency)

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andreas Herrmann <herrmann.der.user@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dirk Brandewie <dirk.j.brandewie@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Li <tony.li@amd.com>
Link: http://lkml.kernel.org/r/1439201994-28067-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 14:52:16 +02:00
Tim Chen 488ca7d72d x86/cpufeatures: Enable cpuid for Intel SHA extensions
Add Intel CPUID for Intel Secure Hash Algorithm Extensions. This feature
provides new instructions for accelerated computation of SHA-1 and SHA-256.
This allows the feature to be shown in the /proc/cpuinfo for cpus that
support it.

Refer to SHA extension programming guide in chapter 8.2 of the Intel
Architecture Instruction Set Extensions Programming reference
for definition of this feature's cpuid: CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29] = 1
https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf

Originally-by: Chandramouli Narayanan <mouli_7982@yahoo.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Link: http://lkml.kernel.org/r/1440194206.3940.6.camel@schen9-mobl2
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-08-22 11:17:31 +02:00
Andy Lutomirski f0a97af83f x86/traps: Weaken context tracking entry assertions
We were asserting that we were all the way in CONTEXT_KERNEL
when exception handlers were called.  While having this be true
is, I think, a nice goal (or maybe a variant in which we assert
that we're in CONTEXT_KERNEL or some new IRQ context), we're not
quite there.

In particular, if an IRQ interrupts the SYSCALL prologue and the
IRQ handler in turn causes an exception, the exception entry
will be called in RCU IRQ mode but with CONTEXT_USER.

This is okay (nothing goes wrong), but until we fix up the
SYSCALL prologue, we need to avoid warning.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c81faf3916346c0e04346c441392974f49cd7184.1440133286.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 11:12:10 +02:00
Ingo Molnar 827409b2f5 x86/fpu/math-emu: Fix crash in fork()
During later stages of math-emu bootup the following crash triggers:

	 math_emulate: 0060:c100d0a8
	 Kernel panic - not syncing: Math emulation needed in kernel
	 CPU: 0 PID: 1511 Comm: login Not tainted 4.2.0-rc7+ #1012
	 [...]
	 Call Trace:
	  [<c181d50d>] dump_stack+0x41/0x52
	  [<c181c918>] panic+0x77/0x189
	  [<c1003530>] ? math_error+0x140/0x140
	  [<c164c2d7>] math_emulate+0xba7/0xbd0
	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
	  [<c1109c3c>] ? __alloc_pages_nodemask+0x12c/0x870
	  [<c136ac20>] ? proc_clear_tty+0x40/0x70
	  [<c136ac6e>] ? session_clear_tty+0x1e/0x30
	  [<c1003530>] ? math_error+0x140/0x140
	  [<c1003575>] do_device_not_available+0x45/0x70
	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
	  [<c18258e6>] error_code+0x5a/0x60
	  [<c1003530>] ? math_error+0x140/0x140
	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
	  [<c100c205>] arch_dup_task_struct+0x25/0x30
	  [<c1048cea>] copy_process.part.51+0xea/0x1480
	  [<c115a8e5>] ? dput+0x175/0x200
	  [<c136af70>] ? no_tty+0x30/0x30
	  [<c1157242>] ? do_vfs_ioctl+0x322/0x540
	  [<c104a21a>] _do_fork+0xca/0x340
	  [<c1057b06>] ? SyS_rt_sigaction+0x66/0x90
	  [<c104a557>] SyS_clone+0x27/0x30
	  [<c1824a80>] sysenter_do_call+0x12/0x12

The reason is the incorrect assumption in fpu_copy(), that FNSAVE
can be executed from math-emu kernels as well.

Don't try to copy the registers, the soft state will be copied
by fork anyway, so the child task inherits the parent task's
soft math state.

With this fix applied math-emu kernels boot up fine on modern
hardware and the 'no387 nofxsr' boot options.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 10:23:03 +02:00
Ingo Molnar 5fc960380e x86/fpu/math-emu: Fix math-emu boot crash
On a math-emu bootup the following crash occurs:

	Initializing CPU#0
	------------[ cut here ]------------
	kernel BUG at arch/x86/kernel/traps.c:779!
	invalid opcode: 0000 [#1] SMP
	[...]
	EIP is at do_device_not_available+0xe/0x70
	[...]
	Call Trace:
	 [<c18238e6>] error_code+0x5a/0x60
	 [<c1002bd0>] ? math_error+0x140/0x140
	 [<c100bbd9>] ? fpu__init_cpu+0x59/0xa0
	 [<c1012322>] cpu_init+0x202/0x330
	 [<c104509f>] ? __native_set_fixmap+0x1f/0x30
	 [<c1b56ab0>] trap_init+0x305/0x346
	 [<c1b548af>] start_kernel+0x1a5/0x35d
	 [<c1b542b4>] i386_start_kernel+0x82/0x86

The reason is that in the following commit:

  b1276c48e9 ("x86/fpu: Initialize fpregs in fpu__init_cpu_generic()")

I failed to consider math-emu's limitation that it cannot execute the
FNINIT instruction in kernel mode.

The long term fix might be to allow math-emu to execute (certain) kernel
mode FPU instructions, but for now apply the safe (albeit somewhat ugly)
fix: initialize the emulation state explicitly without trapping out to
the FPU emulator.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 10:02:04 +02:00
David S. Miller dc25b25897 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/qmi_wwan.c

Overlapping additions of new device IDs to qmi_wwan.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-21 11:44:04 -07:00
Vitaly Kuznetsov 88c9281a9f x86/hyperv: Mark the Hyper-V TSC as unstable
The Hyper-V top-level functional specification states, that
"algorithms should be resilient to sudden jumps forward or
backward in the TSC value", this means that we should consider
TSC as unstable. In some cases tsc tests are able to detect the
instability, it was detected in 543 out of 646 boots in my
testing:

 Measured 6277 cycles TSC warp between CPUs, turning off TSC clock.
 tsc: Marking TSC unstable due to check_tsc_sync_source failed

This is, however, just a heuristic. On Hyper-V platform there
are two good clocksources: MSR-based hyperv_clocksource and
recently introduced TSC page.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: devel@linuxdriverproject.org
Link: http://lkml.kernel.org/r/1440003264-9949-1-git-send-email-vkuznets@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-21 08:44:38 +02:00
Ingo Molnar 99770737ca x86/asm/tsc: Add rdtscll() merge helper
Some in-flight code makes use of the old rdtscll() (now removed), provide a wrapper
for a kernel cycle to smooth the transition to rdtsc().

( We use the safest variant, rdtsc_ordered(), which has barriers - this adds another
  incentive to remove the wrapper in the future. )

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/dddbf98a2af53312e9aa73a5a2b1622fe5d6f52b.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-21 08:35:42 +02:00
Ingo Molnar 82819ffb42 perf/x86/msr: Fix the MSR driver build
The new MSR PMU driver made use of rdtsc() which does not exist (yet) in
this tree:

  arch/x86/kernel/cpu/perf_event_msr.c:91:3: error: implicit declaration of function 'rdtsc'

Use the old rdtscll() primitive for now.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-21 08:17:01 +02:00
Jisheng Zhang e43d0189ac x86/idle: Restore trace_cpu_idle to mwait_idle() calls
Commit b253149b84 ("sched/idle/x86: Restore mwait_idle() to fix boot
hangs, to improve power savings and to improve performance") restores
mwait_idle(), but the trace_cpu_idle related calls are missing. This
causes powertop on my old desktop powered by Intel Core2 E6550 to
report zero wakeups and zero events.

Add them back to restore the proper behaviour.

Fixes: b253149b84 ("sched/idle/x86: Restore mwait_idle() to ...")
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Cc: <len.brown@intel.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1440046479-4262-1-git-send-email-jszhang@marvell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-08-20 21:37:45 +02:00
Toshi Kani d5dc861bd6 x86/mm/pat: Add comments to cachemode translation tables
Add comments to the cachemode translation tables to clarify that
the default values are set as minimal supported mode, which are
necessary to handle WC and WT fallback to UC- when they are not
enabled.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1437588371-28223-1-git-send-email-toshi.kani@hp.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-08-20 21:26:12 +02:00
Linus Torvalds 3d3e66ba2c xen: build fix for 4.2-rc7
- Fix i386 build with an (uncommon) configuration
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV1bZtAAoJEFxbo/MsZsTR3sYH/2Q+wabqeFZotSZsJjYjSh+q
 6hCRB/tD+LbmReYuFBsqStHUDEL0Ljh9kw6YQvUrEVLv6CIH/pCVhj2U+/INlUur
 aScKQe1ttKaMzEAB2opLQnYMw5Q/C/pHAtq88MYMWnYBb9fM/puMyI0iXu8FhoOP
 +QYdaDt7+hRfID3PWZ7JxLGu+AqVgis5OAh/rt/Y4aC/WaNF7ifrE4qIJlgaR9x4
 IDglRBc4cPhLjwb2yhykiRHREhydVvRqEPsgji20T7pXVduj5DEqqVpU1XMqKBNU
 0EmZ5wLnELvWxPWv3zCScAtvmH30+i4QQcGB/3igJCeYN0gBxzeoGbTKWf4P0HM=
 =lRue
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen build fix from David Vrabel:
 "Fix i386 build with an (uncommon) configuration"

* tag 'for-linus-4.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: make CONFIG_XEN depend on CONFIG_X86_LOCAL_APIC
2015-08-20 12:21:26 -07:00
Ross Zwisler 5de490daec pmem: add copy_from_iter_pmem() and clear_pmem()
Add support for two new PMEM APIs, copy_from_iter_pmem() and
clear_pmem().  copy_from_iter_pmem() is used to copy data from an
iterator into a PMEM buffer.  clear_pmem() zeros a PMEM memory range.

Both of these new APIs must be explicitly ordered using a wmb_pmem()
function call and are implemented in such a way that the wmb_pmem()
will make the stores to PMEM durable.  Because both APIs are unordered
they can be called as needed without introducing any unwanted memory
barriers.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-20 14:07:23 -04:00
Ross Zwisler 4a370df553 pmem, x86: clean up conditional pmem includes
Prior to this change x86_64 used the pmem defines in
arch/x86/include/asm/pmem.h, and UM used the default ones at the
top of include/linux/pmem.h.  The inclusion or exclusion in linux/pmem.h
was controlled by CONFIG_ARCH_HAS_PMEM_API, but the ones in asm/pmem.h
were controlled by ARCH_HAS_NOCACHE_UACCESS.

Instead, control them both with CONFIG_ARCH_HAS_PMEM_API so that it's
clear that they are related and we don't run into the possibility where
they are both included or excluded.  Also remove a bunch of stale
function prototypes meant for UM in asm/pmem.h - these just conflicted
with the inline defaults in linux/pmem.h and gave compile errors.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-20 14:07:23 -04:00
Ross Zwisler 18279b467a pmem: remove layer when calling arch_has_wmb_pmem()
Prior to this change arch_has_wmb_pmem() was only called by
arch_has_pmem_api().  Both arch_has_wmb_pmem() and arch_has_pmem_api()
checked to make sure that CONFIG_ARCH_HAS_PMEM_API was enabled.

Instead, remove the old arch_has_wmb_pmem() wrapper to be rid of one
extra layer of indirection and the redundant CONFIG_ARCH_HAS_PMEM_API
check. Rename __arch_has_wmb_pmem() to arch_has_wmb_pmem() since we no
longer have a wrapper, and just have arch_has_pmem_api() call the
architecture specific arch_has_wmb_pmem() directly.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-20 14:07:23 -04:00
Ross Zwisler 4060352656 pmem, x86: move x86 PMEM API to new pmem.h header
Move the x86 PMEM API implementation out of asm/cacheflush.h and into
its own header asm/pmem.h.  This will allow members of the PMEM API to
be more easily identified on this and other architectures.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-20 14:07:23 -04:00
Boris Ostrovsky 3375d8284d xen/x86: Don't try to set PCE bit in CR4
Since VPMU code emulates RDPMC instruction with RDMSR and because hypervisor
does not emulate it there is no reason to try setting CR4's PCE bit (and the
hypervisor will warn on seeing it set).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:25:26 +01:00
Boris Ostrovsky bf6dfb154d xen/PMU: PMU emulation code
Add PMU emulation code that runs when we are processing a PMU interrupt.
This code will allow us not to trap to hypervisor on each MSR/LVTPC access
(of which there may be quite a few in the handler).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:25:26 +01:00
Boris Ostrovsky 6b08cd6328 xen/PMU: Intercept PMU-related MSR and APIC accesses
Provide interfaces for recognizing accesses to PMU-related MSRs and
LVTPC APIC and process these accesses in Xen PMU code.

(The interrupt handler performs XENPMU_flush right away in the beginning
since no PMU emulation is available. It will be added with a later patch).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:25:25 +01:00
Boris Ostrovsky e27b72df01 xen/PMU: Describe vendor-specific PMU registers
AMD and Intel PMU register initialization and helpers that determine
whether a register belongs to PMU.

This and some of subsequent PMU emulation code is somewhat similar to
Xen's PMU implementation.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:25:25 +01:00
Boris Ostrovsky 65d0cf0be7 xen/PMU: Initialization code for Xen PMU
Map shared data structure that will hold CPU registers, VPMU context,
V/PCPU IDs of the CPU interrupted by PMU interrupt. Hypervisor fills
this information in its handler and passes it to the guest for further
processing.

Set up PMU VIRQ.

Now that perf infrastructure will assume that PMU is available on a PV
guest we need to be careful and make sure that accesses via RDPMC
instruction don't cause fatal traps by the hypervisor. Provide a nop
RDPMC handler.

For the same reason avoid issuing a warning on a write to APIC's LVTPC.

Both of these will be made functional in later patches.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:25:20 +01:00
Boris Ostrovsky 5f14154882 xen/PMU: Sysfs interface for setting Xen PMU mode
Set Xen's PMU mode via /sys/hypervisor/pmu/pmu_mode. Add XENPMU hypercall.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:26 +01:00
Juergen Gross cb3eb85013 xen: remove no longer needed p2m.h
Cleanup by removing arch/x86/xen/p2m.h as it isn't needed any more.

Most definitions in this file are used in p2m.c only. Move those into
p2m.c.

set_phys_range_identity() is already declared in
arch/x86/include/asm/xen/page.h, add __init annotation there.

MAX_REMAP_RANGES isn't used at all, just delete it.

The only define left is P2M_PER_PAGE which is moved to page.h as well.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:25 +01:00
Juergen Gross c70727a5bc xen: allow more than 512 GB of RAM for 64 bit pv-domains
64 bit pv-domains under Xen are limited to 512 GB of RAM today. The
main reason has been the 3 level p2m tree, which was replaced by the
virtual mapped linear p2m list. Parallel to the p2m list which is
being used by the kernel itself there is a 3 level mfn tree for usage
by the Xen tools and eventually for crash dump analysis. For this tree
the linear p2m list can serve as a replacement, too. As the kernel
can't know whether the tools are capable of dealing with the p2m list
instead of the mfn tree, the limit of 512 GB can't be dropped in all
cases.

This patch replaces the hard limit by a kernel parameter which tells
the kernel to obey the 512 GB limit or not. The default is selected by
a configuration parameter which specifies whether the 512 GB limit
should be active per default for domUs (domain save/restore/migration
and crash dump analysis are affected).

Memory above the domain limit is returned to the hypervisor instead of
being identity mapped, which was wrong anyway.

The kernel configuration parameter to specify the maximum size of a
domain can be deleted, as it is not relevant any more.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:24 +01:00
Juergen Gross 70e6119955 xen: move p2m list if conflicting with e820 map
Check whether the hypervisor supplied p2m list is placed at a location
which is conflicting with the target E820 map. If this is the case
relocate it to a new area unused up to now and compliant to the E820
map.

As the p2m list might by huge (up to several GB) and is required to be
mapped virtually, set up a temporary mapping for the copied list.

For pvh domains just delete the p2m related information from start
info instead of reserving the p2m memory, as we don't need it at all.

For 32 bit kernels adjust the memblock_reserve() parameters in order
to cover the page tables only. This requires to memblock_reserve() the
start_info page on it's own.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:24 +01:00
Juergen Gross 6c2681c863 xen: add explicit memblock_reserve() calls for special pages
Some special pages containing interfaces to xen are being reserved
implicitly only today. The memblock_reserve() call to reserve them is
meant to reserve the p2m list supplied by xen. It is just reserving
not only the p2m list itself, but some more pages up to the start of
the xen built page tables.

To be able to move the p2m list to another pfn range, which is needed
for support of huge RAM, this memblock_reserve() must be split up to
cover all affected reserved pages explicitly.

The affected pages are:
- start_info page
- xenstore ring (might be missing, mfn is 0 in this case)
- console ring (not for initial domain)

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:23 +01:00
Juergen Gross 4b9c15377f xen: check for initrd conflicting with e820 map
Check whether the initrd is placed at a location which is conflicting
with the target E820 map. If this is the case relocate it to a new
area unused up to now and compliant to the E820 map.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:22 +01:00
Juergen Gross 04414baab5 xen: check pre-allocated page tables for conflict with memory map
Check whether the page tables built by the domain builder are at
memory addresses which are in conflict with the target memory map.
If this is the case just panic instead of running into problems
later.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:21 +01:00
Juergen Gross 808fdb7193 xen: check for kernel memory conflicting with memory layout
Checks whether the pre-allocated memory of the loaded kernel is in
conflict with the target memory map. If this is the case, just panic
instead of run into problems later, as there is nothing we can do
to repair this situation.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:21 +01:00
Juergen Gross 9ddac5b724 xen: find unused contiguous memory area
For being able to relocate pre-allocated data areas like initrd or
p2m list it is mandatory to find a contiguous memory area which is
not yet in use and doesn't conflict with the memory map we want to
be in effect.

In case such an area is found reserve it at once as this will be
required to be done in any case.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:20 +01:00
Juergen Gross e612b4a7db xen: check memory area against e820 map
Provide a service routine to check a physical memory area against the
E820 map. The routine will return false if the complete area is RAM
according to the E820 map and true otherwise.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:20 +01:00
Juergen Gross 5097cdf6ce xen: split counting of extra memory pages from remapping
Memory pages in the initial memory setup done by the Xen hypervisor
conflicting with the target E820 map are remapped. In order to do this
those pages are counted and remapped in xen_set_identity_and_remap().

Split the counting from the remapping operation to be able to setup
the needed memory sizes in time but doing the remap operation at a
later time. This enables us to simplify the interface to
xen_set_identity_and_remap() as the number of remapped and released
pages is no longer needed here.

Finally move the remapping further down to prepare relocating
conflicting memory contents before the memory might be clobbered by
xen_set_identity_and_remap(). This requires to not destroy the Xen
E820 map when the one for the system is being constructed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:19 +01:00
Juergen Gross 69632ecfcd xen: move static e820 map to global scope
Instead of using a function local static e820 map in xen_memory_setup()
and calling various functions in the same source with the map as a
parameter use a map directly accessible by all functions in the source.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:18 +01:00
Juergen Gross 8f5b0c6398 xen: eliminate scalability issues from initial mapping setup
Direct Xen to place the initial P->M table outside of the initial
mapping, as otherwise the 1G (implementation) / 2G (theoretical)
restriction on the size of the initial mapping limits the amount
of memory a domain can be handed initially.

As the initial P->M table is copied rather early during boot to
domain private memory and it's initial virtual mapping is dropped,
the easiest way to avoid virtual address conflicts with other
addresses in the kernel is to use a user address area for the
virtual address of the initial P->M table. This allows us to just
throw away the page tables of the initial mapping after the copy
without having to care about address invalidation.

It should be noted that this patch won't enable a pv-domain to USE
more than 512 GB of RAM. It just enables it to be started with a
P->M table covering more memory. This is especially important for
being able to boot a Dom0 on a system with more than 512 GB memory.

Signed-off-by: Juergen Gross <jgross@suse.com>
Based-on-patch-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:18 +01:00
Juergen Gross d51e8b3e85 xen: don't build mfn tree if tools don't need it
In case the Xen tools indicate they don't need the p2m 3 level tree
as they support the virtual mapped linear p2m list, just omit building
the tree.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:17 +01:00
Juergen Gross 4b9c9a1180 xen: save linear p2m list address in shared info structure
The virtual address of the linear p2m list should be stored in the
shared info structure read by the Xen tools to be able to support
64 bit pv-domains larger than 512 GB. Additionally the linear p2m
list interface includes a generation count which is changed prior
to and after each mapping change of the p2m list. Reading the
generation count the Xen tools can detect changes of the mappings
and re-read the p2m list eventually.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:17 +01:00
Juergen Gross 17fb46b119 xen: sync with xen headers
Use the newest headers from the xen tree to get some new structure
layouts.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:16 +01:00
Julien Grall 4a5b69464e xen/events: Support event channel rebind on ARM
Currently, the event channel rebind code is gated with the presence of
the vector callback.

The virtual interrupt controller on ARM has the concept of per-CPU
interrupt (PPI) which allow us to support per-VCPU event channel.
Therefore there is no need of vector callback for ARM.

Xen is already using a free PPI to notify the guest VCPU of an event.
Furthermore, the xen code initialization in Linux (see
arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ.

Introduce new helper xen_support_evtchn_rebind to allow architecture
decide whether rebind an event is support or not. It will always return
true on ARM and keep the same behavior on x86.

This is also allow us to drop the usage of xen_have_vector_callback
entirely in the ARM code.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:15 +01:00
Colin Ian King 772f95e3b9 x86/xen: fix non-ANSI declaration of xen_has_pv_devices()
xen_has_pv_devices() has no parameters, so use the normal void
parameter convention to make it match the prototype in the header file
include/xen/platform_pci.h.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20 12:24:13 +01:00
David Vrabel 87ffd2b9bb x86/xen: make CONFIG_XEN depend on CONFIG_X86_LOCAL_APIC
Since commit feb44f1f7a (x86/xen:
Provide a "Xen PV" APIC driver to support >255 VCPUs) Xen guests need
a full APIC driver and thus should depend on X86_LOCAL_APIC.

This fixes an i386 build failure with !SMP && !CONFIG_X86_UP_APIC by
disabling Xen support in this configuration.

Users needing Xen support in a non-SMP i386 kernel will need to enable
CONFIG_X86_UP_APIC.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: <stable@vger.kernel.org>
2015-08-20 11:45:43 +01:00
Ingo Molnar 40a2ea1bd9 Merge branch 'perf/urgent' into perf/core, to pick up fixes before adding more changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-20 11:48:56 +02:00
Ingo Molnar b5be5b7fff Merge branch 'x86/asm/urgent' to pick up an entry code fix
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-19 09:05:15 +02:00
Dan Williams 7a67832c7e libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
We currently register a platform device for e820 type-12 memory and
register a nvdimm bus beneath it.  Registering the platform device
triggers the device-core machinery to probe for a driver, but that
search currently comes up empty.  Building the nvdimm-bus registration
into the e820_pmem platform device registration in this way forces
libnvdimm to be built-in.  Instead, convert the built-in portion of
CONFIG_X86_PMEM_LEGACY to simply register a platform device and move the
rest of the logic to the driver for e820_pmem, for the following
reasons:

1/ Letting e820_pmem support be a module allows building and testing
   libnvdimm.ko changes without rebooting

2/ All the normal policy around modules can be applied to e820_pmem
   (unbind to disable and/or blacklisting the module from loading by
   default)

3/ Moving the driver to a generic location and converting it to scan
   "iomem_resource" rather than "e820.map" means any other architecture can
   take advantage of this simple nvdimm resource discovery mechanism by
   registering a resource named "Persistent Memory (legacy)"

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-19 00:34:34 -04:00
Jiang Liu 527f0a91e9 x86/irq: Build correct vector mapping for multiple MSI interrupts
Alex Deucher, Mark Rustad and Alexander Holler reported a regression
with the latest v4.2-rc4 kernel, which breaks some SATA controllers.
With multi-MSI capable SATA controllers, only the first port works,
all other ports time out when executing SATA commands.

This happens because the first argument to assign_irq_vector_policy()
is always the base linux irq number of the multi MSI interrupt block,
so all subsequent vector assignments operate on the base linux irq
number, so all MSI irqs are handled as the first irq number. Therefor
the other MSI irqs of a device are never set up correctly and never
fire.

Add the loop iterator to the base irq number so all vectors are
assigned correctly.

Fixes: b5dc8e6c21 "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
Reported-and-tested-by: Alex Deucher <alexdeucher@gmail.com>
Reported-and-tested-by: Mark Rustad <mrustad@gmail.com>
Reported-and-tested-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1439911228-9880-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-08-18 18:18:55 +02:00
Ingo Molnar a5dd192496 Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixes
Conflicts:
	arch/x86/entry/entry_64_compat.S
	arch/x86/math-emu/get_address.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-18 09:39:47 +02:00
Andy Lutomirski 512255a2ad Revert "sched/x86_64: Don't save flags on context switch"
This reverts commit:

  2c7577a758 ("sched/x86_64: Don't save flags on context switch")

It was a nice speedup.  It's also not quite correct: SYSENTER
enables interrupts too early.

We can re-add this optimization once the SYSENTER code is beaten
into shape, which should happen in 4.3 or 4.4.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # v3.19
Link: http://lkml.kernel.org/r/85f56651f59f76624e80785a8fd3bdfdd089a818.1439838962.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-18 09:39:26 +02:00
Herbert Xu 5e4b8c1fcc crypto: aead - Remove CRYPTO_ALG_AEAD_NEW flag
This patch removes the CRYPTO_ALG_AEAD_NEW flag now that everyone
has been converted.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-08-17 16:53:53 +08:00
Len Brown 656bba3068 x86/smpboot: Remove APIC.wait_for_init_deassert and atomic init_deasserted
Both the per-APIC flag ".wait_for_init_deassert",
and the global atomic_t "init_deasserted"
are dead code -- remove them.

For all APIC types, "wait_for_master()"
prevents an AP from proceeding until the BSP has set
cpu_callout_mask, making "init_deasserted" {unnecessary}:

	BSP: <de-assert INIT>
	...
	BSP: {set init_deasserted}
	AP: wait_for_master()
		set cpu_initialized_mask
		wait for cpu_callout_mask
	BSP: test cpu_initialized_mask
	BSP: set cpu_callout_mask
	AP: test cpu_callout_mask
	AP: {wait for init_deasserted}
	...
	AP: <touch APIC>

Deleting the {dead code} above is necessary to enable
some parallelism in a future patch.

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/de4b3a9bab894735e285870b5296da25ee6a8a5a.1439739165.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-17 10:42:28 +02:00
Len Brown a9bcaa02a5 x86/smpboot: Remove SIPI delays from cpu_up()
MPS 1.4 example code shows the following required delays during processor
on-lining:

	INIT
	 udelay(10,000)
	SIPI
	 udelay(200)
	SIPI
	 udelay(200) /* Linux actually implements this as udelay(300) */

Linux skips the udelay(10,000) on modern processors.
This patch removes the udelay(200) after each SIPI
on those same processors.

All three legacy delays can be restored by the cmdline
"cpu_init_udelay=10000".

As measured by analyze_suspend.py, this patch speeds
processor resume time on my desktop from 2.4ms to 1.8ms, per AP.

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/a5dfdbc8fbfdd813784da204aad5677fe459ac37.1439739165.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-17 10:42:27 +02:00
Len Brown 2d99af8e8f x86/smpboot: Remove udelay(100) when polling cpu_callin_map
After the BSP sends INIT/SIPI/SIP to the AP and sees the AP
in the cpu_initialized_map, it sets the AP loose via the
cpu_callout_map, and waits for it via the cpu_callin_map.

The BSP polls the cpu_callin_map with a udelay(100)
and a schedule() in each iteration.

The udelay(100) adds no value.

For example, on my 4-CPU dekstop, the AP finishes
cpu_callin() in under 70 usec and sets the cpu_callin_mask.
The BSP, however, doesn't see that setting until over 30 usec
later, because it was still running its udelay(100)
when the AP finished.

Deleting the udelay(100) in the cpu_callin_mask polling loop,
saves from 0 to 100 usec per Application Processor.

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/0aade12eabeb89a688c929fe80856eaea0544bb7.1439739165.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-17 10:42:27 +02:00
Len Brown 6e38f1e79d x86/smpboot: Remove udelay(100) when polling cpu_initialized_map
After the BSP sends the APIC INIT/SIPI/SIPI to the AP,
it waits for the AP to come up and indicate that it is alive
by setting its own bit in the cpu_initialized_mask.

Linux polls for up to 10 seconds for this to happen.
Each polling loop has a udelay(100) and a call to schedule().

The udelay(100) adds no value.

For example, on my desktop, the BSP waits for the
other 3 CPUs to come on line at boot for 305, 404, 405 usec.
For resume from S3, it waits 317, 404, 405 usec.

But when the udelay(100) is removed, the BSP waits
305, 310, 306 for boot, and 305, 307, 306 for resume.

So for both boot and resume, removing the udelay(100)
speeds online by about 100us in 2 of 3 cases.

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jan H. Schönherr <jschoenh@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/33ef746c67d2489cad0a9b1958cf71167232ff2b.1439739165.git.len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-17 10:42:27 +02:00
Ingo Molnar 5461bd81bf Linux 4.2-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV0R4AAAoJEHm+PkMAQRiG8xIH/AmiRd+JDrs0qqEy46p6X8Gn
 0lB5/KsGycvIGIBTiy2nZzcT0Ly6LeFUKUjzPytlOhIZPMrxMVMShDaQKCXXIMUr
 1mN6hkvpkLNnUhvL2fR6mm0zkjbz3zZEazFY+Jic8wQrtSkHgfH0DXqSAo8le0f8
 kNrd5BPPhIwvpHGaNGFdTpbgpPcalXyQk/fHyvDGidbyXzY/d7l05QfYJ6XCD4Zm
 IAy48iK5BFts2+z3aOYrOeuuCcm1qFX8YArqzE1rfPp+U/LQpfUfij4cmOqDLn/F
 qnv9E7bRRVovvrgKe4I3Trta8kT53VLJvqpdw2Usqo8zvhs4VyrYpHC+gEE6YUY=
 =9Rd4
 -----END PGP SIGNATURE-----

Merge tag 'v4.2-rc7' into x86/boot, to refresh the branch before merging new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-17 10:41:59 +02:00
Dave Airlie 4eebf60b74 Linux 4.2-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV0R4AAAoJEHm+PkMAQRiG8xIH/AmiRd+JDrs0qqEy46p6X8Gn
 0lB5/KsGycvIGIBTiy2nZzcT0Ly6LeFUKUjzPytlOhIZPMrxMVMShDaQKCXXIMUr
 1mN6hkvpkLNnUhvL2fR6mm0zkjbz3zZEazFY+Jic8wQrtSkHgfH0DXqSAo8le0f8
 kNrd5BPPhIwvpHGaNGFdTpbgpPcalXyQk/fHyvDGidbyXzY/d7l05QfYJ6XCD4Zm
 IAy48iK5BFts2+z3aOYrOeuuCcm1qFX8YArqzE1rfPp+U/LQpfUfij4cmOqDLn/F
 qnv9E7bRRVovvrgKe4I3Trta8kT53VLJvqpdw2Usqo8zvhs4VyrYpHC+gEE6YUY=
 =9Rd4
 -----END PGP SIGNATURE-----

Merge tag 'v4.2-rc7' into drm-next

Linux 4.2-rc7

Backmerge master for i915 fixes
2015-08-17 14:13:53 +10:00
Linus Torvalds 01565479e9 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Merge x86 fixes from Ingo Molnar:
 "Two followup fixes related to the previous LDT fix"

Also applied a further FPU emulation fix from Andy Lutomirski to the
branch before actually merging it.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
  x86/ldt: Further fix FPU emulation
  x86/ldt: Correct FPU emulation access to LDT
  x86/ldt: Correct LDT access in single stepping logic
2015-08-16 15:11:25 -07:00
Andy Lutomirski 12e244f4b5 x86/ldt: Further fix FPU emulation
The previous fix confused a selector with a segment prefix.  Fix it.

Compile-tested only.

Cc: stable@vger.kernel.org
Cc: Juergen Gross <jgross@suse.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 4809146b86 ("x86/ldt: Correct FPU emulation access to LDT")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-16 15:11:05 -07:00
Linus Torvalds 45e38cff4f Just two very small & simple patches.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJVzmylAAoJEL/70l94x66D7r0IAKd8oclVTdbo8RxR1Hg2zZev
 ytTm2Mjd0kgSqhTaBBUgyE900/cznYpT1xJq1/5Wwc+FP1J1QBzsDemtrQlEZIBh
 Zi4b7zm37K1ai7xWs6oLaXieVjiyX8vuUGO6saBw1n/ZLURgPjVzTmQMxdnYtyFX
 yf37rPvksnyzyctv+D9ZvdhrpD7Xd3NFNoCOSiukkeZkjb97JabDRrzpTlVmj4wu
 KNReYCN+iA6jZe5tEZHzCGplVrEMfHdAcoRc3GVz3oecPVZojX/NLzwlw97iN/2z
 mm5SVOlxbvCO7sqEQXo/db91xlP3E6Q1QGuDE21NboClbNeinC/uFJMFzpVInSI=
 =AD69
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Just two very small & simple patches"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST
  KVM: x86: zero IDT limit on entry to SMM
2015-08-14 17:27:52 -07:00
Andy Lutomirski 4d283ec908 x86/kvm: Rename VMX's segment access rights defines
VMX encodes access rights differently from LAR, and the latter is
most likely what x86 people think of when they think of "access
rights".

Rename them to avoid confusion.

Cc: kvm@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-08-15 00:47:13 +02:00
Linus Torvalds b25c6cee55 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixes: PMU driver corner cases, tooling fixes, and an 'AUX'
  (Intel PT) race related core fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler
  perf/x86/intel: Fix memory leak on hot-plug allocation fail
  perf: Fix PERF_EVENT_IOC_PERIOD migration race
  perf: Fix double-free of the AUX buffer
  perf: Fix fasync handling on inherited events
  perf tools: Fix test build error when bindir contains double slash
  perf stat: Fix transaction lenght metrics
  perf: Fix running time accounting
2015-08-14 10:57:16 -07:00
Dan Williams e836a256e8 pmem: convert to generic memremap
Kill arch_memremap_pmem() and just let the architecture specify the
flags to be passed to memremap().  Default to writethrough by default.

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-14 13:23:28 -04:00
David S. Miller 182ad468e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/cavium/Kconfig

The cavium conflict was overlapping dependency
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 16:23:11 -07:00
Linus Torvalds cd88ec2317 x86: fix error handling for 32-bit compat out-of-range system call numbers
Commit 3f5159a922 ("x86/asm/entry/32: Update -ENOSYS handling to match
the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case.
The proper error return value was never loaded into %rax, except if
things just happened to go through the audit paths, which ended up
reloading the return value.

This moves the loading or %rax into the normal system call path, just to
make sure the error case triggers it.  It's kind of sad, since it adds a
useless instruction to reload the register to the fast path, but it's
not like that single load from the stack is going to be noticeable.

Reported-by: David Drysdale <drysdale@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-13 16:19:44 -07:00
Linus Torvalds 6b476e1140 xen: bug fixes for 4.2-rc6
- Revert a fix from 4.2-rc5 that was causing lots of WARNING spam.
 - Fix a memory leak affecting backends in HVM guests.
 - Fix PV domU hang with certain configurations.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVzHsAAAoJEFxbo/MsZsTR9Y0H/2j1PHt29RPcNdgGQ84AH0Wh
 tw1emL8rMcdhWQnsO7bNmywNNvRNQnU3ZJ8dzoq+5GPikNsbfQzYc7U2pIL4A+gB
 AAJsNDNzecuq4srk8vNxcmZ7ySvm9w6dccDUex2ge3sNWaq6gzSQvz6FSWiL0Sxg
 k3JcnemEg6JrYOTWdxKInAORMcRO6rgx9eIsdPUPOpgC5XLg6/mZOqBAWXIksDvs
 V9uCMqQicaUgBgKFIOSllqH6fcCNooRu3aDwNNj/2mMcJmEvMeBkHmNlQgEm2j5L
 ubdDyrC5y48TUPJm8i3+W2/AY+kgWzhThcqyVy6LRAAj5RItJFxMf0nMXzIEqlQ=
 =UgMy
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

 - revert a fix from 4.2-rc5 that was causing lots of WARNING spam.

 - fix a memory leak affecting backends in HVM guests.

 - fix PV domU hang with certain configurations.

* tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
  Revert "xen/events/fifo: Handle linked events when closing a port"
  x86/xen: build "Xen PV" APIC driver for domU as well
2015-08-13 13:36:22 -07:00
Linus Torvalds ed596cde94 Revert x86 sigcontext cleanups
This reverts commits 9a036b93a3 ("x86/signal/64: Remove 'fs' and 'gs'
from sigcontext") and c6f2062935 ("x86/signal/64: Fix SS handling for
signals delivered to 64-bit programs").

They were cleanups, but they break dosemu by changing the signal return
behavior (and removing 'fs' and 'gs' from the sigcontext struct - while
not actually changing any behavior - causes build problems).

Reported-and-tested-by: Stas Sergeev <stsp@list.ru>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-13 12:42:22 -07:00
Borislav Petkov 6c36dfe949 x86/ras: Move AMD MCE injector to arch/x86/ras/
This is an x86-specific module and would benefit from being
closer to the arch code. Move it there. Update copyright while
at it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1439396985-12812-14-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:54 +02:00
Borislav Petkov a79da38494 x86/mce: Add a wrapper around mce_log() for injection
Will be used by an injector module in a following patch.

Additionally, add a missing module export reported by 0-DAY
kernel test.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1439396985-12812-13-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:53 +02:00
Borislav Petkov 9a7783d021 x86/mce: Rename rcu_dereference_check_mce() to mce_log_get_idx_check()
The "rcu_" prefix misleads for it being a proper RCU interface
which is not. It basically checks whether we're preemptible or
holding the chrdev_read mutex.

Rename it accordingly.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1439396985-12812-12-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:53 +02:00
Xie XiuQi 1b48465500 x86/mce: Reenable CMCI banks when swiching back to interrupt mode
Zhang Liguang reported the following issue:

1) System detects a CMCI storm on the current CPU.

2) Kernel disables the CMCI interrupt on banks owned by the
   current CPU and switches to poll mode

3) After the CMCI storm subsides, kernel switches back to
   interrupt mode

4) We expect the system to reenable the CMCI interrupt on banks
   owned by the current CPU

   mce_intel_adjust_timer
   |-> cmci_reenable
       |-> cmci_discover     # owned banks are ignored here

  static void cmci_discover(int banks)
	...
	for (i = 0; i < banks; i++) {
		...
		if (test_bit(i, owned))	# ownd banks is ignore here
			continue;

So convert cmci_storm_disable_banks() to
cmci_toggle_interrupt_mode() which controls whether to enable or
disable CMCI interrupts with its argument.

NB: We cannot clear the owned bit because the banks won't be
polled, otherwise. See:

  27f6c573e0 ("x86, CMCI: Add proper detection of end of CMCI storms")

for more info.

Reported-by: Zhang Liguang <zhangliguang@huawei.com>
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v3.15+
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: huawei.libin@huawei.com
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: rui.xiang@huawei.com
Link: http://lkml.kernel.org/r/1439396985-12812-10-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:52 +02:00
Ashok Raj 8838eb6c0b x86/mce: Clear Local MCE opt-in before kexec
kexec could boot a kernel that could be legacy with no knowledge
of LMCE. Hence we should make sure we clear LMCE optin before
kexec reboot.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1439396985-12812-9-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:52 +02:00
Ashok Raj 4d1d5cdc34 x86/mce: Remove unused function declarations
Remove unused function declarations.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1439396985-12812-8-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:52 +02:00
Borislav Petkov eef4dfa0cb x86/mce: Kill drain_mcelog_buffer()
This used to flush out MCEs logged during early boot and which
were in the MCA registers from a previous system run. No need
for that now, since we've moved to a genpool.

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:52 +02:00
Chen, Gong f29a7aff4b x86/mce: Avoid potential deadlock due to printk() in MCE context
Printing in MCE context is a no-no, currently, as printk() is
not NMI-safe. If some of the notifiers on the MCE chain call do
so, we may deadlock. In order to avoid that, delay printk() to
process context where it is safe.

Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
[ Fold in subsequent patch from Boris for early boot logging. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ Kick irq_work in mce_log() directly. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:51 +02:00
Chen, Gong fd4cf79fcc x86/mce: Remove the MCE ring for Action Optional errors
Use unified genpool to save Action Optional error events and put
Action Optional error handling in the same notification chain as
MCE error decoding.

Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
[ Fold in subsequent patch from Boris for early boot logging. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ Correct a lot. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:51 +02:00
Chen, Gong 061120aed7 x86/mce: Don't use percpu workqueues
An MCE is a rare event. Therefore, there's no need to have
per-CPU instances of both normal and IRQ workqueues. Make them
both global.

Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
[ Fold in subsequent patch from Rui/Boris/Tony for early boot logging. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ Massage commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:51 +02:00
Chen, Gong 648ed94038 x86/mce: Provide a lockless memory pool to save error records
printk() is not safe to use in MCE context. Add a lockless
memory allocator pool to save error records in MCE context.
Those records will be issued later, in a printk-safe context.
The idea is inspired by the APEI/GHES driver.

We're very conservative and allocate only two pages for it but
since we're going to use those pages throughout the system's
lifetime, we allocate them statically to avoid early boot time
allocation woes.

Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
[ Rewrite. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1439396985-12812-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:50 +02:00
Borislav Petkov 20d51a426f x86/mce: Reuse one of the u16 padding fields in 'struct mce'
... to save the error severity of the MCE and whether the
reported address of the error is usable.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1439396985-12812-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:50 +02:00
Peter Zijlstra d420acd816 jump_label/x86: Work around asm build bug on older/backported GCCs
Boris reported that gcc version 4.4.4 20100503 (Red Hat
4.4.4-2) fails to build linux-next kernels that have
this fresh commit via the locking tree:

  11276d5306 ("locking/static_keys: Add a new static_key interface")

The problem appears to be that even though @key and @branch are
compile time constants, it doesn't see the following expression
as an immediate value:

   &((char *)key)[branch]

More recent GCCs don't appear to have this problem.

In particular, Red Hat backported the 'asm goto' feature into 4.4,
'normal' 4.4 compilers will not have this feature and thus not
run into this asm.

The workaround is to supply both values to the asm as immediates
and do the addition in asm.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 08:44:43 +02:00
David Howells 99db443506 PKCS#7: Appropriately restrict authenticated attributes and content type
A PKCS#7 or CMS message can have per-signature authenticated attributes
that are digested as a lump and signed by the authorising key for that
signature.  If such attributes exist, the content digest isn't itself
signed, but rather it is included in a special authattr which then
contributes to the signature.

Further, we already require the master message content type to be
pkcs7_signedData - but there's also a separate content type for the data
itself within the SignedData object and this must be repeated inside the
authattrs for each signer [RFC2315 9.2, RFC5652 11.1].

We should really validate the authattrs if they exist or forbid them
entirely as appropriate.  To this end:

 (1) Alter the PKCS#7 parser to reject any message that has more than one
     signature where at least one signature has authattrs and at least one
     that does not.

 (2) Validate authattrs if they are present and strongly restrict them.
     Only the following authattrs are permitted and all others are
     rejected:

     (a) contentType.  This is checked to be an OID that matches the
     	 content type in the SignedData object.

     (b) messageDigest.  This must match the crypto digest of the data.

     (c) signingTime.  If present, we check that this is a valid, parseable
     	 UTCTime or GeneralTime and that the date it encodes fits within
     	 the validity window of the matching X.509 cert.

     (d) S/MIME capabilities.  We don't check the contents.

     (e) Authenticode SP Opus Info.  We don't check the contents.

     (f) Authenticode Statement Type.  We don't check the contents.

     The message is rejected if (a) or (b) are missing.  If the message is
     an Authenticode type, the message is rejected if (e) is missing; if
     not Authenticode, the message is rejected if (d) - (f) are present.

     The S/MIME capabilities authattr (d) unfortunately has to be allowed
     to support kernels already signed by the pesign program.  This only
     affects kexec.  sign-file suppresses them (CMS_NOSMIMECAP).

     The message is also rejected if an authattr is given more than once or
     if it contains more than one element in its set of values.

 (3) Add a parameter to pkcs7_verify() to select one of the following
     restrictions and pass in the appropriate option from the callers:

     (*) VERIFYING_MODULE_SIGNATURE

	 This requires that the SignedData content type be pkcs7-data and
	 forbids authattrs.  sign-file sets CMS_NOATTR.  We could be more
	 flexible and permit authattrs optionally, but only permit minimal
	 content.

     (*) VERIFYING_FIRMWARE_SIGNATURE

	 This requires that the SignedData content type be pkcs7-data and
	 requires authattrs.  In future, this will require an attribute
	 holding the target firmware name in addition to the minimal set.

     (*) VERIFYING_UNSPECIFIED_SIGNATURE

	 This requires that the SignedData content type be pkcs7-data but
	 allows either no authattrs or only permits the minimal set.

     (*) VERIFYING_KEXEC_PE_SIGNATURE

	 This only supports the Authenticode SPC_INDIRECT_DATA content type
	 and requires at least an SpcSpOpusInfo authattr in addition to the
	 minimal set.  It also permits an SPC_STATEMENT_TYPE authattr (and
	 an S/MIME capabilities authattr because the pesign program doesn't
	 remove these).

     (*) VERIFYING_KEY_SIGNATURE
     (*) VERIFYING_KEY_SELF_SIGNATURE

	 These are invalid in this context but are included for later use
	 when limiting the use of X.509 certs.

 (4) The pkcs7_test key type is given a module parameter to select between
     the above options for testing purposes.  For example:

	echo 1 >/sys/module/pkcs7_test_key/parameters/usage
	keyctl padd pkcs7_test foo @s </tmp/stuff.pkcs7

     will attempt to check the signature on stuff.pkcs7 as if it contains a
     firmware blob (1 being VERIFYING_FIRMWARE_SIGNATURE).

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: David Woodhouse <David.Woodhouse@intel.com>
2015-08-12 17:01:01 +01:00
Ingo Molnar 9b9412dc70 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU changes from Paul E. McKenney:

  - The combination of tree geometry-initialization simplifications
    and OS-jitter-reduction changes to expedited grace periods.
    These two are stacked due to the large number of conflicts
    that would otherwise result.

    [ With one addition, a temporary commit to silence a lockdep false
      positive. Additional changes to the expedited grace-period
      primitives (queued for 4.4) remove the cause of this false
      positive, and therefore include a revert of this temporary commit. ]

  - Documentation updates.

  - Torture-test updates.

  - Miscellaneous fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 12:12:12 +02:00
Peter Zijlstra 16eefbd1ed x86/kconfig: Enable CONFIG_JUMP_LABEL in the defconfigs
Enable CONFIG_JUMP_LABEL in the defconfigs, the feature already deals with
GCC not having the asm-goto feature so will not break the build on
older compilers.

Having it enabled generates a faster kernel at very little extra cost
since we already include all the code patching code by having KPROBES
enabled.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 12:00:39 +02:00
Will Deacon 2b2a85a4d3 locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
Since the following commit:

  536fa40222 ("compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()")

smp_store_release() supports byte accesses, so use that in writer unlock
and remove the conditional macro override.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Waiman Long <Waiman.Long@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1438880084-18856-6-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 11:59:05 +02:00
Ingo Molnar f52609fdab Merge branch 'locking/arch-atomic' into locking/core, because it's ready for upstream
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 11:44:30 +02:00
Takao Indoh 709bc87192 perf/x86/intel/pt: Clean up files of Intel Processor Trace
This patch just cleans up some files of Intel Processor Trace, does not
change its behavior. This patch removes unused definitions and replaces a
constant value with a macro.

Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin<alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: H.Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438681015-5124-1-git-send-email-indou.takao@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 11:43:22 +02:00
Peter Zijlstra 19b3340cf5 perf/x86: Fix MSR PMU driver
Currently we only update the sysfs event files per available MSR, we
didn't actually disallow creating unlisted events.

Rework things such that the dectection, sysfs listing and event
creation are better coordinated.

Sadly it appears it's impossible to probe R/O MSRs under virt. This
means we have to do the full model table to avoid listing all MSRs all
the time.

Tested-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 11:43:20 +02:00
Ingo Molnar 3d325bf0da Merge branch 'perf/urgent' into perf/core, to pick up fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 11:39:19 +02:00
Matt Fleming d7a702f0b1 perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler
Tony reports that booting his 144-cpu machine with maxcpus=10 triggers
the following WARN_ON():

[   21.045727] WARNING: CPU: 8 PID: 647 at arch/x86/kernel/cpu/perf_event_intel_cqm.c:1267 intel_cqm_cpu_prepare+0x75/0x90()
[   21.045744] CPU: 8 PID: 647 Comm: systemd-udevd Not tainted 4.2.0-rc4 #1
[   21.045745] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0066.R00.1506021730 06/02/2015
[   21.045747]  0000000000000000 0000000082771b09 ffff880856333ba8 ffffffff81669b67
[   21.045748]  0000000000000000 0000000000000000 ffff880856333be8 ffffffff8107b02a
[   21.045750]  ffff88085b789800 ffff88085f68a020 ffffffff819e2470 000000000000000a
[   21.045750] Call Trace:
[   21.045757]  [<ffffffff81669b67>] dump_stack+0x45/0x57
[   21.045759]  [<ffffffff8107b02a>] warn_slowpath_common+0x8a/0xc0
[   21.045761]  [<ffffffff8107b15a>] warn_slowpath_null+0x1a/0x20
[   21.045762]  [<ffffffff81036725>] intel_cqm_cpu_prepare+0x75/0x90
[   21.045764]  [<ffffffff81036872>] intel_cqm_cpu_notifier+0x42/0x160
[   21.045767]  [<ffffffff8109a33d>] notifier_call_chain+0x4d/0x80
[   21.045769]  [<ffffffff8109a44e>] __raw_notifier_call_chain+0xe/0x10
[   21.045770]  [<ffffffff8107b538>] _cpu_up+0xe8/0x190
[   21.045771]  [<ffffffff8107b65a>] cpu_up+0x7a/0xa0
[   21.045774]  [<ffffffff8165e920>] cpu_subsys_online+0x40/0x90
[   21.045777]  [<ffffffff81433b37>] device_online+0x67/0x90
[   21.045778]  [<ffffffff81433bea>] online_store+0x8a/0xa0
[   21.045782]  [<ffffffff81430e78>] dev_attr_store+0x18/0x30
[   21.045785]  [<ffffffff8126b6ba>] sysfs_kf_write+0x3a/0x50
[   21.045786]  [<ffffffff8126ad40>] kernfs_fop_write+0x120/0x170
[   21.045789]  [<ffffffff811f0b77>] __vfs_write+0x37/0x100
[   21.045791]  [<ffffffff811f38b8>] ? __sb_start_write+0x58/0x110
[   21.045795]  [<ffffffff81296d2d>] ? security_file_permission+0x3d/0xc0
[   21.045796]  [<ffffffff811f1279>] vfs_write+0xa9/0x190
[   21.045797]  [<ffffffff811f2075>] SyS_write+0x55/0xc0
[   21.045800]  [<ffffffff81067300>] ? do_page_fault+0x30/0x80
[   21.045804]  [<ffffffff816709ae>] entry_SYSCALL_64_fastpath+0x12/0x71
[   21.045805] ---[ end trace fe228b836d8af405 ]---

The root cause is that CPU_UP_PREPARE is completely the wrong notifier
action from which to access cpu_data(), because smp_store_cpu_info()
won't have been executed by the target CPU at that point, which in turn
means that ->x86_cache_max_rmid and ->x86_cache_occ_scale haven't been
filled out.

Instead let's invoke our handler from CPU_STARTING and rename it
appropriately.

Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Link: http://lkml.kernel.org/r/1438863163-14083-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 11:37:23 +02:00
Peter Zijlstra dbc72b7a0c perf/x86/intel: Fix memory leak on hot-plug allocation fail
We fail to free the shared_regs allocation if the constraint_list
allocation fails.

Cure this and be more consistent in NULL-ing the pointers after free.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-12 11:37:22 +02:00
Wei Huang b6bb424b40 KVM: x86/vPMU: Fix unnecessary signed extension for AMD PERFCTRn
According to AMD programmer's manual, AMD PERFCTRn is 64-bit MSR which,
unlike Intel perf counters, doesn't require signed extension. This
patch removes the unnecessary conversion in SVM vPMU code when PERFCTRn
is being updated.

Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-08-11 15:19:41 +02:00
Nicholas Krause 603242a88a kvm: x86: Fix error handling in the function kvm_lapic_sync_from_vapic
This fixes error handling in the function kvm_lapic_sync_from_vapic
by checking if the call to kvm_read_guest_cached has returned a
error code to signal to its caller the call to this function has
failed and due to this we must immediately return to the caller
of kvm_lapic_sync_from_vapic to avoid incorrectly call apic_set_tpc
if a error has occurred here.

Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-08-11 15:11:05 +02:00
Jason A. Donenfeld fc5fee86bd x86/xen: build "Xen PV" APIC driver for domU as well
It turns out that a PV domU also requires the "Xen PV" APIC
driver. Otherwise, the flat driver is used and we get stuck in busy
loops that never exit, such as in this stack trace:

(gdb) target remote localhost:9999
Remote debugging using localhost:9999
__xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
56              while (native_apic_mem_read(APIC_ICR) & APIC_ICR_BUSY)
(gdb) bt
 #0  __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
 #1  __default_send_IPI_shortcut (shortcut=<optimized out>,
dest=<optimized out>, vector=<optimized out>) at
./arch/x86/include/asm/ipi.h:75
 #2  apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54
 #3  0xffffffff81011336 in arch_irq_work_raise () at
arch/x86/kernel/irq_work.c:47
 #4  0xffffffff8114990c in irq_work_queue (work=0xffff88000fc0e400) at
kernel/irq_work.c:100
 #5  0xffffffff8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633
 #6  0xffffffff8110ca60 in vprintk_emit (facility=0, level=<optimized
out>, dict=0x0 <irq_stack_union>, dictlen=<optimized out>,
fmt=<optimized out>, args=<optimized out>)
    at kernel/printk/printk.c:1778
 #7  0xffffffff816010c8 in printk (fmt=<optimized out>) at
kernel/printk/printk.c:1868
 #8  0xffffffffc00013ea in ?? ()
 #9  0x0000000000000000 in ?? ()

Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-10 15:33:10 +01:00