Commit graph

70039 commits

Author SHA1 Message Date
Yan, Zheng e4339d28f6 libceph: reference counting pagelist
this allow pagelist to present data that may be sent multiple times.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-10-14 12:56:48 -07:00
Tilman Schmidt 9ea8aa8d50 isdn/capi: correct capi20_manufacturer argument type mismatch
Function capi20_manufacturer() is declared with unsigned int cmd
argument but called with unsigned long.
Fix by correcting the function prototype since the actual argument
is part of the user visible API.

Spotted with Coverity.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14 15:05:34 -04:00
Li RongQing 02ea80741a ipv6: remove aca_lock spinlock from struct ifacaddr6
no user uses this lock.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14 13:15:15 -04:00
Alexander Aring 31eff81e94 skbuff: fix ftrace handling in skb_unshare
If the skb is not dropped afterwards we should run consume_skb instead
kfree_skb. Inside of function skb_unshare we do always a kfree_skb,
doesn't depend if skb_copy failed or was successful.

This patch switch this behaviour like skb_share_check, if allocation of
sk_buff failed we use kfree_skb otherwise consume_skb.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14 13:10:31 -04:00
Daniel Borkmann b69040d8e3 net: sctp: fix panic on duplicate ASCONF chunks
When receiving a e.g. semi-good formed connection scan in the
form of ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ---------------- ASCONF_a; ASCONF_b ----------------->

... where ASCONF_a equals ASCONF_b chunk (at least both serials
need to be equal), we panic an SCTP server!

The problem is that good-formed ASCONF chunks that we reply with
ASCONF_ACK chunks are cached per serial. Thus, when we receive a
same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
not need to process them again on the server side (that was the
idea, also proposed in the RFC). Instead, we know it was cached
and we just resend the cached chunk instead. So far, so good.

Where things get nasty is in SCTP's side effect interpreter, that
is, sctp_cmd_interpreter():

While incoming ASCONF_a (chunk = event_arg) is being marked
!end_of_packet and !singleton, and we have an association context,
we do not flush the outqueue the first time after processing the
ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
queued up, although we set local_cork to 1. Commit 2e3216cd54
changed the precedence, so that as long as we get bundled, incoming
chunks we try possible bundling on outgoing queue as well. Before
this commit, we would just flush the output queue.

Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
continue to process the same ASCONF_b chunk from the packet. As
we have cached the previous ASCONF_ACK, we find it, grab it and
do another SCTP_CMD_REPLY command on it. So, effectively, we rip
the chunk->list pointers and requeue the same ASCONF_ACK chunk
another time. Since we process ASCONF_b, it's correctly marked
with end_of_packet and we enforce an uncork, and thus flush, thus
crashing the kernel.

Fix it by testing if the ASCONF_ACK is currently pending and if
that is the case, do not requeue it. When flushing the output
queue we may relink the chunk for preparing an outgoing packet,
but eventually unlink it when it's copied into the skb right
before transmission.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd54 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14 12:46:22 -04:00
Daniel Borkmann 9de7922bc7 net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks
Commit 6f4c618ddb ("SCTP : Add paramters validity check for
ASCONF chunk") added basic verification of ASCONF chunks, however,
it is still possible to remotely crash a server by sending a
special crafted ASCONF chunk, even up to pre 2.6.12 kernels:

skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768
 head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950
 end:0x440 dev:<NULL>
 ------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:129!
[...]
Call Trace:
 <IRQ>
 [<ffffffff8144fb1c>] skb_put+0x5c/0x70
 [<ffffffffa01ea1c3>] sctp_addto_chunk+0x63/0xd0 [sctp]
 [<ffffffffa01eadaf>] sctp_process_asconf+0x1af/0x540 [sctp]
 [<ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20
 [<ffffffffa01e0038>] sctp_sf_do_asconf+0x168/0x240 [sctp]
 [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
 [<ffffffff8147645d>] ? fib_rules_lookup+0xad/0xf0
 [<ffffffffa01e6b22>] ? sctp_cmp_addr_exact+0x32/0x40 [sctp]
 [<ffffffffa01e8393>] sctp_assoc_bh_rcv+0xd3/0x180 [sctp]
 [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
 [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
 [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
 [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff81496ded>] ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81497078>] ip_local_deliver+0x98/0xa0
 [<ffffffff8149653d>] ip_rcv_finish+0x12d/0x440
 [<ffffffff81496ac5>] ip_rcv+0x275/0x350
 [<ffffffff8145c88b>] __netif_receive_skb+0x4ab/0x750
 [<ffffffff81460588>] netif_receive_skb+0x58/0x60

This can be triggered e.g., through a simple scripted nmap
connection scan injecting the chunk after the handshake, for
example, ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ------------------ ASCONF; UNKNOWN ------------------>

... where ASCONF chunk of length 280 contains 2 parameters ...

  1) Add IP address parameter (param length: 16)
  2) Add/del IP address parameter (param length: 255)

... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the
Address Parameter in the ASCONF chunk is even missing, too.
This is just an example and similarly-crafted ASCONF chunks
could be used just as well.

The ASCONF chunk passes through sctp_verify_asconf() as all
parameters passed sanity checks, and after walking, we ended
up successfully at the chunk end boundary, and thus may invoke
sctp_process_asconf(). Parameter walking is done with
WORD_ROUND() to take padding into account.

In sctp_process_asconf()'s TLV processing, we may fail in
sctp_process_asconf_param() e.g., due to removal of the IP
address that is also the source address of the packet containing
the ASCONF chunk, and thus we need to add all TLVs after the
failure to our ASCONF response to remote via helper function
sctp_add_asconf_response(), which basically invokes a
sctp_addto_chunk() adding the error parameters to the given
skb.

When walking to the next parameter this time, we proceed
with ...

  length = ntohs(asconf_param->param_hdr.length);
  asconf_param = (void *)asconf_param + length;

... instead of the WORD_ROUND()'ed length, thus resulting here
in an off-by-one that leads to reading the follow-up garbage
parameter length of 12336, and thus throwing an skb_over_panic
for the reply when trying to sctp_addto_chunk() next time,
which implicitly calls the skb_put() with that length.

Fix it by using sctp_walk_params() [ which is also used in
INIT parameter processing ] macro in the verification *and*
in ASCONF processing: it will make sure we don't spill over,
that we walk parameters WORD_ROUND()'ed. Moreover, we're being
more defensive and guard against unknown parameter types and
missized addresses.

Joint work with Vlad Yasevich.

Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-14 12:46:22 -04:00
Martin K. Petersen e19a8a0ad2 block: Remove REQ_KERNEL
REQ_KERNEL is no longer used. Remove it and drop the redundant uio
argument to nfs_file_direct_{read,write}.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-10-14 09:00:44 -06:00
Behan Webster a0a77af141 crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code
Add a macro which replaces the use of a Variable Length Array In Struct (VLAIS)
with a C99 compliant equivalent. This macro instead allocates the appropriate
amount of memory using an char array.

The new code can be compiled with both gcc and clang.

struct shash_desc contains a flexible array member member ctx declared with
CRYPTO_MINALIGN_ATTR, so sizeof(struct shash_desc) aligns the beginning
of the array declared after struct shash_desc with long long.

No trailing padding is required because it is not a struct type that can
be used in an array.

The CRYPTO_MINALIGN_ATTR is required so that desc is aligned with long long
as would be the case for a struct containing a member with
CRYPTO_MINALIGN_ATTR.

If you want to get to the ctx at the end of the shash_desc as before you can do
so using shash_desc_ctx(shash)

Signed-off-by: Behan Webster <behanw@converseincode.com>
Reviewed-by: Mark Charlebois <charlebm@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michał Mirosław <mirqus@gmail.com>
2014-10-14 10:51:22 +02:00
Linus Torvalds 2d65a9f48f Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is the main git pull for the drm,

  I pretty much froze major pulls at -rc5/6 time, and haven't had much
  fallout, so will probably continue doing that.

  Lots of changes all over, big internal header cleanup to make it clear
  drm features are legacy things and what are things that modern KMS
  drivers should be using.  Also big move to use the new generic fences
  in all the TTM drivers.

  core:
        atomic prep work,
        vblank rework changes, allows immediate vblank disables
        major header reworking and cleanups to better delinate legacy
        interfaces from what KMS drivers should be using.
        cursor planes locking fixes

  ttm:
        move to generic fences (affects all TTM drivers)
        ppc64 caching fixes

  radeon:
        userptr support,
        uvd for old asics,
        reset rework for fence changes
        better buffer placement changes,
        dpm feature enablement
        hdmi audio support fixes

  intel:
        Cherryview work,
        180 degree rotation,
        skylake prep work,
        execlist command submission
        full ppgtt prep work
        cursor improvements
        edid caching,
        vdd handling improvements

  nouveau:
        fence reworking
        kepler memory clock work
        gt21x clock work
        fan control improvements
        hdmi infoframe fixes
        DP audio

  ast:
        ppc64 fixes
        caching fix

  rcar:
        rcar-du DT support

  ipuv3:
        prep work for capture support

  msm:
        LVDS support for mdp4, new panel, gpu refactoring

  exynos:
        exynos3250 SoC support, drop bad mmap interface,
        mipi dsi changes, and component match support"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (640 commits)
  drm/mst: rework payload table allocation to conform better.
  drm/ast: Fix HW cursor image
  drm/radeon/kv: add uvd/vce info to dpm debugfs output
  drm/radeon/ci: add uvd/vce info to dpm debugfs output
  drm/radeon: export reservation_object from dmabuf to ttm
  drm/radeon: cope with foreign fences inside the reservation object
  drm/radeon: cope with foreign fences inside display
  drm/core: use helper to check driver features
  drm/radeon/cik: write gfx ucode version to ucode addr reg
  drm/radeon/si: print full CS when we hit a packet 0
  drm/radeon: remove unecessary includes
  drm/radeon/combios: declare legacy_connector_convert as static
  drm/radeon/atombios: declare connector convert tables as static
  drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table
  drm/radeon/dpm: drop clk/voltage dependency filters for BTC
  drm/radeon/dpm: drop clk/voltage dependency filters for CI
  drm/radeon/dpm: drop clk/voltage dependency filters for SI
  drm/radeon/dpm: drop clk/voltage dependency filters for NI
  drm/radeon: disable audio when we disable hdmi (v2)
  drm/radeon: split audio enable between eg and r600 (v2)
  ...
2014-10-14 09:39:08 +02:00
NeilBrown 2cbbca5e7c md: discard PRINT_RAID_DEBUG ioctl
All the interesting information printed by this ioctl
is provided in /proc/mdstat and/or sysfs.
So it isn't needed and isn't used and would be best if it didn't
exist.

Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-14 13:08:29 +11:00
Linus Torvalds dfe2c6dcc8 Merge branch 'akpm' (patches from Andrew Morton)
Merge second patch-bomb from Andrew Morton:
 - a few hotfixes
 - drivers/dma updates
 - MAINTAINERS updates
 - Quite a lot of lib/ updates
 - checkpatch updates
 - binfmt updates
 - autofs4
 - drivers/rtc/
 - various small tweaks to less used filesystems
 - ipc/ updates
 - kernel/watchdog.c changes

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (135 commits)
  mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared
  kernel/param: consolidate __{start,stop}___param[] in <linux/moduleparam.h>
  ia64: remove duplicate declarations of __per_cpu_start[] and __per_cpu_end[]
  frv: remove unused declarations of __start___ex_table and __stop___ex_table
  kvm: ensure hard lockup detection is disabled by default
  kernel/watchdog.c: control hard lockup detection default
  staging: rtl8192u: use %*pEn to escape buffer
  staging: rtl8192e: use %*pEn to escape buffer
  staging: wlan-ng: use %*pEhp to print SN
  lib80211: remove unused print_ssid()
  wireless: hostap: proc: print properly escaped SSID
  wireless: ipw2x00: print SSID via %*pE
  wireless: libertas: print esaped string via %*pE
  lib/vsprintf: add %*pE[achnops] format specifier
  lib / string_helpers: introduce string_escape_mem()
  lib / string_helpers: refactoring the test suite
  lib / string_helpers: move documentation to c-file
  include/linux: remove strict_strto* definitions
  arch/x86/mm/numa.c: fix boot failure when all nodes are hotpluggable
  fs: check bh blocknr earlier when searching lru
  ...
2014-10-14 03:54:50 +02:00
Linus Torvalds 1ee07ef6b5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "This patch set contains the main portion of the changes for 3.18 in
  regard to the s390 architecture.  It is a bit bigger than usual,
  mainly because of a new driver and the vector extension patches.

  The interesting bits are:
   - Quite a bit of work on the tracing front.  Uprobes is enabled and
     the ftrace code is reworked to get some of the lost performance
     back if CONFIG_FTRACE is enabled.
   - To improve boot time with CONFIG_DEBIG_PAGEALLOC, support for the
     IPTE range facility is added.
   - The rwlock code is re-factored to improve writer fairness and to be
     able to use the interlocked-access instructions.
   - The kernel part for the support of the vector extension is added.
   - The device driver to access the CD/DVD on the HMC is added, this
     will hopefully come in handy to improve the installation process.
   - Add support for control-unit initiated reconfiguration.
   - The crypto device driver is enhanced to enable the additional AP
     domains and to allow the new crypto hardware to be used.
   - Bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits)
  s390/ftrace: simplify enabling/disabling of ftrace_graph_caller
  s390/ftrace: remove 31 bit ftrace support
  s390/kdump: add support for vector extension
  s390/disassembler: add vector instructions
  s390: add support for vector extension
  s390/zcrypt: Toleration of new crypto hardware
  s390/idle: consolidate idle functions and definitions
  s390/nohz: use a per-cpu flag for arch_needs_cpu
  s390/vtime: do not reset idle data on CPU hotplug
  s390/dasd: add support for control unit initiated reconfiguration
  s390/dasd: fix infinite loop during format
  s390/mm: make use of ipte range facility
  s390/setup: correct 4-level kernel page table detection
  s390/topology: call set_sched_topology early
  s390/uprobes: architecture backend for uprobes
  s390/uprobes: common library for kprobes and uprobes
  s390/rwlock: use the interlocked-access facility 1 instructions
  s390/rwlock: improve writer fairness
  s390/rwlock: remove interrupt-enabling rwlock variant.
  s390/mm: remove change bit override support
  ...
2014-10-14 03:47:00 +02:00
Linus Torvalds ba1a96fc7d Merge branch 'x86-seccomp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 seccomp changes from Ingo Molnar:
 "This tree includes x86 seccomp filter speedups and related preparatory
  work, which touches core seccomp facilities as well.

  The main idea is to split seccomp into two phases, to be able to enter
  a simple fast path for syscalls with ptrace side effects.

  There's no substantial user-visible (and ABI) effects expected from
  this, except a change in how we emit a better audit record for
  SECCOMP_RET_TRACE events"

* 'x86-seccomp-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86_64, entry: Use split-phase syscall_trace_enter for 64-bit syscalls
  x86_64, entry: Treat regs->ax the same in fastpath and slowpath syscalls
  x86: Split syscall_trace_enter into two phases
  x86, entry: Only call user_exit if TIF_NOHZ
  x86, x32, audit: Fix x32's AUDIT_ARCH wrt audit
  seccomp: Document two-phase seccomp and arch-provided seccomp_data
  seccomp: Allow arch code to provide seccomp_data
  seccomp: Refactor the filter callback and the API
  seccomp,x86,arm,mips,s390: Remove nr parameter from secure_computing
2014-10-14 02:27:06 +02:00
Linus Torvalds f1bfbd984b Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
 "The main changes in this tree are:

   - fix and update Intel Quark [Galileo] SoC platform support

   - update IOSF chipset side band interface and make it available via
     debugfs

   - enable HPETs on Soekris net6501 and other e6xx based systems"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Add cpu_detect_cache_sizes to init_intel() add Quark legacy_cache()
  x86: Quark: Comment setup_arch() to document TLB/PGE bug
  x86/intel/quark: Switch off CR4.PGE so TLB flush uses CR3 instead
  x86/platform/intel/iosf: Add debugfs config option for IOSF
  x86/platform/intel/iosf: Add better description of IOSF driver in config
  x86/platform/intel/iosf: Add Braswell PCI ID
  x86/platform/pmc_atom: Fix warning when CONFIG_DEBUG_FS=n
  x86: HPET force enable for e6xx based systems
  x86/iosf: Add debugfs support
  x86/iosf: Add Kconfig prompt for IOSF_MBI selection
2014-10-14 02:23:55 +02:00
Peter Feiner 64e455079e mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared
For VMAs that don't want write notifications, PTEs created for read faults
have their write bit set.  If the read fault happens after VM_SOFTDIRTY is
cleared, then the PTE's softdirty bit will remain clear after subsequent
writes.

Here's a simple code snippet to demonstrate the bug:

  char* m = mmap(NULL, getpagesize(), PROT_READ | PROT_WRITE,
                 MAP_ANONYMOUS | MAP_SHARED, -1, 0);
  system("echo 4 > /proc/$PPID/clear_refs"); /* clear VM_SOFTDIRTY */
  assert(*m == '\0');     /* new PTE allows write access */
  assert(!soft_dirty(x));
  *m = 'x';               /* should dirty the page */
  assert(soft_dirty(x));  /* fails */

With this patch, write notifications are enabled when VM_SOFTDIRTY is
cleared.  Furthermore, to avoid unnecessary faults, write notifications
are disabled when VM_SOFTDIRTY is set.

As a side effect of enabling and disabling write notifications with
care, this patch fixes a bug in mprotect where vm_page_prot bits set by
drivers were zapped on mprotect.  An analogous bug was fixed in mmap by
commit c9d0bf2414 ("mm: uncached vma support with writenotify").

Signed-off-by: Peter Feiner <pfeiner@google.com>
Reported-by: Peter Feiner <pfeiner@google.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Jamie Liu <jamieliu@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:28 +02:00
Geert Uytterhoeven 63a12d9d01 kernel/param: consolidate __{start,stop}___param[] in <linux/moduleparam.h>
Consolidate the various external const and non-const declarations of
__start___param[] and __stop___param in <linux/moduleparam.h>.  This
requires making a few struct kernel_param pointers in kernel/params.c
const.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:28 +02:00
Ulrich Obergfell 6e7458a6f0 kernel/watchdog.c: control hard lockup detection default
In some cases we don't want hard lockup detection enabled by default.
An example is when running as a guest.  Introduce

  watchdog_enable_hardlockup_detector(bool)

allowing those cases to disable hard lockup detection.  This must be
executed early by the boot processor from e.g.  smp_prepare_boot_cpu, in
order to allow kernel command line arguments to override it, as well as
to avoid hard lockup detection being enabled before we've had a chance
to indicate that it's unwanted.  In summary,

  initial boot:					default=enabled
  smp_prepare_boot_cpu
    watchdog_enable_hardlockup_detector(false):	default=disabled
  cmdline has 'nmi_watchdog=1':			default=enabled

The running kernel still has the ability to enable/disable at any time
with /proc/sys/kernel/nmi_watchdog us usual.  However even when the
default has been overridden /proc/sys/kernel/nmi_watchdog will initially
show '1'.  To truly turn it on one must disable/enable it, i.e.

  echo 0 > /proc/sys/kernel/nmi_watchdog
  echo 1 > /proc/sys/kernel/nmi_watchdog

This patch will be immediately useful for KVM with the next patch of this
series.  Other hypervisor guest types may find it useful as well.

[akpm@linux-foundation.org: fix build]
[dzickus@redhat.com: fix compile issues on sparc]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:27 +02:00
Andy Shevchenko 5df1415aee lib80211: remove unused print_ssid()
In kernel we have %*pE specifier to print an escaped buffer.  All users
now switched to that approach.

This fixes a bug as well.  The current implementation wrongly prints
octal numbers: only two first digits are used in case when 3 are
required and the rest of the string ends up cut off.

Additionally by default the \f, \v, \a, and \e are escaped to their
alphabetic representation.  It's safe to do since it is currently used
for messaging only.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: "John W . Linville" <linville@tuxdriver.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:27 +02:00
Andy Shevchenko c8250381c8 lib / string_helpers: introduce string_escape_mem()
This is almost the opposite function to string_unescape().  Nevertheless
it handles \0 and could be used for any byte buffer.

The documentation is supplied together with the function prototype.

The test cases covers most of the scenarios and would be expanded later
on.

[akpm@linux-foundation.org: avoid 1k stack consumption]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: "John W . Linville" <linville@tuxdriver.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:26 +02:00
Andy Shevchenko d295634e96 lib / string_helpers: move documentation to c-file
The introduced function string_escape_mem() is a kind of opposite to
string_unescape.  We have several users of such functionality each of
them created custom implementation.  The series contains clean up of
test suite, adding new call, and switching few users to use it via %*pE
specifier.

Test suite covers all of existing and most of potential use cases.

This patch (of 11):

The documentation of API belongs to c-file.  This patch moves it
accordingly.

There is no functional change.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: "John W . Linville" <linville@tuxdriver.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:26 +02:00
Daniel Walter 3db2e9cdc0 include/linux: remove strict_strto* definitions
Remove obsolete and unused strict_strto* functions

Signed-off-by: Daniel Walter <dwalter@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:26 +02:00
Rasmus Villemoes b0bfb63118 lib: string: Make all calls to strnicmp into calls to strncasecmp
The previous patch made strnicmp into a wrapper for strncasecmp.

This patch makes all in-tree users of strnicmp call strncasecmp
directly, while still making sure that the strnicmp symbol can be used
by out-of-tree modules.  It should be considered a temporary hack until
all in-tree callers have been converted.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:23 +02:00
Mike Travis 67cf13ceed x86: optimize resource lookups for ioremap
We have a large university system in the UK that is experiencing very long
delays modprobing the driver for a specific I/O device.  The delay is from
8-10 minutes per device and there are 31 devices in the system.  This 4 to
5 hour delay in starting up those I/O devices is very much a burden on the
customer.

There are two causes for requiring a restart/reload of the drivers.  First
is periodic preventive maintenance (PM) and the second is if any of the
devices experience a fatal error.  Both of these trigger this excessively
long delay in bringing the system back up to full capability.

The problem was tracked down to a very slow IOREMAP operation and the
excessively long ioresource lookup to insure that the user is not
attempting to ioremap RAM.  These patches provide a speed up to that
function.

The modprobe time appears to be affected quite a bit by previous activity
on the ioresource list, which I suspect is due to cache preloading.  While
the overall improvement is impacted by other overhead of starting the
devices, this drastically improves the modprobe time.

Also our system is considerably smaller so the percentages gained will not
be the same.  Best case improvement with the modprobe on our 20 device
smallish system was from 'real 5m51.913s' to 'real 0m18.275s'.

This patch (of 2):

Since the ioremap operation is verifying that the specified address range
is NOT RAM, it will search the entire ioresource list if the condition is
true.  To make matters worse, it does this one 4k page at a time.  For a
128M BAR region this is 32 passes to determine the entire region does not
contain any RAM addresses.

This patch provides another resource lookup function, region_is_ram, that
searches for the entire region specified, verifying that it is completely
contained within the resource region.  If it is found, then it is checked
to be RAM or not, within a single pass.

The return result reflects if it was found or not (-1), and whether it is
RAM (1) or not (0).  This allows the caller to fallback to the previous
page by page search if it was not found.

[akpm@linux-foundation.org: fix spellos and typos in comment]
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Alex Thorlton <athorlton@sgi.com>
Reviewed-by: Cliff Wickman <cpw@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:22 +02:00
Lai Jiangshan a841b65921 rbtree: add comment to rb_insert_augmented()
The comment is copied from Documentation/rbtree.txt, but this comment is
so important that it should also be in the code.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:21 +02:00
Baoquan He 669280a152 kexec: take the segment adding out of locate_mem_hole functions
In locate_mem_hole functions, a memory hole is located and added as
kexec_segment.  But from the name of locate_mem_hole, it should only take
responsibility of searching a available memory hole to contain data of a
specified size.

So in this patch add a new field 'mem' into kexec_buf, then take that
kexec segment adding code out of locate_mem_hole_top_down and
locate_mem_hole_bottom_up.  This make clear of the functionality of
locate_mem_hole just like it declars to do.  And by this
locate_mem_hole_callback chould be used later if anyone want to locate a
memory hole for other use.

Meanwhile Vivek suggested opening code function __kexec_add_segment(),
that way we have to retreive ksegment pointer once and it is easy to read.
 So just do it in this patch and remove __kexec_add_segment() since no one
use it anymore.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:21 +02:00
Oleg Nesterov 1c3bea0e71 signal: use BUILD_BUG() instead of _NSIG_WORDS_is_unsupported_size()
Kill _NSIG_WORDS_is_unsupported_size(), use BUILD_BUG() instead.  This
simplifies the code, avoids the nested-externs warnings, and this way we
do not defer the problem to linker.

Also, fix the indentation in _SIG_SET_BINOP() and _SIG_SET_OP().

Note: this patch assumes that the code like "if (0) BUILD_BUG();" is
valid.  If not (say __compiletime_error() is not defined and thus
__compiletime_error_fallback() uses a negative array) we should fix
BUILD_BUG() and/or BUILD_BUG_ON_MSG().  This code should be fine by
definition, this is the documented purpose of BUILD_BUG().

[sfr@canb.auug.org.au: fix powerpc build failures]
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:20 +02:00
Lai Jiangshan 6de8ab68bc lib: remove prio_heap
The prio_heap code is unused since commit 889ed9ceaa ("cgroup: remove
css_scan_tasks()").  It should be compiled out to shrink the binary
kernel size which can be done via introducing CONFIG_PRIO_HEAD or by
removing the code.

We can simply recover the code from git when needed, so it would be
better to remove it IMO.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Francesco Fusco <ffusco@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: George Spelvin <linux@horizon.com>
Cc: Mark Salter <msalter@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:14 +02:00
Masahiro Yamada 8b21d9ca17 list: include linux/kernel.h
linux/list.h uses container_of, therefore it depends on linux/kernel.h.

Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:13 +02:00
Marek Szyprowski de9e14eebf drivers: dma-contiguous: add initialization from device tree
Add a function to create CMA region from previously reserved memory and
add support for handling 'shared-dma-pool' reserved-memory device tree
nodes.

Based on previous code provided by Josh Cartwright <joshc@codeaurora.org>

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Josh Cartwright <joshc@codeaurora.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:12 +02:00
Sasha Levin 71458cfc78 kernel: add support for gcc 5
We're missing include/linux/compiler-gcc5.h which is required now
because gcc branched off to v5 in trunk.

Just copy the relevant bits out of include/linux/compiler-gcc4.h,
no new code is added as of now.

This fixes a build error when using gcc 5.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:12 +02:00
Linus Torvalds faafcba3b5 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave
     Hansen)

   - Various sched/idle refinements for better idle handling (Nicolas
     Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot)

   - sched/numa updates and optimizations (Rik van Riel)

   - sysbench speedup (Vincent Guittot)

   - capacity calculation cleanups/refactoring (Vincent Guittot)

   - Various cleanups to thread group iteration (Oleg Nesterov)

   - Double-rq-lock removal optimization and various refactorings
     (Kirill Tkhai)

   - various sched/deadline fixes

  ... and lots of other changes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
  sched/dl: Use dl_bw_of() under rcu_read_lock_sched()
  sched/fair: Delete resched_cpu() from idle_balance()
  sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
  sched: Improve sysbench performance by fixing spurious active migration
  sched/x86: Fix up typo in topology detection
  x86, sched: Add new topology for multi-NUMA-node CPUs
  sched/rt: Use resched_curr() in task_tick_rt()
  sched: Use rq->rd in sched_setaffinity() under RCU read lock
  sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask'
  sched: Use dl_bw_of() under RCU read lock
  sched/fair: Remove duplicate code from can_migrate_task()
  sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
  sched: print_rq(): Don't use tasklist_lock
  sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock()
  sched: Fix the task-group check in tg_has_rt_tasks()
  sched/fair: Leverage the idle state info when choosing the "idlest" cpu
  sched: Let the scheduler see CPU idle states
  sched/deadline: Fix inter- exclusive cpusets migrations
  sched/deadline: Clear dl_entity params when setscheduling to different class
  sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault()
  ...
2014-10-13 16:23:15 +02:00
Linus Torvalds 9d9420f120 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side updates:

   - Fix and enhance poll support (Jiri Olsa)

   - Re-enable inheritance optimization (Jiri Olsa)

   - Enhance Intel memory events support (Stephane Eranian)

   - Refactor the Intel uncore driver to be more maintainable (Zheng
     Yan)

   - Enhance and fix Intel CPU and uncore PMU drivers (Peter Zijlstra,
     Andi Kleen)

   - [ plus various smaller fixes/cleanups ]

  User visible tooling updates:

   - Add +field argument support for --field option, so that one can add
     fields to the default list of fields to show, ie now one can just
     do:

         perf report --fields +pid

     And the pid will appear in addition to the default fields (Jiri
     Olsa)

   - Add +field argument support for --sort option (Jiri Olsa)

   - Honour -w in the report tools (report, top), allowing to specify
     the widths for the histogram entries columns (Namhyung Kim)

   - Properly show submicrosecond times in 'perf kvm stat' (Christian
     Borntraeger)

   - Add beautifier for mremap flags param in 'trace' (Alex Snast)

   - perf script: Allow callchains if any event samples them

   - Don't truncate Intel style addresses in 'annotate' (Alex Converse)

   - Allow profiling when kptr_restrict == 1 for non root users, kernel
     samples will just remain unresolved (Andi Kleen)

   - Allow configuring default options for callchains in config file
     (Namhyung Kim)

   - Support operations for shared futexes.  (Davidlohr Bueso)

   - "perf kvm stat report" improvements by Alexander Yarygin:
       -  Save pid string in opts.target.pid
       -  Enable the target.system_wide flag
       -  Unify the title bar output

   - [ plus lots of other fixes and small improvements.  ]

  Tooling infrastructure changes:

   - Refactor unit and scale function parameters for PMU parsing
     routines (Matt Fleming)

   - Improve DSO long names lookup with rbtree, resulting in great
     speedup for workloads with lots of DSOs (Waiman Long)

   - We were not handling POLLHUP notifications for event file
     descriptors

     Fix it by filtering entries in the events file descriptor array
     after poll() returns, refcounting mmaps so that when the last fd
     pointing to a perf mmap goes away we do the unmap (Arnaldo Carvalho
     de Melo)

   - Intel PT prep work, from Adrian Hunter, including:
       - Let a user specify a PMU event without any config terms
       - Add perf-with-kcore script
       - Let default config be defined for a PMU
       - Add perf_pmu__scan_file()
       - Add a 'perf test' for tracking with sched_switch
       - Add 'flush' callback to scripting API

   - Use ring buffer consume method to look like other tools (Arnaldo
     Carvalho de Melo)

   - hists browser (used in top and report) refactorings, getting rid of
     unused variables and reducing source code size by handling similar
     cases in a fewer functions (Namhyung Kim).

   - Replace thread unsafe strerror() with strerror_r() accross the
     whole tools/perf/ tree (Masami Hiramatsu)

   - Rename ordered_samples to ordered_events and allow setting a queue
     size for ordering events (Jiri Olsa)

   - [ plus lots of fixes, cleanups and other improvements ]"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (198 commits)
  perf/x86: Tone down kernel messages when the PMU check fails in a virtual environment
  perf/x86/intel/uncore: Fix minor race in box set up
  perf record: Fix error message for --filter option not coming after tracepoint
  perf tools: Fix build breakage on arm64 targets
  perf symbols: Improve DSO long names lookup speed with rbtree
  perf symbols: Encapsulate dsos list head into struct dsos
  perf bench futex: Sanitize -q option in requeue
  perf bench futex: Support operations for shared futexes
  perf trace: Fix mmap return address truncation to 32-bit
  perf tools: Refactor unit and scale function parameters
  perf tools: Fix line number in the config file error message
  perf tools: Convert {record,top}.call-graph option to call-graph.record-mode
  perf tools: Introduce perf_callchain_config()
  perf callchain: Move some parser functions to callchain.c
  perf tools: Move callchain config from record_opts to callchain_param
  perf hists browser: Fix callchain print bug on TUI
  perf tools: Use ACCESS_ONCE() instead of volatile cast
  perf tools: Modify error code for when perf_session__new() fails
  perf tools: Fix perf record as non root with kptr_restrict == 1
  perf stat: Fix --per-core on multi socket systems
  ...
2014-10-13 15:58:15 +02:00
Linus Torvalds 6d5f0ebfc0 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking updates from Ingo Molnar:
 "The main updates in this cycle were:

   - mutex MCS refactoring finishing touches: improve comments, refactor
     and clean up code, reduce debug data structure footprint, etc.

   - qrwlock finishing touches: remove old code, self-test updates.

   - small rwsem optimization

   - various smaller fixes/cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/lockdep: Revert qrwlock recusive stuff
  locking/rwsem: Avoid double checking before try acquiring write lock
  locking/rwsem: Move EXPORT_SYMBOL() lines to follow function definition
  locking/rwlock, x86: Delete unused asm/rwlock.h and rwlock.S
  locking/rwlock, x86: Clean up asm/spinlock*.h to remove old rwlock code
  locking/semaphore: Resolve some shadow warnings
  locking/selftest: Support queued rwlock
  locking/lockdep: Restrict the use of recursive read_lock() with qrwlock
  locking/spinlocks: Always evaluate the second argument of spin_lock_nested()
  locking/Documentation: Update locking/mutex-design.txt disadvantages
  locking/Documentation: Move locking related docs into Documentation/locking/
  locking/mutexes: Use MUTEX_SPIN_ON_OWNER when appropriate
  locking/mutexes: Refactor optimistic spinning code
  locking/mcs: Remove obsolete comment
  locking/mutexes: Document quick lock release when unlocking
  locking/mutexes: Standardize arguments in lock/unlock slowpaths
  locking: Remove deprecated smp_mb__() barriers
2014-10-13 15:51:40 +02:00
Linus Torvalds dbb885fecc Merge branch 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull arch atomic cleanups from Ingo Molnar:
 "This is a series kept separate from the main locking tree, which
  cleans up and improves various details in the atomics type handling:

   - Remove the unused atomic_or_long() method

   - Consolidate and compress atomic ops implementations between
     architectures, to reduce linecount and to make it easier to add new
     ops.

   - Rewrite generic atomic support to only require cmpxchg() from an
     architecture - generate all other methods from that"

* 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  locking,arch: Use ACCESS_ONCE() instead of cast to volatile in atomic_read()
  locking, mips: Fix atomics
  locking, sparc64: Fix atomics
  locking,arch: Rewrite generic atomic support
  locking,arch,xtensa: Fold atomic_ops
  locking,arch,sparc: Fold atomic_ops
  locking,arch,sh: Fold atomic_ops
  locking,arch,powerpc: Fold atomic_ops
  locking,arch,parisc: Fold atomic_ops
  locking,arch,mn10300: Fold atomic_ops
  locking,arch,mips: Fold atomic_ops
  locking,arch,metag: Fold atomic_ops
  locking,arch,m68k: Fold atomic_ops
  locking,arch,m32r: Fold atomic_ops
  locking,arch,ia64: Fold atomic_ops
  locking,arch,hexagon: Fold atomic_ops
  locking,arch,cris: Fold atomic_ops
  locking,arch,avr32: Fold atomic_ops
  locking,arch,arm64: Fold atomic_ops
  locking,arch,arm: Fold atomic_ops
  ...
2014-10-13 15:48:00 +02:00
Linus Torvalds d6dd50e07c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes in this cycle were:

   - changes related to No-CBs CPUs and NO_HZ_FULL

   - RCU-tasks implementation

   - torture-test updates

   - miscellaneous fixes

   - locktorture updates

   - RCU documentation updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits)
  workqueue: Use cond_resched_rcu_qs macro
  workqueue: Add quiescent state between work items
  locktorture: Cleanup header usage
  locktorture: Cannot hold read and write lock
  locktorture: Fix __acquire annotation for spinlock irq
  locktorture: Support rwlocks
  rcu: Eliminate deadlock between CPU hotplug and expedited grace periods
  locktorture: Document boot/module parameters
  rcutorture: Rename rcutorture_runnable parameter
  locktorture: Add test scenario for rwsem_lock
  locktorture: Add test scenario for mutex_lock
  locktorture: Make torture scripting account for new _runnable name
  locktorture: Introduce torture context
  locktorture: Support rwsems
  locktorture: Add infrastructure for torturing read locks
  torture: Address race in module cleanup
  locktorture: Make statistics generic
  locktorture: Teach about lock debugging
  locktorture: Support mutexes
  locktorture: Add documentation
  ...
2014-10-13 15:44:12 +02:00
Linus Torvalds 77c688ac87 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "The big thing in this pile is Eric's unmount-on-rmdir series; we
  finally have everything we need for that.  The final piece of prereqs
  is delayed mntput() - now filesystem shutdown always happens on
  shallow stack.

  Other than that, we have several new primitives for iov_iter (Matt
  Wilcox, culled from his XIP-related series) pushing the conversion to
  ->read_iter()/ ->write_iter() a bit more, a bunch of fs/dcache.c
  cleanups and fixes (including the external name refcounting, which
  gives consistent behaviour of d_move() wrt procfs symlinks for long
  and short names alike) and assorted cleanups and fixes all over the
  place.

  This is just the first pile; there's a lot of stuff from various
  people that ought to go in this window.  Starting with
  unionmount/overlayfs mess...  ;-/"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (60 commits)
  fs/file_table.c: Update alloc_file() comment
  vfs: Deduplicate code shared by xattr system calls operating on paths
  reiserfs: remove pointless forward declaration of struct nameidata
  don't need that forward declaration of struct nameidata in dcache.h anymore
  take dname_external() into fs/dcache.c
  let path_init() failures treated the same way as subsequent link_path_walk()
  fix misuses of f_count() in ppp and netlink
  ncpfs: use list_for_each_entry() for d_subdirs walk
  vfs: move getname() from callers to do_mount()
  gfs2_atomic_open(): skip lookups on hashed dentry
  [infiniband] remove pointless assignments
  gadgetfs: saner API for gadgetfs_create_file()
  f_fs: saner API for ffs_sb_create_file()
  jfs: don't hash direct inode
  [s390] remove pointless assignment of ->f_op in vmlogrdr ->open()
  ecryptfs: ->f_op is never NULL
  android: ->f_op is never NULL
  nouveau: __iomem misannotations
  missing annotation in fs/file.c
  fs: namespace: suppress 'may be used uninitialized' warnings
  ...
2014-10-13 11:28:42 +02:00
Dave Airlie dfda0df342 drm/mst: rework payload table allocation to conform better.
The old code has problems with the Dell MST monitors due to some
assumptions I made that weren't true.

I initially thought the Virtual Channel Payload IDs had to be in
the DPCD table in ascending order, however it appears that assumption
is bogus.

The old code also assumed it was possible to insert a member
into the table and it would move other members up, like it does
when you remove table entries, however reality has shown this
isn't true.

So the new code allocates VCPIs separate from entries in the payload
tracking table, and when we remove an entry from the DPCD table,
I shuffle the tracking payload entries around in the struct.

This appears to make VT switch more robust (still not perfect)
with an MST enabled Dell monitor.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-10-13 14:40:53 +10:00
Al Viro 7b600f2abb don't need that forward declaration of struct nameidata in dcache.h anymore
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-12 17:09:06 -04:00
Al Viro 810bb17267 take dname_external() into fs/dcache.c
never used outside and it's too low-level for legitimate uses outside
of fs/dcache.c anyway

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-12 17:09:05 -04:00
Linus Torvalds 5e40d331bd Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris.

Mostly ima, selinux, smack and key handling updates.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits)
  integrity: do zero padding of the key id
  KEYS: output last portion of fingerprint in /proc/keys
  KEYS: strip 'id:' from ca_keyid
  KEYS: use swapped SKID for performing partial matching
  KEYS: Restore partial ID matching functionality for asymmetric keys
  X.509: If available, use the raw subjKeyId to form the key description
  KEYS: handle error code encoded in pointer
  selinux: normalize audit log formatting
  selinux: cleanup error reporting in selinux_nlmsg_perm()
  KEYS: Check hex2bin()'s return when generating an asymmetric key ID
  ima: detect violations for mmaped files
  ima: fix race condition on ima_rdwr_violation_check and process_measurement
  ima: added ima_policy_flag variable
  ima: return an error code from ima_add_boot_aggregate()
  ima: provide 'ima_appraise=log' kernel option
  ima: move keyring initialization to ima_init()
  PKCS#7: Handle PKCS#7 messages that contain no X.509 certs
  PKCS#7: Better handling of unsupported crypto
  KEYS: Overhaul key identification when searching for asymmetric keys
  KEYS: Implement binary asymmetric key ID handling
  ...
2014-10-12 10:13:55 -04:00
Linus Torvalds 9837acff77 This set has a few minor updates, but the big change is the redesign
of the trampoline logic.
 
 The trampoline logic of 3.17 required a descriptor for every function
 that is registered to be traced and uses a trampoline. Currently,
 only the function graph tracer uses a trampoline, but if you were
 to trace all 32,000 (give or take a few thousand) functions with the
 function graph tracer, it would create 32,000 descriptors to let us
 know that there's a trampoline associated with it. This takes up a bit
 of memory when there's a better way to do it.
 
 The redesign now reuses the ftrace_ops' (what registers the function graph
 tracer) hash tables. The hash tables tell ftrace what the tracer
 wants to trace or doesn't want to trace. There's two of them: one
 that tells us what to trace, the other tells us what not to trace.
 If the first one is empty, it means all functions should be traced,
 otherwise only the ones that are listed should be. The second hash table
 tells us what not to trace, and if it is empty, all functions may be
 traced, and if there's any listed, then those should not be traced
 even if they exist in the first hash table.
 
 It took a bit of massaging, but now these hashes can be used to
 keep track of what has a trampoline and what does not, and allows
 the ftrace accounting to work. Now we can trace all functions when using
 the function graph trampoline, and avoid needing to create any special
 descriptors to hold all the functions that are being traced.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUMwp6AAoJEKQekfcNnQGuIoAIAIsqvTYAnULyzKKCweEZYUfb
 XJzz6cN5FPGSXkoeda1ZvnfOlHjFRrWNXzXMB0jZYR2hU++pe3xjtghaNzvbRcyV
 wlwDUTsnz235OcOuFEspIwBamhtah96Coiwf/2z/2q6srXlHd/1TrqXB+Fpj1tEK
 BkAViGDUEdq/eLZX7nGen36cTb5gpNqV9NjY1CVAK6bSkU/xXk/ArqFy1qy0MPnc
 z/9bXdIf+Z6VnG/IzwRc2rwiMFuD1+lpjLuHEqagoHp1D4teCjWPSJl1EKCVAS40
 GaCOTUZi92zIVgx8Bb28TglSla9MN65CO3E8dw6hlXUIsNz1p0eavpctnC6ac/Y=
 =vDpP
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "This set has a few minor updates, but the big change is the redesign
  of the trampoline logic.

  The trampoline logic of 3.17 required a descriptor for every function
  that is registered to be traced and uses a trampoline.  Currently,
  only the function graph tracer uses a trampoline, but if you were to
  trace all 32,000 (give or take a few thousand) functions with the
  function graph tracer, it would create 32,000 descriptors to let us
  know that there's a trampoline associated with it.  This takes up a
  bit of memory when there's a better way to do it.

  The redesign now reuses the ftrace_ops' (what registers the function
  graph tracer) hash tables.  The hash tables tell ftrace what the
  tracer wants to trace or doesn't want to trace.  There's two of them:
  one that tells us what to trace, the other tells us what not to trace.
  If the first one is empty, it means all functions should be traced,
  otherwise only the ones that are listed should be.  The second hash
  table tells us what not to trace, and if it is empty, all functions
  may be traced, and if there's any listed, then those should not be
  traced even if they exist in the first hash table.

  It took a bit of massaging, but now these hashes can be used to keep
  track of what has a trampoline and what does not, and allows the
  ftrace accounting to work.  Now we can trace all functions when using
  the function graph trampoline, and avoid needing to create any special
  descriptors to hold all the functions that are being traced"

* tag 'trace-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Only disable ftrace_enabled to test buffer in selftest
  ftrace: Add sanity check when unregistering last ftrace_ops
  kernel: trace_syscalls: Replace rcu_assign_pointer() with RCU_INIT_POINTER()
  tracing: generate RCU warnings even when tracepoints are disabled
  ftrace: Replace tramp_hash with old_*_hash to save space
  ftrace: Annotate the ops operation on update
  ftrace: Grab any ops for a rec for enabled_functions output
  ftrace: Remove freeing of old_hash from ftrace_hash_move()
  ftrace: Set callback to ftrace_stub when no ops are registered
  ftrace: Add helper function ftrace_ops_get_func()
  ftrace: Add separate function for non recursive callbacks
2014-10-12 07:27:19 -04:00
Linus Torvalds ca321885b0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "This set fixes a bunch of fallout from the changes that went in during
  this merge window, particularly:

   - Fix fsl_pq_mdio (Claudiu Manoil) and fm10k (Pranith Kumar) build
     failures.

   - Several networking drivers do atomic_set() on page counts where
     that's not exactly legal.  From Eric Dumazet.

   - Make __skb_flow_get_ports() work cleanly with unaligned data, from
     Alexander Duyck.

   - Fix some kernel-doc buglets in rfkill and netlabel, from Fabian
     Frederick.

   - Unbalanced enable_irq_wake usage in bcmgenet and systemport
     drivers, from Florian Fainelli.

   - pxa168_eth needs to depend on HAS_DMA, from Geert Uytterhoeven.

   - Multi-dequeue in the qdisc layer severely bypasses the fairness
     limits the previous code used to enforce, reintroduce in a way that
     at the same time doesn't compromise bulk dequeue opportunities.
     From Jesper Dangaard Brouer.

   - macvlan receive path unnecessarily hops through a softirq by using
     netif_rx() instead of netif_receive_skb().  From Jason Baron"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (51 commits)
  net: systemport: avoid unbalanced enable_irq_wake calls
  net: bcmgenet: avoid unbalanced enable_irq_wake calls
  net: bcmgenet: fix off-by-one in incrementing read pointer
  net: fix races in page->_count manipulation
  mlx4: fix race accessing page->_count
  ixgbe: fix race accessing page->_count
  igb: fix race accessing page->_count
  fm10k: fix race accessing page->_count
  net/phy: micrel: Add clock support for KSZ8021/KSZ8031
  flow-dissector: Fix alignment issue in __skb_flow_get_ports
  net: filter: fix the comments
  Documentation: replace __sk_run_filter with __bpf_prog_run
  macvlan: optimize the receive path
  macvlan: pass 'bool' type to macvlan_count_rx()
  drivers: net: xgene: Add 10GbE ethtool support
  drivers: net: xgene: Add 10GbE support
  drivers: net: xgene: Preparing for adding 10GbE support
  dtb: Add 10GbE node to APM X-Gene SoC device tree
  Documentation: dts: Update section header for APM X-Gene
  MAINTAINERS: Update APM X-Gene section
  ...
2014-10-11 21:19:00 -04:00
Linus Torvalds fd9879b9bb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc updates from Michael Ellerman:
 "Here's a first pull request for powerpc updates for 3.18.

  The bulk of the additions are for the "cxl" driver, for IBM's Coherent
  Accelerator Processor Interface (CAPI).  Most of it's in drivers/misc,
  which Greg & Arnd maintain, Greg said he was happy for us to take it
  through our tree.

  There's the usual minor cleanups and fixes, including a bit of noise
  in drivers from some of those.  A bunch of updates to our EEH code,
  which has been getting more testing.  Several nice speedups from
  Anton, including 20% in clear_page().

  And a bunch of updates for freescale from Scott"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (130 commits)
  cxl: Fix afu_read() not doing finish_wait() on signal or non-blocking
  cxl: Add documentation for userspace APIs
  cxl: Add driver to Kbuild and Makefiles
  cxl: Add userspace header file
  cxl: Driver code for powernv PCIe based cards for userspace access
  cxl: Add base builtin support
  powerpc/mm: Add hooks for cxl
  powerpc/opal: Add PHB to cxl mode call
  powerpc/mm: Add new hash_page_mm()
  powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts
  cxl: Add new header for call backs and structs
  powerpc/powernv: Split out set MSI IRQ chip code
  powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize
  powerpc/msi: Improve IRQ bitmap allocator
  powerpc/cell: Make spu_flush_all_slbs() generic
  powerpc/cell: Move data segment faulting code out of cell platform
  powerpc/cell: Move spu_handle_mm_fault() out of cell platform
  powerpc/pseries: Use new defines when calling H_SET_MODE
  powerpc: Update contact info in Documentation files
  powerpc/perf/hv-24x7: Simplify catalog_read()
  ...
2014-10-11 20:34:00 -04:00
Linus Torvalds 81ae31d782 xen: features and fixes for 3.18-rc0
- Add pvscsi frontend and backend drivers.
 - Remove _PAGE_IOMAP PTE flag, freeing it for alternate uses.
 - Try and keep memory contiguous during PV memory setup (reduces
   SWIOTLB usage).
 - Allow front/back drivers to use threaded irqs.
 - Support large initrds in PV guests.
 - Fix PVH guests in preparation for Xen 4.5
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJUNonmAAoJEFxbo/MsZsTRHAQH/inCjpCT+pkvTB0YAVfVvgMI
 gUogT8G+iB2MuCNpMffGIt8TAVXwcVtnOLH9ABH3IBVehzgipIbIiVEM9YhjrYvU
 1rgIKBpmZqSpjDHoIHpdHeCH67cVnRzA/PyoxZWLxPNmQ0t6bNf9yeAcCXK9PfUc
 7EAblUDmPGSx9x/EUnOKNNaZSEiUJZHDBXbMBLllk1+5H1vfKnpFCRGMG0IrfI44
 KVP2NX9Gfa05edMZYtH887FYyjFe2KNV6LJvE7+w7h2Dy0yIzf7y86t0l4n8gETb
 plvEUJ/lu9RYzTiZY/RxgBFYVTV59EqT45brSUtoe2Jcp8GSwiHslTHdfyFBwSo=
 =gw4d
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.18-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull Xen updates from David Vrabel:
 "Features and fixes:

   - Add pvscsi frontend and backend drivers.
   - Remove _PAGE_IOMAP PTE flag, freeing it for alternate uses.
   - Try and keep memory contiguous during PV memory setup (reduces
     SWIOTLB usage).
   - Allow front/back drivers to use threaded irqs.
   - Support large initrds in PV guests.
   - Fix PVH guests in preparation for Xen 4.5"

* tag 'stable/for-linus-3.18-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (22 commits)
  xen: remove DEFINE_XENBUS_DRIVER() macro
  xen/xenbus: Remove BUG_ON() when error string trucated
  xen/xenbus: Correct the comments for xenbus_grant_ring()
  x86/xen: Set EFER.NX and EFER.SCE in PVH guests
  xen: eliminate scalability issues from initrd handling
  xen: sync some headers with xen tree
  xen: make pvscsi frontend dependant on xenbus frontend
  arm{,64}/xen: Remove "EXPERIMENTAL" in the description of the Xen options
  xen-scsifront: don't deadlock if the ring becomes full
  x86: remove the Xen-specific _PAGE_IOMAP PTE flag
  x86/xen: do not use _PAGE_IOMAP PTE flag for I/O mappings
  x86: skip check for spurious faults for non-present faults
  xen/efi: Directly include needed headers
  xen-scsiback: clean up a type issue in scsiback_make_tpg()
  xen-scsifront: use GFP_ATOMIC under spin_lock
  MAINTAINERS: Add xen pvscsi maintainer
  xen-scsiback: Add Xen PV SCSI backend driver
  xen-scsifront: Add Xen PV SCSI frontend driver
  xen: Add Xen pvSCSI protocol description
  xen/events: support threaded irqs for interdomain event channels
  ...
2014-10-11 20:29:01 -04:00
Linus Torvalds ef4a48c513 File locking related changes for v3.18 (pile #1)
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNZK4AAoJEAAOaEEZVoIVI08P/iM7eaIVRnqaqtWw/JBzxiba
 EMDlJYUBSlv6lYk9s8RJT4bMmcmGAKSYzVAHSoPahzNcqTDdFLeDTLGxJ8uKBbjf
 d1qRRdH1yZHGUzCvJq3mEendjfXn435Y3YburUxjLfmzrzW7EbMvndiQsS5dhAm9
 PEZ+wrKF/zFL7LuXa1YznYrbqOD/GRsJAXGEWc3kNwfS9avephVG/RI3GtpI2PJj
 RY1mf8P7+WOlrShYoEuUo5aqs01MnU70LbqGHzY8/QKH+Cb0SOkCHZPZyClpiA+G
 MMJ+o2XWcif3BZYz+dobwz/FpNZ0Bar102xvm2E8fqByr/T20JFjzooTKsQ+PtCk
 DetQptrU2gtyZDKtInJUQSDPrs4cvA13TW+OEB1tT8rKBnmyEbY3/TxBpBTB9E6j
 eb/V3iuWnywR3iE+yyvx24Qe7Pov6deM31s46+Vj+GQDuWmAUJXemhfzPtZiYpMT
 exMXTyDS3j+W+kKqHblfU5f+Bh1eYGpG2m43wJVMLXKV7NwDf8nVV+Wea962ga+w
 BAM3ia4JRVgRWJBPsnre3lvGT5kKPyfTZsoG+kOfRxiorus2OABoK+SIZBZ+c65V
 Xh8VH5p3qyCUBOynXlHJWFqYWe2wH0LfbPrwe9dQwTwON51WF082EMG5zxTG0Ymf
 J2z9Shz68zu0ok8cuSlo
 =Hhee
 -----END PGP SIGNATURE-----

Merge tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linux

Pull file locking related changes from Jeff Layton:
 "This release is a little more busy for file locking changes than the
  last:

   - a set of patches from Kinglong Mee to fix the lockowner handling in
     knfsd
   - a pile of cleanups to the internal file lease API.  This should get
     us a bit closer to allowing for setlease methods that can block.

  There are some dependencies between mine and Bruce's trees this cycle,
  and I based my tree on top of the requisite patches in Bruce's tree"

* tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linux: (26 commits)
  locks: fix fcntl_setlease/getlease return when !CONFIG_FILE_LOCKING
  locks: flock_make_lock should return a struct file_lock (or PTR_ERR)
  locks: set fl_owner for leases to filp instead of current->files
  locks: give lm_break a return value
  locks: __break_lease cleanup in preparation of allowing direct removal of leases
  locks: remove i_have_this_lease check from __break_lease
  locks: move freeing of leases outside of i_lock
  locks: move i_lock acquisition into generic_*_lease handlers
  locks: define a lm_setup handler for leases
  locks: plumb a "priv" pointer into the setlease routines
  nfsd: don't keep a pointer to the lease in nfs4_file
  locks: clean up vfs_setlease kerneldoc comments
  locks: generic_delete_lease doesn't need a file_lock at all
  nfsd: fix potential lease memory leak in nfs4_setlease
  locks: close potential race in lease_get_mtime
  security: make security_file_set_fowner, f_setown and __f_setown void return
  locks: consolidate "nolease" routines
  locks: remove lock_may_read and lock_may_write
  lockd: rip out deferred lock handling from testlock codepath
  NFSD: Get reference of lockowner when coping file_lock
  ...
2014-10-11 13:21:34 -04:00
Linus Torvalds 90d0c376f5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
 "The largest set of changes here come from Miao Xie.  He's cleaning up
  and improving read recovery/repair for raid, and has a number of
  related fixes.

  I've merged another set of fsync fixes from Filipe, and he's also
  improved the way we handle metadata write errors to make sure we force
  the FS readonly if things go wrong.

  Otherwise we have a collection of fixes and cleanups.  Dave Sterba
  gets a cookie for removing the most lines (thanks Dave)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (139 commits)
  btrfs: Fix compile error when CONFIG_SECURITY is not set.
  Btrfs: fix compiles when CONFIG_BTRFS_FS_RUN_SANITY_TESTS is off
  btrfs: Make btrfs handle security mount options internally to avoid losing security label.
  Btrfs: send, don't delay dir move if there's a new parent inode
  btrfs: add more superblock checks
  Btrfs: fix race in WAIT_SYNC ioctl
  Btrfs: be aware of btree inode write errors to avoid fs corruption
  Btrfs: remove redundant btrfs_verify_qgroup_counts declaration.
  btrfs: fix shadow warning on cmp
  Btrfs: fix compilation errors under DEBUG
  Btrfs: fix crash of btrfs_release_extent_buffer_page
  Btrfs: add missing end_page_writeback on submit_extent_page failure
  btrfs: Fix the wrong condition judgment about subset extent map
  Btrfs: fix build_backref_tree issue with multiple shared blocks
  Btrfs: cleanup error handling in build_backref_tree
  btrfs: move checks for DUMMY_ROOT into a helper
  btrfs: new define for the inline extent data start
  btrfs: kill extent_buffer_page helper
  btrfs: drop constant param from btrfs_release_extent_buffer_page
  btrfs: hide typecast to definition of BTRFS_SEND_TRANS_STUB
  ...
2014-10-11 08:03:52 -04:00
Linus Torvalds 27a9716bc8 VFIO updates for v3.18-rc1
- Nested IOMMU extension to type1 (Will Deacon)
  - Restore MSIx message before enabling (Gavin Shan)
  - Fix remove path locking (Alex Williamson)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUOOETAAoJECObm247sIsihDQP/jADEe9KFu4ymWu7rqi24w1L
 81hGNLXlfx2PPomluN3jENpyueo7vWdP5yZ8q/bi6oF6UbShL8Po01UKHOJzJJwW
 8GW86YcNsmPz/jl8Jcdbkex3dKvT1OzrDjFjCiKTJBHxE9nEdtWlRV8mO1pwd00t
 YFiXF8xFbkpHExMiQNU36rq/fzZCTOu4ZpCK9kDT7Sy+lsKAnGoXuM1IZK+7DGJo
 jcsMF32DVDmji6riy3uHHPc0qprP24QNVy6FfOmLEUvuOEIUOxMAYM9je9mmsHeS
 CeR/NHexr4RgYQE33jL1w8A1saT0rbu7DSKSa7OQebnY2Zte+oncLtqFZR2/Wylh
 jBU5r7P3PdxM6ykqEeC/3ytx7iFX6c7jc0SU4I5m8bFexmUQXqOko28gGIt0OL3n
 R8CmNF/MDs3gqYprhW6MvSJI1diY1+pX7pX0e7k7lDAoZ1QOjPNSGv+YOfF3H1YB
 AggIVxIKXW0T0bQ/hKcQiDKkxQ88vi1hld2LknbiBW9nMNLjNkxl2RZSGunFvWWN
 LzOYkBgR6rrTbhTvsWApsfYguYtGkgAGGJZSR1oev0BJnx4UHOfL1bykJRyUHdUd
 KDSBEni5TY65087IKD93nkyRhassszOa9XHmRDwQLxQeJCKRZi6bQRSzFZVheXIO
 O3XINOo2wNF1bIrfD/vR
 =s2+/
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v3.18-rc1' of git://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:
 - Nested IOMMU extension to type1 (Will Deacon)
 - Restore MSIx message before enabling (Gavin Shan)
 - Fix remove path locking (Alex Williamson)

* tag 'vfio-v3.18-rc1' of git://github.com/awilliam/linux-vfio:
  vfio-pci: Fix remove path locking
  drivers/vfio: Export vfio_spapr_iommu_eeh_ioctl() with GPL
  vfio/pci: Restore MSIx message prior to enabling
  PCI: Export MSI message relevant functions
  vfio/iommu_type1: add new VFIO_TYPE1_NESTING_IOMMU IOMMU type
  iommu: introduce domain attribute for nesting IOMMUs
2014-10-11 06:49:24 -04:00
Linus Torvalds e98d6e7f76 Devicetree changes for v3.18
This branch contains bug fixes and new features for the devicetree code.
 Most of the changes are either new testcases for the selftest code or
 documentation changes. The most notable change is the addition of a
 phandle resolver for use when grafting in a second device tree blob into
 the core tree. The resolver isn't currently used by anything other than
 the selftest module, but it will be used to support device tree
 overlays; probably in the v3.19 timeframe.
 
 Also note that I've moved my normal tree from git.secretlab.ca to
 git.kernel.org.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUOEytAAoJEMWQL496c2LNq9AP/RPbMr7rPLX3flHJbWOeHCF7
 U1TzAZDoBdl7xHFCrsLF5yQlt5rGleJinMxphf8XK9Aui7L18NO4LDqoYGeGOC/u
 hYPcfpLuyRJiBr2xVyt+e0zivPe62P5618wUP4DmqZq7rQ3IYR71bR/g2K4N33VG
 LLD4HmQUCfAUpsF9ruijSShM9ez21oloURSR02xD+yvCfqpXjaysp5XLDJJQLfql
 O2E084QOk0d5LI+buTdmenMzuOAa8TrmDwdEKpbL4maf4Frr7H5QQnQ7xrIkUR0w
 Lu9XxjGhmNG4iLSQcH4lmWpzf6N9nHvfVmjhCZ3UdpYc651v6sb0Lyi8rYWMne2E
 rUoOSpfmUgQ1WlAsFp5R6USUyrJd1Xe0hlqwCwVl97psNLBcZrYmi7YEYWugAAep
 IBHrJk80exBVASErUXr4dgRI257AuHMhIiJxlyaec+mSGJBIzjdjrJbZDtdKVPWw
 liY0PthfFPJUWTjmWiiDK00m3dtpoxnw/ugTAYAKuQGCyXdgcMKNJxwJtpts8kxY
 jDCaNpr1Jf69b0nn1HSlmI40QVgjOnPfNvXGVbQBMxzHorxb1GEiv/uGFavw2bzo
 aEZjxq1/uKMWyvkbJSsGQjGQXuKKwyj5iJ0sSd6U2JfD4Pze+1o+FaWMGo6Bcz7o
 tpbR+vQRBIV2f6pc4PzR
 =VYfI
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux

Pull devicetree changes from Grant Likely:
 "This branch contains bug fixes and new features for the devicetree
  code.

  Most of the changes are either new testcases for the selftest code or
  documentation changes.  The most notable change is the addition of a
  phandle resolver for use when grafting in a second device tree blob
  into the core tree.  The resolver isn't currently used by anything
  other than the selftest module, but it will be used to support device
  tree overlays; probably in the v3.19 timeframe.

  Also note that I've moved my normal tree from git.secretlab.ca to
  git.kernel.org"

* tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux:
  of/selftest: Move hash table off stack to fix large frame size
  To remove non-ascii characters in of_selftest.txt
  of/selftest: Use the resolver to fixup phandles
  of: Introduce Device Tree resolve support.
  of/selftest: Add a test for duplicate phandles
  of: Don't try to search when phandle == 0
  of/selftest: Test structure of device tree
  of: Fix NULL dereference in selftest removal code
  of: add vendor prefix for Chipidea
  of: Add vendor prefix for Innolux Corporation
  of: Add vendor prefix for Sitronix
  devicetree: bindings: Document Gateworks vendor prefix
  of: Add vendor prefix for Energy Micro
  dt/documentation: add specification of dma bus information
2014-10-11 06:47:50 -04:00
Linus Torvalds f43b179bbd MMC core:
- Fix SDIO IRQ bug.
  - MMC regulator improvements.
  - Fix slot-gpio card detect bug.
  - Add support for Driver Stage Register.
  - Convert the common MMC OF parser to use GPIO descriptors.
  - Convert MMC_CAP2_NO_MULTI_READ into a callback, ->multi_io_quirk().
  - Some additional minor fixes.
 
 MMC host:
  - mmci: Support Qualcomm specific DML layer for DMA.
  - dw_mmc: Use common MMC regulators.
  - dw_mmc: Add support for Rock-chips RK3288.
  - tmio: Enable runtime PM support.
  - tmio: Add support for R-Car Gen2 SoCs.
  - tmio: Several fixes and improvements.
  - omap_hsmmc: Removed Balaji from MAINTAINERS.
  - jz4740: add DMA and pre/post support.
  - sdhci: Add support for Intel Braswell.
  - sdhci: Several fixes and improvements.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNoFRAAoJEP4mhCVzWIwp+oQP/3a9Rs85+lKwnaDtCotCnvps
 LF2R1qiFbeTgQ4XwJvOctuX0VX3/9/XTRhXq+/txA8phlXzqL5BarbXv8WfLILJJ
 DgXDt/lTeW1NzJ9WYjrmV/rsH7qlbyIq6I+7kXVT15M86Qqx40DF0hSx/idDKDc4
 1ly4trLh0ZeqsM10AR9nu6h/ykVBblHOLSnMZXbBhtmIVshvNg+5KRQkSmwtvTKy
 /DswgxmuM1H1Z0T+qNejh4AZSCvxYPlwN06eqYzpYrGuoPH+SafJVws5o1G9z9SX
 t/A9i1QDxFtvDP0u1twEAYv0R4e3H24OPit3R8p2tgMUw683576DPYkF2A13Yzxj
 c3mYiTAPK8UfRc9kWxCRSkaI38URna1+t7hHRuT/Ha6DBlAvHpRL+wIu+/25XVh+
 vNwOmECtT9DzmL2UP+SHLQtyyy3guAFSsFP5RJzuA5wcYeLpNYobcJJCGuziLNYi
 PZ55O+2HRtd7my4A7NiXAib+CXTPs4VY0XY1tBgaWHl2sxFj/mULILaf+3zxpiWg
 Jc8rWkUMpy1nP1OXUrCRBKbgr/loghUOEM6hozggeisDwpjh7Rm5OXZRj6JdO4QT
 DLCl8NQKL8Ex33XoS45LoF2uuTfLt/E52CT0Sic4JdpwvIDTwlhxQR/Yo5gWuCnQ
 L+J+zbclHjORG5EuIUsw
 =VFRY
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v3.18-1' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC updates from Ulf Hansson:
 "MMC core:
   - Fix SDIO IRQ bug
   - MMC regulator improvements
   - Fix slot-gpio card detect bug
   - Add support for Driver Stage Register
   - Convert the common MMC OF parser to use GPIO descriptors
   - Convert MMC_CAP2_NO_MULTI_READ into a callback, ->multi_io_quirk()
   - Some additional minor fixes

  MMC host:
   - mmci: Support Qualcomm specific DML layer for DMA
   - dw_mmc: Use common MMC regulators
   - dw_mmc: Add support for Rock-chips RK3288
   - tmio: Enable runtime PM support
   - tmio: Add support for R-Car Gen2 SoCs
   - tmio: Several fixes and improvements
   - omap_hsmmc: Removed Balaji from MAINTAINERS
   - jz4740: add DMA and pre/post support
   - sdhci: Add support for Intel Braswell
   - sdhci: Several fixes and improvements"

* tag 'mmc-v3.18-1' of git://git.linaro.org/people/ulf.hansson/mmc: (119 commits)
  ARM: dts: fix MMC2 regulators for Exynos5420 Arndale Octa board
  mmc: sdhci-acpi: Fix Braswell eMMC timeout clock frequency
  mmc: sdhci-acpi: Pass HID and UID to probe_slot
  mmc: sdhci-acpi: Get UID directly from acpi_device
  mmc, sdhci, bcm-kona, LLVMLinux: Remove use of __initconst
  mmc: sdhci-pci: Fix Braswell eMMC timeout clock frequency
  mmc: sdhci: Let a driver override timeout clock frequency
  mmc: sdhci-pci: Add Bay Trail and Braswell SD card detect
  mmc: sdhci-pci: Set SDHCI_QUIRK2_STOP_WITH_TC for Intel BYT host controllers
  mmc: sdhci-acpi: Add a HID and UID for a SD Card host controller
  mmc: sdhci-acpi: Set SDHCI_QUIRK2_STOP_WITH_TC for Intel host controllers
  mmc: sdhci: Add quirk for always getting TC with stop cmd
  mmc: core: restore detect line inversion semantics
  mmc: Fix incorrect warning when setting 0 Hz via debugfs
  mmc: Fix use of wrong device in mmc_gpiod_free_cd()
  mmc: atmel-mci: fix mismatched section on atmci_cleanup_slot
  mmc: rtsx_pci: Set power related cap2 macros
  mmc: core: Add new power_mode MMC_POWER_UNDEFINED
  mmc: sdhci: execute tuning when device is not busy
  mmc: atmel-mci: Release mmc resources on failure in probe
  ..
2014-10-11 06:34:22 -04:00
Linus Torvalds a2ce35273c sound updates for 3.18-rc1
This time it's a relatively calm update batch, but the amount isn't
 too small in the end.  Here we go over some highlights:
 
 - ALSA core
   - One major change is the support of nonatomic PCM operations.
     This allows the trigger and other callbacks to call schedule(),
     which would be useful for mailbox type communications.  Already
     some drivers (Digigram ones) have been converted to use together
     with threaded irqs as an example.
   - Improvement / fixes of DSD PCM format support
 
 - HD-audio
   - Large volume of rewrites are found in Realtek codec driver for
     converting Dell and HP quirks to generic forms.
   - Inverted dmic code cleanup from David.
   - Realtek COEF access has been optimized.
   - Now HD-audio jack infrastructure allows multiple callbacks, which
     fixes / simplifies the jack-dependent power controls on STAC/IDT
     and VIA codecs.
   - Many additional device-specific fixups as usual
   - A few deadcode cleanups, CA0132 code cleanup, etc.
 
 - ASoC
   - More componentization work from Lars-Peter, this time mainly
     cleaning up the suspend and bias level transition callbacks.
   - Real system support for the Intel drivers and a bunch of fixes
     and enhancements for the associated CODEC drivers, this is going
     to need a lot quirks over time due to the lack of any firmware
     description of the boards.
   - Jack detect support for simple card from Dylan Reid.
   - A bunch of small fixes and enhancements for the Freescale
     drivers.
   - New drivers for Analog Devices SSM4567, Cirrus Logic CS35L32,
     Everest Semiconductor ES8328 and Freescale cards using the ASRC
     in newer i.MX processors.
   - A few simple-card fixes, mostly cleanups but also a fix for
   - interaction between GPIO 0 and simple-card.
 
 - Misc
   - Virtuoso / Oxygen updates by Clemens
   - USB-audio: Yamaha MOTIF XF MIDI port name fixes
   - Conversion of kernel messages to standard dev_*() in ctxfi
     driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJUNrU8AAoJEGwxgFQ9KSmkxZYQAI7DgkrCx2S1dIHij99jtJGz
 FjhFSO/x8Jj0lgXkoCLRHXFgtq3iYjbyS9s0aokIpvAewD9SreVE977DsMqqZVJz
 9FPOkv4keuxyJZ46mxJpYswDeazCjEYNFVbkYHhwsCiiyce8HyWMpe38tWrQfwSV
 loJYbnEfjpTxFc4JPaQK3pIICRofQCZJonWv20K25pm7L8yG29jtqFsMQWjDCONb
 ZVNwnvW61gl6ouuHincGGqVtj8pmkgKlU0l0bMgRNflRqRusrpQdobW56OEoM13H
 Tq7xMp5Yxzg7j9sM/QzL+VAksHc1u1aBzg8XZKXjk9PsmH26h1gq98W2BDKQkMzF
 U7MQaUks4x+apJcVVDoi5+15AOsyGoxNq9ahc0fe4ADTMSe94or78GaKptWMR+NK
 pA2pX2zwvool4TYj+AtcK8SNwfVeBjSua9eNnNpaNTKuwPIX6Vch0O6jaEbQZSaC
 92JYhqiC6HsW5tbhN3afTmeHxelBCpQfWPLVtgEl/eIhY3B72/1ZXWCCqwY+Ur8E
 D3OCtuAjFnzvzr/gdHZWEnMu3HGt/xqOMVE0EHTQWokQpX2E3IF724YcttAzQakw
 wS1ppeWSO5l+TkplqcqurEA7Bq1mN6bO/q9UK+iduIiYmvtNI3fDPTlXXy2SxRUz
 QuIEpsIKuZFFumFksQd9
 =S4IQ
 -----END PGP SIGNATURE-----

Merge tag 'sound-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "This time it's a relatively calm update batch, but the amount isn't
  too small in the end.  Here we go over some highlights:

  ALSA core:
   - One major change is the support of nonatomic PCM operations.  This
     allows the trigger and other callbacks to call schedule(), which
     would be useful for mailbox type communications.  Already some
     drivers (Digigram ones) have been converted to use together with
     threaded irqs as an example.
   - Improvement / fixes of DSD PCM format support

  HD-audio:
   - Large volume of rewrites are found in Realtek codec driver for
     converting Dell and HP quirks to generic forms.
   - Inverted dmic code cleanup from David.
   - Realtek COEF access has been optimized.
   - Now HD-audio jack infrastructure allows multiple callbacks, which
     fixes / simplifies the jack-dependent power controls on STAC/IDT
     and VIA codecs.
   - Many additional device-specific fixups as usual
   - A few deadcode cleanups, CA0132 code cleanup, etc.

  ASoC:
   - More componentization work from Lars-Peter, this time mainly
     cleaning up the suspend and bias level transition callbacks.
   - Real system support for the Intel drivers and a bunch of fixes and
     enhancements for the associated CODEC drivers, this is going to
     need a lot quirks over time due to the lack of any firmware
     description of the boards.
   - Jack detect support for simple card from Dylan Reid.
   - A bunch of small fixes and enhancements for the Freescale drivers.
   - New drivers for Analog Devices SSM4567, Cirrus Logic CS35L32,
     Everest Semiconductor ES8328 and Freescale cards using the ASRC in
     newer i.MX processors.
   - A few simple-card fixes, mostly cleanups but also a fix for
     interaction between GPIO 0 and simple-card.

  Misc:
   - Virtuoso / Oxygen updates by Clemens
   - USB-audio: Yamaha MOTIF XF MIDI port name fixes
   - Conversion of kernel messages to standard dev_*() in ctxfi driver"

* tag 'sound-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (251 commits)
  ASoC: mc13783: Ensure we only try to dereference valid of_nodes
  ASoC: rockchip-i2s: fix infinite loop in rockchip_snd_txctrl
  ALSA: hda - Add dock port support to Thinkpad L440 (71aa:501e)
  ALSA: Allow pass NULL dev for snd_pci_quirk_lookup()
  ASoC: imx-es8328: Fix of_node_put() call with uninitialized object
  ASoC: soc-pcm: fix sig_bits determination in soc_pcm_apply_msb()
  ASoC: simple-card: Initialize headphone and mic GPIO numbers
  ASoC: imx-es8328: Fix missing return code in imx_es8328_probe()
  ALSA: hda - Add dock support for Thinkpad T440 (17aa:2212)
  ALSA: usb: caiaq: check for cdev->n_streams > 1
  ASoC: 88pm860x-codec: Fix possibly missing string termination
  ASoC: core: fix use after free in snd_soc_remove_platform()
  ASoC: soc-dapm: fix use after free
  ALSA: hda - Make the inv dmic handling for Realtek use generic parser
  ALSA: hda - Add Inverted Internal mic for Samsung Ativ book 9 (NP900X3G)
  ALSA: hda - Add inverted internal mic for Asus Aspire 4830T
  ASoC: Intel: byt-rt5640: fix coccinelle warnings
  ASoC: fsl_esai doc: Add "fsl,vf610-esai" as compatible string
  ASoC: da732x: Remove unnecessary KERN_ERR in pr_err()
  ASoC: simple-card: Fix detect gpio documentation.
  ...
2014-10-10 22:13:25 -04:00
Linus Torvalds bf65dea87e edac updates for v3.18-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNsUKAAoJEAhfPr2O5OEVXSMQAJ953vtyqmMEi02NMf24NpA+
 OuTmoe5c8RT7/+YD2edHzG7VgYIE1L4evVOm71XLoPwI3mmGwuAIYZAFvVXsYPtB
 U7xfWHpC5r29DlTQen7L0doD1NnRTQfOWFUtsHnd5ygdrzmIHToXqqN1IjoQudpK
 i8ttU5zshSMt1IPwFh+CXMHkp8wA11hGX4LyonmJ0WD/bCeu8kreilRrK/ehZtAU
 sBjzEif2+xtqAWBaxyZ0IzWBJdYBjo6u68jyifK0liM8oBZ8vov11i6cCZBuGGAy
 eNG6lNBmak77U187yUeeyqjbdhnPy7NPLEvvfDN/C5voGHqPkrKEjH64j54ayeaR
 TQ6u6VlPJ+3RXeqtRfYbMOgHAsxNMpJZLkKx0NNty073RQc2qgX8Xxs4t33+9B37
 TfkoL8fnkh7Mq9x8czRw4n5X4qiw2ZeDDNX4TZPdGP2QQwP3JVDfMOzyr6x+BhMp
 4TwCp+Sr9PeohyGYZ8InjRdxkA3yTrc1CbqSC00lXBe0lt9Pw4aQ9HGFQWSXxYH3
 uZ8PoqBpxy3C5xvcwDQ8kjmu6y0GPtsRHAdb0G3HW3bjMiLuQbVohTxILEvg9HWr
 2tSyKSfUZXJaTckXfWOVelyexkh3IHvQ0AoQFtG6qXQh2HF12S1OV7EoGxTua2Ye
 /XDAjKzdi12tRLlY172L
 =FRmr
 -----END PGP SIGNATURE-----

Merge tag 'edac/v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac

Pull edac updates from Mauro Carvalho Chehab:
 "Nothing really exiting here: just one bug fix at sb_edac, and some
  changes to allow other drivers to use some shared PCI addresses"

* tag 'edac/v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
  sb_edac: Claim a different PCI device
  Move Intel SNB device ids from sb_edac to pci_ids.h
  sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channel
2014-10-10 22:07:55 -04:00
Linus Torvalds 4d9708ea5e media updates for v3.18-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNr9iAAoJEAhfPr2O5OEVSygP/iVpHK7JZCFSvy1ly67gUcIw
 zeO2q0Exm3WwApchaCNX0b9qB9A6jeaRiJtuqOgR7L8ksYorku7k12g0IrveK8e4
 UhwscWw1HkYvTR3JG4Z2a8LoYiUatQCgcknICgjJ12fo2fCg2SnzbGp9jKiLqJew
 dx1zOgn5Hslqy+PWQULtkLo/XxdlAX8YNUhXU5q5gxCfhciaJ7Kq+tvM9NodobHG
 u94b10fmOclLug37b+Vpg01pxjqe+X+HbrHzbOsL7dvxW84igqzpyb9+WNH8FGZZ
 +oSu66faokH8rVxzkPyODT8TSwHuqafVF1IFafsFFJpYYfRWiY0SttMACVMuuB3z
 m6kVM9pTApmh736xvzB4JP4i/+aIu2qQftYTybQkTpn1AIy2kw8b09pOWbhEgdjl
 5CfI7I2iSkSviZXMrIe51znIhdxohF7gEN8PyaPto3N1LHVnHAd7/J43nolSSnke
 DE0lQGk+NaGFv/MiESiKC8lSiEGzqpMkrxpOIeDZAsKxQ3ihxKai3kqAYYiPt2+n
 2HVhLsmfMqdd23DGSf7LjhhLqjXKhEC/+LDsLl105keRYLN/TYZuQxieJEDikRF/
 NLJcuuXUQkcsdgrAChAonu1K3roAsgZ8E6BP+814CWZ5LM4xW0kQqqKN6S88eKx2
 HcIz2xwveR6sZBNZE7Kl
 =DUbD
 -----END PGP SIGNATURE-----

Merge tag 'media/v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - new IR driver: hix5hd2-ir

 - the virtual test driver (vivi) was replaced by vivid, with has an
   almost complete set of features to emulate most v4l2 devices and
   properly test all sorts of userspace apps

 - the as102 driver had several bugs fixed and was properly split into a
   frontend and a core driver.  With that, it got promoted from staging
   into mainstream

 - one new CI driver got added for CIMaX SP2/SP2HF (sp2 driver)

 - one new frontend driver for Toshiba ISDB-T/ISDB-S demod (tc90522)

 - one new PCI driver for ISDB-T/ISDB-S (pt3 driver)

 - saa7134 driver got support for go7007-based devices

 - added a new PCI driver for Techwell 68xx chipsets (tw68)

 - a new platform driver was added (coda)

 - new tuner drivers: mxl301rf and qm1d1c0042

 - a new DVB USB driver was added for DVBSky S860 & similar devices

 - added a new SDR driver (hackrf)

 - usbtv got audio support

 - several platform drivers are now compiled with COMPILE_TEST

 - a series of compiler fixup patches, making sparse/spatch happier with
   the media stuff and removing several warnings, especially on those
   platform drivers that didn't use to compile on x86

 - Support for several new modern devices got added

 - lots of other fixes, improvements and cleanups

* tag 'media/v3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (544 commits)
  [media] ir-hix5hd2: fix build on c6x arch
  [media] pt3: fix DTV FE I2C driver load error paths
  Revert "[media] media: em28xx - remove reset_resume interface"
  [media] exynos4-is: fix some warnings when compiling on arm64
  [media] usb drivers: use %zu instead of %zd
  [media] pci drivers: use %zu instead of %zd
  [media] dvb-frontends: use %zu instead of %zd
  [media] s5p-mfc: Fix several printk warnings
  [media] s5p_mfc_opr: Fix warnings
  [media] ti-vpe: Fix typecast
  [media] s3c-camif: fix dma_addr_t printks
  [media] s5p_mfc_opr_v6: get rid of warnings when compiled with 64 bits
  [media] s5p_mfc_opr_v5: Fix lots of warnings on x86_64
  [media] em28xx: Fix identation
  [media] drxd: remove a dead code
  [media] saa7146: remove return after BUG()
  [media] cx88: remove return after BUG()
  [media] cx88: fix cards table CodingStyle
  [media] radio-sf16fmr2: declare some structs as static
  [media] radio-sf16fmi: declare pnp_attached as static
  ...
2014-10-10 22:04:49 -04:00
Linus Torvalds 754c780953 Merge branch 'for-v3.18' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull dma-mapping update from Marek Szyprowski:
 "Provide the dma write coherent api (available previously on ARM
  architecture) for all other architectures, which use dma_ops-based dma
  mapping implementation.

  This lets one to use the same code in the device drivers regardless of
  the selected architecture"

* 'for-v3.18' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  dma-mapping: Provide write-combine allocations
  s390: Implement dma_{alloc,free}_attrs()
2014-10-10 16:56:08 -04:00
Linus Torvalds 93834c6419 Immutable branch with restart handler patches for v3.18
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUJQ8/AAoJEMsfJm/On5mBMNgP+QEUHpRKJaOGU3jX/ftHH/t3
 EoNUx7lZt6Q0c9MB2ySAxILYpWUujc9N0tDkRDyW7mTWunF8gEGiRN+iKaSbzcUN
 Y4VffRAbxBasIaBqRtpDl08ycODh6Xu1t8sAao03DdhnMNLGNNO79s3UFHsubdTC
 cXx9mfYR/2SHV/0BXiFvKi8ovdqUspdp9cyZO/qc0PVFGbsADx3MNGGzkvWfgvcE
 6vXnKnUkZrNl5JPiG77kTKZnDsjEMXggmA9DGWKijFCJjGIbuLiuIDf63Zp+eQ52
 mJMRA+ViP/dDgAxY1dkWBcF5nOBT1vTYwLfy69jEoQeHzcomiHVoDKmCSBOpeAEH
 G8VoasWKWYpYnlcOJb+XgkA3QTe6mOPgAPzNsbYr0Ep7hMFw66mOQgKbgi6k4Qts
 HHimG9pnBYpPlBUfvNh+6K4dHAm0C2IyoZyMhKWsyFH6hkhS8TVM8j0gPR8rTTmk
 0a9/e2vxcFnfBe3UAJaqzWRVFsBkOHrTNpG1hvID3Oq8IeywSBXw2VMSR93+mwaB
 sa/GCZKlqHGpOfmtILlhiXQX0E/tTHmcrI2VqyCpX0J2CW+MiGvkcGOwKHOJciSA
 Cj9D68y837QU/DCpMQ6ec/5wqWqZKz8yQb8kxb6vJcL19JcVKdAiPzbuOI49C3Ux
 YxDWoUutzDfVoUD5RhcJ
 =cP1w
 -----END PGP SIGNATURE-----

Merge tag 'restart-handler-for-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull restart handler infrastructure from Guenter Roeck:
 "This series was supposed to be pulled through various trees using it,
  and I did not plan to send a separate pull request.  As it turns out,
  the pinctrl tree did not merge with it, is now upstream, and uses it,
  meaning there are now build failures.

  Please pull this series directly to fix those build failures"

* tag 'restart-handler-for-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  arm/arm64: unexport restart handlers
  watchdog: sunxi: register restart handler with kernel restart handler
  watchdog: alim7101: register restart handler with kernel restart handler
  watchdog: moxart: register restart handler with kernel restart handler
  arm: support restart through restart handler call chain
  arm64: support restart through restart handler call chain
  power/restart: call machine_restart instead of arm_pm_restart
  kernel: add support for kernel restart handler call chain
2014-10-10 16:38:02 -04:00
Sascha Hauer 1fadee0c36 net/phy: micrel: Add clock support for KSZ8021/KSZ8031
The KSZ8021 and KSZ8031 support RMII reference input clocks of 25MHz
and 50MHz. Both PHYs differ in the default frequency they expect
after reset. If this differs from the actual input clock, then
register 0x1f bit 7 must be changed.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10 15:35:13 -04:00
David S. Miller 7b6fa1eef6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter fixes for net-next

This batch contains two fixes for what you have in your net-next,
they are:

1) Remove nf_send_reset6() from header file. This function now resides
   in the nf_reject_ipv6 module. Reported by Eric Dumazet.

2) Fix wrong NFT_REJECT_ICMPX_MAX definition and adjust code to fix
   errors reported by Dan Carpenter's static analysis tools.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10 15:01:09 -04:00
Linus Torvalds c798360cd1 Merge branch 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu updates from Tejun Heo:
 "A lot of activities on percpu front.  Notable changes are...

   - percpu allocator now can take @gfp.  If @gfp doesn't contain
     GFP_KERNEL, it tries to allocate from what's already available to
     the allocator and a work item tries to keep the reserve around
     certain level so that these atomic allocations usually succeed.

     This will replace the ad-hoc percpu memory pool used by
     blk-throttle and also be used by the planned blkcg support for
     writeback IOs.

     Please note that I noticed a bug in how @gfp is interpreted while
     preparing this pull request and applied the fix 6ae833c7fe
     ("percpu: fix how @gfp is interpreted by the percpu allocator")
     just now.

   - percpu_ref now uses longs for percpu and global counters instead of
     ints.  It leads to more sparse packing of the percpu counters on
     64bit machines but the overhead should be negligible and this
     allows using percpu_ref for refcnting pages and in-memory objects
     directly.

   - The switching between percpu and single counter modes of a
     percpu_ref is made independent of putting the base ref and a
     percpu_ref can now optionally be initialized in single or killed
     mode.  This allows avoiding percpu shutdown latency for cases where
     the refcounted objects may be synchronously created and destroyed
     in rapid succession with only a fraction of them reaching fully
     operational status (SCSI probing does this when combined with
     blk-mq support).  It's also planned to be used to implement forced
     single mode to detect underflow more timely for debugging.

  There's a separate branch percpu/for-3.18-consistent-ops which cleans
  up the duplicate percpu accessors.  That branch causes a number of
  conflicts with s390 and other trees.  I'll send a separate pull
  request w/ resolutions once other branches are merged"

* 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (33 commits)
  percpu: fix how @gfp is interpreted by the percpu allocator
  blk-mq, percpu_ref: start q->mq_usage_counter in atomic mode
  percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky
  percpu_ref: add PERCPU_REF_INIT_* flags
  percpu_ref: decouple switching to percpu mode and reinit
  percpu_ref: decouple switching to atomic mode and killing
  percpu_ref: add PCPU_REF_DEAD
  percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch
  percpu_ref: replace pcpu_ prefix with percpu_
  percpu_ref: minor code and comment updates
  percpu_ref: relocate percpu_ref_reinit()
  Revert "blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe"
  Revert "percpu: free percpu allocation info for uniprocessor system"
  percpu-refcount: make percpu_ref based on longs instead of ints
  percpu-refcount: improve WARN messages
  percpu: fix locking regression in the failure path of pcpu_alloc()
  percpu-refcount: add @gfp to percpu_ref_init()
  proportions: add @gfp to init functions
  percpu_counter: add @gfp to percpu_counter_init()
  percpu_counter: make percpu_counters_lock irq-safe
  ...
2014-10-10 07:26:02 -04:00
Linus Torvalds b211e9d7c8 Merge branch 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "Nothing too interesting.  Just a handful of cleanup patches"

* 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  Revert "cgroup: remove redundant variable in cgroup_mount()"
  cgroup: remove redundant variable in cgroup_mount()
  cgroup: fix missing unlock in cgroup_release_agent()
  cgroup: remove CGRP_RELEASABLE flag
  perf/cgroup: Remove perf_put_cgroup()
  cgroup: remove redundant check in cgroup_ino()
  cpuset: simplify proc_cpuset_show()
  cgroup: simplify proc_cgroup_show()
  cgroup: use a per-cgroup work for release agent
  cgroup: remove bogus comments
  cgroup: remove redundant code in cgroup_rmdir()
  cgroup: remove some useless forward declarations
  cgroup: fix a typo in comment.
2014-10-10 07:24:40 -04:00
Linus Torvalds d9428f0976 Merge branch 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata update from Tejun Heo:
 "AHCI is getting per-port irq handling and locks for better
  scalability.  The gain is not huge but measureable with multiple high
  iops devices connected to the same host; however, the value of
  threaded IRQ handling seems negligible for AHCI and it likely will
  revert to non-threaded handling soon.

  Another noteworthy change is George Spelvin's "libata: Un-break ATA
  blacklist".  During 3.17 devel cycle, the libata blacklist glob
  matching got generalized and rewritten; unfortunately, the patch
  forgot to swap arguments to match the new match function and ended up
  breaking blacklist matching completely.  It got noticed only a couple
  days ago so it couldn't make for-3.17-fixes either.  :(

  Other than the above two, nothing too interesting - the usual cleanup
  churns and device-specific changes"

* 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (22 commits)
  pata_serverworks: disable 64-KB DMA transfers on Broadcom OSB4 IDE Controller
  libata: Un-break ATA blacklist
  AHCI: Do not acquire ata_host::lock from single IRQ handler
  AHCI: Optimize single IRQ interrupt processing
  AHCI: Do not read HOST_IRQ_STAT reg in multi-MSI mode
  AHCI: Make few function names more descriptive
  AHCI: Move host activation code into ahci_host_activate()
  AHCI: Move ahci_host_activate() function to libahci.c
  AHCI: Pass SCSI host template as arg to ahci_host_activate()
  ata: pata_imx: Use the SIMPLE_DEV_PM_OPS() macro
  AHCI: Cleanup checking of multiple MSIs/SLM modes
  libata-sff: Fix controllers with no ctl port
  ahci_xgene: Fix the error print invalid resource for APM X-Gene SoC AHCI SATA Host Controller driver.
  libata: change ata_<foo>_printk routines to return void
  ata: qcom: Add device tree bindings information
  ahci-platform: Bump max number of clocks to 5
  ahci: ahci_p5wdh_workaround - constify DMI table
  libahci_platform: Staticize ahci_platform_<en/dis>able_phys()
  pata_platform: Remove useless irq_flags field
  pata_of_platform: Remove "electra-ide" quirk
  ...
2014-10-10 07:23:11 -04:00
Linus Torvalds 0cf744bc7a Merge branch 'akpm' (fixes from Andrew Morton)
Merge patch-bomb from Andrew Morton:
 - part of OCFS2 (review is laggy again)
 - procfs
 - slab
 - all of MM
 - zram, zbud
 - various other random things: arch, filesystems.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  nosave: consolidate __nosave_{begin,end} in <asm/sections.h>
  include/linux/screen_info.h: remove unused ORIG_* macros
  kernel/sys.c: compat sysinfo syscall: fix undefined behavior
  kernel/sys.c: whitespace fixes
  acct: eliminate compile warning
  kernel/async.c: switch to pr_foo()
  include/linux/blkdev.h: use NULL instead of zero
  include/linux/kernel.h: deduplicate code implementing clamp* macros
  include/linux/kernel.h: rewrite min3, max3 and clamp using min and max
  alpha: use Kbuild logic to include <asm-generic/sections.h>
  frv: remove deprecated IRQF_DISABLED
  frv: remove unused cpuinfo_frv and friends to fix future build error
  zbud: avoid accessing last unused freelist
  zsmalloc: simplify init_zspage free obj linking
  mm/zsmalloc.c: correct comment for fullness group computation
  zram: use notify_free to account all free notifications
  zram: report maximum used memory
  zram: zram memory size limitation
  zsmalloc: change return value unit of zs_get_total_size_bytes
  zsmalloc: move pages_allocated to zs_pool
  ...
2014-10-09 22:26:14 -04:00
Geert Uytterhoeven 7f8998c7ae nosave: consolidate __nosave_{begin,end} in <asm/sections.h>
The different architectures used their own (and different) declarations:

    extern __visible const void __nosave_begin, __nosave_end;
    extern const void __nosave_begin, __nosave_end;
    extern long __nosave_begin, __nosave_end;

Consolidate them using the first variant in <asm/sections.h>.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:04 -04:00
Geert Uytterhoeven 578b25dfce include/linux/screen_info.h: remove unused ORIG_* macros
The ORIG_* macros definitions to access struct screen_info members and all
of their users were removed 7 years ago by commit 3ea3351000
("Remove magic macros for screen_info structure members"), but (only) the
definitions reappeared a few days later in commit ee8e7cfe9d ("Make
asm-x86/bootparam.h includable from userspace.").

Remove them for good. Amen.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:04 -04:00
Michele Curti 61a04e5b30 include/linux/blkdev.h: use NULL instead of zero
Quite useless but it shuts up some warnings.

Signed-off-by: Michele Curti <michele.curti@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:04 -04:00
Michal Nazarewicz c185b07fc9 include/linux/kernel.h: deduplicate code implementing clamp* macros
Instead of open-coding clamp_t macro min_t and max_t the way clamp macro
does and instead of open-coding clamp_val simply use clamp_t.
Furthermore, normalise argument naming in the macros to be lo and hi.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Cc: "Kirsher, Jeffrey T" <jeffrey.t.kirsher@intel.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:03 -04:00
Michal Nazarewicz 2e1d06e1c0 include/linux/kernel.h: rewrite min3, max3 and clamp using min and max
It appears that gcc is better at optimising a double call to min and max
rather than open coded min3 and max3.  This can be observed here:

    $ cat min-max.c
    #define min(x, y) ({				\
    	typeof(x) _min1 = (x);			\
    	typeof(y) _min2 = (y);			\
    	(void) (&_min1 == &_min2);		\
    	_min1 < _min2 ? _min1 : _min2; })
    #define min3(x, y, z) ({			\
    	typeof(x) _min1 = (x);			\
    	typeof(y) _min2 = (y);			\
    	typeof(z) _min3 = (z);			\
    	(void) (&_min1 == &_min2);		\
    	(void) (&_min1 == &_min3);		\
    	_min1 < _min2 ? (_min1 < _min3 ? _min1 : _min3) : \
    		(_min2 < _min3 ? _min2 : _min3); })

    int fmin3(int x, int y, int z) { return min3(x, y, z); }
    int fmin2(int x, int y, int z) { return min(min(x, y), z); }

    $ gcc -O2 -o min-max.s -S min-max.c; cat min-max.s
    	.file	"min-max.c"
    	.text
    	.p2align 4,,15
    	.globl	fmin3
    	.type	fmin3, @function
    fmin3:
    .LFB0:
    	.cfi_startproc
    	cmpl	%esi, %edi
    	jl	.L5
    	cmpl	%esi, %edx
    	movl	%esi, %eax
    	cmovle	%edx, %eax
    	ret
    	.p2align 4,,10
    	.p2align 3
    .L5:
    	cmpl	%edi, %edx
    	movl	%edi, %eax
    	cmovle	%edx, %eax
    	ret
    	.cfi_endproc
    .LFE0:
    	.size	fmin3, .-fmin3
    	.p2align 4,,15
    	.globl	fmin2
    	.type	fmin2, @function
    fmin2:
    .LFB1:
    	.cfi_startproc
    	cmpl	%edi, %esi
    	movl	%edx, %eax
    	cmovle	%esi, %edi
    	cmpl	%edx, %edi
    	cmovle	%edi, %eax
    	ret
    	.cfi_endproc
    .LFE1:
    	.size	fmin2, .-fmin2
    	.ident	"GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
    	.section	.note.GNU-stack,"",@progbits

fmin3 function, which uses open-coded min3 macro, is compiled into total
of ten instructions including a conditional branch, whereas fmin2
function, which uses two calls to min2 macro, is compiled into six
instructions with no branches.

Similarly, open-coded clamp produces the same code as clamp using min and
max macros, but the latter is much shorter:

    $ cat clamp.c
    #define clamp(val, min, max) ({			\
    	typeof(val) __val = (val);		\
    	typeof(min) __min = (min);		\
    	typeof(max) __max = (max);		\
    	(void) (&__val == &__min);		\
    	(void) (&__val == &__max);		\
    	__val = __val < __min ? __min: __val;	\
    	__val > __max ? __max: __val; })
    #define min(x, y) ({				\
    	typeof(x) _min1 = (x);			\
    	typeof(y) _min2 = (y);			\
    	(void) (&_min1 == &_min2);		\
    	_min1 < _min2 ? _min1 : _min2; })
    #define max(x, y) ({				\
    	typeof(x) _max1 = (x);			\
    	typeof(y) _max2 = (y);			\
    	(void) (&_max1 == &_max2);		\
    	_max1 > _max2 ? _max1 : _max2; })

    int fclamp(int v, int min, int max) { return clamp(v, min, max); }
    int fclampmm(int v, int min, int max) { return min(max(v, min), max); }

    $ gcc -O2 -o clamp.s -S clamp.c; cat clamp.s
    	.file	"clamp.c"
    	.text
    	.p2align 4,,15
    	.globl	fclamp
    	.type	fclamp, @function
    fclamp:
    .LFB0:
    	.cfi_startproc
    	cmpl	%edi, %esi
    	movl	%edx, %eax
    	cmovge	%esi, %edi
    	cmpl	%edx, %edi
    	cmovle	%edi, %eax
    	ret
    	.cfi_endproc
    .LFE0:
    	.size	fclamp, .-fclamp
    	.p2align 4,,15
    	.globl	fclampmm
    	.type	fclampmm, @function
    fclampmm:
    .LFB1:
    	.cfi_startproc
    	cmpl	%edi, %esi
    	cmovge	%esi, %edi
    	cmpl	%edi, %edx
    	movl	%edi, %eax
    	cmovle	%edx, %eax
    	ret
    	.cfi_endproc
    .LFE1:
    	.size	fclampmm, .-fclampmm
    	.ident	"GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
    	.section	.note.GNU-stack,"",@progbits

    Linux mpn-glaptop 3.13.0-29-generic #53~precise1-Ubuntu SMP Wed Jun 4 22:06:25 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
    gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
    Copyright (C) 2011 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    -rwx------ 1 mpn eng 51224656 Jun 17 14:15 vmlinux.before
    -rwx------ 1 mpn eng 51224608 Jun 17 13:57 vmlinux.after

48 bytes reduction.  The do_fault_around was a few instruction shorter
and as far as I can tell saved 12 bytes on the stack, i.e.:

    $ grep -e rsp -e pop -e push do_fault_around.*
    do_fault_around.before.s:push   %rbp
    do_fault_around.before.s:mov    %rsp,%rbp
    do_fault_around.before.s:push   %r13
    do_fault_around.before.s:push   %r12
    do_fault_around.before.s:push   %rbx
    do_fault_around.before.s:sub    $0x38,%rsp
    do_fault_around.before.s:add    $0x38,%rsp
    do_fault_around.before.s:pop    %rbx
    do_fault_around.before.s:pop    %r12
    do_fault_around.before.s:pop    %r13
    do_fault_around.before.s:pop    %rbp

    do_fault_around.after.s:push   %rbp
    do_fault_around.after.s:mov    %rsp,%rbp
    do_fault_around.after.s:push   %r12
    do_fault_around.after.s:push   %rbx
    do_fault_around.after.s:sub    $0x30,%rsp
    do_fault_around.after.s:add    $0x30,%rsp
    do_fault_around.after.s:pop    %rbx
    do_fault_around.after.s:pop    %r12
    do_fault_around.after.s:pop    %rbp

or here side-by-side:

    Before                    After
    push   %rbp               push   %rbp
    mov    %rsp,%rbp          mov    %rsp,%rbp
    push   %r13
    push   %r12               push   %r12
    push   %rbx               push   %rbx
    sub    $0x38,%rsp         sub    $0x30,%rsp
    add    $0x38,%rsp         add    $0x30,%rsp
    pop    %rbx               pop    %rbx
    pop    %r12               pop    %r12
    pop    %r13
    pop    %rbp               pop    %rbp

There are also fewer branches:

    $ grep ^j do_fault_around.*
    do_fault_around.before.s:jae    ffffffff812079b7
    do_fault_around.before.s:jmp    ffffffff812079c5
    do_fault_around.before.s:jmp    ffffffff81207a14
    do_fault_around.before.s:ja     ffffffff812079f9
    do_fault_around.before.s:jb     ffffffff81207a10
    do_fault_around.before.s:jmp    ffffffff81207a63
    do_fault_around.before.s:jne    ffffffff812079df

    do_fault_around.after.s:jmp    ffffffff812079fd
    do_fault_around.after.s:ja     ffffffff812079e2
    do_fault_around.after.s:jb     ffffffff812079f9
    do_fault_around.after.s:jmp    ffffffff81207a4c
    do_fault_around.after.s:jne    ffffffff812079c8

And here's with allyesconfig on a different machine:

    $ uname -a; gcc --version; ls -l vmlinux.*
    Linux erwin 3.14.7-mn #54 SMP Sun Jun 15 11:25:08 CEST 2014 x86_64 AMD Phenom(tm) II X3 710 Processor AuthenticAMD GNU/Linux
    gcc (GCC) 4.8.3
    Copyright (C) 2013 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    -rwx------ 1 mpn eng 437027411 Jun 20 16:04 vmlinux.before
    -rwx------ 1 mpn eng 437026881 Jun 20 15:30 vmlinux.after

530 bytes reduction.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: David Rientjes <rientjes@google.com>
Cc: "Rustad, Mark D" <mark.d.rustad@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:03 -04:00
Minchan Kim 722cdc1723 zsmalloc: change return value unit of zs_get_total_size_bytes
zs_get_total_size_bytes returns a amount of memory zsmalloc consumed with
*byte unit* but zsmalloc operates *page unit* rather than byte unit so
let's change the API so benefit we could get is that reduce unnecessary
overhead (ie, change page unit with byte unit) in zsmalloc.

Since return type is pages, "zs_get_total_pages" is better than
"zs_get_total_size_bytes".

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Dan Streetman <ddstreet@ieee.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: <juno.choi@lge.com>
Cc: <seungho1.park@lge.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: David Horner <ds2horner@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:02 -04:00
Konstantin Khlebnikov 09316c09dd mm/balloon_compaction: add vmstat counters and kpageflags bit
Always mark pages with PageBalloon even if balloon compaction is disabled
and expose this mark in /proc/kpageflags as KPF_BALLOON.

Also this patch adds three counters into /proc/vmstat: "balloon_inflate",
"balloon_deflate" and "balloon_migrate".  They accumulate balloon
activity.  Current size of balloon is (balloon_inflate - balloon_deflate)
pages.

All generic balloon code now gathered under option CONFIG_MEMORY_BALLOON.
It should be selected by ballooning driver which wants use this feature.
Currently virtio-balloon is the only user.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:01 -04:00
Konstantin Khlebnikov 9d1ba80564 mm/balloon_compaction: remove balloon mapping and flag AS_BALLOON_MAP
Now ballooned pages are detected using PageBalloon().  Fake mapping is no
longer required.  This patch links ballooned pages to balloon device using
field page->private instead of page->mapping.  Also this patch embeds
balloon_dev_info directly into struct virtio_balloon.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:01 -04:00
Konstantin Khlebnikov d6d86c0a7f mm/balloon_compaction: redesign ballooned pages management
Sasha Levin reported KASAN splash inside isolate_migratepages_range().
Problem is in the function __is_movable_balloon_page() which tests
AS_BALLOON_MAP in page->mapping->flags.  This function has no protection
against anonymous pages.  As result it tried to check address space flags
inside struct anon_vma.

Further investigation shows more problems in current implementation:

* Special branch in __unmap_and_move() never works:
  balloon_page_movable() checks page flags and page_count.  In
  __unmap_and_move() page is locked, reference counter is elevated, thus
  balloon_page_movable() always fails.  As a result execution goes to the
  normal migration path.  virtballoon_migratepage() returns
  MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
  move_to_new_page() thinks this is an error code and assigns
  newpage->mapping to NULL.  Newly migrated page lose connectivity with
  balloon an all ability for further migration.

* lru_lock erroneously required in isolate_migratepages_range() for
  isolation ballooned page.  This function releases lru_lock periodically,
  this makes migration mostly impossible for some pages.

* balloon_page_dequeue have a tight race with balloon_page_isolate:
  balloon_page_isolate could be executed in parallel with dequeue between
  picking page from list and locking page_lock.  Race is rare because they
  use trylock_page() for locking.

This patch fixes all of them.

Instead of fake mapping with special flag this patch uses special state of
page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256.  Buddy allocator uses
PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose.  Storing mark
directly in struct page makes everything safer and easier.

PagePrivate is used to mark pages present in page list (i.e.  not
isolated, like PageLRU for normal pages).  It replaces special rules for
reference counter and makes balloon migration similar to migration of
normal pages.  This flag is protected by page_lock together with link to
the balloon device.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: <stable@vger.kernel.org>	[3.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:01 -04:00
Johannes Weiner b70a2a21dc mm: memcontrol: fix transparent huge page allocations under pressure
In a memcg with even just moderate cache pressure, success rates for
transparent huge page allocations drop to zero, wasting a lot of effort
that the allocator puts into assembling these pages.

The reason for this is that the memcg reclaim code was never designed for
higher-order charges.  It reclaims in small batches until there is room
for at least one page.  Huge page charges only succeed when these batches
add up over a series of huge faults, which is unlikely under any
significant load involving order-0 allocations in the group.

Remove that loop on the memcg side in favor of passing the actual reclaim
goal to direct reclaim, which is already set up and optimized to meet
higher-order goals efficiently.

This brings memcg's THP policy in line with the system policy: if the
allocator painstakingly assembles a hugepage, memcg will at least make an
honest effort to charge it.  As a result, transparent hugepage allocation
rates amid cache activity are drastically improved:

                                      vanilla                 patched
pgalloc                 4717530.80 (  +0.00%)   4451376.40 (  -5.64%)
pgfault                  491370.60 (  +0.00%)    225477.40 ( -54.11%)
pgmajfault                    2.00 (  +0.00%)         1.80 (  -6.67%)
thp_fault_alloc               0.00 (  +0.00%)       531.60 (+100.00%)
thp_fault_fallback          749.00 (  +0.00%)       217.40 ( -70.88%)

[ Note: this may in turn increase memory consumption from internal
  fragmentation, which is an inherent risk of transparent hugepages.
  Some setups may have to adjust the memcg limits accordingly to
  accomodate this - or, if the machine is already packed to capacity,
  disable the transparent huge page feature. ]

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave@sr71.net>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:59 -04:00
Vladimir Davydov 6f817f4cda memcg: move memcg_update_cache_size() to slab_common.c
`While growing per memcg caches arrays, we jump between memcontrol.c and
slab_common.c in a weird way:

  memcg_alloc_cache_id - memcontrol.c
    memcg_update_all_caches - slab_common.c
      memcg_update_cache_size - memcontrol.c

There's absolutely no reason why memcg_update_cache_size can't live on the
slab's side though.  So let's move it there and settle it comfortably amid
per-memcg cache allocation functions.

Besides, this patch cleans this function up a bit, removing all the
useless comments from it, and renames it to memcg_update_cache_params to
conform to memcg_alloc/free_cache_params, which we already have in
slab_common.c.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:59 -04:00
Vladimir Davydov 33a690c45b memcg: move memcg_{alloc,free}_cache_params to slab_common.c
The only reason why they live in memcontrol.c is that we get/put css
reference to the owner memory cgroup in them.  However, we can do that in
memcg_{un,}register_cache.  OTOH, there are several reasons to move them
to slab_common.c.

First, I think that the less public interface functions we have in
memcontrol.h the better.  Since the functions I move don't depend on
memcontrol, I think it's worth making them private to slab, especially
taking into account that the arrays are defined on the slab's side too.

Second, the way how per-memcg arrays are updated looks rather awkward: it
proceeds from memcontrol.c (__memcg_activate_kmem) to slab_common.c
(memcg_update_all_caches) and back to memcontrol.c again
(memcg_update_array_size).  In the following patches I move the function
relocating the arrays (memcg_update_array_size) to slab_common.c and
therefore get rid this circular call path.  I think we should have the
cache allocation stuff in the same place where we have relocation, because
it's easier to follow the code then.  So I move arrays alloc/free
functions to slab_common.c too.

The third point isn't obvious.  I'm going to make the list_lru structure
per-memcg to allow targeted kmem reclaim.  That means we will have
per-memcg arrays in list_lrus too.  It turns out that it's much easier to
update these arrays in list_lru.c rather than in memcontrol.c, because all
the stuff we need is defined there.  This patch makes memcg caches arrays
allocation path conform that of the upcoming list_lru.

So let's move these functions to slab_common.c and make them static.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:59 -04:00
Sasha Levin 31c9afa6db mm: introduce VM_BUG_ON_MM
Very similar to VM_BUG_ON_PAGE and VM_BUG_ON_VMA, dump struct_mm when the
bug is hit.

[akpm@linux-foundation.org: coding-style fixes]
[mhocko@suse.cz: fix build]
[mhocko@suse.cz: fix build some more]
[akpm@linux-foundation.org: do strange things to avoid doing strange things for the comma separators]
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:58 -04:00
Junxiao Bi 934f3072c1 mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set
commit 21caf2fc19 ("mm: teach mm by current context info to not do I/O
during memory allocation") introduces PF_MEMALLOC_NOIO flag to avoid doing
I/O inside memory allocation, __GFP_IO is cleared when this flag is set,
but __GFP_FS implies __GFP_IO, it should also be cleared.  Or it may still
run into I/O, like in superblock shrinker.  And this will make the kernel
run into the deadlock case described in that commit.

See Dave Chinner's comment about io in superblock shrinker:

Filesystem shrinkers do indeed perform IO from the superblock shrinker and
have for years.  Even clean inodes can require IO before they can be freed
- e.g.  on an orphan list, need truncation of post-eof blocks, need to
wait for ordered operations to complete before it can be freed, etc.

IOWs, Ext4, btrfs and XFS all can issue and/or block on arbitrary amounts
of IO in the superblock shrinker context.  XFS, in particular, has been
doing transactions and IO from the VFS inode cache shrinker since it was
first introduced....

Fix this by clearing __GFP_FS in memalloc_noio_flags(), this function has
masked all the gfp_mask that will be passed into fs for the processes
setting PF_MEMALLOC_NOIO in the direct reclaim path.

v1 thread at: https://lkml.org/lkml/2014/9/3/32

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: joyce.xue <xuejiufei@huawei.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:58 -04:00
Johannes Weiner 5705465174 mm: clean up zone flags
Page reclaim tests zone_is_reclaim_dirty(), but the site that actually
sets this state does zone_set_flag(zone, ZONE_TAIL_LRU_DIRTY), sending the
reader through layers indirection just to track down a simple bit.

Remove all zone flag wrappers and just use bitops against zone->flags
directly.  It's just as readable and the lines are barely any longer.

Also rename ZONE_TAIL_LRU_DIRTY to ZONE_DIRTY to match ZONE_WRITEBACK, and
remove the zone_flags_t typedef.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:57 -04:00
Sasha Levin 81d1b09c6b mm: convert a few VM_BUG_ON callers to VM_BUG_ON_VMA
Trivially convert a few VM_BUG_ON calls to VM_BUG_ON_VMA to extract
more information when they trigger.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:57 -04:00
Sasha Levin fa3759ccd5 mm: introduce VM_BUG_ON_VMA
Very similar to VM_BUG_ON_PAGE but dumps VMA information instead.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:57 -04:00
Sasha Levin 0bf5513978 mm: introduce dump_vma
Introduce a helper to dump information about a VMA, this also makes
dump_page_flags more generic and re-uses that so the output looks very
similar to dump_page:

[   61.903437] vma ffff88070f88be00 start 00007fff25970000 end 00007fff25992000
[   61.903437] next ffff88070facd600 prev ffff88070face400 mm ffff88070fade000
[   61.903437] prot 8000000000000025 anon_vma ffff88070fa1e200 vm_ops           (null)
[   61.903437] pgoff 7ffffffdd file           (null) private_data           (null)
[   61.909129] flags: 0x100173(read|write|mayread|maywrite|mayexec|growsdown|account)

[akpm@linux-foundation.org: make dump_vma() require CONFIG_DEBUG_VM]
[swarren@nvidia.com: fix dump_vma() compilation]
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:57 -04:00
Andrew Morton 1c93923cc2 include/linux/migrate.h: remove migrate_page #define
This is designed to avoid a few ifdefs in .c files but it's obnoxious
because it can cause unsuspecting "migrate_page" symbols to get turned into
"NULL".

Just nuke it and use the ifdefs.

Cc: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:56 -04:00
Oleg Nesterov dd6eecb917 mempolicy: unexport get_vma_policy() and remove its "task" arg
- get_vma_policy(task) is not safe if task != current, remove this
  argument.

- get_vma_policy() no longer has callers outside of mempolicy.c,
  make it static.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:56 -04:00
Oleg Nesterov 74d2c3a05c mempolicy: introduce __get_vma_policy(), export get_task_policy()
Extract the code which looks for vma's policy from get_vma_policy()
into the new helper, __get_vma_policy(). Export get_task_policy().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:56 -04:00
Oleg Nesterov 6b6482bbf6 mempolicy: remove the "task" arg of vma_policy_mof() and simplify it
1. vma_policy_mof(task) is simply not safe unless task == current,
   it can race with do_exit()->mpol_put(). Remove this arg and update
   its single caller.

2. vma can not be NULL, remove this check and simplify the code.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:56 -04:00
Johannes Weiner 1f13ae399c mm: remove noisy remainder of the scan_unevictable interface
The deprecation warnings for the scan_unevictable interface triggers by
scripts doing `sysctl -a | grep something else'.  This is annoying and not
helpful.

The interface has been defunct since 264e56d824 ("mm: disable user
interface to manually rescue unevictable pages"), which was in 2011, and
there haven't been any reports of usecases for it, only reports that the
deprecation warnings are annying.  It's unlikely that anybody is using
this interface specifically at this point, so remove it.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:55 -04:00
Cyrill Gorcunov f606b77f1a prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation
During development of c/r we've noticed that in case if we need to support
user namespaces we face a problem with capabilities in prctl(PR_SET_MM,
...) call, in particular once new user namespace is created
capable(CAP_SYS_RESOURCE) no longer passes.

A approach is to eliminate CAP_SYS_RESOURCE check but pass all new values
in one bundle, which would allow the kernel to make more intensive test
for sanity of values and same time allow us to support checkpoint/restore
of user namespaces.

Thus a new command PR_SET_MM_MAP introduced. It takes a pointer of
prctl_mm_map structure which carries all the members to be updated.

	prctl(PR_SET_MM, PR_SET_MM_MAP, struct prctl_mm_map *, size)

	struct prctl_mm_map {
		__u64	start_code;
		__u64	end_code;
		__u64	start_data;
		__u64	end_data;
		__u64	start_brk;
		__u64	brk;
		__u64	start_stack;
		__u64	arg_start;
		__u64	arg_end;
		__u64	env_start;
		__u64	env_end;
		__u64	*auxv;
		__u32	auxv_size;
		__u32	exe_fd;
	};

All members except @exe_fd correspond ones of struct mm_struct.  To figure
out which available values these members may take here are meanings of the
members.

 - start_code, end_code: represent bounds of executable code area
 - start_data, end_data: represent bounds of data area
 - start_brk, brk: used to calculate bounds for brk() syscall
 - start_stack: used when accounting space needed for command
   line arguments, environment and shmat() syscall
 - arg_start, arg_end, env_start, env_end: represent memory area
   supplied for command line arguments and environment variables
 - auxv, auxv_size: carries auxiliary vector, Elf format specifics
 - exe_fd: file descriptor number for executable link (/proc/self/exe)

Thus we apply the following requirements to the values

1) Any member except @auxv, @auxv_size, @exe_fd is rather an address
   in user space thus it must be laying inside [mmap_min_addr, mmap_max_addr)
   interval.

2) While @[start|end]_code and @[start|end]_data may point to an nonexisting
   VMAs (say a program maps own new .text and .data segments during execution)
   the rest of members should belong to VMA which must exist.

3) Addresses must be ordered, ie @start_ member must not be greater or
   equal to appropriate @end_ member.

4) As in regular Elf loading procedure we require that @start_brk and
   @brk be greater than @end_data.

5) If RLIMIT_DATA rlimit is set to non-infinity new values should not
   exceed existing limit. Same applies to RLIMIT_STACK.

6) Auxiliary vector size must not exceed existing one (which is
   predefined as AT_VECTOR_SIZE and depends on architecture).

7) File descriptor passed in @exe_file should be pointing
   to executable file (because we use existing prctl_set_mm_exe_file_locked
   helper it ensures that the file we are going to use as exe link has all
   required permission granted).

Now about where these members are involved inside kernel code:

 - @start_code and @end_code are used in /proc/$pid/[stat|statm] output;

 - @start_data and @end_data are used in /proc/$pid/[stat|statm] output,
   also they are considered if there enough space for brk() syscall
   result if RLIMIT_DATA is set;

 - @start_brk shown in /proc/$pid/stat output and accounted in brk()
   syscall if RLIMIT_DATA is set; also this member is tested to
   find a symbolic name of mmap event for perf system (we choose
   if event is generated for "heap" area); one more aplication is
   selinux -- we test if a process has PROCESS__EXECHEAP permission
   if trying to make heap area being executable with mprotect() syscall;

 - @brk is a current value for brk() syscall which lays inside heap
   area, it's shown in /proc/$pid/stat. When syscall brk() succesfully
   provides new memory area to a user space upon brk() completion the
   mm::brk is updated to carry new value;

   Both @start_brk and @brk are actively used in /proc/$pid/maps
   and /proc/$pid/smaps output to find a symbolic name "heap" for
   VMA being scanned;

 - @start_stack is printed out in /proc/$pid/stat and used to
   find a symbolic name "stack" for task and threads in
   /proc/$pid/maps and /proc/$pid/smaps output, and as the same
   as with @start_brk -- perf system uses it for event naming.
   Also kernel treat this member as a start address of where
   to map vDSO pages and to check if there is enough space
   for shmat() syscall;

 - @arg_start, @arg_end, @env_start and @env_end are printed out
   in /proc/$pid/stat. Another access to the data these members
   represent is to read /proc/$pid/environ or /proc/$pid/cmdline.
   Any attempt to read these areas kernel tests with access_process_vm
   helper so a user must have enough rights for this action;

 - @auxv and @auxv_size may be read from /proc/$pid/auxv. Strictly
   speaking kernel doesn't care much about which exactly data is
   sitting there because it is solely for userspace;

 - @exe_fd is referred from /proc/$pid/exe and when generating
   coredump. We uses prctl_set_mm_exe_file_locked helper to update
   this member, so exe-file link modification remains one-shot
   action.

Still note that updating exe-file link now doesn't require sys-resource
capability anymore, after all there is no much profit in preventing setup
own file link (there are a number of ways to execute own code -- ptrace,
ld-preload, so that the only reliable way to find which exactly code is
executed is to inspect running program memory).  Still we require the
caller to be at least user-namespace root user.

I believe the old interface should be deprecated and ripped off in a
couple of kernel releases if no one against.

To test if new interface is implemented in the kernel one can pass
PR_SET_MM_MAP_SIZE opcode and the kernel returns the size of currently
supported struct prctl_mm_map.

[akpm@linux-foundation.org: fix 80-col wordwrap in macro definitions]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Andrew Vagin <avagin@openvz.org>
Tested-by: Andrew Vagin <avagin@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Julien Tinnes <jln@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:55 -04:00
Cyrill Gorcunov 9c5990240e mm: introduce check_data_rlimit helper
To eliminate code duplication lets introduce check_data_rlimit helper
which we will use in brk() and prctl() syscalls.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Julien Tinnes <jln@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:55 -04:00
David Rientjes 43e7a34d26 mm: rename allocflags_to_migratetype for clarity
The page allocator has gfp flags (like __GFP_WAIT) and alloc flags (like
ALLOC_CPUSET) that have separate semantics.

The function allocflags_to_migratetype() actually takes gfp flags, not
alloc flags, and returns a migratetype.  Rename it to
gfpflags_to_migratetype().

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:55 -04:00
Vlastimil Babka 1f9efdef4f mm, compaction: khugepaged should not give up due to need_resched()
Async compaction aborts when it detects zone lock contention or
need_resched() is true.  David Rientjes has reported that in practice,
most direct async compactions for THP allocation abort due to
need_resched().  This means that a second direct compaction is never
attempted, which might be OK for a page fault, but khugepaged is intended
to attempt a sync compaction in such case and in these cases it won't.

This patch replaces "bool contended" in compact_control with an int that
distinguishes between aborting due to need_resched() and aborting due to
lock contention.  This allows propagating the abort through all compaction
functions as before, but passing the abort reason up to
__alloc_pages_slowpath() which decides when to continue with direct
reclaim and another compaction attempt.

Another problem is that try_to_compact_pages() did not act upon the
reported contention (both need_resched() or lock contention) immediately
and would proceed with another zone from the zonelist.  When
need_resched() is true, that means initializing another zone compaction,
only to check again need_resched() in isolate_migratepages() and aborting.
 For zone lock contention, the unintended consequence is that the lock
contended status reported back to the allocator is detrmined from the last
zone where compaction was attempted, which is rather arbitrary.

This patch fixes the problem in the following way:
- async compaction of a zone aborting due to need_resched() or fatal signal
  pending means that further zones should not be tried. We report
  COMPACT_CONTENDED_SCHED to the allocator.
- aborting zone compaction due to lock contention means we can still try
  another zone, since it has different set of locks. We report back
  COMPACT_CONTENDED_LOCK only if *all* zones where compaction was attempted,
  it was aborted due to lock contention.

As a result of these fixes, khugepaged will proceed with second sync
compaction as intended, when the preceding async compaction aborted due to
need_resched().  Page fault compactions aborting due to need_resched()
will spare some cycles previously wasted by initializing another zone
compaction only to abort again.  Lock contention will be reported only
when compaction in all zones aborted due to lock contention, and therefore
it's not a good idea to try again after reclaim.

In stress-highalloc from mmtests configured to use __GFP_NO_KSWAPD, this
has improved number of THP collapse allocations by 10%, which shows
positive effect on khugepaged.  The benchmark's success rates are
unchanged as it is not recognized as khugepaged.  Numbers of compact_stall
and compact_fail events have however decreased by 20%, with
compact_success still a bit improved, which is good.  With benchmark
configured not to use __GFP_NO_KSWAPD, there is 6% improvement in THP
collapse allocations, and only slight improvement in stalls and failures.

[akpm@linux-foundation.org: fix warnings]
Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:54 -04:00
Vlastimil Babka 53853e2d2b mm, compaction: defer each zone individually instead of preferred zone
When direct sync compaction is often unsuccessful, it may become deferred
for some time to avoid further useless attempts, both sync and async.
Successful high-order allocations un-defer compaction, while further
unsuccessful compaction attempts prolong the compaction deferred period.

Currently the checking and setting deferred status is performed only on
the preferred zone of the allocation that invoked direct compaction.  But
compaction itself is attempted on all eligible zones in the zonelist, so
the behavior is suboptimal and may lead both to scenarios where 1)
compaction is attempted uselessly, or 2) where it's not attempted despite
good chances of succeeding, as shown on the examples below:

1) A direct compaction with Normal preferred zone failed and set
   deferred compaction for the Normal zone.  Another unrelated direct
   compaction with DMA32 as preferred zone will attempt to compact DMA32
   zone even though the first compaction attempt also included DMA32 zone.

   In another scenario, compaction with Normal preferred zone failed to
   compact Normal zone, but succeeded in the DMA32 zone, so it will not
   defer compaction.  In the next attempt, it will try Normal zone which
   will fail again, instead of skipping Normal zone and trying DMA32
   directly.

2) Kswapd will balance DMA32 zone and reset defer status based on
   watermarks looking good.  A direct compaction with preferred Normal
   zone will skip compaction of all zones including DMA32 because Normal
   was still deferred.  The allocation might have succeeded in DMA32, but
   won't.

This patch makes compaction deferring work on individual zone basis
instead of preferred zone.  For each zone, it checks compaction_deferred()
to decide if the zone should be skipped.  If watermarks fail after
compacting the zone, defer_compaction() is called.  The zone where
watermarks passed can still be deferred when the allocation attempt is
unsuccessful.  When allocation is successful, compaction_defer_reset() is
called for the zone containing the allocated page.  This approach should
approximate calling defer_compaction() only on zones where compaction was
attempted and did not yield allocated page.  There might be corner cases
but that is inevitable as long as the decision to stop compacting dues not
guarantee that a page will be allocated.

Due to a new COMPACT_DEFERRED return value, some functions relying
implicitly on COMPACT_SKIPPED = 0 had to be updated, with comments made
more accurate.  The did_some_progress output parameter of
__alloc_pages_direct_compact() is removed completely, as the caller
actually does not use it after compaction sets it - it is only considered
when direct reclaim sets it.

During testing on a two-node machine with a single very small Normal zone
on node 1, this patch has improved success rates in stress-highalloc
mmtests benchmark.  The success here were previously made worse by commit
3a025760fc ("mm: page_alloc: spill to remote nodes before waking
kswapd") as kswapd was no longer resetting often enough the deferred
compaction for the Normal zone, and DMA32 zones on both nodes were thus
not considered for compaction.  On different machine, success rates were
improved with __GFP_NO_KSWAPD allocations.

[akpm@linux-foundation.org: fix CONFIG_COMPACTION=n build]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:53 -04:00
Laura Abbott 513510ddba common: dma-mapping: introduce common remapping functions
For architectures without coherent DMA, memory for DMA may need to be
remapped with coherent attributes.  Factor out the the remapping code from
arm and put it in a common location to reduce code duplication.

As part of this, the arm APIs are now migrated away from
ioremap_page_range to the common APIs which use map_vm_area for remapping.
 This should be an equivalent change and using map_vm_area is more correct
as ioremap_page_range is intended to bring in io addresses into the cpu
space and not regular kernel managed memory.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Riley <davidriley@chromium.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Mitchel Humpherys <mitchelh@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:52 -04:00
Laura Abbott 9efb3a421d lib/genalloc.c: add genpool range check function
After allocating an address from a particular genpool, there is no good
way to verify if that address actually belongs to a genpool.  Introduce
addr_in_gen_pool which will return if an address plus size falls
completely within the genpool range.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Olof Johansson <olof@lixom.net>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Riley <davidriley@chromium.org>
Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:52 -04:00
Laura Abbott 505e3be6c0 lib/genalloc.c: add power aligned algorithm
One of the more common algorithms used for allocation is to align the
start address of the allocation to the order of size requested.  Add this
as an algorithm option for genalloc.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Riley <davidriley@chromium.org>
Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:52 -04:00
Mel Gorman 6a33979d5b mm: remove misleading ARCH_USES_NUMA_PROT_NONE
ARCH_USES_NUMA_PROT_NONE was defined for architectures that implemented
_PAGE_NUMA using _PROT_NONE.  This saved using an additional PTE bit and
relied on the fact that PROT_NONE vmas were skipped by the NUMA hinting
fault scanner.  This was found to be conceptually confusing with a lot of
implicit assumptions and it was asked that an alternative be found.

Commit c46a7c81 "x86: define _PAGE_NUMA by reusing software bits on the
PMD and PTE levels" redefined _PAGE_NUMA on x86 to be one of the swap PTE
bits and shrunk the maximum possible swap size but it did not go far
enough.  There are no architectures that reuse _PROT_NONE as _PROT_NUMA
but the relics still exist.

This patch removes ARCH_USES_NUMA_PROT_NONE and removes some unnecessary
duplication in powerpc vs the generic implementation by defining the types
the core NUMA helpers expected to exist from x86 with their ppc64
equivalent.  This necessitated that a PTE bit mask be created that
identified the bits that distinguish present from NUMA pte entries but it
is expected this will only differ between arches based on _PAGE_PROTNONE.
The naming for the generic helpers was taken from x86 originally but ppc64
has types that are equivalent for the purposes of the helper so they are
mapped instead of duplicating code.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:52 -04:00
Zhang Zhen ed2f240094 memory-hotplug: add sysfs valid_zones attribute
Currently memory-hotplug has two limits:

1. If the memory block is in ZONE_NORMAL, you can change it to
   ZONE_MOVABLE, but this memory block must be adjacent to ZONE_MOVABLE.

2. If the memory block is in ZONE_MOVABLE, you can change it to
   ZONE_NORMAL, but this memory block must be adjacent to ZONE_NORMAL.

With this patch, we can easy to know a memory block can be onlined to
which zone, and don't need to know the above two limits.

Updated the related Documentation.

[akpm@linux-foundation.org: use conventional comment layout]
[akpm@linux-foundation.org: fix build with CONFIG_MEMORY_HOTREMOVE=n]
[akpm@linux-foundation.org: remove unused local zone_prev]
Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:52 -04:00
Joonsoo Kim bf0dea23a9 mm/slab: use percpu allocator for cpu cache
Because of chicken and egg problem, initialization of SLAB is really
complicated.  We need to allocate cpu cache through SLAB to make the
kmem_cache work, but before initialization of kmem_cache, allocation
through SLAB is impossible.

On the other hand, SLUB does initialization in a more simple way.  It uses
percpu allocator to allocate cpu cache so there is no chicken and egg
problem.

So, this patch try to use percpu allocator in SLAB.  This simplifies the
initialization step in SLAB so that we could maintain SLAB code more
easily.

In my testing there is no performance difference.

This implementation relies on percpu allocator.  Because percpu allocator
uses vmalloc address space, vmalloc address space could be exhausted by
this change on many cpu system with *32 bit* kernel.  This implementation
can cover 1024 cpus in worst case by following calculation.

Worst: 1024 cpus * 4 bytes for pointer * 300 kmem_caches *
	120 objects per cpu_cache = 140 MB
Normal: 1024 cpus * 4 bytes for pointer * 150 kmem_caches(slab merge) *
	80 objects per cpu_cache = 46 MB

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:51 -04:00
Joonsoo Kim ad2c814441 topology: add support for node_to_mem_node() to determine the fallback node
Anton noticed (http://www.spinics.net/lists/linux-mm/msg67489.html) that
on ppc LPARs with memoryless nodes, a large amount of memory was consumed
by slabs and was marked unreclaimable.  He tracked it down to slab
deactivations in the SLUB core when we allocate remotely, leading to poor
efficiency always when memoryless nodes are present.

After much discussion, Joonsoo provided a few patches that help
significantly.  They don't resolve the problem altogether:

 - memory hotplug still needs testing, that is when a memoryless node
   becomes memory-ful, we want to dtrt
 - there are other reasons for going off-node than memoryless nodes,
   e.g., fully exhausted local nodes

Neither case is resolved with this series, but I don't think that should
block their acceptance, as they can be explored/resolved with follow-on
patches.

The series consists of:

[1/3] topology: add support for node_to_mem_node() to determine the
      fallback node

[2/3] slub: fallback to node_to_mem_node() node if allocating on
      memoryless node

      - Joonsoo's patches to cache the nearest node with memory for each
        NUMA node

[3/3] Partial revert of 81c98869fa (""kthread: ensure locality of
      task_struct allocations")

 - At Tejun's request, keep the knowledge of memoryless node fallback
   to the allocator core.

This patch (of 3):

We need to determine the fallback node in slub allocator if the allocation
target node is memoryless node.  Without it, the SLUB wrongly select the
node which has no memory and can't use a partial slab, because of node
mismatch.  Introduced function, node_to_mem_node(X), will return a node Y
with memory that has the nearest distance.  If X is memoryless node, it
will return nearest distance node, but, if X is normal node, it will
return itself.

We will use this function in following patch to determine the fallback
node.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Han Pingtian <hanpt@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:51 -04:00
Joonsoo Kim 61f47105a2 mm/sl[ao]b: always track caller in kmalloc_(node_)track_caller()
Now, we track caller if tracing or slab debugging is enabled.  If they are
disabled, we could save one argument passing overhead by calling
__kmalloc(_node)().  But, I think that it would be marginal.  Furthermore,
default slab allocator, SLUB, doesn't use this technique so I think that
it's okay to change this situation.

After this change, we can turn on/off CONFIG_DEBUG_SLAB without full
kernel build and remove some complicated '#if' defintion.  It looks more
benefitial to me.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:50 -04:00
Joonsoo Kim 07f361b2be mm/slab_common: move kmem_cache definition to internal header
We don't need to keep kmem_cache definition in include/linux/slab.h if we
don't need to inline kmem_cache_size().  According to my code inspection,
this function is only called at lc_create() in lib/lru_cache.c which may
be called at initialization phase of something, so we don't need to inline
it.  Therfore, move it to slab_common.c and move kmem_cache definition to
internal header.

After this change, we can change kmem_cache definition easily without full
kernel build.  For instance, we can turn on/off CONFIG_SLUB_STATS without
full kernel build.

[akpm@linux-foundation.org: export kmem_cache_size() to modules]
[rdunlap@infradead.org: add header files to fix kmemcheck.c build errors]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:50 -04:00
Oleg Nesterov 58cb65487e proc/maps: make vm_is_stack() logic namespace-friendly
- Rename vm_is_stack() to task_of_stack() and change it to return
  "struct task_struct *" rather than the global (and thus wrong in
  general) pid_t.

- Add the new pid_of_stack() helper which calls task_of_stack() and
  uses the right namespace to report the correct pid_t.

  Unfortunately we need to define this helper twice, in task_mmu.c
  and in task_nommu.c. perhaps it makes sense to add fs/proc/util.c
  and move at least pid_of_stack/task_of_stack there to avoid the
  code duplication.

- Change show_map_vma() and show_numa_map() to use the new helper.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:50 -04:00
Linus Torvalds b528392669 ACPI and power management updates for 3.18-rc1
- Rework the handling of wakeup IRQs by the IRQ core such that
    all of them will be switched over to "wakeup" mode in
    suspend_device_irqs() and in that mode the first interrupt
    will abort system suspend in progress or wake up the system
    if already in suspend-to-idle (or equivalent) without executing
    any interrupt handlers.  Among other things that eliminates the
    wakeup-related motivation to use the IRQF_NO_SUSPEND interrupt
    flag with interrupts which don't really need it and should not
    use it (Thomas Gleixner and Rafael J Wysocki).
 
  - Switch over ACPI to handling wakeup interrupts with the help
    of the new mechanism introduced by the above IRQ core rework
    (Rafael J Wysocki).
 
  - Rework the core generic PM domains code to eliminate code that's
    not used, add DT support and add a generic mechanism by which
    devices can be added to PM domains automatically during
    enumeration (Ulf Hansson, Geert Uytterhoeven and Tomasz Figa).
 
  - Add debugfs-based mechanics for debugging generic PM domains
    (Maciej Matraszek).
 
  - ACPICA update to upstream version 20140828.  Included are updates
    related to the SRAT and GTDT tables and the _PSx methods are in
    the METHOD_NAME list now (Bob Moore and Hanjun Guo).
 
  - Add _OSI("Darwin") support to the ACPI core (unfortunately, that
    can't really be done in a straightforward way) to prevent
    Thunderbolt from being turned off on Apple systems after boot
    (or after resume from system suspend) and rework the ACPI Smart
    Battery Subsystem (SBS) driver to work correctly with Apple
    platforms (Matthew Garrett and Andreas Noever).
 
  - ACPI LPSS (Low-Power Subsystem) driver update cleaning up the
    code, adding support for 133MHz I2C source clock on Intel Baytrail
    to it and making it avoid using UART RTS override with Auto Flow
    Control (Heikki Krogerus).
 
  - ACPI backlight updates removing the video_set_use_native_backlight
    quirk which is not necessary any more, making the code check the
    list of output devices returned by the _DOD method to avoid
    creating acpi_video interfaces that won't work and adding a quirk
    for Lenovo Ideapad Z570 (Hans de Goede, Aaron Lu and Stepan Bujnak).
 
  - New Win8 ACPI OSI quirks for some Dell laptops (Edward Lin).
 
  - Assorted ACPI code cleanups (Fabian Frederick, Rasmus Villemoes,
    Sudip Mukherjee, Yijing Wang, and Zhang Rui).
 
  - cpufreq core updates and cleanups (Viresh Kumar, Preeti U Murthy,
    Rasmus Villemoes).
 
  - cpufreq driver updates: cpufreq-cpu0/cpufreq-dt (driver name
    change among other things), ppc-corenet, powernv (Viresh Kumar,
    Preeti U Murthy, Shilpasri G Bhat, Lucas Stach).
 
  - cpuidle support for DT-based idle states infrastructure, new
    ARM64 cpuidle driver, cpuidle core cleanups (Lorenzo Pieralisi,
    Rasmus Villemoes).
 
  - ARM big.LITTLE cpuidle driver updates: support for DT-based
    initialization and Exynos5800 compatible string (Lorenzo Pieralisi,
    Kevin Hilman).
 
  - Rework of the test_suspend kernel command line argument and
    a new trace event for console resume (Srinivas Pandruvada,
    Todd E Brandt).
 
  - Second attempt to optimize swsusp_free() (hibernation core) to
    make it avoid going through all PFNs which may be way too slow on
    some systems (Joerg Roedel).
 
  - devfreq updates (Paul Bolle, Punit Agrawal, Ãrjan Eide).
 
  - rockchip-io Adaptive Voltage Scaling (AVS) driver and AVS
    entry update in MAINTAINERS (Heiko Stübner, Kevin Hilman).
 
  - PM core fix related to clock management (Geert Uytterhoeven).
 
  - PM core's sysfs code cleanup (Johannes Berg).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJUNbJoAAoJEILEb/54YlRxRp8QAJyGIPdx+f03oBir+7vvEwhY
 svxd+V9xXK0UgWNGkCvlMk/1RIVy0qqtXliUrDaE+9tcHACA9+iAxMmNmDsjLOiO
 gpazuz5kgeznrmp1eNwQnYTt+OCReQIcyCsj4q4fNo9bbETTyr2bRz226LEuZekC
 TAiKdphYoOszFBgTVg5gfu+lqjHyXjgXPnwMTlRYn1y4YL2adDIgxj9cFedykTTW
 Eu593TY2dH6ovERJ6q3qxZbRuWuxtww95J07b3t2/2Eb3e/R/zlX0/XJ/C88f/m2
 DkqngbOYqCdw+zJeN6k8631foyfUwAcTd0sJ1+5nsm5H4NE5NqObjbxOk5/yNht6
 HgvgISGHWLerEw+A/Dk6o0oZOtR1G/TAQ5qQk5nUfKT/sSoU+9/USsXtWhXwZCia
 XccnJgW6ZtPrJJP3zDnkrxe3gndmLic11QXArw2IhWTsq0sZlAyMgtauBXLdDiQa
 H/AMiYrUNmIABef1cirBLTtgXN4Zbsai9vIrxMmV7OgBrclrh52NTjzr05P5Hnl2
 fRK56mb6mP59LymI7n8fyXL8tHnbNwFvTaxuvrZmzcYbzL0l9DuPocJrrTHRSfhm
 GFfzfvLj0R66ZM4PthRSwz4H2v1FnlRcCkj5k/QjtBPlyzxtOnJveqve5umbrnb9
 T5mRmlAs4iYwLuKCVVNT
 =sIv/
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:
 "Features-wise, to me the most important this time is a rework of
  wakeup interrupts handling in the core that makes them work
  consistently across all of the available sleep states, including
  suspend-to-idle.  Many thanks to Thomas Gleixner for his help with
  this work.

  Second is an update of the generic PM domains code that has been in
  need of some care for quite a while.  Unused code is being removed, DT
  support is being added and domains are now going to be attached to
  devices in bus type code in analogy with the ACPI PM domain.  The
  majority of work here was done by Ulf Hansson who also has been the
  most active developer this time.

  Apart from this we have a traditional ACPICA update, this time to
  upstream version 20140828 and a few ACPI wakeup interrupts handling
  patches on top of the general rework mentioned above.  There also are
  several cpufreq commits including renaming the cpufreq-cpu0 driver to
  cpufreq-dt, as this is what implements generic DT-based cpufreq
  support, and a new DT-based idle states infrastructure for cpuidle.

  In addition to that, the ACPI LPSS driver is updated, ACPI support for
  Apple machines is improved, a few bugs are fixed and a few cleanups
  are made all over.

  Finally, the Adaptive Voltage Scaling (AVS) subsystem now has a tree
  maintained by Kevin Hilman that will be merged through the PM tree.

  Numbers-wise, the generic PM domains update takes the lead this time
  with 32 non-merge commits, second is cpufreq (15 commits) and the 3rd
  place goes to the wakeup interrupts handling rework (13 commits).

  Specifics:

   - Rework the handling of wakeup IRQs by the IRQ core such that all of
     them will be switched over to "wakeup" mode in suspend_device_irqs()
     and in that mode the first interrupt will abort system suspend in
     progress or wake up the system if already in suspend-to-idle (or
     equivalent) without executing any interrupt handlers.  Among other
     things that eliminates the wakeup-related motivation to use the
     IRQF_NO_SUSPEND interrupt flag with interrupts which don't really
     need it and should not use it (Thomas Gleixner and Rafael Wysocki)

   - Switch over ACPI to handling wakeup interrupts with the help of the
     new mechanism introduced by the above IRQ core rework (Rafael Wysocki)

   - Rework the core generic PM domains code to eliminate code that's
     not used, add DT support and add a generic mechanism by which
     devices can be added to PM domains automatically during enumeration
     (Ulf Hansson, Geert Uytterhoeven and Tomasz Figa).

   - Add debugfs-based mechanics for debugging generic PM domains
     (Maciej Matraszek).

   - ACPICA update to upstream version 20140828.  Included are updates
     related to the SRAT and GTDT tables and the _PSx methods are in the
     METHOD_NAME list now (Bob Moore and Hanjun Guo).

   - Add _OSI("Darwin") support to the ACPI core (unfortunately, that
     can't really be done in a straightforward way) to prevent
     Thunderbolt from being turned off on Apple systems after boot (or
     after resume from system suspend) and rework the ACPI Smart Battery
     Subsystem (SBS) driver to work correctly with Apple platforms
     (Matthew Garrett and Andreas Noever).

   - ACPI LPSS (Low-Power Subsystem) driver update cleaning up the code,
     adding support for 133MHz I2C source clock on Intel Baytrail to it
     and making it avoid using UART RTS override with Auto Flow Control
     (Heikki Krogerus).

   - ACPI backlight updates removing the video_set_use_native_backlight
     quirk which is not necessary any more, making the code check the
     list of output devices returned by the _DOD method to avoid
     creating acpi_video interfaces that won't work and adding a quirk
     for Lenovo Ideapad Z570 (Hans de Goede, Aaron Lu and Stepan Bujnak)

   - New Win8 ACPI OSI quirks for some Dell laptops (Edward Lin)

   - Assorted ACPI code cleanups (Fabian Frederick, Rasmus Villemoes,
     Sudip Mukherjee, Yijing Wang, and Zhang Rui)

   - cpufreq core updates and cleanups (Viresh Kumar, Preeti U Murthy,
     Rasmus Villemoes)

   - cpufreq driver updates: cpufreq-cpu0/cpufreq-dt (driver name change
     among other things), ppc-corenet, powernv (Viresh Kumar, Preeti U
     Murthy, Shilpasri G Bhat, Lucas Stach)

   - cpuidle support for DT-based idle states infrastructure, new ARM64
     cpuidle driver, cpuidle core cleanups (Lorenzo Pieralisi, Rasmus
     Villemoes)

   - ARM big.LITTLE cpuidle driver updates: support for DT-based
     initialization and Exynos5800 compatible string (Lorenzo Pieralisi,
     Kevin Hilman)

   - Rework of the test_suspend kernel command line argument and a new
     trace event for console resume (Srinivas Pandruvada, Todd E Brandt)

   - Second attempt to optimize swsusp_free() (hibernation core) to make
     it avoid going through all PFNs which may be way too slow on some
     systems (Joerg Roedel)

   - devfreq updates (Paul Bolle, Punit Agrawal, Ãrjan Eide).

   - rockchip-io Adaptive Voltage Scaling (AVS) driver and AVS entry
     update in MAINTAINERS (Heiko Stübner, Kevin Hilman)

   - PM core fix related to clock management (Geert Uytterhoeven)

   - PM core's sysfs code cleanup (Johannes Berg)"

* tag 'pm+acpi-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (105 commits)
  ACPI / fan: printk replacement
  PM / clk: Fix crash in clocks management code if !CONFIG_PM_RUNTIME
  PM / Domains: Rename cpu_data to cpuidle_data
  cpufreq: cpufreq-dt: fix potential double put of cpu OF node
  cpufreq: cpu0: rename driver and internals to 'cpufreq_dt'
  PM / hibernate: Iterate over set bits instead of PFNs in swsusp_free()
  cpufreq: ppc-corenet: remove duplicate update of cpu_data
  ACPI / sleep: Rework the handling of ACPI GPE wakeup from suspend-to-idle
  PM / sleep: Rename platform suspend/resume functions in suspend.c
  PM / sleep: Export dpm_suspend_late/noirq() and dpm_resume_early/noirq()
  ACPICA: Introduce acpi_enable_all_wakeup_gpes()
  ACPICA: Clear all non-wakeup GPEs in acpi_hw_enable_wakeup_gpe_block()
  ACPI / video: check _DOD list when creating backlight devices
  PM / Domains: Move dev_pm_domain_attach|detach() to pm_domain.h
  cpufreq: Replace strnicmp with strncasecmp
  cpufreq: powernv: Set the cpus to nominal frequency during reboot/kexec
  cpufreq: powernv: Set the pstate of the last hotplugged out cpu in policy->cpus to minimum
  cpufreq: Allow stop CPU callback to be used by all cpufreq drivers
  PM / devfreq: exynos: Enable building exynos PPMU as module
  PM / devfreq: Export helper functions for drivers
  ...
2014-10-09 16:07:43 -04:00
Linus Torvalds 80213c03c4 PCI changes for the v3.18 merge window:
Enumeration
     - Check Vendor ID only for Config Request Retry Status (Rajat Jain)
     - Enable Config Request Retry Status when supported (Rajat Jain)
     - Add generic domain handling (Catalin Marinas)
     - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado)
 
   Resource management
     - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu)
     - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr)
 
   PCI device hotplug
     - Prevent NULL dereference during pciehp probe (Andreas Noever)
     - Move _HPP & _HPX handling into core (Bjorn Helgaas)
     - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas)
     - Apply _HPP/_HPX to display devices (Bjorn Helgaas)
     - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas)
     - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas)
     - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas)
     - Fix wait time in pciehp timeout message (Yinghai Lu)
     - Add more pciehp Slot Control debug output (Yinghai Lu)
     - Stop disabling pciehp notifications during init (Yinghai Lu)
 
   MSI
     - Remove arch_msi_check_device() (Alexander Gordeev)
     - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev)
     - Move D0 check into pci_msi_check_device() (Alexander Gordeev)
     - Remove unused kobject from struct msi_desc (Yijing Wang)
     - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang)
     - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang)
     - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang)
     - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang)
     - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang)
 
   Power management
     - Drop unused runtime PM support code for PCIe ports (Rafael J.  Wysocki)
     - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki)
 
   AER
     - Add additional AER error strings (Gong Chen)
     - Make <linux/aer.h> standalone includable (Thierry Reding)
 
   Virtualization
     - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson)
     - Add ACS quirk for Intel 10G NICs (Alex Williamson)
     - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp)
     - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson)
     - Add device flag helpers (Ethan Zhao)
     - Assume all Mellanox devices have broken INTx masking (Gavin Shan)
 
   Generic host bridge driver
     - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau)
     - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau)
     - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau)
     - Fix the conversion of IO ranges into IO resources (Liviu Dudau)
     - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau)
     - Add support for parsing PCI host bridge resources from DT (Liviu Dudau)
     - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau)
     - Add arm64 architectural support for PCI (Liviu Dudau)
 
   APM X-Gene
     - Add APM X-Gene PCIe driver (Tanmay Inamdar)
     - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar)
 
   Freescale i.MX6
     - Probe in module_init(), not fs_initcall() (Lucas Stach)
     - Delay enabling reference clock for SS until it stabilizes (Tim Harvey)
 
   Marvell MVEBU
     - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni)
 
   NVIDIA Tegra
     - Make sure the PCIe PLL is really reset (Eric Yuen)
     - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang)
     - Fix extended configuration space mapping (Peter Daifuku)
     - Implement resource hierarchy (Thierry Reding)
     - Clear CLKREQ# enable on port disable (Thierry Reding)
     - Add Tegra124 support (Thierry Reding)
 
   ST Microelectronics SPEAr13xx
     - Pass config resource through reg property (Pratyush Anand)
 
   Synopsys DesignWare
     - Use NULL instead of false (Fabio Estevam)
     - Parse bus-range property from devicetree (Lucas Stach)
     - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach)
     - Remove pci_assign_unassigned_resources() (Lucas Stach)
     - Check private_data validity in single place (Lucas Stach)
     - Setup and clear exactly one MSI at a time (Lucas Stach)
     - Remove open-coded bitmap operations (Lucas Stach)
     - Fix configuration base address when using 'reg' (Minghuan Lian)
     - Fix IO resource end address calculation (Minghuan Lian)
     - Rename get_msi_data() to get_msi_addr() (Minghuan Lian)
     - Add get_msi_data() to pcie_host_ops (Minghuan Lian)
     - Add support for v3.65 hardware (Murali Karicheri)
     - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand)
 
   TI Keystone
     - Add TI Keystone PCIe driver (Murali Karicheri)
     - Limit MRSS for all downstream devices (Murali Karicheri)
     - Assume controller is already in RC mode (Murali Karicheri)
     - Set device ID based on SoC to support multiple ports (Murali Karicheri)
 
   Xilinx AXI
     - Add Xilinx AXI PCIe driver (Srikanth Thokala)
     - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter)
 
   Miscellaneous
     - Clean up whitespace (Quentin Lambert)
     - Remove assignments from "if" conditions (Quentin Lambert)
     - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri)
     - x86: Mark DMI tables as initialization data (Mathias Krause)
     - x86: Move __init annotation to the correct place (Mathias Krause)
     - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause)
     - x86: Constify pci_mmcfg_probes[] array (Mathias Krause)
     - x86: Mark PCI BIOS initialization code as such (Mathias Krause)
     - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya)
     - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNWmJAAoJEFmIoMA60/r8GncP/3uHRoBrnaF6pv+S1l1p3Fs/
 l1kKH91/IuAAU7VJX8pkNybFqx02topWmiVVXAzqvD01PcRLGCLjPbWl5h+y5/Ja
 CHZH33AwHAmm0kt4BrOSOeHTLJhAigly2zV3P4F8jRIgyaeMoGZ6Ko4tkQUpm21k
 +ohrOd4cxYkmzzCjKwsZZhKnyRNpae8FmTk3VQBPuN8DbhvFPrqo5/+GeAdSZTdS
 HZHpfl2HL4095aY7uBVsZqNkjQyl6SnWwjkjLnuI8q3qA3BLgDZE/Jr8F/MNuW1V
 y01JIjerFWMDFyBIkpg7moYnODy6oP3KvczwYdKGmqsJja+0MQvYhLTwD+R/yTQS
 SewJA0mL3T3EJEfnFYkCiaIX27xIwk/FxHfaKPN91xgx/QM7xCVZNrU2/dXjhoX1
 GqLKxOEaFHhWWTyT5Dj27I0ZcElzFZ3tIwvrHfs8y22oAuAlsAypaUgvUwRfL4CO
 hOj4ITZa0t041sYWqxCoGAA9Fdp8HMzNKKS5F4mhADz4Ad9v6uPCNv/s/RoxVsbm
 jhZOtPYJ0/iCA+kNVX563S8Z3VpfPI+7bBjcj2WKdzW+IlICvOKT+kvwL2Tv/rE7
 w0hrNsbkgGsYbPldMx7LwCavsUtYFuNj0zoU6vkhP2jk6O2Tn5VXDmjrXH0v3iHI
 v03vlUtre0bQ26fzDyLQ
 =4Zv1
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "The interesting things here are:

   - Turn on Config Request Retry Status Software Visibility.  This
     caused hangs last time, but we included a fix this time.
   - Rework PCI device configuration to use _HPP/_HPX more aggressively
   - Allow PCI devices to be put into D3cold during system suspend
   - Add arm64 PCI support
   - Add APM X-Gene host bridge driver
   - Add TI Keystone host bridge driver
   - Add Xilinx AXI host bridge driver

  More detailed summary:

  Enumeration
    - Check Vendor ID only for Config Request Retry Status (Rajat Jain)
    - Enable Config Request Retry Status when supported (Rajat Jain)
    - Add generic domain handling (Catalin Marinas)
    - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado)

  Resource management
    - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu)
    - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr)

  PCI device hotplug
    - Prevent NULL dereference during pciehp probe (Andreas Noever)
    - Move _HPP & _HPX handling into core (Bjorn Helgaas)
    - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas)
    - Apply _HPP/_HPX to display devices (Bjorn Helgaas)
    - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas)
    - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas)
    - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas)
    - Fix wait time in pciehp timeout message (Yinghai Lu)
    - Add more pciehp Slot Control debug output (Yinghai Lu)
    - Stop disabling pciehp notifications during init (Yinghai Lu)

  MSI
    - Remove arch_msi_check_device() (Alexander Gordeev)
    - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev)
    - Move D0 check into pci_msi_check_device() (Alexander Gordeev)
    - Remove unused kobject from struct msi_desc (Yijing Wang)
    - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang)
    - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang)
    - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang)
    - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang)
    - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang)

  Power management
    - Drop unused runtime PM support code for PCIe ports (Rafael J.  Wysocki)
    - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki)

  AER
    - Add additional AER error strings (Gong Chen)
    - Make <linux/aer.h> standalone includable (Thierry Reding)

  Virtualization
    - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson)
    - Add ACS quirk for Intel 10G NICs (Alex Williamson)
    - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp)
    - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson)
    - Add device flag helpers (Ethan Zhao)
    - Assume all Mellanox devices have broken INTx masking (Gavin Shan)

  Generic host bridge driver
    - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau)
    - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau)
    - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau)
    - Fix the conversion of IO ranges into IO resources (Liviu Dudau)
    - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau)
    - Add support for parsing PCI host bridge resources from DT (Liviu Dudau)
    - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau)
    - Add arm64 architectural support for PCI (Liviu Dudau)

  APM X-Gene
    - Add APM X-Gene PCIe driver (Tanmay Inamdar)
    - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar)

  Freescale i.MX6
    - Probe in module_init(), not fs_initcall() (Lucas Stach)
    - Delay enabling reference clock for SS until it stabilizes (Tim Harvey)

  Marvell MVEBU
    - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni)

  NVIDIA Tegra
    - Make sure the PCIe PLL is really reset (Eric Yuen)
    - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang)
    - Fix extended configuration space mapping (Peter Daifuku)
    - Implement resource hierarchy (Thierry Reding)
    - Clear CLKREQ# enable on port disable (Thierry Reding)
    - Add Tegra124 support (Thierry Reding)

  ST Microelectronics SPEAr13xx
    - Pass config resource through reg property (Pratyush Anand)

  Synopsys DesignWare
    - Use NULL instead of false (Fabio Estevam)
    - Parse bus-range property from devicetree (Lucas Stach)
    - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach)
    - Remove pci_assign_unassigned_resources() (Lucas Stach)
    - Check private_data validity in single place (Lucas Stach)
    - Setup and clear exactly one MSI at a time (Lucas Stach)
    - Remove open-coded bitmap operations (Lucas Stach)
    - Fix configuration base address when using 'reg' (Minghuan Lian)
    - Fix IO resource end address calculation (Minghuan Lian)
    - Rename get_msi_data() to get_msi_addr() (Minghuan Lian)
    - Add get_msi_data() to pcie_host_ops (Minghuan Lian)
    - Add support for v3.65 hardware (Murali Karicheri)
    - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand)

  TI Keystone
    - Add TI Keystone PCIe driver (Murali Karicheri)
    - Limit MRSS for all downstream devices (Murali Karicheri)
    - Assume controller is already in RC mode (Murali Karicheri)
    - Set device ID based on SoC to support multiple ports (Murali Karicheri)

  Xilinx AXI
    - Add Xilinx AXI PCIe driver (Srikanth Thokala)
    - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter)

  Miscellaneous
    - Clean up whitespace (Quentin Lambert)
    - Remove assignments from "if" conditions (Quentin Lambert)
    - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri)
    - x86: Mark DMI tables as initialization data (Mathias Krause)
    - x86: Move __init annotation to the correct place (Mathias Krause)
    - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause)
    - x86: Constify pci_mmcfg_probes[] array (Mathias Krause)
    - x86: Mark PCI BIOS initialization code as such (Mathias Krause)
    - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya)
    - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)"

* tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (109 commits)
  arm64: dts: Add APM X-Gene PCIe device tree nodes
  PCI: Add ACS quirk for AMD A88X southbridge devices
  PCI: xgene: Add APM X-Gene PCIe driver
  PCI: designware: Remove open-coded bitmap operations
  PCI/MSI: Remove unnecessary temporary variable
  PCI/MSI: Use __write_msi_msg() instead of write_msi_msg()
  MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()
  PCI/MSI: Use __get_cached_msi_msg() instead of get_cached_msi_msg()
  PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints
  PCI/MSI: Remove "pos" from the struct msi_desc msi_attrib
  PCI/MSI: Remove unused kobject from struct msi_desc
  PCI/MSI: Rename pci_msi_check_device() to pci_msi_supported()
  PCI/MSI: Move D0 check into pci_msi_check_device()
  PCI/MSI: Remove arch_msi_check_device()
  irqchip: armada-370-xp: Remove arch_msi_check_device()
  PCI/MSI/PPC: Remove arch_msi_check_device()
  arm64: Add architectural support for PCI
  PCI: Add pci_remap_iospace() to map bus I/O resources
  of/pci: Add support for parsing PCI host bridge resources from DT
  of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()
  ...

Conflicts:
	arch/arm64/boot/dts/apm-storm.dtsi
2014-10-09 15:03:49 -04:00