Commit graph

77901 commits

Author SHA1 Message Date
Simon Fels 6b3cc1db68 Bluetooth: close HCI device when user channel socket gets closed
With 9380f9eacf the order of unsetting
the HCI_USER_CHANNEL flag of the HCI device was reverted to ensure
the device is first closed before making it available again.

Due to hci_dev_close checking for HCI_USER_CHANNEL being set on the
device it was never really closed and was kept opened. We're now
calling hci_dev_do_close directly to make sure the device is correctly
closed and we keep the correct order to unset the flag on our device
object.

Signed-off-by: Simon Fels <simon.fels@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17 13:20:02 +02:00
Loic Poulain 6f558b70fb Bluetooth: Add bt_dev logging macros
Add specific bluetooth device logging macros since hci device name is
repeatedly referred in bluetooth subsystem logs.

Signed-off-by: Loic Poulain <loic.poulain@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17 13:20:00 +02:00
Alex Gartrell 4e478098ac ipvs: add sysctl to ignore tunneled packets
This is a way to avoid nasty routing loops when multiple ipvs instances can
forward to eachother.

Signed-off-by: Alex Gartrell <agartrell@fb.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2015-09-17 11:50:02 +09:00
Geliang Tang 0243ed44ad spi: fix kernel-doc warnings in spi.h
Fix the following 'make htmldocs' warnings:

  .//include/linux/spi/spi.h:71: warning: No description found for parameter 'lock'
  .//include/linux/spi/spi.h:71: warning: Excess struct/union/enum/typedef member 'clock' description in 'spi_statistics'

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2015-09-16 20:44:47 +01:00
Tejun Heo 0c986253b9 Revert "sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem"
This reverts commit d59cfc09c3.

d59cfc09c3 ("sched, cgroup: replace signal_struct->group_rwsem with
a global percpu_rwsem") and b5ba75b5fc ("cgroup: simplify
threadgroup locking") changed how cgroup synchronizes against task
fork and exits so that it uses global percpu_rwsem instead of
per-process rwsem; unfortunately, the write [un]lock paths of
percpu_rwsem always involve synchronize_rcu_expedited() which turned
out to be too expensive.

Improvements for percpu_rwsem are scheduled to be merged in the coming
v4.4-rc1 merge window which alleviates this issue.  For now, revert
the two commits to restore per-process rwsem.  They will be re-applied
for the v4.4-rc1 merge window.

Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/g/55F8097A.7000206@de.ibm.com
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org # v4.2+
2015-09-16 11:51:12 -04:00
Thomas Gleixner bd0b9ac405 genirq: Remove irq argument from irq flow handlers
Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.

Remove the argument.

Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
2015-09-16 15:47:51 +02:00
Jiang Liu b237721c5d genirq: Move field 'msi_desc' from irq_data into irq_common_data
MSI descriptors are per-irq instead of per irqchip, so move it into
struct irq_common_data.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1433145945-789-35-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-16 15:46:49 +02:00
Jiang Liu 9df872faa7 genirq: Move field 'affinity' from irq_data into irq_common_data
Irq affinity mask is per-irq instead of per irqchip, so move it into
struct irq_common_data.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1433303281-27688-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-16 15:46:49 +02:00
Jiang Liu af7080e040 genirq: Move field 'handler_data' from irq_data into irq_common_data
Handler data (handler_data) is per-irq instead of per irqchip, so move
it into struct irq_common_data.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1433145945-789-13-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-16 15:46:49 +02:00
Jiang Liu 449e9cae58 genirq: Move field 'node' from irq_data into irq_common_data
NUMA node information is per-irq instead of per-irqchip, so move it into
struct irq_common_data. Also use CONFIG_NUMA to guard irq_common_data.node.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1433145945-789-8-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-16 15:46:49 +02:00
Thomas Gleixner fc5697126a genirq: Provide IRQD_FORWARDED_TO_VCPU status flag
Provide a irq data flag to mark an irq forwarded to a VCPU along with
the accessor functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-16 15:46:48 +02:00
Thomas Gleixner 755d119a62 genirq: Simplify irq_data_to_desc()
Avoid the lookup of irq_desc and use the same mechanism for
hierarchical and flat irqdomains.

Based-on-a-patch-from: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-16 15:43:11 +02:00
Thomas Gleixner 123236ccac genirq: Remove __irq_set_handler_locked()
All users converted to irq_set_handler_locked()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-16 15:43:11 +02:00
Thomas Gleixner e902e14549 genirq: Remove __irq_set_chip_handler_name_locked()
All users converted to irq_set_chip_handler_name_locked()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-16 12:55:15 +02:00
Stephen Rothwell b84ee0d7f3 cdc: add header guards
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 21:53:24 -07:00
Rafael J. Wysocki 1f0bd44e93 cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()
cpufreq_cpu_get() called by get_cur_freq_on_cpu() is overkill,
because the ->get() callback is always invoked in a context in
which all of the conditions checked by cpufreq_cpu_get() are
guaranteed to be satisfied.

Use cpufreq_cpu_get_raw() instead of it and drop the
corresponding cpufreq_cpu_put() from get_cur_freq_on_cpu().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-09-16 02:17:49 +02:00
Sowmini Varadhan d5566fd72e rtnetlink: RTEXT_FILTER_SKIP_STATS support to avoid dumping inet/inet6 stats
Many commonly used functions like getifaddrs() invoke RTM_GETLINK
to dump the interface information, and do not need the
the AF_INET6 statististics that are always returned by default
from rtnl_fill_ifinfo().

Computing the statistics can be an expensive operation that impacts
scaling, so it is desirable to avoid this if the information is
not needed.

This patch adds a the RTEXT_FILTER_SKIP_STATS extended info flag that
can be passed with netlink_request() to avoid statistics computation
for the ifinfo path.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 15:25:02 -07:00
Martin KaFai Lau 70da5b5c53 ipv6: Replace spinlock with seqlock and rcu in ip6_tunnel
This patch uses a seqlock to ensure consistency between idst->dst and
idst->cookie.  It also makes dst freeing from fib tree to undergo a
rcu grace period.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:05 -07:00
Martin KaFai Lau cdf3464e6c ipv6: Fix dst_entry refcnt bugs in ip6_tunnel
Problems in the current dst_entry cache in the ip6_tunnel:

1. ip6_tnl_dst_set is racy.  There is no lock to protect it:
   - One major problem is that the dst refcnt gets messed up. F.e.
     the same dst_cache can be released multiple times and then
     triggering the infamous dst refcnt < 0 warning message.
   - Another issue is the inconsistency between dst_cache and
     dst_cookie.

   It can be reproduced by adding and removing the ip6gre tunnel
   while running a super_netperf TCP_CRR test.

2. ip6_tnl_dst_get does not take the dst refcnt before returning
   the dst.

This patch:
1. Create a percpu dst_entry cache in ip6_tnl
2. Use a spinlock to protect the dst_cache operations
3. ip6_tnl_dst_get always takes the dst refcnt before returning

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:05 -07:00
Martin KaFai Lau f230d1e891 ipv6: Rename the dst_cache helper functions in ip6_tunnel
It is a prep work to fix the dst_entry refcnt bugs in
ip6_tunnel.

This patch rename:
1. ip6_tnl_dst_check() to ip6_tnl_dst_get() to better
   reflect that it will take a dst refcnt in the next patch.
2. ip6_tnl_dst_store() to ip6_tnl_dst_set() to have a more
   conventional name matching with ip6_tnl_dst_get().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:04 -07:00
David S. Miller ad1e7b97b3 cdc: Fix build warning.
In file included from drivers/usb/gadget/function/u_serial.h:16:0,
                    from drivers/usb/gadget/function/f_acm.c:23:
>> include/linux/usb/cdc.h:47:5: warning: 'struct usb_interface' declared inside parameter list
        int buflen);
        ^
>> include/linux/usb/cdc.h:47:5: warning: its scope is only this definition or declaration, which is probably not what you want

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 13:25:03 -07:00
Oliver Neukum c40a2c8817 CDC: common parser for extra headers
CDC drivers all implement their own parser for the extra headers.
This patch fixes the code duplication introducing a single common
parser in usbnet.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 12:43:11 -07:00
David Ahern c36ba6603a net: Allow user to get table id from route lookup
rt_fill_info which is called for 'route get' requests hardcodes the
table id as RT_TABLE_MAIN which is not correct when multiple tables
are used. Use the newly added table id in the rtable to send back
the correct table similar to what is done for IPv6.

To maintain current ABI a new request flag, RTM_F_LOOKUP_TABLE, is
added to indicate the actual table is wanted versus the hardcoded
response.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 12:01:41 -07:00
David Ahern b7503e0cdb net: Add FIB table id to rtable
Add the FIB table id to rtable to make the information available for
IPv4 as it is for IPv6.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 12:01:41 -07:00
Huang Shijie 6584d84c3e genirq: Update the comment for generic_handle_irq_desc
__do_IRQ() was removed by commit 1c77ff2 "genirq: Remove __do_IRQ",
but the comment referring to __do_IRQ() was left.

Update the comment for generic_handle_irq_desc().

Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Cc: jiang.liu@linux.intel.com
Cc: peterz@infradead.org
Cc: rafael.j.wysocki@intel.com
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Link: http://lkml.kernel.org/r/1441074950-3893-1-git-send-email-shijie.huang@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-15 17:06:30 +02:00
Thomas Gleixner 3829c664b1 genirq: Remove stale comment
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-15 17:06:29 +02:00
Jonathan Corbet 1975dbc276 locking/static_keys: Fix up the static keys documentation
Fix a few small mistakes in the static key documentation and
delete an unneeded sentence.

Suggested-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150914171105.511e1e21@lwn.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-15 07:12:06 +02:00
Sudeep Holla bcb2b0b2ba ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED()
This commit removes all CONFIG_.*{,_MODULE} in ACPI code, replacing it
with IS_ENABLED().

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-09-15 03:05:45 +02:00
Rafael J. Wysocki 4184a8fc57 Merge branch 'for-rafael' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq into pm-devfreq
Pull devfreq updates for v4.3 from MyungJoo Ham.

* 'for-rafael' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq:
  PM / devfreq: Fix incorrect type issue.
  PM / devfreq: tegra: Update governor to use devfreq_update_stats()
  PM / devfreq: comments for get_dev_status usage updated
  PM / devfreq: drop comment about thermal setting max_freq
  PM / devfreq: cache the last call to get_dev_status()
  PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL)
  PM / devfreq: exynos-ppmu: bit-wise operation bugfix.
  PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
  PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
  PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding

Conflicts:
	drivers/devfreq/event/exynos-ppmu.c
2015-09-15 01:29:43 +02:00
Florian Westphal 63cdbc06b3 netfilter: bridge: fix routing of bridge frames with call-iptables=1
We can't re-use the physoutdev storage area.

1.  When using NFQUEUE in PREROUTING, we attempt to bump a bogus
refcnt since nf_bridge->physoutdev is garbage (ipv4/ipv6 address)

2. for same reason, we crash in physdev match in FORWARD or later if
skb is routed instead of bridged.

This increases nf_bridge_info to 40 bytes, but we have no other choice.

Fixes: 72b1e5e4ca ("netfilter: bridge: reduce nf_bridge_info to 32 bytes again")
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-14 18:08:33 +02:00
Javi Merino c973c3bcec thermal: Add a function to get the minimum power
The thermal core already has a function to get the maximum power of a
cooling device: power_actor_get_max_power().  Add a function to get the
minimum power of a cooling device.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-09-14 07:39:46 -07:00
Viresh Kumar eef7635a22 clockevents: Remove unused set_mode() callback
All users are migrated to the per-state callbacks, get rid of the
unused interface and the core support code.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linaro-kernel@lists.linaro.org
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/fd60de14cf6d125489c031207567bb255ad946f6.1441943991.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-14 11:00:55 +02:00
Punit Agrawal cd33dc9ac2 thermal: Fix thermal_zone_of_sensor_register to match documentation
thermal_zone_of_sensor_register is documented as returning a pointer
to either a valid thermal_zone_device on success, or a corresponding
ERR_PTR() value.

In contrast, the function returns NULL when THERMAL_OF is configured
off. Fix this.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-09-13 19:33:07 -07:00
Linus Torvalds 10fbd36e36 blk: rq_data_dir() should not return a boolean
rq_data_dir() returns either READ or WRITE (0 == READ, 1 == WRITE), not
a boolean value.

Now, admittedly the "!= 0" doesn't really change the value (0 stays as
zero, 1 stays as one), but it's not only redundant, it confuses gcc, and
causes gcc to warn about the construct

    switch (rq_data_dir(req)) {
        case READ:
            ...
        case WRITE:
            ...

that we have in a few drivers.

Now, the gcc warning is silly and stupid (it seems to warn not about the
switch value having a different type from the case statements, but about
_any_ boolean switch value), but in this case the code itself is silly
and stupid too, so let's just change it, and get rid of warnings like
this:

  drivers/block/hd.c: In function ‘hd_request’:
  drivers/block/hd.c:630:11: warning: switch condition has boolean value [-Wswitch-bool]
     switch (rq_data_dir(req)) {

The odd '!= 0' came in when "cmd_flags" got turned into a "u64" in
commit 5953316dbf ("block: make rq->cmd_flags be 64-bit") and is
presumably because the old code (that just did a logical 'and' with 1)
would then end up making the type of rq_data_dir() be u64 too.

But if we want to retain the old regular integer type, let's just cast
the result to 'int' rather than use that rather odd '!= 0'.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-12 12:03:30 -07:00
Linus Torvalds 01b0c014ee Merge branch 'akpm' (patches from Andrew)
Merge fourth patch-bomb from Andrew Morton:

 - sys_membarier syscall

 - seq_file interface changes

 - a few misc fixups

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  revert "ocfs2/dlm: use list_for_each_entry instead of list_for_each"
  mm/early_ioremap: add explicit #include of asm/early_ioremap.h
  fs/seq_file: convert int seq_vprint/seq_printf/etc... returns to void
  selftests: enhance membarrier syscall test
  selftests: add membarrier syscall test
  sys_membarrier(): system-wide memory barrier (generic, x86)
  MODSIGN: fix a compilation warning in extract-cert
2015-09-11 19:34:09 -07:00
Linus Torvalds ded0e250b5 NTB bug and documentation fixes, new device IDs, performance
improvements, and adding a mailing list to MAINTAINERS for NTB.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJV84vnAAoJEG5mS6x6i9IjbNkQAJsaG804NrU1hu4lpXX0nC2d
 LlmN6o9HveAJLZf0hX7WE5cjd0jkgzYevgXoFDOnTs9OpMATeQJ65pq/6fUWzrjd
 P/wnZFeJoGP/SWAm1M6r4mhOJe9L9R574FNKypopmm1yWuijDvqtAskhJX8wR++7
 BWBRlL7lryCsvaNrZBMKWBWBih3TSAD1g2l4F/1TxSU25aBz3LU2u5fCNvtDbhJH
 FY9DKnymI91c4Do24iV8uTzGKUX+gXe0COsN5E57zCn2yYRKflz7H2reLXLNWXSv
 X2sNHscnix26GqyqAxvFCz0Lja0cLr1nT5LxcaZmLWgxY5Y52VcT14TKUu36i1L9
 Ppt52QA56ZQaxQ1luLxBTfEuw/QdAVz1JKPf8aqKqzVKrOAcFp4566tmsfCmulq7
 GhLZVzb80c5BEPNwOdJhRVxfcPQ/eSe4CMRa1gkyM4+vLbFf3bDpsZIUO/Bx7ug8
 zyg0hBYNIKvhSMXb3vZs30xz2pVI13yxNnURXtP4P4Ifig8WYkcvxqYPvsPFjLmd
 yYm89ulBjhubTCNu3+40Jd+QWqnOE3egFobUis2BNOu5adZV7bQwu79DGYx3zysK
 OqPtS1qOpXR1ZA+AaRODk115ZFxa/Ebos6Lr0HnaZYwRAAF9HWPiS4ilLrH+Ls8R
 JN229+2kgvred9kBBwYB
 =CASz
 -----END PGP SIGNATURE-----

Merge tag 'ntb-4.3' of git://github.com/jonmason/ntb

Pull NTB fixes from Jon Mason:
 "NTB bug and documentation fixes, new device IDs, performance
  improvements, and adding a mailing list to MAINTAINERS for NTB"

* tag 'ntb-4.3' of git://github.com/jonmason/ntb:
  NTB: Fix range check on memory window index
  NTB: Improve index handling in B2B MW workaround
  NTB: Fix documentation for ntb_peer_db_clear.
  NTB: Fix documentation for ntb_link_is_up
  NTB: Use unique DMA channels for TX and RX
  NTB: Remove dma_sync_wait from ntb_async_rx
  NTB: Clean up QP stats info
  NTB: Make the transport list in order of discovery
  NTB: Add PCI Device IDs for Broadwell Xeon
  NTB: Add flow control to the ntb_netdev
  NTB: Add list to MAINTAINERS
2015-09-11 19:29:00 -07:00
Linus Torvalds fa9a67ef9d Additional power management and ACPI material for v4.3-rc1
- Build fix for the new Mediatek MT8173 cpufreq driver (Guenter Roeck).
 
  - Generic power domains framework fixes (power on error code
    path, subdomain removal) and cleanup of a deprecated API user
    (Geert Uytterhoeven, Jon Hunter, Ulf Hansson).
 
  - cpufreq-dt driver fixes including two fixes for bugs related to
    the new Operating Performance Points Device Tree bindings
    introduced recently (Viresh Kumar).
 
  - Suspend frequency support for the cpufreq-dt driver
    (Bartlomiej Zolnierkiewicz, Viresh Kumar).
 
  - cpufreq core cleanups (Viresh Kumar).
 
  - intel_pstate driver fixes (Chen Yu, Kristen Carlson Accardi).
 
  - Additional sanity check in the cpuidle core (Xunlei Pang).
 
  - Fix for a comment related to CPU power management (Lina Iyer).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJV80XwAAoJEILEb/54YlRxIAkP/RrT4mDO9QZkocO+XYDBB739
 oi1aaSDmmVVKlFtFiyX4njS7S+U5UC8PYn5o/fFtVnP1kD9yuZkoak65U3t7bsKK
 +rjun1SzJeooaLXW4ZAJKyPldi/RrsfRJ2JOaEt2vK03ppjx2mouK1VpjNj+qmYd
 wwFb4q5S5Rcpph0Sx+xxhpZKoOgBmXk7nWIe5bWDxuORdx4yoxp4lus/P223wFJy
 r0kJnNyY55mnY4Yx5e3L8W1rMC/Sf6mt8pyMvqRpsKdBHaEbhmGsiERSi0eCZJwr
 AmL666TVgPk6zNM/yc3mFOTRZeAF3soSWF5R6+n2TswWvWFH+ksM14g9ndb7fTv3
 FUuOU3jyfSUj2T4xr0qUZLaQUFB3xNrnWBNnm/CzPwjJtH1u1SK8uBSRqdZnEdGq
 EeqwnwDLFeSAbG6lQVsKVVAPZHolDCnJPsvZebKdnJH09kjgUdm60BLOnif4QDgw
 ULPZ6NToFZkrQSMAX1Uqm42iv/qWcHIgS5x43jAmblY3TsuByqRrI7gBCy+hbGSN
 cQfeCxTdVPi/tUPYlJFmWn43dgEsnceSFKvlQRXGrZJM29BFvhVzJQjTFC8Oj79q
 OP/Rv28+VeV5J531BQu+w8cvg73GJ3DDab+i91ZkIsd7ZNjRYQKXBfR9TImTSD4l
 BPjvaPXwDg6EPhwjcpK/
 =C7+4
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management and ACPI updates from Rafael Wysocki:
 "These are mostly fixes and cleanups on top of the previous PM+ACPI
  pull request (cpufreq core and drivers, cpuidle, generic power domains
  framework).  Some of them didn't make to that pull request and some
  fix issues introduced by it.

  The only really new thing is the support for suspend frequency in the
  cpufreq-dt driver, but it is needed to fix an issue with Exynos
  platforms.

  Specifics:

   - build fix for the new Mediatek MT8173 cpufreq driver (Guenter
     Roeck).

   - generic power domains framework fixes (power on error code path,
     subdomain removal) and cleanup of a deprecated API user (Geert
     Uytterhoeven, Jon Hunter, Ulf Hansson).

   - cpufreq-dt driver fixes including two fixes for bugs related to the
     new Operating Performance Points Device Tree bindings introduced
     recently (Viresh Kumar).

   - suspend frequency support for the cpufreq-dt driver (Bartlomiej
     Zolnierkiewicz, Viresh Kumar).

   - cpufreq core cleanups (Viresh Kumar).

   - intel_pstate driver fixes (Chen Yu, Kristen Carlson Accardi).

   - additional sanity check in the cpuidle core (Xunlei Pang).

   - fix for a comment related to CPU power management (Lina Iyer)"

* tag 'pm+acpi-4.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  intel_pstate: fix PCT_TO_HWP macro
  intel_pstate: Fix user input of min/max to legal policy region
  PM / OPP: Return suspend_opp only if it is enabled
  cpufreq-dt: add suspend frequency support
  cpufreq: allow cpufreq_generic_suspend() to work without suspend frequency
  PM / OPP: add dev_pm_opp_get_suspend_opp() helper
  staging: board: Migrate away from __pm_genpd_name_add_device()
  cpufreq: Use __func__ to print function's name
  cpufreq: staticize cpufreq_cpu_get_raw()
  PM / Domains: Ensure subdomain is not in use before removing
  cpufreq: Add ARM_MT8173_CPUFREQ dependency on THERMAL
  cpuidle/coupled: Add sanity check for safe_state_index
  PM / Domains: Try power off masters in error path of __pm_genpd_poweron()
  cpufreq: dt: Tolerance applies on both sides of target voltage
  cpufreq: dt: Print error on failing to mark OPPs as shared
  cpufreq: dt: Check OPP count before marking them shared
  kernel/cpu_pm: fix cpu_cluster_pm_exit comment
2015-09-11 19:11:06 -07:00
Linus Torvalds 05c78081d2 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Here are the outstanding target-pending updates for v4.3-rc1.

  Mostly bug-fixes and minor changes this round.  The fallout from the
  big v4.2-rc1 RCU conversion have (thus far) been minimal.

  The highlights this round include:

   - Move sense handling routines into scsi_common code (Sagi)

   - Return ABORTED_COMMAND sense key for PI errors (Sagi)

   - Add tpg_enabled_sendtargets attribute for disabled iscsi-target
     discovery (David)

   - Shrink target struct se_cmd by rearranging fields (Roland)

   - Drop iSCSI use of mutex around max_cmd_sn increment (Roland)

   - Replace iSCSI __kernel_sockaddr_storage with sockaddr_storage (Andy +
     Chris)

   - Honor fabric max_data_sg_nents I/O transfer limit (Arun + Himanshu +
     nab)

   - Fix EXTENDED_COPY >= v4.1 regression OOPsen (Alex + nab)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (37 commits)
  target: use stringify.h instead of own definition
  target/user: Fix UFLAG_UNKNOWN_OP handling
  target: Remove no-op conditional
  target/user: Remove unused variable
  target: Fix max_cmd_sn increment w/o cmdsn mutex regressions
  target: Attach EXTENDED_COPY local I/O descriptors to xcopy_pt_sess
  target/qla2xxx: Honor max_data_sg_nents I/O transfer limit
  target/iscsi: Replace __kernel_sockaddr_storage with sockaddr_storage
  target/iscsi: Replace conn->login_ip with login_sockaddr
  target/iscsi: Keep local_ip as the actual sockaddr
  target/iscsi: Fix np_ip bracket issue by removing np_ip
  target: Drop iSCSI use of mutex around max_cmd_sn increment
  qla2xxx: Update tcm_qla2xxx module description to 24xx+
  iscsi-target: Add tpg_enabled_sendtargets for disabled discovery
  drivers: target: Drop unlikely before IS_ERR(_OR_NULL)
  target: check DPO/FUA usage for COMPARE AND WRITE
  target: Shrink struct se_cmd by rearranging fields
  target: Remove cmd->se_ordered_id (unused except debug log lines)
  target: add support for START_STOP_UNIT SCSI opcode
  target: improve unsupported opcode message
  ...
2015-09-11 19:00:42 -07:00
Linus Torvalds 8e78b7dc93 SCSI misc on 20150911
The major pieces of this patch are a set patches facilitating better
 integration between scsi and scsi_dh (the device handling layer used by
 multi-path; all the dm parts are acked by Mike Snitzer).  It also includes
 driver updates for mp3sas, scsi_debug and an assortment of bug fixes.
 
 Signed-off-by: James Bottomley <JBottomley@Odin.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJV8yt5AAoJEDeqqVYsXL0MBsQH+wXvlx3o0BGuz5ZXfIs/RxzI
 MwGnu1J0LSA9FPakkMUVOBtsxIG+pCV+4eKorQMkfGCKAZ8daaYsyYvSEM2mcqIX
 1Y/srEnbzfE94JHbsI2pbiMPkB7QdtW27WjTSjQGgD9igAyVmmITiQJrXbpAlSLF
 F6n++9avng+GhjXQ5TF8/y13OYgabIoAPM1j4B/ut/Ok8ReruBvMBnOla5w5RMKR
 rBZKTZfUwvX5S0cuREwj8tFsRVUgdBNSrcGswFJrZo5x9WAsSHLC6+SOLZuUy1vC
 ua0tNtEiyXiuR0/jSP9qv7hJ/j0BW+EGdnW6GZEzKpeMK5PxfVspOsbNunUDRsY=
 =Y9G1
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull second round of SCSI updates from James Bottomley:
 "There's one late arriving patch here (added today), fixing a build
  issue which the scsi_dh patch set in here uncovered.  Other than that,
  everything has been incubated in -next and the checkers for a week.

  The major pieces of this patch are a set patches facilitating better
  integration between scsi and scsi_dh (the device handling layer used
  by multi-path; all the dm parts are acked by Mike Snitzer).

  This also includes driver updates for mp3sas, scsi_debug and an
  assortment of bug fixes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (50 commits)
  scsi_dh: fix randconfig build error
  scsi: fix scsi_error_handler vs. scsi_host_dev_release race
  fcoe: Convert use of __constant_htons to htons
  mpt2sas: setpci reset kernel oops fix
  pm80xx: Don't override ts->stat on IO_OPEN_CNX_ERROR_HW_RESOURCE_BUSY
  lpfc: Fix possible use-after-free and double free in lpfc_mbx_cmpl_rdp_page_a2()
  bfa: Fix incorrect de-reference of pointer
  bfa: Fix indentation
  scsi_transport_sas: Remove check for SAS expander when querying bay/enclosure IDs.
  scsi_debug: resp_request: remove unused variable
  scsi_debug: fix REPORT LUNS Well Known LU
  scsi_debug: schedule_resp fix input variable check
  scsi_debug: make dump_sector static
  scsi_debug: vfree is null safe so drop the check
  scsi_debug: use SCSI_W_LUN_REPORT_LUNS instead of SAM2_WLUN_REPORT_LUNS;
  scsi_debug: define pr_fmt() for consistent logging
  mpt2sas: Refcount fw_events and fix unsafe list usage
  mpt2sas: Refcount sas_device objects and fix unsafe list usage
  scsi_dh: return SCSI_DH_NOTCONN in scsi_dh_activate()
  scsi_dh: don't allow to detach device handlers at runtime
  ...
2015-09-11 18:15:18 -07:00
Linus Torvalds 06a660ada2 media updates for v4.3-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV8WvjAAoJEAhfPr2O5OEV5wIP/AjmqOau99ms4FvOQ932sO57
 kKDM4CYeTBkYY2Xz2eGStgxhcEj538JTf6SXdrceEEYJHb/GNCb2iBM1TnB4YciF
 rqhFv+n3R8h4Yn5KmhEhYzEfO7HUoyHPrOhcmTLzDoTO5wyrhAlPZxDWHohmfU84
 uQ8WyGPYLxwm8hdZ+/NkB8PXsGbWN65EoKzN6tt2kA6HUP52UxE0Cw7Qu7Iu5zmO
 y/x03mMbjhCBFFE41EeM76J+xKBhuaS4cyf8g08DJy5Zpf6ic8bKFmVg1tAFOZRD
 mCETLrUlPYhglHqOoVS25bCI5kCw9xTAyjPZdQnwCTwgHl5gG3E4oJYKASrmZlps
 igMSmLJEpQilsLy1Ze+K+Ci8EILmZzwbi21X0sbjq74Jd+tJZ+C8ZuWHVmPEF9j7
 iHtZNIRzkzufNBJZn3DsmlGBb/Xc/UqfZVnJAB9gu3Ktav6dmtEIHrGRPpL19iYH
 WtJWLt/Bpyb318K+fnxL8SzUqUxZJ4+8DrMtlgTqHmIRwVQ4CczyeWi0utQmBXEF
 CaNp00S2V9N1hn8OIc+gaf7LTYJn0LkHFsskoiUZ5aZQd9ai0ql0IT1xLe0r8lMi
 +ieB0Vp4wJtaodWIXOPeFugDqQXIb0Mh2M8J9FIJ116FLIai6btzO2iyVCtlR9Bg
 1uPztCfJ/nusPPHnE26R
 =TEFw
 -----END PGP SIGNATURE-----

Merge tag 'media/v4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:
 "A series of patches that move part of the code used to allocate memory
  from the media subsystem to the mm subsystem"

[ The mm parts have been acked by VM people, and the series was
  apparently in -mm for a while   - Linus ]

* tag 'media/v4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] drm/exynos: Convert g2d_userptr_get_dma_addr() to use get_vaddr_frames()
  [media] media: vb2: Remove unused functions
  [media] media: vb2: Convert vb2_dc_get_userptr() to use frame vector
  [media] media: vb2: Convert vb2_vmalloc_get_userptr() to use frame vector
  [media] media: vb2: Convert vb2_dma_sg_get_userptr() to use frame vector
  [media] vb2: Provide helpers for mapping virtual addresses
  [media] media: omap_vout: Convert omap_vout_uservirt_to_phys() to use get_vaddr_pfns()
  [media] mm: Provide new get_vaddr_frames() helper
  [media] vb2: Push mmap_sem down to memops
2015-09-11 16:42:39 -07:00
Linus Torvalds 9ebd051a7d Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal updates from Zhang Rui:

 - use int instead of unsigned long to represent temperature to avoid
   bogus overheat detection when negative temperature reported.  From
   Sascha Hauer.

 - export available thermal governors information to user space via
   sysfs.  From Wei Ni.

 - introduce new thermal driver for Wildcat Point platform controller
   hub, which uses PCH thermal sensor and associated critical and hot
   trip points.  From Tushar Dave.

 - add suuport for Intel Skylake and Denlow platforms in powerclamp
   driver.

 - some small cleanups in thermal core.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal: Add Intel PCH thermal driver
  thermal: Add comment explaining test for critical temperature
  thermal: Use IS_ENABLED instead of #ifdef
  thermal: remove unnecessary call to thermal_zone_device_set_polling
  thermal: trivial: fix typo in comment
  thermal: consistently use int for temperatures
  thermal: add available policies sysfs attribute
  thermal/powerclamp: add cpu id for denlow platform
  thermal/powerclamp: add cpu id for Skylake u/y
  thermal/powerclamp: add cpu id for skylake h/s
2015-09-11 16:13:47 -07:00
Joe Perches 6798a8caaf fs/seq_file: convert int seq_vprint/seq_printf/etc... returns to void
The seq_<foo> function return values were frequently misused.

See: commit 1f33c41c03 ("seq_file: Rename seq_overflow() to
     seq_has_overflowed() and make public")

All uses of these return values have been removed, so convert the
return types to void.

Miscellanea:

o Move seq_put_decimal_<type> and seq_escape prototypes closer the
  other seq_vprintf prototypes
o Reorder seq_putc and seq_puts to return early on overflow
o Add argument names to seq_vprintf and seq_printf
o Update the seq_escape kernel-doc
o Convert a couple of leading spaces to tabs in seq_escape

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-11 15:21:34 -07:00
Mathieu Desnoyers 5b25b13ab0 sys_membarrier(): system-wide memory barrier (generic, x86)
Here is an implementation of a new system call, sys_membarrier(), which
executes a memory barrier on all threads running on the system.  It is
implemented by calling synchronize_sched().  It can be used to
distribute the cost of user-space memory barriers asymmetrically by
transforming pairs of memory barriers into pairs consisting of
sys_membarrier() and a compiler barrier.  For synchronization primitives
that distinguish between read-side and write-side (e.g.  userspace RCU
[1], rwlocks), the read-side can be accelerated significantly by moving
the bulk of the memory barrier overhead to the write-side.

The existing applications of which I am aware that would be improved by
this system call are as follows:

* Through Userspace RCU library (http://urcu.so)
  - DNS server (Knot DNS) https://www.knot-dns.cz/
  - Network sniffer (http://netsniff-ng.org/)
  - Distributed object storage (https://sheepdog.github.io/sheepdog/)
  - User-space tracing (http://lttng.org)
  - Network storage system (https://www.gluster.org/)
  - Virtual routers (https://events.linuxfoundation.org/sites/events/files/slides/DPDK_RCU_0MQ.pdf)
  - Financial software (https://lkml.org/lkml/2015/3/23/189)

Those projects use RCU in userspace to increase read-side speed and
scalability compared to locking.  Especially in the case of RCU used by
libraries, sys_membarrier can speed up the read-side by moving the bulk of
the memory barrier cost to synchronize_rcu().

* Direct users of sys_membarrier
  - core dotnet garbage collector (https://github.com/dotnet/coreclr/issues/198)

Microsoft core dotnet GC developers are planning to use the mprotect()
side-effect of issuing memory barriers through IPIs as a way to implement
Windows FlushProcessWriteBuffers() on Linux.  They are referring to
sys_membarrier in their github thread, specifically stating that
sys_membarrier() is what they are looking for.

To explain the benefit of this scheme, let's introduce two example threads:

Thread A (non-frequent, e.g. executing liburcu synchronize_rcu())
Thread B (frequent, e.g. executing liburcu
rcu_read_lock()/rcu_read_unlock())

In a scheme where all smp_mb() in thread A are ordering memory accesses
with respect to smp_mb() present in Thread B, we can change each
smp_mb() within Thread A into calls to sys_membarrier() and each
smp_mb() within Thread B into compiler barriers "barrier()".

Before the change, we had, for each smp_mb() pairs:

Thread A                    Thread B
previous mem accesses       previous mem accesses
smp_mb()                    smp_mb()
following mem accesses      following mem accesses

After the change, these pairs become:

Thread A                    Thread B
prev mem accesses           prev mem accesses
sys_membarrier()            barrier()
follow mem accesses         follow mem accesses

As we can see, there are two possible scenarios: either Thread B memory
accesses do not happen concurrently with Thread A accesses (1), or they
do (2).

1) Non-concurrent Thread A vs Thread B accesses:

Thread A                    Thread B
prev mem accesses
sys_membarrier()
follow mem accesses
                            prev mem accesses
                            barrier()
                            follow mem accesses

In this case, thread B accesses will be weakly ordered. This is OK,
because at that point, thread A is not particularly interested in
ordering them with respect to its own accesses.

2) Concurrent Thread A vs Thread B accesses

Thread A                    Thread B
prev mem accesses           prev mem accesses
sys_membarrier()            barrier()
follow mem accesses         follow mem accesses

In this case, thread B accesses, which are ensured to be in program
order thanks to the compiler barrier, will be "upgraded" to full
smp_mb() by synchronize_sched().

* Benchmarks

On Intel Xeon E5405 (8 cores)
(one thread is calling sys_membarrier, the other 7 threads are busy
looping)

1000 non-expedited sys_membarrier calls in 33s =3D 33 milliseconds/call.

* User-space user of this system call: Userspace RCU library

Both the signal-based and the sys_membarrier userspace RCU schemes
permit us to remove the memory barrier from the userspace RCU
rcu_read_lock() and rcu_read_unlock() primitives, thus significantly
accelerating them. These memory barriers are replaced by compiler
barriers on the read-side, and all matching memory barriers on the
write-side are turned into an invocation of a memory barrier on all
active threads in the process. By letting the kernel perform this
synchronization rather than dumbly sending a signal to every process
threads (as we currently do), we diminish the number of unnecessary wake
ups and only issue the memory barriers on active threads. Non-running
threads do not need to execute such barrier anyway, because these are
implied by the scheduler context switches.

Results in liburcu:

Operations in 10s, 6 readers, 2 writers:

memory barriers in reader:    1701557485 reads, 2202847 writes
signal-based scheme:          9830061167 reads,    6700 writes
sys_membarrier:               9952759104 reads,     425 writes
sys_membarrier (dyn. check):  7970328887 reads,     425 writes

The dynamic sys_membarrier availability check adds some overhead to
the read-side compared to the signal-based scheme, but besides that,
sys_membarrier slightly outperforms the signal-based scheme. However,
this non-expedited sys_membarrier implementation has a much slower grace
period than signal and memory barrier schemes.

Besides diminishing the number of wake-ups, one major advantage of the
membarrier system call over the signal-based scheme is that it does not
need to reserve a signal. This plays much more nicely with libraries,
and with processes injected into for tracing purposes, for which we
cannot expect that signals will be unused by the application.

An expedited version of this system call can be added later on to speed
up the grace period. Its implementation will likely depend on reading
the cpu_curr()->mm without holding each CPU's rq lock.

This patch adds the system call to x86 and to asm-generic.

[1] http://urcu.so

membarrier(2) man page:

MEMBARRIER(2)              Linux Programmer's Manual             MEMBARRIER(2)

NAME
       membarrier - issue memory barriers on a set of threads

SYNOPSIS
       #include <linux/membarrier.h>

       int membarrier(int cmd, int flags);

DESCRIPTION
       The cmd argument is one of the following:

       MEMBARRIER_CMD_QUERY
              Query  the  set  of  supported commands. It returns a bitmask of
              supported commands.

       MEMBARRIER_CMD_SHARED
              Execute a memory barrier on all threads running on  the  system.
              Upon  return from system call, the caller thread is ensured that
              all running threads have passed through a state where all memory
              accesses  to  user-space  addresses  match program order between
              entry to and return from the system  call  (non-running  threads
              are de facto in such a state). This covers threads from all pro=E2=80=90
              cesses running on the system.  This command returns 0.

       The flags argument needs to be 0. For future extensions.

       All memory accesses performed  in  program  order  from  each  targeted
       thread is guaranteed to be ordered with respect to sys_membarrier(). If
       we use the semantic "barrier()" to represent a compiler barrier forcing
       memory  accesses  to  be performed in program order across the barrier,
       and smp_mb() to represent explicit memory barriers forcing full  memory
       ordering  across  the barrier, we have the following ordering table for
       each pair of barrier(), sys_membarrier() and smp_mb():

       The pair ordering is detailed as (O: ordered, X: not ordered):

                              barrier()   smp_mb() sys_membarrier()
              barrier()          X           X            O
              smp_mb()           X           O            O
              sys_membarrier()   O           O            O

RETURN VALUE
       On success, these system calls return zero.  On error, -1 is  returned,
       and errno is set appropriately. For a given command, with flags
       argument set to 0, this system call is guaranteed to always return the
       same value until reboot.

ERRORS
       ENOSYS System call is not implemented.

       EINVAL Invalid arguments.

Linux                             2015-04-15                     MEMBARRIER(2)

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nicholas Miell <nmiell@comcast.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Pranith Kumar <bobby.prani@gmail.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-11 15:21:34 -07:00
Linus Torvalds e013f74b60 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph update from Sage Weil:
 "There are a few fixes for snapshot behavior with CephFS and support
  for the new keepalive protocol from Zheng, a libceph fix that affects
  both RBD and CephFS, a few bug fixes and cleanups for RBD from Ilya,
  and several small fixes and cleanups from Jianpeng and others"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: improve readahead for file holes
  ceph: get inode size for each append write
  libceph: check data_len in ->alloc_msg()
  libceph: use keepalive2 to verify the mon session is alive
  rbd: plug rbd_dev->header.object_prefix memory leak
  rbd: fix double free on rbd_dev->header_name
  libceph: set 'exists' flag for newly up osd
  ceph: cleanup use of ceph_msg_get
  ceph: no need to get parent inode in ceph_open
  ceph: remove the useless judgement
  ceph: remove redundant test of head->safe and silence static analysis warnings
  ceph: fix queuing inode to mdsdir's snaprealm
  libceph: rename con_work() to ceph_con_workfn()
  libceph: Avoid holding the zero page on ceph_msgr_slab_init errors
  libceph: remove the unused macro AES_KEY_SIZE
  ceph: invalidate dirty pages after forced umount
  ceph: EIO all operations after forced umount
2015-09-11 12:33:03 -07:00
Linus Torvalds 04d78e39ee Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Just a bunch of fixes to squeeze in before -rc1:

   - three nouveau regression fixes

   - one qxl regression fix

   - a bunch of i915 fixes

  ... and some core displayport/atomic fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau/device: enable c800 quirk for tecra w50
  drm/nouveau/clk/gt215: Unbreak engine pausing for GT21x/MCP7x
  drm/nouveau/gr/nv04: fix big endian setting on gr context
  drm/qxl: validate monitors config modes
  drm/i915: Allow DSI dual link to be configured on any pipe
  drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS
  drm/i915: Fix CSR MMIO address check
  drm/i915: Limit the number of loops for reading a split 64bit register
  drm/i915: Fix broken mst get_hw_state.
  drm/i915: Pass hpd_status_i915[] to intel_get_hpd_pins() in pre-g4x
  uapi/drm/i915_drm.h: fix userspace compilation.
  drm/i915: Always mark the object as dirty when used by the GPU
  drm/dp: Add dp_aux_i2c_speed_khz module param to set the assume i2c bus speed
  drm/dp: Adjust i2c-over-aux retry count based on message size and i2c bus speed
  drm/dp: Define AUX_RETRY_INTERVAL as 500 us
  drm/atomic: Fix bookkeeping with TEST_ONLY, v3.
2015-09-11 09:35:56 -07:00
Sagi Grimberg 7f39add3b0 block: Refuse request/bio merges with gaps in the integrity payload
If a driver sets the block queue virtual boundary mask, it means that
it cannot handle gaps so we must not allow those in the integrity
payload as well.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>

Fixed up by me to have duplicate integrity merge functions, depending
on whether block integrity is enabled or not. Fixes a compilations
issue with CONFIG_BLK_DEV_INTEGRITY unset.

Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-11 09:03:04 -06:00
Rafael J. Wysocki 7c976664d5 Merge branch 'pm-opp'
* pm-opp:
  PM / OPP: Return suspend_opp only if it is enabled
  PM / OPP: add dev_pm_opp_get_suspend_opp() helper
2015-09-11 15:37:17 +02:00
David Disseldorp ac64a2ce50 target: use stringify.h instead of own definition
Signed-off-by: David Disseldorp <ddiss@suse.de>
Acked-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-11 00:32:36 -07:00
Nicholas Bellinger 8f9b565482 target/qla2xxx: Honor max_data_sg_nents I/O transfer limit
This patch adds an optional fabric driver provided SGL limit
that target-core will honor as it's own internal I/O maximum
transfer length limit, as exposed by EVPD=0xb0 block limits
parameters.

This is required for handling cases when host I/O transfer
length exceeds the requested EVPD block limits maximum
transfer length. The initial user of this logic is qla2xxx,
so that we can avoid having to reject I/Os from some legacy
FC hosts where EVPD=0xb0 parameters are not honored.

When se_cmd payload length exceeds the provided limit in
target_check_max_data_sg_nents() code, se_cmd->data_length +
se_cmd->prot_length are reset with se_cmd->residual_count
plus underflow bit for outgoing TFO response callbacks.
It also checks for existing CDB level underflow + overflow
and recalculates final residual_count as necessary.

Note this patch currently assumes 1:1 mapping of PAGE_SIZE
per struct scatterlist entry.

Reported-by: Craig Watson <craig.watson@vanguard-rugged.com>
Cc: Craig Watson <craig.watson@vanguard-rugged.com>
Tested-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Arun Easi <arun.easi@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-11 00:31:39 -07:00
Peter Zijlstra 43b3f02899 locking/qspinlock/x86: Fix performance regression under unaccelerated VMs
Dave ran into horrible performance on a VM without PARAVIRT_SPINLOCKS
set and Linus noted that the test-and-set implementation was retarded.

One should spin on the variable with a load, not a RMW.

While there, remove 'queued' from the name, as the lock isn't queued
at all, but a simple test-and-set.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Dave Chinner <david@fromorbit.com>
Tested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/20150904152523.GR18673@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-11 07:49:42 +02:00
MyungJoo Ham d54cdf3fc9 PM / devfreq: comments for get_dev_status usage updated
With the introduction of devfreq_update_stats(), governors
are not recommended to use get_dev_status() directly.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
2015-09-11 14:23:29 +09:00
Javi Merino 08e75e754a PM / devfreq: cache the last call to get_dev_status()
The return value of get_dev_status() can be reused.  Cache it so that
other parts of the kernel can reuse it instead of having to call the
same function again.

Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
2015-09-11 14:23:28 +09:00
Linus Torvalds b0a1ea51bd Merge branch 'for-4.3/blkcg' of git://git.kernel.dk/linux-block
Pull blk-cg updates from Jens Axboe:
 "A bit later in the cycle, but this has been in the block tree for a a
  while.  This is basically four patchsets from Tejun, that improve our
  buffered cgroup writeback.  It was dependent on the other cgroup
  changes, but they went in earlier in this cycle.

  Series 1 is set of 5 patches that has cgroup writeback updates:

   - bdi_writeback iteration fix which could lead to some wb's being
     skipped or repeated during e.g. sync under memory pressure.

   - Simplification of wb work wait mechanism.

   - Writeback tracepoints updated to report cgroup.

  Series 2 is is a set of updates for the CFQ cgroup writeback handling:

     cfq has always charged all async IOs to the root cgroup.  It didn't
     have much choice as writeback didn't know about cgroups and there
     was no way to tell who to blame for a given writeback IO.
     writeback finally grew support for cgroups and now tags each
     writeback IO with the appropriate cgroup to charge it against.

     This patchset updates cfq so that it follows the blkcg each bio is
     tagged with.  Async cfq_queues are now shared across cfq_group,
     which is per-cgroup, instead of per-request_queue cfq_data.  This
     makes all IOs follow the weight based IO resource distribution
     implemented by cfq.

     - Switched from GFP_ATOMIC to GFP_NOWAIT as suggested by Jeff.

     - Other misc review points addressed, acks added and rebased.

  Series 3 is the blkcg policy cleanup patches:

     This patchset contains assorted cleanups for blkcg_policy methods
     and blk[c]g_policy_data handling.

     - alloc/free added for blkg_policy_data.  exit dropped.

     - alloc/free added for blkcg_policy_data.

     - blk-throttle's async percpu allocation is replaced with direct
       allocation.

     - all methods now take blk[c]g_policy_data instead of blkcg_gq or
       blkcg.

  And finally, series 4 is a set of patches cleaning up the blkcg stats
  handling:

    blkcg's stats have always been somwhat of a mess.  This patchset
    tries to improve the situation a bit.

     - The following patches added to consolidate blkcg entry point and
       blkg creation.  This is in itself is an improvement and helps
       colllecting common stats on bio issue.

     - per-blkg stats now accounted on bio issue rather than request
       completion so that bio based and request based drivers can behave
       the same way.  The issue was spotted by Vivek.

     - cfq-iosched implements custom recursive stats and blk-throttle
       implements custom per-cpu stats.  This patchset make blkcg core
       support both by default.

     - cfq-iosched and blk-throttle keep track of the same stats
       multiple times.  Unify them"

* 'for-4.3/blkcg' of git://git.kernel.dk/linux-block: (45 commits)
  blkcg: use CGROUP_WEIGHT_* scale for io.weight on the unified hierarchy
  blkcg: s/CFQ_WEIGHT_*/CFQ_WEIGHT_LEGACY_*/
  blkcg: implement interface for the unified hierarchy
  blkcg: misc preparations for unified hierarchy interface
  blkcg: separate out tg_conf_updated() from tg_set_conf()
  blkcg: move body parsing from blkg_conf_prep() to its callers
  blkcg: mark existing cftypes as legacy
  blkcg: rename subsystem name from blkio to io
  blkcg: refine error codes returned during blkcg configuration
  blkcg: remove unnecessary NULL checks from __cfqg_set_weight_device()
  blkcg: reduce stack usage of blkg_rwstat_recursive_sum()
  blkcg: remove cfqg_stats->sectors
  blkcg: move io_service_bytes and io_serviced stats into blkcg_gq
  blkcg: make blkg_[rw]stat_recursive_sum() to be able to index into blkcg_gq
  blkcg: make blkcg_[rw]stat per-cpu
  blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it
  blkcg: consolidate blkg creation in blkcg_bio_issue_check()
  blk-throttle: improve queue bypass handling
  blkcg: move root blkg lookup optimization from throtl_lookup_tg() to __blkg_lookup()
  blkcg: inline [__]blkg_lookup()
  ...
2015-09-10 18:56:14 -07:00
Linus Torvalds 33e247c7e5 Merge branch 'akpm' (patches from Andrew)
Merge third patch-bomb from Andrew Morton:

 - even more of the rest of MM

 - lib/ updates

 - checkpatch updates

 - small changes to a few scruffy filesystems

 - kmod fixes/cleanups

 - kexec updates

 - a dma-mapping cleanup series from hch

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits)
  dma-mapping: consolidate dma_set_mask
  dma-mapping: consolidate dma_supported
  dma-mapping: cosolidate dma_mapping_error
  dma-mapping: consolidate dma_{alloc,free}_noncoherent
  dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}
  mm: use vma_is_anonymous() in create_huge_pmd() and wp_huge_pmd()
  mm: make sure all file VMAs have ->vm_ops set
  mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff()
  mm: mark most vm_operations_struct const
  namei: fix warning while make xmldocs caused by namei.c
  ipc: convert invalid scenarios to use WARN_ON
  zlib_deflate/deftree: remove bi_reverse()
  lib/decompress_unlzma: Do a NULL check for pointer
  lib/decompressors: use real out buf size for gunzip with kernel
  fs/affs: make root lookup from blkdev logical size
  sysctl: fix int -> unsigned long assignments in INT_MIN case
  kexec: export KERNEL_IMAGE_SIZE to vmcoreinfo
  kexec: align crash_notes allocation to make it be inside one physical page
  kexec: remove unnecessary test in kimage_alloc_crash_control_pages()
  kexec: split kexec_load syscall from kexec core code
  ...
2015-09-10 18:19:42 -07:00
Linus Torvalds d71fc239b6 ARM: SoC: late fixes and dependencies
This is a collection of a few late fixes and other misc. stuff that
 had dependencies on things being merged from other trees.
 
 The bulk of the changes are for samsung/exynos SoCs for some changes
 that needed a few minor reworks so ended up a bit late.  The others
 are mainly for qcom SoCs: a couple fixes and some DTS updates.
 
 There's one conflict with drivers/cpufreq/exynos-cpufreq.c because
 it's now been completely removed, but there were some fixes that hit
 mainline in the meantime.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIbBAABAgAGBQJV8fkAAAoJEFk3GJrT+8Zllf8P9jj3+TnvbJS/8bWoQoB7BRUZ
 LZPgi2+sBXylrBV60uQdyodiTHQUMZhbL7GvgEVG0z6yyin7nyijqNkulTbQbWmg
 WhumLNCNcs8vlZegA/corbwgcVC7FkjOP97HveTe2mgwZ+GaXj9qMRQzBsMqSXEo
 4890ZeP1nWBTP42oXOQHkNyKWFBjuERK0dTw2MXj7WE0/Ag8i7ERp76uJQdQ7V5O
 BpNRwxp3vSCky8rxbpD/avWdlspv1yZGBQyLeIreVq2YQFojvT36K8wHcf6iWBT/
 pzGGV/uZM7MnrGZdqSfVEMDHl7Z2s7Ls+sv5F6Md7ErnVDerHGRjw/6lJDjbeH7u
 trucpsuhz5yhTjpZssGHH2NT8pWxxh8M+AaNOiiH8PuYESAbPAmWLpWkn+648bIn
 P++Z90DIyfNEqnNSMHkcQYpVt8zc4g75gZfTIIsXLB+DLgzgSK8ergrfyRN/O0zj
 oFY35g3wHdgisnGve+BAW30zTZtP19TMT36OltWjIkjuRZC29PS2vkH8eTETdVXx
 01f/qcpbB1L1rXfIBjNkjx81j89XPd68JIBfctTF5QBOAGm/Dix6tj2mU/N7TNu5
 TrBmD3CXdOQbCPaoK/qWX7/b11IN/XOlGL06hhYwbSdvCVy7rNccApXvfnrDWziK
 Ly923FP1OB7h0Kk1cmo=
 =85Ck
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull late ARM SoC updates from Kevin Hilman:
 "This is a collection of a few late fixes and other misc stuff that had
  dependencies on things being merged from other trees.

  The bulk of the changes are for samsung/exynos SoCs for some changes
  that needed a few minor reworks so ended up a bit late.  The others
  are mainly for qcom SoCs: a couple fixes and some DTS updates"

* tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (37 commits)
  ARM: multi_v7_defconfig: Enable PBIAS regulator
  soc: qcom: smd: Correct fBLOCKREADINTR handling
  soc: qcom: smd: Use correct remote processor ID
  soc: qcom: smem: Fix errant private access
  ARM: dts: qcom: msm8974-sony-xperia-honami: Use stdout-path
  ARM: dts: qcom: msm8960-cdp: Use stdout-path
  ARM: dts: qcom: msm8660-surf: Use stdout-path
  ARM: dts: qcom: ipq8064-ap148: Use stdout-path
  ARM: dts: qcom: apq8084-mtp: Use stdout-path
  ARM: dts: qcom: apq8084-ifc6540: Use stdout-path
  ARM: dts: qcom: apq8074-dragonboard: Use stdout-path
  ARM: dts: qcom: apq8064-ifc6410: Use stdout-path
  ARM: dts: qcom: apq8064-cm-qs600: Use stdout-path
  ARM: dts: qcom: Label serial nodes for aliasing and stdout-path
  reset: ath79: Fix missing spin_lock_init
  reset: Add (devm_)reset_control_get stub functions
  ARM: EXYNOS: switch to using generic cpufreq driver for exynos4x12
  cpufreq: exynos: Remove unselectable rule for arm-exynos-cpufreq.o
  ARM: dts: add iommu property to JPEG device for exynos4
  ARM: dts: enable SPI1 for exynos4412-odroidu3
  ...
2015-09-10 17:59:04 -07:00
Dave Airlie 91b6fc02a2 Merge tag 'drm-intel-next-fixes-2015-09-10' of git://anongit.freedesktop.org/drm-intel into drm-next
Fixes headed for v4.3-rc1, including Maarten's DP MST state checker fix
you requested.

* tag 'drm-intel-next-fixes-2015-09-10' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Allow DSI dual link to be configured on any pipe
  drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS
  drm/i915: Fix CSR MMIO address check
  drm/i915: Limit the number of loops for reading a split 64bit register
  drm/i915: Fix broken mst get_hw_state.
  drm/i915: Pass hpd_status_i915[] to intel_get_hpd_pins() in pre-g4x
  uapi/drm/i915_drm.h: fix userspace compilation.
  drm/i915: Always mark the object as dirty when used by the GPU
2015-09-11 10:52:08 +10:00
Linus Torvalds 519f526d39 ARM:
- Full debug support for arm64
 - Active state switching for timer interrupts
 - Lazy FP/SIMD save/restore for arm64
 - Generic ARMv8 target
 
 PPC:
 - Book3S: A few bug fixes
 - Book3S: Allow micro-threading on POWER8
 
 x86:
 - Compiler warnings
 
 Generic:
 - Adaptive polling for guest halt
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJV7qd/AAoJEL/70l94x66DDBcH/2OLomKHjDOGXqJ/dpkqf4UU
 FYI1pVjs2zP4z3L7RYV/DeuEsD6XaWzS7EXQOS3mcb9d8GWahPrdofeVmpmhg/8y
 jmkuUEFHl2Ut6imk8qDlG3m42c86Mk8/1k38l1bp8S3lL0/Q7IyADyYAlHdwzpOx
 yEyOAE4VU4n+VyQH5dbnzc12QRTeHfRQc/dI3eQq238gf37SF/1qzOzeLIdbEa+N
 DCzqQ8SExbctiRaLzCY5Ogan+unZBQbFfhrDrUSryywrzo/8WRFVmbjuf5O5Ucxa
 +UTLMvmm1YgxvBvWhlcmA+HSzSVeWNvaHQ9illgE5+74G5CzaD2ukurmoz/+r+A=
 =XtrL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
 "ARM:
   - Full debug support for arm64
   - Active state switching for timer interrupts
   - Lazy FP/SIMD save/restore for arm64
   - Generic ARMv8 target

  PPC:
   - Book3S: A few bug fixes
   - Book3S: Allow micro-threading on POWER8

  x86:
   - Compiler warnings

  Generic:
   - Adaptive polling for guest halt"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (49 commits)
  kvm: irqchip: fix memory leak
  kvm: move new trace event outside #ifdef CONFIG_KVM_ASYNC_PF
  KVM: trace kvm_halt_poll_ns grow/shrink
  KVM: dynamic halt-polling
  KVM: make halt_poll_ns per-vCPU
  Silence compiler warning in arch/x86/kvm/emulate.c
  kvm: compile process_smi_save_seg_64() only for x86_64
  KVM: x86: avoid uninitialized variable warning
  KVM: PPC: Book3S: Fix typo in top comment about locking
  KVM: PPC: Book3S: Fix size of the PSPB register
  KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is set
  KVM: PPC: Book3S HV: Fix race in starting secondary threads
  KVM: PPC: Book3S: correct width in XER handling
  KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation
  KVM: PPC: Book3S HV: Fix preempted vcore list locking
  KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  KVM: PPC: Book3S HV: Fix bug in dirty page tracking
  KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE
  KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8
  KVM: PPC: Book3S HV: Make use of unused threads when running guests
  ...
2015-09-10 16:42:49 -07:00
Linus Torvalds 06ab838c20 xen: MFN/GFN/BFN terminology changes for 4.3-rc0
- Use the correct GFN/BFN terms more consistently.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV8VRMAAoJEFxbo/MsZsTRiGQH/i/jrAJUJfrFC2PINaA2gDwe
 O0dlrkCiSgAYChGmxxxXZQSPM5Po5+EbT/dLjZ/uvSooeorM9RYY/mFo7ut/qLep
 4pyQUuwGtebWGBZTrj9sygUVXVhgJnyoZxskNUbhj9zvP7hb9++IiI78mzne6cpj
 lCh/7Z2dgpfRcKlNRu+qpzP79Uc7OqIfDK+IZLrQKlXa7IQDJTQYoRjbKpfCtmMV
 BEG3kN9ESx5tLzYiAfxvaxVXl9WQFEoktqe9V8IgOQlVRLgJ2DQWS6vmraGrokWM
 3HDOCHtRCXlPhu1Vnrp0R9OgqWbz8FJnmVAndXT8r3Nsjjmd0aLwhJx7YAReO/4=
 =JDia
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen terminology fixes from David Vrabel:
 "Use the correct GFN/BFN terms more consistently"

* tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn
  xen/privcmd: Further s/MFN/GFN/ clean-up
  hvc/xen: Further s/MFN/GFN clean-up
  video/xen-fbfront: Further s/MFN/GFN clean-up
  xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfn
  xen: Use correctly the Xen memory terminologies
  arm/xen: implement correctly pfn_to_mfn
  xen: Make clear that swiotlb and biomerge are dealing with DMA address
2015-09-10 16:21:11 -07:00
Linus Torvalds 573c577af0 Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
Pull microblaze update from Michal Simek.

* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
  elf-em.h: move EM_MICROBLAZE to the common header
2015-09-10 16:20:00 -07:00
Linus Torvalds 65c61bc5db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix out-of-bounds array access in netfilter ipset, from Jozsef
    Kadlecsik.

 2) Use correct free operation on netfilter conntrack templates, from
    Daniel Borkmann.

 3) Fix route leak in SCTP, from Marcelo Ricardo Leitner.

 4) Fix sizeof(pointer) in mac80211, from Thierry Reding.

 5) Fix cache pointer comparison in ip6mr leading to missed unlock of
    mrt_lock.  From Richard Laing.

 6) rds_conn_lookup() needs to consider network namespace in key
    comparison, from Sowmini Varadhan.

 7) Fix deadlock in TIPC code wrt broadcast link wakeups, from Kolmakov
    Dmitriy.

 8) Fix fd leaks in bpf syscall, from Daniel Borkmann.

 9) Fix error recovery when installing ipv6 multipath routes, we would
    delete the old route before we would know if we could fully commit
    to the new set of nexthops.  Fix from Roopa Prabhu.

10) Fix run-time suspend problems in r8152, from Hayes Wang.

11) In fec, don't program the MAC address into the chip when the clocks
    are gated off.  From Fugang Duan.

12) Fix poll behavior for netlink sockets when using rx ring mmap, from
    Daniel Borkmann.

13) Don't allocate memory with GFP_KERNEL from get_stats64 in r8169
    driver, from Corinna Vinschen.

14) In TCP Cubic congestion control, handle idle periods better where we
    are application limited, in order to keep cwnd from growing out of
    control.  From Eric Dumzet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
  tcp_cubic: better follow cubic curve after idle period
  tcp: generate CA_EVENT_TX_START on data frames
  xen-netfront: respect user provided max_queues
  xen-netback: respect user provided max_queues
  r8169: Fix sleeping function called during get_stats64, v2
  ether: add IEEE 1722 ethertype - TSN
  netlink, mmap: fix edge-case leakages in nf queue zero-copy
  netlink, mmap: don't walk rx ring on poll if receive queue non-empty
  cxgb4: changes for new firmware 1.14.4.0
  net: fec: add netif status check before set mac address
  r8152: fix the runtime suspend issues
  r8152: split DRIVER_VERSION
  ipv6: fix ifnullfree.cocci warnings
  add microchip LAN88xx phy driver
  stmmac: fix check for phydev being open
  net: qlcnic: delete redundant memsets
  net: mv643xx_eth: use kzalloc
  net: jme: use kzalloc() instead of kmalloc+memset
  net: cavium: liquidio: use kzalloc in setup_glist()
  net: ipv6: use common fib_default_rule_pref
  ...
2015-09-10 13:53:15 -07:00
Christoph Hellwig 452e06af1f dma-mapping: consolidate dma_set_mask
Almost everyone implements dma_set_mask the same way, although some time
that's hidden in ->set_dma_mask methods.

This patch consolidates those into a common implementation that either
calls ->set_dma_mask if present or otherwise uses the default
implementation.  Some architectures used to only call ->set_dma_mask
after the initial checks, and those instance have been fixed to do the
full work.  h8300 implemented dma_set_mask bogusly as a no-ops and has
been fixed.

Unfortunately some architectures overload unrelated semantics like changing
the dma_ops into it so we still need to allow for an architecture override
for now.

[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Christoph Hellwig ee196371d5 dma-mapping: consolidate dma_supported
Most architectures just call into ->dma_supported, but some also return 1
if the method is not present, or 0 if no dma ops are present (although
that should never happeb). Consolidate this more broad version into
common code.

Also fix h8300 which inorrectly always returned 0, which would have been
a problem if it's dma_set_mask implementation wasn't a similarly buggy
noop.

As a few architectures have much more elaborate implementations, we
still allow for arch overrides.

[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Christoph Hellwig efa21e432c dma-mapping: cosolidate dma_mapping_error
Currently there are three valid implementations of dma_mapping_error:

 (1) call ->mapping_error
 (2) check for a hardcoded error code
 (3) always return 0

This patch provides a common implementation that calls ->mapping_error
if present, then checks for DMA_ERROR_CODE if defined or otherwise
returns 0.

[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Christoph Hellwig 1e8937526e dma-mapping: consolidate dma_{alloc,free}_noncoherent
Most architectures do not support non-coherent allocations and either
define dma_{alloc,free}_noncoherent to their coherent versions or stub
them out.

Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips
implements them directly.

This patch moves the Openrisc version to common code, and handles the
DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance.

Note that actual non-coherent allocations require a dma_cache_sync
implementation, so if non-coherent allocations didn't work on
an architecture before this patch they still won't work after it.

[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Christoph Hellwig 6894258eda dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}
Since 2009 we have a nice asm-generic header implementing lots of DMA API
functions for architectures using struct dma_map_ops, but unfortunately
it's still missing a lot of APIs that all architectures still have to
duplicate.

This series consolidates the remaining functions, although we still need
arch opt outs for two of them as a few architectures have very
non-standard implementations.

This patch (of 5):

The coherent DMA allocator works the same over all architectures supporting
dma_map operations.

This patch consolidates them and converges the minor differences:

 - the debug_dma helpers are now called from all architectures, including
   those that were previously missing them
 - dma_alloc_from_coherent and dma_release_from_coherent are now always
   called from the generic alloc/free routines instead of the ops
   dma-mapping-common.h always includes dma-coherent.h to get the defintions
   for them, or the stubs if the architecture doesn't support this feature
 - checks for ->alloc / ->free presence are removed.  There is only one
   magic instead of dma_map_ops without them (mic_dma_ops) and that one
   is x86 only anyway.

Besides that only x86 needs special treatment to replace a default devices
if none is passed and tweak the gfp_flags.  An optional arch hook is provided
for that.

[linux@roeck-us.net: fix build]
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Oleg Nesterov 1fcfd8db7f mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff()
Add the additional "vm_flags_t vm_flags" argument to do_mmap_pgoff(),
rename it to do_mmap(), and re-introduce do_mmap_pgoff() as a simple
wrapper on top of do_mmap().  Perhaps we should update the callers of
do_mmap_pgoff() and kill it later.

This way mpx_mmap() can simply call do_mmap(vm_flags => VM_MPX) and do not
play with vm internals.

After this change mmap_region() has a single user outside of mmap.c,
arch/tile/mm/elf.c:arch_setup_additional_pages().  It would be nice to
change arch/tile/ and unexport mmap_region().

[kirill@shutemov.name: fix build]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Dave Young 2965faa5e0 kexec: split kexec_load syscall from kexec core code
There are two kexec load syscalls, kexec_load another and kexec_file_load.
 kexec_file_load has been splited as kernel/kexec_file.c.  In this patch I
split kexec_load syscall code to kernel/kexec.c.

And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
use kexec_file_load only, or vice verse.

The original requirement is from Ted Ts'o, he want kexec kernel signature
being checked with CONFIG_KEXEC_VERIFY_SIG enabled.  But kexec-tools use
kexec_load syscall can bypass the checking.

Vivek Goyal proposed to create a common kconfig option so user can compile
in only one syscall for loading kexec kernel.  KEXEC/KEXEC_FILE selects
KEXEC_CORE so that old config files still work.

Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
KEXEC_CORE in arch Kconfig.  Also updated general kernel code with to
kexec_load syscall.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Dave Young a43cac0d9d kexec: split kexec_file syscall code to kexec_file.c
Split kexec_file syscall related code to another file kernel/kexec_file.c
so that the #ifdef CONFIG_KEXEC_FILE in kexec.c can be dropped.

Sharing variables and functions are moved to kernel/kexec_internal.h per
suggestion from Vivek and Petr.

[akpm@linux-foundation.org: fix bisectability]
[akpm@linux-foundation.org: declare the various arch_kexec functions]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Andy Shevchenko 37607102c4 seq_file: provide an analogue of print_hex_dump()
This introduces a new helper and switches current users to use it.  All
patches are compiled tested. kmemleak is tested via its own test suite.

This patch (of 6):

The new seq_hex_dump() is a complete analogue of print_hex_dump().

We have few users of this functionality already. It allows to reduce their
codebase.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Joe Perches <joe@perches.com>
Cc: Tadeusz Struk <tadeusz.struk@intel.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Tuchscherer <ingo.tuchscherer@de.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Frederic Weisbecker 90f023030e kmod: use system_unbound_wq instead of khelper
We need to launch the usermodehelper kernel threads with the widest
affinity and this is partly why we use khelper.  This workqueue has
unbound properties and thus a wide affinity inherited by all its children.

Now khelper also has special properties that we aren't much interested in:
ordered and singlethread.  There is really no need about ordering as all
we do is creating kernel threads.  This can be done concurrently.  And
singlethread is a useless limitation as well.

The workqueue engine already proposes generic unbound workqueues that
don't share these useless properties and handle well parallel jobs.

The only worrysome specific is their affinity to the node of the current
CPU.  It's fine for creating the usermodehelper kernel threads but those
inherit this affinity for longer jobs such as requesting modules.

This patch proposes to use these node affine unbound workqueues assuming
that a node is sufficient to handle several parallel usermodehelper
requests.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Kees Cook b40bdb7fb2 lib/string_helpers: rename "esc" arg to "only"
To further clarify the purpose of the "esc" argument, rename it to "only"
to reflect that it is a limit, not a list of additional characters to
escape.

Signed-off-by: Kees Cook <keescook@chromium.org>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Linus Walleij cdf17449af hexdump: do not print debug dumps for !CONFIG_DEBUG
print_hex_dump_debug() is likely supposed to be analogous to pr_debug() or
dev_dbg() & friends.  Currently it will adhere to dynamic debug, but will
not stub out prints if CONFIG_DEBUG is not set.  Let's make it do the
right thing, because I am tired of having my dmesg buffer full of hex
dumps on production systems.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Jason A. Donenfeld 515a9adce0 include/linux/printk.h: include pr_fmt in pr_debug_ratelimited
The other two implementations of pr_debug_ratelimited include pr_fmt,
along with every other pr_* function.  But pr_debug_ratelimited forgot to
add it with the CONFIG_DYNAMIC_DEBUG implementation.

This patch unifies the behavior.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Vasily Kulikov 8b839635e7 include/linux/poison.h: remove not-used poison pointer macros
Signed-off-by: Vasily Kulikov <segoon@openwall.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Vasily Kulikov 8a5e5e02fc include/linux/poison.h: fix LIST_POISON{1,2} offset
Poison pointer values should be small enough to find a room in
non-mmap'able/hardly-mmap'able space.  E.g.  on x86 "poison pointer space"
is located starting from 0x0.  Given unprivileged users cannot mmap
anything below mmap_min_addr, it should be safe to use poison pointers
lower than mmap_min_addr.

The current poison pointer values of LIST_POISON{1,2} might be too big for
mmap_min_addr values equal or less than 1 MB (common case, e.g.  Ubuntu
uses only 0x10000).  There is little point to use such a big value given
the "poison pointer space" below 1 MB is not yet exhausted.  Changing it
to a smaller value solves the problem for small mmap_min_addr setups.

The values are suggested by Solar Designer:
http://www.openwall.com/lists/oss-security/2015/05/02/6

Signed-off-by: Vasily Kulikov <segoon@openwall.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Vladimir Davydov f074a8f49e proc: export idle flag via kpageflags
As noted by Minchan, a benefit of reading idle flag from /proc/kpageflags
is that one can easily filter dirty and/or unevictable pages while
estimating the size of unused memory.

Note that idle flag read from /proc/kpageflags may be stale in case the
page was accessed via a PTE, because it would be too costly to iterate
over all page mappings on each /proc/kpageflags read to provide an
up-to-date value.  To make sure the flag is up-to-date one has to read
/sys/kernel/mm/page_idle/bitmap first.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Vladimir Davydov 33c3fc71c8 mm: introduce idle page tracking
Knowing the portion of memory that is not used by a certain application or
memory cgroup (idle memory) can be useful for partitioning the system
efficiently, e.g.  by setting memory cgroup limits appropriately.
Currently, the only means to estimate the amount of idle memory provided
by the kernel is /proc/PID/{clear_refs,smaps}: the user can clear the
access bit for all pages mapped to a particular process by writing 1 to
clear_refs, wait for some time, and then count smaps:Referenced.  However,
this method has two serious shortcomings:

 - it does not count unmapped file pages
 - it affects the reclaimer logic

To overcome these drawbacks, this patch introduces two new page flags,
Idle and Young, and a new sysfs file, /sys/kernel/mm/page_idle/bitmap.
A page's Idle flag can only be set from userspace by setting bit in
/sys/kernel/mm/page_idle/bitmap at the offset corresponding to the page,
and it is cleared whenever the page is accessed either through page tables
(it is cleared in page_referenced() in this case) or using the read(2)
system call (mark_page_accessed()). Thus by setting the Idle flag for
pages of a particular workload, which can be found e.g.  by reading
/proc/PID/pagemap, waiting for some time to let the workload access its
working set, and then reading the bitmap file, one can estimate the amount
of pages that are not used by the workload.

The Young page flag is used to avoid interference with the memory
reclaimer.  A page's Young flag is set whenever the Access bit of a page
table entry pointing to the page is cleared by writing to the bitmap file.
If page_referenced() is called on a Young page, it will add 1 to its
return value, therefore concealing the fact that the Access bit was
cleared.

Note, since there is no room for extra page flags on 32 bit, this feature
uses extended page flags when compiled on 32 bit.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: kpageidle requires an MMU]
[akpm@linux-foundation.org: decouple from page-flags rework]
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Vladimir Davydov 1d7715c676 mmu-notifier: add clear_young callback
In the scope of the idle memory tracking feature, which is introduced by
the following patch, we need to clear the referenced/accessed bit not only
in primary, but also in secondary ptes.  The latter is required in order
to estimate wss of KVM VMs.  At the same time we want to avoid flushing
tlb, because it is quite expensive and it won't really affect the final
result.

Currently, there is no function for clearing pte young bit that would meet
our requirements, so this patch introduces one.  To achieve that we have
to add a new mmu-notifier callback, clear_young, since there is no method
for testing-and-clearing a secondary pte w/o flushing tlb.  The new method
is not mandatory and currently only implemented by KVM.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Vladimir Davydov e993d905c8 memcg: zap try_get_mem_cgroup_from_page
It is only used in mem_cgroup_try_charge, so fold it in and zap it.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Vladimir Davydov 2fc0452470 memcg: add page_cgroup_ino helper
This patchset introduces a new user API for tracking user memory pages
that have not been used for a given period of time.  The purpose of this
is to provide the userspace with the means of tracking a workload's
working set, i.e.  the set of pages that are actively used by the
workload.  Knowing the working set size can be useful for partitioning the
system more efficiently, e.g.  by tuning memory cgroup limits
appropriately, or for job placement within a compute cluster.

==== USE CASES ====

The unified cgroup hierarchy has memory.low and memory.high knobs, which
are defined as the low and high boundaries for the workload working set
size.  However, the working set size of a workload may be unknown or
change in time.  With this patch set, one can periodically estimate the
amount of memory unused by each cgroup and tune their memory.low and
memory.high parameters accordingly, therefore optimizing the overall
memory utilization.

Another use case is balancing workloads within a compute cluster.  Knowing
how much memory is not really used by a workload unit may help take a more
optimal decision when considering migrating the unit to another node
within the cluster.

Also, as noted by Minchan, this would be useful for per-process reclaim
(https://lwn.net/Articles/545668/). With idle tracking, we could reclaim idle
pages only by smart user memory manager.

==== USER API ====

The user API consists of two new files:

 * /sys/kernel/mm/page_idle/bitmap.  This file implements a bitmap where each
   bit corresponds to a page, indexed by PFN. When the bit is set, the
   corresponding page is idle. A page is considered idle if it has not been
   accessed since it was marked idle. To mark a page idle one should set the
   bit corresponding to the page by writing to the file. A value written to the
   file is OR-ed with the current bitmap value. Only user memory pages can be
   marked idle, for other page types input is silently ignored. Writing to this
   file beyond max PFN results in the ENXIO error. Only available when
   CONFIG_IDLE_PAGE_TRACKING is set.

   This file can be used to estimate the amount of pages that are not
   used by a particular workload as follows:

   1. mark all pages of interest idle by setting corresponding bits in the
      /sys/kernel/mm/page_idle/bitmap
   2. wait until the workload accesses its working set
   3. read /sys/kernel/mm/page_idle/bitmap and count the number of bits set

 * /proc/kpagecgroup.  This file contains a 64-bit inode number of the
   memory cgroup each page is charged to, indexed by PFN. Only available when
   CONFIG_MEMCG is set.

   This file can be used to find all pages (including unmapped file pages)
   accounted to a particular cgroup. Using /sys/kernel/mm/page_idle/bitmap, one
   can then estimate the cgroup working set size.

For an example of using these files for estimating the amount of unused
memory pages per each memory cgroup, please see the script attached
below.

==== REASONING ====

The reason to introduce the new user API instead of using
/proc/PID/{clear_refs,smaps} is that the latter has two serious
drawbacks:

 - it does not count unmapped file pages
 - it affects the reclaimer logic

The new API attempts to overcome them both. For more details on how it
is achieved, please see the comment to patch 6.

==== PATCHSET STRUCTURE ====

The patch set is organized as follows:

 - patch 1 adds page_cgroup_ino() helper for the sake of
   /proc/kpagecgroup and patches 2-3 do related cleanup
 - patch 4 adds /proc/kpagecgroup, which reports cgroup ino each page is
   charged to
 - patch 5 introduces a new mmu notifier callback, clear_young, which is
   a lightweight version of clear_flush_young; it is used in patch 6
 - patch 6 implements the idle page tracking feature, including the
   userspace API, /sys/kernel/mm/page_idle/bitmap
 - patch 7 exports idle flag via /proc/kpageflags

==== SIMILAR WORKS ====

Originally, the patch for tracking idle memory was proposed back in 2011
by Michel Lespinasse (see http://lwn.net/Articles/459269/).  The main
difference between Michel's patch and this one is that Michel implemented
a kernel space daemon for estimating idle memory size per cgroup while
this patch only provides the userspace with the minimal API for doing the
job, leaving the rest up to the userspace.  However, they both share the
same idea of Idle/Young page flags to avoid affecting the reclaimer logic.

==== PERFORMANCE EVALUATION ====

SPECjvm2008 (https://www.spec.org/jvm2008/) was used to evaluate the
performance impact introduced by this patch set.  Three runs were carried
out:

 - base: kernel without the patch
 - patched: patched kernel, the feature is not used
 - patched-active: patched kernel, 1 minute-period daemon is used for
   tracking idle memory

For tracking idle memory, idlememstat utility was used:
https://github.com/locker/idlememstat

testcase            base            patched        patched-active

compiler       537.40 ( 0.00)%   532.26 (-0.96)%   538.31 ( 0.17)%
compress       305.47 ( 0.00)%   301.08 (-1.44)%   300.71 (-1.56)%
crypto         284.32 ( 0.00)%   282.21 (-0.74)%   284.87 ( 0.19)%
derby          411.05 ( 0.00)%   413.44 ( 0.58)%   412.07 ( 0.25)%
mpegaudio      189.96 ( 0.00)%   190.87 ( 0.48)%   189.42 (-0.28)%
scimark.large   46.85 ( 0.00)%    46.41 (-0.94)%    47.83 ( 2.09)%
scimark.small  412.91 ( 0.00)%   415.41 ( 0.61)%   421.17 ( 2.00)%
serial         204.23 ( 0.00)%   213.46 ( 4.52)%   203.17 (-0.52)%
startup         36.76 ( 0.00)%    35.49 (-3.45)%    35.64 (-3.05)%
sunflow        115.34 ( 0.00)%   115.08 (-0.23)%   117.37 ( 1.76)%
xml            620.55 ( 0.00)%   619.95 (-0.10)%   620.39 (-0.03)%

composite      211.50 ( 0.00)%   211.15 (-0.17)%   211.67 ( 0.08)%

time idlememstat:

17.20user 65.16system 2:15:23elapsed 1%CPU (0avgtext+0avgdata 8476maxresident)k
448inputs+40outputs (1major+36052minor)pagefaults 0swaps

==== SCRIPT FOR COUNTING IDLE PAGES PER CGROUP ====
#! /usr/bin/python
#

import os
import stat
import errno
import struct

CGROUP_MOUNT = "/sys/fs/cgroup/memory"
BUFSIZE = 8 * 1024  # must be multiple of 8

def get_hugepage_size():
    with open("/proc/meminfo", "r") as f:
        for s in f:
            k, v = s.split(":")
            if k == "Hugepagesize":
                return int(v.split()[0]) * 1024

PAGE_SIZE = os.sysconf("SC_PAGE_SIZE")
HUGEPAGE_SIZE = get_hugepage_size()

def set_idle():
    f = open("/sys/kernel/mm/page_idle/bitmap", "wb", BUFSIZE)
    while True:
        try:
            f.write(struct.pack("Q", pow(2, 64) - 1))
        except IOError as err:
            if err.errno == errno.ENXIO:
                break
            raise
    f.close()

def count_idle():
    f_flags = open("/proc/kpageflags", "rb", BUFSIZE)
    f_cgroup = open("/proc/kpagecgroup", "rb", BUFSIZE)

    with open("/sys/kernel/mm/page_idle/bitmap", "rb", BUFSIZE) as f:
        while f.read(BUFSIZE): pass  # update idle flag

    idlememsz = {}
    while True:
        s1, s2 = f_flags.read(8), f_cgroup.read(8)
        if not s1 or not s2:
            break

        flags, = struct.unpack('Q', s1)
        cgino, = struct.unpack('Q', s2)

        unevictable = (flags >> 18) & 1
        huge = (flags >> 22) & 1
        idle = (flags >> 25) & 1

        if idle and not unevictable:
            idlememsz[cgino] = idlememsz.get(cgino, 0) + \
                (HUGEPAGE_SIZE if huge else PAGE_SIZE)

    f_flags.close()
    f_cgroup.close()
    return idlememsz

if __name__ == "__main__":
    print "Setting the idle flag for each page..."
    set_idle()

    raw_input("Wait until the workload accesses its working set, "
              "then press Enter")

    print "Counting idle pages..."
    idlememsz = count_idle()

    for dir, subdirs, files in os.walk(CGROUP_MOUNT):
        ino = os.stat(dir)[stat.ST_INO]
        print dir + ": " + str(idlememsz.get(ino, 0) / 1024) + " kB"
==== END SCRIPT ====

This patch (of 8):

Add page_cgroup_ino() helper to memcg.

This function returns the inode number of the closest online ancestor of
the memory cgroup a page is charged to.  It is required for exporting
information about which page is charged to which cgroup to userspace,
which will be introduced by a following patch.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Dan Streetman 3f0e131221 zpool: add zpool_has_pool()
This series makes creation of the zpool and compressor dynamic, so that
they can be changed at runtime.  This makes using/configuring zswap
easier, as before this zswap had to be configured at boot time, using boot
params.

This uses a single list to track both the zpool and compressor together,
although Seth had mentioned an alternative which is to track the zpools
and compressors using separate lists.  In the most common case, only a
single zpool and single compressor, using one list is slightly simpler
than using two lists, and for the uncommon case of multiple zpools and/or
compressors, using one list is slightly less simple (and uses slightly
more memory, probably) than using two lists.

This patch (of 4):

Add zpool_has_pool() function, indicating if the specified type of zpool
is available (i.e.  zsmalloc or zbud).  This allows checking if a pool is
available, without actually trying to allocate it, similar to
crypto_has_alg().

This is used by a following patch to zswap that enables the dynamic
runtime creation of zswap zpools.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Acked-by: Seth Jennings <sjennings@variantweb.net>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10 13:29:01 -07:00
Henrik Austad 1ab1e89549 ether: add IEEE 1722 ethertype - TSN
IEEE 1722 describes AVB (later renamed to TSN - Time Sensitive
Networking), a protocol, encapsualtion and synchronization to utilize
standard networks for audio/video (and later other time-sensitive)
streams.

This standard uses ethertype 0x22F0.

http://standards.ieee.org/develop/regauth/ethertype/eth.txt

This is a respin of a previous patch ("ether: add AVB frame type
ETH_P_AVB")

CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
CC: linux-api@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Henrik Austad <henrik@austad.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-09 22:06:29 -07:00
Mike Frysinger b14132797d elf-em.h: move EM_MICROBLAZE to the common header
The linux/audit.h header uses EM_MICROBLAZE in order to define
AUDIT_ARCH_MICROBLAZE, but it's only available in the microblaze
asm headers.  Move it to the common elf-em.h header so that the
define can be used on non-microblaze systems.  Otherwise we get
build errors that EM_MICROBLAZE isn't defined when we try to use
the AUDIT_ARCH_MICROBLAZE symbol.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2015-09-10 06:54:15 +02:00
Daniel Borkmann 6bb0fef489 netlink, mmap: fix edge-case leakages in nf queue zero-copy
When netlink mmap on receive side is the consumer of nf queue data,
it can happen that in some edge cases, we write skb shared info into
the user space mmap buffer:

Assume a possible rx ring frame size of only 4096, and the network skb,
which is being zero-copied into the netlink skb, contains page frags
with an overall skb->len larger than the linear part of the netlink
skb.

skb_zerocopy(), which is generic and thus not aware of the fact that
shared info cannot be accessed for such skbs then tries to write and
fill frags, thus leaking kernel data/pointers and in some corner cases
possibly writing out of bounds of the mmap area (when filling the
last slot in the ring buffer this way).

I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has
an advertised length larger than 4096, where the linear part is visible
at the slot beginning, and the leaked sizeof(struct skb_shared_info)
has been written to the beginning of the next slot (also corrupting
the struct nl_mmap_hdr slot header incl. status etc), since skb->end
points to skb->data + ring->frame_size - NL_MMAP_HDRLEN.

The fix adds and lets __netlink_alloc_skb() take the actual needed
linear room for the network skb + meta data into account. It's completely
irrelevant for non-mmaped netlink sockets, but in case mmap sockets
are used, it can be decided whether the available skb_tailroom() is
really large enough for the buffer, or whether it needs to internally
fallback to a normal alloc_skb().

>From nf queue side, the information whether the destination port is
an mmap RX ring is not really available without extra port-to-socket
lookup, thus it can only be determined in lower layers i.e. when
__netlink_alloc_skb() is called that checks internally for this. I
chose to add the extra ldiff parameter as mmap will then still work:
We have data_len and hlen in nfqnl_build_packet_message(), data_len
is the full length (capped at queue->copy_range) for skb_zerocopy()
and hlen some possible part of data_len that needs to be copied; the
rem_len variable indicates the needed remaining linear mmap space.

The only other workaround in nf queue internally would be after
allocation time by f.e. cap'ing the data_len to the skb_tailroom()
iff we deal with an mmap skb, but that would 1) expose the fact that
we use a mmap skb to upper layers, and 2) trim the skb where we
otherwise could just have moved the full skb into the normal receive
queue.

After the patch, in my test case the ring slot doesn't fit and therefore
shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and
thus needs to be picked up via recv().

Fixes: 3ab1f683bf ("nfnetlink: add support for memory mapped netlink")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-09 21:43:22 -07:00
Woojung.Huh@microchip.com 792aec47d5 add microchip LAN88xx phy driver
Add Microchip LAN88XX phy driver for phylib.

Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-09 17:19:14 -07:00
Kevin Hilman c6e59bdac9 Qualcomm ARM Based SoC Updates for 4.3-rc2
* Fix errant private access in SMEM
 * Fix use of correct remote processor ID in SMD transactions
 * Correct SMD fBLOCKREADINTR handling
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJV8KPBAAoJEFKiBbHx2RXVWUwP/jn3Tth9UOBOem7/nLIPctQx
 O7knZBUJ3UtnC+xwkKFy/G6WYZ29ovCjzZ+3ExNQFFZNMkmE04c4x78J/XaO3gAl
 kwHLEsStmbAXyIi+gqDYFw7BS00yjB159gUg3zCZhfao5ZiaPDgI1tZAWgaU7ywp
 7r6+VaJcN3/fhiNoV1PKSNy38w9yeS22MNdudkqeJm6Uumf30UXxsy8eMbZWYjgJ
 kpm7PGav/8mYEvrSWNe0nAuvkOTMNzTGjtwp9cQSZ+J0myZ4mOgHopNU2xAE0eaN
 TVrkxpyfiTkpa+x2Y3SGvIxNOepLT7BzL3V/NG3hBgHA/OApMjbBlQ+vDL6vsBbh
 UrRkpPCnwrH3q2GWOdrxlVKNxgfOmVEpQc1woj6Ykr1aFPSGK7OnqxKCQtD/BIkg
 zWZUVkbLyRYIcHql7tG9Q3TcjBG/nIV2XM76NJ1H06IrBRhpVA7mwDm11DfPma2Q
 y4641evunqJAZNTDtGpX/fyiqZeV5m9Md5PtOZKWuh83pqbk+X0tjZ0rsEt1K2wa
 szItTS896fKzP8DaGx0a0+1N4jsyJmWj1Dd6gR/Cz9wD+OasvBdIDvixYv7RPSXW
 kuAv2yzh8OPuCb5gWjVv0pOr/cMdvrJefvXhodcOQcjMEfSNCcBSc7StuKVg4qHw
 bEnGfjV5jj0NVr2hvTAw
 =w16T
 -----END PGP SIGNATURE-----

Merge tag 'qcom-soc-for-4.3-rc2' of git://codeaurora.org/quic/kernel/agross-msm into next/late

Qualcomm ARM Based SoC Updates for 4.3-rc2

* Fix errant private access in SMEM
* Fix use of correct remote processor ID in SMD transactions
* Correct SMD fBLOCKREADINTR handling

* tag 'qcom-soc-for-4.3-rc2' of git://codeaurora.org/quic/kernel/agross-msm:
  soc: qcom: smd: Correct fBLOCKREADINTR handling
  soc: qcom: smd: Use correct remote processor ID
  soc: qcom: smem: Fix errant private access
  devicetree: soc: Add Qualcomm SMD based RPM DT binding
  soc: qcom: Driver for the Qualcomm RPM over SMD
  soc: qcom: Add Shared Memory Driver
  soc: qcom: Add device tree binding for Shared Memory Device
  drivers: qcom: Select QCOM_SCM unconditionally for QCOM_PM
  soc: qcom: Add Shared Memory Manager driver
2015-09-09 16:15:34 -07:00
Kevin Hilman 71d440499b Merge branch 'drivers/reset' into next/late
* drivers/reset:
  reset: ath79: Fix missing spin_lock_init
  reset: Add (devm_)reset_control_get stub functions
  reset: reset-zynq: Adding support for Xilinx Zynq reset controller.
  docs: dts: Added documentation for Xilinx Zynq Reset Controller bindings.
  MIPS: ath79: Add the reset controller to the AR9132 dtsi
  reset: Add a driver for the reset controller on the AR71XX/AR9XXX
  devicetree: Add bindings for the ATH79 reset controller
  reset: socfpga: Update reset-socfpga to read the altr,modrst-offset property
  doc: dt: add documentation for lpc1850-rgu reset driver
  reset: add driver for lpc18xx rgu
  reset: sti: constify of_device_id array
  ARM: STi: DT: Move reset controller constants into common location
  MAINTAINERS: add include/dt-bindings/reset path to reset controller entry
2015-09-09 15:42:45 -07:00
Phil Sutter f53de1e9a4 net: ipv6: use common fib_default_rule_pref
This switches IPv6 policy routing to use the shared
fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
multicast routing for IPv4 as well as IPv6.

The motivation for this patch is a complaint about iproute2 behaving
inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
assigned priority value was decreased with each rule added.

Since then all users of the default_pref field have been converted to
assign the generic function fib_default_rule_pref(), fib_nl_newrule()
may just use it directly instead. Therefore get rid of the function
pointer altogether and make fib_default_rule_pref() static, as it's not
used outside fib_rules.c anymore.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-09 14:19:50 -07:00
Andy Gross 61e19ba99e Qualcomm ARM Based SoC Updates for 4.3
* Add SMEM driver
 * Add SMD driver
 * Add RPM over SMD driver
 * Select QCOM_SCM by default
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJVuSZxAAoJEFKiBbHx2RXVtmsP/i7em4uhfDnj7cwoJ0pYbCRx
 ZkbuBFPC6OdUut9Lq7Wj9b9/cGtZIKhDptyvVic0eWE26u39o2JctnqKQ2bw/uu/
 MwhTWBll6jPguY4Xb8kM0dbccPkq2vKWLu1VWwqeW4VhlB7QYYLOKaNELhW3nfPv
 y5q9qp+ykko0I7VUXNRSxI3kzbIhDKNaKlS4KupTdLRA6d7xl8PAc9O6/qEZAcK0
 S7vBlpvsAyLiGXW35bg6Oi+mGOqgfFFi1z4L9oW1UgXeMcslJ5EBWg4CvJMjb48y
 PTSlMZ+2QZTbJj7fcc7Sj/FgKrWXiFpcsxfXaSdeywlIQvvZcPAf6vuXTVHcHx32
 0AoWpsm9vW0N1tWtstoj0Lvs2A8ewQxkQyflYX+J1G0+JCZ22PbPiG7lUAVLV17u
 fTOAPp2Vcw/5v5jnpAzECp5CH9SGUuv/3yfwoN7spb+ODOOWqzYGFRCnycTTeuxo
 QYkjrUr4i/fbPsJiK3TZalZzm7X6ZI3Y1FZyiCGN/1b8XmZm7PfSB7Uf/YLFbQXM
 WuEepT3GoR17SMyApN09uV0kJXA+TSFBpY5A89QgfzjkBbOyK+dsZ3GnUuoWyjof
 G/swXEVzk7SX8LBXwBg7/BIHVWVARiedTLzvn383C/7BShO9ME/pdGHaYAjVHq8r
 BCTszTpelZ9p1qkI5hlN
 =ic0y
 -----END PGP SIGNATURE-----

Merge tag 'qcom-soc-for-4.3' into v4.2-rc2

Qualcomm ARM Based SoC Updates for 4.3

* Add SMEM driver
* Add SMD driver
* Add RPM over SMD driver
* Select QCOM_SCM by default
2015-09-09 15:56:35 -05:00
Linus Torvalds b8889c4fc6 TTY driver revert for 4.3-rc1
Here are some reverts for some tty patches (specifically the pl011
 driver) that ended up breaking a bunch of machines (i.e. almost all of
 the ones with this chip.)  People are working on a fix for this, but in
 the meantime, it's best to just revert all 5 patches to restore people's
 serial consoles.
 
 These reverts have been in linux-next for many days now.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlXvtvkACgkQMUfUDdst+ynEVACgvMxKazgRqCnYcqbQs9DwARds
 0r4AoL51IVIQe976ZYSjUdSFL+Q0IE1t
 =LOn1
 -----END PGP SIGNATURE-----

Merge tag 'tty-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty driver reverts from Greg KH:
 "Here are some reverts for some tty patches (specifically the pl011
  driver) that ended up breaking a bunch of machines (i.e. almost all
  of the ones with this chip).

  People are working on a fix for this, but in the meantime, it's best
  to just revert all 5 patches to restore people's serial consoles.

  These reverts have been in linux-next for many days now"

* tag 'tty-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  Revert "uart: pl011: Rename regs with enumeration"
  Revert "uart: pl011: Introduce register accessor"
  Revert "uart: pl011: Introduce register look up table"
  Revert "uart: pl011: Improve LCRH register access decision"
  Revert "uart: pl011: Add support to ZTE ZX296702 uart"
2015-09-09 11:27:01 -07:00
Linus Torvalds 82278fc079 pwm: Changes for v4.3-rc1
This set of changes introduces the beginnings of a new API that's based
 around the concept of states that can be atomically applied. Drivers go
 to various lengths to implement something similar, which indicates that
 the core should really be providing the necessary framework.
 
 On top of that, there is a bit of cleanup as well as improved kerneldoc
 and integration into the device-drivers DocBook.
 
 Regarding drivers there is a new one for the NXP LPC18xx family of SoCs
 and a couple of fixes for existing drivers (pca9685, Broadcom Kona and
 Atmel HLCDC).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJV8DLwAAoJEN0jrNd/PrOhSf4P/RYwdDGB/xvI0ifXS5p9znNA
 hr6+MUH87+f9NY1jcDye0RrxIkRsckQ3x3fe/nTmODjCSM/cyrWrVAEKbkqE5bZH
 LTRT6M6jolSAUGXFTBDuHktYdMU9yaD0PJvm0RgBEC7z2BeBmCgCXTFc4paxw2GA
 NvikouXZX9XDlXp598pHPFoL92vTLevQQMBHYW9myY8cjAOG0X8Dn05xQaC2kRvJ
 5q+OD1D/kIEMJxONrpef7bwzPgCkuGTHqqIIm6gLVoPBPGWz43kaZ5+x/j2E98E9
 fkfcP+21GvZ/tdAUEkrRnpy5Jnn+Nk0m+yzPtRkvj4TLwPd93EjumiPg5sb53pys
 ZfoHgnOAArVxEnTYLQk4IyBWcHyyzIxZBKZZpXkMXjY4QVq6wZLAP+vyCAzcm4tV
 vHqUN5BYAkwYPUNU58fgFWGHg6rFlb2QSN1bIj5opgygWmRbPgmwVj3trvS2Dkoe
 NOv47pp0OyWvTvx+wLql8BUt1zK1M6QldfPF6pR4FB18SLyKYH89ZbjHTyGdsuCv
 SiaWrqNB4eRmEKW1MA20Kc4wlF5c2peXVp2EHeddh/oHPES8IVWBPMPwSkJZrpQb
 bIUMStKlJ/Y1GPo6Dd7njK/kA3GwwsbYXYHmyPm0IRuAeZKrzGzVSIIljbfpL6Vf
 LBuKRMQHY4iWtF8A5SmY
 =ymH8
 -----END PGP SIGNATURE-----

Merge tag 'pwm/for-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
 "This set of changes introduces the beginnings of a new API that's
  based around the concept of states that can be atomically applied.
  Drivers go to various lengths to implement something similar, which
  indicates that the core should really be providing the necessary
  framework.

  On top of that, there is a bit of cleanup as well as improved
  kerneldoc and integration into the device-drivers DocBook.

  Regarding drivers there is a new one for the NXP LPC18xx family of
  SoCs and a couple of fixes for existing drivers (pca9685, Broadcom
  Kona and Atmel HLCDC)"

* tag 'pwm/for-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  ARM: at91: pwm: atmel-hlcdc: Add at91sam9n12 errata
  pwm: Add NXP LPC18xx PWM/SCT DT binding documentation
  pwm: NXP LPC18xx PWM/SCT driver
  pwm-pca9685: Support changing the output frequency
  pwm-pca9685: Fix several driver bugs
  pwm: kona: Modify settings application sequence
  pwm: pca9685: Drop owner assignment
  pwm: Add to device-drivers documentation
  pwm: Clean up kerneldoc
  pwm: Remove useless whitespace
  pwm: sysfs: Remove unnecessary padding
  pwm: sysfs: Properly convert from enum to string
  pwm: Make use of pwm_get_xxx() helpers where appropriate
  pwm: Add pwm_get_polarity() helper function
  pwm: Constify PWM device where possible
  pwm: Add the pwm_is_enabled() helper
2015-09-09 10:55:32 -07:00
Linus Torvalds 26d2177e97 Changes for 4.3
- Create drivers/staging/rdma
 - Move amso1100 driver to staging/rdma and schedule for deletion
 - Move ipath driver to staging/rdma and schedule for deletion
 - Add hfi1 driver to staging/rdma and set TODO for move to regular tree
 - Initial support for namespaces to be used on RDMA devices
 - Add RoCE GID table handling to the RDMA core caching code
 - Infrastructure to support handling of devices with differing
   read and write scatter gather capabilities
 - Various iSER updates
 - Kill off unsafe usage of global mr registrations
 - Update SRP driver
 - Misc. mlx4 driver updates
 - Support for the mr_alloc verb
 - Support for a netlink interface between kernel and user space cache
   daemon to speed path record queries and route resolution
 - Ininitial support for safe hot removal of verbs devices
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV7v8wAAoJELgmozMOVy/d2dcP/3PXnGFPgFGJODKE6VCZtTvj
 nooNXRKXjxv470UT5DiAX7SNcBxzzS7Zl/Lj+831H9iNXUyzuH31KtBOAZ3W03vZ
 yXwCB2caOStSldTRSUUvPe2aIFPnyNmSpC4i6XcJLJMCFijKmxin5pAo8qE44BQU
 yjhT+wC9P6LL5wZXsn/nFIMLjOFfu0WBFHNp3gs5j59paxlx5VeIAZk16aQZH135
 m7YCyicwrS8iyWQl2bEXRMon2vlCHlX2RHmOJ4f/P5I0quNcGF2+d8Yxa+K1VyC5
 zcb3OBezz+wZtvh16yhsDfSPqHWirljwID2VzOgRSzTJWvQjju8VkwHtkq6bYoBW
 egIxGCHcGWsD0R5iBXLYr/tB+BmjbDObSm0AsR4+JvSShkeVA1IpeoO+19162ixE
 n6CQnk2jCee8KXeIN4PoIKsjRSbIECM0JliWPLoIpuTuEhhpajftlSLgL5hf1dzp
 HrSy6fXmmoRj7wlTa7DnYIC3X+ffwckB8/t1zMAm2sKnIFUTjtQXF7upNiiyWk4L
 /T1QEzJ2bLQckQ9yY4v528SvBQwA4Dy1amIQB7SU8+2S//bYdUvhysWPkdKC4oOT
 WlqS5PFDCI31MvNbbM3rUbMAD8eBAR8ACw9ZpGI/Rffm5FEX5W3LoxA8gfEBRuqt
 30ZYFuW8evTL+YQcaV65
 =EHLg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull inifiniband/rdma updates from Doug Ledford:
 "This is a fairly sizeable set of changes.  I've put them through a
  decent amount of testing prior to sending the pull request due to
  that.

  There are still a few fixups that I know are coming, but I wanted to
  go ahead and get the big, sizable chunk into your hands sooner rather
  than waiting for those last few fixups.

  Of note is the fact that this creates what is intended to be a
  temporary area in the drivers/staging tree specifically for some
  cleanups and additions that are coming for the RDMA stack.  We
  deprecated two drivers (ipath and amso1100) and are waiting to hear
  back if we can deprecate another one (ehca).  We also put Intel's new
  hfi1 driver into this area because it needs to be refactored and a
  transfer library created out of the factored out code, and then it and
  the qib driver and the soft-roce driver should all be modified to use
  that library.

  I expect drivers/staging/rdma to be around for three or four kernel
  releases and then to go away as all of the work is completed and final
  deletions of deprecated drivers are done.

  Summary of changes for 4.3:

   - Create drivers/staging/rdma
   - Move amso1100 driver to staging/rdma and schedule for deletion
   - Move ipath driver to staging/rdma and schedule for deletion
   - Add hfi1 driver to staging/rdma and set TODO for move to regular
     tree
   - Initial support for namespaces to be used on RDMA devices
   - Add RoCE GID table handling to the RDMA core caching code
   - Infrastructure to support handling of devices with differing read
     and write scatter gather capabilities
   - Various iSER updates
   - Kill off unsafe usage of global mr registrations
   - Update SRP driver
   - Misc  mlx4 driver updates
   - Support for the mr_alloc verb
   - Support for a netlink interface between kernel and user space cache
     daemon to speed path record queries and route resolution
   - Ininitial support for safe hot removal of verbs devices"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (136 commits)
  IB/ipoib: Suppress warning for send only join failures
  IB/ipoib: Clean up send-only multicast joins
  IB/srp: Fix possible protection fault
  IB/core: Move SM class defines from ib_mad.h to ib_smi.h
  IB/core: Remove unnecessary defines from ib_mad.h
  IB/hfi1: Add PSM2 user space header to header_install
  IB/hfi1: Add CSRs for CONFIG_SDMA_VERBOSITY
  mlx5: Fix incorrect wc pkey_index assignment for GSI messages
  IB/mlx5: avoid destroying a NULL mr in reg_user_mr error flow
  IB/uverbs: reject invalid or unknown opcodes
  IB/cxgb4: Fix if statement in pick_local_ip6adddrs
  IB/sa: Fix rdma netlink message flags
  IB/ucma: HW Device hot-removal support
  IB/mlx4_ib: Disassociate support
  IB/uverbs: Enable device removal when there are active user space applications
  IB/uverbs: Explicitly pass ib_dev to uverbs commands
  IB/uverbs: Fix race between ib_uverbs_open and remove_one
  IB/uverbs: Fix reference counting usage of event files
  IB/core: Make ib_dealloc_pd return void
  IB/srp: Create an insecure all physical rkey only if needed
  ...
2015-09-09 08:33:31 -07:00
Linus Torvalds a794b4f329 IPMI updates for 4.3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iEYEABECAAYFAlXoqLgACgkQIXnXXONXEReNCwCghq2EtqGYTvLhupB4bJFCdtA5
 ZJMAoIJ+1WIbvGJmBZ/RxehiCl/FjxTl
 =+N3y
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.3' of git://git.code.sf.net/p/openipmi/linux-ipmi

Pull IPMI updates from Corey Minyard:
 "Most of these have been sitting in linux-next for more than a release,
  particularly commit 0fbcf4af7c ("ipmi: Convert the IPMI SI ACPI
  handling to a platform device") which is probably the most complex
  patch.

  That is also the one that changes drivers/acpi/acpi_pnp.c.  The change
  in that file is only removing IPMI from a "special platform devices"
  list, since I convert it to the standard PNP interface.  I posted this
  one to the ACPI list twice and got no response, and it seems to work
  well in my testing, so I'm hoping it's good.

  Hidehiro Kawai posted a set of changes that improves the panic time
  handling in the IPMI driver.

  The rest of the changes are minor bug fixes or cleanups and some
  documentation"

* tag 'for-linus-4.3' of git://git.code.sf.net/p/openipmi/linux-ipmi:
  ipmi:ssif: Add a module parm to specify that SMBus alerts don't work
  ipmi: add of_device_id in MODULE_DEVICE_TABLE
  ipmi: Compensate for BMCs that wont set the irq enable bit
  ipmi: Don't call receive handler in the panic context
  ipmi: Avoid touching possible corrupted lists in the panic context
  ipmi: Don't flush messages in sender() in run-to-completion mode
  ipmi: Factor out message flushing procedure
  ipmi: Remove unneeded set_run_to_completion call
  ipmi: Make some data const that was only read
  ipmi: constify SSIF ACPI device ids
  ipmi: Delete an unnecessary check before the function call "cleanup_one_si"
  char:ipmi - Change 1 to true for bool type variables during initialization.
  impi:Remove unneeded setting of module owner to THIS_MODULE in the platform structure, powernv_ipmi_driver
  ipmi: Add a comment in how messages are delivered from the lower layer
  ipmi/powernv: Fix potential invalid pointer dereference
  ipmi: Convert the IPMI SI ACPI handling to a platform device
  ipmi: Add device tree bindings information
2015-09-08 18:19:17 -07:00
Linus Torvalds f6f7a63692 Merge branch 'akpm' (patches from Andrew)
Merge second patch-bomb from Andrew Morton:
 "Almost all of the rest of MM.  There was an unusually large amount of
  MM material this time"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (141 commits)
  zpool: remove no-op module init/exit
  mm: zbud: constify the zbud_ops
  mm: zpool: constify the zpool_ops
  mm: swap: zswap: maybe_preload & refactoring
  zram: unify error reporting
  zsmalloc: remove null check from destroy_handle_cache()
  zsmalloc: do not take class lock in zs_shrinker_count()
  zsmalloc: use class->pages_per_zspage
  zsmalloc: consider ZS_ALMOST_FULL as migrate source
  zsmalloc: partial page ordering within a fullness_list
  zsmalloc: use shrinker to trigger auto-compaction
  zsmalloc: account the number of compacted pages
  zsmalloc/zram: introduce zs_pool_stats api
  zsmalloc: cosmetic compaction code adjustments
  zsmalloc: introduce zs_can_compact() function
  zsmalloc: always keep per-class stats
  zsmalloc: drop unused variable `nr_to_migrate'
  mm/memblock.c: fix comment in __next_mem_range()
  mm/page_alloc.c: fix type information of memoryless node
  memory-hotplug: fix comments in zone_spanned_pages_in_node() and zone_spanned_pages_in_node()
  ...
2015-09-08 17:52:23 -07:00
Linus Torvalds 9a9952bbd7 IOMMU Updates for Linux v4.3
This time the IOMMU updates are mostly cleanups or fixes. No big new
 features or drivers this time. In particular the changes include:
 
 	* Bigger cleanup of the Domain<->IOMMU data structures and the
 	  code that manages them in the Intel VT-d driver. This makes
 	  the code easier to understand and maintain, and also easier to
 	  keep the data structures in sync. It is also a preparation
 	  step to make use of default domains from the IOMMU core in the
 	  Intel VT-d driver.
 
 	* Fixes for a couple of DMA-API misuses in ARM IOMMU drivers,
 	  namely in the ARM and Tegra SMMU drivers.
 
 	* Fix for a potential buffer overflow in the OMAP iommu driver's
 	  debug code
 
 	* A couple of smaller fixes and cleanups in various drivers
 
 	* One small new feature: Report domain-id usage in the Intel
 	  VT-d driver to easier detect bugs where these are leaked.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJV7sCEAAoJECvwRC2XARrjz3YP/Au4IIfqykfPvmI0cmPhVnAV
 Q72tltwkbK2u2iP+pHheveaMngJtAshsZrnhBon4KJRIt/KTLZQvsFplHDaRhPfY
 yw3LIxhC5kLG/S6irY9Ozb0+uTMdQ3BU2uS23pyoFVfCz+RngBrAwDBcTKqZDCDG
 8dNd+T21XlzxuyeGr58h9upz2VFtq6feoGFhLU5PNxTlf4JWZe77D7NlbSvx6Nwy
 7Ai8dVRgpV9ciUP7w8FXrCUvbMZQDIoTMiWGNSlogVMgA0dllGES91UZYhWf3pil
 abuX6DeFul/cOhEOnH2xa+j5zz2O/upe9stU4wAFw6IhPiAELTHc2NKlWAhwb0SY
 bpDRf7dgLnUfqpmZLpWjTwN4jllc0qS2MIHj+eUu0uhdFi4Z0BuH2wSCdbR7xkqk
 u5u0Jq7hDNKs5FmQTSsWSiAdjakMsRjIN7jMrBbOeZnBSmUnLx74KGPLTb63ncR3
 WIOi4Iyu+LSXBIvZDiLu3lIIh7Atzd+y7IDnb8KXdyqfy+h53OZZOJNbP/qTWHgT
 ZUdm/qrqjIQpTQfleOEadC7vY/y3fR5sBtOQHUamfntni3oYCc4AMRlNdf3eV9lb
 Tyss6F699mU7d/vennTaIToBgVwaXdLYtmvGWjnoT/kqOMclyDf3cIUtZGtp2rJR
 ddmzDA3vBUC5pGj8Hd8R
 =yoGE
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates for from Joerg Roedel:
 "This time the IOMMU updates are mostly cleanups or fixes.  No big new
  features or drivers this time.  In particular the changes include:

   - Bigger cleanup of the Domain<->IOMMU data structures and the code
     that manages them in the Intel VT-d driver.  This makes the code
     easier to understand and maintain, and also easier to keep the data
     structures in sync.  It is also a preparation step to make use of
     default domains from the IOMMU core in the Intel VT-d driver.

   - Fixes for a couple of DMA-API misuses in ARM IOMMU drivers, namely
     in the ARM and Tegra SMMU drivers.

   - Fix for a potential buffer overflow in the OMAP iommu driver's
     debug code

   - A couple of smaller fixes and cleanups in various drivers

   - One small new feature: Report domain-id usage in the Intel VT-d
     driver to easier detect bugs where these are leaked"

* tag 'iommu-updates-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (83 commits)
  iommu/vt-d: Really use upper context table when necessary
  x86/vt-d: Fix documentation of DRHD
  iommu/fsl: Really fix init section(s) content
  iommu/io-pgtable-arm: Unmap and free table when overwriting with block
  iommu/io-pgtable-arm: Move init-fn declarations to io-pgtable.h
  iommu/msm: Use BUG_ON instead of if () BUG()
  iommu/vt-d: Access iomem correctly
  iommu/vt-d: Make two functions static
  iommu/vt-d: Use BUG_ON instead of if () BUG()
  iommu/vt-d: Return false instead of 0 in irq_remapping_cap()
  iommu/amd: Use BUG_ON instead of if () BUG()
  iommu/amd: Make a symbol static
  iommu/amd: Simplify allocation in irq_remapping_alloc()
  iommu/tegra-smmu: Parameterize number of TLB lines
  iommu/tegra-smmu: Factor out tegra_smmu_set_pde()
  iommu/tegra-smmu: Extract tegra_smmu_pte_get_use()
  iommu/tegra-smmu: Use __GFP_ZERO to allocate zeroed pages
  iommu/tegra-smmu: Remove PageReserved manipulation
  iommu/tegra-smmu: Convert to use DMA API
  iommu/tegra-smmu: smmu_flush_ptc() wants device addresses
  ...
2015-09-08 17:22:35 -07:00
Bartlomiej Zolnierkiewicz 4eafbd15b6 PM / OPP: add dev_pm_opp_get_suspend_opp() helper
Add dev_pm_opp_get_suspend_opp() helper to obtain suspend opp.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-09-09 02:20:39 +02:00
Linus Torvalds e81b594cda regmap: Changes for v4.3
This has been a busy release for regmap.  By far the biggest set of
 changes here are those from Markus Pargmann which implement support for
 block transfers in smbus devices.  This required quite a bit of
 refactoring but leaves us better able to handle odd restrictions that
 controllers may have and with better performance on smbus.
 
 Other new features include:
 
  - Fix interactions with lockdep for nested regmaps (eg, when a device
    using regmap is connected to a bus where the bus controller has a
    separate regmap).  Lockdep's default class identification is too
    crude to work without help.
  - Support for must write bitfield operations, useful for operations
    which require writing a bit to trigger them from Kuniori Morimoto.
  - Support for delaying during register patch application from Nariman
    Poushin.
  - Support for overriding cache state via the debugfs implementation
    from Richard Fitzgerald.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV6cZyAAoJECTWi3JdVIfQM3sH/RSygzRIOoOuvro0U3qd4+nM
 qLzpuZNtuAP7xNc5yJZiixz1S6PqUNl+pK/u58s6x10GWDGsWZY6E0Lg94lYmfhA
 26jqWSzrMHp42x+ZN7btLExzPTVnvRerrjtj4mqz6t4yun9SzqVymaZTUZ1hqPHB
 qxSNHs3rHPiqiEWpQKJXb+5dazYYJCbOUrlivxJCk60+ns1N88cA71aY+5/zq5uy
 36e0iYe3s+lv0lptedarFCf9p7o1E6EQzhvEMfyquGXtvq8fl7Qdeo7riRFQ8Iiy
 sygCb3AYuZIiqnOC7+90xkr2Oq0iwdJUD91A9sl/SRiwgItG9jW03bWNHYIhQyk=
 =CbGt
 -----END PGP SIGNATURE-----

Merge tag 'regmap-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap updates from Mark Brown:
 "This has been a busy release for regmap.

  By far the biggest set of changes here are those from Markus Pargmann
  which implement support for block transfers in smbus devices.  This
  required quite a bit of refactoring but leaves us better able to
  handle odd restrictions that controllers may have and with better
  performance on smbus.

  Other new features include:

   - Fix interactions with lockdep for nested regmaps (eg, when a device
     using regmap is connected to a bus where the bus controller has a
     separate regmap).  Lockdep's default class identification is too
     crude to work without help.

   - Support for must write bitfield operations, useful for operations
     which require writing a bit to trigger them from Kuniori Morimoto.

   - Support for delaying during register patch application from Nariman
     Poushin.

   - Support for overriding cache state via the debugfs implementation
     from Richard Fitzgerald"

* tag 'regmap-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (25 commits)
  regmap: fix a NULL pointer dereference in __regmap_init
  regmap: Support bulk reads for devices without raw formatting
  regmap-i2c: Add smbus i2c block support
  regmap: Add raw_write/read checks for max_raw_write/read sizes
  regmap: regmap max_raw_read/write getter functions
  regmap: Introduce max_raw_read/write for regmap_bulk_read/write
  regmap: Add missing comments about struct regmap_bus
  regmap: No multi_write support if bus->write does not exist
  regmap: Split use_single_rw internally into use_single_read/write
  regmap: Fix regmap_bulk_write for bus writes
  regmap: regmap_raw_read return error on !bus->read
  regulator: core: Print at debug level on debugfs creation failure
  regmap: Fix regmap_can_raw_write check
  regmap: fix typos in regmap.c
  regmap: Fix integertypes for register address and value
  regmap: Move documentation to regmap.h
  regmap: Use different lockdep class for each regmap init call
  thermal: sti: Add parentheses around bridge->ops->regmap_init call
  mfd: vexpress: Add parentheses around bridge->ops->regmap_init call
  regmap: debugfs: Fix misuse of IS_ENABLED
  ...
2015-09-08 16:48:55 -07:00
Linus Torvalds fa815580fb fbdev changes for 4.3
* Minor fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV7X+aAAoJEPo9qoy8lh71yvAP/3WoFdOzodSI973iBJbnXtBz
 LXFY2MaA5dIaAUsGGd8vLNiUQQRdX3QDT50sEoRhr2yXt2S0GRIEhYtjhy6/cueA
 fMVCL25FHJfZ/g2+rCwTBquD28Xw8eCWyXZtecNxoeqyvuoUvaCcvugCOoYzAbWc
 jH8e0iAmvIy9KdcnIXPvlzU3Jjef7Ci6S0Eh4m62X5sL+4x53c7LM8UPEVbeoj8l
 lwu63nJq9K9G9XtG8PPLzTVag9ST93YIU/G3dQYhF+pPSwJcsz7Cc/cLw1DuJKRI
 9N00fWxEgNnqGA5TKq7kwi7N25kxgjZeVmCYG1C4uYLjeL4H4QJ75qXrXZSbkB6H
 oqklJrlMN5XKgMyI5hgHuhEw6olpz97B62/L+c/0ilpR0lzFtuJLH6Uj6UvWcQE2
 MZj9WuEAKLUE5fDH4voeOqwykq5FJlEzwuyWs9w140pCkHI0MT+ZC0bRXAotETyV
 qwjKOXuUDjiqCBaDWJ5BB0amfMf7r6s0RK/robkgWYCnT1tf5BFDODjuIHvSG0oj
 tKtiRa8c5Ca5zsPBTEW9Jyl5wyzKKrvNjSDZWl3BLxJMBY0+eWFCokDt3c48nlYV
 2X8GW6fCx6lT/PU7uTimMS8hq7U2m/PCFRi/NveWLf8iCXzyCQBmJJY32CqbnOGZ
 r/8RAfVnWusV8R4OBVAZ
 =j9Sz
 -----END PGP SIGNATURE-----

Merge tag 'fbdev-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux

Pull fbdev updates from Tomi Valkeinen:
 "Minor fixes and cleanups"

* tag 'fbdev-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
  video: fbdev: atmel_lcdfb: remove useless include
  video: fbdev: pxa168fb: Use devm_clk_get
  fbdev: ssd1307fb: fix error return code
  fbdev: fix snprintf() limit in show_bl_curve()
  video: fbdev: s3c-fb: Constify platform_device_id
  video: fbdev: atmel: fix warning for const return value
  video: fbdev: Drop owner assignment from platform_driver
  video: fbdev: Drop owner assignment from i2c_driver
  fbdev: remove unnecessary memset in vfb
  framebuffer: disable vgacon on microblaze arch
  fbdev: udlfb: remove unneeded initialization in few places
  fbdev: Allow compile test of GPIO consumers if !GPIOLIB
  fbdev: fix cea_modes array size
2015-09-08 16:42:55 -07:00
Linus Torvalds 85579ad7f1 MMC core:
- Fix a race condition in the request handling
  - Skip trim commands for some buggy kingston eMMCs
  - An optimization and a correction for erase groups
  - Set CMD23 quirk for some Sandisk cards
 
 MMC host:
  - sdhci: Give GPIO CD higher precedence and don't poll when it's used
  - sdhci: Fix DMA memory leakage
  - sdhci: Some updates for clock management
  - sdhci-of-at91: introduce driver for the Atmel SDMMC
  - sdhci-of-arasan: Add support for sdhci-5.1
  - sdhci-esdhc-imx: Add support for imx7d which also supports HS400
  - sdhci: A collection of fixes and improvements for various sdhci hosts
  - omap_hsmmc: Modernization of the regulator code
  - dw_mmc: A couple of fixes for DMA and PIO mode
  - usdhi6rol0: A few fixes and support probe deferral for regulators
  - pxamci: Convert to use dmaengine
  - sh_mmcif: Fix the suspend process in a short term solution
  - tmio: Adjust timeout for commands
  - sunxi: Fix timeout while gating/ungating clock
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV7WBfAAoJEP4mhCVzWIwpzMMQAISg3SvN116lxgPJsFmx2Oxd
 fiSdksUTrGB4/Ya/gTGyFQx5wOep1Y/yLhuQr+2rDbIPlIFiKQuQCEVXERXy5VJl
 rmsekgIrMPamiGGYBM5b6n/UhWYktLqoGc88irKYPIaIjLQeH3iqB83y04R6BqBR
 tOC4PKnnHaJyJMt1oR/xHi1PjFKX4/WQtnCUlvzUMu8XcrXRwqgWb/hw1QXLUUZa
 3npFLxlSqJVXJxtjRSeYZp5juvt+l03rK9h/jNmoDfPl7WtdwxkHtShsnS8b74lF
 ZMhrl3phpyMdPNZ0pCwrQJxhplg0MB0Ey8TErRLBsp90Fre0tedf2w6RElkSXek8
 tApr6I/Fkydu5ILnNhnhswsbp4rc9tC7kfr5RkPKQF0icEcZFxNblrDtbD/jFYSu
 csIZO3xsJTvo4Z7kjYymTn3UpsV8l/hZDNOWHzt7LCTt8X2otEvsLvmabK9pMayI
 FipP3pbsM4/H70eOOwpWmzM4J4FGRtKGEXqJGfODqlGHlrXz06xtlvBDzDvW86BA
 hFr3cuDPqpMSm9UUPIVYZrm4OuO6p1JpVpj7/cmg48mJEqVNQZ+hyHC0zU2F/saL
 4pogQC1vL4SM4xJB+rY4Fo+E7gNEkYjEymEJ4yCbwdqF9Nxjs7m2rCulty4JCVyY
 vwMt5mo0d2IsixclFghh
 =wR1C
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.3' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC updates from Ulf Hansson:
 "MMC core:
   - Fix a race condition in the request handling
   - Skip trim commands for some buggy kingston eMMCs
   - An optimization and a correction for erase groups
   - Set CMD23 quirk for some Sandisk cards

  MMC host:
   - sdhci: Give GPIO CD higher precedence and don't poll when it's used
   - sdhci: Fix DMA memory leakage
   - sdhci: Some updates for clock management
   - sdhci-of-at91: introduce driver for the Atmel SDMMC
   - sdhci-of-arasan: Add support for sdhci-5.1
   - sdhci-esdhc-imx: Add support for imx7d which also supports HS400
   - sdhci: A collection of fixes and improvements for various sdhci hosts
   - omap_hsmmc: Modernization of the regulator code
   - dw_mmc: A couple of fixes for DMA and PIO mode
   - usdhi6rol0: A few fixes and support probe deferral for regulators
   - pxamci: Convert to use dmaengine
   - sh_mmcif: Fix the suspend process in a short term solution
   - tmio: Adjust timeout for commands
   - sunxi: Fix timeout while gating/ungating clock"

* tag 'mmc-v4.3' of git://git.linaro.org/people/ulf.hansson/mmc: (67 commits)
  mmc: android-goldfish: remove incorrect __iomem annotation
  mmc: core: fix race condition in mmc_wait_data_done
  mmc: host: omap_hsmmc: remove CONFIG_REGULATOR check
  mmc: host: omap_hsmmc: use ios->vdd for setting vmmc voltage
  mmc: host: omap_hsmmc: use regulator_is_enabled to find pbias status
  mmc: host: omap_hsmmc: enable/disable vmmc_aux regulator based on previous state
  mmc: host: omap_hsmmc: don't use ->set_power to set initial regulator state
  mmc: host: omap_hsmmc: avoid pbias regulator enable on power off
  mmc: host: omap_hsmmc: add separate function to set pbias
  mmc: host: omap_hsmmc: add separate functions for enable/disable supply
  mmc: host: omap_hsmmc: return error if any of the regulator APIs fail
  mmc: host: omap_hsmmc: remove unnecessary pbias set_voltage
  mmc: host: omap_hsmmc: use mmc_host's vmmc and vqmmc
  mmc: host: omap_hsmmc: use the ocrmask provided by the vmmc regulator
  mmc: host: omap_hsmmc: cleanup omap_hsmmc_reg_get()
  mmc: host: omap_hsmmc: return on fatal errors from omap_hsmmc_reg_get
  mmc: host: omap_hsmmc: use devm_regulator_get_optional() for vmmc
  mmc: sdhci-of-at91: fix platform_no_drv_owner.cocci warnings
  mmc: sh_mmcif: Fix suspend process
  mmc: usdhi6rol0: fix error return code
  ...
2015-09-08 16:33:16 -07:00
Linus Torvalds 3af6e98f25 platform-drivers-x86 for 4.3-1
Significant work on toshiba_acpi, including new hardware support,
 refactoring, and cleanups. Extend device support for asus, ideapad, and
 acer systems. New surface pro 3 buttons driver. Misc. minor cleanups for
 thinkpad and hp-wireless.
 
 acer-wmi:
  - No rfkill on HP Omen 15 wifi
 
 thinkpad_acpi
  - Remove side effects from vdbg_printk -> no_printk macro
 
 surface pro 3
  - Add support driver for Surface Pro 3 buttons
 
 hp-wireless
  - remove unneeded goto/label in hpwl_init
 
 ideapad-laptop
  - add alternative representation for Yoga 2 to DMI table
  - Add Lenovo Yoga 3 14 to no_hw_rfkill dmi list
 
 asus-laptop
  - Add key found on Asus F3M
 
 MAINTAINERS
  - Remove Toshiba Linux mailing list address
 
 toshiba_acpi:
  - Bump driver version to 0.23
  - Remove unnecessary checks and returns in HCI/SCI functions
  - Refactor *{get, set} functions return value
  - Remove "*not supported" feature prints
  - Change *available functions return type
  - Add set_fan_status function
  - Change some variables to avoid warnings from ninja-check
  - Reorder toshiba_acpi_alt_keymap entries
  - Remove unused wireless defines
  - Transflective backlight updates
  - Avoid registering input device on WMI event laptops
  - Add /dev/toshiba_acpi device
  - Adapt /proc/acpi/toshiba/keys to TOS1900 devices
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV6K71AAoJEKbMaAwKp364lzAIAIKeesYU4VevKg2wL6Em/dVP
 4ftDgzhvFRQhM0CEe+uR8NWFNB/CVKVSphKD5mFtc0NAs0S0Eo/tmAlERfeq0qkt
 HFSpP80deU+CwjURLo4wgMbPBPudHA3bNDZfSY6vq+/JhakdtheLyAODOda7uegz
 PHV0zrgDntwVDAuBPTB2h6KigFXSc3soGCWcHTjD0PLNvTGvWOiNYO7otsclNzFc
 qC0uerR4ryt6OFtul900/U/x1bfM9OAFNSnqWpfkkhQ4nskmkzDsyXutXm3CtXso
 Ol3YS0Saj3gBGUmYua1smSU5u4IU9cQNeq6EEgR4jdHt2QQ8fJfmT0v5UgeiHHw=
 =UZPt
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v4.3-1' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86

Pull x86 platform driver updates from Darren Hart:
 "Significant work on toshiba_acpi, including new hardware support,
  refactoring, and cleanups.  Extend device support for asus, ideapad,
  and acer systems.  New surface pro 3 buttons driver.  Misc minor
  cleanups for thinkpad and hp-wireless.

  acer-wmi:
   - No rfkill on HP Omen 15 wifi

  thinkpad_acpi:
   - Remove side effects from vdbg_printk -> no_printk macro

  surface pro 3:
   - Add support driver for Surface Pro 3 buttons

  hp-wireless:
   - remove unneeded goto/label in hpwl_init

  ideapad-laptop:
   - add alternative representation for Yoga 2 to DMI table
   - Add Lenovo Yoga 3 14 to no_hw_rfkill dmi list

  asus-laptop:
   - Add key found on Asus F3M

  MAINTAINERS:
   - Remove Toshiba Linux mailing list address

  toshiba_acpi:
   - Bump driver version to 0.23
   - Remove unnecessary checks and returns in HCI/SCI functions
   - Refactor *{get, set} functions return value
   - Remove "*not supported" feature prints
   - Change *available functions return type
   - Add set_fan_status function
   - Change some variables to avoid warnings from ninja-check
   - Reorder toshiba_acpi_alt_keymap entries
   - Remove unused wireless defines
   - Transflective backlight updates
   - Avoid registering input device on WMI event laptops
   - Add /dev/toshiba_acpi device
   - Adapt /proc/acpi/toshiba/keys to TOS1900 devices"

* tag 'platform-drivers-x86-v4.3-1' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86: (21 commits)
  acer-wmi: No rfkill on HP Omen 15 wifi
  thinkpad_acpi: Remove side effects from vdbg_printk -> no_printk macro
  surface pro 3: Add support driver for Surface Pro 3 buttons
  hp-wireless: remove unneeded goto/label in hpwl_init
  ideapad-laptop: add alternative representation for Yoga 2 to DMI table
  asus-laptop: Add key found on Asus F3M
  MAINTAINERS: Remove Toshiba Linux mailing list address
  ideapad-laptop: Add Lenovo Yoga 3 14 to no_hw_rfkill dmi list
  toshiba_acpi: Bump driver version to 0.23
  toshiba_acpi: Remove unnecessary checks and returns in HCI/SCI functions
  toshiba_acpi: Refactor *{get, set} functions return value
  toshiba_acpi: Remove "*not supported" feature prints
  toshiba_acpi: Change *available functions return type
  toshiba_acpi: Add set_fan_status function
  toshiba_acpi: Change some variables to avoid warnings from ninja-check
  toshiba_acpi: Reorder toshiba_acpi_alt_keymap entries
  toshiba_acpi: Remove unused wireless defines
  toshiba_acpi: Transflective backlight updates
  toshiba_acpi: Avoid registering input device on WMI event laptops
  toshiba_acpi: Add /dev/toshiba_acpi device
  ...
2015-09-08 16:26:18 -07:00
Linus Torvalds acceba598e Merge branch 'i2c/for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
 "Features:

   - new drivers: Renesas EMEV2, register based MUX, NXP LPC2xxx
   - core: scans DT and assigns wakeup interrupts.  no driver changes needed.
   - core: some refcouting issues fixed and better API for that
   - core: new helper function for best effort block read emulation
   - slave framework: proper DT bindings and userspace instantiation
   - some bigger work for xiic, pxa, omap drivers

  .. and quite a number of smaller driver fixes, cleanups, improvements"

* 'i2c/for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (65 commits)
  i2c: mux: reg Change ioread endianness for readback
  i2c: mux: reg: fix compilation warnings
  i2c: mux: reg: simplify register size checking
  i2c: muxes: fix leaked i2c adapter device node references
  i2c: allow specifying separate wakeup interrupt in device tree
  of/irq: export of_get_irq_byname()
  i2c: xgene-slimpro: dma_mapping_error() doesn't return an error code
  i2c: Replace I2C_CROS_EC_TUNNEL dependency
  eeprom: at24: use i2c_smbus_read_i2c_block_data_or_emulated
  i2c: core: Add support for best effort block read emulation
  i2c: lpc2k: add driver
  i2c: mux: Add register-based mux i2c-mux-reg
  i2c: dt: describe generic bindings
  i2c: slave: print warning if slave flag not set
  i2c: support 10 bit and slave addresses in sysfs 'new_device'
  i2c: take address space into account when checking for used addresses
  i2c: apply DT flags when probing
  i2c: make address check indpendent from client struct
  i2c: rename address check functions
  i2c: apply address offset for slaves, too
  ...
2015-09-08 16:16:26 -07:00
Linus Torvalds c19176154b RTC for 4.3
Core:
  - use is_visible() to control sysfs attributes
  - switch wakealarm attribute to DEVICE_ATTR_RW
  - make rtc_does_wakealarm() return boolean
  - properly manage lifetime of dev and cdev in rtc device
  - remove unnecessary device_get() in rtc_device_unregister
  - fix double free in rtc_register_device() error path
 
 New drivers:
  - NXP LPC24xx
  - Xilinx Zynq MP
  - Dialog DA9062
 
 Subsystem wide cleanups:
  - fix drivers that consider 0 as a valid IRQ in client->irq
  - Drop (un)likely before IS_ERR(_OR_NULL)
  - drop the remaining owner assignment for i2c_driver and platform_driver
  - module autoload fixes
 
 Drivers:
  - 88pm80x: add device tree support
  - abx80x: fix RTC write bit
  - ab8500: Add a sentinel to ab85xx_rtc_ids[]
  - armada38x: Align RTC set time procedure with the official errata
  - as3722: correct month value
  - at91sam9: cleanups
  - at91rm9200: get and use slow clock and cleanups
  - bq32k: remove redundant check
  - cmos: century support, proper fix for the spurious wakeup
  - ds1307: cleanups and wakeup irq support
  - ds1374: Remove unused variable
  - ds1685: Use module_platform_driver
  - ds3232: fix WARNING trace in resume function
  - gemini: fix ptr_ret.cocci warnings
  - mt6397: implement suspend/resume
  - omap: support internal and external clock enabling
  - opal: Enable alarms only when opal supports tpo
  - pcf2127: use OFS flag to detect unreliable date and warn the user
  - pl031: fix typo for author email
  - rx8025: huge cleanup and fixes
  - sa1100/pxa: share common code
  - s5m: fix to update ctrl register
  - s3c: fix clocks and wakeup, cleanup
  - sirfsoc: use regmap
  - nvram_read()/nvram_write() functions for cmos, ds1305, ds1307, ds1343,
  ds1511, ds1553, ds1742, m48t59, rp5c01, stk17ta8, tx4939
  - use rtc_valid_tm() error code when reading date/time instead of 0 for
  isl12022, pcf2123, pcf2127
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJV7ARpAAoJEKbNnwlvZCyzo40P/09fca3C9HBEI5hgJO7PmQX8
 +3wkacMjEi58yMTzWoDCFh2H1Vvajm1d6bJ6H//5/3h/KPKrjIsCTFlMfuv8gfbe
 xFoeXPMdiojjPxFjbjnpM+0GD9SKTIjwpP4d8HgoUDwGRJ6dxaTOVuJXRmVGcmas
 e4Ih6hDy2YBiRXLzLHSvSNcTJDGznwMvEEg1AfqUjAtNm4BUzhUnvq0CYlSdwMzy
 xK6w0LmOr67l2QpjiQNCvIn04OWdDQ35f0GcRHYEY66cZPkvPp3OCC/UbeChU6uK
 He/v1n+QdBTbkKYeYmNnBSr42KPTLONNaOFXf79PpAz/WtiYxLxJDbf8DYW9gGQj
 mC3rMNWXf4rEFHkHYH4KvbvxplYd6/pCByMFJa76pKtcE2FG7pVAQZgLaLGemct4
 q0MW73n7yBR1wDlTHqqKu1pAnz1CW3Eui96mDJJptG3CtNqe9a8G48Yzy5OuYAcI
 wxBWCbxeT2z+EuePOBGlQvPreyqyTjHXL87NqMEDrzFrdQd02rn90xqIPlCbcr/P
 qyRwIrDGFyV8PX9I4ewrtt9WfgttayHgZ4juaSi9F479oT6006z6vArrTg2Q/saA
 b0X+hG8oLGSZuSobASjjCtjTE+jYhr+4hzGuHJw0ky0CzLQOqf1APMKDwLN4m5Pz
 Ex+JKalozL+LElGMCS3q
 =cKuu
 -----END PGP SIGNATURE-----

Merge tag 'rtc-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "Core:
   - use is_visible() to control sysfs attributes
   - switch wakealarm attribute to DEVICE_ATTR_RW
   - make rtc_does_wakealarm() return boolean
   - properly manage lifetime of dev and cdev in rtc device
   - remove unnecessary device_get() in rtc_device_unregister
   - fix double free in rtc_register_device() error path

  New drivers:
   - NXP LPC24xx
   - Xilinx Zynq MP
   - Dialog DA9062

  Subsystem wide cleanups:
   - fix drivers that consider 0 as a valid IRQ in client->irq
   - Drop (un)likely before IS_ERR(_OR_NULL)
   - drop the remaining owner assignment for i2c_driver and
     platform_driver
   - module autoload fixes

  Drivers:
   - 88pm80x: add device tree support
   - abx80x: fix RTC write bit
   - ab8500: Add a sentinel to ab85xx_rtc_ids[]
   - armada38x: Align RTC set time procedure with the official errata
   - as3722: correct month value
   - at91sam9: cleanups
   - at91rm9200: get and use slow clock and cleanups
   - bq32k: remove redundant check
   - cmos: century support, proper fix for the spurious wakeup
   - ds1307: cleanups and wakeup irq support
   - ds1374: Remove unused variable
   - ds1685: Use module_platform_driver
   - ds3232: fix WARNING trace in resume function
   - gemini: fix ptr_ret.cocci warnings
   - mt6397: implement suspend/resume
   - omap: support internal and external clock enabling
   - opal: Enable alarms only when opal supports tpo
   - pcf2127: use OFS flag to detect unreliable date and warn the user
   - pl031: fix typo for author email
   - rx8025: huge cleanup and fixes
   - sa1100/pxa: share common code
   - s5m: fix to update ctrl register
   - s3c: fix clocks and wakeup, cleanup
   - sirfsoc: use regmap
   - nvram_read()/nvram_write() functions for cmos, ds1305, ds1307,
     ds1343, ds1511, ds1553, ds1742, m48t59, rp5c01, stk17ta8, tx4939
   - use rtc_valid_tm() error code when reading date/time instead of 0
     for isl12022, pcf2123, pcf2127"

* tag 'rtc-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (90 commits)
  rtc: abx80x: fix RTC write bit
  rtc: ab8500: Add a sentinel to ab85xx_rtc_ids[]
  rtc: ds1374: Remove unused variable
  rtc: Fix module autoload for OF platform drivers
  rtc: Fix module autoload for rtc-{ab8500,max8997,s5m} drivers
  rtc: omap: Add external clock enabling support
  rtc: omap: Add internal clock enabling support
  ARM: dts: AM437x: Add the internal and external clock nodes for rtc
  rtc: s5m: fix to update ctrl register
  rtc: add xilinx zynqmp rtc driver
  devicetree: bindings: rtc: add bindings for xilinx zynqmp rtc
  rtc: as3722: correct month value
  ARM: config: Switch PXA27x platforms to use PXA RTC driver
  ARM: mmp: remove unused RTC register definitions
  ARM: sa1100: remove unused RTC register definitions
  rtc: sa1100/pxa: convert to run-time register mapping
  ARM: pxa: add memory resource to SA1100 RTC device
  rtc: pxa: convert to use shared sa1100 functions
  rtc: sa1100: prepare to share sa1100_rtc_ops
  rtc: ds3232: fix WARNING trace in resume function
  ...
2015-09-08 15:46:31 -07:00
Krzysztof Kozlowski c83db4f419 mm: zbud: constify the zbud_ops
The structure zbud_ops is not modified so make the pointer to it a
pointer to const.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Krzysztof Kozlowski 786727799a mm: zpool: constify the zpool_ops
The structure zpool_ops is not modified so make the pointer to it a
pointer to const.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Dmitry Safonov 5b999aadba mm: swap: zswap: maybe_preload & refactoring
zswap_get_swap_cache_page and read_swap_cache_async have pretty much the
same code with only significant difference in return value and usage of
swap_readpage.

I a helper __read_swap_cache_async() with the common code.  Behavior
change: now zswap_get_swap_cache_page will use radix_tree_maybe_preload
instead radix_tree_preload.  Looks like, this wasn't changed only by the
reason of code duplication.

Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Sergey Senozhatsky 860c707dca zsmalloc: account the number of compacted pages
Compaction returns back to zram the number of migrated objects, which is
quite uninformative -- we have objects of different sizes so user space
cannot obtain any valuable data from that number.  Change compaction to
operate in terms of pages and return back to compaction issuer the
number of pages that were freed during compaction.  So from now on we
will export more meaningful value in zram<id>/mm_stat -- the number of
freed (compacted) pages.

This requires:
 (a) a rename of `num_migrated' to 'pages_compacted'
 (b) a internal API change -- return first_page's fullness_group from
     putback_zspage(), so we know when putback_zspage() did
     free_zspage().  It helps us to account compaction stats correctly.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Sergey Senozhatsky 7d3f393823 zsmalloc/zram: introduce zs_pool_stats api
`zs_compact_control' accounts the number of migrated objects but it has
a limited lifespan -- we lose it as soon as zs_compaction() returns back
to zram.  It worked fine, because (a) zram had it's own counter of
migrated objects and (b) only zram could trigger compaction.  However,
this does not work for automatic pool compaction (not issued by zram).
To account objects migrated during auto-compaction (issued by the
shrinker) we need to store this number in zs_pool.

Define a new `struct zs_pool_stats' structure to keep zs_pool's stats
there.  It provides only `num_migrated', as of this writing, but it
surely can be extended.

A new zsmalloc zs_pool_stats() symbol exports zs_pool's stats back to
caller.

Use zs_pool_stats() in zram and remove `num_migrated' from zram_stats.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Suggested-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Vlastimil Babka 82c1fc7147 mm: use numa_mem_id() in alloc_pages_node()
alloc_pages_node() might fail when called with NUMA_NO_NODE and
__GFP_THISNODE on a CPU belonging to a memoryless node.  To make the
local-node fallback more robust and prevent such situations, use
numa_mem_id(), which was introduced for similar scenarios in the slab
context.

Suggested-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Vlastimil Babka 0bc35a970c mm: unify checks in alloc_pages_node() and __alloc_pages_node()
Perform the same debug checks in alloc_pages_node() as are done in
__alloc_pages_node(), by making the former function a wrapper of the
latter one.

In addition to better diagnostics in DEBUG_VM builds for situations
which have been already fatal (e.g.  out-of-bounds node id), there are
two visible changes for potential existing buggy callers of
alloc_pages_node():

- calling alloc_pages_node() with any negative nid (e.g. due to arithmetic
  overflow) was treated as passing NUMA_NO_NODE and fallback to local node was
  applied. This will now be fatal.
- calling alloc_pages_node() with an offline node will now be checked for
  DEBUG_VM builds. Since it's not fatal if the node has been previously online,
  and this patch may expose some existing buggy callers, change the VM_BUG_ON
  in __alloc_pages_node() to VM_WARN_ON.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Vlastimil Babka 96db800f5d mm: rename alloc_pages_exact_node() to __alloc_pages_node()
alloc_pages_exact_node() was introduced in commit 6484eb3e2a ("page
allocator: do not check NUMA node ID when the caller knows the node is
valid") as an optimized variant of alloc_pages_node(), that doesn't
fallback to current node for nid == NUMA_NO_NODE.  Unfortunately the
name of the function can easily suggest that the allocation is
restricted to the given node and fails otherwise.  In truth, the node is
only preferred, unless __GFP_THISNODE is passed among the gfp flags.

The misleading name has lead to mistakes in the past, see for example
commits 5265047ac3 ("mm, thp: really limit transparent hugepage
allocation to local node") and b360edb43f ("mm, mempolicy:
migrate_to_node should only migrate to node").

Another issue with the name is that there's a family of
alloc_pages_exact*() functions where 'exact' means exact size (instead
of page order), which leads to more confusion.

To prevent further mistakes, this patch effectively renames
alloc_pages_exact_node() to __alloc_pages_node() to better convey that
it's an optimized variant of alloc_pages_node() not intended for general
usage.  Both functions get described in comments.

It has been also considered to really provide a convenience function for
allocations restricted to a node, but the major opinion seems to be that
__GFP_THISNODE already provides that functionality and we shouldn't
duplicate the API needlessly.  The number of users would be small
anyway.

Existing callers of alloc_pages_exact_node() are simply converted to
call __alloc_pages_node(), with the exception of sba_alloc_coherent()
which open-codes the check for NUMA_NO_NODE, so it is converted to use
alloc_pages_node() instead.  This means it no longer performs some
VM_BUG_ON checks, and since the current check for nid in
alloc_pages_node() uses a 'nid < 0' comparison (which includes
NUMA_NO_NODE), it may hide wrong values which would be previously
exposed.

Both differences will be rectified by the next patch.

To sum up, this patch makes no functional changes, except temporarily
hiding potentially buggy callers.  Restricting the checks in
alloc_pages_node() is left for the next patch which can in turn expose
more existing buggy callers.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Robin Holt <robinmholt@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cliff Whickman <cpw@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Wanpeng Li da1b13ccfb mm/hwpoison: fix race between soft_offline_page and unpoison_memory
Wanpeng Li reported a race between soft_offline_page() and
unpoison_memory(), which causes the following kernel panic:

   BUG: Bad page state in process bash  pfn:97000
   page:ffffea00025c0000 count:0 mapcount:1 mapping:          (null) index:0x7f4fdbe00
   flags: 0x1fffff80080048(uptodate|active|swapbacked)
   page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
   bad because of flags:
   flags: 0x40(active)
   Modules linked in: snd_hda_codec_hdmi i915 rpcsec_gss_krb5 nfsv4 dns_resolver bnep rfcomm nfsd bluetooth auth_rpcgss nfs_acl nfs rfkill lockd grace sunrpc i2c_algo_bit drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic drm snd_hda_intel fscache snd_hda_codec x86_pkg_temp_thermal coretemp kvm_intel snd_hda_core snd_hwdep kvm snd_pcm snd_seq_dummy snd_seq_oss crct10dif_pclmul snd_seq_midi crc32_pclmul snd_seq_midi_event ghash_clmulni_intel snd_rawmidi aesni_intel lrw gf128mul snd_seq glue_helper ablk_helper snd_seq_device cryptd fuse snd_timer dcdbas serio_raw mei_me parport_pc snd mei ppdev i2c_core video lp soundcore parport lpc_ich shpchp mfd_core ext4 mbcache jbd2 sd_mod e1000e ahci ptp libahci crc32c_intel libata pps_core
   CPU: 3 PID: 2211 Comm: bash Not tainted 4.2.0-rc5-mm1+ #45
   Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015
   Call Trace:
     dump_stack+0x48/0x5c
     bad_page+0xe6/0x140
     free_pages_prepare+0x2f9/0x320
     ? uncharge_list+0xdd/0x100
     free_hot_cold_page+0x40/0x170
     __put_single_page+0x20/0x30
     put_page+0x25/0x40
     unmap_and_move+0x1a6/0x1f0
     migrate_pages+0x100/0x1d0
     ? kill_procs+0x100/0x100
     ? unlock_page+0x6f/0x90
     __soft_offline_page+0x127/0x2a0
     soft_offline_page+0xa6/0x200

This race is explained like below:

  CPU0                    CPU1

  soft_offline_page
  __soft_offline_page
  TestSetPageHWPoison
                        unpoison_memory
                        PageHWPoison check (true)
                        TestClearPageHWPoison
                        put_page    -> release refcount held by get_hwpoison_page in unpoison_memory
                        put_page    -> release refcount held by isolate_lru_page in __soft_offline_page
  migrate_pages

The second put_page() releases refcount held by isolate_lru_page() which
will lead to unmap_and_move() releases the last refcount of page and w/
mapcount still 1 since try_to_unmap() is not called if there is only one
user map the page.  Anyway, the page refcount and mapcount will still
mess if the page is mapped by multiple users.

This race was introduced by commit 4491f71260 ("mm/memory-failure: set
PageHWPoison before migrate_pages()"), which focuses on preventing the
reuse of successfully migrated page.  Before this commit we prevent the
reuse by changing the migratetype to MIGRATE_ISOLATE during soft
offlining, which has the following problems, so simply reverting the
commit is not a best option:

  1) it doesn't eliminate the reuse completely, because
     set_migratetype_isolate() can fail to set MIGRATE_ISOLATE to the
     target page if the pageblock of the page contains one or more
     unmovable pages (i.e.  has_unmovable_pages() returns true).

  2) the original code changes migratetype to MIGRATE_ISOLATE
     forcibly, and sets it to MIGRATE_MOVABLE forcibly after soft offline,
     regardless of the original migratetype state, which could impact
     other subsystems like memory hotplug or compaction.

This patch moves PageSetHWPoison just after put_page() in
unmap_and_move(), which closes up the reported race window and minimizes
another race window b/w SetPageHWPoison and reallocation (which causes
the reuse of soft-offlined page.) The latter race window still exists
but it's acceptable, because it's rare and effectively the same as
ordinary "containment failure" case even if it happens, so keep the
window open is acceptable.

Fixes: 4491f71260 ("mm/memory-failure: set PageHWPoison before migrate_pages()")
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Wanpeng Li <wanpeng.li@hotmail.com>
Tested-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Naoya Horiguchi 8e30456b6c mm/hwpoison: introduce num_poisoned_pages wrappers
num_poisoned_pages counter will be changed outside mm/memory-failure.c
by a subsequent patch, so this patch prepares wrappers to manipulate it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Wanpeng Li 94bf4ec84a mm/hwpoison: introduce put_hwpoison_page to put refcount for memory error handling
Introduce put_hwpoison_page to put refcount for memory error handling.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Mark Salter 6b0f68e32e mm: add utility for early copy from unmapped ram
When booting an arm64 kernel w/initrd using UEFI/grub, use of mem= will
likely cut off part or all of the initrd.  This leaves it outside the
kernel linear map which leads to failure when unpacking.  The x86 code
has a similar need to relocate an initrd outside of mapped memory in
some cases.

The current x86 code uses early_memremap() to copy the original initrd
from unmapped to mapped RAM.  This patchset creates a generic
copy_from_early_mem() utility based on that x86 code and has arm64 and
x86 share it in their respective initrd relocation code.

This patch (of 3):

In some early boot circumstances, it may be necessary to copy from RAM
outside the kernel linear mapping to mapped RAM.  The need to relocate
an initrd is one example in the x86 code.  This patch creates a helper
function based on current x86 code.

Signed-off-by: Mark Salter <msalter@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Sean O. Stalley 01a7fd337b pci: mm: add pci_pool_zalloc() call
Add a wrapper function for pci_pool_alloc() to get zeroed memory.

Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Gilles Muller <Gilles.Muller@lip6.fr>
Cc: Nicolas Palix <nicolas.palix@imag.fr>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Sean O. Stalley ad82362b2d mm: add dma_pool_zalloc() call to DMA API
Add a wrapper function for dma_pool_alloc() to get zeroed memory.

Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Gilles Muller <Gilles.Muller@lip6.fr>
Cc: Nicolas Palix <nicolas.palix@imag.fr>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Kirill A. Shutemov 64b990d295 mm: drop __nocast from vm_flags_t definition
__nocast does no good for vm_flags_t. It only produces useless sparse
warnings.

Let's drop it.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Naoya Horiguchi c5b4e1b02f mm, page_isolation: make set/unset_migratetype_isolate() file-local
Nowaday, set/unset_migratetype_isolate() is defined and used only in
mm/page_isolation, so let's limit the scope within the file.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Tang Chen 95cf82ecc1 mem-hotplug: handle node hole when initializing numa_meminfo.
When parsing SRAT, all memory ranges are added into numa_meminfo.  In
numa_init(), before entering numa_cleanup_meminfo(), all possible memory
ranges are in numa_meminfo.  And numa_cleanup_meminfo() removes all
ranges over max_pfn or empty.

But, this only works if the nodes are continuous.  Let's have a look at
the following example:

We have an SRAT like this:
SRAT: Node 0 PXM 0 [mem 0x00000000-0x5fffffff]
SRAT: Node 0 PXM 0 [mem 0x100000000-0x1ffffffffff]
SRAT: Node 1 PXM 1 [mem 0x20000000000-0x3ffffffffff]
SRAT: Node 4 PXM 2 [mem 0x40000000000-0x5ffffffffff] hotplug
SRAT: Node 5 PXM 3 [mem 0x60000000000-0x7ffffffffff] hotplug
SRAT: Node 2 PXM 4 [mem 0x80000000000-0x9ffffffffff] hotplug
SRAT: Node 3 PXM 5 [mem 0xa0000000000-0xbffffffffff] hotplug
SRAT: Node 6 PXM 6 [mem 0xc0000000000-0xdffffffffff] hotplug
SRAT: Node 7 PXM 7 [mem 0xe0000000000-0xfffffffffff] hotplug

On boot, only node 0,1,2,3 exist.

And the numa_meminfo will look like this:
numa_meminfo.nr_blks = 9
1. on node 0: [0, 60000000]
2. on node 0: [100000000, 20000000000]
3. on node 1: [20000000000, 40000000000]
4. on node 4: [40000000000, 60000000000]
5. on node 5: [60000000000, 80000000000]
6. on node 2: [80000000000, a0000000000]
7. on node 3: [a0000000000, a0800000000]
8. on node 6: [c0000000000, a0800000000]
9. on node 7: [e0000000000, a0800000000]

And numa_cleanup_meminfo() will merge 1 and 2, and remove 8,9 because the
end address is over max_pfn, which is a0800000000.  But 4 and 5 are not
removed because their end addresses are less then max_pfn.  But in fact,
node 4 and 5 don't exist.

In a word, numa_cleanup_meminfo() is not able to handle holes between nodes.

Since memory ranges in node 4 and 5 are in numa_meminfo, in
numa_register_memblks(), node 4 and 5 will be mistakenly set to online.

If you run lscpu, it will show:
NUMA node0 CPU(s):     0-14,128-142
NUMA node1 CPU(s):     15-29,143-157
NUMA node2 CPU(s):
NUMA node3 CPU(s):
NUMA node4 CPU(s):     62-76,190-204
NUMA node5 CPU(s):     78-92,206-220

In this patch, we use memblock_overlaps_region() to check if ranges in
numa_meminfo overlap with ranges in memory_block.  Since memory_block
contains all available memory at boot time, if they overlap, it means the
ranges exist.  If not, then remove them from numa_meminfo.

After this patch, lscpu will show:
NUMA node0 CPU(s):     0-14,128-142
NUMA node1 CPU(s):     15-29,143-157
NUMA node4 CPU(s):     62-76,190-204
NUMA node5 CPU(s):     78-92,206-220

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Tang Chen c5c5c9d100 mm/memblock.c: make memblock_overlaps_region() return bool.
memblock_overlaps_region() checks if the given memblock region
intersects a region in memblock.  If so, it returns the index of the
intersected region.

But its only caller is memblock_is_region_reserved(), and it returns 0
if false, non-zero if true.

Both of these should return bool.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Mike Kravetz 70c3547e36 hugetlbfs: add hugetlbfs_fallocate()
This is based on the shmem version, but it has diverged quite a bit.  We
have no swap to worry about, nor the new file sealing.  Add
synchronication via the fault mutex table to coordinate page faults,
fallocate allocation and fallocate hole punch.

What this allows us to do is move physical memory in and out of a
hugetlbfs file without having it mapped.  This also gives us the ability
to support MADV_REMOVE since it is currently implemented using
fallocate().  MADV_REMOVE lets madvise() remove pages from the middle of
a hugetlbfs file, which wasn't possible before.

hugetlbfs fallocate only operates on whole huge pages.

Based on code by Dave Hansen.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Mike Kravetz ab76ad540a hugetlbfs: New huge_add_to_page_cache helper routine
Currently, there is only a single place where hugetlbfs pages are added
to the page cache.  The new fallocate code be adding a second one, so
break the functionality out into its own helper.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Mike Kravetz b5cec28d36 hugetlbfs: truncate_hugepages() takes a range of pages
Modify truncate_hugepages() to take a range of pages (start, end)
instead of simply start.  If an end value of LLONG_MAX is passed, the
current "truncate" functionality is maintained.  Existing callers are
modified to pass LLONG_MAX as end of range.  By keying off end ==
LLONG_MAX, the routine behaves differently for truncate and hole punch.
Page removal is now synchronized with page allocation via faults by
using the fault mutex table.  The hole punch case can experience the
rare region_del error and must handle accordingly.

Add the routine hugetlb_fix_reserve_counts to fix up reserve counts in
the case where region_del returns an error.

Since the routine handles more than just the truncate case, it is
renamed to remove_inode_hugepages().  To be consistent, the routine
truncate_huge_page() is renamed remove_huge_page().

Downstream of remove_inode_hugepages(), the routine
hugetlb_unreserve_pages() is also modified to take a range of pages.
hugetlb_unreserve_pages is modified to detect an error from region_del and
pass it back to the caller.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Mike Kravetz c672c7f29f mm/hugetlb: expose hugetlb fault mutex for use by fallocate
hugetlb page faults are currently synchronized by the table of mutexes
(htlb_fault_mutex_table).  fallocate code will need to synchronize with
the page fault code when it allocates or deletes pages.  Expose
interfaces so that fallocate operations can be synchronized with page
faults.  Minor name changes to be more consistent with other global
hugetlb symbols.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Mike Kravetz 5e9113731a mm/hugetlb: add cache of descriptors to resv_map for region_add
hugetlbfs is used today by applications that want a high degree of
control over huge page usage.  Often, large hugetlbfs files are used to
map a large number huge pages into the application processes.  The
applications know when page ranges within these large files will no
longer be used, and ideally would like to release them back to the
subpool or global pools for other uses.  The fallocate() system call
provides an interface for preallocation and hole punching within files.
This patch set adds fallocate functionality to hugetlbfs.

fallocate hole punch will want to remove a specific range of pages.
When pages are removed, their associated entries in the region/reserve
map will also be removed.  This will break an assumption in the
region_chg/region_add calling sequence.  If a new region descriptor must
be allocated, it is done as part of the region_chg processing.  In this
way, region_add can not fail because it does not need to attempt an
allocation.

To prepare for fallocate hole punch, create a "cache" of descriptors
that can be used by region_add if necessary.  region_chg will ensure
there are sufficient entries in the cache.  It will be necessary to
track the number of in progress add operations to know a sufficient
number of descriptors reside in the cache.  A new routine region_abort
is added to adjust this in progress count when add operations are
aborted.  vma_abort_reservation is also added for callers creating
reservations with vma_needs_reservation/vma_commit_reservation.

[akpm@linux-foundation.org: fix typo in comment, use more cols]
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Vlastimil Babka bb14c2c75d mm: rename and move get/set_freepage_migratetype
The pair of get/set_freepage_migratetype() functions are used to cache
pageblock migratetype for a page put on a pcplist, so that it does not
have to be retrieved again when the page is put on a free list (e.g.
when pcplists become full).  Historically it was also assumed that the
value is accurate for pages on freelists (as the functions' names
unfortunately suggest), but that cannot be guaranteed without affecting
various allocator fast paths.  It is in fact not needed and all such
uses have been removed.

The last remaining (but pointless) usage related to pages of freelists
is in move_freepages(), which this patch removes.

To prevent further confusion, rename the functions to
get/set_pcppage_migratetype() and expand their description.  Since all
the users are now in mm/page_alloc.c, move the functions there from the
shared header.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Seungho Park <seungho1.park@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Michal Hocko e752eb6881 memcg: move memcg_proto_active from sock.h
The only user is sock_update_memcg which is living in memcontrol.c so it
doesn't make much sense to pollute sock.h by this inline helper.  Move it
to memcontrol.c and open code it into its only caller.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Michal Hocko 6421999489 memcg: get rid of extern for functions in memcontrol.h
Most of the exported functions in this header are not marked extern so
change the rest to follow the same style.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Michal Hocko fabc3fdde0 memcg: get rid of mem_cgroup_root_css for !CONFIG_MEMCG
The only user is cgwb_bdi_init and that one depends on
CONFIG_CGROUP_WRITEBACK which in turn depends on CONFIG_MEMCG so it
doesn't make much sense to definte an empty stub for !CONFIG_MEMCG.
Moreover ERR_PTR(-EINVAL) is ugly and would lead to runtime crashes if
used in unguarded code paths.  Better fail during compilation.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Michal Hocko 33398cf2f3 memcg: export struct mem_cgroup
mem_cgroup structure is defined in mm/memcontrol.c currently which means
that the code outside of this file has to use external API even for
trivial access stuff.

This patch exports mm_struct with its dependencies and makes some of the
exported functions inlines.  This even helps to reduce the code size a bit
(make defconfig + CONFIG_MEMCG=y)

  text		data    bss     dec     	 hex 	filename
  12355346        1823792 1089536 15268674         e8fb42 vmlinux.before
  12354970        1823792 1089536 15268298         e8f9ca vmlinux.after

This is not much (370B) but better than nothing.

We also save a function call in some hot paths like callers of
mem_cgroup_count_vm_event which is used for accounting.

The patch doesn't introduce any functional changes.

[vdavykov@parallels.com: inline memcg_kmem_is_active]
[vdavykov@parallels.com: do not expose type outside of CONFIG_MEMCG]
[akpm@linux-foundation.org: memcontrol.h needs eventfd.h for eventfd_ctx]
[akpm@linux-foundation.org: export mem_cgroup_from_task() to modules]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Vladimir Davydov <vdavydov@parallels.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
David Rientjes 8989e4c7d4 mm, oom: add description of struct oom_control
Describe the purpose of struct oom_control and what each member does.

Also make gfp_mask and order const since they are never manipulated or
passed to functions that discard the qualifier.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
David Rientjes 54e9e29132 mm, oom: pass an oom order of -1 when triggered by sysrq
The force_kill member of struct oom_control isn't needed if an order of -1
is used instead.  This is the same as order == -1 in struct
compact_control which requires full memory compaction.

This patch introduces no functional change.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
David Rientjes 6e0fc46dc2 mm, oom: organize oom context into struct
There are essential elements to an oom context that are passed around to
multiple functions.

Organize these elements into a new struct, struct oom_control, that
specifies the context for an oom condition.

This patch introduces no functional change.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
David Rientjes 28c015d075 mm: improve __GFP_NORETRY comment based on implementation
Explicitly state that __GFP_NORETRY will attempt direct reclaim and
memory compaction before returning NULL and that the oom killer is not
called in the current implementation of the page allocator.

[akpm@linux-foundation.org: s/has/have/]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Minchan Kim 8334b96221 mm: /proc/pid/smaps:: show proportional swap share of the mapping
We want to know per-process workingset size for smart memory management
on userland and we use swap(ex, zram) heavily to maximize memory
efficiency so workingset includes swap as well as RSS.

On such system, if there are lots of shared anonymous pages, it's really
hard to figure out exactly how many each process consumes memory(ie, rss
+ wap) if the system has lots of shared anonymous memory(e.g, android).

This patch introduces SwapPss field on /proc/<pid>/smaps so we can get
more exact workingset size per process.

Bongkyu tested it. Result is below.

1. 50M used swap
SwapTotal: 461976 kB
SwapFree: 411192 kB

$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
48236
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
141184

2. 240M used swap
SwapTotal: 461976 kB
SwapFree: 216808 kB

$ adb shell cat /proc/*/smaps | grep "SwapPss:" | awk '{sum += $2} END {print sum}';
230315
$ adb shell cat /proc/*/smaps | grep "Swap:" | awk '{sum += $2} END {print sum}';
1387744

[akpm@linux-foundation.org: simplify kunmap_atomic() call]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Bongkyu Kim <bongkyu.kim@lge.com>
Tested-by: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Vineet Gupta b5e3aa0a4d mm: remove put_page_unless_one()
It has no callers.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Kirill A. Shutemov d295e3415a dax: don't use set_huge_zero_page()
This is another place where DAX assumed that pgtable_t was a pointer.
Open code the important parts of set_huge_zero_page() in DAX and make
set_huge_zero_page() static again.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Matthew Wilcox 844f35db10 dax: add huge page fault support
This is the support code for DAX-enabled filesystems to allow them to
provide huge pages in response to faults.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Matthew Wilcox 5cad465d7f mm: add vmf_insert_pfn_pmd()
Similar to vm_insert_pfn(), but for PMDs rather than PTEs.  The 'vmf_'
prefix instead of 'vm_' prefix is intended to indicate that it returns a
VMF_ value rather than an errno (which would only have to be converted
into a VMF_ value anyway).

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Matthew Wilcox fc43704437 mm: export various functions for the benefit of DAX
To use the huge zero page in DAX, we need these functions exported.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Matthew Wilcox b96375f74a mm: add a pmd_fault handler
Allow non-anonymous VMAs to provide huge pages in response to a page fault.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Matthew Wilcox 4897c7655d thp: prepare for DAX huge pages
Add a vma_is_dax() helper macro to test whether the VMA is DAX, and use it
in zap_huge_pmd() and __split_huge_page_pmd().

[akpm@linux-foundation.org: fix build]
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Matthew Wilcox c94c2acf84 dax: move DAX-related functions to a new header
In order to handle the !CONFIG_TRANSPARENT_HUGEPAGES case, we need to
return VM_FAULT_FALLBACK from the inlined dax_pmd_fault(), which is
defined in linux/mm.h.  Given that we don't want to include <linux/mm.h>
in <linux/fs.h>, the easiest solution is to move the DAX-related
functions to a new header, <linux/dax.h>.  We could also have moved
VM_FAULT_* definitions to a new header, or a different header that isn't
quite such a boil-the-ocean header as <linux/mm.h>, but this felt like
the best option.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Kirill A. Shutemov e1b9996b85 thp: vma_adjust_trans_huge(): adjust file-backed VMA too
This series of patches adds support for using PMD page table entries to
map DAX files.  We expect NV-DIMMs to start showing up that are many
gigabytes in size and the memory consumption of 4kB PTEs will be
astronomical.

The patch series leverages much of the Transparant Huge Pages
infrastructure, going so far as to borrow one of Kirill's patches from
his THP page cache series.

This patch (of 10):

Since we're going to have huge pages in page cache, we need to call adjust
file-backed VMA, which potentially can contain huge pages.

For now we call it for all VMAs.

Probably later we will need to introduce a flag to indicate that the VMA
has huge pages.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Oleg Nesterov b533062854 mm: introduce vma_is_anonymous(vma) helper
special_mapping_fault() is absolutely broken.  It seems it was always
wrong, but this didn't matter until vdso/vvar started to use more than
one page.

And after this change vma_is_anonymous() becomes really trivial, it
simply checks vm_ops == NULL.  However, I do think the helper makes
sense.  There are a lot of ->vm_ops != NULL checks, the helper makes the
caller's code more understandable (self-documented) and this is more
grep-friendly.

This patch (of 3):

Preparation.  Add the new simple helper, vma_is_anonymous(vma), and change
handle_pte_fault() to use it.  It will have more users.

The name is not accurate, say a hpet_mmap()'ed vma is not anonymous.
Perhaps it should be named vma_has_fault() instead.  But it matches the
logic in mmap.c/memory.c (see next changes).  "True" just means that a
page fault will use do_anonymous_page().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
Linus Torvalds 12f03ee606 libnvdimm for 4.3:
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
    mechanism for adding device-driver-discovered memory regions to the
    kernel's direct map.  This facility is used by the pmem driver to
    enable pfn_to_page() operations on the page frames returned by DAX
    ('direct_access' in 'struct block_device_operations'). For now, the
    'memmap' allocation for these "device" pages comes from "System
    RAM".  Support for allocating the memmap from device memory will
    arrive in a later kernel.
 
 2/ Introduce memremap() to replace usages of ioremap_cache() and
    ioremap_wt().  memremap() drops the __iomem annotation for these
    mappings to memory that do not have i/o side effects.  The
    replacement of ioremap_cache() with memremap() is limited to the
    pmem driver to ease merging the api change in v4.3.  Completion of
    the conversion is targeted for v4.4.
 
 3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
    driver, update the VFS DAX implementation and PMEM api to provide
    persistence guarantees for kernel operations on a DAX mapping.
 
 4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
    cacheable to improve performance.
 
 5/ Miscellaneous updates and fixes to libnvdimm including support
    for issuing "address range scrub" commands, clarifying the optimal
    'sector size' of pmem devices, a clarification of the usage of the
    ACPI '_STA' (status) property for DIMM devices, and other minor
    fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
 JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
 OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
 nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
 NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
 zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
 1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
 sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
 bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
 o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
 dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
 slsw6DkrWT60CRE42nbK
 =o57/
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "This update has successfully completed a 0day-kbuild run and has
  appeared in a linux-next release.  The changes outside of the typical
  drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
  removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
  the introduction of ZONE_DEVICE + devm_memremap_pages().

  Summary:

   - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
     mechanism for adding device-driver-discovered memory regions to the
     kernel's direct map.

     This facility is used by the pmem driver to enable pfn_to_page()
     operations on the page frames returned by DAX ('direct_access' in
     'struct block_device_operations').

     For now, the 'memmap' allocation for these "device" pages comes
     from "System RAM".  Support for allocating the memmap from device
     memory will arrive in a later kernel.

   - Introduce memremap() to replace usages of ioremap_cache() and
     ioremap_wt().  memremap() drops the __iomem annotation for these
     mappings to memory that do not have i/o side effects.  The
     replacement of ioremap_cache() with memremap() is limited to the
     pmem driver to ease merging the api change in v4.3.

     Completion of the conversion is targeted for v4.4.

   - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
     driver, update the VFS DAX implementation and PMEM api to provide
     persistence guarantees for kernel operations on a DAX mapping.

   - Convert the ACPI NFIT 'BLK' driver to map the block apertures as
     cacheable to improve performance.

   - Miscellaneous updates and fixes to libnvdimm including support for
     issuing "address range scrub" commands, clarifying the optimal
     'sector size' of pmem devices, a clarification of the usage of the
     ACPI '_STA' (status) property for DIMM devices, and other minor
     fixes"

* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
  libnvdimm, pmem: direct map legacy pmem by default
  libnvdimm, pmem: 'struct page' for pmem
  libnvdimm, pfn: 'struct page' provider infrastructure
  x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
  add devm_memremap_pages
  mm: ZONE_DEVICE for "device memory"
  mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
  dax: drop size parameter to ->direct_access()
  nd_blk: change aperture mapping from WC to WB
  nvdimm: change to use generic kvfree()
  pmem, dax: have direct_access use __pmem annotation
  dax: update I/O path to do proper PMEM flushing
  pmem: add copy_from_iter_pmem() and clear_pmem()
  pmem, x86: clean up conditional pmem includes
  pmem: remove layer when calling arch_has_wmb_pmem()
  pmem, x86: move x86 PMEM API to new pmem.h header
  libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
  pmem: switch to devm_ allocations
  devres: add devm_memremap
  libnvdimm, btt: write and validate parent_uuid
  ...
2015-09-08 14:35:59 -07:00
Linus Torvalds dab3c3cc4f Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull core kbuild updates from Michal Marek:
 - modpost portability fix
 - linker script fix
 - genksyms segfault fix
 - fixdep cleanup
 - fix for clang detection

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kbuild: Fix clang detection
  kbuild: fixdep: drop meaningless hash table initialization
  kbuild: fixdep: optimize code slightly
  genksyms: Regenerate parser
  genksyms: Duplicate function pointer type definitions segfault
  kbuild: Fix .text.unlikely placement
  Avoid conflict with host definitions when cross-compiling
2015-09-08 14:12:19 -07:00
Linus Torvalds 59a47fff02 Mostly this is just clean ups and micro optimizations.
The changes with more meat are:
 
  o Allowing the trace event filters to filter on CPU number and process ids
 
  o Two new markers for trace output latency were added
     (10 and 100 msec latencies)
 
  o Have tracing_thresh filter function profiling time
 
 I also worked on modifying the ring buffer code for some future
 work, and moved the adding of the timestamp around. One of my changes
 caused a regression, and since other changes were built on top of it
 and already tested, I had to operate a revert of that change. Instead
 of rebasing, this change set has the code that caused a regression
 as well as the code to revert that change without touching the other
 changes that were made on top of it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV6aZEAAoJEEjnJuOKh9ldrR4H/A1RcQf1prLLoUibPP4w3lat
 dmQcdpS1NY+cqyiKuKPAOkFDGQL7qWzRqZ8whcPSJIsHq57ufqNSLf+0bbQYPzg9
 g3CgGL7OApmGi5ulj0sNxhadvc9TFm/SAN0nVJlNuUWdm8e1UWHLsrJZaMfopu2r
 RDEtkOhg619mhDL4rktNdS6rk0B92Fhu2o2PwLZPVlUl1NNEt4WJU+ejitXUVO1A
 Nb70/rTGGJKtyHbW+74on4LnEN5Uu0Viu6rMwGfYyIgRmC2otdBDvE4xfKMiTUKr
 SzBjzrhIoMIRn4Vl0vElfulkpYaw7pcC2BdpZ4d9VpIOiLSlZs0x/TgCtpFEv5M=
 =baZ3
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing update from Steven Rostedt:
 "Mostly this is just clean ups and micro optimizations.

  The changes with more meat are:

   - Allowing the trace event filters to filter on CPU number and
     process ids

   - Two new markers for trace output latency were added (10 and 100
     msec latencies)

   - Have tracing_thresh filter function profiling time

  I also worked on modifying the ring buffer code for some future work,
  and moved the adding of the timestamp around.  One of my changes
  caused a regression, and since other changes were built on top of it
  and already tested, I had to operate a revert of that change.  Instead
  of rebasing, this change set has the code that caused a regression as
  well as the code to revert that change without touching the other
  changes that were made on top of it"

* tag 'trace-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Revert "ring-buffer: Get timestamp after event is allocated"
  tracing: Don't make assumptions about length of string on task rename
  tracing: Allow triggers to filter for CPU ids and process names
  ftrace: Format MCOUNT_ADDR address as type unsigned long
  tracing: Introduce two additional marks for delay
  ftrace: Fix function_graph duration spacing with 7-digits
  ftrace: add tracing_thresh to function profile
  tracing: Clean up stack tracing and fix fentry updates
  ring-buffer: Reorganize function locations
  ring-buffer: Make sure event has enough room for extend and padding
  ring-buffer: Get timestamp after event is allocated
  ring-buffer: Move the adding of the extended timestamp out of line
  ring-buffer: Add event descriptor to simplify passing data
  ftrace: correct the counter increment for trace_buffer data
  tracing: Fix for non-continuous cpu ids
  tracing: Prefer kcalloc over kzalloc with multiply
2015-09-08 14:04:14 -07:00
Linus Torvalds 425afcff13 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit
Pull audit update from Paul Moore:
 "This is one of the larger audit patchsets in recent history,
  consisting of eight patches and almost 400 lines of changes.

  The bulk of the patchset is the new "audit by executable"
  functionality which allows admins to set an audit watch based on the
  executable on disk.  Prior to this, admins could only track an
  application by PID, which has some obvious limitations.

  Beyond the new functionality we also have some refcnt fixes and a few
  minor cleanups"

* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
  fixup: audit: implement audit by executable
  audit: implement audit by executable
  audit: clean simple fsnotify implementation
  audit: use macros for unset inode and device values
  audit: make audit_del_rule() more robust
  audit: fix uninitialized variable in audit_add_rule()
  audit: eliminate unnecessary extra layer of watch parent references
  audit: eliminate unnecessary extra layer of watch references
2015-09-08 13:34:59 -07:00
Yan, Zheng 8b9558aab8 libceph: use keepalive2 to verify the mon session is alive
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-09-08 23:14:30 +03:00
Linus Torvalds b793c005ce Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Highlights:

   - PKCS#7 support added to support signed kexec, also utilized for
     module signing.  See comments in 3f1e1bea.

     ** NOTE: this requires linking against the OpenSSL library, which
        must be installed, e.g.  the openssl-devel on Fedora **

   - Smack
      - add IPv6 host labeling; ignore labels on kernel threads
      - support smack labeling mounts which use binary mount data

   - SELinux:
      - add ioctl whitelisting (see
        http://kernsec.org/files/lss2015/vanderstoep.pdf)
      - fix mprotect PROT_EXEC regression caused by mm change

   - Seccomp:
      - add ptrace options for suspend/resume"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (57 commits)
  PKCS#7: Add OIDs for sha224, sha284 and sha512 hash algos and use them
  Documentation/Changes: Now need OpenSSL devel packages for module signing
  scripts: add extract-cert and sign-file to .gitignore
  modsign: Handle signing key in source tree
  modsign: Use if_changed rule for extracting cert from module signing key
  Move certificate handling to its own directory
  sign-file: Fix warning about BIO_reset() return value
  PKCS#7: Add MODULE_LICENSE() to test module
  Smack - Fix build error with bringup unconfigured
  sign-file: Document dependency on OpenSSL devel libraries
  PKCS#7: Appropriately restrict authenticated attributes and content type
  KEYS: Add a name for PKEY_ID_PKCS7
  PKCS#7: Improve and export the X.509 ASN.1 time object decoder
  modsign: Use extract-cert to process CONFIG_SYSTEM_TRUSTED_KEYS
  extract-cert: Cope with multiple X.509 certificates in a single file
  sign-file: Generate CMS message as signature instead of PKCS#7
  PKCS#7: Support CMS messages also [RFC5652]
  X.509: Change recorded SKID & AKID to not include Subject or Issuer
  PKCS#7: Check content type and versions
  MAINTAINERS: The keyrings mailing list has moved
  ...
2015-09-08 12:41:25 -07:00
Linus Torvalds 6f0a2fc1fe Merge branch 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull NMI backtrace update from Russell King:
 "These changes convert the x86 NMI handling to be a library
  implementation which other architectures can make use of.  Thomas
  Gleixner has reviewed and tested these changes, and wishes me to send
  these rather than taking them through the tip tree.

  The final patch in the set adds an initial implementation using this
  infrastructure to ARM, even though it doesn't send the IPI at "NMI"
  level.  Patches are in progress to add the ARM equivalent of NMI, but
  we still need the IRQ-level fallback for systems where the "NMI" isn't
  available due to secure firmware denying access to it"

* 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: add basic support for on-demand backtrace of other CPUs
  nmi: x86: convert to generic nmi handler
  nmi: create generic NMI backtrace implementation
2015-09-08 12:28:10 -07:00
Linus Torvalds 752240e74d xen: features and fixes for 4.3-rc0
- Convert xen-blkfront to the multiqueue API
 - [arm] Support binding event channels to different VCPUs.
 - [x86] Support > 512 GiB in a PV guests (off by default as such a
   guest cannot be migrated with the current toolstack).
 - [x86] PMU support for PV dom0 (limited support for using perf with
   Xen and other guests).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV7wIdAAoJEFxbo/MsZsTR0hEH/04HTKLKGnSJpZ5WbMPxqZxE
 UqGlvhvVWNAmFocZmbPcEi9T1qtcFrX5pM55JQr6UmAp3ovYsT2q1Q1kKaOaawks
 pSfc/YEH3oQW5VUQ9Lm9Ru5Z8Btox0WrzRREO92OF36UOgUOBOLkGsUfOwDinNIM
 lSk2djbYwDYAsoeC3PHB32wwMI//Lz6B/9ZVXcyL6ULynt1ULdspETjGnptRPZa7
 JTB5L4/soioKOn18HDwwOhKmvaFUPQv9Odnv7dc85XwZreajhM/KMu3qFbMDaF/d
 WVB1NMeCBdQYgjOrUjrmpyr5uTMySiQEG54cplrEKinfeZgKlEyjKvjcAfJfiac=
 =Ktjl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from David Vrabel:
 "Xen features and fixes for 4.3:

   - Convert xen-blkfront to the multiqueue API
   - [arm] Support binding event channels to different VCPUs.
   - [x86] Support > 512 GiB in a PV guests (off by default as such a
     guest cannot be migrated with the current toolstack).
   - [x86] PMU support for PV dom0 (limited support for using perf with
     Xen and other guests)"

* tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (33 commits)
  xen: switch extra memory accounting to use pfns
  xen: limit memory to architectural maximum
  xen: avoid another early crash of memory limited dom0
  xen: avoid early crash of memory limited dom0
  arm/xen: Remove helpers which are PV specific
  xen/x86: Don't try to set PCE bit in CR4
  xen/PMU: PMU emulation code
  xen/PMU: Intercept PMU-related MSR and APIC accesses
  xen/PMU: Describe vendor-specific PMU registers
  xen/PMU: Initialization code for Xen PMU
  xen/PMU: Sysfs interface for setting Xen PMU mode
  xen: xensyms support
  xen: remove no longer needed p2m.h
  xen: allow more than 512 GB of RAM for 64 bit pv-domains
  xen: move p2m list if conflicting with e820 map
  xen: add explicit memblock_reserve() calls for special pages
  mm: provide early_memremap_ro to establish read-only mapping
  xen: check for initrd conflicting with e820 map
  xen: check pre-allocated page tables for conflict with memory map
  xen: check for kernel memory conflicting with memory layout
  ...
2015-09-08 11:46:48 -07:00
Linus Torvalds b8cb642af9 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more irq updates from Thomas Gleixner:
 "The second part of irq related updates:

   - Provide EOImode for GIC[V3] irq chips, which is a prerequisite for
     direct interrupt handling in [KVM] guests"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/GIC: Fix EOImode setting for non-DT/ACPI systems
  irqchip/GIC: Don't deactivate interrupts forwarded to a guest
  irqchip/GIC: Convert to EOImode == 1
  irqchip/GICv3: Don't deactivate interrupts forwarded to a guest
  irqchip/GICv3: Convert to EOImode == 1
2015-09-08 11:36:56 -07:00
Julien Grall a13d7201d7 xen/privcmd: Further s/MFN/GFN/ clean-up
The privcmd code is mixing the usage of GFN and MFN within the same
functions which make the code difficult to understand when you only work
with auto-translated guests.

The privcmd driver is only dealing with GFN so replace all the mention
of MFN into GFN.

The ioctl structure used to map foreign change has been left unchanged
given that the userspace is using it. Nonetheless, add a comment to
explain the expected value within the "mfn" field.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08 18:03:54 +01:00
Julien Grall 0df4f266b3 xen: Use correctly the Xen memory terminologies
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This resulted in some misimplementation of helpers on ARM and
confused developers about the expected behavior.

For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.

For clarity and avoid new confusion, replace any reference to mfn with
gfn in any helpers used by PV drivers. The x86 code will still keep some
reference of pfn_to_mfn which may be used by all kind of guests
No changes as been made in the hypercall field, even
though they may be invalid, in order to keep the same as the defintion
in xen repo.

Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
name to close to the KVM function gfn_to_page.

Take also the opportunity to simplify simple construction such
as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
will come in follow-up patches.

[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08 18:03:49 +01:00
Juergen Gross 626d750866 xen: switch extra memory accounting to use pfns
Instead of using physical addresses for accounting of extra memory
areas available for ballooning switch to pfns as this is much less
error prone regarding partial pages.

Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08 16:28:06 +01:00
Wanpeng Li 3dfe6a5073 kvm: move new trace event outside #ifdef CONFIG_KVM_ASYNC_PF
Fixes compilation with ppc64_defconfig.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-08 11:16:40 +02:00
Jonathan Corbet edcd591c77 locking/static_keys: Fix a silly typo
Commit:

  412758cb26 ("jump label, locking/static_keys: Update docs")

introduced a typo that might as well get fixed.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150907131803.54c027e1@lwn.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-08 09:02:17 +02:00
Linus Torvalds 4e4adb2f46 NFS client updates for Linux 4.3
Highlights include:
 
 Stable patches:
 - Fix atomicity of pNFS commit list updates
 - Fix NFSv4 handling of open(O_CREAT|O_EXCL|O_RDONLY)
 - nfs_set_pgio_error sometimes misses errors
 - Fix a thinko in xs_connect()
 - Fix borkage in _same_data_server_addrs_locked()
 - Fix a NULL pointer dereference of migration recovery ops for v4.2 client
 - Don't let the ctime override attribute barriers.
 - Revert "NFSv4: Remove incorrect check in can_open_delegated()"
 - Ensure flexfiles pNFS driver updates the inode after write finishes
 - flexfiles must not pollute the attribute cache with attrbutes from the DS
 - Fix a protocol error in layoutreturn
 - Fix a protocol issue with NFSv4.1 CLOSE stateids
 
 Bugfixes + cleanups
 - pNFS blocks bugfixes from Christoph
 - Various cleanups from Anna
 - More fixes for delegation corner cases
 - Don't fsync twice for O_SYNC/IS_SYNC files
 - Fix pNFS and flexfiles layoutstats bugs
 - pnfs/flexfiles: avoid duplicate tracking of mirror data
 - pnfs: Fix layoutget/layoutreturn/return-on-close serialisation issues.
 - pnfs/flexfiles: error handling retries a layoutget before fallback to MDS
 
 Features:
 - Full support for the OPEN NFS4_CREATE_EXCLUSIVE4_1 mode from Kinglong
 - More RDMA client transport improvements from Chuck
 - Removal of the deprecated ib_reg_phys_mr() and ib_rereg_phys_mr() verbs
   from the SUNRPC, Lustre and core infiniband tree.
 - Optimise away the close-to-open getattr if there is no cached data
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV7chgAAoJEGcL54qWCgDyqJQP/3kto9VXnXcatC382jF9Pfj5
 F55XeSnviOXH7CyiKA4nSBhnxg/sLuWOTpbkVI/4Y+VyWhLby9h+mtcKURHOlBnj
 d5BFoPwaBVDnUiKlHFQDkRjIyxjj2Sb6/uEb2V/u3v+3znR5AZZ4lzFx4cD85oaz
 mcru7yGiSxaQCIH6lHExcCEKXaDP5YdvS9YFsyQfv2976JSaQHM9ZG04E0v6MzTo
 E5wwC4CLMKmhuX9kmQMj85jzs1ASAKZ3N7b4cApTIo6F8DCDH0vKQphq/nEQC497
 ECjEs5/fpxtNJUpSBu0gT7G4LCiW3PzE7pHa+8bhbaAn9OzxIR5+qWujKsfGYQhO
 Oomp3K9zO6omshAc5w4MkknPpbImjoZjGAj/q/6DbtrDpnD7DzOTirwYY2yX0CA8
 qcL81uJUb8+j4jJj4RTO+lTUBItrM1XTqTSd/3eSMr5DDRVZj+ERZxh17TaxRBZL
 YrbrLHxCHrcbdSbPlovyvY+BwjJUUFJRcOxGQXLmNYR9u92fF59rb53bzVyzcRRO
 wBozzrNRCFL+fPgfNPLEapIb6VtExdM3rl2HYsJGckHj4DPQdnoB3ytIT9iEFZEN
 +/rE14XEZte7kuH3OP4el2UsP/hVsm7A49mtwrkdbd7rxMWD6XfQUp8DAggWUEtI
 1H6T7RC1Y6wsu0X1fnVz
 =knJA
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable patches:
   - Fix atomicity of pNFS commit list updates
   - Fix NFSv4 handling of open(O_CREAT|O_EXCL|O_RDONLY)
   - nfs_set_pgio_error sometimes misses errors
   - Fix a thinko in xs_connect()
   - Fix borkage in _same_data_server_addrs_locked()
   - Fix a NULL pointer dereference of migration recovery ops for v4.2
     client
   - Don't let the ctime override attribute barriers.
   - Revert "NFSv4: Remove incorrect check in can_open_delegated()"
   - Ensure flexfiles pNFS driver updates the inode after write finishes
   - flexfiles must not pollute the attribute cache with attrbutes from
     the DS
   - Fix a protocol error in layoutreturn
   - Fix a protocol issue with NFSv4.1 CLOSE stateids

  Bugfixes + cleanups
   - pNFS blocks bugfixes from Christoph
   - Various cleanups from Anna
   - More fixes for delegation corner cases
   - Don't fsync twice for O_SYNC/IS_SYNC files
   - Fix pNFS and flexfiles layoutstats bugs
   - pnfs/flexfiles: avoid duplicate tracking of mirror data
   - pnfs: Fix layoutget/layoutreturn/return-on-close serialisation
     issues
   - pnfs/flexfiles: error handling retries a layoutget before fallback
     to MDS

  Features:
   - Full support for the OPEN NFS4_CREATE_EXCLUSIVE4_1 mode from
     Kinglong
   - More RDMA client transport improvements from Chuck
   - Removal of the deprecated ib_reg_phys_mr() and ib_rereg_phys_mr()
     verbs from the SUNRPC, Lustre and core infiniband tree.
   - Optimise away the close-to-open getattr if there is no cached data"

* tag 'nfs-for-4.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (108 commits)
  NFSv4: Respect the server imposed limit on how many changes we may cache
  NFSv4: Express delegation limit in units of pages
  Revert "NFS: Make close(2) asynchronous when closing NFS O_DIRECT files"
  NFS: Optimise away the close-to-open getattr if there is no cached data
  NFSv4.1/flexfiles: Clean up ff_layout_write_done_cb/ff_layout_commit_done_cb
  NFSv4.1/flexfiles: Mark the layout for return in ff_layout_io_track_ds_error()
  nfs: Remove unneeded checking of the return value from scnprintf
  nfs: Fix truncated client owner id without proto type
  NFSv4.1/flexfiles: Mark layout for return if the mirrors are invalid
  NFSv4.1/flexfiles: RW layouts are valid only if all mirrors are valid
  NFSv4.1/flexfiles: Fix incorrect usage of pnfs_generic_mark_devid_invalid()
  NFSv4.1/flexfiles: Fix freeing of mirrors
  NFSv4.1/pNFS: Don't request a minimal read layout beyond the end of file
  NFSv4.1/pnfs: Handle LAYOUTGET return values correctly
  NFSv4.1/pnfs: Don't ask for a read layout for an empty file.
  NFSv4.1: Fix a protocol issue with CLOSE stateids
  NFSv4.1/flexfiles: Don't mark the entire deviceid as bad for file errors
  SUNRPC: Prevent SYN+SYNACK+RST storms
  SUNRPC: xs_reset_transport must mark the connection as disconnected
  NFSv4.1/pnfs: Ensure layoutreturn reserves space for the opaque payload
  ...
2015-09-07 14:02:24 -07:00
Allen Hubbe 86663c9186 NTB: Fix documentation for ntb_peer_db_clear.
The documentation should say "peer" not "local" when referring to the
peer doorbell register.

Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:09 -04:00
Allen Hubbe a7c2323748 NTB: Fix documentation for ntb_link_is_up
There was a copy and paste error in the documentation for
ntb_link_is_up.  The long description was mistakenly copied from
ntb_link_set_trans.

This adds the appropriate long description for ntb_link_is_up.

Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:09 -04:00
Dave Jiang e74bfeedad NTB: Add flow control to the ntb_netdev
Right now if we push the NTB really hard, we start dropping packets due
to not able to process the packets fast enough. We need to st:qop the
upper layer from flooding us when that happens.

A timer is necessary in order to restart the queue once the resource has
been processed on the receive side. Due to the way NTB is setup, the
resources on the tx side are tied to the processing of the rx side and
there's no async way to know when the rx side has released those
resources.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Trond Myklebust 7d160a6c46 NFSv4: Express delegation limit in units of pages
Since we're tracking modifications to the page cache on a per-page
basis, it makes sense to express the limit to how much we may cache
in units of pages.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-07 12:36:13 -04:00
David S. Miller 080fff50a3 For the first round of fixes, we have this:
* fix for the sizeof() pointer type issue
  * a fix for regulatory getting into a restore loop
  * a fix for rfkill global 'all' state, it needs to be stored
    everywhere to apply correctly to new rfkill instances
  * properly refuse CQM RSSI when it cannot actually be used
  * protect HT TDLS traffic properly in non-HT networks
  * don't incorrectly advertise 80 MHz support when not allowed
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJV6av+AAoJEDBSmw7B7bqrNYYP/1kZjdk9s4wPHpQndfSWK/sj
 yTjysh/VXZ5Jyzkcmg337AjEuuJ/SntDXjMXEKTxHHrt5fmHFBvmPErPUY7p7I/t
 MJqxVnYGypGW4LImyXYRJbejabmPrxtHC+jcWibw24yBCzJGCXF/MeKCotksnvIn
 IoJDvUHrVFZhfzFRepdozJi1Qy2Y04Q7zXywplW6XFYWxdfDWaWyno/xh+rM8oO7
 xSgpkuj4FDfimDBjzJOHwBH3xOlt/uHoB97G0TJbJ6k0R6a8aTz8eK5OWB4rFcn+
 BsuQfaWcrGg9fJsoqz4mYlWf/F4Qj/gU9Gr5sBUWWa0xQ7HctcgJ/t27XjUWP7oV
 VnQ9YQt/SukG3BVf8GuNqaSo7/7oprWHdDYRH0tnvImXCWsUlHM8gv8MrG0UOHGC
 4I4qgPLHon6Ui3lNHfivl8iSDFT+jfsMDMWiB6kKZRpUOdEicRrepMoDARVbTYck
 h6uY4Tx6OQ39evLSXjyrCWb0aOr2WoDCypxEkpE+fgDsWr8yTV0Ti0QZfVqatO3d
 d4GQb6U5Zkd14ymKtzULa5qjOw1TmnKpxVzSSdrONaPBVdUDsl4QH38gIppFjGsV
 61Gn3PXRRgy+VAib/YyIYqilvdhZ2TdXaSVAEyKv6WjKavOFA/8KsDPmJsyKoCSn
 1dDOwaiqgfu2bK1PTU9A
 =xw8r
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2015-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
For the first round of fixes, we have this:
 * fix for the sizeof() pointer type issue
 * a fix for regulatory getting into a restore loop
 * a fix for rfkill global 'all' state, it needs to be stored
   everywhere to apply correctly to new rfkill instances
 * properly refuse CQM RSSI when it cannot actually be used
 * protect HT TDLS traffic properly in non-HT networks
 * don't incorrectly advertise 80 MHz support when not allowed
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-06 19:49:55 -07:00
Wanpeng Li 2cbd78244f KVM: trace kvm_halt_poll_ns grow/shrink
Tracepoint for dynamic halt_pool_ns, fired on every potential change.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-06 16:33:14 +02:00
Wanpeng Li 19020f8ab8 KVM: make halt_poll_ns per-vCPU
Change halt_poll_ns into per-VCPU variable, seeded from module parameter,
to allow greater flexibility.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-06 16:27:10 +02:00
David S. Miller 53cfd053e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Conflicts:
	include/net/netfilter/nf_conntrack.h

The conflict was an overlap between changing the type of the zone
argument to nf_ct_tmpl_alloc() whilst exporting nf_ct_tmpl_free.

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net, they are:

1) Oneliner to restore maps in nf_tables since we support addressing registers
   at 32 bits level.

2) Restore previous default behaviour in bridge netfilter when CONFIG_IPV6=n,
   oneliner from Bernhard Thaler.

3) Out of bound access in ipset hash:net* set types, reported by Dave Jones'
   KASan utility, patch from Jozsef Kadlecsik.

4) Fix ipset compilation with gcc 4.4.7 related to C99 initialization of
   unnamed unions, patch from Elad Raz.

5) Add a workaround to address inconsistent endianess in the res_id field of
   nfnetlink batch messages, reported by Florian Westphal.

6) Fix error paths of CT/synproxy since the conntrack template was moved to use
   kmalloc, patch from Daniel Borkmann.

All of them look good to me to reach 4.2, I can route this to -stable myself
too, just let me know what you prefer.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-05 21:57:42 -07:00
Linus Torvalds 7d9071a095 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "In this one:

   - d_move fixes (Eric Biederman)

   - UFS fixes (me; locking is mostly sane now, a bunch of bugs in error
     handling ought to be fixed)

   - switch of sb_writers to percpu rwsem (Oleg Nesterov)

   - superblock scalability (Josef Bacik and Dave Chinner)

   - swapon(2) race fix (Hugh Dickins)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (65 commits)
  vfs: Test for and handle paths that are unreachable from their mnt_root
  dcache: Reduce the scope of i_lock in d_splice_alias
  dcache: Handle escaped paths in prepend_path
  mm: fix potential data race in SyS_swapon
  inode: don't softlockup when evicting inodes
  inode: rename i_wb_list to i_io_list
  sync: serialise per-superblock sync operations
  inode: convert inode_sb_list_lock to per-sb
  inode: add hlist_fake to avoid the inode hash lock in evict
  writeback: plug writeback at a high level
  change sb_writers to use percpu_rw_semaphore
  shift percpu_counter_destroy() into destroy_super_work()
  percpu-rwsem: kill CONFIG_PERCPU_RWSEM
  percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()
  percpu-rwsem: introduce percpu_down_read_trylock()
  document rwsem_release() in sb_wait_write()
  fix the broken lockdep logic in __sb_start_write()
  introduce __sb_writers_{acquired,release}() helpers
  ufs_inode_get{frag,block}(): get rid of 'phys' argument
  ufs_getfrag_block(): tidy up a bit
  ...
2015-09-05 20:34:28 -07:00
Linus Torvalds 9cfcc658da media updates for v4.3-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6ifEAAoJEAhfPr2O5OEVn5kP/i2jM1tWcmV/ZEBKGAN0jpRk
 5Y/Q+rnXvOpIJSQC3dEkweoBymVMclSgSB/wFSWCZtp5MaB8KrH4/2uc3UvolF91
 7bqXt+fCUacMbDQyaabMCR83mz9tdOJLd5sf0ABqBgXGfwh5uXmBPaYBzmcYvKcW
 4D89MFUpaFDPARTs9rdpVyr0aPRU4GcN0R3snRO9Ly+cQnyV/RxPf9NqCgnI+yPq
 +NvA9ScUBcBt62piSIGR4egcAR8boxYC+0r57340S21/JVMvsHQ3ok9b1aT8/rtd
 Yl24FkcKrRV0ShN5S1RmW5DLH/HRGabuMjkiEz9xq52FGD2sQQda0At58dWivsa4
 XYdxS9UUfb9Z+qyeMdmCl1MUFRrV2G4H6VItP+GKyT3UZLEDcLl6hBg3SkyWxWB4
 CSO5WuRThiIB86OVcIaREftzqDy5HdvH3ZKRD7QrW0DItGVjQwV5j6gvwqO9OEXs
 99BnSohyKwUBonumE2ZtFGGhIwIomllrMSqg991bPH9+13bg/rPxUqntkPrVap/9
 cV3qKO8ZFrz5UInBnR1U83l60ZK7rV4G6AVMSMKpM9XVK9TDKryAUN9Mhj5XWRH8
 hbma89TQVdhdrITtt27uzj8F622cvZvxd1BqDBR8DjKVvtv/E2GPzJrAj7GHe3/o
 NgzP5fF6X2Si32GNb7J8
 =cIed
 -----END PGP SIGNATURE-----

Merge tag 'media/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:
 - new DVB frontend drivers: ascot2e, cxd2841er, horus3a, lnbh25
 - new HDMI capture driver: tc358743
 - new driver for NetUP DVB new boards (netup_unidvb)
 - IR support for DVBSky cards (smipcie-ir)
 - Coda driver has gain macroblock tiling support
 - Renesas R-Car gains JPEG codec driver
 - new DVB platform driver for STi boards: c8sectpfe
 - added documentation for the media core kABI to device-drivers DocBook
 - lots of driver fixups, cleanups and improvements

* tag 'media/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (297 commits)
  [media] c8sectpfe: Remove select on undefined LIBELF_32
  [media] i2c: fix platform_no_drv_owner.cocci warnings
  [media] cx231xx: Use wake_up_interruptible() instead of wake_up_interruptible_nr()
  [media] tc358743: only queue subdev notifications if devnode is set
  [media] tc358743: add missing Kconfig dependency/select
  [media] c8sectpfe: Use %pad to print 'dma_addr_t'
  [media] DocBook media: Fix typo "the the" in xml files
  [media] tc358743: make reset gpio optional
  [media] tc358743: set direction of reset gpio using devm_gpiod_get
  [media] dvbdev: document most of the functions/data structs
  [media] dvb_frontend.h: document the struct dvb_frontend
  [media] dvb-frontend.h: document struct dtv_frontend_properties
  [media] dvb-frontend.h: document struct dvb_frontend_ops
  [media] dvb: Use DVBFE_ALGO_HW where applicable
  [media] dvb_frontend.h: document struct analog_demod_ops
  [media] dvb_frontend.h: Document struct dvb_tuner_ops
  [media] Docbook: Document struct analog_parameters
  [media] dvb_frontend.h: get rid of dvbfe_modcod
  [media] add documentation for struct dvb_tuner_info
  [media] dvb_frontend: document dvb_frontend_tune_settings
  ...
2015-09-05 18:21:14 -07:00
Linus Torvalds e3a98ac476 Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
 "Mainly we move from jiffy based timer to HRTIMER for finer control
  over polling.  Then a controller reduces its polling period from 10 to
  1ms"

* 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: arm_mhu: reduce txpoll_period from 10ms to 1 ms
  mailbox: switch to hrtimer for tx_complete polling
  mailbox: Drop owner assignment from platform_driver
2015-09-05 18:11:04 -07:00
Linus Torvalds 17447717a3 Nothing major, but:
- Add Jeff Layton as an nfsd co-maintainer: no change to
           existing practice, just an acknowledgement of the status quo.
         - Two patches ("nfsd: ensure that...") for a race overlooked by
           the state locking rewrite, causing a crash noticed by multiple
           users.
         - Lots of smaller bugfixes all over from Kinglong Mee.
         - From Jeff, some cleanup of server rpc code in preparation for
           possible shift of nfsd threads to workqueues.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6fbLAAoJECebzXlCjuG+qGkP/j2YnZynwqCa4uz1+FU7qfYI
 kZWNGFFQ7O7e1i9Wznp7BkSA020rvM5d1HPwZhtstURM3i52XWRtbppwKF2+IuEU
 tpNdPKb28BPCZO29Z8mQk9IS2sX5jmBiibXRqBk0VK7e43PXrIwg1LJJ9HOfOpLh
 b1MvxdEB7vqK+fAVIYyhlg0UDd5AHAkQ+vS8YuohRXbDcsdhhE4vmusLlUl5UKp8
 5Yunz+b+pXfXPYaKidmpar6U2KoRSTPP1uO3bNfN6URO1W1nchPadLs0DnsBKlhb
 U8II5RZEmc+YfiIMoeptkJHoNhWT6Zu7CNJR6B0USTKv4L6TmFQVpxptVutzYVwx
 sGJ65lvCiXXOPz8JJwvBty//HTmbyOiCm64/vMbhQRlSNLSmcmTXEpw/uT5Huaxx
 bX9lnznoVVCd3eRoXPwMdZTbg/uEKqREZsQWVoqA6gexYqeyp79kvGbttLoUJ27Z
 IjtNb9W6akxfPKrHMgan6j7dy866o6TdSfWRayHwUoswbNnVOnMYKHjApOtF0oev
 k2pdLuy9tjl2a9Ow9sSwHZDbNsXgJO76E0aYnSTBP/YvctlG7KoZ+E0oxa6DWTC+
 0dE+g1xhIuUtW5WRL4pfWWk1G7jnf16J91bKkn91VveDn666RncAbLBtePmpIcIu
 5Ah6KxztTVCW++i5pmHh
 =aecc
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.3' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Nothing major, but:

   - Add Jeff Layton as an nfsd co-maintainer: no change to existing
     practice, just an acknowledgement of the status quo.

   - Two patches ("nfsd: ensure that...") for a race overlooked by the
     state locking rewrite, causing a crash noticed by multiple users.

   - Lots of smaller bugfixes all over from Kinglong Mee.

   - From Jeff, some cleanup of server rpc code in preparation for
     possible shift of nfsd threads to workqueues"

* tag 'nfsd-4.3' of git://linux-nfs.org/~bfields/linux: (52 commits)
  nfsd: deal with DELEGRETURN racing with CB_RECALL
  nfsd: return CLID_INUSE for unexpected SETCLIENTID_CONFIRM case
  nfsd: ensure that delegation stateid hash references are only put once
  nfsd: ensure that the ol stateid hash reference is only put once
  net: sunrpc: fix tracepoint Warning: unknown op '->'
  nfsd: allow more than one laundry job to run at a time
  nfsd: don't WARN/backtrace for invalid container deployment.
  fs: fix fs/locks.c kernel-doc warning
  nfsd: Add Jeff Layton as co-maintainer
  NFSD: Return word2 bitmask if setting security label in OPEN/CREATE
  NFSD: Set the attributes used to store the verifier for EXCLUSIVE4_1
  nfsd: SUPPATTR_EXCLCREAT must be encoded before SECURITY_LABEL.
  nfsd: Fix an FS_LAYOUT_TYPES/LAYOUT_TYPES encode bug
  NFSD: Store parent's stat in a separate value
  nfsd: Fix two typos in comments
  lockd: NLM grace period shouldn't block NFSv4 opens
  nfsd: include linux/nfs4.h in export.h
  sunrpc: Switch to using hash list instead single list
  sunrpc/nfsd: Remove redundant code by exports seq_operations functions
  sunrpc: Store cache_detail in seq_file's private directly
  ...
2015-09-05 17:26:24 -07:00
Linus Torvalds 6c0f568e84 Merge branch 'akpm' (patches from Andrew)
Merge patch-bomb from Andrew Morton:

 - a few misc things

 - Andy's "ambient capabilities"

 - fs/nofity updates

 - the ocfs2 queue

 - kernel/watchdog.c updates and feature work.

 - some of MM.  Includes Andrea's userfaultfd feature.

[ Hadn't noticed that userfaultfd was 'default y' when applying the
  patches, so that got fixed in this merge instead.  We do _not_ mark
  new features that nobody uses yet 'default y'   - Linus ]

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
  mm/hugetlb.c: make vma_has_reserves() return bool
  mm/madvise.c: make madvise_behaviour_valid() return bool
  mm/memory.c: make tlb_next_batch() return bool
  mm/dmapool.c: change is_page_busy() return from int to bool
  mm: remove struct node_active_region
  mremap: simplify the "overlap" check in mremap_to()
  mremap: don't do uneccesary checks if new_len == old_len
  mremap: don't do mm_populate(new_addr) on failure
  mm: move ->mremap() from file_operations to vm_operations_struct
  mremap: don't leak new_vma if f_op->mremap() fails
  mm/hugetlb.c: make vma_shareable() return bool
  mm: make GUP handle pfn mapping unless FOLL_GET is requested
  mm: fix status code which move_pages() returns for zero page
  mm: memcontrol: bring back the VM_BUG_ON() in mem_cgroup_swapout()
  genalloc: add support of multiple gen_pools per device
  genalloc: add name arg to gen_pool_get() and devm_gen_pool_create()
  mm/memblock: WARN_ON when nid differs from overlap region
  Documentation/features/vm: add feature description and arch support status for batched TLB flush after unmap
  mm: defer flush of writable TLB entries
  mm: send one IPI per CPU to TLB flush all entries after unmapping pages
  ...
2015-09-05 14:27:38 -07:00
Sylvain Chouleur 3c217e51d8 rtc: cmos: century support
If century field is supported by the RTC CMOS device, then we should use
it and then do not consider years greater that 169 as an error.

For information, the year field of the rtc_time structure contains the
value to add to 1970 to obtain the current year.

This was a hack to be able to support years for 1970 to 2069.
This patch remains compatible with this implementation.

Signed-off-by: Sylvain Chouleur <sylvain.chouleur@intel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2015-09-05 13:19:09 +02:00
minkyung88.kim 4e6dab4233 mm: remove struct node_active_region
struct node_active_region is not used anymore.  Remove it.

Signed-off-by: minkyung88.kim <minkyung88.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Oleg Nesterov 5477e70a64 mm: move ->mremap() from file_operations to vm_operations_struct
vma->vm_ops->mremap() looks more natural and clean in move_vma(), and this
way ->mremap() can have more users.  Say, vdso.

While at it, s/aio_ring_remap/aio_ring_mremap/.

Note: this is the minimal change before ->mremap() finds another user in
file_operations; this method should have more arguments, and it can be
used to kill arch_remap().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Vladimir Zapolskiy c98c36355d genalloc: add support of multiple gen_pools per device
This change fills devm_gen_pool_create()/gen_pool_get() "name" argument
stub with contents and extends of_gen_pool_get() functionality on this
basis.

If there is no associated platform device with a device node passed to
of_gen_pool_get(), the function attempts to get a label property or device
node name (= repeats MTD OF partition standard) and seeks for a named
gen_pool registered by device of the parent device node.

The main idea of the change is to allow registration of independent
gen_pools under the same umbrella device, say "partitions" on "storage
device", the original functionality of one "partition" per "storage
device" is untouched.

[akpm@linux-foundation.org: fix constness in devres_find()]
[dan.carpenter@oracle.com: freeing const data pointers]
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Vladimir Zapolskiy 7385817359 genalloc: add name arg to gen_pool_get() and devm_gen_pool_create()
This change modifies gen_pool_get() and devm_gen_pool_create() client
interfaces adding one more argument "name" of a gen_pool object.

Due to implementation gen_pool_get() is capable to retrieve only one
gen_pool associated with a device even if multiple gen_pools are created,
fortunately right at the moment it is sufficient for the clients, hence
provide NULL as a valid argument on both producer devm_gen_pool_create()
and consumer gen_pool_get() sides.

Because only one created gen_pool per device is addressable, explicitly
add a restriction to devm_gen_pool_create() to create only one gen_pool
per device, this implies two possible error codes returned by the
function, account it on client side (only misc/sram).  This completes
client side changes related to genalloc updates.

[akpm@linux-foundation.org: gen_pool_get() cleanup]
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Mel Gorman d950c9477d mm: defer flush of writable TLB entries
If a PTE is unmapped and it's dirty then it was writable recently.  Due to
deferred TLB flushing, it's best to assume a writable TLB cache entry
exists.  With that assumption, the TLB must be flushed before any IO can
start or the page is freed to avoid lost writes or data corruption.  This
patch defers flushing of potentially writable TLBs as long as possible.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Mel Gorman 72b252aed5 mm: send one IPI per CPU to TLB flush all entries after unmapping pages
An IPI is sent to flush remote TLBs when a page is unmapped that was
potentially accesssed by other CPUs.  There are many circumstances where
this happens but the obvious one is kswapd reclaiming pages belonging to a
running process as kswapd and the task are likely running on separate
CPUs.

On small machines, this is not a significant problem but as machine gets
larger with more cores and more memory, the cost of these IPIs can be
high.  This patch uses a simple structure that tracks CPUs that
potentially have TLB entries for pages being unmapped.  When the unmapping
is complete, the full TLB is flushed on the assumption that a refill cost
is lower than flushing individual entries.

Architectures wishing to do this must give the following guarantee.

        If a clean page is unmapped and not immediately flushed, the
        architecture must guarantee that a write to that linear address
        from a CPU with a cached TLB entry will trap a page fault.

This is essentially what the kernel already depends on but the window is
much larger with this patch applied and is worth highlighting.  The
architecture should consider whether the cost of the full TLB flush is
higher than sending an IPI to flush each individual entry.  An additional
architecture helper called flush_tlb_local is required.  It's a trivial
wrapper with some accounting in the x86 case.

The impact of this patch depends on the workload as measuring any benefit
requires both mapped pages co-located on the LRU and memory pressure.  The
case with the biggest impact is multiple processes reading mapped pages
taken from the vm-scalability test suite.  The test case uses NR_CPU
readers of mapped files that consume 10*RAM.

Linear mapped reader on a 4-node machine with 64G RAM and 48 CPUs

                                           4.2.0-rc1          4.2.0-rc1
                                             vanilla       flushfull-v7
Ops lru-file-mmap-read-elapsed      159.62 (  0.00%)   120.68 ( 24.40%)
Ops lru-file-mmap-read-time_range    30.59 (  0.00%)     2.80 ( 90.85%)
Ops lru-file-mmap-read-time_stddv     6.70 (  0.00%)     0.64 ( 90.38%)

           4.2.0-rc1    4.2.0-rc1
             vanilla flushfull-v7
User          581.00       611.43
System       5804.93      4111.76
Elapsed       161.03       122.12

This is showing that the readers completed 24.40% faster with 29% less
system CPU time.  From vmstats, it is known that the vanilla kernel was
interrupted roughly 900K times per second during the steady phase of the
test and the patched kernel was interrupts 180K times per second.

The impact is lower on a single socket machine.

                                           4.2.0-rc1          4.2.0-rc1
                                             vanilla       flushfull-v7
Ops lru-file-mmap-read-elapsed       25.33 (  0.00%)    20.38 ( 19.54%)
Ops lru-file-mmap-read-time_range     0.91 (  0.00%)     1.44 (-58.24%)
Ops lru-file-mmap-read-time_stddv     0.28 (  0.00%)     0.47 (-65.34%)

           4.2.0-rc1    4.2.0-rc1
             vanilla flushfull-v7
User           58.09        57.64
System        111.82        76.56
Elapsed        27.29        22.55

It's still a noticeable improvement with vmstat showing interrupts went
from roughly 500K per second to 45K per second.

The patch will have no impact on workloads with no memory pressure or have
relatively few mapped pages.  It will have an unpredictable impact on the
workload running on the CPU being flushed as it'll depend on how many TLB
entries need to be refilled and how long that takes.  Worst case, the TLB
will be completely cleared of active entries when the target PFNs were not
resident at all.

[sasha.levin@oracle.com: trace tlb flush after disabling preemption in try_to_unmap_flush]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Mel Gorman 5b74283ab2 x86, mm: trace when an IPI is about to be sent
When unmapping pages it is necessary to flush the TLB.  If that page was
accessed by another CPU then an IPI is used to flush the remote CPU.  That
is a lot of IPIs if kswapd is scanning and unmapping >100K pages per
second.

There already is a window between when a page is unmapped and when it is
TLB flushed.  This series increases the window so multiple pages can be
flushed using a single IPI.  This should be safe or the kernel is hosed
already.

Patch 1 simply made the rest of the series easier to write as ftrace
        could identify all the senders of TLB flush IPIS.

Patch 2 tracks what CPUs potentially map a PFN and then sends an IPI
        to flush the entire TLB.

Patch 3 tracks when there potentially are writable TLB entries that
        need to be batched differently

Patch 4 increases SWAP_CLUSTER_MAX to further batch flushes

The performance impact is documented in the changelogs but in the optimistic
case on a 4-socket machine the full series reduces interrupts from 900K
interrupts/second to 60K interrupts/second.

This patch (of 4):

It is easy to trace when an IPI is received to flush a TLB but harder to
detect what event sent it.  This patch makes it easy to identify the
source of IPIs being transmitted for TLB flushes on x86.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli c1a4de99fa userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation
This implements mcopy_atomic and mfill_zeropage that are the lowlevel
VM methods that are invoked respectively by the UFFDIO_COPY and
UFFDIO_ZEROPAGE userfaultfd commands.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli 1f1c6f0759 userfaultfd: UFFDIO_COPY|UFFDIO_ZEROPAGE uAPI
This implements the uABI of UFFDIO_COPY and UFFDIO_ZEROPAGE.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli 1380fca084 userfaultfd: activate syscall
This activates the userfaultfd syscall.

[sfr@canb.auug.org.au: activate syscall fix]
[akpm@linux-foundation.org: don't enable userfaultfd on powerpc]
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli a9b85f9415 userfaultfd: change the read API to return a uffd_msg
I had requests to return the full address (not the page aligned one) to
userland.

It's not entirely clear how the page offset could be relevant because
userfaults aren't like SIGBUS that can sigjump to a different place and it
actually skip resolving the fault depending on a page offset.  There's
currently no real way to skip the fault especially because after a
UFFDIO_COPY|ZEROPAGE, the fault is optimized to be retried within the
kernel without having to return to userland first (not even self modifying
code replacing the .text that touched the faulting address would prevent
the fault to be repeated).  Userland cannot skip repeating the fault even
more so if the fault was triggered by a KVM secondary page fault or any
get_user_pages or any copy-user inside some syscall which will return to
kernel code.  The second time FAULT_FLAG_RETRY_NOWAIT won't be set leading
to a SIGBUS being raised because the userfault can't wait if it cannot
release the mmap_map first (and FAULT_FLAG_RETRY_NOWAIT is required for
that).

Still returning userland a proper structure during the read() on the uffd,
can allow to use the current UFFD_API for the future non-cooperative
extensions too and it looks cleaner as well.  Once we get additional
fields there's no point to return the fault address page aligned anymore
to reuse the bits below PAGE_SHIFT.

The only downside is that the read() syscall will read 32bytes instead of
8bytes but that's not going to be measurable overhead.

The total number of new events that can be extended or of new future bits
for already shipped events, is limited to 64 by the features field of the
uffdio_api structure.  If more will be needed a bump of UFFD_API will be
required.

[akpm@linux-foundation.org: use __packed]
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Pavel Emelyanov 3f602d2724 userfaultfd: Rename uffd_api.bits into .features
This is (seems to be) the minimal thing that is required to unblock
standard uffd usage from the non-cooperative one.  Now more bits can be
added to the features field indicating e.g.  UFFD_FEATURE_FORK and others
needed for the latter use-case.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli 19a809afe2 userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx
vma->vm_userfaultfd_ctx is yet another vma parameter that vma_merge
must be aware about so that we can merge vmas back like they were
originally before arming the userfaultfd on some memory range.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli 16ba6f811d userfaultfd: add VM_UFFD_MISSING and VM_UFFD_WP
These two flags gets set in vma->vm_flags to tell the VM common code
if the userfaultfd is armed and in which mode (only tracking missing
faults, only tracking wrprotect faults or both). If neither flags is
set it means the userfaultfd is not armed on the vma.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli 745f234be1 userfaultfd: add vm_userfaultfd_ctx to the vm_area_struct
This adds the vm_userfaultfd_ctx to the vm_area_struct.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli 932b18e0ae userfaultfd: linux/userfaultfd_k.h
Kernel header defining the methods needed by the VM common code to
interact with the userfaultfd.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli 1038628d80 userfaultfd: uAPI
Defines the uAPI of the userfaultfd, notably the ioctl numbers and protocol.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrea Arcangeli 51360155ec userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key
userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead
of the current hardcoded 1 (that would wake just the first waitqueue in
the head list).

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Christoph Lameter 484748f0b6 slab: infrastructure for bulk object allocation and freeing
Add the basic infrastructure for alloc/free operations on pointer arrays.
It includes a generic function in the common slab code that is used in
this infrastructure patch to create the unoptimized functionality for slab
bulk operations.

Allocators can then provide optimized allocation functions for situations
in which large numbers of objects are needed.  These optimization may
avoid taking locks repeatedly and bypass metadata creation if all objects
in slab pages can be used to provide the objects required.

Allocators can extend the skeletons provided and add their own code to the
bulk alloc and free functions.  They can keep the generic allocation and
freeing and just fall back to those if optimizations would not work (like
for example when debugging is on).

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Ulrich Obergfell ec6a90661a watchdog: rename watchdog_suspend() and watchdog_resume()
Rename watchdog_suspend() to lockup_detector_suspend() and
watchdog_resume() to lockup_detector_resume() to avoid confusion with the
watchdog subsystem and to be consistent with the existing name
lockup_detector_init().

Also provide comment blocks to explain the watchdog_running and
watchdog_suspended variables and their relationship.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Stephane Eranian <eranian@google.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Ulrich Obergfell 999bbe49ea watchdog: use suspend/resume interface in fixup_ht_bug()
Remove watchdog_nmi_disable_all() and watchdog_nmi_enable_all() since
these functions are no longer needed.  If a subsystem has a need to
deactivate the watchdog temporarily, it should utilize the
watchdog_suspend() and watchdog_resume() functions.

[akpm@linux-foundation.org: fix build with CONFIG_LOCKUP_DETECTOR=m]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Stephane Eranian <eranian@google.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Ulrich Obergfell 8c073d27d7 watchdog: introduce watchdog_suspend() and watchdog_resume()
This interface can be utilized to deactivate the hard and soft lockup
detector temporarily.  Callers are expected to minimize the duration of
deactivation.  Multiple deactivations are allowed to occur in parallel but
should be rare in practice.

[akpm@linux-foundation.org: remove unneeded static initialization]
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Stephane Eranian <eranian@google.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Guenter Roeck aacfbe6a97 kernel/watchdog: move NMI function header declarations from watchdog.h to nmi.h
The kernel's NMI watchdog has nothing to do with the watchdog subsystem.
Its header declarations should be in linux/nmi.h, not linux/watchdog.h.

The code provided two sets of dummy functions if HARDLOCKUP_DETECTOR is
not configured, one in the include file and one in kernel/watchdog.c.
Remove the dummy functions from kernel/watchdog.c and use those from the
include file.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Frederic Weisbecker 230ec93909 smpboot: allow passing the cpumask on per-cpu thread registration
It makes the registration cheaper and simpler for the smpboot per-cpu
kthread users that don't need to always update the cpumask after threads
creation.

[sfr@canb.auug.org.au: fix for allow passing the cpumask on per-cpu thread registration]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Kees Cook a068acf2ee fs: create and use seq_show_option for escaping
Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g.  new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else.  This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.

Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
of "sudo" is something more sneaky:

  $ BASE="ovl"
  $ MNT="$BASE/mnt"
  $ LOW="$BASE/lower"
  $ UP="$BASE/upper"
  $ WORK="$BASE/work/ 0 0
  none /proc fuse.pwn user_id=1000"
  $ mkdir -p "$LOW" "$UP" "$WORK"
  $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
  $ cat /proc/mounts
  none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
  none /proc fuse.pwn user_id=1000 0 0
  $ fusermount -u /proc
  $ cat /proc/mounts
  cat: /proc/mounts: No such file or directory

This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed.  Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.

[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Jan Kara 4712e722f9 fsnotify: get rid of fsnotify_destroy_mark_locked()
fsnotify_destroy_mark_locked() is subtle to use because it temporarily
releases group->mark_mutex.  To avoid future problems with this
function, split it into two.

fsnotify_detach_mark() is the part that needs group->mark_mutex and
fsnotify_free_mark() is the part that must be called outside of
group->mark_mutex.  This way it's much clearer what's going on and we
also avoid some pointless acquisitions of group->mark_mutex.

Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Jan Kara 925d1132a0 fsnotify: remove mark->free_list
Free list is used when all marks on given inode / mount should be
destroyed when inode / mount is going away.  However we can free all of
the marks without using a special list with some care.

Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Jan Kara 1e39fc0183 fsnotify: document mark locking
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andy Lutomirski 746bf6d642 capabilities: add a securebit to disable PR_CAP_AMBIENT_RAISE
Per Andrew Morgan's request, add a securebit to allow admins to disable
PR_CAP_AMBIENT_RAISE.  This securebit will prevent processes from adding
capabilities to their ambient set.

For simplicity, this disables PR_CAP_AMBIENT_RAISE entirely rather than
just disabling setting previously cleared bits.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andy Lutomirski 58319057b7 capabilities: ambient capabilities
Credit where credit is due: this idea comes from Christoph Lameter with
a lot of valuable input from Serge Hallyn.  This patch is heavily based
on Christoph's patch.

===== The status quo =====

On Linux, there are a number of capabilities defined by the kernel.  To
perform various privileged tasks, processes can wield capabilities that
they hold.

Each task has four capability masks: effective (pE), permitted (pP),
inheritable (pI), and a bounding set (X).  When the kernel checks for a
capability, it checks pE.  The other capability masks serve to modify
what capabilities can be in pE.

Any task can remove capabilities from pE, pP, or pI at any time.  If a
task has a capability in pP, it can add that capability to pE and/or pI.
If a task has CAP_SETPCAP, then it can add any capability to pI, and it
can remove capabilities from X.

Tasks are not the only things that can have capabilities; files can also
have capabilities.  A file can have no capabilty information at all [1].
If a file has capability information, then it has a permitted mask (fP)
and an inheritable mask (fI) as well as a single effective bit (fE) [2].
File capabilities modify the capabilities of tasks that execve(2) them.

A task that successfully calls execve has its capabilities modified for
the file ultimately being excecuted (i.e.  the binary itself if that
binary is ELF or for the interpreter if the binary is a script.) [3] In
the capability evolution rules, for each mask Z, pZ represents the old
value and pZ' represents the new value.  The rules are:

  pP' = (X & fP) | (pI & fI)
  pI' = pI
  pE' = (fE ? pP' : 0)
  X is unchanged

For setuid binaries, fP, fI, and fE are modified by a moderately
complicated set of rules that emulate POSIX behavior.  Similarly, if
euid == 0 or ruid == 0, then fP, fI, and fE are modified differently
(primary, fP and fI usually end up being the full set).  For nonroot
users executing binaries with neither setuid nor file caps, fI and fP
are empty and fE is false.

As an extra complication, if you execute a process as nonroot and fE is
set, then the "secure exec" rules are in effect: AT_SECURE gets set,
LD_PRELOAD doesn't work, etc.

This is rather messy.  We've learned that making any changes is
dangerous, though: if a new kernel version allows an unprivileged
program to change its security state in a way that persists cross
execution of a setuid program or a program with file caps, this
persistent state is surprisingly likely to allow setuid or file-capped
programs to be exploited for privilege escalation.

===== The problem =====

Capability inheritance is basically useless.

If you aren't root and you execute an ordinary binary, fI is zero, so
your capabilities have no effect whatsoever on pP'.  This means that you
can't usefully execute a helper process or a shell command with elevated
capabilities if you aren't root.

On current kernels, you can sort of work around this by setting fI to
the full set for most or all non-setuid executable files.  This causes
pP' = pI for nonroot, and inheritance works.  No one does this because
it's a PITA and it isn't even supported on most filesystems.

If you try this, you'll discover that every nonroot program ends up with
secure exec rules, breaking many things.

This is a problem that has bitten many people who have tried to use
capabilities for anything useful.

===== The proposed change =====

This patch adds a fifth capability mask called the ambient mask (pA).
pA does what most people expect pI to do.

pA obeys the invariant that no bit can ever be set in pA if it is not
set in both pP and pI.  Dropping a bit from pP or pI drops that bit from
pA.  This ensures that existing programs that try to drop capabilities
still do so, with a complication.  Because capability inheritance is so
broken, setting KEEPCAPS, using setresuid to switch to nonroot uids, and
then calling execve effectively drops capabilities.  Therefore,
setresuid from root to nonroot conditionally clears pA unless
SECBIT_NO_SETUID_FIXUP is set.  Processes that don't like this can
re-add bits to pA afterwards.

The capability evolution rules are changed:

  pA' = (file caps or setuid or setgid ? 0 : pA)
  pP' = (X & fP) | (pI & fI) | pA'
  pI' = pI
  pE' = (fE ? pP' : pA')
  X is unchanged

If you are nonroot but you have a capability, you can add it to pA.  If
you do so, your children get that capability in pA, pP, and pE.  For
example, you can set pA = CAP_NET_BIND_SERVICE, and your children can
automatically bind low-numbered ports.  Hallelujah!

Unprivileged users can create user namespaces, map themselves to a
nonzero uid, and create both privileged (relative to their namespace)
and unprivileged process trees.  This is currently more or less
impossible.  Hallelujah!

You cannot use pA to try to subvert a setuid, setgid, or file-capped
program: if you execute any such program, pA gets cleared and the
resulting evolution rules are unchanged by this patch.

Users with nonzero pA are unlikely to unintentionally leak that
capability.  If they run programs that try to drop privileges, dropping
privileges will still work.

It's worth noting that the degree of paranoia in this patch could
possibly be reduced without causing serious problems.  Specifically, if
we allowed pA to persist across executing non-pA-aware setuid binaries
and across setresuid, then, naively, the only capabilities that could
leak as a result would be the capabilities in pA, and any attacker
*already* has those capabilities.  This would make me nervous, though --
setuid binaries that tried to privilege-separate might fail to do so,
and putting CAP_DAC_READ_SEARCH or CAP_DAC_OVERRIDE into pA could have
unexpected side effects.  (Whether these unexpected side effects would
be exploitable is an open question.) I've therefore taken the more
paranoid route.  We can revisit this later.

An alternative would be to require PR_SET_NO_NEW_PRIVS before setting
ambient capabilities.  I think that this would be annoying and would
make granting otherwise unprivileged users minor ambient capabilities
(CAP_NET_BIND_SERVICE or CAP_NET_RAW for example) much less useful than
it is with this patch.

===== Footnotes =====

[1] Files that are missing the "security.capability" xattr or that have
unrecognized values for that xattr end up with has_cap set to false.
The code that does that appears to be complicated for no good reason.

[2] The libcap capability mask parsers and formatters are dangerously
misleading and the documentation is flat-out wrong.  fE is *not* a mask;
it's a single bit.  This has probably confused every single person who
has tried to use file capabilities.

[3] Linux very confusingly processes both the script and the interpreter
if applicable, for reasons that elude me.  The results from thinking
about a script's file capabilities and/or setuid bits are mostly
discarded.

Preliminary userspace code is here, but it needs updating:
https://git.kernel.org/cgit/linux/kernel/git/luto/util-linux-playground.git/commit/?h=cap_ambient&id=7f5afbd175d2

Here is a test program that can be used to verify the functionality
(from Christoph):

/*
 * Test program for the ambient capabilities. This program spawns a shell
 * that allows running processes with a defined set of capabilities.
 *
 * (C) 2015 Christoph Lameter <cl@linux.com>
 * Released under: GPL v3 or later.
 *
 *
 * Compile using:
 *
 *	gcc -o ambient_test ambient_test.o -lcap-ng
 *
 * This program must have the following capabilities to run properly:
 * Permissions for CAP_NET_RAW, CAP_NET_ADMIN, CAP_SYS_NICE
 *
 * A command to equip the binary with the right caps is:
 *
 *	setcap cap_net_raw,cap_net_admin,cap_sys_nice+p ambient_test
 *
 *
 * To get a shell with additional caps that can be inherited by other processes:
 *
 *	./ambient_test /bin/bash
 *
 *
 * Verifying that it works:
 *
 * From the bash spawed by ambient_test run
 *
 *	cat /proc/$$/status
 *
 * and have a look at the capabilities.
 */

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <cap-ng.h>
#include <sys/prctl.h>
#include <linux/capability.h>

/*
 * Definitions from the kernel header files. These are going to be removed
 * when the /usr/include files have these defined.
 */
#define PR_CAP_AMBIENT 47
#define PR_CAP_AMBIENT_IS_SET 1
#define PR_CAP_AMBIENT_RAISE 2
#define PR_CAP_AMBIENT_LOWER 3
#define PR_CAP_AMBIENT_CLEAR_ALL 4

static void set_ambient_cap(int cap)
{
	int rc;

	capng_get_caps_process();
	rc = capng_update(CAPNG_ADD, CAPNG_INHERITABLE, cap);
	if (rc) {
		printf("Cannot add inheritable cap\n");
		exit(2);
	}
	capng_apply(CAPNG_SELECT_CAPS);

	/* Note the two 0s at the end. Kernel checks for these */
	if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, cap, 0, 0)) {
		perror("Cannot set cap");
		exit(1);
	}
}

int main(int argc, char **argv)
{
	int rc;

	set_ambient_cap(CAP_NET_RAW);
	set_ambient_cap(CAP_NET_ADMIN);
	set_ambient_cap(CAP_SYS_NICE);

	printf("Ambient_test forking shell\n");
	if (execv(argv[1], argv + 1))
		perror("Cannot exec");

	return 0;
}

Signed-off-by: Christoph Lameter <cl@linux.com> # Original author
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Aaron Jones <aaronmdjones@gmail.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>
Cc: Markku Savela <msa@moth.iki.fi>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Andrew Morton e9f069868d kernel/kthread.c:kthread_create_on_node(): clarify documentation
- Make it clear that the `node' arg refers to memory allocations only:
  kthread_create_on_node() does not pin the new thread to that node's
  CPUs.

- Encourage the use of NUMA_NO_NODE.

[nzimmer@sgi.com: use NUMA_NO_NODE in kthread_create() also]
Cc: Nathan Zimmer <nzimmer@sgi.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-04 16:54:41 -07:00
Linus Torvalds f377ea88b8 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is the main pull request for the drm for 4.3.  Nouveau is
  probably the biggest amount of changes in here, since it missed 4.2.
  Highlights below, along with the usual bunch of fixes.

  All stuff outside drm should have applicable acks.

  Highlights:

   - new drivers:
        freescale dcu kms driver

   - core:
        more atomic fixes
        disable some dri1 interfaces on kms drivers
        drop fb panic handling, this was just getting more broken, as more locking was required.
        new core fbdev Kconfig support - instead of each driver enable/disabling it
        struct_mutex cleanups

   - panel:
        more new panels
        cleanup Kconfig

   - i915:
        Skylake support enabled by default
        legacy modesetting using atomic infrastructure
        Skylake fixes
        GEN9 workarounds

   - amdgpu:
        Fiji support
        CGS support for amdgpu
        Initial GPU scheduler - off by default
        Lots of bug fixes and optimisations.

   - radeon:
        DP fixes
        misc fixes

   - amdkfd:
        Add Carrizo support for amdkfd using amdgpu.

   - nouveau:
        long pending cleanup to complete driver,
        fully bisectable which makes it larger,
        perfmon work
        more reclocking improvements
        maxwell displayport fixes

   - vmwgfx:
        new DX device support, supports OpenGL 3.3
        screen targets support

   - mgag200:
        G200eW support
        G200e new revision support

   - msm:
        dragonboard 410c support, msm8x94 support, msm8x74v1 support
        yuv format support
        dma plane support
        mdp5 rotation
        initial hdcp

   - sti:
        atomic support

   - exynos:
        lots of cleanups
        atomic modesetting/pageflipping support
        render node support

   - tegra:
        tegra210 support (dc, dsi, dp/hdmi)
        dpms with atomic modesetting support

   - atmel:
        support for 3 more atmel SoCs
        new input formats, PRIME support.

   - dwhdmi:
        preparing to add audio support

   - rockchip:
        yuv plane support"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1369 commits)
  drm/amdgpu: rename gmc_v8_0_init_compute_vmid
  drm/amdgpu: fix vce3 instance handling
  drm/amdgpu: remove ib test for the second VCE Ring
  drm/amdgpu: properly enable VM fault interrupts
  drm/amdgpu: fix warning in scheduler
  drm/amdgpu: fix buffer placement under memory pressure
  drm/amdgpu/cz: fix cz_dpm_update_low_memory_pstate logic
  drm/amdgpu: fix typo in dce11 watermark setup
  drm/amdgpu: fix typo in dce10 watermark setup
  drm/amdgpu: use top down allocation for non-CPU accessible vram
  drm/amdgpu: be explicit about cpu vram access for driver BOs (v2)
  drm/amdgpu: set MEC doorbell range for Fiji
  drm/amdgpu: implement burst NOP for SDMA
  drm/amdgpu: add insert_nop ring func and default implementation
  drm/amdgpu: add amdgpu_get_sdma_instance helper function
  drm/amdgpu: add AMDGPU_MAX_SDMA_INSTANCES
  drm/amdgpu: add burst_nop flag for sdma
  drm/amdgpu: add count field for the SDMA NOP packet v2
  drm/amdgpu: use PT for VM sync on unmap
  drm/amdgpu: make wait_event uninterruptible in push_job
  ...
2015-09-04 15:49:32 -07:00
Linus Torvalds 51e771c0d2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov:
 "Drivers, drivers, drivers...  No interesting input core changes this
  time"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (74 commits)
  Input: elan_i2c - use iap_version to get firmware information
  Input: max8997_haptic - fix module alias
  Input: elan_i2c - fix typos for validpage_count
  Input: psmouse - add small delay for IBM trackpoint pass-through mode
  Input: synaptics - fix handling of disabling gesture mode
  Input: elan_i2c - enable ELAN0100 acpi panels
  Input: gpio-keys - report error when disabling unsupported key
  Input: sur40 - fix error return code
  Input: sentelic - silence some underflow warnings
  Input: zhenhua - switch to using bitrev8()
  Input: cros_ec_keyb - replace KEYBOARD_CROS_EC dependency
  Input: cap11xx - add LED support
  Input: elants_i2c - fix for devm_gpiod_get API change
  Input: elan_i2c - enable asynchronous probing
  Input: elants_i2c - enable asynchronous probing
  Input: elants_i2c - wire up regulator support
  Input: do not emit unneeded EV_SYN when suspending
  Input: elants_i2c - disable idle mode before updating firmware
  MAINTAINERS: Add maintainer for atmel_mxt_ts
  Input: atmel_mxt_ts - remove warning on zero T44 count
  ...
2015-09-04 12:02:11 -07:00
Linus Torvalds abebcdfb64 sound updates for 4.3-rc1
There are little changes in core part, but lots of development are
 found in drivers, especially ASoC.  The diffstat shows regmap-
 related changes for a slight API additions / changes, and that's all.
 
 Looking at the code size statistics, the most significant addition
 is for Intel Skylake.  (Note that SKL support is still underway, the
 codec driver is missing.)  Also STI controller driver is a major
 addition as well as a few new codec drivers.
 
 In HD-audio side, there are fewer changes than the past.  The
 noticeable change is the support of ELD notification from i915
 graphics driver.  Thus this pull request carries a few changes in
 drm/i915.
 
 Other than that, USB-audio got a rewrite of runtime PM code.  It
 was initiated by lockdep warning, but resulted in a good cleanup in
 the end.
 
 Below are the highlights:
 
 Common:
 - Factoring out of AC'97 reset code from ASoC into the core helper
 - A few regmap API extensions (in case it's not pulled yet)
 
 ASoC:
 - New drivers for Cirrus CS4349, GTM601, InvenSense ICS43432, Realtek
   RT298 and ST STI controllers
 - Machine drivers for Rockchip systems with MAX98090 and RT5645 and
   RT5650
 - Initial driver support for Intel Skylake devices
 - Lots of rsnd cleanup and enhancements
 - A few DAPM fixes and cleanups
 - A large number of cleanups in various drivers (conversion and
   standardized to regmap, component) mostly by Lars-Peter and Axel
 
 HD-audio:
 - Extended HD-audio core for Intel Skylake controller support
 - Quirks for Dell headsets, Alienware 15
 - Clean up of pin-based quirk tables for Realtek codecs
 - ELD notifier implenetation for Intel HDMI/DP
 
 USB-audio:
 - Refactor runtime PM code to make lockdep happier
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJV6TwJAAoJEGwxgFQ9KSmkZoEP/06GrsGlfgIfBbnlAKcsZ0t0
 RDDCbxmwD8IsjTk180Gs3qBuhVPurhmPxq6Leow5fBktkEK5bIN3eAQkO9aIMroW
 xxU1UF6Q9XE2j97e/PhhUld7/NP0IQK/YTMuwX74G2kfEkA9Lktl4UjNMw9mKJX2
 8OIwz8ZuqSG60znmGlgiqRE4M3Svs1L/jVP1wrPg2DXQfe+ptAJpUTsyVGOMRWm3
 IaJ9h5OelPg8Jm61zcg6/pgsdYx4oquCV5wLwMz8rzIUfHb7ox8F7YKOzB+sXtYI
 zcsTfF2CqifoBcQAh9c+XE4+gMamAdheA+uc8ScUkcskucTj4Fr5tXLiPSN9QMt4
 QGOOVjqcpWv5rWwAgzUJvl1/PT4HyQfkXn5tEQVGdg9Ab1SIcQBzD1+nHUV94vKZ
 N7/grMdqJ56zUGK2fEcBS6BEDlaSToOIHDrQ1iPFNBvmW8qjBq9tYaufTGC6Vtj2
 0YKJukzIbyqLIgQtQf44aqLouFIz2lq437PqRQ4W+9C3FwGN9FKCYJ/JzvOGDIJa
 sSjEwQkJ9vnmZ3E2B30NKb24TG8pPq9WPIN2Rqe5EbHctU3gEnMScwvmG7SmCSG5
 LtDVr6Q5XKFM56cVb7tdZl6Jv97BvGu6EERM+zN+8YyMver206rC8upWOev6R2q3
 asvLDEchv7Qm3upx+PYg
 =/sXs
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "There are little changes in core part, but lots of development are
  found in drivers, especially ASoC.  The diffstat shows regmap-related
  changes for a slight API additions / changes, and that's all.

  Looking at the code size statistics, the most significant addition is
  for Intel Skylake.  (Note that SKL support is still underway, the
  codec driver is missing.) Also STI controller driver is a major
  addition as well as a few new codec drivers.

  In HD-audio side, there are fewer changes than the past.  The
  noticeable change is the support of ELD notification from i915
  graphics driver.  Thus this pull request carries a few changes in
  drm/i915.

  Other than that, USB-audio got a rewrite of runtime PM code.  It was
  initiated by lockdep warning, but resulted in a good cleanup in the
  end.

  Below are the highlights:

  Common:
   - Factoring out of AC'97 reset code from ASoC into the core helper
   - A few regmap API extensions (in case it's not pulled yet)

  ASoC:
   - New drivers for Cirrus CS4349, GTM601, InvenSense ICS43432, Realtek
     RT298 and ST STI controllers
   - Machine drivers for Rockchip systems with MAX98090 and RT5645 and
     RT5650
   - Initial driver support for Intel Skylake devices
   - Lots of rsnd cleanup and enhancements
   - A few DAPM fixes and cleanups
   - A large number of cleanups in various drivers (conversion and
     standardized to regmap, component) mostly by Lars-Peter and Axel

  HD-audio:
   - Extended HD-audio core for Intel Skylake controller support
   - Quirks for Dell headsets, Alienware 15
   - Clean up of pin-based quirk tables for Realtek codecs
   - ELD notifier implenetation for Intel HDMI/DP

  USB-audio:
   - Refactor runtime PM code to make lockdep happier"

* tag 'sound-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (411 commits)
  drm/i915: Add locks around audio component bind/unbind
  drm/i915: Drop port_mst_index parameter from pin/eld callback
  ALSA: hda - Fix missing inline for dummy snd_hdac_set_codec_wakeup()
  ALSA: hda - Wake the codec up on pin/ELD notify events
  ALSA: hda - allow codecs to access the i915 pin/ELD callback
  drm/i915: Call audio pin/ELD notify function
  drm/i915: Add audio pin sense / ELD callback
  ASoC: zx296702-i2s: Fix resource leak when unload module
  ASoC: sti_uniperif: Ensure component is unregistered when unload module
  ASoC: au1x: psc-i2s: Convert to use devm_ioremap_resource
  ASoC: sh: dma-sh7760: Convert to devm_snd_soc_register_platform
  ASoC: spear_pcm: Use devm_snd_dmaengine_pcm_register to fix resource leak
  ALSA: fireworks/bebob/dice/oxfw: fix substreams counting at vmalloc failure
  ASoC: Clean up docbook warnings
  ASoC: txx9: Convert to devm_snd_soc_register_platform
  ASoC: pxa: Convert to devm_snd_soc_register_platform
  ASoC: nuc900: Convert to devm_snd_soc_register_platform
  ASoC: blackfin: Convert to devm_snd_soc_register_platform
  ASoC: au1x: Convert to devm_snd_soc_register_platform
  ASoC: qcom: Constify asoc_qcom_lpass_cpu_dai_ops
  ...
2015-09-04 11:46:02 -07:00
Linus Torvalds 670c039dee - Stop using LP855X Platform Data to control regulators
- Move PWM8941 WLED driver into Backlight
  - Remove invalid use of IS_ERR_VALUE() macro
  - Remove duplicate check for NULL data before unregistering
  - Export I2C Device ID structure
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6FPVAAoJEFGvii+H/Hdh1NIP/2mcBwvmF8zgmTUALbRLKyAp
 3LXx0/Nh64Pjz2rDsZPclo4NLlq43d9xq/A3CBI4b00VgiofgRG/Cihhzo8nxhTu
 XN73TgVQ2CiNrQ0yn85dedoVEbv0cAlZYdyuHrp1SUUSPOUhaiRU2w6k7BkY0gc2
 8lkzmVnMOkQjT+Elj6wVBcPiojm333Tu5wE125W8t6XDOwDfpLoVkzcxcv/V7eIo
 pStB7u3MsXZ9kmG2S9aMSZX1PS6C772OeN1I7+3lFy67qiIDj3tQ+8MyaMFvWGdQ
 /w8+f4K8k0t1m1OJVLgsT4UKR/7qprTOvEXYGrsVPPwZqSOypRZOooPOQbAVtkV6
 qYsVYEnbsJerkCuAyWETL+H5eH+P56Zl66t6a9hgnzw4wKYarufo+xUe0b+ntmoO
 dtegXgUVryY1P9teXsnnWx24r4pfaCWunO1RUXSZABCK+0qEeXa2SM/PFtkCWwEd
 AGBp8Rcv9a2BHtY2oaii+rs6WFUH+CGeZ8rt1cKBZvT2gCjlTPb5WiGvCEnN3OBi
 BCWgTLILldSRG7ERW6P6W6WkpP9u6gz8BChLNe0PsoOoNLXElHj6VCAlusOc1kHU
 6WwvMpKcQ+SVMVNouFLv3Ur5K8sF7VkyaHDRoipeKufu0n3b3+AXEaRr/Ch1BsIh
 cdUHgYJSujzcD/Zx8crp
 =URbg
 -----END PGP SIGNATURE-----

Merge tag 'backlight-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 - Stop using LP855X Platform Data to control regulators
 - Move PWM8941 WLED driver into Backlight
 - Remove invalid use of IS_ERR_VALUE() macro
 - Remove duplicate check for NULL data before unregistering
 - Export I2C Device ID structure

* tag 'backlight-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: tosa: Export I2C module alias information
  backlight: lp8788_bl: Delete a check before backlight_device_unregister()
  backlight: sky81452: Remove unneeded use of IS_ERR_VALUE() macro
  backlight: pm8941-wled: Move PM8941 WLED driver to backlight
  backlight: lp855x: Use private data for regulator control
2015-09-04 11:40:40 -07:00
Linus Torvalds 8bd8fd0a29 - New Device Support
- New Clocksource driver from ST
    - New MFD/ACPI/DMA drivers for Intel's Sunrisepoint PCH based platforms
    - Add support for Arizona WM8998 and WM1814
    - Add support for Dialog Semi DA9062 and DA9063
    - Add support for Kontron COMe-bBL6 and COMe-cBW6
    - Add support for X-Powers AXP152
    - Add support for Atmel, many
    - Add support for STMPE, many
    - Add support for USB in X-Powers AXP22X
 
  - Core Frameworks
    - New Base API to traverse devices and their children in reverse order
 
  - Bug Fixes
    - Fix race between runtime-suspend and IRQs
    - Obtain platform data form more reliable source
 
  - Fix-ups
    - Constifying things
    - Variable signage changes
    - Kconfig depends|selects changes
    - Make use of BIT() macro
    - Do not supply .owner attribute in *_driver structures
    - MAINTAINERS entries
    - Stop using set_irq_flags()
    - Start using irq_set_chained_handler_and_data()
    - Export DT device ID structures
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6FCRAAoJEFGvii+H/HdhO2MP/1GxcbbywMXU4goj40gJaYfx
 kk0zH0S7i8+A8hD7SoCIQNkWN5o7i6sNYUA6sCTnPqixbyrkduWCyid1XsATu+41
 iiKEGyiCRyEHhCwnwCXvaFhpAZBzDi7FKj6hhf6nnRMHSEqwrs2aBqWgzNrOZTs0
 u66i/JHccnDdfHHm9Y7XcKMA8pWVqRMnwwaHreuYTFqfrEB0UGCYpmEeEBynGVKh
 MUGC0lCUrEKp59aOexZRtBUla/5BeALJd//vMQtf/+D0YPvE8lppDNwkgCe4buXN
 ZlNHDQooIWIiZfTj7wbHaTWjrBK7MsOEHWBUjNsk2nyDvDOJoGhTrSdJwPeyhUSh
 d2eUyW6sPEQY21XPwuD0DhfRKYKLOzVRhIcxvjlRAq9QHDWVXKyIlf3M70fculK8
 5FN1Wb6Sc2h0OvMC5RemPpxMwZSq6Ks3XANa718Ju802TGK/xk6iRqhZrEut/qrN
 rLYsU84TLUz6YindozTiI5FrGo+zSp9OlUU4z7HUh+4t3H5/opdsRjRp0ICwgIbY
 NxAmsk2d/vJ7xX7FAAjwMY2rPIC0zIksbGEe1AJweWV455EcDMaBM1/e9zDzHciI
 TXVxbzs3DFBadtQWlLv/VkwZmt43MTI8g6ozTTJJkPQNCKThtz4bvDSu8rQWqFua
 bkbRyQroraX5fM0Q3HIs
 =2blS
 -----END PGP SIGNATURE-----

Merge tag 'mfd-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "New Device Support:
   - New Clocksource driver from ST
   - New MFD/ACPI/DMA drivers for Intel's Sunrisepoint PCH based platforms
   - Add support for Arizona WM8998 and WM1814
   - Add support for Dialog Semi DA9062 and DA9063
   - Add support for Kontron COMe-bBL6 and COMe-cBW6
   - Add support for X-Powers AXP152
   - Add support for Atmel, many
   - Add support for STMPE, many
   - Add support for USB in X-Powers AXP22X

  Core Frameworks:
   - New Base API to traverse devices and their children in reverse order

  Bug Fixes:
   - Fix race between runtime-suspend and IRQs
   - Obtain platform data form more reliable source

  Fix-ups:
   - Constifying things
   - Variable signage changes
   - Kconfig depends|selects changes
   - Make use of BIT() macro
   - Do not supply .owner attribute in *_driver structures
   - MAINTAINERS entries
   - Stop using set_irq_flags()
   - Start using irq_set_chained_handler_and_data()
   - Export DT device ID structures"

* tag 'mfd-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (69 commits)
  mfd: jz4740-adc: Init mask cache in generic IRQ chip
  mfd: cros_ec: spi: Add OF match table
  mfd: stmpe: Add OF match table
  mfd: max77686: Split out regulator part from the DT binding
  mfd: Add DT binding for Maxim MAX77802 IC
  mfd: max77686: Use a generic name for the PMIC node in the example
  mfd: max77686: Don't suggest in binding to use a deprecated property
  mfd: Add MFD_CROS_EC dependencies
  mfd: cros_ec: Remove CROS_EC_PROTO dependency for SPI and I2C drivers
  mfd: axp20x: Add a cell for the usb power_supply part of the axp20x PMICs
  mfd: axp20x: Add missing registers, and mark more registers volatile
  mfd: arizona: Fixup some formatting/white space errors
  mfd: wm8994: Fix NULL pointer exception on missing pdata
  of: Add vendor prefix for Nuvoton
  mfd: mt6397: Implement wake handler and suspend/resume to handle wake up event
  mfd: atmel-hlcdc: Add support for new SoCs
  mfd: Export OF module alias information in missing drivers
  mfd: stw481x: Export I2C module alias information
  mfd: da9062: Support for the DA9063 OnKey in the DA9062 core
  mfd: max899x: Avoid redundant irq_data lookup
  ...
2015-09-04 11:35:03 -07:00
Linus Torvalds 3527122745 dmaengine updates for 4.3-rc1
This time we have aded a new capability for  scatter-gathered memset using
 dmaengine APIs. This is supported in xdmac & hdmac drivers
 
 We have added support for reusing descriptors for examples like video
 buffers etc. Driver will follow
 
 The behaviour of descriptor ack has been clarified and documented
 
 New devices added are:
 - dma controller in sun[457]i SoCs
 - lpc18xx dmamux
 - ZTE ZX296702 dma controller
 - Analog Devices AXI-DMAC DMA controller
 - eDMA support for dma-crossbar
 - imx6sx support in imx-sdma driver
 - imx-sdma device to device support
 
 Others
 - jz4780 fixes
 - ioatdma large refactor and cleanup for removal of ioat v1 and v2 which is
   deprecated and fixes
 - ACPI support in X-Gene DMA engine driver
 - ipu irq fixes
 - mvxor fixes
 - minor fixes spread thru drivers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV5+nSAAoJEHwUBw8lI4NHiXQQAI/++7PmUGZ6BDZGu0B9Bj7U
 JalNijm43p858nka1zVhDea8pi7Cq3zJdE8EAB7FPQGESvCODWr62oZBr+mSaQ1C
 oU1RTIRTSiU2HPE4EFeGUvVGrnmTbHR2b1apI1SU41gKn+oQ5RJRRoQwEVwO6uuZ
 1VYcUqhurIAZs1FrMIAUa2vg7KTcK9UotfwR2gGBmSvXMf1aJ/dNZC7i/pBJjoyt
 v6KrLuYjEBAJvY7l368+NhLY/MS+2xdCMQo84B+HNEG7eA7y2MFOcRPXQA3a7dzr
 NwNuAZcTYDU11r2jiAPcnBM5sPo4bokX6Td0oDbYH6Rn2uIWlof7jGIceUaWLQQq
 QGZc4QPI4KdjTGNedRN8g9zqv0irFVfDr5v1A+B7N7ehvlubnB4jV8LmLpqN6UQH
 B38VnDJ3hqdZ6j9RHQTyUoQskSYMPbOAUYbL0qQLkyx8AnLc8TRv7DgtSvZjnz5W
 oF6So2A5SWZ7UmXKupd6TKtdyG3xtFAh+/MGVQ1RS9bCmnyhaIxJRiJwfftCBTBx
 IZePOsqlwl2dojM62BDlGS4CLRZve2VgiUEJaPINsdm/On3tQs9+iDbNY3cpvLQS
 P9u4po1TQPZnKG732vPAxEqdlq709kta7Fj5KIEvNjuWBBGKfypNP8BHKRvTLFlR
 kcbO03NzwSO6PZpmiUsx
 =gQZ6
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-4.3-rc1' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine updates from Vinod Koul:
 "This time we have aded a new capability for scatter-gathered memset
  using dmaengine APIs.  This is supported in xdmac & hdmac drivers

  We have added support for reusing descriptors for examples like video
  buffers etc.  Driver will follow

  The behaviour of descriptor ack has been clarified and documented

  New devices added are:
   - dma controller in sun[457]i SoCs
   - lpc18xx dmamux
   - ZTE ZX296702 dma controller
   - Analog Devices AXI-DMAC DMA controller
   - eDMA support for dma-crossbar
   - imx6sx support in imx-sdma driver
   - imx-sdma device to device support

  Other:
   - jz4780 fixes
   - ioatdma large refactor and cleanup for removal of ioat v1 and v2
     which is deprecated and fixes
   - ACPI support in X-Gene DMA engine driver
   - ipu irq fixes
   - mvxor fixes
   - minor fixes spread thru drivers"

[ The Kconfig and Makefile entries got re-sorted alphabetically, and I
  handled the conflict with the new Intel integrated IDMA driver by
  slightly mis-sorting it on purpose: "IDMA64" got sorted after "IMX" in
  order to keep the Intel entries together.  I think it might be a good
  idea to just rename the IDMA64 config entry to INTEL_IDMA64 to make
  the sorting be a true sort, not this mismash.

  Also, this merge disables the COMPILE_TEST for the sun4i DMA
  controller, because it does not compile cleanly at all.     - Linus ]

* tag 'dmaengine-4.3-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (89 commits)
  dmaengine: ioatdma: add Broadwell EP ioatdma PCI dev IDs
  dmaengine :ipu: change ipu_irq_handler() to remove compile warning
  dmaengine: ioatdma: Fix variable array length
  dmaengine: ioatdma: fix sparse "error" with prep lock
  dmaengine: hdmac: Add memset capabilities
  dmaengine: sort the sh Makefile
  dmaengine: sort the sh Kconfig
  dmaengine: sort the dw Kconfig
  dmaengine: sort the Kconfig
  dmaengine: sort the makefile
  drivers/dma: make mv_xor.c driver explicitly non-modular
  dmaengine: Add support for the Analog Devices AXI-DMAC DMA controller
  devicetree: Add bindings documentation for Analog Devices AXI-DMAC
  dmaengine: xgene-dma: Fix the lock to allow client for further submission of requests
  dmaengine: ioatdma: fix coccinelle warning
  dmaengine: ioatdma: fix zero day warning on incompatible pointer type
  dmaengine: tegra-apb: Simplify locking for device using global pause
  dmaengine: tegra-apb: Remove unnecessary return statements and variables
  dmaengine: tegra-apb: Avoid unnecessary channel base address calculation
  dmaengine: tegra-apb: Remove unused variables
  ...
2015-09-04 11:10:18 -07:00
Linus Torvalds 88a99886c2 This is the bulk of pin control changes for the v4.3 development
cycle
 
 Core changes:
 
 - It is possible configure groups in debugfs.
 
 - Consolidation of chained IRQ handler install/remove replacing
   all call sites where irq_set_handler_data() and
   irq_set_chained_handler() were done in succession with a
   combined call to irq_set_chained_handler_and_data(). This
   series was created by Thomas Gleixner after the problem was
   observed by Russell King.
 
 - Tglx also made another series of patches switching
   __irq_set_handler_locked() for irq_set_handler_locked() which
   is way cleaner.
 
 - Tglx also wrote a good bunch of patches to make use of
   irq_desc_get_xxx() accessors and avoid looking up irq_descs
   from IRQ numbers. The goal is to get rid of the irq number
   from the handlers in the IRQ flow which is nice.
 
 Driver feature enhancements:
 
 - Power management support for the SiRF SoC Atlas 7.
 
 - Power down support for the Qualcomm driver.
 
 - Intel Cherryview and Baytrail: switch drivers to use raw
   spinlocks in IRQ handlers to play nice with the realtime
   patch set.
 
 - Rework and new modes handling for Qualcomm SPMI-MPP.
 
 - Pinconf power source config for SH PFC.
 
 New drivers and subdrivers:
 
 - A new driver for Conexant Digicolor CX92755.
 
 - A new driver for UniPhier PH1-LD4, PH1-Pro4, PH1-sLD8,
   PH1-Pro5, ProXtream2 and PH1-LD6b SoC pin control support.
 
 - Reverse-egineered the S/PDIF settings for the Allwinner
   sun4i driver.
 
 - Support for Qualcomm Technologies QDF2xxx ARM64 SoCs
 
 - A new Freescale i.mx6ul subdriver.
 
 Cleanup:
 
 - Remove platform data support in a number of SH PFC
   subdrivers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV6YzgAAoJEEEQszewGV1zbIAQAILzMrzWkxsy7bhvL4QdP5/K
 OG3EodE//AE0G5gKugUDjg5t2lftdiIJVhjDA17ruETCSciuAxZSLThlMy1sQgyN
 LPxy9LlCrmsqrYt9+fmJ9js8j52RBJikKK0RUyUVz0VojTBplRpElyEx/KxwM5sG
 Hy3+hU61uKO0j9AyIcsa/RKP6SGavwZdHytJBsHNw+pODyE3UZCf52ChAVBsTPfE
 MV70g3Qzfqur7ZFqcNgtUV7qCyYvlF12ooiihrGFDOsTL3sSq4/OXB7z1z1mGGHL
 Dgq8pXJ6EIZlCbk+jFMTzPRSzy46dxNai0eErjTUVEldH1tOphzGMvKmOdm/nczH
 4M/UOWOKBE1aOYZNPtnUgDy2MRt5K9VJStCNSHEQCB2lGdojNAtmj2cmr8flBN5m
 gM9FDpIS1/C+OYYTkOY9ftPsH5zOk7sCLEHSH5USYRGJHihzLnkV90eiN6a7vlF1
 hyTGrIyl6e//E5JBgamjnR3+fYuxQGr6WeAZEP/gXZRm7BCKCaPwCarq+kPZVG4A
 nolZ/QQN6XYPSlveSPU97VYvLYEUvXaKN0Hf2DTbwkqvNFp7JORD65QLESPtQoIp
 x95iHMdB/1+0OfgOqMmlOtKpOKREeQ/R+KWACxsrr5Rfv3/7CP4BMRGypIZ/iPmz
 HWoyDI4lIebBR+JnjMjK
 =4QFX
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "This is the bulk of pin control changes for the v4.3 development
  cycle.

  Like with GPIO it's a lot of stuff.  If my subsystems are any sign of
  the overall tempo of the kernel v4.3 will be a gigantic diff.

[ It looks like 4.3 is calmer than 4.2 in most other subsystems, but
  we'll see - Linus ]

  Core changes:

   - It is possible configure groups in debugfs.

   - Consolidation of chained IRQ handler install/remove replacing all
     call sites where irq_set_handler_data() and
     irq_set_chained_handler() were done in succession with a combined
     call to irq_set_chained_handler_and_data().  This series was
     created by Thomas Gleixner after the problem was observed by
     Russell King.

   - Tglx also made another series of patches switching
     __irq_set_handler_locked() for irq_set_handler_locked() which is
     way cleaner.

   - Tglx also wrote a good bunch of patches to make use of
     irq_desc_get_xxx() accessors and avoid looking up irq_descs from
     IRQ numbers.  The goal is to get rid of the irq number from the
     handlers in the IRQ flow which is nice.

  Driver feature enhancements:

   - Power management support for the SiRF SoC Atlas 7.

   - Power down support for the Qualcomm driver.

   - Intel Cherryview and Baytrail: switch drivers to use raw spinlocks
     in IRQ handlers to play nice with the realtime patch set.

   - Rework and new modes handling for Qualcomm SPMI-MPP.

   - Pinconf power source config for SH PFC.

  New drivers and subdrivers:

   - A new driver for Conexant Digicolor CX92755.

   - A new driver for UniPhier PH1-LD4, PH1-Pro4, PH1-sLD8, PH1-Pro5,
     ProXtream2 and PH1-LD6b SoC pin control support.

   - Reverse-egineered the S/PDIF settings for the Allwinner sun4i
     driver.

   - Support for Qualcomm Technologies QDF2xxx ARM64 SoCs

   - A new Freescale i.mx6ul subdriver.

  Cleanup:

   - Remove platform data support in a number of SH PFC subdrivers"

* tag 'pinctrl-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (95 commits)
  pinctrl: at91: fix null pointer dereference
  pinctrl: mediatek: Implement wake handler and suspend resume
  pinctrl: mediatek: Fix multiple registration issue.
  pinctrl: sh-pfc: r8a7794: add USB pin groups
  pinctrl: at91: Use generic irq_{request,release}_resources()
  pinctrl: cherryview: Use raw_spinlock for locking
  pinctrl: baytrail: Use raw_spinlock for locking
  pinctrl: imx6ul: Remove .owner field
  pinctrl: zynq: Fix typos in smc0_nand_grp and smc0_nor_grp
  pinctrl: sh-pfc: Implement pinconf power-source param for voltage switching
  clk: rockchip: add pclk_pd_pmu to the list of rk3288 critical clocks
  pinctrl: sun4i: add spdif to pin description.
  pinctrl: atlas7: clear ugly branch statements for pull and drivestrength
  pinctrl: baytrail: Serialize all register access
  pinctrl: baytrail: Drop FSF mailing address
  pinctrl: rockchip: only enable gpio clock when it setting
  pinctrl/mediatek: fix spelling mistake in dev_err error message
  pinctrl: cherryview: Serialize all register access
  pinctrl: UniPhier: PH1-Pro5: add I2C ch6 pin-mux setting
  pinctrl: nomadik: reflect current input value
  ...
2015-09-04 10:22:09 -07:00
Linus Torvalds 8d2faea672 This is the bulk of GPIO changes for the v4.3 kernel cycle:
Core changes:
 
 - Root out the wrapper devm_gpiod_get() and gpiod_get() etc
   versions of the descriptor calls that did not use the flags
   argument on the end. This was around for too long and eventually
   Uwe Kleine-König took the time to clean it out and the last
   users are removed along with the macros in this tag. In several
   cases the use of flags simplifies the code. For this reason we
   have (ACKed) patches hitting in DRM, IIO, media, NFC, USB+PHY
   up until we hammer in the nail with removing the macros.
 
 - Add a fat document describing how much ready-made GPIO stuff
   we have i the kernel to discourage people from reinventing
   a square wheel in userspace, as so often happens.
 
 - Create a separate lockdep class for each instance of a GPIO
   IRQ chip instead of using one class for all chips, as the current
   code will not work with systems with several GPIO chips doing
   lockdep debugging.
 
 - Protect against driver unloading also when a GPIO line is only
   used as IRQ for the GPIOLIB_IRQCHIP helpers.
 
 - If the GPIO chip has no designated owner, assign the parent
   device driver owner as owner.
 
 - Consolidation of chained IRQ handler install/remove replacing
   all call sites where irq_set_handler_data() and
   irq_set_chained_handler() were done in succession with a
   combined call to irq_set_chained_handler_and_data(). This
   series was created by Thomas Gleixner after the problem was
   observed by Russell King.
 
 - Tglx also made another series of patches switching
   __irq_set_handler_locked() for irq_set_handler_locked() which
   is way cleaner.
 
 - Tglx and Jiang Liu wrote a good bunch of patches to make use of
   irq_desc_get_xxx() accessors and avoid looking up irq_descs
   from IRQ numbers. The goal is to get rid of the irq number
   from the handlers in the IRQ flow which is nice.
 
 - Rob Herring killed off the set_irq_flags() for all GPIO
   drivers. This was an ARM specific function that is replaced
   with the generic irq_modify_status() where special flags
   are actually needed.
 
 - When an OF node has a pin range for its GPIOs, return
   -EPROBE_DEFER if the pin controller isn't available.
   Pretty logical, yet needed to be fixed.
 
 - If a driver using GPIOLIB_IRQCHIP has its own
   irq_*_resources call back, then call these instead of the
   defaults provided by the GPIOLIB.
 
 - Fix an undocumented ABI hole: named GPIOs were not
   properly documented.
 
 Driver improvements:
 
 - Add get_direction() support to the generic GPIO driver, it's
   strange that we didn't have that before.
 
 - Make it possible to have input-only GPIO chips using the
   generic GPIO driver.
 
 - Clean out platform data support from the Emma Mobile (EM)
   driver
 
 - Finegrained runtime PM support for the RCAR driver.
 
 - Support r8a7795 (R-car H3) in the RCAR driver.
 
 - Support interrupts on GPIOs 16 thru 31 in the DaVinci driver.
 
 - Some consolidation and new support in the MPC8xxx driver,
   we now support MPC5125.
 
 - Preempt-RT-friendly patches: the OMAP, MPC8xxx, drivers uses raw
   spinlocks making it work better with the realime patches.
 
 - Interrupt support for the EXTRAXFS GPIO driver.
 
 - Make the ETRAXFS GPIO driver support also ARTPEC-3.
 
 - Interrupt and wakeup support for the BRCMSTB driver, also for
   wakeup from S5 cold boot.
 
 - Mask MXC IRQs during suspend.
 
 - Improve OMAP2 GPIO set_debounce() to work according to spec.
 
 - The VF610 driver handles IRQs properly.
 
 New drivers:
 
 - ZTE ZX GPIO driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV52lvAAoJEEEQszewGV1zOVcP/2ZgkfRgl119LZnWShfrEJWq
 UXaNzSaPNgvDzvGaqqi62SZQuhrdIWRKfMPtAuMGbEn5aJx0JC5UAOYjjfkKBqpO
 toqc1w2DScc0JTorY8qgczIBDO1A3ZBAcIvXXpOduy/JaKPoQteRN8WYTynPw48/
 0+97ZODwhyOkfeqmvUClkc9gW4XT68dudb0Lv1nQjsZmd1dHF2PZlwH3aL9sV68j
 GJAqf09xNqZaWWQBhs+J3ptsYjaJfYjo9NOOUf0Y/UgqXO3vB+2S4EmRATaRHS2F
 aHdj03sNZCNSDEa35WwetbLRGxPzSWmfxmBzQQ1baGdcJICn7Yv58EklPKRvwtMo
 ZpUsgiOV4OUIEClPJohs4xbl2HRsOYB3VbcihkXjVAxS6i2/jgA3Tn8ATvUSZ8wq
 TX8D6BfciigRCkT2G+B0TQBmcX1IQsMd1DBUNfw7Dk1TK/vxH4UYWbke422RjKGz
 ORJ+0DfShMCdYjrCVlt7UbFcqE3L5CnrztLQ3oFt0om2JsSWztV9V579G+Dqo9CI
 fE4G3xlsF33UCvXcmnOp6PuU+ZYBodLggkmK4REy2D3LCOnkcKq0U8Fj5RssApZ9
 FdqVYck555ZpcBiN8ihB97WsmU+0XhBjblCbgzr6GxUw8EJ4x8H9nlraA6bluFoP
 9c2qgPxjCq/VWA/F0YOU
 =iQ2P
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for the v4.3 kernel cycle.

  There is quite a lot going on in the GPIO subsystem this merge window,
  so the main matter is decribed below.

  The hits in other subsystems when making the GPIO flags optional are
  all ACKed by their respective subsystem maintainers.

  Core changes:

   - Root out the wrapper devm_gpiod_get() and gpiod_get() etc versions
     of the descriptor calls that did not use the flags argument on the
     end.  This was around for too long and eventually Uwe Kleine-König
     took the time to clean it out and the last users are removed along
     with the macros in this tag.  In several cases the use of flags
     simplifies the code.  For this reason we have (ACKed) patches
     hitting in DRM, IIO, media, NFC, USB+PHY up until we hammer in the
     nail with removing the macros.

   - Add a fat document describing how much ready-made GPIO stuff we
     have i the kernel to discourage people from reinventing a square
     wheel in userspace, as so often happens.

   - Create a separate lockdep class for each instance of a GPIO IRQ
     chip instead of using one class for all chips, as the current code
     will not work with systems with several GPIO chips doing lockdep
     debugging.

   - Protect against driver unloading also when a GPIO line is only used
     as IRQ for the GPIOLIB_IRQCHIP helpers.

   - If the GPIO chip has no designated owner, assign the parent device
     driver owner as owner.

   - Consolidation of chained IRQ handler install/remove replacing all
     call sites where irq_set_handler_data() and
     irq_set_chained_handler() were done in succession with a combined
     call to irq_set_chained_handler_and_data().

     This series was created by Thomas Gleixner after the problem was
     observed by Russell King.

   - Tglx also made another series of patches switching
     __irq_set_handler_locked() for irq_set_handler_locked() which is
     way cleaner.

   - Tglx and Jiang Liu wrote a good bunch of patches to make use of
     irq_desc_get_xxx() accessors and avoid looking up irq_descs from
     IRQ numbers.  The goal is to get rid of the irq number from the
     handlers in the IRQ flow which is nice.

   - Rob Herring killed off the set_irq_flags() for all GPIO drivers.
     This was an ARM specific function that is replaced with the generic
     irq_modify_status() where special flags are actually needed.

   - When an OF node has a pin range for its GPIOs, return -EPROBE_DEFER
     if the pin controller isn't available.  Pretty logical, yet needed
     to be fixed.

   - If a driver using GPIOLIB_IRQCHIP has its own irq_*_resources call
     back, then call these instead of the defaults provided by the
     GPIOLIB.

   - Fix an undocumented ABI hole: named GPIOs were not properly
     documented.

  Driver improvements:

   - Add get_direction() support to the generic GPIO driver, it's
     strange that we didn't have that before.

   - Make it possible to have input-only GPIO chips using the generic
     GPIO driver.

   - Clean out platform data support from the Emma Mobile (EM) driver

   - Finegrained runtime PM support for the RCAR driver.

   - Support r8a7795 (R-car H3) in the RCAR driver.

   - Support interrupts on GPIOs 16 thru 31 in the DaVinci driver.

   - Some consolidation and new support in the MPC8xxx driver, we now
     support MPC5125.

   - Preempt-RT-friendly patches: the OMAP, MPC8xxx, drivers uses raw
     spinlocks making it work better with the realime patches.

   - Interrupt support for the EXTRAXFS GPIO driver.

   - Make the ETRAXFS GPIO driver support also ARTPEC-3.

   - Interrupt and wakeup support for the BRCMSTB driver, also for
     wakeup from S5 cold boot.

   - Mask MXC IRQs during suspend.

   - Improve OMAP2 GPIO set_debounce() to work according to spec.

   - The VF610 driver handles IRQs properly.

  New drivers:

   - ZTE ZX GPIO driver"

* tag 'gpio-v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (87 commits)
  Revert "gpio: extraxfs: fix returnvar.cocci warnings"
  gpio: tc3589x: use static container helper
  gpio: xlp: fix error return code
  gpio: vf610: handle level IRQ's properly
  gpio: max732x: Fix error handling in probe()
  gpio: omap: fix clk_prepare/unprepare usage
  gpio: omap: protect regs access in omap_gpio_irq_handler
  gpio: omap: fix omap2_set_gpio_debounce
  gpio: omap: switch to use platform_get_irq
  gpio: omap: remove wrong irq_domain_remove usage in probe
  gpiolib: add description for gpio irqchip fields in struct gpio_chip
  gpio: extraxfs: fix returnvar.cocci warnings
  gpiolib: irqchip: use different lockdep class for each gpio irqchip
  gpio/grgpio: fix deadlock in grgpio_irq_unmap()
  Documentation: gpio: consumer: describe active low property
  gpio: mxc: fix section mismatch warning
  gpio/mxc: mask gpio interrupts in suspend
  gpio: omap: Fix missing raw locks conversion
  gpio: brcmstb: support wakeup from S5 cold boot
  gpio: brcmstb: Add interrupt and wakeup source support
  ...
2015-09-04 10:07:45 -07:00
Mark Brown 072502a67c Merge remote-tracking branches 'regmap/topic/lockdep' and 'regmap/topic/seq-delay' into regmap-next 2015-09-04 17:22:10 +01:00
Mark Brown 84fb9015d2 Merge remote-tracking branches 'regmap/topic/debugfs' and 'regmap/topic/force-update' into regmap-next 2015-09-04 17:22:09 +01:00
Mark Brown a458a6d411 Merge remote-tracking branch 'regmap/topic/core' into regmap-next 2015-09-04 17:22:08 +01:00
Greg Kroah-Hartman 062a68a5e0 Revert "uart: pl011: Add support to ZTE ZX296702 uart"
This reverts commit 8cd90e50d1 as with
this patch the serial console is broken on lots of platforms.

Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jun Nie <jun.nie@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-04 09:14:20 -07:00
Linus Torvalds 02cf1da254 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull tile updates from Chris Metcalf:
 "This includes secure computing support as well as miscellaneous minor
  improvements"

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: correct some typos in opcode type names
  tile/vdso: emit a GNU hash as well
  tile: Remove finish_arch_switch
  tile: enable full SECCOMP support
  tile/time: Migrate to new 'set-state' interface
2015-09-04 08:59:53 -07:00
Avri Altman 22f66895e6 mac80211: protect non-HT BSS when HT TDLS traffic exists
HT TDLS traffic should be protected in a non-HT BSS to avoid
collisions. Therefore, when TDLS peers join/leave, check if
protection is (now) needed and set the ht_operation_mode of
the virtual interface according to the HT capabilities of the
TDLS peer(s).

This works because a non-HT BSS connection never sets (or
otherwise uses) the ht_operation_mode; it just means that
drivers must be aware that this field applies to all HT
traffic for this virtual interface, not just the traffic
within the BSS. Document that.

Signed-off-by: Avri Altman <avri.altman@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-09-04 14:25:46 +02:00
Linus Torvalds 807249d3ad Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 "This is the main pull request for 4.3 for MIPS.  Here's the summary:

  Three fixes that didn't make 4.2-stable:

   - a -Os build might compile the kernel using the MIPS16 instruction
     set but the R2 optimized inline functions in <uapi/asm/swab.h> are
     implemented using 32-bit wide instructions which is invalid.

   - a build error in pgtable-bits.h for a particular kernel
     configuration.

   - accessing registers of the CM GCR might have been compiled to use
     64 bit accesses but these registers are onl 32 bit wide.

  And also a few new bits:

   - move the ATH79 GPIO driver to drivers/gpio

   - the definition of IRQCHIP_DECLARE has moved to linux/irqchip.h,
     change ATH79 accordingly.

   - fix definition of pgprot_writecombine

   - add an implementation of dma_map_ops.mmap

   - fix alignment of quiet build output for vmlinuz link

   - BCM47xx: Use kmemdup rather than duplicating its implementation

   - Netlogic: Fix 0x0x prefixes of constants.

   - merge Bjorn Helgaas' series to remove most of the weak keywords
     from function declarations.

   - CP0 and CP1 registers are best considered treated as unsigned
     values to avoid large values from becoming negative values.

   - improve support for the MIPS GIC timer.

   - enable common clock framework for Malta and SEAD3.

   - a number of improvments and fixes to dump_tlb().

   - document the MIPS TLB dump functionality in Magic SysRq.

   - Cavium Octeon CN68XX improvments.

   - NetLogic improvments.

   - irq: Use access helper irq_data_get_affinity_mask.

   - handle MSA unaligned accesses.

   - a number of R6-related math-emu fixes.

   - support for I6400.

   - improvments to MSA support.

   - add uprobes support.

   - move from deprecated __initcall to arch_initcall.

   - remove finish_arch_switch().

   - IRQ cleanups by Thomas Gleixner.

   - migrate to new 'set-state' interface.

   - random small cleanups"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (148 commits)
  MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
  MIPS: Fix alignment of quiet build output for vmlinuz link
  MIPS: math-emu: Remove unused handle_dsemul function declaration
  MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction
  MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction
  MIPS: math-emu: Add support for the MIPS R6 CLASS FPU instruction
  MIPS: math-emu: Add support for the MIPS R6 RINT FPU instruction
  MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction
  MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction
  MIPS: math-emu: Add support for the MIPS R6 SELNEZ FPU instruction
  MIPS: math-emu: Add support for the MIPS R6 SELEQZ FPU instruction
  MIPS: math-emu: Add support for the CMP.condn.fmt R6 instruction
  MIPS: inst.h: Add new MIPS R6 FPU opcodes
  MIPS: Octeon: Fix management port MII address on Kontron S1901
  MIPS: BCM47xx: Use kmemdup rather than duplicating its implementation
  STAGING: Octeon: Use common helpers for determining interface and port
  MIPS: Octeon: Support interfaces 4 and 5
  MIPS: Octeon: Set up 1:1 mapping between CN68XX PKO queues and ports
  MIPS: Octeon: Initialize CN68XX PKO
  STAGING: Octeon: Support CN68XX style WQE
  ...
2015-09-03 16:55:55 -07:00
Linus Torvalds ff474e8ca8 powerpc updates for 4.3
- Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask from Benjamin Herrenschmidt
  - EEH fixes for SRIOV from Gavin
  - Introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth
  - Use hardware RNG for arch_get_random_seed_* not arch_get_random_* from Paul Mackerras
  - Seccomp filter support from Michael Ellerman
  - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh Salgaonkar
  - Add powerpc timebase as a trace clock source from Naveen N. Rao
  - Misc cleanups in the xmon, signal & SLB code from Anshuman Khandual
  - Add an inline function to update POWER8 HID0 from Gautham R. Shenoy
  - Fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman
  - Drop support for 64K local store on 4K kernels from Michael Ellerman
  - move dma_get_required_mask() from pnv_phb to pci_controller_ops from Andrew Donnellan
  - Initialize distance lookup table from drconf path from Nikunj A Dadhania
  - Enable RTC class support from Vaibhav Jain
  - Disable automatically blocked PCI config from Gavin Shan
  - Add LEDs driver for PowerNV platform from Vasant Hegde
  - Fix endianness issues in the HVSI driver from Laurent Dufour
  - Kexec endian fixes from Samuel Mendoza-Jonas
  - Fix corrupted pdn list from Gavin Shan
  - Fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan
 
  - Freescale updates from Scott: Highlights include 32-bit memcpy/memset
    optimizations, checksum optimizations, 85xx config fragments and updates,
    device tree updates, e6500 fixes for non-SMP, and misc cleanup and minor
    fixes.
 
  - A ton of cxl updates & fixes:
   - Add explicit precision specifiers from Rasmus Villemoes
   - use more common format specifier from Rasmus Villemoes
   - Destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
   - Destroy afu->contexts_idr on release of an afu from Johannes Thumshirn
   - Compile with -Werror from Daniel Axtens
   - EEH support from Daniel Axtens
   - Plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
   - Add alternate MMIO error handling from Ian Munsie
   - Allow release of contexts which have been OPENED but not STARTED from Andrew Donnellan
   - Remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
   - Release irqs if memory allocation fails from Vaibhav Jain
   - Remove racy attempt to force EEH invocation in reset from Daniel Axtens
   - Fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
   - Fix force unmapping mmaps of contexts allocated through the kernel api from Ian Munsie
   - Set up and enable PSL Timebase from Philippe Bergheaud
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV5+GzAAoJEFHr6jzI4aWA0iAP/jcd0kNaNBzLgcDKKygKdgz4
 xn4EWu81vfMfZYWesb0ATrjlH0hLsRxSXoFUqUMhtJTa5kNAoCIaz/M8WBALS50h
 aT+i7br4WEU2j2FcaMyP3iAZx/2hl+2utODJSHPRWPkec1fUDBfEyBf++e520RWM
 HUQGIGZXh8yq7KMA96Pwhsvls9vOB8hS2UdU/NS8ff3J5jFvXC1/WmF2qfzJBS1V
 8iHyz26Jl8+dJ+et7iC2oD5XQAjIH1oJgOyPVPBzAQttfi8RjuVzRA30TfPBAUwI
 lC9nlmPy6bCe4kiQYWVB1z7GegHyW/9vkeuMj/u8mZbqpaayMEMZmd2C3iNDXNHx
 i2NSvdln539t4qWYsV2v6lVCfa/ayDHD73Wackj5Dk394tzXnpCPhxNzc2yKEd5v
 h7vwYc9jBhsbfSCSogaM+gSHJ1APgCidggHJMYYNA2nN2u6V62RpsMB7zp/1+Q2v
 yqYdD8oYF4Dm21x/ujaNFrlizROD46WS0UqdJ3yP6HAqRYIpRXtibmpECJgt1n5h
 HjADEci4hQ2UQxdMdp/Q5KZnPTJebBtrZrmkW5r6cZBUaTB5TVkFaEWN44CT/Loh
 tMNeA3qOBN06CaQS2WL3UUUWpbZq9fSbWuUZ5lWZDb5AOyRxe5eWVYNLkiyIXozY
 L24l1bYdBhXahnjoS/kc
 =n9+X
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask
   from Benjamin Herrenschmidt

 - EEH fixes for SRIOV from Gavin

 - introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth

 - use hardware RNG for arch_get_random_seed_* not arch_get_random_*
   from Paul Mackerras

 - seccomp filter support from Michael Ellerman

 - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh
   Salgaonkar

 - add powerpc timebase as a trace clock source from Naveen N.  Rao

 - misc cleanups in the xmon, signal & SLB code from Anshuman Khandual

 - add an inline function to update POWER8 HID0 from Gautham R.  Shenoy

 - fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman

 - drop support for 64K local store on 4K kernels from Michael Ellerman

 - move dma_get_required_mask() from pnv_phb to pci_controller_ops from
   Andrew Donnellan

 - initialize distance lookup table from drconf path from Nikunj A
   Dadhania

 - enable RTC class support from Vaibhav Jain

 - disable automatically blocked PCI config from Gavin Shan

 - add LEDs driver for PowerNV platform from Vasant Hegde

 - fix endianness issues in the HVSI driver from Laurent Dufour

 - kexec endian fixes from Samuel Mendoza-Jonas

 - fix corrupted pdn list from Gavin Shan

 - fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan

 - Freescale updates from Scott: Highlights include 32-bit memcpy/memset
   optimizations, checksum optimizations, 85xx config fragments and
   updates, device tree updates, e6500 fixes for non-SMP, and misc
   cleanup and minor fixes.

 - a ton of cxl updates & fixes:
    - add explicit precision specifiers from Rasmus Villemoes
    - use more common format specifier from Rasmus Villemoes
    - destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
    - destroy afu->contexts_idr on release of an afu from Johannes
      Thumshirn
    - compile with -Werror from Daniel Axtens
    - EEH support from Daniel Axtens
    - plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
    - add alternate MMIO error handling from Ian Munsie
    - allow release of contexts which have been OPENED but not STARTED
      from Andrew Donnellan
    - remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
    - release irqs if memory allocation fails from Vaibhav Jain
    - remove racy attempt to force EEH invocation in reset from Daniel
      Axtens
    - fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
    - fix force unmapping mmaps of contexts allocated through the kernel
      api from Ian Munsie
    - set up and enable PSL Timebase from Philippe Bergheaud

* tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (140 commits)
  cxl: Set up and enable PSL Timebase
  cxl: Fix force unmapping mmaps of contexts allocated through the kernel api
  cxl: Fix + cleanup error paths in cxl_dev_context_init
  powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()
  powerpc/pseries: Cleanup on pci_dn_reconfig_notifier()
  powerpc/pseries: Fix corrupted pdn list
  powerpc/powernv: Enable LEDS support
  powerpc/iommu: Set default DMA offset in dma_dev_setup
  cxl: Remove racy attempt to force EEH invocation in reset
  cxl: Release irqs if memory allocation fails
  cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE
  powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver
  powerpc/powernv: Reset HILE before kexec_sequence()
  powerpc/kexec: Reset secondary cpu endianness before kexec
  powerpc/hvsi: Fix endianness issues in the HVSI driver
  leds/powernv: Add driver for PowerNV platform
  powerpc/powernv: Create LED platform device
  powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states
  powerpc/powernv: Fix the log message when disabling VF
  cxl: Allow release of contexts which have been OPENED but not STARTED
  ...
2015-09-03 16:41:38 -07:00
Linus Torvalds c706c7eb0d Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM development updates from Russell King:
 "Included in this update:

   - moving PSCI code from ARM64/ARM to drivers/

   - removal of some architecture internals from global kernel view

   - addition of software based "privileged no access" support using the
     old domains register to turn off the ability for kernel
     loads/stores to access userspace.  Only the proper accessors will
     be usable.

   - addition of early fixup support for early console

   - re-addition (and reimplementation) of OMAP special interconnect
     barrier

   - removal of finish_arch_switch()

   - only expose cpuX/online in sysfs if hotpluggable

   - a number of code cleanups"

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (41 commits)
  ARM: software-based priviledged-no-access support
  ARM: entry: provide uaccess assembly macro hooks
  ARM: entry: get rid of multiple macro definitions
  ARM: 8421/1: smp: Collapse arch_cpu_idle_dead() into cpu_die()
  ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore()
  ARM: mm: improve do_ldrd_abort macro
  ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit()
  ARM: entry: efficiency cleanups
  ARM: entry: get rid of asm_trace_hardirqs_on_cond
  ARM: uaccess: simplify user access assembly
  ARM: domains: remove DOMAIN_TABLE
  ARM: domains: keep vectors in separate domain
  ARM: domains: get rid of manager mode for user domain
  ARM: domains: move initial domain setting value to asm/domains.h
  ARM: domains: provide domain_mask()
  ARM: domains: switch to keeping domain value in register
  ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE
  ARM: 8417/1: refactor bitops functions with BIT_MASK() and BIT_WORD()
  ARM: 8416/1: Feroceon: use of_iomap() to map register base
  ARM: 8415/1: early fixmap support for earlycon
  ...
2015-09-03 16:27:01 -07:00
Linus Torvalds ca520cab25 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
2015-09-03 15:46:07 -07:00
Linus Torvalds 4c12ab7e5e Merge tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "The major work includes fixing and enhancing the existing extent_cache
  feature, which has been well settling down so far and now it becomes a
  default mount option accordingly.

  Also, this version newly registers a f2fs memory shrinker to reclaim
  several objects consumed by a couple of data structures in order to
  avoid memory pressures.

  Another new feature is to add ioctl(F2FS_GARBAGE_COLLECT) which
  triggers a cleaning job explicitly by users.

  Most of the other patches are to fix bugs occurred in the corner cases
  across the whole code area"

* tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (85 commits)
  f2fs: upset segment_info repair
  f2fs: avoid accessing NULL pointer in f2fs_drop_largest_extent
  f2fs: update extent tree in batches
  f2fs: fix to release inode correctly
  f2fs: handle f2fs_truncate error correctly
  f2fs: avoid unneeded initializing when converting inline dentry
  f2fs: atomically set inode->i_flags
  f2fs: fix wrong pointer access during try_to_free_nids
  f2fs: use __GFP_NOFAIL to avoid infinite loop
  f2fs: lookup neighbor extent nodes for merging later
  f2fs: split __insert_extent_tree_ret for readability
  f2fs: kill dead code in __insert_extent_tree
  f2fs: adjust showing of extent cache stat
  f2fs: add largest/cached stat in extent cache
  f2fs: fix incorrect mapping for bmap
  f2fs: add annotation for space utilization of regular/inline dentry
  f2fs: fix to update cached_en of extent tree properly
  f2fs: fix typo
  f2fs: check the node block address of newly allocated nid
  f2fs: go out for insert_inode_locked failure
  ...
2015-09-03 13:10:22 -07:00
Hidehiro Kawai 82802f968b ipmi: Don't flush messages in sender() in run-to-completion mode
When flushing queued messages in run-to-completion mode,
smi_event_handler() is recursively called.

flush_messages()
 smi_event_handler()
  handle_transaction_done()
   deliver_recv_msg()
    ipmi_smi_msg_received()
     smi_recv_tasklet()
      sender()
       flush_messages()
        smi_event_handler()
         ...

The depth of the recursive call depends on the number of queued
messages, so it can cause a stack overflow if many messages have
been queued.

To solve this problem, this patch removes flush_messages()
from sender()@ipmi_si_intf.c.  Instead, add flush_messages() to
caller side of sender() if needed.  Additionally, to implement this,
add new handler flush_messages to struct ipmi_smi_handlers.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>

Fixed up a comment and some spacing issues.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-09-03 15:02:28 -05:00
Corey Minyard 81d02b7f8c ipmi: Make some data const that was only read
Several data structures were only used for reading, so make them
const.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-09-03 15:02:27 -05:00
Linus Torvalds 9cbf22b37a dlm for 4.3
This set mainly includes a change to the way the
 dlm uses the SCTP API in the kernel, removing the
 direct dependency on the sctp module.  Other odd
 SCTP-related fixes are also included.  The other
 notable fix is for a long standing regression in
 the behavior of lock value blocks for user space
 locks.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV5HwZAAoJEDgbc8f8gGmqoaQP/iz5zgKSjX0mOC3fz8BqXISk
 85cKLPfsf0avDmGx6nkKp5wsmVDYkfrObkocvf7bOcemAuycuOmr9y22ZscNaAWM
 vKLhTJQ0koAlZqhJmJx45w318BFY03RdDQmVKUnQHza9Ed7Uoa0CyR6jyuwBTuMP
 gA9O6i6CezodtB8CLPySJa2znlt50CptLaJKj1V9/xCpBh7orwpihv4pBz8oH1lR
 JXRj9hNEFy2+vk8Pce14fKmHgUROg5+y1V7jZeetpCbTxAAFOeFOL6EH28eWssbQ
 YoWofcPugmOs9BDbnVZHf6+Y5xIaoiIylb2Q4/me4rjQfSmaiDbTZyqB4TtFrldF
 BngaAJipmLQu8ELqQmwEMhZTAc/GsB60x1EcjrPVTKbW7pwsfVp2fPVV92a7koQe
 prmz5rh8HCenrWuy3d4/EP7K+E4+W98ZXsDuym4pBNaoYwCPyvtWLa8kSqAdx47J
 MNk/ak9ktP2NxsCs+EjCmP2hn2r+RTio6R2uCtKB2pdclfqOupIsYZkVdZERK5Ch
 5+ALeVjHfxswFVRxGjbPQRs9x8ZclBydceAHgYbLQ2xDGRvTpQhnIyNLRXsZnkrD
 t4mTokZG/GGgmWOscZ5nXOOGZt8SpX+UkICWWWbuy3dxuOK6al3lVeBcC0KW5Pki
 KNHzcKrlGJJnCVr0nWTU
 =iYRu
 -----END PGP SIGNATURE-----

Merge tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm updates from David Teigland:
 "This set mainly includes a change to the way the dlm uses the SCTP API
  in the kernel, removing the direct dependency on the sctp module.
  Other odd SCTP-related fixes are also included.

  The other notable fix is for a long standing regression in the
  behavior of lock value blocks for user space locks"

* tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: print error from kernel_sendpage
  dlm: fix lvb copy for user locks
  dlm: sctp_accept_from_sock() can be static
  dlm: fix reconnecting but not sending data
  dlm: replace BUG_ON with a less severe handling
  dlm: use sctp 1-to-1 API
  dlm: fix not reconnecting on connecting error handling
  dlm: fix race while closing connections
  dlm: fix connection stealing if using SCTP
2015-09-03 12:57:48 -07:00
Linus Torvalds ea814ab9aa Pretty much all bug fixes and clean ups for 4.3, after a lot of
features and other churn going into 4.2.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJV55TlAAoJEPL5WVaVDYGjyzYH/1WtZpIzRjp7o+3H4/vFqONg
 R1Fsw785C1w8WX2QuIK/m31u4XO+VeCV4jWA9DuqnSzWm9w9C/4kTqITd4Hwp416
 /9gJvYoZHHaDikxpWWADptDi8IoLohTlcFVCHIvvf53cGehVEEsc2WOijUZo7Cgv
 O454Nm3tK0CQ3yrCIlf5SyvkUZSMTiawLLJJzd4GCyvU13C1SnABNQj8UxKisBA5
 cP8q4O2nPg/S9rkYxnFAifQyZppd3jMvorUaq9eHiWMjl95o6e/6+wYGnHhoFUvr
 /P1dNjJYbzk+TUzlsDkq2zANK2UsB3iNNi8YwLFOpfFcuYopmUAYRIWOgIZWYUQ=
 =ijuI
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "Pretty much all bug fixes and clean ups for 4.3, after a lot of
  features and other churn going into 4.2"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  Revert "ext4: remove block_device_ejected"
  ext4: ratelimit the file system mounted message
  ext4: silence a format string false positive
  ext4: simplify some code in read_mmp_block()
  ext4: don't manipulate recovery flag when freezing no-journal fs
  jbd2: limit number of reserved credits
  ext4 crypto: remove duplicate header file
  ext4: update c/mtime on truncate up
  jbd2: avoid infinite loop when destroying aborted journal
  ext4, jbd2: add REQ_FUA flag when recording an error in the superblock
  ext4 crypto: fix spelling typo in comment
  ext4 crypto: exit cleanly if ext4_derive_key_aes() fails
  ext4: reject journal options for ext2 mounts
  ext4: implement cgroup writeback support
  ext4: replace ext4_io_submit->io_op with ->io_wbc
  ext4 crypto: check for too-short encrypted file names
  ext4 crypto: use a jbd2 transaction when adding a crypto policy
  jbd2: speedup jbd2_journal_dirty_metadata()
2015-09-03 12:52:19 -07:00
Ira Weiny 0629cb06cd IB/core: Move SM class defines from ib_mad.h to ib_smi.h
When the hfi1 driver was added these definitions were moved from the qib driver
to ib_mad.h to be used by both qib and hfi1.  They should have been moved to
ib_smi.h instead.

Fixes: d4ab347005 ("IB/core: Add core header changes needed for OPA")
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-03 15:50:32 -04:00
Ira Weiny ce755c9b01 IB/core: Remove unnecessary defines from ib_mad.h
Remove the unused IB_NOTICE_REPRESS_* defines.

When the hfi1 driver was added these definitions were moved from the qib driver
to ib_mad.h.  They should have been removed instead.

Fixes: d4ab347005 ("IB/core: Add core header changes needed for OPA")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-03 15:49:37 -04:00
Ira Weiny 6f876ce4b5 IB/hfi1: Add PSM2 user space header to header_install
When the hfi1 driver was added a user space header file (hfi1_user.h) was added
to be shared between PSM2 and the driver.  However, the file was not added to
the header install.  Add it now.

Fixes: d4ab347005 ("IB/core: Add core header changes needed for OPA")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-03 15:29:52 -04:00
Linus Torvalds e31fb9e005 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext3 removal, quota & udf fixes from Jan Kara:
 "The biggest change in the pull is the removal of ext3 filesystem
  driver (~28k lines removed).  Ext4 driver is a full featured
  replacement these days and both RH and SUSE use it for several years
  without issues.  Also there are some workarounds in VM & block layer
  mainly for ext3 which we could eventually get rid of.

  Other larger change is addition of proper error handling for
  dquot_initialize().  The rest is small fixes and cleanups"

[ I wasn't convinced about the ext3 removal and worried about things
  falling through the cracks for legacy users, but ext4 maintainers
  piped up and were all unanimously in favor of removal, and maintaining
  all legacy ext3 support inside ext4.   - Linus ]

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Don't modify filesystem for read-only mounts
  quota: remove an unneeded condition
  ext4: memory leak on error in ext4_symlink()
  mm/Kconfig: NEED_BOUNCE_POOL: clean-up condition
  ext4: Improve ext4 Kconfig test
  block: Remove forced page bouncing under IO
  fs: Remove ext3 filesystem driver
  doc: Update doc about journalling layer
  jfs: Handle error from dquot_initialize()
  reiserfs: Handle error from dquot_initialize()
  ocfs2: Handle error from dquot_initialize()
  ext4: Handle error from dquot_initialize()
  ext2: Handle error from dquot_initalize()
  quota: Propagate error from ->acquire_dquot()
2015-09-03 12:28:30 -07:00
Jens Axboe 5e7c4274a7 block: Check for gaps on front and back merges
We are checking for gaps to previous bio_vec, which can
only detect back merges gaps. Moreover, at the point where
we check for a gap, we don't know if we will attempt a back
or a front merge. Thus, check for gap to prev in a back merge
attempt and check for a gap to next in a front merge attempt.

Signed-off-by: Jens Axboe <axboe@fb.com>
[sagig: Minor rename change]
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
2015-09-03 10:33:09 -06:00
Linus Torvalds dd5cdb48ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Another merge window, another set of networking changes.  I've heard
  rumblings that the lightweight tunnels infrastructure has been voted
  networking change of the year.  But what do I know?

   1) Add conntrack support to openvswitch, from Joe Stringer.

   2) Initial support for VRF (Virtual Routing and Forwarding), which
      allows the segmentation of routing paths without using multiple
      devices.  There are some semantic kinks to work out still, but
      this is a reasonably strong foundation.  From David Ahern.

   3) Remove spinlock fro act_bpf fast path, from Alexei Starovoitov.

   4) Ignore route nexthops with a link down state in ipv6, just like
      ipv4.  From Andy Gospodarek.

   5) Remove spinlock from fast path of act_gact and act_mirred, from
      Eric Dumazet.

   6) Document the DSA layer, from Florian Fainelli.

   7) Add netconsole support to bcmgenet, systemport, and DSA.  Also
      from Florian Fainelli.

   8) Add Mellanox Switch Driver and core infrastructure, from Jiri
      Pirko.

   9) Add support for "light weight tunnels", which allow for
      encapsulation and decapsulation without bearing the overhead of a
      full blown netdevice.  From Thomas Graf, Jiri Benc, and a cast of
      others.

  10) Add Identifier Locator Addressing support for ipv6, from Tom
      Herbert.

  11) Support fragmented SKBs in iwlwifi, from Johannes Berg.

  12) Allow perf PMUs to be accessed from eBPF programs, from Kaixu Xia.

  13) Add BQL support to 3c59x driver, from Loganaden Velvindron.

  14) Stop using a zero TX queue length to mean that a device shouldn't
      have a qdisc attached, use an explicit flag instead.  From Phil
      Sutter.

  15) Use generic geneve netdevice infrastructure in openvswitch, from
      Pravin B Shelar.

  16) Add infrastructure to avoid re-forwarding a packet in software
      that was already forwarded by a hardware switch.  From Scott
      Feldman.

  17) Allow AF_PACKET fanout function to be implemented in a bpf
      program, from Willem de Bruijn"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1458 commits)
  netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
  netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled
  net: fec: clear receive interrupts before processing a packet
  ipv6: fix exthdrs offload registration in out_rt path
  xen-netback: add support for multicast control
  bgmac: Update fixed_phy_register()
  sock, diag: fix panic in sock_diag_put_filterinfo
  flow_dissector: Use 'const' where possible.
  flow_dissector: Fix function argument ordering dependency
  ixgbe: Resolve "initialized field overwritten" warnings
  ixgbe: Remove bimodal SR-IOV disabling
  ixgbe: Add support for reporting 2.5G link speed
  ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
  ixgbe: support for ethtool set_rxfh
  ixgbe: Avoid needless PHY access on copper phys
  ixgbe: cleanup to use cached mask value
  ixgbe: Remove second instance of lan_id variable
  ixgbe: use kzalloc for allocating one thing
  flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c
  ixgbe: Remove unused PCI bus types
  ...
2015-09-03 08:08:17 -07:00
David Henningsson f0675d4a8e drm/i915: Drop port_mst_index parameter from pin/eld callback
The port_mst_index parameter was reserved for future use, but
maintainers prefer to add it later when it is actually used.

[Note: this is an update patch to commit [51e1d83cab: drm/i915: Call
 audio pin/ELD notify function] where I mistakenly applied the older
 version.  Jani and Daniel's review tags were to the latest version,
 so I add them below, too -- tiwai]

Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-09-03 12:09:03 +02:00
Linus Torvalds 1e1a4e8f43 Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper update from Mike Snitzer:

 - a couple small cleanups in dm-cache, dm-verity, persistent-data's
   dm-btree, and DM core.

 - a 4.1-stable fix for dm-cache that fixes the leaking of deferred bio
   prison cells

 - a 4.2-stable fix that adds feature reporting for the dm-stats
   features added in 4.2

 - improve DM-snapshot to not invalidate the on-disk snapshot if
   snapshot device write overflow occurs; but a write overflow triggered
   through the origin device will still invalidate the snapshot.

 - optimize DM-thinp's async discard submission a bit now that late bio
   splitting has been included in block core.

 - switch DM-cache's SMQ policy lock from using a mutex to a spinlock;
   improves performance on very low latency devices (eg. NVMe SSD).

 - document DM RAID 4/5/6's discard support

[ I did not pull the slab changes, which weren't appropriate for this
  tree, and weren't obviously the right thing to do anyway.  At the very
  least they need some discussion and explanation before getting merged.

  Because not pulling the actual tagged commit but doing a partial pull
  instead, this merge commit thus also obviously is missing the git
  signature from the original tag ]

* tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache: fix use after freeing migrations
  dm cache: small cleanups related to deferred prison cell cleanup
  dm cache: fix leaking of deferred bio prison cells
  dm raid: document RAID 4/5/6 discard support
  dm stats: report precise_timestamps and histogram in @stats_list output
  dm thin: optimize async discard submission
  dm snapshot: don't invalidate on-disk image on snapshot write overflow
  dm: remove unlikely() before IS_ERR()
  dm: do not override error code returned from dm_get_device()
  dm: test return value for DM_MAPIO_SUBMITTED
  dm verity: remove unused mempool
  dm cache: move wake_waker() from free_migrations() to where it is needed
  dm btree remove: remove unused function get_nr_entries()
  dm btree: remove unused "dm_block_t root" parameter in btree_split_sibling()
  dm cache policy smq: change the mutex to a spinlock
2015-09-02 16:35:26 -07:00
Daniel Borkmann 62da98656b netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
Fengguang reported, that some randconfig generated the following linker
issue with nf_ct_zone_dflt object involved:

  [...]
  CC      init/version.o
  LD      init/built-in.o
  net/built-in.o: In function `ipv4_conntrack_defrag':
  nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
  net/built-in.o: In function `ipv6_defrag':
  nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
  make: *** [vmlinux] Error 1

Given that configurations exist where we have a built-in part, which is
accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
area when netfilter is configured in general.

Therefore, split the more generic parts into a common header under
include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
section that already holds parts related to CONFIG_NF_CONNTRACK in the
netfilter core. This fixes the issue on my side.

Fixes: 308ac9143e ("netfilter: nf_conntrack: push zone object into functions")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-02 16:32:56 -07:00
Linus Torvalds d975f309a8 Merge branch 'for-4.3/sg' of git://git.kernel.dk/linux-block
Pull SG updates from Jens Axboe:
 "This contains a set of scatter-gather related changes/fixes for 4.3:

   - Add support for limited chaining of sg tables even for
     architectures that do not set ARCH_HAS_SG_CHAIN.  From Christoph.

   - Add sg chain support to target_rd.  From Christoph.

   - Fixup open coded sg->page_link in crypto/omap-sham.  From
     Christoph.

   - Fixup open coded crypto ->page_link manipulation.  From Dan.

   - Also from Dan, automated fixup of manual sg_unmark_end()
     manipulations.

   - Also from Dan, automated fixup of open coded sg_phys()
     implementations.

   - From Robert Jarzmik, addition of an sg table splitting helper that
     drivers can use"

* 'for-4.3/sg' of git://git.kernel.dk/linux-block:
  lib: scatterlist: add sg splitting function
  scatterlist: use sg_phys()
  crypto/omap-sham: remove an open coded access to ->page_link
  scatterlist: remove open coded sg_unmark_end instances
  crypto: replace scatterwalk_sg_chain with sg_chain
  target/rd: always chain S/G list
  scatterlist: allow limited chaining without ARCH_HAS_SG_CHAIN
2015-09-02 13:22:38 -07:00
Linus Torvalds 52b084d31c Merge branch 'for-4.3/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "On top of the 4.3 core block IO changes, here are the driver related
  changes for 4.3.  Basically just NVMe and nbd this time around:

   - NVMe:
      - PRACT PI improvement from Alok Pandey.
      - Cleanups and improvements on submission queue doorbell and
        writing, using CMB if available.  From Jon Derrick.
      - From Keith, support for setting queue maximum segments, and
        reset support.
      - Also from Jon, fixup of u64 division issue on 32-bit archs and
        wiring up of the reset support through and ioctl.
      - Two small cleanups from Matias and Sunad

  - Various code cleanups and fixes from Markus Pargmann"

* 'for-4.3/drivers' of git://git.kernel.dk/linux-block:
  NVMe: Using PRACT bit to generate and verify PI by controller
  NVMe:Remove unreachable code in nvme_abort_req
  NVMe: Add nvme subsystem reset IOCTL
  NVMe: Add nvme subsystem reset support
  NVMe: removed unused nn var from nvme_dev_add
  NVMe: Set queue max segments
  nbd: flags is a u32 variable
  nbd: Rename functions for clearness of recv/send path
  nbd: Change 'disconnect' to be boolean
  nbd: Add debugfs entries
  nbd: Remove variable 'pid'
  nbd: Move clear queue debug message
  nbd: Remove 'harderror' and propagate error properly
  nbd: restructure sock_shutdown
  nbd: sock_shutdown, remove conditional lock
  nbd: Fix timeout detection
  nvme: Fixes u64 division which breaks i386 builds
  NVMe: Use CMB for the IO SQes if available
  NVMe: Unify SQ entry writing and doorbell ringing
2015-09-02 13:14:58 -07:00
Linus Torvalds 1081230b74 Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block
Pull core block updates from Jens Axboe:
 "This first core part of the block IO changes contains:

   - Cleanup of the bio IO error signaling from Christoph.  We used to
     rely on the uptodate bit and passing around of an error, now we
     store the error in the bio itself.

   - Improvement of the above from myself, by shrinking the bio size
     down again to fit in two cachelines on x86-64.

   - Revert of the max_hw_sectors cap removal from a revision again,
     from Jeff Moyer.  This caused performance regressions in various
     tests.  Reinstate the limit, bump it to a more reasonable size
     instead.

   - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me.
     Most devices have huge trim limits, which can cause nasty latencies
     when deleting files.  Enable the admin to configure the size down.
     We will look into having a more sane default instead of UINT_MAX
     sectors.

   - Improvement of the SGP gaps logic from Keith Busch.

   - Enable the block core to handle arbitrarily sized bios, which
     enables a nice simplification of bio_add_page() (which is an IO hot
     path).  From Kent.

   - Improvements to the partition io stats accounting, making it
     faster.  From Ming Lei.

   - Also from Ming Lei, a basic fixup for overflow of the sysfs pending
     file in blk-mq, as well as a fix for a blk-mq timeout race
     condition.

   - Ming Lin has been carrying Kents above mentioned patches forward
     for a while, and testing them.  Ming also did a few fixes around
     that.

   - Sasha Levin found and fixed a use-after-free problem introduced by
     the bio->bi_error changes from Christoph.

   - Small blk cgroup cleanup from Viresh Kumar"

* 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
  blk: Fix bio_io_vec index when checking bvec gaps
  block: Replace SG_GAPS with new queue limits mask
  block: bump BLK_DEF_MAX_SECTORS to 2560
  Revert "block: remove artifical max_hw_sectors cap"
  blk-mq: fix race between timeout and freeing request
  blk-mq: fix buffer overflow when reading sysfs file of 'pending'
  Documentation: update notes in biovecs about arbitrarily sized bios
  block: remove bio_get_nr_vecs()
  fs: use helper bio_add_page() instead of open coding on bi_io_vec
  block: kill merge_bvec_fn() completely
  md/raid5: get rid of bio_fits_rdev()
  md/raid5: split bio for chunk_aligned_read
  block: remove split code in blkdev_issue_{discard,write_same}
  btrfs: remove bio splitting and merge_bvec_fn() calls
  bcache: remove driver private bio splitting code
  block: simplify bio_add_page()
  block: make generic_make_request handle arbitrarily sized bios
  blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
  block: don't access bio->bi_error after bio_put()
  block: shrink struct bio down to 2 cache lines again
  ...
2015-09-02 13:10:25 -07:00
Linus Torvalds df910390e2 SCSI misc on 20150901
This includes one new driver: cxlflash plus the usual grab bag of updates for
 the major drivers: qla2xxx, ipr, storvsc, pm80xx, hptiop, plus a few assorted
 fixes.
 
 Signed-off-by: James Bottomley <JBottomley@Odin.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJV5eGiAAoJEDeqqVYsXL0MCpwH/AoneZOPeCx0YdTlyiojasi4
 Kc7ECmV9IJJoMoCbP8grwStvynYyCHDSphYmqopZPRlD021eG8ota2uRTHEGJI+q
 SoiZUlq8ti8xgnD55mubwO+UNF+zoELMyHUok2pGzBoZN5alA6nvKuNY7Hif3P3b
 YMT490oWQLjWmJkMW8TbpMn9nHpW0dfbP323uaggWsMy3CSI707+x36FLi1/ICg6
 MZRyv4aESAcauZGUI5EG+SrIl3OBQX7snsYXyuqD3biGqzbGc3p3L9uWG1qXHDbM
 OSGXhN+our0WYHCV1/UrGz7/IAWW1UU0W2qgCBwkXkDjkXJ4jqd36zLJxeuhSpE=
 =KOmP
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull first round of SCSI updates from James Bottomley:
 "This includes one new driver: cxlflash plus the usual grab bag of
  updates for the major drivers: qla2xxx, ipr, storvsc, pm80xx, hptiop,
  plus a few assorted fixes.

  There's another tranch coming, but I want to incubate it another few
  days in the checkers, plus it includes a mpt2sas separated lifetime
  fix, which Avago won't get done testing until Friday"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (85 commits)
  aic94xx: set an error code on failure
  storvsc: Set the error code correctly in failure conditions
  storvsc: Allow write_same when host is windows 10
  storvsc: use storage protocol version to determine storage capabilities
  storvsc: use correct defaults for values determined by protocol negotiation
  storvsc: Untangle the storage protocol negotiation from the vmbus protocol negotiation.
  storvsc: Use a single value to track protocol versions
  storvsc: Rather than look for sets of specific protocol versions, make decisions based on ranges.
  cxlflash: Remove unused variable from queuecommand
  cxlflash: shift wrapping bug in afu_link_reset()
  cxlflash: off by one bug in cxlflash_show_port_status()
  cxlflash: Virtual LUN support
  cxlflash: Superpipe support
  cxlflash: Base error recovery support
  qla2xxx: Update driver version to 8.07.00.26-k
  qla2xxx: Add pci device id 0x2261.
  qla2xxx: Fix missing device login retries.
  qla2xxx: do not clear slot in outstanding cmd array
  qla2xxx: Remove decrement of sp reference count in abort handler.
  qla2xxx: Add support to show MPI and PEP FW version for ISP27xx.
  ...
2015-09-02 12:22:54 -07:00
Paul Durrant 210c34dcd8 xen-netback: add support for multicast control
Xen's PV network protocol includes messages to add/remove ethernet
multicast addresses to/from a filter list in the backend. This allows
the frontend to request the backend only forward multicast packets
which are of interest thus preventing unnecessary noise on the shared
ring.

The canonical netif header in git://xenbits.xen.org/xen.git specifies
the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal
necessary changes have been pulled into include/xen/interface/io/netif.h.

To prevent the frontend from extending the multicast filter list
arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries.
This limit is not specified by the protocol and so may change in future.
If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD
sent by the frontend will be failed with NETIF_RSP_ERROR.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-02 11:45:00 -07:00
Linus Torvalds 8bdc69b764 Merge branch 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:

 - a new PIDs controller is added.  It turns out that PIDs are actually
   an independent resource from kmem due to the limited PID space.

 - more core preparations for the v2 interface.  Once cpu side interface
   is settled, it should be ready for lifting the devel mask.
   for-4.3-unified-base was temporarily branched so that other trees
   (block) can pull cgroup core changes that blkcg changes depend on.

 - a non-critical idr_preload usage bug fix.

* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: pids: fix invalid get/put usage
  cgroup: introduce cgroup_subsys->legacy_name
  cgroup: don't print subsystems for the default hierarchy
  cgroup: make cftype->private a unsigned long
  cgroup: export cgrp_dfl_root
  cgroup: define controller file conventions
  cgroup: fix idr_preload usage
  cgroup: add documentation for the PIDs controller
  cgroup: implement the PIDs subsystem
  cgroup: allow a cgroup subsystem to reject a fork
2015-09-02 08:04:23 -07:00
Linus Torvalds 76ec51ef5e Merge branch 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu updates from Tejun Heo:
 "Minor cleanups"

* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: clean up of schunk->map[] assignment in pcpu_setup_first_chunk
  percpu: update incorrect comment for this_cpu_*() operations
2015-09-02 08:03:25 -07:00
Linus Torvalds 7d3e2eb178 Merge branch 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
 "Only three trivial changes for workqueue this time - doc, MAINTAINERS
  and EXPORT_SYMBOL updates"

* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix some docbook warnings
  workqueue: Make flush_workqueue() available again to non GPL modules
  workqueue: add myself as a dedicated reviwer
2015-09-02 08:02:20 -07:00
Artem Savkov 16f7249ddf uapi/drm/i915_drm.h: fix userspace compilation.
commit 346add7834
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Jul 14 18:07:30 2015 +0200

    drm/i915: Use expcitly fixed type in compat32 structs

changed the type of param field in drm_i915_getparam from int to
s32. This header is exported to userspace and needs to use userspace
type __s32 instead.

This fixes userspace compilation errors like the following:
include/drm/i915_drm.h:361:2: error: unknown type name 's32'
  s32 param;

Signed-off-by: Artem Savkov <asavkov@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-09-02 16:26:58 +03:00
Takashi Iwai 6869de380e ALSA: hda - Fix missing inline for dummy snd_hdac_set_codec_wakeup()
This seems overlooked.

Fixes: 98d8fc6c5d ('ALSA: hda - Move hda_i915.c from sound/pci/hda to sound/hda')
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-09-02 12:24:55 +02:00
David Henningsson 45c053df5b ALSA: hda - allow codecs to access the i915 pin/ELD callback
This lets the interested codec be notified when an i915 pin/ELD
event happens.

[tiwai: Fixed a trivial build error for CONFIG_SND_HDA_I915=n]

Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-09-02 12:23:55 +02:00
David Henningsson 2a8ceedf78 drm/i915: Add audio pin sense / ELD callback
This callback will be called by the i915 driver to notify the hda
driver that its HDMI information needs to be refreshed, i e,
that audio output is now available (or unavailable) - usually as a
result of a monitor being plugged in (or unplugged).

Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-09-02 11:30:44 +02:00
Vatika Harlalka 9642d18eee nohz: Affine unpinned timers to housekeepers
The problem addressed in this patch is about affining unpinned
timers. Adaptive or Full Dynticks CPUs are currently disturbed
by unnecessary jitter due to firing of such timers on them.

This patch will affine timers to online CPUs which are not full
dynticks in NOHZ_FULL configured systems. It should not
introduce overhead in nohz full off case due to static keys.

Signed-off-by: Vatika Harlalka <vatikaharlalka@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1441119060-2230-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-02 10:33:22 +02:00