Commit graph

56951 commits

Author SHA1 Message Date
Frederic Weisbecker d1e43fa5f8 nohz: Ensure full dynticks CPUs are RCU nocbs
We need full dynticks CPU to also be RCU nocb so
that we don't have to keep the tick to handle RCU
callbacks.

Make sure the range passed to nohz_full= boot
parameter is a subset of rcu_nocbs=

The CPUs that fail to meet this requirement will be
excluded from the nohz_full range. This is checked
early in boot time, before any CPU has the opportunity
to stop its tick.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-04-19 13:54:04 +02:00
Frederic Weisbecker c5bfece2d6 nohz: Switch from "extended nohz" to "full nohz" based naming
"Extended nohz" was used as a naming base for the full dynticks
API and Kconfig symbols. It reflects the fact the system tries
to stop the tick in more places than just idle.

But that "extended" name is a bit opaque and vague. Rename it to
"full" makes it clearer what the system tries to do under this
config: try to shutdown the tick anytime it can. The various
constraints that prevent that to happen shouldn't be considered
as fundamental properties of this feature but rather technical
issues that may be solved in the future.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-04-15 19:58:17 +02:00
Frederic Weisbecker 3451d0243c nohz: Rename CONFIG_NO_HZ to CONFIG_NO_HZ_COMMON
We are planning to convert the dynticks Kconfig options layout
into a choice menu. The user must be able to easily pick
any of the following implementations: constant periodic tick,
idle dynticks, full dynticks.

As this implies a mutual exclusion, the two dynticks implementions
need to converge on the selection of a common Kconfig option in order
to ease the sharing of a common infrastructure.

It would thus seem pretty natural to reuse CONFIG_NO_HZ to
that end. It already implements all the idle dynticks code
and the full dynticks depends on all that code for now.
So ideally the choice menu would propose CONFIG_NO_HZ_IDLE and
CONFIG_NO_HZ_EXTENDED then both would select CONFIG_NO_HZ.

On the other hand we want to stay backward compatible: if
CONFIG_NO_HZ is set in an older config file, we want to
enable CONFIG_NO_HZ_IDLE by default.

But we can't afford both at the same time or we run into
a circular dependency:

1) CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED both select
   CONFIG_NO_HZ
2) If CONFIG_NO_HZ is set, we default to CONFIG_NO_HZ_IDLE

We might be able to support that from Kconfig/Kbuild but it
may not be wise to introduce such a confusing behaviour.

So to solve this, create a new CONFIG_NO_HZ_COMMON option
which gathers the common code between idle and full dynticks
(that common code for now is simply the idle dynticks code)
and select it from their referring Kconfig.

Then we'll later create CONFIG_NO_HZ_IDLE and map CONFIG_NO_HZ
to it for backward compatibility.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-04-03 13:56:03 +02:00
Frederic Weisbecker 1c20091e77 nohz: Wake up full dynticks CPUs when a timer gets enqueued
Wake up a CPU when a timer list timer is enqueued there and
the target is part of the full dynticks range. Sending an IPI
to it makes it reconsidering the next timer to program on top
of recent updates.

This may later be improved by checking if the tick is really
stopped on the target. This would need some careful
synchronization though. So deal with such optimization later
and start simple.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-03-21 15:55:59 +01:00
Frederic Weisbecker a831881be2 nohz: Basic full dynticks interface
For extreme usecases such as Real Time or HPC, having
the ability to shutdown the tick when a single task runs
on a CPU is a desired feature:

* Reducing the amount of interrupts improves throughput
for CPU-bound tasks. The CPU is less distracted from its
real job, from an execution time and from the cache point
of views.

* This also improve latency response as we have less critical
sections.

Start with introducing a very simple interface to define
full dynticks CPU: use a boot time option defined cpumask
through the "nohz_extended=" kernel parameter. CPUs that
are part of this range will have their tick shutdown
whenever possible: provided they run a single task and
they don't do kernel activity that require the periodic
tick. These details will be later documented in
Documentation/*

An online CPU must be kept outside this range to handle the
timekeeping.

Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-03-21 15:38:33 +01:00
Frederic Weisbecker f792685006 math64: New div64_u64_rem helper
Provide an extended version of div64_u64() that
also returns the remainder of the division.

We are going to need this to refine the cputime
scaling code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
2013-03-13 18:03:27 +01:00
Ingo Molnar 4e3da46797 Merge branch 'sched/cputime' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/core
Pull cputime changes from Frederic Weisbecker:

  * Generalize exception handling

  * Fix race in context tracking state restore on return from exception
    and irq exit kernel preemption

  * Fix cputime scaling in full dynticks accounting dynamic off-case

  * Fix default Kconfig value

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-08 16:41:22 +01:00
Frederic Weisbecker 9fbc42eac1 cputime: Dynamically scale cputime for full dynticks accounting
The full dynticks cputime accounting is able to account either
using the tick or the context tracking subsystem. This way
the housekeeping CPU can keep the low overhead tick based
solution.

This latter mode has a low jiffies resolution granularity and
need to be scaled against CFS precise runtime accounting to
improve its result. We are doing this for CONFIG_TICK_CPU_ACCOUNTING,
now we also need to expand it to full dynticks accounting dynamic
off-case as well.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-07 17:10:32 +01:00
Frederic Weisbecker 6c1e0256fa context_tracking: Restore correct previous context state on exception exit
On exception exit, we restore the previous context tracking state based on
the regs of the interrupted frame. Iff that frame is in user mode as
stated by user_mode() helper, we restore the context tracking user mode.

However there is a tiny chunck of low level arch code after we pass through
user_enter() and until the CPU eventually resumes userspace.
If an exception happens in this tiny area, exception_enter() correctly
exits the context tracking user mode but exception_exit() won't restore
it because of the value returned by user_mode(regs).

As a result we may return to userspace with the wrong context tracking
state.

To fix this, change exception_enter() to return the context tracking state
prior to its call and pass this saved state to exception_exit(). This restores
the real context tracking state of the interrupted frame.

(May be this patch was suggested to me, I don't recall exactly. If so,
sorry for the missing credit).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-07 17:10:11 +01:00
Frederic Weisbecker 56dd9470d7 context_tracking: Move exception handling to generic code
Exceptions handling on context tracking should share common
treatment: on entry we exit user mode if the exception triggered
in that context. Then on exception exit we return to that previous
context.

Generalize this to avoid duplication across archs.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Mats Liljegren <mats.liljegren@enea.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-07 17:09:25 +01:00
Li Zefan 25cc7da7e6 sched: Move group scheduling functions out of include/linux/sched.h
- Make sched_group_{set_,}runtime(), sched_group_{set_,}period()
and sched_rt_can_attach() static.

- Move sched_{create,destroy,online,offline}_group() to
kernel/sched/sched.h.

- Remove declaration of sched_group_shares().

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A7C5.3000708@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:24:34 +01:00
Li Zefan 15f803c94b sched: Make default_scale_freq_power() static
As default_scale_{freq,smt}_power() and update_rt_power() are
used in kernel/sched/fair.c only, annotate them as static
functions.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A7AF.8010900@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:24:34 +01:00
Li Zefan c82ba9fa75 sched: Move struct sched_class to kernel/sched/sched.h
It's used internally only.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A79F.8090502@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:24:33 +01:00
Li Zefan b13095f07f sched: Move wake flags to kernel/sched/sched.h
They are used internally only.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A78E.7040609@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:24:32 +01:00
Li Zefan 5e6521eaa1 sched: Move struct sched_group to kernel/sched/sched.h
Move struct sched_group_power and sched_group and related inline
functions to kernel/sched/sched.h, as they are used internally
only.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A77F.2010705@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:24:31 +01:00
Li Zefan cc1f4b1f3f sched: Move SCHED_LOAD_SHIFT macros to kernel/sched/sched.h
They are used internally only.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A771.4070104@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:24:30 +01:00
Li Zefan 090b582f27 sched: Remove test_sd_parent()
It's unused.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A75F.4070202@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:24:29 +01:00
Li Zefan 19a37d1cd5 sched: Remove some dummy functions
No one will call those functions if CONFIG_SCHED_DEBUG=n.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5135A748.3050206@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:24:28 +01:00
Linus Torvalds ea882c2ece UAPI disintegration 2012-12-20
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAUNNNlBOxKuMESys7AQKLmw//TQ8NRe7PGpsKpApuCVePmmIicGv9f9C2
 5Upovfl1Um144jDzpNDuvM+txzrZvh9riaLYbqvoWGUlP4kbX9VsMA6fJcLonUEk
 G9/RtdijZzhlNL59mhoLxUWJi5/jK8TAgStJmJ7TkZFFc8r0QXlHcx5+lClrYLAv
 jTkPlIqm9YhzeayOOPmbrI1lxgPJrf6MZ9BJh4cz93VPqKmYvA69/dqJAO/1JBHW
 YmWfzHy7NTchdWbw0+vGPlsX0xe5ymTl5reY94H23AxGhOB1CqiXqXyy4CyjaQ8g
 sxkhDJFmgA6+OyJQeG98lmcVV5heY0h4AizHvIuwnUZzIgXF2ipWMp1zhJf8yX9/
 wQQBtN0VZ9QGdo6lMzFd2jzqkVtYMyYuDasMb4OGVGtX/0w95j7uvRXjPEkRl21e
 B4In9VqYp8yKusPwGgHd4NkSUJS/iqg7FknkiOL/OI6rjqfF5/Ot2yL2Mhp7ozh+
 I8hflc2BaKv29UNRPAnI6Z2GngylLF0E4fvxU4E/L8EUwi7O3UxYqerT9fUDodrM
 xSdAlWK+iNBURo9G8/Au3XNpJwSAMYTBMJf0RevI7QpxqxYvjdygX+jiqfzH4x3A
 SMDeeSIsipbVJxZgCzRngQl/yPD/Cu3bRBBTV3Y8vVxt2Fv5vJBQ8pXhBBrcyx8M
 Wuoyk58CGaw=
 =JMlx
 -----END PGP SIGNATURE-----

Merge tag 'disintegrate-fbdev-20121220' of git://git.infradead.org/users/dhowells/linux-headers

Pull fbdev UAPI disintegration from David Howells:
 "You'll be glad to here that the end is nigh for the UAPI patches.
  Only the fbdev/framebuffer piece remains now that the SCSI stuff has
  gone in.

  Here are the UAPI disintegration bits for the fbdev drivers.  It
  appears that Florian hasn't had time to deal with my patch, but back
  in December he did say he didn't mind if I pushed it forward."

Yay.  No more uapi movement.  And hopefully no more big header file
cleanups coming up either, it just tends to be very painful.

* tag 'disintegrate-fbdev-20121220' of git://git.infradead.org/users/dhowells/linux-headers:
  UAPI: (Scripted) Disintegrate include/video
2013-03-03 14:24:59 -08:00
Linus Torvalds 56a79b7b02 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull  more VFS bits from Al Viro:
 "Unfortunately, it looks like xattr series will have to wait until the
  next cycle ;-/

  This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
  etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
  more file_inode() work"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  constify path_get/path_put and fs_struct.c stuff
  fix nommu breakage in shmem.c
  cache the value of file_inode() in struct file
  9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
  9p: make sure ->lookup() adds fid to the right dentry
  9p: untangle ->lookup() a bit
  9p: double iput() in ->lookup() if d_materialise_unique() fails
  9p: v9fs_fid_add() can't fail now
  v9fs: get rid of v9fs_dentry
  9p: turn fid->dlist into hlist
  9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
  more file_inode() open-coded instances
  selinux: opened file can't have NULL or negative ->f_path.dentry

(In the meantime, the hlist traversal macros have changed, so this
required a semantic conflict fixup for the newly hlistified fid->dlist)
2013-03-03 13:23:03 -08:00
Linus Torvalds 8fd5e7a2d9 ImgTec Meta architecture changes for v3.9-rc1
This adds core architecture support for Imagination's Meta processor
 cores, followed by some later miscellaneous arch/metag cleanups and
 fixes which I kept separate to ease review:
 
  - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
  - A few fixes all over, particularly for symbol prefixes
  - A few privilege protection fixes
  - Several cleanups (setup.c includes, split out a lot of metag_ksyms.c)
  - Fix some missing exports
  - Convert hugetlb to use vm_unmapped_area()
  - Copy device tree to non-init memory
  - Provide dma_get_sgtable()
 
 Signed-off-by: James Hogan <james.hogan@imgtec.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRMmVXAAoJEKHZs+irPybfivgP/inEXqJyfw59omQdjwvYcU/a
 /u0MJ3UKSNS3U+HknfaFCy/Nwk1dqPLjqqyVC1V6AbUPBXlaEwGcimlNRx2uRjdq
 Uh96upMLHsNuF/xiiR477g3RwY0egIJdM1R1bGi3mZ3vVrNQGF+wbni6f61xCWGz
 M/4rDglpQvE79oLhYdgj6tidZtHQT0YWtERA9W90zkQWXGYmpFPKBKbfZAi5+rKQ
 U6Gpg26orUugzXNaxltJEYKE8gjLTppEabx8DARnItZ4zCMy4dw5RBJ35RFvQw6e
 eSmfgTy9w9WqBMY2+QMSgU0KQt1IITCzX7OlOXC0jALQJXoU0WWbOELlBVQLCwF1
 T0OcR/5ZP/hIlOk5Dh+e9U3AtbASXdMtqA0ZUe78woH1CBf7Nc/0c0vRg23EdMh8
 lnHDJxT/UqskoOYLI4kgWbEdLDy4uTh19U2pVi7VCo7ksLB9Bj9Xc8VSKgscSXTl
 OwzN+c4Jgtu8FDFTp+Af4AT8pYGJ08j8L2ErsV2sOv3Q44U5WXdrMz3GSgwXj8+4
 wZk3HvdkQVkMD5sJCUZgAswaN6BnbB0pHdCz4wMQ8jR/Ogs015Ipk64Ecym9S/4n
 uES7PnDtt/4lb5EyX2ScbvdnZTAFTaaP7OOhC77BOQvbQjIW1tkAcxWJqRry86uS
 iM0BFgK6Ohx3geqa5Ft0
 =65cR
 -----END PGP SIGNATURE-----

Merge tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag

Pull new ImgTec Meta architecture from James Hogan:
 "This adds core architecture support for Imagination's Meta processor
  cores, followed by some later miscellaneous arch/metag cleanups and
  fixes which I kept separate to ease review:

   - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
   - A few fixes all over, particularly for symbol prefixes
   - A few privilege protection fixes
   - Several cleanups (setup.c includes, split out a lot of
     metag_ksyms.c)
   - Fix some missing exports
   - Convert hugetlb to use vm_unmapped_area()
   - Copy device tree to non-init memory
   - Provide dma_get_sgtable()"

* tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
  metag: Provide dma_get_sgtable()
  metag: prom.h: remove declaration of metag_dt_memblock_reserve()
  metag: copy devicetree to non-init memory
  metag: cleanup metag_ksyms.c includes
  metag: move mm/init.c exports out of metag_ksyms.c
  metag: move usercopy.c exports out of metag_ksyms.c
  metag: move setup.c exports out of metag_ksyms.c
  metag: move kick.c exports out of metag_ksyms.c
  metag: move traps.c exports out of metag_ksyms.c
  metag: move irq enable out of irqflags.h on SMP
  genksyms: fix metag symbol prefix on crc symbols
  metag: hugetlb: convert to vm_unmapped_area()
  metag: export clear_page and copy_page
  metag: export metag_code_cache_flush_all
  metag: protect more non-MMU memory regions
  metag: make TXPRIVEXT bits explicit
  metag: kernel/setup.c: sort includes
  perf: Enable building perf tools for Meta
  metag: add boot time LNKGET/LNKSET check
  metag: add __init to metag_cache_probe()
  ...
2013-03-03 12:06:09 -08:00
Linus Torvalds 68b86a2522 Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
 "This contains:
   - fixes and improvements
   - devicetree bindings
   - conversion to watchdog generic framework of the following drivers:
        - booke_wdt
        - bcm47xx_wdt.c
        - at91sam9_wdt
   - Removal of old STMP3xxx driver
   - Addition of following new drivers:
        - new driver for STMP3xxx and i.MX23/28
        - Retu watchdog driver"

* git://www.linux-watchdog.org/linux-watchdog: (30 commits)
  watchdog: sp805_wdt depends on ARM
  watchdog: davinci_wdt: update to devm_* API
  watchdog: davinci_wdt: use devm managed clk get
  watchdog: at91rm9200: add DT support
  watchdog: add timeout-sec property binding
  watchdog: at91sam9_wdt: Convert to use the watchdog framework
  watchdog: omap_wdt: Add option nowayout
  watchdog: core: dt: add support for the timeout-sec dt property
  watchdog: bcm47xx_wdt.c: add hard timer
  watchdog: bcm47xx_wdt.c: rename wdt_time to timeout
  watchdog: bcm47xx_wdt.c: rename ops methods
  watchdog: bcm47xx_wdt.c: use platform device
  watchdog: bcm47xx_wdt.c: convert to watchdog core api
  watchdog: Convert BookE watchdog driver to watchdog infrastructure
  watchdog: s3c2410_wdt: Use devm_* functions
  watchdog: remove old STMP3xxx driver
  watchdog: add new driver for STMP3xxx and i.MX23/28
  rtc: stmp3xxx: add wdt-accessor function
  watchdog: introduce retu_wdt driver
  watchdog: intel_scu_watchdog: fix Kconfig dependency
  ...
2013-03-03 10:23:29 -08:00
Linus Torvalds 527c680f7c Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma
Pull second set of slave-dmaengine updates from Vinod Koul:
 "Arnd's patch moves the dw_dmac to use generic DMA binding.  I agreed
  to merge this late as it will avoid the conflicts between trees.

  The second patch from Matt adding a dma_request_slave_channel_compat
  API was supposed to be picked up, but somehow never got picked up.
  Some patches dependent on this are already in -next :("

* 'next' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: dw_dmac: move to generic DMA binding
  dmaengine: add dma_request_slave_channel_compat()
2013-03-03 10:20:22 -08:00
Linus Torvalds a7c1120d2d Various bug fixes for ext4. The most important is a fix for the new
extent cache's slab shrinker which can cause significant, user-visible
 pauses when the system is under memory pressure.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJRMpM3AAoJENNvdpvBGATwiXgP/3eSg3C+M0ZUeL6lH3aXRxQO
 PHxUL/Di5cfFs3GX4DksVzsD1KkTIz8B424AhdahrrgGh1jTx4/J23OrEdu9nK24
 JGU5hmowoCyG8PZG1kGMbX6EYcblYTx+O2tX/RInnRExm9ajkfxb0S1g0Vl340qw
 58WTSWfl2+J/3RnJ9TYX/qNVeCJdxLH3GkpFbvQbLGyylfM9hsUD5MZMAR1bpOJF
 U2vNdK3n65W0AtKhLo7TYnoJ4ll2PoFRvffS0rqhEpIAcRxpVsNThFJLBcOQ1a79
 6cCN5uhrJOlL5jLN/fYCViU1+03y7itCMJmtSpuyV8DtUGjf4r1tzlvWGeiSmpB9
 NprZ/MgO1ROnzO/gzPM2s4nWWeGZiGaf7vMDyScIDtqF1ckfHN17jqazuSJcybN8
 U83O9+KyhHkvr/+zqlySXiBX2MUSUdSE37CsMC7R+mAz7C46yjXEPuG8pLkLCWiG
 gjMD30D1f6+h+K646WN497+Crxl1CurEH+ON7k158cNvVNlX1FfFHUprRHeNUXkV
 tEKjiCUCf5WjNeFEc93nC/nDi4OIISD25N9LyHzp2CcV/XXRjpsrNPBFDAZjwgiK
 YVUQIwocVUVlRaACzrM9sDFtSELqNzy/GLuERITu1Mb2R4sMXIyvvJkjc+EuQS0F
 XVQ3BU5ypWyxJGrSGCPd
 =+vcC
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bug fixes from Ted Ts'o:
 "Various bug fixes for ext4.  The most important is a fix for the new
  extent cache's slab shrinker which can cause significant, user-visible
  pauses when the system is under memory pressure."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: enable quotas before orphan cleanup
  ext4: don't allow quota mount options when quota feature enabled
  ext4: fix a warning from sparse check for ext4_dir_llseek
  ext4: convert number of blocks to clusters properly
  ext4: fix possible memory leak in ext4_remount()
  jbd2: fix ERR_PTR dereference in jbd2__journal_start
  ext4: use percpu counter for extent cache count
  ext4: optimize ext4_es_shrink()
2013-03-02 19:33:21 -08:00
Linus Torvalds 8d05b3771d NFS client bugfixes for Linux 3.9
- Don't allow NFS silly-renamed files to be deleted
 - Don't start the retransmission timer when out of socket space
 - Fix a couple of pnfs-related Oopses.
 - Fix one more NFSv4 state recovery deadlock
 - Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRMpMhAAoJEGcL54qWCgDy4BMP/0Zl7Ei7x9bJSb1C1lpPSo5p
 Lr9XoHLYqhPcAwKUXQfgM5IkC69bE62bD5esmdDqkgZYqnmGE0E4LG6MsbsMmvzk
 yug5WOxmjOFee7Bdpd8B86Z0qsa4l2TkQu2h9G3zE36P2rPKQaNzpteIjhis5UEQ
 EfNyLoBdFcuUSh4ztMVZOzbAyDcbNfsyl03XVmlv+Qn/o0l42Zjth0qwOP60bjuM
 zJF1CkHi5NLbXEhmOev9mA6UYz6zWRbiA/Yu92pomtXVDtOtzWpUniBIcf/S1ZH/
 V8Gj6bWdHHyFCa2PjhY1/QdLBOPRPdxpAAJk+q48AKmzyiOU6g3lIHBp5ai1WZNI
 1C+SYxABE/EJgq9SoQYGqq6SUiolrFulqnFHXF0jHF+ifdjoHjSRmpGQAoyoZ0k1
 aSl+Ojqx7QHibJd8GZBavWc3upRDzhHDRRB3tkQCENi+hryBZxEyeS2Z54NmBRUN
 tsOuyac6rtknZdD8Do4DMt9uc9u1DWicaiZbLfkP1VL1Angh6NKSA7qbmH6giLBS
 9Y+DPcIk5e34uKQ21WTxFydGD+SMg0EMnOmfr6EYXWEHBhKNYVR+cHyH0mAF6RzX
 enU2g0H2m+3vUQqajPUP0DV/eLGtdsvWvMjiskc3KX90CWfHmV2C8GFSxjV2OkT1
 vG1KFrICO6DR2943Udit
 =FMtb
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "We've just concluded another Connectathon interoperability testing
  week, and so here are the fixes for the bugs that were discovered:

   - Don't allow NFS silly-renamed files to be deleted
   - Don't start the retransmission timer when out of socket space
   - Fix a couple of pnfs-related Oopses.
   - Fix one more NFSv4 state recovery deadlock
   - Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER"

* tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: One line comment fix
  NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS
  SUNRPC: add call to get configured timeout
  PNFS: set the default DS timeout to 60 seconds
  NFSv4: Fix another open/open_recovery deadlock
  nfs: don't allow nfs_find_actor to match inodes of the wrong type
  NFSv4.1: Hold reference to layout hdr in layoutget
  pnfs: fix resend_to_mds for directio
  SUNRPC: Don't start the retransmission timer when out of socket space
  NFS: Don't allow NFS silly-renamed files to be deleted, no signal
2013-03-02 16:46:07 -08:00
Linus Torvalds b695188dd3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs update from Chris Mason:
 "The biggest feature in the pull is the new (and still experimental)
  raid56 code that David Woodhouse started long ago.  I'm still working
  on the parity logging setup that will avoid inconsistent parity after
  a crash, so this is only for testing right now.  But, I'd really like
  to get it out to a broader audience to hammer out any performance
  issues or other problems.

  scrub does not yet correct errors on raid5/6 either.

  Josef has another pass at fsync performance.  The big change here is
  to combine waiting for metadata with waiting for data, which is a big
  latency win.  It is also step one toward using atomics from the
  hardware during a commit.

  Mark Fasheh has a new way to use btrfs send/receive to send only the
  metadata changes.  SUSE is using this to make snapper more efficient
  at finding changes between snapshosts.

  Snapshot-aware defrag is also included.

  Otherwise we have a large number of fixes and cleanups.  Eric Sandeen
  wins the award for removing the most lines, and I'm hoping we steal
  this idea from XFS over and over again."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (118 commits)
  btrfs: fixup/remove module.h usage as required
  Btrfs: delete inline extents when we find them during logging
  btrfs: try harder to allocate raid56 stripe cache
  Btrfs: cleanup to make the function btrfs_delalloc_reserve_metadata more logic
  Btrfs: don't call btrfs_qgroup_free if just btrfs_qgroup_reserve fails
  Btrfs: remove reduplicate check about root in the function btrfs_clean_quota_tree
  Btrfs: return ENOMEM rather than use BUG_ON when btrfs_alloc_path fails
  Btrfs: fix missing deleted items in btrfs_clean_quota_tree
  btrfs: use only inline_pages from extent buffer
  Btrfs: fix wrong reserved space when deleting a snapshot/subvolume
  Btrfs: fix wrong reserved space in qgroup during snap/subv creation
  Btrfs: remove unnecessary dget_parent/dput when creating the pending snapshot
  btrfs: remove a printk from scan_one_device
  Btrfs: fix NULL pointer after aborting a transaction
  Btrfs: fix memory leak of log roots
  Btrfs: copy everything if we've created an inline extent
  btrfs: cleanup for open-coded alignment
  Btrfs: do not change inode flags in rename
  Btrfs: use reserved space for creating a snapshot
  clear chunk_alloc flag on retryable failure
  ...
2013-03-02 16:41:54 -08:00
Linus Torvalds 48476df998 Fairly unexciting MTD merge for 3.9:
* misc clean-ups in the MTD command-line partitioning parser (cmdlinepart)
  * add flash locking support for STmicro chips serial flash chips, as well as
    for CFI command set 2 chips.
  * new driver for the ELM error correction HW module found in various TI chips,
    enable the OMAP NAND driver to use the ELM HW error correction
  * added number of new serial flash IDs
  * various fixes and improvements in the gpmi NAND driver
  * bcm47xx NAND driver improvements
  * make the mtdpart module actually removable
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iEYEABECAAYFAlExEs8ACgkQdwG7hYl686Oa+gCgiBNt+I0MiixDeN+MGuE1uw9s
 i2wAniD9sR2HDy6y5SkbdXK/aPr8ez/p
 =xV1l
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20130301' of git://git.infradead.org/linux-mtd

Pull MTD update from David Woodhouse:
 "Fairly unexciting MTD merge for 3.9:

   - misc clean-ups in the MTD command-line partitioning parser
     (cmdlinepart)
   - add flash locking support for STmicro chips serial flash chips, as
     well as for CFI command set 2 chips.
   - new driver for the ELM error correction HW module found in various
     TI chips, enable the OMAP NAND driver to use the ELM HW error
     correction
   - added number of new serial flash IDs
   - various fixes and improvements in the gpmi NAND driver
   - bcm47xx NAND driver improvements
   - make the mtdpart module actually removable"

* tag 'for-linus-20130301' of git://git.infradead.org/linux-mtd: (45 commits)
  mtd: map: BUG() in non handled cases
  mtd: bcm47xxnflash: use pr_fmt for module prefix in messages
  mtd: davinci_nand: Use managed resources
  mtd: mtd_torturetest can cause stack overflows
  mtd: physmap_of: Convert device allocation to managed devm_kzalloc()
  mtd: at91: atmel_nand: for PMECC, add code to check the ONFI parameter ECC requirement.
  mtd: atmel_nand: make pmecc-cap, pmecc-sector-size in dts is optional.
  mtd: atmel_nand: avoid to report an error when lookup table offset is 0.
  mtd: bcm47xxsflash: adjust names of bus-specific functions
  mtd: bcm47xxpart: improve probing of nvram partition
  mtd: bcm47xxpart: add support for other erase sizes
  mtd: bcm47xxnflash: register this as normal driver
  mtd: bcm47xxnflash: fix message
  mtd: bcm47xxsflash: register this as normal driver
  mtd: bcm47xxsflash: write number of written bytes
  mtd: gpmi: add sanity check for the ECC
  mtd: gpmi: set the Golois Field bit for mx6q's BCH
  mtd: devices: elm: Removes <xx> literals in elm DT node
  mtd: gpmi: fix a dereferencing freed memory error
  mtd: fix the wrong timeo for panic_nand_wait()
  ...
2013-03-02 16:33:54 -08:00
James Hogan 9ca52ed979 mm: define VM_GROWSUP for CONFIG_METAG
Commit cc2383ec06 ("mm: introduce
arch-specific vma flag VM_ARCH_1") merged in v3.7-rc1.

The above commit combined several arch-specific vma flags into one, and
in the process it changed the VM_GROWSUP definition to depend on
specific architectures rather than CONFIG_STACK_GROWSUP. Therefore add
an ifdef for CONFIG_METAG to also set VM_GROWSUP.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michel Lespinasse <walken@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-mm@kvack.org
2013-03-02 20:09:53 +00:00
James Hogan 5698c50d9d metag: Internal and external irqchips
Meta core internal interrupts (from HWSTATMETA and friends) are vectored
onto the TR1 core trigger for the current thread. This is demultiplexed
in irq-metag.c to individual Linux IRQs for each internal interrupt.

External SoC interrupts (from HWSTATEXT and friends) are vectored onto
the TR2 core trigger for the current thread. This is demultiplexed in
irq-metag-ext.c to individual Linux IRQs for each external SoC interrupt.
The external irqchip has devicetree bindings for configuring the number
of irq banks and the type of masking available.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Rob Landley <rob@landley.net>
Cc: Dom Cobley <popcornmix@gmail.com>
Cc: Simon Arlott <simon@fire.lp0.eu>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: linux-doc@vger.kernel.org
2013-03-02 20:09:48 +00:00
James Hogan a2c5d4ed92 metag: Time keeping
Add time keeping code for metag. Meta hardware threads have 2 timers.
The background timer (TXTIMER) is used as a free-running time base, and
the interrupt timer (TXTIMERI) is used for the timer interrupt. Both
counters traditionally count at approximately 1MHz.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-03-02 20:09:22 +00:00
James Hogan bc3966bf15 metag: ptrace
The ptrace interface for metag provides access to some core register
sets using the PTRACE_GETREGSET and PTRACE_SETREGSET operations. The
details of the internal context structures is abstracted into user API
structures to both ease use and allow flexibility to change the internal
context layouts. Copyin and copyout functions for these register sets
are exposed to allow signal handling code to use them to copy to and
from the signal context.

struct user_gp_regs (NT_PRSTATUS) provides access to the core general
purpose register context.

struct user_cb_regs (NT_METAG_CBUF) provides access to the TXCATCH*
registers which contains information abuot a memory fault, unaligned
access error or watchpoint. This can be modified to alter the way the
fault is replayed on resume ("catch replay"), or to prevent the replay
taking place.

struct user_rp_state (NT_METAG_RPIPE) provides access to the state of
the Meta read pipeline which can be used to hide memory latencies in
hand optimised data loops.

Extended DSP register state, DSP RAM, and hardware breakpoint registers
aren't yet exposed through ptrace.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
2013-03-02 20:09:22 +00:00
James Hogan 4dd3c95940 asm-generic/unistd.h: handle symbol prefixes in cond_syscall
Some architectures have symbol prefixes and set CONFIG_SYMBOL_PREFIX,
but this wasn't taken into account by the generic cond_syscall. It's
easy enough to fix in a generic fashion, so add the symbol prefix to
symbol names in cond_syscall when CONFIG_SYMBOL_PREFIX is set.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mike Frysinger <vapier@gentoo.org>
2013-03-02 20:09:14 +00:00
James Hogan c93d031231 asm-generic/io.h: check CONFIG_VIRT_TO_BUS
Make asm-generic/io.h check CONFIG_VIRT_TO_BUS before defining
virt_to_bus() and bus_to_virt(), otherwise it's easy to accidentally
have a silently failing incorrect direct mapped definition rather then
no definition at all.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2013-03-02 20:09:14 +00:00
Linus Torvalds 37cae6ad4c The main addition here is a long-desired target framework to allow
an SSD to be used as a cache in front of a slower device.  Cache
 tuning is delegated to interchangeable policy modules so these can
 be developed independently of the mechanics needed to shuffle the
 data around.
 
 Other than that, kcopyd users acquire a throttling parameter, ioctl
 buffer usage gets streamlined, more mempool reliance is reduced
 and there are a few other bug fixes and tidy-ups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRMTADAAoJEK2W1qbAHj1nAiUP/1Az6kkg+pYQAkD2f3aQLHi8
 b3SfkZ5mMSZkBYRUuMzZ5oMuSVZuyoRQi8+qy5PsUSDYO8jH2ja825jOa584OqVW
 XYBbNi6Te/1L+CMWWCYf5ShyZuu4o5TwIV7IWbaPsVWIwflaTF9WRyEjxG3WRM3X
 XglrGJqb321Gx2ZI3OwxIPGyXRQWFTCmZ5pshAVAedspE86tKRh2kx3dgLw4qcu/
 3YYMV8DBga+xgNVM/EuoXQLsPpuxdrsSfdYZo7cTvKevgEuO+0vgWZORjCWFfICd
 ypwco9FHAU5dSjEiQ0rXO5Lhbv3Dx5vfm+vrzPSKUp4WVFhZKLHOZEFDylluEidv
 jX7a5RohUj/mDOTp+tgZ1GbpTWXtjX9uhoQWbeqZmz189TYX+twnVaNR47Q+YjcL
 VtYP7SgsUXTzdGJJ/RtGM6zhlgALI5dC6smNGbHLcyFcmIKIAa6LtyE86iPw14DY
 axKfNwL/c5PmtcmAYxIBx6CDK+M+MunK50/BtX4pOrQyzgbyxtylvrgq/BzvI5m3
 FwMkcziDMA0Dhwi1PpXz+MyOnVvGhEb6CrzYrylB98aFyDw2YvwUytBvisTmwSoJ
 YNn4vwCwJIWetu0ls5SZERLHPOh3rjwaufq9nVXew/FXF8gNjrOTZLHz3XtZbtw5
 TMntpytLQ5N9A7bRPuDz
 =yXuO
 -----END PGP SIGNATURE-----

Merge tag 'dm-3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull device-mapper update from Alasdair G Kergon:
 "The main addition here is a long-desired target framework to allow an
  SSD to be used as a cache in front of a slower device.  Cache tuning
  is delegated to interchangeable policy modules so these can be
  developed independently of the mechanics needed to shuffle the data
  around.

  Other than that, kcopyd users acquire a throttling parameter, ioctl
  buffer usage gets streamlined, more mempool reliance is reduced and
  there are a few other bug fixes and tidy-ups."

* tag 'dm-3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: (30 commits)
  dm cache: add cleaner policy
  dm cache: add mq policy
  dm: add cache target
  dm persistent data: add bitset
  dm persistent data: add transactional array
  dm thin: remove cells from stack
  dm bio prison: pass cell memory in
  dm persistent data: add btree_walk
  dm: add target num_write_bios fn
  dm kcopyd: introduce configurable throttling
  dm ioctl: allow message to return data
  dm ioctl: optimize functions without variable params
  dm ioctl: introduce ioctl_flags
  dm: merge io_pool and tio_pool
  dm: remove unused _rq_bio_info_cache
  dm: fix limits initialization when there are no data devices
  dm snapshot: add missing module aliases
  dm persistent data: set some btree fn parms const
  dm: refactor bio cloning
  dm: rename bio cloning functions
  ...
2013-03-02 11:44:27 -08:00
Linus Torvalds 426d266c12 SCSI for-linus on 20130301
This is an assorted set of stragglers into the merge window with driver
 updates for qla2xxx, megaraid_sas, storvsc and ufs.  It also includes pulls of
 the uapi tree (all the remaining SCSI pieces) and the fcoe tree (updates to
 fcoe and libfc)
 
 Signed-off-by: James Bottomley <JBottomley@Parallels.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJRMHJHAAoJEDeqqVYsXL0M9tAH/2YG3TCfy0RFAejGgLfX9OGH
 6eFe71m7Z8nfIEneAnm5BuKjCx3QFRp5UFjJZdFHLP1Qv0TbpKs4FnZyeSGTxLQp
 S1fZc5sTWmsb5qYxLaukKopC6sFx+hNI2dvB+rgKcd+nWy1RzG7lGqbS4CRNE76q
 UNByqlfqJxn5cfQw7dg2zOUKlGaGL2jSyFf0QFXR2IZzO33PeyBPfKDFeJC6b+oc
 XTy9MK9V5u6ne3XimDTU2hP4lPAsZaJtcqsv1Gvv2y+BHalQiPqfL6bZMvN3Zbfq
 hfT+i2xnYy85858gxtyIhzHwU14zF7I0HEWiVpddsF9NDK7iNKvK8aWHaTs7qis=
 =hvGQ
 -----END PGP SIGNATURE-----

Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is an assorted set of stragglers into the merge window with
  driver updates for qla2xxx, megaraid_sas, storvsc and ufs.

  It also includes pulls of the uapi tree (all the remaining SCSI
  pieces) and the fcoe tree (updates to fcoe and libfc)"

* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (81 commits)
  [SCSI] ufs: Separate PCI code into glue driver
  [SCSI] ufs: Segregate PCI Specific Code
  [SCSI] scsi: fix lpfc build when wmb() is defined as mb()
  [SCSI] storvsc: Handle dynamic resizing of the device
  [SCSI] storvsc: Restructure error handling code on command completion
  [SCSI] storvsc: avoid usage of WRITE_SAME
  [SCSI] aacraid: suppress two GCC warnings
  [SCSI] hpsa: check for dma_mapping_error in hpsa_passthru ioctls
  [SCSI] hpsa: reorganize error handling in hpsa_passthru_ioctl
  [SCSI] hpsa: check for dma_mapping_error in hpsa_map_sg_chain_block
  [SCSI] hpsa: Check for dma_mapping_error for all code paths using fill_cmd
  [SCSI] hpsa: Check for dma_mapping_error in hpsa_map_one
  [SCSI] dc395x: uninitialized variable in device_alloc()
  [SCSI] Fix range check in scsi_host_dif_capable()
  [SCSI] storvsc: Initialize the sglist
  [SCSI] mpt2sas: Add support for OEM specific controller
  [SCSI] ipr: Fix oops while resetting an ipr adapter
  [SCSI] fnic: Fnic Trace Utility
  [SCSI] fnic: New debug flags and debug log messages
  [SCSI] fnic: fnic driver may hit BUG_ON on device reset
  ...
2013-03-02 11:42:16 -08:00
Yinghai Lu 20e6926dcb x86, ACPI, mm: Revert movablemem_map support
Tim found:

  WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80()
  Hardware name: S2600CP
  sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
  smpboot: Booting Node   1, Processors  #1
  Modules linked in:
  Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1
  Call Trace:
    set_cpu_sibling_map+0x279/0x449
    start_secondary+0x11d/0x1e5

Don Morris reproduced on a HP z620 workstation, and bisected it to
commit e8d1955258 ("acpi, memory-hotplug: parse SRAT before memblock
is ready")

It turns out movable_map has some problems, and it breaks several things

1. numa_init is called several times, NOT just for srat. so those
	nodes_clear(numa_nodes_parsed)
	memset(&numa_meminfo, 0, sizeof(numa_meminfo))
   can not be just removed.  Need to consider sequence is: numaq, srat, amd, dummy.
   and make fall back path working.

2. simply split acpi_numa_init to early_parse_srat.
   a. that early_parse_srat is NOT called for ia64, so you break ia64.
   b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
	     set_apicid_to_node(i, NUMA_NO_NODE)
     still left in numa_init. So it will just clear result from early_parse_srat.
     it should be moved before that....
   c.  it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
       early before override from INITRD is settled.

3. that patch TITLE is total misleading, there is NO x86 in the title,
   but it changes critical x86 code. It caused x86 guys did not
   pay attention to find the problem early. Those patches really should
   be routed via tip/x86/mm.

4. after that commit, following range can not use movable ram:
  a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed?
  b. initrd... it will be freed after booting, so it could be on movable...
  c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G
	anymore.
  d. init_mem_mapping: can not put page table high anymore.
  e. initmem_init: vmemmap can not be high local node anymore. That is
     not good.

If node is hotplugable, the mem related range like page table and
vmemmap could be on the that node without problem and should be on that
node.

We have workaround patch that could fix some problems, but some can not
be fixed.

So just remove that offending commit and related ones including:

 f7210e6c4a ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to
    protect movablecore_map in memblock_overlaps_region().")

 01a178a94e ("acpi, memory-hotplug: support getting hotplug info from
    SRAT")

 27168d38fa ("acpi, memory-hotplug: extend movablemem_map ranges to
    the end of node")

 e8d1955258 ("acpi, memory-hotplug: parse SRAT before memblock is
    ready")

 fb06bc8e5f ("page_alloc: bootmem limit with movablecore_map")

 42f47e27e7 ("page_alloc: make movablemem_map have higher priority")

 6981ec3114 ("page_alloc: introduce zone_movable_limit[] to keep
    movable limit for nodes")

 34b71f1e04 ("page_alloc: add movable_memmap kernel parameter")

 4d59a75125 ("x86: get pg_data_t's memory from other node")

Later we should have patches that will make sure kernel put page table
and vmemmap on local node ram instead of push them down to node0.  Also
need to find way to put other kernel used ram to local node ram.

Reported-by: Tim Gardner <tim.gardner@canonical.com>
Reported-by: Don Morris <don.morris@hp.com>
Bisected-by: Don Morris <don.morris@hp.com>
Tested-by: Don Morris <don.morris@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-02 09:34:39 -08:00
Linus Torvalds 14cc0b55b7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal/compat fixes from Al Viro:
 "Fixes for several regressions introduced in the last signal.git pile,
  along with fixing bugs in truncate and ftruncate compat (on just about
  anything biarch at least one of those two had been done wrong)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  compat: restore timerfd settime and gettime compat syscalls
  [regression] braino in "sparc: convert to ksignal"
  fix compat truncate/ftruncate
  switch lseek to COMPAT_SYSCALL_DEFINE
  lseek() and truncate() on sparc really need sign extension
2013-03-02 08:34:06 -08:00
Linus Torvalds e23b62256a Initial ARC Linux port with some fixes on top for 3.9-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRMZYbAAoJEGnX8d3iisJeFj8P/R1hWohDDUc8pG3+ov9Y2Brt
 g7oIVw1udlKIk3HhVwsyT14/UHunfcTCONKKKGmmbfRrLJSMMsSlXvYoAQLozokf
 TuaO3Xt5IfERROqTrCDSwdNaAmwIZGsIuI9jWKo4qsXovAL0nc3sR527qMI1o7OE
 9X5eIqOJ/rPvOjXYrPqXmvO/DZKYG8PNTH4PYqePes31CsAmDXWBIlgYmgvF3th3
 ptwKPgRU/c2wKNpDDJXVQg/bcg9NI2cCnndNrjgXZgyUQrC37ZTdt/IOF5w6FgIW
 6i6UbDKXn8MgQAhrXx0Ns/+0kSJZ7eBWmj8hLyrxUzOYlF4rCs/il6ofDRaMO6fv
 9LmbNZXYnGICzm1YAxZRK7dm13IbDnltmMc81vISBpJSMTBgqzLWobHnq5/67Wh4
 2oUkoc2Tfaw70FnRCewX0x4Qop2YXmXl1KBwdecvzdcKi6Yg+rRH08ur/0yyCyx7
 +vAQpPVIuVqCc916qwmCPFaf1UMNnmMStxNH7D1AQHvi1G372NxfXizdYyKFRY9N
 f5Q+6DTo1xh2AxuGieSZxBoeK0Rlp4DWTOBD4MMz29y7BRX7LK1U2iS+nW0g8uir
 3RdYeAqyCxlJtjJNQX9U8ZT54jUPZgvJWU0udesRN1CBdOSQMjM9OyZsLLRtLeX1
 ww2tc7zqhBUyjBej6Itg
 =NKkW
 -----END PGP SIGNATURE-----

Merge tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull new ARC architecture from Vineet Gupta:
 "Initial ARC Linux port with some fixes on top for 3.9-rc1:

  I would like to introduce the Linux port to ARC Processors (from
  Synopsys) for 3.9-rc1.  The patch-set has been discussed on the public
  lists since Nov and has received a fair bit of review, specially from
  Arnd, tglx, Al and other subsystem maintainers for DeviceTree, kgdb...

  The arch bits are in arch/arc, some asm-generic changes (acked by
  Arnd), a minor change to PARISC (acked by Helge).

  The series is a touch bigger for a new port for 2 main reasons:

   1. It enables a basic kernel in first sub-series and adds
      ptrace/kgdb/.. later

   2. Some of the fallout of review (DeviceTree support, multi-platform-
      image support) were added on top of orig series, primarily to
      record the revision history.

  This updated pull request additionally contains

   - fixes due to our GNU tools catching up with the new syscall/ptrace
     ABI

   - some (minor) cross-arch Kconfig updates."

* tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (82 commits)
  ARC: split elf.h into uapi and export it for userspace
  ARC: Fixup the current ABI version
  ARC: gdbserver using regset interface possibly broken
  ARC: Kconfig cleanup tracking cross-arch Kconfig pruning in merge window
  ARC: make a copy of flat DT
  ARC: [plat-arcfpga] DT arc-uart bindings change: "baud" => "current-speed"
  ARC: Ensure CONFIG_VIRT_TO_BUS is not enabled
  ARC: Fix pt_orig_r8 access
  ARC: [3.9] Fallout of hlist iterator update
  ARC: 64bit RTSC timestamp hardware issue
  ARC: Don't fiddle with non-existent caches
  ARC: Add self to MAINTAINERS
  ARC: Provide a default serial.h for uart drivers needing BASE_BAUD
  ARC: [plat-arcfpga] defconfig for fully loaded ARC Linux
  ARC: [Review] Multi-platform image #8: platform registers SMP callbacks
  ARC: [Review] Multi-platform image #7: SMP common code to use callbacks
  ARC: [Review] Multi-platform image #6: cpu-to-dma-addr optional
  ARC: [Review] Multi-platform image #5: NR_IRQS defined by ARC core
  ARC: [Review] Multi-platform image #4: Isolate platform headers
  ARC: [Review] Multi-platform image #3: switch to board callback
  ...
2013-03-02 07:58:56 -08:00
Al Viro dcf787f391 constify path_get/path_put and fs_struct.c stuff
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-01 23:51:07 -05:00
Al Viro dd37978c50 cache the value of file_inode() in struct file
Note that this thing does *not* contribute to inode refcount;
it's pinned down by dentry.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-01 19:48:30 -05:00
Alasdair G Kergon b0d8ed4d96 dm: add target num_write_bios fn
Add a num_write_bios function to struct target.

If an instance of a target sets this, it will be queried before the
target's mapping function is called on a write bio, and the response
controls the number of copies of the write bio that the target will
receive.

This provides a convenient way for a target to send the same data to
more than one device.  The new cache target uses this in writethrough
mode, to send the data both to the cache and the backing device.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-01 22:45:49 +00:00
Mikulas Patocka df5d2e9089 dm kcopyd: introduce configurable throttling
This patch allows the administrator to reduce the rate at which kcopyd
issues I/O.

Each module that uses kcopyd acquires a throttle parameter that can be
set in /sys/module/*/parameters.

We maintain a history of kcopyd usage by each module in the variables
io_period and total_period in struct dm_kcopyd_throttle. The actual
kcopyd activity is calculated as a percentage of time equal to
"(100 * io_period / total_period)".  This is compared with the user-defined
throttle percentage threshold and if it is exceeded, we sleep.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-01 22:45:49 +00:00
Mikulas Patocka a26062416e dm ioctl: allow message to return data
This patch introduces enhanced message support that allows the
device-mapper core to recognise messages that are common to all devices,
and for messages to return data to userspace.

Core messages are processed by the function "message_for_md".  If the
device mapper doesn't support the message, it is passed to the target
driver.

If the message returns data, the kernel sets the flag
DM_MESSAGE_OUT_FLAG.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-01 22:45:49 +00:00
Mikulas Patocka 02cde50b7e dm ioctl: optimize functions without variable params
Device-mapper ioctls receive and send data in a buffer supplied
by userspace.  The buffer has two parts.  The first part contains
a 'struct dm_ioctl' and has a fixed size.  The second part depends
on the ioctl and has a variable size.

This patch recognises the specific ioctls that do not use the variable
part of the buffer and skips allocating memory for it.

In particular, when a device is suspended and a resume ioctl is sent,
this now avoid memory allocation completely.

The variable "struct dm_ioctl tmp" is moved from the function
copy_params to its caller ctl_ioctl and renamed to param_kernel.
It is used directly when the ioctl function doesn't need any arguments.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-01 22:45:49 +00:00
Alasdair G Kergon 55a62eef8d dm: rename request variables to bios
Use 'bio' in the name of variables and functions that deal with
bios rather than 'request' to avoid confusion with the normal
block layer use of 'request'.

No functional changes.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-01 22:45:47 +00:00
Mikulas Patocka fd7c092e71 dm: fix truncated status strings
Avoid returning a truncated table or status string instead of setting
the DM_BUFFER_FULL_FLAG when the last target of a table fills the
buffer.

When processing a table or status request, the function retrieve_status
calls ti->type->status. If ti->type->status returns non-zero,
retrieve_status assumes that the buffer overflowed and sets
DM_BUFFER_FULL_FLAG.

However, targets don't return non-zero values from their status method
on overflow. Most targets returns always zero.

If a buffer overflow happens in a target that is not the last in the
table, it gets noticed during the next iteration of the loop in
retrieve_status; but if a buffer overflow happens in the last target, it
goes unnoticed and erroneously truncated data is returned.

In the current code, the targets behave in the following way:
* dm-crypt returns -ENOMEM if there is not enough space to store the
  key, but it returns 0 on all other overflows.
* dm-thin returns errors from the status method if a disk error happened.
  This is incorrect because retrieve_status doesn't check the error
  code, it assumes that all non-zero values mean buffer overflow.
* all the other targets always return 0.

This patch changes the ti->type->status function to return void (because
most targets don't use the return code). Overflow is detected in
retrieve_status: if the status method fills up the remaining space
completely, it is assumed that buffer overflow happened.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-01 22:45:44 +00:00
Randy Dunlap 8eae508b7c hsi: fix kernel-doc warnings
Fix kernel-doc warnings in hsi files:

  Warning(include/linux/hsi/hsi.h:136): Excess struct/union/enum/typedef member 'e_handler' description in 'hsi_client'
  Warning(include/linux/hsi/hsi.h:136): Excess struct/union/enum/typedef member 'pclaimed' description in 'hsi_client'
  Warning(include/linux/hsi/hsi.h:136): Excess struct/union/enum/typedef member 'nb' description in 'hsi_client'
  Warning(drivers/hsi/hsi.c:434): No description found for parameter 'handler'
  Warning(drivers/hsi/hsi.c:434): Excess function parameter 'cb' description in 'hsi_register_port_event'

Don't document "private:" fields with kernel-doc notation.
If you want to leave them fully documented, that's OK, but
then don't mark them as "private:".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Carlos Chinea <carlos.chinea@nokia.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-01 13:39:00 -08:00
Fabio Porcedda 3048253ed9 watchdog: core: dt: add support for the timeout-sec dt property
Add support for watchdog drivers to initialize/set the timeout field
of the watchdog_device structure. The timeout field is initialised
either with the module timeout parameter value (if valid) or with the
timeout-sec dt property (if valid). If both are invalid the initial
value is unchanged.

Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2013-03-01 12:48:36 +01:00
Hauke Mehrtens f82dedf812 watchdog: bcm47xx_wdt.c: use platform device
Instead of accessing the function to set the watchdog timer directly,
register a platform driver the platform could register to use this
watchdog driver.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2013-03-01 12:47:16 +01:00
Wolfram Sang 1a71fb84fd rtc: stmp3xxx: add wdt-accessor function
This RTC also includes a watchdog timer. Provide an accessor function
for setting the watchdog timeout value which will be picked up by a
watchdog driver. Also register the platform_device for the watchdog here
to get the boot-time dependencies right.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2013-03-01 12:40:36 +01:00