1
0
Fork 0
Commit Graph

1589 Commits (578bd4dfcda63d2ef15f025f1d5d55c0e56b9660)

Author SHA1 Message Date
Ingo Molnar a6b9b4d50f Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu 2010-08-23 11:32:34 +02:00
Linus Torvalds 9ee47476d6 Merge branch 'radix-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev
* 'radix-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev:
  radix-tree: radix_tree_range_tag_if_tagged() can set incorrect tags
  radix-tree: clear all tags in radix_tree_node_rcu_free
2010-08-22 19:55:14 -07:00
Dave Chinner 144dcfc012 radix-tree: radix_tree_range_tag_if_tagged() can set incorrect tags
Commit ebf8aa44be ("radix-tree:
omplement function radix_tree_range_tag_if_tagged") does not safely
set tags on on intermediate tree nodes. The code walks down the tree
setting tags before it has fully resolved the path to the leaf under
the assumption there will be a leaf slot with the tag set in the
range it is searching.

Unfortunately, this is not a valid assumption - we can abort after
setting a tag on an intermediate node if we overrun the number of
tags we are allowed to set in a batch, or stop scanning because we
we have passed the last scan index before we reach a leaf slot with
the tag we are searching for set.

As a result, we can leave the function with tags set on intemediate
nodes which can be tripped over later by tag-based lookups. The
result of these stale tags is that lookup may end prematurely or
livelock because the lookup cannot make progress.

The fix for the problem involves reocrding the traversal path we
take to the leaf nodes, and only propagating the tags back up the
tree once the tag is set in the leaf node slot. We are already
recording the path for efficient traversal, so there is no
additional overhead to do the intermediately node tag setting in
this manner.

This fixes a radix tree lookup livelock triggered by the new
writeback sync livelock avoidance code introduced in commit
f446daaea9 ("mm: implement writeback
livelock avoidance using page tagging").

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Jan Kara <jack@suse.cz>
2010-08-23 10:33:53 +10:00
Dave Chinner b6dd08652e radix-tree: clear all tags in radix_tree_node_rcu_free
Commit f446daaea9 ("mm: implement
writeback livelock avoidance using page tagging") introduced a new
radix tree tag, increasing the number of tags in each node from 2 to
3. It did not, however, fix up the code in
radix_tree_node_rcu_free() that cleans up after radix_tree_shrink()
and hence could leave stray tags set in the new tag array.

The result is that the livelock avoidance code added in the the
above commit would hit stale tags when doing tag based lookups,
resulting in livelocks when trying to traverse the tree.

Fix this problem in radix_tree_node_rcu_free() so it doesn't happen
again in the future by using a loop to walk all the tags up to
RADIX_TREE_MAX_TAGS to clear the stray tags radix_tree_shrink()
leaves behind.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Nick Piggin <npiggin@kernel.dk>
Acked-by: Jan Kara <jack@suse.cz>
2010-08-23 10:33:19 +10:00
Jan Kara d5ed3a4af7 lib/radix-tree.c: fix overflow in radix_tree_range_tag_if_tagged()
When radix_tree_maxindex() is ~0UL, it can happen that scanning overflows
index and tree traversal code goes astray reading memory until it hits
unreadable memory.  Check for overflow and exit in that case.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-20 09:34:55 -07:00
Paul E. McKenney 910b1b7e19 rcu: Allow RCU CPU stall warnings to be off at boot, but manually enablable
Currently, if RCU CPU stall warnings are enabled, they are enabled
immediately upon boot.  They can be manually disabled via /sys (and
also re-enabled via /sys), and are automatically disabled upon panic.
However, some users need RCU CPU stalls to be disabled at boot time,
but to be enabled without rebuilding/rebooting.  For example, someone
running a real-time application in production might not want the
additional latency of RCU CPU stall detection in normal operation, but
might need to enable it at any point for fault isolation purposes.

This commit therefore provides a new CONFIG_RCU_CPU_STALL_DETECTOR_RUNNABLE
kernel configuration parameter that maintains the current behavior
(enable at boot) by default, but allows a kernel to be configured
with RCU CPU stall detection built into the kernel, but disabled at
boot time.

Requested-by: Clark Williams <williams@redhat.com>
Requested-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-08-19 17:18:04 -07:00
Arnd Bergmann a1115570b3 radix-tree: __rcu annotations
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2010-08-19 17:18:03 -07:00
Paul E. McKenney b163760e37 rcu: make CPU stall warning timeout configurable
Also set the default to 60 seconds, up from the previous hard-coded timeout
of 10 seconds.  This allows people who care to set short timeouts, while
avoiding people with unusual configurations (make randconfig!!!) from being
bothered with spurious CPU stall warnings.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2010-08-19 17:18:02 -07:00
Paul E. McKenney ca5ecddfa8 rcu: define __rcu address space modifier for sparse
This commit provides definitions for the __rcu annotation defined earlier.
This annotation permits sparse to check for correct use of RCU-protected
pointers.  If a pointer that is annotated with __rcu is accessed
directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
or one of their variants), sparse can be made to complain.  To enable
such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
kernel configuration option.  Please note that these sparse complaints are
intended to be a debugging aid, -not- a code-style-enforcement mechanism.

There are special rcu_dereference_protected() and rcu_access_pointer()
accessors for use when RCU read-side protection is not required, for
example, when no other CPU has access to the data structure in question
or while the current CPU hold the update-side lock.

This patch also updates a number of docbook comments that were showing
their age.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Christopher Li <sparse@chrisli.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2010-08-19 17:17:59 -07:00
Randy Dunlap 625fdcaa6d latencytop: Fix kconfig dependency warnings
warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
SCHED_DEBUG which has unmet direct dependencies (DEBUG_KERNEL &&
PROC_FS) warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
SCHEDSTATS which has unmet direct dependencies (DEBUG_KERNEL && PROC_FS)

Add depends on STACKTRACE_SUPPORT for 'select STACKTRACE'.
Add depends on PROC_FS since that is where the output goes.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <20100812123121.a7c99cde.randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-17 09:09:03 +02:00
Linus Torvalds 7367f5b013 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
  Further tidyup of raid6 naming in lib/raid6
  Make lib/raid6/test build correctly.
  Rename raid6 files now they're in a 'raid6' directory.
2010-08-12 10:08:10 -07:00
David Howells 1490cf5f0c MN10300: Don't try and #include <linux/slab.h> in lib/inflate.c from bootloader
Don't try and #include <linux/slab.h> in lib/inflate.c from the bootloader code
as linux/slab.h hauls in function defs that aren't available in the bootloader
code and may also haul in conflicting functions.

To fix this, make the inclusion of linux/slab.h contingent on NO_INFLATE_MALLOC
as are the usages of kmalloc() and kfree().

In MN10300, this causes the following errors:

In file included from include/linux/string.h:21,
                 from include/linux/bitmap.h:8,
                 from include/linux/nodemask.h:93,
                 from include/linux/mmzone.h:16,
                 from include/linux/gfp.h:4,
                 from include/linux/slab.h:12,
                 from arch/mn10300/boot/compressed/../../../../lib/inflate.c:106,
                 from arch/mn10300/boot/compressed/misc.c:170:
/warthog/am33/linux-2.6-mn10300/arch/mn10300/include/asm/string.h:19: error: conflicting types for 'memset'
arch/mn10300/boot/compressed/misc.c:59: error: previous definition of 'memset' was here

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-12 09:51:35 -07:00
NeilBrown a8e026c785 Further tidyup of raid6 naming in lib/raid6
Rename raid6/raid6x86.h to raid6/x86.h
and modify some comments.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-12 06:44:54 +10:00
NeilBrown d5302fe41f Make lib/raid6/test build correctly.
Some bit-rot needs to be cleaned out.

Signed-off-by: NeilBrown <neilb@suse.de>
2010-08-12 06:38:24 +10:00
Prarit Bhargava dd21e9bdff lib/decompress_bunzip2.c: fix checkstack warning
Fix checkstack error:

lib/decompress_bunzip2.c: In function `get_next_block':
lib/decompress_bunzip2.c:511: warning: the frame size of 1932 bytes is larger than 1024 bytes

byteCount, symToByte, and mtfSymbol cannot be declared static or allocated
dynamically so place them in the bunzip_data struct.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11 08:59:23 -07:00
Anton Blanchard 863a604920 lib/bug.c: add oops end marker to WARN implementation
We are missing the oops end marker for the exception based WARN implementation
in lib/bug.c. This is useful for logfile analysis tools.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11 08:59:22 -07:00
Anton Blanchard e2e7e09325 lib/bug.c: make WARN implementation match the kernel/panic.c one
There are a few issues with the exception based WARN implementation in
lib/bug.c:

- Inconsistent printk flags. The "cut here" line is printed at KERN_EMERG, so
  the console and all logged in users see the single line:

------------[ cut here ]------------

  for each WARN. Fix this so we print everything at KERN_WARNING to match the
  kernel/panic.c version.

- The lib/bug.c WARN would print "Badness at". Change it to match the
  kernel/panic.c version which prints "WARNING: at".

- Print the list of modules, similar to kernel/panic.c of modules, similar to
  kernel/panic.c

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11 08:59:22 -07:00
David Woodhouse cc4589ebfa Rename raid6 files now they're in a 'raid6' directory.
Linus asks 'why "raid6" twice?'. No reason.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2010-08-11 00:19:05 +01:00
Linus Torvalds 3d30701b58 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md: (24 commits)
  md: clean up do_md_stop
  md: fix another deadlock with removing sysfs attributes.
  md: move revalidate_disk() back outside open_mutex
  md/raid10: fix deadlock with unaligned read during resync
  md/bitmap:  separate out loading a bitmap from initialising the structures.
  md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.
  md/bitmap: optimise scanning of empty bitmaps.
  md/bitmap: clean up plugging calls.
  md/bitmap: reduce dependence on sysfs.
  md/bitmap: white space clean up and similar.
  md/raid5: export raid5 unplugging interface.
  md/plug: optionally use plugger to unplug an array during resync/recovery.
  md/raid5: add simple plugging infrastructure.
  md/raid5: export is_congested test
  raid5: Don't set read-ahead when there is no queue
  md: add support for raising dm events.
  md: export various start/stop interfaces
  md: split out md_rdev_init
  md: be more careful setting MD_CHANGE_CLEAN
  md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk
  ...
2010-08-10 15:38:19 -07:00
Linus Torvalds 4c619407b0 Merge branch 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm
* 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm:
  kmemleak: Fix typo in the comment
  lib/scatterlist: Hook sg_kmalloc into kmemleak (v2)
  kmemleak: Add DocBook style comments to kmemleak.c
  kmemleak: Introduce a default off mode for kmemleak
  kmemleak: Show more information for objects found by alias
2010-08-10 13:58:11 -07:00
Michel Lespinasse a8618a0e8a rwsem: smaller wrappers around rwsem_down_failed_common
More code can be pushed from rwsem_down_read_failed and
rwsem_down_write_failed into rwsem_down_failed_common.

Following change adding down_read_critical infrastructure support also
enjoys having flags available in a register rather than having to fish it
out in the struct rwsem_waiter...

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:11 -07:00
Michel Lespinasse 424acaaeb3 rwsem: wake queued readers when writer blocks on active read lock
This change addresses the following situation:

- Thread A acquires the rwsem for read
- Thread B tries to acquire the rwsem for write, notices there is already
  an active owner for the rwsem.
- Thread C tries to acquire the rwsem for read, notices that thread B already
  tried to acquire it.
- Thread C grabs the spinlock and queues itself on the wait queue.
- Thread B grabs the spinlock and queues itself behind C. At this point A is
  the only remaining active owner on the rwsem.

In this situation thread B could notice that it was the last active writer
on the rwsem, and decide to wake C to let it proceed in parallel with A
since they both only want the rwsem for read.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:11 -07:00
Michel Lespinasse fd41b33435 rwsem: let RWSEM_WAITING_BIAS represent any number of waiting threads
Previously each waiting thread added a bias of RWSEM_WAITING_BIAS.  With
this change, the bias is added only once to indicate that the wait list is
non-empty.

This has a few nice properties which will be used in following changes:
- when the spinlock is held and the waiter list is known to be non-empty,
  count < RWSEM_WAITING_BIAS  <=>  there is an active writer on that sem
- count == RWSEM_WAITING_BIAS  <=>  there are waiting threads and no
                                     active readers/writers on that sem

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:11 -07:00
Michel Lespinasse 70bdc6e064 rwsem: lighter active count checks when waking up readers
In __rwsem_do_wake(), we can skip the active count check unless we come
there from up_xxxx().  Also when checking the active count, it is not
actually necessary to increment it; this allows us to get rid of the read
side undo code and simplify the calculation of the final rwsem count
adjustment once we've counted the reader threads to wake.

The basic observation is the following.  When there are waiter threads on
a rwsem and the spinlock is held, other threads can only increment the
active count by trying to grab the rwsem in down_xxxx().  However
down_xxxx() will notice there are waiter threads and take the down_failed
path, blocking to acquire the spinlock on the way there.  Therefore, a
thread observing an active count of zero with waiters queued and the
spinlock held, is protected against other threads acquiring the rwsem
until it wakes the last waiter or releases the spinlock.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:10 -07:00
Michel Lespinasse 345af7bf33 rwsem: fully separate code paths to wake writers vs readers
This is in preparation for later changes in the series.

In __rwsem_do_wake(), the first queued waiter is checked first in order to
determine whether it's a writer or a reader.  The code paths diverge at
this point.  The code that checks and increments the rwsem active count is
duplicated on both sides - the point is that later changes in the series
will be able to independently modify both sides.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Ying Han <yinghan@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:10 -07:00
Eric Paris ea98eed9bc flex_array: add helpers to get and put to make pointers easy to use
Getting and putting arrays of pointers with flex arrays is a PITA.  You
have to remember to pass &ptr to the _put and you have to do weird and
wacky casting to get the ptr back from the _get.  Add two functions
flex_array_get_ptr() and flex_array_put_ptr() to handle all of the magic.

[akpm@linux-foundation.org: simplification suggested by Joe]
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Joe Perches <joe@perches.com>
Cc: James Morris <jmorris@namei.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:09 -07:00
Michal Nazarewicz 559b140a36 lib: vsprintf: useless strlen() removed
The strict_strtoul() and strict_strtoull() functions used strlen() to
check argument's length in a situation where it wasn't strictly necessary

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Cc: "Yi Yang" <yi.y.yang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:09 -07:00
Baruch Siach e3f76e3386 list debugging: warn when deleting a deleted entry
Use the magic LIST_POISON* values to detect an incorrect use of list_del
on a deleted entry.  This DEBUG_LIST specific warning is easier to
understand than the generic Oops message caused by LIST_POISON
dereference.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:08 -07:00
Anton Blanchard e269b08517 iommu: inline iommu_num_pages
A profile of a network benchmark showed iommu_num_pages rather high up:

     0.52%  iommu_num_pages

Looking at the profile, an integer divide is taking almost all of the time:

      %
         :      c000000000376ea4 <.iommu_num_pages>:
    1.93 :      c000000000376ea4:       fb e1 ff f8     std     r31,-8(r1)
    0.00 :      c000000000376ea8:       f8 21 ff c1     stdu    r1,-64(r1)
    0.00 :      c000000000376eac:       7c 3f 0b 78     mr      r31,r1
    3.86 :      c000000000376eb0:       38 84 ff ff     addi    r4,r4,-1
    0.00 :      c000000000376eb4:       38 05 ff ff     addi    r0,r5,-1
    0.00 :      c000000000376eb8:       7c 84 2a 14     add     r4,r4,r5
   46.95 :      c000000000376ebc:       7c 00 18 38     and     r0,r0,r3
   45.66 :      c000000000376ec0:       7c 84 02 14     add     r4,r4,r0
    0.00 :      c000000000376ec4:       7c 64 2b 92     divdu   r3,r4,r5
    0.00 :      c000000000376ec8:       38 3f 00 40     addi    r1,r31,64
    0.00 :      c000000000376ecc:       eb e1 ff f8     ld      r31,-8(r1)
    1.61 :      c000000000376ed0:       4e 80 00 20     blr

Since every caller of iommu_num_pages passes in a constant power of two
we can inline this such that the divide is replaced by a shift. The
entire function is only a few instructions once optimised, so it is
a good candidate for inlining overall.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:05 -07:00
Jan Kara ebf8aa44be radix-tree: omplement function radix_tree_range_tag_if_tagged
Implement function for setting one tag if another tag is set for each item
in given range.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:44:59 -07:00
Tim Chen 27f5e0f694 tmpfs: add accurate compare function to percpu_counter library
Add percpu_counter_compare that allows for a quick but accurate comparison
of percpu_counter with a given value.

A rough count is provided by the count field in percpu_counter structure,
without accounting for the other values stored in individual cpu counters.

The actual count is a sum of count and the cpu counters.  However, count
field is never different from the actual value by a factor of
batch*num_online_cpu.  We do not need to get actual count for comparison
if count is different from the given value by this factor and allows for
quick comparison without summing up all the per cpu counters.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:44:58 -07:00
David Woodhouse 2144381da4 Merge branch 'async' of macbook:git/btrfs-unstable
Conflicts:
	drivers/md/Makefile
	lib/raid6/unroll.pl
2010-08-09 10:36:44 +01:00
Linus Torvalds ab69bcd66f Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (28 commits)
  driver core: device_rename's new_name can be const
  sysfs: Remove owner field from sysfs struct attribute
  powerpc/pci: Remove owner field from attribute initialization in PCI bridge init
  regulator: Remove owner field from attribute initialization in regulator core driver
  leds: Remove owner field from attribute initialization in bd2802 driver
  scsi: Remove owner field from attribute initialization in ARCMSR driver
  scsi: Remove owner field from attribute initialization in LPFC driver
  cgroupfs: create /sys/fs/cgroup to mount cgroupfs on
  Driver core: Add BUS_NOTIFY_BIND_DRIVER
  driver core: fix memory leak on one error path in bus_register()
  debugfs: no longer needs to depend on SYSFS
  sysfs: Fix one more signature discrepancy between sysfs implementation and docs.
  sysfs: fix discrepancies between implementation and documentation
  dcdbas: remove a redundant smi_data_buf_free in dcdbas_exit
  dmi-id: fix a memory leak in dmi_id_init error path
  sysfs: sysfs_chmod_file's attr can be const
  firmware: Update hotplug script
  Driver core: move platform device creation helpers to .init.text (if MODULE=n)
  Driver core: reduce duplicated code for platform_device creation
  Driver core: use kmemdup in platform_device_add_resources
  ...
2010-08-06 11:36:30 -07:00
Linus Torvalds 9faa1e5942 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Ioremap: fix wrong physical address handling in PAT code
  x86, tlb: Clean up and correct used type
  x86, iomap: Fix wrong page aligned size calculation in ioremapping code
  x86, mm: Create symbolic index into address_markers array
  x86, ioremap: Fix normal ram range check
  x86, ioremap: Fix incorrect physical address handling in PAE mode
  x86-64, mm: Initialize VDSO earlier on 64 bits
  x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages
2010-08-06 10:17:52 -07:00
Linus Torvalds 4aed2fd8e3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
  tracing/kprobes: unregister_trace_probe needs to be called under mutex
  perf: expose event__process function
  perf events: Fix mmap offset determination
  perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
  perf, powerpc: Convert the FSL driver to use local64_t
  perf tools: Don't keep unreferenced maps when unmaps are detected
  perf session: Invalidate last_match when removing threads from rb_tree
  perf session: Free the ref_reloc_sym memory at the right place
  x86,mmiotrace: Add support for tracing STOS instruction
  perf, sched migration: Librarize task states and event headers helpers
  perf, sched migration: Librarize the GUI class
  perf, sched migration: Make the GUI class client agnostic
  perf, sched migration: Make it vertically scrollable
  perf, sched migration: Parameterize cpu height and spacing
  perf, sched migration: Fix key bindings
  perf, sched migration: Ignore unhandled task states
  perf, sched migration: Handle ignored migrate out events
  perf: New migration tool overview
  tracing: Drop cpparg() macro
  perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
  ...

Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06 09:30:52 -07:00
Linus Torvalds 3a3527b646 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  Revert "net: Make accesses to ->br_port safe for sparse RCU"
  mce: convert to rcu_dereference_index_check()
  net: Make accesses to ->br_port safe for sparse RCU
  vfs: add fs.h to define struct file
  lockdep: Add an in_workqueue_context() lockdep-based test function
  rcu: add __rcu API for later sparse checking
  rcu: add an rcu_dereference_index_check()
  tree/tiny rcu: Add debug RCU head objects
  mm: remove all rcu head initializations
  fs: remove all rcu head initializations, except on_stack initializations
  powerpc: remove all rcu head initializations
2010-08-06 09:23:07 -07:00
Linus Torvalds da9e82b3b8 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  modpost: support objects with more than 64k sections
  trivial: fix a typo in a filename
  frv: clean up arch/frv/Makefile
  kbuild: allow assignment to {A,C}FLAGS_KERNEL on the command line
  kbuild: allow assignment to {A,C,LD}FLAGS_MODULE on the command line
  Kbuild: Add option to set -femit-struct-debug-baseonly
  Makefile: "make kernelrelease" should show the correct full kernel version
  Makefile.build: make KBUILD_SYMTYPES work again
2010-08-05 14:10:07 -07:00
Randy Dunlap c462e8cd57 debugfs: no longer needs to depend on SYSFS
debugfs no longer uses 'kernel_subsys' (which is gone), and other
kernel/ksysfs.c code is always built, so DEBUG_FS does not need
to depend on SYSFS.

Fixes this kconfig warning:

warning: (TREE_RCU_TRACE || AMD_IOMMU_STATS && AMD_IOMMU || MTD_UBI_DEBUG && MTD && SYSFS && MTD_UBI || UBIFS_FS_DEBUG && MISC_FILESYSTEMS && UBIFS_FS || DEBUG_KMEMLEAK && DEBUG_KERNEL && EXPERIMENTAL && !MEMORY_HOTPLUG && (X86 || ARM || PPC || S390 || SPARC64 || SUPERH || MICROBLAZE) && SYSFS || TRACING || X86_PTDUMP && DEBUG_KERNEL || BLK_DEV_IO_TRACE && TRACING_SUPPORT && FTRACE && SYSFS && BLOCK) selects DEBUG_FS which has unmet direct dependencies (SYSFS)

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-05 13:53:34 -07:00
Linus Torvalds bbc4fd12a6 Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze
* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze: (49 commits)
  microblaze: Add KGDB support
  microblaze: Support brki rX, 0x18 for user application debugging
  microblaze: Remove nop after MSRCLR/SET, MTS, MFS instructions
  microblaze: Simplify syscall rutine
  microblaze: Move PT_MODE saving to delay slot
  microblaze: Fix _interrupt function
  microblaze: Fix _user_exception function
  microblaze: Put together addik instructions
  microblaze: Use delay slot in syscall macros
  microblaze: Save kernel mode in delay slot
  microblaze: Do not mix register saving and mode setting
  microblaze: Move SAVE_STATE upward
  microblaze: entry.S: Macro optimization
  microblaze: Optimize hw exception rutine
  microblaze: Implement clear_ums macro and fix SAVE_STATE macro
  microblaze: Remove additional setup for kernel_mode
  microblaze: Optimize SAVE_STATE macro
  microblaze: Remove additional loading
  microblaze: Completely remove working with R11 register
  microblaze: Do not setup BIP in _debug_exception
  ...
2010-08-05 08:59:22 -07:00
Ingo Molnar 61be7fdec2 Merge branch 'perf/nmi' into perf/core
Conflicts:
	kernel/Makefile

Merge reason: Add the now complete topic, fix the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-05 08:45:05 +02:00
Linus Torvalds 3cfc2c42c1 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits)
  Documentation: update broken web addresses.
  fix comment typo "choosed" -> "chosen"
  hostap:hostap_hw.c Fix typo in comment
  Fix spelling contorller -> controller in comments
  Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
  fs/Kconfig: Fix typo Userpace -> Userspace
  Removing dead MACH_U300_BS26
  drivers/infiniband: Remove unnecessary casts of private_data
  fs/ocfs2: Remove unnecessary casts of private_data
  libfc: use ARRAY_SIZE
  scsi: bfa: use ARRAY_SIZE
  drm: i915: use ARRAY_SIZE
  drm: drm_edid: use ARRAY_SIZE
  synclink: use ARRAY_SIZE
  block: cciss: use ARRAY_SIZE
  comment typo fixes: charater => character
  fix comment typos concerning "challenge"
  arm: plat-spear: fix typo in kerneldoc
  reiserfs: typo comment fix
  update email address
  ...
2010-08-04 15:31:02 -07:00
Linus Torvalds 6ba74014c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
  phy/marvell: add 88ec048 support
  igb: Program MDICNFG register prior to PHY init
  e1000e: correct MAC-PHY interconnect register offset for 82579
  hso: Add new product ID
  can: Add driver for esd CAN-USB/2 device
  l2tp: fix export of header file for userspace
  can-raw: Fix skb_orphan_try handling
  Revert "net: remove zap_completion_queue"
  net: cleanup inclusion
  phy/marvell: add 88e1121 interface mode support
  u32: negative offset fix
  net: Fix a typo from "dev" to "ndev"
  igb: Use irq_synchronize per vector when using MSI-X
  ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
  e1000e: Fix irq_synchronize in MSI-X case
  e1000e: register pm_qos request on hardware activation
  ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
  net: Add getsockopt support for TCP thin-streams
  cxgb4: update driver version
  cxgb4: add new PCI IDs
  ...

Manually fix up conflicts in:
 - drivers/net/e1000e/netdev.c: due to pm_qos registration
   infrastructure changes
 - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
   and cleaning up the IDs
 - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
   conflict (registration change vs marking it static)
2010-08-04 11:47:58 -07:00
Linus Torvalds 4a35cee066 Merge branch 'stable/swiotlb-0.8.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6
* 'stable/swiotlb-0.8.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6:
  swiotlb: Make swiotlb bookkeeping functions visible in the header file.
  swiotlb: search and replace "int dir" with "enum dma_data_direction dir"
  swiotlb: Make internal bookkeeping functions have 'swiotlb_tbl' prefix.
  swiotlb: add the swiotlb initialization function with iotlb memory
  swiotlb: add swiotlb_tbl_map_single library function
2010-08-04 10:36:39 -07:00
Jiri Kosina d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Michal Marek 772320e845 Merge commit 'v2.6.35' into kbuild/kbuild
Conflicts:
	arch/powerpc/Makefile
2010-08-04 13:59:13 +02:00
Michal Simek 79aac88903 microblaze: Disable FRAME_POINTER selection
Microblaze doesn't support frame pointers. Ftrace code
uses CALLER_ADDR1 which is defined in linux/ftrace.h. For Microblaze
is 0.

Signed-off-by: Michal Simek <monstr@monstr.eu>
2010-08-04 10:22:32 +02:00
Chris Wilson b94de9bb75 lib/scatterlist: Hook sg_kmalloc into kmemleak (v2)
kmemleak ignores page_alloc() and so believes the final sub-page
allocation using the plain kmalloc is decoupled and lost. This leads to
lots of false-positives with code that uses scatterlists.

The options seem to be either to tell kmemleak that the kmalloc is not
leaked or to notify kmemleak of the page allocations. The danger of the
first approach is that we may hide a real leak, so choose the latter
approach (of which I am not sure of the downsides).

v2: Added comments on the suggestion of Catalin.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2010-07-28 22:59:02 +01:00
Will Deacon c58bbd39f8 ARM: 6213/1: atomic64_test: add ARM as supported architecture
ARM has support for the atomic64_dec_if_positive operation
so ensure that it is tested by the atomic64_test routine.

Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-07-27 10:43:46 +01:00
Takuya Yoshikawa f4d0143951 Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
The last 't' of 'fault' is missing in the description of FAIL_IO_TIMEOUT.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-21 10:36:50 +02:00
Jason Baron ab0155a22a kmemleak: Introduce a default off mode for kmemleak
Introduce a new DEBUG_KMEMLEAK_DEFAULT_OFF config parameter that allows
kmemleak to be disabled by default, but enabled on the command line
via: kmemleak=on. Although a reboot is required to turn it on, its still
useful to not require a re-compile.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-07-19 11:54:17 +01:00
Andi Kleen d6f4ceb796 Kbuild: Add option to set -femit-struct-debug-baseonly
Newer gcc has a -femit-struct-debug-baseonly option that dramatically
reduces the size of object files with debug info. What it does
is to only emit type information for structures when the structures
are defined in the same file or in a header file.

This means the type information for most headers are not included.
This is not good when the type information is actually
needed (e.g. with kgdb or systemtap)

But often kernel hackers only care about line numbers and don't
need all the type information anyways. In this case setting
the option can be a big win:

A build dir for a specific x86-64 configuration with gcc 4.5
shrunk from 2.3G to 1.2G. The compilation was also nearly a minute
faster.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
[mmarek: reformatted help text]
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-07-14 17:21:28 +02:00
Yinghai Lu 95f72d1ed4 lmb: rename to memblock
via following scripts

      FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

      sed -i \
        -e 's/lmb/memblock/g' \
        -e 's/LMB/MEMBLOCK/g' \
        $FILES

      for N in $(find . -name lmb.[ch]); do
        M=$(echo $N | sed 's/lmb/memblock/g')
        mv $N $M
      done

and remove some wrong change like lmbench and dlmb etc.

also move memblock.c from lib/ to mm/

Suggested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-14 17:14:00 +10:00
Kulikov Vasiliy 4d45ada36b lib/devres.c: fix comment typo
'Unamp' should be 'Unmap'.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-11 22:16:32 +02:00
Kenji Kaneshige ffa71f33a8 x86, ioremap: Fix incorrect physical address handling in PAE mode
Current x86 ioremap() doesn't handle physical address higher than
32-bit properly in X86_32 PAE mode. When physical address higher than
32-bit is passed to ioremap(), higher 32-bits in physical address is
cleared wrongly. Due to this bug, ioremap() can map wrong address to
linear address space.

In my case, 64-bit MMIO region was assigned to a PCI device (ioat
device) on my system. Because of the ioremap()'s bug, wrong physical
address (instead of MMIO region) was mapped to linear address space.
Because of this, loading ioatdma driver caused unexpected behavior
(kernel panic, kernel hangup, ...).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <4C1AE680.7090408@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-07-09 11:42:03 -07:00
Linus Torvalds 47a716cf0c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rbtree: Undo augmented trees performance damage and regression
  x86, Calgary: Limit the max PHB number to 256
2010-07-06 17:16:09 -07:00
Peter Zijlstra b945d6b255 rbtree: Undo augmented trees performance damage and regression
Reimplement augmented RB-trees without sprinkling extra branches
all over the RB-tree code (which lives in the scheduler hot path).

This approach is 'borrowed' from Fabio's BFQ implementation and
relies on traversing the rebalance path after the RB-tree-op to
correct the heap property for insertion/removal and make up for
the damage done by the tree rotations.

For insertion the rebalance path is trivially that from the new
node upwards to the root, for removal it is that from the deepest
node in the path from the to be removed node that will still
be around after the removal.

[ This patch also fixes a video driver regression reported by
  Ali Gholami Rudi - the memtype->subtree_max_end was updated
  incorrectly. ]

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Ali Gholami Rudi <ali@rudi.ir>
Cc: Fabio Checconi <fabio@gandalf.sssup.it>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1275414172.27810.27961.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-05 14:43:50 +02:00
Yehuda Sadeh ff49d74ad3 module: initialize module dynamic debug later
We should initialize the module dynamic debug datastructures
only after determining that the module is not loaded yet. This
fixes a bug that introduced in 2.6.35-rc2, where when a trying
to load a module twice, we also load it's dynamic printing data
twice which causes all sorts of nasty issues. Also handle
the dynamic debug cleanup later on failure.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed a #ifdef)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-04 20:17:22 -07:00
Joe Perches 7db6f5fb65 vsprintf: Recursive vsnprintf: Add "%pV", struct va_format
Add the ability to print a format and va_list from a structure pointer

Allows __dev_printk to be implemented as a single printk while
minimizing string space duplication.

%pV should not be used without some mechanism to verify the
format and argument use ala __attribute__(format (printf(...))).

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04 10:40:17 -07:00
Ingo Molnar 0a54cec0c2 Merge branch 'linus' into core/rcu
Conflicts:
	fs/fs-writeback.c

Merge reason: Resolve the conflict

Note, i picked the version from Linus's tree, which effectively reverts
the fs-writeback.c bits of:

  b97181f: fs: remove all rcu head initializations, except on_stack initializations

As the upstream changes to this file changed this code heavily and the
first attempt to resolve the conflict resulted in a non-booting kernel.
It's safer to re-try this portion of the commit cleanly.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-01 09:31:25 +02:00
Imre Deak e621ba9932 genalloc: fix allocation from end of pool
bitmap_find_next_zero_area requires the size of the bitmap, we instead
passed the last suitable position.  This made it impossible to allocate
from the end of the pool.

Fixes a regression introduced by 243797f59b
("genalloc: use bitmap_find_next_zero_area").

Signed-off-by: Imre Deak <imre.deak@nokia.com>
Cc: Zygo Blaxell <zygo.blaxell@xandros.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-29 15:29:30 -07:00
Paul E. McKenney 94bfa3b669 idr: fix RCU lockdep splat in idr_get_next()
Convert to rcu_dereference_raw() given that many callers may have many
different locking models.

Located-by: Miles Lane <miles.lane@gmail.com>
Tested-by: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-06-23 06:50:45 -07:00
Jiri Kosina f1bbbb6912 Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
Uwe Kleine-König 421f91d21a fix typos concerning "initiali[zs]e"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-06-16 18:05:05 +02:00
Mathieu Desnoyers 551d55a944 tree/tiny rcu: Add debug RCU head objects
Helps finding racy users of call_rcu(), which results in hangs because list
entries are overwritten and/or skipped.

Changelog since v4:
- Bissectability is now OK
- Now generate a WARN_ON_ONCE() for non-initialized rcu_head passed to
  call_rcu(). Statically initialized objects are detected with
  object_is_static().
- Rename rcu_head_init_on_stack to init_rcu_head_on_stack.
- Remove init_rcu_head() completely.

Changelog since v3:
- Include comments from Lai Jiangshan

This new patch version is based on the debugobjects with the newly introduced
"active state" tracker.

Non-initialized entries are all considered as "statically initialized". An
activation fixup (triggered by call_rcu()) takes care of performing the debug
object initialization without issuing any warning. Since we cannot increase the
size of struct rcu_head, I don't see much room to put an identifier for
statically initialized rcu_head structures. So for now, we have to live without
"activation without explicit init" detection. But the main purpose of this debug
option is to detect double-activations (double call_rcu() use of a rcu_head
before the callback is executed), which is correctly addressed here.

This also detects potential internal RCU callback corruption, which would cause
the callbacks to be executed twice.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: David S. Miller <davem@davemloft.net>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: akpm@linux-foundation.org
CC: mingo@elte.hu
CC: laijs@cn.fujitsu.com
CC: dipankar@in.ibm.com
CC: josh@joshtriplett.org
CC: dvhltc@us.ibm.com
CC: niv@us.ibm.com
CC: tglx@linutronix.de
CC: peterz@infradead.org
CC: rostedt@goodmis.org
CC: Valdis.Kletnieks@vt.edu
CC: dhowells@redhat.com
CC: eric.dumazet@gmail.com
CC: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
2010-06-14 16:37:26 -07:00
Konrad Rzeszutek Wilk d7ef1533a9 swiotlb: Make swiotlb bookkeeping functions visible in the header file.
We put the functions dealing with the operations on
the SWIOTLB buffer in the header and make those functions non-static.
And also make the functions exported via EXPORT_SYMBOL_GPL.

See "swiotlb: swiotlb: add swiotlb_tbl_map_single library function" for
full description of patchset.

[v2: swiotlb_sync_single_range_for_* no more. Remove usage.]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:27 -04:00
Konrad Rzeszutek Wilk 22d4826998 swiotlb: search and replace "int dir" with "enum dma_data_direction dir"
.. to catch anybody doing something funky.

See "swiotlb: swiotlb: add swiotlb_tbl_map_single library function" for
full description of patchset.

[v2: swiotlb_sync_single_range_* no more - removed usage]
[v3: enum dma_data_direction direction -> enum dma_data_direction dir]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:26 -04:00
Konrad Rzeszutek Wilk bfc5501f6d swiotlb: Make internal bookkeeping functions have 'swiotlb_tbl' prefix.
The functions that operate on io_tlb_list/io_tlb_start/io_tlb_orig_addr
have the prefix 'swiotlb_tbl' now.

See "swiotlb: swiotlb: add swiotlb_tbl_map_single library function" for
full description of patchset.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:26 -04:00
FUJITA Tomonori abbceff7d7 swiotlb: add the swiotlb initialization function with iotlb memory
This enables the caller to initialize swiotlb with its own iotlb
memory.

See "swiotlb: swiotlb: add swiotlb_tbl_map_single library function" for
full description of patchset.

[v2: changed ..with_tlb to ..with_tbl]

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:25 -04:00
FUJITA Tomonori eb605a5754 swiotlb: add swiotlb_tbl_map_single library function
swiotlb_tbl_map_single() takes the dma address of iotlb instead of
using swiotlb_virt_to_bus().

[v2: changed swiotlb_tlb to swiotlb_tbl]
[v3: changed u64 to dma_addr_t]

This patch:

This is a set of patches that separate the address translation
(virt_to_phys, virt_to_bus, etc) and allocation of the SWIOTLB buffer
from the SWIOTLB library.

The idea behind this set of patches is to make it possible to have separate
mechanisms for translating virtual to physical or virtual to DMA addresses
on platforms which need an SWIOTLB, and where physical != PCI bus address
and also to allocate the core IOTLB memory outside SWIOTLB.

One customers of this is the pv-ops project, which can switch between
different modes of operation depending on the environment it is running in:
bare-metal or virtualized (Xen for now). Another is the Wii DMA - used to
implement the MEM2 DMA facility needed by its EHCI controller (for details:
http://lkml.org/lkml/2010/5/18/303)

On bare-metal SWIOTLB is used when there are no hardware IOMMU. In virtualized
environment it used when PCI pass-through is enabled for the guest. The problems
with PCI pass-through is that the guest's idea of PFN's is not the real thing.
To fix that, there is translation layer for PFN->machine frame number and vice-versa.
To bubble that up to the SWIOTLB layer there are two possible solutions.

One solution has been to wholesale copy the SWIOTLB, stick it in
arch/x86/xen/swiotlb.c and modify the virt_to_phys, phys_to_virt and others
to use the Xen address translation functions. Unfortunately, since the kernel can
run on bare-metal, there would be big code overlap with the real SWIOTLB.
(git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git xen/dom0/swiotlb-new)

Another approach, which this set of patches explores, is to abstract the
address translation and address determination functions away from the
SWIOTLB book-keeping functions. This way the core SWIOTLB library functions
are present in one place, while the address related functions are in
a separate library that can be loaded when running under non-bare-metal platform.

Changelog:
Since the last posting [v8.2] Konrad has done:
 - Added this changelog in the patch and referenced in the other patches
   this description.
 - 'enum dma_data_direction direction' to 'enum dma.. dir' so to be
   unified.
[v8-v8.2 changes:]
 - Rolled-up the last two patches in one.
 - Rebased against linus latest. That meant dealing with swiotlb_sync_single_range_* changes.
 - added Acked-by: Fujita Tomonori and Tested-by: Albert Herranz
[v7-v8 changes:]
 - Minimized the list of exported functions.
 - Integrated Fujita's patches and changed "swiotlb_tlb" to "swiotlb_tbl" in them.
[v6-v7 changes:]
 - Minimized the amount of exported functions/variable with a prefix of: "swiotbl_tbl".
 - Made the usage of 'int dir' to be 'enum dma_data_direction'.
[v5-v6 changes:]
 - Made the exported functions/variables have the 'swiotlb_bk' prefix.
 - dropped the checkpatches/other reworks

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
2010-06-07 11:59:13 -04:00
Linus Torvalds f9196e7c03 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  fix setattr error handling in sysfs, configfs
  kobject: free memory if netlink_kernel_create() fails
  lib/kobject_uevent.c: fix CONIG_NET=n warning
2010-06-04 15:27:27 -07:00
Heiko Carstens 007d08678e lib: add s390 to atomic64_dec_if_positive archs
Add s390 to list of architectures that have atomic64_dec_if_positive
implemented so we get rid of this warning:

lib/atomic64_test.c:129:2: warning: #warning Please implement
atomic64_dec_if_positive for your architecture, and add it to the IF above

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Luca Barbieri <luca@luca-barbieri.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-04 15:21:45 -07:00
Dan Carpenter 743db2d903 kobject: free memory if netlink_kernel_create() fails
There is a kfree(ue_sk) missing on the error path if
netlink_kernel_create() fails.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-06-04 13:27:52 -07:00
Andrew Morton c842128607 lib/kobject_uevent.c: fix CONIG_NET=n warning
lib/kobject_uevent.c:87: warning: 'kobj_bcast_filter' defined but not used

Repairs "hotplug: netns aware uevent_helper"

Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-06-04 13:27:52 -07:00
Linus Torvalds 021fad8b70 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, cpufeature: Unbreak compile with gcc 3.x
  x86, pat: Fix memory leak in free_memtype
  x86, k8: Fix section mismatch for powernowk8_exit()
  lib/atomic64_test: fix missing include of linux/kernel.h
  x86: remove last traces of quicklist usage
  x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012
  x86: "nosmp" command line option should force the system into UP mode
  arch/x86/pci: use kasprintf
  x86, apic: ack all pending irqs when crashed/on kexec
2010-05-30 09:06:13 -07:00
Linus Torvalds 35926ff5fb Revert "cpusets: randomize node rotor used in cpuset_mem_spread_node()"
This reverts commit 0ac0c0d0f8, which
caused cross-architecture build problems for all the wrong reasons.
IA64 already added its own version of __node_random(), but the fact is,
there is nothing architectural about the function, and the original
commit was just badly done. Revert it, since no fix is forthcoming.

Requested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-30 09:00:03 -07:00
Linus Torvalds 9a90e09854 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (27 commits)
  ACPI: Don't let acpi_pad needlessly mark TSC unstable
  drivers/acpi/sleep.h: Checkpatch cleanup
  ACPI: Minor cleanup eliminating redundant PMTIMER_TICKS to NS conversion
  ACPI: delete unused c-state promotion/demotion data strucutures
  ACPI: video: fix acpi_backlight=video
  ACPI: EC: Use kmemdup
  drivers/acpi: use kasprintf
  ACPI, APEI, EINJ injection parameters support
  Add x64 support to debugfs
  ACPI, APEI, Use ERST for persistent storage of MCE
  ACPI, APEI, Error Record Serialization Table (ERST) support
  ACPI, APEI, Generic Hardware Error Source memory error support
  ACPI, APEI, UEFI Common Platform Error Record (CPER) header
  Unified UUID/GUID definition
  ACPI Hardware Error Device (PNP0C33) support
  ACPI, APEI, PCIE AER, use general HEST table parsing in AER firmware_first setup
  ACPI, APEI, Document for APEI
  ACPI, APEI, EINJ support
  ACPI, APEI, HEST table parsing
  ACPI, APEI, APEI supporting infrastructure
  ...
2010-05-28 14:42:18 -07:00
Cesar Eduardo Barros edcd1d843a radix-tree: fix radix_tree_prev_hole() underflow case
radix_tree_prev_hole() used LONG_MAX to detect underflow; however,
ULONG_MAX is clearly what was intended, both here and by its only user
(count_history_pages at mm/readahead.c).

Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:53 -07:00
FUJITA Tomonori 38388301b7 swiotlb: remove unnecessary swiotlb_sync_single_range_*
swiotlb_sync_single_range_for_cpu and swiotlb_sync_single_range_for_device
are unnecessary because swiotlb_sync_single_for_cpu and
swiotlb_sync_single_for_device can be used instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:52 -07:00
Joe Eykholt 5960164fde lib/random32: export pseudo-random number generator for modules
This patch moves the definition of struct rnd_state and the inline
__seed() function to linux/random.h.  It renames the static __random32()
function to prandom32() and exports it for use in modules.

prandom32() is useful as a privately-seeded pseudo random number generator
that can give the same result every time it is initialized.

For FCoE FC-BB-6 VN2VN mode self-selected unique FC address generation, we
need an pseudo-random number generator seeded with the 64-bit world-wide
port name.  A truly random generator or one seeded with randomness won't
do because the same sequence of numbers should be generated each time we
boot or the link comes up.

A prandom32_seed() inline function is added to the header file.  It is
inlined not for speed, but so the function won't be expanded in the base
kernel, but only in the module that uses it.

Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:52 -07:00
Imre Deak 2dcb22b346 idr: fix backtrack logic in idr_remove_all
Currently idr_remove_all will fail with a use after free error if
idr::layers is bigger than 2, which on 32 bit systems corresponds to items
more than 1024.  This is due to stepping back too many levels during
backtracking.  For simplicity let's assume that IDR_BITS=1 -> we have 2
nodes at each level below the root node and each leaf node stores two IDs.
 (In reality for 32 bit systems IDR_BITS=5, with 32 nodes at each sub-root
level and 32 IDs in each leaf node).  The sequence of freeing the nodes at
the moment is as follows:

layer
1 ->                       a(7)
2 ->            b(3)                  c(5)
3 ->        d(1)   e(2)           f(4)    g(6)

Until step 4 things go fine, but then node c is freed, whereas node g
should be freed first.  Since node c contains the pointer to node g we'll
have a use after free error at step 6.

How many levels we step back after visiting the leaf nodes is currently
determined by the msb of the id we are currently visiting:

Step
1.          node d with IDs 0,1 is freed, current ID is advanced to 2.
            msb of the current ID bit 1. This means we need to step back
            1 level to node b and take the next sibling, node e.
2-3.        node e with IDs 2,3 is freed, current ID is 4, msb is bit 2.
            This means we need to step back 2 levels to node a, freeing
            node b on the way.
4-5.        node f with IDs 4,5 is freed, current ID is 6, msb is still
            bit 2. This means we again need to step back 2 levels to node
            a and free c on the way.
6.          We should visit node g, but its pointer is not available as
            node c was freed.

The fix changes how we determine the number of levels to step back.
Instead of deducting this merely from the msb of the current ID, we should
really check if advancing the ID causes an overflow to a bit position
corresponding to a given layer.  In the above example overflow from bit 0
to bit 1 should mean stepping back 1 level.  Overflow from bit 1 to bit 2
should mean stepping back 2 levels and so on.

The fix was tested with IDs up to 1 << 20, which corresponds to 4 layers
on 32 bit systems.

Signed-off-by: Imre Deak <imre.deak@nokia.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: <stable@kernel.org>		[2.6.34.1]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:48 -07:00
Akinobu Mita c9d221f86e fault-injection: add CPU notifier error injection module
I used this module to test the series of modification to the cpu notifiers
code.

Example1: inject CPU offline error (-1 == -EPERM)

	# modprobe cpu-notifier-error-inject cpu_down_prepare_error=-1
	# echo 0 > /sys/devices/system/cpu/cpu1/online
	bash: echo: write error: Operation not permitted

Example2: inject CPU online error (-2 == -ENOENT)

	# modprobe cpu-notifier-error-inject cpu_up_prepare_error=-2
	# echo 1 > /sys/devices/system/cpu/cpu1/online
	bash: echo: write error: No such file or directory

[akpm@linux-foundation.org: fix Kconfig help text]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:48 -07:00
Jack Steiner 0ac0c0d0f8 cpusets: randomize node rotor used in cpuset_mem_spread_node()
Some workloads that create a large number of small files tend to assign
too many pages to node 0 (multi-node systems).  Part of the reason is that
the rotor (in cpuset_mem_spread_node()) used to assign nodes starts at
node 0 for newly created tasks.

This patch changes the rotor to be initialized to a random node number of
the cpuset.

[akpm@linux-foundation.org: fix layout]
[Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration]
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Paul Menage <menage@google.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:44 -07:00
Andrew Morton 0d2daf5cc8 revert "crc32: use __BYTE_ORDER macro for endian detection"
It doesn't work on big-endian - those architectures don't define
__LITTLE_ENDIAN.

Cc: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-26 08:19:23 -07:00
Joakim Tjernlund 4762bbc1a3 crc32: use __BYTE_ORDER macro for endian detection.
Since crc32.c contains a nifty test program that can be executed in user
space, make sure endian detection works reliably in user space too.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:06 -07:00
Joakim Tjernlund 836e2af925 crc32: major optimization
Precompute more crc32 values(0xcc00, 0xcc0000 and 0xcc000000) into tables.
 This increases the table size from 1KB to 4KB but the performance benfit
makes it worth it:

28% faster on MPC8321, 266 MHz
2x faster on Core 2 Duo, 3.1GHz

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:06 -07:00
Andy Shevchenko 903788892e lib: introduce common method to convert hex digits
hex_to_bin() is a little method which converts hex digit to its actual
value.  There are plenty of places where such functionality is needed.

[akpm@linux-foundation.org: use tolower(), saving 3 bytes, test the more common case first - it's quicker]
[akpm@linux-foundation.org: relocate tolower to make it even faster! (Joe)]
Signed-off-by: Andy Shevchenko <ext-andriy.shevchenko@nokia.com>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: Duncan Sands <duncan.sands@free.fr>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:05 -07:00
Joe Perches db0fd97c27 lib/hexdump.c: reduce stack variable size and cleanups
Reduce char linebuf[200] to the actual size required., which is 32 * 3 + 2
+ 32 + 1, ie: linebuf[131].

Change examples to use bool true not int 1.

Align multiline argument indentation to open parenthesis.

Use temporary for ptr[j] so trigraph fits on single line.

Convert printk ptr from %*p, (int)(2 * sizeof(void *)) to %p as %p uses
the same calculation for size.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:05 -07:00
Florian Ragwitz 2b2f68b538 DYNAMIC_DEBUG: fix documentation errors
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Florian Ragwitz <rafl@debian.org>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:05 -07:00
Dan Carpenter ea46c8f774 dynamic_debug: small cleanup in ddebug_proc_write()
This doesn't change behavior at all.  In the original code, if nwords was
zero then ddebug_parse_query() would return -EINVAL, now we just do it
earlier.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:05 -07:00
Joe Perches cf3b429b03 vsprintf.c: use noinline_for_stack
Mark static functions with noinline_for_stack

Before:

  akpm:/usr/src/25> objdump -d lib/vsprintf.o | perl scripts/checkstack.pl
  0x00000e82 pointer [vsprintf.o]:                        344
  0x0000198c pointer [vsprintf.o]:                        344
  0x000025d6 scnprintf [vsprintf.o]:                      216
  0x00002648 scnprintf [vsprintf.o]:                      216
  0x00002565 snprintf [vsprintf.o]:                       208
  0x0000267c sprintf [vsprintf.o]:                        208
  0x000030a3 bprintf [vsprintf.o]:                        208
  0x00003b1e sscanf [vsprintf.o]:                         208
  0x00000608 number [vsprintf.o]:                         136
  0x00000937 number [vsprintf.o]:                         136

After:

  akpm:/usr/src/25> objdump -d lib/vsprintf.o | perl scripts/checkstack.pl
  0x00000a7c symbol_string [vsprintf.o]:                  248
  0x00000ae8 symbol_string [vsprintf.o]:                  248
  0x00002310 scnprintf [vsprintf.o]:                      216
  0x00002382 scnprintf [vsprintf.o]:                      216
  0x0000229f snprintf [vsprintf.o]:                       208
  0x000023b6 sprintf [vsprintf.o]:                        208
  0x00002ddd bprintf [vsprintf.o]:                        208
  0x00003858 sscanf [vsprintf.o]:                         208
  0x00000625 number [vsprintf.o]:                         136
  0x00000954 number [vsprintf.o]:                         136

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:04 -07:00
Alexey Dobriyan 4be929be34 kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN
- C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not
  USHORT_MAX/SHORT_MAX/SHORT_MIN.

- Make SHRT_MIN of type s16, not int, for consistency.

[akpm@linux-foundation.org: fix drivers/dma/timb_dma.c]
[akpm@linux-foundation.org: fix security/keys/keyring.c]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:02 -07:00
Peter Huewe 0dbdd1bfe0 lib/atomic64_test: fix missing include of linux/kernel.h
Fix a build-failure
(http://kisskb.ellerman.id.au/kisskb/buildresult/2601239/) by adding the
missing include file (linux/kernel.h) for printk and KERN_INFO.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
LKML-Reference: <201005241913.o4OJDKdf010884@imap1.linux-foundation.org>
Cc: Luca Barbieri <luca@luca-barbieri.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-05-24 13:33:32 -07:00
Linus Torvalds 0961d6581c Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables
  intel-iommu: Combine the BIOS DMAR table warning messages
  panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I')
  panic: Allow warnings to set different taint flags
  intel-iommu: intel_iommu_map_range failed at very end of address space
  intel-iommu: errors with smaller iommu widths
  intel-iommu: Fix boot inside 64bit virtualbox with io-apic disabled
  intel-iommu: use physfn to search drhd for VF
  intel-iommu: Print out iommu seq_id
  intel-iommu: Don't complain that ACPI_DMAR_SCOPE_TYPE_IOAPIC is not supported
  intel-iommu: Avoid global flushes with caching mode.
  intel-iommu: Use correct domain ID when caching mode is enabled
  intel-iommu mistakenly uses offset_pfn when caching mode is enabled
  intel-iommu: use for_each_set_bit()
  intel-iommu: Fix section mismatch dmar_ir_support() uses dmar_tbl.
2010-05-21 17:25:01 -07:00
Linus Torvalds 90b9a32d8f Merge branch 'kdb-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'kdb-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: (25 commits)
  kdb,debug_core: Allow the debug core to receive a panic notification
  MAINTAINERS: update kgdb, kdb, and debug_core info
  debug_core,kdb: Allow the debug core to process a recursive debug entry
  printk,kdb: capture printk() when in kdb shell
  kgdboc,kdb: Allow kdb to work on a non open console port
  kgdb: Add the ability to schedule a breakpoint via a tasklet
  mips,kgdb: kdb low level trap catch and stack trace
  powerpc,kgdb: Introduce low level trap catching
  x86,kgdb: Add low level debug hook
  kgdb: remove post_primary_code references
  kgdb,docs: Update the kgdb docs to include kdb
  kgdboc,keyboard: Keyboard driver for kdb with kgdb
  kgdb: gdb "monitor" -> kdb passthrough
  sparc,sunzilog: Add console polling support for sunzilog serial driver
  sh,sh-sci: Use NO_POLL_CHAR in the SCIF polled console code
  kgdb,8250,pl011: Return immediately from console poll
  kgdb: core changes to support kdb
  kdb: core for kgdb back end (2 of 2)
  kdb: core for kgdb back end (1 of 2)
  kgdb,blackfin: Add in kgdb_arch_set_pc for blackfin
  ...
2010-05-21 11:08:05 -07:00
Eric W. Biederman 417daa1e8f hotplug: netns aware uevent_helper
It only makes sense for uevent_helper to get events
in the intial namespaces.  It's invocation is not
per namespace and it is not clear how we could make
it's invocation namespace aware.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:33 -07:00
Eric W. Biederman 5f71a29629 kobj: Send hotplug events in the proper namespace.
Utilize netlink_broacast_filtered to allow sending hotplug events
in the proper namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:32 -07:00
Eric W. Biederman 07e98962fa kobject: Send hotplug events in all network namespaces
Open a copy of the uevent kernel socket in each network
namespace so we can send uevents in all network namespaces.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:32 -07:00
Serge E. Hallyn be867b194a sysfs: Comment sysfs directory tagging logic
Add some in-line comments to explain the new infrastructure, which
was introduced to support sysfs directory tagging with namespaces.
I think an overall description someplace might be good too, but it
didn't really seem to fit into Documentation/filesystems/sysfs.txt,
which appears more geared toward users, rather than maintainers, of
sysfs.

(Tejun, please let me know if I can make anything clearer or failed
altogether to comment something that should be commented.)

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:31 -07:00
Eric W. Biederman 3ff195b011 sysfs: Implement sysfs tagged directory support.
The problem.  When implementing a network namespace I need to be able
to have multiple network devices with the same name.  Currently this
is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
potentially a few other directories of the form /sys/ ... /net/*.

What this patch does is to add an additional tag field to the
sysfs dirent structure.  For directories that should show different
contents depending on the context such as /sys/class/net/, and
/sys/devices/virtual/net/ this tag field is used to specify the
context in which those directories should be visible.  Effectively
this is the same as creating multiple distinct directories with
the same name but internally to sysfs the result is nicer.

I am calling the concept of a single directory that looks like multiple
directories all at the same path in the filesystem tagged directories.

For the networking namespace the set of directories whose contents I need
to filter with tags can depend on the presence or absence of hotplug
hardware or which modules are currently loaded.  Which means I need
a simple race free way to setup those directories as tagged.

To achieve a reace free design all tagged directories are created
and managed by sysfs itself.

Users of this interface:
- define a type in the sysfs_tag_type enumeration.
- call sysfs_register_ns_types with the type and it's operations
- sysfs_exit_ns when an individual tag is no longer valid

- Implement mount_ns() which returns the ns of the calling process
  so we can attach it to a sysfs superblock.
- Implement ktype.namespace() which returns the ns of a syfs kobject.

Everything else is left up to sysfs and the driver layer.

For the network namespace mount_ns and namespace() are essentially
one line functions, and look to remain that.

Tags are currently represented a const void * pointers as that is
both generic, prevides enough information for equality comparisons,
and is trivial to create for current users, as it is just the
existing namespace pointer.

The work needed in sysfs is more extensive.  At each directory
or symlink creating I need to check if the directory it is being
created in is a tagged directory and if so generate the appropriate
tag to place on the sysfs_dirent.  Likewise at each symlink or
directory removal I need to check if the sysfs directory it is
being removed from is a tagged directory and if so figure out
which tag goes along with the name I am deleting.

Currently only directories which hold kobjects, and
symlinks are supported.  There is not enough information
in the current file attribute interfaces to give us anything
to discriminate on which makes it useless, and there are
no potential users which makes it an uninteresting problem
to solve.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:31 -07:00
Eric W. Biederman bc451f2058 kobj: Add basic infrastructure for dealing with namespaces.
Move complete knowledge of namespaces into the kobject layer
so we can use that information when reporting kobjects to
userspace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:31 -07:00