1
0
Fork 0
Commit Graph

465 Commits (601d39d00c2af206d10d1252132a85316d95499a)

Author SHA1 Message Date
Joonsoo Kim 601d39d00c slub: fix a possible memory leak
Memory allocated by kstrdup should be freed,
when kmalloc(kmem_size, GFP_KERNEL) is failed.

Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-05-16 09:37:25 +03:00
Joonsoo Kim de3ec03562 slub: fix incorrect return type of get_any_partial()
Commit 497b66f2ec ('slub: return object pointer
from get_partial() / new_slab().') changed return type of some functions.
This updates missing part.

Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-05-08 08:31:57 +03:00
Linus Torvalds 532bfc851a Merge branch 'akpm' (Andrew's patch-bomb)
Merge third batch of patches from Andrew Morton:
 - Some MM stragglers
 - core SMP library cleanups (on_each_cpu_mask)
 - Some IPI optimisations
 - kexec
 - kdump
 - IPMI
 - the radix-tree iterator work
 - various other misc bits.

 "That'll do for -rc1.  I still have ~10 patches for 3.4, will send
  those along when they've baked a little more."

* emailed from Andrew Morton <akpm@linux-foundation.org>: (35 commits)
  backlight: fix typo in tosa_lcd.c
  crc32: add help text for the algorithm select option
  mm: move hugepage test examples to tools/testing/selftests/vm
  mm: move slabinfo.c to tools/vm
  mm: move page-types.c from Documentation to tools/vm
  selftests/Makefile: make `run_tests' depend on `all'
  selftests: launch individual selftests from the main Makefile
  radix-tree: use iterators in find_get_pages* functions
  radix-tree: rewrite gang lookup using iterator
  radix-tree: introduce bit-optimized iterator
  fs/proc/namespaces.c: prevent crash when ns_entries[] is empty
  nbd: rename the nbd_device variable from lo to nbd
  pidns: add reboot_pid_ns() to handle the reboot syscall
  sysctl: use bitmap library functions
  ipmi: use locks on watchdog timeout set on reboot
  ipmi: simplify locking
  ipmi: fix message handling during panics
  ipmi: use a tasklet for handling received messages
  ipmi: increase KCS timeouts
  ipmi: decrease the IPMI message transaction time in interrupt mode
  ...
2012-03-28 17:19:28 -07:00
Gilad Ben-Yossef a8364d5555 slub: only IPI CPUs that have per cpu obj to flush
flush_all() is called for each kmem_cache_destroy().  So every cache being
destroyed dynamically ends up sending an IPI to each CPU in the system,
regardless if the cache has ever been used there.

For example, if you close the Infinband ipath driver char device file, the
close file ops calls kmem_cache_destroy().  So running some infiniband
config tool on one a single CPU dedicated to system tasks might interrupt
the rest of the 127 CPUs dedicated to some CPU intensive or latency
sensitive task.

I suspect there is a good chance that every line in the output of "git
grep kmem_cache_destroy linux/ | grep '\->'" has a similar scenario.

This patch attempts to rectify this issue by sending an IPI to flush the
per cpu objects back to the free lists only to CPUs that seem to have such
objects.

The check which CPU to IPI is racy but we don't care since asking a CPU
without per cpu objects to flush does no damage and as far as I can tell
the flush_all by itself is racy against allocs on remote CPUs anyway, so
if you required the flush_all to be determinstic, you had to arrange for
locking regardless.

Without this patch the following artificial test case:

$ cd /sys/kernel/slab
$ for DIR in *; do cat $DIR/alloc_calls > /dev/null; done

produces 166 IPIs on an cpuset isolated CPU. With it it produces none.

The code path of memory allocation failure for CPUMASK_OFFSTACK=y
config was tested using fault injection framework.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Avi Kivity <avi@redhat.com>
Cc: Michal Nazarewicz <mina86@mina86.org>
Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com>
Cc: Milton Miller <miltonm@bga.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28 17:14:35 -07:00
Linus Torvalds 0c9aac0826 Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
Pull SLAB changes from Pekka Enberg:
 "There's the new kmalloc_array() API, minor fixes and performance
  improvements, but quite honestly, nothing terribly exciting."

* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  mm: SLAB Out-of-memory diagnostics
  slab: introduce kmalloc_array()
  slub: per cpu partial statistics change
  slub: include include for prefetch
  slub: Do not hold slub_lock when calling sysfs_slab_add()
  slub: prefetch next freelist pointer in slab_alloc()
  slab, cleanup: remove unneeded return
2012-03-28 15:04:26 -07:00
Mel Gorman cc9a6c8776 cpuset: mm: reduce large amounts of memory barrier related damage v3
Commit c0ff7453bb ("cpuset,mm: fix no node to alloc memory when
changing cpuset's mems") wins a super prize for the largest number of
memory barriers entered into fast paths for one commit.

[get|put]_mems_allowed is incredibly heavy with pairs of full memory
barriers inserted into a number of hot paths.  This was detected while
investigating at large page allocator slowdown introduced some time
after 2.6.32.  The largest portion of this overhead was shown by
oprofile to be at an mfence introduced by this commit into the page
allocator hot path.

For extra style points, the commit introduced the use of yield() in an
implementation of what looks like a spinning mutex.

This patch replaces the full memory barriers on both read and write
sides with a sequence counter with just read barriers on the fast path
side.  This is much cheaper on some architectures, including x86.  The
main bulk of the patch is the retry logic if the nodemask changes in a
manner that can cause a false failure.

While updating the nodemask, a check is made to see if a false failure
is a risk.  If it is, the sequence number gets bumped and parallel
allocators will briefly stall while the nodemask update takes place.

In a page fault test microbenchmark, oprofile samples from
__alloc_pages_nodemask went from 4.53% of all samples to 1.15%.  The
actual results were

                             3.3.0-rc3          3.3.0-rc3
                             rc3-vanilla        nobarrier-v2r1
    Clients   1 UserTime       0.07 (  0.00%)   0.08 (-14.19%)
    Clients   2 UserTime       0.07 (  0.00%)   0.07 (  2.72%)
    Clients   4 UserTime       0.08 (  0.00%)   0.07 (  3.29%)
    Clients   1 SysTime        0.70 (  0.00%)   0.65 (  6.65%)
    Clients   2 SysTime        0.85 (  0.00%)   0.82 (  3.65%)
    Clients   4 SysTime        1.41 (  0.00%)   1.41 (  0.32%)
    Clients   1 WallTime       0.77 (  0.00%)   0.74 (  4.19%)
    Clients   2 WallTime       0.47 (  0.00%)   0.45 (  3.73%)
    Clients   4 WallTime       0.38 (  0.00%)   0.37 (  1.58%)
    Clients   1 Flt/sec/cpu  497620.28 (  0.00%) 520294.53 (  4.56%)
    Clients   2 Flt/sec/cpu  414639.05 (  0.00%) 429882.01 (  3.68%)
    Clients   4 Flt/sec/cpu  257959.16 (  0.00%) 258761.48 (  0.31%)
    Clients   1 Flt/sec      495161.39 (  0.00%) 517292.87 (  4.47%)
    Clients   2 Flt/sec      820325.95 (  0.00%) 850289.77 (  3.65%)
    Clients   4 Flt/sec      1020068.93 (  0.00%) 1022674.06 (  0.26%)
    MMTests Statistics: duration
    Sys Time Running Test (seconds)             135.68    132.17
    User+Sys Time Running Test (seconds)         164.2    160.13
    Total Elapsed Time (seconds)                123.46    120.87

The overall improvement is small but the System CPU time is much
improved and roughly in correlation to what oprofile reported (these
performance figures are without profiling so skew is expected).  The
actual number of page faults is noticeably improved.

For benchmarks like kernel builds, the overall benefit is marginal but
the system CPU time is slightly reduced.

To test the actual bug the commit fixed I opened two terminals.  The
first ran within a cpuset and continually ran a small program that
faulted 100M of anonymous data.  In a second window, the nodemask of the
cpuset was continually randomised in a loop.

Without the commit, the program would fail every so often (usually
within 10 seconds) and obviously with the commit everything worked fine.
With this patch applied, it also worked fine so the fix should be
functionally equivalent.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-21 17:54:59 -07:00
Alex Shi 8028dcea8a slub: per cpu partial statistics change
This patch split the cpu_partial_free into 2 parts: cpu_partial_node, PCP refilling
times from node partial; and same name cpu_partial_free, PCP refilling times in
slab_free slow path. A new statistic 'cpu_partial_drain' is added to get PCP
drain to node partial times. These info are useful when do PCP tunning.

The slabinfo.c code is unchanged, since cpu_partial_node is not on slow path.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-02-18 11:00:09 +02:00
Christoph Lameter 4de900b4d6 slub: include include for prefetch
Otherwise m68k breaks:

On Mon, 30 Jan 2012, Geert Uytterhoeven wrote:
> m68k/allmodconfig at http://kisskb.ellerman.id.au/kisskb/buildresult/5527349/
>
> mm/slub.c:274: error: implicit declaration of function 'prefetch'
>
> Sorry, didn't notice it earlier due to other build breakage in -next.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-02-10 14:47:39 +02:00
Christoph Lameter 66c4c35c6b slub: Do not hold slub_lock when calling sysfs_slab_add()
sysfs_slab_add() calls various sysfs functions that actually may
end up in userspace doing all sorts of things.

Release the slub_lock after adding the kmem_cache structure to the list.
At that point the address of the kmem_cache is not known so we are
guaranteed exlusive access to the following modifications to the
kmem_cache structure.

If the sysfs_slab_add fails then reacquire the slub_lock to
remove the kmem_cache structure from the list.

Cc: <stable@vger.kernel.org>	# 3.3+
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-02-06 12:24:13 +02:00
Eric Dumazet 0ad9500e16 slub: prefetch next freelist pointer in slab_alloc()
Recycling a page is a problem, since freelist link chain is hot on
cpu(s) which freed objects, and possibly very cold on cpu currently
owning slab.

Adding a prefetch of cache line containing the pointer to next object in
slab_alloc() helps a lot in many workloads, in particular on assymetric
ones (allocations done on one cpu, frees on another cpus). Added cost is
three machine instructions only.

Examples on my dual socket quad core ht machine (Intel CPU E5540
@2.53GHz) (16 logical cpus, 2 memory nodes), 64bit kernel.

Before patch :

# perf stat -r 32 hackbench 50 process 4000 >/dev/null

 Performance counter stats for 'hackbench 50 process 4000' (32 runs):

     327577,471718 task-clock                #   15,821 CPUs utilized            ( +-  0,64% )
        28 866 491 context-switches          #    0,088 M/sec                    ( +-  1,80% )
         1 506 929 CPU-migrations            #    0,005 M/sec                    ( +-  3,24% )
           127 151 page-faults               #    0,000 M/sec                    ( +-  0,16% )
   829 399 813 448 cycles                    #    2,532 GHz                      ( +-  0,64% )
   580 664 691 740 stalled-cycles-frontend   #   70,01% frontend cycles idle     ( +-  0,71% )
   197 431 700 448 stalled-cycles-backend    #   23,80% backend  cycles idle     ( +-  1,03% )
   503 548 648 975 instructions              #    0,61  insns per cycle
                                             #    1,15  stalled cycles per insn  ( +-  0,46% )
    95 780 068 471 branches                  #  292,389 M/sec                    ( +-  0,48% )
     1 426 407 916 branch-misses             #    1,49% of all branches          ( +-  1,35% )

      20,705679994 seconds time elapsed                                          ( +-  0,64% )

After patch :

# perf stat -r 32 hackbench 50 process 4000 >/dev/null

 Performance counter stats for 'hackbench 50 process 4000' (32 runs):

     286236,542804 task-clock                #   15,786 CPUs utilized            ( +-  1,32% )
        19 703 372 context-switches          #    0,069 M/sec                    ( +-  4,99% )
         1 658 249 CPU-migrations            #    0,006 M/sec                    ( +-  6,62% )
           126 776 page-faults               #    0,000 M/sec                    ( +-  0,12% )
   724 636 593 213 cycles                    #    2,532 GHz                      ( +-  1,32% )
   499 320 714 837 stalled-cycles-frontend   #   68,91% frontend cycles idle     ( +-  1,47% )
   156 555 126 809 stalled-cycles-backend    #   21,60% backend  cycles idle     ( +-  2,22% )
   463 897 792 661 instructions              #    0,64  insns per cycle
                                             #    1,08  stalled cycles per insn  ( +-  0,94% )
    87 717 352 563 branches                  #  306,451 M/sec                    ( +-  0,99% )
       941 738 280 branch-misses             #    1,07% of all branches          ( +-  3,35% )

      18,132070670 seconds time elapsed                                          ( +-  1,30% )

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
CC: Matt Mackall <mpm@selenic.com>
CC: David Rientjes <rientjes@google.com>
CC: "Alex,Shi" <alex.shi@intel.com>
CC: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-01-24 21:53:57 +02:00
Heiko Carstens 2565409fc0 mm,x86,um: move CMPXCHG_DOUBLE config option
Move CMPXCHG_DOUBLE and rename it to HAVE_CMPXCHG_DOUBLE so architectures
can simply select the option if it is supported.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-12 20:13:03 -08:00
Heiko Carstens 43570fd2f4 mm,slub,x86: decouple size of struct page from CONFIG_CMPXCHG_LOCAL
While implementing cmpxchg_double() on s390 I realized that we don't set
CONFIG_CMPXCHG_LOCAL despite the fact that we have support for it.

However setting that option will increase the size of struct page by
eight bytes on 64 bit, which we certainly do not want.  Also, it doesn't
make sense that a present cpu feature should increase the size of struct
page.

Besides that it looks like the dependency to CMPXCHG_LOCAL is wrong and
that it should depend on CMPXCHG_DOUBLE instead.

This patch:

If an architecture supports CMPXCHG_LOCAL this shouldn't result
automatically in larger struct pages if the SLUB allocator is used.
Instead introduce a new config option "HAVE_ALIGNED_STRUCT_PAGE" which
can be selected if a double word aligned struct page is required.  Also
update x86 Kconfig so that it should work as before.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-12 20:13:03 -08:00
Linus Torvalds 6296e5d3c0 Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  slub: disallow changing cpu_partial from userspace for debug caches
  slub: add missed accounting
  slub: Extract get_freelist from __slab_alloc
  slub: Switch per cpu partial page support off for debugging
  slub: fix a possible memleak in __slab_alloc()
  slub: fix slub_max_order Documentation
  slub: add missed accounting
  slab: add taint flag outputting to debug paths.
  slub: add taint flag outputting to debug paths
  slab: introduce slab_max_order kernel parameter
  slab: rename slab_break_gfp_order to slab_max_order
2012-01-11 18:52:23 -08:00
Pekka Enberg 5878cf431c Merge branch 'slab/urgent' into slab/for-linus 2012-01-11 21:11:29 +02:00
Stanislaw Gruszka fc8d8620d3 slub: min order when debug_guardpage_minorder > 0
Disable slub debug facilities and allocate slabs at minimal order when
debug_guardpage_minorder > 0 to increase probability to catch random
memory corruption by cpu exception.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:43 -08:00
David Rientjes 74ee4ef1f9 slub: disallow changing cpu_partial from userspace for debug caches
For caches with debugging enabled, "slub: Switch per cpu partial page
support off for debugging" changes cpu_partial to 0.  It shouldn't be
tunable from userspace for such caches, otherwise the same accounting
issues arise during validation.

This patch disallows tuning /sys/kernel/slab/cache/cpu_partial to be non-
zero for caches with debugging enabled.

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-01-10 21:31:09 +02:00
Linus Torvalds 6b3da11b3c Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: Remove irqsafe_cpu_xxx variants

Fix up conflict in arch/x86/include/asm/percpu.h due to clash with
cebef5beed ("x86: Fix and improve percpu_cmpxchg{8,16}b_double()")
which edited the (now removed) irqsafe_cpu_cmpxchg*_double code.
2012-01-09 13:08:28 -08:00
Jan Beulich cdcd629869 x86: Fix and improve cmpxchg_double{,_local}()
Just like the per-CPU ones they had several
problems/shortcomings:

Only the first memory operand was mentioned in the asm()
operands, and the 2x64-bit version didn't have a memory clobber
while the 2x32-bit one did. The former allowed the compiler to
not recognize the need to re-load the data in case it had it
cached in some register, while the latter was overly
destructive.

The types of the local copies of the old and new values were
incorrect (the types of the pointed-to variables should be used
here, to make sure the respective old/new variable types are
compatible).

The __dummy/__junk variables were pointless, given that local
copies of the inputs already existed (and can hence be used for
discarded outputs).

The 32-bit variant of cmpxchg_double_local() referenced
cmpxchg16b_local().

At once also:

 - change the return value type to what it really is: 'bool'
 - unify 32- and 64-bit variants
 - abstract out the common part of the 'normal' and 'local' variants

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F01F12A020000780006A19B@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-04 15:01:54 +01:00
Christoph Lameter 933393f58f percpu: Remove irqsafe_cpu_xxx variants
We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state.  That has no material change for x86
and s390 implementations of this_cpu operations.  However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.

-tj: This is part of on-going percpu API cleanup.  For detailed
     discussion of the subject, please refer to the following thread.

     http://thread.gmane.org/gmane.linux.kernel/1222078

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <alpine.DEB.2.00.1112221154380.11787@router.home>
2011-12-22 10:40:20 -08:00
Shaohua Li b13683d1cc slub: add missed accounting
With per-cpu partial list, slab is added to partial list first and then moved
to node list. The __slab_free() code path for add/remove_partial is almost
deprecated(except for slub debug). But we forget to account add/remove_partial
when move per-cpu partial pages to node list, so the statistics for such events
are always 0. Add corresponding accounting.

This is against the patch "slub: use correct parameter to add a page to
partial list tail"

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-12-13 22:27:09 +02:00
Christoph Lameter 213eeb9fd9 slub: Extract get_freelist from __slab_alloc
get_freelist retrieves free objects from the page freelist (put there by remote
frees) or deactivates a slab page if no more objects are available.

Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-12-13 22:17:10 +02:00
Christoph Lameter 8f1e33daed slub: Switch per cpu partial page support off for debugging
Eric saw an issue with accounting of slabs during validation. Its not
possible to determine accurately how many per cpu partial slabs exist at
any time so this switches off per cpu partial pages during debug.

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-12-13 22:14:02 +02:00
Eric Dumazet 73736e0387 slub: fix a possible memleak in __slab_alloc()
Zhihua Che reported a possible memleak in slub allocator on
CONFIG_PREEMPT=y builds.

It is possible current thread migrates right before disabling irqs in
__slab_alloc(). We must check again c->freelist, and perform a normal
allocation instead of scratching c->freelist.

Many thanks to Zhihua Che for spotting this bug, introduced in 2.6.39

V2: Its also possible an IRQ freed one (or several) object(s) and
populated c->freelist, so its not a CONFIG_PREEMPT only problem.

Cc: <stable@vger.kernel.org>        [2.6.39+]
Reported-by: Zhihua Che <zhihua.che@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-12-13 22:11:21 +02:00
Shaohua Li 4c493a5a5c slub: add missed accounting
With per-cpu partial list, slab is added to partial list first and then moved
to node list. The __slab_free() code path for add/remove_partial is almost
deprecated(except for slub debug). But we forget to account add/remove_partial
when move per-cpu partial pages to node list, so the statistics for such events
are always 0. Add corresponding accounting.

This is against the patch "slub: use correct parameter to add a page to
partial list tail"

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-11-27 22:08:15 +02:00
Pekka Enberg 42616cacf8 Merge branch 'slab/urgent' into slab/next 2011-11-27 22:08:03 +02:00
Eric Dumazet bc6697d8a5 slub: avoid potential NULL dereference or corruption
show_slab_objects() can trigger NULL dereferences or memory corruption.

Another cpu can change its c->page to NULL or c->node to NUMA_NO_NODE
while we use them.

Use ACCESS_ONCE(c->page) and ACCESS_ONCE(c->node) to make sure this
cannot happen.

Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-11-24 08:44:19 +02:00
Christoph Lameter 42d623a8cd slub: use irqsafe_cpu_cmpxchg for put_cpu_partial
The cmpxchg must be irq safe. The fallback for this_cpu_cmpxchg only
disables preemption which results in per cpu partial page operation
potentially failing on non x86 platforms.

This patch fixes the following problem reported by Christian Kujau:

  I seem to hit it with heavy disk & cpu IO is in progress on this
  PowerBook
  G4. Full dmesg & .config: http://nerdbynature.de/bits/3.2.0-rc1/oops/

  I've enabled some debug options and now it really points to slub.c:2166

    http://nerdbynature.de/bits/3.2.0-rc1/oops/oops4m.jpg

  With debug options enabled I'm currently in the xmon debugger, not sure
  what to make of it yet, I'll try to get something useful out of it :)

Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-11-24 08:44:14 +02:00
Dave Jones 265d47e711 slub: add taint flag outputting to debug paths
When we get corruption reports, it's useful to see if the kernel was
tainted, to rule out problems we can't do anything about.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-11-16 21:14:40 +02:00
Shaohua Li 9ada19342b slub: move discard_slab out of node lock
Lockdep reports there is potential deadlock for slub node list_lock.
discard_slab() is called with the lock hold in unfreeze_partials(),
which could trigger a slab allocation, which could hold the lock again.

discard_slab() doesn't need hold the lock actually, if the slab is
already removed from partial list.

Acked-by: Christoph Lameter <cl@linux.com>
Reported-and-tested-by: Yong Zhang <yong.zhang0@gmail.com>
Reported-and-tested-by: Julie Sullivan <kernelmail.jms@gmail.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-11-15 20:41:00 +02:00
Shaohua Li f64ae042d9 slub: use correct parameter to add a page to partial list tail
unfreeze_partials() needs add the page to partial list tail, since such page
hasn't too many free objects. We now explictly use DEACTIVATE_TO_TAIL for this,
while DEACTIVATE_TO_TAIL != 1. This will cause performance regression (eg, more
lock contention in node->list_lock) without below fix.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-11-15 20:37:15 +02:00
Akinobu Mita 798248206b lib/string.c: introduce memchr_inv()
memchr_inv() is mainly used to check whether the whole buffer is filled
with just a specified byte.

The function name and prototype are stolen from logfs and the
implementation is from SLUB.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Matt Mackall <mpm@selenic.com>
Acked-by: Joern Engel <joern@logfs.org>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:47 -07:00
Pekka Enberg e182a345d4 Merge branches 'slab/next' and 'slub/partial' into slab/for-linus 2011-10-26 18:09:12 +03:00
Alex Shi dcc3be6a54 slub: Discard slab page when node partial > minimum partial number
Discarding slab should be done when node partial > min_partial.  Otherwise,
node partial slab may eat up all memory.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-09-27 23:03:31 +03:00
Alex Shi 9f26490412 slub: correct comments error for per cpu partial
Correct comment errors, that mistake cpu partial objects number as pages
number, may make reader misunderstand.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-09-27 23:03:30 +03:00
Vasiliy Kulikov ab067e99d2 mm: restrict access to slab files under procfs and sysfs
Historically /proc/slabinfo and files under /sys/kernel/slab/* have
world read permissions and are accessible to the world.  slabinfo
contains rather private information related both to the kernel and
userspace tasks.  Depending on the situation, it might reveal either
private information per se or information useful to make another
targeted attack.  Some examples of what can be learned by
reading/watching for /proc/slabinfo entries:

1) dentry (and different *inode*) number might reveal other processes fs
activity.  The number of dentry "active objects" doesn't strictly show
file count opened/touched by a process, however, there is a good
correlation between them.  The patch "proc: force dcache drop on
unauthorized access" relies on the privacy of dentry count.

2) different inode entries might reveal the same information as (1), but
these are more fine granted counters.  If a filesystem is mounted in a
private mount point (or even a private namespace) and fs type differs from
other mounted fs types, fs activity in this mount point/namespace is
revealed.  If there is a single ecryptfs mount point, the whole fs
activity of a single user is revealed.  Number of files in ecryptfs
mount point is a private information per se.

3) fuse_* reveals number of files / fs activity of a user in a user
private mount point.  It is approx. the same severity as ecryptfs
infoleak in (2).

4) sysfs_dir_cache similar to (2) reveals devices' addition/removal,
which can be otherwise hidden by "chmod 0700 /sys/".  With 0444 slabinfo
the precise number of sysfs files is known to the world.

5) buffer_head might reveal some kernel activity.  With other
information leaks an attacker might identify what specific kernel
routines generate buffer_head activity.

6) *kmalloc* infoleaks are very situational.  Attacker should watch for
the specific kmalloc size entry and filter the noise related to the unrelated
kernel activity.  If an attacker has relatively silent victim system, he
might get rather precise counters.

Additional information sources might significantly increase the slabinfo
infoleak benefits.  E.g. if an attacker knows that the processes
activity on the system is very low (only core daemons like syslog and
cron), he may run setxid binaries / trigger local daemon activity /
trigger network services activity / await sporadic cron jobs activity
/ etc. and get rather precise counters for fs and network activity of
these privileged tasks, which is unknown otherwise.

Also hiding slabinfo and /sys/kernel/slab/* is a one step to complicate
exploitation of kernel heap overflows (and possibly, other bugs).  The
related discussion:

http://thread.gmane.org/gmane.linux.kernel/1108378

To keep compatibility with old permission model where non-root
monitoring daemon could watch for kernel memleaks though slabinfo one
should do:

    groupadd slabinfo
    usermod -a -G slabinfo $MONITOR_USER

And add the following commands to init scripts (to mountall.conf in
Ubuntu's upstart case):

    chmod g+r /proc/slabinfo /sys/kernel/slab/*/*
    chgrp slabinfo /proc/slabinfo /sys/kernel/slab/*/*

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Reviewed-by: Kees Cook <kees@ubuntu.com>
Reviewed-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Christoph Lameter <cl@gentwo.org>
Acked-by: David Rientjes <rientjes@google.com>
CC: Valdis.Kletnieks@vt.edu
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Alan Cox <alan@linux.intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-09-27 22:59:27 +03:00
Pekka Enberg d20bbfab01 Merge branch 'slab/urgent' into slab/next 2011-09-19 17:46:07 +03:00
Alex,Shi 12d79634f8 slub: Code optimization in get_partial_node()
I find a way to reduce a variable in get_partial_node(). That is also helpful
for code understanding.

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-09-13 20:41:25 +03:00
Shaohua Li 136333d104 slub: explicitly document position of inserting slab to partial list
Adding slab to partial list head/tail is sensitive to performance.
So explicitly uses DEACTIVATE_TO_TAIL/DEACTIVATE_TO_HEAD to document
it to avoid we get it wrong.

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Shaohua Li <shli@kernel.org>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-27 11:59:00 +03:00
Shaohua Li 130655ef09 slub: add slab with one free object to partial list tail
The slab has just one free object, adding it to partial list head doesn't make
sense. And it can cause lock contentation. For example,
1. CPU takes the slab from partial list
2. fetch an object
3. switch to another slab
4. free an object, then the slab is added to partial list again
In this way n->list_lock will be heavily contended.
In fact, Alex had a hackbench regression. 3.1-rc1 performance drops about 70%
against 3.0. This patch fixes it.

Acked-by: Christoph Lameter <cl@linux.com>
Reported-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Shaohua Li <shli@kernel.org>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-27 11:58:59 +03:00
Christoph Lameter 49e2258586 slub: per cpu cache for partial pages
Allow filling out the rest of the kmem_cache_cpu cacheline with pointers to
partial pages. The partial page list is used in slab_free() to avoid
per node lock taking.

In __slab_alloc() we can then take multiple partial pages off the per
node partial list in one go reducing node lock pressure.

We can also use the per cpu partial list in slab_alloc() to avoid scanning
partial lists for pages with free objects.

The main effect of a per cpu partial list is that the per node list_lock
is taken for batches of partial pages instead of individual ones.

Potential future enhancements:

1. The pickup from the partial list could be perhaps be done without disabling
   interrupts with some work. The free path already puts the page into the
   per cpu partial list without disabling interrupts.

2. __slab_free() may have some code paths that could use optimization.

Performance:

				Before		After
./hackbench 100 process 200000
				Time: 1953.047	1564.614
./hackbench 100 process 20000
				Time: 207.176   156.940
./hackbench 100 process 20000
				Time: 204.468	156.940
./hackbench 100 process 20000
				Time: 204.879	158.772
./hackbench 10 process 20000
				Time: 20.153	15.853
./hackbench 10 process 20000
				Time: 20.153	15.986
./hackbench 10 process 20000
				Time: 19.363	16.111
./hackbench 1 process 20000
				Time: 2.518	2.307
./hackbench 1 process 20000
				Time: 2.258	2.339
./hackbench 1 process 20000
				Time: 2.864	2.163

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-19 19:34:27 +03:00
Christoph Lameter 497b66f2ec slub: return object pointer from get_partial() / new_slab().
There is no need anymore to return the pointer to a slab page from get_partial()
since the page reference can be stored in the kmem_cache_cpu structures "page" field.

Return an object pointer instead.

That in turn allows a simplification of the spaghetti code in __slab_alloc().

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-19 19:34:27 +03:00
Christoph Lameter acd19fd1a7 slub: pass kmem_cache_cpu pointer to get_partial()
Pass the kmem_cache_cpu pointer to get_partial(). That way
we can avoid the this_cpu_write() statements.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-19 19:34:26 +03:00
Christoph Lameter e6e82ea112 slub: Prepare inuse field in new_slab()
inuse will always be set to page->objects. There is no point in
initializing the field to zero in new_slab() and then overwriting
the value in __slab_alloc().

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-19 19:34:26 +03:00
Christoph Lameter 7db0d70540 slub: Remove useless statements in __slab_alloc
Two statements in __slab_alloc() do not have any effect.

1. c->page is already set to NULL by deactivate_slab() called right before.

2. gfpflags are masked in new_slab() before being passed to the page
   allocator. There is no need to mask gfpflags in __slab_alloc in particular
   since most frequent processing in __slab_alloc does not require the use of a
   gfpmask.

Cc: torvalds@linux-foundation.org
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-19 19:34:25 +03:00
Christoph Lameter 69cb8e6b7c slub: free slabs without holding locks
There are two situations in which slub holds a lock while releasing
pages:

	A. During kmem_cache_shrink()
	B. During kmem_cache_close()

For A build a list while holding the lock and then release the pages
later. In case of B we are the last remaining user of the slab so
there is no need to take the listlock.

After this patch all calls to the page allocator to free pages are
done without holding any spinlocks. kmem_cache_destroy() will still
hold the slub_lock semaphore.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-19 19:34:25 +03:00
Christoph Lameter 81107188f1 slub: Fix partial count comparison confusion
deactivate_slab() has the comparison if more than the minimum number of
partial pages are in the partial list wrong. An effect of this may be that
empty pages are not freed from deactivate_slab(). The result could be an
OOM due to growth of the partial slabs per node. Frees mostly occur from
__slab_free which is okay so this would only affect use cases where a lot
of switching around of per cpu slabs occur.

Switching per cpu slabs occurs with high frequency if debugging options are
enabled.

Reported-and-tested-by: Xiaotian Feng <xtfeng@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-09 21:12:31 +03:00
Akinobu Mita ef62fb32b7 slub: fix check_bytes() for slub debugging
The check_bytes() function is used by slub debugging.  It returns a pointer
to the first unmatching byte for a character in the given memory area.

If the character for matching byte is greater than 0x80, check_bytes()
doesn't work.  Becuase 64-bit pattern is generated as below.

	value64 = value | value << 8 | value << 16 | value << 24;
	value64 = value64 | value64 << 32;

The integer promotions are performed and sign-extended as the type of value
is u8.  The upper 32 bits of value64 is 0xffffffff in the first line, and
the second line has no effect.

This fixes the 64-bit pattern generation.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Reviewed-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-09 16:37:48 +03:00
Christoph Lameter 6fbabb20fa slub: Fix full list corruption if debugging is on
When a slab is freed by __slab_free() and the slab can only contain a
single object ever then it was full (and therefore not on the partial
lists but on the full list in the debug case) before we reached
slab_empty.

This caused the following full list corruption when SLUB debugging was enabled:

  [ 5913.233035] ------------[ cut here ]------------
  [ 5913.233097] WARNING: at lib/list_debug.c:53 __list_del_entry+0x8d/0x98()
  [ 5913.233101] Hardware name: Adamo 13
  [ 5913.233105] list_del corruption. prev->next should be ffffea000434fd20, but was ffffea0004199520
  [ 5913.233108] Modules linked in: nfs fscache fuse ebtable_nat ebtables ppdev parport_pc lp parport ipt_MASQUERADE iptable_nat nf_nat nfsd lockd nfs_acl auth_rpcgss xt_CHECKSUM sunrpc iptable_mangle bridge stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables rfcomm bnep arc4 iwlagn snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_intel btusb mac80211 snd_hda_codec bluetooth snd_hwdep snd_seq snd_seq_device snd_pcm usb_debug dell_wmi sparse_keymap cdc_ether usbnet cdc_acm uvcvideo cdc_wdm mii cfg80211 snd_timer dell_laptop videodev dcdbas snd microcode v4l2_compat_ioctl32 soundcore joydev tg3 pcspkr snd_page_alloc iTCO_wdt i2c_i801 rfkill iTCO_vendor_support wmi virtio_net kvm_intel kvm ipv6 xts gf128mul dm_crypt i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
  [ 5913.233213] Pid: 0, comm: swapper Not tainted 3.0.0+ #127
  [ 5913.233213] Call Trace:
  [ 5913.233213]  <IRQ>  [<ffffffff8105df18>] warn_slowpath_common+0x83/0x9b
  [ 5913.233213]  [<ffffffff8105dfd3>] warn_slowpath_fmt+0x46/0x48
  [ 5913.233213]  [<ffffffff8127e7c1>] __list_del_entry+0x8d/0x98
  [ 5913.233213]  [<ffffffff8127e7da>] list_del+0xe/0x2d
  [ 5913.233213]  [<ffffffff814e0430>] __slab_free+0x1db/0x235
  [ 5913.233213]  [<ffffffff811706ab>] ? bvec_free_bs+0x35/0x37
  [ 5913.233213]  [<ffffffff811706ab>] ? bvec_free_bs+0x35/0x37
  [ 5913.233213]  [<ffffffff811706ab>] ? bvec_free_bs+0x35/0x37
  [ 5913.233213]  [<ffffffff81133085>] kmem_cache_free+0x88/0x102
  [ 5913.233213]  [<ffffffff811706ab>] bvec_free_bs+0x35/0x37
  [ 5913.233213]  [<ffffffff811706e1>] bio_free+0x34/0x64
  [ 5913.233213]  [<ffffffff813dc390>] dm_bio_destructor+0x12/0x14
  [ 5913.233213]  [<ffffffff8116fef6>] bio_put+0x2b/0x2d
  [ 5913.233213]  [<ffffffff813dccab>] clone_endio+0x9e/0xb4
  [ 5913.233213]  [<ffffffff8116f7dd>] bio_endio+0x2d/0x2f
  [ 5913.233213]  [<ffffffffa00148da>] crypt_dec_pending+0x5c/0x8b [dm_crypt]
  [ 5913.233213]  [<ffffffffa00150a9>] crypt_endio+0x78/0x81 [dm_crypt]

[ Full discussion here: https://lkml.org/lkml/2011/8/4/375 ]

Make sure that we remove such a slab also from the full lists.

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Xiaotian Feng <xtfeng@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-08-09 16:36:02 +03:00
Sebastian Andrzej Siewior ffc79d2880 slub: use print_hex_dump
Less code and same functionality. The output would be:

| Object c7428000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
| Object c7428010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
| Object c7428020: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
| Object c7428030: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5              kkkkkkkkkkk.
| Redzone c742803c: bb bb bb bb                                      ....
| Padding c7428064: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
| Padding c7428074: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a              ZZZZZZZZZZZZ

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-07-31 19:16:48 +03:00
Linus Torvalds c11abbbaa3 Merge branch 'slub/lockless' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'slub/lockless' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: (21 commits)
  slub: When allocating a new slab also prep the first object
  slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock
  Avoid duplicate _count variables in page_struct
  Revert "SLUB: Fix build breakage in linux/mm_types.h"
  SLUB: Fix build breakage in linux/mm_types.h
  slub: slabinfo update for cmpxchg handling
  slub: Not necessary to check for empty slab on load_freelist
  slub: fast release on full slab
  slub: Add statistics for the case that the current slab does not match the node
  slub: Get rid of the another_slab label
  slub: Avoid disabling interrupts in free slowpath
  slub: Disable interrupts in free_debug processing
  slub: Invert locking and avoid slab lock
  slub: Rework allocator fastpaths
  slub: Pass kmem_cache struct to lock and freeze slab
  slub: explicit list_lock taking
  slub: Add cmpxchg_double_slab()
  mm: Rearrange struct page
  slub: Move page->frozen handling near where the page->freelist handling occurs
  slub: Do not use frozen page flag but a bit in the page counters
  ...
2011-07-30 08:21:48 -10:00