remarkable-linux/include
Michel Lespinasse fff3fd8a12 rbtree: add prio tree and interval tree tests
Patch 1 implements support for interval trees, on top of the augmented
rbtree API. It also adds synthetic tests to compare the performance of
interval trees vs prio trees. Short answers is that interval trees are
slightly faster (~25%) on insert/erase, and much faster (~2.4 - 3x)
on search. It is debatable how realistic the synthetic test is, and I have
not made such measurements yet, but my impression is that interval trees
would still come out faster.

Patch 2 uses a preprocessor template to make the interval tree generic,
and uses it as a replacement for the vma prio_tree.

Patch 3 takes the other prio_tree user, kmemleak, and converts it to use
a basic rbtree. We don't actually need the augmented rbtree support here
because the intervals are always non-overlapping.

Patch 4 removes the now-unused prio tree library.

Patch 5 proposes an additional optimization to rb_erase_augmented, now
providing it as an inline function so that the augmented callbacks can be
inlined in. This provides an additional 5-10% performance improvement
for the interval tree insert/erase benchmark. There is a maintainance cost
as it exposes augmented rbtree users to some of the rbtree library internals;
however I think this cost shouldn't be too high as I expect the augmented
rbtree will always have much less users than the base rbtree.

I should probably add a quick summary of why I think it makes sense to
replace prio trees with augmented rbtree based interval trees now.  One of
the drivers is that we need augmented rbtrees for Rik's vma gap finding
code, and once you have them, it just makes sense to use them for interval
trees as well, as this is the simpler and more well known algorithm.  prio
trees, in comparison, seem *too* clever: they impose an additional 'heap'
constraint on the tree, which they use to guarantee a faster worst-case
complexity of O(k+log N) for stabbing queries in a well-balanced prio
tree, vs O(k*log N) for interval trees (where k=number of matches,
N=number of intervals).  Now this sounds great, but in practice prio trees
don't realize this theorical benefit.  First, the additional constraint
makes them harder to update, so that the kernel implementation has to
simplify things by balancing them like a radix tree, which is not always
ideal.  Second, the fact that there are both index and heap properties
makes both tree manipulation and search more complex, which results in a
higher multiplicative time constant.  As it turns out, the simple interval
tree algorithm ends up running faster than the more clever prio tree.

This patch:

Add two test modules:

- prio_tree_test measures the performance of lib/prio_tree.c, both for
  insertion/removal and for stabbing searches

- interval_tree_test measures the performance of a library of equivalent
  functionality, built using the augmented rbtree support.

In order to support the second test module, lib/interval_tree.c is
introduced. It is kept separate from the interval_tree_test main file
for two reasons: first we don't want to provide an unfair advantage
over prio_tree_test by having everything in a single compilation unit,
and second there is the possibility that the interval tree functionality
could get some non-test users in kernel over time.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:39 +09:00
..
acpi Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux 2012-10-08 07:14:06 +09:00
asm-generic thp: introduce pmdp_invalidate() 2012-10-09 16:22:29 +09:00
clocksource arm64: Generic timers support 2012-09-17 13:42:20 +01:00
crypto crypto: cast6 - fix sparse warnings (symbol was not declared, should be static?) 2012-09-07 04:17:06 +08:00
drm Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next 2012-10-07 21:13:54 +10:00
keys
linux rbtree: add prio tree and interval tree tests 2012-10-09 16:22:39 +09:00
math-emu
media Merge branch 'exynos-drm-next' of git://git.infradead.org/users/kmpark/linux-samsung into drm-next 2012-10-07 21:06:33 +10:00
memory
misc
mtd UBI: add max_beb_per1024 to attach ioctl 2012-09-04 09:39:01 +03:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-10-06 03:11:59 +09:00
pcmcia
ras
rdma IB/core: Add ib_find_exact_cached_pkey() 2012-09-30 20:33:30 -07:00
rxrpc
scsi Prepared for main script 2012-10-03 13:45:43 -07:00
sound Sound updates for 3.7-rc1 2012-10-09 07:07:14 +09:00
target target: support zero allocation length in REQUEST SENSE 2012-09-07 11:14:21 -07:00
trace mm: remove __GFP_NO_KSWAPD 2012-10-09 16:22:15 +09:00
uapi UAPI: (Scripted) Disintegrate include/drm 2012-10-04 18:21:50 +01:00
video ARM: clps711x: Remove board support for CEIVA 2012-09-28 21:14:08 +02:00
xen Features: 2012-10-07 07:13:01 +09:00
Kbuild