remarkable-linux/lib
Peter Zijlstra f0f1d32f93 llist: Remove cpu_relax() usage in cmpxchg loops
Initial benchmarks show they're a net loss:

 $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done
 $ echo 4096 32000 64 128 > /proc/sys/kernel/sem
 $ ./sembench -t 2048 -w 1900 -o 0

Pre:

 run time 30 seconds 778936 worker burns per second
 run time 30 seconds 912190 worker burns per second
 run time 30 seconds 817506 worker burns per second
 run time 30 seconds 830870 worker burns per second
 run time 30 seconds 845056 worker burns per second

Post:

 run time 30 seconds 905920 worker burns per second
 run time 30 seconds 849046 worker burns per second
 run time 30 seconds 886286 worker burns per second
 run time 30 seconds 822320 worker burns per second
 run time 30 seconds 900283 worker burns per second

So about 4% faster. (!)

cpu_relax() stalls the pipeline, therefore, when used in a tight loop
it has the following benefits:

 - allows SMT siblings to have a go;
 - reduces pressure on the CPU interconnect.

However, cmpxchg loops are unfair and thus have unbounded completion
time, therefore we should avoid getting in such heavily contended
situations where the above benefits make any difference.

A typical cmpxchg loop should not go round more than a handfull of
times at worst, therefore adding extra delays just slows things down.

Since the llist primitives are new, there aren't any bad users yet,
and we should avoid growing them. Heavily contended sites should
generally be better off using the ticket locks for serialization since
they provide bounded completion times (fifo-fair over the cpus).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1315836358.26517.43.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 12:44:03 +02:00
..
lzo
raid6 Move .gitignore from drivers/md to lib/raid6 2010-08-30 17:35:52 +10:00
reed_solomon
xz XZ: Fix incorrect XZ_BUF_ERROR 2011-09-21 13:39:59 -07:00
zlib_deflate zlib: slim down zlib_deflate() workspace when possible 2011-03-22 17:44:17 -07:00
zlib_inflate
.gitignore
argv_split.c
atomic64.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
atomic64_test.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
audit.c audit: support the "standard" <asm-generic/unistd.h> 2011-05-04 14:41:28 -04:00
average.c lib: Improve EWMA efficiency by using bitshifts 2010-12-06 15:58:43 -05:00
bcd.c
bch.c lib: add shared BCH ECC library 2011-03-11 14:25:50 +00:00
bitmap.c Merge branch 'apei' into apei-release 2011-08-03 11:30:42 -04:00
bitrev.c
bsearch.c lib: Add generic binary search function to the kernel. 2011-05-19 16:55:27 +09:30
btree.c Fix common misspellings 2011-03-31 11:26:23 -03:00
bug.c modules: Fix module_bug_list list corruption race 2010-10-05 11:29:27 -07:00
bust_spinlocks.c
check_signature.c
checksum.c lib/checksum.c: optimize do_csum a bit 2011-07-07 04:52:24 -07:00
cmdline.c
cordic.c lib: cordic: add library module providing cordic angle calculation 2011-06-03 15:01:07 -04:00
cpu-notifier-error-inject.c fault-injection: add CPU notifier error injection module 2010-05-27 09:12:48 -07:00
cpu_rmap.c lib: cpu_rmap: CPU affinity reverse-mapping 2011-01-24 14:51:56 -08:00
cpumask.c cpumask: alloc_cpumask_var() use NUMA_NO_NODE 2011-07-26 16:49:44 -07:00
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c
crc7.c
crc8.c lib: crc8: add new library module providing crc8 algorithm 2011-06-03 15:01:06 -04:00
crc16.c
crc32.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
crc32defs.h
ctype.c
debug_locks.c Revert "debug_locks: set oops_in_progress if we will log messages." 2010-11-29 15:18:28 -08:00
debugobjects.c debugobjects: Fix boot crash when kmemleak and debugobjects enabled 2011-06-20 14:38:43 +02:00
dec_and_lock.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
decompress.c decompressors: add boot-time XZ support 2011-01-13 08:03:25 -08:00
decompress_bunzip2.c Decompressors: include <linux/slab.h> in <linux/decompress/mm.h> 2011-01-13 08:03:23 -08:00
decompress_inflate.c decompressors: check input size in decompress_inflate.c 2011-01-13 08:03:25 -08:00
decompress_unlzma.c Decompressors: validate match distance in decompress_unlzma.c 2011-01-13 08:03:24 -08:00
decompress_unlzo.c Decompressors: fix callback-to-callback mode in decompress_unlzo.c 2011-01-13 08:03:24 -08:00
decompress_unxz.c Fix common misspellings 2011-03-31 11:26:23 -03:00
devres.c devres: fix possible use after free 2011-07-25 20:57:14 -07:00
div64.c div64_u64(): improve precision on 32bit platforms 2010-10-26 16:52:19 -07:00
dma-debug.c dma-debug: print information about leaked entry 2011-04-07 16:31:19 +02:00
dump_stack.c
dynamic_debug.c dynamic_debug: add #include <linux/sched.h> 2011-02-03 15:59:58 -08:00
extable.c
fault-inject.c fault-injection: add ability to export fault_attr in arbitrary directory 2011-08-03 14:25:20 -10:00
find_last_bit.c bitops: add #ifndef for each of find bitops 2011-05-26 17:12:38 -07:00
find_next_bit.c arch: remove CONFIG_GENERIC_FIND_{NEXT_BIT,BIT_LE,LAST_BIT} 2011-05-26 17:12:38 -07:00
flex_array.c flex_array: avoid divisions when accessing elements 2011-05-26 17:12:33 -07:00
gcd.c
gen_crc32table.c crc32: major optimization 2010-05-25 08:07:06 -07:00
genalloc.c lib, Make gen_pool memory allocator lockless 2011-08-03 11:15:57 -04:00
halfmd4.c
hexdump.c include/linux/printk.h lib/hexdump.c: neatening and add CONFIG_PRINTK guard 2011-01-13 08:03:10 -08:00
hweight.c
idr.c ida: simplified functions for id allocation 2011-08-03 14:25:20 -10:00
inflate.c MN10300: Don't try and #include <linux/slab.h> in lib/inflate.c from bootloader 2010-08-12 09:51:35 -07:00
int_sqrt.c
iomap.c iomap: make IOPORT/PCI mapping functions conditional 2011-07-22 18:46:26 +02:00
iomap_copy.c
iommu-helper.c iommu: inline iommu_num_pages 2010-08-09 20:45:05 -07:00
ioremap.c ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support 2011-01-12 03:06:19 -05:00
irq_regs.c
is_single_threaded.c
kasprintf.c
Kconfig llist: Make some llist functions inline 2011-10-04 11:30:53 +02:00
Kconfig.debug Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-07-22 16:45:02 -07:00
Kconfig.kgdb
Kconfig.kmemcheck
klist.c
kobject.c Delay struct net freeing while there's a sysfs instance refering to it 2011-06-12 17:45:41 -04:00
kobject_uevent.c kobject_uevent: fix typo in comments 2010-08-23 18:12:46 -07:00
kref.c kref: Add a kref_sub function 2010-11-22 13:25:13 +10:00
kstrtox.c lib: make _tolower() public 2011-07-25 20:57:16 -07:00
lcm.c lib/lcm.c: quiet sparse noise 2011-07-25 20:57:15 -07:00
libcrc32c.c
list_debug.c Expand CONFIG_DEBUG_LIST to several other list operations 2011-02-18 11:32:28 -08:00
list_sort.c lib/list_sort: test: check element addresses 2010-10-26 16:52:19 -07:00
llist.c llist: Remove cpu_relax() usage in cmpxchg loops 2011-10-04 12:44:03 +02:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c rcu: Fix unpaired rcu_irq_enter() from locking selftests 2011-05-26 09:42:19 -07:00
lru_cache.c lru_cache: use correct type in sizeof for allocation 2011-05-25 08:39:52 -07:00
Makefile llist: Make some llist functions inline 2011-10-04 11:30:53 +02:00
md5.c crypto: Move md5_transform to lib/md5.c 2011-08-06 18:32:45 -07:00
nlattr.c net: fix nla_policy_len to actually _iterate_ over the policy 2011-02-28 12:38:25 -08:00
parser.c Fix common misspellings 2011-03-31 11:26:23 -03:00
percpu_counter.c percpucounter: Optimize __percpu_counter_add a bit through the use of this_cpu() options. 2010-12-17 15:07:18 +01:00
plist.c plist: Remove the need to supply locks to plist heads 2011-07-08 14:02:53 +02:00
prio_heap.c
prio_tree.c
proportions.c
radix-tree.c tmpfs radix_tree: locate_item to speed up swapoff 2011-08-03 14:25:24 -10:00
random32.c Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
ratelimit.c
rational.c
rbtree.c Export the augmented rbtree helper functions 2011-01-28 12:16:59 +10:00
reciprocal_div.c
rwsem-spinlock.c
rwsem.c rwsem: Remove redundant asmregparm annotation 2011-01-27 12:30:40 +01:00
scatterlist.c scatterlist: prevent invalid free when alloc fails 2010-08-30 19:55:09 +02:00
sha1.c lib/sha1.c: quiet sparse noise about symbol not declared 2011-09-13 16:09:41 -07:00
show_mem.c arch, mm: filter disallowed nodes from arch specific show_mem functions 2011-05-25 08:39:03 -07:00
smp_processor_id.c
sort.c
spinlock_debug.c
string.c Add a strtobool function matching semantics of existing in kernel equivalents 2011-05-19 16:55:28 +09:30
string_helpers.c
swiotlb.c swiotlb: Export swioltb_nr_tbl and utilize it as appropiate. 2011-06-06 15:41:16 -04:00
syscall.c
test-kstrtox.c kstrtox: fix compile warnings in test 2011-04-14 16:06:54 -07:00
textsearch.c textsearch: doc - fix spelling in lib/textsearch.c. 2011-01-24 23:33:30 -08:00
timerqueue.c Fix common misspellings 2011-03-31 11:26:23 -03:00
ts_bm.c
ts_fsm.c
ts_kmp.c
uuid.c
vsprintf.c Merge 'akpm' patch series 2011-07-25 21:00:19 -07:00