remarkable-linux/kernel
Christian Borntraeger 3401a61e16 stop_machine: make stop_machine_run more virtualization friendly
On kvm I have seen some rare hangs in stop_machine when I used more guest
cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the
hang quite often. I could also reproduce the problem on a 4 way z/VM host with
a 64 way guest.

It turned out that the guest was consuming all available cpus mostly for
spinning on scheduler locks like rq->lock. This is expected as the threads are
calling yield all the time.
The problem is now, that the host scheduling decisings together with the guest
scheduling decisions and spinlocks not being fair managed to create an
interesting scenario similar to a live lock. (Sometimes the hang resolved
itself after some minutes)

Changing stop_machine to yield the cpu to the hypervisor when yielding inside
the guest fixed the problem for me. While I am not completely happy with this
patch, I think it causes no harm and it really improves the situation for me.

I used cpu_relax for yielding to the hypervisor, does that work on all
architectures?

p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use
stop_machine_run and both triggered the problem after some retries.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-23 13:09:34 +10:00
..
irq genirq: reenable a nobody cared disabled irq when a new driver arrives 2008-05-02 13:40:34 +02:00
power Merge branches 'release', 'acpica', 'bugzilla-10224', 'bugzilla-9772', 'bugzilla-9916', 'ec', 'eeepc', 'idle', 'misc', 'pm-legacy', 'sysfs-links-2.6.26', 'thermal', 'thinkpad' and 'video' into release 2008-04-30 13:58:00 -04:00
time clocksource: allow read access to available/current_clocksource 2008-05-03 18:11:48 +02:00
.gitignore Update kernel/.gitignore with new auto-generated files 2008-02-09 23:27:01 -08:00
acct.c bsd_acct: using task_struct->tgid is not right in pid-namespaces 2008-03-24 19:22:20 -07:00
audit.c [patch 1/1] audit_send_reply(): fix error-path memory leak 2008-05-17 03:30:22 -04:00
audit.h [PATCH 1/2] audit: move extern declarations to audit.h 2008-04-28 06:28:04 -04:00
audit_tree.c [PATCH] list_for_each_rcu must die: audit 2008-05-17 03:30:23 -04:00
auditfilter.c Merge branch 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current 2008-04-29 11:41:22 -07:00
auditsc.c [PATCH] new predicate - AUDIT_FILETYPE 2008-04-28 06:28:37 -04:00
backtracetest.c x86: add a simple backtrace test module 2008-01-30 13:33:08 +01:00
bounds.c Add kbuild.h that contains common definitions for kbuild users 2008-04-29 08:06:29 -07:00
capability.c Add 64-bit capability support to the kernel 2008-02-05 09:44:20 -08:00
cgroup.c mm: bdi: add separate writeback accounting capability 2008-04-30 08:29:50 -07:00
cgroup_debug.c CGroup API files: move "releasable" to cgroup_debug subsystem 2008-04-29 08:06:09 -07:00
compat.c ntp: support for TAI 2008-05-01 08:03:59 -07:00
configs.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
cpu.c kernel: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
cpuset.c Fix cpuset sched_relax_domain_level control file 2008-05-08 10:46:56 -07:00
delayacct.c Add scaled time to taskstats based process accounting 2007-10-18 14:37:28 -07:00
dma.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
exec_domain.c whitespace fixes: execution domains 2007-10-18 14:37:26 -07:00
exit.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
extable.c module: Don't report discarded init pages as kernel text. 2008-01-29 17:13:18 +11:00
fork.c [PATCH] dup_fd() fixes, part 1 2008-05-16 17:22:26 -04:00
futex.c Removal of FUTEX_FD 2008-05-05 08:18:45 -07:00
futex_compat.c futex_compat __user annotation 2008-03-30 14:18:41 -07:00
hrtimer.c hrtimer: remove duplicate helper function 2008-05-03 18:11:48 +02:00
itimer.c ITIMER_REAL: convert to use struct pid 2008-02-08 09:22:29 -08:00
kallsyms.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
Kconfig.hz sched: high-res preemption tick 2008-01-25 21:08:29 +01:00
Kconfig.preempt rcu: move PREEMPT_RCU config option back under PREEMPT 2008-03-10 18:01:20 -07:00
kexec.c kexec: make extended crashkernel= syntax less confusing 2008-05-01 08:04:00 -07:00
kfifo.c is_power_of_2: kernel/kfifo.c 2007-07-16 09:05:50 -07:00
kgdb.c lib: create common ascii hex array 2008-05-14 19:11:14 -07:00
kmod.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
kprobes.c kprobes: add (un)register_jprobes for batch registration 2008-04-28 08:58:32 -07:00
ksysfs.c Kobject: convert remaining kobject_unregister() to kobject_put() 2008-01-24 20:40:40 -08:00
kthread.c Deprecate find_task_by_pid() 2008-04-30 08:29:48 -07:00
latencytop.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
lockdep.c Subject: lockdep: include all lock classes in all_lock_classes 2008-02-25 23:03:02 +01:00
lockdep_internals.h [PATCH] lockdep: more chains 2006-12-07 08:39:43 -08:00
lockdep_proc.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
Makefile sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK 2008-05-05 23:56:18 +02:00
marker.c make marker_debug static 2008-04-30 08:29:49 -07:00
module.c modules: proper cleanup of kobject without CONFIG_SYSFS 2008-05-23 13:09:33 +10:00
mutex-debug.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex-debug.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
mutex.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
mutex.h [PATCH] lockdep: prove mutex locking correctness 2006-07-03 15:27:04 -07:00
notifier.c ipc: re-enable msgmni automatic recomputing msgmni if set to negative 2008-04-29 08:06:13 -07:00
ns_cgroup.c cgroups: kernel/ns_cgroup.c should #include <linux/nsproxy.h> 2008-04-29 08:06:07 -07:00
nsproxy.c ipc: sysvsem: refuse clone(CLONE_SYSVSEM|CLONE_NEWIPC) 2008-04-29 08:06:14 -07:00
panic.c Taint kernel after WARN_ON(condition) 2008-04-29 08:05:59 -07:00
params.c Add new string functions strict_strto* and convert kernel params to use them 2008-02-08 09:22:41 -08:00
pid.c pids: introduce change_pid() helper 2008-04-30 08:29:48 -07:00
pid_namespace.c pidns: make pid->level and pid_ns->level unsigned 2008-04-30 08:29:49 -07:00
pm_qos_params.c pm qos infrastructure and interface 2008-02-05 09:44:22 -08:00
posix-cpu-timers.c remove div_long_long_rem 2008-05-01 08:03:58 -07:00
posix-timers.c signals: join send_sigqueue() with send_group_sigqueue() 2008-04-30 08:29:36 -07:00
printk.c printk: don't read beyond string arguments' terminating zero 2008-04-30 08:29:52 -07:00
profile.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
ptrace.c make generic sys_ptrace unconditional 2008-05-01 10:21:54 -07:00
rcuclassic.c Preempt-RCU: implementation 2008-01-25 21:08:24 +01:00
rcupdate.c rcupdate: fix comment 2008-02-13 16:21:18 -08:00
rcupreempt.c generic: reduce stack pressure in sched_affinity 2008-04-19 19:44:59 +02:00
rcupreempt_trace.c Preempt-RCU: implementation 2008-01-25 21:08:24 +01:00
rcutorture.c kernel: explicitly include required header files under kernel/ 2008-04-29 08:06:04 -07:00
relay.c Revert "relay: fix splice problem" 2008-05-08 14:06:19 +02:00
res_counter.c memcgroup: add the max_usage member on the res_counter 2008-04-29 08:06:10 -07:00
resource.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
rtmutex-debug.c Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rtmutex-debug.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
rtmutex-tester.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
rtmutex.c hrtimer: more hrtimer_init_sleeper() fallout. 2008-02-13 15:45:36 +01:00
rtmutex.h [PATCH] lockdep: better lock debugging 2006-07-03 15:27:01 -07:00
rtmutex_common.h Don't operate with pid_t in rtmutex tester 2008-02-08 09:22:41 -08:00
rwsem.c sched: mark rwsem functions as __sched for wchan/profiling 2007-12-18 15:21:13 +01:00
sched.c cgroups: fix compile warning 2008-05-14 19:11:14 -07:00
sched_clock.c sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK 2008-05-05 23:56:18 +02:00
sched_debug.c sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK 2008-05-05 23:56:18 +02:00
sched_fair.c sched: fix weight calculations 2008-05-08 17:00:42 +02:00
sched_features.h sched: /debug/sched_features 2008-04-19 19:45:00 +02:00
sched_idletask.c sched: make rt_sched_class, idle_sched_class static 2008-05-05 23:56:17 +02:00
sched_rt.c sched: fix RT task-wakeup logic 2008-05-05 23:56:18 +02:00
sched_stats.h cpumask: use new cpus_scnprintf function 2008-04-19 19:44:59 +02:00
seccomp.c make seccomp zerocost in schedule 2007-07-16 09:05:50 -07:00
semaphore.c Revert "semaphore: fix" 2008-05-10 20:43:22 -07:00
signal.c signals: add set_restore_sigmask 2008-04-30 08:29:37 -07:00
softirq.c Fix cpu hotplug problem in softirq code 2008-05-01 08:03:58 -07:00
softlockup.c softlockup: fix task state setting 2008-02-29 18:46:53 +01:00
spinlock.c spinlock: lockbreak cleanup 2008-01-30 13:31:20 +01:00
srcu.c make srcu_readers_active() static 2008-02-06 10:41:02 -08:00
stacktrace.c [PATCH] lockdep: stacktrace subsystem, core 2006-07-03 15:27:02 -07:00
stop_machine.c stop_machine: make stop_machine_run more virtualization friendly 2008-05-23 13:09:34 +10:00
sys.c pids: sys_getpgid: fix unsafe *pid usage, s/tasklist/rcu/ 2008-04-30 08:29:49 -07:00
sys_ni.c timerfd: new timerfd API 2008-02-05 09:44:07 -08:00
sysctl.c [PATCH] avoid multiplication overflows and signedness issues for max_fds 2008-05-16 17:22:52 -04:00
sysctl_check.c constify tables in kernel/sysctl_check.c 2008-02-08 09:22:31 -08:00
taskstats.c Use find_task_by_vpid in taskstats 2008-04-30 08:29:48 -07:00
test_kprobes.c kprobes: kretprobe user entry-handler 2008-02-06 10:41:11 -08:00
time.c Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timeconst.pl Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timer.c debugobjects: add timer specific object debugging code 2008-04-30 08:29:53 -07:00
tsacct.c Add scaled time to taskstats based process accounting 2007-10-18 14:37:28 -07:00
uid16.c asmlinkage_protect replaces prevent_tail_call 2008-04-10 17:28:26 -07:00
user.c alloc_uid: cleanup 2008-04-30 08:29:53 -07:00
user_namespace.c eCryptfs: make key module subsystem respect namespaces 2008-04-29 08:06:07 -07:00
utsname.c kernel: explicitly include required header files under kernel/ 2008-04-29 08:06:04 -07:00
utsname_sysctl.c Isolate the UTS namespace's domainname and hostname back 2007-11-29 09:24:53 -08:00
wait.c kernel: remove fastcall in kernel/* 2008-02-08 09:22:31 -08:00
workqueue.c workqueue: remove redundant function invocation 2008-05-01 08:04:02 -07:00