alistair23-linux/kernel
Ken Chen b0432d8f16 sched: Fix sched-domain avg_load calculation
In function find_busiest_group(), the sched-domain avg_load isn't
calculated at all if there is a group imbalance within the domain. This
will cause erroneous imbalance calculation.

The reason is that calculate_imbalance() sees sds->avg_load = 0 and it
will dump entire sds->max_load into imbalance variable, which is used
later on to migrate entire load from busiest CPU to the puller CPU.

This has two really bad effect:

1. stampede of task migration, and they won't be able to break out
   of the bad state because of positive feedback loop: large load
   delta -> heavier load migration -> larger imbalance and the cycle
   goes on.

2. severe imbalance in CPU queue depth.  This causes really long
   scheduling latency blip which affects badly on application that
   has tight latency requirement.

The fix is to have kernel calculate domain avg_load in both cases. This
will ensure that imbalance calculation is always sensible and the target
is usually half way between busiest and puller CPU.

Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/20110408002322.3A0D812217F@elm.corp.google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-11 11:08:54 +02:00
..
debug Fix common misspellings 2011-03-31 11:26:23 -03:00
gcov Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 2011-03-20 18:14:55 -07:00
irq Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', 'timers-fixes-for-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-04-07 12:12:58 -07:00
power Fix common misspellings 2011-03-31 11:26:23 -03:00
time Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6 2011-04-07 11:14:49 -07:00
trace Fix common misspellings 2011-03-31 11:26:23 -03:00
.gitignore
acct.c
async.c
audit.c netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms 2011-03-03 10:55:40 -08:00
audit.h
audit_tree.c Fix common misspellings 2011-03-31 11:26:23 -03:00
audit_watch.c kill path_lookup() 2011-03-14 09:15:23 -04:00
auditfilter.c netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms 2011-03-03 10:55:40 -08:00
auditsc.c Fix common misspellings 2011-03-31 11:26:23 -03:00
backtracetest.c
bounds.c memcg: remove direct page_cgroup-to-page pointer 2011-03-23 19:46:28 -07:00
capability.c userns: make has_capability* into real functions 2011-03-23 19:47:06 -07:00
cgroup.c Fix common misspellings 2011-03-31 11:26:23 -03:00
cgroup_freezer.c
compat.c posix-timers: Introduce a syscall for clock tuning. 2011-02-02 15:28:19 +01:00
configs.c
cpu.c Fix common misspellings 2011-03-31 11:26:23 -03:00
cpuset.c cpuset: hold callback_mutex in cpuset_post_clone() 2011-03-23 19:46:35 -07:00
crash_dump.c crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn 2011-03-23 19:47:19 -07:00
cred.c userns: security: make capabilities relative to the user namespace 2011-03-23 19:47:02 -07:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c Fix common misspellings 2011-03-31 11:26:23 -03:00
extable.c
fork.c Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
freezer.c Freezer: Fix a race during freezing of TASK_STOPPED tasks 2010-12-24 15:02:40 +01:00
futex.c Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-25 17:52:22 -07:00
futex_compat.c userns: user namespaces: convert several capable() calls 2011-03-23 19:47:08 -07:00
groups.c userns: user namespaces: convert several capable() calls 2011-03-23 19:47:08 -07:00
hrtimer.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-15 18:53:35 -07:00
hung_task.c
hw_breakpoint.c perf: Dynamic pmu types 2010-12-16 11:36:43 +01:00
irq_work.c irq_work: Use per cpu atomics instead of regular atomics 2010-12-18 15:54:48 +01:00
itimer.c
jump_label.c
kallsyms.c Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-25 17:52:22 -07:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6 2011-04-07 11:14:49 -07:00
kfifo.c
kmod.c
kprobes.c Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2011-01-07 17:02:58 -08:00
ksysfs.c
kthread.c Fix common misspellings 2011-03-31 11:26:23 -03:00
latencytop.c Fix common misspellings 2011-03-31 11:26:23 -03:00
lockdep.c Fix common misspellings 2011-03-31 11:26:23 -03:00
lockdep_internals.h
lockdep_proc.c lockdep: Remove unused 'factor' variable from lockdep_stats_show() 2011-03-23 13:54:47 +01:00
lockdep_states.h
Makefile crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn 2011-03-23 19:47:19 -07:00
module.c Fix common misspellings 2011-03-31 11:26:23 -03:00
mutex-debug.c
mutex-debug.h
mutex.c Fix common misspellings 2011-03-31 11:26:23 -03:00
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c userns: user namespaces: convert several capable() calls 2011-03-23 19:47:08 -07:00
padata.c Fix common misspellings 2011-03-31 11:26:23 -03:00
panic.c move x86 specific oops=panic to generic code 2011-03-22 17:44:11 -07:00
params.c Fix common misspellings 2011-03-31 11:26:23 -03:00
perf_event.c perf: Fix task_struct reference leak 2011-03-31 13:02:56 +02:00
pid.c export pid symbols needed for kvm_vcpu_on_spin 2011-03-17 13:08:28 -03:00
pid_namespace.c pidns: call pid_ns_prepare_proc() from create_pid_namespace() 2011-03-23 19:46:58 -07:00
pm_qos_params.c PM QoS: Make pm_qos settings readable 2011-03-15 00:43:18 +01:00
posix-cpu-timers.c Fix common misspellings 2011-03-31 11:26:23 -03:00
posix-timers.c Fix common misspellings 2011-03-31 11:26:23 -03:00
printk.c printk: allow setting DEFAULT_MESSAGE_LEVEL via Kconfig 2011-03-22 17:44:13 -07:00
profile.c
ptrace.c userns: allow ptrace from non-init user namespaces 2011-03-23 19:47:05 -07:00
range.c
rcupdate.c rcu: add comment saying why DEBUG_OBJECTS_RCU_HEAD depends on PREEMPT. 2011-03-04 08:05:41 -08:00
rcutiny.c rcu: avoid pointless blocked-task warnings 2011-01-14 04:58:08 -08:00
rcutiny_plugin.h rcu: call __rcu_read_unlock() in exit_rcu for tiny RCU 2011-03-04 08:05:08 -08:00
rcutorture.c rcutorture: Get rid of duplicate sched.h include 2011-03-04 08:05:17 -08:00
rcutree.c Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2011-01-07 17:02:58 -08:00
rcutree.h rcu: limit rcu_node leaf-level fanout 2010-12-17 12:34:20 -08:00
rcutree_plugin.h rcu: increase synchronize_sched_expedited() batching 2010-12-17 12:34:08 -08:00
rcutree_trace.c rcu,cleanup: simplify the code when cpu is dying 2010-11-29 22:01:58 -08:00
relay.c
res_counter.c memcg: res_counter_read_u64(): fix potential races on 32-bit machines 2011-03-23 19:46:22 -07:00
resource.c resources: add arch hook for preventing allocation in reserved areas 2010-12-17 10:01:09 -08:00
rtmutex-debug.c rtmutex: Simplify PI algorithm and make highest prio task get lock 2011-01-27 21:13:51 -05:00
rtmutex-debug.h
rtmutex-tester.c rtmutex: tester: Remove the remaining BKL leftovers 2011-02-22 22:07:22 +01:00
rtmutex.c rtmutex: Simplify PI algorithm and make highest prio task get lock 2011-01-27 21:13:51 -05:00
rtmutex.h
rtmutex_common.h rtmutex: Simplify PI algorithm and make highest prio task get lock 2011-01-27 21:13:51 -05:00
rwsem.c
sched.c Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', 'timers-fixes-for-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-04-07 12:12:58 -07:00
sched_autogroup.c Fix common misspellings 2011-03-31 11:26:23 -03:00
sched_autogroup.h sched, autogroup: Stop going ahead if autogroup is disabled 2011-02-23 11:33:59 +01:00
sched_clock.c sched: Add some clock info to sched_debug 2010-11-23 10:29:08 +01:00
sched_cpupri.c
sched_cpupri.h
sched_debug.c sched: Use a buddy to implement yield_task_fair() 2011-02-03 14:20:33 +01:00
sched_fair.c sched: Fix sched-domain avg_load calculation 2011-04-11 11:08:54 +02:00
sched_features.h
sched_idletask.c sched, doc: Update sched-design-CFS.txt 2011-03-23 14:09:41 +01:00
sched_rt.c Fix common misspellings 2011-03-31 11:26:23 -03:00
sched_stats.h
sched_stoptask.c sched, doc: Update sched-design-CFS.txt 2011-03-23 14:09:41 +01:00
seccomp.c
semaphore.c
signal.c signal.c: fix erroneous syscall kernel-doc 2011-04-08 11:05:24 -07:00
smp.c smp: move smp setup functions to kernel/smp.c 2011-03-22 17:44:11 -07:00
softirq.c Fix common misspellings 2011-03-31 11:26:23 -03:00
spinlock.c
srcu.c rcu: demote SRCU_SYNCHRONIZE_DELAY from kernel-parameter status 2011-01-14 04:56:49 -08:00
stacktrace.c
stop_machine.c kthread: use kthread_create_on_node() 2011-03-22 17:44:01 -07:00
sys.c userns: user namespaces: convert all capable checks in kernel/sys.c 2011-03-23 19:47:06 -07:00
sys_ni.c vfs: Add open by file handle support 2011-03-15 02:21:44 -04:00
sysctl.c sysctl: restrict write access to dmesg_restrict 2011-03-23 19:46:54 -07:00
sysctl_binary.c open-style analog of vfs_path_lookup() 2011-03-14 09:15:28 -04:00
sysctl_check.c sysctl_check: drop dead code 2011-03-23 19:46:51 -07:00
taskstats.c taskstats: use appropriate printk priority level 2011-03-23 19:47:14 -07:00
test_kprobes.c
time.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-15 18:53:35 -07:00
timeconst.pl
timer.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-15 18:53:35 -07:00
tracepoint.c tracepoints: Fix section alignment using pointer array 2011-02-03 09:28:46 -05:00
tsacct.c
uid16.c userns: user namespaces: convert several capable() calls 2011-03-23 19:47:08 -07:00
up.c
user-return-notifier.c Fix common misspellings 2011-03-31 11:26:23 -03:00
user.c userns: add a user_namespace as creator/owner of uts_namespace 2011-03-23 19:46:59 -07:00
user_namespace.c user_ns: improve the user_ns on-the-slab packaging 2011-01-13 08:03:18 -08:00
utsname.c userns: allow sethostname in a container 2011-03-23 19:47:03 -07:00
utsname_sysctl.c
wait.c Fix common misspellings 2011-03-31 11:26:23 -03:00
watchdog.c kernel/watchdog.c: always return NOTIFY_OK during cpu up/down events 2011-03-22 17:44:12 -07:00
workqueue.c Fix common misspellings 2011-03-31 11:26:23 -03:00
workqueue_sched.h