Commit graph

376448 commits

Author SHA1 Message Date
Alex Shi bf5b986ed4 sched/tg: Use 'unsigned long' for load variable in task group
Since tg->load_avg is smaller than tg->load_weight, we don't need a
atomic64_t variable for load_avg in 32 bit machine.
The same reason for cfs_rq->tg_load_contrib.

The atomic_long_t/unsigned long variable type are more efficient and
convenience for them.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-11-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:40 +02:00
Alex Shi 72a4cf20cb sched: Change cfs_rq load avg to unsigned long
Since the 'u64 runnable_load_avg, blocked_load_avg' in cfs_rq struct are
smaller than 'unsigned long' cfs_rq->load.weight. We don't need u64
vaiables to describe them. unsigned long is more efficient and convenience.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Reviewed-by: Paul Turner <pjt@google.com>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-10-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:38 +02:00
Alex Shi a003a25b22 sched: Consider runnable load average in move_tasks()
Aside from using runnable load average in background, move_tasks is
also the key function in load balance. We need consider the runnable
load average in it in order to make it an apple to apple load
comparison.

Morten had caught a div u64 bug on ARM, thanks!

Thanks-to: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-8-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:36 +02:00
Alex Shi b92486cbf2 sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task
They are the base values in load balance, update them with rq runnable
load average, then the load balance will consider runnable load avg
naturally.

We also try to include the blocked_load_avg as cpu load in balancing,
but that cause kbuild performance drop 6% on every Intel machine, and
aim7/oltp drop on some of 4 CPU sockets machines.
Or only add blocked_load_avg into get_rq_runable_load, hackbench still
drop a little on NHM EX.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-7-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:35 +02:00
Alex Shi 83dfd5235e sched: Update cpu load after task_tick
To get the latest runnable info, we need do this cpuload update after
task_tick.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Reviewed-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-6-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:33 +02:00
Alex Shi 282cf499f0 sched: Fix sleep time double accounting in enqueue entity
The woken migrated task will __synchronize_entity_decay(se); in
migrate_task_rq_fair, then it needs to set
`se->avg.last_runnable_update -= (-se->avg.decay_count) << 20' before
update_entity_load_avg, in order to avoid sleep time is updated twice
for se.avg.load_avg_contrib in both __syncchronize and
update_entity_load_avg.

However if the sleeping task is woken up from the same cpu, it miss
the last_runnable_update before update_entity_load_avg(se, 0, 1), then
the sleep time was used twice in both functions.  So we need to remove
the double sleep time accounting.

Paul also contributed some code comments in this commit.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Reviewed-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-5-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:32 +02:00
Alex Shi a75cdaa915 sched: Set an initial value of runnable avg for new forked task
We need to initialize the se.avg.{decay_count, load_avg_contrib} for a
new forked task. Otherwise random values of above variables cause a
mess when a new task is enqueued:

    enqueue_task_fair
        enqueue_entity
            enqueue_entity_load_avg

and make fork balancing imbalance due to incorrect load_avg_contrib.

Further more, Morten Rasmussen notice some tasks were not launched at
once after created. So Paul and Peter suggest giving a start value for
new task runnable avg time same as sched_slice().

PeterZ said:

> So the 'problem' is that our running avg is a 'floating' average; ie. it
> decays with time. Now we have to guess about the future of our newly
> spawned task -- something that is nigh impossible seeing these CPU
> vendors keep refusing to implement the crystal ball instruction.
>
> So there's two asymptotic cases we want to deal well with; 1) the case
> where the newly spawned program will be 'nearly' idle for its lifetime;
> and 2) the case where its cpu-bound.
>
> Since we have to guess, we'll go for worst case and assume its
> cpu-bound; now we don't want to make the avg so heavy adjusting to the
> near-idle case takes forever. We want to be able to quickly adjust and
> lower our running avg.
>
> Now we also don't want to make our avg too light, such that it gets
> decremented just for the new task not having had a chance to run yet --
> even if when it would run, it would be more cpu-bound than not.
>
> So what we do is we make the initial avg of the same duration as that we
> guess it takes to run each task on the system at least once -- aka
> sched_slice().
>
> Of course we can defeat this with wakeup/fork bombs, but in the 'normal'
> case it should be good enough.

Paul also contributed most of the code comments in this commit.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Paul Turner <pjt@google.com>
[peterz; added explanation of sched_slice() usage]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-4-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:30 +02:00
Alex Shi fa6bddeb14 sched: Move a few runnable tg variables into CONFIG_SMP
The following 2 variables are only used under CONFIG_SMP, so its
better to move their definiation into CONFIG_SMP too.

        atomic64_t load_avg;
        atomic_t runnable_avg;

Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-3-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:29 +02:00
Alex Shi 141965c749 Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking"
Remove CONFIG_FAIR_GROUP_SCHED that covers the runnable info, then
we can use runnable load variables.

Also remove 2 CONFIG_FAIR_GROUP_SCHED setting which is not in reverted
patch(introduced in 9ee474f), but also need to revert.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51CA76A3.3050207@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:22 +02:00
Joe Perches be7002e6c6 sched: Don't mix use of typedef ctl_table and struct ctl_table
Just use struct ctl_table.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371063336.2069.22.camel@joe-AO722
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:48 +02:00
Viresh Kumar 94c95ba69f sched: Remove WARN_ON(!sd) from init_sched_groups_power()
sd can't be NULL in init_sched_groups_power() and so checking it for NULL isn't
useful. In case it is required, then also we need to rearrange the code a bit as
we already accessed invalid pointer sd to get sg: sg = sd->groups.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/2bbe633cd74b431c05253a8ce61fdfd5066a531b.1370948150.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:47 +02:00
Viresh Kumar cd08e9234c sched: Fix memory leakage in build_sched_groups()
In build_sched_groups() we don't need to call get_group() for cpus
which are already covered in previous iterations. Calling get_group()
would mark the group used and eventually leak it since we wouldn't
connect it and not find it again to free it.

This will happen only in cases where sg->cpumask contained more than
one cpu (For any topology level). This patch would free sg's memory
for all cpus leaving the group leader as the group isn't marked used
now.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/7a61e955abdcbb1dfa9fe493f11a5ec53a11ddd3.1370948150.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:46 +02:00
Viresh Kumar 0936629f01 sched: Use cached value of span instead of calling sched_domain_span()
In the beginning of build_sched_groups() we called sched_domain_span() and
cached its return value in span. Few statements later we are calling it again to
get the same pointer.

Lets use the cached value instead as it hasn't changed in between.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/834ecd507071ad88aff039352dbc7e063dd996a7.1370948150.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:46 +02:00
Viresh Kumar 27723a68ca sched: Create for_each_sd_topology()
For loop for traversing sched_domain_topology was used at multiple placed in
core.c. This patch removes code redundancy by creating for_each_sd_topology().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/e0e04542f54e9464bd9da54f5ccfe62ec6c4c0bc.1370861520.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:45 +02:00
Viresh Kumar c75e01288c sched: Don't set sd->child to NULL when it is already NULL
Memory for sd is allocated with kzalloc_node() which will initialize its fields
with zero. In build_sched_domain() we are setting sd->child to child even if
child is NULL, which isn't required.

Lets do it only if child isn't NULL.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/f4753a1730051341003ad2ad29a3229c7356678e.1370861520.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:45 +02:00
Viresh Kumar 1c63216940 sched: Don't initialize alloc_state in build_sched_domains()
alloc_state will be overwritten by __visit_domain_allocation_hell() and so we
don't actually need to initialize alloc_state.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/df57734a075cc5ad130e1ae498702e24f2529ab8.1370861520.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:44 +02:00
Viresh Kumar 22da956953 sched: Optimize build_sched_domains() for saving first SD node for a cpu
We are saving first scheduling domain for a cpu in build_sched_domains() by
iterating over the nested sd->child list. We don't actually need to do it this
way.

tl will be equal to sched_domain_topology for the first iteration and so we can
set *per_cpu_ptr(d.sd, i) based on that.  So, save pointer to first SD while
running the iteration loop over tl's.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/fc473527cbc4dfa0b8eeef2a59db74684eb59a83.1370436120.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:43 +02:00
Viresh Kumar 4a850cbefa sched: Remove unused params of build_sched_domain()
build_sched_domain() never uses parameter struct s_data *d and so passing it is
useless.

Remove it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/545e0b4536166a15b4475abcafe5ed0db4ad4a2c.1370436120.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:42 +02:00
Viresh Kumar 0a0fca9d83 sched: Rename sched.c as sched/core.c in comments and Documentation
Most of the stuff from kernel/sched.c was moved to kernel/sched/core.c long time
back and the comments/Documentation never got updated.

I figured it out when I was going through sched-domains.txt and so thought of
fixing it globally.

I haven't crossed check if the stuff that is referenced in sched/core.c by all
these files is still present and hasn't changed as that wasn't the motive behind
this patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/cdff76a265326ab8d71922a1db5be599f20aad45.1370329560.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:42 +02:00
Michael Wang 8404c90d05 sched: Femove the useless declaration in kernel/sched/fair.c
default_cfs_period(), do_sched_cfs_period_timer(), do_sched_cfs_slack_timer()
already defined previously, no need to declare again.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51AD8808.7020608@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:41 +02:00
Michael Wang 22b958d8cc sched: Refine the code in unthrottle_cfs_rq()
Directly use rq to save some code.

Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/51AD87EB.1070605@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:41 +02:00
Kirill Tkhai e23ee74777 sched/rt: Simplify pull_rt_task() logic and remove .leaf_rt_rq_list
[ Peter, this is based off of some of my work, I ran it though a few
  tests and it passed. I also reviewed it, and added my SOB as I am
  somewhat a co-author to it. ]

Based on the patch by Steven Rostedt from previous year:

https://lkml.org/lkml/2012/4/18/517

1)Simplify pull_rt_task() logic: search in pushable tasks of dest runqueue.
The only pullable tasks are the tasks which are pushable in their local rq,
and no others.

2)Remove .leaf_rt_rq_list member of struct rt_rq and functions connected
with it: nobody uses it since now.

Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/287571370557898@web7d.yandex.ru
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:58:40 +02:00
Ingo Molnar d81344c508 Merge branch 'sched/urgent' into sched/core
Merge in fixes before applying ongoing new work.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:55:31 +02:00
Steven Rostedt 29bb9e5a75 tracing/context-tracking: Add preempt_schedule_context() for tracing
Dave Jones hit the following bug report:

 ===============================
 [ INFO: suspicious RCU usage. ]
 3.10.0-rc2+ #1 Not tainted
 -------------------------------
 include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
 other info that might help us debug this:
 RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
 RCU used illegally from extended quiescent state!
 2 locks held by cc1/63645:
  #0:  (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0
  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0

 CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
 Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
  0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
  ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
  0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
 Call Trace:
  [<ffffffff816ae383>] dump_stack+0x19/0x1b
  [<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130
  [<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0
  [<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0
  [<ffffffff8108dffc>] update_curr+0xec/0x240
  [<ffffffff8108f528>] put_prev_task_fair+0x228/0x480
  [<ffffffff816b3a71>] __schedule+0x161/0x9b0
  [<ffffffff816b4721>] preempt_schedule+0x51/0x80
  [<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60
  [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
  [<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210
  [<ffffffff816be280>] ftrace_call+0x5/0x2f
  [<ffffffff816b681d>] ? retint_careful+0xb/0x2e
  [<ffffffff816b4805>] ? schedule_user+0x5/0x70
  [<ffffffff816b4805>] ? schedule_user+0x5/0x70
  [<ffffffff816b6824>] ? retint_careful+0x12/0x2e
 ------------[ cut here ]------------

What happened was that the function tracer traced the schedule_user() code
that tells RCU that the system is coming back from userspace, and to
add the CPU back to the RCU monitoring.

Because the function tracer does a preempt_disable/enable_notrace() calls
the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
then preempt_schedule() is called. But this is called before the user_exit()
function can inform the kernel that the CPU is no longer in user mode and
needs to be accounted for by RCU.

The fix is to create a new preempt_schedule_context() that checks if
the kernel is still in user mode and if so to switch it to kernel mode
before calling schedule. It also switches back to user mode coming back
from schedule in need be.

The only user of this currently is the preempt_enable_notrace(), which is
only used by the tracing subsystem.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:55:10 +02:00
Vincent Guittot 873b4c65b5 sched: Fix clear NOHZ_BALANCE_KICK
I have faced a sequence where the Idle Load Balance was sometime not
triggered for a while on my platform, in the following scenario:

 CPU 0 and CPU 1 are running tasks and CPU 2 is idle

 CPU 1 kicks the Idle Load Balance
 CPU 1 selects CPU 2 as the new Idle Load Balancer
 CPU 2 sets NOHZ_BALANCE_KICK for CPU 2
 CPU 2 sends a reschedule IPI to CPU 2

 While CPU 3 wakes up, CPU 0 or CPU 1 migrates a waking up task A on CPU 2

 CPU 2 finally wakes up, runs task A and discards the Idle Load Balance
       task A quickly goes back to sleep (before a tick occurs on CPU 2)
 CPU 2 goes back to idle with NOHZ_BALANCE_KICK set

Whenever CPU 2 will be selected as the ILB, no reschedule IPI will be sent
because NOHZ_BALANCE_KICK is already set and no Idle Load Balance will be
performed.

We must wait for the sched softirq to be raised on CPU 2 thanks to another
part the kernel to come back to clear NOHZ_BALANCE_KICK.

The proposed solution clears NOHZ_BALANCE_KICK in schedule_ipi if
we can't raise the sched_softirq for the Idle Load Balance.

Change since V1:

- move the clear of NOHZ_BALANCE_KICK in got_nohz_idle_kick if the ILB
  can't run on this CPU (as suggested by Peter)

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1370419991-13870-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-19 12:55:09 +02:00
Andrew Jones b0bc225d0e sched/x86: Construct all sibling maps if smt
Commit 316ad24830 ("sched/x86: Rewrite
set_cpu_sibling_map()") broke the construction of sibling maps,
which also broke the booted_cores accounting.

Before the rewrite, if smt was present, then each map was
updated for each smt sibling. After the rewrite only
cpu_sibling_mask gets updated, as the llc and core maps depend
on 'has_mc = x86_max_cores > 1' instead. This leads to problems
with topologies like the following

(qemu -smp sockets=2,cores=1,threads=2)

  processor       : 0
  physical id     : 0
  siblings        : 1    <= should be 2
  core id         : 0
  cpu cores       : 1

  processor       : 1
  physical id     : 0
  siblings        : 1    <= should be 2
  core id         : 0
  cpu cores       : 0    <= should be 1

  processor       : 2
  physical id     : 1
  siblings        : 1    <= should be 2
  core id         : 0
  cpu cores       : 1

  processor       : 3
  physical id     : 1
  siblings        : 1    <= should be 2
  core id         : 0
  cpu cores       : 0    <= should be 1

This patch restores the former construction by defining has_mc
as (has_smt || x86_max_cores > 1). This should be fine as there
were no (has_smt && !has_mc) conditions in the context.

Aso rename has_mc to has_mp now that it's not just for cores.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: a.p.zijlstra@chello.nl
Cc: fenghua.yu@intel.com
Link: http://lkml.kernel.org/r/1369831695-11970-1-git-send-email-drjones@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-31 13:10:38 +02:00
Kamalesh Babulal 0de358f1c2 sched/fair: Remove unused variable from expire_cfs_rq_runtime()
Commit 78becc2709 ("sched: Use an accessor to read the rq clock")
introduces rq_clock(), which obsoletes the use of the "rq" variable
in expire_cfs_rq_runtime() and triggers this build warning:

  kernel/sched/fair.c: In function 'expire_cfs_rq_runtime':
  kernel/sched/fair.c:2159:13: warning: unused variable 'rq' [-Wunused-variable]

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Turner <pjt@google.com>
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/1369904660-14169-1-git-send-email-kamalesh@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-31 13:02:29 +02:00
Linus Torvalds dcdbe33add Merge branch 'mn10300' (mn10300 fixes from David Howells)
Merge mn10300 fixes from David Howells.

* emailed patches from David Howells <dhowells@redhat.com>:
  MN10300: Need pci_iomap() and __pci_ioport_map() defining
  MN10300: ASB2305's PCI code needs the definition of XIRQ1
  MN10300: Enable IRQs more in system call exit work path
  MN10300: Fix ret_from_kernel_thread
2013-05-30 13:39:01 +09:00
David Howells 1aeeac7ad4 MN10300: Need pci_iomap() and __pci_ioport_map() defining
Include the generic definitions of pci_iomap() and __pci_ioport_map()
otherwise we can get errors like:

  lib/pci_iomap.c: In function 'pci_iomap':
  lib/pci_iomap.c:37: error: implicit declaration of function '__pci_ioport_map'
  lib/pci_iomap.c:37: warning: return makes pointer from integer without a cast

and:

  drivers/pci/quirks.c: In function 'disable_igfx_irq':
  drivers/pci/quirks.c:2893: error: implicit declaration of function 'pci_iomap'
  drivers/pci/quirks.c:2893: warning: initialization makes pointer from integer without a cast
  drivers/pci/quirks.c: In function 'reset_ivb_igd':
  drivers/pci/quirks.c:3133: warning: assignment makes pointer from integer without a cast

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Ken Cox <jkc@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-30 13:38:48 +09:00
David Howells b8bc9b0237 MN10300: ASB2305's PCI code needs the definition of XIRQ1
The code for PCI in the ASB2305 needs the definition of XIRQ1 from proc/irq.h
otherwise the following error appears:

  arch/mn10300/unit-asb2305/pci.c: In function 'unit_pci_init':
  arch/mn10300/unit-asb2305/pci.c:481: error: 'XIRQ1' undeclared (first use in this function)
  arch/mn10300/unit-asb2305/pci.c:481: error: (Each undeclared identifier is reported only once
  arch/mn10300/unit-asb2305/pci.c:481: error: for each function it appears in.)

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Ken Cox <jkc@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-30 13:38:48 +09:00
David Howells d17fc238ac MN10300: Enable IRQs more in system call exit work path
Enable IRQs when calling schedule() for TIF_NEED_RESCHED and
do_notify_resume().  If interrupts are enabled during do_notify_resume(), a
warning can be seen (see lower down).

Whilst we're at it, resume_userspace can be made local to entry.S as it is not
called outside of there and it can be merged with the part of work_resched that
occurs after schedule() is called.

  WARNING: at kernel/softirq.c:160 local_bh_enable+0x42/0xa0()
  Call Trace:
    local_bh_enable+0x42/0xa0
    unix_release_sock+0x86/0x23c
    unix_release+0x20/0x28
    sock_release+0x17/0x88
    sock_close+0x20/0x28
    __fput+0xc9/0x1fc
    ____fput+0xb/0x10
    task_work_run+0x64/0x78
    do_notify_resume+0x53d/0x544
    work_notifysig+0xa/0xc

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Ken Cox <jkc@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-30 13:38:48 +09:00
David Howells 1e00227d4e MN10300: Fix ret_from_kernel_thread
ret_from_kernel_thread needs to set A2 to the thread_info pointer before
jumping to syscall_exit.

Without this, we never correctly start userspace.

This was caused by the rejuggling of the fork/exec paths in commit
ddf23e87a8 ("mn10300: switch to saner kernel_execve() semantics")

Reported-by: Ken Cox <jkc@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Ken Cox <jkc@redhat.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-30 13:38:48 +09:00
Linus Torvalds 7b55eab81e Pin control fixes for v3.10:
- Six patches fixing up the suspend/resume and wakeup
   handling of the Samsung and Exynos drivers.
 - Errorpath fixes for four different drivers. All on
   the probe() errorpath.
 - Make the debugfs code for pin config take the right
   mutex.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJRpiV9AAoJEEEQszewGV1zMoEP/i/XS0p/KoyHvD3k4L1vw5nc
 kxDjjYXYZRrNzLtqMdr+DrEITO9yArnuJQeeUlx92Aq0GCC3B/Rh1YvhkIKvqhx4
 tbySQqpDaBbWt7UsixtZBBt8CxVGiD26lLbkpAECunAFgh+XA4x6dyuSbWdem+1Y
 XBX56BpukU+pfa62sM9P2Hs4Cj/QQ/ABDkibzoD1tFnORdGPkcFwddXdthn37MZL
 WLF14522xL1GCrwzDVDbDTHsbgooFZRI8Zv0cExnbGc+BrivcnSAnVCioACHY+Pg
 +iHk1ls+rJEnZgqafEzq9ViHRx3ctyiscUyrdYS5OMoHZ4PcqcSqtgmi0YUTynwf
 jNa3OXVQ4SITuj8Q1vvxwOejUD1L2GdWSij8gBIYZTKShqntdsrYj9zX6SljHd6P
 x/93UpXLL9N1nbbTA/XsD1HRSYCmtHS20GH3N2hsDJa8nIQNwBD1ydj+Mzt0ROLf
 pKvT7jPVSYC8lYYMrigFhNuUVir0mCKiHYPrz3H6oWTVX+YFxj7420b3mid0u5fw
 mi8zfpxhLOMPnDGQnB3U2xUva4Nfshn9RLBfdBjC08H3OJnGTgjBwtfiqx6vbLqi
 ZkOl/gElq9AKUOqVNT51E0G/4Nvbe/jBQqQppLDBsxQ4x8LLBIrFmInOs/IiH3wV
 Q4TVoXut9HMB6YDBl2BA
 =J5kT
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-fixes-v3.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin-control fixes from Linus Walleij:
 - Six patches fixing up the suspend/resume and wakeup handling of the
   Samsung and Exynos drivers.
 - Errorpath fixes for four different drivers.  All on the probe()
   errorpath.
 - Make the debugfs code for pin config take the right mutex.

* tag 'pinctrl-fixes-v3.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: pinconf: take the right mutex
  pinctrl: sunxi: fix error return code in sunxi_pinctrl_probe()
  pinctrl: exynos: Handle suspend/resume of GPIO EINT registers
  pinctrl: samsung: Allow per-bank SoC-specific private data
  pinctrl: samsung: Add support for SoC-specific suspend/resume callbacks
  pinctrl: Don't override the error code in probe error handling
  ARM: EXYNOS: Fix EINT wake-up mask configuration when pinctrl is used
  pinctrl: exynos: Add support for set_irq_wake of wake-up EINTs
  pinctrl: samsung: fix suspend/resume functionality
2013-05-30 08:54:29 +09:00
Linus Torvalds c476321533 ARM: Exynos fixes for 3.10-rc
Here's a shorter set of fixes for 3.10, all for Samsung Exynos platforms.
 
 It also includes a defconfig update so that exynos_defconfig provides
 a meaningful set of drivers to boot an unmodified kernel on the Samsung
 ARM-based Chromebooks.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRpU7rAAoJEIwa5zzehBx3AusQAIOzT/Gp8wUSUDhH4CfS6ByX
 DqQrKhhg8jzkP7crQjeMMsdu0kkLhn1Q+XuT+Vb9eEVVKRIqztk3iqcLSS+qzMNf
 DbEoMj9AaC9ovyLh694JxVgBScOYv9nFfUl3/bZa7jgXbzF4KsXPmTffi6TnBgHN
 Hc/hlhRX8FOhVKIVL3UbKifjw671n3jP87iov68IPcBd3J4sCtP5H3SWGCDa/eP9
 JOQShtnGGThghfHMl0QnMEExtPJ86zq/ZVH6y2Mhxp6AErq0k7P9E6123QANgKe8
 ejS/VmTJL9jOwXL8HT94bBHjT1QV9dnhbrzen8H2jIEiWyu/H8aFiSGz/wAz8sSO
 bPknczUI/QhjcmLnj7+GB1TIyfbj+DA9E8jdDyZ6tkqGKb2xWdN2R5tCgGwfy8Xa
 bz9aUP4W8ISGGx7ime2RLLfKoOLBwjGHWzbG9b4coa2v5O7LZrwQJEV2LTT6zg+s
 MyJ62RFTG33lkVzJGJgUMOXh6vyTw2cybh4G7f/a7nKt5N6ktTGThpMKuSvhyH9D
 lKKSq+xAnf1LMwzRtwgn0nn/Xe9I9lSve/BaR5fL+bGNgZ28l51xNKFd4HLqvQ1Z
 AObdlupfy377bs6IPjFKOr/daZet7OziBP72t9Px/ar2wJ8kqeY1dicgVkVN6n/3
 of5xhz13QymuaD3Lz3d+
 =TUvk
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM Exynos fixes from Olof Johansson:
 "Here's a shorter set of fixes for 3.10, all for Samsung Exynos
  platforms.

  It also includes a defconfig update so that exynos_defconfig provides
  a meaningful set of drivers to boot an unmodified kernel on the
  Samsung ARM-based Chromebooks."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: exynos: defconfig update
  ARM: SAMSUNG: Add names to fimd0 IRQ resources
  ARM: EXYNOS: fix software reset logic for EXYNOS5440 SOC
  ARM: EXYNOS: Fix support of Exynos4210 rev0 SoC
  ARM: dts: Enabling samsung-usb2phy driver for exynos5250
2013-05-29 19:24:55 +09:00
Olof Johansson da9d0fbf5e ARM: exynos: defconfig update
This turns on a number of configs that are useful on the Chromebook, but also
good to have on in general:

* USB host and MMC drivers(!)
* I2C GPIO arbitration driver
* CYAPA trackpad driver
* simplefb
* CROS EC and keyboard drivers
* S5M8767 driver
* MAX77686 drivers
* MAX8997 driver
* DEVTMPFS + mount
* DM_CRYPT (as module)
* CRYPTOLOOP
* HIGHMEM
* PRINTK timestamps

This also turns off DEBUG_LL, and switches the hardcoded Samsung lowlevel
uart to uart 3 (which is only used to show the "uncompressing kernel"
message at boot, it seems).

Signed-off-by: Olof Johansson <olof@lixom.net>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Tested-by: Tushar Behera <tushar.behera@linaro.org>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
2013-05-28 17:21:41 -07:00
Linus Torvalds 58f8bbd2e3 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "This is mostly exynos and intel fixes, along with some vblank patches
  I lost from Rob a few months ago that make wayland work better on lots
  of GPUs, also a qxl kconfig fix."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (22 commits)
  qxl: fix Kconfig deps - select FB_DEFERRED_IO
  drm/exynos: replace request_threaded_irq with devm function
  drm/exynos: remove unnecessary devm_kfree
  drm/exynos: fix build warnings from ipp fimc
  drm/exynos: cleanup device pointer usages
  drm/exynos: wait for the completion of pending page flip
  drm/exynos: use drm_send_vblank_event() helper
  drm/i915: avoid premature DP AUX timeouts
  drm/i915: avoid premature timeouts in __wait_seqno()
  drm/i915: use msecs_to_jiffies_timeout instead of open coding the same
  drm/i915: add msecs_to_jiffies_timeout to guarantee minimum duration
  drm/i915: force full modeset if the connector is in DPMS OFF mode
  drm/exynos: page flip fixes
  drm/exynos: exynos_hdmi: Pass correct pointer to free_irq()
  drm/exynos: exynos_drm_ipp: Fix incorrect usage of IS_ERR_OR_NULL
  drm/exynos: exynos_drm_fbdev: Fix incorrect usage of IS_ERR_OR_NULL
  drm/imx: use drm_send_vblank_event() helper
  drm/shmob: use drm_send_vblank_event() helper
  drm/radeon: use drm_send_vblank_event() helper
  drm/nouveau: use drm_send_vblank_event() helper
  ...
2013-05-28 10:11:34 -07:00
Linus Torvalds 30a9e50143 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This push fixes a crash in the new sha256_ssse3 driver as well as a
  DMA setup/teardown bug in caam"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: sha256_ssse3 - fix stack corruption with SSSE3 and AVX implementations
  crypto: caam - fix inconsistent assoc dma mapping direction
2013-05-28 10:09:38 -07:00
Linus Torvalds 320b34e3e0 Merge branch 'for-3.10' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
 "Fixes for a couple of DFS problems, a problem with extended security
  negotiation and two other small cifs fixes"

* 'for-3.10' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix composing of mount options for DFS referrals
  cifs: stop printing the unc= option in /proc/mounts
  cifs: fix error handling when calling cifs_parse_devname
  cifs: allow sec=none mounts to work against servers that don't support extended security
  cifs: fix potential buffer overrun when composing a new options string
  cifs: only set ops for inodes in I_NEW state
2013-05-28 10:08:39 -07:00
Linus Torvalds e3bf756eb9 Two more fixes:
The first one was reported by Mauro Carvalho Chehab, where if a poll()
 is done against a trace buffer for a CPU that has never been online,
 it will crash the kernel, as buffers are only created when a CPU comes
 on line, but the trace files are for all possible CPUs.
 
 This fix is to check if the buffer was allocated and if not return -EINVAL.
 
 That was the simple fix, the real fix is a bit more complex and not for
 a -rc release. We could have the files created when the CPUs come online.
 That would require some design changes.
 
 The second one was reported by Peter Zijlstra. If the kernel command line
 has ftrace=nop, it will lock up the system on boot up. This is because
 the new design for 3.10 has the nop tracer bootstrap the tracing subsystem.
 When ftrace=<trace> is defined, when a that tracer is registered, it
 starts the tracing, but uses the nop tracer to clear things out.
 What happened here was that ftrace=nop caused the registering of nop
 to start it and use nop before it was initialized.
 
 The only thing nop needs to have done to initialize it is to have the
 tracer point its current_tracer structure member to the nop tracer.
 Doing that before registering the nop tracer makes everything work.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJRpMkPAAoJEOdOSU1xswtMxeEIALWCnqSCKZJ0Oz+2TuR15vd2
 Szm/knRBktRG2FizN8FIouXXMLIYM5HFSvO3Q2bWuV4Dv5KaqNcCEL5BggZC/+Rj
 swt5+rMiUuln0teq792h2LhKwORw0YicLzWsyIZ82iSpcFKAseXqcMzEe/P/Emat
 +J1QaoeDtOx/3X5Sv6tqHomqR80u7phQJwmIK6Yik389yLo3sy2XiPRk9PJqDpac
 V9xbCnZlnopm7rLo7pEAI3R6Vn+MX6lrY1MO0xxjqeIvhvxr9nk0WIRnaevyARbt
 eHnCtfa9pjn+bU9xYaFmyIkilc/IEBFRLb0dtEueH81nmaFDXpHI+h/pEFrDJqE=
 =PR0j
 -----END PGP SIGNATURE-----

Merge tag 'trace-fixes-v3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Two more fixes:

  The first one was reported by Mauro Carvalho Chehab, where if a poll()
  is done against a trace buffer for a CPU that has never been online,
  it will crash the kernel, as buffers are only created when a CPU comes
  on line, but the trace files are for all possible CPUs.

  This fix is to check if the buffer was allocated and if not return
  -EINVAL.

  That was the simple fix, the real fix is a bit more complex and not
  for a -rc release.  We could have the files created when the CPUs come
  online.  That would require some design changes.

  The second one was reported by Peter Zijlstra.  If the kernel command
  line has ftrace=nop, it will lock up the system on boot up.  This is
  because the new design for 3.10 has the nop tracer bootstrap the
  tracing subsystem.  When ftrace=<trace> is defined, when a that tracer
  is registered, it starts the tracing, but uses the nop tracer to clear
  things out.  What happened here was that ftrace=nop caused the
  registering of nop to start it and use nop before it was initialized.

  The only thing nop needs to have done to initialize it is to have the
  tracer point its current_tracer structure member to the nop tracer.
  Doing that before registering the nop tracer makes everything work."

* tag 'trace-fixes-v3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Do not poll non allocated cpu buffers
  tracing: Fix crash when ftrace=nop on the kernel command line
2013-05-28 09:39:04 -07:00
Linus Torvalds 3c48dd4964 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k fixes from Geert Uytterhoeven:
 - futex support that I had missed before,
 - A long-overdue update of the m68k defconfigs.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Update defconfigs for v3.9
  m68k: implement futex.h to support userspace robust futexes and PI mutexes
2013-05-28 09:23:23 -07:00
Linus Torvalds 6e7d43f494 Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
Pull microblaze fixes from Michal Simek:
 "One patch fix futex support and my patches fix warnings which were
  reported by Geert's regression testing"

* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Reversed logic in futex cmpxchg
  microblaze: Use proper casting for inb/inw/inl in io.h
  microblaze: Initialize temp variable to remove compilation warning
2013-05-28 09:21:13 -07:00
Steven Rostedt (Red Hat) 6721cb6002 ring-buffer: Do not poll non allocated cpu buffers
The tracing infrastructure sets up for possible CPUs, but it uses
the ring buffer polling, it is possible to call the ring buffer
polling code with a CPU that hasn't been allocated. This will cause
a kernel oops when it access a ring buffer cpu buffer that is part
of the possible cpus but hasn't been allocated yet as the CPU has never
been online.

Reported-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tested-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-05-28 10:53:20 -04:00
Stanislaw Gruszka 84f9f3a156 sched: Use swap() macro in scale_stime()
Simple cleanup.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1367501673-6563-1-git-send-email-sgruszka@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-28 11:58:10 +02:00
Geert Uytterhoeven 37c14e83ee m68k: Update defconfigs for v3.9
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2013-05-28 10:22:06 +02:00
Frederic Weisbecker 78becc2709 sched: Use an accessor to read the rq clock
Read the runqueue clock through an accessor. This
prepares for adding a debugging infrastructure to
detect missing or redundant calls to update_rq_clock()
between a scheduler's entry and exit point.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Turner <pjt@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1365724262-20142-6-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-28 09:40:27 +02:00
Frederic Weisbecker 1a55af2e45 sched: Update rq clock earlier in unthrottle_cfs_rq
In this function we are making use of rq->clock right before the
update of the rq clock, let's just call update_rq_clock() just
before that to avoid using a stale rq clock value.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Turner <pjt@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1365724262-20142-5-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-28 09:40:25 +02:00
Frederic Weisbecker 1ad4ec0dc7 sched: Update rq clock before calling check_preempt_curr()
check_preempt_curr() of fair class needs an uptodate sched clock
value to update runtime stats of the current task of the target's rq.

When a task is woken up, activate_task() is usually called right before
ttwu_do_wakeup() unless the task is still in the runqueue. In the latter
case we need to update the rq clock explicitly because activate_task()
isn't here to do the job for us.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Turner <pjt@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1365724262-20142-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-28 09:40:24 +02:00
Frederic Weisbecker 71b1da46ff sched: Update rq clock before setting fair group shares
Because we may update the execution time in

    sched_group_set_shares()->update_cfs_shares()->reweight_entity()->update_curr()

before reweighting the entity while setting the group shares and this requires
an uptodate version of the runqueue clock.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Turner <pjt@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1365724262-20142-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-28 09:40:23 +02:00
Frederic Weisbecker 77bd39702f sched: Update rq clock before migrating tasks out of dying CPU
Because the sched_class::put_prev_task() callback of rt and fair
classes are referring to the rq clock to update their runtime
statistics. There is a missing rq clock update from the CPU
hotplug notifier's entry point of the scheduler.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Turner <pjt@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1365724262-20142-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-28 09:40:23 +02:00
Neil Zhang c5405a495e sched: Remove redundant update_runtime notifier
migration_call() will do all the things that update_runtime() does.
So let's remove it.

Furthermore, there is potential risk that the current code will catch
BUG_ON at line 689 of rt.c when do cpu hotplug while there are realtime
threads running because of enabling runtime twice while the rt_runtime
may already changed.

Signed-off-by: Neil Zhang <zhangwm@marvell.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1365685499-26515-1-git-send-email-zhangwm@marvell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-05-28 09:40:22 +02:00