Commit graph

1963 commits

Author SHA1 Message Date
Rafael J. Wysocki 0843e83c1a cpufreq: intel_pstate: Proportional algorithm for Atom
The PID algorithm used by the intel_pstate driver tends to drive
performance to the minimum for workloads with utilization below the
setpoint, which is undesirable, so replace it with a modified
"proportional" algorithm on Atom.

The new algorithm will set the new P-state to be 1.25 times the
available maximum times the (frequency-invariant) utilization during
the previous sampling period except when the target P-state computed
this way is lower than the average P-state during the previous
sampling period.  In the latter case, it will increase the target by
50% of the difference between it and the average P-state to prevent
performance from dropping down too fast in some cases.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2016-10-12 20:58:13 +02:00
Rafael J. Wysocki f00593a4bd cpufreq: intel_pstate: Clarify comment in get_target_pstate_use_performance()
Make the comment explaining the meaning of the perf_scaled variable
in get_target_pstate_use_performance() more straightforward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-10-09 18:54:57 +02:00
Srinivas Pandruvada f9f4872df6 cpufreq: intel_pstate: Fix unsafe HWP MSR access
This is a requirement that MSR MSR_PM_ENABLE must be set to 0x01 before
reading MSR_HWP_CAPABILITIES on a given CPU. If cpufreq init() is
scheduled on a CPU which is not same as policy->cpu or migrates to a
different CPU before calling msr read for MSR_HWP_CAPABILITIES, it
is possible that MSR_PM_ENABLE was not to set to 0x01 on that CPU.
This will cause GP fault. So like other places in this path
rdmsrl_on_cpu should be used instead of rdmsrl.

Moreover the scope of MSR_HWP_CAPABILITIES is on per thread basis, so it
should be read from the same CPU, for which MSR MSR_HWP_REQUEST is
getting set.

dmesg dump or warning:

[   22.014488] WARNING: CPU: 139 PID: 1 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x68/0x70
[   22.014492] unchecked MSR access error: RDMSR from 0x771
[   22.014493] Modules linked in:
[   22.014507] CPU: 139 PID: 1 Comm: swapper/0 Not tainted 4.7.5+ #1
...
...
[   22.014516] Call Trace:
[   22.014542]  [<ffffffff813d7dd1>] dump_stack+0x63/0x82
[   22.014558]  [<ffffffff8107bc8b>] __warn+0xcb/0xf0
[   22.014561]  [<ffffffff8107bcff>] warn_slowpath_fmt+0x4f/0x60
[   22.014563]  [<ffffffff810676f8>] ex_handler_rdmsr_unsafe+0x68/0x70
[   22.014564]  [<ffffffff810677d9>] fixup_exception+0x39/0x50
[   22.014604]  [<ffffffff8102e400>] do_general_protection+0x80/0x150
[   22.014610]  [<ffffffff817f9ec8>] general_protection+0x28/0x30
[   22.014635]  [<ffffffff81687940>] ? get_target_pstate_use_performance+0xb0/0xb0
[   22.014642]  [<ffffffff810600c7>] ? native_read_msr+0x7/0x40
[   22.014657]  [<ffffffff81688123>] intel_pstate_hwp_set+0x23/0x130
[   22.014660]  [<ffffffff81688406>] intel_pstate_set_policy+0x1b6/0x340
[   22.014662]  [<ffffffff816829bb>] cpufreq_set_policy+0xeb/0x2c0
[   22.014664]  [<ffffffff81682f39>] cpufreq_init_policy+0x79/0xe0
[   22.014666]  [<ffffffff81682cb0>] ? cpufreq_update_policy+0x120/0x120
[   22.014669]  [<ffffffff816833a6>] cpufreq_online+0x406/0x820
[   22.014671]  [<ffffffff8168381f>] cpufreq_add_dev+0x5f/0x90
[   22.014717]  [<ffffffff81530ac8>] subsys_interface_register+0xb8/0x100
[   22.014719]  [<ffffffff816821bc>] cpufreq_register_driver+0x14c/0x210
[   22.014749]  [<ffffffff81fe1d90>] intel_pstate_init+0x39d/0x4d5
[   22.014751]  [<ffffffff81fe13f2>] ? cpufreq_gov_dbs_init+0x12/0x12

Cc: 4.3+ <stable@vger.kernel.org> # 4.3+
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-10-09 18:54:13 +02:00
Rafael J. Wysocki b6e2511782 Merge branch 'pm-cpufreq-sched' into pm-cpufreq 2016-10-02 01:42:33 +02:00
Colin Ian King 9ad0a1b6a2 cpufreq: st: add missing \n to end of dev_err message
Trival fix, dev_err message is missing a \n, so add it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-26 15:11:42 +02:00
Colin Ian King 4c232f9469 cpufreq: kirkwood: add missing \n to end of dev_err messages
Trival fix, dev_err messages are missing a \n, so add it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-26 15:10:58 +02:00
Hoan Tran f89f4147f7 cpufreq: CPPC: Avoid overflow when calculating desired_perf
This patch fixes overflow issue when calculating the desired_perf.

Fixes: ad38677df4 (cpufreq: CPPC: Force reporting values in KHz to fix user space interface)
Signed-off-by: Hoan Tran <hotran@apm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-16 23:59:19 +02:00
Dave Gerlach e01072d22d cpufreq: ti: Use generic platdev driver
Now that the cpufreq-dt-platdev is used to create the cpufreq-dt platform
device for all OMAP platforms and the platform code that did it
before has been removed, add ti,am33xx and ti,dra7xx to the machine list
in cpufreq-dt-platdev which had relied on the removed platform code to do
this previously.

Fixes: 7694ca6e1d (cpufreq: omap: Use generic platdev driver)
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.7+ <stable@vger.kernel.org> # 4.7+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-16 23:57:04 +02:00
Srinivas Pandruvada 3ba7bcaa36 cpufreq: intel_pstate: Add io_boost trace
Add io_boost percent to current pstate_sample tracepoint.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-16 23:55:30 +02:00
Rafael J. Wysocki 09c448d3c6 cpufreq: intel_pstate: Use IOWAIT flag in Atom algorithm
Modify the P-state selection algorithm for Atom processors to use
the new SCHED_CPUFREQ_IOWAIT flag instead of the questionable
get_cpu_iowait_time_us() function.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-14 02:28:13 +02:00
Al Stone ad38677df4 cpufreq: CPPC: Force reporting values in KHz to fix user space interface
When CPPC is being used by ACPI on arm64, user space tools such as
cpupower report CPU frequency values from sysfs that are incorrect.

What the driver was doing was reporting the values given by ACPI tables
in whatever scale was used to provide them.  However, the ACPI spec
defines the CPPC values as unitless abstract numbers.  Internal kernel
structures such as struct perf_cap, in contrast, expect these values
to be in KHz.  When these struct values get reported via sysfs, the
user space tools also assume they are in KHz, causing them to report
incorrect values (for example, reporting a CPU frequency of 1MHz when
it should be 1.8GHz).

The downside is that this approach has some assumptions:

   (1) It relies on SMBIOS3 being used, *and* that the Max Frequency
   value for a processor is set to a non-zero value.

   (2) It assumes that all processors run at the same speed, or that
   the CPPC values have all been scaled to reflect relative speed.
   This patch retrieves the largest CPU Max Frequency from a type 4 DMI
   record that it can find.  This may not be an issue, however, as a
   sampling of DMI data on x86 and arm64 indicates there is often only
   one such record regardless.  Since CPPC is relatively new, it is
   unclear if the ACPI ASL will always be written to reflect any sort
   of relative performance of processors of differing speeds.

   (3) It assumes that performance and frequency both scale linearly.

For arm64 servers, this may be sufficient, but it does rely on
firmware values being set correctly.  Hence, other approaches will
be considered in the future.

This has been tested on three arm64 servers, with and without DMI, with
and without CPPC support.

Signed-off-by: Al Stone <ahs3@redhat.com>
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:47:44 +02:00
Viresh Kumar 26619804e7 cpufreq: create link to policy only for registered CPUs
If a cpufreq driver is registered very early in the boot stage (e.g.
registered from postcore_initcall()), then cpufreq core may generate
kernel warnings for it.

In this case, the CPUs are brought online, then the cpufreq driver is
registered, and then the CPU topology devices are registered. However,
by the time cpufreq_add_dev() gets called, the cpu device isn't stored
in the per-cpu variable (cpu_sys_devices,) which is read by
get_cpu_device().

So the cpufreq core fails to get device for the CPU, for which
cpufreq_add_dev() was called in the first place and we will hit a
WARN_ON(!cpu_dev).

Even if we reuse the 'dev' parameter passed to cpufreq_add_dev() to
avoid that warning, there might be other CPUs online that share the
policy with the cpu for which cpufreq_add_dev() is called. Eventually
get_cpu_device() will return NULL for them as well, and we will hit the
same WARN_ON() again.

In order to fix these issues, change cpufreq core to create links to the
policy for a cpu only when cpufreq_add_dev() is called for that CPU.

Reuse the 'real_cpus' mask to track that as well.

Note that cpufreq_remove_dev() already handles removal of the links for
individual CPUs and cpufreq_add_dev() has aligned with that now.

Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:41:15 +02:00
Julia Lawall 42ce8921cc intel_pstate: constify local structures
For structure types defined in the same file or local header files, find
top-level static structure declarations that have the following
properties:
1. Never reassigned.
2. Address never taken
3. Not passed to a top-level macro call
4. No pointer or array-typed field passed to a function or stored in a
variable.
Declare structures having all of these properties as const.

Done using Coccinelle.
Based on a suggestion by Joe Perches <joe@perches.com>.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:40:24 +02:00
Viresh Kumar 297a66221d cpufreq: dt: Support governor tunables per policy
The cpufreq-dt driver is also used for systems with multiple
clock/voltage domains for CPUs, i.e. multiple cpufreq policies in a
system.

And in such cases the platform users may want to enable "governor
tunables per policy". Support that via platform data, as not all users
of the driver would want that behavior.

Reported-by: Juri Lelli <Juri.Lelli@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:39:12 +02:00
Viresh Kumar 33cc4fc1b2 cpufreq: dt: Update kconfig description
The cpufreq DT driver also supports systems that have multiple
clock/voltage domains for CPUs, i.e. multiple policy systems.

The description of the Kconfig entry was never updated after the driver
was modified to support such systems, fix it.

Reported-by: Juri Lelli <Juri.Lelli@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:39:12 +02:00
Viresh Kumar e86eee6bc2 cpufreq: dt: Remove unused code
This is leftover from an earlier patch which removed the usage of
platform data but forgot to remove this line. Remove it now.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:39:12 +02:00
Geert Uytterhoeven ffdf8b867b cpufreq: dt: Add support for r8a7792
Add the compatible string for supporting the generic cpufreq driver on
the Renesas R-Car V2H (r8a7792) SoC.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-13 02:34:14 +02:00
Rafael J. Wysocki d0fbf1d328 Merge back earlier cpufreq material for v4.9. 2016-09-12 14:49:29 +02:00
Rafael J. Wysocki 3689ad7ed6 cpufreq: Drop unnecessary check from cpufreq_policy_alloc()
Since cpufreq_policy_alloc() doesn't use its dev variable for
anything useful, drop that variable from there along with the
NULL check against it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-09-01 00:29:10 +02:00
Wei Yongjun bd37e022e3 cpufreq: dt: Add terminate entry for of_device_id tables
Make sure of_device_id tables are NULL terminated.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Fixes: f56aad1d98 (cpufreq: dt: Add generic platform-device creation support)
CC: 4.7+ <stable@vger.kernel.org> # 4.7+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 02:49:05 +02:00
Markus Elfring b0d8a69d08 cpufreq-SCPI: Delete unnecessary assignment for the field "owner"
The field "owner" is set by the core.
Thus delete an unneeded initialisation.

Generated by scripts/coccinelle/api/platform_no_drv_owner.cocci

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-18 03:42:32 +02:00
Rafael J. Wysocki 58919e83c8 cpufreq / sched: Pass flags to cpufreq_update_util()
It is useful to know the reason why cpufreq_update_util() has just
been called and that can be passed as flags to cpufreq_update_util()
and to the ->func() callback in struct update_util_data.  However,
doing that in addition to passing the util and max arguments they
already take would be clumsy, so avoid it.

Instead, use the observation that the schedutil governor is part
of the scheduler proper, so it can access scheduler data directly.
This allows the util and max arguments of cpufreq_update_util()
and the ->func() callback in struct update_util_data to be replaced
with a flags one, but schedutil has to be modified to follow.

Thus make the schedutil governor obtain the CFS utilization
information from the scheduler and use the "RT" and "DL" flags
instead of the special utilization value of ULONG_MAX to track
updates from the RT and DL sched classes.  Make it non-modular
too to avoid having to export scheduler variables to modules at
large.

Next, update all of the other users of cpufreq_update_util()
and the ->func() callback in struct update_util_data accordingly.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-08-16 22:14:55 +02:00
Chanwoo Choi c4b4057267 cpufreq: dt: Add exynos5433 compatible to use generic cpufreq driver
This patch adds the exynos5433 compatible string for supporting
the generic cpufreq driver on Exynos5433.

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-16 13:57:30 +02:00
Rafael J. Wysocki 0aeeb3e73f Merge branches 'pm-sleep' and 'pm-cpufreq'
* pm-sleep:
  PM / hibernate: Restore processor state before using per-CPU variables
  x86/power/64: Always create temporary identity mapping correctly

* pm-cpufreq:
  cpufreq: powernv: Fix crash in gpstate_timer_handler()
2016-08-12 22:53:58 +02:00
Akshay Adiga 8e85946777 cpufreq: powernv: Fix crash in gpstate_timer_handler()
Commit 09ca4c9b59 (cpufreq: powernv: Replacing pstate_id with
frequency table index) changes calc_global_pstate() to use
cpufreq_table index instead of pstate_id.

But in gpstate_timer_handler(), pstate_id was being passed instead
of cpufreq_table index, which caused index_to_pstate() to access
out of bound indices, leading to this crash.

Adding sanity check for index and pstate, to ensure only valid pstate
and index values are returned.

Call Trace:
[c00000078d66b130] [c00000000011d224] __free_irq+0x234/0x360
(unreliable)
[c00000078d66b1c0] [c00000000011d44c] free_irq+0x6c/0xa0
[c00000078d66b1f0] [c00000000006c4f8] opal_event_shutdown+0x88/0xd0
[c00000078d66b230] [c000000000067a4c] opal_shutdown+0x1c/0x90
[c00000078d66b260] [c000000000063a00] pnv_shutdown+0x20/0x40
[c00000078d66b280] [c000000000021538] machine_restart+0x38/0x90
[c0000000078d66b310] [c000000000965ea0] panic+0x284/0x300
[c00000078d66b3a0] [c00000000001f508] die+0x388/0x450
[c00000078d66b430] [c000000000045a50] bad_page_fault+0xd0/0x140
[c00000078d66b4a0] [c000000000008964] handle_page_fault+0x2c/0x30
   interrupt: 300 at gpstate_timer_handler+0x150/0x260
    LR = gpstate_timer_handler+0x130/0x260
[c00000078d66b7f0] [c000000000132b58] call_timer_fn+0x58/0x1c0
[c00000078d66b880] [c000000000132e20] expire_timers+0x130/0x1d0
[c00000078d66b8f0] [c000000000133068] run_timer_softirq+0x1a8/0x230
[c00000078d66b980] [c0000000000b535c] __do_softirq+0x18c/0x400
[c00000078d66ba70] [c0000000000b5828] irq_exit+0xc8/0x100
[c00000078d66ba90] [c00000000001e214] timer_interrupt+0xa4/0xe0
[c00000078d66bac0] [c0000000000027d0] decrementer_common+0x150/0x180
   interrupt: 901 at arch_local_irq_restore+0x74/0x90
  0] [c000000000106b34] call_cpuidle+0x44/0x90
[c00000078d66be50] [c00000000010708c] cpu_startup_entry+0x38c/0x460
[c00000078d66bf20] [c00000000003d930] start_secondary+0x330/0x380
[c00000078d66bf90] [c000000000008e6c] start_secondary_prolog+0x10/0x14

Fixes: 09ca4c9b59 (cpufreq: powernv: Replacing pstate_id with frequency table index)
Reported-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-06 14:52:26 +02:00
Linus Torvalds 11d8ec408d More power management updates for v4.8-rc1
- Prevent the low-level assembly hibernate code on x86-64 from
    referring to __PAGE_OFFSET directly as a symbol which doesn't work
    when the kernel identity mapping base is randomized, in which case
    __PAGE_OFFSET is a variable (Rafael Wysocki).
 
  - Avoid selecting CPU_FREQ_STAT by default as the statistics are not
    required for proper cpufreq operation (Borislav Petkov).
 
  - Add Skylake-X and Broadwell-X IDs to the intel_pstate's list of
    processors where out-of-band (OBB) control of P-states is possible
    and if that is in use, intel_pstate should not attempt to manage
    P-states (Srinivas Pandruvada).
 
  - Drop some unnecessary checks from the wakeup IRQ handling code in
    the PM core (Markus Elfring).
 
  - Reduce the number operating performance point (OPP) lookups in
    one of the OPP framework's helper functions (Jisheng Zhang).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJXpKLNAAoJEILEb/54YlRx6L4P/39GR0kVB7vIyCajdWc4f3gh
 0zh5RbKC0YT4F6upebzgQ9uYS9cv4y+df7ShVwKQAea3wDReEmZhM/egOGw0Ls+8
 SS1MiJq1LSekyMIWH6cXwZsH69/V0LuWTBbzWYBgHUDbfEMlgwV5ZZMEH14/2bWw
 d4SLUUiW5P42im+IDAxpdYneKOrbJo3txj6WbOutgtIrHdPko6lF1dDouKvI1QTk
 zCBOkEB9nELq3rWN/sbPmHzbmbj/yFiiHk5+iqwuKKJZI8PQB9/C6Qmc3wvjtPpc
 GXLPI+OHqLgBMofGsiKOvm3hPQAIjf/ERsUilHLE3qOi/Mi0qj7U4dFivZrPWaCG
 j2bD+b36TffmG1r8L7NYbEKU60syeIFSRqAngbyswu6XF+NdVboaifENGcgM3tBC
 pUC1mFh/4PMKP5zW9mOwE6WSntZkw14CVR+A3fRFOuTBavNvjGdckwLi/aBsZdU3
 K4DJUFzdELF7+JdqnQV35yV2tgMbJhxQa/QBykiFBh3AyqliOZ8uBoIxxLinrGmH
 XWR3kB4ZPRBIStGI9IpG3lNhLLU7mLIdaFhGayicicAwFcLsXN7oKWVzYfUqWA9T
 1ptXApcRf6S9J1JnKHkznoQ1/D1pNYLbH+t7BOtWdK8SWBhMS2Lzo2kyKWsCFkJS
 16J8j1tcLDhN1dvcBT55
 =WvPB
 -----END PGP SIGNATURE-----

Merge tag 'pm-extra-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "A few more fixes and cleanups in the x86-64 low-level hibernation
  code, PM core, cpufreq (Kconfig and intel_pstate), and the operating
  points framework.

  Specifics:

   - Prevent the low-level assembly hibernate code on x86-64 from
     referring to __PAGE_OFFSET directly as a symbol which doesn't work
     when the kernel identity mapping base is randomized, in which case
     __PAGE_OFFSET is a variable (Rafael Wysocki).

   - Avoid selecting CPU_FREQ_STAT by default as the statistics are not
     required for proper cpufreq operation (Borislav Petkov).

   - Add Skylake-X and Broadwell-X IDs to the intel_pstate's list of
     processors where out-of-band (OBB) control of P-states is possible
     and if that is in use, intel_pstate should not attempt to manage
     P-states (Srinivas Pandruvada).

   - Drop some unnecessary checks from the wakeup IRQ handling code in
     the PM core (Markus Elfring).

   - Reduce the number operating performance point (OPP) lookups in one
     of the OPP framework's helper functions (Jisheng Zhang)"

* tag 'pm-extra-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  x86/power/64: Do not refer to __PAGE_OFFSET from assembly code
  cpufreq: Do not default-yes CPU_FREQ_STAT
  cpufreq: intel_pstate: Add more out-of-band IDs
  PM / OPP: optimize dev_pm_opp_set_rate() performance a bit
  PM-wakeup: Delete unnecessary checks before three function calls
2016-08-05 23:26:16 -04:00
Rafael J. Wysocki e2b3b80de5 Merge branches 'pm-sleep', 'pm-cpufreq', 'pm-core' and 'pm-opp'
* pm-sleep:
  x86/power/64: Do not refer to __PAGE_OFFSET from assembly code

* pm-cpufreq:
  cpufreq: Do not default-yes CPU_FREQ_STAT
  cpufreq: intel_pstate: Add more out-of-band IDs

* pm-core:
  PM-wakeup: Delete unnecessary checks before three function calls

* pm-opp:
  PM / OPP: optimize dev_pm_opp_set_rate() performance a bit
2016-08-05 15:46:55 +02:00
Borislav Petkov 79ad70de53 cpufreq: Do not default-yes CPU_FREQ_STAT
CPU frequency transition statistics are not absolutely required for
proper cpufreq operation on the system AFAICT so remove the default-yes
setting in Kconfig.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 01:34:11 +02:00
Linus Torvalds 43a0a98aa8 ARM: SoC driver updates for v4.8
Driver updates for ARM SoCs.
 
 A slew of changes this release cycle. The reset driver tree, that we merge
 through arm-soc for historical reasons, is also sizable this time around.
 
 Among the changes:
 
  - clps711x: Treewide changes to compatible strings, merged here for simplicity.
  - Qualcomm: SCM firmware driver cleanups, move to platform driver
  - ux500: Major cleanups, removal of old mach-specific infrastructure.
  - Atmel external bus memory driver
  - Move of brcmstb platform to the rest of bcm
  - PMC driver updates for tegra, various fixes and improvements
  - Samsung platform driver updates to support 64-bit Exynos platforms
  - Reset controller cleanups moving to devm_reset_controller_register() APIs
  - Reset controller driver for Amlogic Meson
  - Reset controller driver for Hisilicon hi6220
  - ARM SCPI power domain support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnm1XAAoJEIwa5zzehBx35lcP/ApuQarIXeZCQZtjlUBV9McW
 o3o7FhKFHePmEPeoYCvVeK5D8NykTkQv3WpnCknoxPJzxGJF7jbPWQJcVnXfKOXD
 kTcyIK15WL2HHtSE3lYyLfyUPz8AbJyRt0l0cxgcg6jvo+uzlWooNz1y78rLIYzg
 UwRssj7OiHv4dsyYRHZIsjnB8gMWw8rYMk154gP2xy6MnNXXzzOVVnOiVqxSZBm+
 EgIIcROMOqkkHuFlClMYKluIgrmgz1Ypjf+FuAg7dqXZd+TGRrmGXeI7SkGThfLu
 nyvY3N18NViNu7xOUkI9zg7+ifyYM8Si9ylalSICSJdIAxZfiwFqFaLJvVWKU1rY
 rBOBjKckQI0/X9WYusFNFHcijhIFV8/FgGAnVRRMPdvlCss7Zp03C9mR4AEhmKMX
 rLG49x81hU1C+LftC59ml3iB8dhZrrRkbxNHjLFHVGWNrKMrmJKa8JhXGRAoNM+u
 LRauiuJZatqvLfISNvpfcoW2EashVoU3f+uC8ymT3QCyME3wZm0t7T4tllxhMfBl
 sOgJqNkTKDmPLofwm/dASiLML7ZF1WePScrFyOACnj9K4mUD+OaCnowtWoQPu0eI
 aNmT84oosJ2S9F/iUDPtFHXdzQ+1QPPfSiQ9FXMoauciVq/2F+pqq68yYgqoxFOG
 vmkmG2YM4Wyq43u0BONR
 =O8+y
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Olof Johansson:
 "Driver updates for ARM SoCs.

  A slew of changes this release cycle.  The reset driver tree, that we
  merge through arm-soc for historical reasons, is also sizable this
  time around.

  Among the changes:

   - clps711x: Treewide changes to compatible strings, merged here for simplicity.
   - Qualcomm: SCM firmware driver cleanups, move to platform driver
   - ux500: Major cleanups, removal of old mach-specific infrastructure.
   - Atmel external bus memory driver
   - Move of brcmstb platform to the rest of bcm
   - PMC driver updates for tegra, various fixes and improvements
   - Samsung platform driver updates to support 64-bit Exynos platforms
   - Reset controller cleanups moving to devm_reset_controller_register() APIs
   - Reset controller driver for Amlogic Meson
   - Reset controller driver for Hisilicon hi6220
   - ARM SCPI power domain support"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (100 commits)
  ARM: ux500: consolidate base platform files
  ARM: ux500: move soc_id driver to drivers/soc
  ARM: ux500: call ux500_setup_id later
  ARM: ux500: consolidate soc_device code in id.c
  ARM: ux500: remove cpu_is_u* helpers
  ARM: ux500: use CLK_OF_DECLARE()
  ARM: ux500: move l2x0 init to .init_irq
  mfd: db8500 stop passing around platform data
  ASoC: ab8500-codec: remove platform data based probe
  ARM: ux500: move ab8500_regulator_plat_data into driver
  ARM: ux500: remove unused regulator data
  soc: raspberrypi-power: add CONFIG_OF dependency
  firmware: scpi: add CONFIG_OF dependency
  video: clps711x-fb: Changing the compatibility string to match with the smallest supported chip
  input: clps711x-keypad: Changing the compatibility string to match with the smallest supported chip
  pwm: clps711x: Changing the compatibility string to match with the smallest supported chip
  serial: clps711x: Changing the compatibility string to match with the smallest supported chip
  irqchip: clps711x: Changing the compatibility string to match with the smallest supported chip
  clocksource: clps711x: Changing the compatibility string to match with the smallest supported chip
  clk: clps711x: Changing the compatibility string to match with the smallest supported chip
  ...
2016-08-01 18:36:01 -04:00
Srinivas Pandruvada 65c1262f40 cpufreq: intel_pstate: Add more out-of-band IDs
Add Skylake-X and Broadwell-X IDs for out-of-band (OBB) control of
P-States.

For these processors, if MSR_MISC_PWR_MGMT BIT(8) == 1, then the
Intel P-State driver should exit as OS can't control P-States.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw : Subject/changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-28 23:58:17 +02:00
Linus Torvalds 6453dbdda3 Power management material for v4.8-rc1
- Rework the cpufreq governor interface to make it more straightforward
    and modify the conservative governor to avoid using transition
    notifications (Rafael Wysocki).
 
  - Rework the handling of frequency tables by the cpufreq core to make
    it more efficient (Viresh Kumar).
 
  - Modify the schedutil governor to reduce the number of wakeups it
    causes to occur in cases when the CPU frequency doesn't need to be
    changed (Steve Muckle, Viresh Kumar).
 
  - Fix some minor issues and clean up code in the cpufreq core and
    governors (Rafael Wysocki, Viresh Kumar).
 
  - Add Intel Broxton support to the intel_pstate driver (Srinivas
    Pandruvada).
 
  - Fix problems related to the config TDP feature and to the validity
    of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka,
    Srinivas Pandruvada).
 
  - Make intel_pstate update the cpu_frequency tracepoint even if
    the frequency doesn't change to avoid confusing powertop (Rafael
    Wysocki).
 
  - Clean up the usage of __init/__initdata in intel_pstate, mark some
    of its internal variables as __read_mostly and drop an unused
    structure element from it (Jisheng Zhang, Carsten Emde).
 
  - Clean up the usage of some duplicate MSR symbols in intel_pstate
    and turbostat (Srinivas Pandruvada).
 
  - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay
    Adiga, Viresh Kumar, Ben Dooks).
 
  - Fix a regression (introduced during the 4.5 cycle) in the
    pcc-cpufreq driver by reverting the problematic commit (Andreas
    Herrmann).
 
  - Add support for Intel Denverton to intel_idle, clean up Broxton
    support in it and make it explicitly non-modular (Jacob Pan,
    Jan Beulich, Paul Gortmaker).
 
  - Add support for Denverton and Ivy Bridge server to the Intel RAPL
    power capping driver and make it more careful about the handing
    of MSRs that may not be present (Jacob Pan, Xiaolong Wang).
 
  - Fix resume from hibernation on x86-64 by making the CPU offline
    during resume avoid using MONITOR/MWAIT in the "play dead" loop
    which may lead to an inadvertent "revival" of a "dead" CPU and
    a page fault leading to a kernel crash from it (Rafael Wysocki).
 
  - Make memory management during resume from hibernation more
    straightforward (Rafael Wysocki).
 
  - Add debug features that should help to detect problems related
    to hibernation and resume from it (Rafael Wysocki, Chen Yu).
 
  - Clean up hibernation core somewhat (Rafael Wysocki).
 
  - Prevent KASAN from instrumenting the hibernation core which leads
    to large numbers of false-positives from it (James Morse).
 
  - Prevent PM (hibernate and suspend) notifiers from being called
    during the cleanup phase if they have not been called during the
    corresponding preparation phase which is possible if one of the
    other notifiers returns an error at that time (Lianwei Wang).
 
  - Improve suspend-related debug printout in the tasks freezer and
    clean up suspend-related console handling (Roger Lu, Borislav
    Petkov).
 
  - Update the AnalyzeSuspend script in the kernel sources to
    version 4.2 (Todd Brandt).
 
  - Modify the generic power domains framework to make it handle
    system suspend/resume better (Ulf Hansson).
 
  - Make the runtime PM framework avoid resuming devices synchronously
    when user space changes the runtime PM settings for them and
    improve its error reporting (Rafael Wysocki, Linus Walleij).
 
  - Fix error paths in devfreq drivers (exynos, exynos-ppmu, exynos-bus)
    and in the core, make some devfreq code explicitly non-modular and
    change some of it into tristate (Bartlomiej Zolnierkiewicz,
    Peter Chen, Paul Gortmaker).
 
  - Add DT support to the generic PM clocks management code and make
    it export some more symbols (Jon Hunter, Paul Gortmaker).
 
  - Make the PCI PM core code slightly more robust against possible
    driver errors (Andy Shevchenko).
 
  - Make it possible to change DESTDIR and PREFIX in turbostat
    (Andy Shevchenko).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJXl7/dAAoJEILEb/54YlRx+VgQAIQJOWvxKew3Yl02c/sdj9OT
 5VNnFrzGzdcAPofvvG9qGq8B0Es1vYehJpwwOB21ri8EvYv0riIiU1yrqslObojQ
 oaZOkSBpbIoKjGR4CpYA/A+feE+8EqIBdPGd+lx5a6oRdUi7tRVHBG9lyLO3FB/i
 jan1q8dMpZsmu+Y+rVVHGnCVuIlIEqr2ZnZfCwDAulO2Arp/QFAh4kH08ELATvrl
 bkPa25vq7/VMP/vCDzrfZKD5mUuKogIRu/J5wx4py1nE+FB35cKKyqBOgklLwAeY
 UI8vjDhr/myNUs54AZlktOkq47TCYvjvhX9kmOxBjuWqFbRusU012IRek1fYPRIV
 ZqbkqNX7UEVQwunAEg9AyFwyzEtOht93dQDT5RLEd4QzKuM76gmHpLeTGGMzE+nu
 FnmF9JGl4DVwqpZl9yU2+hR2Mt3bP8OF8qYmNiGUB3KO4emPslhSd+6y8liA5Bx2
 SJf0Gb//vaHCh3/uMnwAonYPqRkZvBLOMwuL1VUjNQfRMnQtDdgHMYB1aT/EglPA
 8ww6j4J8rVRLAxvYQ3UEmNA/vBNclKXblRR18+JddEZP9/oX0ATfwnCCUpr839uk
 xxyQhrm4/AI60+PHWCX4GG80YrKdOGTkF7LXCQZanVWjjuyF17rufegZ2YWLT07v
 JU1Cmumfdy2jJluT8xsR
 =uVGz
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael  Wysocki:
 "Again, the majority of changes go into the cpufreq subsystem, but
  there are no big features this time.  The cpufreq changes that stand
  out somewhat are the governor interface rework and improvements
  related to the handling of frequency tables.  Apart from those, there
  are fixes and new device/CPU IDs in drivers, cleanups and an
  improvement of the new schedutil governor.

  Next, there are some changes in the hibernation core, including a fix
  for a nasty problem related to the MONITOR/MWAIT usage by CPU offline
  during resume from hibernation, a few core improvements related to
  memory management during resume, a couple of additional debug features
  and cleanups.

  Finally, we have some fixes and cleanups in the devfreq subsystem,
  generic power domains framework improvements related to system
  suspend/resume, support for some new chips in intel_idle and in the
  power capping RAPL driver, a new version of the AnalyzeSuspend utility
  and some assorted fixes and cleanups.

  Specifics:

   - Rework the cpufreq governor interface to make it more
     straightforward and modify the conservative governor to avoid using
     transition notifications (Rafael Wysocki).

   - Rework the handling of frequency tables by the cpufreq core to make
     it more efficient (Viresh Kumar).

   - Modify the schedutil governor to reduce the number of wakeups it
     causes to occur in cases when the CPU frequency doesn't need to be
     changed (Steve Muckle, Viresh Kumar).

   - Fix some minor issues and clean up code in the cpufreq core and
     governors (Rafael Wysocki, Viresh Kumar).

   - Add Intel Broxton support to the intel_pstate driver (Srinivas
     Pandruvada).

   - Fix problems related to the config TDP feature and to the validity
     of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka,
     Srinivas Pandruvada).

   - Make intel_pstate update the cpu_frequency tracepoint even if the
     frequency doesn't change to avoid confusing powertop (Rafael
     Wysocki).

   - Clean up the usage of __init/__initdata in intel_pstate, mark some
     of its internal variables as __read_mostly and drop an unused
     structure element from it (Jisheng Zhang, Carsten Emde).

   - Clean up the usage of some duplicate MSR symbols in intel_pstate
     and turbostat (Srinivas Pandruvada).

   - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay
     Adiga, Viresh Kumar, Ben Dooks).

   - Fix a regression (introduced during the 4.5 cycle) in the
     pcc-cpufreq driver by reverting the problematic commit (Andreas
     Herrmann).

   - Add support for Intel Denverton to intel_idle, clean up Broxton
     support in it and make it explicitly non-modular (Jacob Pan, Jan
     Beulich, Paul Gortmaker).

   - Add support for Denverton and Ivy Bridge server to the Intel RAPL
     power capping driver and make it more careful about the handing of
     MSRs that may not be present (Jacob Pan, Xiaolong Wang).

   - Fix resume from hibernation on x86-64 by making the CPU offline
     during resume avoid using MONITOR/MWAIT in the "play dead" loop
     which may lead to an inadvertent "revival" of a "dead" CPU and a
     page fault leading to a kernel crash from it (Rafael Wysocki).

   - Make memory management during resume from hibernation more
     straightforward (Rafael Wysocki).

   - Add debug features that should help to detect problems related to
     hibernation and resume from it (Rafael Wysocki, Chen Yu).

   - Clean up hibernation core somewhat (Rafael Wysocki).

   - Prevent KASAN from instrumenting the hibernation core which leads
     to large numbers of false-positives from it (James Morse).

   - Prevent PM (hibernate and suspend) notifiers from being called
     during the cleanup phase if they have not been called during the
     corresponding preparation phase which is possible if one of the
     other notifiers returns an error at that time (Lianwei Wang).

   - Improve suspend-related debug printout in the tasks freezer and
     clean up suspend-related console handling (Roger Lu, Borislav
     Petkov).

   - Update the AnalyzeSuspend script in the kernel sources to version
     4.2 (Todd Brandt).

   - Modify the generic power domains framework to make it handle system
     suspend/resume better (Ulf Hansson).

   - Make the runtime PM framework avoid resuming devices synchronously
     when user space changes the runtime PM settings for them and
     improve its error reporting (Rafael Wysocki, Linus Walleij).

   - Fix error paths in devfreq drivers (exynos, exynos-ppmu,
     exynos-bus) and in the core, make some devfreq code explicitly
     non-modular and change some of it into tristate (Bartlomiej
     Zolnierkiewicz, Peter Chen, Paul Gortmaker).

   - Add DT support to the generic PM clocks management code and make it
     export some more symbols (Jon Hunter, Paul Gortmaker).

   - Make the PCI PM core code slightly more robust against possible
     driver errors (Andy Shevchenko).

   - Make it possible to change DESTDIR and PREFIX in turbostat (Andy
     Shevchenko)"

* tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits)
  Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency"
  PM / hibernate: Introduce test_resume mode for hibernation
  cpufreq: export cpufreq_driver_resolve_freq()
  cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index()
  PCI / PM: check all fields in pci_set_platform_pm()
  cpufreq: acpi-cpufreq: use cached frequency mapping when possible
  cpufreq: schedutil: map raw required frequency to driver frequency
  cpufreq: add cpufreq_driver_resolve_freq()
  cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT
  intel_pstate: Update cpu_frequency tracepoint every time
  cpufreq: intel_pstate: clean remnant struct element
  PM / tools: scripts: AnalyzeSuspend v4.2
  x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
  cpufreq: powernv: Replacing pstate_id with frequency table index
  intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate()
  PM / hibernate: Image data protection during restoration
  PM / hibernate: Add missing braces in __register_nosave_region()
  PM / hibernate: Clean up comments in snapshot.c
  PM / hibernate: Clean up function headers in snapshot.c
  PM / hibernate: Add missing braces in hibernate_setup()
  ...
2016-07-26 17:29:07 -07:00
Linus Torvalds 55392c4c06 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "This update provides the following changes:

   - The rework of the timer wheel which addresses the shortcomings of
     the current wheel (cascading, slow search for next expiring timer,
     etc).  That's the first major change of the wheel in almost 20
     years since Finn implemted it.

   - A large overhaul of the clocksource drivers init functions to
     consolidate the Device Tree initialization

   - Some more Y2038 updates

   - A capability fix for timerfd

   - Yet another clock chip driver

   - The usual pile of updates, comment improvements all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits)
  tick/nohz: Optimize nohz idle enter
  clockevents: Make clockevents_subsys static
  clocksource/drivers/time-armada-370-xp: Fix return value check
  timers: Implement optimization for same expiry time in mod_timer()
  timers: Split out index calculation
  timers: Only wake softirq if necessary
  timers: Forward the wheel clock whenever possible
  timers/nohz: Remove pointless tick_nohz_kick_tick() function
  timers: Optimize collect_expired_timers() for NOHZ
  timers: Move __run_timers() function
  timers: Remove set_timer_slack() leftovers
  timers: Switch to a non-cascading wheel
  timers: Reduce the CPU index space to 256k
  timers: Give a few structs and members proper names
  hlist: Add hlist_is_singular_node() helper
  signals: Use hrtimer for sigtimedwait()
  timers: Remove the deprecated mod_timer_pinned() API
  timers, net/ipv4/inet: Initialize connection request timers as pinned
  timers, drivers/tty/mips_ejtag: Initialize the poll timer as pinned
  timers, drivers/tty/metag_da: Initialize the poll timer as pinned
  ...
2016-07-25 20:43:12 -07:00
Linus Torvalds 8e466955d6 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Intel-SoC enhancements (Andy Shevchenko)

   - Intel CPU symbolic model definition rework (Dave Hansen)

   - ... other misc changes"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  x86/sfi: Enable enumeration of SD devices
  x86/pci: Use MRFLD abbreviation for Merrifield
  x86/platform/intel-mid: Make vertical indentation consistent
  x86/platform/intel-mid: Mark regulators explicitly defined
  x86/platform/intel-mid: Rename mrfl.c to mrfld.c
  x86/platform/intel-mid: Enable spidev on Intel Edison boards
  x86/platform/intel-mid: Extend PWRMU to support Penwell
  x86/pci, x86/platform/intel_mid_pci: Remove duplicate power off code
  x86/platform/intel-mid: Add pinctrl for Intel Merrifield
  x86/platform/intel-mid: Enable GPIO expanders on Edison
  x86/platform/intel-mid: Add Power Management Unit driver
  x86/platform/atom/punit: Enable support for Merrifield
  x86/platform/intel_mid_pci: Rework IRQ0 workaround
  x86, thermal: Clean up and fix CPU model detection for intel_soc_dts_thermal
  x86, mmc: Use Intel family name macros for mmc driver
  x86/intel_telemetry: Use Intel family name macros for telemetry driver
  x86/acpi/lss: Use Intel family name macros for the acpi_lpss driver
  x86/cpufreq: Use Intel family name macros for the intel_pstate cpufreq driver
  x86/platform: Use new Intel model number macros
  x86/intel_idle: Use Intel family macros for intel_idle
  ...
2016-07-25 19:15:35 -07:00
Rafael J. Wysocki bc841e260c Merge branch 'pm-cpu'
* pm-cpu:
  x86: remove duplicate turbo ratio limit MSRs
  tools/power turbostat: Replace MSR_NHM_TURBO_RATIO_LIMIT
  cpufreq: intel_pstate: Replace MSR_NHM_TURBO_RATIO_LIMIT
2016-07-25 13:46:30 +02:00
Andreas Herrmann da7d3abe1c Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency"
This reverts commit 790d849bf8.

Using a v4.7-rc7 kernel on a HP ProLiant triggered following messages

 pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz
 cpufreq: ondemand governor failed, too long transition latency of HW, fallback to performance governor

The last line was shown for each CPU in the system.
Testing v4.5 (where commit 790d849b was integrated) triggered
similar messages. Same behaviour on a 2nd HP Proliant system.

So commit 790d849bf (cpufreq: pcc-cpufreq: update default value of
cpuinfo_transition_latency) causes the system to use performance
governor which, I guess, was not the intention of the patch.

Enabling debug output in pcc-cpufreq provides following verbose output:

 pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz
 pcc_get_offset: for CPU 0: pcc_cpu_data input_offset: 0x44, pcc_cpu_data output_offset: 0x48
 init: policy->max is 2800000, policy->min is 1200000
 get: get_freq for CPU 0
 get: SUCCESS: (virtual) output_offset for cpu 0 is 0xffffc9000d7c0048, contains a value of: 0xff06. Speed is: 168000 MHz
 cpufreq: ondemand governor failed, too long transition latency of HW, fallback to performance governor
 target: CPU 0 should go to target freq: 2800000 (virtual) input_offset is 0xffffc9000d7c0044
 target: was SUCCESSFUL for cpu 0

I am asking to revert 790d849bf to re-enable usage of ondemand
governor with pcc-cpufreq.

Fixes: 790d849bf (cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency)
CC: <stable@vger.kernel.org> # 4.5+
Signed-off-by: Andreas Herrmann <aherrmann@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-22 23:51:06 +02:00
Steve Muckle ae2c1ca686 cpufreq: export cpufreq_driver_resolve_freq()
Export cpufreq_driver_resolve_freq() since governors may be compiled as
modules.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Steve Muckle <smuckle@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-22 13:53:51 +02:00
Viresh Kumar abe8bd024e cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index()
The handlers provided by cpufreq core are sufficient for resolving the
frequency for drivers providing ->target_index(), as the core already
has the frequency table and so ->resolve_freq() isn't required for such
platforms.

This patch disallows drivers with ->target_index() callback to use the
->resolve_freq() callback.

Also, it fixes a potential kernel crash for drivers providing ->target()
but no ->resolve_freq().

Fixes: e3c0623608 "cpufreq: add cpufreq_driver_resolve_freq()"
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21 23:45:17 +02:00
Steve Muckle 5b6667c76d cpufreq: acpi-cpufreq: use cached frequency mapping when possible
A call to cpufreq_driver_resolve_freq will cache the mapping from
the desired target frequency to the frequency table index. If there
is a mapping for the desired target frequency then use it instead of
looking up the mapping again.

Signed-off-by: Steve Muckle <smuckle@linaro.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21 22:28:32 +02:00
Steve Muckle e3c0623608 cpufreq: add cpufreq_driver_resolve_freq()
Cpufreq governors may need to know what a particular target frequency
maps to in the driver without necessarily wanting to set the frequency.
Support this operation via a new cpufreq API,
cpufreq_driver_resolve_freq(). This API returns the lowest driver
frequency equal or greater than the target frequency
(CPUFREQ_RELATION_L), subject to any policy (min/max) or driver
limitations. The mapping is also cached in the policy so that a
subsequent fast_switch operation can avoid repeating the same lookup.

The API will call a new cpufreq driver callback, resolve_freq(), if it
has been registered by the driver. Otherwise the frequency is resolved
via cpufreq_frequency_table_target(). Rather than require ->target()
style drivers to provide a resolve_freq() callback it is left to the
caller to ensure that the driver implements this callback if necessary
to use cpufreq_driver_resolve_freq().

Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Steve Muckle <smuckle@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21 14:46:08 +02:00
Srinivas Pandruvada da7de91c3e cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT
The MSR MSR_HWP_INTERRUPT is valid only when CPUID.06H:EAX[8] = 1, so
check for feature before accessing this MSR.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21 14:29:30 +02:00
Rafael J. Wysocki bc95a454b6 intel_pstate: Update cpu_frequency tracepoint every time
Currently, intel_pstate only updates the cpu_frequency tracepoint
if the new P-state to set is different from the current one, but
that causes powertop to report 100% idle on an 100% loaded system
sometimes.

Prevent that from happening by updating the cpu_frequency tracepoint
every time intel_pstate_update_pstate() is called.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>-
2016-07-21 14:28:37 +02:00
Carsten Emde 2630abc243 cpufreq: intel_pstate: clean remnant struct element
When I was working with the Intel P state driver I came across a
remnant struct element that is no longer needed after the function
intel_pstate_calc_freq() was retired.

Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21 14:26:00 +02:00
Akshay Adiga 09ca4c9b59 cpufreq: powernv: Replacing pstate_id with frequency table index
Refactoring code to use frequency table index instead of pstate_id.
This abstraction will make the code independent of the pstate values.

- No functional changes
- The highest frequency is at frequency table index 0 and the frequency
  decreases as the index increases.
- Macros pstates_to_idx() and idx_to_pstate() can be used for conversion
  between pstate_id and index.
- powernv_pstate_info now contains frequency table index to min, max and
  nominal frequency (instead of pstate_ids)
- global_pstate_info new stores index values instead pstate ids.
- variables renamed as *_idx which now store index instead of pstate

Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-12 02:47:10 +02:00
Jan Kiszka 5fc8f707a2 intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate()
If MSR_CONFIG_TDP_CONTROL is locked, we currently try to address some
MSR 0x80000648 or so. Mask out the relevant level bits 0 and 1.

Found while running over the Jailhouse hypervisor which became upset
about this strange MSR index.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-11 15:12:30 +02:00
Srinivas Pandruvada 100cf6f277 cpufreq: intel_pstate: Replace MSR_NHM_TURBO_RATIO_LIMIT
Replace MSR_NHM_TURBO_RATIO_LIMIT with MSR_TURBO_RATIO_LIMIT.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-07 15:31:58 +02:00
Thomas Gleixner 7bc54b652f timers, cpufreq/powernv: Initialize the gpstate timer as pinned
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.297014487@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:25:14 +02:00
Viresh Kumar 825773609c cpufreq: Reuse new freq-table helpers
This patch migrates few users of cpufreq tables to the new helpers
that work on sorted freq-tables.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-07 00:14:27 +02:00
Viresh Kumar da0c6dc00c cpufreq: Handle sorted frequency tables more efficiently
cpufreq drivers aren't required to provide a sorted frequency table
today, and even the ones which provide a sorted table aren't handled
efficiently by cpufreq core.

This patch adds infrastructure to verify if the freq-table provided by
the drivers is sorted or not, and use efficient helpers if they are
sorted.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-07 00:13:20 +02:00
Rafael J. Wysocki 8d540ea792 cpufreq: Drop redundant check from cpufreq_update_current_freq()
Both callers of cpufreq_update_current_freq(), cpufreq_update_policy()
and cpufreq_start_governor(), check cpufreq_suspended before calling
that function, so drop the redundant cpufreq_suspended check from it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-07-04 13:22:35 +02:00
Rafael J. Wysocki 5d1191ab6c Merge back earlier cpufreq material for v4.8. 2016-07-04 13:21:43 +02:00