Commit graph

303 commits

Author SHA1 Message Date
Linus Torvalds 277edbabf6 Power management and ACPI material for v4.6-rc1, part 1
- Redesign of cpufreq governors and the intel_pstate driver to
    make them use callbacks invoked by the scheduler to trigger CPU
    frequency evaluation instead of using per-CPU deferrable timers
    for that purpose (Rafael Wysocki).
 
  - Reorganization and cleanup of cpufreq governor code to make it
    more straightforward and fix some concurrency problems in it
    (Rafael Wysocki, Viresh Kumar).
 
  - Cleanup and improvements of locking in the cpufreq core (Viresh
    Kumar).
 
  - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
    Kumar, Eric Biggers).
 
  - intel_pstate driver updates including fixes, optimizations and a
    modification to make it enable enable hardware-coordinated P-state
    selection (HWP) by default if supported by the processor (Philippe
    Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
    Franciosi).
 
  - Operating Performance Points (OPP) framework updates to improve
    its handling of voltage regulators and device clocks and updates
    of the cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
 
  - Updates of the powernv cpufreq driver to fix initialization
    and cleanup problems in it and correct its worker thread handling
    with respect to CPU offline, new powernv_throttle tracepoint
    (Shilpasri Bhat).
 
  - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
 
  - ACPICA updates including one fix for a regression introduced
    by previos changes in the ACPICA code (Bob Moore, Lv Zheng,
    David Box, Colin Ian King).
 
  - Support for installing ACPI tables from initrd (Lv Zheng).
 
  - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
    Chaugule).
 
  - Support for _HID(ACPI0010) devices (ACPI processor containers)
    and ACPI processor driver cleanups (Sudeep Holla).
 
  - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
    Aleksey Makarov).
 
  - Modification of the ACPI PCI IRQ management code to make it treat
    255 in the Interrupt Line register as "not connected" on x86 (as
    per the specification) and avoid attempts to use that value as
    a valid interrupt vector (Chen Fan).
 
  - ACPI APEI fixes related to resource leaks (Josh Hunt).
 
  - Removal of modularity from a few ACPI drivers (BGRT, GHES,
    intel_pmic_crc) that cannot be built as modules in practice (Paul
    Gortmaker).
 
  - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
    as a valid resource type (Harb Abdulhamid).
 
  - New device ID (future AMD I2C controller) in the ACPI driver for
    AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
 
  - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
 
  - cpuidle menu governor optimization to avoid a square root
    computation in it (Rasmus Villemoes).
 
  - Fix for potential use-after-free in the generic device properties
    framework (Heikki Krogerus).
 
  - Updates of the generic power domains (genpd) framework including
    support for multiple power states of a domain, fixes and debugfs
    output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
    Geert Uytterhoeven).
 
  - Intel RAPL power capping driver updates to reduce IPI overhead in
    it (Jacob Pan).
 
  - System suspend/hibernation code cleanups (Eric Biggers, Saurabh
    Sengar).
 
  - Year 2038 fix for the process freezer (Abhilash Jindal).
 
  - turbostat utility updates including new features (decoding of more
    registers and CPUID fields, sub-second intervals support, GFX MHz
    and RC6 printout, --out command line option), fixes (syscall jitter
    detection and workaround, reductioin of the number of syscalls made,
    fixes related to Xeon x200 processors, compiler warning fixes) and
    cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJW50NXAAoJEILEb/54YlRxvr8QAIktC9+ft0y5AmU46hDcBWcK
 QutyWJL9X9BS6DWBJZA2qclDYFmhMfi5Fza1se0gQ9TnLB/KrBwHWLsiYoTsb1k+
 nPKf214aPk+qAhkVuyB4leNWML9Qz9n9jwku/EYxWWpgtbSRf3+0ioIKZeWWc/8V
 JvuaOu4O+g/tkmL7QTrnGWBwhIIssAAV85QPsHkx+g68MrCj4UMMzm7z9G21SPXX
 bmP8yIHsczX/XnRsY0W2NSno7Vdk6ImHpDJ26IAZg28WRNPWICHgGYHvB0TTWMvb
 tts+yqfF7/7QLRjT/M8k9CzDBDE/DnVqoZ0fNJ+aYr7hNKF32mtAN+jH9ZB9dl/P
 fEFapJkPxnWyzAoVoB9Dz0rkcZkYMlbxlLWzUGpaPq0JflUUTzLk0ApSjmMn4HRO
 UddwCDdyHTaYThp3gn6GbOb0pIP0SdOVbI1M2QV2x/4PLcT2Ft8Np1+1RFWOeinZ
 Bdl9AE890big0808mqbBzw/buETwr9FjHtCdDPXpP0vJpkBLu3nIYRNb0LCt39es
 mWMp6dFhGgvGj3D3ahTuV3GI8hdpDkh9SObexa11RCjkTKrXcwEmFxHxLeFXwKYq
 alG278bo6cSChRMziS1lis+W/3tsJRN4TXUSv1PPzJHrFgptQVFRStU9ngBKP+pN
 WB+itPc4Fw0YHOrAFsrx
 =cfty
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI updates from Rafael Wysocki:
 "This time the majority of changes go into cpufreq and they are
  significant.

  First off, the way CPU frequency updates are triggered is different
  now.  Instead of having to set up and manage a deferrable timer for
  each CPU in the system to evaluate and possibly change its frequency
  periodically, cpufreq governors set up callbacks to be invoked by the
  scheduler on a regular basis (basically on utilization updates).  The
  "old" governors, "ondemand" and "conservative", still do all of their
  work in process context (although that is triggered by the scheduler
  now), but intel_pstate does it all in the callback invoked by the
  scheduler with no need for any additional asynchronous processing.

  Of course, this eliminates the overhead related to the management of
  all those timers, but also it allows the cpufreq governor code to be
  simplified quite a bit.  On top of that, the common code and data
  structures used by the "ondemand" and "conservative" governors are
  cleaned up and made more straightforward and some long-standing and
  quite annoying problems are addressed.  In particular, the handling of
  governor sysfs attributes is modified and the related locking becomes
  more fine grained which allows some concurrency problems to be avoided
  (particularly deadlocks with the core cpufreq code).

  In principle, the new mechanism for triggering frequency updates
  allows utilization information to be passed from the scheduler to
  cpufreq.  Although the current code doesn't make use of it, in the
  works is a new cpufreq governor that will make decisions based on the
  scheduler's utilization data.  That should allow the scheduler and
  cpufreq to work more closely together in the long run.

  In addition to the core and governor changes, cpufreq drivers are
  updated too.  Fixes and optimizations go into intel_pstate, the
  cpufreq-dt driver is updated on top of some modification in the
  Operating Performance Points (OPP) framework and there are fixes and
  other updates in the powernv cpufreq driver.

  Apart from the cpufreq updates there is some new ACPICA material,
  including a fix for a problem introduced by previous ACPICA updates,
  and some less significant changes in the ACPI code, like CPPC code
  optimizations, ACPI processor driver cleanups and support for loading
  ACPI tables from initrd.

  Also updated are the generic power domains framework, the Intel RAPL
  power capping driver and the turbostat utility and we have a bunch of
  traditional assorted fixes and cleanups.

  Specifics:

   - Redesign of cpufreq governors and the intel_pstate driver to make
     them use callbacks invoked by the scheduler to trigger CPU
     frequency evaluation instead of using per-CPU deferrable timers for
     that purpose (Rafael Wysocki).

   - Reorganization and cleanup of cpufreq governor code to make it more
     straightforward and fix some concurrency problems in it (Rafael
     Wysocki, Viresh Kumar).

   - Cleanup and improvements of locking in the cpufreq core (Viresh
     Kumar).

   - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
     Kumar, Eric Biggers).

   - intel_pstate driver updates including fixes, optimizations and a
     modification to make it enable enable hardware-coordinated P-state
     selection (HWP) by default if supported by the processor (Philippe
     Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
     Franciosi).

   - Operating Performance Points (OPP) framework updates to improve its
     handling of voltage regulators and device clocks and updates of the
     cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).

   - Updates of the powernv cpufreq driver to fix initialization and
     cleanup problems in it and correct its worker thread handling with
     respect to CPU offline, new powernv_throttle tracepoint (Shilpasri
     Bhat).

   - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).

   - ACPICA updates including one fix for a regression introduced by
     previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box,
     Colin Ian King).

   - Support for installing ACPI tables from initrd (Lv Zheng).

   - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
     Chaugule).

   - Support for _HID(ACPI0010) devices (ACPI processor containers) and
     ACPI processor driver cleanups (Sudeep Holla).

   - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
     Aleksey Makarov).

   - Modification of the ACPI PCI IRQ management code to make it treat
     255 in the Interrupt Line register as "not connected" on x86 (as
     per the specification) and avoid attempts to use that value as a
     valid interrupt vector (Chen Fan).

   - ACPI APEI fixes related to resource leaks (Josh Hunt).

   - Removal of modularity from a few ACPI drivers (BGRT, GHES,
     intel_pmic_crc) that cannot be built as modules in practice (Paul
     Gortmaker).

   - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
     as a valid resource type (Harb Abdulhamid).

   - New device ID (future AMD I2C controller) in the ACPI driver for
     AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).

   - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).

   - cpuidle menu governor optimization to avoid a square root
     computation in it (Rasmus Villemoes).

   - Fix for potential use-after-free in the generic device properties
     framework (Heikki Krogerus).

   - Updates of the generic power domains (genpd) framework including
     support for multiple power states of a domain, fixes and debugfs
     output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
     Geert Uytterhoeven).

   - Intel RAPL power capping driver updates to reduce IPI overhead in
     it (Jacob Pan).

   - System suspend/hibernation code cleanups (Eric Biggers, Saurabh
     Sengar).

   - Year 2038 fix for the process freezer (Abhilash Jindal).

   - turbostat utility updates including new features (decoding of more
     registers and CPUID fields, sub-second intervals support, GFX MHz
     and RC6 printout, --out command line option), fixes (syscall jitter
     detection and workaround, reductioin of the number of syscalls
     made, fixes related to Xeon x200 processors, compiler warning
     fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)"

* tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits)
  tools/power turbostat: bugfix: TDP MSRs print bits fixing
  tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
  tools/power turbostat: call __cpuid() instead of __get_cpuid()
  tools/power turbostat: indicate SMX and SGX support
  tools/power turbostat: detect and work around syscall jitter
  tools/power turbostat: show GFX%rc6
  tools/power turbostat: show GFXMHz
  tools/power turbostat: show IRQs per CPU
  tools/power turbostat: make fewer systems calls
  tools/power turbostat: fix compiler warnings
  tools/power turbostat: add --out option for saving output in a file
  tools/power turbostat: re-name "%Busy" field to "Busy%"
  tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
  tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
  tools/power turbostat: allow sub-sec intervals
  ACPI / APEI: ERST: Fixed leaked resources in erst_init
  ACPI / APEI: Fix leaked resources
  intel_pstate: Do not skip samples partially
  intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
  intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
  ...
2016-03-16 14:10:53 -07:00
Rafael J. Wysocki 3fdb74649b Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux into pm-tools
Pull turbostat updates for 4.6 from Len Brown.

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: bugfix: TDP MSRs print bits fixing
  tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
  tools/power turbostat: call __cpuid() instead of __get_cpuid()
  tools/power turbostat: indicate SMX and SGX support
  tools/power turbostat: detect and work around syscall jitter
  tools/power turbostat: show GFX%rc6
  tools/power turbostat: show GFXMHz
  tools/power turbostat: show IRQs per CPU
  tools/power turbostat: make fewer systems calls
  tools/power turbostat: fix compiler warnings
  tools/power turbostat: add --out option for saving output in a file
  tools/power turbostat: re-name "%Busy" field to "Busy%"
  tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
  tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
  tools/power turbostat: allow sub-sec intervals
  tools/power turbostat: Decode MSR_MISC_PWR_MGMT
  tools/power turbostat: decode HWP registers
  x86 msr-index: Simplify syntax for HWP fields
  tools/power turbostat: CPUID(0x16) leaf shows base, max, and bus frequency
  tools/power turbostat: decode more CPUID fields
2016-03-14 02:13:05 +01:00
Chen Yu 685b535b2c tools/power turbostat: bugfix: TDP MSRs print bits fixing
MSR_CONFIG_TDP_NOMINAL:
should print all 8 bits of base_ratio (bit 0:7) 0xFF

MSR_CONFIG_TDP_LEVEL_1:
should print all 15 bits of PKG_MIN_PWR_LVL1 (bit 48:62) 0x7FFF
should print all 15 bits of PKG_MAX_PWR_LVL1 (bit 32:46) 0x7FFF
should print all 8 bits of LVL1_RATIO (bit 16:23) 0xFF
should print all 15 bits of PKG_TDP_LVL1 (bit 0:14) 0x7FFF

And the same modification to MSR_CONFIG_TDP_LEVEL_2.

MSR_TURBO_ACTIVATION_RATIO:
should print all 8 bits of MAX_NON_TURBO_RATIO (bit 0:7) 0xFF

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 04:22:57 -04:00
Len Brown 6c34f160df tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x1e008008 (...pkg-cstate-limit=0: unlimited)
should print as
MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x1e008008 (...pkg-cstate-limit=8: unlimited)

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 04:22:47 -04:00
Len Brown 5aea2f7f64 tools/power turbostat: call __cpuid() instead of __get_cpuid()
turbostat already checks whether calling each cpuid leavf is legal,
and it doesn't look at the function return value,
so call the simpler gcc intrinsic __cpuid() instead of __get_cpuid().

syntax only, no functional change

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:42 -04:00
Len Brown aa8d8cc79a tools/power turbostat: indicate SMX and SGX support
SGX presence is related to a SKL power workaround,
so lets show when that is enabled.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:42 -04:00
Len Brown 0102b06747 tools/power turbostat: detect and work around syscall jitter
The accuracy of Bzy_Mhz and Busy% depend on reading
the TSC, APERF, and MPERF close together in time.

When there is a very short measurement interval,
or a large system is profoundly idle, the changes
in APERF and MPERF may be very small.
They can be small enough that an expensive interrupt
between reading APERF and MPERF can cause the APERF/MPERF
ratio to become inaccurate, resulting in invalid
calculation and display of Bzy_MHz.

A dummy APERF read of APERF makes this problem
much more rare.  Apparently this 1st systemn call
after exiting a long stretch of idle is when we
typically see expensive timer interrupts that cause
large jitter.

For the cases that dummy APERF read fails to prevent,
we compare the latency of the APERF and MPERF reads.
If they differ by more than 2x, we re-issue them.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:41 -04:00
Len Brown fdf676e51f tools/power turbostat: show GFX%rc6
The column "GFX%c6" show the percentage of time the GPU
is in the "render C6" state, rc6.  Deep package C-states on several
systems depend on the GPU being in RC6.

This information comes from the counter
/sys/class/drm/card0/power/rc6_residency_ms,
as read before and after the measurement interval.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:41 -04:00
Len Brown 27d47356b6 tools/power turbostat: show GFXMHz
Under the column "GFXMHz", show a snapshot of this attribute:
/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz

This is an instantaneous snapshot of what sysfs presents
at the end of the measurement interval.  turbostat does
not average or otherwise perform any math on this value.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:40 -04:00
Len Brown 562a2d377b tools/power turbostat: show IRQs per CPU
The new IRQ column shows how many interrupts have occurred on each CPU
during the measurement inteval.  This information comes from
the difference between /proc/interrupts shapshots made before
and after the measurement interval.

The first row, the system summary, shows the sum of the IRQS
for all CPUs during that interval.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:40 -04:00
Len Brown 36229897ba tools/power turbostat: make fewer systems calls
skip the open(2)/close(2) on each msr read
by keeping the /dev/cpu/*/msr files open.

The remaining read(2) is generally far fewer cycles
than the removed open(2) system call.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:40 -04:00
Len Brown 58cc30a4e6 tools/power turbostat: fix compiler warnings
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:39 -04:00
Len Brown b7d8c1483b tools/power turbostat: add --out option for saving output in a file
By default...

Turbostat --debug gconfiguration info goes to stderr.

In FORK mode, turbostat statistics go to stderr.

In PERIODIC mode, turbostat statistics go to stdout.

These defaults do not change, but an option "--out file"
will send all output above only to the specified file.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:39 -04:00
Len Brown 75d2e44e60 tools/power turbostat: re-name "%Busy" field to "Busy%"
some tools processing turbostat output
have difficulty with items that begin with %...

Reported-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:38 -04:00
Hubert Chrzaniuk cbf97abaf3 tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
Following changes have been made:
- changed MSR_NHM_TURBO_RATIO_LIMIT to MSR_TURBO_RATIO_LIMIT in debug print
  for consistency with Developer Manual
- updated definition of bitfields in MSR_TURBO_RATIO_LIMIT and appropriate
  parsing code
- added x200 to list of architectures that do not support Nahlem compatible
  definition of MSR_TURBO_RATIO_LIMIT register (x200 has the register but
  bits definition is custom)
- fixed typo in code that parses MSR_TURBO_RATIO_LIMIT
  (logical instead of bitwise operator)
- changed MSR_TURBO_RATIO_LIMIT parsing algorithm so the print out had the
  same order as implementations for other platforms

Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:38 -04:00
Chrzaniuk, Hubert 121b48bb77 tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
x200 does not enable any way to programmatically obtain bus clock
speed. Bclk for the architecture has a fixed value of 100 MHz.
At the same time x200 cannot be included in has_snb_msrs since
it does not support C7 idle state.

prior to this patch, MHz values reported on this chip
were erroneously calculated using bclk of 133MHz,
causing MHz values to be reported 33% higher than actual.

Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:37 -04:00
Len Brown 2a0609c02e tools/power turbostat: allow sub-sec intervals
turbostat -i interval_sec

will sample and display statistics every interval_sec.
interval_sec used to be a whole number of seconds,
but now we accept a decimal, as small as 0.001 sec (1 ms).

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:32 -04:00
Colin Ian King 1b69317d2d tools/power turbostat: fix various build warnings
When building with gcc 6 we're getting various build warnings that just
require some trivial function declaration and call fixes:

  turbostat.c: In function ‘dump_cstate_pstate_config_info’:
  turbostat.c:1973:1: warning: type of ‘family’ defaults to ‘int’
   dump_cstate_pstate_config_info(family, model)
  turbostat.c:1973:1: warning: type of ‘model’ defaults to ‘int’
  turbostat.c: In function ‘get_tdp’:
  turbostat.c:2145:8: warning: type of ‘model’ defaults to ‘int’
   double get_tdp(model)
  turbostat.c: In function ‘perf_limit_reasons_probe’:
  turbostat.c:2259:6: warning: type of ‘family’ defaults to ‘int’
   void perf_limit_reasons_probe(family, model)
  turbostat.c:2259:6: warning: type of ‘model’ defaults to ‘int’

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-wbicer8n0s9qe6ql8h9x478e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:39 -03:00
Len Brown f0057310b4 tools/power turbostat: Decode MSR_MISC_PWR_MGMT
This MSR is helpful to show if P-state HW coordination
is enabled or disabled.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-02-17 01:43:05 -05:00
Len Brown 7f5c258e1c tools/power turbostat: decode HWP registers
# turbostat --debug
...
CPUID(6): ... HWP, HWPnotify, HWPwindow, HWPepp, HWPpkg ...
...
cpu0: MSR_PM_ENABLE: 0x00000001 (HWP)
cpu0: MSR_HWP_CAPABILITIES: 0x01050916 (high 0x16 guar 0x9 eff 0x5 low 0x1)
cpu0: MSR_HWP_REQUEST: 0x80001604 (min 0x4 max 0x16 des 0x0 epp 0x80 window 0x0 pkg 0x0)
cpu0: MSR_HWP_INTERRUPT: 0x00000001 (EN_Guaranteed_Perf_Change, Dis_Excursion_Min)
cpu0: MSR_HWP_STATUS: 0x00000000 (No-Guaranteed_Perf_Change, No-Excursion_Min)

Signed-off-by: Len Brown <len.brown@intel.com>
2016-02-17 01:42:34 -05:00
Len Brown 61a87ba789 tools/power turbostat: CPUID(0x16) leaf shows base, max, and bus frequency
This CPUID leaf is available on Skylake:

CPUID(0x16): base_mhz: 1500 max_mhz: 2200 bus_mhz: 100

Signed-off-by: Len Brown <len.brown@intel.com>
2016-02-17 01:42:20 -05:00
Len Brown 69807a638f tools/power turbostat: decode more CPUID fields
for debugging, dump a few more fields:

CPUID(1): SSE3 MONITOR EIST TM2 TSC MSR ACPI-TM TM

cpu0: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MONITOR)

Signed-off-by: Len Brown <len.brown@intel.com>
2016-02-17 01:41:53 -05:00
Rafael J. Wysocki db2b52f752 Merge branch 'pm-tools'
* pm-tools:
  cpupower: Fix build error in cpufreq-info
2016-01-21 00:43:29 +01:00
Rafael J. Wysocki a72aea722f Merge branches 'acpica', 'acpi-video' and 'acpi-fan'
* acpica:
  ACPICA: Update version to 20160108
  ACPICA: Silence a -Wbad-function-cast warning when acpi_uintptr_t is 'uintptr_t'
  ACPICA: Additional 2016 copyright changes
  ACPICA: Reduce regression fix divergence from upstream ACPICA

* acpi-video:
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830
  ACPI / video: Revert "thinkpad_acpi: Use acpi_video_handles_brightness_key_presses()"
  ACPI / video: Document acpi_video_handles_brightness_key_presses() a bit
  ACPI / video: Fix using an uninitialized mutex / list_head in acpi_video_handles_brightness_key_presses()
  ACPI / video: Revert "ACPI / video: driver must be registered before checking for keypresses"
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700

* acpi-fan:
  ACPI / fan: Improve acpi_device_update_power error message
2016-01-21 00:41:59 +01:00
Shreyas B. Prabhu 38cb76a307 cpupower: Fix build error in cpufreq-info
Fix the following build error by including limits.h -

utils/cpufreq-info.c: In function ‘get_latency’:
utils/cpufreq-info.c:437:29: error: ‘UINT_MAX’ undeclared (first use in
this function)
  if (!latency || latency == UINT_MAX) {
                             ^
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Fixes: e98f033f94 (cpupower: fix how "cpupower frequency-info" interprets latency)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-19 01:17:23 +01:00
Bob Moore c8100dc464 ACPICA: Additional 2016 copyright changes
All tool/utility signons.
Dual-license module header.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-15 22:18:09 +01:00
Rafael J. Wysocki 8f053a56df Merge branches 'pm-sleep' and 'pm-tools'
* pm-sleep:
  PM / sleep: Add support for read-only sysfs attributes

* pm-tools:
  cpupower: fix how "cpupower frequency-info" interprets latency
  cpupower: rework the "cpupower frequency-info" command
  cpupower: Do not analyse offlined cpus
  cpupower: Provide STATIC variable in Makefile for debug builds
  cpupower: Fix precedence issue
2016-01-12 01:12:03 +01:00
Bob Moore 5920380c67 ACPICA: getopt: Comment update, no functional change
ACPICA commit 0d784a90bc3aac75227c4459c3553de18b9ebe7a

Document one of the option string operators.

Link: https://github.com/acpica/acpica/commit/0d784a90
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-01 03:47:36 +01:00
Bob Moore 1fad87385e ACPICA: Core: Major update for code formatting, no functional changes
ACPICA commit dfa394471f6c01b2ee9433dbc143ec70cb9bca72

Mostly indentation inconsistencies across the code. Split
some long lines, etc.

Link: https://github.com/acpica/acpica/commit/dfa39447
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-01 03:36:53 +01:00
Lv Zheng 37645d6590 tools/power/acpi: Add userspace AML interface support
This patch adds a userspace tool to access Linux kernel AML debugger
interface.

Tow modes are supported by this tool:
1. Interactive: Users are able to launch a debugging shell to talk with
   in-kernel AML debugger.
   Note that it's user duty to ensure kernel runtime integrity by using
   this debugging tool:
   A. Some control methods evaluated by the users may result in kernel
      panics if those control methods shouldn't be evaluated by the OSPMs
      according to the current BIOS/OS configurations.
   B. Currently if a single stepping evaluation couldn't run to an end,
      then the synchronization primitives acquired by the evaluation may
      block normal OSPM control method evaluations.
2. Batch: Users are able to execute debugger commands in a script.
   Note that in addition to the above duties, it's user duty to ensure
   script runtime integrity by using this debugging tool in this mode:
   C. Currently only those commands that are not used for single stepping
      are suitable to be used in this mode.
   D. If the execution of the command may cause a failure that could result
      in an endless kernel execution, the execution of the script may also
      get blocked.
To exit the utility, currently "exit/quit" commands are recommended, but
ctrl-C" can also be used.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-15 00:17:44 +01:00
Jacob Tanenbaum e98f033f94 cpupower: fix how "cpupower frequency-info" interprets latency
the intel-pstate driver does not support the ondemand governor and does not
have a valid value in
/sys/devices/system/cpu/cpu[x]/cpufreq/cpuinfo_transition_latency. The
intel-pstate driver sets cpuinfo_transition_latency to CPUFREQ_ETERNAL (-1),
the value written into cpuinfo_transition_latency is defind as an unsigned
int so checking the read value against max unsigned int will determine if the
value is valid.

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
Signed-off-by: Thomas Renninger <trenn@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-03 02:30:30 +01:00
Jacob Tanenbaum 562e5f1a35 cpupower: rework the "cpupower frequency-info" command
this patch makes two changes to the way that "cpupower
frequancy-info" operates

1. make it so that querying individual values always returns a
   message to the user

currently cpupower frequency info doesn't return anything to the user when
querying an individual value cannot be returned

[root@amd-dinar-09 cpupower]# cpupower -c 4 frequency-info -d
analyzing CPU 4:
[root@amd-dinar-09 cpupower]#

I added messages so that each query prints a message to the terminal

[root@amd-dinar-09 cpupower]# ./cpupower -c 4 frequency-info -d
analyzing CPU 4:
  no or unknown cpufreq driver is active on this CPU
[root@amd-dinar-09 cpupower]#

(this is just one example)

2. change debug_output_one() to use the functions already provided
   by cpufreq-info.c to query individual values of interest.

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
Signed-off-by: Thomas Renninger <trenn@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-03 02:30:30 +01:00
Thomas Renninger ce512b8404 cpupower: Do not analyse offlined cpus
Use sysfs_is_cpu_online(cpu) instead of cpufreq_cpu_exists(cpu) to detect offlined cpus.

Re-arrange printfs slightly to have a consistent output even if you have multiple CPUs
as output and even if offlined cores are in between.

Signed-off-by: Thomas Renninger <trenn@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-03 02:30:30 +01:00
Thomas Renninger e51207f003 cpupower: Provide STATIC variable in Makefile for debug builds
When working on cpupower code, you often want to compile library code into the
binary.

This allows to execute modified cpupower code, even with library changes
without doing "make install"

Signed-off-by: Thomas Renninger <trenn@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-03 02:30:30 +01:00
Thomas Renninger 7b0e1bf171 cpupower: Fix precedence issue
Signed-off-by: Thomas Renninger <trenn@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-03 02:30:29 +01:00
Rafael J. Wysocki d9f67dbc0f Merge branch 'pm-tools'
* pm-tools:
  x86: remove unused definition of MSR_NHM_PLATFORM_INFO
  tools/power turbostat: use new name for MSR_PLATFORM_INFO
2015-11-16 22:57:02 +01:00
Len Brown ec0adc539b tools/power turbostat: use new name for MSR_PLATFORM_INFO
MSR_PLATFORM_INFO is the new name for MSR_NHM_PLATFORM_INFO

no functional change

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-13 23:27:13 +01:00
Rafael J. Wysocki f57ab32a84 Merge branch 'pm-tools'
* pm-tools:
  Creating a common structure initialization pattern for struct option
  cpupower: Enable disabled Cstates if they are below max latency
  cpupower: Remove debug message when using cpupower idle-set -D switch
  cpupower: cpupower monitor reports uninitialized values for offline cpus
  tools/power turbostat: bugfix: print MAX_NON_TURBO_RATIO
  tools/power turbostat: simplify Bzy_MHz calculation
2015-11-12 00:22:56 +01:00
Rafael J. Wysocki 89ba7d8c22 Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux into pm-tools
Pull turbostat changes for v4.4 from Len Brown.

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: bugfix: print MAX_NON_TURBO_RATIO
  tools/power turbostat: simplify Bzy_MHz calculation
2015-11-11 00:01:21 +01:00
Sriram Raghunathan 57ab3b0872 Creating a common structure initialization pattern for struct option
This patch tries to creates a common structure initialization
within the cpupower tool.

Previously the ``struct option`` was initialized
using `designated initializer` technique which was
not needed. There were conflicting initialization methods seen with

bench/main.c & others.

Signed-off-by: Sriram Raghunathan <sriram@marirs.net.in>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-02 02:28:59 +01:00
Thomas Renninger 19c9fb896f cpupower: Enable disabled Cstates if they are below max latency
cpupower idle-set -D <latency>
currently only disables all C-states that have a higher latency than the
specified <latency>. But if deep sleep states were already disabled and
have a lower latency, they should get enabled again.

For example:
This call:
cpupower idle-set -D 30
disables all C-states with a higher or equal latency than 30.
If one then calls:
cpupower idle-set -D 100
C-states with a latency between 30-99 will get enabled again with this patch
now. It is ensured that only C-states with a latency of 100 and higher are
disabled.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-02 02:28:59 +01:00
Thomas Renninger 645209472d cpupower: Remove debug message when using cpupower idle-set -D switch
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-02 02:28:59 +01:00
Jacob Tanenbaum 20102ac5be cpupower: cpupower monitor reports uninitialized values for offline cpus
[root@hp-dl980g7-02 linux]# cpupower monitor
...
5472|   0|   1|******|******|******|******|| 0.00|  0.00|  0.00|  0.00|  0.00 *is offline
10567|   0| 159|******|******|******|******||  0.00|  0.00|  0.00|  0.00|  0.00 *is offline
1661206560|859272560| 150|******|******|******|******|| 0.00|  0.00|  0.00|  0.00|  0.00 *is offline
1661206560|943093104| 140|******|******|******|******|| 0.00|  0.00|  0.00|  0.00|  0.00 *is offline

because of this cpupower also holds the incorrect value for the number
of physical packages in the machine

Changed cpupower to initialize the values of an offline cpu's socket and
core to -1, warn the user that one or more cpus is/are
offline and not print statistics for offline cpus.

This fix hides offlined cores where topology cannot be accessed.
With a recent kernel patch suggested from Prarit Bhargava it may be possible
that soft offlined cores' topology can still be parsed.
This patch would then show which cores in which package/socket are offline,
when sane toplogoy information is available.

Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-02 02:28:59 +01:00
Len Brown 759d2a932b tools/power turbostat: bugfix: print MAX_NON_TURBO_RATIO
MSR_TURBO_ACTIVATION_RATIO: 0x00000016 (MAX_NON_TURBO_RATIO=6 lock=0)
should print all 7 bits of MAX_NON_TURBO_RATIO (in decimal):
MSR_TURBO_ACTIVATION_RATIO: 0x00000016 (MAX_NON_TURBO_RATIO=22 lock=0)

Signed-off-by: Len Brown <len.brown@intel.com>
2015-10-22 02:42:12 -04:00
Bob Moore 842e71332e ACPICA: iASL: General cleanup of the file suffix #defines
ACPICA commit bed456ed2976bdaafdef406b982fdf6c539befc0

Removed some extraneous defines, reordered others.

Link: https://github.com/acpica/acpica/commit/bed456ed
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-10-22 02:01:12 +02:00
Len Brown 21ed5574d1 tools/power turbostat: simplify Bzy_MHz calculation
Bzy_MHz = TSC_delta*tsc_tweak/APERF_delta/MPERF_delta/measurement_interval

becomes

    Bzy_MHz = base_mhz/APERF_delta/MPERF_delta

on systems which support MSR_NHM_PLATFORM_INFO.

base_mhz is calculated directly from the base_ratio
reported in MSR_NHM_PLATFORM_INFO * bclk,
and bclk is discovered via MSR or cpuid.

This reduces the dependency of Bzy_MHz calculation on the TSC.
Previously, there were 4 TSC readings required in each caculation,
the raw TSC delta combined with the measurement_interval.
This also removes the "tsc_tweak" correction factor used when
TSC runs on a different base clock from the CPU's bclk.

After this change, tsc_tweak is used only for %Busy.

The end-result should be a Bzy_MHz result slightly less prone to jitter.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-10-19 22:50:01 -04:00
Len Brown af71b980c0 tools/power turbosat: update version number
Signed-off-by: Len Brown <len.brown@intel.com>
2015-09-26 09:49:55 -04:00
Len Brown a2b7b74945 tools/power turbostat: SKL: Adjust for TSC difference from base frequency
On a Skylake with 1500MHz base frequency,
the TSC runs at 1512MHz.

This is because the TSC is no longer in the n*100 MHz BCLK domain,
but is now in the m*24MHz crystal clock domain. (24 MHz * 63 = 1512 MHz)

This adds error to several calculations in turbostat,
unless the TSC sample sizes are adjusted for this difference.

Note that calculations in the time domain are immune
from this issue, as the timing sub-system has already
calibrated the TSC against a known wall clock.

AVG_MHz = APERF_delta/measurement_interval

	need no adjustment.  APERF_delta is in the BCLK domain,
	and measurement_interval is in the time domain.

TSC_MHz  =  TSC_delta/measurement_interval

	needs no adjustment -- as we really do want to report
	the actual measured TSC delta here, and measurement_interval
	is in the accurate time domain.

%Busy = MPERF_delta/TSC_delta

	needs adjustment to use TSC_BCLK_DOMAIN_delta.
	TSC_BCLK_DOMAIN_delta = TSC_delta * base_hz / tsc_hz

Bzy_MHz = TSC_delta/APERF_delta/MPERF_delta/measurement_interval

	need adjustment as above.

No other metrics in turbostat need to be adjusted.

Before:

     CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
       -     550   24.84    2216    1512
       0    2191   98.73    2219    1514
       2       0    0.01    2130    1512
       1       9    0.43    2016    1512
       3       2    0.08    2016    1512

After:

     CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
       -     550   25.05    2198    1512
       0    2190   99.62    2199    1512
       2       0    0.01    2152    1512
       1       9    0.46    2000    1512
       3       2    0.10    2000    1512

Note that in this example, the "Before" Bzy_MHz
was reported as exceeding the 2200 max turbo rate.
Also, even a pinned spin loop would not be reported
as over 99% busy.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-09-26 00:50:54 -04:00
Hubert Chrzaniuk b2b34dfe4d tools/power turbostat: KNL workaround for %Busy and Avg_MHz
KNL increments APERF and MPERF every 1024 clocks.
This is compliant with the architecture specification,
which requires that only the ratio of APERF/MPERF need be valid.

However, turbostat takes advantage of the fact that these
two MSRs increment every un-halted clock
at the actual and base frequency:

AVG_MHz = APERF_delta/measurement_interval

%Busy = MPERF_delta/TSC_delta

This quirk is needed for these calculations to also work on KNL,
which would otherwise show a value 1024x smaller than expected.

Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-09-26 00:50:54 -04:00
Len Brown 756357b8e4 tools/power turbostat: IVB Xeon: fix --debug regression
Staring in Linux-4.3-rc1,
commit 6fb3143b56 ("tools/power turbostat: dump CONFIG_TDP")
touches MSR 0x648, which is not supported on IVB-Xeon.
This results in "turbostat --debug" exiting on those systems:

turbostat: /dev/cpu/2/msr offset 0x648 read failed: Input/output error

Remove IVB-Xeon from the list of machines supporting with that MSR.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-09-26 00:50:48 -04:00