Commit graph

16 commits

Author SHA1 Message Date
Ashwin Chaugule 158c998ea4 ACPI / CPPC: add sysfs support to compute delivered performance
The CPPC tables contain entries for per CPU feedback counters which
allows us to compute the delivered performance over a given interval
of time.

The math for delivered performance per the CPPCv5.0+ spec is:
  reference perf * delta(delivered perf ctr)/delta(ref perf ctr)

Maintaining deltas of the counters in the kernel is messy, as it
depends on when the reads are triggered. (e.g. via the cpufreq
->get() interface). Also the ->get() interace only returns one
value, so cant return raw values. So instead, leave it to userspace
to keep track of raw values and do its math for CPUs it cares about.

delivered and reference perf counters are exposed via the same
sysfs file to avoid the potential "skid", if these values are read
individually from userspace.

Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 01:02:34 +02:00
Prakash, Prashanth be8b88d7d9 ACPI / CPPC: set a non-zero value for transition_latency
Compute the expected transition latency for frequency transitions
using the values from the PCCT tables when the desired perf
register is in PCC.

Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Reviewed-by: Alexey Klimov <alexey.klimov@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 01:02:33 +02:00
Prakash, Prashanth 80b8286aee ACPI / CPPC: support for batching CPPC requests
CPPC defined in section 8.4.7 of ACPI 6.0 specification suggests
"To amortize the cost of PCC transactions, OSPM should read or write
all PCC registers via a single read or write command when possible"
This patch enables opportunistic batching of frequency transition
requests whenever the request happen to overlap in time.

Currently the access to pcc is serialized by a spin lock which does
not scale well as we increase the number of cores in the system. This
patch improves the scalability by allowing the differnt CPU cores to
update PCC subspace in parallel and by batching requests which will
reduce the certain types of operation(checking command completion bit,
ringing doorbell) by a significant margin.

Profiling shows significant improvement in the overall effeciency
to service freq. transition requests. With this patch we observe close
to 30% of the frequency transition requests being batched with other
requests while running apache bench on a ARM platform with 6
independent domains(or sets of related cpus).

Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 01:02:33 +02:00
Prakash, Prashanth 850d64a4a6 ACPI / CPPC: acquire pcc_lock only while accessing PCC subspace
We need to acquire pcc_lock only when we are accessing registers
that are in the PCC subspsace.

Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 01:02:33 +02:00
Ashwin Chaugule 5bbb86aa4b ACPI / CPPC: restructure read/writes for efficient sys mapped reg ops
For cases where sys mapped CPC registers need to be accessed
frequently, it helps immensly to pre-map them rather than map
and unmap for each operation. e.g. case where feedback counters
are sys mem map registers.

Restructure cpc_read/write and the cpc_regs structure to allow
pre-mapping the system addresses and unmap them when the CPU exits.

Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-31 01:02:33 +02:00
Hoan Tran 2324d15447 ACPI / CPPC: Prevent cpc_desc_ptr points to the invalid data
When CPPC fails to request a PCC channel, the CPC data is freed
and cpc_desc_ptr points to the invalid data.

Avoid this issue by moving the cpc_desc_ptr assignment after the PCC
channel request.

Signed-off-by: Hoan Tran <hotran@apm.com>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-25 02:53:31 +02:00
Hoan Tran 8343c40d3d ACPI: CPPC: Return error if _CPC is invalid on a CPU
Based on 8.4.7.1 section of ACPI 6.1 specification, if the platform
supports CPPC, the _CPC object must exist under all processor objects.
If cpc_desc_ptr pointer is invalid on any CPUs, acpi_get_psd_map()
should return error and CPPC cpufreq driver can not be registered.

Signed-off-by: Hoan Tran <hotran@apm.com>
Reviewed-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-23 23:19:21 +02:00
Prakash, Prashanth f387e5b901 ACPI / CPPC: use MRTT/MPAR to decide if/when a req can be sent
The ACPI spec defines Minimum Request Turnaround Time(MRTT) and
Maximum Periodic Access Rate(MPAR) to prevent the OSPM from sending
too many requests than the platform can handle. For further details
on these parameters please refer to section 14.1.3 of ACPI 6.0 spec.

This patch includes MRTT/MPAR in deciding if or when a CPPC request
can be sent to the platform to make sure CPPC implementation is
compliant to the spec.

Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 23:35:29 +01:00
Prakash, Prashanth beee23aebc ACPI / CPPC: replace writeX/readX to PCC with relaxed version
We do not have a strict read/write order requirement while accessing
PCC subspace. The only requirement is all access should be committed
before triggering the PCC doorbell to transfer the ownership of PCC
to the platform and this requirement is enforced by the PCC driver.

Profiling on a many core system shows improvement of about 1.8us on
average per freq change request(about 10% improvement on average).
Since these operations are executed while holding the pcc_lock,
reducing this time helps the CPPC implementation to scale much
better as the number of cores increases.

Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 23:35:29 +01:00
Prakash, Prashanth 77e3d86f2f ACPI / CPPC: optimized cpc_read and cpc_write
cpc_read and cpc_write are used while holding the pcc_lock spin_lock,
so they need to be as fast as possible. acpi_os_read/write_memory
APIs linearly search through a list for cached mapping which is
quite expensive. Since the PCC subspace is already mapped into
virtual address space during initialization, we can just add the
offset and access the necessary CPPC registers.

This patch + similar changes to PCC driver reduce the time per freq.
transition from around 200us to about 20us for the CPPC cpufreq
driver.

Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 23:35:29 +01:00
Ashwin Chaugule ad62e1e677 ACPI / CPPC: Optimize PCC Read Write operations
Previously the send_pcc_cmd() code checked if the
PCC operation had completed before returning from
the function. This check was performed regardless
of the PCC op type (i.e. Read/Write). Knowing
the type of cmd can be used to optimize the check
and avoid needless waiting. e.g. with Write ops,
the actual Writing is done before calling send_pcc_cmd().
And the subsequent Writes will check if the channel is
free at the entry of send_pcc_cmd() anyway.

However, for Read cmds, we need to wait for the cmd
completion bit to be flipped, since the actual Read
ops follow after returning from the send_pcc_cmd(). So,
only do the looping check at the end for Read ops.

Also, instead of using udelay() calls, use ktime as a
means to check for deadlines. The current deadline
in which the Remote should flip the cmd completion bit
is defined as N * Nominal latency. Where N is arbitrary
and large enough to work on slow emulators and Nominal
latency comes from the ACPI table (PCCT). This helps
in working around the CONFIG_HZ effects on udelay()
and also avoids needing different ACPI tables for Silicon
and Emulation platforms.

Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09 23:35:28 +01:00
Rafael J. Wysocki 9832bf3a35 Merge branches 'pm-cpufreq' and 'acpi-cppc'
* pm-cpufreq:
  Revert "Documentation: kernel_parameters for Intel P state driver"
  cpufreq: mediatek: fix build error
  cpufreq: intel_pstate: Add separate support for Airmont cores
  cpufreq: intel_pstate: Replace BYT with ATOM
  Revert "cpufreq: intel_pstate: Use ACPI perf configuration"
  Revert "cpufreq: intel_pstate: Avoid calculation for max/min"

* acpi-cppc:
  ACPI / CPPC: Use h/w reduced version of the PCCT structure
2015-11-20 01:22:10 +01:00
Ashwin Chaugule d29d67357d ACPI / CPPC: Use h/w reduced version of the PCCT structure
CPPC is enabled only on platforms which support the h/w reduced
ACPI specification, so use the h/w reduced version of the PCCT
consistently when deferencing PCCT contents.

Fixes: 337aadff8e (ACPI: Introduce CPU performance controls using CPPC)
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-14 00:00:38 +01:00
Ashwin Chaugule 4219853aef ACPI / CPPC: Fix potential memory leak
Commit 337aadff8e (ACPI: Introduce CPU performance controls using CPPC)
leads to the following static checker warning:

        drivers/acpi/cppc_acpi.c:527 acpi_cppc_processor_probe()
        warn: overwrite may leak 'cpc_ptr'

Fix the warning by removing the bogus per-CPU pointer dereference.

Fixes: 337aadff8e (ACPI: Introduce CPU performance controls using CPPC)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-10-26 04:47:02 +01:00
Dan Carpenter 32c0b2f606 ACPI / CPPC: signedness bug in register_pcc_channel()
The "pcc_subspace_idx" is -1 if it hasn't been initialized yet.  We need
it to be signed.

Fixes: 337aadff8e (ACPI: Introduce CPU performance controls using CPPC)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-10-26 04:44:17 +01:00
Ashwin Chaugule 337aadff8e ACPI: Introduce CPU performance controls using CPPC
CPPC stands for Collaborative Processor Performance Controls
and is defined in the ACPI v5.0+ spec. It describes CPU
performance controls on an abstract and continuous scale
allowing the platform (e.g. remote power processor) to flexibly
optimize CPU performance with its knowledge of power budgets
and other architecture specific knowledge.

This patch adds a shim which exports commonly used functions
to get and set CPPC specific controls for each CPU. This enables
CPUFreq drivers to gather per CPU performance data and use
with exisiting governors or even allows for customized governors
which are implemented inside CPUFreq drivers.

Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Reviewed-by: Al Stone <al.stone@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-10-12 22:49:55 +02:00