alistair23-linux/tools/perf/util
Andi Kleen 44b1e60ab5 perf stat: Basic support for TopDown in perf stat
Add basic plumbing for TopDown in perf stat

TopDown is intended to replace the frontend cycles idle/ backend cycles
idle metrics in standard perf stat output.  These metrics are not
reliable in many workloads, due to out of order effects.

This implements a new --topdown mode in perf stat (similar to
--transaction) that measures the pipe line bottlenecks using
standardized formulas. The measurement can be all done with 5 counters
(one fixed counter)

The result are four metrics:

FrontendBound, BackendBound, BadSpeculation, Retiring

that describe the CPU pipeline behavior on a high level.

The full top down methology has many hierarchical metrics.  This
implementation only supports level 1 which can be collected without
multiplexing. A full implementation of top down on top of perf is
available in pmu-tools toplev.  (http://github.com/andikleen/pmu-tools)

The current version works on Intel Core CPUs starting with Sandy Bridge,
and Atom CPUs starting with Silvermont.  In principle the generic
metrics should be also implementable on other out of order CPUs.

TopDown level 1 uses a set of abstracted metrics which are generic to
out of order CPU cores (although some CPUs may not implement all of
them):

  topdown-total-slots       Available slots in the pipeline
  topdown-slots-issued      Slots issued into the pipeline
  topdown-slots-retired     Slots successfully retired
  topdown-fetch-bubbles     Pipeline gaps in the frontend
  topdown-recovery-bubbles  Pipeline gaps during recovery
                            from misspeculation

These metrics then allow to compute four useful metrics:

FrontendBound, BackendBound, Retiring, BadSpeculation.

Add a new --topdown options to enable events.  When --topdown is
specified set up events for all topdown events supported by the kernel.
Add topdown-* as a special case to the event parser, as is needed for
all events containing -.

The actual code to compute the metrics is in follow-on patches.

v2: Use standard sysctl read function.
v3: Move x86 specific code to arch/
v4: Enable --metric-only implicitly for topdown.
v5: Add --single-thread option to not force per core mode
v6: Fix output order of topdown metrics
v7: Allow combining with -d
v8: Remove --single-thread again
v9: Rename functions, adding arch_ and topdown_.
v10: Expand man page and describe TopDown better
Paste intro into commit description.
Print error when malloc fails.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-06-06 17:04:15 -03:00
..
include tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/ 2016-01-08 12:35:46 -03:00
intel-pt-decoder perf intel-pt: Fix off-by-one comparison on maximum code 2016-04-25 20:35:59 -03:00
scripting-engines perf tools: Fix usage of max_stack sysctl 2016-05-20 11:43:56 -03:00
alias.c perf tools: Introduce zfree 2013-12-27 15:17:00 -03:00
annotate.c perf annotate: Sort list of recognised instructions 2016-05-20 11:43:57 -03:00
annotate.h perf tools: Remove misplaced __maybe_unused 2016-03-23 12:03:04 -03:00
auxtrace.c perf tools: Add support for skipping itrace instructions 2016-03-30 11:14:09 -03:00
auxtrace.h perf tools: Add support for skipping itrace instructions 2016-03-30 11:14:09 -03:00
bpf-loader.c perf bpf: Automatically create bpf-output event __bpf_stdout__ 2016-04-11 22:18:04 -03:00
bpf-loader.h perf bpf: Clone bpf stdout events in multiple bpf scripts 2016-04-11 22:17:45 -03:00
bpf-prologue.c perf bpf: Add prologue for BPF programs for fetching arguments 2015-11-18 17:51:04 -03:00
bpf-prologue.h perf bpf: Add prologue for BPF programs for fetching arguments 2015-11-18 17:51:04 -03:00
Build perf tools: Remove xrealloc and ALLOC_GROW 2016-05-10 11:58:27 -03:00
build-id.c perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid 2016-05-30 13:15:03 -03:00
build-id.h perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid 2016-05-30 13:15:03 -03:00
cache.h perf tools: Remove xrealloc and ALLOC_GROW 2016-05-10 11:58:27 -03:00
call-path.c perf tools: Refactor code to move call path handling out of thread-stack 2016-05-06 13:00:43 -03:00
call-path.h perf tools: Refactor code to move call path handling out of thread-stack 2016-05-06 13:00:43 -03:00
callchain.c perf hists: Move sort__has_parent into struct perf_hpp_list 2016-05-05 21:03:59 -03:00
callchain.h perf tools: Per event max-stack settings 2016-05-30 12:41:44 -03:00
cgroup.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
cgroup.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
cloexec.c perf bench numa: Fix to show proper convergence stats 2015-06-25 12:28:35 -03:00
cloexec.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
color.c perf config: Bring perf_default_config to the very beginning at main() 2016-02-26 19:49:16 -03:00
color.h perf tools: Remove trail argument to color vsprintf 2015-08-05 16:44:02 -03:00
comm.c perf comm: Use atomic.h for refcounting 2015-05-27 12:21:43 -03:00
comm.h perf tools: Add facility to export data in database-friendly way 2014-10-29 10:32:49 -02:00
config.c perf config: Introduce perf_config_set class 2016-04-14 09:00:42 -03:00
config.h perf config: Introduce perf_config_set class 2016-04-14 09:00:42 -03:00
counts.c perf stat: Move perf_counts struct and functions into separate object 2015-08-08 14:16:49 -03:00
counts.h perf stat: Move perf_counts struct and functions into separate object 2015-08-08 14:16:49 -03:00
cpumap.c perf cpu_map: Add has() method 2016-04-13 10:11:50 -03:00
cpumap.h perf cpu_map: Add has() method 2016-04-13 10:11:50 -03:00
ctype.c perf ui/stdio: Align column header for hierarchy output 2016-02-24 20:21:12 -03:00
data-convert-bt.c perf ctf: Convert invalid chars in a string before set value 2016-05-27 12:08:40 -03:00
data-convert-bt.h perf data: Support using -f to override perf.data file ownership for 'convert' 2015-04-02 13:18:52 -03:00
data.c perf data: Add perf_data_file__switch() helper 2016-04-14 08:57:54 -03:00
data.h perf data: Add perf_data_file__switch() helper 2016-04-14 08:57:54 -03:00
db-export.c perf thread: Adopt get_main_thread from db-export.c 2016-05-30 12:41:43 -03:00
db-export.h perf script: Add call path id to exported sample in db export 2016-05-06 13:00:53 -03:00
debug.c perf tools: Make binary data printer code in trace_event public available 2016-02-24 11:38:01 -03:00
debug.h perf tools: Initialize libapi debug output 2016-02-16 17:12:59 -03:00
demangle-java.c perf symbols: add Java demangling support 2016-02-05 09:46:45 -03:00
demangle-java.h perf symbols: add Java demangling support 2016-02-05 09:46:45 -03:00
dso.c perf tools: Set buildid dir under symfs when --symfs is provided 2016-05-20 11:43:58 -03:00
dso.h perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid 2016-05-30 13:15:03 -03:00
dwarf-aux.c perf probe: Check if dwarf_getlocations() is available 2016-05-12 11:26:59 -03:00
dwarf-aux.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
env.c perf tools: Add perf data cache feature 2016-02-16 17:13:00 -03:00
env.h perf tools: Add perf data cache feature 2016-02-16 17:13:00 -03:00
event.c perf record: Fix crash when kptr is restricted 2016-05-27 09:41:39 -03:00
event.h perf tools: Add time conversion event 2016-03-31 10:52:24 -03:00
evlist.c perf evlist: Fix alloc_mmap() failure path 2016-06-03 14:53:46 -03:00
evlist.h perf evlist: Choose correct reading direction according to evlist->backward 2016-05-30 12:41:45 -03:00
evsel.c perf evsel: Provide way to extract integer value from format_field 2016-06-03 14:53:46 -03:00
evsel.h perf evsel: Provide way to extract integer value from format_field 2016-06-03 14:53:46 -03:00
evsel_fprintf.c perf evsel: Move fprintf methods to separate source file 2016-04-14 19:46:58 -03:00
find-vdso-map.c perf tools: Build programs to copy 32-bit compatibility 2014-10-29 10:32:48 -02:00
genelf.c perf jit: add source line info support 2016-02-05 12:33:09 -03:00
genelf.h perf jit: genelf makes assumptions about endian 2016-03-30 18:12:06 -03:00
genelf_debug.c perf jit: add source line info support 2016-02-05 12:33:09 -03:00
generate-cmdlist.sh perf tools: Do not show trace command if it's not compiled in 2016-01-08 12:46:17 -03:00
group.h perf stat: Basic support for TopDown in perf stat 2016-06-06 17:04:15 -03:00
header.c perf tools: Use SBUILD_ID_SIZE where applicable 2016-05-11 13:06:06 -03:00
header.h perf tools: Remove misplaced __maybe_unused 2016-03-23 12:03:04 -03:00
help-unknown-cmd.c perf help: Do not use ALLOC_GROW in add_cmd_list 2016-05-10 11:58:09 -03:00
help-unknown-cmd.h perf tools: Move help_unknown_cmd() to its own file 2015-12-14 12:30:37 -03:00
hist.c perf report: Add srcline_from/to branch sort keys 2016-05-23 11:25:16 -03:00
hist.h perf report: Add srcline_from/to branch sort keys 2016-05-23 11:25:16 -03:00
intel-bts.c perf tools: Add support for skipping itrace instructions 2016-03-30 11:14:09 -03:00
intel-bts.h perf tools: Add Intel BTS support 2015-08-21 11:34:10 -03:00
intel-pt.c Merge branch 'perf/urgent' into perf/core, to resolve conflict 2016-04-23 14:12:10 +02:00
intel-pt.h perf tools: Pass Intel PT information for decoding MTC and CYC 2015-08-24 17:46:43 -03:00
intlist.c perf util: Add findnew method to intlist 2013-10-14 10:28:48 -03:00
intlist.h perf util: Add findnew method to intlist 2013-10-14 10:28:48 -03:00
jit.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
jitdump.c perf jit: memset() variable 'st' using the correct size 2016-04-19 12:37:01 -03:00
jitdump.h perf jit: Add support for using TSC as a timestamp 2016-04-01 18:42:55 -03:00
kvm-stat.h perf kvm/powerpc: Port perf kvm stat to powerpc 2016-01-29 17:49:54 -03:00
levenshtein.c
levenshtein.h
llvm-utils.c perf llvm: Use strerror_r instead of the thread unsafe strerror one 2016-03-23 17:42:21 -03:00
llvm-utils.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
lzma.c perf tools: Add lzma decompression support for kernel module 2015-03-21 14:53:40 -03:00
machine.c perf callchain: Stop validating callchains by the max_stack sysctl 2016-05-20 11:43:56 -03:00
machine.h perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1 2016-05-20 11:43:54 -03:00
map.c perf tools: Use SBUILD_ID_SIZE where applicable 2016-05-11 13:06:06 -03:00
map.h perf maps: Introduce maps__find_symbol_by_name() 2015-09-30 18:34:25 -03:00
mem-events.c perf script: Display data_src values 2016-02-24 10:32:11 -03:00
mem-events.h perf script: Display data_src values 2016-02-24 10:32:11 -03:00
ordered-events.c perf ordered_events: Introduce reinit() 2016-04-14 08:57:54 -03:00
ordered-events.h perf ordered_events: Introduce reinit() 2016-04-14 08:57:54 -03:00
parse-branch-options.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
parse-branch-options.h perf tools: Move branch option parsing to own file 2015-05-27 21:02:17 -03:00
parse-events.c perf tools: Per event max-stack settings 2016-05-30 12:41:44 -03:00
parse-events.h perf tools: Per event max-stack settings 2016-05-30 12:41:44 -03:00
parse-events.l perf stat: Basic support for TopDown in perf stat 2016-06-06 17:04:15 -03:00
parse-events.y perf tools: Explicitly declare inc_group_count as a void function 2016-03-08 10:11:16 +01:00
parse-regs-options.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
parse-regs-options.h perf record: Add ability to name registers to record 2015-08-31 18:01:33 -03:00
path.c perf tools: Remove unused perf_pathdup, xstrdup functions 2016-03-23 15:27:33 -03:00
PERF-VERSION-GEN perf tools: Fix version when building out of tree 2013-11-07 10:40:47 -03:00
perf_regs.c perf tools: Fix build break on powerpc due to sample_reg_masks 2015-09-30 18:34:27 -03:00
perf_regs.h perf tools: Fix build break on powerpc due to sample_reg_masks 2015-09-30 18:34:27 -03:00
pmu.c perf pmu: Make pmu_formats_string to check return value of strbuf 2016-05-10 11:57:52 -03:00
pmu.h perf tools: Add perf_pmu__format_bits() 2015-08-06 16:49:01 -03:00
pmu.l
pmu.y
probe-event.c perf probe: Check the return value of strbuf APIs 2016-05-10 11:53:34 -03:00
probe-event.h perf symbols: Fix kallsyms perf test on ppc64le 2016-05-05 21:04:03 -03:00
probe-file.c perf probe: Let probe_file__add_event return 0 if succeeded 2016-04-26 13:14:58 -03:00
probe-file.h perf probe: Print deleted events in cmd_probe() 2015-09-04 12:43:44 -03:00
probe-finder.c perf probe: Check the return value of strbuf APIs 2016-05-10 11:53:34 -03:00
probe-finder.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
pstack.c perf tools: Introduce pstack_peek() 2015-05-05 18:13:22 -03:00
pstack.h perf tools: Introduce pstack_peek() 2015-05-05 18:13:22 -03:00
python-ext-sources perf symbols: Move fprintf routines to separate object file 2016-04-14 19:46:53 -03:00
python.c perf python: Support the PERF_RECORD_SWITCH event 2015-10-07 19:41:50 -03:00
quote.c perf tools: Make alias handler to check return value of strbuf 2016-05-10 11:56:52 -03:00
quote.h perf tools: Make alias handler to check return value of strbuf 2016-05-10 11:56:52 -03:00
rb_resort.h perf tools: Add template for generating rbtree resort class 2016-05-05 21:03:55 -03:00
rblist.c perf util: Add findnew method to intlist 2013-10-14 10:28:48 -03:00
rblist.h perf util: Add findnew method to intlist 2013-10-14 10:28:48 -03:00
record.c perf evsel: Do not use globals in config() 2016-04-11 22:18:20 -03:00
session.c perf tools: Per event max-stack settings 2016-05-30 12:41:44 -03:00
session.h perf evsel: Move some methods from session.[ch] to evsel.[ch] 2016-04-13 10:11:52 -03:00
setup.py perf tools: Fix python extension build 2016-02-29 11:18:25 -03:00
sort.c perf report: Add srcline_from/to branch sort keys 2016-05-23 11:25:16 -03:00
sort.h perf report: Add srcline_from/to branch sort keys 2016-05-23 11:25:16 -03:00
srcline.c perf tools: Always use non inlined file name for 'srcfile' sort key 2015-09-02 16:30:46 -03:00
stat-shadow.c perf stat: Update runtime using cpu-clock event 2016-05-16 23:11:46 -03:00
stat.c perf stat: Scale values by unit before metrics 2016-05-09 13:42:09 -03:00
stat.h perf stat: Check for frontend stalled for metrics 2016-03-03 11:10:40 -03:00
strbuf.c perf tools: Rewrite strbuf not to die() 2016-05-10 11:27:58 -03:00
strbuf.h perf tools: Rewrite strbuf not to die() 2016-05-10 11:27:58 -03:00
strfilter.c perf tools: Add strfilter__string to recover rules string 2015-05-04 12:43:54 -03:00
strfilter.h perf tools: Add strfilter__string to recover rules string 2015-05-04 12:43:54 -03:00
string.c tools: Adopt memdup() from tools/perf, moving it to tools/lib/string.c 2015-11-18 17:51:02 -03:00
strlist.c perf tools: Add file_only config option to strlist 2016-01-12 12:42:07 -03:00
strlist.h perf tools: Add file_only config option to strlist 2016-01-12 12:42:07 -03:00
svghelper.c perf tools: Add reference counting for cpu_map object 2015-06-25 15:15:50 -03:00
svghelper.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
symbol-elf.c perf symbols: Fix kallsyms perf test on ppc64le 2016-05-05 21:04:03 -03:00
symbol-minimal.c perf symbols: Fix type error when reading a build-id 2015-10-28 10:02:00 -03:00
symbol.c perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid 2016-05-30 13:15:03 -03:00
symbol.h perf report: Add srcline_from/to branch sort keys 2016-05-23 11:25:16 -03:00
symbol_fprintf.c perf symbols: Move fprintf routines to separate object file 2016-04-14 19:46:53 -03:00
syscalltbl.c perf tools: Build syscall table .c header from kernel's syscall_64.tbl 2016-04-08 09:58:14 -03:00
syscalltbl.h perf tools: Allow generating per-arch syscall table arrays 2016-04-08 09:58:14 -03:00
target.c perf target: Simplify handling of strerror_r return 2015-03-24 12:08:30 -03:00
target.h perf target: Move the checking of which map function to call into function. 2013-12-04 13:46:37 -03:00
term.c perf tools: Move term functions out of util.c 2015-12-09 13:42:02 -03:00
term.h perf tools: Move term functions out of util.c 2015-12-09 13:42:02 -03:00
thread-stack.c perf script: Expose usage of the callchain db export via the python api 2016-05-06 13:00:54 -03:00
thread-stack.h perf script: Expose usage of the callchain db export via the python api 2016-05-06 13:00:54 -03:00
thread.c perf thread: Adopt get_main_thread from db-export.c 2016-05-30 12:41:43 -03:00
thread.h perf thread: Adopt get_main_thread from db-export.c 2016-05-30 12:41:43 -03:00
thread_map.c perf thread_map: Use readdir() instead of deprecated readdir_r() 2016-05-12 11:26:58 -03:00
thread_map.h perf thread_map: Make new_by_tid_str constructor public 2016-04-13 10:11:51 -03:00
tool.h perf tools: Add time conversion event 2016-03-31 10:52:24 -03:00
top.c perf tools: Rename 'perf_record_opts' to 'record_opts 2013-12-19 14:43:45 -03:00
top.h perf top: Use machine->kptr_restrict_warned 2016-05-20 11:43:55 -03:00
trace-event-info.c tools lib api fs: Move tracing_path interface into api/fs/tracing_path.c 2015-09-04 12:00:45 -03:00
trace-event-parse.c irq_poll: make blk-iopoll available outside the block layer 2015-12-11 11:52:24 -08:00
trace-event-read.c perf tools: Stop reading the kallsyms data from perf.data 2015-07-23 22:51:11 -03:00
trace-event-scripting.c perf scripting: No need to pass thread twice to the scripting callbacks 2015-04-02 13:18:41 -03:00
trace-event.c tools lib api fs: Adopt filename__read_str from perf 2016-02-16 17:12:56 -03:00
trace-event.h perf script: Add process_stat/process_stat_interval scripting interface 2016-01-06 20:11:15 -03:00
trigger.h perf tools: Introduce trigger class 2016-04-28 09:58:58 -03:00
tsc.c perf tools: Use 64-bit shifts with (TSC) time conversion 2016-03-08 10:11:18 +01:00
tsc.h perf jit: Add support for using TSC as a timestamp 2016-04-01 18:42:55 -03:00
unwind-libdw.c perf libdw: Check for mmaps also in MAP__VARIABLE tree 2016-01-08 14:16:57 -03:00
unwind-libdw.h perf callchain: Add order support for libdw DWARF unwinder 2015-11-23 18:31:13 -03:00
unwind-libunwind.c perf tools: Add dedicated unwind addr_space member into thread struct 2016-04-08 09:58:02 -03:00
unwind.h perf callchains: Use thread->mg->machine 2014-10-29 10:32:46 -02:00
usage.c perf tools: Simplify die() mechanism 2016-03-23 12:32:31 -03:00
util.c perf tools: Separate accounting of contexts and real addresses in a stack trace 2016-05-16 23:11:54 -03:00
util.h perf tools: Separate accounting of contexts and real addresses in a stack trace 2016-05-16 23:11:54 -03:00
values.c perf tools: Use zfree to help detect use after free bugs 2013-12-27 17:08:19 -03:00
values.h tools: Consolidate types.h 2014-05-01 21:22:39 +02:00
vdso.c perf tools: Fix lockup using 32-bit compat vdso 2015-07-07 11:05:08 -03:00
vdso.h perf machine: Fix up vdso methods names 2015-05-29 12:43:44 -03:00
xyarray.c perf tools: Introduce xyarray__reset function 2015-06-16 10:34:39 -03:00
xyarray.h perf tools: Introduce xyarray__reset function 2015-06-16 10:34:39 -03:00
zlib.c perf tools: Add gzip decompression support for kernel module 2014-11-05 10:11:26 -03:00