1
0
Fork 0
alistair23-linux/tools/perf
Kim Phillips 7bee7eabf0 perf stat: Don't report a null stalled cycles per insn metric
commit 80cc7bb6c1 upstream.

For data collected on machines with front end stalled cycles supported,
such as found on modern AMD CPU families, commit 146540fb54 ("perf
stat: Always separate stalled cycles per insn") introduces a new line in
CSV output with a leading comma that upsets some automated scripts.
Scripts have to use "-e ex_ret_instr" to work around this issue, after
upgrading to a version of perf with that commit.

We could add "if (have_frontend_stalled && !config->csv_sep)" to the not
(total && avg) else clause, to emphasize that CSV users are usually
scripts, and are written to do only what is needed, i.e., they wouldn't
typically invoke "perf stat" without specifying an explicit event list.

But - let alone CSV output - why should users now tolerate a constant
0-reporting extra line in regular terminal output?:

BEFORE:

$ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1

 Performance counter stats for 'system wide':

       181,110,981      instructions              #    0.58  insn per cycle
                                                  #    0.00  stalled cycles per insn
       309,876,469      cycles

       1.002202582 seconds time elapsed

The user would not like to see the now permanent:

  "0.00  stalled cycles per insn"

line fixture, as it gives no useful information.

So this patch removes the printing of the zeroed stalled cycles line
altogether, almost reverting the very original commit fb4605ba47
("perf stat: Check for frontend stalled for metrics"), which seems like
it was written to normalize --metric-only column output of common Intel
machines at the time: modern Intel machines have ceased to support the
genericised frontend stalled metrics AFAICT.

AFTER:

$ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1

 Performance counter stats for 'system wide':

       244,071,432      instructions              #    0.69  insn per cycle
       355,353,490      cycles

       1.001862516 seconds time elapsed

Output behaviour when stalled cycles is indeed measured is not affected
(BEFORE == AFTER):

$ sudo perf stat --all-cpus -einstructions,cycles,stalled-cycles-frontend -- sleep 1

 Performance counter stats for 'system wide':

       247,227,799      instructions              #    0.63  insn per cycle
                                                  #    0.26  stalled cycles per insn
       394,745,636      cycles
        63,194,485      stalled-cycles-frontend   #   16.01% frontend cycles idle

       1.002079770 seconds time elapsed

Fixes: 146540fb54 ("perf stat: Always separate stalled cycles per insn")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200207230613.26709-1-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-19 19:53:08 +01:00
..
Documentation perf docs: Correct and clarify jitdump spec 2019-09-30 17:29:51 -03:00
arch perf tools: Fix cross compile for ARM64 2019-12-31 16:44:57 +01:00
bench perf env: Remove needless cpumap.h header 2019-09-20 09:19:21 -03:00
examples/bpf perf augmented_raw_syscalls: Reduce perf_event_output() boilerplate 2019-08-26 11:58:29 -03:00
include/bpf perf include bpf: Add bpf_tail_call() prototype 2019-07-29 18:34:40 -03:00
jvmti perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy() 2019-10-15 11:47:38 -03:00
lib libperf: Add perf_evlist__poll() function 2019-09-25 09:51:49 -03:00
pmu-events perf vendor events s390: Remove name from L1D_RO_EXCL_WRITES description 2020-01-17 19:48:30 +01:00
python treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 407 2019-06-05 17:37:14 +02:00
scripts perf scripts python: exported-sql-viewer.py: Fix use of TRUE with SQLite 2019-12-13 08:42:16 +01:00
tests perf tests: Disable bp_signal testing for arm64 2019-12-31 16:44:12 +01:00
trace perf trace beauty ioctl: Fix off-by-one error in cmd->string table 2019-08-26 11:58:29 -03:00
ui libperf: Add perf_evlist__first()/last() functions 2019-09-25 09:51:48 -03:00
util perf stat: Don't report a null stalled cycles per insn metric 2020-02-19 19:53:08 +01:00
.gitignore perf: Update .gitignore file 2019-08-31 22:27:52 -03:00
Build perf tools: Rename build libperf to perf 2019-02-14 15:18:08 -03:00
CREDITS
MANIFEST tools lib: Adopt zalloc()/zfree() from tools/perf 2019-07-09 10:13:26 -03:00
Makefile perf tools: Disable parallelism for 'make clean' 2018-08-20 08:54:58 -03:00
Makefile.config perf build: Add detection of java-11-openjdk-devel package 2019-09-25 16:26:41 -03:00
Makefile.perf libtraceevent: Move traceevent plugins in its own subdirectory 2019-09-25 09:51:43 -03:00
builtin-annotate.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-bench.c perf tools: Remove perf.h from source files not needing it 2019-08-29 17:38:32 -03:00
builtin-buildid-cache.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-buildid-list.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-c2c.c perf c2c: Fix return type for histogram sorting comparision functions 2020-02-05 21:22:40 +00:00
builtin-config.c perf tools: Remove util.h from where it is not needed 2019-09-20 09:19:20 -03:00
builtin-data.c perf debug: Remove needless include directives from debug.h 2019-08-31 19:10:19 -03:00
builtin-diff.c perf diff: Use llabs() with 64-bit values 2020-01-04 19:18:26 +01:00
builtin-evlist.c perf evsel: Introduce evsel_fprintf.h 2019-09-25 16:26:34 -03:00
builtin-ftrace.c perf auxtrace: Uninline functions that touch perf_session 2019-08-31 22:24:10 -03:00
builtin-help.c perf debug: Remove needless include directives from debug.h 2019-08-31 19:10:19 -03:00
builtin-inject.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-kallsyms.c perf dsos: Move the dsos struct and its methods to separate source files 2019-08-31 22:24:10 -03:00
builtin-kmem.c perf kmem: Fix memory leak in compact_gfp_flags() 2019-10-16 10:08:32 -03:00
builtin-kvm.c perf tools: Propagate get_cpuid() error 2019-09-30 17:29:54 -03:00
builtin-list.c perf list: Allow plurals for metric, metricgroup 2019-09-25 09:51:42 -03:00
builtin-lock.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-mem.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-probe.c perf probe: No need for symbol.h, symbol_conf is enough 2019-08-31 22:24:10 -03:00
builtin-record.c libperf: Add perf_evlist__poll() function 2019-09-25 09:51:49 -03:00
builtin-report.c perf report: Fix no libunwind compiled warning break s390 issue 2020-02-05 21:22:53 +00:00
builtin-sched.c perf evsel: Introduce evsel_fprintf.h 2019-09-25 16:26:34 -03:00
builtin-script.c perf script: Fix --reltime with --time 2020-01-23 08:23:01 +01:00
builtin-stat.c libperf: Move 'sample_id' from 'struct evsel' to 'struct perf_evsel' 2019-09-25 09:51:47 -03:00
builtin-timechart.c perf session: Return error code for perf_session__new() function on failure 2019-09-20 15:58:11 -03:00
builtin-top.c perf evsel: Move config terms to a separate header 2019-09-25 16:26:40 -03:00
builtin-trace.c perf evsel: Introduce evsel_fprintf.h 2019-09-25 16:26:34 -03:00
builtin-version.c perf symbols: Move mem_info and branch_info out of symbol.h 2019-08-31 22:27:48 -03:00
builtin.h perf tools: Remove needless util.h include from builtin.h 2019-08-28 17:19:34 -03:00
check-headers.sh tools headers uapi: Sync linux/fs.h with the kernel sources 2019-09-30 17:29:22 -03:00
command-list.txt perf help: Add missing subcommand `version` 2018-09-19 14:53:36 -03:00
design.txt perf/doc: Update design.txt for exclude_{host|guest} flags 2019-01-21 11:01:18 +01:00
perf-archive.sh License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
perf-completion.sh perf tools: Auto-complete for events with ':' 2017-12-27 12:16:00 -03:00
perf-read-vdso.c perf tools: Make find_vdso_map() more modular 2019-01-08 13:28:13 -03:00
perf-sys.h perf tools: Make usage of test_attr__* optional for perf-sys.h 2019-10-31 21:38:41 +01:00
perf-with-kcore.sh Merge branch 'x86/cpu' into perf/core, to pick up dependent changes 2019-06-17 12:29:16 +02:00
perf.c libperf: Merge libperf_set_print() into libperf_init() 2019-09-25 09:51:49 -03:00
perf.h perf time-utils: Adopt rdclock() from perf.h 2019-08-29 17:38:32 -03:00