1
0
Fork 0
Commit Graph

32 Commits (redonkable)

Author SHA1 Message Date
Andrey Zhizhikin e0de7af107 This is the 5.4.69 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl91u0cACgkQONu9yGCS
 aT7KmhAAvuW3edfAfzD/F5h4vHaa9rMRmtvp2/FwefBoE4LEi3F6p2gBrUZMA3ds
 DNQ8Nheafeqd63wFkfE//TXYR0rYTxTxa0jTrhtuJCUZ4+anRyG00fEbHPOxvMnJ
 aPwQQVNOfCaUAvRbFdQ4RbuIm5chhX8Bml0ZtqvsAAFJ9XkCh1UPF0VHtSrS7PRL
 lRMBlamLgZqU72naaJaFY2nMp+pvMFPZrzkR7tpv0Z1bqxuJp6L2n/EmcHpmTOJy
 Ze+Wvt1wKk8Ep5Vql5ekXt5lEiInjacwsJZXbb5HfHO++Y+1b+ABt1kSjJx+R3/q
 2Qdztq+9Eoj0N1A4gXdVFoZHqKihhbD49k8YqX4qO5ujTzqgnNyHGSEXyIKvaU6z
 b3b12IvjbcMhM1zm3qvFfrVbbQI3kJf66zSi9NAwsZHlsvxRzslALR8I7mila4r5
 fVOyfGoZxFs44FNW9JG7I85/isAxgg0ogYraMZbk8gmhTtb1ZaN+r7kJeXuTpzOg
 UBAIDYPclMyZeny6tn1/qFuzNGYQQ0R9kxFcTC21Cf2zNLWHNfwCL1vE3Ob+ROIS
 IHcsce6IqWQKGlD8UPjkZiXTLfqCAVi51PsGTVrnidXfa1IBOuvDsVqlghPsjHSD
 30N4VB++9Gbw7LFEP4e33cOZLBLjDEdYd4VuoQFYywDZ3cy6xXo=
 =OoZD
 -----END PGP SIGNATURE-----

Merge tag 'v5.4.69' into 5.4-2.2.x-imx

This is the 5.4.69 stable release

Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>
2020-10-01 16:21:52 +00:00
Ian Rogers ff793fe02c perf metricgroup: Free metric_events on error
[ Upstream commit a159e2fe89 ]

Avoid a simple memory leak.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: kp singh <kpsingh@chromium.org>
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200508053629.210324-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01 13:18:06 +02:00
Joakim Zhang 1503fb9f5b MLK-23724-1 tools: perf: fix metricgroup add metric events multiple times
Fix metricgroup add metric events multiple times.

Before:
root@imx8dxlevk:~# ./perf stat -a -I 1000 -M imx8dxl_ddr0_read.all
     1.000135500              40047      imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 320376.0 imx8dxl_ddr0_read.all
     												# 320376.0 imx8dxl_ddr0_read.all
     												# 320376.0 imx8dxl_ddr0_read.all

After:
root@imx8dxlevk:~# ./perf stat -a -I 1000 -M imx8dxl_ddr0_read.all
     1.000135500              40047      imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 320376.0 imx8dxl_ddr0_read.all

Fixes: commit b8552bd2ea ("tools: perf: metricgroup: add metricgroup for each PMU")
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
2020-04-01 18:04:33 +08:00
Joakim Zhang 8d4580bdac MLK-23302-3 tools: perf: metricgroup: match metric with socname if needed
All metrics under one CPUID would be loaded by Perf tool when the CPUID
of SoC is matched. So users could see other platforms' metrics from one
platform, which is very confused. We can match metric/metricgroup with
SOCNAME if needed.

Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
2020-02-12 19:26:56 +08:00
Kajol Jain 4431f64855 perf metricgroup: Fix printing event names of metric group with multiple events
[ Upstream commit eb573e746b ]

Commit f01642e491 ("perf metricgroup: Support multiple events for
metricgroup") introduced support for multiple events in a metric group.
But with the current upstream, metric events names are not printed
properly

In power9 platform:

command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
     1.000208486
     2.000368863
     2.001400558

Similarly in skylake platform:

command:./perf stat --metric-only -M Power -I 1000
     1.000579994
     2.002189493

With current upstream version, issue is with event name comparison logic
in find_evsel_group(). Current logic is to compare events belonging to a
metric group to the events in perf_evlist.  Since the break statement is
missing in the loop used for comparison between metric group and
perf_evlist events, the loop continues to execute even after getting a
pattern match, and end up in discarding the matches.

Incase of single metric event belongs to metric group, its working fine,
because in case of single event once it compare all events it reaches to
end of perf_evlist.

Example for single metric event in power9 platform:

command:# ./perf stat --metric-only  -M branches_per_inst -I 1000 sleep 1
     1.000094653                  0.2
     1.001337059                  0.0

This patch fixes the issue by making sure once we found all events
belongs to that metric event matched in find_evsel_group(), we
successfully break from that loop by adding corresponding condition.

With this patch:
In power9 platform:

command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
result:#
            time  derat_4k_miss_rate_percent  derat_4k_miss_ratio derat_miss_ratio derat_64k_miss_rate_percent  derat_64k_miss_ratio dslb_miss_rate_percent islb_miss_rate_percent
     1.000135672                         0.0                  0.3              1.0                         0.0                   0.2                    0.0                    0.0
     2.000380617                         0.0                  0.0              0.0                         0.0                   0.0                    0.0                    0.0

command:# ./perf stat --metric-only -M Power -I 1000

Similarly in skylake platform:
result:#
            time    Turbo_Utilization    C3_Core_Residency  C6_Core_Residency  C7_Core_Residency    C2_Pkg_Residency  C3_Pkg_Residency     C6_Pkg_Residency   C7_Pkg_Residency
     1.000563580                  0.3                  0.0                2.6               44.2                21.9               0.0                  0.0               0.0
     2.002235027                  0.4                  0.0                2.7               43.0                20.7               0.0                  0.0               0.0

Committer testing:

  Before:

  [root@seventh ~]# perf stat --metric-only -M Power -I 1000
  #           time
       1.000383223
       2.001168182
       3.001968545
       4.002741200
       5.003442022
  ^C     5.777687244

  [root@seventh ~]#

  After the patch:

  [root@seventh ~]# perf stat --metric-only -M Power -I 1000
  #           time    Turbo_Utilization    C3_Core_Residency    C6_Core_Residency    C7_Core_Residency     C2_Pkg_Residency     C3_Pkg_Residency     C6_Pkg_Residency     C7_Pkg_Residency
       1.000406577                  0.4                  0.1                  1.4                 97.0                  0.0                  0.0                  0.0                  0.0
       2.001481572                  0.3                  0.0                  0.6                 97.9                  0.0                  0.0                  0.0                  0.0
       3.002332585                  0.2                  0.0                  1.0                 97.5                  0.0                  0.0                  0.0                  0.0
       4.003196624                  0.2                  0.0                  0.3                 98.6                  0.0                  0.0                  0.0                  0.0
       5.004063851                  0.3                  0.0                  0.7                 97.7                  0.0                  0.0                  0.0                  0.0
  ^C     5.471260276                  0.2                  0.0                  0.5                 49.3                  0.0                  0.0                  0.0                  0.0

  [root@seventh ~]#
  [root@seventh ~]# dmesg | grep -i skylake
  [    0.187807] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
  [root@seventh ~]#

Fixes: f01642e491 ("perf metricgroup: Support multiple events for metricgroup")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191120084059.24458-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

Joakim cherry pick from upstream:
3635b27cc0 perf metricgroup: Fix printing event names of metric group with multiple events

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
2020-02-12 19:26:56 +08:00
Kajol Jain 3635b27cc0 perf metricgroup: Fix printing event names of metric group with multiple events
[ Upstream commit eb573e746b ]

Commit f01642e491 ("perf metricgroup: Support multiple events for
metricgroup") introduced support for multiple events in a metric group.
But with the current upstream, metric events names are not printed
properly

In power9 platform:

command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
     1.000208486
     2.000368863
     2.001400558

Similarly in skylake platform:

command:./perf stat --metric-only -M Power -I 1000
     1.000579994
     2.002189493

With current upstream version, issue is with event name comparison logic
in find_evsel_group(). Current logic is to compare events belonging to a
metric group to the events in perf_evlist.  Since the break statement is
missing in the loop used for comparison between metric group and
perf_evlist events, the loop continues to execute even after getting a
pattern match, and end up in discarding the matches.

Incase of single metric event belongs to metric group, its working fine,
because in case of single event once it compare all events it reaches to
end of perf_evlist.

Example for single metric event in power9 platform:

command:# ./perf stat --metric-only  -M branches_per_inst -I 1000 sleep 1
     1.000094653                  0.2
     1.001337059                  0.0

This patch fixes the issue by making sure once we found all events
belongs to that metric event matched in find_evsel_group(), we
successfully break from that loop by adding corresponding condition.

With this patch:
In power9 platform:

command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
result:#
            time  derat_4k_miss_rate_percent  derat_4k_miss_ratio derat_miss_ratio derat_64k_miss_rate_percent  derat_64k_miss_ratio dslb_miss_rate_percent islb_miss_rate_percent
     1.000135672                         0.0                  0.3              1.0                         0.0                   0.2                    0.0                    0.0
     2.000380617                         0.0                  0.0              0.0                         0.0                   0.0                    0.0                    0.0

command:# ./perf stat --metric-only -M Power -I 1000

Similarly in skylake platform:
result:#
            time    Turbo_Utilization    C3_Core_Residency  C6_Core_Residency  C7_Core_Residency    C2_Pkg_Residency  C3_Pkg_Residency     C6_Pkg_Residency   C7_Pkg_Residency
     1.000563580                  0.3                  0.0                2.6               44.2                21.9               0.0                  0.0               0.0
     2.002235027                  0.4                  0.0                2.7               43.0                20.7               0.0                  0.0               0.0

Committer testing:

  Before:

  [root@seventh ~]# perf stat --metric-only -M Power -I 1000
  #           time
       1.000383223
       2.001168182
       3.001968545
       4.002741200
       5.003442022
  ^C     5.777687244

  [root@seventh ~]#

  After the patch:

  [root@seventh ~]# perf stat --metric-only -M Power -I 1000
  #           time    Turbo_Utilization    C3_Core_Residency    C6_Core_Residency    C7_Core_Residency     C2_Pkg_Residency     C3_Pkg_Residency     C6_Pkg_Residency     C7_Pkg_Residency
       1.000406577                  0.4                  0.1                  1.4                 97.0                  0.0                  0.0                  0.0                  0.0
       2.001481572                  0.3                  0.0                  0.6                 97.9                  0.0                  0.0                  0.0                  0.0
       3.002332585                  0.2                  0.0                  1.0                 97.5                  0.0                  0.0                  0.0                  0.0
       4.003196624                  0.2                  0.0                  0.3                 98.6                  0.0                  0.0                  0.0                  0.0
       5.004063851                  0.3                  0.0                  0.7                 97.7                  0.0                  0.0                  0.0                  0.0
  ^C     5.471260276                  0.2                  0.0                  0.5                 49.3                  0.0                  0.0                  0.0                  0.0

  [root@seventh ~]#
  [root@seventh ~]# dmesg | grep -i skylake
  [    0.187807] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
  [root@seventh ~]#

Fixes: f01642e491 ("perf metricgroup: Support multiple events for metricgroup")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191120084059.24458-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-12 12:21:25 +01:00
Joakim Zhang b8552bd2ea tools: perf: metricgroup: add metricgroup for each PMU
If we want to add/print metricgroup, we need get pmu evets map via CPUID
string, so it is important to get CPUID string. Now in metricgroup, it
passes NULL to perf_pmu__find_map(), will never get the pmu events map on
ARM64 platforms, due to get_cpuid_str() implement for ARM64 depending on the
cpu info of PMU. The CPUID string will not be same on all CPUs on heterogeneous
platforms, adding provision(using pmu->cpus) to find cpuid string from
associated CPUs of PMU.

The implement of get_cpuid_str() for ARM64 has taken heterogeneous platforms
into consideration, but metricgroup has not. So it is necessory to add
heterogeneous support for metricgroup. When we want to add/print metricgroup,
need iterator each PMU then pass the perf_pmu struct to perf_pmu__find_map().

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
2019-11-28 12:11:37 +08:00
Jin Yao f01642e491 perf metricgroup: Support multiple events for metricgroup
Some uncore metrics don't work as expected. For example, on
cascadelakex:

  root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1

   Performance counter stats for 'system wide':

           1841092      unc_m_pmm_rpq_inserts
           3680816      unc_m_pmm_wpq_inserts

       1.001775055 seconds time elapsed

  root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1

   Performance counter stats for 'system wide':

         860649746      unc_m_pmm_rpq_occupancy.all
           1840557      unc_m_pmm_rpq_inserts
       12790627455      unc_m_clockticks

       1.001773348 seconds time elapsed

No metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' or 'UNC_M_PMM_READ_LATENCY' are
reported.

The issue is, the case of an alias expanding to mulitple events is not
supported, typically the uncore events.  (see comments in
find_evsel_group()).

For UNC_M_PMM_BANDWIDTH.TOTAL in above example, the expanded event group
is '{unc_m_pmm_rpq_inserts,unc_m_pmm_wpq_inserts}:W', but the actual
events passed to find_evsel_group are:

  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts

For this multiple events case, it's not supported well.

This patch introduces a new field 'metric_leader' in struct evsel. The
first event is considered as a metric leader. For the rest of same
events, they point to the first event via it's metric_leader field in
struct evsel.

This design is for adding the counting results of all same events to the
first event in group (the metric_leader).

With this patch,

  root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1

   Performance counter stats for 'system wide':

           1842108      unc_m_pmm_rpq_inserts     #    337.2 MB/sec  UNC_M_PMM_BANDWIDTH.TOTAL
           3682209      unc_m_pmm_wpq_inserts

       1.001819706 seconds time elapsed

  root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1

   Performance counter stats for 'system wide':

         861970685      unc_m_pmm_rpq_occupancy.all #    219.4 ns  UNC_M_PMM_READ_LATENCY
           1842772      unc_m_pmm_rpq_inserts
       12790196356      unc_m_clockticks

       1.001749103 seconds time elapsed

Now we can see the correct metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' and
'UNC_M_PMM_READ_LATENCY'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190828055932.8269-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-31 22:27:52 -03:00
Jin Yao 287f2649f7 perf metricgroup: Scale the metric result
Some metrics define the scale unit, such as

    {
        "BriefDescription": "Intel Optane DC persistent memory read latency (ns). Derived from unc_m_pmm_rpq_occupancy.all",
        "Counter": "0,1,2,3",
        "EventCode": "0xE0",
        "EventName": "UNC_M_PMM_READ_LATENCY",
        "MetricExpr": "UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / UNC_M_CLOCKTICKS",
        "MetricName": "UNC_M_PMM_READ_LATENCY",
        "PerPkg": "1",
        "ScaleUnit": "6000000000ns",
        "UMask": "0x1",
        "Unit": "iMC"
    },

For above example, the ratio should be,

ratio = (UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / UNC_M_CLOCKTICKS) * 6000000000

But in current code, the ratio is not scaled ( * 6000000000)

With this patch, the ratio is scaled and the unit (ns) is printed.

For example,
  #    219.4 ns  UNC_M_PMM_READ_LATENCY

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190828055932.8269-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-31 22:27:52 -03:00
Arnaldo Carvalho de Melo b42090256f perf tools: Remove debug.h from header files not needing it
And fix the fallout, adding it to places that must have it since they
use its definitions.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1s3jel4i26chq2g0lydoz7i3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-29 17:38:32 -03:00
Arnaldo Carvalho de Melo 0b8026e8fb perf metricgroup: Remove needless includes from metricgroup.h
There we need just some struct forward declarations, do that instead and
add the includes needed by metricgroup.c.

That should help with needless rebuilds when changing the removed
headers from metricgroup.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1fkskjws6imir2hhztqhdyb0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-22 17:16:56 -03:00
Jiri Olsa 63503dba87 perf evlist: Rename struct perf_evlist to struct evlist
Rename struct perf_evlist to struct evlist, so we don't have a name
clash when we add struct perf_evlist in libperf.

Committer notes:

Added fixes to build on arm64, from Jiri and from me
(tools/perf/util/cs-etm.c)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:42 -03:00
Jiri Olsa 32dcd021d0 perf evsel: Rename struct perf_evsel to struct evsel
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.

Committer notes:

Added fixes for arm64, provided by Jiri.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo acc7bfb3db perf metricgroup: Add missing list_del_init() when flushing egroups list
So that at the end each of the entries have its list node struct cleared
and the egroup list head ends emptied.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-dxzj1ah350fy9ec0xbhb15b6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-09 10:13:27 -03:00
Arnaldo Carvalho de Melo d8f9da2404 perf tools: Use zfree() where applicable
In places where the equivalent was already being done, i.e.:

   free(a);
   a = NULL;

And in placs where struct members are being freed so that if we have
some erroneous reference to its struct, then accesses to freed members
will result in segfaults, which we can detect faster than use after free
to areas that may still have something seemingly valid.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-jatyoofo5boc1bsvoig6bb6i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-09 10:13:27 -03:00
Andi Kleen 488c3bf7ec perf tools metric: Don't include duration_time in group
The Memory_BW metric generates groups including duration_time, which
maps to a software event.

For some reason this makes the group always not count.

Always put duration_time outside a group when generating metrics.  It's
always the same time, so no need to group it.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190628220737.13259-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-02 16:08:16 -03:00
Andi Kleen 9c344d15f5 perf list: Avoid extra : for --raw metrics
When printing the metrics raw, don't print : after the metricgroups.
This helps the command line completion to complete those too.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190628220737.13259-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-02 16:08:16 -03:00
Andi Kleen 2f87f33f42 perf stat: Fix group lookup for metric group
The metric group code tries to find a group it added earlier in the
evlist. Fix the lookup to handle groups with partially overlaps
correctly. When a sub string match fails and we reset the match, we have
to compare the first element again.

I also renamed the find_evsel function to find_evsel_group to make its
purpose clearer.

With the earlier changes this fixes:

Before:

  % perf stat -M UPI,IPC sleep 1
  ...
         1,032,922      uops_retired.retire_slots #      1.1 UPI
         1,896,096      inst_retired.any
         1,896,096      inst_retired.any
         1,177,254      cpu_clk_unhalted.thread

After:

  % perf stat -M UPI,IPC sleep 1
  ...
        1,013,193      uops_retired.retire_slots #      1.1 UPI
           932,033      inst_retired.any
           932,033      inst_retired.any          #      0.9 IPC
         1,091,245      cpu_clk_unhalted.thread

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Fixes: b18f3e3650 ("perf stat: Support JSON metrics in perf stat")
Link: http://lkml.kernel.org/r/20190624193711.35241-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-01 22:50:41 -03:00
Arnaldo Carvalho de Melo 80e9073f1f perf metricgroup: Use strsep()
No change in behaviour intended, trivial optimization done by avoiding
looking for spaces in 'g' right after setting it to "No_group".

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-f2siadtp3hb5o0l1w7bvd8bk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-26 11:31:43 -03:00
Arnaldo Carvalho de Melo bd9860bf05 perf tools: Use linux/ctype.h in more places
There were a few places where we still were using the libc version of
ctype.h, switch to the one in tools/lib/ctype.c that the rest of perf
uses.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wa4nz4kt61eze88eprk20tfd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-25 21:13:51 -03:00
Thomas Gleixner 2025cf9e19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Jiri Olsa 33bbc571ed perf list: Display metric expressions for --details option
Display metric expression itself when --details is specified.

Current list with no details:

  # perf list metrics
  ...
  TopDownL1:
    IPC
         [Instructions Per Cycle (per logical thread)]
    SLOTS
         [Total issue-pipeline slots]
  ...

Detailed output with metric formula:

  # perf list --details metrics
  ...
  TopDownL1:
    IPC
         [Instructions Per Cycle (per logical thread)]
         [inst_retired.any / cpu_clk_unhalted.thread]
    SLOTS
         [Total issue-pipeline slots]
         [4*(( cpu_clk_unhalted.thread_any / 2 ) if #smt_on else cycles)]
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190213123246.4015-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-14 15:18:09 -03:00
Davidlohr Bueso ca2270292e perf util: Use cached rbtree for rblists
At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something required for any of the strlist or intlist traversals
(XXX_for_each_entry()). There are a number of users in perf of these
(particularly strlists), including probes, and buildid.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-5-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-25 15:12:10 +01:00
Michael Petlan 95f04328e4 perf list: Unify metric group description format with PMU event description
PMU event descriptions use 7 spaces + '[' or 8 spaces as indentation.
Metric groups used a tab + '['. This patch unifies it to the way PMU
event descriptions are indented.

BEFORE:

  $ perf list
  [...]
  Metric Groups:

  DSB:
    DSB_Coverage
	  [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
  [...]

AFTER:

  $ perf list
  [...]
  Metric Groups:

  DSB:
    DSB_Coverage
         [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
  [...]

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
LPU-Reference: 771439042.22924766.1532986504631.JavaMail.zimbra@redhat.com
Link: https://lkml.kernel.org/n/tip-mlo850517m6u1rbjndvd1bwr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-31 11:35:44 -03:00
Thomas Richter 742d92ff21 perf stat: Add transaction flag (-T) support for s390
The 'perf stat' command line flag -T to display transaction counters is
currently supported for x86 only.

Add support for s390. It is based on the metrics flag -M transaction
using the architecture dependent JSON files. This requires a metric
named "transaction" in the JSON files for the platform.

Introduce a new function metricgroup__has_metric() to check for the
existence of a metric_name transaction.

As suggested by Andi Kleen, this is the new approach to support
transactions counters. Other architectures will follow.

Output before:

  [root@p23lp27 perf]# ./perf stat -T -- sleep 1
  Cannot set up transaction events
  [root@p23lp27 perf]#

Output after:

  [root@s35lp76 perf]# ./perf stat -T -- ~/mytesttx 1 >/tmp/111

   Performance counter stats for '/root/mytesttx 1':

                   1      tx_c_tend           #     13.0 transaction
                   1      tx_nc_tend
                  11      tx_nc_tabort
                   0      tx_c_tabort_special
                   0      tx_c_tabort_no_special

         0.001070109 seconds time elapsed

  [root@s35lp76 perf]#

Suggested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180626071701.58190-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24 14:49:37 -03:00
Pravin Shedge 3315d14f8e perf perf: Remove duplicate includes
These duplicate includes have been found with scripts/checkincludes.pl
but they have been removed manually to avoid removing false positives.

Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1512582204-6493-1-git-send-email-pravin.shedge4linux@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:49 -03:00
Ganapatrao Kulkarni 54e32dc0f8 perf pmu: Pass pmu as a parameter to get_cpuid_str()
The cpuid string will not be same on all CPUs on heterogeneous platforms
like ARM's big.LITTLE, adding provision(using pmu->cpus) to find cpuid
string from associated CPUs of PMU CORE device.

Also optimise arguments to function pmu_add_cpu_aliases.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: http://lkml.kernel.org/r/20171016183222.25750-2-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-05 10:24:33 -03:00
Andi Kleen 4bd1bef8bb perf script: Allow computing 'perf stat' style metrics
Add support for computing 'perf stat' style metrics in 'perf script'.

When using leader sampling we can get metrics for each sampling period
by computing formulas over the values of the different group members.

This allows things like fine grained IPC tracking through sampling, much
more fine grained than with 'perf stat'.

The metric is still averaged over the sampling period, it is not just
for the sampling point.

This patch adds a new metric output field for 'perf script' that uses
the existing 'perf stat' metrics infrastructure to compute any metrics
supported by 'perf stat'.

For example to sample IPC:

  $ perf record -e '{ref-cycles,cycles,instructions}:S' -a sleep 1
  $ perf script -F metric,ip,sym,time,cpu,comm
  ...
   alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
   alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
   alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
   alsa-sink-ALC32 [000] 42815.856074:    metric:    0.13  insn per cycle
           swapper [000] 42815.857961:  ffffffff81655df0 __schedule
           swapper [000] 42815.857961:  ffffffff81655df0 __schedule
           swapper [000] 42815.857961:  ffffffff81655df0 __schedule
           swapper [000] 42815.857961:    metric:    0.23  insn per cycle
   qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
   qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
   qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
   qemu-system-x86 [000] 42815.858130:    metric:    0.46  insn per cycle
             :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
             :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
             :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
             :4972 [000] 42815.858312:    metric:    0.45  insn per cycle

TopDown:

This requires disabling SMT if you have it enabled, because SMT would
require sampling per core, which is not supported.

  $ perf record -e '{ref-cycles,topdown-fetch-bubbles,\
                     topdown-recovery-bubbles,\
                     topdown-slots-retired,topdown-total-slots,\
                     topdown-slots-issued}:S' -a sleep 1
  $ perf script --header -I -F cpu,ip,sym,event,metric,period
  ...
  [000]     121108               ref-cycles:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]     190350    topdown-fetch-bubbles:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]       2055 topdown-recovery-bubbles:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]     148729    topdown-slots-retired:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]     144324      topdown-total-slots:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]     160852     topdown-slots-issued:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]   metric:     33.0% frontend bound
  [000]   metric:      3.5% bad speculation
  [000]   metric:     25.8% retiring
  [000]   metric:     37.7% backend bound
  [000]     112112               ref-cycles:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]     357222    topdown-fetch-bubbles:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]       3325 topdown-recovery-bubbles:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]     323553    topdown-slots-retired:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]     270507      topdown-total-slots:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]     341226     topdown-slots-issued:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]   metric:     33.0% frontend bound
  [000]   metric:      2.9% bad speculation
  [000]   metric:     29.9% retiring
  [000]   metric:     34.2% backend bound
...

v2:
Use evsel->priv for new fields
Port to new base line, support fp output.
Handle stats in ->stats, not ->priv
Minor cleanups

Extra explanation about the use of the term 'averaging', from Andi in the
thread in the Link: tag below:

<quote Andi>
The current samples contains the sum of event counts for a sampling period.

EventA-1           EventA-2                EventA-3      EventA-4
EventB-1     EventB-2                             EventC-3

                         gap with no events                overflow
|-----------------------------------------------------------------|
period-start                                             period-end
^                                                                 ^
|                                                                 |
previous sample                                      current sample

So EventA = 4 and EventB = 3 at the sample point

I generate a metric, let's say EventA / EventB. It applies to the whole period.

But the metric is over a longer time which does not have the same behavior. For
example the gap above doesn't have any events, while they are clustered at the
beginning and end of the sample period.

But we're summing everything together. The metric doesn't know that the gap is
different than the busy period.

That's what I'm trying to express with averaging.
</quote>

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171117214300.32746-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-29 18:18:01 -03:00
Andi Kleen 411bc316f3 perf stat: Fix adding multiple event groups
The -M metric group parser threw away the events of earlier groups when
multiple groups were specified. Fix this here by not overwriting the
string incorrectly.

Now this works correctly:

% perf stat -M Summary,SMT --metric-only -a sleep 1

 Performance counter stats for 'system wide':

Instructions CPI CLKS         CPU_Utilization GFLOPs SMT_2T_Utilization SMT_2T_Utilization Kernel_Utilization CoreIPC CORE_CLKS
900907376.0  2.7 2398954144.0 0.1             0.0    0.2                0.2                0.1                0.4     2080822855.5

while previously it would only show the SMT metrics.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170914205735.18431-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-21 13:12:58 -03:00
Andi Kleen 333b566559 perf pmu: Improve error messages for missing PMUs
When a PMU is missing print a better error message mentioning
the missing PMU.

% mkdir empty
% mount --bind empty /sys/devices/msr
% perf stat -M Summary true
event syntax error: '{inst_retired.any,cycles}:W,{cpu_clk_unhalted.thread}:W,{inst_retired.any}:W,{cpu_clk_unhalted.ref_tsc,msr/tsc/}:W,{fp_comp_ops_exe.sse_scalar..'
                     \___ Cannot find PMU `msr'. Missing kernel support?

It still cannot find the right column for aliases, but it's already a vast improvement.

v2: Check asprintf

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170913215006.32222-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-18 09:40:20 -03:00
Andi Kleen 71b0acce78 perf list: Add metric groups to perf list
Add code to perf list to print metric groups, and metrics
that don't have an event name. The metricgroup code collects
the eventgroups and events into a rblist, and then prints
them according to the configured filters.

The metricgroups are printed by default, but can be
limited by perf list metric or perf list metricgroup

  % perf list metricgroup
  ..
  Metric Groups:

  DSB:
    DSB_Coverage
          [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
  FLOPS:
    GFLOPs
          [Giga Floating Point Operations Per Second]
  Frontend:
    IFetch_Line_Utilization
          [Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
  Frontend_Bandwidth:
    DSB_Coverage
          [Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
  Memory_BW:
    MLP
          [Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]

v2: Check return value of asprintf to fix warning on FC26
Fix key in lookup/addition for the groups list

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-8-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-13 09:49:13 -03:00
Andi Kleen b18f3e3650 perf stat: Support JSON metrics in perf stat
Add generic support for standalone metrics specified in JSON files to
perf stat. A metric is a formula that uses multiple events to compute a
higher level result (e.g. IPC).

Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have standalone
metrics. They are in the same JSON data structure as events, but don't
have an event name.

We also allow to organize the metrics in metric groups, which allows a
short cut to select several related metrics at once.

Add a new -M / --metrics option to perf stat that adds the metrics or
metric groups specified.

Add the core code to manage and parse the metric groups. They are
collected from the JSON data structures into a separate rblist.  When
computing shadow values look for metrics in that list.  Then they are
computed using the existing saved values infrastructure in stat-shadow.c

The actual JSON metrics are in a separate pull request.

  % perf stat -M Summary --metric-only -a sleep 1

   Performance counter stats for 'system wide':

  Instructions   CLKS          CPU_Utilization  GFLOPs   SMT_2T_Utilization   Kernel_Utilization
  317614222.0    1392930775.0  0.0              0.0      0.2                  0.1

       1.001497549 seconds time elapsed

  % perf stat -M GFLOPs flops

   Performance counter stats for 'flops':

     3,999,541,471  fp_comp_ops_exe.sse_scalar_single #  1.2 GFLOPs   (66.65%)
                14  fp_comp_ops_exe.sse_scalar_double                 (66.65%)
                 0  fp_comp_ops_exe.sse_packed_double                 (66.67%)
                 0  fp_comp_ops_exe.sse_packed_single                 (66.70%)
                 0  simd_fp_256.packed_double                         (66.70%)
                 0  simd_fp_256.packed_single                         (66.67%)
                 0  duration_time

       3.238372845 seconds time elapsed

v2: Add missing header file
v3: Move find_map to pmu.c

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170831194036.30146-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-13 09:49:13 -03:00