1
0
Fork 0
Commit Graph

274 Commits (a6f0cb3ff6cd4520e5b3d7892a4f10c9e87a6372)

Author SHA1 Message Date
Adrian Hunter d443cefd9f perf intel-pt: Fix 'CPU too large' error
commit 5501e9229a upstream.

In some cases, the number of cpus (nr_cpus_online) is confused with the
maximum cpu number (nr_cpus_avail), which results in the error in the
example below:

Example on system with 8 cpus:

 Before:
   # echo 0 > /sys/devices/system/cpu/cpu2/online
   # ./perf record --kcore -e intel_pt// taskset --cpu-list 7 uname
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.147 MB perf.data ]
   # ./perf script --itrace=e
   Requested CPU 7 too large. Consider raising MAX_NR_CPUS
   0x25908 [0x8]: failed to process type: 68 [Invalid argument]

 After:
   # ./perf script --itrace=e
   #

Fixes: 8c7274691f ("perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Fixes: 7df4e36a47 ("perf session: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210107174159.24897-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19 18:26:16 +01:00
Arnaldo Carvalho de Melo 6498b7a571 perf map: No need to adjust the long name of modules
commit f068435d9b upstream.

At some point in the past we needed to make sure we would get the long
name of modules and not just what we get from /proc/modules, but that
need, as described in the cset that introduced the adjustment function:

Fixes: c03d5184f0 ("perf machine: Adjust dso->long_name for offline module")

Without using the buildid-cache:

  # lsmod | grep trusted
  # insmod trusted.ko
  # lsmod | grep trusted
  trusted                24576  0
  # strace -e open,openat perf probe -m ./trusted.ko key_seal |& grep trusted
  openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 4
  openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 7
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/.debug/root/trusted.ko/dd3d355d567394d540f527e093e0f64b95879584/probes", O_RDWR|O_CREAT, 0644) = 3
  openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/.debug/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, ".debug/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 4
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
    probe:key_seal       (on key_seal in trusted)
  # perf probe -l
    probe:key_seal       (on key_seal in trusted)
  #

No attempt at opening '[trusted]'.

Now using the build-id cache:

  # rmmod trusted
  # perf buildid-cache --add ./trusted.ko
  # insmod trusted.ko
  # strace -e open,openat perf probe -m ./trusted.ko key_seal |& grep trusted
  openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 4
  openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 7
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/.debug/root/trusted.ko/dd3d355d567394d540f527e093e0f64b95879584/probes", O_RDWR|O_CREAT, 0644) = 3
  openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/.debug/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, ".debug/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 4
  openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
  #

Again, no attempt at reading '[trusted]'.

Finally, adding a probe to that function and then using:

[root@quaco ~]# perf trace -e probe_perf:*/max-stack=16/ --max-events=2
     0.000 perf/13456 probe_perf:dso__adjust_kmod_long_name(__probe_ip: 5492263)
                                       dso__adjust_kmod_long_name (/home/acme/bin/perf)
                                       machine__process_kernel_mmap_event (/home/acme/bin/perf)
                                       machine__process_mmap_event (/home/acme/bin/perf)
                                       perf_event__process_mmap (/home/acme/bin/perf)
                                       machines__deliver_event (/home/acme/bin/perf)
                                       perf_session__deliver_event (/home/acme/bin/perf)
                                       perf_session__process_event (/home/acme/bin/perf)
                                       process_simple (/home/acme/bin/perf)
                                       reader__process_events (/home/acme/bin/perf)
                                       __perf_session__process_events (/home/acme/bin/perf)
                                       perf_session__process_events (/home/acme/bin/perf)
                                       process_buildids (/home/acme/bin/perf)
                                       record__finish_output (/home/acme/bin/perf)
                                       __cmd_record (/home/acme/bin/perf)
                                       cmd_record (/home/acme/bin/perf)
                                       run_builtin (/home/acme/bin/perf)
     0.055 perf/13456 probe_perf:dso__adjust_kmod_long_name(__probe_ip: 5492263)
                                       dso__adjust_kmod_long_name (/home/acme/bin/perf)
                                       machine__process_kernel_mmap_event (/home/acme/bin/perf)
                                       machine__process_mmap_event (/home/acme/bin/perf)
                                       perf_event__process_mmap (/home/acme/bin/perf)
                                       machines__deliver_event (/home/acme/bin/perf)
                                       perf_session__deliver_event (/home/acme/bin/perf)
                                       perf_session__process_event (/home/acme/bin/perf)
                                       process_simple (/home/acme/bin/perf)
                                       reader__process_events (/home/acme/bin/perf)
                                       __perf_session__process_events (/home/acme/bin/perf)
                                       perf_session__process_events (/home/acme/bin/perf)
                                       process_buildids (/home/acme/bin/perf)
                                       record__finish_output (/home/acme/bin/perf)
                                       __cmd_record (/home/acme/bin/perf)
                                       cmd_record (/home/acme/bin/perf)
                                       run_builtin (/home/acme/bin/perf)
  #

This was the only path I could find using the perf tools that reach at this
function, then as of november/2019, if we put a probe in the line where the
actuall setting of the dso->long_name is done:

  # perf trace -e probe_perf:*
  ^C[root@quaco ~]
  # perf stat -e probe_perf:*  -I 2000
       2.000404265                  0      probe_perf:dso__adjust_kmod_long_name
       4.001142200                  0      probe_perf:dso__adjust_kmod_long_name
       6.001704120                  0      probe_perf:dso__adjust_kmod_long_name
       8.002398316                  0      probe_perf:dso__adjust_kmod_long_name
      10.002984010                  0      probe_perf:dso__adjust_kmod_long_name
      12.003597851                  0      probe_perf:dso__adjust_kmod_long_name
      14.004113303                  0      probe_perf:dso__adjust_kmod_long_name
      16.004582773                  0      probe_perf:dso__adjust_kmod_long_name
      18.005176373                  0      probe_perf:dso__adjust_kmod_long_name
      20.005801605                  0      probe_perf:dso__adjust_kmod_long_name
      22.006467540                  0      probe_perf:dso__adjust_kmod_long_name
  ^C    23.683261941                  0      probe_perf:dso__adjust_kmod_long_name

  #

Its not being used at all.

To further test this I used kvm.ko as the offline module, i.e. removed
if from the buildid-cache by nuking it completely (rm -rf ~/.debug) and
moved it from the normal kernel distro path, removed the modules, stoped
the kvm guest, and then installed it manually, etc.

  # rmmod kvm-intel
  # rmmod kvm
  # lsmod | grep kvm
  # modprobe kvm-intel
  modprobe: ERROR: ctx=0x55d3b1722260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
  modprobe: ERROR: ctx=0x55d3b1722260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
  modprobe: ERROR: could not insert 'kvm_intel': Unknown symbol in module, or unknown parameter (see dmesg)
  # insmod ./kvm.ko
  # modprobe kvm-intel
  modprobe: ERROR: ctx=0x562f34026260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
  modprobe: ERROR: ctx=0x562f34026260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
  # lsmod | grep kvm
  kvm_intel             299008  0
  kvm                   765952  1 kvm_intel
  irqbypass              16384  1 kvm
  #
  # perf probe -x ~/bin/perf machine__findnew_module_map:12 mname=m.name:string filename=filename:string 'dso_long_name=map->dso->long_name:string' 'dso_name=map->dso->name:string'
  # perf probe -l
    probe_perf:machine__findnew_module_map (on machine__findnew_module_map:12@util/machine.c in /home/acme/bin/perf with mname filename dso_long_name dso_name)
  # perf record
  ^C[ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 3.416 MB perf.data (33956 samples) ]
  # perf trace -e probe_perf:machine*
  <SNIP>
       6.322 perf/23099 probe_perf:machine__findnew_module_map(__probe_ip: 5492493, mname: "[salsa20_generic]", filename: "/lib/modules/5.3.8-200.fc30.x86_64/kernel/crypto/salsa20_generic.ko.xz", dso_long_name: "/lib/modules/5.3.8-200.fc30.x86_64/kernel/crypto/salsa20_generic.ko.xz", dso_name: "[salsa20_generic]")
       6.375 perf/23099 probe_perf:machine__findnew_module_map(__probe_ip: 5492493, mname: "[kvm]", filename: "[kvm]", dso_long_name: "[kvm]", dso_name: "[kvm]")
  <SNIP>

The filename doesn't come with the path, no point in trying to set the dso->long_name.

  [root@quaco ~]# strace -e open,openat perf probe -m ./kvm.ko kvm_apic_local_deliver |& egrep 'open.*kvm'
  openat(AT_FDCWD, "/sys/module/kvm_intel/notes/.note.gnu.build-id", O_RDONLY) = 4
  openat(AT_FDCWD, "/sys/module/kvm/notes/.note.gnu.build-id", O_RDONLY) = 4
  openat(AT_FDCWD, "/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 7
  openat(AT_FDCWD, "/sys/module/kvm_intel/notes/.note.gnu.build-id", O_RDONLY) = 8
  openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/.debug/root/kvm.ko/5955f426cb93f03f30f3e876814be2db80ab0b55/probes", O_RDWR|O_CREAT, 0644) = 3
  openat(AT_FDCWD, "/usr/lib/debug/root/kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/usr/lib/debug/root/kvm.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/.debug/kvm.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, ".debug/kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
  openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 4
  openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
  [root@quaco ~]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-jlfew3lyb24d58egrp0o72o2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-26 10:01:02 +01:00
Adrian Hunter 632a300260 perf callchain: Fix segfault in thread__resolve_callchain_sample()
commit aceb98261e upstream.

Do not dereference 'chain' when it is NULL.

  $ perf record -e intel_pt//u -e branch-misses:u uname
  $ perf report --itrace=l --branch-history
  perf: Segmentation fault

Fixes: e9024d519d ("perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191114142538.4097-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-09 10:20:02 +01:00
Jiri Olsa 20f2be1d48 libperf: Move 'page_size' global variable to libperf
We need the 'page_size' variable in libperf, so move it there.

Add a libperf_init() as a global libperf init function to obtain this
value via sysconf() at tool start.

Committer notes:

Add internal/lib.h to tools/perf/ files using 'page_size', sometimes
replacing util.h with it if that was the only reason for having util.h
included.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-33-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Arnaldo Carvalho de Melo 055c67ed39 perf tools: Move event synthesizing routines to separate .c file
For better grouping, in time we may end up making most of these static,
i.e. generalizing the 'perf record' synthesizing code so that based on
the target it can do the right thing and call the needed synthesizers.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-s9zxxhk40s95pjng9panet16@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-20 10:28:21 -03:00
Arnaldo Carvalho de Melo ea49e01cfa perf tools: Move event synthesizing routines to separate header
Those are the only routines using the perf_event__handler_t typedef and
are all related, so move to a separate header to reduce the header
dependency tree, lots of places were getting event.h and even stdio.h,
limits.h indirectly, so fix those as well.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-yvx9u1mf7baq6cu1abfhbqgs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-20 09:19:22 -03:00
Arnaldo Carvalho de Melo d3300a3c4e perf symbols: Move mem_info and branch_info out of symbol.h
The mem_info struct goes to mem-events.h and branch_info goes to
branch.h, where they belong, this way we can remove several headers from
symbols.h and trim the include dependency tree more.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-aupw71xnravcsu2xoabfmhpc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-31 22:27:48 -03:00
Arnaldo Carvalho de Melo f2a39fe849 perf auxtrace: Uninline functions that touch perf_session
So that we don't carry the session.h include directive in auxtrace.h,
which in turn opens a can of worms of files that were getting all sorts
of things via that include, fix them all.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-d2d83aovpgri2z75wlitquni@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-31 22:24:10 -03:00
Arnaldo Carvalho de Melo 4a3cec8494 perf dsos: Move the dsos struct and its methods to separate source files
So that we can reduce the header dependency tree further, in the process
noticed that lots of places were getting even things like build-id
routines and 'struct perf_tool' definition indirectly, so fix all those
too.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-31 22:24:10 -03:00
Arnaldo Carvalho de Melo 8520a98dba perf debug: Remove needless include directives from debug.h
All we need there is a forward declaration for 'union perf_event', so
remove it from there and add missing header directives in places using
things from this indirect include.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-31 19:10:19 -03:00
Kyle Meyer 8c7274691f perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_online
nr_cpus, the number of CPUs online during a record session bound by
MAX_NR_CPUS, can be used as a dynamic alternative for MAX_NR_CPUS in
__machine__synthesize_threads and machine__set_current_tid.

Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <russ.anderson@hpe.com>
Link: http://lore.kernel.org/lkml/20190827214352.94272-6-meyerk@stormcage.eag.rdlabs.hpecorp.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-29 17:38:32 -03:00
Arnaldo Carvalho de Melo 3f604b5f61 perf tool: Rename perf_tool::bpf_event to bpf
No need for that _event suffix, do just like all the other meta event
handlers and suppress that suffix.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/n/tip-03spzxtqafbabbbmnm7y4xfx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26 19:39:11 -03:00
Arnaldo Carvalho de Melo ebdba16e95 perf tools: Rename perf_event::ksymbol_event to perf_event::ksymbol
Just like all the other meta events, that extra _event suffix is just
redundant, ditch it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/n/tip-0q8b2xnfs17q0g523oej75s0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26 19:39:11 -03:00
Jiri Olsa a2e254d841 libperf: Add PERF_RECORD_LOST_SAMPLES 'struct lost_samples_event' to perf/event.h
Move the PERF_RECORD_LOST_SAMPLES event definition into libperf's
event.h header include.

In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used
events to their generic '__u*' versions.

Perf added 'u*' types mainly to ease up printing __u64 values
as stated in the linux/types.h comment:

  /*
   * We define u64 as uint64_t for every architecture
   * so that we can print it with "%"PRIx64 without getting warnings.
   *
   * typedef __u64 u64;
   * typedef __s64 s64;
   */

Add and use new PRI_lu64 and PRI_lx64 macros for that.  Use extra '_' to
ease up the reading and differentiate them from standard PRI*64 macros.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190825181752.722-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26 19:39:10 -03:00
Jiri Olsa 5290ed6955 libperf: Add PERF_RECORD_LOST 'struct lost_event' to perf/event.h
Move the lost_event event definition to libperf's event.h header
include.

In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used
events to their generic '__u*' versions.

Perf added 'u*' types mainly to ease up printing __u64 values as stated
in the linux/types.h comment:

  /*
   * We define u64 as uint64_t for every architecture
   * so that we can print it with "%"PRIx64 without getting warnings.
   *
   * typedef __u64 u64;
   * typedef __s64 s64;
   */

Add and use new PRI_lu64 and PRI_lx64 macros for that.  Use extra '_' to
ease up the reading and differentiate them from standard PRI*64 macros.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190825181752.722-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26 19:39:09 -03:00
Arnaldo Carvalho de Melo 97b9d866a6 perf srcline: Add missing srcline.h header to files needing its defs
When srcline was introduced it wrongly added the include to util/sort.h,
even with that header not needing the definitions it provides, fix it by
adding it to the places that need it as a pre patch to remove srcline.h
from sort.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-shuebppedtye8hrgxk15qe3x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26 11:58:29 -03:00
Arnaldo Carvalho de Melo aeb00b1aea perf record: Move record_opts and other record decls out of perf.h
And into a separate util/record.h, to better isolate things and make
sure that those who use record_opts and the other moved declarations
are explicitly including the necessary header.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-31q8mei1qkh74qvkl9nwidfq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-26 11:58:22 -03:00
Arnaldo Carvalho de Melo 272172bd41 Merge remote-tracking branch 'torvalds/master' into perf/core
To get closer to upstream and check if we need to sync more UAPI
headers, pick up fixes for libbpf that prevent perf's container tests
from completing successfuly, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-12 16:25:00 -03:00
Thomas Richter 12a6d2940b perf record: Fix module size on s390
On s390 the modules loaded in memory have the text segment located after
the GOT and Relocation table. This can be seen with this output:

  [root@m35lp76 perf]# fgrep qeth /proc/modules
  qeth 151552 1 qeth_l2, Live 0x000003ff800b2000
  ...
  [root@m35lp76 perf]# cat /sys/module/qeth/sections/.text
  0x000003ff800b3990
  [root@m35lp76 perf]#

There is an offset of 0x1990 bytes. The size of the qeth module is
151552 bytes (0x25000 in hex).

The location of the GOT/relocation table at the beginning of a module is
unique to s390.

commit 203d8a4aa6 ("perf s390: Fix 'start' address of module's map")
adjusts the start address of a module in the map structures, but does
not adjust the size of the modules. This leads to overlapping of module
maps as this example shows:

[root@m35lp76 perf] # ./perf report -D
     0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x25000)
          @ 0]:  x /lib/modules/.../qeth.ko.xz
     0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x8000)
          @ 0]:  x /lib/modules/.../ip6_tables.ko.xz

The module qeth.ko has an adjusted start address modified to b3990, but
its size is unchanged and the module ends at 0x3ff800d8990.  This end
address overlaps with the next modules start address of 0x3ff800d85a0.

When the size of the leading GOT/Relocation table stored in the
beginning of the text segment (0x1990 bytes) is subtracted from module
qeth end address, there are no overlaps anymore:

   0x3ff800d8990 - 0x1990 = 0x0x3ff800d7000

which is the same as

   0x3ff800b2000 + 0x25000 = 0x0x3ff800d7000.

To fix this issue, also adjust the modules size in function
arch__fix_module_text_start(). Add another function parameter named size
and reduce the size of the module when the text segment start address is
changed.

Output after:
     0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x23670)
          @ 0]:  x /lib/modules/.../qeth.ko.xz
     0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x7a60)
          @ 0]:  x /lib/modules/.../ip6_tables.ko.xz

Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 203d8a4aa6 ("perf s390: Fix 'start' address of module's map")
Link: http://lkml.kernel.org/r/20190724122703.3996-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-08 15:41:11 -03:00
Jiri Olsa 1fc632cef4 libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel
Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.

Committer notes:

Fixed up these:

 tools/perf/arch/arm/util/auxtrace.c
 tools/perf/arch/arm/util/cs-etm.c
 tools/perf/arch/arm64/util/arm-spe.c
 tools/perf/arch/s390/util/auxtrace.c
 tools/perf/util/cs-etm.c

Also

  cc1: warnings being treated as errors
  tests/sample-parsing.c: In function 'do_test':
  tests/sample-parsing.c:162: error: missing initializer
  tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')

   	struct evsel evsel = {
   		.needs_swap = false,
  -		.core.attr = {
  -			.sample_type = sample_type,
  -			.read_format = read_format,
  +		.core = {
  +			. attr = {
  +				.sample_type = sample_type,
  +				.read_format = read_format,
  +			},

  [perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
  gcc (GCC) 4.4.7

Also we don't need to include perf_event.h in
tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
perf_event_attr' is enough. And this even fixes the build in some
systems where things are used somewhere down the include path from
perf_event.h without defining __always_inline.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:45 -03:00
Jiri Olsa 32dcd021d0 perf evsel: Rename struct perf_evsel to struct evsel
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.

Committer notes:

Added fixes for arm64, provided by Jiri.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:42 -03:00
Jiri Olsa 9749b90e56 perf tools: Rename struct thread_map to struct perf_thread_map
Rename struct thread_map to struct perf_thread_map, so it could be part
of libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo d8f9da2404 perf tools: Use zfree() where applicable
In places where the equivalent was already being done, i.e.:

   free(a);
   a = NULL;

And in placs where struct members are being freed so that if we have
some erroneous reference to its struct, then accesses to freed members
will result in segfaults, which we can detect faster than use after free
to areas that may still have something seemingly valid.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-jatyoofo5boc1bsvoig6bb6i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-09 10:13:27 -03:00
Arnaldo Carvalho de Melo 7f7c536f23 tools lib: Adopt zalloc()/zfree() from tools/perf
Eroding a bit more the tools/perf/util/util.h hodpodge header.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-09 10:13:26 -03:00
Arnaldo Carvalho de Melo e3b22a6534 Merge remote-tracking branch 'tip/perf/core' into perf/urgent
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-08 13:06:57 -03:00
Arnaldo Carvalho de Melo 4c00af0e94 perf thread: Allow references to thread objects after machine__exit()
Threads are created when we either synthesize PERF_RECORD_FORK events
for pre-existing threads or when we receive PERF_RECORD_FORK events from
the kernel as new threads get created.

We then keep them in machine->threads[].entries rb trees till when we
receive a PERF_RECORD_EXIT, i.e. that thread terminated.

The thread object has a reference count that is grabbed when, for
instance, we keep that thread referenced in struct hist_entry, in 'perf
report' and 'perf top'.

When we receive a PERF_RECORD_EXIT we remove the thread object from the
rb tree and move it to the corresponding machine->threads[].dead list,
then we do a thread__put(), dropping the reference we had for keeping it
in the rb tree.

In thread__put() we were assuming that when the reference count hit zero
we should remove it from the dead list by simply doing a
list_del_init(&thread->node).

That works well when all the thread lifetime is during the machine that
has the list heads lifetime, since we know that we can do the
list_del_init() and it will update the 'dead' list_head.

But in 'perf sched lat' we were doing:

    machine__new() (via perf_session__new)

    process events, grabbing refcounts to keep those thread objects
    in 'perf sched' local data structures.

    machine__exit() (via perf_session__delete) which would delete the
    'dead' list heads.

    And then doing the final thread__put() for the refcounts 'perf sched'
    rightfully obtained for keeping those thread object references.

    b00m, since thread__put() would do the list_del_init() touching
    a dead dead list head.

Fix it by removing all the dead threads from machine->threads[].dead at
machine__exit(), since whatever is there should have refcounts taken by
things like 'perf sched lat', and make thread__put() check if the thread
is in a linked list before removing it from that list.

Reported-by: Wei Li <liwei391@huawei.com>
Link: https://lkml.kernel.org/r/20190508143648.8153-1-liwei391@huawei.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhipeng Xie <xiezhipeng1@huawei.com>
Link: https://lkml.kernel.org/r/20190704194355.GI10740@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-06 14:29:32 -03:00
Arnaldo Carvalho de Melo 3052ba56bc tools perf: Move from sane_ctype.h obtained from git to the Linux's original
We got the sane_ctype.h headers from git and kept using it so far, but
since that code originally came from the kernel sources to the git
sources, perhaps its better to just use the one in the kernel, so that
we can leverage tools/perf/check_headers.sh to be notified when our copy
gets out of sync, i.e. when fixes or goodies are added to the code we've
copied.

This will help with things like tools/lib/string.c where we want to have
more things in common with the kernel, such as strim(), skip_spaces(),
etc so as to go on removing the things that we have in tools/perf/util/
and instead using the code in the kernel, indirectly and removing things
like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
are made to the original code.

Hopefully this also should help with reducing the difference of code
hosted in tools/ to the one in the kernel proper.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-25 21:02:47 -03:00
Arnaldo Carvalho de Melo 1b2fc358dd perf tools: Add missing util.h to pick up 'page_size' variable
Not to depend of getting it indirectly.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-tirjsmvu4ektw0k7lm8k9lhu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-06-25 18:35:34 -03:00
Donald Yandt 34b65affe1 perf machine: Return NULL instead of null-terminating /proc/version array
Return NULL instead of null-terminating version char array when fgets
fails due to end-of-file or error.

Signed-off-by: Donald Yandt <donald.yandt@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Fixes: 30ba5b0e66 ("perf machine: Null-terminate version char array upon fgets(/proc/version) error")
Link: http://lkml.kernel.org/r/20190528134128.30841-1-donald.yandt@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:44 -03:00
Jiri Olsa 8529f2e673 perf machine: Keep zero in pgoff BPF map
With pgoff set to zero, the map__map_ip function will return BPF
addresses based from 0, which is what we need when we read the data from
a BPF DSO.

Adding BPF symbols with mapped IP addresses as well.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 18:37:42 -03:00
Jiri Olsa ed9adb2035 perf machine: Read also the end of the kernel
We mark the end of kernel based on the first module, but that could
cover some bpf program maps. Reading _etext symbol if it's present to
get precise kernel map end.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190508132010.14512-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-28 09:52:23 -03:00
Donald Yandt 30ba5b0e66 perf machine: Null-terminate version char array upon fgets(/proc/version) error
If fgets() fails due to any other error besides end-of-file, the version
char array may not even be null-terminated.

Signed-off-by: Donald Yandt <donald.yandt@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Avi Kivity <avi@scylladb.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Fixes: a1645ce12a ("perf: 'perf kvm' tool for monitoring guest performance from host")
Link: http://lkml.kernel.org/r/20190514110100.22019-1-donald.yandt@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-15 16:36:47 -03:00
Wei Li 977c7a6d1e perf machine: Update kernel map address and re-order properly
Since commit 1fb87b8e95 ("perf machine: Don't search for active kernel
start in __machine__create_kernel_maps"), the __machine__create_kernel_maps()
just create a map what start and end are both zero. Though the address will be
updated later, the order of map in the rbtree may be incorrect.

The commit ee05d21791 ("perf machine: Set main kernel end address properly")
fixed the logic in machine__create_kernel_maps(), but it's still wrong in
function machine__process_kernel_mmap_event().

To reproduce this issue, we need an environment which the module address
is before the kernel text segment. I tested it on an aarch64 machine with
kernel 4.19.25:

  [root@localhost hulk]# grep _stext /proc/kallsyms
  ffff000008081000 T _stext
  [root@localhost hulk]# grep _etext /proc/kallsyms
  ffff000009780000 R _etext
  [root@localhost hulk]# tail /proc/modules
  hisi_sas_v2_hw 77824 0 - Live 0xffff00000191d000
  nvme_core 126976 7 nvme, Live 0xffff0000018b6000
  mdio 20480 1 ixgbe, Live 0xffff0000018ab000
  hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0xffff000001861000
  hns_mdio 20480 2 - Live 0xffff000001822000
  hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0xffff000001815000
  dm_mirror 40960 0 - Live 0xffff000001804000
  dm_region_hash 32768 1 dm_mirror, Live 0xffff0000017f5000
  dm_log 32768 2 dm_mirror,dm_region_hash, Live 0xffff0000017e7000
  dm_mod 315392 17 dm_mirror,dm_log, Live 0xffff000001780000
  [root@localhost hulk]#

Before fix:

  [root@localhost bin]# perf record sleep 3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
  [root@localhost bin]# perf buildid-list -i perf.data
  4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso]
  [root@localhost bin]# perf buildid-list -i perf.data -H
  0000000000000000000000000000000000000000 /proc/kcore
  [root@localhost bin]#

After fix:

  [root@localhost tools]# ./perf/perf record sleep 3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
  [root@localhost tools]# ./perf/perf buildid-list -i perf.data
  28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms]
  106c14ce6e4acea3453e484dc604d66666f08a2f [vdso]
  [root@localhost tools]# ./perf/perf buildid-list -i perf.data -H
  28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore

Signed-off-by: Wei Li <liwei391@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Li Bin <huawei.libin@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190228092003.34071-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-28 14:41:21 -03:00
Arnaldo Carvalho de Melo daecf9e0fa perf tools: Add missing include for symbols.h
Several places were using definitions found in symbols.h but not
including it, getting it by sheer luck from some other headers that now
are in the process of removing that include because they don't need it
or because simply having struct forward declarations is enough, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-xbcvvx296d70kpg9wb0qmeq9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Davidlohr Bueso f3acb3a8a2 perf machine: Use cached rbtrees
At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something required for nearly every operation dealing with
machine->guests and threads->entries.

The conversion is straightforward, however, it's worth noticing that the
rb_erase_init() calls have been replaced by rb_erase_cached() which has
no _init() flavor, however, the node is explicitly cleared next anyway,
which was redundant until now.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20181206191819.30182-3-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-25 15:12:09 +01:00
Song Liu 45178a928a perf tools: Handle PERF_RECORD_BPF_EVENT
This patch adds basic handling of PERF_RECORD_BPF_EVENT.  Tracking of
PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to
turn it on.

Committer notes:

Add dummy machine__process_bpf_event() variant that returns zero for
systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking
the build in such systems.

Remove the needless include <machine.h> from bpf->event.h, provide just
forward declarations for the structs and unions in the parameters, to
reduce compilation time and needless rebuilds when machine.h gets
changed.

Committer testing:

When running with:

 # perf record --bpf-event

On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL
is not present, we fallback to removing those two bits from
perf_event_attr, making the tool to continue to work on older kernels:

  perf_event_attr:
    size                             112
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    mmap                             1
    comm                             1
    freq                             1
    enable_on_exec                   1
    task                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
    bpf_event                        1
  ------------------------------------------------------------
  sys_perf_event_open: pid 5779  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off bpf_event
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    mmap                             1
    comm                             1
    freq                             1
    enable_on_exec                   1
    task                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
  ------------------------------------------------------------
  sys_perf_event_open: pid 5779  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off ksymbol
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    mmap                             1
    comm                             1
    freq                             1
    enable_on_exec                   1
    task                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
  ------------------------------------------------------------

And then proceeds to work without those two features.

As passing --bpf-event is an explicit action performed by the user, perhaps we
should emit a warning telling that the kernel has no such feature, but this can
be done on top of this patch.

Now with a kernel that supports these events, start the 'record --bpf-event -a'
and then run 'perf trace sleep 10000' that will use the BPF
augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus
should generate PERF_RECORD_BPF_EVENT events:

  [root@quaco ~]# perf record -e dummy -a --bpf-event
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.713 MB perf.data ]

  [root@quaco ~]# bpftool prog
  13: cgroup_skb  tag 7be49e3934a125ba  gpl
  	loaded_at 2019-01-19T09:09:43-0300  uid 0
  	xlated 296B  jited 229B  memlock 4096B  map_ids 13,14
  14: cgroup_skb  tag 2a142ef67aaad174  gpl
  	loaded_at 2019-01-19T09:09:43-0300  uid 0
  	xlated 296B  jited 229B  memlock 4096B  map_ids 13,14
  15: cgroup_skb  tag 7be49e3934a125ba  gpl
  	loaded_at 2019-01-19T09:09:43-0300  uid 0
  	xlated 296B  jited 229B  memlock 4096B  map_ids 15,16
  16: cgroup_skb  tag 2a142ef67aaad174  gpl
  	loaded_at 2019-01-19T09:09:43-0300  uid 0
  	xlated 296B  jited 229B  memlock 4096B  map_ids 15,16
  17: cgroup_skb  tag 7be49e3934a125ba  gpl
  	loaded_at 2019-01-19T09:09:44-0300  uid 0
  	xlated 296B  jited 229B  memlock 4096B  map_ids 17,18
  18: cgroup_skb  tag 2a142ef67aaad174  gpl
  	loaded_at 2019-01-19T09:09:44-0300  uid 0
  	xlated 296B  jited 229B  memlock 4096B  map_ids 17,18
  21: cgroup_skb  tag 7be49e3934a125ba  gpl
  	loaded_at 2019-01-19T09:09:45-0300  uid 0
  	xlated 296B  jited 229B  memlock 4096B  map_ids 21,22
  22: cgroup_skb  tag 2a142ef67aaad174  gpl
  	loaded_at 2019-01-19T09:09:45-0300  uid 0
  	xlated 296B  jited 229B  memlock 4096B  map_ids 21,22
  31: tracepoint  name sys_enter  tag 12504ba9402f952f  gpl
  	loaded_at 2019-01-19T09:19:56-0300  uid 0
  	xlated 512B  jited 374B  memlock 4096B  map_ids 30,29,28
  32: tracepoint  name sys_exit  tag c1bd85c092d6e4aa  gpl
  	loaded_at 2019-01-19T09:19:56-0300  uid 0
  	xlated 256B  jited 191B  memlock 4096B  map_ids 30,29
  # perf report -D | grep PERF_RECORD_BPF_EVENT | nl
     1	0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13
     2	0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14
     3	0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15
     4	0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16
     5	0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17
     6	0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18
     7	0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21
     8	0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22
     9	7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29
    10	7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29
    11	7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30
    12	7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30
    13	7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31
    14	7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32
  #

There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'.

Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-21 17:00:57 -03:00
Song Liu 9aa0bfa370 perf tools: Handle PERF_RECORD_KSYMBOL
This patch handles PERF_RECORD_KSYMBOL in perf record/report.
Specifically, map and symbol are created for ksymbol register, and
removed for ksymbol unregister.

This patch also sets perf_event_attr.ksymbol properly. The flag is ON by
default.

Committer notes:

Use proper inttypes.h for u64, fixing the build in some environments
like in the android NDK r15c targetting ARM 32-bit.

I.e. fixing this build error:

  util/event.c: In function 'perf_event__fprintf_ksymbol':
  util/event.c:1489:10: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=]
            event->ksymbol_event.flags, event->ksymbol_event.name);
            ^
  cc1: all warnings being treated as errors

Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-6-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-21 17:00:57 -03:00
Jin Yao a3366db06b perf report: Fix wrong iteration count in --branch-history
By calculating the removed loops, we can get the iteration count.

But the iteration count could be reported incorrectly, reporting
impossibly high counts.

That's because previous code uses the number of removed LBR entries for
the iteration count. That's not good. Fix this by increasing the
iteration count when a loop is detected.

When matching the chain, the iteration count would be added up, finally we need
to compute the average value when printing out.

For example,

  $ perf report --branch-history --stdio --no-children

Before:

  ---f2 +0
     |
     |--33.62%--f1 +9 (cycles:1)
     |          f1 +0
     |          main +22 (cycles:1)
     |          main +17
     |          main +38 (cycles:1)
     |          main +27
     |          f1 +26 (cycles:1)
     |          f1 +24
     |          f2 +27 (cycles:7)
     |          f2 +0
     |          f1 +19 (cycles:1)
     |          f1 +14
     |          f2 +27 (cycles:11)
     |          f2 +0
     |          f1 +9 (cycles:1 iter:2968 avg_cycles:3)
     |          f1 +0
     |          main +22 (cycles:1 iter:2968 avg_cycles:3)
     |          main +17
     |          main +38 (cycles:1 iter:2968 avg_cycles:3)

2968 is an impossible high iteration count and avg_cycles is too small.

After:

  ---f2 +0
     |
     |--33.62%--f1 +9 (cycles:1)
     |          f1 +0
     |          main +22 (cycles:1)
     |          main +17
     |          main +38 (cycles:1)
     |          main +27
     |          f1 +26 (cycles:1)
     |          f1 +24
     |          f2 +27 (cycles:7)
     |          f2 +0
     |          f1 +19 (cycles:1)
     |          f1 +14
     |          f2 +27 (cycles:11)
     |          f2 +0
     |          f1 +9 (cycles:1 iter:1 avg_cycles:23)
     |          f1 +0
     |          main +22 (cycles:1 iter:1 avg_cycles:23)
     |          main +17
     |          main +38 (cycles:1 iter:1 avg_cycles:23)

avg_cycles:23 is the average cycles of this iteration.

Fixes: c4ee06251d ("perf report: Calculate the average cycles of iterations")

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1546582230-17507-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04 12:54:49 -03:00
Mark Drayton 3fcb10e496 perf tools: Allow specifying proc-map-timeout in config file
The default timeout of 500ms for parsing /proc/<pid>/maps files is too
short for profiling many of our services.

This can be overridden by passing --proc-map-timeout to the relevant
command but it'd be nice to globally increase our default value.

This patch permits setting a different default with the
core.proc-map-timeout config file parameter.

Signed-off-by: Mark Drayton <mbd@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:57 -03:00
Ingo Molnar adba163441 perf tools: Fix diverse comment typos
Go over the tools/ files that are maintained in Arnaldo's tree and
fix common typos: half of them were in comments, the other half
in JSON files.

No change in functionality intended.

Committer notes:

This was split from a larger patch as there are code that is,
additionally, maintained outside the kernel tree, so to ease
cherry-picking and/or backporting, split this into multiple patches.

Just typos in comments, no need to backport, reducing the possibility of
possible backporting artifacts.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:56:47 -03:00
Adrian Hunter 8e80ad9983 perf thread: Add fallback functions for cases where cpumode is insufficient
For branch stacks or branch samples, the sample cpumode might not be
correct because it applies only to the sample 'ip' and not necessary to
'addr' or branch stack addresses. Add fallback functions that can be
used to deal with those cases

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:54:13 -03:00
David Miller 4f8f382e63 perf tools: Don't clone maps from parent when synthesizing forks
When synthesizing FORK events, we are trying to create thread objects
for the already running tasks on the machine.

Normally, for a kernel FORK event, we want to clone the parent's maps
because that is what the kernel just did.

But when synthesizing, this should not be done.  If we do, we end up
with overlapping maps as we process the sythesized MMAP2 events that
get delivered shortly thereafter.

Use the FORK event misc flags in an internal way to signal this
situation, so we can elide the map clone when appropriate.

Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Link: http://lkml.kernel.org/r/20181030.222404.2085088822877051075.davem@davemloft.net
[ Added comment about flag use in machine__process_fork_event(),
  use ternary op in thread__clone_map_groups() as suggested by Jiri ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-31 10:18:01 -03:00
David S. Miller e9024d519d perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}
When processing using 'perf report -g caller', which is the default, we
ended up reverting the callchain entries received from the kernel, but
simply reverting throws away the information that tells that from a
point onwards the addresses are for userspace, kernel, guest kernel,
guest user, hypervisor.

The idea is that if we are walking backwards, for each cluster of
non-cpumode entries we have to first scan backwards for the next one and
use that for the cluster.

This seems silly and more expensive than it needs to be but it is enough
for a initial fix.

The code here is really complicated because it is intimately intertwined
with the lbr and branch handling, as well as this callchain order,
further fixes will be needed to properly take into account the cpumode
in those cases.

Another problem with ORDER_CALLER is that the NULL "0" IP that is at the
end of most callchains shows up at the top of the histogram because
every callchain contains it and with ORDER_CALLER it is the first entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Souvik Banerjee <souvik1997@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org # 4.19
Link: https://lkml.kernel.org/n/tip-2wt3ayp6j2y2f2xowixa8y6y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-31 09:57:51 -03:00
Milian Wolff 7a8a8fcf7b perf record: Use unmapped IP for inline callchain cursors
Only use the mapped IP to find inline frames, but keep using the
unmapped IP for the callchain cursor. This ensures we properly show the
unmapped IP when displaying a frame we received via the
dso__parse_addr_inlines API for a module which does not contain
sufficient debug symbols to show the srcline.

This is another follow-up to commit 1961018469 ("perf script: Show
virtual addresses instead of offsets").

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Fixes: 1961018469 ("perf script: Show virtual addresses instead of offsets")
Link: http://lkml.kernel.org/r/20180926135207.30263-2-milian.wolff@kdab.com
Link: http://lkml.kernel.org/r/20181002073949.3297-1-milian.wolff@kdab.com
[ Squashed a fix from Milian for a problem reported by Ravi, fixed up space damage ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-05 11:18:09 -03:00
Milian Wolff ff4ce2885a perf report: Don't try to map ip to invalid map
Fixes a crash when the report encounters an address that could not be
associated with an mmaped region:

  #0  0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329
  #1  unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329
  #2  0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=18446744073709551615) at util/unwind-libunwind-local.c:586
  #3  get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703
  #4  0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725
  #5  0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351
  #6  thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378
  #7  0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750,
      max_stack=<optimized out>) at util/callchain.c:1085

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Tested-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 2a9d5050dc ("perf script: Show correct offsets for DWARF-based unwinding")
Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-27 16:05:43 -03:00
Jiri Olsa 2af5247530 perf tools: Store compression id into struct dso
Add comp to 'struct dso' to hold the compression index.  It will be used
in the following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180817094813.15086-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-20 08:54:59 -03:00
Jiri Olsa b57334b945 perf machine: Use last_match threads cache only in single thread mode
There's an issue with using threads::last_match in multithread mode
which is enabled during the perf top synthesize. It might crash with
following assertion:

  perf: ...include/linux/refcount.h:109: refcount_inc:
        Assertion `!(!refcount_inc_not_zero(r))' failed.

The gdb backtrace looks like this:

  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  (gdb)
  #0  0x00007ffff50839fb in raise () from /lib64/libc.so.6
  #1  0x00007ffff5085800 in abort () from /lib64/libc.so.6
  #2  0x00007ffff507c0da in __assert_fail_base () from /lib64/libc.so.6
  #3  0x00007ffff507c152 in __assert_fail () from /lib64/libc.so.6
  #4  0x0000000000535ff9 in refcount_inc (r=0x7fffe8009a70)
      at ...include/linux/refcount.h:109
  #5  0x0000000000536771 in thread__get (thread=0x7fffe8009a40)
      at util/thread.c:115
  #6  0x0000000000523cd0 in ____machine__findnew_thread (machine=0xbfde38,
      threads=0xbfdf28, pid=2, tid=2, create=true) at util/machine.c:432
  #7  0x0000000000523eb4 in __machine__findnew_thread (machine=0xbfde38,
      pid=2, tid=2) at util/machine.c:489
  #8  0x0000000000523f24 in machine__findnew_thread (machine=0xbfde38,
      pid=2, tid=2) at util/machine.c:499
  #9  0x0000000000526fbe in machine__process_fork_event (machine=0xbfde38,
  ...

The failing assertion is this one:

  REFCOUNT_WARN(!refcount_inc_not_zero(r), ...

the problem is that we don't serialize access to threads::last_match.
We serialize the access to the threads tree, but we don't care how's
threads::last_match being accessed. Both locked/unlocked paths use
that data and can set it. In multithreaded mode we can end up with
invalid object in thread__get call, like in following paths race:

  thread 1
    ...
    machine__findnew_thread
      down_write(&threads->lock);
      __machine__findnew_thread
        ____machine__findnew_thread
          th = threads->last_match;
          if (th->tid == tid) {
            thread__get

  thread 2
    ...
    machine__find_thread
      down_read(&threads->lock);
      __machine__findnew_thread
        ____machine__findnew_thread
          th = threads->last_match;
          if (th->tid == tid) {
            thread__get

  thread 3
    ...
    machine__process_fork_event
      machine__remove_thread
        __machine__remove_thread
          threads->last_match = NULL
          thread__put
      thread__put

Thread 1 and 2 might got stale last_match, before thread 3 clears
it. Thread 1 and 2 then race with thread 3's thread__put and they
might trigger the refcnt == 0 assertion above.

The patch is disabling the last_match cache for multiple thread
mode. It was originally meant for single thread scenarios, where
it's common to have multiple sequential searches of the same
thread.

In multithread mode this does not make sense, because top's threads
processes different /proc entries and so the 'struct threads' object
is queried for various threads. Moreover we'd need to add more locks
to make it work.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180719143345.12963-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24 14:53:52 -03:00
Jiri Olsa 67fda0f32c perf machine: Add threads__set_last_match function
Separating threads::last_match cache set into separate
threads__set_last_match function.  This will be useful in following
patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180719143345.12963-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24 14:53:42 -03:00
Jiri Olsa f8b2ebb532 perf machine: Add threads__get_last_match function
Separating threads::last_match cache read/check into separate
threads__get_last_match function. This will be useful in following
patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lukasz Odzioba <lukasz.odzioba@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180719143345.12963-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24 14:53:31 -03:00
Sandipan Das 2a9d5050dc perf script: Show correct offsets for DWARF-based unwinding
When perf/data is recorded with the dwarf call-graph option, the
callchain shown by 'perf script' still shows the binary offsets of the
userspace symbols instead of their virtual addresses. Since the symbol
offset calculation is based on using virtual address as the ip, we see
incorrect offsets as well.

The use of virtual addresses affects the ability to find out the
line number in the corresponding source file to which an address
maps to as described in commit 6754075915 ("perf unwind: Use
addr_location::addr instead of ip for entries").

This has also been addressed by temporarily converting the virtual
address to the correponding binary offset so that it can be mapped
to the source line number correctly.

This is a follow-up for commit 1961018469 ("perf script: Show
virtual addresses instead of offsets").

This can be verified on a powerpc64le system running Fedora 27 as
shown below:

  # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton
  # perf record -e probe_libc:inet_pton --call-graph=dwarf ping -6 -c 1 ::1

Before:

  # perf report --stdio --no-children -s sym,srcline -g address

  # Samples: 1  of event 'probe_libc:inet_pton'
  # Event count (approx.): 1
  #
  # Overhead  Symbol                Source:Line
  # ........  ....................  ...........
  #
     100.00%  [.] __GI___inet_pton  inet_pton.c
              |
              ---gaih_inet getaddrinfo.c:537 (inlined)
                 __GI_getaddrinfo getaddrinfo.c:2304 (inlined)
                 main ping.c:519
                 generic_start_main libc-start.c:308 (inlined)
                 __libc_start_main libc-start.c:102
  ...

  # perf script -F comm,ip,sym,symoff,srcline,dso

  ping
                    15af28 __GI___inet_pton+0xffff000099160008 (/usr/lib64/libc-2.26.so)
    libc-2.26.so[ffff80004ca0af28]
                    10fa53 gaih_inet+0xffff000099160f43
    libc-2.26.so[ffff80004c9bfa53] (inlined)
                    1105b3 __GI_getaddrinfo+0xffff000099160163
    libc-2.26.so[ffff80004c9c05b3] (inlined)
                      2d6f main+0xfffffffd9f1003df (/usr/bin/ping)
    ping[fffffffecf882d6f]
                     2369f generic_start_main+0xffff00009916013f
    libc-2.26.so[ffff80004c8d369f] (inlined)
                     23897 __libc_start_main+0xffff0000991600b7 (/usr/lib64/libc-2.26.so)
    libc-2.26.so[ffff80004c8d3897]

After:

  # perf report --stdio --no-children -s sym,srcline -g address

  # Samples: 1  of event 'probe_libc:inet_pton'
  # Event count (approx.): 1
  #
  # Overhead  Symbol                Source:Line
  # ........  ....................  ...........
  #
     100.00%  [.] __GI___inet_pton  inet_pton.c
              |
              ---gaih_inet.constprop.7 getaddrinfo.c:537
                 getaddrinfo getaddrinfo.c:2304
                 main ping.c:519
                 generic_start_main.isra.0 libc-start.c:308
                 __libc_start_main libc-start.c:102
  ...

  # perf script -F comm,ip,sym,symoff,srcline,dso

  ping
              7fffb38aaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
    inet_pton.c:68
              7fffb385fa53 gaih_inet.constprop.7+0xf43 (/usr/lib64/libc-2.26.so)
    getaddrinfo.c:537
              7fffb38605b3 getaddrinfo+0x163 (/usr/lib64/libc-2.26.so)
    getaddrinfo.c:2304
                 130782d6f main+0x3df (/usr/bin/ping)
    ping.c:519
              7fffb377369f generic_start_main.isra.0+0x13f (/usr/lib64/libc-2.26.so)
    libc-start.c:308
              7fffb3773897 __libc_start_main+0xb7 (/usr/lib64/libc-2.26.so)
    libc-start.c:102

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes: 6754075915 ("perf unwind: Use addr_location::addr instead of ip for entries")
Link: http://lkml.kernel.org/r/20180703120555.32971-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-24 14:53:11 -03:00