Commit graph

224 commits

Author SHA1 Message Date
Namhyung Kim 77f02f4446 perf sched: Make common options cascading
The -i and -v options can be used in subcommands so enable cascading the
sched_options.  This fixes the following inconvenience in 'perf sched':

  $ perf sched -i perf.data.sched  map
  ... (it works well) ...

  $ perf sched map  -i perf.data.sched
    Error: unknown switch `i'

   Usage: perf sched map [<options>]

          --color-cpus <cpus>
                            highlight given CPUs in map
          --color-pids <pids>
                            highlight given pids in map
          --compact         map output in compact mode
          --cpus <cpus>     display given CPUs in map

With this patch, the second command line works with the perf.data.sched
data file.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20161024030003.28534-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-25 10:24:48 -03:00
Arnaldo Carvalho de Melo 4fc76e495b perf sched: Use linux/time64.h
Probably the next step is to introduce linux/time.h and use
timespec_to_ns(), etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4nqhskn27fn93cz3ukbc8drf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-08-23 15:37:33 -03:00
Arnaldo Carvalho de Melo c8b5f2c96d tools: Introduce str_error_r()
The tools so far have been using the strerror_r() GNU variant, that
returns a string, be it the buffer passed or something else.

But that, besides being tricky in cases where we expect that the
function using strerror_r() returns the error formatted in a provided
buffer (we have to check if it returned something else and copy that
instead), breaks the build on systems not using glibc, like Alpine
Linux, where musl libc is used.

So, introduce yet another wrapper, str_error_r(), that has the GNU
interface, but uses the portable XSI variant of strerror_r(), so that
users rest asured that the provided buffer is used and it is what is
returned.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-d4t42fnf48ytlk8rjxs822tf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-12 15:19:47 -03:00
Jiri Olsa 73643bb6a2 perf sched map: Display only given cpus
Introducing --cpus option that will display only given cpus. Could be
used together with color-cpus option.

  $ perf sched map  --cpus 0,1
        *A0   309999.786924 secs A0 => rcu_sched:7
        *.    309999.786930 secs
    *B0  .    309999.786931 secs B0 => rcuos/2:25
     B0 *A0   309999.786947 secs

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-9-git-send-email-jolsa@kernel.org
[ Added entry to man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:52 -03:00
Jiri Olsa cf294f24f8 perf sched map: Color given cpus
Adding --color-cpus option to display selected cpus with background
color (red by default).  It helps on navigating through the perf sched
map output.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-8-git-send-email-jolsa@kernel.org
[ Added entry to man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:51 -03:00
Jiri Olsa a151a37a76 perf sched map: Color given pids
Adding --color-pids option to display selected pids in color (blue by
default). It helps on navigating through the 'perf sched map' output.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-7-git-send-email-jolsa@kernel.org
[ Added entry to man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:51 -03:00
Jiri Olsa 8cd91195e5 perf sched: Use color_fprintf for output
As preparation for next patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:51 -03:00
Jiri Olsa 99623c628f perf sched: Add compact display option
Add compact map display that does not output the whole cpu matrix, only
cpus that got event.

  $ perf sched map --compact
    *A0   1082427.094098 secs A0 => perf:19404 (CPU 2)
     A0 *.    1082427.094127 secs .  => swapper:0 (CPU 1)
     A0  .  *B0   1082427.094174 secs B0 => rcuos/2:25 (CPU 3)
     A0  .  *.    1082427.094177 secs
    *C0  .   .    1082427.094187 secs C0 => migration/2:21
     C0 *A0  .    1082427.094193 secs
    *.   A0  .    1082427.094195 secs
    *D0  A0  .    1082427.094402 secs D0 => rngd:968
    *.   A0  .    1082427.094406 secs
     .  *E0  .    1082427.095221 secs E0 => kworker/1:1:5333
     .   E0 *F0   1082427.095227 secs F0 => xterm:3342

It helps to display sane output for small thread loads on big cpu
servers.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-4-git-send-email-jolsa@kernel.org
[ Add entry in 'perf sched' man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-13 10:11:51 -03:00
Josh Poimboeuf 4b6ab94eab perf subcmd: Create subcmd library
Move the subcommand-related files from perf to a new library named
libsubcmd.a.

Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to
'exec-cmd.*' to be consistent with the naming of all the other files.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-12-17 14:27:14 -03:00
Jiri Olsa 0014de172d perf sched latency: Fix thread pid reuse issue
The latency subcommand holds a tree of working atoms sorted by thread's
pid/tid. If there's new thread with same pid and tid, the old working atom is
found and assert bug condition is hit in search function:

  thread_atoms_search: Assertion `!(thread != atoms->thread)' failed

Changing the sort function to use thread object pointers together with pid and
tid check. This way new thread will never find old one with same pid/tid.

Link: http://lkml.kernel.org/n/tip-o4doazhhv0zax5zshkg8hnys@git.kernel.org
Reported-by: Mohit Agrawal <moagrawa@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446462625-15807-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-11-05 12:51:00 -03:00
Namhyung Kim c711836972 perf tools: Introduce usage_with_options_msg()
Now usage_with_options() setup a pager before printing message so normal
printf() or pr_err() will not be shown.  The usage_with_options_msg()
can be used to print some help message before usage strings.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445701767-12731-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-27 09:28:44 -03:00
Josef Bacik 2f80dd4488 perf sched: Add option to merge like comms to lat output
Sometimes when debugging large multi-threaded applications it is helpful
to collate all of the latency numbers into one bulk record to get an
idea of what is going on.

This patch does this by merging any entries that belong to the same comm
into one entry and then spits out those totals.

I've also slightly changed the output so you can see how many threads
were merged in the processing.  Here is the new default output format

 -----------------------------------------------------------------------------------------------------------
  Task                 | Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at    |
 -----------------------------------------------------------------------------------------------------------
  chrome:(23)          |  740.878 ms |     2612 | avg:    0.022 ms | max:    0.845 ms | max at: 7935.254223 s
  pulseaudio:1523      |   94.440 ms |      597 | avg:    0.027 ms | max:    0.110 ms | max at: 7934.668372 s
  threaded-ml:6042     |   72.554 ms |      386 | avg:    0.035 ms | max:    1.186 ms | max at: 7935.330911 s
  Chrome_IOThread:3832 |   52.388 ms |      456 | avg:    0.021 ms | max:    1.365 ms | max at: 7935.330602 s
  Chrome_ChildIOT:(7)  |   50.694 ms |      743 | avg:    0.021 ms | max:    1.448 ms | max at: 7935.256659 s
  Compositor:5510      |   30.012 ms |      192 | avg:    0.019 ms | max:    0.131 ms | max at: 7936.636815 s
  plugin_audio_th:6043 |   24.828 ms |      314 | avg:    0.018 ms | max:    0.143 ms | max at: 7936.205994 s
  CompositorTileW:(2)  |   14.099 ms |       45 | avg:    0.022 ms | max:    0.153 ms | max at: 7937.521800 s

the (#) after the task is the number of tasks merged, and then if there were
no tasks merged it just shows the pid.  Here is the same trace file with the -p
option to print the per-pid latency numbers

 -----------------------------------------------------------------------------------------------------------
  Task                 | Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at    |
 -----------------------------------------------------------------------------------------------------------
  chrome:5500          |  386.872 ms |      387 | avg:    0.023 ms | max:    0.241 ms | max at: 7936.001694 s
  pulseaudio:1523      |   94.440 ms |      597 | avg:    0.027 ms | max:    0.110 ms | max at: 7934.668372 s
  threaded-ml:6042     |   72.554 ms |      386 | avg:    0.035 ms | max:    1.186 ms | max at: 7935.330911 s
  chrome:10226         |   69.710 ms |      251 | avg:    0.023 ms | max:    0.764 ms | max at: 7935.992305 s
  chrome:4267          |   64.551 ms |      418 | avg:    0.021 ms | max:    0.294 ms | max at: 7937.862427 s
  chrome:4827          |   62.268 ms |       54 | avg:    0.029 ms | max:    0.666 ms | max at: 7935.992813 s
  Chrome_IOThread:3832 |   52.388 ms |      456 | avg:    0.021 ms | max:    1.365 ms | max at: 7935.330602 s
  chrome:3776          |   46.150 ms |      349 | avg:    0.023 ms | max:    0.845 ms | max at: 7935.254223 s

Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/1432300720-30478-1-git-send-email-jbacik@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-05-27 12:21:45 -03:00
Arnaldo Carvalho de Melo b91fc39f4a perf machine: Protect the machine->threads with a rwlock
In addition to using refcounts for the struct thread lifetime
management, we need to protect access to machine->threads from
concurrent access.

That happens in 'perf top', where a thread processes events, inserting
and deleting entries from that rb_tree while another thread decays
hist_entries, that end up dropping references and ultimately deleting
threads from the rb_tree and releasing its resources when no further
hist_entry (or other data structures, like in 'perf sched') references
it.

So the rule is the same for refcounts + protected trees in the kernel,
get the tree lock, find object, bump the refcount, drop the tree lock,
return, use object, drop the refcount if no more use of it is needed,
keep it if storing it in some other data structure, drop when releasing
that data structure.

I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and
"perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".

The addr_location__put() one is because as we return references to
several data structures, we may end up adding more reference counting
for the other data structures and then we'll drop it at
addr_location__put() time.

Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-05-08 16:19:27 -03:00
Yunlong Song ff5f3bbd40 perf sched replay: Use replay_repeat to calculate the runavg of cpu usage instead of the default value 10
Since sched->replay_repeat is set to 10 as default, the sched->run_avg,
sched->runavg_cpu_usage, and sched->runavg_parent_cpu_usage all use
10 to calculate their value.

However, the replay_repeat can be changed to other value by using -r
option, so the calculation above should use replay_repeat to achieve
more accurate results instead of the default value 10.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-10-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:27 -03:00
Yunlong Song f0dd330fdf perf sched replay: Support using -f to override perf.data file ownership
Enable to use perf.data when it is not owned by current user or root.

Example:

 $ ls -al perf.data
 -rw------- 1 Yunlong.Song Yunlong.Song 5321918 Mar 25 15:14 perf.data
 $ sudo id
 uid=0(root) gid=0(root) groups=0(root),64(pkcs11)

Before this patch:

 $ sudo perf sched replay -f
 run measurement overhead: 98 nsecs
 sleep measurement overhead: 52909 nsecs
 the run test took 1000015 nsecs
 the sleep test took 1054253 nsecs
 File perf.data not owned by current user or root (use -f to override)

As shown above, the -f option does not work at all.

After this patch:

 $ sudo perf sched replay -f
 run measurement overhead: 221 nsecs
 sleep measurement overhead: 40514 nsecs
 the run test took 1000003 nsecs
 the sleep test took 1056098 nsecs
 nr_run_events:        10
 nr_sleep_events:      1562
 nr_wakeup_events:     5
 task      0 (                  :1:         1), nr_events: 1
 task      1 (                  :2:         2), nr_events: 1
 task      2 (                  :3:         3), nr_events: 1
 ...
 ...
 task   1549 (             :163132:    163132), nr_events: 1
 task   1550 (             :163540:    163540), nr_events: 1
 task   1551 (           <unknown>:         0), nr_events: 10
 ------------------------------------------------------------
 #1  : 50.198, ravg: 50.20, cpu: 2335.18 / 2335.18
 #2  : 219.099, ravg: 67.09, cpu: 2835.11 / 2385.17
 #3  : 238.626, ravg: 84.24, cpu: 3278.26 / 2474.48
 #4  : 200.364, ravg: 95.85, cpu: 2977.41 / 2524.77
 #5  : 176.882, ravg: 103.96, cpu: 2801.35 / 2552.43
 #6  : 191.093, ravg: 112.67, cpu: 2813.70 / 2578.56
 #7  : 189.448, ravg: 120.35, cpu: 2809.21 / 2601.62
 #8  : 200.637, ravg: 128.38, cpu: 2849.91 / 2626.45
 #9  : 248.338, ravg: 140.37, cpu: 4380.61 / 2801.87
 #10 : 511.139, ravg: 177.45, cpu: 3077.73 / 2829.45

As shown above, the -f option really works now.

Besides for replay, -f option can also work for latency and map.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-9-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:26 -03:00
Yunlong Song 939cda521a perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files
The soft maximum number of open files for a calling process is 1024,
which is defined as INR_OPEN_CUR in include/uapi/linux/fs.h, and the
hard maximum number of open files for a calling process is 4096, which
is defined as INR_OPEN_MAX in include/uapi/linux/fs.h.

Both INR_OPEN_CUR and INR_OPEN_MAX are used to limit the value of
RLIMIT_NOFILE in include/asm-generic/resource.h.

And the soft maximum number finally decides the limitation of the
maximum files which are allowed to be opened.

That is to say a process can use at most 1024 file descriptors for its
o pened files, or an EMFILE error will happen.

This error can be fixed by increasing the soft maximum number, under the
constraint that the soft maximum number can not exceed the hard maximum
number, or both soft and hard maximum number should be increased
simultaneously with privilege.

For perf sched replay, it uses sys_perf_event_open to create the file
descriptor for each of the tasks in order to handle information of perf
events.

That is to say each task needs a unique file descriptor. In x86_64,
there may be over 1024 or 4096 tasks correspoinding to the record in
perf.data, which causes that no enough file descriptors can be used.

As a result, EMFILE error happens and stops the replay process. To solve
this problem, we adaptively increase the soft and hard maximum number of
open files with a '-f' option.

Example:

Test environment: x86_64 with 160 cores

 $ cat /proc/sys/kernel/pid_max
 163840
 $ cat /proc/sys/fs/file-max
 6815744
 $ ulimit -Sn
 1024
 $ ulimit -Hn
 4096

Before this patch:

 $ perf sched replay
 ...
 task   1549 (             :163132:    163132), nr_events: 1
 task   1550 (             :163540:    163540), nr_events: 1
 task   1551 (           <unknown>:         0), nr_events: 10
 Error: sys_perf_event_open() syscall returned with -1 (Too many open
 files)

After this patch:

 $ perf sched replay
 ...
 task   1549 (             :163132:    163132), nr_events: 1
 task   1550 (             :163540:    163540), nr_events: 1
 task   1551 (           <unknown>:         0), nr_events: 10
 Error: sys_perf_event_open() syscall returned with -1 (Too many open
 files)
 Have a try with -f option

 $ perf sched replay -f
 ...
 task   1549 (             :163132:    163132), nr_events: 1
 task   1550 (             :163540:    163540), nr_events: 1
 task   1551 (           <unknown>:         0), nr_events: 10
 ------------------------------------------------------------
 #1  : 54.401, ravg: 54.40, cpu: 3285.21 / 3285.21
 #2  : 199.548, ravg: 68.92, cpu: 4999.65 / 3456.66
 #3  : 170.483, ravg: 79.07, cpu: 1349.94 / 3245.99
 #4  : 192.034, ravg: 90.37, cpu: 1322.88 / 3053.67
 #5  : 182.929, ravg: 99.62, cpu: 1406.51 / 2888.96
 #6  : 152.974, ravg: 104.96, cpu: 1167.54 / 2716.82
 #7  : 155.579, ravg: 110.02, cpu: 2992.53 / 2744.39
 #8  : 130.557, ravg: 112.08, cpu: 1126.43 / 2582.59
 #9  : 138.520, ravg: 114.72, cpu: 1253.22 / 2449.65
 #10 : 134.328, ravg: 116.68, cpu: 1587.95 / 2363.48

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-8-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:26 -03:00
Yunlong Song 1aff59be53 perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task
Since there is sem_wait for each task in the wait_for_tasks(), e.g.
sem_wait(&task->work_done_sem).

The sem_wait can continue only when work_done_sem is greater than 0, or
it will be blocked.

For perf sched replay, one task may sem_post the work_done_sem of
another task, which causes the work_done_sem of that task processed in a
reasonable sequence, e.g. sem_post, sem_wait, sem_wait, sem_post...

This sequence simulates the sched process of the running tasks at the
time when perf sched record runs.

As a result, all the tasks are required and their threads must be
successfully created.

If any one (task A) of the tasks fails to create its thread, then
another task (task B), whose work_done_sem needs sem_post from that
failed task A, may likely block itself due to seg_wait.

And this is a dead halt, since task B's thread_func cannot continue at
all.

To solve this problem, perf sched replay should exit once any task fails
to create its thread.

Example:

Test environment: x86_64 with 160 cores

Before this patch:

 $ perf sched replay
 ...
 Error: sys_perf_event_open() syscall returned with -1 (Too many open
 files)
 ------------------------------------------------------------    <- dead halt

After this patch:

 $ perf sched replay
 ...
 task   1551 (           <unknown>:         0), nr_events: 10
 Error: sys_perf_event_open() syscall returned with -1 (Too many open
 files)
 $

As shown above, perf sched replay finishes the process after printing an
error message and does not block itself.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-7-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:25 -03:00
Yunlong Song 08097abc11 perf sched replay: Fix the segmentation fault problem caused by pr_err in threads
The pr_err in self_open_counters() prints error message to stderr.
Unlike stdout, stderr uses memory buffer on the stack of each calling
process.

The pr_err in self_open_counters() works in a thread called thread_func
created in function create_tasks, which concurrently creates
sched->nr_tasks threads.

If the error happens and pr_err prints the error message in each of
these threads, the stack size of the perf process (default is 8192
kbytes) will quickly run out and the segmentation fault will happen
then.

To solve this problem, pr_err with self_open_counters() should be moved
from newly created threads to the old main thread of the perf process.
Then the pr_err can work in a stable situation without the strange
segmentation fault problem.

Example:

Test environment: x86_64 with 160 cores

Before this patch:

 $ perf sched replay
 ...
 task   1549 (             :163132:    163132), nr_events: 1
 task   1550 (             :163540:    163540), nr_events: 1
 task   1551 (           <unknown>:         0), nr_events: 10
 Segmentation fault

After this patch:

 $ perf sched replay
 ...
 task   1549 (             :163132:    163132), nr_events: 1
 task   1550 (             :163540:    163540), nr_events: 1
 task   1551 (           <unknown>:         0), nr_events: 10
 ...

As shown above, the result continues without any segmentation fault.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-6-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:24 -03:00
Yunlong Song 3a423a5c36 perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the different pid_max configurations
Although the memory of pid_to_task can be allocated via calloc according
to the value of /proc/sys/kernel/pid_max, it cannot handle the case when
pid_max is changed after 'perf sched record' has created its perf.data.

If the new pid_max configured in 'perf sched replay' is smaller than the
old pid_max configured in 'perf sched record', then it will cause the
assertion failure problem.

To solve this problem, we realloc the memory of pid_to_task stepwise
once the passed-in pid parameter in register_pid is larger than the
current pid_max.

Example:

Test environment: x86_64 with 160 cores

 $ cat /proc/sys/kernel/pid_max
 163840
 $ perf sched record ls
 $ echo 5000 > /proc/sys/kernel/pid_max
 $ cat /proc/sys/kernel/pid_max
 5000

Before this patch:

 $ perf sched replay
 run measurement overhead: 221 nsecs
 sleep measurement overhead: 55356 nsecs
 the run test took 1000011 nsecs
 the sleep test took 1060940 nsecs
 perf: builtin-sched.c:337: register_pid: Assertion `!(pid >= (unsigned
 long)pid_max)' failed.
 Aborted

After this patch:

 $ perf sched replay
 run measurement overhead: 221 nsecs
 sleep measurement overhead: 55611 nsecs
 the run test took 1000026 nsecs
 the sleep test took 1060486 nsecs
 nr_run_events:        10
 nr_sleep_events:      1562
 nr_wakeup_events:     5
 task      0 (                  :1:         1), nr_events: 1
 task      1 (                  :2:         2), nr_events: 1
 task      2 (                  :3:         3), nr_events: 1
 task      3 (                  :5:         5), nr_events: 1
 ...

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-5-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:23 -03:00
Yunlong Song cb06ac256a perf sched replay: Alloc the memory of pid_to_task dynamically to adapt to the unexpected change of pid_max
The current memory allocation of struct task_desc *pid_to_task[MAX_PID]
is in a permanent and preset way, and it has two problems:

Problem 1: If the pid_max, which is the max number of pids in the
system, is much smaller than MAX_PID (1024*1000), then it causes a waste
of stack memory. This may happen in the case where the number of cpu
cores is much smaller than 1000.

Problem 2: If the pid_max is changed from the default value to a value
larger than MAX_PID, then it will cause assertion failure problem. The
maximum value of pid_max can be set to pid_max_max (see pidmap_init
defined in kernel/pid.c), which equals to PID_MAX_LIMIT. In x86_64,
PID_MAX_LIMIT is 4*1024*1024 (defined in include/linux/threads.h). This
value is much larger than MAX_PID, and will take up 32768 Kbytes
(4*1024*1024*8/1024) for memory allocation of pid_to_task, which is much
larger than the default 8192 Kbytes of the stack size of calling
process.

Due to these two problems, we use calloc to allocate the memory of
pid_to_task dynamically.

Example:

Test environment: x86_64 with 160 cores

 $ cat /proc/sys/kernel/pid_max
 163840
 $ echo 1025000 > /proc/sys/kernel/pid_max
 $ cat /proc/sys/kernel/pid_max
 1025000

Run some applications until the pid of some process is greater than
the value of MAX_PID (1024*1000).

Before this patch:

 $ perf sched replay
 run measurement overhead: 221 nsecs
 sleep measurement overhead: 55480 nsecs
 the run test took 1000008 nsecs
 the sleep test took 1063151 nsecs
 perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 1024000)'
 failed.
 Aborted

After this patch:

 $ perf sched replay
 run measurement overhead: 221 nsecs
 sleep measurement overhead: 55435 nsecs
 the run test took 1000004 nsecs
 the sleep test took 1059312 nsecs
 nr_run_events:        10
 nr_sleep_events:      1562
 nr_wakeup_events:     5
 task      0 (                  :1:         1), nr_events: 1
 task      1 (                  :2:         2), nr_events: 1
 task      2 (                  :3:         3), nr_events: 1
 task      3 (                  :5:         5), nr_events: 1
 ...

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-4-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:22 -03:00
Yunlong Song a35e27d0e5 perf sched replay: Increase the MAX_PID value to fix assertion failure problem
Current MAX_PID is only 65536, which will cause assertion failure problem
when CPU cores are more than 64 in x86_64.

This is because the pid_max value in x86_64 is at least
PIDS_PER_CPU_DEFAULT * num_possible_cpus() (see function pidmap_init
defined in kernel/pid.c), where PIDS_PER_CPU_DEFAULT is 1024 (defined in
include/linux/threads.h).

Thus for MAX_PID = 65536, the correspoinding CPU cores are
65536/1024=64.  This is obviously not enough at all for x86_64, and will
cause an assertion failure problem due to BUG_ON(pid >= MAX_PID) in the
codes.

We increase MAX_PID value from 65536 to 1024*1000, which can be used in
x86_64 with 1000 cores.

This number is finally decided according to the limitation of stack size
of calling process.

Use 'ulimit -a', the result shows the stack size of any process is 8192
Kbytes, which is defined in include/uapi/linux/resource.h (#define
_STK_LIM (8*1024*1024)).

Thus we choose a large enough value for MAX_PID, and make it satisfy to
the limitation of the stack size, i.e., making the perf process take up
a memory space just smaller than 8192 Kbytes.

We have calculated and tested that 1024*1000 is OK for MAX_PID.

This means perf sched replay can now be used with at most 1000 cores in
x86_64 without any assertion failure problem.

Example:

Test environment: x86_64 with 160 cores

 $ cat /proc/sys/kernel/pid_max
 163840

Before this patch:

 $ perf sched replay
 run measurement overhead: 240 nsecs
 sleep measurement overhead: 55379 nsecs
 the run test took 1000004 nsecs
 the sleep test took 1059424 nsecs
 perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 65536)'
 failed.
 Aborted

After this patch:

 $ perf sched replay
 run measurement overhead: 221 nsecs
 sleep measurement overhead: 55397 nsecs
 the run test took 999920 nsecs
 the sleep test took 1053313 nsecs
 nr_run_events:        10
 nr_sleep_events:      1562
 nr_wakeup_events:     5
 task      0 (                  :1:         1), nr_events: 1
 task      1 (                  :2:         2), nr_events: 1
 task      2 (                  :3:         3), nr_events: 1
 task      3 (                  :5:         5), nr_events: 1
 ...

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-3-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:21 -03:00
Yunlong Song 0755bc4dc7 perf sched replay: Use struct task_desc instead of struct task_task for correct meaning
There is no struct task_task at all, thus it is a typo error in the old
commits, now fix it to what it should be in order to avoid unnecessary
misunderstanding.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-2-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:19 -03:00
Arnaldo Carvalho de Melo b7b61cbebd perf ordered_events: Shorten function signatures
By keeping pointers to machines, evlist and tool in ordered_events.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-0c6huyaf59mqtm2ek9pmposl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-11 10:17:09 -03:00
Arnaldo Carvalho de Melo ae536acfac perf sched: No need to keep the session around
We were keeping the session around just because we kept pointers to
struct thread instances, but now we reference count them, so no need
for deferring the perf_session__delete call to after we traverse the
work_list entries.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9agtck6jdr3rebdp39z1lo0e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-03 00:17:12 -03:00
Arnaldo Carvalho de Melo f3b623b849 perf tools: Reference count struct thread
We need to do that to stop accumulating entries in the dead_threads
linked list, i.e. we were keeping references to threads in struct hists
that continue to exist even after a thread exited and was removed from
the machine threads rbtree.

We still keep the dead_threads list, but just for debugging, allowing us
to iterate at any given point over the threads that still are referenced
by things like struct hist_entry.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-3ejvfyed0r7ue61dkurzjux4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-03 00:17:08 -03:00
Arnaldo Carvalho de Melo 75be989a7a perf evlist: Adopt events_stats from perf_session
For tools that don't deal with perf.data files, thus do not need to
use perf_session.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-kglq67gvauq9tak02a4se00r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-22 22:22:57 -03:00
Arnaldo Carvalho de Melo b3f25b6e04 perf sched: Stop updating hists stats, not used
Not used here, remove to reduce perf_evsel/hists structs interaction.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cb7wkk4a3jpoovzim914ih3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-09 11:46:35 -03:00
Masami Hiramatsu fb74fbda42 perf sched: Use strerror_r instead of strerror
Use strerror_r instead of strerror in error message for thread-safety.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naohiro Aota <naota@elisp.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20140814022247.3545.4564.stgit@kbuild-fedora.novalocal
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-08-15 13:07:47 -03:00
Namhyung Kim 0a7e6d1b68 perf tools: Check recorded kernel version when finding vmlinux
Currently vmlinux_path__init() only tries to find vmlinux file from
current directory, /boot and some canonical directories with version
number of the running kernel.  This can be a problem when reporting old
data recorded on a kernel version not running currently.

We can use --symfs option for this but it's annoying for user to do it
always.  As we already have the info in the perf.data file, it can be
changed to use it for the search automatically.

Before:

  $ perf report
  ...
  # Samples: 4K of event 'cpu-clock'
  # Event count (approx.): 1067250000
  #
  # Overhead  Command     Shared Object      Symbol
  # ........  ..........  .................  ..............................
      71.87%     swapper  [kernel.kallsyms]  [k] recover_probed_instruction

After:

  # Overhead  Command     Shared Object      Symbol
  # ........  ..........  .................  ....................
      71.87%     swapper  [kernel.kallsyms]  [k] native_safe_halt

This requires to change signature of symbol__init() to receive struct
perf_session_env *.

Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1407825645-24586-14-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-08-13 16:42:21 -03:00
Namhyung Kim 0493410612 perf sched: Move call to symbol__init() after creating session
This is a preparation of fixing dso__load_kernel_sym().  It needs a
session info before calling symbol__init().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1407825645-24586-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-08-13 16:34:29 -03:00
Jiri Olsa 0a8cb85c20 perf tools: Rename ordered_samples bool to ordered_events
The time ordering is generic for all kinds of events, so using generic
name 'ordered_events' for ordered_samples bool in perf_tool struct.

No functional change was intended.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-07mrqzcuhsks9wfmxrzsvemz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-08-12 12:02:54 -03:00
Yann Droneaud 57480d2cd9 perf tools: Enable close-on-exec flag on perf file descriptor
In commit a21b0b354d ('perf: Introduce a flag to enable
close-on-exec in perf_event_open()'), flag PERF_FLAG_FD_CLOEXEC
was added to perf_event_open(2) syscall to allows userspace
to atomically enable close-on-exec behavor when creating
the file descriptor.

This patch makes perf tools use the new flag if supported
by the kernel, so that the event file descriptors got
automatically closed if perf tool exec a sub-command.

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/1404160127-7475-1-git-send-email-ydroneaud@opteya.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-07-18 09:09:34 +02:00
Adrian Hunter 1fcb876863 perf machine: Fix the value used for unknown pids
The value used for unknown pids cannot be zero because that is used by
the "idle" task.

Use -1 instead.  Also handle the unknown pid case when creating map
groups.

Note that, threads with an unknown pid should not occur because fork (or
synthesized) events precede the thread's existence.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1405332185-4050-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-07-16 17:57:33 -03:00
Namhyung Kim 1844dbcbe7 perf tools: Introduce hists__inc_nr_samples()
There're some duplicate code for counting number of samples.  Add
hists__inc_nr_samples() and reuse it.

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1401335910-16832-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:34:55 +02:00
Dongsheng Yang 9d372ca59b perf sched: Cleanup, remove unused variables in map_switch_event()
In map_switch_event(), we don't care the previous process currently,
this patch remove the infomation we get but not used.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/1400218625-14613-1-git-send-email-yangds.fnst@cn.fujitsu.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 09:17:50 +02:00
Dongsheng Yang 67d6259dd0 perf sched: Remove nr_state_machine_bugs in perf latency
As we do not use .success in sched_wakeup event any more, then
we can not guarantee that the task when wakeup event happen is
out of run queue. So the message of nr_state_machine_bugs is
not correct.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/1399945101-21736-1-git-send-email-yangds.fnst@cn.fujitsu.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 09:17:36 +02:00
Peter Zijlstra 0680ee7db1 perf tools: Remove usage of trace_sched_wakeup(.success)
trace_sched_wakeup(.success) is a dead argument and has been for ages,
the only reason its still there is because of brain dead software, which
apparently includes perf tools

There's a few more instances in pearly snake shit, but that's not
supported as far as I care anyhow, so let that bitrot.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140512181946.GG13467@laptop.programming.kicks-ass.net
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 21:13:44 +02:00
Dongsheng 6bcab4e1ea perf tools: Clarify the output of perf sched map.
In output of perf sched map, any shortname of thread will be explained
at the first time when it appear.

Example:
              *A0       228836.978985 secs A0 => perf:23032
          *.   A0       228836.979016 secs B0 => swapper:0
           .  *C0       228836.979099 secs C0 => migration/3:22
  *A0      .   C0       228836.979115 secs
   A0      .  *.        228836.979115 secs

But B0, which is explained as swapper:0 did not appear in the
left part of output. Instead, we use '.' as the shortname of
swapper:0. So the comment of "B0 => swapper:0" is not easy to
understand.

This patch clarify the output of perf sched map with not allocating
one letter-number shortname for swapper:0 and print ". => swapper:0"
as the explanation for swapper:0.

Example:
              *A0       228836.978985 secs A0 => perf:23032
          * .  A0       228836.979016 secs .  => swapper:0
            . *B0       228836.979099 secs B0 => migration/3:22
  *A0       .  B0       228836.979115 secs
   A0       . * .       228836.979115 secs
   A0     *C0   .       228836.979225 secs C0 => ksoftirqd/2:18
   A0     *D0   .       228836.979236 secs D0 => rcu_sched:7

Signed-off-by: Dongsheng <yangds.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1399354741-19522-1-git-send-email-yangds.fnst@cn.fujitsu.com
[ small style fixes to make checkpatch happy ]
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 11:09:05 +02:00
Dongsheng e936e8e459 perf tools: Adapt the TASK_STATE_TO_CHAR_STR to new value in kernel space.
Currently, TASK_STATE_TO_CHAR_STR in kernel space is already expanded to RSDTtZXxKWP,
but it is still RSDTtZX in perf sched tool.

This patch update TASK_STATE_TO_CHAR_STR to the new value in kernel space.

Signed-off-by: Dongsheng <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/6d2f55dc1e02c1e29a5d70bfeb9d6e8863caf2aa.1399273302.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 10:01:49 +02:00
Dongsheng 7fff959783 perf tools: Add missing event for perf sched record.
We should record and process sched:sched_wakeup_new event in
perf sched tool, but currently, there is the process function
for it, without recording it in record subcommand.

This patch add -e sched:sched_wakeup_new to perf sched record.

Signed-off-by: Dongsheng <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/710c6edd2162b2cea1711443f54de47c0210d9fd.1399273302.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 10:01:41 +02:00
Ramkumar Ramachandra a83edb2dfc perf sched: Introduce --list-cmds for use by scripts
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1394853474-31019-5-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-16 17:16:05 +02:00
Ramkumar Ramachandra 80790e0b7e perf sched: Fixup header alignment in 'latency' output
Before:

 ---------------------------------------------------------------------------------------------------------------
  Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at     |
 ---------------------------------------------------------------------------------------------------------------
  ...                   |               |          |                  |                  |
  git:24540             |    336.622 ms |       10 | avg:    0.032 ms | max:    0.062 ms | max at: 115610.111046 s
  git:24541             |      0.457 ms |        1 | avg:    0.000 ms | max:    0.000 ms | max at:  0.000000 s
 -----------------------------------------------------------------------------------------
  TOTAL:                |    396.542 ms |      353 |
 ---------------------------------------------------

After:

 -----------------------------------------------------------------------------------------------------------------
  Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms | Maximum delay at       |
 -----------------------------------------------------------------------------------------------------------------
  ...                   |               |          |                  |                  |
  git:24540             |    336.622 ms |       10 | avg:    0.032 ms | max:    0.062 ms | max at: 115610.111046 s
  git:24541             |      0.457 ms |        1 | avg:    0.000 ms | max:    0.000 ms | max at:      0.000000 s
 -----------------------------------------------------------------------------------------------------------------
  TOTAL:                |    396.542 ms |      353 |
 ---------------------------------------------------

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1395065901-25740-1-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-03-18 18:16:55 -03:00
Arnaldo Carvalho de Melo 74cf249d5c perf tools: Use zfree to help detect use after free bugs
Several areas already used this technique, so do some audit to
consistently use it elsewhere.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9sbere0kkplwe45ak6rk4a1f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-27 17:08:19 -03:00
Arnaldo Carvalho de Melo 744a971940 perf evsel: Ditch evsel->handler.data field
Not needed since this cset:

  fcf65bf149: perf evsel: Cache associated event_format

So lets trim this struct a bit.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-j8setslokt0goiwxq9dogzqm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07 10:40:47 -03:00
Frederic Weisbecker b9c5143a01 perf tools: Use an accessor to read thread comm
As the thread comm is going to be implemented by way of a more
complicated data structure than just a pointer to a string from the
thread struct, convert the readers of comm to use an accessor instead of
accessing it directly.

The accessor will be later overriden to support an enhanced comm
implementation.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wr683zwy94hmj4ibogmnv9ce@git.kernel.org
[ Rename thread__comm_curr() to thread__comm_str() ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
[ Fixed up some minor const pointer issues ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-04 11:50:28 -03:00
Adrian Hunter 156a2b0229 perf sched: Optimize build time
builtin-sched.c took a log time to build with -O6 optimization. This
turned out to be caused by:

	.curr_pid = { [0 ... MAX_CPUS - 1] = -1 },

Fix by initializing curr_pid programmatically.

This addresses the problem cured in f36f83f947 using a smaller hammer.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1382427258-17495-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-23 10:24:29 -03:00
Adrian Hunter 8a39df8faa perf sched: Make struct perf_sched sched a local variable
Change "struct perf_sched sched" from being global to being local.

The build slowdown cured by f36f83f947 is dealt with in the following
patch, by programatically setting perf_sched.curr_pid.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1382427258-17495-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-23 10:24:19 -03:00
Jiri Olsa f5fc14124c perf tools: Add data object to handle perf data file
This patch is adding 'struct perf_data_file' object as a placeholder for
all attributes regarding perf.data file handling. Changing
perf_session__new to take it as an argument.

The rest of the functionality will be added later to keep this change
simple enough, because all the places using perf_session are changed
now.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381847254-28809-2-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-21 17:33:24 -03:00
Adrian Hunter 314add6b1f perf tools: change machine__findnew_thread() to set thread pid
Add a new parameter for 'pid' to machine__findnew_thread().
Change callers to pass 'pid' when it is known.

Note that callers sometimes want to find the main thread
which has the memory maps.  The main thread has tid == pid
so the usage in that case is:

	machine__findnew_thread(machine, pid, pid)

whereas the usage to find the specific thread is:

	machine__findnew_thread(machine, pid, tid)

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377591794-30553-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-29 11:51:31 -03:00
David Ahern cb627505ae perf sched: Remove sched_process_fork tracepoint
The PERF_RECORD_FORK event is already collected as part of the use of
cmd_record and those events are analyzed as part of the libperf
machinery.  Using the fork tracepoint as well just duplicates the event
load.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375930261-77273-6-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-12 10:31:07 -03:00
David Ahern 4a957e4df1 perf sched: Remove sched_process_exit tracepoint
Event is not needed nor analyzed. Since perf-sched leverages perf-record
to capture the sched data, we already capture task events like EXIT.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375930261-77273-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-12 10:31:06 -03:00
David Ahern ffb273dd7e perf sched: Remove thread lookup in sample handler
Not used in the function, so no sense in doing the lookup here. Thread
look up will be done in the timehist command, and no sense in doing it
twice.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375930261-77273-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-12 10:31:06 -03:00
David Ahern ad9def7ca0 perf sched: Simplify arguments to read_events
Destroy argument is not necessary. If session is not returned to caller,
then clean it up.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375930261-77273-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-12 10:31:05 -03:00
Adrian Hunter 380512345e perf tools: struct thread has a tid not a pid
As evident from 'machine__process_fork_event()' and
'machine__process_exit_event()' the 'pid' member of struct thread is
actually the tid.

Rename 'pid' to 'tid' in struct thread accordingly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1372944040-32690-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-12 13:53:50 -03:00
Namhyung Kim f36f83f947 perf sched: Move struct perf_sched definition out of cmd_sched()
For some reason it consumed quite amount of compile time when declared
as local variable, and it disappeared when moved out of the function.
Moving other variables/tables didn't help.

On my system this single-file-change build time reduced from 11s to 3s.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1370324779-16921-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-12 13:52:35 -03:00
Jiri Olsa 4a4d371a4d perf record: Remove -f/--force option
It no longer have any affect on the processing and is marked as obsolete
anyway.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-tvwyspiqr4getzfib2lw06ty@git.kernel.org
Link: http://lkml.kernel.org/r/1372307120-737-1-git-send-email-namhyung@kernel.org
[ combined patch removing the -f usage in various sub-commands, such as 'perf sched', etc, by Namhyung Kim ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-08 17:37:25 -03:00
Arnaldo Carvalho de Melo 1c6763cb99 Revert "perf sched: Handle PERF_RECORD_EXIT events"
This reverts commit 0439539f72.

This caused this segfault:

[root@sandy linux]# perf sched rec
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.306 MB perf.data (~57062 samples) ]
perf
[root@sandy linux]# perf sched lat
perf: builtin-sched.c:781: thread_atoms_search: Assertion `!(thread != atoms->thread)' failed.
Aborted (core dumped)
[root@sandy linux]#

Further investigation is needed to check that even with machine__remove_thread()
not really deleting the thread referenced in the PERF_RECORD_EXIT (it goes to
machine->dead_threads, because references may still exist to them in things like
 hist, etc) some event later comes for this dead thread and then
machine__findnew_thread() will create a new thead instance that will not be the
same as the one referenced by work_atoms->thread in thread_atoms_search().

For now just revert this patch to get the 'perf sched lat' back working.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
echo Link: http://lkml.kernel.org/n/tip-`ranpwd -l 24`@git.kernel.org
Link: http://lkml.kernel.org/n/tip-hg4s6e5txiwqe00h8rdg1sin@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-04-01 12:22:34 -03:00
Arnaldo Carvalho de Melo 28a6b6aa54 perf session: There is no need for a per session hists instance
It was being used just for its stats member, so ditch session->hists and
use just what is needed, session->stats.

This completes the move support multiple events in the hists layer, the
last user of session->hists was 'perf diff' but Jiri Olsa has fixed that
some time ago.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pimk92kek8kcp4dmb1jakoro@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-24 16:40:12 -03:00
Feng Tang 70cb4e963f perf tools: Add a global variable "const char *input_name"
Currently many perf commands annotate/evlist/report/script/lock etc all
support "-i" option to chose a specific perf data, and all of them
create a local "input_name" to save the file name for that perf data.

Since most of these commands need it, we can add a global variable for
it, also it can some other benefits:

1. When calling script browser inside hists/annotation browser, it needs
to know the perf data file name to run that script.

2. For further feature like runtime switching to another perf data file,
this variable can also help.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1351569369-26732-2-git-send-email-feng.tang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-29 11:45:34 -02:00
Arnaldo Carvalho de Melo 0439539f72 perf sched: Handle PERF_RECORD_EXIT events
Noticed sched wasn't handling those events while introducing
perf_event__process_{fork,exit}.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-035rzjtnv9ri8sssi7ojjjq0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-06 16:34:03 -03:00
Arnaldo Carvalho de Melo f62d3f0f45 perf event: No need to create a thread when handling PERF_RECORD_EXIT
When we were processing a PERF_RECORD_EXIT event we first used
machine__findnew_thread for both the thread exiting and for its parent,
only to use just the thread struct associated with the one exiting, and
to just delete it.

If it existed, i.e. not created at this very moment in
machine__findnew_thread, it will be moved to the machine->dead_threads
linked list, because we may have hist_entries pointing to it, but if it
was created just do be deleted, it will just sit there with no
references at all.

Use the new machine__find_thread() method so that if it is not there, we
don't create it.

As a bonus the parent thread will also not be created at this point.

Create process_fork() and process_exit() helpers to use this and make
the builtins use it instead of the generic process_task(), ditched by
this patch.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-z7n2y98ebjyrvmytaope4vdl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-06 16:33:45 -03:00
Arnaldo Carvalho de Melo 73ee3b2768 perf sched: Look up thread using tid instead of pid
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-zdu8up6vahogckg2uft7wh3n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-02 18:36:28 -03:00
Namhyung Kim 60b7d14af4 perf sched: Fixup for the die() removal
The commit a116e05dcf ("perf sched: Remove die() calls") replaced
die() call to pr_debug + return -1, but it should be pr_err otherwise
it'll not show up unless -v option is given.  Fix it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1347415866-303-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-14 15:48:38 -03:00
Arnaldo Carvalho de Melo 9ec3f4e437 perf sched: Don't read all tracepoint variables in advance
Do it just at the actual consumer of these fields, that way we avoid
needless lookups:

  [root@sandy ~]# perf sched record sleep 30s
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 8.585 MB perf.data (~375063 samples) ]

Before:

  [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null

   Performance counter stats for 'perf sched lat' (10 runs):

          103.592215 task-clock                #    0.993 CPUs utilized            ( +-  0.33% )
                  12 context-switches          #    0.114 K/sec                    ( +-  3.29% )
                   0 cpu-migrations            #    0.000 K/sec
               7,605 page-faults               #    0.073 M/sec                    ( +-  0.00% )
         345,796,112 cycles                    #    3.338 GHz                      ( +-  0.07% ) [82.90%]
         106,876,796 stalled-cycles-frontend   #   30.91% frontend cycles idle     ( +-  0.38% ) [83.23%]
          62,060,877 stalled-cycles-backend    #   17.95% backend  cycles idle     ( +-  0.80% ) [67.14%]
         628,246,586 instructions              #    1.82  insns per cycle
                                               #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.64%]
         134,962,057 branches                  # 1302.820 M/sec                    ( +-  0.10% ) [83.64%]
           1,233,037 branch-misses             #    0.91% of all branches          ( +-  0.29% ) [83.41%]

         0.104333272 seconds time elapsed                                          ( +-  0.33% )

  [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null

   Performance counter stats for 'perf sched lat' (10 runs):

         98.848272 task-clock                #    0.993 CPUs utilized            ( +-  0.48% )
                11 context-switches          #    0.112 K/sec                    ( +-  2.83% )
                 0 cpu-migrations            #    0.003 K/sec                    ( +- 50.92% )
             7,604 page-faults               #    0.077 M/sec                    ( +-  0.00% )
       332,216,085 cycles                    #    3.361 GHz                      ( +-  0.14% ) [82.87%]
       100,623,710 stalled-cycles-frontend   #   30.29% frontend cycles idle     ( +-  0.53% ) [82.95%]
        58,788,692 stalled-cycles-backend    #   17.70% backend  cycles idle     ( +-  0.59% ) [67.15%]
       609,402,433 instructions              #    1.83  insns per cycle
                                             #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.76%]
       131,277,138 branches                  # 1328.067 M/sec                    ( +-  0.06% ) [83.77%]
         1,117,871 branch-misses             #    0.85% of all branches          ( +-  0.32% ) [83.51%]

       0.099580430 seconds time elapsed                                          ( +-  0.48% )

  [root@sandy ~]#

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-kracdpw8wqlr0xjh75uk8g11@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 20:39:19 -03:00
Arnaldo Carvalho de Melo 2b7fcbc5a9 perf sched: Use perf_evsel__{int,str}val
This patch also stops reading the common fields, as they were not being used except
for one ->common_pid case that was replaced by sample->tid, i.e. the info is already
in the perf_sample struct.

Also it only fills the _event structures when there is a handler.

  [root@sandy ~]# perf sched record sleep 30s
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 8.585 MB perf.data (~375063 samples) ]

Before:

  [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null

   Performance counter stats for 'perf sched lat' (10 runs):

          129.117838 task-clock                #    0.994 CPUs utilized            ( +-  0.28% )
                  14 context-switches          #    0.111 K/sec                    ( +-  2.10% )
                   0 cpu-migrations            #    0.002 K/sec                    ( +- 66.67% )
               7,654 page-faults               #    0.059 M/sec                    ( +-  0.67% )
         438,121,661 cycles                    #    3.393 GHz                      ( +-  0.06% ) [83.06%]
         150,808,605 stalled-cycles-frontend   #   34.42% frontend cycles idle     ( +-  0.14% ) [83.10%]
          80,748,941 stalled-cycles-backend    #   18.43% backend  cycles idle     ( +-  0.64% ) [66.73%]
         758,605,879 instructions              #    1.73  insns per cycle
                                               #    0.20  stalled cycles per insn  ( +-  0.08% ) [83.54%]
         162,164,321 branches                  # 1255.940 M/sec                    ( +-  0.10% ) [83.70%]
           1,609,903 branch-misses             #    0.99% of all branches          ( +-  0.08% ) [83.62%]

         0.129949153 seconds time elapsed                                          ( +-  0.28% )

After:

  [root@sandy ~]# perf stat -r 10 perf sched lat > /dev/null

   Performance counter stats for 'perf sched lat' (10 runs):

          103.592215 task-clock                #    0.993 CPUs utilized            ( +-  0.33% )
                  12 context-switches          #    0.114 K/sec                    ( +-  3.29% )
                   0 cpu-migrations            #    0.000 K/sec
               7,605 page-faults               #    0.073 M/sec                    ( +-  0.00% )
         345,796,112 cycles                    #    3.338 GHz                      ( +-  0.07% ) [82.90%]
         106,876,796 stalled-cycles-frontend   #   30.91% frontend cycles idle     ( +-  0.38% ) [83.23%]
          62,060,877 stalled-cycles-backend    #   17.95% backend  cycles idle     ( +-  0.80% ) [67.14%]
         628,246,586 instructions              #    1.82  insns per cycle
                                               #    0.17  stalled cycles per insn  ( +-  0.04% ) [83.64%]
         134,962,057 branches                  # 1302.820 M/sec                    ( +-  0.10% ) [83.64%]
           1,233,037 branch-misses             #    0.91% of all branches          ( +-  0.29% ) [83.41%]

         0.104333272 seconds time elapsed                                          ( +-  0.33% )

  [root@sandy ~]#

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-weu9t63zkrfrazkn0gxj48xy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 19:33:51 -03:00
Arnaldo Carvalho de Melo 0e9b07e574 perf sched: Use perf_tool as ancestor
So that we can remove all the globals.

Before:

   text	   data	    bss	    dec	    hex	filename
1586833	 110368	1438600	3135801	 2fd939	/tmp/oldperf

After:

   text	   data	    bss	    dec	    hex	filename
1629329	  93568	 848328	2571225	 273bd9	/root/bin/perf

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-oph40vikij0crjz4eyapneov@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 17:29:27 -03:00
Arnaldo Carvalho de Melo 4218e67341 perf sched: Remove unused thread parameter
From the tracepoint handling routines.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mcqd9mv34z6he0wqiz4a3mh9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 13:18:47 -03:00
Irina Tirdea 1d037ca164 perf tools: Use __maybe_used for unused variables
perf defines both __used and __unused variables to use for marking
unused variables. The variable __used is defined to
__attribute__((__unused__)), which contradicts the kernel definition to
__attribute__((__used__)) for new gcc versions. On Android, __used is
also defined in system headers and this leads to warnings like: warning:
'__used__' attribute ignored

__unused is not defined in the kernel and is not a standard definition.
If __unused is included everywhere instead of __used, this leads to
conflicts with glibc headers, since glibc has a variables with this name
in its headers.

The best approach is to use __maybe_unused, the definition used in the
kernel for __attribute__((unused)). In this way there is only one
definition in perf sources (instead of 2 definitions that point to the
same thing: __used and __unused) and it works on both Linux and Android.
This patch simply replaces all instances of __used and __unused with
__maybe_unused.

Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
[ committer note: fixed up conflict with a116e05 in builtin-sched.c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 12:19:15 -03:00
Arnaldo Carvalho de Melo a116e05dcf perf sched: Remove die() calls
Just use pr_err() + return -1 and perf_session__process_events to abort
when some event would call die(), then let the perf's main() exit doing
whatever it needs.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-88cwdogxqomsy9tfr8r0as58@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-09 11:39:02 -03:00
Arnaldo Carvalho de Melo 7f7f8d0bea perf sched: Use perf_sample
To reduce the number of parameters passed to the various event handling
functions.

Cc: Andrey Wagin <avagin@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-fc537qykjjqzvyol5fecx6ug@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-08-07 23:46:19 -03:00
Arnaldo Carvalho de Melo fcf65bf149 perf evsel: Cache associated event_format
We already lookup the associated event_format when reading the perf.data
header, so that we can cache the tracepoint name in evsel->name, so do
it a little further and save the event_format itself, so that we can
avoid relookups in tools that need to access it.

Change the tools to take the most obvious advantage, when they were
using pevent_find_event directly. More work is needed for further
removing the need of a pointer to pevent, such as when asking for event
field values ("common_pid" and the other common fields and per
event_format fields).

This is something that was planned but only got actually done when
Andrey Wagin needed to do this lookup at perf_tool->sample() time, when
we don't have access to pevent (session->pevent) to use with
pevent_find_event().

Cc: Andrey Wagin <avagin@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/tip-txkvew2ckko0b594ae8fbnyk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-08-07 23:43:37 -03:00
Arnaldo Carvalho de Melo da3789628f perf tools: Stop using a global trace events description list
The pevent thing is per perf.data file, so I made it stop being static
and become a perf_session member, so tools processing perf.data files
use perf_session and _there_ we read the trace events description into
session->pevent and then change everywhere to stop using that single
global pevent variable and use the per session one.

Note that it _doesn't_ fall backs to trace__event_id, as we're not
interested at all in what is present in the
/sys/kernel/debug/tracing/events in the workstation doing the analysis,
just in what is in the perf.data file.

This patch also introduces perf_session__set_tracepoints_handlers that
is the perf perf.data/session way to associate handlers to tracepoint
events by resolving their IDs using the events descriptions stored in a
perf.data file. Make 'perf sched' use it.

Reported-by: Dmitry Antipov <dmitry.antipov@linaro.org>
Tested-by: Dmitry Antipov <dmitry.antipov@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linaro-dev@lists.linaro.org
Cc: patches@linaro.org
Link: http://lkml.kernel.org/r/20120625232016.GA28525@infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-06-27 13:08:42 -03:00
Arnaldo Carvalho de Melo 22c8b84320 perf tools: Don't access evsel->name directly
One needs to use perf_evsel__name() so that if needed the name gets
synthesized and stored in evsel->name, from where perf_evsel__name()
will serve from them on.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ml7zbenjmri9bghmrea0jm0d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-06-19 13:06:21 -03:00
Steven Rostedt aaf045f723 perf: Have perf use the new libtraceevent.a library
The event parsing code in perf was originally copied from trace-cmd
but never was kept up-to-date with the changes that was done there.
The trace-cmd libtraceevent.a code is much more mature than what is
currently in perf.

This updates the code to use wrappers to handle the calls to the
new event parsing code. The new code requires a handle to be pass
around, which removes the global event variables and allows
more than one event structure to be read from different files
(and different machines).

But perf still has the old global events and the code throughout
perf does not yet have a nice way to pass around a handle.
A global 'pevent' has been made for perf and the old calls have
been created as wrappers to the new event parsing code that uses
the global pevent.

With this change, perf can later incorporate the pevent handle into
the perf structures and allow more than one file to be read and
compared, that contains different events.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2012-04-25 13:28:48 +02:00
Markus Trippelsdorf 7b78f13603 perf tools: Fix getrusage() related build failure on glibc trunk
On a system running glibc trunk perf doesn't build:

    CC builtin-sched.o
builtin-sched.c: In function ‘get_cpu_usage_nsec_parent’: builtin-sched.c:399:16: error: storage size of ‘ru’ isn’t known builtin-sched.c:403:2: error: implicit declaration of function ‘getrusage’ [-Werror=implicit-function-declaration]
    [...]

Fix it by including sys/resource.h.

Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120404084527.GA294@x4
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-04 11:59:00 +02:00
Robert Richter efad14150a perf report: Accept fifos as input file
The default input file for perf report is not handled the same way as
perf record does it for its output file. This leads to unexpected
behavior of perf report, etc. E.g.:

 # perf record -a -e cpu-cycles sleep 2 | perf report | cat
 failed to open perf.data: No such file or directory  (try 'perf record' first)

While perf record writes to a fifo, perf report expects perf.data to be
read. This patch changes this to accept fifos as input file.

Applies to the following commands:

 perf annotate
 perf buildid-list
 perf evlist
 perf kmem
 perf lock
 perf report
 perf sched
 perf script
 perf timechart

Also fixes char const* -> const char* type declaration for filename
strings.

v2:
* Prevent potential null pointer access to input_name in
  builtin-report.c. Needed due to removal of patch "perf report: Setup
  browser if stdout is a pipe"

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-23 17:01:03 -02:00
Arnaldo Carvalho de Melo ee29be625b perf tools: Save some loops using perf_evlist__id2evsel
Since we already ask for PERF_SAMPLE_ID and use it to quickly find the
associated evsel, add handler func + data to struct perf_evsel to avoid
using chains of if(strcmp(event_name)) and also to avoid all the linear
list searches via trace_event_find.

To demonstrate the technique convert 'perf sched' to it:

 # perf sched record sleep 5m

And then:

 Performance counter stats for '/tmp/oldperf sched lat':

        646.929438 task-clock                #    0.999 CPUs utilized
                 9 context-switches          #    0.000 M/sec
                 0 CPU-migrations            #    0.000 M/sec
            20,901 page-faults               #    0.032 M/sec
     1,290,144,450 cycles                    #    1.994 GHz
   <not supported> stalled-cycles-frontend
   <not supported> stalled-cycles-backend
     1,606,158,439 instructions              #    1.24  insns per cycle
       339,088,395 branches                  #  524.151 M/sec
         4,550,735 branch-misses             #    1.34% of all branches

       0.647524759 seconds time elapsed

Versus:

 Performance counter stats for 'perf sched lat':

        473.564691 task-clock                #    0.999 CPUs utilized
                 9 context-switches          #    0.000 M/sec
                 0 CPU-migrations            #    0.000 M/sec
            20,903 page-faults               #    0.044 M/sec
       944,367,984 cycles                    #    1.994 GHz
   <not supported> stalled-cycles-frontend
   <not supported> stalled-cycles-backend
     1,442,385,571 instructions              #    1.53  insns per cycle
       308,383,106 branches                  #  651.195 M/sec
         4,481,784 branch-misses             #    1.45% of all branches

       0.474215751 seconds time elapsed

[root@emilia ~]#

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1kbzpl74lwi6lavpqke2u2p3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28 17:57:40 -02:00
Arnaldo Carvalho de Melo 45694aa770 perf tools: Rename perf_event_ops to perf_tool
To better reflect that it became the base class for all tools, that must
be in each tool struct and where common stuff will be put.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28 10:39:28 -02:00
Arnaldo Carvalho de Melo 743eb86865 perf tools: Resolve machine earlier and pass it to perf_event_ops
Reducing the exposure of perf_session further, so that we can use the
classes in cases where no perf.data file is created.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28 10:39:12 -02:00
Arnaldo Carvalho de Melo d20deb64e0 perf tools: Pass tool context in the the perf_event_ops functions
So that we don't need to have that many globals.

Next steps will remove the 'session' pointer, that in most cases is
not needed.

Then we can rename perf_event_ops to 'perf_tool' that better describes
this class hierarchy.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28 10:38:56 -02:00
Arnaldo Carvalho de Melo e3f4260962 perf tools: Use evsel->attr.sample_type instead of session->sample_type
Eventually session->sample_type will go away as we want to support
multiple sample types per session, so use it from the evsel which is a
step in that direction.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-0vwdpjcwbjezw459lw5n3ew1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28 10:38:14 -02:00
Jiri Olsa 580cabed88 perf sched: Usage leftover from trace -> script rename
The 'perf sched' command usage still showing 'trace' command instead of
the 'script' command.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110809124651.GD2056@jolsa.brq.redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-09 13:32:12 -03:00
Jiri Olsa 4c09bafae3 perf sched: Do not delete session object prematurely
The session object is released prematurely when processing events for
latency command. The session's thread objects are used within the
output_lat_thread function.

Runnning following commands:

 # perf sched record
 # perf sched latency

the latter displays incorrect data and might cause access violation.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1312837414-3819-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-09 13:31:38 -03:00
Arnaldo Carvalho de Melo 9e69c21082 perf session: Pass evsel in event_ops->sample()
Resolving the sample->id to an evsel since the most advanced tools,
report and annotate, and the others will too when they evolve to
properly support multi-event perf.data files.

Good also because it does an extra validation, checking that the ID is
valid when present. When that is not the case, the overhead is just a
branch + function call (perf_evlist__id2evsel).

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-23 19:28:58 -03:00
Kyle McMartin fb7d0b3cef perf tool: Fix gcc 4.6.0 issues
GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf
due to the -Werror=unused-but-set-variable flag.

I've gone through and annotated some of the assignments that had side
effects (ie: return value from a function) with the __used annotation,
and in some cases, just removed unused code.

In a few cases, we were assigning something useful, but not using it in
later parts of the function.

kyle@dreadnought:~/src% gcc --version
gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3)

Cc: Ingo Molnar <mingo@redhat.com>
LKML-Reference: <20110124161304.GK27353@bombadil.infradead.org>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
[ committer note: Fixed up the annotation fixes, as that code moved recently ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-07 12:41:41 -02:00
Arnaldo Carvalho de Melo 8115d60c32 perf tools: Kill event_t typedef, use 'union perf_event' instead
And move the event_t methods to the perf_event__ too.

No code changes, just namespace consistency.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29 16:25:37 -02:00
Arnaldo Carvalho de Melo 8d50e5b417 perf tools: Rename 'struct sample_data' to 'struct perf_sample'
Making the namespace more uniform.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29 16:25:20 -02:00
Arnaldo Carvalho de Melo 9486aa3877 perf tools: Fix 64 bit integer format strings
Using %L[uxd] has issues in some architectures, like on ppc64.  Fix it
by making our 64 bit integers typedefs of stdint.h types and using
PRI[ux]64 like, for instance, git does.

Reported by Denis Kirjanov that provided a patch for one case, I went
and changed all cases.

Reported-by: Denis Kirjanov <dkirjanov@kernel.org>
Tested-by: Denis Kirjanov <dkirjanov@kernel.org>
LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
Cc: Denis Kirjanov <dkirjanov@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pingtian Han <phan@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 23:41:57 -02:00
Stephane Eranian 9710118bd4 perf sched: Fix list of events, dropping unsupported ':r' modifier
Looks to me like the :r modifier is not supported anymore, so remove it from
the list of events.

Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <AANLkTim=jawJyBj0iFd0r4-LCKzvjFW+NddzJMD5GUB9@mail.gmail.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-13 11:25:49 -02:00
Jiri Pirko 12f7e03643 perf sched: Use PTHREAD_STACK_MIN to avoid pthread_attr_setstacksize() fail
on ppc64:
/usr/include/bits/local_lim.h:#define PTHREAD_STACK_MIN	131072

therefore following set of commands:

gives:
perf.2.6.37test: builtin-sched.c:493: create_tasks: Assertion `!(err)' failed.

So make sure we do not set stack size lower than PTHREAD_STACK_MIN.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110110160417.GB2685@psychotron.brq.redhat.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10 14:16:00 -02:00
Arnaldo Carvalho de Melo e462dc553e perf sched: Fix allocation result check
Bug introduced in ce47dc56.

Reported-by: Mike Galbraith <efault@gmx.de>
Cc: Chris Samuel <chris@csamuel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10 10:48:47 -02:00
Ian Munsie 21ef97f05a perf session: Fallback to unordered processing if no sample_id_all
If we are running the new perf on an old kernel without support for
sample_id_all, we should fall back to the old unordered processing of
events. If we didn't than we would *always* process events without
timestamps out of order, whether or not we hit a reordering race. In
other words, instead of there being a chance of not attributing samples
correctly, we would guarantee that samples would not be attributed.

While processing all events without timestamps before events with
timestamps may seem like an intuitive solution, it falls down as
PERF_RECORD_EXIT events would also be processed before any samples.
Even with a workaround for that case, samples before/after an exec would
not be attributed correctly.

This patch allows commands to indicate whether they need to fall back to
unordered processing, so that commands that do not care about timestamps
on every event will not be affected. If we do fallback, this will print
out a warning if report -D was invoked.

This patch adds the test in perf_session__new so that we only need to
test once per session. Commands that do not use an event_ops (such as
record and top) can simply pass NULL in it's place.

Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1291951882-sup-6069@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 20:17:51 -02:00
Chris Samuel ce47dc56a2 perf tools: Catch a few uncheck calloc/malloc's
There were a few stray calloc()'s and malloc()'s which were not having
their return values checked for success.

As the calling code either already coped with failure or didn't actually
care we just return -ENOMEM at that point.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Chris Samuel <chris@csamuel.org>
LKML-Reference: <4CDDF95A.1050400@csamuel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-06 12:52:35 -02:00
Arnaldo Carvalho de Melo 640c03ce83 perf session: Parse sample earlier
At perf_session__process_event, so that we reduce the number of lines in eache
tool sample processing routine that now receives a sample_data pointer already
parsed.

This will also be useful in the next patch, where we'll allow sample the
identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
timestamp) just after before every event.

Also validate callchains in perf_session__process_event, i.e. as early as
possible, and keep a counter of the number of events discarded due to invalid
callchains, warning the user about it if it happens.

There is an assumption that was kept that all events have the same sample_type,
that will be dealt with in the future, when this preexisting limitation will be
removed.

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-04 23:05:19 -02:00
Ingo Molnar 133dc4c39c perf: Rename 'perf trace' to 'perf script'
Free the perf trace name space and rename the trace to 'script' which is a
better match for the scripting engine.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-11-16 19:37:44 +01:00
Frederic Weisbecker af64865ba6 perf: Use event__process_task from perf sched
perf sched uses event__process_comm(), which means it can resolve
comms from:

- tasks that have exec'ed (kernel comm events)
- tasks that were running when perf record started the actual
  recording (synthetized comm events)

But perf sched can't resolve the pids of tasks that were created
after the recording started.

To solve this, we need to inherit the comms on fork events using
event__process_task().

This fixes various unresolved pids in perf sched, easily visible
with:
	perf sched record perf bench sched messaging

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
2010-06-01 00:10:32 +02:00
Arnaldo Carvalho de Melo edb7c60e27 perf options: Type check all the remaining OPT_ variants
OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
tools is to set something that has an enum type, that is builtin
compatible with unsigned int.

Several string constifications were done to make OPT_STRING require a
const char * type.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 16:22:41 -03:00
Arnaldo Carvalho de Melo 1967936d68 perf options: Check v type in OPT_U?INTEGER
To avoid problems like the one fixed by Stephane Eranian in 3de29ca, now
we'll got this instead:

	bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’
	bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’

Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
hackers should be already used to this.

With it in place found some problems, fixed by changing the affected
variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.

Next csets will go thru converting each of the remaining OPT_ so that
review can be made easier by grouping changes per type per patch.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 15:43:38 -03:00
Arnaldo Carvalho de Melo cee75ac7ec perf hist: Clarify events_stats fields usage
The events_stats.total field is too generic, rename it to .total_period,
and also add a comment explaining that it is the sum of all the .period
fields in samples, that is needed because we use auto-freq to avoid
sampling artifacts.

Ditto for events_stats.lost, that is the sum of all lost_event.lost
fields, i.e. the number of events the kernel dropped.

Looking at the users, builtin-sched.c can make use of these fields and
stop doing it again.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14 13:16:55 -03:00
Tom Zanussi 454c407ec1 perf: add perf-inject builtin
Currently, perf 'live mode' writes build-ids at the end of the
session, which isn't actually useful for processing live mode events.

What would be better would be to have the build-ids sent before any of
the samples that reference them, which can be done by processing the
event stream and retrieving the build-ids on the first hit.  Doing
that in perf-record itself, however, is off-limits.

This patch introduces perf-inject, which does the same job while
leaving perf-record untouched.  Normal mode perf still records the
build-ids at the end of the session as it should, but for live mode,
perf-inject can be injected in between the record and report steps
e.g.:

perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

perf-inject reads a perf-record event stream and repipes it to stdout.
At any point the processing code can inject other events into the
event stream - in this case build-ids (-b option) are read and
injected as needed into the event stream.

Build-ids are just the first user of perf-inject - potentially
anything that needs userspace processing to augment the trace stream
with additional information could make use of this facility.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 13:36:56 -03:00