Commit graph

296 commits

Author SHA1 Message Date
Frederic Weisbecker af0a6fa463 perf tools: Fix missing top level callchain
While recursively printing the branches of each callchains, we
forget to display the root. It is never printed.

Say we have:

    symbol
    f1
    f2
     |
     -------- f3
     |        f4
     |
     ---------f5
              f6

Actually we never see that, instead it displays:

    symbol
    |
    --------- f3
    |         f4
    |
    --------- f5
              f6

However f1 is always the same than "symbol" and if we are
sorting by symbols first then "symbol", f1 and f2 will be well
aligned like in the above example, so displaying f1 looks
redundant here.

But if we are sorting by something else first (dso, comm,
etc...), displaying f1 doesn't look redundant but rather
necessary because the symbol is not well aligned anymore with
its callchain:

     comm     dso        symbol
     f1
     f2
     |
     --------- [...]

And we want the callchain to be obvious.
So we fix the bug by printing the root branch, but we also
filter its first entry if we are sorting by symbols first.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1256246604-17156-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23 07:55:16 +02:00
Steven Rostedt 4e3b799d7d perf tools: Use strsep() over strtok_r() for parsing single line
The second argument in the strtok_r() function is not to be used
generically and can have different implementations. Currently
the function parsing of the perf trace code uses the second
argument to copy data from. This can crash the tool or just have
unpredictable results.

The correct solution is to use strsep() which has a defined
result.

I also added a check to see if the result was correct, and will
break out of the loop in case it fails to parse as expected.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020232034.237814877@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-21 13:39:57 +02:00
Arnaldo Carvalho de Melo e42049926e perf annotate: Use the sym_priv_size area for the histogram
We have this sym_priv_size mechanism for attaching private areas
to struct symbol entries but annotate wasn't using it, adding
private areas to struct symbol in addition to a ->priv pointer.

Scrap all that and use the sym_priv_size mechanism.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1256055940-19511-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 21:12:58 +02:00
Arnaldo Carvalho de Melo ed52ce2e3c perf tools: Add ->unmap_ip operation to struct map
We need this because we get section relative addresses when
reading the symtabs, but when a tool like 'perf annotate' needs
to match these address to what 'objdump -dS' produces we need
the address + section back again.

So in annotate now we look at the 'struct hist_entry' instances
(that weren't really being used) so that we iterate only over
the symbols that had some hit and get the map where that
particular hit happened so that we can get the right address to
match with annotate.

Verified that at least:

 perf annotate mmap_read_counter # Uses the ~/bin/perf binary
 perf annotate --vmlinux /home/acme/git/build/perf/vmlinux intel_pmu_enable_all

on a 'perf record perf top' session seems to work.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1255979877-12533-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 07:55:51 +02:00
Ingo Molnar c258449bc9 Merge branch 'perf/urgent' into perf/core
Merge reason: Queue up dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 07:51:44 +02:00
Arjan van de Ven 2e600d01c1 perf timechart: Improve the visual appearance of scheduler delays
[from KS feedback]

Currently, scheduler delays are shown in a mostly transparent,
light yellow color. This color is rather hard to see on several
screens, especially projectors.

This patch changes the color of the scheduler delays to be a
much more "hard" yellow that survived the kernel summit
projector.

Reported-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020064731.20ae126a@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 03:39:21 +02:00
Arnaldo Carvalho de Melo 20639c15d2 perf tools: Add missing tools/perf/util/include/string.h
To cure a bunch of:

In file included from util/include/linux/bitmap.h:1,
                 from util/header.h:8,
                 from builtin-trace.c:7:
util/include/../../../../include/linux/bitmap.h:8:26: error:
linux/string.h: No such file or directory make: ***
[builtin-trace.o] Error 1 make: *** Waiting for unfinished
jobs....

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1255972296-11500-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-20 02:59:34 +02:00
Frederic Weisbecker db9f11e36d perf tools: Use DECLARE_BITMAP instead of an open-coded array
Use DECLARE_BITMAP instead of an open coded array for our bitmap
of featured sections.

This makes the array an unsigned long instead of a u64 but since
we use a 256 bits bitmap, the array size shouldn't vary between
different boxes.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1255795038-13751-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:26:35 +02:00
Frederic Weisbecker 2ba0825075 perf tools: Introduce bitmask'ed additional headers
This provides a new set of bitmasked headers. A new field is
added in the perf headers that implements a bitmap storing
optional features present in the perf.data file.

The layout can be pictured like this:

(Usual perf headers)(Features bitmap)[Feature 0][Feature
n][Feature 255]

If the bit n is set, then the feature n is used in this file.
They are all set in order. This brings a backward and forward
compatibility.

The trace_info section has moved into such optional features,
this is the first and only one for now.

This is backward compatible with the .32 file version although
it doesn't support the previous separate trace.info file.

And finally it doesn't support the current interim development
version.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1255792354-11304-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:26:35 +02:00
Frederic Weisbecker 5a116dd279 perf tools: Use kernel bitmap library
Use the kernel bitmap library for internal perf tools uses.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1255792354-11304-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-19 09:26:34 +02:00
Ingo Molnar bb3c3e8071 Merge commit 'v2.6.32-rc5' into perf/probes
Conflicts:
	kernel/trace/trace_event_profile.c

Merge reason: update to -rc5 and resolve conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:58:25 +02:00
Masami Hiramatsu 9769833b8e perf: Add DIE_IF() macro for error checking
Add DIE_IF() macro and replace ERR_IF() with it, and use
linux/stringify.h.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20091017000818.16556.82452.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:54:01 +02:00
Masami Hiramatsu 89c69c0eee perf: Use eprintf() for debug messages in perf-probe
Replace debug() macro with eprintf() and add -v option for
showing those messages in perf-probe.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20091017000810.16556.38013.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:54:00 +02:00
Masami Hiramatsu 074fc0e4b3 perf: Use die() for error cases in perf-probe
Use die() for exiting perf-probe with errors. This replaces
perror_exit(), msg_exit() and fprintf()+exit() with die(), and
uses die() in semantic_error().

This also renames 'die' local variables to 'dw_die' for avoiding
name confliction.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20091017000801.16556.46866.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:54:00 +02:00
Ingo Molnar 210f9cb2cf perf tools: Bump version to 0.0.2
We released the first version of perf with 0.0.1 in v2.6.31,
time to double our version number to 0.0.2 ;-)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 10:34:28 +02:00
Li Zefan c171b552a7 perf trace: Add filter Suppport
Add a new option "--filter <filter_str>" to perf record, and
it should be right after "-e trace_point":

 #./perf record -R -f -e irq:irq_handler_entry --filter irq==18
 ^C
 # ./perf trace
            perf-4303  ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0
            init-0     ... irq_handler_entry: irq=18 handler=eth0

See Documentation/trace/events.txt for the syntax of filter
expressions.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4AD6955F.90602@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 11:35:23 +02:00
Steven Rostedt c4dc775f53 perf tools: Remove all char * typecasts and use const in prototype
The (char *) for all the static strings was a fix for the
symptom and not the disease. The real issue was that the
function prototypes needed to be declared "const char *".

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194400.635935008@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:43:17 +02:00
Steven Rostedt afdf1a404e perf tools: Handle - and + in parsing trace print format
The opterators '-' and '+' are not handled in the trace print
format.

To do: '++' and '--'.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194400.330843045@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:40 +02:00
Steven Rostedt cda48461c7 perf tools: Add latency format to trace output
Add the irqs disabled, preemption count, need resched, and other
info that is shown in the latency format of ftrace.

 # perf trace -l
    perf-16457   2..s2. 53636.260344: kmem_cache_free: call_site=ffffffff811198f
    perf-16457   2..s2. 53636.264330: kmem_cache_free: call_site=ffffffff811198f
    perf-16457   2d.s4. 53636.300006: kmem_cache_free: call_site=ffffffff810d889

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194400.076588953@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:39 +02:00
Steven Rostedt 0d1da915c7 perf tools: Handle both versions of ftrace output
The ftrace output events can have either arguments or no
arguments. The parser needs to be able to handle both.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194359.790221427@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:39 +02:00
Steven Rostedt ffa1895561 perf tools: Fix bprintk reading in trace output
The bprintk parsing was broken in more ways than one.

The file parsing was incorrect, and the words used by the
arguments are always 4 bytes aligned, even on 64-bit machines.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194359.520931637@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:38 +02:00
Steven Rostedt 07a4bdddcf perf tools: Still continue on failed parsing of an event
Even though an event may fail to parse, we should not kill the
entire report. The trace should still be able to show what it
can.

If an event fails to parse, a warning is printed, and the output
continues.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194359.190809589@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:38 +02:00
Steven Rostedt 13999e5934 perf tools: Handle the case with and without the "signed" trace field
The trace format files now have a "signed" field. But we should
still be able to handle the kernels that do not have this field.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194358.888239553@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:37 +02:00
Steven Rostedt f1d1feecf0 perf tools: Handle newlines in trace parsing better
New lines between args in the trace format can break the
parsing. This should not be the case.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194358.637991808@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:37 +02:00
Steven Rostedt b99af87482 perf tools: Handle * as typecast in trace parsing
The '*' is currently only treated as a multiplication, and it
needs to be handled as a typecast pointer.

This is the version used by trace-cmd.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194358.409327875@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:36 +02:00
Steven Rostedt 0959b8d65c perf tools: Handle arrays in print fields for trace parsing
The array used by the ftrace stack events (caller[x]) causes
issues with the parser. This adds code to handle the case, but
it also assumes that the array is of type long.

Note, this is a special case used (currently) only by the ftrace
user and kernel stack records.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194358.124833639@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:36 +02:00
Steven Rostedt 298ebc3ef2 perf tools: Handle trace parsing of < and >
The code to handle the '<' and '>' ops was all in place, but
they were not in the switch statement to consider them as valid
ops.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194357.807434040@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:35 +02:00
Steven Rostedt 91ff2bc191 perf tools: Fix backslash processing on trace print formats
The handling of backslashes was broken. It would stop parsing
when encountering one. Also, '\n', '\t', '\r' and '\\' were not
converted.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194357.521974680@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:35 +02:00
Steven Rostedt 924a79af2c perf tools: Handle print concatenations in event format file
kmem_alloc ftrace event format had a string that was broken up
by two tokens. "string 1" "string 2". This patch lets the parser
be able to handle the concatenation.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20091014194357.253818714@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 10:42:34 +02:00
Ingo Molnar b226f744d4 Merge branch 'linus' into perf/core
Merge reason: pick up tools/perf/ changes from upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:44:44 +02:00
Arnaldo Carvalho de Melo d5b889f2ec perf tools: Move threads & last_match to threads.c
This was just being copy'n'pasted all over.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091013141629.GD21809@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 17:12:18 +02:00
Vincent Legoll cfed95a693 perf tools: Do not manually count string lengths
Use strlen & macros instead of manually counting string lengths as
this is error prone and may lend to bugs.

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Cc: Linus Torvalds <torvalds@osdl.org>
LKML-Reference: <4727185d0910130118m5387058dndb02ac9b384af9f0@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 11:55:31 +02:00
Masami Hiramatsu 23e8ec0d1c perf probe: Add perf probe command support without libdwarf
Enables 'perf probe' even if libdwarf is not installed. If libdwarf is
not found, 'perf probe' just disables dwarf support. Users can use
'perf probe' to set up new events by using kprobe_events format.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
LKML-Reference: <20091007222830.1684.25665.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-10-12 23:31:53 +02:00
Masami Hiramatsu 4ea42b1814 perf: Add perf probe subcommand, a kprobe-event setup helper
Add perf probe subcommand that implements a kprobe-event setup helper
to the perf command.
This allows user to define kprobe events using C expressions (C line
numbers, C function names, and C local variables).

Usage
-----
 perf probe [<options>] -P 'PROBEDEF' [-P 'PROBEDEF' ...]

    -k, --vmlinux <file>  vmlinux/module pathname
    -P, --probe <p|r:[GRP/]NAME FUNC[+OFFS][@SRC]|@SRC:LINE [ARG ...]>
                          probe point definition, where
		p:	kprobe probe
		r:	kretprobe probe
		GRP:	Group name (optional)
		NAME:	Event name
		FUNC:	Function name
		OFFS:	Offset from function entry (in byte)
		SRC:	Source code path
		LINE:	Line number
		ARG:	Probe argument (local variable name or
			kprobe-tracer argument format is supported.)

Changes in v4:
 - Add _GNU_SOURCE macro for strndup().

Changes in v3:
 - Remove -r option because perf always be used for online kernel.
 - Check malloc/calloc results.

Changes in v2:
 - Check synthesized string length.
 - Rename perf kprobe to perf probe.
 - Use spaces for separator and update usage comment.
 - Check error paths in parse_probepoint().
 - Check optimized-out variables.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
LKML-Reference: <20091008211737.29299.14784.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-10-12 23:31:52 +02:00
Ashwin Chaugule 63c9e01e1a perf tools: Remove static debugfs path from parse-events
Timechart doesn't work if debugfs is not in /sys/kernel/debug/.
Fixed by using global debugfs_path which is filled in by perf.

Signed-off-by: Ashwin Chaugule <ashwinc@quicinc.com>
Cc: "Arjan van de Ven" <arjan@linux.intel.com>
LKML-Reference: <a751bdc6978478de6d10440e587a2cc7.squirrel@www.codeaurora.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 22:41:05 +02:00
Randy Dunlap cbef79a82a perf tools: Fix const char type propagation
The following perf build warnings/errors in function
argument types:

  builtin-sched.c:1894: warning: passing argument 1 of 'sort_dimension__add' discards qualifiers from pointer target type
  util/trace-event-parse.c:685: warning: passing argument 2 of 'read_expected' discards qualifiers from pointer target type
  util/trace-event-parse.c:741: warning: passing argument 4 of 'test_type_token' discards qualifiers from pointer target type
  util/trace-event-parse.c:706: warning: passing argument 2 of 'read_expected_item' discards qualifiers from pointer target type

... trigger because older GCC is not able to prove that
sort_dimension__add() does not change the string.

Some goes for test_type_token().

Fix this by improving type consistency.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091005131729.78444bfb.randy.dunlap@oracle.com>
[ Also remove ugly type cast now unnecessary. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 08:35:00 +02:00
Frederic Weisbecker 26dd2cb074 perf tools: Provide backward compatibility with previous perf.data version
We have merged the trace.info file into perf.data by adding one
section in the perf headers. This makes it incompatible with
previous version: the new perf tools can't read the older
perf.data.

To support the previous format, we check the headers size. If they
have the same size than in the previous format, then ignore the
trace info section that doesn't exist.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1255032449-12022-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 22:11:02 +02:00
Frederic Weisbecker 97ea1a7fa6 perf tools: Fix thread comm resolution in perf sched
This reverts commit 9a92b479b2 ("perf
tools: Improve thread comm resolution in perf sched") and fixes the
real bug.

The bug was elsewhere:

We are failing to resolve thread names in perf sched because the
table of threads we are building, on top of comm events, has a per
process granularity. But perf sched, unlike the other perf tools,
needs a per thread granularity as we are profiling every tasks
individually.

So fix it by building our threads table using the tid instead of
the pid as the thread identifier.

v2: Revert the previous fix - it is not really needed

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 21:10:21 +02:00
Arnaldo Carvalho de Melo 2e538c4a18 perf tools: Improve kernel/modules symbol lookup
This removes the ovelapping of vmlinux addresses with modules,
using the ELF section name when using --vmlinux and creating a
unique DSO name when using /proc/kallsyms ([kernel].N).

This is done by creating multiple 'struct map' instances for
address ranges backed by DSOs that have just the symbols for that
range and a name that is derived from the ELF section name.o

Now it is possible to ask for just the symbols in some particular
kernel section:

$ perf report -m --vmlinux ../build/tip-recvmmsg/vmlinux \
	--dsos [kernel].vsyscall_fn | head -15
    52.73%             Xorg  [.] vread_hpet
    18.61%          firefox  [.] vread_hpet
    14.50%     npviewer.bin  [.] vread_hpet
     6.83%           compiz  [.] vread_hpet
     5.73%         glxgears  [.] vread_hpet
     0.63%             java  [.] vread_hpet
     0.30%   gnome-terminal  [.] vread_hpet
     0.23%             perf  [.] vread_hpet
     0.18%            xchat  [.] vread_hpet
$

Now we don't have to first lookup the list of modules and then, if
it fails, vmlinux symbols, its just a simple lookup for the map
then the symbols, just like for threads.

Reports generated using /proc/kallsyms and --vmlinux should provide
the same results, modulo the DSO name for sections other than
".text".

But they don't right now because things like:

 ffffffff81011c20-ffffffff81012068 system_call
 ffffffff81011c30-ffffffff81011c9b system_call_after_swapgs
 ffffffff81011c9c-ffffffff81011cb6 system_call_fastpath
 ffffffff81011cb7-ffffffff81011cbb ret_from_sys_call

I.e. overlapping symbols, again some ASM special case that we have
to fixup.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1254934136-8503-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 19:27:11 +02:00
Arnaldo Carvalho de Melo da21d1b547 perf tools: Up the verbose level for some really verbose stuff
Like printing every symbol created.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 19:27:10 +02:00
Frederic Weisbecker 9a92b479b2 perf tools: Improve thread comm resolution in perf sched
When we get sched traces that involve a task that was already
created before opening the event, we won't have the comm event for
it.

So if we can't find the comm event for a given thread, we look at
the traces that may contain these informations.

Before:

 ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
 kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
 kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
 :5124:5124            |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
 :6244:6244            |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
 firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
 npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
 :6245:6245            |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |

After:

 ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
 kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
 kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
 firefox:5124          |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
 npviewer.bin:6244     |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
 firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
 npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
 npviewer.bin:6245     |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 16:56:33 +02:00
Frederic Weisbecker 016e92fbc9 perf tools: Unify perf.data mapping and events handling
This librarizes the perf.data file mapping and handling in various
perf tools, roughly reducing the amount of code and fixing the
places that mmap from beginning of the file whereas we want to mmap
from the beginning of the data, leading to page fault because the
mmap window is too small since the trace info are written in the
file too.

TODO:

 - convert perf timechart too

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <20091007104729.GD5043@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 16:56:32 +02:00
Frederic Weisbecker 03456a158d perf tools: Merge trace.info content into perf.data
This drops the trace.info file and move its contents into the
common perf.data file.

This is done by creating a new trace_info section into this file. A
user of perf headers needs to call perf_header__set_trace_info() to
save the trace meta informations into the perf.data file.

A file created by perf after his patch is unsupported by previous
version because the size of the headers have increased.

That said, it's two new fields that have been added in the end of
the headers, and those could be ignored by previous versions if
they just handled the dynamic header size and then ignore the
unknow part. The offsets guarantee the compatibility. We'll do a
-stable fix for that.

But current previous versions handle the header size using its
static size, not dynamic, then it's not backward compatible with
trace records.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091006213643.GA5343@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-07 08:36:10 +02:00
Tom Zanussi 064739bc4b perf trace: Add string/dynamic cases to format_flags
Needed for distinguishing string fields in event stream processing.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: hch@infradead.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254809398-8078-4-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:04:46 +02:00
Tom Zanussi 2774601811 perf trace: Add subsystem string to struct event
Needed to fully qualify event names for event stream processing.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: hch@infradead.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254809398-8078-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:04:46 +02:00
Tom Zanussi 26a50744b2 tracing/events: Add 'signed' field to format files
The sign info used for filters in the kernel is also useful to
applications that process the trace stream.  Add it to the format
files and make it available to userspace.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: hch@infradead.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254809398-8078-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:04:45 +02:00
Ingo Molnar d9b2002c40 Merge branch 'perf/urgent' into perf/core
Merge reason: Upcoming patch is dependent on a fix in perf/urgent.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:02:34 +02:00
Arnaldo Carvalho de Melo 818331303b perf tools: elf_sym__is_function() should accept "zero" sized functions
Asm routines that end up having size equal to zero are not really
zero sized, and as now we do kernel_maps__fixup_sym_end, at least
for kernel routines this gets fixed.

A similar fixup needs to be done for the userspace bits as well,
but as this fixup started only because in /proc/kallsyms we don't
have the end address nor the function size, it appeared here first.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1254796503-27203-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 12:08:08 +02:00
Tom Zanussi b934cdd55f perf trace: Update eval_flag() flags array to match interrupt.h
Add missing BLOCK_IOPOLL_SOFTIRQ entry.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254808849-7829-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 12:02:34 +02:00
Arnaldo Carvalho de Melo a2a99e8e12 perf tools: /proc/modules names don't always match its name
$ cut -d' ' -f1 /proc/modules|grep _|wc -l
 29
 $ cut -d' ' -f1 /proc/modules|grep _|sed 's/$/.ko'/g|while read n;do find /lib/modules/`uname -r` -name $n;done|wc -l
 12

For instance:

 $ grep ^aes_x86 /proc/modules
 aes_x86_64 9056 2 - Live 0xffffffffa0091000
 $ l /lib/modules/2.6.31-tip/kernel/arch/x86/crypto/aes-x86_64.ko
 -rw-r--r-- 1 root root 136438 2009-09-22 19:05 /lib/modules/2.6.31-tip/kernel/arch/x86/crypto/aes-x86_64.ko

Handle that by introducing a strxfrchar routine that replaces
dashes with underscores when matching file names to loaded modules.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05 20:35:23 +02:00
Arnaldo Carvalho de Melo af427bf529 perf tools: Create maps for modules when processing kallsyms
So that we get kallsyms processing closer to vmlinux + modules
symtabs processing.

One change in behaviour is that since when one specifies --vmlinux
-m should be used to ask for modules, so it is now for kallsyms as
well.

Also continue if one manages to load the vmlinux data but module
processing fails, so that at least some analisys can be done with
part of the needed symbols.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05 20:35:23 +02:00
Arnaldo Carvalho de Melo ec218fc4a7 perf tools: Remove show_mask bitmask
As it was not being exposed via any command line and with --dsos/--comms
we can do this and even more, like asking for just kernel + some module:

[root@doppio linux-2.6-tip]# perf report --dsos \[kernel\],\[drm\]
--vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules | head -15
 # Samples: 619669
 #
 # Overhead          Command  Shared Object  Symbol
 # ........  ...............  .............  ......
 #
      7.12%          swapper  [kernel]       [k] read_hpet
      6.86%             init  [kernel]       [k] read_hpet
      6.22%             init  [kernel]       [k] mwait_idle_with_hints
      5.34%          swapper  [kernel]       [k] mwait_idle_with_hints
      3.01%          firefox  [kernel]       [.] vread_hpet
      2.14%             Xorg  [drm]          [k] drm_clflush_pages
      2.09%           pidgin  [kernel]       [.] vread_hpet
      1.58%     npviewer.bin  [kernel]       [.] vread_hpet
      1.37%          swapper  [kernel]       [k] hpet_next_event
      1.23%             Xorg  [kernel]       [k] read_hpet
[root@doppio linux-2.6-tip]#

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091003233048.GA30535@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-04 19:37:11 +02:00
Arnaldo Carvalho de Melo 9735abf11b perf tools: Move hist_entry__add common code to hist.c
Now perf report and annotate do the callgraph/hit processing in
their specialized hist_entry__add functions.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-03 16:01:59 +02:00
Arnaldo Carvalho de Melo 439d473b47 perf tools: Rewrite and improve support for kernel modules
Representing modules as struct map entries, backed by a DSO, etc,
using /proc/modules to find where the module is loaded.

DSOs now can have a short and long name, so that in verbose mode we
can show exactly which .ko or vmlinux image was used.

As kernel modules now are a DSO separate from the kernel, we can
ask for just the hits for a particular set of kernel modules, just
like we can do with shared libraries:

[root@doppio linux-2.6-tip]# perf report -n --vmlinux
/home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15
    84.58%      13266             Xorg  [k] drm_clflush_pages
     4.02%        630             Xorg  [k] trace_kmalloc.clone.0
     3.95%        619             Xorg  [k] drm_ioctl
     2.07%        324             Xorg  [k] drm_addbufs
     1.68%        263             Xorg  [k] drm_gem_close_ioctl
     0.77%        120             Xorg  [k] drm_setmaster_ioctl
     0.70%        110             Xorg  [k] drm_lastclose
     0.68%        106             Xorg  [k] drm_open
     0.54%         85             Xorg  [k] drm_mm_search_free
[root@doppio linux-2.6-tip]#

Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko
would have the same effect. Allowing specifying just 'drm.ko' is left
for another patch.

Processing kallsyms so that per kernel module struct map are
instantiated was also left for another patch. That will allow
removing the module name from each of its symbols.

struct symbol was reduced by removing the ->module backpointer and
moving it (well now the map) to struct symbol_entry in perf top,
that is its only user right now.

The total linecount went down by ~500 lines.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02 10:48:42 +02:00
Arjan van de Ven 39a90a8ef1 perf timechart: Add a power-only mode
For doing work on the Linux power management components, I need to
make long (30+ seconds) traces. Currently, this then results in a
HUGE svg file, with mostly process data that isn't interesting.

This patch adds a --power-only mode to perf timechart that only
outputs the CPU power section of the SVG; this significantly
reduces the size of the SVG file, making even 30+ second traces
viewable with inkscape.

As a minor tweak for the same effect, the minimum text size is
decreased; current inkscape cannot zoom in deep enough to show text
this small, but it reduces inkscape compute time.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: peterz@infradead.org
LKML-Reference: <20090924154013.0675ab71@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01 09:26:40 +02:00
Arnaldo Carvalho de Melo 1b46cddfcc perf tools: Use rb_tree for maps
Threads can have many and kernel modules will be represented as a
tree of maps as well.

Ah, and for a perf.data with 146607 samples:

Before:

[root@doppio ~]# perf stat -r 5 perf report > /dev/null

 Performance counter stats for 'perf report' (5 runs):

     699.823680  task-clock-msecs         #      0.991 CPUs    ( +-   0.454% )
             74  context-switches         #      0.000 M/sec   ( +-   1.709% )
              2  CPU-migrations           #      0.000 M/sec   ( +-  17.008% )
          23114  page-faults              #      0.033 M/sec   ( +-   0.000% )
     1381257019  cycles                   #   1973.721 M/sec   ( +-   0.290% )
     1456894438  instructions             #      1.055 IPC     ( +-   0.007% )
       18779818  cache-references         #     26.835 M/sec   ( +-   0.380% )
         641799  cache-misses             #      0.917 M/sec   ( +-   1.200% )

    0.705972729  seconds time elapsed   ( +-   0.501% )

[root@doppio ~]#

After

 Performance counter stats for 'perf report' (5 runs):

     691.261451  task-clock-msecs         #      0.993 CPUs    ( +-   0.307% )
             72  context-switches         #      0.000 M/sec   ( +-   0.829% )
              6  CPU-migrations           #      0.000 M/sec   ( +-  18.409% )
          23127  page-faults              #      0.033 M/sec   ( +-   0.000% )
     1366395876  cycles                   #   1976.670 M/sec   ( +-   0.153% )
     1443136016  instructions             #      1.056 IPC     ( +-   0.012% )
       17956402  cache-references         #     25.976 M/sec   ( +-   0.325% )
         661924  cache-misses             #      0.958 M/sec   ( +-   1.335% )

    0.696127275  seconds time elapsed   ( +-   0.377% )

I.e. we see some speedup too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
LKML-Reference: <20090928174846.GA3361@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30 13:57:56 +02:00
John Kacur 3d1d07ecd2 perf tools: Put common histogram functions in their own file
Move histogram related functions into their own files (hist.c and
hist.h) and make use of them in builtin-annotate.c and
builtin-report.c.

Signed-off-by: John Kacur <jkacur@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <alpine.LFD.2.00.0909281531180.8316@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30 13:57:56 +02:00
John Kacur dd68ada2d4 perf tools: Create util/sort.and use it
Create util/sort.[ch] and move common functionality for
builtin-report.c and builtin-annotate.c there, and make use of it.

Signed-off-by: John Kacur <jkacur@redhat.com>
LKML-Reference: <alpine.LFD.2.00.0909241758390.11383@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 21:27:52 +02:00
John Kacur 8b40f521cf perf tools: Protect header files with a consistent style
There was a colorful mix of header guards - standardize them.

Signed-off-by: John Kacur <jkacur@redhat.com>
LKML-Reference: <alpine.LFD.2.00.0909241756530.11383@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 21:27:51 +02:00
Eric Dumazet 725b13685c perf tools: Dont use openat()
openat() is still a young glibc facility, better to not use it in a
non performance critical program (perf list)

Many machines have older glibc (RHEL 4 Update 5 -> glibc-2.3.4-2.36
on my dev machine for example).

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ulrich Drepper <drepper@redhat.com>
LKML-Reference: <4ABB767D.6080004@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 21:20:08 +02:00
Eric Dumazet a255a9981a perf tools: Fix buffer allocation
"perf top" cores dump on my dev machine, if run from a directory
where vmlinux is present:

  *** glibc detected *** malloc(): memory corruption: 0x085670d0 ***

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@kernel.org>
LKML-Reference: <4ABB6EB7.7000002@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 15:11:23 +02:00
Mike Galbraith dd906a0fe8 perf tools: Handle relative paths while loading module symbols
Inform util/module.c::mod_dso__load_module_paths() that relative
paths do exist in some modules.dep, and make it fail noisily should
it encounter a path that it doesn't understand, or a module it
cannot open.

Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: rostedt@goodmis.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1253779628.10513.8.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 11:40:35 +02:00
Mike Galbraith 508c4d0874 perf tools: Fix module symbol loading bug
Avi Kivity reported 'perf annotate' failures with modules, the
requested function was not annotated.

If there are no modules currently loaded, or the last module
scanned is not loaded, dso__load_modules() steps on the value from
dso__load_vmlinux(), so we happily load the kallsyms symbols on top
of what we've already loaded.

Fix that such that the total count of symbols loaded is returned.
Should module symbol load fail after parsing of vmlinux, is's a
hard failure, so do not silently fall-back to kallsyms.

Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: rostedt@goodmis.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1253697658.11461.36.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-23 13:45:48 +02:00
Ingo Molnar cdd6c482c9 perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!

In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.

Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.

All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)

The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.

Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.

User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)

This patch has been generated via the following script:

  FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

  sed -i \
    -e 's/PERF_EVENT_/PERF_RECORD_/g' \
    -e 's/PERF_COUNTER/PERF_EVENT/g' \
    -e 's/perf_counter/perf_event/g' \
    -e 's/nb_counters/nb_events/g' \
    -e 's/swcounter/swevent/g' \
    -e 's/tpcounter_event/tp_event/g' \
    $FILES

  for N in $(find . -name perf_counter.[ch]); do
    M=$(echo $N | sed 's/perf_counter/perf_event/g')
    mv $N $M
  done

  FILES=$(find . -name perf_event.*)

  sed -i \
    -e 's/COUNTER_MASK/REG_MASK/g' \
    -e 's/COUNTER/EVENT/g' \
    -e 's/\<event\>/event_id/g' \
    -e 's/counter/event/g' \
    -e 's/Counter/Event/g' \
    $FILES

... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.

Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.

( NOTE: 'counters' are still the proper terminology when we deal
  with hardware registers - and these sed scripts are a bit
  over-eager in renaming them. I've undone some of that, but
  in case there's something left where 'counter' would be
  better than 'event' we can undo that on an individual basis
  instead of touching an otherwise nicely automated patch. )

Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 14:28:04 +02:00
Arjan van de Ven 611a546bec perf util: SVG performance improvements
Tweak the output SVG to increase performance in SVG viewers by
limiting the different types of font sizes and by smarter
transformations on the text.

At least with Inkscape this gives a notable performance improvement
during zoom and scrolling.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090920181438.3a49cb93@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20 19:37:36 +02:00
Arjan van de Ven 5094b65545 perf util: Make the timechart SVG width dynamic
This patch adds a command line option for timechart that allows the
user to specify the width of the SVG file.

This patch also makes sure that each second of recording has at
least 200 units (pixels at 96 DPI) of width.  This impacts
recordings longer than 5 seconds; recordings shorter than 5 second
will scale up to have a width of 1000 units for the whole recording
(as before).

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090920181416.69570c5d@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20 19:37:35 +02:00
Arjan van de Ven a92fe7b306 perf timechart: Show the duration of scheduler delays in the SVG
Given that scheduler latencies are the hot thing nowadays, show the
duration of said latencies in the SVG in text form.

In addition, if the latency is more than 10 msec, pick a brighter
yellow color as a way to point these long delays out.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090920181353.796f4509@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20 19:37:35 +02:00
Arjan van de Ven 4f1202c8e6 perf timechart: Show the name of the waker/wakee in timechart
Timechart currently shows thin green lines for sending or receiving
wakeups. This patch also prints (in a very small font) the name of
the process that is being woken/wakes up this process.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090920181328.68baa978@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20 19:37:34 +02:00
Arjan van de Ven ec60a3fe47 perf utils: Use a define for the maximum length of a trace event
As per Ingo's review: use a #define rather than an open coded constant
for the maximum length of a trace event for storing in the perf.data file.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: fweisbec@gmail.com
Cc: peterz@infradead.org
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090919133630.10533d3e@infradead.org>
[ add a few comments to nearby functions ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-19 18:57:53 +02:00
Arjan van de Ven 964a0b3d2b perf utils: Be consistent about minimum text size in the svghelper
Be more consistent in the svghelper about the minimum text size
by having a global #define for this.

There needs to be a minimum text size in order to keep the size
of the SVG file within the reach of what current SVG viewers can
cope with.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: fweisbec@gmail.com
Cc: peterz@infradead.org
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <20090919133507.7374ef8b@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-19 18:57:52 +02:00
Arjan van de Ven f48d55ce78 perf: Add a SVG helper library file
The timechart tool writes out SVG format output; this patch adds a
set of helper functions to abstract dealing with SVG from the core
timechart code.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090912130613.677f0516@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-19 11:42:13 +02:00
Arjan van de Ven fd39e055c4 perf: Add a sample_event type to the event_union
Add a sample_event type to the event_union so that raw samples can
be processed easily.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090912130511.411434b5@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-19 11:42:12 +02:00
Arjan van de Ven a79d7db9fd perf: Allow perf utilities to have "callback" options without arguments
timechart needs to add a "callback" type command line argument that
does not take arguments.

This patch adds the parse-options.h infrastructure to make this
possible.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090912130440.548666c1@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-19 11:42:11 +02:00
Arjan van de Ven 8755a8f27a perf: Store trace event name/id pairs in perf.data
The trace event name<->id mapping is dynamic for each kernel
compile. In order for perf.data to be useable outside the actual
system, we thus need to store a table of this mapping for later
use.

This patch adds this table to perf.data, and provides helper
functions for lookup up fields from this table.

To avoid mistakes, lookup-from-table is kept completely seprate
from lookup-from-local-debugfs.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090912130405.6960d099@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-19 11:42:11 +02:00
Arjan van de Ven 393b2ad8c7 perf: Add a timestamp to fork events
perf timechart needs to know when a process forked, in order to be
able to visualize properly when tasks start.

This patch adds a time field to the event structure, and fills it
in appropriately.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090912130341.51ad2de2@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-19 11:42:10 +02:00
Li Zefan 1281a49b7a perf trace: Sample timestamp and cpu when using record flag
Sample timestamp and cpu just like the -R option.

Before:
            init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
            init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
            init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=1 handler=i8042
            init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
            init-0     [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=1 handler=i8042

After:
            init-0     [001]  7364.568965353: irq_handler_entry: irq=18 handler=eth0
            init-0     [001]  7365.530226877: irq_handler_entry: irq=1 handler=i8042
            init-0     [001]  7365.542831563: irq_handler_entry: irq=18 handler=eth0
            init-0     [001]  7365.644156299: irq_handler_entry: irq=18 handler=eth0
            init-0     [001]  7365.694556201: irq_handler_entry: irq=18 handler=eth0

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4AB1F827.8040905@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-18 00:30:25 +02:00
Li Zefan 270bbbe80d perf tools: Increase MAX_EVENT_LENGTH
The name length of some trace events is longer than 30, like
sys_enter_sched_get_priority_max and
ext4_mb_discard_preallocations.

Passing those events to perf-record will fail, try:

  # ./perf record -f -e syscalls:sys_enter_sched_get_priority_max -F 1 -a

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4AB1F4AB.7050205@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-18 00:30:25 +02:00
Li Zefan 6706ccf8e7 perf tools: Fix memory leak in read_ftrace_printk()
get_tracing_file() should be paired with put_tracing_file().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4AB1F48F.4070807@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-18 00:30:24 +02:00
Ingo Molnar 0ec04e16d0 perf sched: Add 'perf sched map' scheduling event map printout
This prints a textual context-switching outline of workload
captured via perf sched record.

For example, on a 16 CPU box it outputs:

   N1  O1  .   .   .   S1  .   .   .   B0  .  *I0  C1  .   M1  .    23002.773423 secs
   N1  O1  .  *Q0  .   S1  .   .   .   B0  .   I0  C1  .   M1  .    23002.773423 secs
   N1  O1  .   Q0  .   S1  .   .   .   B0  .  *R1  C1  .   M1  .    23002.773485 secs
   N1  O1  .   Q0  .   S1  .  *S0  .   B0  .   R1  C1  .   M1  .    23002.773478 secs
  *L0  O1  .   Q0  .   S1  .   S0  .   B0  .   R1  C1  .   M1  .    23002.773523 secs
   L0  O1  .  *.   .   S1  .   S0  .   B0  .   R1  C1  .   M1  .    23002.773531 secs
   L0  O1  .   .   .   S1  .   S0  .   B0  .   R1  C1 *T1  M1  .    23002.773547 secs T1 => irqbalance:2089
   L0  O1  .   .   .   S1  .   S0  .  *P0  .   R1  C1  T1  M1  .    23002.773549 secs
  *N1  O1  .   .   .   S1  .   S0  .   P0  .   R1  C1  T1  M1  .    23002.773566 secs
   N1  O1  .   .   .  *J0  .   S0  .   P0  .   R1  C1  T1  M1  .    23002.773571 secs
   N1  O1  .   .   .   J0  .   S0 *B0  P0  .   R1  C1  T1  M1  .    23002.773592 secs
   N1  O1  .   .   .   J0  .  *U0  B0  P0  .   R1  C1  T1  M1  .    23002.773582 secs
   N1  O1  .   .   .  *S1  .   U0  B0  P0  .   R1  C1  T1  M1  .    23002.773604 secs
   N1  O1  .   .   .   S1  .   U0  B0 *.   .   R1  C1  T1  M1  .    23002.773615 secs
   N1  O1  .   .   .   S1  .   U0  B0  .   .  *K0  C1  T1  M1  .    23002.773631 secs
   N1  O1  .  *M0  .   S1  .   U0  B0  .   .   K0  C1  T1  M1  .    23002.773624 secs
   N1  O1  .   M0  .   S1  .   U0 *.   .   .   K0  C1  T1  M1  .    23002.773644 secs
   N1  O1  .   M0  .   S1  .   U0  .   .   .  *R1  C1  T1  M1  .    23002.773662 secs
   N1  O1  .   M0  .   S1  .  *.   .   .   .   R1  C1  T1  M1  .    23002.773648 secs
   N1  O1  .  *.   .   S1  .   .   .   .   .   R1  C1  T1  M1  .    23002.773680 secs
   N1  O1  .   .   .  *L0  .   .   .   .   .   R1  C1  T1  M1  .    23002.773717 secs
  *N0  O1  .   .   .   L0  .   .   .   .   .   R1  C1  T1  M1  .    23002.773709 secs
  *N1  O1  .   .   .   L0  .   .   .   .   .   R1  C1  T1  M1  .    23002.773747 secs

Columns stand for individual CPUs, from CPU0 to CPU15, and the
two-letter shortcuts stand for tasks that are running on a CPU.

'*' denotes the CPU that had the event.

A dot signals an idle CPU.

New tasks are assigned new two-letter shortcuts - when they occur
first they are printed. In the above example 'T1' stood for irqbalance:

      T1 => irqbalance:2089

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-16 16:41:30 +02:00
Ingo Molnar 80ed0987f3 perf sched: Make idle thread and comm/pid names more consistent
Peter noticed that we have 3 ways of referring to the idle thread:

 [idle]:0
 swapper:0
 swapper-0

Standardize on 'swapper:0'.

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-16 12:15:47 +02:00
Ingo Molnar dc02bf7178 perf sched: Account for lost events, increase default buffering
Output such lost event and state machine weirdness stats:

   TOTAL:                |  14974.910 ms |    46384 |
  ---------------------------------------------------
   INFO: 8.865% lost events (19132 out of 215819, in 8 chunks)
   INFO: 0.198% state machine bugs (49 out of 24708) (due to lost events?)

And increase buffering to -m 1024 (4 MB) by default. Since we
use output multiplexing that kind of space is needed.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-16 11:48:05 +02:00
Ingo Molnar ea57c4f520 perf tools: Implement counter output multiplexing
Finish the -M/--multiplex option implementation:

 - separate it out from group_fd

 - correctly set it via the ioctl and dont mmap counters that
   are multiplexed

 - modify the perf record event loop to deal with buffer-less
   counters.

 - remove the -g option from perf sched record

 - account for unordered events in perf sched latency

 - (add -f to perf sched record to ease measurements)

 - skip idle threads (pid==0) in latency output

The result is better latency output by 'perf sched latency':

 -----------------------------------------------------------------------------------
  Task              |  Runtime ms | Switches | Average delay ms | Maximum delay ms |
 -----------------------------------------------------------------------------------
  ksoftirqd/8       |    0.071 ms |        2 | avg:    0.458 ms | max:    0.913 ms |
  at-spi-registry   |    0.609 ms |       19 | avg:    0.013 ms | max:    0.023 ms |
  perf              |    3.316 ms |       16 | avg:    0.013 ms | max:    0.054 ms |
  Xorg              |    0.392 ms |       19 | avg:    0.011 ms | max:    0.018 ms |
  sleep             |    0.537 ms |        2 | avg:    0.009 ms | max:    0.009 ms |
 -----------------------------------------------------------------------------------
  TOTAL:            |    4.925 ms |       58 |
 ---------------------------------------------

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-14 15:45:11 +02:00
Ingo Molnar 1fc35b29b4 perf sched: Implement the 'perf sched record' subcommand
Implement the 'perf sched record' subcommand that adds a
default list of events, turns on raw sampling and system-wide
tracing and passes off the rest of the command to perf record.

This is more convenient than having to specify the events all
the time.

Before:

 $ perf record -a -R -e sched:sched_switch:r -e sched:sched_stat_wait:r -e sched:sched_stat_sleep:r -e sched:sched_stat_iowait:r -e sched:sched_process_exit:r -e sched:sched_process_fork:r -e sched:sched_wakeup:r -e sched:sched_migrate_task:r -c 1 sleep 1

After:

 $ perf sched record -f sleep 1

Also fix an assumption in the event string parser that assumed
that strings passed in can be modified. (In this case they wont
be as they come from a readonly constant section.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13 10:22:51 +02:00
Ingo Molnar b5fae128e4 perf sched: Clean up PID sorting logic
Use a sort list for thread atoms insertion as well - instead of
hardcoded for PID.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13 10:22:50 +02:00
Ingo Molnar d9340c1db3 perf sched: Display time in milliseconds, reorganize output
After:

-----------------------------------------------------------------------------------
 Task              |  runtime ms | switches | average delay ms | maximum delay ms |
-----------------------------------------------------------------------------------
 migration/0       |    0.000 ms |        1 | avg:    0.047 ms | max:    0.047 ms |
 ksoftirqd/0       |    0.000 ms |        1 | avg:    0.039 ms | max:    0.039 ms |
 migration/1       |    0.000 ms |        3 | avg:    0.013 ms | max:    0.016 ms |
 migration/3       |    0.000 ms |        2 | avg:    0.003 ms | max:    0.004 ms |
 migration/4       |    0.000 ms |        1 | avg:    0.022 ms | max:    0.022 ms |
 distccd           |    0.000 ms |        1 | avg:    0.004 ms | max:    0.004 ms |
 distccd           |    0.000 ms |        1 | avg:    0.014 ms | max:    0.014 ms |
 distccd           |    0.000 ms |        2 | avg:    0.000 ms | max:    0.000 ms |
 distccd           |    0.000 ms |        2 | avg:    0.012 ms | max:    0.019 ms |
 distccd           |    0.000 ms |        1 | avg:    0.002 ms | max:    0.002 ms |
 as                |    0.000 ms |        2 | avg:    0.019 ms | max:    0.019 ms |
 as                |    0.000 ms |        3 | avg:    0.015 ms | max:    0.017 ms |
 as                |    0.000 ms |        1 | avg:    0.009 ms | max:    0.009 ms |
 perf              |    0.000 ms |        1 | avg:    0.001 ms | max:    0.001 ms |
 gcc               |    0.000 ms |        1 | avg:    0.021 ms | max:    0.021 ms |
 run-mozilla.sh    |    0.000 ms |        2 | avg:    0.010 ms | max:    0.017 ms |
 mozilla-plugin-   |    0.000 ms |        1 | avg:    0.006 ms | max:    0.006 ms |
 gcc               |    0.000 ms |        2 | avg:    0.013 ms | max:    0.013 ms |
-----------------------------------------------------------------------------------

(The runtime ms column is not filled in yet.)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13 10:22:44 +02:00
Frederic Weisbecker 4653881802 perf sched: Fix bad event alignment
perf sched raises the following error when it meets a sched
switch event:

perf: builtin-sched.c:286: register_pid: Assertion `!(pid >= 65536)' failed.
Abandon

Currently in x86-64, the sched switch events have a hole in the
middle of the structure:

	u16 common_type;
	u8 common_flags;
	u8 common_preempt_count;
	u32 common_pid;
	u32 common_tgid;

	char prev_comm[16];
	u32 prev_pid;
	u32 prev_prio;
			<--- there
	u64 prev_state;
	char next_comm[16];
	u32 next_pid;
	u32 next_prio;

Gcc inserts a 4 bytes hole there for prev_state to be u64
aligned. And the events are exported to userspace with this
hole.

But in userspace, from perf sched, we fetch it using a
structure that has a new field in the beginning: u32 size. This
is because our trace is exported with its size as a field. But
now that we have this new field, the hole in the middle
disappears because it makes prev_state becoming well aligned.

And since we are using a pointer to the raw trace using this
struct, instead of reading prev_state, we are reading the hole.

We could fix it by keeping the size seperate from the struct
but actually there a lot of other potential problems: some
fields may be saved as long in a 64 bits system and later read
as long in a 32 bits system. Also this direct cast doesn't care
about the endianness differences between the host traced
machine and the machine in which we do the post processing.

So instead of using such dangerous direct casts, fetch the
values using the trace parsing API that already takes care of
all these problems.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13 10:22:41 +02:00
Frederic Weisbecker bcd3279f46 perf tools: Allow the specification of all tracepoints at once
Currently, when one wants to activate every tracepoint
counters of a subsystem from perf record, the current sequence
is needed:

  perf record -e subsys:ev1 -e subsys:ev2 -e subsys:ev3

This may annoy the most patient of us.

Now we can just do:

  perf record -e subsys:*

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13 10:22:40 +02:00
Ingo Molnar ec156764d4 perf sched: Import schedbench.c
Import the schedbench.c tool that i wrote some time ago to
simulate scheduler behavior but never finished. It's a good
basis for perf sched nevertheless.

Most of its guts are not hooked up to the perf event loop
yet - that will be done in the patches to come.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-13 10:22:37 +02:00
Linus Torvalds 483e3cd6a3 Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (105 commits)
  ring-buffer: only enable ring_buffer_swap_cpu when needed
  ring-buffer: check for swapped buffers in start of committing
  tracing: report error in trace if we fail to swap latency buffer
  tracing: add trace_array_printk for internal tracers to use
  tracing: pass around ring buffer instead of tracer
  tracing: make tracing_reset safe for external use
  tracing: use timestamp to determine start of latency traces
  tracing: Remove mentioning of legacy latency_trace file from documentation
  tracing/filters: Defer pred allocation, fix memory leak
  tracing: remove users of tracing_reset
  tracing: disable buffers and synchronize_sched before resetting
  tracing: disable update max tracer while reading trace
  tracing: print out start and stop in latency traces
  ring-buffer: disable all cpu buffers when one finds a problem
  ring-buffer: do not count discarded events
  ring-buffer: remove ring_buffer_event_discard
  ring-buffer: fix ring_buffer_read crossing pages
  ring-buffer: remove unnecessary cpu_relax
  ring-buffer: do not swap buffers during a commit
  ring-buffer: do not reset while in a commit
  ...
2009-09-11 13:24:03 -07:00
Ingo Molnar ed011b22ce Merge commit 'v2.6.31-rc9' into tracing/core
Merge reason: move from -rc5 to -rc9.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-06 06:11:42 +02:00
Ulrich Drepper 6b58e7f146 perf tools: Avoid unnecessary work in directory lookups
This patch improves some (common) inefficiencies in the
handling of directory lookups:

- not using the d_type information returned by the kernel

- constructing (absolute) paths for file operation even though
  directory-relative operations using the *at functions is
  possible

There are more places to fix but this is a start.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20090904193951.GB6186@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-04 21:50:17 +02:00
Ingo Molnar 6f4596d931 perf trace: Fix read_string()
We did not account for the enclosing \0. Depending on what malloc()
gave us this resulted in corrupted version string printouts.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03 16:22:45 +02:00
Ingo Molnar 00fc97863c perf trace: Print out in nanoseconds
Print out more accurate timestamps - usecs does not cut it
anymore on fast enough boxes ;-)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03 16:22:02 +02:00
Ingo Molnar 2e01d17911 perf tools: Seek to the end of the header area
Leave the input fd at the data area.

It does not matter right now - but seeking at the end of it
certainly did not make sense.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03 16:21:11 +02:00
Ingo Molnar 65014ab361 perf tools: Work around strict aliasing related warnings
Older versions of GCC are rather stupid about strict aliasing:

  util/trace-event-parse.c: In function 'parse_cmdlines':
  util/trace-event-parse.c:93: warning: dereferencing type-punned pointer will break strict-aliasing rules
  util/trace-event-parse.c: In function 'parse_proc_kallsyms':
  util/trace-event-parse.c:155: warning: dereferencing type-punned pointer will break strict-aliasing rules
  util/trace-event-parse.c:157: warning: dereferencing type-punned pointer will break strict-aliasing rules
  util/trace-event-parse.c:158: warning: dereferencing type-punned pointer will break strict-aliasing rules
  util/trace-event-parse.c: In function 'parse_ftrace_printk':
  util/trace-event-parse.c:294: warning: dereferencing type-punned pointer will break strict-aliasing rules
  util/trace-event-parse.c:295: warning: dereferencing type-punned pointer will break strict-aliasing rules
  make: *** [util/trace-event-parse.o] Error 1

Make it clear to GCC that we intend with those pointers, by passing
them through via an explicit (void *) cast.

We might want to add -fno-strict-aliasing as well, like the kernel
itself does.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02 14:56:33 +02:00
Frederic Weisbecker 561f732c12 perf tools: Complete support for dynamic strings
Complete support for __str_loc type strings of ftrace events
which have dynamic offsets values set for each of them inside
their sammples.

Before:
        geany-5759  [000]     0.000000: lock_release: name
        geany-5759  [000]     0.000000: lock_release: name
        geany-5759  [000]     0.000000: lock_release: name
  kondemand/0-362   [000]     0.000000: lock_release: name
      pdflush-421   [000]     0.000000: lock_release: name

After:
        geany-5759  [000]     0.000000: lock_release: &u->lock
        geany-5759  [000]     0.000000: lock_release: key
        geany-5759  [000]     0.000000: lock_release: &group->notification_mutex
  kondemand/0-362   [000]     0.000000: lock_release: &rq->lock
      pdflush-421   [000]     0.000000: lock_release: &rq->lock

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1251693921-6579-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
2009-08-31 10:04:49 +02:00
Frederic Weisbecker 9b8055a52c perf tools: Unify swapper tasks naming
In perf tools, we hardcode the pid 0 cmdline resolving to
"idle" because the init task is not included in the COMM
events.

But the idle tasks secondary cpus are resolved into their
"init" name through the COMM events.

We have then such strange result in perf report (ditto with
trace):

    19.66%       init    [kernel]          [k] acpi_idle_enter_c1
    17.32%       [idle]  [kernel]          [k] acpi_idle_enter_c1

It's then better to unify the swapper tasks into a single init
name.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1251693921-6579-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
2009-08-31 10:04:49 +02:00
Frederic Weisbecker 5b447a6a13 perf tools: Librarize idle thread registration
Librarize register_idle_thread() used by annotate and report.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1251693921-6579-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31 10:04:48 +02:00
Ingo Molnar 19c959627a Merge branch 'perfcounters/tracing' into perfcounters/core
Merge reason: this topic is ready now to merge into the main
              development branch for .32, with functional
              perf trace output.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31 10:03:27 +02:00
Frederic Weisbecker d498bc1f62 perf tools: Fix missing string field printing in perf trace
Some string fields are not printed because of a missing printf
in the post-processing.

Before:
	    perf-10070 [000]     0.000000: sched_switch: task :10070 [120] (R) ==> :5720 [120]
           geany-5720  [000]     0.000000: sched_switch: task :5720 [120] (S) ==> :10070 [120]
            perf-10070 [000]     0.000000: sched_switch: task :10070 [120] (R) ==> :5720 [120]
           geany-5720  [000]     0.000000: sched_switch: task :5720 [120] (S) ==> :10070 [120]
          <idle>-0     [000]     0.000000: sched_switch: task :0 [140] (R) ==> :361 [115]

After:
	    perf-10070 [000]     0.000000: sched_switch: task perf:10070 [120] (R) ==> geany:5720 [120]
           geany-5720  [000]     0.000000: sched_switch: task geany:5720 [120] (S) ==> perf:10070 [120]
            perf-10070 [000]     0.000000: sched_switch: task perf:10070 [120] (R) ==> geany:5720 [120]
           geany-5720  [000]     0.000000: sched_switch: task geany:5720 [120] (S) ==> perf:10070 [120]
          <idle>-0     [000]     0.000000: sched_switch: task swapper:0 [140] (R) ==> kondemand/1:361 [115]

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1251427567-10551-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28 07:58:11 +02:00
Frederic Weisbecker 1ef2ed1066 perf tools: Only save the event formats we need
While opening a trace event counter, every events are saved in
the trace.info file. But we only want to save the
specifications of the events we are using.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1251421798-9101-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28 07:58:11 +02:00
Frederic Weisbecker 6e086437f3 perf tools: Save partial non-overlapping map
The librarization of the thread helpers between annotate and
report lost some perf report specifics.

thread__insert_map() had its most uptodate version in perf
report which cared about partial map overlapping. In case of
overlap between two maps, perf annotate's version removes the
whole old map without considering if it partially or
absolutely overlaps the new map.

We exported the odd version, change it by using the perf
report version.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250607843-7395-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-18 17:18:03 +02:00
Ingo Molnar 1f18345bdf perf tools: Remove obsolete defines
The _XOPEN_SOURCE* defines are not really needed on Linux and
it's not like we'll port this to AIX ;-)

The define also broke the build with gcc 4.4.1:

 CC util/trace-event-parse.o
 In file included from util/trace-event-parse.c:32:
 util/util.h:43:1: error: "_XOPEN_SOURCE" redefined

So remove them.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-18 11:00:05 +02:00
Frederic Weisbecker 3f9edc2382 perf tools: Make trace event format parser aware of cast to pointers
The ftrace event format parser handles the usual casts but not
the cast to pointers. Such casts have been introduced recently
with the module trace events and raise the following parsing
error:

	Fatal: bad op token )

This is because it considers the "*" character as a binary
operator. Make it then aware of casts to pointers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1250543271-8383-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-18 00:00:19 +02:00
Frederic Weisbecker 5f9c39dca5 perf tools: Add perf trace
This adds perf trace into the set of perf tools.

It is written to fetch the tracepoint samples from perf events
and display them, according to the events information given by
the debugfs files through the util/trace* tools.

It is a rough first shot and doesn't yet handle the cpu,
timestamps fields and some other things.

Example:

 perf record -f -e workqueue:workqueue_execution:record -F 1 -a
 perf trace

       kblockd/0-236   [000]     0.000000: workqueue_execution: thread=:236 func=cfq_kick_queue+0x0
     kondemand/0-360   [000]     0.000000: workqueue_execution: thread=:360 func=do_dbs_timer+0x0
     kondemand/0-360   [000]     0.000000: workqueue_execution: thread=:360 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0
     kondemand/1-361   [000]     0.000000: workqueue_execution: thread=:361 func=do_dbs_timer+0x0

Todo:

- A lot of things!

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Jon Masters <jonathan@jonmasters.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1250518688-7207-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 16:32:39 +02:00
Steven Rostedt ea4010d136 perf tools: Add trace event information parser
Add util/trace-event-parse.c which provides the handlers to
parse the ftrace events info from the stream and handles the
ftrace perf samples event printing.

This file is a rename of the parse-events.c file from the
trace-cmd tools, written by Steven Rostedt and Josh Triplett,
originated from the git tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git

This is a perf tools integration.

[ fweisbec@gmail.com: various changes for perf tools
                      integration. ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Jon Masters <jonathan@jonmasters.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1250518688-7207-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 16:32:39 +02:00
Steven Rostedt 538bafb5cc perf tools: Add trace event debugfs stream reader
Add util/trace-event-read.c which handles trace events
informations reading.

This file is a rename of the trace-read.c file from the
trace-cmd tools, written by Steven Rostedt and Josh Triplett,
originated from the git tree:

   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git

This is its perf tools integration.

[ fweisbec@gmail.com: various changes for perf tools
                      integration. ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Jon Masters <jonathan@jonmasters.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1250518688-7207-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 16:32:38 +02:00
Steven Rostedt 5205094364 perf tools: Add trace event debugfs IO handler
Add util/trace-event-info.c which handles ftrace file IO from
debugfs and provides general helpers to fetch/save ftrace
events informations.

This file is a rename of the trace-cmd.c file from the
trace-cmd tools, written by Steven Rostedt and Josh Triplett,
originated from the git tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git

This is a perf tools integration.

For now, ftrace events information is saved in a separate file
than the standard perf.data

[fweisbec@gmail.com: various changes for perf tools integration]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Jon Masters <jonathan@jonmasters.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1250518688-7207-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 16:32:38 +02:00
Frederic Weisbecker 8f28827a16 perf tools: Librarize trace_event() helper
Librarize trace_event() helper so that perf trace can use it
too. Also clean up the debug.h includes a bit.

It's not good to have it included in perf.h because it doesn't
make it flexible against other headers it may need (headers
that can also depend on perf.h and then create a recursive
header dependency).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250453149-664-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16 23:06:45 +02:00
Frederic Weisbecker 0d3a5c8859 perf tools: Librarize sample type and attr finding from headers
Librarize the sample type and attr fetching from perf data file
headers so that we can also use it from perf trace.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250448997-30715-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16 23:06:44 +02:00
Frederic Weisbecker 0f25bfc8d8 perf tools: Put the show mode into the event headers files
Annotate and report share the same flags to filter events
considering their context (kernel, user, hypervisor).

Both tools have their own definitions of these flags. Factorize
them out into the event headers file.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250445414-29237-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16 19:59:13 +02:00
Frederic Weisbecker 2cec19d9d0 perf tools: Factorize the dprintf definition
We have two users of dprintf: report and annotate. Another one
is coming with perf trace. Then factorize it into the debug
file.

While at it, rename dprintf() to dump_printf() so that it
doesn't conflicts with its libc homograph.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250443461-28130-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16 19:42:31 +02:00
Ingo Molnar 83a0944fa9 perf: Enable more compiler warnings
Related to a shadowed variable bug fix Valdis Kletnieks noticed
that perf does not get built with -Wshadow, which could have
helped us avoid the bug.

So enable -Wshadow and also enable the following warnings on
perf builds, in addition to the already enabled -Wall -Wextra
-std=gnu99 warnings:

 -Wcast-align
 -Wformat=2
 -Wshadow
 -Winit-self
 -Wpacked
 -Wredundant-decls
 -Wstack-protector
 -Wstrict-aliasing=3
 -Wswitch-default
 -Wswitch-enum
 -Wno-system-headers
 -Wundef
 -Wvolatile-register-var
 -Wwrite-strings
 -Wbad-function-cast
 -Wmissing-declarations
 -Wmissing-prototypes
 -Wnested-externs
 -Wold-style-definition
 -Wstrict-prototypes
 -Wdeclaration-after-statement

And change/fix the perf code to build cleanly under GCC 4.3.2.

The list of warnings enablement is rather arbitrary: it's based
on my (quick) reading of the GCC manpages and trying them on
perf.

I categorized the warnings based on individually enabling them
and looking whether they trigger something in the perf build.
If i liked those warnings (i.e. if they trigger for something
that arguably could be improved) i enabled the warning.

If the warnings seemed to come from language laywers spamming
the build with tons of nuisance warnings i generally kept them
off. Most of the sign conversion related warnings were in
this category. (A second patch enabling some of the sign
warnings might be welcome - sign bugs can be nasty.)

I also kept warnings that seem to make sense from their manpage
description and which produced no actual warnings on our code
base. These warnings might still be turned off if they end up
being a nuisance.

I also left out a few warnings that are not supported in older
compilers.

[ Note that these changes might break the build on older
  compilers i did not test, or on non-x86 architectures that
  produce different warnings, so more testing would be welcome. ]

Reported-by: Valdis.Kletnieks@vt.edu
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-16 10:47:47 +02:00
Frederic Weisbecker 6baa0a5ae0 perf tools: Factorize the thread code in a dedicated file
Factorize the thread management code used by perf-annotate and
perf-report in dedicated source and header files.

v2: pass last_match by address so that it can actually be
modified.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250245313-6995-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15 16:10:19 +02:00
Ingo Molnar be750231ce Merge branch 'perfcounters/urgent' into perfcounters/core
Conflicts:
	kernel/perf_counter.c

Merge reason: update to latest upstream (-rc6) and resolve
              the conflict with urgent fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15 12:06:12 +02:00
Peter Zijlstra 18408ddc01 perf tools: Add some comments to the event definitions
Just to make it clear that these are _not_ generic event
structures but do rely on the counter configuration.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey J Ashford <cjashfor@us.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
LKML-Reference: <20090813103655.334194326@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15 12:00:09 +02:00
Frederic Weisbecker 3a9f131fb0 perf tools: Add a per tracepoint counter attribute to get raw sample
Add a new flag field while opening a tracepoint perf counter:

	-e tracepoint_subsystem:tracepoint_name:flags

This is intended to be generic although for now it only supports the
r[e[c[o[r[d]]]]] flag:

	./perf record -e workqueue:workqueue_insertion:record
	./perf record -e workqueue:workqueue_insertion:r

will have the same effect: enabling the raw samples record for
the given tracepoint counter.

In the future, we may want to support further flags, separated
by commas.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1250152039-7284-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-13 10:37:25 +02:00
Arnaldo Carvalho de Melo 1340e6bbaf perf tools: Fix dso__new handle() to handle deleted DSOs
It is better than showing the map addr, this way at least we
know that we can't get the symtabs because the DSO was deleted
(system update) while an app still used such DSO.

Yeah, don't do that, but if you do, you'll figure it out
quicker this way.

[acme@doppio linux-2.6-tip]$ perf report | head -15
 # Samples: 3796
 #
 # Overhead  Command                                                        Shared Object  Symbol
 # ........  .......  ...................................................................  ......
 #
    23.55%   pidgin  /lib64/libglib-2.0.so.0.2000.4.#prelink#.Pd98lu (deleted)            [.] 0x00000000038844
    21.55%   pidgin  /lib64/libpthread-2.10.1.so.#prelink#.AFwK8Q (deleted)               [.] 0x0000000000a42d
    10.85%   pidgin  [kernel]                                                             [.] vread_hpet
     7.85%   pidgin  /lib64/libgobject-2.0.so.0.2000.4.#prelink#.o1vpU7 (deleted)         [.] 0x00000000014de8
     3.35%   pidgin  /lib64/libc-2.10.1.so (deleted)                                      [.] 0x0000000007a875
     3.19%   pidgin  /lib64/libdbus-1.so.3.4.0.#prelink#.6mwgZP (deleted)                 [.] 0x0000000001d254
     3.06%   pidgin  /usr/lib64/libgtk-x11-2.0.so.0.1600.5.#prelink#.511hAl (deleted)     [.] 0x000000002334e7
     2.90%   pidgin  /usr/lib64/libgdk-x11-2.0.so.0.1600.5.#prelink#.5qlMo1 (deleted)     [.] 0x00000000037b2d
     1.84%   pidgin  [kernel]                                                             [k] do_sys_poll
     1.45%   pidgin  /usr/lib64/libX11.so.6.2.0.#prelink#.iR59Rx (deleted)                [.] 0x0000000004c751
[acme@doppio linux-2.6-tip]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Luis Claudio R. Gonçalves <lclaudio@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090811200436.GA3478@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12 14:10:50 +02:00
Arnaldo Carvalho de Melo 247648e374 perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available
In old binutils we can't access bfd_demangle(), use
cplus_demangle() just like oprofile.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Luis Claudio R. Gonçalves <lclaudio@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090811192211.GG18061@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12 14:10:49 +02:00
Frederic Weisbecker 66e274f3b8 perf tools: Factorize the map helpers
Factorize the dso mapping helpers into a single purpose common file
"util/map.c"

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12 12:37:37 +02:00
Frederic Weisbecker 1fe2c1066c perf tools: Factorize the event structure definitions in a single file
Factorize the multiple definition of the events structures into a
single util/event.h file.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12 12:04:39 +02:00
Frederic Weisbecker cd84c2ac6d perf tools: Factorize high level dso helpers
Factorize multiple definitions of high level dso helpers into the
symbol source file.

The side effect is a general export of the verbose and eprintf
debugging helpers into a new file dedicated to debugging purposes.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12 12:02:38 +02:00
Jason Baron 48c2e17f1f tracing: Add more namespace area to 'perf list' output
The new syscall tracepoints names can be too long for the 'perf list'
output.
Add a few more characters.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-11 20:35:29 +02:00
Brice Goglin 9f86669711 perf report: Add raw displaying of per-thread counters
If --pretty=raw is given to perf report -T, it now displays one
line per-thread per-counter with the raw event id added.

We get:
 #   PID    TID              Name  Raw    Count
   18608  18609      cache-misses  28e   416744
   18608  18609  cache-references  28f  6456792
   18608  18608      cache-misses  28e   448219
   18608  18608  cache-references  28f  7270244
 instead of:

#   PID    TID  cache-misses  cache-references
   18608  18609        416744           6456792
   18608  18608        448219           7270244

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Acked-by: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4A802008.5050409@inria.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-10 15:48:17 +02:00
Frederic Weisbecker c0a8865e32 perf tools: callchain: Fix bad rounding of minimum rate
Sometimes we get callchain branches that have a rate under the
limit given by the user.

Say you launched:

 perf record -f -g -a ./hackbench 10
 perf report -g fractal,10.0

And you got:

2.33%       hackbench  [kernel]                  [k] _spin_lock_irqsave
                |
                |--78.57%-- remove_wait_queue
                |          poll_freewait
                |          do_sys_poll
                |          sys_poll
                |          sysenter_dispatch
                |          0xf7ffa430
                |          0x1ffadea3c
                |
                |--7.14%-- __up_read
                |          up_read
                |          do_page_fault
                |          page_fault
                |          0xf7ffa430
                |          0xa0df710000000a
                ...

It is abnormal to get a 7.14% branch whereas we passed a 10%
filter.

The problem is that we round down the minimum threshold. This
happens mostly when we have very low number of events. If the
total amount of your branch is 4 and you have a subranch of 3
events, filtering to 90% will be computed like follows:

  limit = 4 * 0.9;

The result is about 3.6, but the cast to integer will round
down to 3. It means that our filter is actually of 75%

We must then explicitly round up the minimum threshold.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@redhat.com
Cc: peterz@infradead.org
Cc: efault@gmx.de
LKML-Reference: <20090809024235.GA10146@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 13:07:46 +02:00
Brice Goglin 8d51327090 perf report: Fix and improve the displaying of per-thread event counters
Improve and fix the handling of per-thread counter stats
recorded via perf record -s. Previously we only displayed
it in debug printouts (-D) and even that output was hard
to disambiguate.

I moved everything to utils/values.[ch] so that we may reuse
it in perf stat.

We get something like this now:

 #  PID   TID  cache-misses  cache-references
   4658  4659        495581           3238779
   4658  4662        498246           3236823
   4658  4663        499531           3243162

Then it'll be easy to add --pretty=raw to display a single line per thread/event.

By the way, -S was also used for --symbol... So I used -T/--thread here.

perf report: Add -T/--threads to display per-thread counter values

 We get something like this now:
 #  PID   TID  cache-misses  cache-references
   4658  4659        495581           3238779
   4658  4662        498246           3236823
   4658  4663        499531           3243162

Per-thread arrays of counter values are managed in utils/values.[ch]

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 13:04:20 +02:00
Frederic Weisbecker b1a88349c3 perf tools: callchain: Fix 'perf report' display to be callchain by default
If we recorded with -g option to record the callchain, right now
we require a -g option to perf report as well - and people reported
this as unnecessary complication: the user already specified -g
once, no need to require it a second time.

So if the recording includes call-chains, display the callchain by
default from perf report.

( The user can override this default using "-g none" option from
  perf report. )

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1249690585-9145-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:42 +02:00
Frederic Weisbecker b0efe213f8 perf tools: callchain: Fix spurious 'perf report' warnings: ignore empty callchains
When the callchain tree comes to insert an empty backtrace, it
raises a spurious warning about the fact we are inserting an
empty. This is spurious because the radix tree assumes it did
something wrong to reach this state. But it didn't, we just met
an empty callchain that has to be ignored.

This happens occasionally with certain types of call-chain
recordings. If it happens it's a big nuisance as perf report
output starts with thousands of warning lines.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1249690585-9145-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:41 +02:00
Pierre Habouzit 7eac7e9e72 perf util: Fix do_read() to fail on EOF instead of busy-looping
While toying with perf, I've noticed that perf record can
easily enter a busy loop when doing something as silly as:

    $ perf record -A ls

Yeah, do_read here really wants to read a known size, not being
able to should die(), not busy-loop ;)

That was the cause for the bug.

Signed-off-by: Pierre Habouzit <pierre.habouzit@intersec.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:39 +02:00
Peter Zijlstra ae07b63f4b perf list: Fix the output to not include tracepoints without an id
Stop perf list from displaying tracepoints without an id file,
those are special tracepoints that are not interfaced to
perfcounters so listing them is erroneous and passing them as
events will produce no output.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:38 +02:00
Arnaldo Carvalho de Melo 94cb9e385d perf report: Add debug help for the finding of symbol bugs - show the symtab origin (DSO, build-id, kernel, etc)
Used with perf report --verbose:

[acme@doppio linux-2.6-tip]$ perf report -v | head -16
     5.17%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
     2.56%  firefox  /lib64/libpthread-2.10.1.so            0x0000000000008e02 d [.] __pthread_mutex_lock_internal
     1.94%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x0000000000d0af8f f [.] SearchTable
     1.75%  firefox  [kernel]                               0xffffffffff60013b k [.] vread_hpet
     1.63%  firefox  /lib64/libpthread-2.10.1.so            0x000000000000a404 d [.] __pthread_mutex_unlock
     1.47%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret
     1.42%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer
     1.24%  firefox  [kernel]                               0xffffffff8102ca4a k [k] read_hpet
     1.16%  firefox  [kernel]                               0xffffffff810f3dd4 k [k] fget_light
     1.11%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject
     0.98%  firefox  /usr/lib64/firefox-3.5.2/firefox       0x000000000000dd23 b [.] arena_ralloc
[acme@doppio linux-2.6-tip]$

The new field is just after the symbol address. To help in
figuring out symbol resolution bugs.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:36 +02:00
Peter Zijlstra 8f18aec535 perf report: Fix per task mult-counter stat reporting
Brice Goglin reported:

> I can easily sort them by thread id, but I don't know how to match
> my 4 events with each group of 4 lines.

Also report the counter id and the time running/enabled
stats (in case the counter got time-shared).

Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:35 +02:00
Peter Zijlstra 1c222bce7d perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header
Brice Goglin reported that only the first result from a
multi-counter perf record --stat run is accurate, the
rest looks bogus.

A silly mistake made us re-read the first attribute for
every recorded attribute.

Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: paulus@samba.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:34 +02:00
Frederic Weisbecker 1953287bfe perf tools: Fix call-chain cumul hit based sub-total (fractal mode)
The callchain fractal mode builds each new total hits in a new
branch of profiling by using the parent's hits of the current
branch plus the hits of the children.

This is wrong, the total hits of a branch should be made of the
sum of every children hits, we must ignore the parent hits in
this scope.

This patch also fixes another mistake with the hit counting.

Now the rates are correct.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:33 +02:00
Arnaldo Carvalho de Melo 4d1e00a8af perf symbol: Fix symbol parsing in certain cases: use the build-id as a symlink
In some cases distros have binaries and debuginfo in weird places:

[root@doppio tuna]# ls -la /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
-rwxr-xr-x 1 root root 90024 2009-08-03 19:45 /usr/lib64/firefox-3.5.2/firefox
-rwxr-xr-x 1 root root 90024 2009-08-03 18:23 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
[root@doppio tuna]# sha1sum /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
19a858077d263d5de22c9c5da250d3e4396ae739  /usr/lib64/xulrunner-1.9.1/xulrunner-stub
19a858077d263d5de22c9c5da250d3e4396ae739  /usr/lib64/firefox-3.5.2/firefox
[root@doppio tuna]# rpm -qf /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
xulrunner-1.9.1.2-1.fc11.x86_64
firefox-3.5.2-2.fc11.x86_64
[root@doppio tuna]# ls -la /usr/lib/debug/{usr/lib64/xulrunner-1.9.1/xulrunner-stub,usr/lib64/firefox-3.5.2/firefox}.debug
ls: cannot access /usr/lib/debug/usr/lib64/firefox-3.5.2/firefox.debug: No such file or directory
-rwxr-xr-x 1 root root 403608 2009-08-03 18:22 /usr/lib/debug/usr/lib64/xulrunner-1.9.1/xulrunner-stub.debug

Seemingly we don't have a .symtab when we actually can find it
if we use the .note.gnu.build-id ELF section put in place by
some distros. Use it and find the symbols we need.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06 20:24:37 +02:00
Peter Zijlstra 2cdbc46d7b perf: Auto-detect libbfd
Since the C++ demangling isn't needed for everybody and
bfd/iberty aren't widely/easily available on all machines, make
it optional.

It also allows you to forcefully disable demangling by using
NO_DEMANGLE=1 and otherwise tries to detect libbfd/libiberty
combinations that result in a compiling demangler.

Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
LKML-Reference: <20090801082048.GX12579@kernel.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-05 14:12:08 +02:00
Roel Kluin 7e030655dd perf: Fix read buffer overflow
Check whether index is within bounds before testing the element.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: a.p.zijlstra@chello.nl
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A757BCF.40101@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-04 11:09:56 +02:00
Stoyan Gaydarov 9b30a26bf3 perf tools: Fix faulty check
This patch fixes a spelling error that has resulted from copy
and pasting. The location of the error was found using a
semantic patch but the semantic patch was not trying to find
these errors. After looking things over it seemed logical that
this change was needed. Please review it and then include the
patch if it is in fact the correct change.

Signed-off-by: Stoyan Gaydarov <sgayda2@uiuc.edu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1248949529-20891-1-git-send-email-sgayda2@uiuc.edu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-02 13:47:59 +02:00
Mike Galbraith d20ff6bd6b perf_counter tools: Fix vmlinux symbol generation breakage
vmlinux meets the criteria for symbol adjustment, which breaks vmlinux generated symbols.
Fix this by exempting vmlinux.  This is a bit fragile in that someone could change the
kernel dso's name, but currently that name is also hardwired.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1248091298.18702.18.camel@marge.simson.net>
2009-07-22 18:05:58 +02:00
Jason Baron 5beeded123 perf_counter: Detect debugfs location
If "/sys/kernel/debug" is not a debugfs mount point, search for the debugfs
filesystem in /proc/mounts, but also allows the user to specify
'--debugfs-dir=blah' or set the environment variable: 'PERF_DEBUGFS_DIR'

Signed-off-by: Jason Baron <jbaron@redhat.com>
[ also made it probe "/debug" by default ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090721181629.GA3094@redhat.com>
2009-07-22 18:05:57 +02:00
Jason Baron f6bdafef2a perf_counter: Add tracepoint support to perf list, perf stat
Add support to 'perf list' and 'perf stat' for kernel tracepoints. The
implementation creates a 'for_each_subsystem' and 'for_each_event' for
easy iteration over the tracepoints.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <426129bf9fcc8ee63bb094cf736e7316a7dcd77a.1248190728.git.jbaron@redhat.com>
2009-07-22 18:05:57 +02:00
Arnaldo Carvalho de Melo 28ac909b49 perf symbol: C++ demangling
[acme@doppio ~]$ perf report -s comm,dso,symbol -C firefox -d /usr/lib64/xulrunner-1.9.1/libxul.so | grep :: | head
     2.21%  [.] nsDeque::Push(void*)
     1.78%  [.] GraphWalker::DoWalk(nsDeque&)
     1.30%  [.] GCGraphBuilder::AddNode(void*, nsCycleCollectionParticipant*)
     1.27%  [.] XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)
     1.18%  [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
     1.13%  [.] nsDeque::PopFront()
     1.11%  [.] nsGlobalWindow::RunTimeout(nsTimeout*)
     0.97%  [.] nsXPConnect::Traverse(void*, nsCycleCollectionTraversalCallback&)
     0.95%  [.] nsJSEventListener::cycleCollection::Traverse(void*, nsCycleCollectionTraversalCallback&)
     0.95%  [.] nsCOMPtr_base::~nsCOMPtr_base()
[acme@doppio ~]$

Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Suggested-by: Clark Williams <williams@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090720171412.GB10410@ghostprotocols.net>
2009-07-22 18:05:57 +02:00
Arjan van de Ven dfe5a50461 perf: avoid structure size confusion by using a fixed size
for some reason, this structure gets compiled as 36 bytes in some files
(the ones that alloacte it) but 40 bytes in others (the ones that use it).
The cause is an off_t type that gets a different size in different
compilation units for some yet-to-be-explained reason.

But the effect is disasterous; the size/offset members of the struct
are at different offsets, and result in mostly complete garbage.
The parser in perf is so robust that this all gets hidden, and after
skipping an certain amount of samples, it recovers.... so this bug
is not normally noticed.

.... except when you want every sample to be exact.

Fix this by just using an explicitly sized type.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4A655917.9080504@linux.intel.com>
2009-07-22 18:05:57 +02:00
Peter Zijlstra 1d2f37945d Merge commit 'tip/perfcounters/core' into perf-counters-for-linus 2009-07-22 18:05:48 +02:00
Roel Kluin 23cdb5d517 perf_counter tools: Fix index boundary check
Keep index within event_type_descriptors[]

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A5A7F0B.4070106@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-13 10:58:28 +02:00
Arnaldo Carvalho de Melo a25e46c463 perf_counter tools: PLT info is stripped in -debuginfo packages
So we need to get the richer .symtab from the debuginfo
packages but the PLT info from the original DSO where we have
just the leaner .dynsym symtab.

Example:

| [acme@doppio pahole]$ perf report --sort comm,dso,symbol > before
| [acme@doppio pahole]$ perf report --sort comm,dso,symbol > after
| [acme@doppio pahole]$ diff -U1 before after
| --- before	2009-07-11 11:04:22.688595741 -0300
| +++ after	2009-07-11 11:04:33.380595676 -0300
| @@ -80,3 +80,2 @@
|       0.07%  pahole ./build/pahole              [.] pahole_stealer
| -     0.06%  pahole /usr/lib64/libdw-0.141.so   [.] 0x00000000007140
|       0.06%  pahole /usr/lib64/libdw-0.141.so   [.] __libdw_getabbrev
| @@ -91,2 +90,3 @@
|       0.06%  pahole [kernel]                    [k] free_hot_cold_page
| +     0.06%  pahole /usr/lib64/libdw-0.141.so   [.] tfind@plt
|       0.05%  pahole ./build/libdwarves.so.1.0.0 [.] ftype__add_parameter
| @@ -242,2 +242,3 @@
|       0.01%  pahole [kernel]                    [k] account_group_user_time
| +     0.01%  pahole /usr/lib64/libdw-0.141.so   [.] strlen@plt
|       0.01%  pahole ./build/pahole              [.] strcmp@plt
| [acme@doppio pahole]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1247325517-12272-4-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11 19:20:26 +02:00
Arnaldo Carvalho de Melo 27d0fd410c strlist: Introduce strlist__entry and strlist__nr_entries methods
The strlist__entry method allows accessing strlists like an
array, will be used in the 'perf report' to access the first
entry.

We now keep the nr_entries so that we can check if we have just
one entry, will be used in 'perf report' to improve the output
by showing just at the top when we have just, say, one DSO.

While at it use nr_entries to optimize strlist__is_empty by not
using the far more costly rb_first based implementation.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1247325517-12272-2-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11 19:20:25 +02:00
Arnaldo Carvalho de Melo 52d422de22 perf report: Adjust column width to the values sampled
Auto-adjust column width of perf report output to the
longest occuring string length.

Example:

[acme@doppio pahole]$  perf report --sort comm,dso,symbol | head -13

    12.79%   pahole  /usr/lib64/libdw-0.141.so    [.] __libdw_find_attr
     8.90%   pahole  /lib64/libc-2.10.1.so        [.] _int_malloc
     8.68%   pahole  /usr/lib64/libdw-0.141.so    [.] __libdw_form_val_len
     8.15%   pahole  /lib64/libc-2.10.1.so        [.] __GI_strcmp
     6.80%   pahole  /lib64/libc-2.10.1.so        [.] __tsearch
     5.54%   pahole  ./build/libdwarves.so.1.0.0  [.] tag__recode_dwarf_type
[acme@doppio pahole]$

[acme@doppio pahole]$  perf report --sort comm,dso,symbol -d /lib64/libc-2.10.1.so | head -10

    21.92%   pahole  /lib64/libc-2.10.1.so  [.] _int_malloc
    20.08%   pahole  /lib64/libc-2.10.1.so  [.] __GI_strcmp
    16.75%   pahole  /lib64/libc-2.10.1.so  [.] __tsearch
[acme@doppio pahole]$

Also add these extra options to control the new behaviour:

  -w, --field-width

Force each column width to the provided list, for large terminal
readability.

  -t, --field-separator:

Use a special separator character and don't pad with spaces, replacing
all occurances of this separator in symbol names (and other output) with
a '.' character, that thus it's the only non valid separator.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090711014728.GH3452@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11 10:24:11 +02:00
Anton Blanchard 9590b7ba3f perf_counter tools: Rename cache events to remove $
The cache events contain '$' which will hit shell variable
expansion. To avoid confusion change this to 'cache', ie
L1-d$-loads becomes L1-dcache-loads.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20090706120131.GB4391@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-10 10:04:06 +02:00
Frederic Weisbecker 805d127d62 perf report: Add "Fractal" mode output - support callchains with relative overhead rate
The current callchain displays the overhead rates as absolute:
relative to the total overhead.

This patch provides relative overhead percentage, in which each
branch of the callchain tree is a independant instrumentated object.

This provides a 'fractal' view of the call-chain profile: each
sub-graph looks like a profile in itself - relative to its parent.

You can produce such output by using the "fractal" mode
that you can abbreviate via f, fr, fra, frac, etc...

./perf report -s sym -c fractal

Example:

     8.46%  [k] copy_user_generic_string
                |
                |--52.01%-- generic_file_aio_read
                |          do_sync_read
                |          vfs_read
                |          |
                |          |--97.20%-- sys_pread64
                |          |          system_call_fastpath
                |          |          pread64
                |          |
                |           --2.81%-- sys_read
                |                     system_call_fastpath
                |                     __read
                |
                |--39.85%-- generic_file_buffered_write
                |          __generic_file_aio_write_nolock
                |          generic_file_aio_write
                |          do_sync_write
                |          reiserfs_file_write
                |          vfs_write
                |          |
                |          |--97.05%-- sys_pwrite64
                |          |          system_call_fastpath
                |          |          __pwrite64
                |          |
                |           --2.95%-- sys_write
                |                     system_call_fastpath
                |                     __write_nocancel
[...]

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-05 10:30:23 +02:00