redonkable/remarkable-linux

Author	SHA1	Message	Date
Arnaldo Carvalho de Melo	6630125419	perf archive: Don't try to collect files without a build-id To avoid these error: [root@doppio ~]# perf archive tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat: No such file or directory tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat: No such file or directory tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat: No such file or directory tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat: No such file or directory tar: Exiting with failure status due to previous errors [root@doppio ~]# More work is needed to support archiving symtabs for binaries without a build-id, perhaps creating a perf.data UUID + adding build-ids for the binaries copied into the cache and then have this perf.data session UUID be a directory with symlinks to the by now calculated build-id of the files inside it. Or just do an extra pass and insert the calculated build-ids in the perf.data header. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-03-02 19:27:46 +01:00
Peter Zijlstra	b622d644c7	perf_events, x86: Fixup fixed counter constraints Patch `1da53e0230` ("perf_events, x86: Improve x86 event scheduling") lost us one of the fixed purpose counters and then `ed8777fc13` ("perf_events, x86: Fix event constraint masks") broke it even further. Widen the fixed event mask to event+umask and specify the full config for each of the 3 fixed purpose counters. Then let the init code fill out the placement for the GP regs based on the cpuid info. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-03-02 15:06:47 +01:00
Peter Zijlstra	320ebf09cb	perf, x86: Restrict the ANY flag The ANY flag can show SMT data of another task (like 'top'), so we want to disable it when system-wide profiling is disabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-03-02 15:06:46 +01:00
Robert Richter	bb1165d688	perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE For consistency reasons this patch renames ARCH_PERFMON_EVENTSEL0_ENABLE to ARCH_PERFMON_EVENTSEL_ENABLE. The following is performed: $ sed -i -e s/ARCH_PERFMON_EVENTSEL0_ENABLE/ARCH_PERFMON_EVENTSEL_ENABLE/g \ arch/x86/include/asm/perf_event.h arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perfctr-watchdog.c \ arch/x86/oprofile/op_model_amd.c arch/x86/oprofile/op_model_ppro.c Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-03-01 14:21:23 +01:00
Robert Richter	a163b1099d	perf, x86: add some IBS macros to perf_event.h Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-03-01 11:23:15 +01:00
Robert Richter	1d6040f17d	perf, x86: make IBS macros available in perf_event.h This patch moves code from oprofile to perf_event.h to make it also available for usage by perf. Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-03-01 11:23:15 +01:00
Robert Richter	86d62b6fa2	Merge remote branch 'tip/oprofile' into tip/perf/core Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-03-01 11:22:45 +01:00
Frederic Weisbecker	3d083407a1	x86/hw-breakpoints: Remove the name field Remove the name field from the arch_hw_breakpoint. We never deal with target symbols in the arch level, neither do we need to ever store it. It's a legacy for the previous version of the x86 breakpoint backend. Let's remove it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>	2010-02-27 17:24:15 +01:00
Frederic Weisbecker	dd8b1cf681	perf: Remove pointless breakpoint union Remove pointless union in the breakpoint field of hw_perf_event. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org>	2010-02-27 17:10:39 +01:00
Frederic Weisbecker	b67577dfb4	perf lock: Drop the buffers multiplexing dependency We need to deal with time ordered events to build a correct state machine of lock events. This is why we multiplex the lock events buffers. But the ordering is done from the kernel, on the tracing fast path, leading to high contention between cpus. Without multiplexing, the events appears in a weak order. If we have four events, each split per cpu, perf record will read the events buffers in the following order: [ CPU0 ev0, CPU0 ev1, CPU0 ev3, CPU0 ev4, CPU1 ev0, CPU1 ev0....] To handle a post processing reordering, we could just read and sort the whole in memory, but it just doesn't scale with high amounts of events: lock events can fill huge amounts in few times. Basically we need to sort in memory and find a "grace period" point when we know that a given slice of previously sorted events can be committed for post-processing, so that we can unload the memory usage step by step and keep a scalable sorting list. There is no strong rules about how to define such "grace period". What does this patch is: We define a FLUSH_PERIOD value that defines a grace period in seconds. We want to have a slice of events covering 2 * FLUSH_PERIOD in our sorted list. If FLUSH_PERIOD is big enough, it ensures every events that occured in the first half of the timeslice have all been buffered and there are none remaining and there won't be further to put inside this first timeslice. Then once we reach the 2 * FLUSH_PERIOD timeslice, we flush the first half to be gentle with the memory (the second half can still get new events in the middle, so wait another period to flush it) FLUSH_PERIOD is defined to 5 seconds. Say the first event started on time t0. We can safely assume that at the time we are processing events of t0 + 10 seconds, ther won't be anymore events to read from perf.data that occured between t0 and t0 + 5 seconds. Hence we can safely flush the first half. To point out funky bugs, we have a guardian that checks a new event timestamp is not below the last event's timestamp flushed and that displays a warning in this case. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com>	2010-02-27 17:06:19 +01:00
Hitoshi Mitake	84c6f88fc8	perf lock: Fix and add misc documentally things I've forgot to add 'perf lock' line to command-list.txt, so users of perf could not find perf lock when they type 'perf'. Fixing command-list.txt requires document (tools/perf/Documentation/perf-lock.txt). But perf lock is too much "under construction" to write a stable document, so this is something like pseudo document for now. And I wrote description of perf lock at help section of CONFIG_LOCK_STAT, this will navigate users of lock trace events. Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> LKML-Reference: <1265267295-8388-1-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-02-27 17:05:22 +01:00
Tejun Heo	44ee63587d	percpu: Add __percpu sparse annotations to hw_breakpoint Add __percpu sparse annotations to hw_breakpoint. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. In kernel/hw_breakpoint.c, per_cpu(nr_task_bp_pinned, cpu)'s will trigger spurious noderef related warnings from sparse. Changing it to &per_cpu(nr_task_bp_pinned[0], cpu) will work around the problem but deemed to ugly by the maintainer. Leave it alone until better solution can be found. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: K.Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <4B7B4B7A.9050902@kernel.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-02-27 16:23:39 +01:00
Frederic Weisbecker	018cbffe68	Merge commit 'v2.6.33' into perf/core Merge reason: __percpu annotations need the corresponding sparse address space definition upstream. Conflicts: tools/perf/util/probe-event.c (trivial)	2010-02-27 16:18:46 +01:00
Peter Zijlstra	1dd2980d99	perf_event, amd: Fix spinlock initialization Avoid kernels from exploding on AMD machines when they have any lock debugging bits enabled. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 17:25:19 +01:00
Peter Zijlstra	24691ea964	perf_event: Fix preempt warning in perf_clock() A recent commit introduced a preemption warning for perf_clock(), use raw_smp_processor_id() to avoid this, it really doesn't matter which cpu we use here. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1267198583.22519.684.camel@laptop> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 17:25:00 +01:00
David S. Miller	4385d580f2	perf tools: Flush maps on COMM events Even though we don't register the counters until the child is right about to exec(), we're still going to get at least a few events while the fork()'d child is still executing 'perf' and in particular we're going to get the MMAP events. We can't distinguish the ones in the newly executed process because the PID will be the same. One way to solve this would be to have a PERF_RECORD_EXEC event, and when this is seen 'perf' can flush it's map cache. We can't use PERF_RECORD_COMM since that's generated by other things, not just exec(). Actually, thinking about it some more, using PERF_RECORD_COMM might be a good enough approximation. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1267196914-16238-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 16:28:45 +01:00
Peter Zijlstra	f22f54f449	perf_events, x86: Split PMU definitions into separate files Split amd,p6,intel into separate files so that we can easily deal with CONFIG_CPU_SUP_* things, needed to make things build now that perf_event.c relies on symbols from amd.c Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 15:44:04 +01:00
Arnaldo Carvalho de Melo	48fb4fdd6b	perf annotate: Handle samples not at objdump output addr boundaries Without this patch we get this for need_resched: [root@mica ~]# perf annotate need_resched ------------------------------------------------ Percent \| Source code & Disassembly of vmlinux ------------------------------------------------ : : : Disassembly of section .text: : : ffffffff810095ed <need_resched>: : return (state & TASK_INTERRUPTIBLE) \|\| __fatal_signal_pending(p); : } : : static inline int need_resched(void) : { 0.00 : ffffffff810095ed: 55 push %rbp : return unlikely(test_thread_flag(TIF_NEED_RESCHED)); 0.00 : ffffffff810095ee: be 03 00 00 00 mov $0x3,%esi : : static inline struct thread_info current_thread_info(void) : { : struct thread_info ti; : ti = (void )(percpu_read_stable(kernel_stack) + 0.00 : ffffffff810095f3: 65 48 8b 3c 25 48 b5 mov %gs:0xb548,%rdi 0.00 : ffffffff810095fa: 00 00 : return (state & TASK_INTERRUPTIBLE) \|\| __fatal_signal_pending(p); : } : : static inline int need_resched(void) : { 0.00 : ffffffff810095fc: 48 89 e5 mov %rsp,%rbp : return unlikely(test_thread_flag(TIF_NEED_RESCHED)); 0.00 : ffffffff810095ff: 48 81 ef d8 1f 00 00 sub $0x1fd8,%rdi 0.00 : ffffffff81009606: e8 9d ff ff ff callq ffffffff810095a8 <test_ti_thread_flag> : } 0.00 : ffffffff8100960b: c9 leaveq 0.00 : ffffffff8100960c: 85 c0 test %eax,%eax 0.00 : ffffffff8100960e: 0f 95 c0 setne %al 0.00 : ffffffff81009611: 0f b6 c0 movzbl %al,%eax : Disassembly of section .vsyscall_0: : Disassembly of section .vsyscall_fn: : Disassembly of section .vsyscall_1: : Disassembly of section .vsyscall_2: : Disassembly of section .init.text: : Disassembly of section .altinstr_replacement: : Disassembly of section .exit.text: [root@mica ~]# But from the 'perf report' result we know that there are hits for need_resched on a 4 way machine mostly doing nothing, so after adding code to show what is in each hist offset and collapsing IP hits for what happens between objdump lines we get, for the same perf.data file: [root@mica ~]# perf annotate -v need_resched ------------------------------------------------ Percent \| Source code & Disassembly of vmlinux ------------------------------------------------ : : : Disassembly of section .text: : : ffffffff810095ed <need_resched>: : return (state & TASK_INTERRUPTIBLE) \|\| __fatal_signal_pending(p); : } : : static inline int need_resched(void) : { 0.00 : ffffffff810095ed: 55 push %rbp : return unlikely(test_thread_flag(TIF_NEED_RESCHED)); 52.78 : ffffffff810095ee: be 03 00 00 00 mov $0x3,%esi : : static inline struct thread_info current_thread_info(void) : { : struct thread_info ti; : ti = (void )(percpu_read_stable(kernel_stack) + 0.00 : ffffffff810095f3: 65 48 8b 3c 25 48 b5 mov %gs:0xb548,%rdi 0.00 : ffffffff810095fa: 00 00 : return (state & TASK_INTERRUPTIBLE) \|\| __fatal_signal_pending(p); : } : : static inline int need_resched(void) : { 0.00 : ffffffff810095fc: 48 89 e5 mov %rsp,%rbp : return unlikely(test_thread_flag(TIF_NEED_RESCHED)); 9.72 : ffffffff810095ff: 48 81 ef d8 1f 00 00 sub $0x1fd8,%rdi 0.00 : ffffffff81009606: e8 9d ff ff ff callq ffffffff810095a8 <test_ti_thread_flag> : } 0.00 : ffffffff8100960b: c9 leaveq 0.00 : ffffffff8100960c: 85 c0 test %eax,%eax 37.50 : ffffffff8100960e: 0f 95 c0 setne %al 0.00 : ffffffff81009611: 0f b6 c0 movzbl %al,%eax : Disassembly of section .vsyscall_0: : Disassembly of section .vsyscall_fn: : Disassembly of section .vsyscall_1: : Disassembly of section .vsyscall_2: : Disassembly of section .init.text: : Disassembly of section .altinstr_replacement: : Disassembly of section .exit.text: [root@mica ~]# And now 'perf annotate -v', verbose mode, will show the hits per precise IP, so that one can make sense of the attribution to each objdumop line: [root@mica ~]# perf annotate -v need_resched Looking at the vmlinux_path (5 entries long) Using /lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux for symbols annotate_sym: filename=/lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux, sym=need_resched, start=0xffffffff810095ed, end=0xffffffff81009614 ------------------------------------------------ Percent \| Source code & Disassembly of vmlinux ------------------------------------------------ ffffffff810095f1: 152 ffffffff81009603: 28 ffffffff8100960f: 55 ffffffff81009610: 53 h->sum: 288 <SNIP same annotation> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1267194194-15670-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 15:42:49 +01:00
Robert Richter	cfc9c0b450	oprofile/x86: fix msr access to reserved counters During switching virtual counters there is access to perfctr msrs. If the counter is not available this fails due to an invalid address. This patch fixes this. Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:28:16 +01:00
Robert Richter	c17c8fbf34	oprofile/x86: use kzalloc() instead of kmalloc() Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:20:03 +01:00
Robert Richter	68dc819ce8	oprofile/x86: fix perfctr nmi reservation for mulitplexing Multiple virtual counters share one physical counter. The reservation of virtual counters fails due to duplicate allocation of the same counter. The counters are already reserved. Thus, virtual counter reservation may removed at all. This also makes the code easier. Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:19:03 +01:00
Naga Chumbalkar	8588d10671	oprofile/x86: add comment to counter-in-use warning Currently, oprofile fails silently on platforms where a non-OS entity such as the system firmware "enables" and uses a performance counter. There is a warning in the code for this case. The warning indicates an already running counter. If oprofile doesn't collect data, then try using a different performance counter on your platform to monitor the desired event. Delete the counter from the desired event by editing the /usr/share/oprofile/<cpu_type>/<cpu>/events file. If the event cannot be monitored by any other counter, contact your hardware or BIOS vendor. Cc: Shashi Belur <shashi-kiran.belur@hp.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:14:34 +01:00
Robert Richter	98a2e73a06	oprofile/x86: warn user if a counter is already active This patch generates a warning if a counter is already active. Implemented for AMD and P6 models. P4 is not supported. Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Shashi Belur <shashi-kiran.belur@hp.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:14:03 +01:00
Robert Richter	ba52078e19	oprofile/x86: implement randomization for IBS periodic op counter IBS selects an op (execution operation) for sampling by counting either cycles or dispatched ops. Better statistical samples can be produced by adding a software generated random offset to the periodic op counter value with each sample. This patch adds software randomization to the IBS periodic op counter. The lower 12 bits of the 20 bit counter are randomized. IbsOpCurCnt is initialized with a 12 bit random value. There is a work around if the hw can not write to IbsOpCurCnt. Then the lower 8 bits of the 16 bit IbsOpMaxCnt [15:0] value are randomized in the range of -128 to +127 by adding/subtracting an offset to the maximum count (IbsOpMaxCnt). The linear feedback shift register (LFSR) algorithm is used for pseudo-random number generation to have low impact to the memory system. Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:14:02 +01:00
Suravee Suthikulpanit	f125be1469	oprofile/x86: implement lsfr pseudo-random number generator for IBS This patch implements a linear feedback shift register (LFSR) for pseudo-random number generation for IBS. For IBS measurements it would be good to minimize memory traffic in the interrupt handler since every access pollutes the data caches. Computing a maximal period LFSR just needs shifts and ORs. The LFSR method is good enough to randomize the ops at low overhead. 16 pseudo-random bits are enough for the implementation and it doesn't matter that the pattern repeats with a fairly short cycle. It only needs to break up (hard) periodic sampling behavior. The logic was designed by Paul Drongowski. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:14:02 +01:00
Robert Richter	64683da664	oprofile/x86: implement IBS cpuid feature detection This patch adds IBS feature detection using cpuid flags. An IBS capability mask is introduced to test for certain IBS features. The bit mask is the same as for IBS cpuid feature flags (Fn8000_001B_EAX), but bit 0 is used to indicate the existence of IBS. The patch also changes the handling of the IbsOpCntCtl bit (periodic op counter count control). The oprofilefs file for this feature (ibs_op/dispatched_ops) will be only exposed if the feature is available, also the default for the bit is set to count clock cycles. In general, the userland can detect the availability of a feature by checking for the corresponding file in oprofilefs. If it exists, the feature also exists. This may lead to a dynamic file layout depending on the cpu type with that the userland has to deal with. Current opcontrol is compatible. Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:14:02 +01:00
Robert Richter	89baaaa98a	oprofile/x86: remove node check in AMD IBS initialization Standard AMD systems have the same number of nodes as there are northbridge devices. However, there may kernel configurations (especially for 32 bit) or system setups exist, where the node number is different or it can not be detected properly. Thus the check is not reliable and may fail though IBS setup was fine. For this reason it is better to remove the check. Cc: stable <stable@kernel.org> Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:14:01 +01:00
Robert Richter	013cfc5067	oprofile/x86: remove OPROFILE_IBS config option OProfile support for IBS is now for several versions in the kernel. The feature is stable now and the code can be activated permanently. As a side effect IBS now works also on nosmp configs. Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:13:55 +01:00
Robert Richter	b309a294e5	oprofile: remove EXPERIMENTAL from the config option description OProfile is already used for a long time and no longer experimental. Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:13:54 +01:00
Robert Richter	18b4a4d59e	oprofile: remove tracing build dependency The commit `1155de4` ring-buffer: Make it generally available already made ring-buffer available without the TRACING option enabled. This patch removes the TRACING dependency from oprofile. Fixes also oprofile configuration on ia64. The patch also applies to the 2.6.32-stable kernel. Reported-by: Tony Jones <tonyj@suse.de> Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 14:52:52 +01:00
Peter Zijlstra	6667661df4	perf_events, x86: Remove superflous MSR writes We re-program the event control register every time we reset the count, this appears to be superflous, hence remove it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 10:56:54 +01:00
Peter Zijlstra	6e37738a2f	perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in() Since the cpu argument to hw_perf_group_sched_in() is always smp_processor_id(), simplify the code a little by removing this argument and using the current cpu where needed. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1265890918.5396.3.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 10:56:53 +01:00
Stephane Eranian	38331f62c2	perf_events, x86: AMD event scheduling This patch adds correct AMD NorthBridge event scheduling. NB events are events measuring L3 cache, Hypertransport traffic. They are identified by an event code >= 0xe0. They measure events on the Northbride which is shared by all cores on a package. NB events are counted on a shared set of counters. When a NB event is programmed in a counter, the data actually comes from a shared counter. Thus, access to those counters needs to be synchronized. We implement the synchronization such that no two cores can be measuring NB events using the same counters. Thus, we maintain a per-NB allocation table. The available slot is propagated using the event_constraint structure. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4b703957.0702d00a.6bf2.7b7d@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 10:56:53 +01:00
Stephane Eranian	d76a0812ac	perf_events: Add new start/stop PMU callbacks In certain situations, the kernel may need to stop and start the same event rapidly. The current PMU callbacks do not distinguish between stop and release (i.e., stop + free the resource). Thus, a counter may be released, then it will be immediately re-acquired. Event scheduling will again take place with no guarantee to assign the same counter. On some processors, this may event yield to failure to assign the event back due to competion between cores. This patch is adding a new pair of callback to stop and restart a counter without actually release the underlying counter resource. On stop, the counter is stopped, its values saved and that's it. On start, the value is reloaded and counter is restarted (on x86, actual restart is delayed until perf_enable()). Signed-off-by: Stephane Eranian <eranian@google.com> [ added fallback to ->enable/->disable for all other PMUs fixed x86_pmu_start() to call x86_pmu.enable() merged __x86_pmu_disable into x86_pmu_stop() ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4b703875.0a04d00a.7896.ffffb824@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 10:56:53 +01:00
Peter Zijlstra	3a0304e90a	perf_events: Report the MMAP pgoff value in bytes DaveM reported that currently perf interprets the pgoff value reported by the MMAP events as a byte range, but the kernel reports it as a page offset. Since its broken (and unusable) anyway, change the kernel behaviour (ABI) to report bytes indeed, avoiding the need for userspace to deal with PAGE_SIZE things. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-26 10:56:52 +01:00
Arnaldo Carvalho de Melo	628ada0cb0	perf annotate: Defer allocating sym_priv->hist array Because symbol->end is not fixed up at symbol_filter time, only after all symbols for a DSO are loaded, and that, for asm symbols, may be bogus, causing segfaults when hits happen in these symbols. Reported-by: David Miller <davem@davemloft.net> Reported-by: Anton Blanchard <anton@samba.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: <stable@kernel.org> # for .33.x. Does not apply cleanly, needs backport. LKML-Reference: <20100225155740.GB8553@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 17:39:14 +01:00
Arnaldo Carvalho de Melo	3846df2e0a	perf symbols: Improve debugging information about symtab origins Be more clear about DSO long names and tell from which file kernel symbols were obtained, all in --verbose mode: [root@mica ~]# perf report -v > /dev/null Looking at the vmlinux_path (5 entries long) Using /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux for symbols [root@mica ~]# mv /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux /tmp/dd [root@mica ~]# perf report -v > /dev/null Looking at the vmlinux_path (5 entries long) Using /proc/kallsyms for symbols [root@mica ~]# Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1266866139-6361-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 12:27:17 +01:00
Arnaldo Carvalho de Melo	c7ad21af2c	perf top: Use a macro instead of a constant variable To overcome a silly gcc warning: cc1: warnings being treated as errors builtin-top.c: In function ‘lookup_sym_source’: builtin-top.c:291: warning: not protecting local variables: variable length buffer make: * [builtin-top.o] Error 1 make: * Waiting for unfinished jobs.... That is emitted for this: const size_t pattern_len = BITS_PER_LONG / 4 + 2; char pattern[pattern_len + 1]; Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1266866062-6287-1-git-send-email-acme@infradead.org> [ -v2: macroify the naming style ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 12:26:16 +01:00
Zhang, Yanmin	37fe5fcb7a	perf symbols: Check the right return variable In function dso__split_kallsyms(), curr_map saves the return value of map__new2. So check it instead of var map after the call returns. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: <stable@kernel.org> # for .33.x Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1267066851.1726.9.camel@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 12:15:24 +01:00
Frederic Weisbecker	c2fbaa4b48	perf/scripts: Tag syscall_name helper as not yet available syscall_name() helper, which resolves a syscall arch number to its name, is not yet available as we first need to implement event injection for it to work. Remove it from the documentation or tag its references as unavailable yet. Once it's implemented, we can just revert the current patch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Keiichi KII <k-keiichi@bx.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-02-25 04:07:50 +01:00
Tom Zanussi	cff68e5822	perf/scripts: Add perf-trace-python Documentation Also small update to perf-trace-perl and perf-trace docs. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Keiichi KII <k-keiichi@bx.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1264580883-15324-13-git-send-email-tzanussi@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-02-25 04:07:49 +01:00
Tom Zanussi	44ad9cd8f0	perf/scripts: Remove unnecessary PyTuple resizes If we know the size of a tuple in advance, there's no need to resize it - start out with the known size in the first place. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Keiichi KII <k-keiichi@bx.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1266822779.6426.4.camel@tropicana> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-02-25 04:07:49 +01:00
Tom Zanussi	4d161f0360	perf/scripts: Add syscall tracing scripts Adds a set of scripts that aggregate system call totals and system call errors. Most are Python scripts that also test basic functionality of the new Python engine, but there's also one Perl script added for comparison and for reference in some new Documentation contained in a later patch. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Keiichi KII <k-keiichi@bx.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1264580883-15324-8-git-send-email-tzanussi@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-02-25 04:07:48 +01:00
Tom Zanussi	7e4b21b84c	perf/scripts: Add Python scripting engine Add base support for Python scripting to perf trace. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Keiichi KII <k-keiichi@bx.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1264580883-15324-6-git-send-email-tzanussi@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-02-25 04:07:29 +01:00
Linus Torvalds	60b341b778	Linux 2.6.33	2010-02-24 10:52:17 -08:00
Linus Torvalds	1e6c5c4e4c	Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 * 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6: parisc: Set PCI CLS early in boot.	2010-02-24 10:51:21 -08:00
Linus Torvalds	46fe24389a	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Fix broken sn2 build	2010-02-24 10:51:04 -08:00
Carlos O'Donell	5fd4514bb3	parisc: Set PCI CLS early in boot. Set the PCI CLS early in the boot process to prevent device failures. In pcibios_set_master use the new pci_cache_line_size instead of a hard-coded value. Signed-off-by: Carlos O'Donell <carlos@codesourcery.com> Reviewed-by: Grant Grundler <grundler@google.com> Signed-off-by: Kyle McMartin <kyle@redhat.com>	2010-02-24 17:30:36 +00:00
Linus Torvalds	7b1f94b8a6	Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze * 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Fix out_le32() macro microblaze: Fix cache loop function for cache range	2010-02-24 07:43:02 -08:00
Linus Torvalds	83d90addc8	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Revert "block: improve queue_should_plug() by looking at IO depths"	2010-02-24 07:42:42 -08:00

1 2 3 4 5 ...

180972 commits