1
0
Fork 0
Commit Graph

287240 Commits (2b144498350860b6ee9dc57ff27a93ad488de5dc)

Author SHA1 Message Date
Srikar Dronamraju 2b14449835 uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints
Add uprobes support to the core kernel, with x86 support.

This commit adds the kernel facilities, the actual uprobes
user-space ABI and perf probe support comes in later commits.

General design:

Uprobes are maintained in an rb-tree indexed by inode and offset
(the offset here is from the start of the mapping). For a unique
(inode, offset) tuple, there can be at most one uprobe in the
rb-tree.

Since the (inode, offset) tuple identifies a unique uprobe, more
than one user may be interested in the same uprobe. This provides
the ability to connect multiple 'consumers' to the same uprobe.

Each consumer defines a handler and a filter (optional). The
'handler' is run every time the uprobe is hit, if it matches the
'filter' criteria.

The first consumer of a uprobe causes the breakpoint to be
inserted at the specified address and subsequent consumers are
appended to this list.  On subsequent probes, the consumer gets
appended to the existing list of consumers. The breakpoint is
removed when the last consumer unregisters. For all other
unregisterations, the consumer is removed from the list of
consumers.

Given a inode, we get a list of the mms that have mapped the
inode. Do the actual registration if mm maps the page where a
probe needs to be inserted/removed.

We use a temporary list to walk through the vmas that map the
inode.

- The number of maps that map the inode, is not known before we
  walk the rmap and keeps changing.
- extending vm_area_struct wasn't recommended, it's a
  size-critical data structure.
- There can be more than one maps of the inode in the same mm.

We add callbacks to the mmap methods to keep an eye on text vmas
that are of interest to uprobes.  When a vma of interest is mapped,
we insert the breakpoint at the right address.

Uprobe works by replacing the instruction at the address defined
by (inode, offset) with the arch specific breakpoint
instruction. We save a copy of the original instruction at the
uprobed address.

This is needed for:

 a. executing the instruction out-of-line (xol).
 b. instruction analysis for any subsequent fixups.
 c. restoring the instruction back when the uprobe is unregistered.

We insert or delete a breakpoint instruction, and this
breakpoint instruction is assumed to be the smallest instruction
available on the platform. For fixed size instruction platforms
this is trivially true, for variable size instruction platforms
the breakpoint instruction is typically the smallest (often a
single byte).

Writing the instruction is done by COWing the page and changing
the instruction during the copy, this even though most platforms
allow atomic writes of the breakpoint instruction. This also
mirrors the behaviour of a ptrace() memory write to a PRIVATE
file map.

The core worker is derived from KSM's replace_page() logic.

In essence, similar to KSM:

 a. allocate a new page and copy over contents of the page that
    has the uprobed vaddr
 b. modify the copy and insert the breakpoint at the required
    address
 c. switch the original page with the copy containing the
    breakpoint
 d. flush page tables.

replace_page() is being replicated here because of some minor
changes in the type of pages and also because Hugh Dickins had
plans to improve replace_page() for KSM specific work.

Instruction analysis on x86 is based on instruction decoder and
determines if an instruction can be probed and determines the
necessary fixups after singlestep.  Instruction analysis is done
at probe insertion time so that we avoid having to repeat the
same analysis every time a probe is hit.

A lot of code here is due to the improvement/suggestions/inputs
from Peter Zijlstra.

Changelog:

(v10):
 - Add code to clear REX.B prefix as suggested by Denys Vlasenko
   and Masami Hiramatsu.

(v9):
 - Use insn_offset_modrm as suggested by Masami Hiramatsu.

(v7):

 Handle comments from Peter Zijlstra:

 - Dont take reference to inode. (expect inode to uprobe_register to be sane).
 - Use PTR_ERR to set the return value.
 - No need to take reference to inode.
 - use PTR_ERR to return error value.
 - register and uprobe_unregister share code.

(v5):

 - Modified del_consumer as per comments from Peter.
 - Drop reference to inode before dropping reference to uprobe.
 - Use i_size_read(inode) instead of inode->i_size.
 - Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called.
 - Includes errno.h as recommended by Stephen Rothwell to fix a build issue
   on sparc defconfig
 - Remove restrictions while unregistering.
 - Earlier code leaked inode references under some conditions while
   registering/unregistering.
 - Continue the vma-rmap walk even if the intermediate vma doesnt
   meet the requirements.
 - Validate the vma found by find_vma before inserting/removing the
   breakpoint
 - Call del_consumer under mutex_lock.
 - Use hash locks.
 - Handle mremap.
 - Introduce find_least_offset_node() instead of close match logic in
   find_uprobe
 - Uprobes no more depends on MM_OWNER; No reference to task_structs
   while inserting/removing a probe.
 - Uses read_mapping_page instead of grab_cache_page so that the pages
   have valid content.
 - pass NULL to get_user_pages for the task parameter.
 - call SetPageUptodate on the new page allocated in write_opcode.
 - fix leaking a reference to the new page under certain conditions.
 - Include Instruction Decoder if Uprobes gets defined.
 - Remove const attributes for instruction prefix arrays.
 - Uses mm_context to know if the application is 32 bit.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Also-written-by: Jim Keniston <jkenisto@us.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Anton Arapov <anton@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linux-mm <linux-mm@kvack.org>
Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com
[ Made various small edits to the commit log ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-17 10:00:01 +01:00
Ingo Molnar d1e169da9e perf core fixes and improvements.
Includes the previous pull request + the exclude_{host,guest} feature test and
 fallback to handle older kernels.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPOotAAAoJENZQFvNTUqpA+vAP/RyxxzomxCQILzAfJZ02OMVZ
 iyiX2r4HQ3MOwksJp4Kah316sKLPJjxHIHNxzY/1jJ4Vb/UfJYnMf3mvdbHhN6aq
 oxqFxzFPWXOJMCvLWEsK/kBHScLCoy0JAPGMzOyBObVjCIqJ3/P3/5ltLvBYaugh
 +IRzWMDnqN4fH3QOYGZ962j+MEkr/KgYUJw5VwHNO1WXKotYbGhawGRz6dgpXIwV
 nugWgCS+Nt97tfqK/nAfFht8Gv3kUBkBnPiKuCQLat422IwFVPqJ6ooNi0TeNke0
 QbAdz4Aft5mInIMftzW5kGgN2feH2JTb6VPNcoIsIaiIl9toWNSf//zBuY4Z6dR+
 JdIbvNBzmkZUhqDDqaPg30X53d2bxP1L/+hoWv1Ocgf5vmqQfuQsQm2o5OFcGq83
 XDoYifu8awClsqyLVHu/Meup4+1IDVgIfx+jbW9m9pVQZVCD0xcF/OUqi95mQ9me
 bvDyfQt4Oieov+aNgBa6r1lkS+vwrU8MKGvg3vGUoDUKpDato/59a+i+xoYjsQrB
 ssG3iT/7mTat1g6yo+jqYxyW6WMIVDufwyFy41gJDsfbJej44W2UPB4Q1BsuIeIc
 FljInjJtN6w+9fCz0veUMHSOz0tLqLAtLmNCb30HwXVQqjHteUn7YOHwlTZLNpL4
 FXFm5QeDr6GwLXaHfOgn
 =15Z0
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Includes smaller fixes and improvements plus the exclude_{host,guest} feature
test and fallback to handle older kernels.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-17 09:02:28 +01:00
Arnaldo Carvalho de Melo 808e122630 perf tools: Invert the sample_id_all logic
Instead of requiring that users of perf_record_opts set
.sample_id_all_avail to true, just invert the logic, using
.sample_id_all_missing, that doesn't need to be explicitely initialized
since gcc will zero members ommitted in a struct initialization.

Just like the newly introduced .exclude_{guest,host} feature test.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ab772uzk78cwybihf0vt7kxw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-14 14:18:57 -02:00
Arnaldo Carvalho de Melo 0c9781280f perf tools: Handle kernels that don't support attr.exclude_{guest,host}
Just fall back to resetting those fields, if set, warning the user that
that feature is not available.

If guest samples appear they will just be discarded because no struct
machine will be found and thus the event will be accounted as not
handled and dropped, see 0c09571.

Reported-by: Namhyung Kim <namhyung@gmail.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-14 14:05:30 -02:00
Stephane Eranian 7e1ccd3804 perf tools: cleanup initialization of attr->size
The perf_event_attr size needs to be initialized in all cases because it
captures the ABI version.

This patch moves the initialization of the field from the
perf_event_open() syscall stub to its proper location in the
event_attr_init().

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20120209151238.GA10272@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:35:04 -02:00
Robert Richter f1c67db7e3 perf tools: Factor out feature op to process header sections
There is individual code for each feature to process header sections.

Adding a function pointer .process to struct feature_ops for keeping the
implementation in separate functions. Code to process header sections is
now a generic function.

Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/1328884916-5901-2-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:33:36 -02:00
Robert Richter 08d95bd256 perf tools: Moving code in header.c
Needed for later changes. No modified functionality.

Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/1328884916-5901-1-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:32:32 -02:00
Jiri Olsa 850f8127fa perf tools: Add bitmap_or function into bitmap object
Adding implementation os bitmap_or function to the bitmap object. It is
stolen from the kernel lib/bitmap.o object.

It is used in upcomming patches.

Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1327674868-10486-5-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:28:10 -02:00
Jiri Olsa e90fda0635 perf tools: Add sysfs mountpoint interface
Adding sysfs object to provide sysfs mount information in the same way
as debugfs object does.

The object provides following function:
  sysfs_find_mountpoint

which returns the sysfs mount mount.

Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1327674868-10486-4-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:27:15 -02:00
Jiri Olsa 2837609fef perf tools: Remove unused functions from debugfs object
Following debugfs object functions are not referenced
within the code:

  int debugfs_valid_entry(const char *path);
  int debugfs_umount(void);
  int debugfs_write(const char *entry, const char *value);
  int debugfs_read(const char *entry, char *buffer, size_t size);
  void debugfs_force_cleanup(void);
  int debugfs_make_path(const char *element, char *buffer, int size);

Removing them.

Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1327674868-10486-3-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:25:38 -02:00
Namhyung Kim e334c726ca perf tools: Get rid of ctype.h in symbol.c
The ctype.h in symbol.c was needed because of isupper(). However we now
have it in util.h, it can be changed to use our implementation.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328836217-9118-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:22:50 -02:00
Namhyung Kim 3bd2b8d109 perf tools: ctype.c only wants util.h
The implementation of sane ctype macros only depends on symbols in
util.h not cache.h.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328836217-9118-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:17:40 -02:00
Namhyung Kim 2cd13b0f7d perf tools: Implement islower/isupper macro into util.h
The util.h header provides various ctype macros but lacks those two.

Add them.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328836217-9118-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:15:43 -02:00
Joerg Roedel c4a7dca92b perf tools: Change perf_guest default back to false
Setting perf_guest to true by default makes no sense because the perf
subcommands can not setup guest symbol information and thus not process
and guest samples. The only exception is perf-kvm which changes the
perf_guest value on its own.  So change the default for perf_guest back
to false.

Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328893505-4115-3-git-send-email-joerg.roedel@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 23:14:44 -02:00
Joerg Roedel 0c095715b3 perf top: Don't process samples with no valid machine object
The perf sample processing code relies on a valid machine object. Make
sure that this path is only entered when such a object exists.

A counter for samples where no machine object exits is also introduced
to give the user a message about these samples.

Reported-by: David Ahern <dsahern@gmail.com>
Reported-by: Jason Wang <jasowang@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328893505-4115-2-git-send-email-joerg.roedel@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 22:55:58 -02:00
David Ahern b52956c961 perf tools: Allow multiple threads or processes in record, stat, top
Allow a user to collect events for multiple threads or processes
using a comma separated list.

e.g., collect data on a VM and its vhost thread:
  perf top -p 21483,21485
  perf stat -p 21483,21485 -ddd
  perf record -p 21483,21485

or monitoring vcpu threads
  perf top -t 21488,21489
  perf stat -t 21488,21489 -ddd
  perf record -t 21488,21489

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1328718772-16688-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 22:54:11 -02:00
David Ahern eca1c3e3f9 perf tools: Fix out of tree compiles
For latest tip/perf/core tree Compiles are failing on:

GEN common-cmds.h
make: *** No rule to make target `../../arch/x86/lib/memset_64.S', needed by `builtin-annotate.o'.  Stop.

Resolve by adding memset.* to the tar file.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1329145057-26302-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 22:46:41 -02:00
Namhyung Kim 6a5c13aff4 perf tools: Fix build dependency of perf python extension
The perf python extention (perf.so) file lacks its dependencies in the
Makefile so that it cannot be refreshed if one of source files it depends
is changed. Fix it by putting them in a separate file and processing it in
both of Makefile and setup.py.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1329043524-12470-1-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-13 18:01:25 -02:00
Masami Hiramatsu f8d98f1095 x86: Fix to decode grouped AVX with VEX pp bits
Fix to decode grouped AVX with VEX pp bits which should be
handled as same as last-prefixes. This fixes below warnings
in posttest with CONFIG_CRYPTO_SHA1_SSSE3=y.

 Warning: arch/x86/tools/test_get_len found difference at <sha1_transform_avx>:ffffffff810d5fc0
 Warning: ffffffff810d6069:	c5 f9 73 de 04       	vpsrldq $0x4,%xmm6,%xmm0
 Warning: objdump says 5 bytes, but insn_get_length() says 4
 ...

With this change, test_get_len can decode it correctly.

 $ arch/x86/tools/test_get_len -v -y
 ffffffff810d6069:       c5 f9 73 de 04          vpsrldq $0x4,%xmm6,%xmm0
 Succeed: decoded and checked 1 instructions

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20120210053340.30429.73410.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-11 15:11:35 +01:00
Fernando Luis Vázquez Cao 86f5e6a7b1 watchdog: Fix code/comments mismatches
Reflect the change in the soft and hard lockup thresholds and
their relation to the frequency of the hrtimer and NMI events in
the code comments. While at it, remove references to files that
do not exist anymore.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1328827342-6253-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-11 15:11:33 +01:00
Fernando Luis Vázquez Cao 5f32908943 watchdog: Update Kconfig entries
The soft and hard lockup thresholds have changed so the
corresponding Kconfig entries need to be updated accordingly.
Add a reference to  watchdog_thresh while at it.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1328827342-6253-2-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-11 15:11:30 +01:00
Fernando Luis Vázquez Cao 9919cba7ff watchdog: Update documentation
The soft and hard lockup detectors are now built on top of the
hrtimer and perf subsystems. Update the documentation
accordingly.

Signed-off-by: Fernando Luis Vazquez Cao<fernando@oss.ntt.co.jp>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1328827342-6253-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-11 15:11:28 +01:00
David Ahern d366549895 perf record: No build id option fails
A recent refactoring of perf-record introduced the following:

perf record -a -B
Couldn't generating buildids. Use --no-buildid to profile anyway.
sleep: Terminated

I believe the triple negative was meant to be only a double negative.
:-) While I'm there, fixed the grammar on the error message.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-09 12:28:10 -02:00
Stephane Eranian 73323f541f perf tools: fix endianness detection in perf.data
The current version of perf detects whether or not the perf.data file is
written in a different endianness using the attr_size field in the
header of the file. This field represents sizeof(struct perf_event_attr)
as known to perf record. If the sizes do not match, then perf tries the
byte-swapped version. If they match, then the tool assumes a different
endianness.

The issue with the approach is that it assumes the size of
perf_event_attr always has to match between perf record and perf report.
However, the kernel perf_event ABI is extensible.  New fields can be
added to struct perf_event_attr. Consequently, it is not possible to use
attr_size to detect endianness.

This patch takes another approach by using the magic number written at
the beginning of the perf.data file to detect endianness. The magic
number is an eight-byte signature.  It's primary purpose is to identify
(signature) a perf.data file. But it could also be used to encode the
endianness.

The patch introduces a new value for this signature. The key difference
is that the signature is written differently in the file depending on
the endianness. Thus, by comparing the signature from the file with the
tool's own signature it is possible to detect endianness. The new
signature is "PERFILE2".

Backward compatiblity with existing perf.data file is ensured.

Tested-by: David Ahern <dsahern@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Vince Weaver <vweaver1@eecs.utk.edu>
Link: http://lkml.kernel.org/r/1328187288-24395-15-git-send-email-eranian@google.com
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-09 12:28:10 -02:00
Borislav Petkov c98fdeaa92 x86/sched/perf/AMD: Set sched_clock_stable
Stephane Eranian reported that doing a scheduler latency
measurements with perf on AMD doesn't work out as expected due
to the fact that the sched_clock() granularity is too coarse,
i.e. done in jiffies due to the sched_clock_stable not set,
which, if set, would mean that we get to use the TSC as sample
source which would give us much higher precision.

However, there's no reason not to set sched_clock_stable on AMD
because all families from F10h and upwards do have an invariant
TSC and have the CPUID flag to prove (CPUID_8000_0007_EDX[8]).

Make it so, #1.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Venki Pallipadi <venki@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Link: http://lkml.kernel.org/r/20120206132546.GA30854@quad
[ Should any non-standard system break the TSC, we should
  mark them so explicitly, in their platform init handler, or
  in a DMI quirk. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-07 13:12:08 +01:00
Ingo Molnar 52fce956a7 perf/core fixes and improvements.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPMEQJAAoJENZQFvNTUqpAPhsP+wRxyE/nsIo1nZGx+sWMJqbm
 xjgqgPdLUR+Ud8/Du7CKSLl23PNqf+GsMPQqY6n8mezowtwpHPigDQjbYaCgD5Le
 P6f8JAXnKXGdw9H/aDoIbuSHWb6JS7EhHTntpHIRfuikTsoB+n9CA6UNnv20gJ2o
 NQQ6xTSxjKZutQQpzDPmz05nmQJ750MnROfSI6/4xCWlXJU29CadOkU0S6EQ83QA
 iSbOhZ5Am2OxkD/ICxgail/AuEmNe9DElcNHUHle6hgiJSCSOMzb94fYCzSvRXyP
 QaZGqDlHyV9ZFC3qKkmtCCfqRdezFkMFYE+NTdlRwtEQ7CVxhvJ2/kWw8T66MZOT
 lO2vj89wQMjaVldxMojLerVKFhn72WA5h8j7UDXCiECusmm4ObERtDGeKDLWngB7
 mNqNvqeKs+honVJEqp7vkxb11QQv0J7KPWppUrOuPY9UQlhZRzec0jUdNUN2MBVO
 GdqguwtD9lXsnbBR+tz6eHJLZAhl6hyAL/+h58cPDPwm6kdMC9ar3RfWvxWAT/8S
 AHxdkAELWGq1obYvbTLp4kgRNr+hyYgKS+6y3TmrnFvV8zw5lKV8hxj0YYezlTvQ
 wIa5Vih553hPcMidbiNqeBCieiZTt6aoYzzx3UsTEPEpjXrvsffx7NQj3N/J58En
 mTOSQpPoP/AfwEvK37qh
 =Ib1W
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

perf/core fixes and improvements.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-07 10:00:41 +01:00
Ingo Molnar 290436c9c6 Linux 3.3-rc2
.. several days delayed. No reason, I just didn't think of it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJPKF4KAAoJEHm+PkMAQRiGw/YH+wXg2DpZUuHeBK52zMGlJBPc
 DzX11/Uan3Y07gM0JbDzuVxwjX4vdxR2bV6r1qsLP8JEUnE8jyFC32DBGi5WAht7
 F4KU/Uov2Ds5/wzvY4Iuo01C+JftQHXuy/Sbhck1d0LI0yjLejRaw+zuJv0x2/eS
 7YqV+KTGE1lDuJs/Gyq1Vqr1g9831AuS1tv/g3gaqBuN6TcPBFCocaVxzwrUc+y6
 94h26XbbOhQRIz38oqUkiqAGnvYS61ocyBcEiRHf0dXkNSDIINqlgukvd7YTXouA
 jj/w/DWpMRcQuYAgqkrurr9+yWC9hVQcsvvQ5sAQnIPcxoR868sg1pO8Oheq+1g=
 =kUzV
 -----END PGP SIGNATURE-----

Merge tag 'v3.3-rc2' into perf/core

Linux 3.3-rc2

Pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-07 09:59:29 +01:00
Namhyung Kim 9dac6a29e0 perf stat: Align scaled output of cpu-clock
The output of cpu-clock event is controlled in nsec_printout(),
but its alignment was broken:

 Performance counter stats for 'sleep 1':

         6,038,774 instructions              #    0.00  insns per cycle
               180 faults                    #    0.007 K/sec                   [99.95%]
         1,282,201 branches                  #    0.053 M/sec                   [99.84%]
      24126.221811 cpu-clock                 [99.62%]
      24121.689540 task-clock                #   24.098 CPUs utilized           [99.52%]

       1.001001017 seconds time elapsed

This patch fixes this:

 Performance counter stats for 'sleep 1':

        13,540,843 instructions              #    0.00  insns per cycle
               180 faults                    #    0.007 K/sec                   [99.94%]
         2,875,386 branches                  #    0.119 M/sec                   [99.82%]
      24144.221137 cpu-clock                                                    [99.61%]
      24133.515366 task-clock                #   24.109 CPUs utilized           [99.52%]

       1.001020946 seconds time elapsed

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328514285-26232-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-06 19:17:39 -02:00
Namhyung Kim 5fde2523bd perf stat: Adjust print unit
The default 'M/sec' unit is not useful if the result is small enough.

Adjust it dynamically according to the value.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328514285-26232-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-06 19:17:11 -02:00
Franck Bui-Huu 762b2935fc perf doc: Allow producing documentation in a specified output directory
Currently we can put the object files in a different directory by using
'O=' comand line argument.

However the generated documentation files don't honor this directive,

This patch fixes that. It's been tested for man target but the others
seems currently broken so no tests have been done on them so far.

Link: http://lkml.kernel.org/r/1328541443-18003-1-git-send-email-fbuihuu@gmail.com
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-06 19:16:03 -02:00
Jiri Olsa e89cef136a perf tool: Fix perf stack to non executable on x86_64
By adding following objects:
  bench/mem-memset-x86-64-asm.o
  bench/mem-memcpy-x86-64-asm.o
the x86_64 perf binary ended up with executable stack.

The reason was that above objects are assembler sourced and are missing the
GNU-stack note section. In such case the linker assumes that the final binary
should not be restricted at all and mark the stack as RWX.

Adding section ".note.GNU-stack" definition to mentioned objects, with all
flags disabled, thus omiting those objects from linker stack flags decision.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570
Reported-by: Clark Williams <williams@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
[ committer note: Remaining bits after what was already added to perf/urgent ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-06 19:14:17 -02:00
Arnaldo Carvalho de Melo 5ddf146f70 Merge branch 'perf/urgent' into perf/core
So that we can get the perf bench exec stack fixes and then apply the
remaining fix for the files added after what is in perf/urgent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-06 19:11:02 -02:00
Naveen N. Rao a4a03fc7ef perf evsel: Fix an issue where perf report fails to show the proper percentage
This patch fixes an issue where perf report shows nan% for certain
perf.data files. The below is from a report for a do_fork probe:

   -nan%           sshd  [kernel.kallsyms]  [k] do_fork
   -nan%    packagekitd  [kernel.kallsyms]  [k] do_fork
   -nan%    dbus-daemon  [kernel.kallsyms]  [k] do_fork
   -nan%           bash  [kernel.kallsyms]  [k] do_fork

A git bisect shows commit f3bda2c as the cause. However, looking back
through the git history, I saw commit 640c03c which seems to have
removed the required initialization for perf_sample->period. The problem
only started showing after commit f3bda2c. The below patch re-introduces
the initialization and it fixes the problem for me.

With the below patch, for the same perf.data:

  73.08%             bash  [kernel.kallsyms]  [k] do_fork
   8.97%      11-dhclient  [kernel.kallsyms]  [k] do_fork
   6.41%             sshd  [kernel.kallsyms]  [k] do_fork
   3.85%        20-chrony  [kernel.kallsyms]  [k] do_fork
   2.56%         sendmail  [kernel.kallsyms]  [k] do_fork

This patch applies over current linux-tip commit 9949284.

Problem introduced in:

$ git describe 640c03c
v2.6.37-rc3-83-g640c03c

Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/20120203170113.5190.25558.stgit@localhost6.localdomain6
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-06 18:59:38 -02:00
Jiri Olsa bf32c9ebc9 perf tools: Fix prefix matching for kernel maps
In some perf ancient versions we used '[kernel.kallsyms._text]' as the
name for the kernel map.

This got changed with commit:
  perf: 'perf kvm' tool for monitoring guest performance from host
  commit a1645ce12a
  Author: Zhang, Yanmin <yanmin_zhang@linux.intel.com>

and we started to use following name '[kernel.kallsyms]_text'.

This name change is important for the report code dealing with ancient
perf data. When processing the kernel map event, we need to recognize
the old naming (dont match the last ']') and initialize the kernel map
correctly.

The subsequent call to maps__set_kallsyms_ref_reloc_sym deals with the
superfluous ']' to get correct symbol name.

Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328461865-6127-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-06 18:57:39 -02:00
Jiri Olsa 7a0153ee15 perf tools: Fix perf stack to non executable on x86_64
By adding following objects:
  bench/mem-memcpy-x86-64-asm.o
the x86_64 perf binary ended up with executable stack.

The reason was that above object are assembler sourced and is missing the
GNU-stack note section. In such case the linker assumes that the final binary
should not be restricted at all and mark the stack as RWX.

Adding section ".note.GNU-stack" definition to mentioned object, with all
flags disabled, thus omiting this object from linker stack flags decision.

Problem introduced in:

  $ git describe ea7872b
  v2.6.37-rc2-19-gea7872b

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=783570
Reported-by: Clark Williams <williams@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1328100848-5630-1-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
[ committer note: Backported fix to perf/urgent (3.3-rc2+) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-06 18:54:06 -02:00
Stephane Eranian 84f2b9b2ed perf: Remove deprecated WARN_ON_ONCE()
With the new throttling/unthrottling code introduced with
commit:

  e050e3f0a7 ("perf: Fix broken interrupt rate throttling")

we occasionally hit two WARN_ON_ONCE() checks in:

  - intel_pmu_pebs_enable()
  - intel_pmu_lbr_enable()
  - x86_pmu_start()

The assertions are no longer problematic. There is a valid
path where they can trigger but it is harmless.

The assertion can be triggered with:

  $ perf record -e instructions:pp ....

Leading to paths:

  intel_pmu_pebs_enable
  intel_pmu_enable_event
  x86_perf_event_set_period
  x86_pmu_start
  perf_adjust_freq_unthr_context
  perf_event_task_tick
  scheduler_tick

And:

  intel_pmu_lbr_enable
  intel_pmu_enable_event
  x86_perf_event_set_period
  x86_pmu_start
  perf_adjust_freq_unthr_context.
  perf_event_task_tick
  scheduler_tick

cpuc->enabled is always on because when we get to
perf_adjust_freq_unthr_context() the PMU is not totally
disabled. Furthermore when we need to adjust a period,
we only stop the event we need to change and not the
entire PMU. Thus, when we re-enable, cpuc->enabled is
already set. Note that when we stop the event, both
pebs and lbr are stopped if necessary (and possible).

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/20120202110401.GA30911@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-03 08:24:40 +01:00
Linus Torvalds 6c073a7ee2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: fix safety of rbd_put_client()
  rbd: fix a memory leak in rbd_get_client()
  ceph: create a new session lock to avoid lock inversion
  ceph: fix length validation in parse_reply_info()
  ceph: initialize client debugfs outside of monc->mutex
  ceph: change "ceph.layout" xattr to be "ceph.file.layout"
2012-02-02 15:47:33 -08:00
Josh Triplett ff05f603c3 include/linux/lp8727.h: Remove executable bit
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-02 15:43:40 -08:00
Alex Elder d23a4b3fd6 rbd: fix safety of rbd_put_client()
The rbd_client structure uses a kref to arrange for cleaning up and
freeing an instance when its last reference is dropped.  The cleanup
routine is rbd_client_release(), and one of the things it does is
delete the rbd_client from rbd_client_list.  It acquires node_lock
to do so, but the way it is done is still not safe.

The problem is that when attempting to reuse an existing rbd_client,
the structure found might already be in the process of getting
destroyed and cleaned up.

Here's the scenario, with "CLIENT" representing an existing
rbd_client that's involved in the race:

 Thread on CPU A                | Thread on CPU B
 ---------------                | ---------------
 rbd_put_client(CLIENT)         | rbd_get_client()
   kref_put()                   |   (acquires node_lock)
     kref->refcount becomes 0   |   __rbd_client_find() returns CLIENT
     calls rbd_client_release() |   kref_get(&CLIENT->kref);
                                |   (releases node_lock)
       (acquires node_lock)     |
       deletes CLIENT from list | ...and starts using CLIENT...
       (releases node_lock)     |
       and frees CLIENT         | <-- but CLIENT gets freed here

Fix this by having rbd_put_client() acquire node_lock.  The result
could still be improved, but at least it avoids this problem.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-02 12:56:59 -08:00
Christopher Yeoh 8cdb878dcb Fix race in process_vm_rw_core
This fixes the race in process_vm_core found by Oleg (see

  http://article.gmane.org/gmane.linux.kernel/1235667/

for details).

This has been updated since I last sent it as the creation of the new
mm_access() function did almost exactly the same thing as parts of the
previous version of this patch did.

In order to use mm_access() even when /proc isn't enabled, we move it to
kernel/fork.c where other related process mm access functions already
are.

Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-02 12:55:17 -08:00
Alex Elder 97bb59a03d rbd: fix a memory leak in rbd_get_client()
If an existing rbd client is found to be suitable for use in
rbd_get_client(), the rbd_options structure is not being
freed as it should.  Fix that.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-02 12:49:27 -08:00
Alex Elder d8fb02abdc ceph: create a new session lock to avoid lock inversion
Lockdep was reporting a possible circular lock dependency in
dentry_lease_is_valid().  That function needs to sample the
session's s_cap_gen and and s_cap_ttl fields coherently, but needs
to do so while holding a dentry lock.  The s_cap_lock field was
being used to protect the two fields, but that can't be taken while
holding a lock on a dentry within the session.

In most cases, the s_cap_gen and s_cap_ttl fields only get operated
on separately.  But in three cases they need to be updated together.
Implement a new lock to protect the spots updating both fields
atomically is required.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-02-02 12:49:19 -08:00
Xi Wang 32852a81bc ceph: fix length validation in parse_reply_info()
"len" is read from network and thus needs validation.  Otherwise, given
a bogus "len" value, p+len could be an out-of-bounds pointer, which is
used in further parsing.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-02 12:49:11 -08:00
Sage Weil ab434b60ab ceph: initialize client debugfs outside of monc->mutex
Initializing debufs under monc->mutex introduces a lock dependency for
sb->s_type->i_mutex_key, which (combined with several other dependencies)
leads to an annoying lockdep warning.  There's no particular reason to do
the debugfs setup under this lock, so move it out.

It used to be the case that our first monmap could come from the OSD; that
is no longer the case with recent servers, so we will reliably set up the
client entry during the initial authentication.

We don't have to worry about racing with debugfs teardown by
ceph_debugfs_client_cleanup() because ceph_destroy_client() calls
ceph_msgr_flush() first, which will wait for the message dispatch work
to complete (and the debugfs init to complete).

Fixes: #1940
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-02 12:49:01 -08:00
Alex Elder 114fc47492 ceph: change "ceph.layout" xattr to be "ceph.file.layout"
The virtual extended attribute named "ceph.layout" is meaningful
only for regular files.  Change its name to be "ceph.file.layout" to
more directly reflect that in the ceph xattr namespace.  Preserve
the old "ceph.layout" name for the time being (until we decide it's
safe to get rid of it entirely).

Add a missing initializer for "readonly" in the terminating entry.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-02-02 12:48:52 -08:00
Robert Richter 781ba9d2ed perf record: Make feature initialization generic
Loop over all features to enable it instead of explicitly enabling every
single feature. Reducing duplicate code and making it more robust to
later changes e.g. when adding more features.

Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/1323966762-8574-3-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-02 17:41:17 -02:00
Srikar Dronamraju 4eced2347c perf probe: Rename target_module to target
This is a precursor patch that modifies names that refer to
kernel/module to also refer to user space names.

Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anton Arapov <anton@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-mm <linux-mm@kvack.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20120202142040.5967.64156.sendpatchset@srdronam.in.ibm.com
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-02 17:39:15 -02:00
John Kacur 946d863cc0 perf tools: Remove distclean from Makefile help output
distclean as an alias for clean was removed from the perf Makefile by
commit a3d1ee10d1

However, that commit neglected to remove it from the help output of
the perf Makefile, which could result in a user trying the following.

$ cd tools/perf/
$ make help | grep distclean
  distclean		- alias to clean
$ make distclean
make: *** No rule to make target `distclean'.  Stop.

This patch removes it from the Makefile help output.

Link: http://lkml.kernel.org/r/1328134591-19851-1-git-send-email-jkacur@redhat.com
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-02-02 17:37:51 -02:00
Linus Torvalds 24b36da33c Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms/blit: fix blit copy for very large buffers
  drm/radeon/kms: fix TRAVIS panel setup
  drm/radeon: fix use after free in ATRM bios reading code.
  drm/radeon/kms: Fix device tree linkage of DP i2c buses too
  drm/radeon: Set DESKTOP_HEIGHT register to the framebuffer (not mode) height.
  drm/radeon/kms: disable output polling when suspended
  drm/nv50/pm: signedness bug in nv50_pm_clocks_pre()
  drm/nouveau/gem: fix fence_sync race / oops
  drm/nouveau: fix typo on mxmdcb option
  drm/nouveau/mxm: pretend to succeed, even if we can't shadow the MXM-SIS
  drm/nouveau/disp: check that panel power gpio is enabled at init time
2012-02-02 11:19:03 -08:00
Linus Torvalds c84e295b30 Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
  Revert "microblaze: Add topology init"
2012-02-02 11:18:18 -08:00