1
0
Fork 0

Merge branches 'perf-urgent-for-linus' and 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf changes from Ingo Molnar:
 "As a first remark I'd like to point out that the obsolete '-f'
  (--force) option, which has not done anything for several releases,
  has been removed from 'perf record' and related utilities.  Everyone
  please update muscle memory accordingly! :-)

  Main changes on the perf kernel side:

   - Performance optimizations:
        . for trace events, by Steve Rostedt.
        . for time values, by Peter Zijlstra

   - New hardware support:
        . for Intel Silvermont (22nm Atom) CPUs, by Zheng Yan
        . for Intel SNB-EP uncore PMUs, by Zheng Yan

   - Enhanced hardware support:
        . for Intel uncore PMUs: add filter support for QPI boxes, by Zheng Yan

   - Core perf events code enhancements and fixes:
        . for full-nohz feature handling, by Frederic Weisbecker
        . for group events, by Jiri Olsa
        . for call chains, by Frederic Weisbecker
        . for event stream parsing, by Adrian Hunter

   - New ABI details:
        . Add attr->mmap2 attribute, by Stephane Eranian
        . Add PERF_EVENT_IOC_ID ioctl to return event ID, by Jiri Olsa
        . Export u64 time_zero on the mmap header page to allow TSC
          calculation, by Adrian Hunter
        . Add dummy software event, by Adrian Hunter.
        . Add a new PERF_SAMPLE_IDENTIFIER to make samples always
          parseable, by Adrian Hunter.
        . Make Power7 events available via sysfs, by Runzhen Wang.

   - Code cleanups and refactorings:
        . for nohz-full, by Frederic Weisbecker
        . for group events, by Jiri Olsa

   - Documentation updates:
        . for perf_event_type, by Peter Zijlstra

  Main changes on the perf tooling side (some of these tooling changes
  utilize the above kernel side changes):

   - Lots of 'perf trace' enhancements:

        . Make 'perf trace' command line arguments consistent with
          'perf record', by David Ahern.

        . Allow specifying syscalls a la strace, by Arnaldo Carvalho de Melo.

        . Add --verbose and -o/--output options, by Arnaldo Carvalho de Melo.

        . Support ! in -e expressions, to filter a list of syscalls,
          by Arnaldo Carvalho de Melo.

        . Arg formatting improvements to allow masking arguments in
          syscalls such as futex and open, where the some arguments are
          ignored and thus should not be printed depending on other args,
          by Arnaldo Carvalho de Melo.

        . Beautify futex open, openat, open_by_handle_at, lseek and futex
          syscalls, by Arnaldo Carvalho de Melo.

        . Add option to analyze events in a file versus live, so that
          one can do:

           [root@zoo ~]# perf record -a -e raw_syscalls:* sleep 1
           [ perf record: Woken up 0 times to write data ]
           [ perf record: Captured and wrote 25.150 MB perf.data (~1098836 samples) ]
           [root@zoo ~]# perf trace -i perf.data -e futex --duration 1
              17.799 ( 1.020 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, ua
             113.344 (95.429 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 4294967
             133.778 ( 1.042 ms): 18004 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 429496
           [root@zoo ~]#

          By David Ahern.

        . Honor target pid / tid options when analyzing a file, by David Ahern.

        . Introduce better formatting of syscall arguments, including so
          far beautifiers for mmap, madvise, syscall return values,
          by Arnaldo Carvalho de Melo.

        . Handle HUGEPAGE defines in the mmap beautifier, by David Ahern.

   - 'perf report/top' enhancements:

        . Do annotation using /proc/kcore and /proc/kallsyms when
          available, removing the forced need for a vmlinux file kernel
          assembly annotation. This also improves this use case because
          vmlinux has just the initial kernel image, not what is actually
          in use after various code patchings by things like alternatives.
          By Adrian Hunter.

        . Add --ignore-callees=<regex> option to collapse undesired parts
          of call graphs, by Greg Price.

        . Simplify symbol filtering by doing it at machine class level,
          by Adrian Hunter.

        . Add support for callchains in the gtk UI, by Namhyung Kim.

        . Add --objdump option to 'perf top', by Sukadev Bhattiprolu.

   - 'perf kvm' enhancements:

        . Add option to print only events that exceed a specified time
          duration, by David Ahern.

        . Improve stack trace printing, by David Ahern.

        . Update documentation of the live command, by David Ahern

        . Add perf kvm stat live mode that combines aspects of 'perf kvm
          stat' record and report, by David Ahern.

        . Add option to analyze specific VM in perf kvm stat report, by
          David Ahern.

        . Do not require /lib/modules/* on a guest, by Jason Wessel.

   - 'perf script' enhancements:

        . Fix symbol offset computation for some dsos, by David Ahern.

        . Fix named threads support, by David Ahern.

        . Don't install scripting files files when perl/python support
          is disabled, by Arnaldo Carvalho de Melo.

   - 'perf test' enhancements:

        . Add various improvements and fixes to the "vmlinux matches
          kallsyms" 'perf test' entry, related to the /proc/kcore
          annotation feature. By Adrian Hunter.

        . Add sample parsing test, by Adrian Hunter.

        . Add test for reading object code, by Adrian Hunter.

        . Add attr record group sampling test, by Jiri Olsa.

        . Misc testing infrastructure improvements and other details,
          by Jiri Olsa.

   - 'perf list' enhancements:

        . Skip unsupported hardware events, by Namhyung Kim.

        . List pmu events, by Andi Kleen.

   - 'perf diff' enhancements:

        . Add support for more than two files comparison, by Jiri Olsa.

   - 'perf sched' enhancements:

        . Various improvements, including removing reliance on some
          scheduler tracepoints that provide the same information as the
          PERF_RECORD_{FORK,EXIT} events. By David Ahern.

        . Remove odd build stall by moving a large struct initialization
          from a local variable to a global one, by Namhyung Kim.

   - 'perf stat' enhancements:

        . Add --initial-delay option to skip measuring for a defined
          startup phase, by Andi Kleen.

   - Generic perf tooling infrastructure/plumbing changes:

        . Tidy up sample parsing validation, by Adrian Hunter.

        . Fix up jobserver setup in libtraceevent Makefile.
          by Arnaldo Carvalho de Melo.

        . Debug improvements, by Adrian Hunter.

        . Fix correlation of samples coming after PERF_RECORD_EXIT event,
          by David Ahern.

        . Improve robustness of the topology parsing code,
          by Stephane Eranian.

        . Add group leader sampling, that allows just one event in a group
          to sample while the other events have just its values read,
          by Jiri Olsa.

        . Add support for a new modifier "D", which requests that the
          event, or group of events, be pinned to the PMU.
          By Michael Ellerman.

        . Support callchain sorting based on addresses, by Andi Kleen

        . Prep work for multi perf data file storage, by Jiri Olsa.

        . libtraceevent cleanups, by Namhyung Kim.

  And lots and lots of other fixes and code reorganizations that did not
  make it into the list, see the shortlog, diffstat and the Git log for
  details!"

[ Also merge a leftover from the 3.11 cycle ]

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Prevent race in unthrottling code

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (237 commits)
  perf trace: Tell arg formatters the arg index
  perf trace: Add beautifier for open's flags arg
  perf trace: Add beautifier for lseek's whence arg
  perf tools: Fix symbol offset computation for some dsos
  perf list: Skip unsupported events
  perf tests: Add 'keep tracking' test
  perf tools: Add support for PERF_COUNT_SW_DUMMY
  perf: Add a dummy software event to keep tracking
  perf trace: Add beautifier for futex 'operation' parm
  perf trace: Allow syscall arg formatters to mask args
  perf: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node()
  perf: Export struct perf_branch_entry to userspace
  perf: Add attr->mmap2 attribute to an event
  perf/x86: Add Silvermont (22nm Atom) support
  perf/x86: use INTEL_UEVENT_EXTRA_REG to define MSR_OFFCORE_RSP_X
  perf trace: Handle missing HUGEPAGE defines
  perf trace: Honor target pid / tid options when analyzing a file
  perf trace: Add option to analyze events in a file versus live
  perf evlist: Add tracepoint lookup by name
  perf tests: Add a sample parsing test
  ...
hifive-unleashed-5.1
Linus Torvalds 2013-09-04 08:25:35 -07:00
commit 0d99b70873
144 changed files with 9147 additions and 2324 deletions

View File

@ -138,11 +138,11 @@ extern ssize_t power_events_sysfs_show(struct device *dev,
#define EVENT_PTR(_id, _suffix) &EVENT_VAR(_id, _suffix).attr.attr
#define EVENT_ATTR(_name, _id, _suffix) \
PMU_EVENT_ATTR(_name, EVENT_VAR(_id, _suffix), PME_PM_##_id, \
PMU_EVENT_ATTR(_name, EVENT_VAR(_id, _suffix), PME_##_id, \
power_events_sysfs_show)
#define GENERIC_EVENT_ATTR(_name, _id) EVENT_ATTR(_name, _id, _g)
#define GENERIC_EVENT_PTR(_id) EVENT_PTR(_id, _g)
#define POWER_EVENT_ATTR(_name, _id) EVENT_ATTR(PM_##_name, _id, _p)
#define POWER_EVENT_ATTR(_name, _id) EVENT_ATTR(_name, _id, _p)
#define POWER_EVENT_PTR(_id) EVENT_PTR(_id, _p)

View File

@ -0,0 +1,548 @@
/*
* Performance counter support for POWER7 processors.
*
* Copyright 2013 Runzhen Wang, IBM Corporation.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
EVENT(PM_IC_DEMAND_L2_BR_ALL, 0x04898)
EVENT(PM_GCT_UTIL_7_TO_10_SLOTS, 0x020a0)
EVENT(PM_PMC2_SAVED, 0x10022)
EVENT(PM_CMPLU_STALL_DFU, 0x2003c)
EVENT(PM_VSU0_16FLOP, 0x0a0a4)
EVENT(PM_MRK_LSU_DERAT_MISS, 0x3d05a)
EVENT(PM_MRK_ST_CMPL, 0x10034)
EVENT(PM_NEST_PAIR3_ADD, 0x40881)
EVENT(PM_L2_ST_DISP, 0x46180)
EVENT(PM_L2_CASTOUT_MOD, 0x16180)
EVENT(PM_ISEG, 0x020a4)
EVENT(PM_MRK_INST_TIMEO, 0x40034)
EVENT(PM_L2_RCST_DISP_FAIL_ADDR, 0x36282)
EVENT(PM_LSU1_DC_PREF_STREAM_CONFIRM, 0x0d0b6)
EVENT(PM_IERAT_WR_64K, 0x040be)
EVENT(PM_MRK_DTLB_MISS_16M, 0x4d05e)
EVENT(PM_IERAT_MISS, 0x100f6)
EVENT(PM_MRK_PTEG_FROM_LMEM, 0x4d052)
EVENT(PM_FLOP, 0x100f4)
EVENT(PM_THRD_PRIO_4_5_CYC, 0x040b4)
EVENT(PM_BR_PRED_TA, 0x040aa)
EVENT(PM_CMPLU_STALL_FXU, 0x20014)
EVENT(PM_EXT_INT, 0x200f8)
EVENT(PM_VSU_FSQRT_FDIV, 0x0a888)
EVENT(PM_MRK_LD_MISS_EXPOSED_CYC, 0x1003e)
EVENT(PM_LSU1_LDF, 0x0c086)
EVENT(PM_IC_WRITE_ALL, 0x0488c)
EVENT(PM_LSU0_SRQ_STFWD, 0x0c0a0)
EVENT(PM_PTEG_FROM_RL2L3_MOD, 0x1c052)
EVENT(PM_MRK_DATA_FROM_L31_SHR, 0x1d04e)
EVENT(PM_DATA_FROM_L21_MOD, 0x3c046)
EVENT(PM_VSU1_SCAL_DOUBLE_ISSUED, 0x0b08a)
EVENT(PM_VSU0_8FLOP, 0x0a0a0)
EVENT(PM_POWER_EVENT1, 0x1006e)
EVENT(PM_DISP_CLB_HELD_BAL, 0x02092)
EVENT(PM_VSU1_2FLOP, 0x0a09a)
EVENT(PM_LWSYNC_HELD, 0x0209a)
EVENT(PM_PTEG_FROM_DL2L3_SHR, 0x3c054)
EVENT(PM_INST_FROM_L21_MOD, 0x34046)
EVENT(PM_IERAT_XLATE_WR_16MPLUS, 0x040bc)
EVENT(PM_IC_REQ_ALL, 0x04888)
EVENT(PM_DSLB_MISS, 0x0d090)
EVENT(PM_L3_MISS, 0x1f082)
EVENT(PM_LSU0_L1_PREF, 0x0d0b8)
EVENT(PM_VSU_SCALAR_SINGLE_ISSUED, 0x0b884)
EVENT(PM_LSU1_DC_PREF_STREAM_CONFIRM_STRIDE, 0x0d0be)
EVENT(PM_L2_INST, 0x36080)
EVENT(PM_VSU0_FRSP, 0x0a0b4)
EVENT(PM_FLUSH_DISP, 0x02082)
EVENT(PM_PTEG_FROM_L2MISS, 0x4c058)
EVENT(PM_VSU1_DQ_ISSUED, 0x0b09a)
EVENT(PM_CMPLU_STALL_LSU, 0x20012)
EVENT(PM_MRK_DATA_FROM_DMEM, 0x1d04a)
EVENT(PM_LSU_FLUSH_ULD, 0x0c8b0)
EVENT(PM_PTEG_FROM_LMEM, 0x4c052)
EVENT(PM_MRK_DERAT_MISS_16M, 0x3d05c)
EVENT(PM_THRD_ALL_RUN_CYC, 0x2000c)
EVENT(PM_MEM0_PREFETCH_DISP, 0x20083)
EVENT(PM_MRK_STALL_CMPLU_CYC_COUNT, 0x3003f)
EVENT(PM_DATA_FROM_DL2L3_MOD, 0x3c04c)
EVENT(PM_VSU_FRSP, 0x0a8b4)
EVENT(PM_MRK_DATA_FROM_L21_MOD, 0x3d046)
EVENT(PM_PMC1_OVERFLOW, 0x20010)
EVENT(PM_VSU0_SINGLE, 0x0a0a8)
EVENT(PM_MRK_PTEG_FROM_L3MISS, 0x2d058)
EVENT(PM_MRK_PTEG_FROM_L31_SHR, 0x2d056)
EVENT(PM_VSU0_VECTOR_SP_ISSUED, 0x0b090)
EVENT(PM_VSU1_FEST, 0x0a0ba)
EVENT(PM_MRK_INST_DISP, 0x20030)
EVENT(PM_VSU0_COMPLEX_ISSUED, 0x0b096)
EVENT(PM_LSU1_FLUSH_UST, 0x0c0b6)
EVENT(PM_INST_CMPL, 0x00002)
EVENT(PM_FXU_IDLE, 0x1000e)
EVENT(PM_LSU0_FLUSH_ULD, 0x0c0b0)
EVENT(PM_MRK_DATA_FROM_DL2L3_MOD, 0x3d04c)
EVENT(PM_LSU_LMQ_SRQ_EMPTY_ALL_CYC, 0x3001c)
EVENT(PM_LSU1_REJECT_LMQ_FULL, 0x0c0a6)
EVENT(PM_INST_PTEG_FROM_L21_MOD, 0x3e056)
EVENT(PM_INST_FROM_RL2L3_MOD, 0x14042)
EVENT(PM_SHL_CREATED, 0x05082)
EVENT(PM_L2_ST_HIT, 0x46182)
EVENT(PM_DATA_FROM_DMEM, 0x1c04a)
EVENT(PM_L3_LD_MISS, 0x2f082)
EVENT(PM_FXU1_BUSY_FXU0_IDLE, 0x4000e)
EVENT(PM_DISP_CLB_HELD_RES, 0x02094)
EVENT(PM_L2_SN_SX_I_DONE, 0x36382)
EVENT(PM_GRP_CMPL, 0x30004)
EVENT(PM_STCX_CMPL, 0x0c098)
EVENT(PM_VSU0_2FLOP, 0x0a098)
EVENT(PM_L3_PREF_MISS, 0x3f082)
EVENT(PM_LSU_SRQ_SYNC_CYC, 0x0d096)
EVENT(PM_LSU_REJECT_ERAT_MISS, 0x20064)
EVENT(PM_L1_ICACHE_MISS, 0x200fc)
EVENT(PM_LSU1_FLUSH_SRQ, 0x0c0be)
EVENT(PM_LD_REF_L1_LSU0, 0x0c080)
EVENT(PM_VSU0_FEST, 0x0a0b8)
EVENT(PM_VSU_VECTOR_SINGLE_ISSUED, 0x0b890)
EVENT(PM_FREQ_UP, 0x4000c)
EVENT(PM_DATA_FROM_LMEM, 0x3c04a)
EVENT(PM_LSU1_LDX, 0x0c08a)
EVENT(PM_PMC3_OVERFLOW, 0x40010)
EVENT(PM_MRK_BR_MPRED, 0x30036)
EVENT(PM_SHL_MATCH, 0x05086)
EVENT(PM_MRK_BR_TAKEN, 0x10036)
EVENT(PM_CMPLU_STALL_BRU, 0x4004e)
EVENT(PM_ISLB_MISS, 0x0d092)
EVENT(PM_CYC, 0x0001e)
EVENT(PM_DISP_HELD_THERMAL, 0x30006)
EVENT(PM_INST_PTEG_FROM_RL2L3_SHR, 0x2e054)
EVENT(PM_LSU1_SRQ_STFWD, 0x0c0a2)
EVENT(PM_GCT_NOSLOT_BR_MPRED, 0x4001a)
EVENT(PM_1PLUS_PPC_CMPL, 0x100f2)
EVENT(PM_PTEG_FROM_DMEM, 0x2c052)
EVENT(PM_VSU_2FLOP, 0x0a898)
EVENT(PM_GCT_FULL_CYC, 0x04086)
EVENT(PM_MRK_DATA_FROM_L3_CYC, 0x40020)
EVENT(PM_LSU_SRQ_S0_ALLOC, 0x0d09d)
EVENT(PM_MRK_DERAT_MISS_4K, 0x1d05c)
EVENT(PM_BR_MPRED_TA, 0x040ae)
EVENT(PM_INST_PTEG_FROM_L2MISS, 0x4e058)
EVENT(PM_DPU_HELD_POWER, 0x20006)
EVENT(PM_RUN_INST_CMPL, 0x400fa)
EVENT(PM_MRK_VSU_FIN, 0x30032)
EVENT(PM_LSU_SRQ_S0_VALID, 0x0d09c)
EVENT(PM_GCT_EMPTY_CYC, 0x20008)
EVENT(PM_IOPS_DISP, 0x30014)
EVENT(PM_RUN_SPURR, 0x10008)
EVENT(PM_PTEG_FROM_L21_MOD, 0x3c056)
EVENT(PM_VSU0_1FLOP, 0x0a080)
EVENT(PM_SNOOP_TLBIE, 0x0d0b2)
EVENT(PM_DATA_FROM_L3MISS, 0x2c048)
EVENT(PM_VSU_SINGLE, 0x0a8a8)
EVENT(PM_DTLB_MISS_16G, 0x1c05e)
EVENT(PM_CMPLU_STALL_VECTOR, 0x2001c)
EVENT(PM_FLUSH, 0x400f8)
EVENT(PM_L2_LD_HIT, 0x36182)
EVENT(PM_NEST_PAIR2_AND, 0x30883)
EVENT(PM_VSU1_1FLOP, 0x0a082)
EVENT(PM_IC_PREF_REQ, 0x0408a)
EVENT(PM_L3_LD_HIT, 0x2f080)
EVENT(PM_GCT_NOSLOT_IC_MISS, 0x2001a)
EVENT(PM_DISP_HELD, 0x10006)
EVENT(PM_L2_LD, 0x16080)
EVENT(PM_LSU_FLUSH_SRQ, 0x0c8bc)
EVENT(PM_BC_PLUS_8_CONV, 0x040b8)
EVENT(PM_MRK_DATA_FROM_L31_MOD_CYC, 0x40026)
EVENT(PM_CMPLU_STALL_VECTOR_LONG, 0x4004a)
EVENT(PM_L2_RCST_BUSY_RC_FULL, 0x26282)
EVENT(PM_TB_BIT_TRANS, 0x300f8)
EVENT(PM_THERMAL_MAX, 0x40006)
EVENT(PM_LSU1_FLUSH_ULD, 0x0c0b2)
EVENT(PM_LSU1_REJECT_LHS, 0x0c0ae)
EVENT(PM_LSU_LRQ_S0_ALLOC, 0x0d09f)
EVENT(PM_L3_CO_L31, 0x4f080)
EVENT(PM_POWER_EVENT4, 0x4006e)
EVENT(PM_DATA_FROM_L31_SHR, 0x1c04e)
EVENT(PM_BR_UNCOND, 0x0409e)
EVENT(PM_LSU1_DC_PREF_STREAM_ALLOC, 0x0d0aa)
EVENT(PM_PMC4_REWIND, 0x10020)
EVENT(PM_L2_RCLD_DISP, 0x16280)
EVENT(PM_THRD_PRIO_2_3_CYC, 0x040b2)
EVENT(PM_MRK_PTEG_FROM_L2MISS, 0x4d058)
EVENT(PM_IC_DEMAND_L2_BHT_REDIRECT, 0x04098)
EVENT(PM_LSU_DERAT_MISS, 0x200f6)
EVENT(PM_IC_PREF_CANCEL_L2, 0x04094)
EVENT(PM_MRK_FIN_STALL_CYC_COUNT, 0x1003d)
EVENT(PM_BR_PRED_CCACHE, 0x040a0)
EVENT(PM_GCT_UTIL_1_TO_2_SLOTS, 0x0209c)
EVENT(PM_MRK_ST_CMPL_INT, 0x30034)
EVENT(PM_LSU_TWO_TABLEWALK_CYC, 0x0d0a6)
EVENT(PM_MRK_DATA_FROM_L3MISS, 0x2d048)
EVENT(PM_GCT_NOSLOT_CYC, 0x100f8)
EVENT(PM_LSU_SET_MPRED, 0x0c0a8)
EVENT(PM_FLUSH_DISP_TLBIE, 0x0208a)
EVENT(PM_VSU1_FCONV, 0x0a0b2)
EVENT(PM_DERAT_MISS_16G, 0x4c05c)
EVENT(PM_INST_FROM_LMEM, 0x3404a)
EVENT(PM_IC_DEMAND_L2_BR_REDIRECT, 0x0409a)
EVENT(PM_CMPLU_STALL_SCALAR_LONG, 0x20018)
EVENT(PM_INST_PTEG_FROM_L2, 0x1e050)
EVENT(PM_PTEG_FROM_L2, 0x1c050)
EVENT(PM_MRK_DATA_FROM_L21_SHR_CYC, 0x20024)
EVENT(PM_MRK_DTLB_MISS_4K, 0x2d05a)
EVENT(PM_VSU0_FPSCR, 0x0b09c)
EVENT(PM_VSU1_VECT_DOUBLE_ISSUED, 0x0b082)
EVENT(PM_MRK_PTEG_FROM_RL2L3_MOD, 0x1d052)
EVENT(PM_MEM0_RQ_DISP, 0x10083)
EVENT(PM_L2_LD_MISS, 0x26080)
EVENT(PM_VMX_RESULT_SAT_1, 0x0b0a0)
EVENT(PM_L1_PREF, 0x0d8b8)
EVENT(PM_MRK_DATA_FROM_LMEM_CYC, 0x2002c)
EVENT(PM_GRP_IC_MISS_NONSPEC, 0x1000c)
EVENT(PM_PB_NODE_PUMP, 0x10081)
EVENT(PM_SHL_MERGED, 0x05084)
EVENT(PM_NEST_PAIR1_ADD, 0x20881)
EVENT(PM_DATA_FROM_L3, 0x1c048)
EVENT(PM_LSU_FLUSH, 0x0208e)
EVENT(PM_LSU_SRQ_SYNC_COUNT, 0x0d097)
EVENT(PM_PMC2_OVERFLOW, 0x30010)
EVENT(PM_LSU_LDF, 0x0c884)
EVENT(PM_POWER_EVENT3, 0x3006e)
EVENT(PM_DISP_WT, 0x30008)
EVENT(PM_CMPLU_STALL_REJECT, 0x40016)
EVENT(PM_IC_BANK_CONFLICT, 0x04082)
EVENT(PM_BR_MPRED_CR_TA, 0x048ae)
EVENT(PM_L2_INST_MISS, 0x36082)
EVENT(PM_CMPLU_STALL_ERAT_MISS, 0x40018)
EVENT(PM_NEST_PAIR2_ADD, 0x30881)
EVENT(PM_MRK_LSU_FLUSH, 0x0d08c)
EVENT(PM_L2_LDST, 0x16880)
EVENT(PM_INST_FROM_L31_SHR, 0x1404e)
EVENT(PM_VSU0_FIN, 0x0a0bc)
EVENT(PM_LARX_LSU, 0x0c894)
EVENT(PM_INST_FROM_RMEM, 0x34042)
EVENT(PM_DISP_CLB_HELD_TLBIE, 0x02096)
EVENT(PM_MRK_DATA_FROM_DMEM_CYC, 0x2002e)
EVENT(PM_BR_PRED_CR, 0x040a8)
EVENT(PM_LSU_REJECT, 0x10064)
EVENT(PM_GCT_UTIL_3_TO_6_SLOTS, 0x0209e)
EVENT(PM_CMPLU_STALL_END_GCT_NOSLOT, 0x10028)
EVENT(PM_LSU0_REJECT_LMQ_FULL, 0x0c0a4)
EVENT(PM_VSU_FEST, 0x0a8b8)
EVENT(PM_NEST_PAIR0_AND, 0x10883)
EVENT(PM_PTEG_FROM_L3, 0x2c050)
EVENT(PM_POWER_EVENT2, 0x2006e)
EVENT(PM_IC_PREF_CANCEL_PAGE, 0x04090)
EVENT(PM_VSU0_FSQRT_FDIV, 0x0a088)
EVENT(PM_MRK_GRP_CMPL, 0x40030)
EVENT(PM_VSU0_SCAL_DOUBLE_ISSUED, 0x0b088)
EVENT(PM_GRP_DISP, 0x3000a)
EVENT(PM_LSU0_LDX, 0x0c088)
EVENT(PM_DATA_FROM_L2, 0x1c040)
EVENT(PM_MRK_DATA_FROM_RL2L3_MOD, 0x1d042)
EVENT(PM_LD_REF_L1, 0x0c880)
EVENT(PM_VSU0_VECT_DOUBLE_ISSUED, 0x0b080)
EVENT(PM_VSU1_2FLOP_DOUBLE, 0x0a08e)
EVENT(PM_THRD_PRIO_6_7_CYC, 0x040b6)
EVENT(PM_BC_PLUS_8_RSLV_TAKEN, 0x040ba)
EVENT(PM_BR_MPRED_CR, 0x040ac)
EVENT(PM_L3_CO_MEM, 0x4f082)
EVENT(PM_LD_MISS_L1, 0x400f0)
EVENT(PM_DATA_FROM_RL2L3_MOD, 0x1c042)
EVENT(PM_LSU_SRQ_FULL_CYC, 0x1001a)
EVENT(PM_TABLEWALK_CYC, 0x10026)
EVENT(PM_MRK_PTEG_FROM_RMEM, 0x3d052)
EVENT(PM_LSU_SRQ_STFWD, 0x0c8a0)
EVENT(PM_INST_PTEG_FROM_RMEM, 0x3e052)
EVENT(PM_FXU0_FIN, 0x10004)
EVENT(PM_LSU1_L1_SW_PREF, 0x0c09e)
EVENT(PM_PTEG_FROM_L31_MOD, 0x1c054)
EVENT(PM_PMC5_OVERFLOW, 0x10024)
EVENT(PM_LD_REF_L1_LSU1, 0x0c082)
EVENT(PM_INST_PTEG_FROM_L21_SHR, 0x4e056)
EVENT(PM_CMPLU_STALL_THRD, 0x1001c)
EVENT(PM_DATA_FROM_RMEM, 0x3c042)
EVENT(PM_VSU0_SCAL_SINGLE_ISSUED, 0x0b084)
EVENT(PM_BR_MPRED_LSTACK, 0x040a6)
EVENT(PM_MRK_DATA_FROM_RL2L3_MOD_CYC, 0x40028)
EVENT(PM_LSU0_FLUSH_UST, 0x0c0b4)
EVENT(PM_LSU_NCST, 0x0c090)
EVENT(PM_BR_TAKEN, 0x20004)
EVENT(PM_INST_PTEG_FROM_LMEM, 0x4e052)
EVENT(PM_GCT_NOSLOT_BR_MPRED_IC_MISS, 0x4001c)
EVENT(PM_DTLB_MISS_4K, 0x2c05a)
EVENT(PM_PMC4_SAVED, 0x30022)
EVENT(PM_VSU1_PERMUTE_ISSUED, 0x0b092)
EVENT(PM_SLB_MISS, 0x0d890)
EVENT(PM_LSU1_FLUSH_LRQ, 0x0c0ba)
EVENT(PM_DTLB_MISS, 0x300fc)
EVENT(PM_VSU1_FRSP, 0x0a0b6)
EVENT(PM_VSU_VECTOR_DOUBLE_ISSUED, 0x0b880)
EVENT(PM_L2_CASTOUT_SHR, 0x16182)
EVENT(PM_DATA_FROM_DL2L3_SHR, 0x3c044)
EVENT(PM_VSU1_STF, 0x0b08e)
EVENT(PM_ST_FIN, 0x200f0)
EVENT(PM_PTEG_FROM_L21_SHR, 0x4c056)
EVENT(PM_L2_LOC_GUESS_WRONG, 0x26480)
EVENT(PM_MRK_STCX_FAIL, 0x0d08e)
EVENT(PM_LSU0_REJECT_LHS, 0x0c0ac)
EVENT(PM_IC_PREF_CANCEL_HIT, 0x04092)
EVENT(PM_L3_PREF_BUSY, 0x4f080)
EVENT(PM_MRK_BRU_FIN, 0x2003a)
EVENT(PM_LSU1_NCLD, 0x0c08e)
EVENT(PM_INST_PTEG_FROM_L31_MOD, 0x1e054)
EVENT(PM_LSU_NCLD, 0x0c88c)
EVENT(PM_LSU_LDX, 0x0c888)
EVENT(PM_L2_LOC_GUESS_CORRECT, 0x16480)
EVENT(PM_THRESH_TIMEO, 0x10038)
EVENT(PM_L3_PREF_ST, 0x0d0ae)
EVENT(PM_DISP_CLB_HELD_SYNC, 0x02098)
EVENT(PM_VSU_SIMPLE_ISSUED, 0x0b894)
EVENT(PM_VSU1_SINGLE, 0x0a0aa)
EVENT(PM_DATA_TABLEWALK_CYC, 0x3001a)
EVENT(PM_L2_RC_ST_DONE, 0x36380)
EVENT(PM_MRK_PTEG_FROM_L21_MOD, 0x3d056)
EVENT(PM_LARX_LSU1, 0x0c096)
EVENT(PM_MRK_DATA_FROM_RMEM, 0x3d042)
EVENT(PM_DISP_CLB_HELD, 0x02090)
EVENT(PM_DERAT_MISS_4K, 0x1c05c)
EVENT(PM_L2_RCLD_DISP_FAIL_ADDR, 0x16282)
EVENT(PM_SEG_EXCEPTION, 0x028a4)
EVENT(PM_FLUSH_DISP_SB, 0x0208c)
EVENT(PM_L2_DC_INV, 0x26182)
EVENT(PM_PTEG_FROM_DL2L3_MOD, 0x4c054)
EVENT(PM_DSEG, 0x020a6)
EVENT(PM_BR_PRED_LSTACK, 0x040a2)
EVENT(PM_VSU0_STF, 0x0b08c)
EVENT(PM_LSU_FX_FIN, 0x10066)
EVENT(PM_DERAT_MISS_16M, 0x3c05c)
EVENT(PM_MRK_PTEG_FROM_DL2L3_MOD, 0x4d054)
EVENT(PM_GCT_UTIL_11_PLUS_SLOTS, 0x020a2)
EVENT(PM_INST_FROM_L3, 0x14048)
EVENT(PM_MRK_IFU_FIN, 0x3003a)
EVENT(PM_ITLB_MISS, 0x400fc)
EVENT(PM_VSU_STF, 0x0b88c)
EVENT(PM_LSU_FLUSH_UST, 0x0c8b4)
EVENT(PM_L2_LDST_MISS, 0x26880)
EVENT(PM_FXU1_FIN, 0x40004)
EVENT(PM_SHL_DEALLOCATED, 0x05080)
EVENT(PM_L2_SN_M_WR_DONE, 0x46382)
EVENT(PM_LSU_REJECT_SET_MPRED, 0x0c8a8)
EVENT(PM_L3_PREF_LD, 0x0d0ac)
EVENT(PM_L2_SN_M_RD_DONE, 0x46380)
EVENT(PM_MRK_DERAT_MISS_16G, 0x4d05c)
EVENT(PM_VSU_FCONV, 0x0a8b0)
EVENT(PM_ANY_THRD_RUN_CYC, 0x100fa)
EVENT(PM_LSU_LMQ_FULL_CYC, 0x0d0a4)
EVENT(PM_MRK_LSU_REJECT_LHS, 0x0d082)
EVENT(PM_MRK_LD_MISS_L1_CYC, 0x4003e)
EVENT(PM_MRK_DATA_FROM_L2_CYC, 0x20020)
EVENT(PM_INST_IMC_MATCH_DISP, 0x30016)
EVENT(PM_MRK_DATA_FROM_RMEM_CYC, 0x4002c)
EVENT(PM_VSU0_SIMPLE_ISSUED, 0x0b094)
EVENT(PM_CMPLU_STALL_DIV, 0x40014)
EVENT(PM_MRK_PTEG_FROM_RL2L3_SHR, 0x2d054)
EVENT(PM_VSU_FMA_DOUBLE, 0x0a890)
EVENT(PM_VSU_4FLOP, 0x0a89c)
EVENT(PM_VSU1_FIN, 0x0a0be)
EVENT(PM_NEST_PAIR1_AND, 0x20883)
EVENT(PM_INST_PTEG_FROM_RL2L3_MOD, 0x1e052)
EVENT(PM_RUN_CYC, 0x200f4)
EVENT(PM_PTEG_FROM_RMEM, 0x3c052)
EVENT(PM_LSU_LRQ_S0_VALID, 0x0d09e)
EVENT(PM_LSU0_LDF, 0x0c084)
EVENT(PM_FLUSH_COMPLETION, 0x30012)
EVENT(PM_ST_MISS_L1, 0x300f0)
EVENT(PM_L2_NODE_PUMP, 0x36480)
EVENT(PM_INST_FROM_DL2L3_SHR, 0x34044)
EVENT(PM_MRK_STALL_CMPLU_CYC, 0x3003e)
EVENT(PM_VSU1_DENORM, 0x0a0ae)
EVENT(PM_MRK_DATA_FROM_L31_SHR_CYC, 0x20026)
EVENT(PM_NEST_PAIR0_ADD, 0x10881)
EVENT(PM_INST_FROM_L3MISS, 0x24048)
EVENT(PM_EE_OFF_EXT_INT, 0x02080)
EVENT(PM_INST_PTEG_FROM_DMEM, 0x2e052)
EVENT(PM_INST_FROM_DL2L3_MOD, 0x3404c)
EVENT(PM_PMC6_OVERFLOW, 0x30024)
EVENT(PM_VSU_2FLOP_DOUBLE, 0x0a88c)
EVENT(PM_TLB_MISS, 0x20066)
EVENT(PM_FXU_BUSY, 0x2000e)
EVENT(PM_L2_RCLD_DISP_FAIL_OTHER, 0x26280)
EVENT(PM_LSU_REJECT_LMQ_FULL, 0x0c8a4)
EVENT(PM_IC_RELOAD_SHR, 0x04096)
EVENT(PM_GRP_MRK, 0x10031)
EVENT(PM_MRK_ST_NEST, 0x20034)
EVENT(PM_VSU1_FSQRT_FDIV, 0x0a08a)
EVENT(PM_LSU0_FLUSH_LRQ, 0x0c0b8)
EVENT(PM_LARX_LSU0, 0x0c094)
EVENT(PM_IBUF_FULL_CYC, 0x04084)
EVENT(PM_MRK_DATA_FROM_DL2L3_SHR_CYC, 0x2002a)
EVENT(PM_LSU_DC_PREF_STREAM_ALLOC, 0x0d8a8)
EVENT(PM_GRP_MRK_CYC, 0x10030)
EVENT(PM_MRK_DATA_FROM_RL2L3_SHR_CYC, 0x20028)
EVENT(PM_L2_GLOB_GUESS_CORRECT, 0x16482)
EVENT(PM_LSU_REJECT_LHS, 0x0c8ac)
EVENT(PM_MRK_DATA_FROM_LMEM, 0x3d04a)
EVENT(PM_INST_PTEG_FROM_L3, 0x2e050)
EVENT(PM_FREQ_DOWN, 0x3000c)
EVENT(PM_PB_RETRY_NODE_PUMP, 0x30081)
EVENT(PM_INST_FROM_RL2L3_SHR, 0x1404c)
EVENT(PM_MRK_INST_ISSUED, 0x10032)
EVENT(PM_PTEG_FROM_L3MISS, 0x2c058)
EVENT(PM_RUN_PURR, 0x400f4)
EVENT(PM_MRK_GRP_IC_MISS, 0x40038)
EVENT(PM_MRK_DATA_FROM_L3, 0x1d048)
EVENT(PM_CMPLU_STALL_DCACHE_MISS, 0x20016)
EVENT(PM_PTEG_FROM_RL2L3_SHR, 0x2c054)
EVENT(PM_LSU_FLUSH_LRQ, 0x0c8b8)
EVENT(PM_MRK_DERAT_MISS_64K, 0x2d05c)
EVENT(PM_INST_PTEG_FROM_DL2L3_MOD, 0x4e054)
EVENT(PM_L2_ST_MISS, 0x26082)
EVENT(PM_MRK_PTEG_FROM_L21_SHR, 0x4d056)
EVENT(PM_LWSYNC, 0x0d094)
EVENT(PM_LSU0_DC_PREF_STREAM_CONFIRM_STRIDE, 0x0d0bc)
EVENT(PM_MRK_LSU_FLUSH_LRQ, 0x0d088)
EVENT(PM_INST_IMC_MATCH_CMPL, 0x100f0)
EVENT(PM_NEST_PAIR3_AND, 0x40883)
EVENT(PM_PB_RETRY_SYS_PUMP, 0x40081)
EVENT(PM_MRK_INST_FIN, 0x30030)
EVENT(PM_MRK_PTEG_FROM_DL2L3_SHR, 0x3d054)
EVENT(PM_INST_FROM_L31_MOD, 0x14044)
EVENT(PM_MRK_DTLB_MISS_64K, 0x3d05e)
EVENT(PM_LSU_FIN, 0x30066)
EVENT(PM_MRK_LSU_REJECT, 0x40064)
EVENT(PM_L2_CO_FAIL_BUSY, 0x16382)
EVENT(PM_MEM0_WQ_DISP, 0x40083)
EVENT(PM_DATA_FROM_L31_MOD, 0x1c044)
EVENT(PM_THERMAL_WARN, 0x10016)
EVENT(PM_VSU0_4FLOP, 0x0a09c)
EVENT(PM_BR_MPRED_CCACHE, 0x040a4)
EVENT(PM_CMPLU_STALL_IFU, 0x4004c)
EVENT(PM_L1_DEMAND_WRITE, 0x0408c)
EVENT(PM_FLUSH_BR_MPRED, 0x02084)
EVENT(PM_MRK_DTLB_MISS_16G, 0x1d05e)
EVENT(PM_MRK_PTEG_FROM_DMEM, 0x2d052)
EVENT(PM_L2_RCST_DISP, 0x36280)
EVENT(PM_CMPLU_STALL, 0x4000a)
EVENT(PM_LSU_PARTIAL_CDF, 0x0c0aa)
EVENT(PM_DISP_CLB_HELD_SB, 0x020a8)
EVENT(PM_VSU0_FMA_DOUBLE, 0x0a090)
EVENT(PM_FXU0_BUSY_FXU1_IDLE, 0x3000e)
EVENT(PM_IC_DEMAND_CYC, 0x10018)
EVENT(PM_MRK_DATA_FROM_L21_SHR, 0x3d04e)
EVENT(PM_MRK_LSU_FLUSH_UST, 0x0d086)
EVENT(PM_INST_PTEG_FROM_L3MISS, 0x2e058)
EVENT(PM_VSU_DENORM, 0x0a8ac)
EVENT(PM_MRK_LSU_PARTIAL_CDF, 0x0d080)
EVENT(PM_INST_FROM_L21_SHR, 0x3404e)
EVENT(PM_IC_PREF_WRITE, 0x0408e)
EVENT(PM_BR_PRED, 0x0409c)
EVENT(PM_INST_FROM_DMEM, 0x1404a)
EVENT(PM_IC_PREF_CANCEL_ALL, 0x04890)
EVENT(PM_LSU_DC_PREF_STREAM_CONFIRM, 0x0d8b4)
EVENT(PM_MRK_LSU_FLUSH_SRQ, 0x0d08a)
EVENT(PM_MRK_FIN_STALL_CYC, 0x1003c)
EVENT(PM_L2_RCST_DISP_FAIL_OTHER, 0x46280)
EVENT(PM_VSU1_DD_ISSUED, 0x0b098)
EVENT(PM_PTEG_FROM_L31_SHR, 0x2c056)
EVENT(PM_DATA_FROM_L21_SHR, 0x3c04e)
EVENT(PM_LSU0_NCLD, 0x0c08c)
EVENT(PM_VSU1_4FLOP, 0x0a09e)
EVENT(PM_VSU1_8FLOP, 0x0a0a2)
EVENT(PM_VSU_8FLOP, 0x0a8a0)
EVENT(PM_LSU_LMQ_SRQ_EMPTY_CYC, 0x2003e)
EVENT(PM_DTLB_MISS_64K, 0x3c05e)
EVENT(PM_THRD_CONC_RUN_INST, 0x300f4)
EVENT(PM_MRK_PTEG_FROM_L2, 0x1d050)
EVENT(PM_PB_SYS_PUMP, 0x20081)
EVENT(PM_VSU_FIN, 0x0a8bc)
EVENT(PM_MRK_DATA_FROM_L31_MOD, 0x1d044)
EVENT(PM_THRD_PRIO_0_1_CYC, 0x040b0)
EVENT(PM_DERAT_MISS_64K, 0x2c05c)
EVENT(PM_PMC2_REWIND, 0x30020)
EVENT(PM_INST_FROM_L2, 0x14040)
EVENT(PM_GRP_BR_MPRED_NONSPEC, 0x1000a)
EVENT(PM_INST_DISP, 0x200f2)
EVENT(PM_MEM0_RD_CANCEL_TOTAL, 0x30083)
EVENT(PM_LSU0_DC_PREF_STREAM_CONFIRM, 0x0d0b4)
EVENT(PM_L1_DCACHE_RELOAD_VALID, 0x300f6)
EVENT(PM_VSU_SCALAR_DOUBLE_ISSUED, 0x0b888)
EVENT(PM_L3_PREF_HIT, 0x3f080)
EVENT(PM_MRK_PTEG_FROM_L31_MOD, 0x1d054)
EVENT(PM_CMPLU_STALL_STORE, 0x2004a)
EVENT(PM_MRK_FXU_FIN, 0x20038)
EVENT(PM_PMC4_OVERFLOW, 0x10010)
EVENT(PM_MRK_PTEG_FROM_L3, 0x2d050)
EVENT(PM_LSU0_LMQ_LHR_MERGE, 0x0d098)
EVENT(PM_BTAC_HIT, 0x0508a)
EVENT(PM_L3_RD_BUSY, 0x4f082)
EVENT(PM_LSU0_L1_SW_PREF, 0x0c09c)
EVENT(PM_INST_FROM_L2MISS, 0x44048)
EVENT(PM_LSU0_DC_PREF_STREAM_ALLOC, 0x0d0a8)
EVENT(PM_L2_ST, 0x16082)
EVENT(PM_VSU0_DENORM, 0x0a0ac)
EVENT(PM_MRK_DATA_FROM_DL2L3_SHR, 0x3d044)
EVENT(PM_BR_PRED_CR_TA, 0x048aa)
EVENT(PM_VSU0_FCONV, 0x0a0b0)
EVENT(PM_MRK_LSU_FLUSH_ULD, 0x0d084)
EVENT(PM_BTAC_MISS, 0x05088)
EVENT(PM_MRK_LD_MISS_EXPOSED_CYC_COUNT, 0x1003f)
EVENT(PM_MRK_DATA_FROM_L2, 0x1d040)
EVENT(PM_LSU_DCACHE_RELOAD_VALID, 0x0d0a2)
EVENT(PM_VSU_FMA, 0x0a884)
EVENT(PM_LSU0_FLUSH_SRQ, 0x0c0bc)
EVENT(PM_LSU1_L1_PREF, 0x0d0ba)
EVENT(PM_IOPS_CMPL, 0x10014)
EVENT(PM_L2_SYS_PUMP, 0x36482)
EVENT(PM_L2_RCLD_BUSY_RC_FULL, 0x46282)
EVENT(PM_LSU_LMQ_S0_ALLOC, 0x0d0a1)
EVENT(PM_FLUSH_DISP_SYNC, 0x02088)
EVENT(PM_MRK_DATA_FROM_DL2L3_MOD_CYC, 0x4002a)
EVENT(PM_L2_IC_INV, 0x26180)
EVENT(PM_MRK_DATA_FROM_L21_MOD_CYC, 0x40024)
EVENT(PM_L3_PREF_LDST, 0x0d8ac)
EVENT(PM_LSU_SRQ_EMPTY_CYC, 0x40008)
EVENT(PM_LSU_LMQ_S0_VALID, 0x0d0a0)
EVENT(PM_FLUSH_PARTIAL, 0x02086)
EVENT(PM_VSU1_FMA_DOUBLE, 0x0a092)
EVENT(PM_1PLUS_PPC_DISP, 0x400f2)
EVENT(PM_DATA_FROM_L2MISS, 0x200fe)
EVENT(PM_SUSPENDED, 0x00000)
EVENT(PM_VSU0_FMA, 0x0a084)
EVENT(PM_CMPLU_STALL_SCALAR, 0x40012)
EVENT(PM_STCX_FAIL, 0x0c09a)
EVENT(PM_VSU0_FSQRT_FDIV_DOUBLE, 0x0a094)
EVENT(PM_DC_PREF_DST, 0x0d0b0)
EVENT(PM_VSU1_SCAL_SINGLE_ISSUED, 0x0b086)
EVENT(PM_L3_HIT, 0x1f080)
EVENT(PM_L2_GLOB_GUESS_WRONG, 0x26482)
EVENT(PM_MRK_DFU_FIN, 0x20032)
EVENT(PM_INST_FROM_L1, 0x04080)
EVENT(PM_BRU_FIN, 0x10068)
EVENT(PM_IC_DEMAND_REQ, 0x04088)
EVENT(PM_VSU1_FSQRT_FDIV_DOUBLE, 0x0a096)
EVENT(PM_VSU1_FMA, 0x0a086)
EVENT(PM_MRK_LD_MISS_L1, 0x20036)
EVENT(PM_VSU0_2FLOP_DOUBLE, 0x0a08c)
EVENT(PM_LSU_DC_PREF_STRIDED_STREAM_CONFIRM, 0x0d8bc)
EVENT(PM_INST_PTEG_FROM_L31_SHR, 0x2e056)
EVENT(PM_MRK_LSU_REJECT_ERAT_MISS, 0x30064)
EVENT(PM_MRK_DATA_FROM_L2MISS, 0x4d048)
EVENT(PM_DATA_FROM_RL2L3_SHR, 0x1c04c)
EVENT(PM_INST_FROM_PREF, 0x14046)
EVENT(PM_VSU1_SQ, 0x0b09e)
EVENT(PM_L2_LD_DISP, 0x36180)
EVENT(PM_L2_DISP_ALL, 0x46080)
EVENT(PM_THRD_GRP_CMPL_BOTH_CYC, 0x10012)
EVENT(PM_VSU_FSQRT_FDIV_DOUBLE, 0x0a894)
EVENT(PM_BR_MPRED, 0x400f6)
EVENT(PM_INST_PTEG_FROM_DL2L3_SHR, 0x3e054)
EVENT(PM_VSU_1FLOP, 0x0a880)
EVENT(PM_HV_CYC, 0x2000a)
EVENT(PM_MRK_LSU_FIN, 0x40032)
EVENT(PM_MRK_DATA_FROM_RL2L3_SHR, 0x1d04c)
EVENT(PM_DTLB_MISS_16M, 0x4c05e)
EVENT(PM_LSU1_LMQ_LHR_MERGE, 0x0d09a)
EVENT(PM_IFU_FIN, 0x40066)

View File

@ -53,37 +53,13 @@
/*
* Power7 event codes.
*/
#define PME_PM_CYC 0x1e
#define PME_PM_GCT_NOSLOT_CYC 0x100f8
#define PME_PM_CMPLU_STALL 0x4000a
#define PME_PM_INST_CMPL 0x2
#define PME_PM_LD_REF_L1 0xc880
#define PME_PM_LD_MISS_L1 0x400f0
#define PME_PM_BRU_FIN 0x10068
#define PME_PM_BR_MPRED 0x400f6
#define EVENT(_name, _code) \
PME_##_name = _code,
#define PME_PM_CMPLU_STALL_FXU 0x20014
#define PME_PM_CMPLU_STALL_DIV 0x40014
#define PME_PM_CMPLU_STALL_SCALAR 0x40012
#define PME_PM_CMPLU_STALL_SCALAR_LONG 0x20018
#define PME_PM_CMPLU_STALL_VECTOR 0x2001c
#define PME_PM_CMPLU_STALL_VECTOR_LONG 0x4004a
#define PME_PM_CMPLU_STALL_LSU 0x20012
#define PME_PM_CMPLU_STALL_REJECT 0x40016
#define PME_PM_CMPLU_STALL_ERAT_MISS 0x40018
#define PME_PM_CMPLU_STALL_DCACHE_MISS 0x20016
#define PME_PM_CMPLU_STALL_STORE 0x2004a
#define PME_PM_CMPLU_STALL_THRD 0x1001c
#define PME_PM_CMPLU_STALL_IFU 0x4004c
#define PME_PM_CMPLU_STALL_BRU 0x4004e
#define PME_PM_GCT_NOSLOT_IC_MISS 0x2001a
#define PME_PM_GCT_NOSLOT_BR_MPRED 0x4001a
#define PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS 0x4001c
#define PME_PM_GRP_CMPL 0x30004
#define PME_PM_1PLUS_PPC_CMPL 0x100f2
#define PME_PM_CMPLU_STALL_DFU 0x2003c
#define PME_PM_RUN_CYC 0x200f4
#define PME_PM_RUN_INST_CMPL 0x400fa
enum {
#include "power7-events-list.h"
};
#undef EVENT
/*
* Layout of constraint bits:
@ -398,96 +374,36 @@ static int power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
};
GENERIC_EVENT_ATTR(cpu-cycles, CYC);
GENERIC_EVENT_ATTR(stalled-cycles-frontend, GCT_NOSLOT_CYC);
GENERIC_EVENT_ATTR(stalled-cycles-backend, CMPLU_STALL);
GENERIC_EVENT_ATTR(instructions, INST_CMPL);
GENERIC_EVENT_ATTR(cache-references, LD_REF_L1);
GENERIC_EVENT_ATTR(cache-misses, LD_MISS_L1);
GENERIC_EVENT_ATTR(branch-instructions, BRU_FIN);
GENERIC_EVENT_ATTR(branch-misses, BR_MPRED);
GENERIC_EVENT_ATTR(cpu-cycles, PM_CYC);
GENERIC_EVENT_ATTR(stalled-cycles-frontend, PM_GCT_NOSLOT_CYC);
GENERIC_EVENT_ATTR(stalled-cycles-backend, PM_CMPLU_STALL);
GENERIC_EVENT_ATTR(instructions, PM_INST_CMPL);
GENERIC_EVENT_ATTR(cache-references, PM_LD_REF_L1);
GENERIC_EVENT_ATTR(cache-misses, PM_LD_MISS_L1);
GENERIC_EVENT_ATTR(branch-instructions, PM_BRU_FIN);
GENERIC_EVENT_ATTR(branch-misses, PM_BR_MPRED);
POWER_EVENT_ATTR(CYC, CYC);
POWER_EVENT_ATTR(GCT_NOSLOT_CYC, GCT_NOSLOT_CYC);
POWER_EVENT_ATTR(CMPLU_STALL, CMPLU_STALL);
POWER_EVENT_ATTR(INST_CMPL, INST_CMPL);
POWER_EVENT_ATTR(LD_REF_L1, LD_REF_L1);
POWER_EVENT_ATTR(LD_MISS_L1, LD_MISS_L1);
POWER_EVENT_ATTR(BRU_FIN, BRU_FIN)
POWER_EVENT_ATTR(BR_MPRED, BR_MPRED);
#define EVENT(_name, _code) POWER_EVENT_ATTR(_name, _name);
#include "power7-events-list.h"
#undef EVENT
POWER_EVENT_ATTR(CMPLU_STALL_FXU, CMPLU_STALL_FXU);
POWER_EVENT_ATTR(CMPLU_STALL_DIV, CMPLU_STALL_DIV);
POWER_EVENT_ATTR(CMPLU_STALL_SCALAR, CMPLU_STALL_SCALAR);
POWER_EVENT_ATTR(CMPLU_STALL_SCALAR_LONG, CMPLU_STALL_SCALAR_LONG);
POWER_EVENT_ATTR(CMPLU_STALL_VECTOR, CMPLU_STALL_VECTOR);
POWER_EVENT_ATTR(CMPLU_STALL_VECTOR_LONG, CMPLU_STALL_VECTOR_LONG);
POWER_EVENT_ATTR(CMPLU_STALL_LSU, CMPLU_STALL_LSU);
POWER_EVENT_ATTR(CMPLU_STALL_REJECT, CMPLU_STALL_REJECT);
POWER_EVENT_ATTR(CMPLU_STALL_ERAT_MISS, CMPLU_STALL_ERAT_MISS);
POWER_EVENT_ATTR(CMPLU_STALL_DCACHE_MISS, CMPLU_STALL_DCACHE_MISS);
POWER_EVENT_ATTR(CMPLU_STALL_STORE, CMPLU_STALL_STORE);
POWER_EVENT_ATTR(CMPLU_STALL_THRD, CMPLU_STALL_THRD);
POWER_EVENT_ATTR(CMPLU_STALL_IFU, CMPLU_STALL_IFU);
POWER_EVENT_ATTR(CMPLU_STALL_BRU, CMPLU_STALL_BRU);
POWER_EVENT_ATTR(GCT_NOSLOT_IC_MISS, GCT_NOSLOT_IC_MISS);
POWER_EVENT_ATTR(GCT_NOSLOT_BR_MPRED, GCT_NOSLOT_BR_MPRED);
POWER_EVENT_ATTR(GCT_NOSLOT_BR_MPRED_IC_MISS, GCT_NOSLOT_BR_MPRED_IC_MISS);
POWER_EVENT_ATTR(GRP_CMPL, GRP_CMPL);
POWER_EVENT_ATTR(1PLUS_PPC_CMPL, 1PLUS_PPC_CMPL);
POWER_EVENT_ATTR(CMPLU_STALL_DFU, CMPLU_STALL_DFU);
POWER_EVENT_ATTR(RUN_CYC, RUN_CYC);
POWER_EVENT_ATTR(RUN_INST_CMPL, RUN_INST_CMPL);
#define EVENT(_name, _code) POWER_EVENT_PTR(_name),
static struct attribute *power7_events_attr[] = {
GENERIC_EVENT_PTR(CYC),
GENERIC_EVENT_PTR(GCT_NOSLOT_CYC),
GENERIC_EVENT_PTR(CMPLU_STALL),
GENERIC_EVENT_PTR(INST_CMPL),
GENERIC_EVENT_PTR(LD_REF_L1),
GENERIC_EVENT_PTR(LD_MISS_L1),
GENERIC_EVENT_PTR(BRU_FIN),
GENERIC_EVENT_PTR(BR_MPRED),
GENERIC_EVENT_PTR(PM_CYC),
GENERIC_EVENT_PTR(PM_GCT_NOSLOT_CYC),
GENERIC_EVENT_PTR(PM_CMPLU_STALL),
GENERIC_EVENT_PTR(PM_INST_CMPL),
GENERIC_EVENT_PTR(PM_LD_REF_L1),
GENERIC_EVENT_PTR(PM_LD_MISS_L1),
GENERIC_EVENT_PTR(PM_BRU_FIN),
GENERIC_EVENT_PTR(PM_BR_MPRED),
POWER_EVENT_PTR(CYC),
POWER_EVENT_PTR(GCT_NOSLOT_CYC),
POWER_EVENT_PTR(CMPLU_STALL),
POWER_EVENT_PTR(INST_CMPL),
POWER_EVENT_PTR(LD_REF_L1),
POWER_EVENT_PTR(LD_MISS_L1),
POWER_EVENT_PTR(BRU_FIN),
POWER_EVENT_PTR(BR_MPRED),
POWER_EVENT_PTR(CMPLU_STALL_FXU),
POWER_EVENT_PTR(CMPLU_STALL_DIV),
POWER_EVENT_PTR(CMPLU_STALL_SCALAR),
POWER_EVENT_PTR(CMPLU_STALL_SCALAR_LONG),
POWER_EVENT_PTR(CMPLU_STALL_VECTOR),
POWER_EVENT_PTR(CMPLU_STALL_VECTOR_LONG),
POWER_EVENT_PTR(CMPLU_STALL_LSU),
POWER_EVENT_PTR(CMPLU_STALL_REJECT),
POWER_EVENT_PTR(CMPLU_STALL_ERAT_MISS),
POWER_EVENT_PTR(CMPLU_STALL_DCACHE_MISS),
POWER_EVENT_PTR(CMPLU_STALL_STORE),
POWER_EVENT_PTR(CMPLU_STALL_THRD),
POWER_EVENT_PTR(CMPLU_STALL_IFU),
POWER_EVENT_PTR(CMPLU_STALL_BRU),
POWER_EVENT_PTR(GCT_NOSLOT_IC_MISS),
POWER_EVENT_PTR(GCT_NOSLOT_BR_MPRED),
POWER_EVENT_PTR(GCT_NOSLOT_BR_MPRED_IC_MISS),
POWER_EVENT_PTR(GRP_CMPL),
POWER_EVENT_PTR(1PLUS_PPC_CMPL),
POWER_EVENT_PTR(CMPLU_STALL_DFU),
POWER_EVENT_PTR(RUN_CYC),
POWER_EVENT_PTR(RUN_INST_CMPL),
#include "power7-events-list.h"
#undef EVENT
NULL
};
static struct attribute_group power7_pmu_events_group = {
.name = "events",
.attrs = power7_events_attr,

View File

@ -82,7 +82,6 @@ config X86
select HAVE_USER_RETURN_NOTIFIER
select ARCH_BINFMT_ELF_RANDOMIZE_PIE
select HAVE_ARCH_JUMP_LABEL
select HAVE_TEXT_POKE_SMP
select HAVE_GENERIC_HARDIRQS
select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
select SPARSE_IRQ
@ -2333,10 +2332,6 @@ config HAVE_ATOMIC_IOMAP
def_bool y
depends on X86_32
config HAVE_TEXT_POKE_SMP
bool
select STOP_MACHINE if SMP
config X86_DEV_DMA_OPS
bool
depends on X86_64 || STA2X11

View File

@ -5,6 +5,7 @@
#include <linux/stddef.h>
#include <linux/stringify.h>
#include <asm/asm.h>
#include <asm/ptrace.h>
/*
* Alternative inline assembly for SMP.
@ -220,20 +221,11 @@ extern void *text_poke_early(void *addr, const void *opcode, size_t len);
* no thread can be preempted in the instructions being modified (no iret to an
* invalid instruction possible) or if the instructions are changed from a
* consistent state to another consistent state atomically.
* More care must be taken when modifying code in the SMP case because of
* Intel's errata. text_poke_smp() takes care that errata, but still
* doesn't support NMI/MCE handler code modifying.
* On the local CPU you need to be protected again NMI or MCE handlers seeing an
* inconsistent instruction while you patch.
*/
struct text_poke_param {
void *addr;
const void *opcode;
size_t len;
};
extern void *text_poke(void *addr, const void *opcode, size_t len);
extern void *text_poke_smp(void *addr, const void *opcode, size_t len);
extern void text_poke_smp_batch(struct text_poke_param *params, int n);
extern int poke_int3_handler(struct pt_regs *regs);
extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler);
#endif /* _ASM_X86_ALTERNATIVE_H */

View File

@ -49,6 +49,7 @@ extern void tsc_init(void);
extern void mark_tsc_unstable(char *reason);
extern int unsynchronized_tsc(void);
extern int check_tsc_unstable(void);
extern int check_tsc_disabled(void);
extern unsigned long native_calibrate_tsc(void);
extern int tsc_clocksource_reliable;

View File

@ -11,6 +11,7 @@
#include <linux/memory.h>
#include <linux/stop_machine.h>
#include <linux/slab.h>
#include <linux/kdebug.h>
#include <asm/alternative.h>
#include <asm/sections.h>
#include <asm/pgtable.h>
@ -596,97 +597,93 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
return addr;
}
/*
* Cross-modifying kernel text with stop_machine().
* This code originally comes from immediate value.
*/
static atomic_t stop_machine_first;
static int wrote_text;
struct text_poke_params {
struct text_poke_param *params;
int nparams;
};
static int __kprobes stop_machine_text_poke(void *data)
static void do_sync_core(void *info)
{
struct text_poke_params *tpp = data;
struct text_poke_param *p;
int i;
if (atomic_xchg(&stop_machine_first, 0)) {
for (i = 0; i < tpp->nparams; i++) {
p = &tpp->params[i];
text_poke(p->addr, p->opcode, p->len);
}
smp_wmb(); /* Make sure other cpus see that this has run */
wrote_text = 1;
} else {
while (!wrote_text)
cpu_relax();
smp_mb(); /* Load wrote_text before following execution */
}
for (i = 0; i < tpp->nparams; i++) {
p = &tpp->params[i];
flush_icache_range((unsigned long)p->addr,
(unsigned long)p->addr + p->len);
}
/*
* Intel Archiecture Software Developer's Manual section 7.1.3 specifies
* that a core serializing instruction such as "cpuid" should be
* executed on _each_ core before the new instruction is made visible.
*/
sync_core();
return 0;
}
static bool bp_patching_in_progress;
static void *bp_int3_handler, *bp_int3_addr;
int poke_int3_handler(struct pt_regs *regs)
{
/* bp_patching_in_progress */
smp_rmb();
if (likely(!bp_patching_in_progress))
return 0;
if (user_mode_vm(regs) || regs->ip != (unsigned long)bp_int3_addr)
return 0;
/* set up the specified breakpoint handler */
regs->ip = (unsigned long) bp_int3_handler;
return 1;
}
/**
* text_poke_smp - Update instructions on a live kernel on SMP
* @addr: address to modify
* @opcode: source of the copy
* @len: length to copy
* text_poke_bp() -- update instructions on live kernel on SMP
* @addr: address to patch
* @opcode: opcode of new instruction
* @len: length to copy
* @handler: address to jump to when the temporary breakpoint is hit
*
* Modify multi-byte instruction by using stop_machine() on SMP. This allows
* user to poke/set multi-byte text on SMP. Only non-NMI/MCE code modifying
* should be allowed, since stop_machine() does _not_ protect code against
* NMI and MCE.
* Modify multi-byte instruction by using int3 breakpoint on SMP.
* We completely avoid stop_machine() here, and achieve the
* synchronization using int3 breakpoint.
*
* Note: Must be called under get_online_cpus() and text_mutex.
* The way it is done:
* - add a int3 trap to the address that will be patched
* - sync cores
* - update all but the first byte of the patched range
* - sync cores
* - replace the first byte (int3) by the first byte of
* replacing opcode
* - sync cores
*
* Note: must be called under text_mutex.
*/
void *__kprobes text_poke_smp(void *addr, const void *opcode, size_t len)
void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler)
{
struct text_poke_params tpp;
struct text_poke_param p;
unsigned char int3 = 0xcc;
bp_int3_handler = handler;
bp_int3_addr = (u8 *)addr + sizeof(int3);
bp_patching_in_progress = true;
/*
* Corresponding read barrier in int3 notifier for
* making sure the in_progress flags is correctly ordered wrt.
* patching
*/
smp_wmb();
text_poke(addr, &int3, sizeof(int3));
on_each_cpu(do_sync_core, NULL, 1);
if (len - sizeof(int3) > 0) {
/* patch all but the first byte */
text_poke((char *)addr + sizeof(int3),
(const char *) opcode + sizeof(int3),
len - sizeof(int3));
/*
* According to Intel, this core syncing is very likely
* not necessary and we'd be safe even without it. But
* better safe than sorry (plus there's not only Intel).
*/
on_each_cpu(do_sync_core, NULL, 1);
}
/* patch the first byte */
text_poke(addr, opcode, sizeof(int3));
on_each_cpu(do_sync_core, NULL, 1);
bp_patching_in_progress = false;
smp_wmb();
p.addr = addr;
p.opcode = opcode;
p.len = len;
tpp.params = &p;
tpp.nparams = 1;
atomic_set(&stop_machine_first, 1);
wrote_text = 0;
/* Use __stop_machine() because the caller already got online_cpus. */
__stop_machine(stop_machine_text_poke, (void *)&tpp, cpu_online_mask);
return addr;
}
/**
* text_poke_smp_batch - Update instructions on a live kernel on SMP
* @params: an array of text_poke parameters
* @n: the number of elements in params.
*
* Modify multi-byte instruction by using stop_machine() on SMP. Since the
* stop_machine() is heavy task, it is better to aggregate text_poke requests
* and do it once if possible.
*
* Note: Must be called under get_online_cpus() and text_mutex.
*/
void __kprobes text_poke_smp_batch(struct text_poke_param *params, int n)
{
struct text_poke_params tpp = {.params = params, .nparams = n};
atomic_set(&stop_machine_first, 1);
wrote_text = 0;
__stop_machine(stop_machine_text_poke, (void *)&tpp, cpu_online_mask);
}

View File

@ -1884,6 +1884,7 @@ static struct pmu pmu = {
void arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
{
userpg->cap_usr_time = 0;
userpg->cap_usr_time_zero = 0;
userpg->cap_usr_rdpmc = x86_pmu.attr_rdpmc;
userpg->pmc_width = x86_pmu.cntval_bits;
@ -1897,6 +1898,11 @@ void arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
userpg->time_mult = this_cpu_read(cyc2ns);
userpg->time_shift = CYC2NS_SCALE_FACTOR;
userpg->time_offset = this_cpu_read(cyc2ns_offset) - now;
if (sched_clock_stable && !check_tsc_disabled()) {
userpg->cap_usr_time_zero = 1;
userpg->time_zero = this_cpu_read(cyc2ns_offset);
}
}
/*

View File

@ -641,6 +641,8 @@ extern struct event_constraint intel_core2_pebs_event_constraints[];
extern struct event_constraint intel_atom_pebs_event_constraints[];
extern struct event_constraint intel_slm_pebs_event_constraints[];
extern struct event_constraint intel_nehalem_pebs_event_constraints[];
extern struct event_constraint intel_westmere_pebs_event_constraints[];

View File

@ -347,8 +347,7 @@ static struct amd_nb *amd_alloc_nb(int cpu)
struct amd_nb *nb;
int i;
nb = kmalloc_node(sizeof(struct amd_nb), GFP_KERNEL | __GFP_ZERO,
cpu_to_node(cpu));
nb = kzalloc_node(sizeof(struct amd_nb), GFP_KERNEL, cpu_to_node(cpu));
if (!nb)
return NULL;

View File

@ -81,7 +81,8 @@ static struct event_constraint intel_nehalem_event_constraints[] __read_mostly =
static struct extra_reg intel_nehalem_extra_regs[] __read_mostly =
{
INTEL_EVENT_EXTRA_REG(0xb7, MSR_OFFCORE_RSP_0, 0xffff, RSP_0),
/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0xffff, RSP_0),
INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x100b),
EVENT_EXTRA_END
};
@ -143,8 +144,9 @@ static struct event_constraint intel_ivb_event_constraints[] __read_mostly =
static struct extra_reg intel_westmere_extra_regs[] __read_mostly =
{
INTEL_EVENT_EXTRA_REG(0xb7, MSR_OFFCORE_RSP_0, 0xffff, RSP_0),
INTEL_EVENT_EXTRA_REG(0xbb, MSR_OFFCORE_RSP_1, 0xffff, RSP_1),
/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0xffff, RSP_0),
INTEL_UEVENT_EXTRA_REG(0x01bb, MSR_OFFCORE_RSP_1, 0xffff, RSP_1),
INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x100b),
EVENT_EXTRA_END
};
@ -162,16 +164,27 @@ static struct event_constraint intel_gen_event_constraints[] __read_mostly =
EVENT_CONSTRAINT_END
};
static struct event_constraint intel_slm_event_constraints[] __read_mostly =
{
FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF */
FIXED_EVENT_CONSTRAINT(0x0300, 2), /* pseudo CPU_CLK_UNHALTED.REF */
EVENT_CONSTRAINT_END
};
static struct extra_reg intel_snb_extra_regs[] __read_mostly = {
INTEL_EVENT_EXTRA_REG(0xb7, MSR_OFFCORE_RSP_0, 0x3f807f8fffull, RSP_0),
INTEL_EVENT_EXTRA_REG(0xbb, MSR_OFFCORE_RSP_1, 0x3f807f8fffull, RSP_1),
/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x3f807f8fffull, RSP_0),
INTEL_UEVENT_EXTRA_REG(0x01bb, MSR_OFFCORE_RSP_1, 0x3f807f8fffull, RSP_1),
INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd),
EVENT_EXTRA_END
};
static struct extra_reg intel_snbep_extra_regs[] __read_mostly = {
INTEL_EVENT_EXTRA_REG(0xb7, MSR_OFFCORE_RSP_0, 0x3fffff8fffull, RSP_0),
INTEL_EVENT_EXTRA_REG(0xbb, MSR_OFFCORE_RSP_1, 0x3fffff8fffull, RSP_1),
/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x3fffff8fffull, RSP_0),
INTEL_UEVENT_EXTRA_REG(0x01bb, MSR_OFFCORE_RSP_1, 0x3fffff8fffull, RSP_1),
INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd),
EVENT_EXTRA_END
};
@ -882,6 +895,140 @@ static __initconst const u64 atom_hw_cache_event_ids
},
};
static struct extra_reg intel_slm_extra_regs[] __read_mostly =
{
/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x768005ffff, RSP_0),
INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x768005ffff, RSP_1),
EVENT_EXTRA_END
};
#define SLM_DMND_READ SNB_DMND_DATA_RD
#define SLM_DMND_WRITE SNB_DMND_RFO
#define SLM_DMND_PREFETCH (SNB_PF_DATA_RD|SNB_PF_RFO)
#define SLM_SNP_ANY (SNB_SNP_NONE|SNB_SNP_MISS|SNB_NO_FWD|SNB_HITM)
#define SLM_LLC_ACCESS SNB_RESP_ANY
#define SLM_LLC_MISS (SLM_SNP_ANY|SNB_NON_DRAM)
static __initconst const u64 slm_hw_cache_extra_regs
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] =
{
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SLM_DMND_READ|SLM_LLC_ACCESS,
[ C(RESULT_MISS) ] = SLM_DMND_READ|SLM_LLC_MISS,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SLM_DMND_WRITE|SLM_LLC_ACCESS,
[ C(RESULT_MISS) ] = SLM_DMND_WRITE|SLM_LLC_MISS,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = SLM_DMND_PREFETCH|SLM_LLC_ACCESS,
[ C(RESULT_MISS) ] = SLM_DMND_PREFETCH|SLM_LLC_MISS,
},
},
};
static __initconst const u64 slm_hw_cache_event_ids
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] =
{
[ C(L1D) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0,
[ C(RESULT_MISS) ] = 0x0104, /* LD_DCU_MISS */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0,
[ C(RESULT_MISS) ] = 0,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0,
[ C(RESULT_MISS) ] = 0,
},
},
[ C(L1I ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x0380, /* ICACHE.ACCESSES */
[ C(RESULT_MISS) ] = 0x0280, /* ICACGE.MISSES */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0,
[ C(RESULT_MISS) ] = 0,
},
},
[ C(LL ) ] = {
[ C(OP_READ) ] = {
/* OFFCORE_RESPONSE.ANY_DATA.LOCAL_CACHE */
[ C(RESULT_ACCESS) ] = 0x01b7,
/* OFFCORE_RESPONSE.ANY_DATA.ANY_LLC_MISS */
[ C(RESULT_MISS) ] = 0x01b7,
},
[ C(OP_WRITE) ] = {
/* OFFCORE_RESPONSE.ANY_RFO.LOCAL_CACHE */
[ C(RESULT_ACCESS) ] = 0x01b7,
/* OFFCORE_RESPONSE.ANY_RFO.ANY_LLC_MISS */
[ C(RESULT_MISS) ] = 0x01b7,
},
[ C(OP_PREFETCH) ] = {
/* OFFCORE_RESPONSE.PREFETCH.LOCAL_CACHE */
[ C(RESULT_ACCESS) ] = 0x01b7,
/* OFFCORE_RESPONSE.PREFETCH.ANY_LLC_MISS */
[ C(RESULT_MISS) ] = 0x01b7,
},
},
[ C(DTLB) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0,
[ C(RESULT_MISS) ] = 0x0804, /* LD_DTLB_MISS */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0,
[ C(RESULT_MISS) ] = 0,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0,
[ C(RESULT_MISS) ] = 0,
},
},
[ C(ITLB) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x00c0, /* INST_RETIRED.ANY_P */
[ C(RESULT_MISS) ] = 0x0282, /* ITLB.MISSES */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
},
[ C(BPU ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x00c4, /* BR_INST_RETIRED.ANY */
[ C(RESULT_MISS) ] = 0x00c5, /* BP_INST_RETIRED.MISPRED */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
},
};
static inline bool intel_pmu_needs_lbr_smpl(struct perf_event *event)
{
/* user explicitly requested branch sampling */
@ -1301,11 +1448,11 @@ static void intel_fixup_er(struct perf_event *event, int idx)
if (idx == EXTRA_REG_RSP_0) {
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
event->hw.config |= 0x01b7;
event->hw.config |= x86_pmu.extra_regs[EXTRA_REG_RSP_0].event;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0;
} else if (idx == EXTRA_REG_RSP_1) {
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
event->hw.config |= 0x01bb;
event->hw.config |= x86_pmu.extra_regs[EXTRA_REG_RSP_1].event;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1;
}
}
@ -2176,6 +2323,21 @@ __init int intel_pmu_init(void)
pr_cont("Atom events, ");
break;
case 55: /* Atom 22nm "Silvermont" */
memcpy(hw_cache_event_ids, slm_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, slm_hw_cache_extra_regs,
sizeof(hw_cache_extra_regs));
intel_pmu_lbr_init_atom();
x86_pmu.event_constraints = intel_slm_event_constraints;
x86_pmu.pebs_constraints = intel_slm_pebs_event_constraints;
x86_pmu.extra_regs = intel_slm_extra_regs;
x86_pmu.er_flags |= ERF_HAS_RSP_1;
pr_cont("Silvermont events, ");
break;
case 37: /* 32 nm nehalem, "Clarkdale" */
case 44: /* 32 nm nehalem, "Gulftown" */
case 47: /* 32 nm Xeon E7 */

View File

@ -224,7 +224,7 @@ static int alloc_pebs_buffer(int cpu)
if (!x86_pmu.pebs)
return 0;
buffer = kmalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL | __GFP_ZERO, node);
buffer = kzalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL, node);
if (unlikely(!buffer))
return -ENOMEM;
@ -262,7 +262,7 @@ static int alloc_bts_buffer(int cpu)
if (!x86_pmu.bts)
return 0;
buffer = kmalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_ZERO, node);
buffer = kzalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL, node);
if (unlikely(!buffer))
return -ENOMEM;
@ -295,7 +295,7 @@ static int alloc_ds_buffer(int cpu)
int node = cpu_to_node(cpu);
struct debug_store *ds;
ds = kmalloc_node(sizeof(*ds), GFP_KERNEL | __GFP_ZERO, node);
ds = kzalloc_node(sizeof(*ds), GFP_KERNEL, node);
if (unlikely(!ds))
return -ENOMEM;
@ -517,6 +517,32 @@ struct event_constraint intel_atom_pebs_event_constraints[] = {
EVENT_CONSTRAINT_END
};
struct event_constraint intel_slm_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x0103, 0x1), /* REHABQ.LD_BLOCK_ST_FORWARD_PS */
INTEL_UEVENT_CONSTRAINT(0x0803, 0x1), /* REHABQ.LD_SPLITS_PS */
INTEL_UEVENT_CONSTRAINT(0x0204, 0x1), /* MEM_UOPS_RETIRED.L2_HIT_LOADS_PS */
INTEL_UEVENT_CONSTRAINT(0x0404, 0x1), /* MEM_UOPS_RETIRED.L2_MISS_LOADS_PS */
INTEL_UEVENT_CONSTRAINT(0x0804, 0x1), /* MEM_UOPS_RETIRED.DTLB_MISS_LOADS_PS */
INTEL_UEVENT_CONSTRAINT(0x2004, 0x1), /* MEM_UOPS_RETIRED.HITM_PS */
INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY_PS */
INTEL_UEVENT_CONSTRAINT(0x00c4, 0x1), /* BR_INST_RETIRED.ALL_BRANCHES_PS */
INTEL_UEVENT_CONSTRAINT(0x7ec4, 0x1), /* BR_INST_RETIRED.JCC_PS */
INTEL_UEVENT_CONSTRAINT(0xbfc4, 0x1), /* BR_INST_RETIRED.FAR_BRANCH_PS */
INTEL_UEVENT_CONSTRAINT(0xebc4, 0x1), /* BR_INST_RETIRED.NON_RETURN_IND_PS */
INTEL_UEVENT_CONSTRAINT(0xf7c4, 0x1), /* BR_INST_RETIRED.RETURN_PS */
INTEL_UEVENT_CONSTRAINT(0xf9c4, 0x1), /* BR_INST_RETIRED.CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfbc4, 0x1), /* BR_INST_RETIRED.IND_CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfdc4, 0x1), /* BR_INST_RETIRED.REL_CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfec4, 0x1), /* BR_INST_RETIRED.TAKEN_JCC_PS */
INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_MISP_RETIRED.ALL_BRANCHES_PS */
INTEL_UEVENT_CONSTRAINT(0x7ec5, 0x1), /* BR_INST_MISP_RETIRED.JCC_PS */
INTEL_UEVENT_CONSTRAINT(0xebc5, 0x1), /* BR_INST_MISP_RETIRED.NON_RETURN_IND_PS */
INTEL_UEVENT_CONSTRAINT(0xf7c5, 0x1), /* BR_INST_MISP_RETIRED.RETURN_PS */
INTEL_UEVENT_CONSTRAINT(0xfbc5, 0x1), /* BR_INST_MISP_RETIRED.IND_CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfec5, 0x1), /* BR_INST_MISP_RETIRED.TAKEN_JCC_PS */
EVENT_CONSTRAINT_END
};
struct event_constraint intel_nehalem_pebs_event_constraints[] = {
INTEL_PLD_CONSTRAINT(0x100b, 0xf), /* MEM_INST_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0x0f, 0xf), /* MEM_UNCORE_RETIRED.* */

View File

@ -6,6 +6,8 @@ static struct intel_uncore_type **pci_uncores = empty_uncore;
/* pci bus to socket mapping */
static int pcibus_to_physid[256] = { [0 ... 255] = -1, };
static struct pci_dev *extra_pci_dev[UNCORE_SOCKET_MAX][UNCORE_EXTRA_PCI_DEV_MAX];
static DEFINE_RAW_SPINLOCK(uncore_box_lock);
/* mask of cpus that collect uncore events */
@ -45,6 +47,24 @@ DEFINE_UNCORE_FORMAT_ATTR(filter_band0, filter_band0, "config1:0-7");
DEFINE_UNCORE_FORMAT_ATTR(filter_band1, filter_band1, "config1:8-15");
DEFINE_UNCORE_FORMAT_ATTR(filter_band2, filter_band2, "config1:16-23");
DEFINE_UNCORE_FORMAT_ATTR(filter_band3, filter_band3, "config1:24-31");
DEFINE_UNCORE_FORMAT_ATTR(match_rds, match_rds, "config1:48-51");
DEFINE_UNCORE_FORMAT_ATTR(match_rnid30, match_rnid30, "config1:32-35");
DEFINE_UNCORE_FORMAT_ATTR(match_rnid4, match_rnid4, "config1:31");
DEFINE_UNCORE_FORMAT_ATTR(match_dnid, match_dnid, "config1:13-17");
DEFINE_UNCORE_FORMAT_ATTR(match_mc, match_mc, "config1:9-12");
DEFINE_UNCORE_FORMAT_ATTR(match_opc, match_opc, "config1:5-8");
DEFINE_UNCORE_FORMAT_ATTR(match_vnw, match_vnw, "config1:3-4");
DEFINE_UNCORE_FORMAT_ATTR(match0, match0, "config1:0-31");
DEFINE_UNCORE_FORMAT_ATTR(match1, match1, "config1:32-63");
DEFINE_UNCORE_FORMAT_ATTR(mask_rds, mask_rds, "config2:48-51");
DEFINE_UNCORE_FORMAT_ATTR(mask_rnid30, mask_rnid30, "config2:32-35");
DEFINE_UNCORE_FORMAT_ATTR(mask_rnid4, mask_rnid4, "config2:31");
DEFINE_UNCORE_FORMAT_ATTR(mask_dnid, mask_dnid, "config2:13-17");
DEFINE_UNCORE_FORMAT_ATTR(mask_mc, mask_mc, "config2:9-12");
DEFINE_UNCORE_FORMAT_ATTR(mask_opc, mask_opc, "config2:5-8");
DEFINE_UNCORE_FORMAT_ATTR(mask_vnw, mask_vnw, "config2:3-4");
DEFINE_UNCORE_FORMAT_ATTR(mask0, mask0, "config2:0-31");
DEFINE_UNCORE_FORMAT_ATTR(mask1, mask1, "config2:32-63");
static u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
{
@ -281,7 +301,7 @@ static struct attribute *snbep_uncore_cbox_formats_attr[] = {
};
static struct attribute *snbep_uncore_pcu_formats_attr[] = {
&format_attr_event.attr,
&format_attr_event_ext.attr,
&format_attr_occ_sel.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
@ -301,6 +321,24 @@ static struct attribute *snbep_uncore_qpi_formats_attr[] = {
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh8.attr,
&format_attr_match_rds.attr,
&format_attr_match_rnid30.attr,
&format_attr_match_rnid4.attr,
&format_attr_match_dnid.attr,
&format_attr_match_mc.attr,
&format_attr_match_opc.attr,
&format_attr_match_vnw.attr,
&format_attr_match0.attr,
&format_attr_match1.attr,
&format_attr_mask_rds.attr,
&format_attr_mask_rnid30.attr,
&format_attr_mask_rnid4.attr,
&format_attr_mask_dnid.attr,
&format_attr_mask_mc.attr,
&format_attr_mask_opc.attr,
&format_attr_mask_vnw.attr,
&format_attr_mask0.attr,
&format_attr_mask1.attr,
NULL,
};
@ -356,13 +394,16 @@ static struct intel_uncore_ops snbep_uncore_msr_ops = {
SNBEP_UNCORE_MSR_OPS_COMMON_INIT(),
};
#define SNBEP_UNCORE_PCI_OPS_COMMON_INIT() \
.init_box = snbep_uncore_pci_init_box, \
.disable_box = snbep_uncore_pci_disable_box, \
.enable_box = snbep_uncore_pci_enable_box, \
.disable_event = snbep_uncore_pci_disable_event, \
.read_counter = snbep_uncore_pci_read_counter
static struct intel_uncore_ops snbep_uncore_pci_ops = {
.init_box = snbep_uncore_pci_init_box,
.disable_box = snbep_uncore_pci_disable_box,
.enable_box = snbep_uncore_pci_enable_box,
.disable_event = snbep_uncore_pci_disable_event,
.enable_event = snbep_uncore_pci_enable_event,
.read_counter = snbep_uncore_pci_read_counter,
SNBEP_UNCORE_PCI_OPS_COMMON_INIT(),
.enable_event = snbep_uncore_pci_enable_event, \
};
static struct event_constraint snbep_uncore_cbox_constraints[] = {
@ -726,6 +767,61 @@ static struct intel_uncore_type *snbep_msr_uncores[] = {
NULL,
};
enum {
SNBEP_PCI_QPI_PORT0_FILTER,
SNBEP_PCI_QPI_PORT1_FILTER,
};
static int snbep_qpi_hw_config(struct intel_uncore_box *box, struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
struct hw_perf_event_extra *reg1 = &hwc->extra_reg;
struct hw_perf_event_extra *reg2 = &hwc->branch_reg;
if ((hwc->config & SNBEP_PMON_CTL_EV_SEL_MASK) == 0x38) {
reg1->idx = 0;
reg1->reg = SNBEP_Q_Py_PCI_PMON_PKT_MATCH0;
reg1->config = event->attr.config1;
reg2->reg = SNBEP_Q_Py_PCI_PMON_PKT_MASK0;
reg2->config = event->attr.config2;
}
return 0;
}
static void snbep_qpi_enable_event(struct intel_uncore_box *box, struct perf_event *event)
{
struct pci_dev *pdev = box->pci_dev;
struct hw_perf_event *hwc = &event->hw;
struct hw_perf_event_extra *reg1 = &hwc->extra_reg;
struct hw_perf_event_extra *reg2 = &hwc->branch_reg;
if (reg1->idx != EXTRA_REG_NONE) {
int idx = box->pmu->pmu_idx + SNBEP_PCI_QPI_PORT0_FILTER;
struct pci_dev *filter_pdev = extra_pci_dev[box->phys_id][idx];
WARN_ON_ONCE(!filter_pdev);
if (filter_pdev) {
pci_write_config_dword(filter_pdev, reg1->reg,
(u32)reg1->config);
pci_write_config_dword(filter_pdev, reg1->reg + 4,
(u32)(reg1->config >> 32));
pci_write_config_dword(filter_pdev, reg2->reg,
(u32)reg2->config);
pci_write_config_dword(filter_pdev, reg2->reg + 4,
(u32)(reg2->config >> 32));
}
}
pci_write_config_dword(pdev, hwc->config_base, hwc->config | SNBEP_PMON_CTL_EN);
}
static struct intel_uncore_ops snbep_uncore_qpi_ops = {
SNBEP_UNCORE_PCI_OPS_COMMON_INIT(),
.enable_event = snbep_qpi_enable_event,
.hw_config = snbep_qpi_hw_config,
.get_constraint = uncore_get_constraint,
.put_constraint = uncore_put_constraint,
};
#define SNBEP_UNCORE_PCI_COMMON_INIT() \
.perf_ctr = SNBEP_PCI_PMON_CTR0, \
.event_ctl = SNBEP_PCI_PMON_CTL0, \
@ -755,17 +851,18 @@ static struct intel_uncore_type snbep_uncore_imc = {
};
static struct intel_uncore_type snbep_uncore_qpi = {
.name = "qpi",
.num_counters = 4,
.num_boxes = 2,
.perf_ctr_bits = 48,
.perf_ctr = SNBEP_PCI_PMON_CTR0,
.event_ctl = SNBEP_PCI_PMON_CTL0,
.event_mask = SNBEP_QPI_PCI_PMON_RAW_EVENT_MASK,
.box_ctl = SNBEP_PCI_PMON_BOX_CTL,
.ops = &snbep_uncore_pci_ops,
.event_descs = snbep_uncore_qpi_events,
.format_group = &snbep_uncore_qpi_format_group,
.name = "qpi",
.num_counters = 4,
.num_boxes = 2,
.perf_ctr_bits = 48,
.perf_ctr = SNBEP_PCI_PMON_CTR0,
.event_ctl = SNBEP_PCI_PMON_CTL0,
.event_mask = SNBEP_QPI_PCI_PMON_RAW_EVENT_MASK,
.box_ctl = SNBEP_PCI_PMON_BOX_CTL,
.num_shared_regs = 1,
.ops = &snbep_uncore_qpi_ops,
.event_descs = snbep_uncore_qpi_events,
.format_group = &snbep_uncore_qpi_format_group,
};
@ -807,43 +904,53 @@ static struct intel_uncore_type *snbep_pci_uncores[] = {
static DEFINE_PCI_DEVICE_TABLE(snbep_uncore_pci_ids) = {
{ /* Home Agent */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_HA),
.driver_data = SNBEP_PCI_UNCORE_HA,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_HA, 0),
},
{ /* MC Channel 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_IMC0),
.driver_data = SNBEP_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_IMC, 0),
},
{ /* MC Channel 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_IMC1),
.driver_data = SNBEP_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_IMC, 1),
},
{ /* MC Channel 2 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_IMC2),
.driver_data = SNBEP_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_IMC, 2),
},
{ /* MC Channel 3 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_IMC3),
.driver_data = SNBEP_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_IMC, 3),
},
{ /* QPI Port 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_QPI0),
.driver_data = SNBEP_PCI_UNCORE_QPI,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_QPI, 0),
},
{ /* QPI Port 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_QPI1),
.driver_data = SNBEP_PCI_UNCORE_QPI,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_QPI, 1),
},
{ /* R2PCIe */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_R2PCIE),
.driver_data = SNBEP_PCI_UNCORE_R2PCIE,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_R2PCIE, 0),
},
{ /* R3QPI Link 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_R3QPI0),
.driver_data = SNBEP_PCI_UNCORE_R3QPI,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_R3QPI, 0),
},
{ /* R3QPI Link 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_UNC_R3QPI1),
.driver_data = SNBEP_PCI_UNCORE_R3QPI,
.driver_data = UNCORE_PCI_DEV_DATA(SNBEP_PCI_UNCORE_R3QPI, 1),
},
{ /* QPI Port 0 filter */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x3c86),
.driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
SNBEP_PCI_QPI_PORT0_FILTER),
},
{ /* QPI Port 0 filter */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x3c96),
.driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
SNBEP_PCI_QPI_PORT1_FILTER),
},
{ /* end: all zeroes */ }
};
@ -1256,71 +1363,71 @@ static struct intel_uncore_type *ivt_pci_uncores[] = {
static DEFINE_PCI_DEVICE_TABLE(ivt_uncore_pci_ids) = {
{ /* Home Agent 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe30),
.driver_data = IVT_PCI_UNCORE_HA,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_HA, 0),
},
{ /* Home Agent 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe38),
.driver_data = IVT_PCI_UNCORE_HA,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_HA, 1),
},
{ /* MC0 Channel 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xeb4),
.driver_data = IVT_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 0),
},
{ /* MC0 Channel 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xeb5),
.driver_data = IVT_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 1),
},
{ /* MC0 Channel 3 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xeb0),
.driver_data = IVT_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 2),
},
{ /* MC0 Channel 4 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xeb1),
.driver_data = IVT_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 3),
},
{ /* MC1 Channel 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xef4),
.driver_data = IVT_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 4),
},
{ /* MC1 Channel 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xef5),
.driver_data = IVT_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 5),
},
{ /* MC1 Channel 3 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xef0),
.driver_data = IVT_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 6),
},
{ /* MC1 Channel 4 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xef1),
.driver_data = IVT_PCI_UNCORE_IMC,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 7),
},
{ /* QPI0 Port 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe32),
.driver_data = IVT_PCI_UNCORE_QPI,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_QPI, 0),
},
{ /* QPI0 Port 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe33),
.driver_data = IVT_PCI_UNCORE_QPI,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_QPI, 1),
},
{ /* QPI1 Port 2 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe3a),
.driver_data = IVT_PCI_UNCORE_QPI,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_QPI, 2),
},
{ /* R2PCIe */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe34),
.driver_data = IVT_PCI_UNCORE_R2PCIE,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_R2PCIE, 0),
},
{ /* R3QPI0 Link 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe36),
.driver_data = IVT_PCI_UNCORE_R3QPI,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_R3QPI, 0),
},
{ /* R3QPI0 Link 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe37),
.driver_data = IVT_PCI_UNCORE_R3QPI,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_R3QPI, 1),
},
{ /* R3QPI1 Link 2 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe3e),
.driver_data = IVT_PCI_UNCORE_R3QPI,
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_R3QPI, 2),
},
{ /* end: all zeroes */ }
};
@ -2606,7 +2713,7 @@ struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type, int cp
size = sizeof(*box) + type->num_shared_regs * sizeof(struct intel_uncore_extra_reg);
box = kmalloc_node(size, GFP_KERNEL | __GFP_ZERO, cpu_to_node(cpu));
box = kzalloc_node(size, GFP_KERNEL, cpu_to_node(cpu));
if (!box)
return NULL;
@ -3167,16 +3274,24 @@ static bool pcidrv_registered;
/*
* add a pci uncore device
*/
static int uncore_pci_add(struct intel_uncore_type *type, struct pci_dev *pdev)
static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct intel_uncore_pmu *pmu;
struct intel_uncore_box *box;
int i, phys_id;
struct intel_uncore_type *type;
int phys_id;
phys_id = pcibus_to_physid[pdev->bus->number];
if (phys_id < 0)
return -ENODEV;
if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) {
extra_pci_dev[phys_id][UNCORE_PCI_DEV_IDX(id->driver_data)] = pdev;
pci_set_drvdata(pdev, NULL);
return 0;
}
type = pci_uncores[UNCORE_PCI_DEV_TYPE(id->driver_data)];
box = uncore_alloc_box(type, 0);
if (!box)
return -ENOMEM;
@ -3185,21 +3300,11 @@ static int uncore_pci_add(struct intel_uncore_type *type, struct pci_dev *pdev)
* for performance monitoring unit with multiple boxes,
* each box has a different function id.
*/
for (i = 0; i < type->num_boxes; i++) {
pmu = &type->pmus[i];
if (pmu->func_id == pdev->devfn)
break;
if (pmu->func_id < 0) {
pmu->func_id = pdev->devfn;
break;
}
pmu = NULL;
}
if (!pmu) {
kfree(box);
return -EINVAL;
}
pmu = &type->pmus[UNCORE_PCI_DEV_IDX(id->driver_data)];
if (pmu->func_id < 0)
pmu->func_id = pdev->devfn;
else
WARN_ON_ONCE(pmu->func_id != pdev->devfn);
box->phys_id = phys_id;
box->pci_dev = pdev;
@ -3217,9 +3322,22 @@ static int uncore_pci_add(struct intel_uncore_type *type, struct pci_dev *pdev)
static void uncore_pci_remove(struct pci_dev *pdev)
{
struct intel_uncore_box *box = pci_get_drvdata(pdev);
struct intel_uncore_pmu *pmu = box->pmu;
int cpu, phys_id = pcibus_to_physid[pdev->bus->number];
struct intel_uncore_pmu *pmu;
int i, cpu, phys_id = pcibus_to_physid[pdev->bus->number];
box = pci_get_drvdata(pdev);
if (!box) {
for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
if (extra_pci_dev[phys_id][i] == pdev) {
extra_pci_dev[phys_id][i] = NULL;
break;
}
}
WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX);
return;
}
pmu = box->pmu;
if (WARN_ON_ONCE(phys_id != box->phys_id))
return;
@ -3240,12 +3358,6 @@ static void uncore_pci_remove(struct pci_dev *pdev)
kfree(box);
}
static int uncore_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *id)
{
return uncore_pci_add(pci_uncores[id->driver_data], pdev);
}
static int __init uncore_pci_init(void)
{
int ret;

View File

@ -12,6 +12,15 @@
#define UNCORE_PMC_IDX_FIXED UNCORE_PMC_IDX_MAX_GENERIC
#define UNCORE_PMC_IDX_MAX (UNCORE_PMC_IDX_FIXED + 1)
#define UNCORE_PCI_DEV_DATA(type, idx) ((type << 8) | idx)
#define UNCORE_PCI_DEV_TYPE(data) ((data >> 8) & 0xff)
#define UNCORE_PCI_DEV_IDX(data) (data & 0xff)
#define UNCORE_EXTRA_PCI_DEV 0xff
#define UNCORE_EXTRA_PCI_DEV_MAX 2
/* support up to 8 sockets */
#define UNCORE_SOCKET_MAX 8
#define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
/* SNB event control */
@ -108,6 +117,7 @@
(SNBEP_PMON_CTL_EV_SEL_MASK | \
SNBEP_PCU_MSR_PMON_CTL_OCC_SEL_MASK | \
SNBEP_PMON_CTL_EDGE_DET | \
SNBEP_PMON_CTL_EV_SEL_EXT | \
SNBEP_PMON_CTL_INVERT | \
SNBEP_PCU_MSR_PMON_CTL_TRESH_MASK | \
SNBEP_PCU_MSR_PMON_CTL_OCC_INVERT | \

View File

@ -37,7 +37,19 @@ static void __jump_label_transform(struct jump_entry *entry,
} else
memcpy(&code, ideal_nops[NOP_ATOMIC5], JUMP_LABEL_NOP_SIZE);
(*poker)((void *)entry->code, &code, JUMP_LABEL_NOP_SIZE);
/*
* Make text_poke_bp() a default fallback poker.
*
* At the time the change is being done, just ignore whether we
* are doing nop -> jump or jump -> nop transition, and assume
* always nop being the 'currently valid' instruction
*
*/
if (poker)
(*poker)((void *)entry->code, &code, JUMP_LABEL_NOP_SIZE);
else
text_poke_bp((void *)entry->code, &code, JUMP_LABEL_NOP_SIZE,
(void *)entry->code + JUMP_LABEL_NOP_SIZE);
}
void arch_jump_label_transform(struct jump_entry *entry,
@ -45,7 +57,7 @@ void arch_jump_label_transform(struct jump_entry *entry,
{
get_online_cpus();
mutex_lock(&text_mutex);
__jump_label_transform(entry, type, text_poke_smp);
__jump_label_transform(entry, type, NULL);
mutex_unlock(&text_mutex);
put_online_cpus();
}

View File

@ -82,14 +82,9 @@ extern void synthesize_reljump(void *from, void *to);
extern void synthesize_relcall(void *from, void *to);
#ifdef CONFIG_OPTPROBES
extern int arch_init_optprobes(void);
extern int setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter);
extern unsigned long __recover_optprobed_insn(kprobe_opcode_t *buf, unsigned long addr);
#else /* !CONFIG_OPTPROBES */
static inline int arch_init_optprobes(void)
{
return 0;
}
static inline int setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
{
return 0;

View File

@ -1068,7 +1068,7 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
int __init arch_init_kprobes(void)
{
return arch_init_optprobes();
return 0;
}
int __kprobes arch_trampoline_kprobe(struct kprobe *p)

View File

@ -371,31 +371,6 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
return 0;
}
#define MAX_OPTIMIZE_PROBES 256
static struct text_poke_param *jump_poke_params;
static struct jump_poke_buffer {
u8 buf[RELATIVEJUMP_SIZE];
} *jump_poke_bufs;
static void __kprobes setup_optimize_kprobe(struct text_poke_param *tprm,
u8 *insn_buf,
struct optimized_kprobe *op)
{
s32 rel = (s32)((long)op->optinsn.insn -
((long)op->kp.addr + RELATIVEJUMP_SIZE));
/* Backup instructions which will be replaced by jump address */
memcpy(op->optinsn.copied_insn, op->kp.addr + INT3_SIZE,
RELATIVE_ADDR_SIZE);
insn_buf[0] = RELATIVEJUMP_OPCODE;
*(s32 *)(&insn_buf[1]) = rel;
tprm->addr = op->kp.addr;
tprm->opcode = insn_buf;
tprm->len = RELATIVEJUMP_SIZE;
}
/*
* Replace breakpoints (int3) with relative jumps.
* Caller must call with locking kprobe_mutex and text_mutex.
@ -403,37 +378,38 @@ static void __kprobes setup_optimize_kprobe(struct text_poke_param *tprm,
void __kprobes arch_optimize_kprobes(struct list_head *oplist)
{
struct optimized_kprobe *op, *tmp;
int c = 0;
u8 insn_buf[RELATIVEJUMP_SIZE];
list_for_each_entry_safe(op, tmp, oplist, list) {
WARN_ON(kprobe_disabled(&op->kp));
/* Setup param */
setup_optimize_kprobe(&jump_poke_params[c],
jump_poke_bufs[c].buf, op);
list_del_init(&op->list);
if (++c >= MAX_OPTIMIZE_PROBES)
break;
}
s32 rel = (s32)((long)op->optinsn.insn -
((long)op->kp.addr + RELATIVEJUMP_SIZE));
/*
* text_poke_smp doesn't support NMI/MCE code modifying.
* However, since kprobes itself also doesn't support NMI/MCE
* code probing, it's not a problem.
*/
text_poke_smp_batch(jump_poke_params, c);
WARN_ON(kprobe_disabled(&op->kp));
/* Backup instructions which will be replaced by jump address */
memcpy(op->optinsn.copied_insn, op->kp.addr + INT3_SIZE,
RELATIVE_ADDR_SIZE);
insn_buf[0] = RELATIVEJUMP_OPCODE;
*(s32 *)(&insn_buf[1]) = rel;
text_poke_bp(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE,
op->optinsn.insn);
list_del_init(&op->list);
}
}
static void __kprobes setup_unoptimize_kprobe(struct text_poke_param *tprm,
u8 *insn_buf,
struct optimized_kprobe *op)
/* Replace a relative jump with a breakpoint (int3). */
void __kprobes arch_unoptimize_kprobe(struct optimized_kprobe *op)
{
u8 insn_buf[RELATIVEJUMP_SIZE];
/* Set int3 to first byte for kprobes */
insn_buf[0] = BREAKPOINT_INSTRUCTION;
memcpy(insn_buf + 1, op->optinsn.copied_insn, RELATIVE_ADDR_SIZE);
tprm->addr = op->kp.addr;
tprm->opcode = insn_buf;
tprm->len = RELATIVEJUMP_SIZE;
text_poke_bp(op->kp.addr, insn_buf, RELATIVEJUMP_SIZE,
op->optinsn.insn);
}
/*
@ -444,34 +420,11 @@ extern void arch_unoptimize_kprobes(struct list_head *oplist,
struct list_head *done_list)
{
struct optimized_kprobe *op, *tmp;
int c = 0;
list_for_each_entry_safe(op, tmp, oplist, list) {
/* Setup param */
setup_unoptimize_kprobe(&jump_poke_params[c],
jump_poke_bufs[c].buf, op);
arch_unoptimize_kprobe(op);
list_move(&op->list, done_list);
if (++c >= MAX_OPTIMIZE_PROBES)
break;
}
/*
* text_poke_smp doesn't support NMI/MCE code modifying.
* However, since kprobes itself also doesn't support NMI/MCE
* code probing, it's not a problem.
*/
text_poke_smp_batch(jump_poke_params, c);
}
/* Replace a relative jump with a breakpoint (int3). */
void __kprobes arch_unoptimize_kprobe(struct optimized_kprobe *op)
{
u8 buf[RELATIVEJUMP_SIZE];
/* Set int3 to first byte for kprobes */
buf[0] = BREAKPOINT_INSTRUCTION;
memcpy(buf + 1, op->optinsn.copied_insn, RELATIVE_ADDR_SIZE);
text_poke_smp(op->kp.addr, buf, RELATIVEJUMP_SIZE);
}
int __kprobes
@ -491,22 +444,3 @@ setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter)
}
return 0;
}
int __kprobes arch_init_optprobes(void)
{
/* Allocate code buffer and parameter array */
jump_poke_bufs = kmalloc(sizeof(struct jump_poke_buffer) *
MAX_OPTIMIZE_PROBES, GFP_KERNEL);
if (!jump_poke_bufs)
return -ENOMEM;
jump_poke_params = kmalloc(sizeof(struct text_poke_param) *
MAX_OPTIMIZE_PROBES, GFP_KERNEL);
if (!jump_poke_params) {
kfree(jump_poke_bufs);
jump_poke_bufs = NULL;
return -ENOMEM;
}
return 0;
}

View File

@ -58,6 +58,7 @@
#include <asm/mce.h>
#include <asm/fixmap.h>
#include <asm/mach_traps.h>
#include <asm/alternative.h>
#ifdef CONFIG_X86_64
#include <asm/x86_init.h>
@ -327,6 +328,9 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
ftrace_int3_handler(regs))
return;
#endif
if (poke_int3_handler(regs))
return;
prev_state = exception_enter();
#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,

View File

@ -89,6 +89,12 @@ int check_tsc_unstable(void)
}
EXPORT_SYMBOL_GPL(check_tsc_unstable);
int check_tsc_disabled(void)
{
return tsc_disabled;
}
EXPORT_SYMBOL_GPL(check_tsc_disabled);
#ifdef CONFIG_X86_TSC
int __init notsc_setup(char *str)
{

View File

@ -63,30 +63,6 @@ struct perf_raw_record {
void *data;
};
/*
* single taken branch record layout:
*
* from: source instruction (may not always be a branch insn)
* to: branch target
* mispred: branch target was mispredicted
* predicted: branch target was predicted
*
* support for mispred, predicted is optional. In case it
* is not supported mispred = predicted = 0.
*
* in_tx: running in a hardware transaction
* abort: aborting a hardware transaction
*/
struct perf_branch_entry {
__u64 from;
__u64 to;
__u64 mispred:1, /* target mispredicted */
predicted:1,/* target predicted */
in_tx:1, /* in transaction */
abort:1, /* transaction abort */
reserved:60;
};
/*
* branch stack layout:
* nr: number of taken branches stored in entries[]

View File

@ -1034,6 +1034,9 @@ struct task_struct {
#ifdef CONFIG_SMP
struct llist_node wake_entry;
int on_cpu;
struct task_struct *last_wakee;
unsigned long wakee_flips;
unsigned long wakee_flip_decay_ts;
#endif
int on_rq;

View File

@ -57,7 +57,7 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
TP_PROTO(struct task_struct *p, int success),
TP_ARGS(p, success),
TP_ARGS(__perf_task(p), success),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@ -73,9 +73,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
__entry->prio = p->prio;
__entry->success = success;
__entry->target_cpu = task_cpu(p);
)
TP_perf_assign(
__perf_task(p);
),
TP_printk("comm=%s pid=%d prio=%d success=%d target_cpu=%03d",
@ -313,7 +310,7 @@ DECLARE_EVENT_CLASS(sched_stat_template,
TP_PROTO(struct task_struct *tsk, u64 delay),
TP_ARGS(tsk, delay),
TP_ARGS(__perf_task(tsk), __perf_count(delay)),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@ -325,10 +322,6 @@ DECLARE_EVENT_CLASS(sched_stat_template,
memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
__entry->pid = tsk->pid;
__entry->delay = delay;
)
TP_perf_assign(
__perf_count(delay);
__perf_task(tsk);
),
TP_printk("comm=%s pid=%d delay=%Lu [ns]",
@ -372,11 +365,11 @@ DEFINE_EVENT(sched_stat_template, sched_stat_blocked,
* Tracepoint for accounting runtime (time the task is executing
* on a CPU).
*/
TRACE_EVENT(sched_stat_runtime,
DECLARE_EVENT_CLASS(sched_stat_runtime,
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
TP_ARGS(tsk, runtime, vruntime),
TP_ARGS(tsk, __perf_count(runtime), vruntime),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN )
@ -390,9 +383,6 @@ TRACE_EVENT(sched_stat_runtime,
__entry->pid = tsk->pid;
__entry->runtime = runtime;
__entry->vruntime = vruntime;
)
TP_perf_assign(
__perf_count(runtime);
),
TP_printk("comm=%s pid=%d runtime=%Lu [ns] vruntime=%Lu [ns]",
@ -401,6 +391,10 @@ TRACE_EVENT(sched_stat_runtime,
(unsigned long long)__entry->vruntime)
);
DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
TP_ARGS(tsk, runtime, vruntime));
/*
* Tracepoint for showing priority inheritance modifying a tasks
* priority.

View File

@ -507,8 +507,14 @@ static inline notrace int ftrace_get_offsets_##call( \
#undef TP_fast_assign
#define TP_fast_assign(args...) args
#undef TP_perf_assign
#define TP_perf_assign(args...)
#undef __perf_addr
#define __perf_addr(a) (a)
#undef __perf_count
#define __perf_count(c) (c)
#undef __perf_task
#define __perf_task(t) (t)
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
@ -636,16 +642,13 @@ __attribute__((section("_ftrace_events"))) *__event_##call = &event_##call
#define __get_str(field) (char *)__get_dynamic_array(field)
#undef __perf_addr
#define __perf_addr(a) __addr = (a)
#define __perf_addr(a) (__addr = (a))
#undef __perf_count
#define __perf_count(c) __count = (c)
#define __perf_count(c) (__count = (c))
#undef __perf_task
#define __perf_task(t) __task = (t)
#undef TP_perf_assign
#define TP_perf_assign(args...) args
#define __perf_task(t) (__task = (t))
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
@ -663,15 +666,20 @@ perf_trace_##call(void *__data, proto) \
int __data_size; \
int rctx; \
\
perf_fetch_caller_regs(&__regs); \
\
__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
\
head = this_cpu_ptr(event_call->perf_events); \
if (__builtin_constant_p(!__task) && !__task && \
hlist_empty(head)) \
return; \
\
__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
sizeof(u64)); \
__entry_size -= sizeof(u32); \
\
entry = (struct ftrace_raw_##call *)perf_trace_buf_prepare( \
__entry_size, event_call->event.type, &__regs, &rctx); \
perf_fetch_caller_regs(&__regs); \
entry = perf_trace_buf_prepare(__entry_size, \
event_call->event.type, &__regs, &rctx); \
if (!entry) \
return; \
\
@ -679,7 +687,6 @@ perf_trace_##call(void *__data, proto) \
\
{ assign; } \
\
head = this_cpu_ptr(event_call->perf_events); \
perf_trace_buf_submit(entry, __entry_size, rctx, __addr, \
__count, &__regs, head, __task); \
}

View File

@ -109,6 +109,7 @@ enum perf_sw_ids {
PERF_COUNT_SW_PAGE_FAULTS_MAJ = 6,
PERF_COUNT_SW_ALIGNMENT_FAULTS = 7,
PERF_COUNT_SW_EMULATION_FAULTS = 8,
PERF_COUNT_SW_DUMMY = 9,
PERF_COUNT_SW_MAX, /* non-ABI */
};
@ -134,8 +135,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_STACK_USER = 1U << 13,
PERF_SAMPLE_WEIGHT = 1U << 14,
PERF_SAMPLE_DATA_SRC = 1U << 15,
PERF_SAMPLE_IDENTIFIER = 1U << 16,
PERF_SAMPLE_MAX = 1U << 16, /* non-ABI */
PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
};
/*
@ -275,8 +277,9 @@ struct perf_event_attr {
exclude_callchain_kernel : 1, /* exclude kernel callchains */
exclude_callchain_user : 1, /* exclude user callchains */
mmap2 : 1, /* include mmap with inode data */
__reserved_1 : 41;
__reserved_1 : 40;
union {
__u32 wakeup_events; /* wakeup every n events */
@ -321,6 +324,7 @@ struct perf_event_attr {
#define PERF_EVENT_IOC_PERIOD _IOW('$', 4, __u64)
#define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5)
#define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *)
#define PERF_EVENT_IOC_ID _IOR('$', 7, u64 *)
enum perf_event_ioc_flags {
PERF_IOC_FLAG_GROUP = 1U << 0,
@ -375,9 +379,12 @@ struct perf_event_mmap_page {
__u64 time_running; /* time event on cpu */
union {
__u64 capabilities;
__u64 cap_usr_time : 1,
cap_usr_rdpmc : 1,
cap_____res : 62;
struct {
__u64 cap_usr_time : 1,
cap_usr_rdpmc : 1,
cap_usr_time_zero : 1,
cap_____res : 61;
};
};
/*
@ -418,12 +425,29 @@ struct perf_event_mmap_page {
__u16 time_shift;
__u32 time_mult;
__u64 time_offset;
/*
* If cap_usr_time_zero, the hardware clock (e.g. TSC) can be calculated
* from sample timestamps.
*
* time = timestamp - time_zero;
* quot = time / time_mult;
* rem = time % time_mult;
* cyc = (quot << time_shift) + (rem << time_shift) / time_mult;
*
* And vice versa:
*
* quot = cyc >> time_shift;
* rem = cyc & ((1 << time_shift) - 1);
* timestamp = time_zero + quot * time_mult +
* ((rem * time_mult) >> time_shift);
*/
__u64 time_zero;
/*
* Hole for extension of the self monitor capabilities
*/
__u64 __reserved[120]; /* align to 1k */
__u64 __reserved[119]; /* align to 1k */
/*
* Control data for the mmap() data buffer.
@ -471,13 +495,28 @@ enum perf_event_type {
/*
* If perf_event_attr.sample_id_all is set then all event types will
* have the sample_type selected fields related to where/when
* (identity) an event took place (TID, TIME, ID, CPU, STREAM_ID)
* described in PERF_RECORD_SAMPLE below, it will be stashed just after
* the perf_event_header and the fields already present for the existing
* fields, i.e. at the end of the payload. That way a newer perf.data
* file will be supported by older perf tools, with these new optional
* fields being ignored.
* (identity) an event took place (TID, TIME, ID, STREAM_ID, CPU,
* IDENTIFIER) described in PERF_RECORD_SAMPLE below, it will be stashed
* just after the perf_event_header and the fields already present for
* the existing fields, i.e. at the end of the payload. That way a newer
* perf.data file will be supported by older perf tools, with these new
* optional fields being ignored.
*
* struct sample_id {
* { u32 pid, tid; } && PERF_SAMPLE_TID
* { u64 time; } && PERF_SAMPLE_TIME
* { u64 id; } && PERF_SAMPLE_ID
* { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
* { u32 cpu, res; } && PERF_SAMPLE_CPU
* { u64 id; } && PERF_SAMPLE_IDENTIFIER
* } && perf_event_attr::sample_id_all
*
* Note that PERF_SAMPLE_IDENTIFIER duplicates PERF_SAMPLE_ID. The
* advantage of PERF_SAMPLE_IDENTIFIER is that its position is fixed
* relative to header.size.
*/
/*
* The MMAP events record the PROT_EXEC mappings so that we can
* correlate userspace IPs to code. They have the following structure:
*
@ -498,6 +537,7 @@ enum perf_event_type {
* struct perf_event_header header;
* u64 id;
* u64 lost;
* struct sample_id sample_id;
* };
*/
PERF_RECORD_LOST = 2,
@ -508,6 +548,7 @@ enum perf_event_type {
*
* u32 pid, tid;
* char comm[];
* struct sample_id sample_id;
* };
*/
PERF_RECORD_COMM = 3,
@ -518,6 +559,7 @@ enum perf_event_type {
* u32 pid, ppid;
* u32 tid, ptid;
* u64 time;
* struct sample_id sample_id;
* };
*/
PERF_RECORD_EXIT = 4,
@ -528,6 +570,7 @@ enum perf_event_type {
* u64 time;
* u64 id;
* u64 stream_id;
* struct sample_id sample_id;
* };
*/
PERF_RECORD_THROTTLE = 5,
@ -539,6 +582,7 @@ enum perf_event_type {
* u32 pid, ppid;
* u32 tid, ptid;
* u64 time;
* struct sample_id sample_id;
* };
*/
PERF_RECORD_FORK = 7,
@ -549,6 +593,7 @@ enum perf_event_type {
* u32 pid, tid;
*
* struct read_format values;
* struct sample_id sample_id;
* };
*/
PERF_RECORD_READ = 8,
@ -557,6 +602,13 @@ enum perf_event_type {
* struct {
* struct perf_event_header header;
*
* #
* # Note that PERF_SAMPLE_IDENTIFIER duplicates PERF_SAMPLE_ID.
* # The advantage of PERF_SAMPLE_IDENTIFIER is that its position
* # is fixed relative to header.
* #
*
* { u64 id; } && PERF_SAMPLE_IDENTIFIER
* { u64 ip; } && PERF_SAMPLE_IP
* { u32 pid, tid; } && PERF_SAMPLE_TID
* { u64 time; } && PERF_SAMPLE_TIME
@ -596,11 +648,32 @@ enum perf_event_type {
* u64 dyn_size; } && PERF_SAMPLE_STACK_USER
*
* { u64 weight; } && PERF_SAMPLE_WEIGHT
* { u64 data_src; } && PERF_SAMPLE_DATA_SRC
* { u64 data_src; } && PERF_SAMPLE_DATA_SRC
* };
*/
PERF_RECORD_SAMPLE = 9,
/*
* The MMAP2 records are an augmented version of MMAP, they add
* maj, min, ino numbers to be used to uniquely identify each mapping
*
* struct {
* struct perf_event_header header;
*
* u32 pid, tid;
* u64 addr;
* u64 len;
* u64 pgoff;
* u32 maj;
* u32 min;
* u64 ino;
* u64 ino_generation;
* char filename[];
* struct sample_id sample_id;
* };
*/
PERF_RECORD_MMAP2 = 10,
PERF_RECORD_MAX, /* non-ABI */
};
@ -685,4 +758,28 @@ union perf_mem_data_src {
#define PERF_MEM_S(a, s) \
(((u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
/*
* single taken branch record layout:
*
* from: source instruction (may not always be a branch insn)
* to: branch target
* mispred: branch target was mispredicted
* predicted: branch target was predicted
*
* support for mispred, predicted is optional. In case it
* is not supported mispred = predicted = 0.
*
* in_tx: running in a hardware transaction
* abort: aborting a hardware transaction
*/
struct perf_branch_entry {
__u64 from;
__u64 to;
__u64 mispred:1, /* target mispredicted */
predicted:1,/* target predicted */
in_tx:1, /* in transaction */
abort:1, /* transaction abort */
reserved:60;
};
#endif /* _UAPI_LINUX_PERF_EVENT_H */

View File

@ -116,6 +116,9 @@ int get_callchain_buffers(void)
err = alloc_callchain_buffers();
exit:
if (err)
atomic_dec(&nr_callchain_events);
mutex_unlock(&callchain_mutex);
return err;

View File

@ -145,6 +145,7 @@ static DEFINE_PER_CPU(atomic_t, perf_branch_stack_events);
static atomic_t nr_mmap_events __read_mostly;
static atomic_t nr_comm_events __read_mostly;
static atomic_t nr_task_events __read_mostly;
static atomic_t nr_freq_events __read_mostly;
static LIST_HEAD(pmus);
static DEFINE_MUTEX(pmus_lock);
@ -872,12 +873,8 @@ static void perf_pmu_rotate_start(struct pmu *pmu)
WARN_ON(!irqs_disabled());
if (list_empty(&cpuctx->rotation_list)) {
int was_empty = list_empty(head);
if (list_empty(&cpuctx->rotation_list))
list_add(&cpuctx->rotation_list, head);
if (was_empty)
tick_nohz_full_kick();
}
}
static void get_ctx(struct perf_event_context *ctx)
@ -1219,6 +1216,9 @@ static void perf_event__id_header_size(struct perf_event *event)
if (sample_type & PERF_SAMPLE_TIME)
size += sizeof(data->time);
if (sample_type & PERF_SAMPLE_IDENTIFIER)
size += sizeof(data->id);
if (sample_type & PERF_SAMPLE_ID)
size += sizeof(data->id);
@ -2715,7 +2715,7 @@ static void perf_adjust_freq_unthr_context(struct perf_event_context *ctx,
hwc = &event->hw;
if (needs_unthr && hwc->interrupts == MAX_INTERRUPTS) {
if (hwc->interrupts == MAX_INTERRUPTS) {
hwc->interrupts = 0;
perf_log_throttle(event, 1);
event->pmu->start(event, 0);
@ -2814,10 +2814,11 @@ done:
#ifdef CONFIG_NO_HZ_FULL
bool perf_event_can_stop_tick(void)
{
if (list_empty(&__get_cpu_var(rotation_list)))
return true;
else
if (atomic_read(&nr_freq_events) ||
__this_cpu_read(perf_throttled_count))
return false;
else
return true;
}
#endif
@ -3131,35 +3132,62 @@ static void free_event_rcu(struct rcu_head *head)
static void ring_buffer_put(struct ring_buffer *rb);
static void ring_buffer_detach(struct perf_event *event, struct ring_buffer *rb);
static void unaccount_event_cpu(struct perf_event *event, int cpu)
{
if (event->parent)
return;
if (has_branch_stack(event)) {
if (!(event->attach_state & PERF_ATTACH_TASK))
atomic_dec(&per_cpu(perf_branch_stack_events, cpu));
}
if (is_cgroup_event(event))
atomic_dec(&per_cpu(perf_cgroup_events, cpu));
}
static void unaccount_event(struct perf_event *event)
{
if (event->parent)
return;
if (event->attach_state & PERF_ATTACH_TASK)
static_key_slow_dec_deferred(&perf_sched_events);
if (event->attr.mmap || event->attr.mmap_data)
atomic_dec(&nr_mmap_events);
if (event->attr.comm)
atomic_dec(&nr_comm_events);
if (event->attr.task)
atomic_dec(&nr_task_events);
if (event->attr.freq)
atomic_dec(&nr_freq_events);
if (is_cgroup_event(event))
static_key_slow_dec_deferred(&perf_sched_events);
if (has_branch_stack(event))
static_key_slow_dec_deferred(&perf_sched_events);
unaccount_event_cpu(event, event->cpu);
}
static void __free_event(struct perf_event *event)
{
if (!event->parent) {
if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
put_callchain_buffers();
}
if (event->destroy)
event->destroy(event);
if (event->ctx)
put_ctx(event->ctx);
call_rcu(&event->rcu_head, free_event_rcu);
}
static void free_event(struct perf_event *event)
{
irq_work_sync(&event->pending);
if (!event->parent) {
if (event->attach_state & PERF_ATTACH_TASK)
static_key_slow_dec_deferred(&perf_sched_events);
if (event->attr.mmap || event->attr.mmap_data)
atomic_dec(&nr_mmap_events);
if (event->attr.comm)
atomic_dec(&nr_comm_events);
if (event->attr.task)
atomic_dec(&nr_task_events);
if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
put_callchain_buffers();
if (is_cgroup_event(event)) {
atomic_dec(&per_cpu(perf_cgroup_events, event->cpu));
static_key_slow_dec_deferred(&perf_sched_events);
}
if (has_branch_stack(event)) {
static_key_slow_dec_deferred(&perf_sched_events);
/* is system-wide event */
if (!(event->attach_state & PERF_ATTACH_TASK)) {
atomic_dec(&per_cpu(perf_branch_stack_events,
event->cpu));
}
}
}
unaccount_event(event);
if (event->rb) {
struct ring_buffer *rb;
@ -3183,13 +3211,8 @@ static void free_event(struct perf_event *event)
if (is_cgroup_event(event))
perf_detach_cgroup(event);
if (event->destroy)
event->destroy(event);
if (event->ctx)
put_ctx(event->ctx);
call_rcu(&event->rcu_head, free_event_rcu);
__free_event(event);
}
int perf_event_release_kernel(struct perf_event *event)
@ -3547,6 +3570,15 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case PERF_EVENT_IOC_PERIOD:
return perf_event_period(event, (u64 __user *)arg);
case PERF_EVENT_IOC_ID:
{
u64 id = primary_event_id(event);
if (copy_to_user((void __user *)arg, &id, sizeof(id)))
return -EFAULT;
return 0;
}
case PERF_EVENT_IOC_SET_OUTPUT:
{
int ret;
@ -3644,6 +3676,10 @@ void perf_event_update_userpage(struct perf_event *event)
u64 enabled, running, now;
rcu_read_lock();
rb = rcu_dereference(event->rb);
if (!rb)
goto unlock;
/*
* compute total_time_enabled, total_time_running
* based on snapshot values taken when the event
@ -3654,12 +3690,8 @@ void perf_event_update_userpage(struct perf_event *event)
* NMI context
*/
calc_timer_values(event, &now, &enabled, &running);
rb = rcu_dereference(event->rb);
if (!rb)
goto unlock;
userpg = rb->user_page;
/*
* Disable preemption so as to not let the corresponding user-space
* spin too long if we get preempted.
@ -4254,7 +4286,7 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
if (sample_type & PERF_SAMPLE_TIME)
data->time = perf_clock();
if (sample_type & PERF_SAMPLE_ID)
if (sample_type & (PERF_SAMPLE_ID | PERF_SAMPLE_IDENTIFIER))
data->id = primary_event_id(event);
if (sample_type & PERF_SAMPLE_STREAM_ID)
@ -4293,6 +4325,9 @@ static void __perf_event__output_id_sample(struct perf_output_handle *handle,
if (sample_type & PERF_SAMPLE_CPU)
perf_output_put(handle, data->cpu_entry);
if (sample_type & PERF_SAMPLE_IDENTIFIER)
perf_output_put(handle, data->id);
}
void perf_event__output_id_sample(struct perf_event *event,
@ -4358,7 +4393,8 @@ static void perf_output_read_group(struct perf_output_handle *handle,
list_for_each_entry(sub, &leader->sibling_list, group_entry) {
n = 0;
if (sub != event)
if ((sub != event) &&
(sub->state == PERF_EVENT_STATE_ACTIVE))
sub->pmu->read(sub);
values[n++] = perf_event_count(sub);
@ -4405,6 +4441,9 @@ void perf_output_sample(struct perf_output_handle *handle,
perf_output_put(handle, *header);
if (sample_type & PERF_SAMPLE_IDENTIFIER)
perf_output_put(handle, data->id);
if (sample_type & PERF_SAMPLE_IP)
perf_output_put(handle, data->ip);
@ -4465,20 +4504,6 @@ void perf_output_sample(struct perf_output_handle *handle,
}
}
if (!event->attr.watermark) {
int wakeup_events = event->attr.wakeup_events;
if (wakeup_events) {
struct ring_buffer *rb = handle->rb;
int events = local_inc_return(&rb->events);
if (events >= wakeup_events) {
local_sub(wakeup_events, &rb->events);
local_inc(&rb->wakeup);
}
}
}
if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
if (data->br_stack) {
size_t size;
@ -4514,16 +4539,31 @@ void perf_output_sample(struct perf_output_handle *handle,
}
}
if (sample_type & PERF_SAMPLE_STACK_USER)
if (sample_type & PERF_SAMPLE_STACK_USER) {
perf_output_sample_ustack(handle,
data->stack_user_size,
data->regs_user.regs);
}
if (sample_type & PERF_SAMPLE_WEIGHT)
perf_output_put(handle, data->weight);
if (sample_type & PERF_SAMPLE_DATA_SRC)
perf_output_put(handle, data->data_src.val);
if (!event->attr.watermark) {
int wakeup_events = event->attr.wakeup_events;
if (wakeup_events) {
struct ring_buffer *rb = handle->rb;
int events = local_inc_return(&rb->events);
if (events >= wakeup_events) {
local_sub(wakeup_events, &rb->events);
local_inc(&rb->wakeup);
}
}
}
}
void perf_prepare_sample(struct perf_event_header *header,
@ -4683,12 +4723,10 @@ perf_event_read_event(struct perf_event *event,
perf_output_end(&handle);
}
typedef int (perf_event_aux_match_cb)(struct perf_event *event, void *data);
typedef void (perf_event_aux_output_cb)(struct perf_event *event, void *data);
static void
perf_event_aux_ctx(struct perf_event_context *ctx,
perf_event_aux_match_cb match,
perf_event_aux_output_cb output,
void *data)
{
@ -4699,15 +4737,12 @@ perf_event_aux_ctx(struct perf_event_context *ctx,
continue;
if (!event_filter_match(event))
continue;
if (match(event, data))
output(event, data);
output(event, data);
}
}
static void
perf_event_aux(perf_event_aux_match_cb match,
perf_event_aux_output_cb output,
void *data,
perf_event_aux(perf_event_aux_output_cb output, void *data,
struct perf_event_context *task_ctx)
{
struct perf_cpu_context *cpuctx;
@ -4720,7 +4755,7 @@ perf_event_aux(perf_event_aux_match_cb match,
cpuctx = get_cpu_ptr(pmu->pmu_cpu_context);
if (cpuctx->unique_pmu != pmu)
goto next;
perf_event_aux_ctx(&cpuctx->ctx, match, output, data);
perf_event_aux_ctx(&cpuctx->ctx, output, data);
if (task_ctx)
goto next;
ctxn = pmu->task_ctx_nr;
@ -4728,14 +4763,14 @@ perf_event_aux(perf_event_aux_match_cb match,
goto next;
ctx = rcu_dereference(current->perf_event_ctxp[ctxn]);
if (ctx)
perf_event_aux_ctx(ctx, match, output, data);
perf_event_aux_ctx(ctx, output, data);
next:
put_cpu_ptr(pmu->pmu_cpu_context);
}
if (task_ctx) {
preempt_disable();
perf_event_aux_ctx(task_ctx, match, output, data);
perf_event_aux_ctx(task_ctx, output, data);
preempt_enable();
}
rcu_read_unlock();
@ -4744,7 +4779,7 @@ next:
/*
* task tracking -- fork/exit
*
* enabled by: attr.comm | attr.mmap | attr.mmap_data | attr.task
* enabled by: attr.comm | attr.mmap | attr.mmap2 | attr.mmap_data | attr.task
*/
struct perf_task_event {
@ -4762,6 +4797,13 @@ struct perf_task_event {
} event_id;
};
static int perf_event_task_match(struct perf_event *event)
{
return event->attr.comm || event->attr.mmap ||
event->attr.mmap2 || event->attr.mmap_data ||
event->attr.task;
}
static void perf_event_task_output(struct perf_event *event,
void *data)
{
@ -4771,6 +4813,9 @@ static void perf_event_task_output(struct perf_event *event,
struct task_struct *task = task_event->task;
int ret, size = task_event->event_id.header.size;
if (!perf_event_task_match(event))
return;
perf_event_header__init_id(&task_event->event_id.header, &sample, event);
ret = perf_output_begin(&handle, event,
@ -4793,13 +4838,6 @@ out:
task_event->event_id.header.size = size;
}
static int perf_event_task_match(struct perf_event *event,
void *data __maybe_unused)
{
return event->attr.comm || event->attr.mmap ||
event->attr.mmap_data || event->attr.task;
}
static void perf_event_task(struct task_struct *task,
struct perf_event_context *task_ctx,
int new)
@ -4828,8 +4866,7 @@ static void perf_event_task(struct task_struct *task,
},
};
perf_event_aux(perf_event_task_match,
perf_event_task_output,
perf_event_aux(perf_event_task_output,
&task_event,
task_ctx);
}
@ -4856,6 +4893,11 @@ struct perf_comm_event {
} event_id;
};
static int perf_event_comm_match(struct perf_event *event)
{
return event->attr.comm;
}
static void perf_event_comm_output(struct perf_event *event,
void *data)
{
@ -4865,6 +4907,9 @@ static void perf_event_comm_output(struct perf_event *event,
int size = comm_event->event_id.header.size;
int ret;
if (!perf_event_comm_match(event))
return;
perf_event_header__init_id(&comm_event->event_id.header, &sample, event);
ret = perf_output_begin(&handle, event,
comm_event->event_id.header.size);
@ -4886,12 +4931,6 @@ out:
comm_event->event_id.header.size = size;
}
static int perf_event_comm_match(struct perf_event *event,
void *data __maybe_unused)
{
return event->attr.comm;
}
static void perf_event_comm_event(struct perf_comm_event *comm_event)
{
char comm[TASK_COMM_LEN];
@ -4906,8 +4945,7 @@ static void perf_event_comm_event(struct perf_comm_event *comm_event)
comm_event->event_id.header.size = sizeof(comm_event->event_id) + size;
perf_event_aux(perf_event_comm_match,
perf_event_comm_output,
perf_event_aux(perf_event_comm_output,
comm_event,
NULL);
}
@ -4958,6 +4996,9 @@ struct perf_mmap_event {
const char *file_name;
int file_size;
int maj, min;
u64 ino;
u64 ino_generation;
struct {
struct perf_event_header header;
@ -4970,6 +5011,17 @@ struct perf_mmap_event {
} event_id;
};
static int perf_event_mmap_match(struct perf_event *event,
void *data)
{
struct perf_mmap_event *mmap_event = data;
struct vm_area_struct *vma = mmap_event->vma;
int executable = vma->vm_flags & VM_EXEC;
return (!executable && event->attr.mmap_data) ||
(executable && (event->attr.mmap || event->attr.mmap2));
}
static void perf_event_mmap_output(struct perf_event *event,
void *data)
{
@ -4979,6 +5031,16 @@ static void perf_event_mmap_output(struct perf_event *event,
int size = mmap_event->event_id.header.size;
int ret;
if (!perf_event_mmap_match(event, data))
return;
if (event->attr.mmap2) {
mmap_event->event_id.header.type = PERF_RECORD_MMAP2;
mmap_event->event_id.header.size += sizeof(mmap_event->maj);
mmap_event->event_id.header.size += sizeof(mmap_event->min);
mmap_event->event_id.header.size += sizeof(mmap_event->ino);
}
perf_event_header__init_id(&mmap_event->event_id.header, &sample, event);
ret = perf_output_begin(&handle, event,
mmap_event->event_id.header.size);
@ -4989,6 +5051,14 @@ static void perf_event_mmap_output(struct perf_event *event,
mmap_event->event_id.tid = perf_event_tid(event, current);
perf_output_put(&handle, mmap_event->event_id);
if (event->attr.mmap2) {
perf_output_put(&handle, mmap_event->maj);
perf_output_put(&handle, mmap_event->min);
perf_output_put(&handle, mmap_event->ino);
perf_output_put(&handle, mmap_event->ino_generation);
}
__output_copy(&handle, mmap_event->file_name,
mmap_event->file_size);
@ -4999,21 +5069,12 @@ out:
mmap_event->event_id.header.size = size;
}
static int perf_event_mmap_match(struct perf_event *event,
void *data)
{
struct perf_mmap_event *mmap_event = data;
struct vm_area_struct *vma = mmap_event->vma;
int executable = vma->vm_flags & VM_EXEC;
return (!executable && event->attr.mmap_data) ||
(executable && event->attr.mmap);
}
static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
{
struct vm_area_struct *vma = mmap_event->vma;
struct file *file = vma->vm_file;
int maj = 0, min = 0;
u64 ino = 0, gen = 0;
unsigned int size;
char tmp[16];
char *buf = NULL;
@ -5022,6 +5083,8 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
memset(tmp, 0, sizeof(tmp));
if (file) {
struct inode *inode;
dev_t dev;
/*
* d_path works from the end of the rb backwards, so we
* need to add enough zero bytes after the string to handle
@ -5037,6 +5100,13 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
name = strncpy(tmp, "//toolong", sizeof(tmp));
goto got_name;
}
inode = file_inode(vma->vm_file);
dev = inode->i_sb->s_dev;
ino = inode->i_ino;
gen = inode->i_generation;
maj = MAJOR(dev);
min = MINOR(dev);
} else {
if (arch_vma_name(mmap_event->vma)) {
name = strncpy(tmp, arch_vma_name(mmap_event->vma),
@ -5067,14 +5137,17 @@ got_name:
mmap_event->file_name = name;
mmap_event->file_size = size;
mmap_event->maj = maj;
mmap_event->min = min;
mmap_event->ino = ino;
mmap_event->ino_generation = gen;
if (!(vma->vm_flags & VM_EXEC))
mmap_event->event_id.header.misc |= PERF_RECORD_MISC_MMAP_DATA;
mmap_event->event_id.header.size = sizeof(mmap_event->event_id) + size;
perf_event_aux(perf_event_mmap_match,
perf_event_mmap_output,
perf_event_aux(perf_event_mmap_output,
mmap_event,
NULL);
@ -5104,6 +5177,10 @@ void perf_event_mmap(struct vm_area_struct *vma)
.len = vma->vm_end - vma->vm_start,
.pgoff = (u64)vma->vm_pgoff << PAGE_SHIFT,
},
/* .maj (attr_mmap2 only) */
/* .min (attr_mmap2 only) */
/* .ino (attr_mmap2 only) */
/* .ino_generation (attr_mmap2 only) */
};
perf_event_mmap_event(&mmap_event);
@ -5181,6 +5258,7 @@ static int __perf_event_overflow(struct perf_event *event,
__this_cpu_inc(perf_throttled_count);
hwc->interrupts = MAX_INTERRUPTS;
perf_log_throttle(event, 0);
tick_nohz_full_kick();
ret = 1;
}
}
@ -6446,6 +6524,44 @@ unlock:
return pmu;
}
static void account_event_cpu(struct perf_event *event, int cpu)
{
if (event->parent)
return;
if (has_branch_stack(event)) {
if (!(event->attach_state & PERF_ATTACH_TASK))
atomic_inc(&per_cpu(perf_branch_stack_events, cpu));
}
if (is_cgroup_event(event))
atomic_inc(&per_cpu(perf_cgroup_events, cpu));
}
static void account_event(struct perf_event *event)
{
if (event->parent)
return;
if (event->attach_state & PERF_ATTACH_TASK)
static_key_slow_inc(&perf_sched_events.key);
if (event->attr.mmap || event->attr.mmap_data)
atomic_inc(&nr_mmap_events);
if (event->attr.comm)
atomic_inc(&nr_comm_events);
if (event->attr.task)
atomic_inc(&nr_task_events);
if (event->attr.freq) {
if (atomic_inc_return(&nr_freq_events) == 1)
tick_nohz_full_kick_all();
}
if (has_branch_stack(event))
static_key_slow_inc(&perf_sched_events.key);
if (is_cgroup_event(event))
static_key_slow_inc(&perf_sched_events.key);
account_event_cpu(event, event->cpu);
}
/*
* Allocate and initialize a event structure
*/
@ -6460,7 +6576,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
struct pmu *pmu;
struct perf_event *event;
struct hw_perf_event *hwc;
long err;
long err = -EINVAL;
if ((unsigned)cpu >= nr_cpu_ids) {
if (!task || cpu != -1)
@ -6543,49 +6659,35 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
* we currently do not support PERF_FORMAT_GROUP on inherited events
*/
if (attr->inherit && (attr->read_format & PERF_FORMAT_GROUP))
goto done;
goto err_ns;
pmu = perf_init_event(event);
done:
err = 0;
if (!pmu)
err = -EINVAL;
else if (IS_ERR(pmu))
goto err_ns;
else if (IS_ERR(pmu)) {
err = PTR_ERR(pmu);
if (err) {
if (event->ns)
put_pid_ns(event->ns);
kfree(event);
return ERR_PTR(err);
goto err_ns;
}
if (!event->parent) {
if (event->attach_state & PERF_ATTACH_TASK)
static_key_slow_inc(&perf_sched_events.key);
if (event->attr.mmap || event->attr.mmap_data)
atomic_inc(&nr_mmap_events);
if (event->attr.comm)
atomic_inc(&nr_comm_events);
if (event->attr.task)
atomic_inc(&nr_task_events);
if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) {
err = get_callchain_buffers();
if (err) {
free_event(event);
return ERR_PTR(err);
}
}
if (has_branch_stack(event)) {
static_key_slow_inc(&perf_sched_events.key);
if (!(event->attach_state & PERF_ATTACH_TASK))
atomic_inc(&per_cpu(perf_branch_stack_events,
event->cpu));
if (err)
goto err_pmu;
}
}
return event;
err_pmu:
if (event->destroy)
event->destroy(event);
err_ns:
if (event->ns)
put_pid_ns(event->ns);
kfree(event);
return ERR_PTR(err);
}
static int perf_copy_attr(struct perf_event_attr __user *uattr,
@ -6867,17 +6969,14 @@ SYSCALL_DEFINE5(perf_event_open,
if (flags & PERF_FLAG_PID_CGROUP) {
err = perf_cgroup_connect(pid, event, &attr, group_leader);
if (err)
goto err_alloc;
/*
* one more event:
* - that has cgroup constraint on event->cpu
* - that may need work on context switch
*/
atomic_inc(&per_cpu(perf_cgroup_events, event->cpu));
static_key_slow_inc(&perf_sched_events.key);
if (err) {
__free_event(event);
goto err_task;
}
}
account_event(event);
/*
* Special case software events and allow them to be part of
* any hardware group.
@ -7073,6 +7172,8 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu,
goto err;
}
account_event(event);
ctx = find_get_context(event->pmu, task, cpu);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
@ -7109,6 +7210,7 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
list_for_each_entry_safe(event, tmp, &src_ctx->event_list,
event_entry) {
perf_remove_from_context(event);
unaccount_event_cpu(event, src_cpu);
put_ctx(src_ctx);
list_add(&event->event_entry, &events);
}
@ -7121,6 +7223,7 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
list_del(&event->event_entry);
if (event->state >= PERF_EVENT_STATE_OFF)
event->state = PERF_EVENT_STATE_INACTIVE;
account_event_cpu(event, dst_cpu);
perf_install_in_context(dst_ctx, event, dst_cpu);
get_ctx(dst_ctx);
}

View File

@ -5133,18 +5133,23 @@ static void destroy_sched_domains(struct sched_domain *sd, int cpu)
* two cpus are in the same cache domain, see cpus_share_cache().
*/
DEFINE_PER_CPU(struct sched_domain *, sd_llc);
DEFINE_PER_CPU(int, sd_llc_size);
DEFINE_PER_CPU(int, sd_llc_id);
static void update_top_cache_domain(int cpu)
{
struct sched_domain *sd;
int id = cpu;
int size = 1;
sd = highest_flag_domain(cpu, SD_SHARE_PKG_RESOURCES);
if (sd)
if (sd) {
id = cpumask_first(sched_domain_span(sd));
size = cpumask_weight(sched_domain_span(sd));
}
rcu_assign_pointer(per_cpu(sd_llc, cpu), sd);
per_cpu(sd_llc_size, cpu) = size;
per_cpu(sd_llc_id, cpu) = id;
}

View File

@ -3018,6 +3018,23 @@ static unsigned long cpu_avg_load_per_task(int cpu)
return 0;
}
static void record_wakee(struct task_struct *p)
{
/*
* Rough decay (wiping) for cost saving, don't worry
* about the boundary, really active task won't care
* about the loss.
*/
if (jiffies > current->wakee_flip_decay_ts + HZ) {
current->wakee_flips = 0;
current->wakee_flip_decay_ts = jiffies;
}
if (current->last_wakee != p) {
current->last_wakee = p;
current->wakee_flips++;
}
}
static void task_waking_fair(struct task_struct *p)
{
@ -3038,6 +3055,7 @@ static void task_waking_fair(struct task_struct *p)
#endif
se->vruntime -= min_vruntime;
record_wakee(p);
}
#ifdef CONFIG_FAIR_GROUP_SCHED
@ -3156,6 +3174,28 @@ static inline unsigned long effective_load(struct task_group *tg, int cpu,
#endif
static int wake_wide(struct task_struct *p)
{
int factor = this_cpu_read(sd_llc_size);
/*
* Yeah, it's the switching-frequency, could means many wakee or
* rapidly switch, use factor here will just help to automatically
* adjust the loose-degree, so bigger node will lead to more pull.
*/
if (p->wakee_flips > factor) {
/*
* wakee is somewhat hot, it needs certain amount of cpu
* resource, so if waker is far more hot, prefer to leave
* it alone.
*/
if (current->wakee_flips > (factor * p->wakee_flips))
return 1;
}
return 0;
}
static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
{
s64 this_load, load;
@ -3165,6 +3205,13 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
unsigned long weight;
int balanced;
/*
* If we wake multiple tasks be careful to not bounce
* ourselves around too much.
*/
if (wake_wide(p))
return 0;
idx = sd->wake_idx;
this_cpu = smp_processor_id();
prev_cpu = task_cpu(p);
@ -4172,47 +4219,48 @@ static void update_blocked_averages(int cpu)
}
/*
* Compute the cpu's hierarchical load factor for each task group.
* Compute the hierarchical load factor for cfs_rq and all its ascendants.
* This needs to be done in a top-down fashion because the load of a child
* group is a fraction of its parents load.
*/
static int tg_load_down(struct task_group *tg, void *data)
static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq)
{
unsigned long load;
long cpu = (long)data;
if (!tg->parent) {
load = cpu_rq(cpu)->avg.load_avg_contrib;
} else {
load = tg->parent->cfs_rq[cpu]->h_load;
load = div64_ul(load * tg->se[cpu]->avg.load_avg_contrib,
tg->parent->cfs_rq[cpu]->runnable_load_avg + 1);
}
tg->cfs_rq[cpu]->h_load = load;
return 0;
}
static void update_h_load(long cpu)
{
struct rq *rq = cpu_rq(cpu);
struct rq *rq = rq_of(cfs_rq);
struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)];
unsigned long now = jiffies;
unsigned long load;
if (rq->h_load_throttle == now)
if (cfs_rq->last_h_load_update == now)
return;
rq->h_load_throttle = now;
cfs_rq->h_load_next = NULL;
for_each_sched_entity(se) {
cfs_rq = cfs_rq_of(se);
cfs_rq->h_load_next = se;
if (cfs_rq->last_h_load_update == now)
break;
}
rcu_read_lock();
walk_tg_tree(tg_load_down, tg_nop, (void *)cpu);
rcu_read_unlock();
if (!se) {
cfs_rq->h_load = rq->avg.load_avg_contrib;
cfs_rq->last_h_load_update = now;
}
while ((se = cfs_rq->h_load_next) != NULL) {
load = cfs_rq->h_load;
load = div64_ul(load * se->avg.load_avg_contrib,
cfs_rq->runnable_load_avg + 1);
cfs_rq = group_cfs_rq(se);
cfs_rq->h_load = load;
cfs_rq->last_h_load_update = now;
}
}
static unsigned long task_h_load(struct task_struct *p)
{
struct cfs_rq *cfs_rq = task_cfs_rq(p);
update_cfs_rq_h_load(cfs_rq);
return div64_ul(p->se.avg.load_avg_contrib * cfs_rq->h_load,
cfs_rq->runnable_load_avg + 1);
}
@ -4221,10 +4269,6 @@ static inline void update_blocked_averages(int cpu)
{
}
static inline void update_h_load(long cpu)
{
}
static unsigned long task_h_load(struct task_struct *p)
{
return p->se.avg.load_avg_contrib;
@ -5114,7 +5158,6 @@ redo:
env.src_rq = busiest;
env.loop_max = min(sysctl_sched_nr_migrate, busiest->nr_running);
update_h_load(env.src_cpu);
more_balance:
local_irq_save(flags);
double_rq_lock(env.dst_rq, busiest);

View File

@ -285,7 +285,6 @@ struct cfs_rq {
/* Required to track per-cpu representation of a task_group */
u32 tg_runnable_contrib;
unsigned long tg_load_contrib;
#endif /* CONFIG_FAIR_GROUP_SCHED */
/*
* h_load = weight * f(tg)
@ -294,6 +293,9 @@ struct cfs_rq {
* this group.
*/
unsigned long h_load;
u64 last_h_load_update;
struct sched_entity *h_load_next;
#endif /* CONFIG_FAIR_GROUP_SCHED */
#endif /* CONFIG_SMP */
#ifdef CONFIG_FAIR_GROUP_SCHED
@ -429,9 +431,6 @@ struct rq {
#ifdef CONFIG_FAIR_GROUP_SCHED
/* list of leaf cfs_rq on this cpu: */
struct list_head leaf_cfs_rq_list;
#ifdef CONFIG_SMP
unsigned long h_load_throttle;
#endif /* CONFIG_SMP */
#endif /* CONFIG_FAIR_GROUP_SCHED */
#ifdef CONFIG_RT_GROUP_SCHED
@ -595,6 +594,7 @@ static inline struct sched_domain *highest_flag_domain(int cpu, int flag)
}
DECLARE_PER_CPU(struct sched_domain *, sd_llc);
DECLARE_PER_CPU(int, sd_llc_size);
DECLARE_PER_CPU(int, sd_llc_id);
struct sched_group_power {

View File

@ -553,14 +553,6 @@ void __init lockup_detector_init(void)
{
set_sample_period();
#ifdef CONFIG_NO_HZ_FULL
if (watchdog_user_enabled) {
watchdog_user_enabled = 0;
pr_warning("Disabled lockup detectors by default for full dynticks\n");
pr_warning("You can reactivate it with 'sysctl -w kernel.watchdog=1'\n");
}
#endif
if (watchdog_user_enabled)
watchdog_enable_all_cpus();
}

View File

@ -3,21 +3,6 @@ include ../../scripts/Makefile.include
CC = $(CROSS_COMPILE)gcc
AR = $(CROSS_COMPILE)ar
# Makefiles suck: This macro sets a default value of $(2) for the
# variable named by $(1), unless the variable has been set by
# environment or command line. This is necessary for CC and AR
# because make sets default values, so the simpler ?= approach
# won't work as expected.
define allow-override
$(if $(or $(findstring environment,$(origin $(1))),\
$(findstring command line,$(origin $(1)))),,\
$(eval $(1) = $(2)))
endef
# Allow setting CC and AR, or setting CROSS_COMPILE as a prefix.
$(call allow-override,CC,$(CROSS_COMPILE)gcc)
$(call allow-override,AR,$(CROSS_COMPILE)ar)
# guard against environment variables
LIB_H=
LIB_OBJS=

View File

@ -39,13 +39,8 @@ bindir_relative = bin
bindir = $(prefix)/$(bindir_relative)
man_dir = $(prefix)/share/man
man_dir_SQ = '$(subst ','\'',$(man_dir))'
html_install = $(prefix)/share/kernelshark/html
html_install_SQ = '$(subst ','\'',$(html_install))'
img_install = $(prefix)/share/kernelshark/html/images
img_install_SQ = '$(subst ','\'',$(img_install))'
export man_dir man_dir_SQ html_install html_install_SQ INSTALL
export img_install img_install_SQ
export man_dir man_dir_SQ INSTALL
export DESTDIR DESTDIR_SQ
# copy a bit from Linux kbuild
@ -65,7 +60,7 @@ ifeq ($(BUILD_SRC),)
ifneq ($(BUILD_OUTPUT),)
define build_output
$(if $(VERBOSE:1=),@)$(MAKE) -C $(BUILD_OUTPUT) \
$(if $(VERBOSE:1=),@)+$(MAKE) -C $(BUILD_OUTPUT) \
BUILD_SRC=$(CURDIR) -f $(CURDIR)/Makefile $1
endef
@ -76,10 +71,7 @@ $(if $(BUILD_OUTPUT),, \
all: sub-make
gui: force
$(call build_output, all_cmd)
$(filter-out gui,$(MAKECMDGOALS)): sub-make
$(MAKECMDGOALS): sub-make
sub-make: force
$(call build_output, $(MAKECMDGOALS))
@ -189,6 +181,7 @@ $(obj)/%.o: $(src)/%.c
$(Q)$(call do_compile)
PEVENT_LIB_OBJS = event-parse.o trace-seq.o parse-filter.o parse-utils.o
PEVENT_LIB_OBJS += kbuffer-parse.o
ALL_OBJS = $(PEVENT_LIB_OBJS)
@ -258,9 +251,6 @@ define check_deps
$(RM) $@.$$$$
endef
$(gui_deps): ks_version.h
$(non_gui_deps): tc_version.h
$(all_deps): .%.d: $(src)/%.c
$(Q)$(call check_deps)
@ -300,7 +290,7 @@ define do_install
$(INSTALL) $1 '$(DESTDIR_SQ)$2'
endef
install_lib: all_cmd install_plugins install_python
install_lib: all_cmd
$(Q)$(call do_install,$(LIB_FILE),$(bindir_SQ))
install: install_lib

View File

@ -5450,10 +5450,9 @@ int pevent_register_print_function(struct pevent *pevent,
* If @id is >= 0, then it is used to find the event.
* else @sys_name and @event_name are used.
*/
int pevent_register_event_handler(struct pevent *pevent,
int id, char *sys_name, char *event_name,
pevent_event_handler_func func,
void *context)
int pevent_register_event_handler(struct pevent *pevent, int id,
const char *sys_name, const char *event_name,
pevent_event_handler_func func, void *context)
{
struct event_format *event;
struct event_handler *handle;

View File

@ -69,6 +69,7 @@ struct trace_seq {
};
void trace_seq_init(struct trace_seq *s);
void trace_seq_reset(struct trace_seq *s);
void trace_seq_destroy(struct trace_seq *s);
extern int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
@ -399,6 +400,7 @@ struct pevent {
int cpus;
int long_size;
int page_size;
struct cmdline *cmdlines;
struct cmdline_list *cmdlist;
@ -561,7 +563,8 @@ int pevent_print_num_field(struct trace_seq *s, const char *fmt,
struct event_format *event, const char *name,
struct pevent_record *record, int err);
int pevent_register_event_handler(struct pevent *pevent, int id, char *sys_name, char *event_name,
int pevent_register_event_handler(struct pevent *pevent, int id,
const char *sys_name, const char *event_name,
pevent_event_handler_func func, void *context);
int pevent_register_print_function(struct pevent *pevent,
pevent_func_handler func,
@ -619,6 +622,16 @@ static inline void pevent_set_long_size(struct pevent *pevent, int long_size)
pevent->long_size = long_size;
}
static inline int pevent_get_page_size(struct pevent *pevent)
{
return pevent->page_size;
}
static inline void pevent_set_page_size(struct pevent *pevent, int _page_size)
{
pevent->page_size = _page_size;
}
static inline int pevent_is_file_bigendian(struct pevent *pevent)
{
return pevent->file_bigendian;

View File

@ -0,0 +1,732 @@
/*
* Copyright (C) 2009, 2010 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation;
* version 2.1 of the License (not later!)
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "kbuffer.h"
#define MISSING_EVENTS (1 << 31)
#define MISSING_STORED (1 << 30)
#define COMMIT_MASK ((1 << 27) - 1)
enum {
KBUFFER_FL_HOST_BIG_ENDIAN = (1<<0),
KBUFFER_FL_BIG_ENDIAN = (1<<1),
KBUFFER_FL_LONG_8 = (1<<2),
KBUFFER_FL_OLD_FORMAT = (1<<3),
};
#define ENDIAN_MASK (KBUFFER_FL_HOST_BIG_ENDIAN | KBUFFER_FL_BIG_ENDIAN)
/** kbuffer
* @timestamp - timestamp of current event
* @lost_events - # of lost events between this subbuffer and previous
* @flags - special flags of the kbuffer
* @subbuffer - pointer to the sub-buffer page
* @data - pointer to the start of data on the sub-buffer page
* @index - index from @data to the @curr event data
* @curr - offset from @data to the start of current event
* (includes metadata)
* @next - offset from @data to the start of next event
* @size - The size of data on @data
* @start - The offset from @subbuffer where @data lives
*
* @read_4 - Function to read 4 raw bytes (may swap)
* @read_8 - Function to read 8 raw bytes (may swap)
* @read_long - Function to read a long word (4 or 8 bytes with needed swap)
*/
struct kbuffer {
unsigned long long timestamp;
long long lost_events;
unsigned long flags;
void *subbuffer;
void *data;
unsigned int index;
unsigned int curr;
unsigned int next;
unsigned int size;
unsigned int start;
unsigned int (*read_4)(void *ptr);
unsigned long long (*read_8)(void *ptr);
unsigned long long (*read_long)(struct kbuffer *kbuf, void *ptr);
int (*next_event)(struct kbuffer *kbuf);
};
static void *zmalloc(size_t size)
{
return calloc(1, size);
}
static int host_is_bigendian(void)
{
unsigned char str[] = { 0x1, 0x2, 0x3, 0x4 };
unsigned int *ptr;
ptr = (unsigned int *)str;
return *ptr == 0x01020304;
}
static int do_swap(struct kbuffer *kbuf)
{
return ((kbuf->flags & KBUFFER_FL_HOST_BIG_ENDIAN) + kbuf->flags) &
ENDIAN_MASK;
}
static unsigned long long __read_8(void *ptr)
{
unsigned long long data = *(unsigned long long *)ptr;
return data;
}
static unsigned long long __read_8_sw(void *ptr)
{
unsigned long long data = *(unsigned long long *)ptr;
unsigned long long swap;
swap = ((data & 0xffULL) << 56) |
((data & (0xffULL << 8)) << 40) |
((data & (0xffULL << 16)) << 24) |
((data & (0xffULL << 24)) << 8) |
((data & (0xffULL << 32)) >> 8) |
((data & (0xffULL << 40)) >> 24) |
((data & (0xffULL << 48)) >> 40) |
((data & (0xffULL << 56)) >> 56);
return swap;
}
static unsigned int __read_4(void *ptr)
{
unsigned int data = *(unsigned int *)ptr;
return data;
}
static unsigned int __read_4_sw(void *ptr)
{
unsigned int data = *(unsigned int *)ptr;
unsigned int swap;
swap = ((data & 0xffULL) << 24) |
((data & (0xffULL << 8)) << 8) |
((data & (0xffULL << 16)) >> 8) |
((data & (0xffULL << 24)) >> 24);
return swap;
}
static unsigned long long read_8(struct kbuffer *kbuf, void *ptr)
{
return kbuf->read_8(ptr);
}
static unsigned int read_4(struct kbuffer *kbuf, void *ptr)
{
return kbuf->read_4(ptr);
}
static unsigned long long __read_long_8(struct kbuffer *kbuf, void *ptr)
{
return kbuf->read_8(ptr);
}
static unsigned long long __read_long_4(struct kbuffer *kbuf, void *ptr)
{
return kbuf->read_4(ptr);
}
static unsigned long long read_long(struct kbuffer *kbuf, void *ptr)
{
return kbuf->read_long(kbuf, ptr);
}
static int calc_index(struct kbuffer *kbuf, void *ptr)
{
return (unsigned long)ptr - (unsigned long)kbuf->data;
}
static int __next_event(struct kbuffer *kbuf);
/**
* kbuffer_alloc - allocat a new kbuffer
* @size; enum to denote size of word
* @endian: enum to denote endianness
*
* Allocates and returns a new kbuffer.
*/
struct kbuffer *
kbuffer_alloc(enum kbuffer_long_size size, enum kbuffer_endian endian)
{
struct kbuffer *kbuf;
int flags = 0;
switch (size) {
case KBUFFER_LSIZE_4:
break;
case KBUFFER_LSIZE_8:
flags |= KBUFFER_FL_LONG_8;
break;
default:
return NULL;
}
switch (endian) {
case KBUFFER_ENDIAN_LITTLE:
break;
case KBUFFER_ENDIAN_BIG:
flags |= KBUFFER_FL_BIG_ENDIAN;
break;
default:
return NULL;
}
kbuf = zmalloc(sizeof(*kbuf));
if (!kbuf)
return NULL;
kbuf->flags = flags;
if (host_is_bigendian())
kbuf->flags |= KBUFFER_FL_HOST_BIG_ENDIAN;
if (do_swap(kbuf)) {
kbuf->read_8 = __read_8_sw;
kbuf->read_4 = __read_4_sw;
} else {
kbuf->read_8 = __read_8;
kbuf->read_4 = __read_4;
}
if (kbuf->flags & KBUFFER_FL_LONG_8)
kbuf->read_long = __read_long_8;
else
kbuf->read_long = __read_long_4;
/* May be changed by kbuffer_set_old_format() */
kbuf->next_event = __next_event;
return kbuf;
}
/** kbuffer_free - free an allocated kbuffer
* @kbuf: The kbuffer to free
*
* Can take NULL as a parameter.
*/
void kbuffer_free(struct kbuffer *kbuf)
{
free(kbuf);
}
static unsigned int type4host(struct kbuffer *kbuf,
unsigned int type_len_ts)
{
if (kbuf->flags & KBUFFER_FL_BIG_ENDIAN)
return (type_len_ts >> 29) & 3;
else
return type_len_ts & 3;
}
static unsigned int len4host(struct kbuffer *kbuf,
unsigned int type_len_ts)
{
if (kbuf->flags & KBUFFER_FL_BIG_ENDIAN)
return (type_len_ts >> 27) & 7;
else
return (type_len_ts >> 2) & 7;
}
static unsigned int type_len4host(struct kbuffer *kbuf,
unsigned int type_len_ts)
{
if (kbuf->flags & KBUFFER_FL_BIG_ENDIAN)
return (type_len_ts >> 27) & ((1 << 5) - 1);
else
return type_len_ts & ((1 << 5) - 1);
}
static unsigned int ts4host(struct kbuffer *kbuf,
unsigned int type_len_ts)
{
if (kbuf->flags & KBUFFER_FL_BIG_ENDIAN)
return type_len_ts & ((1 << 27) - 1);
else
return type_len_ts >> 5;
}
/*
* Linux 2.6.30 and earlier (not much ealier) had a different
* ring buffer format. It should be obsolete, but we handle it anyway.
*/
enum old_ring_buffer_type {
OLD_RINGBUF_TYPE_PADDING,
OLD_RINGBUF_TYPE_TIME_EXTEND,
OLD_RINGBUF_TYPE_TIME_STAMP,
OLD_RINGBUF_TYPE_DATA,
};
static unsigned int old_update_pointers(struct kbuffer *kbuf)
{
unsigned long long extend;
unsigned int type_len_ts;
unsigned int type;
unsigned int len;
unsigned int delta;
unsigned int length;
void *ptr = kbuf->data + kbuf->curr;
type_len_ts = read_4(kbuf, ptr);
ptr += 4;
type = type4host(kbuf, type_len_ts);
len = len4host(kbuf, type_len_ts);
delta = ts4host(kbuf, type_len_ts);
switch (type) {
case OLD_RINGBUF_TYPE_PADDING:
kbuf->next = kbuf->size;
return 0;
case OLD_RINGBUF_TYPE_TIME_EXTEND:
extend = read_4(kbuf, ptr);
extend <<= TS_SHIFT;
extend += delta;
delta = extend;
ptr += 4;
break;
case OLD_RINGBUF_TYPE_TIME_STAMP:
/* should never happen! */
kbuf->curr = kbuf->size;
kbuf->next = kbuf->size;
kbuf->index = kbuf->size;
return -1;
default:
if (len)
length = len * 4;
else {
length = read_4(kbuf, ptr);
length -= 4;
ptr += 4;
}
break;
}
kbuf->timestamp += delta;
kbuf->index = calc_index(kbuf, ptr);
kbuf->next = kbuf->index + length;
return type;
}
static int __old_next_event(struct kbuffer *kbuf)
{
int type;
do {
kbuf->curr = kbuf->next;
if (kbuf->next >= kbuf->size)
return -1;
type = old_update_pointers(kbuf);
} while (type == OLD_RINGBUF_TYPE_TIME_EXTEND || type == OLD_RINGBUF_TYPE_PADDING);
return 0;
}
static unsigned int
translate_data(struct kbuffer *kbuf, void *data, void **rptr,
unsigned long long *delta, int *length)
{
unsigned long long extend;
unsigned int type_len_ts;
unsigned int type_len;
type_len_ts = read_4(kbuf, data);
data += 4;
type_len = type_len4host(kbuf, type_len_ts);
*delta = ts4host(kbuf, type_len_ts);
switch (type_len) {
case KBUFFER_TYPE_PADDING:
*length = read_4(kbuf, data);
data += *length;
break;
case KBUFFER_TYPE_TIME_EXTEND:
extend = read_4(kbuf, data);
data += 4;
extend <<= TS_SHIFT;
extend += *delta;
*delta = extend;
*length = 0;
break;
case KBUFFER_TYPE_TIME_STAMP:
data += 12;
*length = 0;
break;
case 0:
*length = read_4(kbuf, data) - 4;
*length = (*length + 3) & ~3;
data += 4;
break;
default:
*length = type_len * 4;
break;
}
*rptr = data;
return type_len;
}
static unsigned int update_pointers(struct kbuffer *kbuf)
{
unsigned long long delta;
unsigned int type_len;
int length;
void *ptr = kbuf->data + kbuf->curr;
type_len = translate_data(kbuf, ptr, &ptr, &delta, &length);
kbuf->timestamp += delta;
kbuf->index = calc_index(kbuf, ptr);
kbuf->next = kbuf->index + length;
return type_len;
}
/**
* kbuffer_translate_data - read raw data to get a record
* @swap: Set to 1 if bytes in words need to be swapped when read
* @data: The raw data to read
* @size: Address to store the size of the event data.
*
* Returns a pointer to the event data. To determine the entire
* record size (record metadata + data) just add the difference between
* @data and the returned value to @size.
*/
void *kbuffer_translate_data(int swap, void *data, unsigned int *size)
{
unsigned long long delta;
struct kbuffer kbuf;
int type_len;
int length;
void *ptr;
if (swap) {
kbuf.read_8 = __read_8_sw;
kbuf.read_4 = __read_4_sw;
kbuf.flags = host_is_bigendian() ? 0 : KBUFFER_FL_BIG_ENDIAN;
} else {
kbuf.read_8 = __read_8;
kbuf.read_4 = __read_4;
kbuf.flags = host_is_bigendian() ? KBUFFER_FL_BIG_ENDIAN: 0;
}
type_len = translate_data(&kbuf, data, &ptr, &delta, &length);
switch (type_len) {
case KBUFFER_TYPE_PADDING:
case KBUFFER_TYPE_TIME_EXTEND:
case KBUFFER_TYPE_TIME_STAMP:
return NULL;
};
*size = length;
return ptr;
}
static int __next_event(struct kbuffer *kbuf)
{
int type;
do {
kbuf->curr = kbuf->next;
if (kbuf->next >= kbuf->size)
return -1;
type = update_pointers(kbuf);
} while (type == KBUFFER_TYPE_TIME_EXTEND || type == KBUFFER_TYPE_PADDING);
return 0;
}
static int next_event(struct kbuffer *kbuf)
{
return kbuf->next_event(kbuf);
}
/**
* kbuffer_next_event - increment the current pointer
* @kbuf: The kbuffer to read
* @ts: Address to store the next record's timestamp (may be NULL to ignore)
*
* Increments the pointers into the subbuffer of the kbuffer to point to the
* next event so that the next kbuffer_read_event() will return a
* new event.
*
* Returns the data of the next event if a new event exists on the subbuffer,
* NULL otherwise.
*/
void *kbuffer_next_event(struct kbuffer *kbuf, unsigned long long *ts)
{
int ret;
if (!kbuf || !kbuf->subbuffer)
return NULL;
ret = next_event(kbuf);
if (ret < 0)
return NULL;
if (ts)
*ts = kbuf->timestamp;
return kbuf->data + kbuf->index;
}
/**
* kbuffer_load_subbuffer - load a new subbuffer into the kbuffer
* @kbuf: The kbuffer to load
* @subbuffer: The subbuffer to load into @kbuf.
*
* Load a new subbuffer (page) into @kbuf. This will reset all
* the pointers and update the @kbuf timestamp. The next read will
* return the first event on @subbuffer.
*
* Returns 0 on succes, -1 otherwise.
*/
int kbuffer_load_subbuffer(struct kbuffer *kbuf, void *subbuffer)
{
unsigned long long flags;
void *ptr = subbuffer;
if (!kbuf || !subbuffer)
return -1;
kbuf->subbuffer = subbuffer;
kbuf->timestamp = read_8(kbuf, ptr);
ptr += 8;
kbuf->curr = 0;
if (kbuf->flags & KBUFFER_FL_LONG_8)
kbuf->start = 16;
else
kbuf->start = 12;
kbuf->data = subbuffer + kbuf->start;
flags = read_long(kbuf, ptr);
kbuf->size = (unsigned int)flags & COMMIT_MASK;
if (flags & MISSING_EVENTS) {
if (flags & MISSING_STORED) {
ptr = kbuf->data + kbuf->size;
kbuf->lost_events = read_long(kbuf, ptr);
} else
kbuf->lost_events = -1;
} else
kbuf->lost_events = 0;
kbuf->index = 0;
kbuf->next = 0;
next_event(kbuf);
return 0;
}
/**
* kbuffer_read_event - read the next event in the kbuffer subbuffer
* @kbuf: The kbuffer to read from
* @ts: The address to store the timestamp of the event (may be NULL to ignore)
*
* Returns a pointer to the data part of the current event.
* NULL if no event is left on the subbuffer.
*/
void *kbuffer_read_event(struct kbuffer *kbuf, unsigned long long *ts)
{
if (!kbuf || !kbuf->subbuffer)
return NULL;
if (kbuf->curr >= kbuf->size)
return NULL;
if (ts)
*ts = kbuf->timestamp;
return kbuf->data + kbuf->index;
}
/**
* kbuffer_timestamp - Return the timestamp of the current event
* @kbuf: The kbuffer to read from
*
* Returns the timestamp of the current (next) event.
*/
unsigned long long kbuffer_timestamp(struct kbuffer *kbuf)
{
return kbuf->timestamp;
}
/**
* kbuffer_read_at_offset - read the event that is at offset
* @kbuf: The kbuffer to read from
* @offset: The offset into the subbuffer
* @ts: The address to store the timestamp of the event (may be NULL to ignore)
*
* The @offset must be an index from the @kbuf subbuffer beginning.
* If @offset is bigger than the stored subbuffer, NULL will be returned.
*
* Returns the data of the record that is at @offset. Note, @offset does
* not need to be the start of the record, the offset just needs to be
* in the record (or beginning of it).
*
* Note, the kbuf timestamp and pointers are updated to the
* returned record. That is, kbuffer_read_event() will return the same
* data and timestamp, and kbuffer_next_event() will increment from
* this record.
*/
void *kbuffer_read_at_offset(struct kbuffer *kbuf, int offset,
unsigned long long *ts)
{
void *data;
if (offset < kbuf->start)
offset = 0;
else
offset -= kbuf->start;
/* Reset the buffer */
kbuffer_load_subbuffer(kbuf, kbuf->subbuffer);
while (kbuf->curr < offset) {
data = kbuffer_next_event(kbuf, ts);
if (!data)
break;
}
return data;
}
/**
* kbuffer_subbuffer_size - the size of the loaded subbuffer
* @kbuf: The kbuffer to read from
*
* Returns the size of the subbuffer. Note, this size is
* where the last event resides. The stored subbuffer may actually be
* bigger due to padding and such.
*/
int kbuffer_subbuffer_size(struct kbuffer *kbuf)
{
return kbuf->size;
}
/**
* kbuffer_curr_index - Return the index of the record
* @kbuf: The kbuffer to read from
*
* Returns the index from the start of the data part of
* the subbuffer to the current location. Note this is not
* from the start of the subbuffer. An index of zero will
* point to the first record. Use kbuffer_curr_offset() for
* the actually offset (that can be used by kbuffer_read_at_offset())
*/
int kbuffer_curr_index(struct kbuffer *kbuf)
{
return kbuf->curr;
}
/**
* kbuffer_curr_offset - Return the offset of the record
* @kbuf: The kbuffer to read from
*
* Returns the offset from the start of the subbuffer to the
* current location.
*/
int kbuffer_curr_offset(struct kbuffer *kbuf)
{
return kbuf->curr + kbuf->start;
}
/**
* kbuffer_event_size - return the size of the event data
* @kbuf: The kbuffer to read
*
* Returns the size of the event data (the payload not counting
* the meta data of the record) of the current event.
*/
int kbuffer_event_size(struct kbuffer *kbuf)
{
return kbuf->next - kbuf->index;
}
/**
* kbuffer_curr_size - return the size of the entire record
* @kbuf: The kbuffer to read
*
* Returns the size of the entire record (meta data and payload)
* of the current event.
*/
int kbuffer_curr_size(struct kbuffer *kbuf)
{
return kbuf->next - kbuf->curr;
}
/**
* kbuffer_missed_events - return the # of missed events from last event.
* @kbuf: The kbuffer to read from
*
* Returns the # of missed events (if recorded) before the current
* event. Note, only events on the beginning of a subbuffer can
* have missed events, all other events within the buffer will be
* zero.
*/
int kbuffer_missed_events(struct kbuffer *kbuf)
{
/* Only the first event can have missed events */
if (kbuf->curr)
return 0;
return kbuf->lost_events;
}
/**
* kbuffer_set_old_forma - set the kbuffer to use the old format parsing
* @kbuf: The kbuffer to set
*
* This is obsolete (or should be). The first kernels to use the
* new ring buffer had a slightly different ring buffer format
* (2.6.30 and earlier). It is still somewhat supported by kbuffer,
* but should not be counted on in the future.
*/
void kbuffer_set_old_format(struct kbuffer *kbuf)
{
kbuf->flags |= KBUFFER_FL_OLD_FORMAT;
kbuf->next_event = __old_next_event;
}

View File

@ -0,0 +1,67 @@
/*
* Copyright (C) 2012 Red Hat Inc, Steven Rostedt <srostedt@redhat.com>
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation;
* version 2.1 of the License (not later!)
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
#ifndef _KBUFFER_H
#define _KBUFFER_H
#ifndef TS_SHIFT
#define TS_SHIFT 27
#endif
enum kbuffer_endian {
KBUFFER_ENDIAN_BIG,
KBUFFER_ENDIAN_LITTLE,
};
enum kbuffer_long_size {
KBUFFER_LSIZE_4,
KBUFFER_LSIZE_8,
};
enum {
KBUFFER_TYPE_PADDING = 29,
KBUFFER_TYPE_TIME_EXTEND = 30,
KBUFFER_TYPE_TIME_STAMP = 31,
};
struct kbuffer;
struct kbuffer *kbuffer_alloc(enum kbuffer_long_size size, enum kbuffer_endian endian);
void kbuffer_free(struct kbuffer *kbuf);
int kbuffer_load_subbuffer(struct kbuffer *kbuf, void *subbuffer);
void *kbuffer_read_event(struct kbuffer *kbuf, unsigned long long *ts);
void *kbuffer_next_event(struct kbuffer *kbuf, unsigned long long *ts);
unsigned long long kbuffer_timestamp(struct kbuffer *kbuf);
void *kbuffer_translate_data(int swap, void *data, unsigned int *size);
void *kbuffer_read_at_offset(struct kbuffer *kbuf, int offset, unsigned long long *ts);
int kbuffer_curr_index(struct kbuffer *kbuf);
int kbuffer_curr_offset(struct kbuffer *kbuf);
int kbuffer_curr_size(struct kbuffer *kbuf);
int kbuffer_event_size(struct kbuffer *kbuf);
int kbuffer_missed_events(struct kbuffer *kbuf);
int kbuffer_subbuffer_size(struct kbuffer *kbuf);
void kbuffer_set_old_format(struct kbuffer *kbuf);
#endif /* _K_BUFFER_H */

View File

@ -48,6 +48,19 @@ void trace_seq_init(struct trace_seq *s)
s->buffer = malloc_or_die(s->buffer_size);
}
/**
* trace_seq_reset - re-initialize the trace_seq structure
* @s: a pointer to the trace_seq structure to reset
*/
void trace_seq_reset(struct trace_seq *s)
{
if (!s)
return;
TRACE_SEQ_CHECK(s);
s->len = 0;
s->readpos = 0;
}
/**
* trace_seq_destroy - free up memory of a trace_seq
* @s: a pointer to the trace_seq to free the buffer

View File

@ -3,17 +3,17 @@ perf-diff(1)
NAME
----
perf-diff - Read two perf.data files and display the differential profile
perf-diff - Read perf.data files and display the differential profile
SYNOPSIS
--------
[verse]
'perf diff' [oldfile] [newfile]
'perf diff' [baseline file] [data file1] [[data file2] ... ]
DESCRIPTION
-----------
This command displays the performance difference amongst two perf.data files
captured via perf record.
This command displays the performance difference amongst two or more perf.data
files captured via perf record.
If no parameters are passed it will assume perf.data.old and perf.data.
@ -75,8 +75,6 @@ OPTIONS
-c::
--compute::
Differential computation selection - delta,ratio,wdiff (default is delta).
If '+' is specified as a first character, the output is sorted based
on the computation results.
See COMPARISON METHODS section for more info.
-p::
@ -87,6 +85,63 @@ OPTIONS
--formula::
Show formula for given computation.
-o::
--order::
Specify compute sorting column number.
COMPARISON
----------
The comparison is governed by the baseline file. The baseline perf.data
file is iterated for samples. All other perf.data files specified on
the command line are searched for the baseline sample pair. If the pair
is found, specified computation is made and result is displayed.
All samples from non-baseline perf.data files, that do not match any
baseline entry, are displayed with empty space within baseline column
and possible computation results (delta) in their related column.
Example files samples:
- file A with samples f1, f2, f3, f4, f6
- file B with samples f2, f4, f5
- file C with samples f1, f2, f5
Example output:
x - computation takes place for pair
b - baseline sample percentage
- perf diff A B C
baseline/A compute/B compute/C samples
---------------------------------------
b x f1
b x x f2
b f3
b x f4
b f6
x x f5
- perf diff B A C
baseline/B compute/A compute/C samples
---------------------------------------
b x x f2
b x f4
b x f5
x x f1
x f3
x f6
- perf diff C B A
baseline/C compute/B compute/A samples
---------------------------------------
b x f1
b x x f2
b x f5
x f3
x x f4
x f6
COMPARISON METHODS
------------------
delta
@ -96,7 +151,7 @@ If specified the 'Delta' column is displayed with value 'd' computed as:
d = A->period_percent - B->period_percent
with:
- A/B being matching hist entry from first/second file specified
- A/B being matching hist entry from data/baseline file specified
(or perf.data/perf.data.old) respectively.
- period_percent being the % of the hist entry period value within
@ -109,24 +164,26 @@ If specified the 'Ratio' column is displayed with value 'r' computed as:
r = A->period / B->period
with:
- A/B being matching hist entry from first/second file specified
- A/B being matching hist entry from data/baseline file specified
(or perf.data/perf.data.old) respectively.
- period being the hist entry period value
wdiff
~~~~~
wdiff:WEIGHT-B,WEIGHT-A
~~~~~~~~~~~~~~~~~~~~~~~
If specified the 'Weighted diff' column is displayed with value 'd' computed as:
d = B->period * WEIGHT-A - A->period * WEIGHT-B
- A/B being matching hist entry from first/second file specified
- A/B being matching hist entry from data/baseline file specified
(or perf.data/perf.data.old) respectively.
- period being the hist entry period value
- WEIGHT-A/WEIGHT-B being user suplied weights in the the '-c' option
behind ':' separator like '-c wdiff:1,2'.
- WIEGHT-A being the weight of the data file
- WIEGHT-B being the weight of the baseline data file
SEE ALSO
--------

View File

@ -13,6 +13,7 @@ SYNOPSIS
{top|record|report|diff|buildid-list}
'perf kvm' [--host] [--guest] [--guestkallsyms=<path> --guestmodules=<path>
| --guestvmlinux=<path>] {top|record|report|diff|buildid-list|stat}
'perf kvm stat [record|report|live] [<options>]
DESCRIPTION
-----------
@ -50,6 +51,10 @@ There are a couple of variants of perf kvm:
'perf kvm stat report' reports statistical data which includes events
handled time, samples, and so on.
'perf kvm stat live' reports statistical data in a live mode (similar to
record + report but with statistical data updated live at a given display
rate).
OPTIONS
-------
-i::
@ -85,13 +90,50 @@ STAT REPORT OPTIONS
--vcpu=<value>::
analyze events which occures on this vcpu. (default: all vcpus)
--events=<value>::
events to be analyzed. Possible values: vmexit, mmio, ioport.
--event=<value>::
event to be analyzed. Possible values: vmexit, mmio, ioport.
(default: vmexit)
-k::
--key=<value>::
Sorting key. Possible values: sample (default, sort by samples
number), time (sort by average time).
-p::
--pid=::
Analyze events only for given process ID(s) (comma separated list).
STAT LIVE OPTIONS
-----------------
-d::
--display::
Time in seconds between display updates
-m::
--mmap-pages=::
Number of mmap data pages. Must be a power of two.
-a::
--all-cpus::
System-wide collection from all CPUs.
-p::
--pid=::
Analyze events only for given process ID(s) (comma separated list).
--vcpu=<value>::
analyze events which occures on this vcpu. (default: all vcpus)
--event=<value>::
event to be analyzed. Possible values: vmexit, mmio, ioport.
(default: vmexit)
-k::
--key=<value>::
Sorting key. Possible values: sample (default, sort by samples
number), time (sort by average time).
--duration=<value>::
Show events other than HLT that take longer than duration usecs.
SEE ALSO
--------

View File

@ -8,7 +8,7 @@ perf-list - List all symbolic event types
SYNOPSIS
--------
[verse]
'perf list' [hw|sw|cache|tracepoint|event_glob]
'perf list' [hw|sw|cache|tracepoint|pmu|event_glob]
DESCRIPTION
-----------
@ -29,6 +29,8 @@ counted. The following modifiers exist:
G - guest counting (in KVM guests)
H - host counting (not in KVM guests)
p - precise level
S - read sample value (PERF_SAMPLE_READ)
D - pin the event to the PMU
The 'p' modifier can be used for specifying how precise the instruction
address should be. The 'p' modifier can be specified multiple times:
@ -104,6 +106,8 @@ To limit the list use:
'subsys_glob:event_glob' to filter by tracepoint subsystems such as sched,
block, etc.
. 'pmu' to print the kernel supplied PMU events.
. If none of the above is matched, it will apply the supplied glob to all
events, printing the ones that match.

View File

@ -115,7 +115,7 @@ OPTIONS
--dump-raw-trace::
Dump raw trace in ASCII.
-g [type,min[,limit],order]::
-g [type,min[,limit],order[,key]]::
--call-graph::
Display call chains using type, min percent threshold, optional print
limit and order.
@ -129,12 +129,21 @@ OPTIONS
- callee: callee based call graph.
- caller: inverted caller based call graph.
Default: fractal,0.5,callee.
key can be:
- function: compare on functions
- address: compare on individual code addresses
Default: fractal,0.5,callee,function.
-G::
--inverted::
alias for inverted caller based call graph.
--ignore-callees=<regex>::
Ignore callees of the function(s) matching the given regex.
This has the effect of collecting the callers of each such
function into one place in the call-graph tree.
--pretty=<key>::
Pretty printing style. key: normal, raw

View File

@ -132,6 +132,11 @@ is a useful mode to detect imbalance between physical cores. To enable this mod
use --per-core in addition to -a. (system-wide). The output includes the
core number and the number of online logical processors on that physical processor.
-D msecs::
--initial-delay msecs::
After starting the program, wait msecs before measuring. This is useful to
filter out the startup phase of the program, which is often very different.
EXAMPLES
--------

View File

@ -155,6 +155,11 @@ Default is to monitor all CPUS.
Default: fractal,0.5,callee.
--ignore-callees=<regex>::
Ignore callees of the function(s) matching the given regex.
This has the effect of collecting the callers of each such
function into one place in the call-graph tree.
--percent-limit::
Do not show entries which have an overhead under that percent.
(Default: 0).

View File

@ -23,25 +23,45 @@ analysis phases.
OPTIONS
-------
-a::
--all-cpus::
System-wide collection from all CPUs.
-e::
--expr::
List of events to show, currently only syscall names.
Prefixing with ! shows all syscalls but the ones specified. You may
need to escape it.
-o::
--output=::
Output file name.
-p::
--pid=::
Record events on existing process ID (comma separated list).
-t::
--tid=::
Record events on existing thread ID (comma separated list).
-u::
--uid=::
Record events in threads owned by uid. Name or number.
-v::
--verbose=::
Verbosity level.
-i::
--no-inherit::
Child tasks do not inherit counters.
-m::
--mmap-pages=::
Number of mmap data pages. Must be a power of two.
-C::
--cpu::
Collect samples only on the list of CPUs provided. Multiple CPUs can be provided as a
comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
@ -54,6 +74,10 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
--sched:
Accrue thread runtime and provide a summary at the end of the session.
-i
--input
Process events from a given perf data file.
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script[1]

View File

@ -124,7 +124,7 @@ strip-libs = $(filter-out -l%,$(1))
ifneq ($(OUTPUT),)
TE_PATH=$(OUTPUT)
ifneq ($(subdir),)
LK_PATH=$(objtree)/lib/lk/
LK_PATH=$(OUTPUT)/../lib/lk/
else
LK_PATH=$(OUTPUT)
endif
@ -281,7 +281,7 @@ LIB_H += util/cpumap.h
LIB_H += util/top.h
LIB_H += $(ARCH_INCLUDE)
LIB_H += util/cgroup.h
LIB_H += $(TRACE_EVENT_DIR)event-parse.h
LIB_H += $(LIB_INCLUDE)traceevent/event-parse.h
LIB_H += util/target.h
LIB_H += util/rblist.h
LIB_H += util/intlist.h
@ -360,6 +360,7 @@ LIB_OBJS += $(OUTPUT)util/rblist.o
LIB_OBJS += $(OUTPUT)util/intlist.o
LIB_OBJS += $(OUTPUT)util/vdso.o
LIB_OBJS += $(OUTPUT)util/stat.o
LIB_OBJS += $(OUTPUT)util/record.o
LIB_OBJS += $(OUTPUT)ui/setup.o
LIB_OBJS += $(OUTPUT)ui/helpline.o
@ -389,6 +390,10 @@ LIB_OBJS += $(OUTPUT)tests/bp_signal.o
LIB_OBJS += $(OUTPUT)tests/bp_signal_overflow.o
LIB_OBJS += $(OUTPUT)tests/task-exit.o
LIB_OBJS += $(OUTPUT)tests/sw-clock.o
ifeq ($(ARCH),x86)
LIB_OBJS += $(OUTPUT)tests/perf-time-to-tsc.o
endif
LIB_OBJS += $(OUTPUT)tests/code-reading.o
BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
@ -434,6 +439,7 @@ PERFLIBS = $(LIB_FILE) $(LIBLK) $(LIBTRACEEVENT)
ifneq ($(OUTPUT),)
CFLAGS += -I$(OUTPUT)
endif
LIB_OBJS += $(OUTPUT)tests/sample-parsing.o
ifdef NO_LIBELF
EXTLIBS := $(filter-out -lelf,$(EXTLIBS))
@ -459,6 +465,7 @@ endif # NO_LIBELF
ifndef NO_LIBUNWIND
LIB_OBJS += $(OUTPUT)util/unwind.o
endif
LIB_OBJS += $(OUTPUT)tests/keep-tracking.o
ifndef NO_LIBAUDIT
BUILTIN_OBJS += $(OUTPUT)builtin-trace.o
@ -631,10 +638,10 @@ $(OUTPUT)util/parse-events.o: util/parse-events.c $(OUTPUT)PERF-CFLAGS
$(QUIET_CC)$(CC) -o $@ -c $(CFLAGS) -Wno-redundant-decls $<
$(OUTPUT)util/scripting-engines/trace-event-perl.o: util/scripting-engines/trace-event-perl.c $(OUTPUT)PERF-CFLAGS
$(QUIET_CC)$(CC) -o $@ -c $(CFLAGS) $(PERL_EMBED_CCOPTS) -Wno-redundant-decls -Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow $<
$(QUIET_CC)$(CC) -o $@ -c $(CFLAGS) $(PERL_EMBED_CCOPTS) -Wno-redundant-decls -Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow -Wno-undef -Wno-switch-default $<
$(OUTPUT)scripts/perl/Perf-Trace-Util/Context.o: scripts/perl/Perf-Trace-Util/Context.c $(OUTPUT)PERF-CFLAGS
$(QUIET_CC)$(CC) -o $@ -c $(CFLAGS) $(PERL_EMBED_CCOPTS) -Wno-redundant-decls -Wno-strict-prototypes -Wno-unused-parameter -Wno-nested-externs $<
$(QUIET_CC)$(CC) -o $@ -c $(CFLAGS) $(PERL_EMBED_CCOPTS) -Wno-redundant-decls -Wno-strict-prototypes -Wno-unused-parameter -Wno-nested-externs -Wno-undef -Wno-switch-default $<
$(OUTPUT)util/scripting-engines/trace-event-python.o: util/scripting-engines/trace-event-python.c $(OUTPUT)PERF-CFLAGS
$(QUIET_CC)$(CC) -o $@ -c $(CFLAGS) $(PYTHON_EMBED_CCOPTS) -Wno-redundant-decls -Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow $<
@ -762,17 +769,21 @@ check: $(OUTPUT)common-cmds.h
install-bin: all
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(bindir_SQ)'
$(INSTALL) $(OUTPUT)perf '$(DESTDIR_SQ)$(bindir_SQ)'
$(INSTALL) $(OUTPUT)perf-archive -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
ifndef NO_LIBPERL
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/perl/Perf-Trace-Util/lib/Perf/Trace'
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/perl/bin'
$(INSTALL) $(OUTPUT)perf-archive -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)'
$(INSTALL) scripts/perl/Perf-Trace-Util/lib/Perf/Trace/* -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/perl/Perf-Trace-Util/lib/Perf/Trace'
$(INSTALL) scripts/perl/*.pl -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/perl'
$(INSTALL) scripts/perl/bin/* -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/perl/bin'
endif
ifndef NO_LIBPYTHON
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/Perf-Trace-Util/lib/Perf/Trace'
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/bin'
$(INSTALL) scripts/python/Perf-Trace-Util/lib/Perf/Trace/* -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/Perf-Trace-Util/lib/Perf/Trace'
$(INSTALL) scripts/python/*.py -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python'
$(INSTALL) scripts/python/bin/* -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/python/bin'
endif
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(sysconfdir_SQ)/bash_completion.d'
$(INSTALL) bash_completion '$(DESTDIR_SQ)$(sysconfdir_SQ)/bash_completion.d/perf'
$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests'

View File

@ -6,3 +6,5 @@ ifndef NO_LIBUNWIND
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind.o
endif
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/header.o
LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/tsc.o
LIB_H += arch/$(ARCH)/util/tsc.h

View File

@ -0,0 +1,59 @@
#include <stdbool.h>
#include <errno.h>
#include <linux/perf_event.h>
#include "../../perf.h"
#include "../../util/types.h"
#include "../../util/debug.h"
#include "tsc.h"
u64 perf_time_to_tsc(u64 ns, struct perf_tsc_conversion *tc)
{
u64 t, quot, rem;
t = ns - tc->time_zero;
quot = t / tc->time_mult;
rem = t % tc->time_mult;
return (quot << tc->time_shift) +
(rem << tc->time_shift) / tc->time_mult;
}
u64 tsc_to_perf_time(u64 cyc, struct perf_tsc_conversion *tc)
{
u64 quot, rem;
quot = cyc >> tc->time_shift;
rem = cyc & ((1 << tc->time_shift) - 1);
return tc->time_zero + quot * tc->time_mult +
((rem * tc->time_mult) >> tc->time_shift);
}
int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
struct perf_tsc_conversion *tc)
{
bool cap_usr_time_zero;
u32 seq;
int i = 0;
while (1) {
seq = pc->lock;
rmb();
tc->time_mult = pc->time_mult;
tc->time_shift = pc->time_shift;
tc->time_zero = pc->time_zero;
cap_usr_time_zero = pc->cap_usr_time_zero;
rmb();
if (pc->lock == seq && !(seq & 1))
break;
if (++i > 10000) {
pr_debug("failed to get perf_event_mmap_page lock\n");
return -EINVAL;
}
}
if (!cap_usr_time_zero)
return -EOPNOTSUPP;
return 0;
}

View File

@ -0,0 +1,20 @@
#ifndef TOOLS_PERF_ARCH_X86_UTIL_TSC_H__
#define TOOLS_PERF_ARCH_X86_UTIL_TSC_H__
#include "../../util/types.h"
struct perf_tsc_conversion {
u16 time_shift;
u32 time_mult;
u64 time_zero;
};
struct perf_event_mmap_page;
int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
struct perf_tsc_conversion *tc);
u64 perf_time_to_tsc(u64 ns, struct perf_tsc_conversion *tc);
u64 tsc_to_perf_time(u64 cyc, struct perf_tsc_conversion *tc);
#endif /* TOOLS_PERF_ARCH_X86_UTIL_TSC_H__ */

View File

@ -117,6 +117,8 @@ static void alloc_mem(void **dst, void **src, size_t length)
*src = zalloc(length);
if (!*src)
die("memory allocation failed - maybe length is too large?\n");
/* Make sure to always replace the zero pages even if MMAP_THRESH is crossed */
memset(*src, 0, length);
}
static u64 do_memcpy_cycle(memcpy_t fn, size_t len, bool prefault)

View File

@ -90,8 +90,7 @@ static int process_sample_event(struct perf_tool *tool,
struct perf_annotate *ann = container_of(tool, struct perf_annotate, tool);
struct addr_location al;
if (perf_event__preprocess_sample(event, machine, &al, sample,
symbol__annotate_init) < 0) {
if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@ -195,6 +194,8 @@ static int __cmd_annotate(struct perf_annotate *ann)
if (session == NULL)
return -ENOMEM;
machines__set_symbol_filter(&session->machines, symbol__annotate_init);
if (ann->cpu_list) {
ret = perf_session__cpu_bitmap(session, ann->cpu_list,
ann->cpu_bitmap);

View File

@ -18,15 +18,53 @@
#include "util/util.h"
#include <stdlib.h>
#include <math.h>
static char const *input_old = "perf.data.old",
*input_new = "perf.data";
static char diff__default_sort_order[] = "dso,symbol";
static bool force;
/* Diff command specific HPP columns. */
enum {
PERF_HPP_DIFF__BASELINE,
PERF_HPP_DIFF__PERIOD,
PERF_HPP_DIFF__PERIOD_BASELINE,
PERF_HPP_DIFF__DELTA,
PERF_HPP_DIFF__RATIO,
PERF_HPP_DIFF__WEIGHTED_DIFF,
PERF_HPP_DIFF__FORMULA,
PERF_HPP_DIFF__MAX_INDEX
};
struct diff_hpp_fmt {
struct perf_hpp_fmt fmt;
int idx;
char *header;
int header_width;
};
struct data__file {
struct perf_session *session;
const char *file;
int idx;
struct hists *hists;
struct diff_hpp_fmt fmt[PERF_HPP_DIFF__MAX_INDEX];
};
static struct data__file *data__files;
static int data__files_cnt;
#define data__for_each_file_start(i, d, s) \
for (i = s, d = &data__files[s]; \
i < data__files_cnt; \
i++, d = &data__files[i])
#define data__for_each_file(i, d) data__for_each_file_start(i, d, 0)
#define data__for_each_file_new(i, d) data__for_each_file_start(i, d, 1)
static char diff__default_sort_order[] = "dso,symbol";
static bool force;
static bool show_period;
static bool show_formula;
static bool show_baseline_only;
static bool sort_compute;
static unsigned int sort_compute;
static s64 compute_wdiff_w1;
static s64 compute_wdiff_w2;
@ -46,6 +84,47 @@ const char *compute_names[COMPUTE_MAX] = {
static int compute;
static int compute_2_hpp[COMPUTE_MAX] = {
[COMPUTE_DELTA] = PERF_HPP_DIFF__DELTA,
[COMPUTE_RATIO] = PERF_HPP_DIFF__RATIO,
[COMPUTE_WEIGHTED_DIFF] = PERF_HPP_DIFF__WEIGHTED_DIFF,
};
#define MAX_COL_WIDTH 70
static struct header_column {
const char *name;
int width;
} columns[PERF_HPP_DIFF__MAX_INDEX] = {
[PERF_HPP_DIFF__BASELINE] = {
.name = "Baseline",
},
[PERF_HPP_DIFF__PERIOD] = {
.name = "Period",
.width = 14,
},
[PERF_HPP_DIFF__PERIOD_BASELINE] = {
.name = "Base period",
.width = 14,
},
[PERF_HPP_DIFF__DELTA] = {
.name = "Delta",
.width = 7,
},
[PERF_HPP_DIFF__RATIO] = {
.name = "Ratio",
.width = 14,
},
[PERF_HPP_DIFF__WEIGHTED_DIFF] = {
.name = "Weighted diff",
.width = 14,
},
[PERF_HPP_DIFF__FORMULA] = {
.name = "Formula",
.width = MAX_COL_WIDTH,
}
};
static int setup_compute_opt_wdiff(char *opt)
{
char *w1_str = opt;
@ -109,13 +188,6 @@ static int setup_compute(const struct option *opt, const char *str,
return 0;
}
if (*str == '+') {
sort_compute = true;
cstr = (char *) ++str;
if (!*str)
return 0;
}
option = strchr(str, ':');
if (option) {
unsigned len = option++ - str;
@ -145,42 +217,42 @@ static int setup_compute(const struct option *opt, const char *str,
return -EINVAL;
}
double perf_diff__period_percent(struct hist_entry *he, u64 period)
static double period_percent(struct hist_entry *he, u64 period)
{
u64 total = he->hists->stats.total_period;
return (period * 100.0) / total;
}
double perf_diff__compute_delta(struct hist_entry *he, struct hist_entry *pair)
static double compute_delta(struct hist_entry *he, struct hist_entry *pair)
{
double new_percent = perf_diff__period_percent(he, he->stat.period);
double old_percent = perf_diff__period_percent(pair, pair->stat.period);
double old_percent = period_percent(he, he->stat.period);
double new_percent = period_percent(pair, pair->stat.period);
he->diff.period_ratio_delta = new_percent - old_percent;
he->diff.computed = true;
return he->diff.period_ratio_delta;
pair->diff.period_ratio_delta = new_percent - old_percent;
pair->diff.computed = true;
return pair->diff.period_ratio_delta;
}
double perf_diff__compute_ratio(struct hist_entry *he, struct hist_entry *pair)
static double compute_ratio(struct hist_entry *he, struct hist_entry *pair)
{
double new_period = he->stat.period;
double old_period = pair->stat.period;
double old_period = he->stat.period ?: 1;
double new_period = pair->stat.period;
he->diff.computed = true;
he->diff.period_ratio = new_period / old_period;
return he->diff.period_ratio;
pair->diff.computed = true;
pair->diff.period_ratio = new_period / old_period;
return pair->diff.period_ratio;
}
s64 perf_diff__compute_wdiff(struct hist_entry *he, struct hist_entry *pair)
static s64 compute_wdiff(struct hist_entry *he, struct hist_entry *pair)
{
u64 new_period = he->stat.period;
u64 old_period = pair->stat.period;
u64 old_period = he->stat.period;
u64 new_period = pair->stat.period;
he->diff.computed = true;
he->diff.wdiff = new_period * compute_wdiff_w2 -
old_period * compute_wdiff_w1;
pair->diff.computed = true;
pair->diff.wdiff = new_period * compute_wdiff_w2 -
old_period * compute_wdiff_w1;
return he->diff.wdiff;
return pair->diff.wdiff;
}
static int formula_delta(struct hist_entry *he, struct hist_entry *pair,
@ -189,15 +261,15 @@ static int formula_delta(struct hist_entry *he, struct hist_entry *pair,
return scnprintf(buf, size,
"(%" PRIu64 " * 100 / %" PRIu64 ") - "
"(%" PRIu64 " * 100 / %" PRIu64 ")",
he->stat.period, he->hists->stats.total_period,
pair->stat.period, pair->hists->stats.total_period);
pair->stat.period, pair->hists->stats.total_period,
he->stat.period, he->hists->stats.total_period);
}
static int formula_ratio(struct hist_entry *he, struct hist_entry *pair,
char *buf, size_t size)
{
double new_period = he->stat.period;
double old_period = pair->stat.period;
double old_period = he->stat.period;
double new_period = pair->stat.period;
return scnprintf(buf, size, "%.0F / %.0F", new_period, old_period);
}
@ -205,16 +277,16 @@ static int formula_ratio(struct hist_entry *he, struct hist_entry *pair,
static int formula_wdiff(struct hist_entry *he, struct hist_entry *pair,
char *buf, size_t size)
{
u64 new_period = he->stat.period;
u64 old_period = pair->stat.period;
u64 old_period = he->stat.period;
u64 new_period = pair->stat.period;
return scnprintf(buf, size,
"(%" PRIu64 " * " "%" PRId64 ") - (%" PRIu64 " * " "%" PRId64 ")",
new_period, compute_wdiff_w2, old_period, compute_wdiff_w1);
}
int perf_diff__formula(struct hist_entry *he, struct hist_entry *pair,
char *buf, size_t size)
static int formula_fprintf(struct hist_entry *he, struct hist_entry *pair,
char *buf, size_t size)
{
switch (compute) {
case COMPUTE_DELTA:
@ -247,7 +319,7 @@ static int diff__process_sample_event(struct perf_tool *tool __maybe_unused,
{
struct addr_location al;
if (perf_event__preprocess_sample(event, machine, &al, sample, NULL) < 0) {
if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@ -299,6 +371,29 @@ static void perf_evlist__collapse_resort(struct perf_evlist *evlist)
}
}
static struct hist_entry*
get_pair_data(struct hist_entry *he, struct data__file *d)
{
if (hist_entry__has_pairs(he)) {
struct hist_entry *pair;
list_for_each_entry(pair, &he->pairs.head, pairs.node)
if (pair->hists == d->hists)
return pair;
}
return NULL;
}
static struct hist_entry*
get_pair_fmt(struct hist_entry *he, struct diff_hpp_fmt *dfmt)
{
void *ptr = dfmt - dfmt->idx;
struct data__file *d = container_of(ptr, struct data__file, fmt);
return get_pair_data(he, d);
}
static void hists__baseline_only(struct hists *hists)
{
struct rb_root *root;
@ -333,22 +428,24 @@ static void hists__precompute(struct hists *hists)
next = rb_first(root);
while (next != NULL) {
struct hist_entry *he = rb_entry(next, struct hist_entry, rb_node_in);
struct hist_entry *pair = hist_entry__next_pair(he);
struct hist_entry *he, *pair;
he = rb_entry(next, struct hist_entry, rb_node_in);
next = rb_next(&he->rb_node_in);
pair = get_pair_data(he, &data__files[sort_compute]);
if (!pair)
continue;
switch (compute) {
case COMPUTE_DELTA:
perf_diff__compute_delta(he, pair);
compute_delta(he, pair);
break;
case COMPUTE_RATIO:
perf_diff__compute_ratio(he, pair);
compute_ratio(he, pair);
break;
case COMPUTE_WEIGHTED_DIFF:
perf_diff__compute_wdiff(he, pair);
compute_wdiff(he, pair);
break;
default:
BUG_ON(1);
@ -367,7 +464,7 @@ static int64_t cmp_doubles(double l, double r)
}
static int64_t
hist_entry__cmp_compute(struct hist_entry *left, struct hist_entry *right,
__hist_entry__cmp_compute(struct hist_entry *left, struct hist_entry *right,
int c)
{
switch (c) {
@ -399,6 +496,36 @@ hist_entry__cmp_compute(struct hist_entry *left, struct hist_entry *right,
return 0;
}
static int64_t
hist_entry__cmp_compute(struct hist_entry *left, struct hist_entry *right,
int c)
{
bool pairs_left = hist_entry__has_pairs(left);
bool pairs_right = hist_entry__has_pairs(right);
struct hist_entry *p_right, *p_left;
if (!pairs_left && !pairs_right)
return 0;
if (!pairs_left || !pairs_right)
return pairs_left ? -1 : 1;
p_left = get_pair_data(left, &data__files[sort_compute]);
p_right = get_pair_data(right, &data__files[sort_compute]);
if (!p_left && !p_right)
return 0;
if (!p_left || !p_right)
return p_left ? -1 : 1;
/*
* We have 2 entries of same kind, let's
* make the data comparison.
*/
return __hist_entry__cmp_compute(p_left, p_right, c);
}
static void insert_hist_entry_by_compute(struct rb_root *root,
struct hist_entry *he,
int c)
@ -448,75 +575,121 @@ static void hists__compute_resort(struct hists *hists)
}
}
static void hists__process(struct hists *old, struct hists *new)
static void hists__process(struct hists *hists)
{
hists__match(new, old);
if (show_baseline_only)
hists__baseline_only(new);
else
hists__link(new, old);
hists__baseline_only(hists);
if (sort_compute) {
hists__precompute(new);
hists__compute_resort(new);
hists__precompute(hists);
hists__compute_resort(hists);
} else {
hists__output_resort(new);
hists__output_resort(hists);
}
hists__fprintf(new, true, 0, 0, 0, stdout);
hists__fprintf(hists, true, 0, 0, 0, stdout);
}
static void data__fprintf(void)
{
struct data__file *d;
int i;
fprintf(stdout, "# Data files:\n");
data__for_each_file(i, d)
fprintf(stdout, "# [%d] %s %s\n",
d->idx, d->file,
!d->idx ? "(Baseline)" : "");
fprintf(stdout, "#\n");
}
static void data_process(void)
{
struct perf_evlist *evlist_base = data__files[0].session->evlist;
struct perf_evsel *evsel_base;
bool first = true;
list_for_each_entry(evsel_base, &evlist_base->entries, node) {
struct data__file *d;
int i;
data__for_each_file_new(i, d) {
struct perf_evlist *evlist = d->session->evlist;
struct perf_evsel *evsel;
evsel = evsel_match(evsel_base, evlist);
if (!evsel)
continue;
d->hists = &evsel->hists;
hists__match(&evsel_base->hists, &evsel->hists);
if (!show_baseline_only)
hists__link(&evsel_base->hists,
&evsel->hists);
}
fprintf(stdout, "%s# Event '%s'\n#\n", first ? "" : "\n",
perf_evsel__name(evsel_base));
first = false;
if (verbose || data__files_cnt > 2)
data__fprintf();
hists__process(&evsel_base->hists);
}
}
static void data__free(struct data__file *d)
{
int col;
for (col = 0; col < PERF_HPP_DIFF__MAX_INDEX; col++) {
struct diff_hpp_fmt *fmt = &d->fmt[col];
free(fmt->header);
}
}
static int __cmd_diff(void)
{
int ret, i;
#define older (session[0])
#define newer (session[1])
struct perf_session *session[2];
struct perf_evlist *evlist_new, *evlist_old;
struct perf_evsel *evsel;
bool first = true;
struct data__file *d;
int ret = -EINVAL, i;
older = perf_session__new(input_old, O_RDONLY, force, false,
&tool);
newer = perf_session__new(input_new, O_RDONLY, force, false,
&tool);
if (session[0] == NULL || session[1] == NULL)
return -ENOMEM;
for (i = 0; i < 2; ++i) {
ret = perf_session__process_events(session[i], &tool);
if (ret)
data__for_each_file(i, d) {
d->session = perf_session__new(d->file, O_RDONLY, force,
false, &tool);
if (!d->session) {
pr_err("Failed to open %s\n", d->file);
ret = -ENOMEM;
goto out_delete;
}
ret = perf_session__process_events(d->session, &tool);
if (ret) {
pr_err("Failed to process %s\n", d->file);
goto out_delete;
}
perf_evlist__collapse_resort(d->session->evlist);
}
evlist_old = older->evlist;
evlist_new = newer->evlist;
data_process();
perf_evlist__collapse_resort(evlist_old);
perf_evlist__collapse_resort(evlist_new);
out_delete:
data__for_each_file(i, d) {
if (d->session)
perf_session__delete(d->session);
list_for_each_entry(evsel, &evlist_new->entries, node) {
struct perf_evsel *evsel_old;
evsel_old = evsel_match(evsel, evlist_old);
if (!evsel_old)
continue;
fprintf(stdout, "%s# Event '%s'\n#\n", first ? "" : "\n",
perf_evsel__name(evsel));
first = false;
hists__process(&evsel_old->hists, &evsel->hists);
data__free(d);
}
out_delete:
for (i = 0; i < 2; ++i)
perf_session__delete(session[i]);
free(data__files);
return ret;
#undef older
#undef newer
}
static const char * const diff_usage[] = {
@ -555,61 +728,310 @@ static const struct option options[] = {
"columns '.' is reserved."),
OPT_STRING(0, "symfs", &symbol_conf.symfs, "directory",
"Look for files with symbols relative to this directory"),
OPT_UINTEGER('o', "order", &sort_compute, "Specify compute sorting."),
OPT_END()
};
static void ui_init(void)
static double baseline_percent(struct hist_entry *he)
{
/*
* Display baseline/delta/ratio
* formula/periods columns.
*/
perf_hpp__column_enable(PERF_HPP__BASELINE);
struct hists *hists = he->hists;
return 100.0 * he->stat.period / hists->stats.total_period;
}
switch (compute) {
case COMPUTE_DELTA:
perf_hpp__column_enable(PERF_HPP__DELTA);
static int hpp__color_baseline(struct perf_hpp_fmt *fmt,
struct perf_hpp *hpp, struct hist_entry *he)
{
struct diff_hpp_fmt *dfmt =
container_of(fmt, struct diff_hpp_fmt, fmt);
double percent = baseline_percent(he);
char pfmt[20] = " ";
if (!he->dummy) {
scnprintf(pfmt, 20, "%%%d.2f%%%%", dfmt->header_width - 1);
return percent_color_snprintf(hpp->buf, hpp->size,
pfmt, percent);
} else
return scnprintf(hpp->buf, hpp->size, "%*s",
dfmt->header_width, pfmt);
}
static int hpp__entry_baseline(struct hist_entry *he, char *buf, size_t size)
{
double percent = baseline_percent(he);
const char *fmt = symbol_conf.field_sep ? "%.2f" : "%6.2f%%";
int ret = 0;
if (!he->dummy)
ret = scnprintf(buf, size, fmt, percent);
return ret;
}
static void
hpp__entry_unpair(struct hist_entry *he, int idx, char *buf, size_t size)
{
switch (idx) {
case PERF_HPP_DIFF__PERIOD_BASELINE:
scnprintf(buf, size, "%" PRIu64, he->stat.period);
break;
case COMPUTE_RATIO:
perf_hpp__column_enable(PERF_HPP__RATIO);
default:
break;
case COMPUTE_WEIGHTED_DIFF:
perf_hpp__column_enable(PERF_HPP__WEIGHTED_DIFF);
}
}
static void
hpp__entry_pair(struct hist_entry *he, struct hist_entry *pair,
int idx, char *buf, size_t size)
{
double diff;
double ratio;
s64 wdiff;
switch (idx) {
case PERF_HPP_DIFF__DELTA:
if (pair->diff.computed)
diff = pair->diff.period_ratio_delta;
else
diff = compute_delta(he, pair);
if (fabs(diff) >= 0.01)
scnprintf(buf, size, "%+4.2F%%", diff);
break;
case PERF_HPP_DIFF__RATIO:
/* No point for ratio number if we are dummy.. */
if (he->dummy)
break;
if (pair->diff.computed)
ratio = pair->diff.period_ratio;
else
ratio = compute_ratio(he, pair);
if (ratio > 0.0)
scnprintf(buf, size, "%14.6F", ratio);
break;
case PERF_HPP_DIFF__WEIGHTED_DIFF:
/* No point for wdiff number if we are dummy.. */
if (he->dummy)
break;
if (pair->diff.computed)
wdiff = pair->diff.wdiff;
else
wdiff = compute_wdiff(he, pair);
if (wdiff != 0)
scnprintf(buf, size, "%14ld", wdiff);
break;
case PERF_HPP_DIFF__FORMULA:
formula_fprintf(he, pair, buf, size);
break;
case PERF_HPP_DIFF__PERIOD:
scnprintf(buf, size, "%" PRIu64, pair->stat.period);
break;
default:
BUG_ON(1);
};
}
if (show_formula)
perf_hpp__column_enable(PERF_HPP__FORMULA);
static void
__hpp__entry_global(struct hist_entry *he, struct diff_hpp_fmt *dfmt,
char *buf, size_t size)
{
struct hist_entry *pair = get_pair_fmt(he, dfmt);
int idx = dfmt->idx;
if (show_period) {
perf_hpp__column_enable(PERF_HPP__PERIOD);
perf_hpp__column_enable(PERF_HPP__PERIOD_BASELINE);
/* baseline is special */
if (idx == PERF_HPP_DIFF__BASELINE)
hpp__entry_baseline(he, buf, size);
else {
if (pair)
hpp__entry_pair(he, pair, idx, buf, size);
else
hpp__entry_unpair(he, idx, buf, size);
}
}
static int hpp__entry_global(struct perf_hpp_fmt *_fmt, struct perf_hpp *hpp,
struct hist_entry *he)
{
struct diff_hpp_fmt *dfmt =
container_of(_fmt, struct diff_hpp_fmt, fmt);
char buf[MAX_COL_WIDTH] = " ";
__hpp__entry_global(he, dfmt, buf, MAX_COL_WIDTH);
if (symbol_conf.field_sep)
return scnprintf(hpp->buf, hpp->size, "%s", buf);
else
return scnprintf(hpp->buf, hpp->size, "%*s",
dfmt->header_width, buf);
}
static int hpp__header(struct perf_hpp_fmt *fmt,
struct perf_hpp *hpp)
{
struct diff_hpp_fmt *dfmt =
container_of(fmt, struct diff_hpp_fmt, fmt);
BUG_ON(!dfmt->header);
return scnprintf(hpp->buf, hpp->size, dfmt->header);
}
static int hpp__width(struct perf_hpp_fmt *fmt,
struct perf_hpp *hpp __maybe_unused)
{
struct diff_hpp_fmt *dfmt =
container_of(fmt, struct diff_hpp_fmt, fmt);
BUG_ON(dfmt->header_width <= 0);
return dfmt->header_width;
}
static void init_header(struct data__file *d, struct diff_hpp_fmt *dfmt)
{
#define MAX_HEADER_NAME 100
char buf_indent[MAX_HEADER_NAME];
char buf[MAX_HEADER_NAME];
const char *header = NULL;
int width = 0;
BUG_ON(dfmt->idx >= PERF_HPP_DIFF__MAX_INDEX);
header = columns[dfmt->idx].name;
width = columns[dfmt->idx].width;
/* Only our defined HPP fmts should appear here. */
BUG_ON(!header);
if (data__files_cnt > 2)
scnprintf(buf, MAX_HEADER_NAME, "%s/%d", header, d->idx);
#define NAME (data__files_cnt > 2 ? buf : header)
dfmt->header_width = width;
width = (int) strlen(NAME);
if (dfmt->header_width < width)
dfmt->header_width = width;
scnprintf(buf_indent, MAX_HEADER_NAME, "%*s",
dfmt->header_width, NAME);
dfmt->header = strdup(buf_indent);
#undef MAX_HEADER_NAME
#undef NAME
}
static void data__hpp_register(struct data__file *d, int idx)
{
struct diff_hpp_fmt *dfmt = &d->fmt[idx];
struct perf_hpp_fmt *fmt = &dfmt->fmt;
dfmt->idx = idx;
fmt->header = hpp__header;
fmt->width = hpp__width;
fmt->entry = hpp__entry_global;
/* TODO more colors */
if (idx == PERF_HPP_DIFF__BASELINE)
fmt->color = hpp__color_baseline;
init_header(d, dfmt);
perf_hpp__column_register(fmt);
}
static void ui_init(void)
{
struct data__file *d;
int i;
data__for_each_file(i, d) {
/*
* Baseline or compute realted columns:
*
* PERF_HPP_DIFF__BASELINE
* PERF_HPP_DIFF__DELTA
* PERF_HPP_DIFF__RATIO
* PERF_HPP_DIFF__WEIGHTED_DIFF
*/
data__hpp_register(d, i ? compute_2_hpp[compute] :
PERF_HPP_DIFF__BASELINE);
/*
* And the rest:
*
* PERF_HPP_DIFF__FORMULA
* PERF_HPP_DIFF__PERIOD
* PERF_HPP_DIFF__PERIOD_BASELINE
*/
if (show_formula && i)
data__hpp_register(d, PERF_HPP_DIFF__FORMULA);
if (show_period)
data__hpp_register(d, i ? PERF_HPP_DIFF__PERIOD :
PERF_HPP_DIFF__PERIOD_BASELINE);
}
}
static int data_init(int argc, const char **argv)
{
struct data__file *d;
static const char *defaults[] = {
"perf.data.old",
"perf.data",
};
bool use_default = true;
int i;
data__files_cnt = 2;
if (argc) {
if (argc == 1)
defaults[1] = argv[0];
else {
data__files_cnt = argc;
use_default = false;
}
} else if (symbol_conf.default_guest_vmlinux_name ||
symbol_conf.default_guest_kallsyms) {
defaults[0] = "perf.data.host";
defaults[1] = "perf.data.guest";
}
if (sort_compute >= (unsigned int) data__files_cnt) {
pr_err("Order option out of limit.\n");
return -EINVAL;
}
data__files = zalloc(sizeof(*data__files) * data__files_cnt);
if (!data__files)
return -ENOMEM;
data__for_each_file(i, d) {
d->file = use_default ? defaults[i] : argv[i];
d->idx = i;
}
return 0;
}
int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused)
{
sort_order = diff__default_sort_order;
argc = parse_options(argc, argv, options, diff_usage, 0);
if (argc) {
if (argc > 2)
usage_with_options(diff_usage, options);
if (argc == 2) {
input_old = argv[0];
input_new = argv[1];
} else
input_new = argv[0];
} else if (symbol_conf.default_guest_vmlinux_name ||
symbol_conf.default_guest_kallsyms) {
input_old = "perf.data.host";
input_new = "perf.data.guest";
}
if (symbol__init() < 0)
return -1;
if (data_init(argc, argv) < 0)
return -1;
ui_init();
if (setup_sorting() < 0)

View File

@ -38,8 +38,7 @@ struct event_entry {
};
static int perf_event__repipe_synth(struct perf_tool *tool,
union perf_event *event,
struct machine *machine __maybe_unused)
union perf_event *event)
{
struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
uint32_t size;
@ -65,39 +64,28 @@ static int perf_event__repipe_op2_synth(struct perf_tool *tool,
struct perf_session *session
__maybe_unused)
{
return perf_event__repipe_synth(tool, event, NULL);
return perf_event__repipe_synth(tool, event);
}
static int perf_event__repipe_event_type_synth(struct perf_tool *tool,
union perf_event *event)
{
return perf_event__repipe_synth(tool, event, NULL);
}
static int perf_event__repipe_tracing_data_synth(union perf_event *event,
struct perf_session *session
__maybe_unused)
{
return perf_event__repipe_synth(NULL, event, NULL);
}
static int perf_event__repipe_attr(union perf_event *event,
struct perf_evlist **pevlist __maybe_unused)
static int perf_event__repipe_attr(struct perf_tool *tool,
union perf_event *event,
struct perf_evlist **pevlist)
{
int ret;
ret = perf_event__process_attr(event, pevlist);
ret = perf_event__process_attr(tool, event, pevlist);
if (ret)
return ret;
return perf_event__repipe_synth(NULL, event, NULL);
return perf_event__repipe_synth(tool, event);
}
static int perf_event__repipe(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct machine *machine)
struct machine *machine __maybe_unused)
{
return perf_event__repipe_synth(tool, event, machine);
return perf_event__repipe_synth(tool, event);
}
typedef int (*inject_handler)(struct perf_tool *tool,
@ -119,7 +107,7 @@ static int perf_event__repipe_sample(struct perf_tool *tool,
build_id__mark_dso_hit(tool, event, sample, evsel, machine);
return perf_event__repipe_synth(tool, event, machine);
return perf_event__repipe_synth(tool, event);
}
static int perf_event__repipe_mmap(struct perf_tool *tool,
@ -148,13 +136,14 @@ static int perf_event__repipe_fork(struct perf_tool *tool,
return err;
}
static int perf_event__repipe_tracing_data(union perf_event *event,
static int perf_event__repipe_tracing_data(struct perf_tool *tool,
union perf_event *event,
struct perf_session *session)
{
int err;
perf_event__repipe_synth(NULL, event, NULL);
err = perf_event__process_tracing_data(event, session);
perf_event__repipe_synth(tool, event);
err = perf_event__process_tracing_data(tool, event, session);
return err;
}
@ -209,7 +198,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,
cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
thread = machine__findnew_thread(machine, event->ip.pid);
thread = machine__findnew_thread(machine, sample->pid, sample->pid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
event->header.type);
@ -217,7 +206,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,
}
thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
event->ip.ip, &al);
sample->ip, &al);
if (al.map != NULL) {
if (!al.map->dso->hit) {
@ -312,7 +301,9 @@ found:
sample_sw.period = sample->period;
sample_sw.time = sample->time;
perf_event__synthesize_sample(event_sw, evsel->attr.sample_type,
&sample_sw, false);
evsel->attr.sample_regs_user,
evsel->attr.read_format, &sample_sw,
false);
build_id__mark_dso_hit(tool, event_sw, &sample_sw, evsel, machine);
return perf_event__repipe(tool, event_sw, &sample_sw, machine);
}
@ -407,8 +398,8 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
.throttle = perf_event__repipe,
.unthrottle = perf_event__repipe,
.attr = perf_event__repipe_attr,
.event_type = perf_event__repipe_event_type_synth,
.tracing_data = perf_event__repipe_tracing_data_synth,
.tracing_data = perf_event__repipe_op2_synth,
.finished_round = perf_event__repipe_op2_synth,
.build_id = perf_event__repipe_op2_synth,
},
.input_name = "-",

View File

@ -305,7 +305,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine)
{
struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
struct thread *thread = machine__findnew_thread(machine, sample->pid,
sample->pid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
@ -313,7 +314,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
return -1;
}
dump_printf(" ... thread: %s:%d\n", thread->comm, thread->pid);
dump_printf(" ... thread: %s:%d\n", thread->comm, thread->tid);
if (evsel->handler.func != NULL) {
tracepoint_handler f = evsel->handler.func;

View File

@ -2,22 +2,26 @@
#include "perf.h"
#include "util/evsel.h"
#include "util/evlist.h"
#include "util/util.h"
#include "util/cache.h"
#include "util/symbol.h"
#include "util/thread.h"
#include "util/header.h"
#include "util/session.h"
#include "util/intlist.h"
#include "util/parse-options.h"
#include "util/trace-event.h"
#include "util/debug.h"
#include <lk/debugfs.h>
#include "util/tool.h"
#include "util/stat.h"
#include "util/top.h"
#include <sys/prctl.h>
#include <sys/timerfd.h>
#include <termios.h>
#include <semaphore.h>
#include <pthread.h>
#include <math.h>
@ -82,6 +86,8 @@ struct exit_reasons_table {
struct perf_kvm_stat {
struct perf_tool tool;
struct perf_record_opts opts;
struct perf_evlist *evlist;
struct perf_session *session;
const char *file_name;
@ -96,10 +102,20 @@ struct perf_kvm_stat {
struct kvm_events_ops *events_ops;
key_cmp_fun compare;
struct list_head kvm_events_cache[EVENTS_CACHE_SIZE];
u64 total_time;
u64 total_count;
u64 lost_events;
u64 duration;
const char *pid_str;
struct intlist *pid_list;
struct rb_root result;
int timerfd;
unsigned int display_time;
bool live;
};
@ -320,6 +336,28 @@ static void init_kvm_event_record(struct perf_kvm_stat *kvm)
INIT_LIST_HEAD(&kvm->kvm_events_cache[i]);
}
static void clear_events_cache_stats(struct list_head *kvm_events_cache)
{
struct list_head *head;
struct kvm_event *event;
unsigned int i;
int j;
for (i = 0; i < EVENTS_CACHE_SIZE; i++) {
head = &kvm_events_cache[i];
list_for_each_entry(event, head, hash_entry) {
/* reset stats for event */
event->total.time = 0;
init_stats(&event->total.stats);
for (j = 0; j < event->max_vcpu; ++j) {
event->vcpu[j].time = 0;
init_stats(&event->vcpu[j].stats);
}
}
}
}
static int kvm_events_hash_fn(u64 key)
{
return key & (EVENTS_CACHE_SIZE - 1);
@ -436,7 +474,7 @@ static bool update_kvm_event(struct kvm_event *event, int vcpu_id,
static bool handle_end_event(struct perf_kvm_stat *kvm,
struct vcpu_event_record *vcpu_record,
struct event_key *key,
u64 timestamp)
struct perf_sample *sample)
{
struct kvm_event *event;
u64 time_begin, time_diff;
@ -472,9 +510,25 @@ static bool handle_end_event(struct perf_kvm_stat *kvm,
vcpu_record->last_event = NULL;
vcpu_record->start_time = 0;
BUG_ON(timestamp < time_begin);
/* seems to happen once in a while during live mode */
if (sample->time < time_begin) {
pr_debug("End time before begin time; skipping event.\n");
return true;
}
time_diff = sample->time - time_begin;
if (kvm->duration && time_diff > kvm->duration) {
char decode[32];
kvm->events_ops->decode_key(kvm, &event->key, decode);
if (strcmp(decode, "HLT")) {
pr_info("%" PRIu64 " VM %d, vcpu %d: %s event took %" PRIu64 "usec\n",
sample->time, sample->pid, vcpu_record->vcpu_id,
decode, time_diff/1000);
}
}
time_diff = timestamp - time_begin;
return update_kvm_event(event, vcpu, time_diff);
}
@ -521,7 +575,7 @@ static bool handle_kvm_event(struct perf_kvm_stat *kvm,
return handle_begin_event(kvm, vcpu_record, &key, sample->time);
if (kvm->events_ops->is_end_event(evsel, sample, &key))
return handle_end_event(kvm, vcpu_record, &key, sample->time);
return handle_end_event(kvm, vcpu_record, &key, sample);
return true;
}
@ -550,6 +604,8 @@ static int compare_kvm_event_ ## func(struct kvm_event *one, \
GET_EVENT_KEY(time, time);
COMPARE_EVENT_KEY(count, stats.n);
COMPARE_EVENT_KEY(mean, stats.mean);
GET_EVENT_KEY(max, stats.max);
GET_EVENT_KEY(min, stats.min);
#define DEF_SORT_NAME_KEY(name, compare_key) \
{ #name, compare_kvm_event_ ## compare_key }
@ -639,43 +695,81 @@ static struct kvm_event *pop_from_result(struct rb_root *result)
return container_of(node, struct kvm_event, rb);
}
static void print_vcpu_info(int vcpu)
static void print_vcpu_info(struct perf_kvm_stat *kvm)
{
int vcpu = kvm->trace_vcpu;
pr_info("Analyze events for ");
if (kvm->live) {
if (kvm->opts.target.system_wide)
pr_info("all VMs, ");
else if (kvm->opts.target.pid)
pr_info("pid(s) %s, ", kvm->opts.target.pid);
else
pr_info("dazed and confused on what is monitored, ");
}
if (vcpu == -1)
pr_info("all VCPUs:\n\n");
else
pr_info("VCPU %d:\n\n", vcpu);
}
static void show_timeofday(void)
{
char date[64];
struct timeval tv;
struct tm ltime;
gettimeofday(&tv, NULL);
if (localtime_r(&tv.tv_sec, &ltime)) {
strftime(date, sizeof(date), "%H:%M:%S", &ltime);
pr_info("%s.%06ld", date, tv.tv_usec);
} else
pr_info("00:00:00.000000");
return;
}
static void print_result(struct perf_kvm_stat *kvm)
{
char decode[20];
struct kvm_event *event;
int vcpu = kvm->trace_vcpu;
if (kvm->live) {
puts(CONSOLE_CLEAR);
show_timeofday();
}
pr_info("\n\n");
print_vcpu_info(vcpu);
print_vcpu_info(kvm);
pr_info("%20s ", kvm->events_ops->name);
pr_info("%10s ", "Samples");
pr_info("%9s ", "Samples%");
pr_info("%9s ", "Time%");
pr_info("%10s ", "Min Time");
pr_info("%10s ", "Max Time");
pr_info("%16s ", "Avg time");
pr_info("\n\n");
while ((event = pop_from_result(&kvm->result))) {
u64 ecount, etime;
u64 ecount, etime, max, min;
ecount = get_event_count(event, vcpu);
etime = get_event_time(event, vcpu);
max = get_event_max(event, vcpu);
min = get_event_min(event, vcpu);
kvm->events_ops->decode_key(kvm, &event->key, decode);
pr_info("%20s ", decode);
pr_info("%10llu ", (unsigned long long)ecount);
pr_info("%8.2f%% ", (double)ecount / kvm->total_count * 100);
pr_info("%8.2f%% ", (double)etime / kvm->total_time * 100);
pr_info("%8" PRIu64 "us ", min / 1000);
pr_info("%8" PRIu64 "us ", max / 1000);
pr_info("%9.2fus ( +-%7.2f%% )", (double)etime / ecount/1e3,
kvm_event_rel_stddev(vcpu, event));
pr_info("\n");
@ -683,6 +777,29 @@ static void print_result(struct perf_kvm_stat *kvm)
pr_info("\nTotal Samples:%" PRIu64 ", Total events handled time:%.2fus.\n\n",
kvm->total_count, kvm->total_time / 1e3);
if (kvm->lost_events)
pr_info("\nLost events: %" PRIu64 "\n\n", kvm->lost_events);
}
static int process_lost_event(struct perf_tool *tool,
union perf_event *event __maybe_unused,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
{
struct perf_kvm_stat *kvm = container_of(tool, struct perf_kvm_stat, tool);
kvm->lost_events++;
return 0;
}
static bool skip_sample(struct perf_kvm_stat *kvm,
struct perf_sample *sample)
{
if (kvm->pid_list && intlist__find(kvm->pid_list, sample->pid) == NULL)
return true;
return false;
}
static int process_sample_event(struct perf_tool *tool,
@ -691,10 +808,14 @@ static int process_sample_event(struct perf_tool *tool,
struct perf_evsel *evsel,
struct machine *machine)
{
struct thread *thread = machine__findnew_thread(machine, sample->tid);
struct thread *thread;
struct perf_kvm_stat *kvm = container_of(tool, struct perf_kvm_stat,
tool);
if (skip_sample(kvm, sample))
return 0;
thread = machine__findnew_thread(machine, sample->pid, sample->tid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
event->header.type);
@ -707,10 +828,20 @@ static int process_sample_event(struct perf_tool *tool,
return 0;
}
static int get_cpu_isa(struct perf_session *session)
static int cpu_isa_config(struct perf_kvm_stat *kvm)
{
char *cpuid = session->header.env.cpuid;
int isa;
char buf[64], *cpuid;
int err, isa;
if (kvm->live) {
err = get_cpuid(buf, sizeof(buf));
if (err != 0) {
pr_err("Failed to look up CPU type (Intel or AMD)\n");
return err;
}
cpuid = buf;
} else
cpuid = kvm->session->header.env.cpuid;
if (strstr(cpuid, "Intel"))
isa = 1;
@ -718,10 +849,361 @@ static int get_cpu_isa(struct perf_session *session)
isa = 0;
else {
pr_err("CPU %s is not supported.\n", cpuid);
isa = -ENOTSUP;
return -ENOTSUP;
}
return isa;
if (isa == 1) {
kvm->exit_reasons = vmx_exit_reasons;
kvm->exit_reasons_size = ARRAY_SIZE(vmx_exit_reasons);
kvm->exit_reasons_isa = "VMX";
}
return 0;
}
static bool verify_vcpu(int vcpu)
{
if (vcpu != -1 && vcpu < 0) {
pr_err("Invalid vcpu:%d.\n", vcpu);
return false;
}
return true;
}
/* keeping the max events to a modest level to keep
* the processing of samples per mmap smooth.
*/
#define PERF_KVM__MAX_EVENTS_PER_MMAP 25
static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
u64 *mmap_time)
{
union perf_event *event;
struct perf_sample sample;
s64 n = 0;
int err;
*mmap_time = ULLONG_MAX;
while ((event = perf_evlist__mmap_read(kvm->evlist, idx)) != NULL) {
err = perf_evlist__parse_sample(kvm->evlist, event, &sample);
if (err) {
pr_err("Failed to parse sample\n");
return -1;
}
err = perf_session_queue_event(kvm->session, event, &sample, 0);
if (err) {
pr_err("Failed to enqueue sample: %d\n", err);
return -1;
}
/* save time stamp of our first sample for this mmap */
if (n == 0)
*mmap_time = sample.time;
/* limit events per mmap handled all at once */
n++;
if (n == PERF_KVM__MAX_EVENTS_PER_MMAP)
break;
}
return n;
}
static int perf_kvm__mmap_read(struct perf_kvm_stat *kvm)
{
int i, err, throttled = 0;
s64 n, ntotal = 0;
u64 flush_time = ULLONG_MAX, mmap_time;
for (i = 0; i < kvm->evlist->nr_mmaps; i++) {
n = perf_kvm__mmap_read_idx(kvm, i, &mmap_time);
if (n < 0)
return -1;
/* flush time is going to be the minimum of all the individual
* mmap times. Essentially, we flush all the samples queued up
* from the last pass under our minimal start time -- that leaves
* a very small race for samples to come in with a lower timestamp.
* The ioctl to return the perf_clock timestamp should close the
* race entirely.
*/
if (mmap_time < flush_time)
flush_time = mmap_time;
ntotal += n;
if (n == PERF_KVM__MAX_EVENTS_PER_MMAP)
throttled = 1;
}
/* flush queue after each round in which we processed events */
if (ntotal) {
kvm->session->ordered_samples.next_flush = flush_time;
err = kvm->tool.finished_round(&kvm->tool, NULL, kvm->session);
if (err) {
if (kvm->lost_events)
pr_info("\nLost events: %" PRIu64 "\n\n",
kvm->lost_events);
return err;
}
}
return throttled;
}
static volatile int done;
static void sig_handler(int sig __maybe_unused)
{
done = 1;
}
static int perf_kvm__timerfd_create(struct perf_kvm_stat *kvm)
{
struct itimerspec new_value;
int rc = -1;
kvm->timerfd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK);
if (kvm->timerfd < 0) {
pr_err("timerfd_create failed\n");
goto out;
}
new_value.it_value.tv_sec = kvm->display_time;
new_value.it_value.tv_nsec = 0;
new_value.it_interval.tv_sec = kvm->display_time;
new_value.it_interval.tv_nsec = 0;
if (timerfd_settime(kvm->timerfd, 0, &new_value, NULL) != 0) {
pr_err("timerfd_settime failed: %d\n", errno);
close(kvm->timerfd);
goto out;
}
rc = 0;
out:
return rc;
}
static int perf_kvm__handle_timerfd(struct perf_kvm_stat *kvm)
{
uint64_t c;
int rc;
rc = read(kvm->timerfd, &c, sizeof(uint64_t));
if (rc < 0) {
if (errno == EAGAIN)
return 0;
pr_err("Failed to read timer fd: %d\n", errno);
return -1;
}
if (rc != sizeof(uint64_t)) {
pr_err("Error reading timer fd - invalid size returned\n");
return -1;
}
if (c != 1)
pr_debug("Missed timer beats: %" PRIu64 "\n", c-1);
/* update display */
sort_result(kvm);
print_result(kvm);
/* reset counts */
clear_events_cache_stats(kvm->kvm_events_cache);
kvm->total_count = 0;
kvm->total_time = 0;
kvm->lost_events = 0;
return 0;
}
static int fd_set_nonblock(int fd)
{
long arg = 0;
arg = fcntl(fd, F_GETFL);
if (arg < 0) {
pr_err("Failed to get current flags for fd %d\n", fd);
return -1;
}
if (fcntl(fd, F_SETFL, arg | O_NONBLOCK) < 0) {
pr_err("Failed to set non-block option on fd %d\n", fd);
return -1;
}
return 0;
}
static
int perf_kvm__handle_stdin(struct termios *tc_now, struct termios *tc_save)
{
int c;
tcsetattr(0, TCSANOW, tc_now);
c = getc(stdin);
tcsetattr(0, TCSAFLUSH, tc_save);
if (c == 'q')
return 1;
return 0;
}
static int kvm_events_live_report(struct perf_kvm_stat *kvm)
{
struct pollfd *pollfds = NULL;
int nr_fds, nr_stdin, ret, err = -EINVAL;
struct termios tc, save;
/* live flag must be set first */
kvm->live = true;
ret = cpu_isa_config(kvm);
if (ret < 0)
return ret;
if (!verify_vcpu(kvm->trace_vcpu) ||
!select_key(kvm) ||
!register_kvm_events_ops(kvm)) {
goto out;
}
init_kvm_event_record(kvm);
tcgetattr(0, &save);
tc = save;
tc.c_lflag &= ~(ICANON | ECHO);
tc.c_cc[VMIN] = 0;
tc.c_cc[VTIME] = 0;
signal(SIGINT, sig_handler);
signal(SIGTERM, sig_handler);
/* copy pollfds -- need to add timerfd and stdin */
nr_fds = kvm->evlist->nr_fds;
pollfds = zalloc(sizeof(struct pollfd) * (nr_fds + 2));
if (!pollfds) {
err = -ENOMEM;
goto out;
}
memcpy(pollfds, kvm->evlist->pollfd,
sizeof(struct pollfd) * kvm->evlist->nr_fds);
/* add timer fd */
if (perf_kvm__timerfd_create(kvm) < 0) {
err = -1;
goto out;
}
pollfds[nr_fds].fd = kvm->timerfd;
pollfds[nr_fds].events = POLLIN;
nr_fds++;
pollfds[nr_fds].fd = fileno(stdin);
pollfds[nr_fds].events = POLLIN;
nr_stdin = nr_fds;
nr_fds++;
if (fd_set_nonblock(fileno(stdin)) != 0)
goto out;
/* everything is good - enable the events and process */
perf_evlist__enable(kvm->evlist);
while (!done) {
int rc;
rc = perf_kvm__mmap_read(kvm);
if (rc < 0)
break;
err = perf_kvm__handle_timerfd(kvm);
if (err)
goto out;
if (pollfds[nr_stdin].revents & POLLIN)
done = perf_kvm__handle_stdin(&tc, &save);
if (!rc && !done)
err = poll(pollfds, nr_fds, 100);
}
perf_evlist__disable(kvm->evlist);
if (err == 0) {
sort_result(kvm);
print_result(kvm);
}
out:
if (kvm->timerfd >= 0)
close(kvm->timerfd);
if (pollfds)
free(pollfds);
return err;
}
static int kvm_live_open_events(struct perf_kvm_stat *kvm)
{
int err, rc = -1;
struct perf_evsel *pos;
struct perf_evlist *evlist = kvm->evlist;
perf_evlist__config(evlist, &kvm->opts);
/*
* Note: exclude_{guest,host} do not apply here.
* This command processes KVM tracepoints from host only
*/
list_for_each_entry(pos, &evlist->entries, node) {
struct perf_event_attr *attr = &pos->attr;
/* make sure these *are* set */
attr->sample_type |= PERF_SAMPLE_TID;
attr->sample_type |= PERF_SAMPLE_TIME;
attr->sample_type |= PERF_SAMPLE_CPU;
attr->sample_type |= PERF_SAMPLE_RAW;
/* make sure these are *not*; want as small a sample as possible */
attr->sample_type &= ~PERF_SAMPLE_PERIOD;
attr->sample_type &= ~PERF_SAMPLE_IP;
attr->sample_type &= ~PERF_SAMPLE_CALLCHAIN;
attr->sample_type &= ~PERF_SAMPLE_ADDR;
attr->sample_type &= ~PERF_SAMPLE_READ;
attr->mmap = 0;
attr->comm = 0;
attr->task = 0;
attr->sample_period = 1;
attr->watermark = 0;
attr->wakeup_events = 1000;
/* will enable all once we are ready */
attr->disabled = 1;
}
err = perf_evlist__open(evlist);
if (err < 0) {
printf("Couldn't create the events: %s\n", strerror(errno));
goto out;
}
if (perf_evlist__mmap(evlist, kvm->opts.mmap_pages, false) < 0) {
ui__error("Failed to mmap the events: %s\n", strerror(errno));
perf_evlist__close(evlist);
goto out;
}
rc = 0;
out:
return rc;
}
static int read_events(struct perf_kvm_stat *kvm)
@ -749,28 +1231,24 @@ static int read_events(struct perf_kvm_stat *kvm)
* Do not use 'isa' recorded in kvm_exit tracepoint since it is not
* traced in the old kernel.
*/
ret = get_cpu_isa(kvm->session);
ret = cpu_isa_config(kvm);
if (ret < 0)
return ret;
if (ret == 1) {
kvm->exit_reasons = vmx_exit_reasons;
kvm->exit_reasons_size = ARRAY_SIZE(vmx_exit_reasons);
kvm->exit_reasons_isa = "VMX";
}
return perf_session__process_events(kvm->session, &kvm->tool);
}
static bool verify_vcpu(int vcpu)
static int parse_target_str(struct perf_kvm_stat *kvm)
{
if (vcpu != -1 && vcpu < 0) {
pr_err("Invalid vcpu:%d.\n", vcpu);
return false;
if (kvm->pid_str) {
kvm->pid_list = intlist__new(kvm->pid_str);
if (kvm->pid_list == NULL) {
pr_err("Error parsing process id string\n");
return -EINVAL;
}
}
return true;
return 0;
}
static int kvm_events_report_vcpu(struct perf_kvm_stat *kvm)
@ -778,6 +1256,9 @@ static int kvm_events_report_vcpu(struct perf_kvm_stat *kvm)
int ret = -EINVAL;
int vcpu = kvm->trace_vcpu;
if (parse_target_str(kvm) != 0)
goto exit;
if (!verify_vcpu(vcpu))
goto exit;
@ -801,16 +1282,11 @@ exit:
return ret;
}
static const char * const record_args[] = {
"record",
"-R",
"-f",
"-m", "1024",
"-c", "1",
"-e", "kvm:kvm_entry",
"-e", "kvm:kvm_exit",
"-e", "kvm:kvm_mmio",
"-e", "kvm:kvm_pio",
static const char * const kvm_events_tp[] = {
"kvm:kvm_entry",
"kvm:kvm_exit",
"kvm:kvm_mmio",
"kvm:kvm_pio",
};
#define STRDUP_FAIL_EXIT(s) \
@ -826,8 +1302,15 @@ kvm_events_record(struct perf_kvm_stat *kvm, int argc, const char **argv)
{
unsigned int rec_argc, i, j;
const char **rec_argv;
const char * const record_args[] = {
"record",
"-R",
"-m", "1024",
"-c", "1",
};
rec_argc = ARRAY_SIZE(record_args) + argc + 2;
rec_argc = ARRAY_SIZE(record_args) + argc + 2 +
2 * ARRAY_SIZE(kvm_events_tp);
rec_argv = calloc(rec_argc + 1, sizeof(char *));
if (rec_argv == NULL)
@ -836,6 +1319,11 @@ kvm_events_record(struct perf_kvm_stat *kvm, int argc, const char **argv)
for (i = 0; i < ARRAY_SIZE(record_args); i++)
rec_argv[i] = STRDUP_FAIL_EXIT(record_args[i]);
for (j = 0; j < ARRAY_SIZE(kvm_events_tp); j++) {
rec_argv[i++] = "-e";
rec_argv[i++] = STRDUP_FAIL_EXIT(kvm_events_tp[j]);
}
rec_argv[i++] = STRDUP_FAIL_EXIT("-o");
rec_argv[i++] = STRDUP_FAIL_EXIT(kvm->file_name);
@ -856,6 +1344,8 @@ kvm_events_report(struct perf_kvm_stat *kvm, int argc, const char **argv)
OPT_STRING('k', "key", &kvm->sort_key, "sort-key",
"key for sorting: sample(sort by samples number)"
" time (sort by avg time)"),
OPT_STRING('p', "pid", &kvm->pid_str, "pid",
"analyze events only for given process id(s)"),
OPT_END()
};
@ -878,6 +1368,190 @@ kvm_events_report(struct perf_kvm_stat *kvm, int argc, const char **argv)
return kvm_events_report_vcpu(kvm);
}
static struct perf_evlist *kvm_live_event_list(void)
{
struct perf_evlist *evlist;
char *tp, *name, *sys;
unsigned int j;
int err = -1;
evlist = perf_evlist__new();
if (evlist == NULL)
return NULL;
for (j = 0; j < ARRAY_SIZE(kvm_events_tp); j++) {
tp = strdup(kvm_events_tp[j]);
if (tp == NULL)
goto out;
/* split tracepoint into subsystem and name */
sys = tp;
name = strchr(tp, ':');
if (name == NULL) {
pr_err("Error parsing %s tracepoint: subsystem delimiter not found\n",
kvm_events_tp[j]);
free(tp);
goto out;
}
*name = '\0';
name++;
if (perf_evlist__add_newtp(evlist, sys, name, NULL)) {
pr_err("Failed to add %s tracepoint to the list\n", kvm_events_tp[j]);
free(tp);
goto out;
}
free(tp);
}
err = 0;
out:
if (err) {
perf_evlist__delete(evlist);
evlist = NULL;
}
return evlist;
}
static int kvm_events_live(struct perf_kvm_stat *kvm,
int argc, const char **argv)
{
char errbuf[BUFSIZ];
int err;
const struct option live_options[] = {
OPT_STRING('p', "pid", &kvm->opts.target.pid, "pid",
"record events on existing process id"),
OPT_UINTEGER('m', "mmap-pages", &kvm->opts.mmap_pages,
"number of mmap data pages"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show counter open errors, etc)"),
OPT_BOOLEAN('a', "all-cpus", &kvm->opts.target.system_wide,
"system-wide collection from all CPUs"),
OPT_UINTEGER('d', "display", &kvm->display_time,
"time in seconds between display updates"),
OPT_STRING(0, "event", &kvm->report_event, "report event",
"event for reporting: vmexit, mmio, ioport"),
OPT_INTEGER(0, "vcpu", &kvm->trace_vcpu,
"vcpu id to report"),
OPT_STRING('k', "key", &kvm->sort_key, "sort-key",
"key for sorting: sample(sort by samples number)"
" time (sort by avg time)"),
OPT_U64(0, "duration", &kvm->duration,
"show events other than HALT that take longer than duration usecs"),
OPT_END()
};
const char * const live_usage[] = {
"perf kvm stat live [<options>]",
NULL
};
/* event handling */
kvm->tool.sample = process_sample_event;
kvm->tool.comm = perf_event__process_comm;
kvm->tool.exit = perf_event__process_exit;
kvm->tool.fork = perf_event__process_fork;
kvm->tool.lost = process_lost_event;
kvm->tool.ordered_samples = true;
perf_tool__fill_defaults(&kvm->tool);
/* set defaults */
kvm->display_time = 1;
kvm->opts.user_interval = 1;
kvm->opts.mmap_pages = 512;
kvm->opts.target.uses_mmap = false;
kvm->opts.target.uid_str = NULL;
kvm->opts.target.uid = UINT_MAX;
symbol__init();
disable_buildid_cache();
use_browser = 0;
setup_browser(false);
if (argc) {
argc = parse_options(argc, argv, live_options,
live_usage, 0);
if (argc)
usage_with_options(live_usage, live_options);
}
kvm->duration *= NSEC_PER_USEC; /* convert usec to nsec */
/*
* target related setups
*/
err = perf_target__validate(&kvm->opts.target);
if (err) {
perf_target__strerror(&kvm->opts.target, err, errbuf, BUFSIZ);
ui__warning("%s", errbuf);
}
if (perf_target__none(&kvm->opts.target))
kvm->opts.target.system_wide = true;
/*
* generate the event list
*/
kvm->evlist = kvm_live_event_list();
if (kvm->evlist == NULL) {
err = -1;
goto out;
}
symbol_conf.nr_events = kvm->evlist->nr_entries;
if (perf_evlist__create_maps(kvm->evlist, &kvm->opts.target) < 0)
usage_with_options(live_usage, live_options);
/*
* perf session
*/
kvm->session = perf_session__new(NULL, O_WRONLY, false, false, &kvm->tool);
if (kvm->session == NULL) {
err = -ENOMEM;
goto out;
}
kvm->session->evlist = kvm->evlist;
perf_session__set_id_hdr_size(kvm->session);
if (perf_target__has_task(&kvm->opts.target))
perf_event__synthesize_thread_map(&kvm->tool,
kvm->evlist->threads,
perf_event__process,
&kvm->session->machines.host);
else
perf_event__synthesize_threads(&kvm->tool, perf_event__process,
&kvm->session->machines.host);
err = kvm_live_open_events(kvm);
if (err)
goto out;
err = kvm_events_live_report(kvm);
out:
exit_browser(0);
if (kvm->session)
perf_session__delete(kvm->session);
kvm->session = NULL;
if (kvm->evlist) {
perf_evlist__delete_maps(kvm->evlist);
perf_evlist__delete(kvm->evlist);
}
return err;
}
static void print_kvm_stat_usage(void)
{
printf("Usage: perf kvm stat <command>\n\n");
@ -885,6 +1559,7 @@ static void print_kvm_stat_usage(void)
printf("# Available commands:\n");
printf("\trecord: record kvm events\n");
printf("\treport: report statistical data of kvm events\n");
printf("\tlive: live reporting of statistical data of kvm events\n");
printf("\nOtherwise, it is the alias of 'perf stat':\n");
}
@ -914,6 +1589,9 @@ static int kvm_cmd_stat(const char *file_name, int argc, const char **argv)
if (!strncmp(argv[1], "rep", 3))
return kvm_events_report(&kvm, argc - 1 , argv + 1);
if (!strncmp(argv[1], "live", 4))
return kvm_events_live(&kvm, argc - 1 , argv + 1);
perf_stat:
return cmd_stat(argc, argv, NULL);
}

View File

@ -13,6 +13,7 @@
#include "util/parse-events.h"
#include "util/cache.h"
#include "util/pmu.h"
int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
{
@ -37,6 +38,8 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
else if (strcmp(argv[i], "cache") == 0 ||
strcmp(argv[i], "hwcache") == 0)
print_hwcache_events(NULL, false);
else if (strcmp(argv[i], "pmu") == 0)
print_pmu_events(NULL, false);
else if (strcmp(argv[i], "--raw-dump") == 0)
print_events(NULL, true);
else {

View File

@ -805,7 +805,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine)
{
struct thread *thread = machine__findnew_thread(machine, sample->tid);
struct thread *thread = machine__findnew_thread(machine, sample->pid,
sample->tid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",

View File

@ -14,7 +14,6 @@ static const char *mem_operation = MEM_OPERATION_LOAD;
struct perf_mem {
struct perf_tool tool;
char const *input_name;
symbol_filter_t annotate_init;
bool hide_unresolved;
bool dump_raw;
const char *cpu_list;
@ -69,8 +68,7 @@ dump_raw_samples(struct perf_tool *tool,
struct addr_location al;
const char *fmt;
if (perf_event__preprocess_sample(event, machine, &al, sample,
mem->annotate_init) < 0) {
if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
fprintf(stderr, "problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@ -96,7 +94,7 @@ dump_raw_samples(struct perf_tool *tool,
symbol_conf.field_sep,
sample->tid,
symbol_conf.field_sep,
event->ip.ip,
sample->ip,
symbol_conf.field_sep,
sample->addr,
symbol_conf.field_sep,

View File

@ -474,13 +474,6 @@ static int __cmd_record(struct perf_record *rec, int argc, const char **argv)
goto out_delete_session;
}
err = perf_event__synthesize_event_types(tool, process_synthesized_event,
machine);
if (err < 0) {
pr_err("Couldn't synthesize event_types.\n");
goto out_delete_session;
}
if (have_tracepoints(&evsel_list->entries)) {
/*
* FIXME err <= 0 here actually means that
@ -904,7 +897,6 @@ const struct option record_options[] = {
int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
{
int err = -ENOMEM;
struct perf_evsel *pos;
struct perf_evlist *evsel_list;
struct perf_record *rec = &record;
char errbuf[BUFSIZ];
@ -968,11 +960,6 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (perf_evlist__create_maps(evsel_list, &rec->opts.target) < 0)
usage_with_options(record_usage, record_options);
list_for_each_entry(pos, &evsel_list->entries, node) {
if (perf_header__push_event(pos->attr.config, perf_evsel__name(pos)))
goto out_free_fd;
}
if (rec->opts.user_interval != ULLONG_MAX)
rec->opts.default_interval = rec->opts.user_interval;
if (rec->opts.user_freq != UINT_MAX)

View File

@ -49,7 +49,6 @@ struct perf_report {
bool mem_mode;
struct perf_read_values show_threads_values;
const char *pretty_printing_style;
symbol_filter_t annotate_init;
const char *cpu_list;
const char *symbol_filter_str;
float min_percent;
@ -89,7 +88,7 @@ static int perf_report__add_mem_hist_entry(struct perf_tool *tool,
if ((sort__has_parent || symbol_conf.use_callchain) &&
sample->callchain) {
err = machine__resolve_callchain(machine, evsel, al->thread,
sample, &parent);
sample, &parent, al);
if (err)
return err;
}
@ -180,7 +179,7 @@ static int perf_report__add_branch_hist_entry(struct perf_tool *tool,
if ((sort__has_parent || symbol_conf.use_callchain)
&& sample->callchain) {
err = machine__resolve_callchain(machine, evsel, al->thread,
sample, &parent);
sample, &parent, al);
if (err)
return err;
}
@ -254,7 +253,7 @@ static int perf_evsel__add_hist_entry(struct perf_evsel *evsel,
if ((sort__has_parent || symbol_conf.use_callchain) && sample->callchain) {
err = machine__resolve_callchain(machine, evsel, al->thread,
sample, &parent);
sample, &parent, al);
if (err)
return err;
}
@ -305,8 +304,7 @@ static int process_sample_event(struct perf_tool *tool,
struct addr_location al;
int ret;
if (perf_event__preprocess_sample(event, machine, &al, sample,
rep->annotate_init) < 0) {
if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
fprintf(stderr, "problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@ -367,7 +365,7 @@ static int process_read_event(struct perf_tool *tool,
static int perf_report__setup_sample_type(struct perf_report *rep)
{
struct perf_session *self = rep->session;
u64 sample_type = perf_evlist__sample_type(self->evlist);
u64 sample_type = perf_evlist__combined_sample_type(self->evlist);
if (!self->fd_pipe && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
if (sort__has_parent) {
@ -497,7 +495,7 @@ static int __cmd_report(struct perf_report *rep)
ret = perf_session__cpu_bitmap(session, rep->cpu_list,
rep->cpu_bitmap);
if (ret)
goto out_delete;
return ret;
}
if (use_browser <= 0)
@ -508,11 +506,11 @@ static int __cmd_report(struct perf_report *rep)
ret = perf_report__setup_sample_type(rep);
if (ret)
goto out_delete;
return ret;
ret = perf_session__process_events(session, &rep->tool);
if (ret)
goto out_delete;
return ret;
kernel_map = session->machines.host.vmlinux_maps[MAP__FUNCTION];
kernel_kmap = map__kmap(kernel_map);
@ -547,7 +545,7 @@ static int __cmd_report(struct perf_report *rep)
if (dump_trace) {
perf_session__fprintf_nr_events(session, stdout);
goto out_delete;
return 0;
}
nr_samples = 0;
@ -572,7 +570,7 @@ static int __cmd_report(struct perf_report *rep)
if (nr_samples == 0) {
ui__error("The %s file has no samples!\n", session->filename);
goto out_delete;
return 0;
}
list_for_each_entry(pos, &session->evlist->entries, node)
@ -598,19 +596,6 @@ static int __cmd_report(struct perf_report *rep)
} else
perf_evlist__tty_browse_hists(session->evlist, rep, help);
out_delete:
/*
* Speed up the exit process, for large files this can
* take quite a while.
*
* XXX Enable this when using valgrind or if we ever
* librarize this command.
*
* Also experiment with obstacks to see how much speed
* up we'll get here.
*
* perf_session__delete(session);
*/
return ret;
}
@ -680,12 +665,23 @@ parse_callchain_opt(const struct option *opt, const char *arg, int unset)
}
/* get the call chain order */
if (!strcmp(tok2, "caller"))
if (!strncmp(tok2, "caller", strlen("caller")))
callchain_param.order = ORDER_CALLER;
else if (!strcmp(tok2, "callee"))
else if (!strncmp(tok2, "callee", strlen("callee")))
callchain_param.order = ORDER_CALLEE;
else
return -1;
/* Get the sort key */
tok2 = strtok(NULL, ",");
if (!tok2)
goto setup;
if (!strncmp(tok2, "function", strlen("function")))
callchain_param.key = CCKEY_FUNCTION;
else if (!strncmp(tok2, "address", strlen("address")))
callchain_param.key = CCKEY_ADDRESS;
else
return -1;
setup:
if (callchain_register_param(&callchain_param) < 0) {
fprintf(stderr, "Can't register callchain params\n");
@ -694,6 +690,24 @@ setup:
return 0;
}
int
report_parse_ignore_callees_opt(const struct option *opt __maybe_unused,
const char *arg, int unset __maybe_unused)
{
if (arg) {
int err = regcomp(&ignore_callees_regex, arg, REG_EXTENDED);
if (err) {
char buf[BUFSIZ];
regerror(err, &ignore_callees_regex, buf, sizeof(buf));
pr_err("Invalid --ignore-callees regex: %s\n%s", arg, buf);
return -1;
}
have_ignore_callees = 1;
}
return 0;
}
static int
parse_branch_mode(const struct option *opt __maybe_unused,
const char *str __maybe_unused, int unset)
@ -736,7 +750,6 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
.lost = perf_event__process_lost,
.read = process_read_event,
.attr = perf_event__process_attr,
.event_type = perf_event__process_event_type,
.tracing_data = perf_event__process_tracing_data,
.build_id = perf_event__process_build_id,
.ordered_samples = true,
@ -780,10 +793,13 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
"Only display entries with parent-match"),
OPT_CALLBACK_DEFAULT('g', "call-graph", &report, "output_type,min_percent[,print_limit],call_order",
"Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold, optional print limit and callchain order. "
"Default: fractal,0.5,callee", &parse_callchain_opt, callchain_default_opt),
"Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold, optional print limit, callchain order, key (function or address). "
"Default: fractal,0.5,callee,function", &parse_callchain_opt, callchain_default_opt),
OPT_BOOLEAN('G', "inverted", &report.inverted_callchain,
"alias for inverted call graph"),
OPT_CALLBACK(0, "ignore-callees", NULL, "regex",
"ignore callees of these functions in call graphs",
report_parse_ignore_callees_opt),
OPT_STRING('d', "dsos", &symbol_conf.dso_list_str, "dso[,dso...]",
"only consider symbols in these dsos"),
OPT_STRING('c', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",
@ -853,7 +869,6 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
setup_browser(true);
else {
use_browser = 0;
perf_hpp__column_enable(PERF_HPP__OVERHEAD);
perf_hpp__init();
}
@ -907,7 +922,8 @@ repeat:
*/
if (use_browser == 1 && sort__has_sym) {
symbol_conf.priv_size = sizeof(struct annotation);
report.annotate_init = symbol__annotate_init;
machines__set_symbol_filter(&session->machines,
symbol__annotate_init);
/*
* For searching by name on the "Browse map details".
* providing it only in verbose mode not to bloat too
@ -931,14 +947,6 @@ repeat:
if (parent_pattern != default_parent_pattern) {
if (sort_dimension__add("parent") < 0)
goto error;
/*
* Only show the parent fields if we explicitly
* sort that way. If we only use parent machinery
* for filtering, we don't want it.
*/
if (!strstr(sort_order, "parent"))
sort_parent.elide = 1;
}
if (argc) {

View File

@ -109,8 +109,9 @@ struct trace_sched_handler {
int (*wakeup_event)(struct perf_sched *sched, struct perf_evsel *evsel,
struct perf_sample *sample, struct machine *machine);
int (*fork_event)(struct perf_sched *sched, struct perf_evsel *evsel,
struct perf_sample *sample);
/* PERF_RECORD_FORK event, not sched_process_fork tracepoint */
int (*fork_event)(struct perf_sched *sched, union perf_event *event,
struct machine *machine);
int (*migrate_task_event)(struct perf_sched *sched,
struct perf_evsel *evsel,
@ -717,22 +718,31 @@ static int replay_switch_event(struct perf_sched *sched,
return 0;
}
static int replay_fork_event(struct perf_sched *sched, struct perf_evsel *evsel,
struct perf_sample *sample)
static int replay_fork_event(struct perf_sched *sched,
union perf_event *event,
struct machine *machine)
{
const char *parent_comm = perf_evsel__strval(evsel, sample, "parent_comm"),
*child_comm = perf_evsel__strval(evsel, sample, "child_comm");
const u32 parent_pid = perf_evsel__intval(evsel, sample, "parent_pid"),
child_pid = perf_evsel__intval(evsel, sample, "child_pid");
struct thread *child, *parent;
if (verbose) {
printf("sched_fork event %p\n", evsel);
printf("... parent: %s/%d\n", parent_comm, parent_pid);
printf("... child: %s/%d\n", child_comm, child_pid);
child = machine__findnew_thread(machine, event->fork.pid,
event->fork.tid);
parent = machine__findnew_thread(machine, event->fork.ppid,
event->fork.ptid);
if (child == NULL || parent == NULL) {
pr_debug("thread does not exist on fork event: child %p, parent %p\n",
child, parent);
return 0;
}
register_pid(sched, parent_pid, parent_comm);
register_pid(sched, child_pid, child_comm);
if (verbose) {
printf("fork event\n");
printf("... parent: %s/%d\n", parent->comm, parent->tid);
printf("... child: %s/%d\n", child->comm, child->tid);
}
register_pid(sched, parent->tid, parent->comm);
register_pid(sched, child->tid, child->comm);
return 0;
}
@ -824,14 +834,6 @@ static int thread_atoms_insert(struct perf_sched *sched, struct thread *thread)
return 0;
}
static int latency_fork_event(struct perf_sched *sched __maybe_unused,
struct perf_evsel *evsel __maybe_unused,
struct perf_sample *sample __maybe_unused)
{
/* should insert the newcomer */
return 0;
}
static char sched_out_state(u64 prev_state)
{
const char *str = TASK_STATE_TO_CHAR_STR;
@ -934,8 +936,8 @@ static int latency_switch_event(struct perf_sched *sched,
return -1;
}
sched_out = machine__findnew_thread(machine, prev_pid);
sched_in = machine__findnew_thread(machine, next_pid);
sched_out = machine__findnew_thread(machine, 0, prev_pid);
sched_in = machine__findnew_thread(machine, 0, next_pid);
out_events = thread_atoms_search(&sched->atom_root, sched_out, &sched->cmp_pid);
if (!out_events) {
@ -978,7 +980,7 @@ static int latency_runtime_event(struct perf_sched *sched,
{
const u32 pid = perf_evsel__intval(evsel, sample, "pid");
const u64 runtime = perf_evsel__intval(evsel, sample, "runtime");
struct thread *thread = machine__findnew_thread(machine, pid);
struct thread *thread = machine__findnew_thread(machine, 0, pid);
struct work_atoms *atoms = thread_atoms_search(&sched->atom_root, thread, &sched->cmp_pid);
u64 timestamp = sample->time;
int cpu = sample->cpu;
@ -1016,7 +1018,7 @@ static int latency_wakeup_event(struct perf_sched *sched,
if (!success)
return 0;
wakee = machine__findnew_thread(machine, pid);
wakee = machine__findnew_thread(machine, 0, pid);
atoms = thread_atoms_search(&sched->atom_root, wakee, &sched->cmp_pid);
if (!atoms) {
if (thread_atoms_insert(sched, wakee))
@ -1070,12 +1072,12 @@ static int latency_migrate_task_event(struct perf_sched *sched,
if (sched->profile_cpu == -1)
return 0;
migrant = machine__findnew_thread(machine, pid);
migrant = machine__findnew_thread(machine, 0, pid);
atoms = thread_atoms_search(&sched->atom_root, migrant, &sched->cmp_pid);
if (!atoms) {
if (thread_atoms_insert(sched, migrant))
return -1;
register_pid(sched, migrant->pid, migrant->comm);
register_pid(sched, migrant->tid, migrant->comm);
atoms = thread_atoms_search(&sched->atom_root, migrant, &sched->cmp_pid);
if (!atoms) {
pr_err("migration-event: Internal tree error");
@ -1115,7 +1117,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
sched->all_runtime += work_list->total_runtime;
sched->all_count += work_list->nb_atoms;
ret = printf(" %s:%d ", work_list->thread->comm, work_list->thread->pid);
ret = printf(" %s:%d ", work_list->thread->comm, work_list->thread->tid);
for (i = 0; i < 24 - ret; i++)
printf(" ");
@ -1131,9 +1133,9 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
{
if (l->thread->pid < r->thread->pid)
if (l->thread->tid < r->thread->tid)
return -1;
if (l->thread->pid > r->thread->pid)
if (l->thread->tid > r->thread->tid)
return 1;
return 0;
@ -1289,8 +1291,8 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
return -1;
}
sched_out = machine__findnew_thread(machine, prev_pid);
sched_in = machine__findnew_thread(machine, next_pid);
sched_out = machine__findnew_thread(machine, 0, prev_pid);
sched_in = machine__findnew_thread(machine, 0, next_pid);
sched->curr_thread[this_cpu] = sched_in;
@ -1321,7 +1323,7 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
printf("*");
if (sched->curr_thread[cpu]) {
if (sched->curr_thread[cpu]->pid)
if (sched->curr_thread[cpu]->tid)
printf("%2s ", sched->curr_thread[cpu]->shortname);
else
printf(". ");
@ -1332,7 +1334,7 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
printf(" %12.6f secs ", (double)timestamp/1e9);
if (new_shortname) {
printf("%s => %s:%d\n",
sched_in->shortname, sched_in->comm, sched_in->pid);
sched_in->shortname, sched_in->comm, sched_in->tid);
} else {
printf("\n");
}
@ -1379,28 +1381,23 @@ static int process_sched_runtime_event(struct perf_tool *tool,
return 0;
}
static int process_sched_fork_event(struct perf_tool *tool,
struct perf_evsel *evsel,
struct perf_sample *sample,
struct machine *machine __maybe_unused)
static int perf_sched__process_fork_event(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
/* run the fork event through the perf machineruy */
perf_event__process_fork(tool, event, sample, machine);
/* and then run additional processing needed for this command */
if (sched->tp_handler->fork_event)
return sched->tp_handler->fork_event(sched, evsel, sample);
return sched->tp_handler->fork_event(sched, event, machine);
return 0;
}
static int process_sched_exit_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct perf_sample *sample __maybe_unused,
struct machine *machine __maybe_unused)
{
pr_debug("sched_exit event %p\n", evsel);
return 0;
}
static int process_sched_migrate_task_event(struct perf_tool *tool,
struct perf_evsel *evsel,
struct perf_sample *sample,
@ -1425,15 +1422,8 @@ static int perf_sched__process_tracepoint_sample(struct perf_tool *tool __maybe_
struct perf_evsel *evsel,
struct machine *machine)
{
struct thread *thread = machine__findnew_thread(machine, sample->tid);
int err = 0;
if (thread == NULL) {
pr_debug("problem processing %s event, skipping it.\n",
perf_evsel__name(evsel));
return -1;
}
evsel->hists.stats.total_period += sample->period;
hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
@ -1445,7 +1435,7 @@ static int perf_sched__process_tracepoint_sample(struct perf_tool *tool __maybe_
return err;
}
static int perf_sched__read_events(struct perf_sched *sched, bool destroy,
static int perf_sched__read_events(struct perf_sched *sched,
struct perf_session **psession)
{
const struct perf_evsel_str_handler handlers[] = {
@ -1453,8 +1443,6 @@ static int perf_sched__read_events(struct perf_sched *sched, bool destroy,
{ "sched:sched_stat_runtime", process_sched_runtime_event, },
{ "sched:sched_wakeup", process_sched_wakeup_event, },
{ "sched:sched_wakeup_new", process_sched_wakeup_event, },
{ "sched:sched_process_fork", process_sched_fork_event, },
{ "sched:sched_process_exit", process_sched_exit_event, },
{ "sched:sched_migrate_task", process_sched_migrate_task_event, },
};
struct perf_session *session;
@ -1480,11 +1468,10 @@ static int perf_sched__read_events(struct perf_sched *sched, bool destroy,
sched->nr_lost_chunks = session->stats.nr_events[PERF_RECORD_LOST];
}
if (destroy)
perf_session__delete(session);
if (psession)
*psession = session;
else
perf_session__delete(session);
return 0;
@ -1529,8 +1516,11 @@ static int perf_sched__lat(struct perf_sched *sched)
struct perf_session *session;
setup_pager();
if (perf_sched__read_events(sched, false, &session))
/* save session -- references to threads are held in work_list */
if (perf_sched__read_events(sched, &session))
return -1;
perf_sched__sort_lat(sched);
printf("\n ---------------------------------------------------------------------------------------------------------------\n");
@ -1565,7 +1555,7 @@ static int perf_sched__map(struct perf_sched *sched)
sched->max_cpu = sysconf(_SC_NPROCESSORS_CONF);
setup_pager();
if (perf_sched__read_events(sched, true, NULL))
if (perf_sched__read_events(sched, NULL))
return -1;
print_bad_events(sched);
return 0;
@ -1580,7 +1570,7 @@ static int perf_sched__replay(struct perf_sched *sched)
test_calibrations(sched);
if (perf_sched__read_events(sched, true, NULL))
if (perf_sched__read_events(sched, NULL))
return -1;
printf("nr_run_events: %ld\n", sched->nr_run_events);
@ -1639,7 +1629,6 @@ static int __cmd_record(int argc, const char **argv)
"-e", "sched:sched_stat_sleep",
"-e", "sched:sched_stat_iowait",
"-e", "sched:sched_stat_runtime",
"-e", "sched:sched_process_exit",
"-e", "sched:sched_process_fork",
"-e", "sched:sched_wakeup",
"-e", "sched:sched_migrate_task",
@ -1662,28 +1651,29 @@ static int __cmd_record(int argc, const char **argv)
return cmd_record(i, rec_argv, NULL);
}
static const char default_sort_order[] = "avg, max, switch, runtime";
static struct perf_sched sched = {
.tool = {
.sample = perf_sched__process_tracepoint_sample,
.comm = perf_event__process_comm,
.lost = perf_event__process_lost,
.fork = perf_sched__process_fork_event,
.ordered_samples = true,
},
.cmp_pid = LIST_HEAD_INIT(sched.cmp_pid),
.sort_list = LIST_HEAD_INIT(sched.sort_list),
.start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
.work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
.curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
.sort_order = default_sort_order,
.replay_repeat = 10,
.profile_cpu = -1,
.next_shortname1 = 'A',
.next_shortname2 = '0',
};
int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
{
const char default_sort_order[] = "avg, max, switch, runtime";
struct perf_sched sched = {
.tool = {
.sample = perf_sched__process_tracepoint_sample,
.comm = perf_event__process_comm,
.lost = perf_event__process_lost,
.fork = perf_event__process_fork,
.ordered_samples = true,
},
.cmp_pid = LIST_HEAD_INIT(sched.cmp_pid),
.sort_list = LIST_HEAD_INIT(sched.sort_list),
.start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
.work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
.curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
.sort_order = default_sort_order,
.replay_repeat = 10,
.profile_cpu = -1,
.next_shortname1 = 'A',
.next_shortname2 = '0',
};
const struct option latency_options[] = {
OPT_STRING('s', "sort", &sched.sort_order, "key[,key2...]",
"sort by key(s): runtime, switch, avg, max"),
@ -1729,7 +1719,6 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
.wakeup_event = latency_wakeup_event,
.switch_event = latency_switch_event,
.runtime_event = latency_runtime_event,
.fork_event = latency_fork_event,
.migrate_task_event = latency_migrate_task_event,
};
struct trace_sched_handler map_ops = {

View File

@ -24,6 +24,7 @@ static u64 last_timestamp;
static u64 nr_unordered;
extern const struct option record_options[];
static bool no_callchain;
static bool latency_format;
static bool system_wide;
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@ -65,6 +66,7 @@ struct output_option {
static struct {
bool user_set;
bool wildcard_set;
unsigned int print_ip_opts;
u64 fields;
u64 invalid_fields;
} output[PERF_TYPE_MAX] = {
@ -234,6 +236,7 @@ static int perf_session__check_output_opt(struct perf_session *session)
{
int j;
struct perf_evsel *evsel;
struct perf_event_attr *attr;
for (j = 0; j < PERF_TYPE_MAX; ++j) {
evsel = perf_session__find_first_evtype(session, j);
@ -252,6 +255,24 @@ static int perf_session__check_output_opt(struct perf_session *session)
if (evsel && output[j].fields &&
perf_evsel__check_attr(evsel, session))
return -1;
if (evsel == NULL)
continue;
attr = &evsel->attr;
output[j].print_ip_opts = 0;
if (PRINT_FIELD(IP))
output[j].print_ip_opts |= PRINT_IP_OPT_IP;
if (PRINT_FIELD(SYM))
output[j].print_ip_opts |= PRINT_IP_OPT_SYM;
if (PRINT_FIELD(DSO))
output[j].print_ip_opts |= PRINT_IP_OPT_DSO;
if (PRINT_FIELD(SYMOFFSET))
output[j].print_ip_opts |= PRINT_IP_OPT_SYMOFFSET;
}
return 0;
@ -381,8 +402,8 @@ static void print_sample_bts(union perf_event *event,
else
printf("\n");
perf_evsel__print_ip(evsel, event, sample, machine,
PRINT_FIELD(SYM), PRINT_FIELD(DSO),
PRINT_FIELD(SYMOFFSET));
output[attr->type].print_ip_opts,
PERF_MAX_STACK_DEPTH);
}
printf(" => ");
@ -396,10 +417,10 @@ static void print_sample_bts(union perf_event *event,
static void process_event(union perf_event *event, struct perf_sample *sample,
struct perf_evsel *evsel, struct machine *machine,
struct addr_location *al)
struct thread *thread,
struct addr_location *al __maybe_unused)
{
struct perf_event_attr *attr = &evsel->attr;
struct thread *thread = al->thread;
if (output[attr->type].fields == 0)
return;
@ -422,9 +443,10 @@ static void process_event(union perf_event *event, struct perf_sample *sample,
printf(" ");
else
printf("\n");
perf_evsel__print_ip(evsel, event, sample, machine,
PRINT_FIELD(SYM), PRINT_FIELD(DSO),
PRINT_FIELD(SYMOFFSET));
output[attr->type].print_ip_opts,
PERF_MAX_STACK_DEPTH);
}
printf("\n");
@ -479,7 +501,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct machine *machine)
{
struct addr_location al;
struct thread *thread = machine__findnew_thread(machine, event->ip.tid);
struct thread *thread = machine__findnew_thread(machine, sample->pid,
sample->tid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
@ -498,7 +521,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
return 0;
}
if (perf_event__preprocess_sample(event, machine, &al, sample, 0) < 0) {
if (perf_event__preprocess_sample(event, machine, &al, sample) < 0) {
pr_err("problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@ -510,7 +533,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
return 0;
scripting_ops->process_event(event, sample, evsel, machine, &al);
scripting_ops->process_event(event, sample, evsel, machine, thread, &al);
evsel->hists.stats.total_period += sample->period;
return 0;
@ -523,7 +546,6 @@ static struct perf_tool perf_script = {
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
.attr = perf_event__process_attr,
.event_type = perf_event__process_event_type,
.tracing_data = perf_event__process_tracing_data,
.build_id = perf_event__process_build_id,
.ordered_samples = true,

View File

@ -100,6 +100,7 @@ static const char *pre_cmd = NULL;
static const char *post_cmd = NULL;
static bool sync_run = false;
static unsigned int interval = 0;
static unsigned int initial_delay = 0;
static bool forever = false;
static struct timespec ref_time;
static struct cpu_map *aggr_map;
@ -254,7 +255,8 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
if (!perf_target__has_task(&target) &&
perf_evsel__is_group_leader(evsel)) {
attr->disabled = 1;
attr->enable_on_exec = 1;
if (!initial_delay)
attr->enable_on_exec = 1;
}
return perf_evsel__open_per_thread(evsel, evsel_list->threads);
@ -414,6 +416,22 @@ static void print_interval(void)
list_for_each_entry(counter, &evsel_list->entries, node)
print_counter_aggr(counter, prefix);
}
fflush(output);
}
static void handle_initial_delay(void)
{
struct perf_evsel *counter;
if (initial_delay) {
const int ncpus = cpu_map__nr(evsel_list->cpus),
nthreads = thread_map__nr(evsel_list->threads);
usleep(initial_delay * 1000);
list_for_each_entry(counter, &evsel_list->entries, node)
perf_evsel__enable(counter, ncpus, nthreads);
}
}
static int __run_perf_stat(int argc, const char **argv)
@ -486,6 +504,7 @@ static int __run_perf_stat(int argc, const char **argv)
if (forks) {
perf_evlist__start_workload(evsel_list);
handle_initial_delay();
if (interval) {
while (!waitpid(child_pid, &status, WNOHANG)) {
@ -497,6 +516,7 @@ static int __run_perf_stat(int argc, const char **argv)
if (WIFSIGNALED(status))
psignal(WTERMSIG(status), argv[0]);
} else {
handle_initial_delay();
while (!done) {
nanosleep(&ts, NULL);
if (interval)
@ -1419,6 +1439,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
"aggregate counts per processor socket", AGGR_SOCKET),
OPT_SET_UINT(0, "per-core", &aggr_mode,
"aggregate counts per physical processor core", AGGR_CORE),
OPT_UINTEGER('D', "delay", &initial_delay,
"ms to wait before starting measurement after program start"),
OPT_END()
};
const char * const stat_usage[] = {

View File

@ -12,6 +12,8 @@
* of the License.
*/
#include <traceevent/event-parse.h>
#include "builtin.h"
#include "util/util.h"
@ -19,6 +21,7 @@
#include "util/color.h"
#include <linux/list.h>
#include "util/cache.h"
#include "util/evlist.h"
#include "util/evsel.h"
#include <linux/rbtree.h>
#include "util/symbol.h"
@ -328,25 +331,6 @@ struct wakeup_entry {
int success;
};
/*
* trace_flag_type is an enumeration that holds different
* states when a trace occurs. These are:
* IRQS_OFF - interrupts were disabled
* IRQS_NOSUPPORT - arch does not support irqs_disabled_flags
* NEED_RESCED - reschedule is requested
* HARDIRQ - inside an interrupt handler
* SOFTIRQ - inside a softirq handler
*/
enum trace_flag_type {
TRACE_FLAG_IRQS_OFF = 0x01,
TRACE_FLAG_IRQS_NOSUPPORT = 0x02,
TRACE_FLAG_NEED_RESCHED = 0x04,
TRACE_FLAG_HARDIRQ = 0x08,
TRACE_FLAG_SOFTIRQ = 0x10,
};
struct sched_switch {
struct trace_entry te;
char prev_comm[TASK_COMM_LEN];
@ -479,6 +463,8 @@ static void sched_switch(int cpu, u64 timestamp, struct trace_entry *te)
}
}
typedef int (*tracepoint_handler)(struct perf_evsel *evsel,
struct perf_sample *sample);
static int process_sample_event(struct perf_tool *tool __maybe_unused,
union perf_event *event __maybe_unused,
@ -486,8 +472,6 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine __maybe_unused)
{
struct trace_entry *te;
if (evsel->attr.sample_type & PERF_SAMPLE_TIME) {
if (!first_time || first_time > sample->time)
first_time = sample->time;
@ -495,69 +479,90 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
last_time = sample->time;
}
te = (void *)sample->raw_data;
if ((evsel->attr.sample_type & PERF_SAMPLE_RAW) && sample->raw_size > 0) {
char *event_str;
#ifdef SUPPORT_OLD_POWER_EVENTS
struct power_entry_old *peo;
peo = (void *)te;
#endif
/*
* FIXME: use evsel, its already mapped from id to perf_evsel,
* remove perf_header__find_event infrastructure bits.
* Mapping all these "power:cpu_idle" strings to the tracepoint
* ID and then just comparing against evsel->attr.config.
*
* e.g.:
*
* if (evsel->attr.config == power_cpu_idle_id)
*/
event_str = perf_header__find_event(te->type);
if (sample->cpu > numcpus)
numcpus = sample->cpu;
if (!event_str)
return 0;
if (sample->cpu > numcpus)
numcpus = sample->cpu;
if (strcmp(event_str, "power:cpu_idle") == 0) {
struct power_processor_entry *ppe = (void *)te;
if (ppe->state == (u32)PWR_EVENT_EXIT)
c_state_end(ppe->cpu_id, sample->time);
else
c_state_start(ppe->cpu_id, sample->time,
ppe->state);
}
else if (strcmp(event_str, "power:cpu_frequency") == 0) {
struct power_processor_entry *ppe = (void *)te;
p_state_change(ppe->cpu_id, sample->time, ppe->state);
}
else if (strcmp(event_str, "sched:sched_wakeup") == 0)
sched_wakeup(sample->cpu, sample->time, sample->pid, te);
else if (strcmp(event_str, "sched:sched_switch") == 0)
sched_switch(sample->cpu, sample->time, te);
#ifdef SUPPORT_OLD_POWER_EVENTS
if (use_old_power_events) {
if (strcmp(event_str, "power:power_start") == 0)
c_state_start(peo->cpu_id, sample->time,
peo->value);
else if (strcmp(event_str, "power:power_end") == 0)
c_state_end(sample->cpu, sample->time);
else if (strcmp(event_str,
"power:power_frequency") == 0)
p_state_change(peo->cpu_id, sample->time,
peo->value);
}
#endif
if (evsel->handler.func != NULL) {
tracepoint_handler f = evsel->handler.func;
return f(evsel, sample);
}
return 0;
}
static int
process_sample_cpu_idle(struct perf_evsel *evsel __maybe_unused,
struct perf_sample *sample)
{
struct power_processor_entry *ppe = sample->raw_data;
if (ppe->state == (u32) PWR_EVENT_EXIT)
c_state_end(ppe->cpu_id, sample->time);
else
c_state_start(ppe->cpu_id, sample->time, ppe->state);
return 0;
}
static int
process_sample_cpu_frequency(struct perf_evsel *evsel __maybe_unused,
struct perf_sample *sample)
{
struct power_processor_entry *ppe = sample->raw_data;
p_state_change(ppe->cpu_id, sample->time, ppe->state);
return 0;
}
static int
process_sample_sched_wakeup(struct perf_evsel *evsel __maybe_unused,
struct perf_sample *sample)
{
struct trace_entry *te = sample->raw_data;
sched_wakeup(sample->cpu, sample->time, sample->pid, te);
return 0;
}
static int
process_sample_sched_switch(struct perf_evsel *evsel __maybe_unused,
struct perf_sample *sample)
{
struct trace_entry *te = sample->raw_data;
sched_switch(sample->cpu, sample->time, te);
return 0;
}
#ifdef SUPPORT_OLD_POWER_EVENTS
static int
process_sample_power_start(struct perf_evsel *evsel __maybe_unused,
struct perf_sample *sample)
{
struct power_entry_old *peo = sample->raw_data;
c_state_start(peo->cpu_id, sample->time, peo->value);
return 0;
}
static int
process_sample_power_end(struct perf_evsel *evsel __maybe_unused,
struct perf_sample *sample)
{
c_state_end(sample->cpu, sample->time);
return 0;
}
static int
process_sample_power_frequency(struct perf_evsel *evsel __maybe_unused,
struct perf_sample *sample)
{
struct power_entry_old *peo = sample->raw_data;
p_state_change(peo->cpu_id, sample->time, peo->value);
return 0;
}
#endif /* SUPPORT_OLD_POWER_EVENTS */
/*
* After the last sample we need to wrap up the current C/P state
* and close out each CPU for these.
@ -974,6 +979,17 @@ static int __cmd_timechart(const char *output_name)
.sample = process_sample_event,
.ordered_samples = true,
};
const struct perf_evsel_str_handler power_tracepoints[] = {
{ "power:cpu_idle", process_sample_cpu_idle },
{ "power:cpu_frequency", process_sample_cpu_frequency },
{ "sched:sched_wakeup", process_sample_sched_wakeup },
{ "sched:sched_switch", process_sample_sched_switch },
#ifdef SUPPORT_OLD_POWER_EVENTS
{ "power:power_start", process_sample_power_start },
{ "power:power_end", process_sample_power_end },
{ "power:power_frequency", process_sample_power_frequency },
#endif
};
struct perf_session *session = perf_session__new(input_name, O_RDONLY,
0, false, &perf_timechart);
int ret = -EINVAL;
@ -984,6 +1000,12 @@ static int __cmd_timechart(const char *output_name)
if (!perf_session__has_traces(session, "timechart record"))
goto out_delete;
if (perf_session__set_tracepoints_handlers(session,
power_tracepoints)) {
pr_err("Initializing session tracepoint handlers failed\n");
goto out_delete;
}
ret = perf_session__process_events(session, &perf_timechart);
if (ret)
goto out_delete;

View File

@ -40,6 +40,7 @@
#include "util/xyarray.h"
#include "util/sort.h"
#include "util/intlist.h"
#include "arch/common.h"
#include "util/debug.h"
@ -102,7 +103,8 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
/*
* We can't annotate with just /proc/kallsyms
*/
if (map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS) {
if (map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS &&
!dso__is_kcore(map->dso)) {
pr_err("Can't annotate %s: No vmlinux file was found in the "
"path\n", sym->name);
sleep(1);
@ -237,8 +239,6 @@ out_unlock:
pthread_mutex_unlock(&notes->lock);
}
static const char CONSOLE_CLEAR[] = "";
static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel,
struct addr_location *al,
struct perf_sample *sample)
@ -689,7 +689,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
{
struct perf_top *top = container_of(tool, struct perf_top, tool);
struct symbol *parent = NULL;
u64 ip = event->ip.ip;
u64 ip = sample->ip;
struct addr_location al;
int err;
@ -699,10 +699,10 @@ static void perf_event__process_sample(struct perf_tool *tool,
if (!seen)
seen = intlist__new(NULL);
if (!intlist__has_entry(seen, event->ip.pid)) {
if (!intlist__has_entry(seen, sample->pid)) {
pr_err("Can't find guest [%d]'s kernel information\n",
event->ip.pid);
intlist__add(seen, event->ip.pid);
sample->pid);
intlist__add(seen, sample->pid);
}
return;
}
@ -716,8 +716,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
if (event->header.misc & PERF_RECORD_MISC_EXACT_IP)
top->exact_samples++;
if (perf_event__preprocess_sample(event, machine, &al, sample,
symbol_filter) < 0 ||
if (perf_event__preprocess_sample(event, machine, &al, sample) < 0 ||
al.filtered)
return;
@ -772,8 +771,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
sample->callchain) {
err = machine__resolve_callchain(machine, evsel,
al.thread, sample,
&parent);
&parent, &al);
if (err)
return;
}
@ -838,7 +836,8 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx)
break;
case PERF_RECORD_MISC_GUEST_KERNEL:
++top->guest_kernel_samples;
machine = perf_session__find_machine(session, event->ip.pid);
machine = perf_session__find_machine(session,
sample.pid);
break;
case PERF_RECORD_MISC_GUEST_USER:
++top->guest_us_samples;
@ -939,6 +938,14 @@ static int __cmd_top(struct perf_top *top)
if (top->session == NULL)
return -ENOMEM;
machines__set_symbol_filter(&top->session->machines, symbol_filter);
if (!objdump_path) {
ret = perf_session_env__lookup_objdump(&top->session->header.env);
if (ret)
goto out_delete;
}
ret = perf_top__setup_sample_type(top);
if (ret)
goto out_delete;
@ -1102,6 +1109,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_CALLBACK_DEFAULT('G', "call-graph", &top.record_opts,
"mode[,dump_size]", record_callchain_help,
&parse_callchain_opt, "fp"),
OPT_CALLBACK(0, "ignore-callees", NULL, "regex",
"ignore callees of these functions in call graphs",
report_parse_ignore_callees_opt),
OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
"Show a column with the sum of periods"),
OPT_STRING(0, "dsos", &symbol_conf.dso_list_str, "dso[,dso...]",
@ -1114,6 +1124,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
"Interleave source code with assembly code (default)"),
OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
"Display raw encoding of assembly instructions (default)"),
OPT_STRING(0, "objdump", &objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"),
OPT_STRING('u', "uid", &target->uid_str, "user", "user to profile"),

File diff suppressed because it is too large Load Diff

View File

@ -46,6 +46,8 @@ ifneq ($(obj-perf),)
obj-perf := $(abspath $(obj-perf))/
endif
LIB_INCLUDE := $(srctree)/tools/lib/
# include ARCH specific config
-include $(src-perf)/arch/$(ARCH)/Makefile
@ -121,8 +123,7 @@ endif
CFLAGS += -I$(src-perf)/util
CFLAGS += -I$(src-perf)
CFLAGS += -I$(TRACE_EVENT_DIR)
CFLAGS += -I$(srctree)/tools/lib/
CFLAGS += -I$(LIB_INCLUDE)
CFLAGS += -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE

View File

@ -125,6 +125,9 @@
#ifndef NSEC_PER_SEC
# define NSEC_PER_SEC 1000000000ULL
#endif
#ifndef NSEC_PER_USEC
# define NSEC_PER_USEC 1000ULL
#endif
static inline unsigned long long rdclock(void)
{

View File

@ -21,7 +21,7 @@ def main():
evsel = perf.evsel(task = 1, comm = 1, mmap = 0,
wakeup_events = 1, watermark = 1,
sample_id_all = 1,
sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_TID | perf.SAMPLE_CPU | perf.SAMPLE_TID)
sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_TID | perf.SAMPLE_CPU)
evsel.open(cpus = cpus, threads = threads);
evlist = perf.evlist(cpus, threads)
evlist.add(evsel)

View File

@ -0,0 +1,36 @@
[config]
command = record
args = -e '{cycles,cache-misses}:S' kill >/dev/null 2>&1
[event-1:base-record]
fd=1
group_fd=-1
sample_type=343
read_format=12
inherit=0
[event-2:base-record]
fd=2
group_fd=1
# cache-misses
type=0
config=3
# default | PERF_SAMPLE_READ
sample_type=343
# PERF_FORMAT_ID | PERF_FORMAT_GROUP
read_format=12
mmap=0
comm=0
enable_on_exec=0
disabled=0
# inherit is disabled for group sampling
inherit=0
# sampling disabled
sample_freq=0
sample_period=0

View File

@ -93,6 +93,24 @@ static struct test {
.desc = "Test software clock events have valid period values",
.func = test__sw_clock_freq,
},
#if defined(__x86_64__) || defined(__i386__)
{
.desc = "Test converting perf time to TSC",
.func = test__perf_time_to_tsc,
},
#endif
{
.desc = "Test object code reading",
.func = test__code_reading,
},
{
.desc = "Test sample parsing",
.func = test__sample_parsing,
},
{
.desc = "Test using a dummy software event to keep tracking",
.func = test__keep_tracking,
},
{
.func = NULL,
},

View File

@ -0,0 +1,572 @@
#include <sys/types.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <inttypes.h>
#include <ctype.h>
#include <string.h>
#include "parse-events.h"
#include "evlist.h"
#include "evsel.h"
#include "thread_map.h"
#include "cpumap.h"
#include "machine.h"
#include "event.h"
#include "thread.h"
#include "tests.h"
#define BUFSZ 1024
#define READLEN 128
struct state {
u64 done[1024];
size_t done_cnt;
};
static unsigned int hex(char c)
{
if (c >= '0' && c <= '9')
return c - '0';
if (c >= 'a' && c <= 'f')
return c - 'a' + 10;
return c - 'A' + 10;
}
static void read_objdump_line(const char *line, size_t line_len, void **buf,
size_t *len)
{
const char *p;
size_t i;
/* Skip to a colon */
p = strchr(line, ':');
if (!p)
return;
i = p + 1 - line;
/* Read bytes */
while (*len) {
char c1, c2;
/* Skip spaces */
for (; i < line_len; i++) {
if (!isspace(line[i]))
break;
}
/* Get 2 hex digits */
if (i >= line_len || !isxdigit(line[i]))
break;
c1 = line[i++];
if (i >= line_len || !isxdigit(line[i]))
break;
c2 = line[i++];
/* Followed by a space */
if (i < line_len && line[i] && !isspace(line[i]))
break;
/* Store byte */
*(unsigned char *)*buf = (hex(c1) << 4) | hex(c2);
*buf += 1;
*len -= 1;
}
}
static int read_objdump_output(FILE *f, void **buf, size_t *len)
{
char *line = NULL;
size_t line_len;
ssize_t ret;
int err = 0;
while (1) {
ret = getline(&line, &line_len, f);
if (feof(f))
break;
if (ret < 0) {
pr_debug("getline failed\n");
err = -1;
break;
}
read_objdump_line(line, ret, buf, len);
}
free(line);
return err;
}
static int read_via_objdump(const char *filename, u64 addr, void *buf,
size_t len)
{
char cmd[PATH_MAX * 2];
const char *fmt;
FILE *f;
int ret;
fmt = "%s -d --start-address=0x%"PRIx64" --stop-address=0x%"PRIx64" %s";
ret = snprintf(cmd, sizeof(cmd), fmt, "objdump", addr, addr + len,
filename);
if (ret <= 0 || (size_t)ret >= sizeof(cmd))
return -1;
pr_debug("Objdump command is: %s\n", cmd);
/* Ignore objdump errors */
strcat(cmd, " 2>/dev/null");
f = popen(cmd, "r");
if (!f) {
pr_debug("popen failed\n");
return -1;
}
ret = read_objdump_output(f, &buf, &len);
if (len) {
pr_debug("objdump read too few bytes\n");
if (!ret)
ret = len;
}
pclose(f);
return ret;
}
static int read_object_code(u64 addr, size_t len, u8 cpumode,
struct thread *thread, struct machine *machine,
struct state *state)
{
struct addr_location al;
unsigned char buf1[BUFSZ];
unsigned char buf2[BUFSZ];
size_t ret_len;
u64 objdump_addr;
int ret;
pr_debug("Reading object code for memory address: %#"PRIx64"\n", addr);
thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION, addr,
&al);
if (!al.map || !al.map->dso) {
pr_debug("thread__find_addr_map failed\n");
return -1;
}
pr_debug("File is: %s\n", al.map->dso->long_name);
if (al.map->dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS &&
!dso__is_kcore(al.map->dso)) {
pr_debug("Unexpected kernel address - skipping\n");
return 0;
}
pr_debug("On file address is: %#"PRIx64"\n", al.addr);
if (len > BUFSZ)
len = BUFSZ;
/* Do not go off the map */
if (addr + len > al.map->end)
len = al.map->end - addr;
/* Read the object code using perf */
ret_len = dso__data_read_offset(al.map->dso, machine, al.addr, buf1,
len);
if (ret_len != len) {
pr_debug("dso__data_read_offset failed\n");
return -1;
}
/*
* Converting addresses for use by objdump requires more information.
* map__load() does that. See map__rip_2objdump() for details.
*/
if (map__load(al.map, NULL))
return -1;
/* objdump struggles with kcore - try each map only once */
if (dso__is_kcore(al.map->dso)) {
size_t d;
for (d = 0; d < state->done_cnt; d++) {
if (state->done[d] == al.map->start) {
pr_debug("kcore map tested already");
pr_debug(" - skipping\n");
return 0;
}
}
if (state->done_cnt >= ARRAY_SIZE(state->done)) {
pr_debug("Too many kcore maps - skipping\n");
return 0;
}
state->done[state->done_cnt++] = al.map->start;
}
/* Read the object code using objdump */
objdump_addr = map__rip_2objdump(al.map, al.addr);
ret = read_via_objdump(al.map->dso->long_name, objdump_addr, buf2, len);
if (ret > 0) {
/*
* The kernel maps are inaccurate - assume objdump is right in
* that case.
*/
if (cpumode == PERF_RECORD_MISC_KERNEL ||
cpumode == PERF_RECORD_MISC_GUEST_KERNEL) {
len -= ret;
if (len) {
pr_debug("Reducing len to %zu\n", len);
} else if (dso__is_kcore(al.map->dso)) {
/*
* objdump cannot handle very large segments
* that may be found in kcore.
*/
pr_debug("objdump failed for kcore");
pr_debug(" - skipping\n");
return 0;
} else {
return -1;
}
}
}
if (ret < 0) {
pr_debug("read_via_objdump failed\n");
return -1;
}
/* The results should be identical */
if (memcmp(buf1, buf2, len)) {
pr_debug("Bytes read differ from those read by objdump\n");
return -1;
}
pr_debug("Bytes read match those read by objdump\n");
return 0;
}
static int process_sample_event(struct machine *machine,
struct perf_evlist *evlist,
union perf_event *event, struct state *state)
{
struct perf_sample sample;
struct thread *thread;
u8 cpumode;
if (perf_evlist__parse_sample(evlist, event, &sample)) {
pr_debug("perf_evlist__parse_sample failed\n");
return -1;
}
thread = machine__findnew_thread(machine, sample.pid, sample.pid);
if (!thread) {
pr_debug("machine__findnew_thread failed\n");
return -1;
}
cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
return read_object_code(sample.ip, READLEN, cpumode, thread, machine,
state);
}
static int process_event(struct machine *machine, struct perf_evlist *evlist,
union perf_event *event, struct state *state)
{
if (event->header.type == PERF_RECORD_SAMPLE)
return process_sample_event(machine, evlist, event, state);
if (event->header.type < PERF_RECORD_MAX)
return machine__process_event(machine, event);
return 0;
}
static int process_events(struct machine *machine, struct perf_evlist *evlist,
struct state *state)
{
union perf_event *event;
int i, ret;
for (i = 0; i < evlist->nr_mmaps; i++) {
while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
ret = process_event(machine, evlist, event, state);
if (ret < 0)
return ret;
}
}
return 0;
}
static int comp(const void *a, const void *b)
{
return *(int *)a - *(int *)b;
}
static void do_sort_something(void)
{
int buf[40960], i;
for (i = 0; i < (int)ARRAY_SIZE(buf); i++)
buf[i] = ARRAY_SIZE(buf) - i - 1;
qsort(buf, ARRAY_SIZE(buf), sizeof(int), comp);
for (i = 0; i < (int)ARRAY_SIZE(buf); i++) {
if (buf[i] != i) {
pr_debug("qsort failed\n");
break;
}
}
}
static void sort_something(void)
{
int i;
for (i = 0; i < 10; i++)
do_sort_something();
}
static void syscall_something(void)
{
int pipefd[2];
int i;
for (i = 0; i < 1000; i++) {
if (pipe(pipefd) < 0) {
pr_debug("pipe failed\n");
break;
}
close(pipefd[1]);
close(pipefd[0]);
}
}
static void fs_something(void)
{
const char *test_file_name = "temp-perf-code-reading-test-file--";
FILE *f;
int i;
for (i = 0; i < 1000; i++) {
f = fopen(test_file_name, "w+");
if (f) {
fclose(f);
unlink(test_file_name);
}
}
}
static void do_something(void)
{
fs_something();
sort_something();
syscall_something();
}
enum {
TEST_CODE_READING_OK,
TEST_CODE_READING_NO_VMLINUX,
TEST_CODE_READING_NO_KCORE,
TEST_CODE_READING_NO_ACCESS,
TEST_CODE_READING_NO_KERNEL_OBJ,
};
static int do_test_code_reading(bool try_kcore)
{
struct machines machines;
struct machine *machine;
struct thread *thread;
struct perf_record_opts opts = {
.mmap_pages = UINT_MAX,
.user_freq = UINT_MAX,
.user_interval = ULLONG_MAX,
.freq = 4000,
.target = {
.uses_mmap = true,
},
};
struct state state = {
.done_cnt = 0,
};
struct thread_map *threads = NULL;
struct cpu_map *cpus = NULL;
struct perf_evlist *evlist = NULL;
struct perf_evsel *evsel = NULL;
int err = -1, ret;
pid_t pid;
struct map *map;
bool have_vmlinux, have_kcore, excl_kernel = false;
pid = getpid();
machines__init(&machines);
machine = &machines.host;
ret = machine__create_kernel_maps(machine);
if (ret < 0) {
pr_debug("machine__create_kernel_maps failed\n");
goto out_err;
}
/* Force the use of kallsyms instead of vmlinux to try kcore */
if (try_kcore)
symbol_conf.kallsyms_name = "/proc/kallsyms";
/* Load kernel map */
map = machine->vmlinux_maps[MAP__FUNCTION];
ret = map__load(map, NULL);
if (ret < 0) {
pr_debug("map__load failed\n");
goto out_err;
}
have_vmlinux = dso__is_vmlinux(map->dso);
have_kcore = dso__is_kcore(map->dso);
/* 2nd time through we just try kcore */
if (try_kcore && !have_kcore)
return TEST_CODE_READING_NO_KCORE;
/* No point getting kernel events if there is no kernel object */
if (!have_vmlinux && !have_kcore)
excl_kernel = true;
threads = thread_map__new_by_tid(pid);
if (!threads) {
pr_debug("thread_map__new_by_tid failed\n");
goto out_err;
}
ret = perf_event__synthesize_thread_map(NULL, threads,
perf_event__process, machine);
if (ret < 0) {
pr_debug("perf_event__synthesize_thread_map failed\n");
goto out_err;
}
thread = machine__findnew_thread(machine, pid, pid);
if (!thread) {
pr_debug("machine__findnew_thread failed\n");
goto out_err;
}
cpus = cpu_map__new(NULL);
if (!cpus) {
pr_debug("cpu_map__new failed\n");
goto out_err;
}
while (1) {
const char *str;
evlist = perf_evlist__new();
if (!evlist) {
pr_debug("perf_evlist__new failed\n");
goto out_err;
}
perf_evlist__set_maps(evlist, cpus, threads);
if (excl_kernel)
str = "cycles:u";
else
str = "cycles";
pr_debug("Parsing event '%s'\n", str);
ret = parse_events(evlist, str);
if (ret < 0) {
pr_debug("parse_events failed\n");
goto out_err;
}
perf_evlist__config(evlist, &opts);
evsel = perf_evlist__first(evlist);
evsel->attr.comm = 1;
evsel->attr.disabled = 1;
evsel->attr.enable_on_exec = 0;
ret = perf_evlist__open(evlist);
if (ret < 0) {
if (!excl_kernel) {
excl_kernel = true;
perf_evlist__delete(evlist);
evlist = NULL;
continue;
}
pr_debug("perf_evlist__open failed\n");
goto out_err;
}
break;
}
ret = perf_evlist__mmap(evlist, UINT_MAX, false);
if (ret < 0) {
pr_debug("perf_evlist__mmap failed\n");
goto out_err;
}
perf_evlist__enable(evlist);
do_something();
perf_evlist__disable(evlist);
ret = process_events(machine, evlist, &state);
if (ret < 0)
goto out_err;
if (!have_vmlinux && !have_kcore && !try_kcore)
err = TEST_CODE_READING_NO_KERNEL_OBJ;
else if (!have_vmlinux && !try_kcore)
err = TEST_CODE_READING_NO_VMLINUX;
else if (excl_kernel)
err = TEST_CODE_READING_NO_ACCESS;
else
err = TEST_CODE_READING_OK;
out_err:
if (evlist) {
perf_evlist__munmap(evlist);
perf_evlist__close(evlist);
perf_evlist__delete(evlist);
}
if (cpus)
cpu_map__delete(cpus);
if (threads)
thread_map__delete(threads);
machines__destroy_kernel_maps(&machines);
machine__delete_threads(machine);
machines__exit(&machines);
return err;
}
int test__code_reading(void)
{
int ret;
ret = do_test_code_reading(false);
if (!ret)
ret = do_test_code_reading(true);
switch (ret) {
case TEST_CODE_READING_OK:
return 0;
case TEST_CODE_READING_NO_VMLINUX:
fprintf(stderr, " (no vmlinux)");
return 0;
case TEST_CODE_READING_NO_KCORE:
fprintf(stderr, " (no kcore)");
return 0;
case TEST_CODE_READING_NO_ACCESS:
fprintf(stderr, " (no access)");
return 0;
case TEST_CODE_READING_NO_KERNEL_OBJ:
fprintf(stderr, " (no kernel obj)");
return 0;
default:
return -1;
};
}

View File

@ -10,14 +10,6 @@
#include "symbol.h"
#include "tests.h"
#define TEST_ASSERT_VAL(text, cond) \
do { \
if (!(cond)) { \
pr_debug("FAILED %s:%d %s\n", __FILE__, __LINE__, text); \
return -1; \
} \
} while (0)
static char *test_file(int size)
{
static char buf_templ[] = "/tmp/test-XXXXXX";

View File

@ -1,6 +1,6 @@
#include <traceevent/event-parse.h>
#include "evsel.h"
#include "tests.h"
#include "event-parse.h"
static int perf_evsel__test_field(struct perf_evsel *evsel, const char *name,
int size, bool should_be_signed)
@ -49,7 +49,7 @@ int test__perf_evsel__tp_sched_test(void)
if (perf_evsel__test_field(evsel, "prev_prio", 4, true))
ret = -1;
if (perf_evsel__test_field(evsel, "prev_state", 8, true))
if (perf_evsel__test_field(evsel, "prev_state", sizeof(long), true))
ret = -1;
if (perf_evsel__test_field(evsel, "next_comm", 16, true))

View File

@ -88,7 +88,8 @@ static struct machine *setup_fake_machine(struct machines *machines)
for (i = 0; i < ARRAY_SIZE(fake_threads); i++) {
struct thread *thread;
thread = machine__findnew_thread(machine, fake_threads[i].pid);
thread = machine__findnew_thread(machine, fake_threads[i].pid,
fake_threads[i].pid);
if (thread == NULL)
goto out;
@ -210,17 +211,15 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
list_for_each_entry(evsel, &evlist->entries, node) {
for (k = 0; k < ARRAY_SIZE(fake_common_samples); k++) {
const union perf_event event = {
.ip = {
.header = {
.misc = PERF_RECORD_MISC_USER,
},
.pid = fake_common_samples[k].pid,
.ip = fake_common_samples[k].ip,
.header = {
.misc = PERF_RECORD_MISC_USER,
},
};
sample.pid = fake_common_samples[k].pid;
sample.ip = fake_common_samples[k].ip;
if (perf_event__preprocess_sample(&event, machine, &al,
&sample, 0) < 0)
&sample) < 0)
goto out;
he = __hists__add_entry(&evsel->hists, &al, NULL, 1, 1);
@ -234,17 +233,15 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
for (k = 0; k < ARRAY_SIZE(fake_samples[i]); k++) {
const union perf_event event = {
.ip = {
.header = {
.misc = PERF_RECORD_MISC_USER,
},
.pid = fake_samples[i][k].pid,
.ip = fake_samples[i][k].ip,
.header = {
.misc = PERF_RECORD_MISC_USER,
},
};
sample.pid = fake_samples[i][k].pid;
sample.ip = fake_samples[i][k].ip;
if (perf_event__preprocess_sample(&event, machine, &al,
&sample, 0) < 0)
&sample) < 0)
goto out;
he = __hists__add_entry(&evsel->hists, &al, NULL, 1, 1);

View File

@ -0,0 +1,154 @@
#include <sys/types.h>
#include <unistd.h>
#include <sys/prctl.h>
#include "parse-events.h"
#include "evlist.h"
#include "evsel.h"
#include "thread_map.h"
#include "cpumap.h"
#include "tests.h"
#define CHECK__(x) { \
while ((x) < 0) { \
pr_debug(#x " failed!\n"); \
goto out_err; \
} \
}
#define CHECK_NOT_NULL__(x) { \
while ((x) == NULL) { \
pr_debug(#x " failed!\n"); \
goto out_err; \
} \
}
static int find_comm(struct perf_evlist *evlist, const char *comm)
{
union perf_event *event;
int i, found;
found = 0;
for (i = 0; i < evlist->nr_mmaps; i++) {
while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
if (event->header.type == PERF_RECORD_COMM &&
(pid_t)event->comm.pid == getpid() &&
(pid_t)event->comm.tid == getpid() &&
strcmp(event->comm.comm, comm) == 0)
found += 1;
}
}
return found;
}
/**
* test__keep_tracking - test using a dummy software event to keep tracking.
*
* This function implements a test that checks that tracking events continue
* when an event is disabled but a dummy software event is not disabled. If the
* test passes %0 is returned, otherwise %-1 is returned.
*/
int test__keep_tracking(void)
{
struct perf_record_opts opts = {
.mmap_pages = UINT_MAX,
.user_freq = UINT_MAX,
.user_interval = ULLONG_MAX,
.freq = 4000,
.target = {
.uses_mmap = true,
},
};
struct thread_map *threads = NULL;
struct cpu_map *cpus = NULL;
struct perf_evlist *evlist = NULL;
struct perf_evsel *evsel = NULL;
int found, err = -1;
const char *comm;
threads = thread_map__new(-1, getpid(), UINT_MAX);
CHECK_NOT_NULL__(threads);
cpus = cpu_map__new(NULL);
CHECK_NOT_NULL__(cpus);
evlist = perf_evlist__new();
CHECK_NOT_NULL__(evlist);
perf_evlist__set_maps(evlist, cpus, threads);
CHECK__(parse_events(evlist, "dummy:u"));
CHECK__(parse_events(evlist, "cycles:u"));
perf_evlist__config(evlist, &opts);
evsel = perf_evlist__first(evlist);
evsel->attr.comm = 1;
evsel->attr.disabled = 1;
evsel->attr.enable_on_exec = 0;
if (perf_evlist__open(evlist) < 0) {
fprintf(stderr, " (not supported)");
err = 0;
goto out_err;
}
CHECK__(perf_evlist__mmap(evlist, UINT_MAX, false));
/*
* First, test that a 'comm' event can be found when the event is
* enabled.
*/
perf_evlist__enable(evlist);
comm = "Test COMM 1";
CHECK__(prctl(PR_SET_NAME, (unsigned long)comm, 0, 0, 0));
perf_evlist__disable(evlist);
found = find_comm(evlist, comm);
if (found != 1) {
pr_debug("First time, failed to find tracking event.\n");
goto out_err;
}
/*
* Secondly, test that a 'comm' event can be found when the event is
* disabled with the dummy event still enabled.
*/
perf_evlist__enable(evlist);
evsel = perf_evlist__last(evlist);
CHECK__(perf_evlist__disable_event(evlist, evsel));
comm = "Test COMM 2";
CHECK__(prctl(PR_SET_NAME, (unsigned long)comm, 0, 0, 0));
perf_evlist__disable(evlist);
found = find_comm(evlist, comm);
if (found != 1) {
pr_debug("Seconf time, failed to find tracking event.\n");
goto out_err;
}
err = 0;
out_err:
if (evlist) {
perf_evlist__disable(evlist);
perf_evlist__munmap(evlist);
perf_evlist__close(evlist);
perf_evlist__delete(evlist);
}
if (cpus)
cpu_map__delete(cpus);
if (threads)
thread_map__delete(threads);
return err;
}

View File

@ -1,6 +1,8 @@
PERF := .
MK := Makefile
has = $(shell which $1 2>/dev/null)
# standard single make variable specified
make_clean_all := clean all
make_python_perf_so := python/perf.so
@ -25,6 +27,13 @@ make_help := help
make_doc := doc
make_perf_o := perf.o
make_util_map_o := util/map.o
make_install := install
make_install_bin := install-bin
make_install_doc := install-doc
make_install_man := install-man
make_install_html := install-html
make_install_info := install-info
make_install_pdf := install-pdf
# all the NO_* variable combined
make_minimal := NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1
@ -50,14 +59,27 @@ run += make_no_backtrace
run += make_no_libnuma
run += make_no_libaudit
run += make_no_libbionic
run += make_tags
run += make_cscope
run += make_help
run += make_doc
run += make_perf_o
run += make_util_map_o
run += make_install
run += make_install_bin
# FIXME 'install-*' commented out till they're fixed
# run += make_install_doc
# run += make_install_man
# run += make_install_html
# run += make_install_info
# run += make_install_pdf
run += make_minimal
ifneq ($(call has,ctags),)
run += make_tags
endif
ifneq ($(call has,cscope),)
run += make_cscope
endif
# $(run_O) contains same portion of $(run) tests with '_O' attached
# to distinguish O=... tests
run_O := $(addsuffix _O,$(run))
@ -84,6 +106,31 @@ test_make_python_perf_so := test -f $(PERF)/python/perf.so
test_make_perf_o := test -f $(PERF)/perf.o
test_make_util_map_o := test -f $(PERF)/util/map.o
test_make_install := test -x $$TMP_DEST/bin/perf
test_make_install_O := $(test_make_install)
test_make_install_bin := $(test_make_install)
test_make_install_bin_O := $(test_make_install)
# FIXME nothing gets installed
test_make_install_man := test -f $$TMP_DEST/share/man/man1/perf.1
test_make_install_man_O := $(test_make_install_man)
# FIXME nothing gets installed
test_make_install_doc := $(test_ok)
test_make_install_doc_O := $(test_ok)
# FIXME nothing gets installed
test_make_install_html := $(test_ok)
test_make_install_html_O := $(test_ok)
# FIXME nothing gets installed
test_make_install_info := $(test_ok)
test_make_install_info_O := $(test_ok)
# FIXME nothing gets installed
test_make_install_pdf := $(test_ok)
test_make_install_pdf_O := $(test_ok)
# Kbuild tests only
#test_make_python_perf_so_O := test -f $$TMP/tools/perf/python/perf.so
#test_make_perf_o_O := test -f $$TMP/tools/perf/perf.o
@ -95,7 +142,7 @@ test_make_util_map_o_O := true
test_default = test -x $(PERF)/perf
test = $(if $(test_$1),$(test_$1),$(test_default))
test_default_O = test -x $$TMP/perf
test_default_O = test -x $$TMP_O/perf
test_O = $(if $(test_$1),$(test_$1),$(test_default_O))
all:
@ -111,23 +158,27 @@ clean := @(cd $(PERF); make -s -f $(MK) clean >/dev/null)
$(run):
$(call clean)
@cmd="cd $(PERF) && make -f $(MK) $($@)"; \
@TMP_DEST=$$(mktemp -d); \
cmd="cd $(PERF) && make -f $(MK) DESTDIR=$$TMP_DEST $($@)"; \
echo "- $@: $$cmd" && echo $$cmd > $@ && \
( eval $$cmd ) >> $@ 2>&1; \
echo " test: $(call test,$@)"; \
$(call test,$@) && \
rm -f $@
rm -f $@ \
rm -rf $$TMP_DEST
$(run_O):
$(call clean)
@TMP=$$(mktemp -d); \
cmd="cd $(PERF) && make -f $(MK) $($(patsubst %_O,%,$@)) O=$$TMP"; \
@TMP_O=$$(mktemp -d); \
TMP_DEST=$$(mktemp -d); \
cmd="cd $(PERF) && make -f $(MK) O=$$TMP_O DESTDIR=$$TMP_DEST $($(patsubst %_O,%,$@))"; \
echo "- $@: $$cmd" && echo $$cmd > $@ && \
( eval $$cmd ) >> $@ 2>&1 && \
echo " test: $(call test_O,$@)"; \
$(call test_O,$@) && \
rm -f $@ && \
rm -rf $$TMP
rm -rf $$TMP_O \
rm -rf $$TMP_DEST
all: $(run) $(run_O)
@echo OK

View File

@ -72,7 +72,7 @@ int test__basic_mmap(void)
}
evsels[i]->attr.wakeup_events = 1;
perf_evsel__set_sample_id(evsels[i]);
perf_evsel__set_sample_id(evsels[i], false);
perf_evlist__add(evlist, evsels[i]);

View File

@ -7,14 +7,6 @@
#include "tests.h"
#include <linux/hw_breakpoint.h>
#define TEST_ASSERT_VAL(text, cond) \
do { \
if (!(cond)) { \
pr_debug("FAILED %s:%d %s\n", __FILE__, __LINE__, text); \
return -1; \
} \
} while (0)
#define PERF_TP_SAMPLE_TYPE (PERF_SAMPLE_RAW | PERF_SAMPLE_TIME | \
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
@ -460,6 +452,7 @@ static int test__checkevent_pmu_events(struct perf_evlist *evlist)
evsel->attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->attr.exclude_hv);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong pinned", !evsel->attr.pinned);
return 0;
}
@ -528,6 +521,7 @@ static int test__group1(struct perf_evlist *evlist)
TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong nr_members", evsel->nr_members == 2);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* cycles:upp */
evsel = perf_evsel__next(evsel);
@ -543,6 +537,7 @@ static int test__group1(struct perf_evlist *evlist)
TEST_ASSERT_VAL("wrong precise_ip", evsel->attr.precise_ip == 2);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
return 0;
}
@ -568,6 +563,7 @@ static int test__group2(struct perf_evlist *evlist)
TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong nr_members", evsel->nr_members == 2);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* cache-references + :u modifier */
evsel = perf_evsel__next(evsel);
@ -582,6 +578,7 @@ static int test__group2(struct perf_evlist *evlist)
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* cycles:k */
evsel = perf_evsel__next(evsel);
@ -595,6 +592,7 @@ static int test__group2(struct perf_evlist *evlist)
TEST_ASSERT_VAL("wrong exclude host", !evsel->attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
return 0;
}
@ -623,6 +621,7 @@ static int test__group3(struct perf_evlist *evlist __maybe_unused)
!strcmp(leader->group_name, "group1"));
TEST_ASSERT_VAL("wrong nr_members", evsel->nr_members == 2);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* group1 cycles:kppp */
evsel = perf_evsel__next(evsel);
@ -639,6 +638,7 @@ static int test__group3(struct perf_evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* group2 cycles + G modifier */
evsel = leader = perf_evsel__next(evsel);
@ -656,6 +656,7 @@ static int test__group3(struct perf_evlist *evlist __maybe_unused)
!strcmp(leader->group_name, "group2"));
TEST_ASSERT_VAL("wrong nr_members", evsel->nr_members == 2);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* group2 1:3 + G modifier */
evsel = perf_evsel__next(evsel);
@ -669,6 +670,7 @@ static int test__group3(struct perf_evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* instructions:u */
evsel = perf_evsel__next(evsel);
@ -682,6 +684,7 @@ static int test__group3(struct perf_evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong exclude host", !evsel->attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
return 0;
}
@ -709,6 +712,7 @@ static int test__group4(struct perf_evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong nr_members", evsel->nr_members == 2);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* instructions:kp + p */
evsel = perf_evsel__next(evsel);
@ -724,6 +728,7 @@ static int test__group4(struct perf_evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong precise_ip", evsel->attr.precise_ip == 2);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
return 0;
}
@ -750,6 +755,7 @@ static int test__group5(struct perf_evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong nr_members", evsel->nr_members == 2);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* instructions + G */
evsel = perf_evsel__next(evsel);
@ -764,6 +770,7 @@ static int test__group5(struct perf_evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* cycles:G */
evsel = leader = perf_evsel__next(evsel);
@ -780,6 +787,7 @@ static int test__group5(struct perf_evlist *evlist __maybe_unused)
TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong nr_members", evsel->nr_members == 2);
TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* instructions:G */
evsel = perf_evsel__next(evsel);
@ -971,6 +979,142 @@ static int test__group_gh4(struct perf_evlist *evlist)
return 0;
}
static int test__leader_sample1(struct perf_evlist *evlist)
{
struct perf_evsel *evsel, *leader;
TEST_ASSERT_VAL("wrong number of entries", 3 == evlist->nr_entries);
/* cycles - sampling group leader */
evsel = leader = perf_evlist__first(evlist);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CPU_CYCLES == evsel->attr.config);
TEST_ASSERT_VAL("wrong exclude_user", !evsel->attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", !evsel->attr.exclude_hv);
TEST_ASSERT_VAL("wrong exclude guest", evsel->attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong sample_read", evsel->sample_read);
/* cache-misses - not sampling */
evsel = perf_evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_MISSES == evsel->attr.config);
TEST_ASSERT_VAL("wrong exclude_user", !evsel->attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", !evsel->attr.exclude_hv);
TEST_ASSERT_VAL("wrong exclude guest", evsel->attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong sample_read", evsel->sample_read);
/* branch-misses - not sampling */
evsel = perf_evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_BRANCH_MISSES == evsel->attr.config);
TEST_ASSERT_VAL("wrong exclude_user", !evsel->attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", !evsel->attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", !evsel->attr.exclude_hv);
TEST_ASSERT_VAL("wrong exclude guest", evsel->attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong sample_read", evsel->sample_read);
return 0;
}
static int test__leader_sample2(struct perf_evlist *evlist __maybe_unused)
{
struct perf_evsel *evsel, *leader;
TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->nr_entries);
/* instructions - sampling group leader */
evsel = leader = perf_evlist__first(evlist);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_INSTRUCTIONS == evsel->attr.config);
TEST_ASSERT_VAL("wrong exclude_user", !evsel->attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->attr.exclude_hv);
TEST_ASSERT_VAL("wrong exclude guest", evsel->attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong sample_read", evsel->sample_read);
/* branch-misses - not sampling */
evsel = perf_evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_BRANCH_MISSES == evsel->attr.config);
TEST_ASSERT_VAL("wrong exclude_user", !evsel->attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->attr.exclude_hv);
TEST_ASSERT_VAL("wrong exclude guest", evsel->attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong sample_read", evsel->sample_read);
return 0;
}
static int test__checkevent_pinned_modifier(struct perf_evlist *evlist)
{
struct perf_evsel *evsel = perf_evlist__first(evlist);
TEST_ASSERT_VAL("wrong exclude_user", !evsel->attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude_kernel", evsel->attr.exclude_kernel);
TEST_ASSERT_VAL("wrong exclude_hv", evsel->attr.exclude_hv);
TEST_ASSERT_VAL("wrong precise_ip", evsel->attr.precise_ip);
TEST_ASSERT_VAL("wrong pinned", evsel->attr.pinned);
return test__checkevent_symbolic_name(evlist);
}
static int test__pinned_group(struct perf_evlist *evlist)
{
struct perf_evsel *evsel, *leader;
TEST_ASSERT_VAL("wrong number of entries", 3 == evlist->nr_entries);
/* cycles - group leader */
evsel = leader = perf_evlist__first(evlist);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CPU_CYCLES == evsel->attr.config);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong pinned", evsel->attr.pinned);
/* cache-misses - can not be pinned, but will go on with the leader */
evsel = perf_evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_MISSES == evsel->attr.config);
TEST_ASSERT_VAL("wrong pinned", !evsel->attr.pinned);
/* branch-misses - ditto */
evsel = perf_evsel__next(evsel);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_BRANCH_MISSES == evsel->attr.config);
TEST_ASSERT_VAL("wrong pinned", !evsel->attr.pinned);
return 0;
}
static int count_tracepoints(void)
{
char events_path[PATH_MAX];
@ -1187,6 +1331,22 @@ static struct evlist_test test__events[] = {
.name = "{cycles:G,cache-misses:H}:uG",
.check = test__group_gh4,
},
[38] = {
.name = "{cycles,cache-misses,branch-misses}:S",
.check = test__leader_sample1,
},
[39] = {
.name = "{instructions,branch-misses}:Su",
.check = test__leader_sample2,
},
[40] = {
.name = "instructions:uDp",
.check = test__checkevent_pinned_modifier,
},
[41] = {
.name = "{cycles,cache-misses,branch-misses}:D",
.check = test__pinned_group,
},
};
static struct evlist_test test__events_pmu[] = {
@ -1254,24 +1414,20 @@ static int test_events(struct evlist_test *events, unsigned cnt)
static int test_term(struct terms_test *t)
{
struct list_head *terms;
struct list_head terms;
int ret;
terms = malloc(sizeof(*terms));
if (!terms)
return -ENOMEM;
INIT_LIST_HEAD(&terms);
INIT_LIST_HEAD(terms);
ret = parse_events_terms(terms, t->str);
ret = parse_events_terms(&terms, t->str);
if (ret) {
pr_debug("failed to parse terms '%s', err %d\n",
t->str , ret);
return ret;
}
ret = t->check(terms);
parse_events__free_terms(terms);
ret = t->check(&terms);
parse_events__free_terms(&terms);
return ret;
}

View File

@ -0,0 +1,177 @@
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <inttypes.h>
#include <sys/prctl.h>
#include "parse-events.h"
#include "evlist.h"
#include "evsel.h"
#include "thread_map.h"
#include "cpumap.h"
#include "tests.h"
#include "../arch/x86/util/tsc.h"
#define CHECK__(x) { \
while ((x) < 0) { \
pr_debug(#x " failed!\n"); \
goto out_err; \
} \
}
#define CHECK_NOT_NULL__(x) { \
while ((x) == NULL) { \
pr_debug(#x " failed!\n"); \
goto out_err; \
} \
}
static u64 rdtsc(void)
{
unsigned int low, high;
asm volatile("rdtsc" : "=a" (low), "=d" (high));
return low | ((u64)high) << 32;
}
/**
* test__perf_time_to_tsc - test converting perf time to TSC.
*
* This function implements a test that checks that the conversion of perf time
* to and from TSC is consistent with the order of events. If the test passes
* %0 is returned, otherwise %-1 is returned. If TSC conversion is not
* supported then then the test passes but " (not supported)" is printed.
*/
int test__perf_time_to_tsc(void)
{
struct perf_record_opts opts = {
.mmap_pages = UINT_MAX,
.user_freq = UINT_MAX,
.user_interval = ULLONG_MAX,
.freq = 4000,
.target = {
.uses_mmap = true,
},
.sample_time = true,
};
struct thread_map *threads = NULL;
struct cpu_map *cpus = NULL;
struct perf_evlist *evlist = NULL;
struct perf_evsel *evsel = NULL;
int err = -1, ret, i;
const char *comm1, *comm2;
struct perf_tsc_conversion tc;
struct perf_event_mmap_page *pc;
union perf_event *event;
u64 test_tsc, comm1_tsc, comm2_tsc;
u64 test_time, comm1_time = 0, comm2_time = 0;
threads = thread_map__new(-1, getpid(), UINT_MAX);
CHECK_NOT_NULL__(threads);
cpus = cpu_map__new(NULL);
CHECK_NOT_NULL__(cpus);
evlist = perf_evlist__new();
CHECK_NOT_NULL__(evlist);
perf_evlist__set_maps(evlist, cpus, threads);
CHECK__(parse_events(evlist, "cycles:u"));
perf_evlist__config(evlist, &opts);
evsel = perf_evlist__first(evlist);
evsel->attr.comm = 1;
evsel->attr.disabled = 1;
evsel->attr.enable_on_exec = 0;
CHECK__(perf_evlist__open(evlist));
CHECK__(perf_evlist__mmap(evlist, UINT_MAX, false));
pc = evlist->mmap[0].base;
ret = perf_read_tsc_conversion(pc, &tc);
if (ret) {
if (ret == -EOPNOTSUPP) {
fprintf(stderr, " (not supported)");
return 0;
}
goto out_err;
}
perf_evlist__enable(evlist);
comm1 = "Test COMM 1";
CHECK__(prctl(PR_SET_NAME, (unsigned long)comm1, 0, 0, 0));
test_tsc = rdtsc();
comm2 = "Test COMM 2";
CHECK__(prctl(PR_SET_NAME, (unsigned long)comm2, 0, 0, 0));
perf_evlist__disable(evlist);
for (i = 0; i < evlist->nr_mmaps; i++) {
while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
struct perf_sample sample;
if (event->header.type != PERF_RECORD_COMM ||
(pid_t)event->comm.pid != getpid() ||
(pid_t)event->comm.tid != getpid())
continue;
if (strcmp(event->comm.comm, comm1) == 0) {
CHECK__(perf_evsel__parse_sample(evsel, event,
&sample));
comm1_time = sample.time;
}
if (strcmp(event->comm.comm, comm2) == 0) {
CHECK__(perf_evsel__parse_sample(evsel, event,
&sample));
comm2_time = sample.time;
}
}
}
if (!comm1_time || !comm2_time)
goto out_err;
test_time = tsc_to_perf_time(test_tsc, &tc);
comm1_tsc = perf_time_to_tsc(comm1_time, &tc);
comm2_tsc = perf_time_to_tsc(comm2_time, &tc);
pr_debug("1st event perf time %"PRIu64" tsc %"PRIu64"\n",
comm1_time, comm1_tsc);
pr_debug("rdtsc time %"PRIu64" tsc %"PRIu64"\n",
test_time, test_tsc);
pr_debug("2nd event perf time %"PRIu64" tsc %"PRIu64"\n",
comm2_time, comm2_tsc);
if (test_time <= comm1_time ||
test_time >= comm2_time)
goto out_err;
if (test_tsc <= comm1_tsc ||
test_tsc >= comm2_tsc)
goto out_err;
err = 0;
out_err:
if (evlist) {
perf_evlist__disable(evlist);
perf_evlist__munmap(evlist);
perf_evlist__close(evlist);
perf_evlist__delete(evlist);
}
if (cpus)
cpu_map__delete(cpus);
if (threads)
thread_map__delete(threads);
return err;
}

View File

@ -0,0 +1,316 @@
#include <stdbool.h>
#include <inttypes.h>
#include "util.h"
#include "event.h"
#include "evsel.h"
#include "tests.h"
#define COMP(m) do { \
if (s1->m != s2->m) { \
pr_debug("Samples differ at '"#m"'\n"); \
return false; \
} \
} while (0)
#define MCOMP(m) do { \
if (memcmp(&s1->m, &s2->m, sizeof(s1->m))) { \
pr_debug("Samples differ at '"#m"'\n"); \
return false; \
} \
} while (0)
static bool samples_same(const struct perf_sample *s1,
const struct perf_sample *s2, u64 type, u64 regs_user,
u64 read_format)
{
size_t i;
if (type & PERF_SAMPLE_IDENTIFIER)
COMP(id);
if (type & PERF_SAMPLE_IP)
COMP(ip);
if (type & PERF_SAMPLE_TID) {
COMP(pid);
COMP(tid);
}
if (type & PERF_SAMPLE_TIME)
COMP(time);
if (type & PERF_SAMPLE_ADDR)
COMP(addr);
if (type & PERF_SAMPLE_ID)
COMP(id);
if (type & PERF_SAMPLE_STREAM_ID)
COMP(stream_id);
if (type & PERF_SAMPLE_CPU)
COMP(cpu);
if (type & PERF_SAMPLE_PERIOD)
COMP(period);
if (type & PERF_SAMPLE_READ) {
if (read_format & PERF_FORMAT_GROUP)
COMP(read.group.nr);
else
COMP(read.one.value);
if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
COMP(read.time_enabled);
if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
COMP(read.time_running);
/* PERF_FORMAT_ID is forced for PERF_SAMPLE_READ */
if (read_format & PERF_FORMAT_GROUP) {
for (i = 0; i < s1->read.group.nr; i++)
MCOMP(read.group.values[i]);
} else {
COMP(read.one.id);
}
}
if (type & PERF_SAMPLE_CALLCHAIN) {
COMP(callchain->nr);
for (i = 0; i < s1->callchain->nr; i++)
COMP(callchain->ips[i]);
}
if (type & PERF_SAMPLE_RAW) {
COMP(raw_size);
if (memcmp(s1->raw_data, s2->raw_data, s1->raw_size)) {
pr_debug("Samples differ at 'raw_data'\n");
return false;
}
}
if (type & PERF_SAMPLE_BRANCH_STACK) {
COMP(branch_stack->nr);
for (i = 0; i < s1->branch_stack->nr; i++)
MCOMP(branch_stack->entries[i]);
}
if (type & PERF_SAMPLE_REGS_USER) {
size_t sz = hweight_long(regs_user) * sizeof(u64);
COMP(user_regs.abi);
if (s1->user_regs.abi &&
(!s1->user_regs.regs || !s2->user_regs.regs ||
memcmp(s1->user_regs.regs, s2->user_regs.regs, sz))) {
pr_debug("Samples differ at 'user_regs'\n");
return false;
}
}
if (type & PERF_SAMPLE_STACK_USER) {
COMP(user_stack.size);
if (memcmp(s1->user_stack.data, s1->user_stack.data,
s1->user_stack.size)) {
pr_debug("Samples differ at 'user_stack'\n");
return false;
}
}
if (type & PERF_SAMPLE_WEIGHT)
COMP(weight);
if (type & PERF_SAMPLE_DATA_SRC)
COMP(data_src);
return true;
}
static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
{
struct perf_evsel evsel = {
.needs_swap = false,
.attr = {
.sample_type = sample_type,
.sample_regs_user = sample_regs_user,
.read_format = read_format,
},
};
union perf_event *event;
union {
struct ip_callchain callchain;
u64 data[64];
} callchain = {
/* 3 ips */
.data = {3, 201, 202, 203},
};
union {
struct branch_stack branch_stack;
u64 data[64];
} branch_stack = {
/* 1 branch_entry */
.data = {1, 211, 212, 213},
};
u64 user_regs[64];
const u64 raw_data[] = {0x123456780a0b0c0dULL, 0x1102030405060708ULL};
const u64 data[] = {0x2211443366558877ULL, 0, 0xaabbccddeeff4321ULL};
struct perf_sample sample = {
.ip = 101,
.pid = 102,
.tid = 103,
.time = 104,
.addr = 105,
.id = 106,
.stream_id = 107,
.period = 108,
.weight = 109,
.cpu = 110,
.raw_size = sizeof(raw_data),
.data_src = 111,
.raw_data = (void *)raw_data,
.callchain = &callchain.callchain,
.branch_stack = &branch_stack.branch_stack,
.user_regs = {
.abi = PERF_SAMPLE_REGS_ABI_64,
.regs = user_regs,
},
.user_stack = {
.size = sizeof(data),
.data = (void *)data,
},
.read = {
.time_enabled = 0x030a59d664fca7deULL,
.time_running = 0x011b6ae553eb98edULL,
},
};
struct sample_read_value values[] = {{1, 5}, {9, 3}, {2, 7}, {6, 4},};
struct perf_sample sample_out;
size_t i, sz, bufsz;
int err, ret = -1;
for (i = 0; i < sizeof(user_regs); i++)
*(i + (u8 *)user_regs) = i & 0xfe;
if (read_format & PERF_FORMAT_GROUP) {
sample.read.group.nr = 4;
sample.read.group.values = values;
} else {
sample.read.one.value = 0x08789faeb786aa87ULL;
sample.read.one.id = 99;
}
sz = perf_event__sample_event_size(&sample, sample_type,
sample_regs_user, read_format);
bufsz = sz + 4096; /* Add a bit for overrun checking */
event = malloc(bufsz);
if (!event) {
pr_debug("malloc failed\n");
return -1;
}
memset(event, 0xff, bufsz);
event->header.type = PERF_RECORD_SAMPLE;
event->header.misc = 0;
event->header.size = sz;
err = perf_event__synthesize_sample(event, sample_type,
sample_regs_user, read_format,
&sample, false);
if (err) {
pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
"perf_event__synthesize_sample", sample_type, err);
goto out_free;
}
/* The data does not contain 0xff so we use that to check the size */
for (i = bufsz; i > 0; i--) {
if (*(i - 1 + (u8 *)event) != 0xff)
break;
}
if (i != sz) {
pr_debug("Event size mismatch: actual %zu vs expected %zu\n",
i, sz);
goto out_free;
}
evsel.sample_size = __perf_evsel__sample_size(sample_type);
err = perf_evsel__parse_sample(&evsel, event, &sample_out);
if (err) {
pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
"perf_evsel__parse_sample", sample_type, err);
goto out_free;
}
if (!samples_same(&sample, &sample_out, sample_type,
sample_regs_user, read_format)) {
pr_debug("parsing failed for sample_type %#"PRIx64"\n",
sample_type);
goto out_free;
}
ret = 0;
out_free:
free(event);
if (ret && read_format)
pr_debug("read_format %#"PRIx64"\n", read_format);
return ret;
}
/**
* test__sample_parsing - test sample parsing.
*
* This function implements a test that synthesizes a sample event, parses it
* and then checks that the parsed sample matches the original sample. The test
* checks sample format bits separately and together. If the test passes %0 is
* returned, otherwise %-1 is returned.
*/
int test__sample_parsing(void)
{
const u64 rf[] = {4, 5, 6, 7, 12, 13, 14, 15};
u64 sample_type;
u64 sample_regs_user;
size_t i;
int err;
/*
* Fail the test if it has not been updated when new sample format bits
* were added.
*/
if (PERF_SAMPLE_MAX > PERF_SAMPLE_IDENTIFIER << 1) {
pr_debug("sample format has changed - test needs updating\n");
return -1;
}
/* Test each sample format bit separately */
for (sample_type = 1; sample_type != PERF_SAMPLE_MAX;
sample_type <<= 1) {
/* Test read_format variations */
if (sample_type == PERF_SAMPLE_READ) {
for (i = 0; i < ARRAY_SIZE(rf); i++) {
err = do_test(sample_type, 0, rf[i]);
if (err)
return err;
}
continue;
}
if (sample_type == PERF_SAMPLE_REGS_USER)
sample_regs_user = 0x3fff;
else
sample_regs_user = 0;
err = do_test(sample_type, sample_regs_user, 0);
if (err)
return err;
}
/* Test all sample format bits together */
sample_type = PERF_SAMPLE_MAX - 1;
sample_regs_user = 0x3fff;
for (i = 0; i < ARRAY_SIZE(rf); i++) {
err = do_test(sample_type, sample_regs_user, rf[i]);
if (err)
return err;
}
return 0;
}

View File

@ -1,6 +1,14 @@
#ifndef TESTS_H
#define TESTS_H
#define TEST_ASSERT_VAL(text, cond) \
do { \
if (!(cond)) { \
pr_debug("FAILED %s:%d %s\n", __FILE__, __LINE__, text); \
return -1; \
} \
} while (0)
enum {
TEST_OK = 0,
TEST_FAIL = -1,
@ -27,5 +35,9 @@ int test__bp_signal(void);
int test__bp_signal_overflow(void);
int test__task_exit(void);
int test__sw_clock_freq(void);
int test__perf_time_to_tsc(void);
int test__code_reading(void);
int test__sample_parsing(void);
int test__keep_tracking(void);
#endif /* TESTS_H */

View File

@ -16,6 +16,8 @@ static int vmlinux_matches_kallsyms_filter(struct map *map __maybe_unused,
return 0;
}
#define UM(x) kallsyms_map->unmap_ip(kallsyms_map, (x))
int test__vmlinux_matches_kallsyms(void)
{
int err = -1;
@ -25,6 +27,7 @@ int test__vmlinux_matches_kallsyms(void)
struct machine kallsyms, vmlinux;
enum map_type type = MAP__FUNCTION;
struct ref_reloc_sym ref_reloc_sym = { .name = "_stext", };
u64 mem_start, mem_end;
/*
* Step 1:
@ -73,7 +76,7 @@ int test__vmlinux_matches_kallsyms(void)
goto out;
}
ref_reloc_sym.addr = sym->start;
ref_reloc_sym.addr = UM(sym->start);
/*
* Step 5:
@ -123,10 +126,14 @@ int test__vmlinux_matches_kallsyms(void)
if (sym->start == sym->end)
continue;
first_pair = machine__find_kernel_symbol(&kallsyms, type, sym->start, NULL, NULL);
mem_start = vmlinux_map->unmap_ip(vmlinux_map, sym->start);
mem_end = vmlinux_map->unmap_ip(vmlinux_map, sym->end);
first_pair = machine__find_kernel_symbol(&kallsyms, type,
mem_start, NULL, NULL);
pair = first_pair;
if (pair && pair->start == sym->start) {
if (pair && UM(pair->start) == mem_start) {
next_pair:
if (strcmp(sym->name, pair->name) == 0) {
/*
@ -138,12 +145,20 @@ next_pair:
* off the real size. More than that and we
* _really_ have a problem.
*/
s64 skew = sym->end - pair->end;
if (llabs(skew) < page_size)
continue;
s64 skew = mem_end - UM(pair->end);
if (llabs(skew) >= page_size)
pr_debug("%#" PRIx64 ": diff end addr for %s v: %#" PRIx64 " k: %#" PRIx64 "\n",
mem_start, sym->name, mem_end,
UM(pair->end));
/*
* Do not count this as a failure, because we
* could really find a case where it's not
* possible to get proper function end from
* kallsyms.
*/
continue;
pr_debug("%#" PRIx64 ": diff end addr for %s v: %#" PRIx64 " k: %#" PRIx64 "\n",
sym->start, sym->name, sym->end, pair->end);
} else {
struct rb_node *nnd;
detour:
@ -152,7 +167,7 @@ detour:
if (nnd) {
struct symbol *next = rb_entry(nnd, struct symbol, rb_node);
if (next->start == sym->start) {
if (UM(next->start) == mem_start) {
pair = next;
goto next_pair;
}
@ -165,10 +180,11 @@ detour:
}
pr_debug("%#" PRIx64 ": diff name v: %s k: %s\n",
sym->start, sym->name, pair->name);
mem_start, sym->name, pair->name);
}
} else
pr_debug("%#" PRIx64 ": %s not on kallsyms\n", sym->start, sym->name);
pr_debug("%#" PRIx64 ": %s not on kallsyms\n",
mem_start, sym->name);
err = -1;
}
@ -201,16 +217,19 @@ detour:
for (nd = rb_first(&vmlinux.kmaps.maps[type]); nd; nd = rb_next(nd)) {
struct map *pos = rb_entry(nd, struct map, rb_node), *pair;
pair = map_groups__find(&kallsyms.kmaps, type, pos->start);
mem_start = vmlinux_map->unmap_ip(vmlinux_map, pos->start);
mem_end = vmlinux_map->unmap_ip(vmlinux_map, pos->end);
pair = map_groups__find(&kallsyms.kmaps, type, mem_start);
if (pair == NULL || pair->priv)
continue;
if (pair->start == pos->start) {
if (pair->start == mem_start) {
pair->priv = 1;
pr_info(" %" PRIx64 "-%" PRIx64 " %" PRIx64 " %s in kallsyms as",
pos->start, pos->end, pos->pgoff, pos->dso->name);
if (pos->pgoff != pair->pgoff || pos->end != pair->end)
pr_info(": \n*%" PRIx64 "-%" PRIx64 " %" PRIx64 "",
if (mem_end != pair->end)
pr_info(":\n*%" PRIx64 "-%" PRIx64 " %" PRIx64,
pair->start, pair->end, pair->pgoff);
pr_info(" %s\n", pair->dso->name);
pair->priv = 1;

View File

@ -428,6 +428,14 @@ static void annotate_browser__init_asm_mode(struct annotate_browser *browser)
browser->b.nr_entries = browser->nr_asm_entries;
}
#define SYM_TITLE_MAX_SIZE (PATH_MAX + 64)
static int sym_title(struct symbol *sym, struct map *map, char *title,
size_t sz)
{
return snprintf(title, sz, "%s %s", sym->name, map->dso->long_name);
}
static bool annotate_browser__callq(struct annotate_browser *browser,
struct perf_evsel *evsel,
struct hist_browser_timer *hbt)
@ -438,6 +446,7 @@ static bool annotate_browser__callq(struct annotate_browser *browser,
struct annotation *notes;
struct symbol *target;
u64 ip;
char title[SYM_TITLE_MAX_SIZE];
if (!ins__is_call(dl->ins))
return false;
@ -461,7 +470,8 @@ static bool annotate_browser__callq(struct annotate_browser *browser,
pthread_mutex_unlock(&notes->lock);
symbol__tui_annotate(target, ms->map, evsel, hbt);
ui_browser__show_title(&browser->b, sym->name);
sym_title(sym, ms->map, title, sizeof(title));
ui_browser__show_title(&browser->b, title);
return true;
}
@ -495,7 +505,7 @@ static bool annotate_browser__jump(struct annotate_browser *browser)
dl = annotate_browser__find_offset(browser, dl->ops.target.offset, &idx);
if (dl == NULL) {
ui_helpline__puts("Invallid jump offset");
ui_helpline__puts("Invalid jump offset");
return true;
}
@ -653,8 +663,10 @@ static int annotate_browser__run(struct annotate_browser *browser,
const char *help = "Press 'h' for help on key bindings";
int delay_secs = hbt ? hbt->refresh : 0;
int key;
char title[SYM_TITLE_MAX_SIZE];
if (ui_browser__show(&browser->b, sym->name, help) < 0)
sym_title(sym, ms->map, title, sizeof(title));
if (ui_browser__show(&browser->b, title, help) < 0)
return -1;
annotate_browser__calc_percent(browser, evsel);
@ -720,7 +732,7 @@ static int annotate_browser__run(struct annotate_browser *browser,
"s Toggle source code view\n"
"/ Search string\n"
"r Run available scripts\n"
"? Search previous string\n");
"? Search string backwards\n");
continue;
case 'r':
{

View File

@ -685,8 +685,10 @@ static u64 __hpp_get_##_field(struct hist_entry *he) \
return he->stat._field; \
} \
\
static int hist_browser__hpp_color_##_type(struct perf_hpp *hpp, \
struct hist_entry *he) \
static int \
hist_browser__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused,\
struct perf_hpp *hpp, \
struct hist_entry *he) \
{ \
return __hpp__color_fmt(hpp, he, __hpp_get_##_field, _cb); \
}
@ -701,8 +703,6 @@ __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us, NULL)
void hist_browser__init_hpp(void)
{
perf_hpp__column_enable(PERF_HPP__OVERHEAD);
perf_hpp__init();
perf_hpp__format[PERF_HPP__OVERHEAD].color =
@ -762,9 +762,9 @@ static int hist_browser__show_entry(struct hist_browser *browser,
first = false;
if (fmt->color) {
width -= fmt->color(&hpp, entry);
width -= fmt->color(fmt, &hpp, entry);
} else {
width -= fmt->entry(&hpp, entry);
width -= fmt->entry(fmt, &hpp, entry);
slsmg_printf("%s", s);
}
}
@ -1256,7 +1256,7 @@ static int hists__browser_title(struct hists *hists, char *bf, size_t size,
printed += scnprintf(bf + printed, size - printed,
", Thread: %s(%d)",
(thread->comm_set ? thread->comm : ""),
thread->pid);
thread->tid);
if (dso)
printed += scnprintf(bf + printed, size - printed,
", DSO: %s", dso->short_name);
@ -1579,7 +1579,7 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
asprintf(&options[nr_options], "Zoom %s %s(%d) thread",
(browser->hists->thread_filter ? "out of" : "into"),
(thread->comm_set ? thread->comm : ""),
thread->pid) > 0)
thread->tid) > 0)
zoom_thread = nr_options++;
if (dso != NULL &&
@ -1702,7 +1702,7 @@ zoom_out_thread:
} else {
ui_helpline__fpush("To zoom out press <- or -> + \"Zoom out of %s(%d) thread\"",
thread->comm_set ? thread->comm : "",
thread->pid);
thread->tid);
browser->hists->thread_filter = thread;
sort_thread.elide = true;
pstack__push(fstack, &browser->hists->thread_filter);

View File

@ -91,7 +91,8 @@ static u64 he_get_##_field(struct hist_entry *he) \
return he->stat._field; \
} \
\
static int perf_gtk__hpp_color_##_type(struct perf_hpp *hpp, \
static int perf_gtk__hpp_color_##_type(struct perf_hpp_fmt *fmt __maybe_unused, \
struct perf_hpp *hpp, \
struct hist_entry *he) \
{ \
return __hpp__color_fmt(hpp, he, he_get_##_field); \
@ -108,8 +109,6 @@ __HPP_COLOR_PERCENT_FN(overhead_guest_us, period_guest_us)
void perf_gtk__init_hpp(void)
{
perf_hpp__column_enable(PERF_HPP__OVERHEAD);
perf_hpp__init();
perf_hpp__format[PERF_HPP__OVERHEAD].color =
@ -124,6 +123,81 @@ void perf_gtk__init_hpp(void)
perf_gtk__hpp_color_overhead_guest_us;
}
static void callchain_list__sym_name(struct callchain_list *cl,
char *bf, size_t bfsize)
{
if (cl->ms.sym)
scnprintf(bf, bfsize, "%s", cl->ms.sym->name);
else
scnprintf(bf, bfsize, "%#" PRIx64, cl->ip);
}
static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
GtkTreeIter *parent, int col, u64 total)
{
struct rb_node *nd;
bool has_single_node = (rb_first(root) == rb_last(root));
for (nd = rb_first(root); nd; nd = rb_next(nd)) {
struct callchain_node *node;
struct callchain_list *chain;
GtkTreeIter iter, new_parent;
bool need_new_parent;
double percent;
u64 hits, child_total;
node = rb_entry(nd, struct callchain_node, rb_node);
hits = callchain_cumul_hits(node);
percent = 100.0 * hits / total;
new_parent = *parent;
need_new_parent = !has_single_node && (node->val_nr > 1);
list_for_each_entry(chain, &node->val, list) {
char buf[128];
gtk_tree_store_append(store, &iter, &new_parent);
scnprintf(buf, sizeof(buf), "%5.2f%%", percent);
gtk_tree_store_set(store, &iter, 0, buf, -1);
callchain_list__sym_name(chain, buf, sizeof(buf));
gtk_tree_store_set(store, &iter, col, buf, -1);
if (need_new_parent) {
/*
* Only show the top-most symbol in a callchain
* if it's not the only callchain.
*/
new_parent = iter;
need_new_parent = false;
}
}
if (callchain_param.mode == CHAIN_GRAPH_REL)
child_total = node->children_hit;
else
child_total = total;
/* Now 'iter' contains info of the last callchain_list */
perf_gtk__add_callchain(&node->rb_root, store, &iter, col,
child_total);
}
}
static void on_row_activated(GtkTreeView *view, GtkTreePath *path,
GtkTreeViewColumn *col __maybe_unused,
gpointer user_data __maybe_unused)
{
bool expanded = gtk_tree_view_row_expanded(view, path);
if (expanded)
gtk_tree_view_collapse_row(view, path);
else
gtk_tree_view_expand_row(view, path, FALSE);
}
static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
float min_pcnt)
{
@ -131,10 +205,11 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
GType col_types[MAX_COLUMNS];
GtkCellRenderer *renderer;
struct sort_entry *se;
GtkListStore *store;
GtkTreeStore *store;
struct rb_node *nd;
GtkWidget *view;
int col_idx;
int sym_col = -1;
int nr_cols;
char s[512];
@ -153,10 +228,13 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
if (se->elide)
continue;
if (se == &sort_sym)
sym_col = nr_cols;
col_types[nr_cols++] = G_TYPE_STRING;
}
store = gtk_list_store_newv(nr_cols, col_types);
store = gtk_tree_store_newv(nr_cols, col_types);
view = gtk_tree_view_new();
@ -165,7 +243,7 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
col_idx = 0;
perf_hpp__for_each_format(fmt) {
fmt->header(&hpp);
fmt->header(fmt, &hpp);
gtk_tree_view_insert_column_with_attributes(GTK_TREE_VIEW(view),
-1, ltrim(s),
@ -183,6 +261,18 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
col_idx++, NULL);
}
for (col_idx = 0; col_idx < nr_cols; col_idx++) {
GtkTreeViewColumn *column;
column = gtk_tree_view_get_column(GTK_TREE_VIEW(view), col_idx);
gtk_tree_view_column_set_resizable(column, TRUE);
if (col_idx == sym_col) {
gtk_tree_view_set_expander_column(GTK_TREE_VIEW(view),
column);
}
}
gtk_tree_view_set_model(GTK_TREE_VIEW(view), GTK_TREE_MODEL(store));
g_object_unref(GTK_TREE_MODEL(store));
@ -199,17 +289,17 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
if (percent < min_pcnt)
continue;
gtk_list_store_append(store, &iter);
gtk_tree_store_append(store, &iter, NULL);
col_idx = 0;
perf_hpp__for_each_format(fmt) {
if (fmt->color)
fmt->color(&hpp, h);
fmt->color(fmt, &hpp, h);
else
fmt->entry(&hpp, h);
fmt->entry(fmt, &hpp, h);
gtk_list_store_set(store, &iter, col_idx++, s, -1);
gtk_tree_store_set(store, &iter, col_idx++, s, -1);
}
list_for_each_entry(se, &hist_entry__sort_list, list) {
@ -219,10 +309,26 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
se->se_snprintf(h, s, ARRAY_SIZE(s),
hists__col_len(hists, se->se_width_idx));
gtk_list_store_set(store, &iter, col_idx++, s, -1);
gtk_tree_store_set(store, &iter, col_idx++, s, -1);
}
if (symbol_conf.use_callchain && sort__has_sym) {
u64 total;
if (callchain_param.mode == CHAIN_GRAPH_REL)
total = h->stat.period;
else
total = hists->stats.total_period;
perf_gtk__add_callchain(&h->sorted_chain, store, &iter,
sym_col, total);
}
}
gtk_tree_view_set_rules_hint(GTK_TREE_VIEW(view), TRUE);
g_signal_connect(view, "row-activated",
G_CALLBACK(on_row_activated), NULL);
gtk_container_add(GTK_CONTAINER(window), view);
}

View File

@ -1,4 +1,5 @@
#include <math.h>
#include <linux/compiler.h>
#include "../util/hist.h"
#include "../util/util.h"
@ -79,7 +80,8 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
}
#define __HPP_HEADER_FN(_type, _str, _min_width, _unit_width) \
static int hpp__header_##_type(struct perf_hpp *hpp) \
static int hpp__header_##_type(struct perf_hpp_fmt *fmt __maybe_unused, \
struct perf_hpp *hpp) \
{ \
int len = _min_width; \
\
@ -92,7 +94,8 @@ static int hpp__header_##_type(struct perf_hpp *hpp) \
}
#define __HPP_WIDTH_FN(_type, _min_width, _unit_width) \
static int hpp__width_##_type(struct perf_hpp *hpp __maybe_unused) \
static int hpp__width_##_type(struct perf_hpp_fmt *fmt __maybe_unused, \
struct perf_hpp *hpp __maybe_unused) \
{ \
int len = _min_width; \
\
@ -110,14 +113,16 @@ static u64 he_get_##_field(struct hist_entry *he) \
return he->stat._field; \
} \
\
static int hpp__color_##_type(struct perf_hpp *hpp, struct hist_entry *he) \
static int hpp__color_##_type(struct perf_hpp_fmt *fmt __maybe_unused, \
struct perf_hpp *hpp, struct hist_entry *he) \
{ \
return __hpp__fmt(hpp, he, he_get_##_field, " %6.2f%%", \
(hpp_snprint_fn)percent_color_snprintf, true); \
}
#define __HPP_ENTRY_PERCENT_FN(_type, _field) \
static int hpp__entry_##_type(struct perf_hpp *hpp, struct hist_entry *he) \
static int hpp__entry_##_type(struct perf_hpp_fmt *_fmt __maybe_unused, \
struct perf_hpp *hpp, struct hist_entry *he) \
{ \
const char *fmt = symbol_conf.field_sep ? " %.2f" : " %6.2f%%"; \
return __hpp__fmt(hpp, he, he_get_##_field, fmt, \
@ -130,7 +135,8 @@ static u64 he_get_raw_##_field(struct hist_entry *he) \
return he->stat._field; \
} \
\
static int hpp__entry_##_type(struct perf_hpp *hpp, struct hist_entry *he) \
static int hpp__entry_##_type(struct perf_hpp_fmt *_fmt __maybe_unused, \
struct perf_hpp *hpp, struct hist_entry *he) \
{ \
const char *fmt = symbol_conf.field_sep ? " %"PRIu64 : " %11"PRIu64; \
return __hpp__fmt(hpp, he, he_get_raw_##_field, fmt, scnprintf, false); \
@ -157,196 +163,6 @@ HPP_PERCENT_FNS(overhead_guest_us, "guest usr", period_guest_us, 9, 8)
HPP_RAW_FNS(samples, "Samples", nr_events, 12, 12)
HPP_RAW_FNS(period, "Period", period, 12, 12)
static int hpp__header_baseline(struct perf_hpp *hpp)
{
return scnprintf(hpp->buf, hpp->size, "Baseline");
}
static int hpp__width_baseline(struct perf_hpp *hpp __maybe_unused)
{
return 8;
}
static double baseline_percent(struct hist_entry *he)
{
struct hist_entry *pair = hist_entry__next_pair(he);
struct hists *pair_hists = pair ? pair->hists : NULL;
double percent = 0.0;
if (pair) {
u64 total_period = pair_hists->stats.total_period;
u64 base_period = pair->stat.period;
percent = 100.0 * base_period / total_period;
}
return percent;
}
static int hpp__color_baseline(struct perf_hpp *hpp, struct hist_entry *he)
{
double percent = baseline_percent(he);
if (hist_entry__has_pairs(he) || symbol_conf.field_sep)
return percent_color_snprintf(hpp->buf, hpp->size, " %6.2f%%", percent);
else
return scnprintf(hpp->buf, hpp->size, " ");
}
static int hpp__entry_baseline(struct perf_hpp *hpp, struct hist_entry *he)
{
double percent = baseline_percent(he);
const char *fmt = symbol_conf.field_sep ? "%.2f" : " %6.2f%%";
if (hist_entry__has_pairs(he) || symbol_conf.field_sep)
return scnprintf(hpp->buf, hpp->size, fmt, percent);
else
return scnprintf(hpp->buf, hpp->size, " ");
}
static int hpp__header_period_baseline(struct perf_hpp *hpp)
{
const char *fmt = symbol_conf.field_sep ? "%s" : "%12s";
return scnprintf(hpp->buf, hpp->size, fmt, "Period Base");
}
static int hpp__width_period_baseline(struct perf_hpp *hpp __maybe_unused)
{
return 12;
}
static int hpp__entry_period_baseline(struct perf_hpp *hpp, struct hist_entry *he)
{
struct hist_entry *pair = hist_entry__next_pair(he);
u64 period = pair ? pair->stat.period : 0;
const char *fmt = symbol_conf.field_sep ? "%" PRIu64 : "%12" PRIu64;
return scnprintf(hpp->buf, hpp->size, fmt, period);
}
static int hpp__header_delta(struct perf_hpp *hpp)
{
const char *fmt = symbol_conf.field_sep ? "%s" : "%7s";
return scnprintf(hpp->buf, hpp->size, fmt, "Delta");
}
static int hpp__width_delta(struct perf_hpp *hpp __maybe_unused)
{
return 7;
}
static int hpp__entry_delta(struct perf_hpp *hpp, struct hist_entry *he)
{
struct hist_entry *pair = hist_entry__next_pair(he);
const char *fmt = symbol_conf.field_sep ? "%s" : "%7.7s";
char buf[32] = " ";
double diff = 0.0;
if (pair) {
if (he->diff.computed)
diff = he->diff.period_ratio_delta;
else
diff = perf_diff__compute_delta(he, pair);
} else
diff = perf_diff__period_percent(he, he->stat.period);
if (fabs(diff) >= 0.01)
scnprintf(buf, sizeof(buf), "%+4.2F%%", diff);
return scnprintf(hpp->buf, hpp->size, fmt, buf);
}
static int hpp__header_ratio(struct perf_hpp *hpp)
{
const char *fmt = symbol_conf.field_sep ? "%s" : "%14s";
return scnprintf(hpp->buf, hpp->size, fmt, "Ratio");
}
static int hpp__width_ratio(struct perf_hpp *hpp __maybe_unused)
{
return 14;
}
static int hpp__entry_ratio(struct perf_hpp *hpp, struct hist_entry *he)
{
struct hist_entry *pair = hist_entry__next_pair(he);
const char *fmt = symbol_conf.field_sep ? "%s" : "%14s";
char buf[32] = " ";
double ratio = 0.0;
if (pair) {
if (he->diff.computed)
ratio = he->diff.period_ratio;
else
ratio = perf_diff__compute_ratio(he, pair);
}
if (ratio > 0.0)
scnprintf(buf, sizeof(buf), "%+14.6F", ratio);
return scnprintf(hpp->buf, hpp->size, fmt, buf);
}
static int hpp__header_wdiff(struct perf_hpp *hpp)
{
const char *fmt = symbol_conf.field_sep ? "%s" : "%14s";
return scnprintf(hpp->buf, hpp->size, fmt, "Weighted diff");
}
static int hpp__width_wdiff(struct perf_hpp *hpp __maybe_unused)
{
return 14;
}
static int hpp__entry_wdiff(struct perf_hpp *hpp, struct hist_entry *he)
{
struct hist_entry *pair = hist_entry__next_pair(he);
const char *fmt = symbol_conf.field_sep ? "%s" : "%14s";
char buf[32] = " ";
s64 wdiff = 0;
if (pair) {
if (he->diff.computed)
wdiff = he->diff.wdiff;
else
wdiff = perf_diff__compute_wdiff(he, pair);
}
if (wdiff != 0)
scnprintf(buf, sizeof(buf), "%14ld", wdiff);
return scnprintf(hpp->buf, hpp->size, fmt, buf);
}
static int hpp__header_formula(struct perf_hpp *hpp)
{
const char *fmt = symbol_conf.field_sep ? "%s" : "%70s";
return scnprintf(hpp->buf, hpp->size, fmt, "Formula");
}
static int hpp__width_formula(struct perf_hpp *hpp __maybe_unused)
{
return 70;
}
static int hpp__entry_formula(struct perf_hpp *hpp, struct hist_entry *he)
{
struct hist_entry *pair = hist_entry__next_pair(he);
const char *fmt = symbol_conf.field_sep ? "%s" : "%-70s";
char buf[96] = " ";
if (pair)
perf_diff__formula(he, pair, buf, sizeof(buf));
return scnprintf(hpp->buf, hpp->size, fmt, buf);
}
#define HPP__COLOR_PRINT_FNS(_name) \
{ \
.header = hpp__header_ ## _name, \
@ -363,19 +179,13 @@ static int hpp__entry_formula(struct perf_hpp *hpp, struct hist_entry *he)
}
struct perf_hpp_fmt perf_hpp__format[] = {
HPP__COLOR_PRINT_FNS(baseline),
HPP__COLOR_PRINT_FNS(overhead),
HPP__COLOR_PRINT_FNS(overhead_sys),
HPP__COLOR_PRINT_FNS(overhead_us),
HPP__COLOR_PRINT_FNS(overhead_guest_sys),
HPP__COLOR_PRINT_FNS(overhead_guest_us),
HPP__PRINT_FNS(samples),
HPP__PRINT_FNS(period),
HPP__PRINT_FNS(period_baseline),
HPP__PRINT_FNS(delta),
HPP__PRINT_FNS(ratio),
HPP__PRINT_FNS(wdiff),
HPP__PRINT_FNS(formula)
HPP__PRINT_FNS(period)
};
LIST_HEAD(perf_hpp__list);
@ -396,6 +206,8 @@ LIST_HEAD(perf_hpp__list);
void perf_hpp__init(void)
{
perf_hpp__column_enable(PERF_HPP__OVERHEAD);
if (symbol_conf.show_cpu_utilization) {
perf_hpp__column_enable(PERF_HPP__OVERHEAD_SYS);
perf_hpp__column_enable(PERF_HPP__OVERHEAD_US);
@ -424,46 +236,6 @@ void perf_hpp__column_enable(unsigned col)
perf_hpp__column_register(&perf_hpp__format[col]);
}
static inline void advance_hpp(struct perf_hpp *hpp, int inc)
{
hpp->buf += inc;
hpp->size -= inc;
}
int hist_entry__period_snprintf(struct perf_hpp *hpp, struct hist_entry *he,
bool color)
{
const char *sep = symbol_conf.field_sep;
struct perf_hpp_fmt *fmt;
char *start = hpp->buf;
int ret;
bool first = true;
if (symbol_conf.exclude_other && !he->parent)
return 0;
perf_hpp__for_each_format(fmt) {
/*
* If there's no field_sep, we still need
* to display initial ' '.
*/
if (!sep || !first) {
ret = scnprintf(hpp->buf, hpp->size, "%s", sep ?: " ");
advance_hpp(hpp, ret);
} else
first = false;
if (color && fmt->color)
ret = fmt->color(hpp, he);
else
ret = fmt->entry(hpp, he);
advance_hpp(hpp, ret);
}
return hpp->buf - start;
}
int hist_entry__sort_snprintf(struct hist_entry *he, char *s, size_t size,
struct hists *hists)
{
@ -499,7 +271,7 @@ unsigned int hists__sort_list_width(struct hists *hists)
if (i)
ret += 2;
ret += fmt->width(&dummy_hpp);
ret += fmt->width(fmt, &dummy_hpp);
}
list_for_each_entry(se, &hist_entry__sort_list, list)

View File

@ -30,7 +30,6 @@ void setup_browser(bool fallback_to_pager)
if (fallback_to_pager)
setup_pager();
perf_hpp__column_enable(PERF_HPP__OVERHEAD);
perf_hpp__init();
break;
}

View File

@ -308,6 +308,47 @@ static size_t hist_entry__callchain_fprintf(struct hist_entry *he,
return hist_entry_callchain__fprintf(he, total_period, left_margin, fp);
}
static inline void advance_hpp(struct perf_hpp *hpp, int inc)
{
hpp->buf += inc;
hpp->size -= inc;
}
static int hist_entry__period_snprintf(struct perf_hpp *hpp,
struct hist_entry *he,
bool color)
{
const char *sep = symbol_conf.field_sep;
struct perf_hpp_fmt *fmt;
char *start = hpp->buf;
int ret;
bool first = true;
if (symbol_conf.exclude_other && !he->parent)
return 0;
perf_hpp__for_each_format(fmt) {
/*
* If there's no field_sep, we still need
* to display initial ' '.
*/
if (!sep || !first) {
ret = scnprintf(hpp->buf, hpp->size, "%s", sep ?: " ");
advance_hpp(hpp, ret);
} else
first = false;
if (color && fmt->color)
ret = fmt->color(fmt, hpp, he);
else
ret = fmt->entry(fmt, hpp, he);
advance_hpp(hpp, ret);
}
return hpp->buf - start;
}
static int hist_entry__fprintf(struct hist_entry *he, size_t size,
struct hists *hists, FILE *fp)
{
@ -365,7 +406,7 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
else
first = false;
fmt->header(&dummy_hpp);
fmt->header(fmt, &dummy_hpp);
fprintf(fp, "%s", bf);
}
@ -410,7 +451,7 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
else
first = false;
width = fmt->width(&dummy_hpp);
width = fmt->width(fmt, &dummy_hpp);
for (i = 0; i < width; i++)
fprintf(fp, ".");
}

View File

@ -110,10 +110,10 @@ static int jump__parse(struct ins_operands *ops)
{
const char *s = strchr(ops->raw, '+');
ops->target.addr = strtoll(ops->raw, NULL, 16);
ops->target.addr = strtoull(ops->raw, NULL, 16);
if (s++ != NULL)
ops->target.offset = strtoll(s, NULL, 16);
ops->target.offset = strtoull(s, NULL, 16);
else
ops->target.offset = UINT64_MAX;
@ -821,11 +821,55 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
if (dl == NULL)
return -1;
if (dl->ops.target.offset == UINT64_MAX)
dl->ops.target.offset = dl->ops.target.addr -
map__rip_2objdump(map, sym->start);
/*
* kcore has no symbols, so add the call target name if it is on the
* same map.
*/
if (dl->ins && ins__is_call(dl->ins) && !dl->ops.target.name) {
struct symbol *s;
u64 ip = dl->ops.target.addr;
if (ip >= map->start && ip <= map->end) {
ip = map->map_ip(map, ip);
s = map__find_symbol(map, ip, NULL);
if (s && s->start == ip)
dl->ops.target.name = strdup(s->name);
}
}
disasm__add(&notes->src->source, dl);
return 0;
}
static void delete_last_nop(struct symbol *sym)
{
struct annotation *notes = symbol__annotation(sym);
struct list_head *list = &notes->src->source;
struct disasm_line *dl;
while (!list_empty(list)) {
dl = list_entry(list->prev, struct disasm_line, node);
if (dl->ins && dl->ins->ops) {
if (dl->ins->ops != &nop_ops)
return;
} else {
if (!strstr(dl->line, " nop ") &&
!strstr(dl->line, " nopl ") &&
!strstr(dl->line, " nopw "))
return;
}
list_del(&dl->node);
disasm_line__free(dl);
}
}
int symbol__annotate(struct symbol *sym, struct map *map, size_t privsize)
{
struct dso *dso = map->dso;
@ -864,7 +908,8 @@ fallback:
free_filename = false;
}
if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS) {
if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS &&
!dso__is_kcore(dso)) {
char bf[BUILD_ID_SIZE * 2 + 16] = " with build id ";
char *build_id_msg = NULL;
@ -898,7 +943,7 @@ fallback:
snprintf(command, sizeof(command),
"%s %s%s --start-address=0x%016" PRIx64
" --stop-address=0x%016" PRIx64
" -d %s %s -C %s|grep -v %s|expand",
" -d %s %s -C %s 2>/dev/null|grep -v %s|expand",
objdump_path ? objdump_path : "objdump",
disassembler_style ? "-M " : "",
disassembler_style ? disassembler_style : "",
@ -918,6 +963,13 @@ fallback:
if (symbol__parse_objdump_line(sym, map, file, privsize) < 0)
break;
/*
* kallsyms does not have symbol sizes so there may a nop at the end.
* Remove it.
*/
if (dso__is_kcore(dso))
delete_last_nop(sym);
pclose(file);
out_free_filename:
if (free_filename)

View File

@ -18,13 +18,14 @@
int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
struct perf_sample *sample,
struct perf_evsel *evsel __maybe_unused,
struct machine *machine)
{
struct addr_location al;
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
struct thread *thread = machine__findnew_thread(machine, sample->pid,
sample->pid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
@ -33,7 +34,7 @@ int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
}
thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
event->ip.ip, &al);
sample->ip, &al);
if (al.map != NULL)
al.map->dso->hit = 1;
@ -47,7 +48,9 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused,
__maybe_unused,
struct machine *machine)
{
struct thread *thread = machine__findnew_thread(machine, event->fork.tid);
struct thread *thread = machine__findnew_thread(machine,
event->fork.pid,
event->fork.tid);
dump_printf("(%d:%d):(%d:%d)\n", event->fork.pid, event->fork.tid,
event->fork.ppid, event->fork.ptid);

View File

@ -15,19 +15,12 @@
#include <errno.h>
#include <math.h>
#include "hist.h"
#include "util.h"
#include "callchain.h"
__thread struct callchain_cursor callchain_cursor;
bool ip_callchain__valid(struct ip_callchain *chain,
const union perf_event *event)
{
unsigned int chain_size = event->header.size;
chain_size -= (unsigned long)&event->ip.__more_data - (unsigned long)event;
return chain->nr * sizeof(u64) <= chain_size;
}
#define chain_for_each_child(child, parent) \
list_for_each_entry(child, &parent->children, siblings)
@ -327,7 +320,8 @@ append_chain(struct callchain_node *root,
/*
* Lookup in the current node
* If we have a symbol, then compare the start to match
* anywhere inside a function.
* anywhere inside a function, unless function
* mode is disabled.
*/
list_for_each_entry(cnode, &root->val, list) {
struct callchain_cursor_node *node;
@ -339,7 +333,8 @@ append_chain(struct callchain_node *root,
sym = node->sym;
if (cnode->ms.sym && sym) {
if (cnode->ms.sym && sym &&
callchain_param.key == CCKEY_FUNCTION) {
if (cnode->ms.sym->start != sym->start)
break;
} else if (cnode->ip != node->ip)

View File

@ -41,12 +41,18 @@ struct callchain_param;
typedef void (*sort_chain_func_t)(struct rb_root *, struct callchain_root *,
u64, struct callchain_param *);
enum chain_key {
CCKEY_FUNCTION,
CCKEY_ADDRESS
};
struct callchain_param {
enum chain_mode mode;
u32 print_limit;
double min_percent;
sort_chain_func_t sort;
enum chain_order order;
enum chain_key key;
};
struct callchain_list {
@ -103,11 +109,6 @@ int callchain_append(struct callchain_root *root,
int callchain_merge(struct callchain_cursor *cursor,
struct callchain_root *dst, struct callchain_root *src);
struct ip_callchain;
union perf_event;
bool ip_callchain__valid(struct ip_callchain *chain,
const union perf_event *event);
/*
* Initialize a cursor before adding entries inside, but keep
* the previously allocated entries as a cache.

View File

@ -41,7 +41,7 @@ static inline int cpu_map__nr(const struct cpu_map *map)
return map ? map->nr : 1;
}
static inline bool cpu_map__all(const struct cpu_map *map)
static inline bool cpu_map__empty(const struct cpu_map *map)
{
return map ? map->map[0] == -1 : true;
}

View File

@ -78,6 +78,8 @@ int dso__binary_type_file(struct dso *dso, enum dso_binary_type type,
symbol_conf.symfs, build_id_hex, build_id_hex + 2);
break;
case DSO_BINARY_TYPE__VMLINUX:
case DSO_BINARY_TYPE__GUEST_VMLINUX:
case DSO_BINARY_TYPE__SYSTEM_PATH_DSO:
snprintf(file, size, "%s%s",
symbol_conf.symfs, dso->long_name);
@ -93,11 +95,14 @@ int dso__binary_type_file(struct dso *dso, enum dso_binary_type type,
dso->long_name);
break;
case DSO_BINARY_TYPE__KCORE:
case DSO_BINARY_TYPE__GUEST_KCORE:
snprintf(file, size, "%s", dso->long_name);
break;
default:
case DSO_BINARY_TYPE__KALLSYMS:
case DSO_BINARY_TYPE__VMLINUX:
case DSO_BINARY_TYPE__GUEST_KALLSYMS:
case DSO_BINARY_TYPE__GUEST_VMLINUX:
case DSO_BINARY_TYPE__JAVA_JIT:
case DSO_BINARY_TYPE__NOT_FOUND:
ret = -1;
@ -419,6 +424,7 @@ struct dso *dso__new(const char *name)
dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND;
dso->data_type = DSO_BINARY_TYPE__NOT_FOUND;
dso->loaded = 0;
dso->rel = 0;
dso->sorted_by_name = 0;
dso->has_build_id = 0;
dso->kernel = DSO_TYPE_USER;

View File

@ -3,6 +3,7 @@
#include <linux/types.h>
#include <linux/rbtree.h>
#include <stdbool.h>
#include "types.h"
#include "map.h"
@ -20,6 +21,8 @@ enum dso_binary_type {
DSO_BINARY_TYPE__SYSTEM_PATH_DSO,
DSO_BINARY_TYPE__GUEST_KMODULE,
DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE,
DSO_BINARY_TYPE__KCORE,
DSO_BINARY_TYPE__GUEST_KCORE,
DSO_BINARY_TYPE__NOT_FOUND,
};
@ -84,6 +87,7 @@ struct dso {
u8 lname_alloc:1;
u8 sorted_by_name;
u8 loaded;
u8 rel;
u8 build_id[BUILD_ID_SIZE];
const char *short_name;
char *long_name;
@ -146,4 +150,17 @@ size_t dso__fprintf_buildid(struct dso *dso, FILE *fp);
size_t dso__fprintf_symbols_by_name(struct dso *dso,
enum map_type type, FILE *fp);
size_t dso__fprintf(struct dso *dso, enum map_type type, FILE *fp);
static inline bool dso__is_vmlinux(struct dso *dso)
{
return dso->data_type == DSO_BINARY_TYPE__VMLINUX ||
dso->data_type == DSO_BINARY_TYPE__GUEST_VMLINUX;
}
static inline bool dso__is_kcore(struct dso *dso)
{
return dso->data_type == DSO_BINARY_TYPE__KCORE ||
dso->data_type == DSO_BINARY_TYPE__GUEST_KCORE;
}
#endif /* __PERF_DSO */

View File

@ -595,6 +595,7 @@ void thread__find_addr_map(struct thread *self,
struct addr_location *al)
{
struct map_groups *mg = &self->mg;
bool load_map = false;
al->thread = self;
al->addr = addr;
@ -609,11 +610,13 @@ void thread__find_addr_map(struct thread *self,
if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
al->level = 'k';
mg = &machine->kmaps;
load_map = true;
} else if (cpumode == PERF_RECORD_MISC_USER && perf_host) {
al->level = '.';
} else if (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest) {
al->level = 'g';
mg = &machine->kmaps;
load_map = true;
} else {
/*
* 'u' means guest os user space.
@ -654,18 +657,25 @@ try_again:
mg = &machine->kmaps;
goto try_again;
}
} else
} else {
/*
* Kernel maps might be changed when loading symbols so loading
* must be done prior to using kernel maps.
*/
if (load_map)
map__load(al->map, machine->symbol_filter);
al->addr = al->map->map_ip(al->map, al->addr);
}
}
void thread__find_addr_location(struct thread *thread, struct machine *machine,
u8 cpumode, enum map_type type, u64 addr,
struct addr_location *al,
symbol_filter_t filter)
struct addr_location *al)
{
thread__find_addr_map(thread, machine, cpumode, type, addr, al);
if (al->map != NULL)
al->sym = map__find_symbol(al->map, al->addr, filter);
al->sym = map__find_symbol(al->map, al->addr,
machine->symbol_filter);
else
al->sym = NULL;
}
@ -673,11 +683,11 @@ void thread__find_addr_location(struct thread *thread, struct machine *machine,
int perf_event__preprocess_sample(const union perf_event *event,
struct machine *machine,
struct addr_location *al,
struct perf_sample *sample,
symbol_filter_t filter)
struct perf_sample *sample)
{
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
struct thread *thread = machine__findnew_thread(machine, sample->pid,
sample->pid);
if (thread == NULL)
return -1;
@ -686,7 +696,7 @@ int perf_event__preprocess_sample(const union perf_event *event,
!strlist__has_entry(symbol_conf.comm_list, thread->comm))
goto out_filtered;
dump_printf(" ... thread: %s:%d\n", thread->comm, thread->pid);
dump_printf(" ... thread: %s:%d\n", thread->comm, thread->tid);
/*
* Have we already created the kernel maps for this machine?
*
@ -699,7 +709,7 @@ int perf_event__preprocess_sample(const union perf_event *event,
machine__create_kernel_maps(machine);
thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
event->ip.ip, al);
sample->ip, al);
dump_printf(" ...... dso: %s\n",
al->map ? al->map->dso->long_name :
al->level == 'H' ? "[hypervisor]" : "<not found>");
@ -717,7 +727,8 @@ int perf_event__preprocess_sample(const union perf_event *event,
dso->long_name)))))
goto out_filtered;
al->sym = map__find_symbol(al->map, al->addr, filter);
al->sym = map__find_symbol(al->map, al->addr,
machine->symbol_filter);
}
if (symbol_conf.sym_list &&

View File

@ -8,16 +8,6 @@
#include "map.h"
#include "build-id.h"
/*
* PERF_SAMPLE_IP | PERF_SAMPLE_TID | *
*/
struct ip_event {
struct perf_event_header header;
u64 ip;
u32 pid, tid;
unsigned char __more_data[];
};
struct mmap_event {
struct perf_event_header header;
u32 pid, tid;
@ -63,7 +53,8 @@ struct read_event {
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD | \
PERF_SAMPLE_IDENTIFIER)
struct sample_event {
struct perf_event_header header;
@ -71,6 +62,7 @@ struct sample_event {
};
struct regs_dump {
u64 abi;
u64 *regs;
};
@ -80,6 +72,23 @@ struct stack_dump {
char *data;
};
struct sample_read_value {
u64 value;
u64 id;
};
struct sample_read {
u64 time_enabled;
u64 time_running;
union {
struct {
u64 nr;
struct sample_read_value *values;
} group;
struct sample_read_value one;
};
};
struct perf_sample {
u64 ip;
u32 pid, tid;
@ -97,6 +106,7 @@ struct perf_sample {
struct branch_stack *branch_stack;
struct regs_dump user_regs;
struct stack_dump user_stack;
struct sample_read read;
};
#define PERF_MEM_DATA_SRC_NONE \
@ -116,7 +126,7 @@ struct build_id_event {
enum perf_user_event_type { /* above any possible kernel type */
PERF_RECORD_USER_TYPE_START = 64,
PERF_RECORD_HEADER_ATTR = 64,
PERF_RECORD_HEADER_EVENT_TYPE = 65,
PERF_RECORD_HEADER_EVENT_TYPE = 65, /* depreceated */
PERF_RECORD_HEADER_TRACING_DATA = 66,
PERF_RECORD_HEADER_BUILD_ID = 67,
PERF_RECORD_FINISHED_ROUND = 68,
@ -148,7 +158,6 @@ struct tracing_data_event {
union perf_event {
struct perf_event_header header;
struct ip_event ip;
struct mmap_event mmap;
struct comm_event comm;
struct fork_event fork;
@ -216,12 +225,14 @@ struct addr_location;
int perf_event__preprocess_sample(const union perf_event *self,
struct machine *machine,
struct addr_location *al,
struct perf_sample *sample,
symbol_filter_t filter);
struct perf_sample *sample);
const char *perf_event__name(unsigned int id);
size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
u64 sample_regs_user, u64 read_format);
int perf_event__synthesize_sample(union perf_event *event, u64 type,
u64 sample_regs_user, u64 read_format,
const struct perf_sample *sample,
bool swapped);

View File

@ -14,6 +14,7 @@
#include "target.h"
#include "evlist.h"
#include "evsel.h"
#include "debug.h"
#include <unistd.h>
#include "parse-events.h"
@ -48,26 +49,19 @@ struct perf_evlist *perf_evlist__new(void)
return evlist;
}
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts)
/**
* perf_evlist__set_id_pos - set the positions of event ids.
* @evlist: selected event list
*
* Events with compatible sample types all have the same id_pos
* and is_pos. For convenience, put a copy on evlist.
*/
void perf_evlist__set_id_pos(struct perf_evlist *evlist)
{
struct perf_evsel *evsel;
/*
* Set the evsel leader links before we configure attributes,
* since some might depend on this info.
*/
if (opts->group)
perf_evlist__set_leader(evlist);
struct perf_evsel *first = perf_evlist__first(evlist);
if (evlist->cpus->map[0] < 0)
opts->no_inherit = true;
list_for_each_entry(evsel, &evlist->entries, node) {
perf_evsel__config(evsel, opts);
if (evlist->nr_entries > 1)
perf_evsel__set_sample_id(evsel);
}
evlist->id_pos = first->id_pos;
evlist->is_pos = first->is_pos;
}
static void perf_evlist__purge(struct perf_evlist *evlist)
@ -100,15 +94,20 @@ void perf_evlist__delete(struct perf_evlist *evlist)
void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
{
list_add_tail(&entry->node, &evlist->entries);
++evlist->nr_entries;
if (!evlist->nr_entries++)
perf_evlist__set_id_pos(evlist);
}
void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
struct list_head *list,
int nr_entries)
{
bool set_id_pos = !evlist->nr_entries;
list_splice_tail(list, &evlist->entries);
evlist->nr_entries += nr_entries;
if (set_id_pos)
perf_evlist__set_id_pos(evlist);
}
void __perf_evlist__set_leader(struct list_head *list)
@ -209,6 +208,21 @@ perf_evlist__find_tracepoint_by_id(struct perf_evlist *evlist, int id)
return NULL;
}
struct perf_evsel *
perf_evlist__find_tracepoint_by_name(struct perf_evlist *evlist,
const char *name)
{
struct perf_evsel *evsel;
list_for_each_entry(evsel, &evlist->entries, node) {
if ((evsel->attr.type == PERF_TYPE_TRACEPOINT) &&
(strcmp(evsel->name, name) == 0))
return evsel;
}
return NULL;
}
int perf_evlist__add_newtp(struct perf_evlist *evlist,
const char *sys, const char *name, void *handler)
{
@ -232,7 +246,7 @@ void perf_evlist__disable(struct perf_evlist *evlist)
for (cpu = 0; cpu < nr_cpus; cpu++) {
list_for_each_entry(pos, &evlist->entries, node) {
if (!perf_evsel__is_group_leader(pos))
if (!perf_evsel__is_group_leader(pos) || !pos->fd)
continue;
for (thread = 0; thread < nr_threads; thread++)
ioctl(FD(pos, cpu, thread),
@ -250,7 +264,7 @@ void perf_evlist__enable(struct perf_evlist *evlist)
for (cpu = 0; cpu < nr_cpus; cpu++) {
list_for_each_entry(pos, &evlist->entries, node) {
if (!perf_evsel__is_group_leader(pos))
if (!perf_evsel__is_group_leader(pos) || !pos->fd)
continue;
for (thread = 0; thread < nr_threads; thread++)
ioctl(FD(pos, cpu, thread),
@ -259,6 +273,44 @@ void perf_evlist__enable(struct perf_evlist *evlist)
}
}
int perf_evlist__disable_event(struct perf_evlist *evlist,
struct perf_evsel *evsel)
{
int cpu, thread, err;
if (!evsel->fd)
return 0;
for (cpu = 0; cpu < evlist->cpus->nr; cpu++) {
for (thread = 0; thread < evlist->threads->nr; thread++) {
err = ioctl(FD(evsel, cpu, thread),
PERF_EVENT_IOC_DISABLE, 0);
if (err)
return err;
}
}
return 0;
}
int perf_evlist__enable_event(struct perf_evlist *evlist,
struct perf_evsel *evsel)
{
int cpu, thread, err;
if (!evsel->fd)
return -EINVAL;
for (cpu = 0; cpu < evlist->cpus->nr; cpu++) {
for (thread = 0; thread < evlist->threads->nr; thread++) {
err = ioctl(FD(evsel, cpu, thread),
PERF_EVENT_IOC_ENABLE, 0);
if (err)
return err;
}
}
return 0;
}
static int perf_evlist__alloc_pollfd(struct perf_evlist *evlist)
{
int nr_cpus = cpu_map__nr(evlist->cpus);
@ -302,6 +354,24 @@ static int perf_evlist__id_add_fd(struct perf_evlist *evlist,
{
u64 read_data[4] = { 0, };
int id_idx = 1; /* The first entry is the counter value */
u64 id;
int ret;
ret = ioctl(fd, PERF_EVENT_IOC_ID, &id);
if (!ret)
goto add;
if (errno != ENOTTY)
return -1;
/* Legacy way to get event id.. All hail to old kernels! */
/*
* This way does not work with group format read, so bail
* out in that case.
*/
if (perf_evlist__read_format(evlist) & PERF_FORMAT_GROUP)
return -1;
if (!(evsel->attr.read_format & PERF_FORMAT_ID) ||
read(fd, &read_data, sizeof(read_data)) == -1)
@ -312,25 +382,39 @@ static int perf_evlist__id_add_fd(struct perf_evlist *evlist,
if (evsel->attr.read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
++id_idx;
perf_evlist__id_add(evlist, evsel, cpu, thread, read_data[id_idx]);
id = read_data[id_idx];
add:
perf_evlist__id_add(evlist, evsel, cpu, thread, id);
return 0;
}
struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id)
struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id)
{
struct hlist_head *head;
struct perf_sample_id *sid;
int hash;
if (evlist->nr_entries == 1)
return perf_evlist__first(evlist);
hash = hash_64(id, PERF_EVLIST__HLIST_BITS);
head = &evlist->heads[hash];
hlist_for_each_entry(sid, head, node)
if (sid->id == id)
return sid->evsel;
return sid;
return NULL;
}
struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id)
{
struct perf_sample_id *sid;
if (evlist->nr_entries == 1)
return perf_evlist__first(evlist);
sid = perf_evlist__id2sid(evlist, id);
if (sid)
return sid->evsel;
if (!perf_evlist__sample_id_all(evlist))
return perf_evlist__first(evlist);
@ -338,6 +422,55 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id)
return NULL;
}
static int perf_evlist__event2id(struct perf_evlist *evlist,
union perf_event *event, u64 *id)
{
const u64 *array = event->sample.array;
ssize_t n;
n = (event->header.size - sizeof(event->header)) >> 3;
if (event->header.type == PERF_RECORD_SAMPLE) {
if (evlist->id_pos >= n)
return -1;
*id = array[evlist->id_pos];
} else {
if (evlist->is_pos > n)
return -1;
n -= evlist->is_pos;
*id = array[n];
}
return 0;
}
static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
union perf_event *event)
{
struct hlist_head *head;
struct perf_sample_id *sid;
int hash;
u64 id;
if (evlist->nr_entries == 1)
return perf_evlist__first(evlist);
if (perf_evlist__event2id(evlist, event, &id))
return NULL;
/* Synthesized events have an id of zero */
if (!id)
return perf_evlist__first(evlist);
hash = hash_64(id, PERF_EVLIST__HLIST_BITS);
head = &evlist->heads[hash];
hlist_for_each_entry(sid, head, node) {
if (sid->id == id)
return sid->evsel;
}
return NULL;
}
union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
{
struct perf_mmap *md = &evlist->mmap[idx];
@ -403,16 +536,20 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
return event;
}
static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
{
if (evlist->mmap[idx].base != NULL) {
munmap(evlist->mmap[idx].base, evlist->mmap_len);
evlist->mmap[idx].base = NULL;
}
}
void perf_evlist__munmap(struct perf_evlist *evlist)
{
int i;
for (i = 0; i < evlist->nr_mmaps; i++) {
if (evlist->mmap[i].base != NULL) {
munmap(evlist->mmap[i].base, evlist->mmap_len);
evlist->mmap[i].base = NULL;
}
}
for (i = 0; i < evlist->nr_mmaps; i++)
__perf_evlist__munmap(evlist, i);
free(evlist->mmap);
evlist->mmap = NULL;
@ -421,7 +558,7 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
{
evlist->nr_mmaps = cpu_map__nr(evlist->cpus);
if (cpu_map__all(evlist->cpus))
if (cpu_map__empty(evlist->cpus))
evlist->nr_mmaps = thread_map__nr(evlist->threads);
evlist->mmap = zalloc(evlist->nr_mmaps * sizeof(struct perf_mmap));
return evlist->mmap != NULL ? 0 : -ENOMEM;
@ -450,6 +587,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist, int prot, int m
int nr_cpus = cpu_map__nr(evlist->cpus);
int nr_threads = thread_map__nr(evlist->threads);
pr_debug2("perf event ring buffer mmapped per cpu\n");
for (cpu = 0; cpu < nr_cpus; cpu++) {
int output = -1;
@ -477,12 +615,8 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist, int prot, int m
return 0;
out_unmap:
for (cpu = 0; cpu < nr_cpus; cpu++) {
if (evlist->mmap[cpu].base != NULL) {
munmap(evlist->mmap[cpu].base, evlist->mmap_len);
evlist->mmap[cpu].base = NULL;
}
}
for (cpu = 0; cpu < nr_cpus; cpu++)
__perf_evlist__munmap(evlist, cpu);
return -1;
}
@ -492,6 +626,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist, int prot, in
int thread;
int nr_threads = thread_map__nr(evlist->threads);
pr_debug2("perf event ring buffer mmapped per thread\n");
for (thread = 0; thread < nr_threads; thread++) {
int output = -1;
@ -517,12 +652,8 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist, int prot, in
return 0;
out_unmap:
for (thread = 0; thread < nr_threads; thread++) {
if (evlist->mmap[thread].base != NULL) {
munmap(evlist->mmap[thread].base, evlist->mmap_len);
evlist->mmap[thread].base = NULL;
}
}
for (thread = 0; thread < nr_threads; thread++)
__perf_evlist__munmap(evlist, thread);
return -1;
}
@ -573,7 +704,7 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
return -ENOMEM;
}
if (cpu_map__all(cpus))
if (cpu_map__empty(cpus))
return perf_evlist__mmap_per_thread(evlist, prot, mask);
return perf_evlist__mmap_per_cpu(evlist, prot, mask);
@ -650,20 +781,66 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
{
struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
struct perf_evsel *pos;
list_for_each_entry_continue(pos, &evlist->entries, node) {
if (first->attr.sample_type != pos->attr.sample_type)
if (evlist->nr_entries == 1)
return true;
if (evlist->id_pos < 0 || evlist->is_pos < 0)
return false;
list_for_each_entry(pos, &evlist->entries, node) {
if (pos->id_pos != evlist->id_pos ||
pos->is_pos != evlist->is_pos)
return false;
}
return true;
}
u64 perf_evlist__sample_type(struct perf_evlist *evlist)
u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist)
{
struct perf_evsel *evsel;
if (evlist->combined_sample_type)
return evlist->combined_sample_type;
list_for_each_entry(evsel, &evlist->entries, node)
evlist->combined_sample_type |= evsel->attr.sample_type;
return evlist->combined_sample_type;
}
u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist)
{
evlist->combined_sample_type = 0;
return __perf_evlist__combined_sample_type(evlist);
}
bool perf_evlist__valid_read_format(struct perf_evlist *evlist)
{
struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
u64 read_format = first->attr.read_format;
u64 sample_type = first->attr.sample_type;
list_for_each_entry_continue(pos, &evlist->entries, node) {
if (read_format != pos->attr.read_format)
return false;
}
/* PERF_SAMPLE_READ imples PERF_FORMAT_ID. */
if ((sample_type & PERF_SAMPLE_READ) &&
!(read_format & PERF_FORMAT_ID)) {
return false;
}
return true;
}
u64 perf_evlist__read_format(struct perf_evlist *evlist)
{
struct perf_evsel *first = perf_evlist__first(evlist);
return first->attr.sample_type;
return first->attr.read_format;
}
u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist)
@ -692,6 +869,9 @@ u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist)
if (sample_type & PERF_SAMPLE_CPU)
size += sizeof(data->cpu) * 2;
if (sample_type & PERF_SAMPLE_IDENTIFIER)
size += sizeof(data->id);
out:
return size;
}
@ -782,13 +962,6 @@ int perf_evlist__prepare_workload(struct perf_evlist *evlist,
close(go_pipe[1]);
fcntl(go_pipe[0], F_SETFD, FD_CLOEXEC);
/*
* Do a dummy execvp to get the PLT entry resolved,
* so we avoid the resolver overhead on the real
* execvp call.
*/
execvp("", (char **)argv);
/*
* Tell the parent we're ready to go
*/
@ -838,7 +1011,7 @@ out_close_ready_pipe:
int perf_evlist__start_workload(struct perf_evlist *evlist)
{
if (evlist->workload.cork_fd > 0) {
char bf;
char bf = 0;
int ret;
/*
* Remove the cork, let it rip!
@ -857,7 +1030,10 @@ int perf_evlist__start_workload(struct perf_evlist *evlist)
int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
struct perf_sample *sample)
{
struct perf_evsel *evsel = perf_evlist__first(evlist);
struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
if (!evsel)
return -EFAULT;
return perf_evsel__parse_sample(evsel, event, sample);
}

View File

@ -32,6 +32,9 @@ struct perf_evlist {
int nr_fds;
int nr_mmaps;
int mmap_len;
int id_pos;
int is_pos;
u64 combined_sample_type;
struct {
int cork_fd;
pid_t pid;
@ -71,6 +74,10 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter);
struct perf_evsel *
perf_evlist__find_tracepoint_by_id(struct perf_evlist *evlist, int id);
struct perf_evsel *
perf_evlist__find_tracepoint_by_name(struct perf_evlist *evlist,
const char *name);
void perf_evlist__id_add(struct perf_evlist *evlist, struct perf_evsel *evsel,
int cpu, int thread, u64 id);
@ -78,11 +85,15 @@ void perf_evlist__add_pollfd(struct perf_evlist *evlist, int fd);
struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id);
struct perf_sample_id *perf_evlist__id2sid(struct perf_evlist *evlist, u64 id);
union perf_event *perf_evlist__mmap_read(struct perf_evlist *self, int idx);
int perf_evlist__open(struct perf_evlist *evlist);
void perf_evlist__close(struct perf_evlist *evlist);
void perf_evlist__set_id_pos(struct perf_evlist *evlist);
bool perf_can_sample_identifier(void);
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts);
@ -99,6 +110,11 @@ void perf_evlist__munmap(struct perf_evlist *evlist);
void perf_evlist__disable(struct perf_evlist *evlist);
void perf_evlist__enable(struct perf_evlist *evlist);
int perf_evlist__disable_event(struct perf_evlist *evlist,
struct perf_evsel *evsel);
int perf_evlist__enable_event(struct perf_evlist *evlist,
struct perf_evsel *evsel);
void perf_evlist__set_selected(struct perf_evlist *evlist,
struct perf_evsel *evsel);
@ -118,7 +134,9 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist);
void __perf_evlist__set_leader(struct list_head *list);
void perf_evlist__set_leader(struct perf_evlist *evlist);
u64 perf_evlist__sample_type(struct perf_evlist *evlist);
u64 perf_evlist__read_format(struct perf_evlist *evlist);
u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist);
u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist);
bool perf_evlist__sample_id_all(struct perf_evlist *evlist);
u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);
@ -127,6 +145,7 @@ int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *even
bool perf_evlist__valid_sample_type(struct perf_evlist *evlist);
bool perf_evlist__valid_sample_id_all(struct perf_evlist *evlist);
bool perf_evlist__valid_read_format(struct perf_evlist *evlist);
void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
struct list_head *list,

Some files were not shown because too many files have changed in this diff Show More