1
0
Fork 0
Commit Graph

3553 Commits (redonkable)

Author SHA1 Message Date
Linus Torvalds fbf4432ff7 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - most of the rest of MM

 - a small number of misc things

 - lib/ updates

 - checkpatch

 - autofs updates

 - ipc/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (126 commits)
  ipc: optimize semget/shmget/msgget for lots of keys
  ipc/sem: play nicer with large nsops allocations
  ipc/sem: drop sem_checkid helper
  ipc: convert kern_ipc_perm.refcount from atomic_t to refcount_t
  ipc: convert sem_undo_list.refcnt from atomic_t to refcount_t
  ipc: convert ipc_namespace.count from atomic_t to refcount_t
  kcov: support compat processes
  sh: defconfig: cleanup from old Kconfig options
  mn10300: defconfig: cleanup from old Kconfig options
  m32r: defconfig: cleanup from old Kconfig options
  drivers/pps: use surrounding "if PPS" to remove numerous dependency checks
  drivers/pps: aesthetic tweaks to PPS-related content
  cpumask: make cpumask_next() out-of-line
  kmod: move #ifdef CONFIG_MODULES wrapper to Makefile
  kmod: split off umh headers into its own file
  MAINTAINERS: clarify kmod is just a kernel module loader
  kmod: split out umh code into its own file
  test_kmod: flip INT checks to be consistent
  test_kmod: remove paranoid UINT_MAX check on uint range processing
  vfat: deduplicate hex2bin()
  ...
2017-09-09 10:30:07 -07:00
Alexey Dobriyan 9b130ad5bb treewide: make "nr_cpu_ids" unsigned
First, number of CPUs can't be negative number.

Second, different signnnedness leads to suboptimal code in the following
cases:

1)
	kmalloc(nr_cpu_ids * sizeof(X));

"int" has to be sign extended to size_t.

2)
	while (loff_t *pos < nr_cpu_ids)

MOVSXD is 1 byte longed than the same MOV.

Other cases exist as well. Basically compiler is told that nr_cpu_ids
can't be negative which can't be deduced if it is "int".

Code savings on allyesconfig kernel: -3KB

	add/remove: 0/0 grow/shrink: 25/264 up/down: 261/-3631 (-3370)
	function                                     old     new   delta
	coretemp_cpu_online                          450     512     +62
	rcu_init_one                                1234    1272     +38
	pci_device_probe                             374     399     +25

				...

	pgdat_reclaimable_pages                      628     556     -72
	select_fallback_rq                           446     369     -77
	task_numa_find_cpu                          1923    1807    -116

Link: http://lkml.kernel.org/r/20170819114959.GA30580@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:48 -07:00
Linus Torvalds 42c8e86c9c Nothing new in development for this release. These are mostly
fixes that were found during development of changes for the next merge
 window and fixes that were sent to me late in the last cycle.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEQEw9Eu0DdyUUkuUUybkF8mrZjcsFAlmypCkUHHJvc3RlZHRA
 Z29vZG1pcy5vcmcACgkQybkF8mrZjcvLdAf/SsYlTViKTxM/jgsDD8fsbS9yOjl7
 9s9WgXkCHlvvpdATQIOBTSXKjc4OWDspwpybkaogf/Pz5xo1qo2JhqgdOK85UxUf
 vbYOt0lKEb+wEFXeeZCAIT3yTS22ILazNE9k6/u/0URF4cByTSnNPMWr9h9OJHzO
 n5gToZgkGNeLMiPa45eY9n7TqHAGvHRSMYzETyrD8LTiEw1IYLaCaWIYswNTrH7o
 TMMT4bmCRWc8XACpqH5EWK0Wq69JuV6trJBHxiJKNJfebl5ojAs5gsARMMoDP3vV
 q1sTjtgPE/anOOGRwnxlKz3jIcMDGfY0Aw3kFoXkWN3ROsJRm8apUd4QPQ==
 =dDI4
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Nothing new in development for this release. These are mostly fixes
  that were found during development of changes for the next merge
  window and fixes that were sent to me late in the last cycle"

* tag 'trace-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Apply trace_clock changes to instance max buffer
  tracing: Fix clear of RECORDED_TGID flag when disabling trace event
  tracing: Add barrier to trace_printk() buffer nesting modification
  ftrace: Fix memleak when unregistering dynamic ops when tracing disabled
  ftrace: Fix selftest goto location on error
  ftrace: Zero out ftrace hashes when a module is removed
  tracing: Only have rmmod clear buffers that its events were active in
  ftrace: Fix debug preempt config name in stack_tracer_{en,dis}able
2017-09-08 15:08:14 -07:00
Linus Torvalds a0725ab0c7 Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
 "This is the first pull request for 4.14, containing most of the code
  changes. It's a quiet series this round, which I think we needed after
  the churn of the last few series. This contains:

   - Fix for a registration race in loop, from Anton Volkov.

   - Overflow complaint fix from Arnd for DAC960.

   - Series of drbd changes from the usual suspects.

   - Conversion of the stec/skd driver to blk-mq. From Bart.

   - A few BFQ improvements/fixes from Paolo.

   - CFQ improvement from Ritesh, allowing idling for group idle.

   - A few fixes found by Dan's smatch, courtesy of Dan.

   - A warning fixup for a race between changing the IO scheduler and
     device remova. From David Jeffery.

   - A few nbd fixes from Josef.

   - Support for cgroup info in blktrace, from Shaohua.

   - Also from Shaohua, new features in the null_blk driver to allow it
     to actually hold data, among other things.

   - Various corner cases and error handling fixes from Weiping Zhang.

   - Improvements to the IO stats tracking for blk-mq from me. Can
     drastically improve performance for fast devices and/or big
     machines.

   - Series from Christoph removing bi_bdev as being needed for IO
     submission, in preparation for nvme multipathing code.

   - Series from Bart, including various cleanups and fixes for switch
     fall through case complaints"

* 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits)
  kernfs: checking for IS_ERR() instead of NULL
  drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set
  drbd: Fix allyesconfig build, fix recent commit
  drbd: switch from kmalloc() to kmalloc_array()
  drbd: abort drbd_start_resync if there is no connection
  drbd: move global variables to drbd namespace and make some static
  drbd: rename "usermode_helper" to "drbd_usermode_helper"
  drbd: fix race between handshake and admin disconnect/down
  drbd: fix potential deadlock when trying to detach during handshake
  drbd: A single dot should be put into a sequence.
  drbd: fix rmmod cleanup, remove _all_ debugfs entries
  drbd: Use setup_timer() instead of init_timer() to simplify the code.
  drbd: fix potential get_ldev/put_ldev refcount imbalance during attach
  drbd: new disk-option disable-write-same
  drbd: Fix resource role for newly created resources in events2
  drbd: mark symbols static where possible
  drbd: Send P_NEG_ACK upon write error in protocol != C
  drbd: add explicit plugging when submitting batches
  drbd: change list_for_each_safe to while(list_first_entry_or_null)
  drbd: introduce drbd_recv_header_maybe_unplug
  ...
2017-09-07 11:59:42 -07:00
Baohong Liu 170b3b1050 tracing: Apply trace_clock changes to instance max buffer
Currently trace_clock timestamps are applied to both regular and max
buffers only for global trace. For instance trace, trace_clock
timestamps are applied only to regular buffer. But, regular and max
buffers can be swapped, for example, following a snapshot. So, for
instance trace, bad timestamps can be seen following a snapshot.
Let's apply trace_clock timestamps to instance max buffer as well.

Link: http://lkml.kernel.org/r/ebdb168d0be042dcdf51f81e696b17fabe3609c1.1504642143.git.tom.zanussi@linux.intel.com

Cc: stable@vger.kernel.org
Fixes: 277ba0446 ("tracing: Add interface to allow multiple trace buffers")
Signed-off-by: Baohong Liu <baohong.liu@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-09-06 20:52:20 -04:00
Linus Torvalds aae3dbb477 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Support ipv6 checksum offload in sunvnet driver, from Shannon
    Nelson.

 2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
    Dumazet.

 3) Allow generic XDP to work on virtual devices, from John Fastabend.

 4) Add bpf device maps and XDP_REDIRECT, which can be used to build
    arbitrary switching frameworks using XDP. From John Fastabend.

 5) Remove UFO offloads from the tree, gave us little other than bugs.

 6) Remove the IPSEC flow cache, from Florian Westphal.

 7) Support ipv6 route offload in mlxsw driver.

 8) Support VF representors in bnxt_en, from Sathya Perla.

 9) Add support for forward error correction modes to ethtool, from
    Vidya Sagar Ravipati.

10) Add time filter for packet scheduler action dumping, from Jamal Hadi
    Salim.

11) Extend the zerocopy sendmsg() used by virtio and tap to regular
    sockets via MSG_ZEROCOPY. From Willem de Bruijn.

12) Significantly rework value tracking in the BPF verifier, from Edward
    Cree.

13) Add new jump instructions to eBPF, from Daniel Borkmann.

14) Rework rtnetlink plumbing so that operations can be run without
    taking the RTNL semaphore. From Florian Westphal.

15) Support XDP in tap driver, from Jason Wang.

16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.

17) Add Huawei hinic ethernet driver.

18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
    Delalande.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
  i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
  i40e: avoid NVM acquire deadlock during NVM update
  drivers: net: xgene: Remove return statement from void function
  drivers: net: xgene: Configure tx/rx delay for ACPI
  drivers: net: xgene: Read tx/rx delay for ACPI
  rocker: fix kcalloc parameter order
  rds: Fix non-atomic operation on shared flag variable
  net: sched: don't use GFP_KERNEL under spin lock
  vhost_net: correctly check tx avail during rx busy polling
  net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
  rxrpc: Make service connection lookup always check for retry
  net: stmmac: Delete dead code for MDIO registration
  gianfar: Fix Tx flow control deactivation
  cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
  cxgb4: Fix pause frame count in t4_get_port_stats
  cxgb4: fix memory leak
  tun: rename generic_xdp to skb_xdp
  tun: reserve extra headroom only when XDP is set
  net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
  net: dsa: bcm_sf2: Advertise number of egress queues
  ...
2017-09-06 14:45:08 -07:00
Chunyu Hu 7685ab6c58 tracing: Fix clear of RECORDED_TGID flag when disabling trace event
When disabling one trace event, the RECORDED_TGID flag in the event
file is not correctly cleared. It's clearing RECORDED_CMD flag when
it should clear RECORDED_TGID flag.

Link: http://lkml.kernel.org/r/1504589806-8425-1-git-send-email-chuhu@redhat.com

Cc: Joel Fernandes <joelaf@google.com>
Cc: stable@vger.kernel.org
Fixes: d914ba37d7 ("tracing: Add support for recording tgid of tasks")
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-09-05 12:00:09 -04:00
Steven Rostedt (VMware) 3d9622c12c tracing: Add barrier to trace_printk() buffer nesting modification
trace_printk() uses 4 buffers, one for each context (normal, softirq, irq
and NMI), such that it does not need to worry about one context preempting
the other. There's a nesting counter that gets incremented to figure out
which buffer to use. If the context gets preempted by another context which
calls trace_printk() it will increment the counter and use the next buffer,
and restore the counter when it is finished.

The problem is that gcc may optimize the modification of the buffer nesting
counter and it may not be incremented in memory before the buffer is used.
If this happens, and the context gets interrupted by another context, it
could pick the same buffer and corrupt the one that is being used.

Compiler barriers need to be added after the nesting variable is incremented
and before it is decremented to prevent usage of the context buffers by more
than one context at the same time.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Fixes: e2ace00117 ("tracing: Choose static tp_printk buffer by explicit nesting count")
Hat-tip-to: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-09-05 11:54:33 -04:00
David S. Miller 6026e043d0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Three cases of simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-01 17:42:05 -07:00
Steven Rostedt (VMware) edb096e007 ftrace: Fix memleak when unregistering dynamic ops when tracing disabled
If function tracing is disabled by the user via the function-trace option or
the proc sysctl file, and a ftrace_ops that was allocated on the heap is
unregistered, then the shutdown code exits out without doing the proper
clean up. This was found via kmemleak and running the ftrace selftests, as
one of the tests unregisters with function tracing disabled.

 # cat kmemleak
unreferenced object 0xffffffffa0020000 (size 4096):
  comm "swapper/0", pid 1, jiffies 4294668889 (age 569.209s)
  hex dump (first 32 bytes):
    55 ff 74 24 10 55 48 89 e5 ff 74 24 18 55 48 89  U.t$.UH...t$.UH.
    e5 48 81 ec a8 00 00 00 48 89 44 24 50 48 89 4c  .H......H.D$PH.L
  backtrace:
    [<ffffffff81d64665>] kmemleak_vmalloc+0x85/0xf0
    [<ffffffff81355631>] __vmalloc_node_range+0x281/0x3e0
    [<ffffffff8109697f>] module_alloc+0x4f/0x90
    [<ffffffff81091170>] arch_ftrace_update_trampoline+0x160/0x420
    [<ffffffff81249947>] ftrace_startup+0xe7/0x300
    [<ffffffff81249bd2>] register_ftrace_function+0x72/0x90
    [<ffffffff81263786>] trace_selftest_ops+0x204/0x397
    [<ffffffff82bb8971>] trace_selftest_startup_function+0x394/0x624
    [<ffffffff81263a75>] run_tracer_selftest+0x15c/0x1d7
    [<ffffffff82bb83f1>] init_trace_selftests+0x75/0x192
    [<ffffffff81002230>] do_one_initcall+0x90/0x1e2
    [<ffffffff82b7d620>] kernel_init_freeable+0x350/0x3fe
    [<ffffffff81d61ec3>] kernel_init+0x13/0x122
    [<ffffffff81d72c6a>] ret_from_fork+0x2a/0x40
    [<ffffffffffffffff>] 0xffffffffffffffff

Cc: stable@vger.kernel.org
Fixes: 12cce594fa ("ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-09-01 13:55:49 -04:00
Steven Rostedt (VMware) 46320a6acc ftrace: Fix selftest goto location on error
In the second iteration of trace_selftest_ops(), the error goto label is
wrong in the case where trace_selftest_test_global_cnt is off. In the
case of error, it leaks the dynamic ops that was allocated.

Cc: stable@vger.kernel.org
Fixes: 95950c2e ("ftrace: Add self-tests for multiple function trace users")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-09-01 12:04:09 -04:00
Steven Rostedt (VMware) 2a5bfe4762 ftrace: Zero out ftrace hashes when a module is removed
When a ftrace filter has a module function, and that module is removed, the
filter still has its address as being enabled. This can cause interesting
side effects. Nothing dangerous, but unwanted functions can be traced
because of it.

 # cd /sys/kernel/tracing
 # echo ':mod:snd_seq' > set_ftrace_filter
 # cat set_ftrace_filter
snd_use_lock_sync_helper [snd_seq]
check_event_type_and_length [snd_seq]
snd_seq_ioctl_pversion [snd_seq]
snd_seq_ioctl_client_id [snd_seq]
snd_seq_ioctl_get_queue_tempo [snd_seq]
update_timestamp_of_queue [snd_seq]
snd_seq_ioctl_get_queue_status [snd_seq]
snd_seq_set_queue_tempo [snd_seq]
snd_seq_ioctl_set_queue_tempo [snd_seq]
snd_seq_ioctl_get_queue_timer [snd_seq]
seq_free_client1 [snd_seq]
[..]
 # rmmod snd_seq
 # cat set_ftrace_filter

 # modprobe kvm
 # cat set_ftrace_filter
kvm_set_cr4 [kvm]
kvm_emulate_hypercall [kvm]
kvm_set_dr [kvm]

This is because removing the snd_seq module after it was being filtered,
left the address of the snd_seq functions in the hash. When the kvm module
was loaded, some of its functions were loaded at the same address as the
snd_seq module. This would enable them to be filtered and traced.

Now we don't want to clear the hash completely. That would cause removing a
module where only its functions are filtered, to cause the tracing to enable
all functions, as an empty filter means to trace all functions. Instead,
just set the hash ip address to zero. Then it will never match any function.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-31 19:55:12 -04:00
Steven Rostedt (VMware) 065e63f951 tracing: Only have rmmod clear buffers that its events were active in
Currently, when a module event is enabled, when that module is removed, it
clears all ring buffers. This is to prevent another module from being loaded
and having one of its trace event IDs from reusing a trace event ID of the
removed module. This could cause undesirable effects as the trace event of
the new module would be using its own processing algorithms to process raw
data of another event. To prevent this, when a module is loaded, if any of
its events have been used (signified by the WAS_ENABLED event call flag,
which is never cleared), all ring buffers are cleared, just in case any one
of them contains event data of the removed event.

The problem is, there's no reason to clear all ring buffers if only one (or
less than all of them) uses one of the events. Instead, only clear the ring
buffers that recorded the events of a module that is being removed.

To do this, instead of keeping the WAS_ENABLED flag with the trace event
call, move it to the per instance (per ring buffer) event file descriptor.
The event file descriptor maps each event to a separate ring buffer
instance. Then when the module is removed, only the ring buffers that
activated one of the module's events get cleared. The rest are not touched.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-31 17:47:38 -04:00
Zhou Chengming 75e8387685 perf/ftrace: Fix double traces of perf on ftrace:function
When running perf on the ftrace:function tracepoint, there is a bug
which can be reproduced by:

  perf record -e ftrace:function -a sleep 20 &
  perf record -e ftrace:function ls
  perf script

              ls 10304 [005]   171.853235: ftrace:function:
  perf_output_begin
              ls 10304 [005]   171.853237: ftrace:function:
  perf_output_begin
              ls 10304 [005]   171.853239: ftrace:function:
  task_tgid_nr_ns
              ls 10304 [005]   171.853240: ftrace:function:
  task_tgid_nr_ns
              ls 10304 [005]   171.853242: ftrace:function:
  __task_pid_nr_ns
              ls 10304 [005]   171.853244: ftrace:function:
  __task_pid_nr_ns

We can see that all the function traces are doubled.

The problem is caused by the inconsistency of the register
function perf_ftrace_event_register() with the probe function
perf_ftrace_function_call(). The former registers one probe
for every perf_event. And the latter handles all perf_events
on the current cpu. So when two perf_events on the current cpu,
the traces of them will be doubled.

So this patch adds an extra parameter "event" for perf_tp_event,
only send sample data to this event when it's not NULL.

Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: alexander.shishkin@linux.intel.com
Cc: huawei.libin@huawei.com
Link: http://lkml.kernel.org/r/1503668977-12526-1-git-send-email-zhouchengming1@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29 13:29:29 +02:00
Linus Torvalds 415be6c256 Various bug fixes:
- Two small memory leaks in error paths.
 
  - A missed return error code on an error path.
 
  - A fix to check the tracing ring buffer CPU when it doesn't
    exist (caused by setting maxcpus on the command line that is less
    than the actual number of CPUs, and then onlining them manually).
 
  - A fix to have the reset of boot tracers called by lateinit_sync()
    instead of just lateinit(). As some of the tracers register via
    lateinit(), and if the clear happens before the tracer is registered,
    it will never start even though it was told to via the kernel command
    line.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEQEw9Eu0DdyUUkuUUybkF8mrZjcsFAlme4DwUHHJvc3RlZHRA
 Z29vZG1pcy5vcmcACgkQybkF8mrZjcssNwf+Itap7Mtbk48wJYNqfjk1pzyiOcYV
 WM88EOBFM46dttVN6cBs2uUmtdvmX/g52RtsHzG6ZbwxzLE+tIGbSO2plGoknOyD
 lro5CSHT2j3bu0enqkxfznDUT0PNrELEaYBoMK0yhMsXm0v+XqHUxkIqb19Ubuo+
 ORPBShZghJtAiEBFArV1nXBW1kzrIFJwjymdF2ccqUlg+XxtPS1wgnZPIOjCa8ia
 YM4bX3aTUh4LiUUvS7FlJsrwjB+JFOHdXu1Vg140CvJEon1a+bW4Jx88MxoN6zrp
 xmFlXm/8MLWz27GO11IkveH01mSrdP67bKIIx8v2ybPBbwsW0Msb2HfZkw==
 =rUJQ
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Various bug fixes:

   - Two small memory leaks in error paths.

   - A missed return error code on an error path.

   - A fix to check the tracing ring buffer CPU when it doesn't exist
     (caused by setting maxcpus on the command line that is less than
     the actual number of CPUs, and then onlining them manually).

   - A fix to have the reset of boot tracers called by lateinit_sync()
     instead of just lateinit(). As some of the tracers register via
     lateinit(), and if the clear happens before the tracer is
     registered, it will never start even though it was told to via the
     kernel command line"

* tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix freeing of filter in create_filter() when set_str is false
  tracing: Fix kmemleak in tracing_map_array_free()
  ftrace: Check for null ret_stack on profile function graph entry function
  ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU
  tracing: Missing error code in tracer_alloc_buffers()
  tracing: Call clear_boot_tracer() at lateinit_sync
2017-08-24 14:08:22 -07:00
Steven Rostedt (VMware) 8b0db1a5bd tracing: Fix freeing of filter in create_filter() when set_str is false
Performing the following task with kmemleak enabled:

 # cd /sys/kernel/tracing/events/irq/irq_handler_entry/
 # echo 'enable_event:kmem:kmalloc:3 if irq >' > trigger
 # echo 'enable_event:kmem:kmalloc:3 if irq > 31' > trigger
 # echo scan > /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8800b9290308 (size 32):
  comm "bash", pid 1114, jiffies 4294848451 (age 141.139s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81cef5aa>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff81357938>] kmem_cache_alloc_trace+0x158/0x290
    [<ffffffff81261c09>] create_filter_start.constprop.28+0x99/0x940
    [<ffffffff812639c9>] create_filter+0xa9/0x160
    [<ffffffff81263bdc>] create_event_filter+0xc/0x10
    [<ffffffff812655e5>] set_trigger_filter+0xe5/0x210
    [<ffffffff812660c4>] event_enable_trigger_func+0x324/0x490
    [<ffffffff812652e2>] event_trigger_write+0x1a2/0x260
    [<ffffffff8138cf87>] __vfs_write+0xd7/0x380
    [<ffffffff8138f421>] vfs_write+0x101/0x260
    [<ffffffff8139187b>] SyS_write+0xab/0x130
    [<ffffffff81cfd501>] entry_SYSCALL_64_fastpath+0x1f/0xbe
    [<ffffffffffffffff>] 0xffffffffffffffff

The function create_filter() is passed a 'filterp' pointer that gets
allocated, and if "set_str" is true, it is up to the caller to free it, even
on error. The problem is that the pointer is not freed by create_filter()
when set_str is false. This is a bug, and it is not up to the caller to free
the filter on error if it doesn't care about the string.

Link: http://lkml.kernel.org/r/1502705898-27571-2-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 38b78eb85 ("tracing: Factorize filter creation")
Reported-by: Chunyu Hu <chuhu@redhat.com>
Tested-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-24 10:07:38 -04:00
Chunyu Hu 475bb3c69a tracing: Fix kmemleak in tracing_map_array_free()
kmemleak reported the below leak when I was doing clear of the hist
trigger. With this patch, the kmeamleak is gone.

unreferenced object 0xffff94322b63d760 (size 32):
  comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
  hex dump (first 32 bytes):
    00 01 00 00 04 00 00 00 08 00 00 00 ff 00 00 00  ................
    10 00 00 00 00 00 00 00 80 a8 7a f2 31 94 ff ff  ..........z.1...
  backtrace:
    [<ffffffff9e96c27a>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff9e424cba>] kmem_cache_alloc_trace+0xca/0x1d0
    [<ffffffff9e377736>] tracing_map_array_alloc+0x26/0x140
    [<ffffffff9e261be0>] kretprobe_trampoline+0x0/0x50
    [<ffffffff9e38b935>] create_hist_data+0x535/0x750
    [<ffffffff9e38bd47>] event_hist_trigger_func+0x1f7/0x420
    [<ffffffff9e38893d>] event_trigger_write+0xfd/0x1a0
    [<ffffffff9e44dfc7>] __vfs_write+0x37/0x170
    [<ffffffff9e44f552>] vfs_write+0xb2/0x1b0
    [<ffffffff9e450b85>] SyS_write+0x55/0xc0
    [<ffffffff9e203857>] do_syscall_64+0x67/0x150
    [<ffffffff9e977ce7>] return_from_SYSCALL_64+0x0/0x6a
    [<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff9431f27aa880 (size 128):
  comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
  hex dump (first 32 bytes):
    00 00 8c 2a 32 94 ff ff 00 f0 8b 2a 32 94 ff ff  ...*2......*2...
    00 e0 8b 2a 32 94 ff ff 00 d0 8b 2a 32 94 ff ff  ...*2......*2...
  backtrace:
    [<ffffffff9e96c27a>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff9e425348>] __kmalloc+0xe8/0x220
    [<ffffffff9e3777c1>] tracing_map_array_alloc+0xb1/0x140
    [<ffffffff9e261be0>] kretprobe_trampoline+0x0/0x50
    [<ffffffff9e38b935>] create_hist_data+0x535/0x750
    [<ffffffff9e38bd47>] event_hist_trigger_func+0x1f7/0x420
    [<ffffffff9e38893d>] event_trigger_write+0xfd/0x1a0
    [<ffffffff9e44dfc7>] __vfs_write+0x37/0x170
    [<ffffffff9e44f552>] vfs_write+0xb2/0x1b0
    [<ffffffff9e450b85>] SyS_write+0x55/0xc0
    [<ffffffff9e203857>] do_syscall_64+0x67/0x150
    [<ffffffff9e977ce7>] return_from_SYSCALL_64+0x0/0x6a
    [<ffffffffffffffff>] 0xffffffffffffffff

Link: http://lkml.kernel.org/r/1502705898-27571-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 08d43a5fa0 ("tracing: Add lock-free tracing_map")
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-24 10:05:51 -04:00
Steven Rostedt (VMware) a8f0f9e499 ftrace: Check for null ret_stack on profile function graph entry function
There's a small race when function graph shutsdown and the calling of the
registered function graph entry callback. The callback must not reference
the task's ret_stack without first checking that it is not NULL. Note, when
a ret_stack is allocated for a task, it stays allocated until the task exits.
The problem here, is that function_graph is shutdown, and a new task was
created, which doesn't have its ret_stack allocated. But since some of the
functions are still being traced, the callbacks can still be called.

The normal function_graph code handles this, but starting with commit
8861dd303c ("ftrace: Access ret_stack->subtime only in the function
profiler") the profiler code references the ret_stack on function entry, but
doesn't check if it is NULL first.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196611

Cc: stable@vger.kernel.org
Fixes: 8861dd303c ("ftrace: Access ret_stack->subtime only in the function profiler")
Reported-by: lilydjwg@gmail.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-24 10:04:01 -04:00
Christoph Hellwig 74d46992e0 block: replace bi_bdev with a gendisk pointer and partitions index
This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-23 12:49:55 -06:00
David S. Miller 463910e2df Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-08-15 20:23:23 -07:00
Daniel Borkmann 88a5c690b6 bpf: fix bpf_trace_printk on 32 bit archs
James reported that on MIPS32 bpf_trace_printk() is currently
broken while MIPS64 works fine:

  bpf_trace_printk() uses conditional operators to attempt to
  pass different types to __trace_printk() depending on the
  format operators. This doesn't work as intended on 32-bit
  architectures where u32 and long are passed differently to
  u64, since the result of C conditional operators follows the
  "usual arithmetic conversions" rules, such that the values
  passed to __trace_printk() will always be u64 [causing issues
  later in the va_list handling for vscnprintf()].

  For example the samples/bpf/tracex5 test printed lines like
  below on MIPS32, where the fd and buf have come from the u64
  fd argument, and the size from the buf argument:

    [...] 1180.941542: 0x00000001: write(fd=1, buf=  (null), size=6258688)

  Instead of this:

    [...] 1625.616026: 0x00000001: write(fd=1, buf=009e4000, size=512)

One way to get it working is to expand various combinations
of argument types into 8 different combinations for 32 bit
and 64 bit kernels. Fix tested by James on MIPS32 and MIPS64
as well that it resolves the issue.

Fixes: 9c959c863f ("tracing: Allow BPF programs to call bpf_trace_printk()")
Reported-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15 17:32:15 -07:00
Yonghong Song cf5f5cea27 bpf: add support for sys_enter_* and sys_exit_* tracepoints
Currently, bpf programs cannot be attached to sys_enter_* and sys_exit_*
style tracepoints. The iovisor/bcc issue #748
(https://github.com/iovisor/bcc/issues/748) documents this issue.
For example, if you try to attach a bpf program to tracepoints
syscalls/sys_enter_newfstat, you will get the following error:
   # ./tools/trace.py t:syscalls:sys_enter_newfstat
   Ioctl(PERF_EVENT_IOC_SET_BPF): Invalid argument
   Failed to attach BPF to tracepoint

The main reason is that syscalls/sys_enter_* and syscalls/sys_exit_*
tracepoints are treated differently from other tracepoints and there
is no bpf hook to it.

This patch adds bpf support for these syscalls tracepoints by
  . permitting bpf attachment in ioctl PERF_EVENT_IOC_SET_BPF
  . calling bpf programs in perf_syscall_enter and perf_syscall_exit

The legality of bpf program ctx access is also checked.
Function trace_event_get_offsets returns correct max offset for each
specific syscall tracepoint, which is compared against the maximum offset
access in bpf program.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:09:48 -07:00
Steven Rostedt (VMware) a7e52ad7ed ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU
Chunyu Hu reported:
  "per_cpu trace directories and files are created for all possible cpus,
   but only the cpus which have ever been on-lined have their own per cpu
   ring buffer (allocated by cpuhp threads). While trace_buffers_open, the
   open handler for trace file 'trace_pipe_raw' is always trying to access
   field of ring_buffer_per_cpu, and would panic with the NULL pointer.

   Align the behavior of trace_pipe_raw with trace_pipe, that returns -NODEV
   when openning it if that cpu does not have trace ring buffer.

   Reproduce:
   cat /sys/kernel/debug/tracing/per_cpu/cpu31/trace_pipe_raw
   (cpu31 is never on-lined, this is a 16 cores x86_64 box)

   Tested with:
   1) boot with maxcpus=14, read trace_pipe_raw of cpu15.
      Got -NODEV.
   2) oneline cpu15, read trace_pipe_raw of cpu15.
      Get the raw trace data.

   Call trace:
   [ 5760.950995] RIP: 0010:ring_buffer_alloc_read_page+0x32/0xe0
   [ 5760.961678]  tracing_buffers_read+0x1f6/0x230
   [ 5760.962695]  __vfs_read+0x37/0x160
   [ 5760.963498]  ? __vfs_read+0x5/0x160
   [ 5760.964339]  ? security_file_permission+0x9d/0xc0
   [ 5760.965451]  ? __vfs_read+0x5/0x160
   [ 5760.966280]  vfs_read+0x8c/0x130
   [ 5760.967070]  SyS_read+0x55/0xc0
   [ 5760.967779]  do_syscall_64+0x67/0x150
   [ 5760.968687]  entry_SYSCALL64_slow_path+0x25/0x25"

This was introduced by the addition of the feature to reuse reader pages
instead of re-allocating them. The problem is that the allocation of a
reader page (which is per cpu) does not check if the cpu is online and set
up for the ring buffer.

Link: http://lkml.kernel.org/r/1500880866-1177-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: 73a757e631 ("ring-buffer: Return reader page back into existing ring buffer")
Reported-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-02 14:23:02 -04:00
Dan Carpenter 147d88e0b5 tracing: Missing error code in tracer_alloc_buffers()
If ring_buffer_alloc() or one of the next couple function calls fail
then we should return -ENOMEM but the current code returns success.

Link: http://lkml.kernel.org/r/20170801110201.ajdkct7vwzixahvx@mwanda

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Fixes: b32614c034 ('tracing/rb: Convert to hotplug state machine')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-02 14:19:57 -04:00
Steven Rostedt (VMware) 4bb0f0e73c tracing: Call clear_boot_tracer() at lateinit_sync
The clear_boot_tracer function is used to reset the default_bootup_tracer
string to prevent it from being accessed after boot, as it originally points
to init data. But since clear_boot_tracer() is called via the
init_lateinit() call, it races with the initcall for registering the hwlat
tracer. If someone adds "ftrace=hwlat" to the kernel command line, depending
on how the linker sets up the text, the saved command line may be cleared,
and the hwlat tracer never is initialized.

Simply have the clear_boot_tracer() be called by initcall_lateinit_sync() as
that's for tasks to be called after lateinit.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196551

Cc: stable@vger.kernel.org
Fixes: e7c15cd8a ("tracing: Added hardware latency tracer")
Reported-by: Zamir SUN <sztsian@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-02 14:19:57 -04:00
Shaohua Li 35fe6d7632 block: use standard blktrace API to output cgroup info for debug notes
Currently cfq/bfq/blk-throttle output cgroup info in trace in their own
way. Now we have standard blktrace API for this, so convert them to use
it.

Note, this changes the behavior a little bit. cgroup info isn't output
by default, we only do this with 'blk_cgroup' option enabled. cgroup
info isn't output as a string by default too, we only do this with
'blk_cgname' option enabled. Also cgroup info is output in different
position of the note string. I think these behavior changes aren't a big
issue (actually we make trace data shorter which is good), since the
blktrace note is solely for debugging.

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-29 09:00:03 -06:00
Shaohua Li 69fd5c3917 blktrace: add an option to allow displaying cgroup path
By default we output cgroup id in blktrace. This adds an option to
display cgroup path. Since get cgroup path is a relativly heavy
operation, we don't enable it by default.

with the option enabled, blktrace will output something like this:
dd-1353  [007] d..2   293.015252:   8,0   /test/level  D   R 24 + 8 [dd]

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-29 09:00:03 -06:00
Shaohua Li ca1136c99b blktrace: export cgroup info in trace
Currently blktrace isn't cgroup aware. blktrace prints out task name of
current context, but the task of current context isn't always in the
cgroup where the BIO comes from. We can't use task name to find out IO
cgroup. For example, Writeback BIOs always comes from flusher thread but
the BIOs are for different blk cgroups. Request could be requeued and
dispatched from completely different tasks. MD/DM are another examples.

This patch tries to fix the gap. We print out cgroup fhandle info in
blktrace. Userspace can use open_by_handle_at() syscall to find the
cgroup by fhandle. Or userspace can use name_to_handle_at() syscall to
find fhandle for a cgroup and use a BPF program to filter out blktrace
for a specific cgroup.

We add a new 'blk_cgroup' trace option for blk tracer. It's default off.
Application which doesn't know the new option isn't affected.  When it's
on, we output fhandle info right after blk_io_trace with an extra bit
set in event action. So from application point of view, blktrace with
the option will output new actions.

I didn't change blk trace event yet, since I'm not sure if changing the
trace event output is an ABI issue. If not, I'll do it later.

Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-07-29 09:00:03 -06:00
Chunyan Zhang f86f418059 trace: fix the errors caused by incompatible type of RCU variables
The variables which are processed by RCU functions should be annotated
as RCU, otherwise sparse will report the errors like below:

"error: incompatible types in comparison expression (different
address spaces)"

Link: http://lkml.kernel.org/r/1496823171-7758-1-git-send-email-zhang.chunyan@linaro.org

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
[ Updated to not be 100% 80 column strict ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-20 09:27:29 -04:00
Chunyu Hu db9108e054 tracing: Fix kmemleak in instance_rmdir
Hit the kmemleak when executing instance_rmdir, it forgot releasing
mem of tracing_cpumask. With this fix, the warn does not appear any
more.

unreferenced object 0xffff93a8dfaa7c18 (size 8):
  comm "mkdir", pid 1436, jiffies 4294763622 (age 9134.308s)
  hex dump (first 8 bytes):
    ff ff ff ff ff ff ff ff                          ........
  backtrace:
    [<ffffffff88b6567a>] kmemleak_alloc+0x4a/0xa0
    [<ffffffff8861ea41>] __kmalloc_node+0xf1/0x280
    [<ffffffff88b505d3>] alloc_cpumask_var_node+0x23/0x30
    [<ffffffff88b5060e>] alloc_cpumask_var+0xe/0x10
    [<ffffffff88571ab0>] instance_mkdir+0x90/0x240
    [<ffffffff886e5100>] tracefs_syscall_mkdir+0x40/0x70
    [<ffffffff886565c9>] vfs_mkdir+0x109/0x1b0
    [<ffffffff8865b1d0>] SyS_mkdir+0xd0/0x100
    [<ffffffff88403857>] do_syscall_64+0x67/0x150
    [<ffffffff88b710e7>] return_from_SYSCALL_64+0x0/0x6a
    [<ffffffffffffffff>] 0xffffffffffffffff

Link: http://lkml.kernel.org/r/1500546969-12594-1-git-send-email-chuhu@redhat.com

Cc: stable@vger.kernel.org
Fixes: ccfe9e42e4 ("tracing: Make tracing_cpumask available for all instances")
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-20 09:24:25 -04:00
Joel Fernandes 848618857d tracing/ring_buffer: Try harder to allocate
ftrace can fail to allocate per-CPU ring buffer on systems with a large
number of CPUs coupled while large amounts of cache happening in the
page cache. Currently the ring buffer allocation doesn't retry in the VM
implementation even if direct-reclaim made some progress but still
wasn't able to find a free page. On retrying I see that the allocations
almost always succeed. The retry doesn't happen because __GFP_NORETRY is
used in the tracer to prevent the case where we might OOM, however if we
drop __GFP_NORETRY, we risk destabilizing the system if OOM killer is
triggered. To prevent this situation, use the __GFP_RETRY_MAYFAIL flag
introduced recently [1].

Tested the following still succeeds without destabilizing a system with
1GB memory.
echo 300000 > /sys/kernel/debug/tracing/buffer_size_kb

[1] https://marc.info/?l=linux-mm&m=149820805124906&w=2

Link: http://lkml.kernel.org/r/20170713021416.8897-1-joelaf@google.com

Cc: Tim Murray <timmurray@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-19 08:22:12 -04:00
Linus Torvalds bc0f51d359 A few more minor updates:
- Show the tgid mappings for user space trace tools to use
 
  - Fix and optimize the comm and tgid cache recording
 
  - Sanitize derived kprobe names
 
  - Ftrace selftest updates
 
  - trace file header fix
 
  - Update of Documentation/trace/ftrace.txt
 
  - Compiler warning fixes
 
  - Fix possible uninitialized variable
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJZZ2rbFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 V3MIAI3NZ3dr0dKJ7DMF1jsQc24YF/bMG2noWm2b9+H/sO+gbnJKsizqzrB2Cm8S
 lFCYGSydLKGGZgKob3wkAX15iO2fxcUvJOKzkKxmyDbwAteABRf9LSr/llthRIsT
 8kSPI5bgJ5dah+lvhl9+1ekarsIZGr41svY97Knj9A2K18kQplnSNqgatkIuV2Kn
 hIoiPI0tG2y27In2JJoaTedAHj4NIwmI3nhTt6nks0GN7ICx3bMcvdE9l+zB+OLJ
 akAehsTk3kcNb66ttoj6ZTzGZ7kaes96Cl6/uamVpXzh3SXla36ux1r9Kp8bgONE
 EgrJwbRwU8BMDaattutDxT7/XmU=
 =TPGB
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:
 "A few more minor updates:

   - Show the tgid mappings for user space trace tools to use

   - Fix and optimize the comm and tgid cache recording

   - Sanitize derived kprobe names

   - Ftrace selftest updates

   - trace file header fix

   - Update of Documentation/trace/ftrace.txt

   - Compiler warning fixes

   - Fix possible uninitialized variable"

* tag 'trace-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix uninitialized variable in match_records()
  ftrace: Remove an unneeded NULL check
  ftrace: Hide cached module code for !CONFIG_MODULES
  tracing: Do note expose stack_trace_filter without DYNAMIC_FTRACE
  tracing: Update Documentation/trace/ftrace.txt
  tracing: Fixup trace file header alignment
  selftests/ftrace: Add a testcase for kprobe event naming
  selftests/ftrace: Add a test to probe module functions
  selftests/ftrace: Update multiple kprobes test for powerpc
  trace/kprobes: Sanitize derived event names
  tracing: Attempt to record other information even if some fail
  tracing: Treat recording tgid for idle task as a success
  tracing: Treat recording comm for idle task as a success
  tracing: Add saved_tgids file to show cached pid to tgid mappings
2017-07-13 13:17:19 -07:00
Dan Carpenter 2e028c4fe1 ftrace: Fix uninitialized variable in match_records()
My static checker complains that if "func" is NULL then "clear_filter"
is uninitialized.  This seems like it could be true, although it's
possible something subtle is happening that I haven't seen.

    kernel/trace/ftrace.c:3844 match_records()
    error: uninitialized symbol 'clear_filter'.

Link: http://lkml.kernel.org/r/20170712073556.h6tkpjcdzjaozozs@mwanda

Cc: stable@vger.kernel.org
Fixes: f0a3b154bd ("ftrace: Clarify code for mod command")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-12 09:48:31 -04:00
Dan Carpenter 44925dfff0 ftrace: Remove an unneeded NULL check
"func" can't be NULL and it doesn't make sense to check because we've
already derefenced it.

Link: http://lkml.kernel.org/r/20170712073340.4enzeojeoupuds5a@mwanda

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-12 09:45:42 -04:00
Arnd Bergmann 69449bbd65 ftrace: Hide cached module code for !CONFIG_MODULES
When modules are disabled, we get a harmless build warning:

kernel/trace/ftrace.c:4051:13: error: 'process_cached_mods' defined but not used [-Werror=unused-function]

This adds the same #ifdef around the new code that exists around
its caller.

Link: http://lkml.kernel.org/r/20170710084413.1820568-1-arnd@arndb.de

Fixes: d7fbf8df7c ("ftrace: Implement cached modules tracing on module load")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-11 19:29:04 -04:00
Steven Rostedt (VMware) bbd1d27d86 tracing: Do note expose stack_trace_filter without DYNAMIC_FTRACE
The "stack_trace_filter" file only makes sense if DYNAMIC_FTRACE is
configured in. If it is not, then the user can not filter any functions.

Not only that, the open function causes warnings when DYNAMIC_FTRACE is not
set.

Link: http://lkml.kernel.org/r/20170710110521.600806-1-arnd@arndb.de

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-11 19:21:04 -04:00
Steven Rostedt (VMware) b11fb73743 tracing: Fixup trace file header alignment
The addition of TGID to the tracing header added a check to see if TGID
shoudl be displayed or not, and updated the header accordingly.
Unfortunately, it broke the default header.

Also add constant strings to use for spacing. This does remove the
visibility of the header a bit, but cuts it down from the extended lines
much greater than 80 characters.

Before this change:

 # tracer: function
 #
 #                            _-----=> irqs-off
 #                           / _----=> need-resched
 #                          | / _---=> hardirq/softirq
 #                          || / _--=> preempt-depth
 #                          ||| /     delay
 #           TASK-PID   CPU#||||    TIMESTAMP  FUNCTION
 #              | |       | ||||       |         |
        swapper/0-1     [000] ....     0.277830: migration_init <-do_one_initcall
        swapper/0-1     [002] d...    13.861967: Unknown type 1201
        swapper/0-1     [002] d..1    13.861970: Unknown type 1202

After this change:

 # tracer: function
 #
 #                              _-----=> irqs-off
 #                             / _----=> need-resched
 #                            | / _---=> hardirq/softirq
 #                            || / _--=> preempt-depth
 #                            ||| /     delay
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
        swapper/0-1     [000] ....     0.278245: migration_init <-do_one_initcall
        swapper/0-1     [003] d...    13.861189: Unknown type 1201
        swapper/0-1     [003] d..1    13.861192: Unknown type 1202

Cc: Joel Fernandes <joelaf@google.com>
Fixes: 441dae8f2f ("tracing: Add support for display of tgid in trace output")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-11 16:48:19 -04:00
Linus Torvalds c3931a87db Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A couple of fixes for perf and kprobes:

   - Add he missing exclude_kernel attribute for the precise_ip level so
     !CAP_SYS_ADMIN users get the proper results.

   - Warn instead of failing completely when perf has no unwind support
     for a particular architectiure built in.

   - Ensure that jprobes are at function entry and not at some random
     place"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kprobes: Ensure that jprobe probepoints are at function entry
  kprobes: Simplify register_jprobes()
  kprobes: Rename [arch_]function_offset_within_entry() to [arch_]kprobe_on_func_entry()
  perf unwind: Do not fail due to missing unwind support
  perf evsel: Set attr.exclude_kernel when probing max attr.precise_ip
2017-07-09 10:49:47 -07:00
Naveen N. Rao fca18a47cf trace/kprobes: Sanitize derived event names
When we derive event names, convert some expected symbols (such as ':'
used to specify module:name and '.' present in some symbols) into
underscores so that the event name is not rejected.

Before this patch:
    # echo 'p kobject_example:foo_store' > kprobe_events
    trace_kprobe: Failed to allocate trace_probe.(-22)
    -sh: write error: Invalid argument

After this patch:
    # echo 'p kobject_example:foo_store' > kprobe_events
    # cat kprobe_events
    p:kprobes/p_kobject_example_foo_store_0 kobject_example:foo_store

Link: http://lkml.kernel.org/r/66c189e09e71361aba91dd4a5bd146a1b62a7a51.1499453040.git.naveen.n.rao@linux.vnet.ibm.com

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-09 07:45:53 -04:00
Naveen N. Rao 659b957f20 kprobes: Rename [arch_]function_offset_within_entry() to [arch_]kprobe_on_func_entry()
Rename function_offset_within_entry() to scope it to kprobe namespace by
using kprobe_ prefix, and to also simplify it.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3aa6c7e2e4fb6e00f3c24fa306496a66edb558ea.1499443367.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-08 11:05:34 +02:00
Linus Torvalds ef3ad0898a linux-kselftest-4.13-rc1-update
This update consists of:
 
 -- TAP13 framework and changes to some tests to convert to TAP13.
    Converting kselftest output to standard format will help identify
    run to run differences and pin point failures easily. TAP13 format
    has been in use for several years and the output is human friendly.
 
    Please find the specification:
    https://testanything.org/tap-version-13-specification.html
 
    Credit goes to Tim Bird for recommending TAP13 as a suitable format,
    and to Grag KH for kick starting the work with help from Paul Elder
    and Alice Ferrazzi
 
    The first phase of the TAp13 conversion is included in this update.
    Future updates will include updates to rest of the tests.
 
 -- Masami Hiramatsu fixed ftrace to run on 4.9 stable kernels.
 
 -- Kselftest documnetation has been converted to ReST format. Document
    now has a new home under Documentation/dev-tools.
 
 -- kselftest_harness.h is now available for general use as a result of
    Mickaël Salaün's work.
 
 -- Several fixes to skip and/or fail tests gracefully on older releases.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZXo9JAAoJEAsCRMQNDUMc1OUQAOJsBFWiMgWWxOZg1RBT5khl
 7OvGLoHsu3qydF5gzVnyDuEZAGHRc4c6OKqbHIqQB3tp9o4PnX2m9KIa6z7sjzys
 jett2ZjMe7BtctBluZF0zVyCbRdAXgfxp7QGfv/CkN+hw4uztwFwen4LpwvJseLd
 gkie/lVPFKszyaWfiF3pDPazk5qhc53ChjAhnSkRY8HlwFcVtZwO7Ptvex0l8gO2
 t+ZxhX9zt3jxRbiHq5h/N6EDw2pPthvSR4iT4FcyYiwqxUK64Nq5RQpkxJTfu0iz
 l2mxMTNol/tDKH+iOvWJX565LzVXxonCf8Cne4mooqegkn0f2bnkPqoE5N8OwTdd
 oIGT/Vq84C5eQwPubtr2oXr6Xh7pywbPW8h7fn972QWl5ySbR4JEmdBzSviF5ALq
 Dwz8lJeGX6qYpSKz8aVqKYJ3U31hYxT/EPhGIJ4VtjcTxyfgcobaD26W0vT0Cjad
 dIdK11IDMxErquS1Vb/kkTzVxCnVhmWRsjmUeKLl/FxDkhiJmjIxaCOvtitzsiHz
 tooMpcCQ7Z97QbDxKfolpcCC563okYhUoca3EhZLq9pZkEwfbGN9YI4/i608oSaA
 K4mJgdL6c704TqGwouIBn/+MTWq4LOkzN2zUP0kpY2z61GvEPMYxmdoQBn2yHBb9
 cnt9MZNlZML2YqnMjiDf
 =j1Um
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-4.13-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest updates from Shuah Khan:
 "This update consists of:

   - TAP13 framework and changes to some tests to convert to TAP13.
     Converting kselftest output to standard format will help identify
     run to run differences and pin point failures easily. TAP13 format
     has been in use for several years and the output is human friendly.

     Please find the specification:
       https://testanything.org/tap-version-13-specification.html

     Credit goes to Tim Bird for recommending TAP13 as a suitable
     format, and to Grag KH for kick starting the work with help from
     Paul Elder and Alice Ferrazzi

     The first phase of the TAp13 conversion is included in this update.
     Future updates will include updates to rest of the tests.

   - Masami Hiramatsu fixed ftrace to run on 4.9 stable kernels.

   - Kselftest documnetation has been converted to ReST format. Document
     now has a new home under Documentation/dev-tools.

   - kselftest_harness.h is now available for general use as a result of
     Mickaël Salaün's work.

   - Several fixes to skip and/or fail tests gracefully on older
     releases"

* tag 'linux-kselftest-4.13-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (48 commits)
  selftests: membarrier: use ksft_* var arg msg api
  selftests: breakpoints: breakpoint_test_arm64: convert test to use TAP13
  selftests: breakpoints: step_after_suspend_test use ksft_* var arg msg api
  selftests: breakpoint_test: use ksft_* var arg msg api
  kselftest: add ksft_print_msg() function to output general information
  kselftest: make ksft_* output functions variadic
  selftests/capabilities: Fix the test_execve test
  selftests: intel_pstate: add .gitignore
  selftests: fix memory-hotplug test
  selftests: add missing test name in memory-hotplug test
  selftests: check percentage range for memory-hotplug test
  selftests: check hot-pluggagble memory for memory-hotplug test
  selftests: typo correction for memory-hotplug test
  selftests: ftrace: Use md5sum to take less time of checking logs
  tools/testing/selftests/sysctl: Add pre-check to the value of writes_strict
  kselftest.rst: do some adjustments after ReST conversion
  selftest/net/Makefile: Specify output with $(OUTPUT)
  selftest/intel_pstate/aperf: Use LDLIBS instead of LDFLAGS
  selftest/memfd/Makefile: Fix build error
  selftests: lib: Skip tests on missing test modules
  ...
2017-07-07 14:04:47 -07:00
Joel Fernandes 29b1a8ad7d tracing: Attempt to record other information even if some fail
In recent patches where we record comm and tgid at the same time, we skip
continuing to record if any fail. Fix that by trying to record as many things
as we can even if some couldn't be recorded. If any information isn't recorded,
then we don't set trace_taskinfo_save as before.

Link: http://lkml.kernel.org/r/20170706230023.17942-3-joelaf@google.com

Cc: kernel-team@android.com
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-07 09:11:34 -04:00
Joel Fernandes bd45d34d25 tracing: Treat recording tgid for idle task as a success
Currently we stop recording tgid for non-idle tasks when switching from/to idle
task since we treat that as a record failure. Fix that by treat recording of
tgid for idle task as a success.

Link: http://lkml.kernel.org/r/20170706230023.17942-2-joelaf@google.com

Cc: kernel-team@android.com
Cc: Ingo Molnar <mingo@redhat.com>
Reported-by: Michael Sartain <mikesart@gmail.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-07 09:04:23 -04:00
Joel Fernandes eaf260ac04 tracing: Treat recording comm for idle task as a success
Currently we stop recording comm for non-idle tasks when switching from/to idle
task since we treat that as a record failure. Fix that by treat recording of
comm for idle task as a success.

Link: http://lkml.kernel.org/r/20170706230023.17942-1-joelaf@google.com

Cc: kernel-team@android.com
Cc: Ingo Molnar <mingo@redhat.com>
Reported-by: Michael Sartain <mikesart@gmail.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-07 09:04:14 -04:00
Linus Torvalds 2074006dac The new features of this release:
- Added TRACE_DEFINE_SIZEOF() which allows trace events that use
     sizeof() it the TP_printk() to be converted to the actual size such
     that trace-cmd and perf can parse them correctly.
 
   - Some rework of the TRACE_DEFINE_ENUM() such that the above
     TRACE_DEFINE_SIZEOF() could reuse the same code.
 
   - Recording of tgid (Thread Group ID). This is similar to how
     task COMMs are recorded (cached at sched_switch), where it is
     in a table and used on output of the trace and trace_pipe files.
 
   - Have ":mod:<module>" be cached when written into set_ftrace_filter.
     Then the functions of the module will be traced at module load.
 
   - Some random clean ups and small fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJZXjYuFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 fsgIAKUvhpn2igoYCR9tWqu+DovEmwxCIumbCzmCFQcRKlLttRte94yY5+W9hnV0
 JPzd9T9zBDVqq1fI7iIop1SuTwEfKW6lJom0usZ8AFpK+YKm6FHnQ28POlvHzre2
 lzO41tpRWiehLQsITZ47eByhsvEfhx86mYT/oM1JSR6Pii1OpjyNYmDMw6BaMNBT
 kSCQFgIhzAhVuHjwAnB/S++E/ou7M5bCwCb5CNh7MubKubV5upHpoJcgYGO+WWa6
 56H/iEhff4EECTGJVefd8e78MtJPL8EsuM0nAcMPlnl8AaiOpP7XCdlgTwdefLvP
 b3o+nP15voSHkARGXC6eM6gH0po=
 =rvGB
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "The new features of this release:

   - Added TRACE_DEFINE_SIZEOF() which allows trace events that use
     sizeof() it the TP_printk() to be converted to the actual size such
     that trace-cmd and perf can parse them correctly.

   - Some rework of the TRACE_DEFINE_ENUM() such that the above
     TRACE_DEFINE_SIZEOF() could reuse the same code.

   - Recording of tgid (Thread Group ID). This is similar to how task
     COMMs are recorded (cached at sched_switch), where it is in a table
     and used on output of the trace and trace_pipe files.

   - Have ":mod:<module>" be cached when written into set_ftrace_filter.
     Then the functions of the module will be traced at module load.

   - Some random clean ups and small fixes"

* tag 'trace-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (26 commits)
  ftrace: Test for NULL iter->tr in regex for stack_trace_filter changes
  ftrace: Decrement count for dyn_ftrace_total_info for init functions
  ftrace: Unlock hash mutex on failed allocation in process_mod_list()
  tracing: Add support for display of tgid in trace output
  tracing: Add support for recording tgid of tasks
  ftrace: Decrement count for dyn_ftrace_total_info file
  ftrace: Remove unused function ftrace_arch_read_dyn_info()
  sh/ftrace: Remove only user of ftrace_arch_read_dyn_info()
  ftrace: Have cached module filters be an active filter
  ftrace: Implement cached modules tracing on module load
  ftrace: Have the cached module list show in set_ftrace_filter
  ftrace: Add :mod: caching infrastructure to trace_array
  tracing: Show address when function names are not found
  ftrace: Add missing comment for FTRACE_OPS_FL_RCU
  tracing: Rename update the enum_map file
  tracing: Add TRACE_DEFINE_SIZEOF() macros
  tracing: define TRACE_DEFINE_SIZEOF() macro to map sizeof's to their values
  tracing: Rename enum_replace to eval_replace
  trace: rename enum_map functions
  trace: rename trace.c enum functions
  ...
2017-07-06 19:45:45 -07:00
Michael Sartain 99c621d704 tracing: Add saved_tgids file to show cached pid to tgid mappings
Export the cached pid / tgid mappings in debugfs tracing saved_tgids file.
This allows user apps to translate the pids from a trace to their respective
thread group.

Example saved_tgids file with pid / tgid values separated by ' ':

  # cat saved_tgids
  1048 1048
  1047 1047
  7 7
  1049 1047
  1054 1047
  1053 1047

Link: http://lkml.kernel.org/r/20170630004023.064965233@goodmis.org
Link: http://lkml.kernel.org/r/20170706040713.unwkumbta5menygi@mikesart-cos

Reviewed-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Michael Sartain <mikesart@fastmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-06 10:11:53 -04:00
Linus Torvalds 5518b69b76 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Reasonably busy this cycle, but perhaps not as busy as in the 4.12
  merge window:

   1) Several optimizations for UDP processing under high load from
      Paolo Abeni.

   2) Support pacing internally in TCP when using the sch_fq packet
      scheduler for this is not practical. From Eric Dumazet.

   3) Support mutliple filter chains per qdisc, from Jiri Pirko.

   4) Move to 1ms TCP timestamp clock, from Eric Dumazet.

   5) Add batch dequeueing to vhost_net, from Jason Wang.

   6) Flesh out more completely SCTP checksum offload support, from
      Davide Caratti.

   7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
      Neira Ayuso, and Matthias Schiffer.

   8) Add devlink support to nfp driver, from Simon Horman.

   9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
      Prabhu.

  10) Add stack depth tracking to BPF verifier and use this information
      in the various eBPF JITs. From Alexei Starovoitov.

  11) Support XDP on qed device VFs, from Yuval Mintz.

  12) Introduce BPF PROG ID for better introspection of installed BPF
      programs. From Martin KaFai Lau.

  13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.

  14) For loads, allow narrower accesses in bpf verifier checking, from
      Yonghong Song.

  15) Support MIPS in the BPF selftests and samples infrastructure, the
      MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
      Daney.

  16) Support kernel based TLS, from Dave Watson and others.

  17) Remove completely DST garbage collection, from Wei Wang.

  18) Allow installing TCP MD5 rules using prefixes, from Ivan
      Delalande.

  19) Add XDP support to Intel i40e driver, from Björn Töpel

  20) Add support for TC flower offload in nfp driver, from Simon
      Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
      Kicinski, and Bert van Leeuwen.

  21) IPSEC offloading support in mlx5, from Ilan Tayari.

  22) Add HW PTP support to macb driver, from Rafal Ozieblo.

  23) Networking refcount_t conversions, From Elena Reshetova.

  24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
      for tuning the TCP sockopt settings of a group of applications,
      currently via CGROUPs"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
  net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
  dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
  cxgb4: Support for get_ts_info ethtool method
  cxgb4: Add PTP Hardware Clock (PHC) support
  cxgb4: time stamping interface for PTP
  nfp: default to chained metadata prepend format
  nfp: remove legacy MAC address lookup
  nfp: improve order of interfaces in breakout mode
  net: macb: remove extraneous return when MACB_EXT_DESC is defined
  bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
  bpf: fix return in load_bpf_file
  mpls: fix rtm policy in mpls_getroute
  net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
  net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
  ...
2017-07-05 12:31:59 -07:00
Steven Rostedt (VMware) 69d71879d2 ftrace: Test for NULL iter->tr in regex for stack_trace_filter changes
As writing into stack_trace_filter, the iter-tr is not set and is NULL.
Check if it is NULL before dereferencing it in ftrace_regex_release().

Fixes: 8c08f0d5c6 ("ftrace: Have cached module filters be an active filter")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-05 09:52:18 -04:00
Steven Rostedt (VMware) 4dce17b26b Merge commit '0f17976568b3f72e676450af0c0db6f8752253d6' into trace/ftrace/core
Need to get the changes from 0f17976568 ("ftrace: Fix regression with
module command in stack_trace_filter") as it is required to fix some other
changes with stack_trace_filter and the new development code.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-07-05 09:51:24 -04:00
Linus Torvalds c6b1e36c8f Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-block
Pull core block/IO updates from Jens Axboe:
 "This is the main pull request for the block layer for 4.13. Not a huge
  round in terms of features, but there's a lot of churn related to some
  core cleanups.

  Note this depends on the UUID tree pull request, that Christoph
  already sent out.

  This pull request contains:

   - A series from Christoph, unifying the error/stats codes in the
     block layer. We now use blk_status_t everywhere, instead of using
     different schemes for different places.

   - Also from Christoph, some cleanups around request allocation and IO
     scheduler interactions in blk-mq.

   - And yet another series from Christoph, cleaning up how we handle
     and do bounce buffering in the block layer.

   - A blk-mq debugfs series from Bart, further improving on the support
     we have for exporting internal information to aid debugging IO
     hangs or stalls.

   - Also from Bart, a series that cleans up the request initialization
     differences across types of devices.

   - A series from Goldwyn Rodrigues, allowing the block layer to return
     failure if we will block and the user asked for non-blocking.

   - Patch from Hannes for supporting setting loop devices block size to
     that of the underlying device.

   - Two series of patches from Javier, fixing various issues with
     lightnvm, particular around pblk.

   - A series from me, adding support for write hints. This comes with
     NVMe support as well, so applications can help guide data placement
     on flash to improve performance, latencies, and write
     amplification.

   - A series from Ming, improving and hardening blk-mq support for
     stopping/starting and quiescing hardware queues.

   - Two pull requests for NVMe updates. Nothing major on the feature
     side, but lots of cleanups and bug fixes. From the usual crew.

   - A series from Neil Brown, greatly improving the bio rescue set
     support. Most notably, this kills the bio rescue work queues, if we
     don't really need them.

   - Lots of other little bug fixes that are all over the place"

* 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits)
  lightnvm: pblk: set line bitmap check under debug
  lightnvm: pblk: verify that cache read is still valid
  lightnvm: pblk: add initialization check
  lightnvm: pblk: remove target using async. I/Os
  lightnvm: pblk: use vmalloc for GC data buffer
  lightnvm: pblk: use right metadata buffer for recovery
  lightnvm: pblk: schedule if data is not ready
  lightnvm: pblk: remove unused return variable
  lightnvm: pblk: fix double-free on pblk init
  lightnvm: pblk: fix bad le64 assignations
  nvme: Makefile: remove dead build rule
  blk-mq: map all HWQ also in hyperthreaded system
  nvmet-rdma: register ib_client to not deadlock in device removal
  nvme_fc: fix error recovery on link down.
  nvmet_fc: fix crashes on bad opcodes
  nvme_fc: Fix crash when nvme controller connection fails.
  nvme_fc: replace ioabort msleep loop with completion
  nvme_fc: fix double calls to nvme_cleanup_cmd()
  nvme-fabrics: verify that a controller returns the correct NQN
  nvme: simplify nvme_dev_attrs_are_visible
  ...
2017-07-03 10:34:51 -07:00
John Fastabend 7bda4b40c5 bpf: extend bpf_trace_printk to support %i
Currently, bpf_trace_printk does not support common formatting
symbol '%i' however vsprintf does and is what eventually gets
called by bpf helper. If users are used to '%i' and currently
make use of it, then bpf_trace_printk will just return with
error without dumping anything to the trace pipe, so just add
support for '%i' to the helper.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-03 02:22:52 -07:00
Daniel Borkmann f96da09473 bpf: simplify narrower ctx access
This work tries to make the semantics and code around the
narrower ctx access a bit easier to follow. Right now
everything is done inside the .is_valid_access(). Offset
matching is done differently for read/write types, meaning
writes don't support narrower access and thus matching only
on offsetof(struct foo, bar) is enough whereas for read
case that supports narrower access we must check for
offsetof(struct foo, bar) + offsetof(struct foo, bar) +
sizeof(<bar>) - 1 for each of the cases. For read cases of
individual members that don't support narrower access (like
packet pointers or skb->cb[] case which has its own narrow
access logic), we check as usual only offsetof(struct foo,
bar) like in write case. Then, for the case where narrower
access is allowed, we also need to set the aux info for the
access. Meaning, ctx_field_size and converted_op_size have
to be set. First is the original field size e.g. sizeof(<bar>)
as in above example from the user facing ctx, and latter
one is the target size after actual rewrite happened, thus
for the kernel facing ctx. Also here we need the range match
and we need to keep track changing convert_ctx_access() and
converted_op_size from is_valid_access() as both are not at
the same location.

We can simplify the code a bit: check_ctx_access() becomes
simpler in that we only store ctx_field_size as a meta data
and later in convert_ctx_accesses() we fetch the target_size
right from the location where we do convert. Should the verifier
be misconfigured we do reject for BPF_WRITE cases or target_size
that are not provided. For the subsystems, we always work on
ranges in is_valid_access() and add small helpers for ranges
and narrow access, convert_ctx_accesses() sets target_size
for the relevant instruction.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-03 02:22:52 -07:00
Sabrina Dubroca 9e52b32567 tracing/kprobes: Allow to create probe with a module name starting with a digit
Always try to parse an address, since kstrtoul() will safely fail when
given a symbol as input. If that fails (which will be the case for a
symbol), try to parse a symbol instead.

This allows creating a probe such as:

    p:probe/vlan_gro_receive 8021q:vlan_gro_receive+0

Which is necessary for this command to work:

    perf probe -m 8021q -a vlan_gro_receive

Link: http://lkml.kernel.org/r/fd72d666f45b114e2c5b9cf7e27b91de1ec966f1.1498122881.git.sd@queasysnail.net

Cc: stable@vger.kernel.org
Fixes: 413d37d1e ("tracing: Add kprobe-based event tracer")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-29 23:13:23 -04:00
Steven Rostedt (VMware) 0f17976568 ftrace: Fix regression with module command in stack_trace_filter
When doing the following command:

 # echo ":mod:kvm_intel" > /sys/kernel/tracing/stack_trace_filter

it triggered a crash.

This happened with the clean up of probes. It required all callers to the
regex function (doing ftrace filtering) to have ops->private be a pointer to
a trace_array. But for the stack tracer, that is not the case.

Allow for the ops->private to be NULL, and change the function command
callbacks to handle the trace_array pointer being NULL as well.

Fixes: d2afd57a4b ("tracing/ftrace: Allow instances to have their own function probes")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-29 10:05:45 -04:00
Steven Rostedt (VMware) 4ec7846785 ftrace: Decrement count for dyn_ftrace_total_info for init functions
Init boot up functions may be traced, but they are also freed when the
kernel finishes booting. These are removed from the ftrace tables, and the
debug variable for dyn_ftrace_total_info needs to reflect that as well.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-28 11:57:03 -04:00
Steven Rostedt (VMware) 3b58a3c72f ftrace: Unlock hash mutex on failed allocation in process_mod_list()
If the new_hash fails to allocate, then unlock the hash mutex on error.

Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-28 09:09:38 -04:00
Joel Fernandes 441dae8f2f tracing: Add support for display of tgid in trace output
Earlier patches introduced ability to record the tgid using the 'record-tgid'
option. Here we read the tgid and output it if the option is enabled.

Link: http://lkml.kernel.org/r/20170626053844.5746-3-joelaf@google.com

Cc: kernel-team@android.com
Cc: Ingo Molnar <mingo@redhat.com>
Tested-by: Michael Sartain <mikesart@gmail.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-27 13:30:28 -04:00
Joel Fernandes d914ba37d7 tracing: Add support for recording tgid of tasks
Inorder to support recording of tgid, the following changes are made:

* Introduce a new API (tracing_record_taskinfo) to additionally record the tgid
  along with the task's comm at the same time. This has has the benefit of not
  setting trace_cmdline_save before all the information for a task is saved.
* Add a new API tracing_record_taskinfo_sched_switch to record task information
  for 2 tasks at a time (previous and next) and use it from sched_switch probe.
* Preserve the old API (tracing_record_cmdline) and create it as a wrapper
  around the new one so that existing callers aren't affected.
* Reuse the existing sched_switch and sched_wakeup probes to record tgid
  information and add a new option 'record-tgid' to enable recording of tgid

When record-tgid option isn't enabled to being with, we take care to make sure
that there's isn't memory or runtime overhead.

Link: http://lkml.kernel.org/r/20170627020155.5139-1-joelaf@google.com

Cc: kernel-team@android.com
Cc: Ingo Molnar <mingo@redhat.com>
Tested-by: Michael Sartain <mikesart@gmail.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-27 13:30:28 -04:00
Steven Rostedt (VMware) 83dd14933e ftrace: Decrement count for dyn_ftrace_total_info file
The dyn_ftrace_total_info file is used to show how many functions have been
converted into nops and can be used by ftrace. The problem is that it does
not get decremented when functions are removed (init boot code being freed,
and modules being freed). That means the number is very inaccurate everytime
functions are removed from the ftrace tables. Decrement it when functions
are removed.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-27 13:30:27 -04:00
Steven Rostedt (VMware) 6a9c981b1e ftrace: Remove unused function ftrace_arch_read_dyn_info()
ftrace_arch_read_dyn_info() was used so that archs could add its own debug
information into the dyn_ftrace_total_info in the tracefs file system. That
file is for debugging usage of dynamic ftrace. No arch uses that function
anymore, so just get rid of it.

This also allows for tracing_read_dyn_info() to be cleaned up a bit.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-27 13:30:22 -04:00
Steven Rostedt (VMware) 8c08f0d5c6 ftrace: Have cached module filters be an active filter
When a module filter is added to set_ftrace_filter, if the module is not
loaded, it is cached. This should be considered an active filter, and
function tracing should be filtered by this. That is, if a cached module
filter is the only filter set, then no function tracing should be happening,
as all the functions available will be filtered out.

This makes sense, as the reason to add a cached module filter, is to trace
the module when you load it. There shouldn't be any other tracing happening
until then.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-26 11:53:04 -04:00
Steven Rostedt (VMware) d7fbf8df7c ftrace: Implement cached modules tracing on module load
If a module is cached in the set_ftrace_filter, and that module is loaded,
then enable tracing on that module as if the cached module text was written
into set_ftrace_filter just as the module is loaded.

  # echo ":mod:kvm_intel" >
  # cat /sys/kernel/tracing/set_ftrace_filter
 #### all functions enabled ####
 :mod:kvm_intel
  # modprobe kvm_intel
  # cat /sys/kernel/tracing/set_ftrace_filter
 vmx_get_rflags [kvm_intel]
 vmx_get_pkru [kvm_intel]
 vmx_get_interrupt_shadow [kvm_intel]
 vmx_rdtscp_supported [kvm_intel]
 vmx_invpcid_supported [kvm_intel]
 [..]

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-26 11:53:03 -04:00
Steven Rostedt (VMware) 5985ea8bd5 ftrace: Have the cached module list show in set_ftrace_filter
When writing in a module filter into set_ftrace_filter for a module that is
not yet loaded, it it cached, and will be executed when the module is loaded
(although that is not implemented yet at this commit). Display the list of
cached modules to be traced.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-26 11:53:02 -04:00
Steven Rostedt (VMware) 673feb9d76 ftrace: Add :mod: caching infrastructure to trace_array
This is the start of the infrastructure work to allow for tracing module
functions before it is loaded.

Currently the following command:

  # echo :mod:some-mod > set_ftrace_filter

will enable tracing of all functions within the module "some-mod" if it is
loaded. What we want, is if the module is not loaded, that line will be
saved. When the module is loaded, then the "some-mod" will have that line
executed on it, so that the functions within it starts being traced.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-26 11:53:02 -04:00
Yonghong Song 239946314e bpf: possibly avoid extra masking for narrower load in verifier
Commit 31fd85816d ("bpf: permits narrower load from bpf program
context fields") permits narrower load for certain ctx fields.
The commit however will already generate a masking even if
the prog-specific ctx conversion produces the result with
narrower size.

For example, for __sk_buff->protocol, the ctx conversion
loads the data into register with 2-byte load.
A narrower 2-byte load should not generate masking.
For __sk_buff->vlan_present, the conversion function
set the result as either 0 or 1, essentially a byte.
The narrower 2-byte or 1-byte load should not generate masking.

To avoid unnecessary masking, prog-specific *_is_valid_access
now passes converted_op_size back to verifier, which indicates
the valid data width after perceived future conversion.
Based on this information, verifier is able to avoid
unnecessary marking.

Since we want more information back from prog-specific
*_is_valid_access checking, all of them are packed into
one data structure for more clarity.

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:04:11 -04:00
Steven Rostedt (VMware) feaf1283d1 tracing: Show address when function names are not found
Currently, when a function is not found in kallsyms, instead of simply
showing the function address, it shows nothing at all:

 # echo ':mod:kvm_intel' > /sys/kernel/tracing/set_ftrace_filter
 # echo function > /sys/kernel/tracing/set_ftrace_filter
 # qemu -enable-kvm /home/my-qemu-image
   <Ctrl-C>
 # rmmod kvm_intel
 # cat /sys/kernel/tracing/trace
 qemu-system-x86-2408  [001] d..2   135.013238:  <-kvm_arch_hardware_enable
 qemu-system-x86-2408  [001] ....   135.014574:  <-kvm_arch_vm_ioctl
 qemu-system-x86-2408  [001] ....   135.015420:  <-kvm_vm_ioctl_check_extension
 qemu-system-x86-2408  [001] ....   135.045411:  <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ....   135.045412:  <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ....   135.045412:  <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ....   135.045412:  <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ...1   135.045413:  <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ....   135.045413:  <-__do_cpuid_ent

When it should show:

 qemu-system-x86-2408  [001] d..2   135.013238: 0xffffffffa02a39f0 <-kvm_arch_hardware_enable
 qemu-system-x86-2408  [001] ....   135.014574: 0xffffffffa02a2ba0 <-kvm_arch_vm_ioctl
 qemu-system-x86-2408  [001] ....   135.015420: 0xffffffffa029e4e0 <-kvm_vm_ioctl_check_extension
 qemu-system-x86-2408  [001] ....   135.045411: 0xffffffffa02a1380 <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ....   135.045412: 0xffffffffa029e160 <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ....   135.045412: 0xffffffffa029e180 <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ....   135.045412: 0xffffffffa029e520 <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ...1   135.045413: 0xffffffffa02a13b0 <-__do_cpuid_ent
 qemu-system-x86-2408  [001] ....   135.045413: 0xffffffffa02a1380 <-__do_cpuid_ent

instead.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-22 17:10:06 -04:00
Yonghong Song 31fd85816d bpf: permits narrower load from bpf program context fields
Currently, verifier will reject a program if it contains an
narrower load from the bpf context structure. For example,
        __u8 h = __sk_buff->hash, or
        __u16 p = __sk_buff->protocol
        __u32 sample_period = bpf_perf_event_data->sample_period
which are narrower loads of 4-byte or 8-byte field.

This patch solves the issue by:
  . Introduce a new parameter ctx_field_size to carry the
    field size of narrower load from prog type
    specific *__is_valid_access validator back to verifier.
  . The non-zero ctx_field_size for a memory access indicates
    (1). underlying prog type specific convert_ctx_accesses
         supporting non-whole-field access
    (2). the current insn is a narrower or whole field access.
  . In verifier, for such loads where load memory size is
    less than ctx_field_size, verifier transforms it
    to a full field load followed by proper masking.
  . Currently, __sk_buff and bpf_perf_event_data->sample_period
    are supporting narrowing loads.
  . Narrower stores are still not allowed as typical ctx stores
    are just normal stores.

Because of this change, some tests in verifier will fail and
these tests are removed. As a bonus, rename some out of bound
__sk_buff->cb access to proper field name and remove two
redundant "skb cb oob" tests.

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14 14:56:25 -04:00
Jeremy Linton 681bec0367 tracing: Rename update the enum_map file
The enum_map file is used to display a list of symbol
to name conversions. As its now used to resolve sizeof
lets update the name and description.

Link: http://lkml.kernel.org/r/20170531215653.3240-13-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:13:06 -04:00
Jeremy Linton 67ec0d8595 tracing: Rename enum_replace to eval_replace
The enum_replace stanza works as is for sizeof()
calls as well as enums. Rename it as well.

Link: http://lkml.kernel.org/r/20170531215653.3240-9-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:10:34 -04:00
Jeremy Linton f57a41434f trace: rename enum_map functions
Rename the core trace enum routines to use eval, to
reflect their use by more than just enum to value mapping.

Link: http://lkml.kernel.org/r/20170531215653.3240-8-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:10:23 -04:00
Jeremy Linton 5f60b351a7 trace: rename trace.c enum functions
Rename the init and trace_enum_jmp_to_tail() routines
to reflect their use by more than enumerated types.

Link: http://lkml.kernel.org/r/20170531215653.3240-7-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:10:12 -04:00
Jeremy Linton 1793ed939b trace: rename trace_enum_mutex to trace_eval_mutex
There is a lock protecting the trace_enum_map, rename
it to reflect the use by more than enums.

Link: http://lkml.kernel.org/r/20170531215653.3240-6-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:10:01 -04:00
Jeremy Linton 23bf8cb8dc trace: rename trace enum data structures in trace.c
The enum map entries can be exported to userspace
via a sys enum_map file. Rename those functions
and structures to reflect the fact that we are using
them for more than enums.

Link: http://lkml.kernel.org/r/20170531215653.3240-5-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:09:50 -04:00
Jeremy Linton 99be647c58 trace: rename struct module entry for trace enums
Each module has a list of enum's its contributing to the
enum map, rename that entry to reflect its use by more than
enums.

Link: http://lkml.kernel.org/r/20170531215653.3240-4-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:09:31 -04:00
Jeremy Linton 00f4b652b6 trace: rename trace_enum_map to trace_eval_map
Each enum is loaded into the trace_enum_map, as we
are now using this for more than enums rename it.

Link: http://lkml.kernel.org/r/20170531215653.3240-3-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:08:57 -04:00
Jeremy Linton 02fd7f68f5 trace: rename kernel enum section to eval
The kernel and its modules have sections containing the enum
string to value conversions. Rename this section because we
intend to store more than enums in it.

Link: http://lkml.kernel.org/r/20170531215653.3240-2-jeremy.linton@arm.com

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 17:08:46 -04:00
Joel Fernandes 6a5ae63a0c tracing: Remove unused declaration of trace_stop_cmdline_recording
trace_stop_cmdline_recording declaration isn't in use, remove it.

Link: http://lkml.kernel.org/r/20170609025327.9508-2-joelaf@google.com

Cc: kernel-team@android.com
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-06-13 09:40:52 -04:00
Jens Axboe 8f66439eec Linux 4.12-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZPdbLAAoJEHm+PkMAQRiGx4wH/1nCjfnl6fE8oJ24/1gEAOUh
 biFdqJkYZmlLYHVtYfLm4Ueg4adJdg0wx6qM/4RaAzmQVvLfDV34bc1qBf1+P95G
 kVF+osWyXrZo5cTwkwapHW/KNu4VJwAx2D1wrlxKDVG5AOrULH1pYOYGOpApEkZU
 4N+q5+M0ce0GJpqtUZX+UnI33ygjdDbBxXoFKsr24B7eA0ouGbAJ7dC88WcaETL+
 2/7tT01SvDMo0jBSV0WIqlgXwZ5gp3yPGnklC3F4159Yze6VFrzHMKS/UpPF8o8E
 W9EbuzwxsKyXUifX2GY348L1f+47glen/1sedbuKnFhP6E9aqUQQJXvEO7ueQl4=
 =m2Gx
 -----END PGP SIGNATURE-----

Merge tag 'v4.12-rc5' into for-4.13/block

We've already got a few conflicts and upcoming work depends on some of the
changes that have gone into mainline as regression fixes for this series.

Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream
trees to continue working on 4.13 changes.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-12 08:30:13 -06:00
Daniel Borkmann 20b9d7ac48 bpf: avoid excessive stack usage for perf_sample_data
perf_sample_data consumes 386 bytes on stack, reduce excessive stack
usage and move it to per cpu buffer. It's allowed due to preemption
being disabled for tracing, xdp and tc programs, thus at all times
only one program can run on a specific CPU and programs cannot run
from interrupt. We similarly also handle bpf_pt_regs.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:45 -04:00
Christoph Hellwig 4e4cbee93d block: switch bios to blk_status_t
Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09 09:27:32 -06:00
Masami Hiramatsu c3ca46ef71 ftrace/kprobes: selftests: Check kretprobe maxactive is supported
Check the kretprobe maxactive is supported by kprobe_events
interface. To ensure the kernel feature, this changes ftrace
README to describe it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07 10:07:22 -06:00
David S. Miller 216fe8f021 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Just some simple overlapping changes in marvell PHY driver
and the DSA core code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 22:20:08 -04:00
Alexei Starovoitov f91840a32d perf, bpf: Add BPF support to all perf_event types
Allow BPF_PROG_TYPE_PERF_EVENT program types to attach to all
perf_event types, including HW_CACHE, RAW, and dynamic pmu events.
Only tracepoint/kprobe events are treated differently which require
BPF_PROG_TYPE_TRACEPOINT/BPF_PROG_TYPE_KPROBE program types accordingly.

Also add support for reading all event counters using
bpf_perf_event_read() helper.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 21:58:01 -04:00
Luis Henriques f9797c2f20 ftrace: Fix memory leak in ftrace_graph_release()
ftrace_hash is being kfree'ed in ftrace_graph_release(), however the
->buckets field is not.  This results in a memory leak that is easily
captured by kmemleak:

unreferenced object 0xffff880038afe000 (size 8192):
  comm "trace-cmd", pid 238, jiffies 4294916898 (age 9.736s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff815f561e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8113964d>] __kmalloc+0x12d/0x1a0
    [<ffffffff810bf6d1>] alloc_ftrace_hash+0x51/0x80
    [<ffffffff810c0523>] __ftrace_graph_open.isra.39.constprop.46+0xa3/0x100
    [<ffffffff810c05e8>] ftrace_graph_open+0x68/0xa0
    [<ffffffff8114003d>] do_dentry_open.isra.1+0x1bd/0x2d0
    [<ffffffff81140df7>] vfs_open+0x47/0x60
    [<ffffffff81150f95>] path_openat+0x2a5/0x1020
    [<ffffffff81152d6a>] do_filp_open+0x8a/0xf0
    [<ffffffff811411df>] do_sys_open+0x12f/0x200
    [<ffffffff811412ce>] SyS_open+0x1e/0x20
    [<ffffffff815fa6e0>] entry_SYSCALL_64_fastpath+0x13/0x94
    [<ffffffffffffffff>] 0xffffffffffffffff

Link: http://lkml.kernel.org/r/20170525152038.7661-1-lhenriques@suse.com

Cc: stable@vger.kernel.org
Fixes: b9b0c831be ("ftrace: Convert graph filter to use hash tables")
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-26 22:35:48 -04:00
Linus Torvalds 56f410cf45 This fixes a bug caused by not cleaning up the new instance unique triggers
when deleting an instance. It also creates a selftest that triggers that bug.
 
 Fix the delayed optimization happening after kprobes boot up self tests
 being removed by freeing of init memory.
 
 Comment kprobes on why the delay optimization is not a problem for removal
 of modules, to keep other developers from searching that riddle.
 
 Fix another rcu isn't watching in stack trace tracing.
 
 Naveen N. Rao (4):
       ftrace: Simplify glob handling in unregister_ftrace_function_probe_func()
       ftrace/instances: Clear function triggers when removing instances
       selftests/ftrace: Fix bashisms
       selftests/ftrace: Add test to remove instance with active event triggers
 
 Steven Rostedt (1):
       tracing: Move postpone selftests to core from early_initcall
 
 Steven Rostedt (VMware) (3):
       ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() stub
       kprobes: Document how optimized kprobes are removed from module unload
       tracing: Make sure RCU is watching before calling a stack trace
 
 Thomas Gleixner (1):
       tracing/kprobes: Enforce kprobes teardown after testing
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJZIQapFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 A6MIAKFLb6mQ4flRBXpWd2tD2B4DQpQ0H7SovseZnlH6Q7grU6POY/qbNl9xXiBA
 3NavxqbIYokH8cxEqGAusL7ASUFPXJj6erMM1uc1WRuAzMpIjvgNacOtW5R+c5S9
 ofR1xtKlBo/854J/IP6M3J0WqrK+B7TsS1WYKohe/tFMBpolbnFloHVfMMZlaL58
 CQhCoAhkjJRsta6dJhbo+HoQy03VGyWsfFHtutBpIwsf81Naq4Stpxp7jdZLWhB8
 Di5QdOji9lDayK6Uk7DDZqHxbjC9z6cCS9nVWIGHkE4AMpR3peYtsyCaAOBjVMLV
 2OuhuREfZgKaYVMjUfdeYCayDAY=
 =1gek
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix a bug caused by not cleaning up the new instance unique triggers
   when deleting an instance. It also creates a selftest that triggers
   that bug.

 - Fix the delayed optimization happening after kprobes boot up self
   tests being removed by freeing of init memory.

 - Comment kprobes on why the delay optimization is not a problem for
   removal of modules, to keep other developers from searching that
   riddle.

 - Fix another case of rcu not watching in stack trace tracing.

* tag 'trace-v4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Make sure RCU is watching before calling a stack trace
  kprobes: Document how optimized kprobes are removed from module unload
  selftests/ftrace: Add test to remove instance with active event triggers
  selftests/ftrace: Fix bashisms
  ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() stub
  ftrace/instances: Clear function triggers when removing instances
  ftrace: Simplify glob handling in unregister_ftrace_function_probe_func()
  tracing/kprobes: Enforce kprobes teardown after testing
  tracing: Move postpone selftests to core from early_initcall
2017-05-20 23:39:03 -07:00
Linus Torvalds 894e21642d Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A small collection of fixes that should go into this cycle.

   - a pull request from Christoph for NVMe, which ended up being
     manually applied to avoid pulling in newer bits in master. Mostly
     fibre channel fixes from James, but also a few fixes from Jon and
     Vijay

   - a pull request from Konrad, with just a single fix for xen-blkback
     from Gustavo.

   - a fuseblk bdi fix from Jan, fixing a regression in this series with
     the dynamic backing devices.

   - a blktrace fix from Shaohua, replacing sscanf() with kstrtoull().

   - a request leak fix for drbd from Lars, fixing a regression in the
     last series with the kref changes. This will go to stable as well"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvmet: release the sq ref on rdma read errors
  nvmet-fc: remove target cpu scheduling flag
  nvme-fc: stop queues on error detection
  nvme-fc: require target or discovery role for fc-nvme targets
  nvme-fc: correct port role bits
  nvme: unmap CMB and remove sysfs file in reset path
  blktrace: fix integer parse
  fuseblk: Fix warning in super_setup_bdi_name()
  block: xen-blkback: add null check to avoid null pointer dereference
  drbd: fix request leak introduced by locking/atomic, kref: Kill kref_sub()
2017-05-20 16:12:30 -07:00
Shaohua Li 5f3394530f blktrace: fix integer parse
sscanf is a very poor way to parse integer. For example, I input
"discard" for act_mask, it gets 0xd and completely messes up. Using
correct API to do integer parse.

This patch also makes attributes accept any base of integer.

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-19 09:21:15 -06:00
Steven Rostedt (VMware) a33d7d94ee tracing: Make sure RCU is watching before calling a stack trace
As stack tracing now requires "rcu watching", force RCU to be watching when
recording a stack trace.

Link: http://lkml.kernel.org/r/20170512172449.879684501@goodmis.org

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-18 23:57:56 -04:00
Steven Rostedt (VMware) 8a49f3e03c ftrace: Remove #ifdef from code and add clear_ftrace_function_probes() stub
No need to add ugly #ifdefs in the code. Having a standard stub file is much
prettier.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-17 21:53:32 -04:00
Naveen N. Rao a0e6369e4b ftrace/instances: Clear function triggers when removing instances
If instance directories are deleted while there are registered function
triggers:

  # cd /sys/kernel/debug/tracing/instances
  # mkdir test
  # echo "schedule:enable_event:sched:sched_switch" > test/set_ftrace_filter
  # rmdir test
  Unable to handle kernel paging request for data at address 0x00000008
  Unable to handle kernel paging request for data at address 0x00000008
  Faulting instruction address: 0xc0000000021edde8
  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=2048
  NUMA
  pSeries
  Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc kvm iptable_filter fuse binfmt_misc pseries_rng rng_core vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c multipath virtio_net virtio_blk virtio_pci crc32c_vpmsum virtio_ring virtio
  CPU: 8 PID: 8694 Comm: rmdir Not tainted 4.11.0-nnr+ #113
  task: c0000000bab52800 task.stack: c0000000baba0000
  NIP: c0000000021edde8 LR: c0000000021f0590 CTR: c000000002119620
  REGS: c0000000baba3870 TRAP: 0300   Not tainted  (4.11.0-nnr+)
  MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>
    CR: 22002422  XER: 20000000
  CFAR: 00007fffabb725a8 DAR: 0000000000000008 DSISR: 40000000 SOFTE: 0
  GPR00: c00000000220f750 c0000000baba3af0 c000000003157e00 0000000000000000
  GPR04: 0000000000000040 00000000000000eb 0000000000000040 0000000000000000
  GPR08: 0000000000000000 0000000000000113 0000000000000000 c00000000305db98
  GPR12: c000000002119620 c00000000fd42c00 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 c0000000bab52e90 0000000000000000
  GPR24: 0000000000000000 00000000000000eb 0000000000000040 c0000000baba3bb0
  GPR28: c00000009cb06eb0 c0000000bab52800 c00000009cb06eb0 c0000000baba3bb0
  NIP [c0000000021edde8] ring_buffer_lock_reserve+0x8/0x4e0
  LR [c0000000021f0590] trace_event_buffer_lock_reserve+0xe0/0x1a0
  Call Trace:
  [c0000000baba3af0] [c0000000021f96c8] trace_event_buffer_commit+0x1b8/0x280 (unreliable)
  [c0000000baba3b60] [c00000000220f750] trace_event_buffer_reserve+0x80/0xd0
  [c0000000baba3b90] [c0000000021196b8] trace_event_raw_event_sched_switch+0x98/0x180
  [c0000000baba3c10] [c0000000029d9980] __schedule+0x6e0/0xab0
  [c0000000baba3ce0] [c000000002122230] do_task_dead+0x70/0xc0
  [c0000000baba3d10] [c0000000020ea9c8] do_exit+0x828/0xd00
  [c0000000baba3dd0] [c0000000020eaf70] do_group_exit+0x60/0x100
  [c0000000baba3e10] [c0000000020eb034] SyS_exit_group+0x24/0x30
  [c0000000baba3e30] [c00000000200bcec] system_call+0x38/0x54
  Instruction dump:
  60000000 60420000 7d244b78 7f63db78 4bffaa09 393efff8 793e0020 39200000
  4bfffecc 60420000 3c4c00f7 3842a020 <81230008> 2f890000 409e02f0 a14d0008
  ---[ end trace b917b8985d0e650b ]---
  Unable to handle kernel paging request for data at address 0x00000008
  Faulting instruction address: 0xc0000000021edde8
  Unable to handle kernel paging request for data at address 0x00000008
  Faulting instruction address: 0xc0000000021edde8
  Faulting instruction address: 0xc0000000021edde8

To address this, let's clear all registered function probes before
deleting the ftrace instance.

Link: http://lkml.kernel.org/r/c5f1ca624043690bd94642bb6bffd3f2fc504035.1494956770.git.naveen.n.rao@linux.vnet.ibm.com

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-17 21:52:22 -04:00
Naveen N. Rao cbab567c3d ftrace: Simplify glob handling in unregister_ftrace_function_probe_func()
Handle a NULL glob properly and simplify the check.

Link: http://lkml.kernel.org/r/5df74d4ffb4721db6d5a22fa08ca031d62ead493.1494956770.git.naveen.n.rao@linux.vnet.ibm.com

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-17 21:51:54 -04:00
Thomas Gleixner 30e7d894c1 tracing/kprobes: Enforce kprobes teardown after testing
Enabling the tracer selftest triggers occasionally the warning in
text_poke(), which warns when the to be modified page is not marked
reserved.

The reason is that the tracer selftest installs kprobes on functions marked
__init for testing. These probes are removed after the tests, but that
removal schedules the delayed kprobes_optimizer work, which will do the
actual text poke. If the work is executed after the init text is freed,
then the warning triggers. The bug can be reproduced reliably when the work
delay is increased.

Flush the optimizer work and wait for the optimizing/unoptimizing lists to
become empty before returning from the kprobes tracer selftest. That
ensures that all operations which were queued due to the probes removal
have completed.

Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.home

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 6274de498 ("kprobes: Support delayed unoptimizing")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-17 21:50:27 -04:00
Steven Rostedt b9ef0326c0 tracing: Move postpone selftests to core from early_initcall
I hit the following lockdep splat when booting with ftrace selftests
enabled, as well as CONFIG_PREEMPT and LOCKDEP.

 Testing dynamic ftrace ops #1:
 (1 0 1 0 0)
 (1 1 2 0 0)
 (2 1 3 0 169)
 (2 2 4 0 50066)
 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 13 at kernel/rcu/srcutree.c:202 check_init_srcu_struct+0x60/0x70
 Modules linked in:
 CPU: 0 PID: 13 Comm: rcu_tasks_kthre Not tainted 4.12.0-rc1-test+ #587
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 task: ffff880119628040 task.stack: ffffc900006a4000
 RIP: 0010:check_init_srcu_struct+0x60/0x70
 RSP: 0000:ffffc900006a7d98 EFLAGS: 00010246
 RAX: 0000000000000246 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: ffff880119628040 RSI: 00000000ffffffff RDI: ffffffff81e5fb40
 RBP: ffffc900006a7e20 R08: 00000023b403c000 R09: 0000000000000001
 R10: ffffc900006a7e40 R11: 0000000000000000 R12: ffffffff81e5fb40
 R13: 0000000000000286 R14: ffff880119628040 R15: ffffc900006a7e98
 FS:  0000000000000000(0000) GS:ffff88011ea00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffff88011edff000 CR3: 0000000001e0f000 CR4: 00000000001406f0
 Call Trace:
  ? __synchronize_srcu+0x6e/0x140
  ? lock_acquire+0xdc/0x1d0
  ? ktime_get_mono_fast_ns+0x5d/0xb0
  synchronize_srcu+0x6f/0x110
  ? synchronize_srcu+0x6f/0x110
  rcu_tasks_kthread+0x20a/0x540
  kthread+0x114/0x150
  ? __rcu_read_unlock+0x70/0x70
  ? kthread_create_on_node+0x40/0x40
  ret_from_fork+0x2e/0x40
 Code: f6 83 70 06 00 00 03 49 89 c5 74 0d be 01 00 00 00 48 89 df e8 42 fa ff ff 4c 89 ee 4c 89 e7 e8 b7 42 75 00 5b 41 5c 41 5d 5d c3 <0f> ff eb aa 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
 ---[ end trace 5c3f4206ce50f6ac ]---

What happens is that the selftests include a creating of a dynamically
allocated ftrace_ops, which requires the use of synchronize_rcu_tasks()
which uses srcu, and triggers the above warning.

It appears that synchronize_rcu_tasks() is not set up at early_initcall(),
but it is at core_initcall(). By moving the tests down to that location
works out properly.

Link: http://lkml.kernel.org/r/20170517111435.7388c033@gandalf.local.home

Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-17 21:39:38 -04:00
Linus Torvalds 29250d301b This is a trivial patch that changes a check for a cpumask from a NULL
pointer to using cpumask_available(), which will do the check. This is
 because cpumasks when not allocated are always set, and clang complains
 about it.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJZEcUIFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 zygH/051hNj3aZlSZCD1pXwPkHgeVBn7lSB9k6hcJ5J/OknL/hNXws3Dv4Lb7Dzj
 cZhg62LTwwS6PVCJtOHk+PE/c5FIdY9o1mXJpAst6wbl9Sp1lzPJbFum45UadvWn
 UyU3RP0ncSgfojyrwIu6XyND7/NatdYk9irTMWL9+cDuy9xGvJgRX1sf7tXmxj4C
 AbZzQorDw7XDczDbvFM1XyPU3ApGUDqQ7VhCEBP6ivE+5Ceoo9xi/z7yfKyjLeb+
 H7+/eA8ztaMLgTzLWwkFKdP/knqwPmAb+MHTR0DoLHcVe7fbbxFS7x+cfR8mfIA9
 8tA5SUxc7bymRvDAcN2dMrtL7f8=
 =3hKI
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "This is a trivial patch that changes a check for a cpumask from a NULL
  pointer to using cpumask_available(), which will do the check. This is
  because cpumasks when not allocated are always set, and clang
  complains about it"

* tag 'trace-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Use cpumask_available() to check if cpumask variable may be used
2017-05-10 11:26:25 -07:00
Linus Torvalds 00d9593335 These are three simple changes.
The first one is just a switch from using strcpy() to strlcpy(). Someone
 thought that it may cause an overflow bug, but since it only copies comms
 into a pre-allocated array of TASK_COMM_LEN, and no comm should ever
 be bigger than that, nor not end with a nul character, this change is more
 of a safety precaution than fixing anything that is actually broken.
 
 The other two changes are simply cleaning and optimizing some code.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJZEKYJFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 +NcH/jK6ELGykogqi2FfLNzwJTjVpHdKrrMKyxHcC+jXv3mJUyK+0qKHkCO6zyy1
 EWAbTrSMjHGG6r6AP42QLRRehsijk7xXjJm86T771PNtSgY4xCKobFisk73KR4YB
 2Y1JXkSpKH2kKgdixR9hcg4h5RTv16KeAMu2cLSMxRfDEr1mBvv7LU8ZrobJSx2C
 LGR/241bTxOB6mWCmjqSTVrhHyEAMgJhVwV+ym7qfjqQULGhgFmq3CVTicFU0PWx
 UkzrcwYT2T56jU3Ngu/e1KkEZq0/rG7O86iSxgwnuraW4n48u3rpkl/q9eZ029Hd
 /kxyyXBKQDxx6cQd4hZrYUTW4IU=
 =8/K8
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:
 "These are three simple changes.

  The first one is just a switch from using strcpy() to strlcpy().
  Someone thought that it may cause an overflow bug, but since it only
  copies comms into a pre-allocated array of TASK_COMM_LEN, and no comm
  should ever be bigger than that, nor not end with a nul character,
  this change is more of a safety precaution than fixing anything that
  is actually broken.

  The other two changes are simply cleaning and optimizing some code"

* tag 'trace-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Simplify ftrace_match_record() even more
  ftrace: Remove an unneeded condition
  tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline()
2017-05-08 20:36:38 -07:00
Matthias Kaehlcke 4dbbe2d8e9 tracing: Use cpumask_available() to check if cpumask variable may be used
This fixes the following clang warning:

kernel/trace/trace.c:3231:12: warning: address of array 'iter->started'
  will always evaluate to 'true' [-Wpointer-bool-conversion]
        if (iter->started)

Link: http://lkml.kernel.org/r/20170421234110.117075-1-mka@chromium.org

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-08 21:22:04 -04:00
Deepa Dinamani 51aad0aee5 trace: make trace_hwlat timestamp y2038 safe
struct timespec is not y2038 safe on 32 bit machines and needs to be
replaced by struct timespec64 in order to represent times beyond year
2038 on such machines.

Fix all the timestamp representation in struct trace_hwlat and all the
corresponding implementations.

Link: http://lkml.kernel.org/r/1491613030-11599-3-git-send-email-deepa.kernel@gmail.com
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:15 -07:00
Steven Rostedt (VMware) 77c0eddeee ftrace: Simplify ftrace_match_record() even more
Dan Carpenter sent a patch to remove a check in ftrace_match_record()
because the logic of the code made the check redundant. I looked deeper into
the code, and made the following logic table, with the three variables and
the result of the original code.

modname        mod_matches     exclude_mod         result
-------        -----------     -----------         ------
  0                 0               0              return 0
  0                 0               1              func_match
  0                 1               *             < cannot exist >
  1                 0               0              return 0
  1                 0               1              func_match
  1                 1               0              func_match
  1                 1               1              return 0

Notice that when mod_matches == exclude mod, the result is always to
return 0, and when mod_matches != exclude_mod, then the result is to test
the function. This means we only need test if mod_matches is equal to
exclude_mod.

Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-03 22:15:13 -04:00
Dan Carpenter 31805c9052 ftrace: Remove an unneeded condition
We know that "mod_matches" is true here so there is no need to check
again.

Link: http://lkml.kernel.org/r/20170331152130.GA4947@mwanda

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-03 22:15:12 -04:00
Amey Telawane e09e28671c tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline()
Strcpy is inherently not safe, and strlcpy() should be used instead.
__trace_find_cmdline() uses strcpy() because the comms saved must have a
terminating nul character, but it doesn't hurt to add the extra protection
of using strlcpy() instead of strcpy().

Link: http://lkml.kernel.org/r/1493806274-13936-1-git-send-email-amit.pundir@linaro.org

Signed-off-by: Amey Telawane <ameyt@codeaurora.org>
[AmitP: Cherry-picked this commit from CodeAurora kernel/msm-3.10
https://source.codeaurora.org/quic/la/kernel/msm-3.10/commit/?id=2161ae9a70b12cf18ac8e5952a20161ffbccb477]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
[ Updated change log and removed the "- 1" from len parameter ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-05-03 22:15:11 -04:00