linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V6 00/15] perf/x86: Add perf text poke events
@ 2020-04-05 20:13 Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 01/15] perf: Add perf text poke event Adrian Hunter
                   ` (16 more replies)
  0 siblings, 17 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Hi

Here are patches to add a text poke event to record changes to kernel text
(i.e. self-modifying code) in order to support tracers like Intel PT
decoding through jump labels, kprobes and ftrace trampolines.

The first 8 patches make the kernel changes and the subsequent patches are
tools changes.

The next 4 patches add support for updating perf tools' data cache
with the changed bytes.

The next patch is an Intel PT specific tools change.

The final 2 patches add perf script --show-text-poke-events option

Patches also here:

	git://git.infradead.org/users/ahunter/linux-perf.git text_poke

Changes in V6

  perf: Add perf text poke event

	Adjustments due to re-base on top of cgroup event patches

  perf/x86: Add perf text poke events for kprobes

	Expand commit message
	Add Masami's Ack

  perf tools: Add support for PERF_RECORD_TEXT_POKE

	Adjustments due to re-base on top of cgroup event patches
	Improvements to perf_event__fprintf_text_poke()

  perf script: Add option --show-text-poke-events
  perf script: Show text poke address symbol

	New patches

Changes in V5

  perf/x86: Add perf text poke events for kprobes

	Simplify optimized kprobes events (Peter)

Changes in V4

  kprobes: Add symbols for kprobe insn pages

	Change "module name" from kprobe to __builtin__kprobes
	Added comment about "module name" use

  ftrace: Add symbols for ftrace trampolines
	
	Change "module name" from ftrace to __builtin__ftrace
	Move calls of ftrace_add_trampoline_to_kallsyms() and
	ftrace_remove_trampoline_from_kallsyms() into
	kernel/trace/ftrace.c
	Added comment about "module name" use

  ftrace: Add perf ksymbol events for ftrace trampolines

	Move changes into kernel/trace/ftrace.c

  ftrace: Add perf text poke events for ftrace trampolines

	Move changes into kernel/trace/ftrace.c

Changes in V3:

  perf: Add perf text poke event

	To prevent warning, cast pointer to (unsigned long) not (u64)

  kprobes: Add symbols for kprobe insn pages

	Expand commit message
	Remove un-needed declarations of kprobe_cache_get_kallsym() and arch_kprobe_get_kallsym() when !CONFIG_KPROBES

  ftrace: Add symbols for ftrace trampolines

	Expand commit message
	Make ftrace_get_trampoline_kallsym() static

Changes in V2:

  perf: Add perf text poke event

	Separate out x86 changes
	The text poke event now has old len and new len
	Revised commit message

  perf/x86: Add support for perf text poke event for text_poke_bp_batch() callers

	New patch containing x86 changes from original first patch

  kprobes: Add symbols for kprobe insn pages
  kprobes: Add perf ksymbol events for kprobe insn pages
  perf/x86: Add perf text poke events for kprobes
  ftrace: Add symbols for ftrace trampolines
  ftrace: Add perf ksymbol events for ftrace trampolines
  ftrace: Add perf text poke events for ftrace trampolines
  perf kcore_copy: Fix module map when there are no modules loaded
  perf evlist: Disable 'immediate' events last

	New patches

  perf tools: Add support for PERF_RECORD_TEXT_POKE

	The text poke event now has old len and new len
	Also select ksymbol events with text poke events

  perf tools: Add support for PERF_RECORD_KSYMBOL_TYPE_OOL

	New patch

  perf intel-pt: Add support for text poke events

	The text poke event now has old len and new len
	Allow for the address not having a map yet


Changes since RFC:

  Dropped 'flags' from the new event.  The consensus seemed to be that text
  pokes should employ a scheme similar to x86's INT3 method instead.

  dropped tools patches already applied.


Example:

  For jump labels, the kernel needs
	CONFIG_JUMP_LABEL=y
  and also an easy to flip jump label is in sysctl_schedstats() which needs
	CONFIG_SCHEDSTATS=y
	CONFIG_PROC_SYSCTL=y
	CONFIG_SCHED_DEBUG=y

  Also note the 'sudo perf record' is put into the background which, as
  written, needs sudo credential caching (otherwise the background task
  will stop awaiting the sudo password), hence the 'sudo echo' to start.

Before:

  $ sudo echo
  $ sudo perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
  [1] 1640
  $ cat /proc/sys/kernel/sched_schedstats
  0
  $ sudo bash -c 'echo 1 > /proc/sys/kernel/sched_schedstats'
  $ cat /proc/sys/kernel/sched_schedstats
  1
  $ sudo bash -c 'echo 0 > /proc/sys/kernel/sched_schedstats'
  $ cat /proc/sys/kernel/sched_schedstats
  0
  $ sudo kill 1640
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 16.635 MB perf.data.before ]
  $ perf script -i perf.data.before --itrace=e >/dev/null
  Warning:
  1946 instruction trace errors

After:

  $ sudo echo
  $ sudo perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
  [1] 1882
  $ cat /proc/sys/kernel/sched_schedstats
  0
  $ sudo bash -c 'echo 1 > /proc/sys/kernel/sched_schedstats'
  $ cat /proc/sys/kernel/sched_schedstats
  1
  $ sudo bash -c 'echo 0 > /proc/sys/kernel/sched_schedstats'
  $ cat /proc/sys/kernel/sched_schedstats
  0
  $ sudo kill 1882
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 10.893 MB perf.data.after ]
  $ perf script -i perf.data.after --itrace=e
  $


Adrian Hunter (15):
      perf: Add perf text poke event
      perf/x86: Add support for perf text poke event for text_poke_bp_batch() callers
      kprobes: Add symbols for kprobe insn pages
      kprobes: Add perf ksymbol events for kprobe insn pages
      perf/x86: Add perf text poke events for kprobes
      ftrace: Add symbols for ftrace trampolines
      ftrace: Add perf ksymbol events for ftrace trampolines
      ftrace: Add perf text poke events for ftrace trampolines
      perf kcore_copy: Fix module map when there are no modules loaded
      perf evlist: Disable 'immediate' events last
      perf tools: Add support for PERF_RECORD_TEXT_POKE
      perf tools: Add support for PERF_RECORD_KSYMBOL_TYPE_OOL
      perf intel-pt: Add support for text poke events
      perf script: Add option --show-text-poke-events
      perf script: Show text poke address symbol

 arch/x86/include/asm/kprobes.h            |   2 +
 arch/x86/kernel/alternative.c             |  37 ++++++++++-
 arch/x86/kernel/kprobes/core.c            |  15 ++++-
 arch/x86/kernel/kprobes/opt.c             |  38 +++++++++--
 include/linux/ftrace.h                    |  12 ++--
 include/linux/kprobes.h                   |  15 +++++
 include/linux/perf_event.h                |   8 +++
 include/uapi/linux/perf_event.h           |  26 +++++++-
 kernel/events/core.c                      |  90 +++++++++++++++++++++++++-
 kernel/kallsyms.c                         |  42 +++++++++++--
 kernel/kprobes.c                          |  57 +++++++++++++++++
 kernel/trace/ftrace.c                     | 101 +++++++++++++++++++++++++++++-
 tools/include/uapi/linux/perf_event.h     |  26 +++++++-
 tools/lib/perf/include/perf/event.h       |   9 +++
 tools/perf/Documentation/perf-script.txt  |   4 ++
 tools/perf/arch/x86/util/intel-pt.c       |   4 ++
 tools/perf/builtin-record.c               |  45 +++++++++++++
 tools/perf/builtin-script.c               |  23 ++++++-
 tools/perf/tests/perf-record.c            |   4 +-
 tools/perf/util/dso.c                     |   3 +
 tools/perf/util/dso.h                     |   1 +
 tools/perf/util/event.c                   |  62 +++++++++++++++++-
 tools/perf/util/event.h                   |   9 ++-
 tools/perf/util/evlist.c                  |  31 ++++++---
 tools/perf/util/evlist.h                  |   1 +
 tools/perf/util/evsel.c                   |   7 ++-
 tools/perf/util/intel-pt.c                |  77 ++++++++++++++++++++++-
 tools/perf/util/machine.c                 |  49 +++++++++++++++
 tools/perf/util/machine.h                 |   3 +
 tools/perf/util/map.c                     |   5 ++
 tools/perf/util/map.h                     |   3 +-
 tools/perf/util/perf_event_attr_fprintf.c |   1 +
 tools/perf/util/record.c                  |  10 +++
 tools/perf/util/record.h                  |   1 +
 tools/perf/util/session.c                 |  23 +++++++
 tools/perf/util/symbol-elf.c              |   7 +++
 tools/perf/util/symbol.c                  |   1 +
 tools/perf/util/tool.h                    |   3 +-
 38 files changed, 815 insertions(+), 40 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH V6 01/15] perf: Add perf text poke event
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 02/15] perf/x86: Add support for perf text poke event for text_poke_bp_batch() callers Adrian Hunter
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Record (single instruction) changes to the kernel text (i.e.
self-modifying code) in order to support tracers like Intel PT and
ARM CoreSight.

A copy of the running kernel code is needed as a reference point (e.g.
from /proc/kcore). The text poke event records the old bytes and the
new bytes so that the event can be processed forwards or backwards.

The basic problem is recording the modified instruction in an
unambiguous manner given SMP instruction cache (in)coherence. That is,
when modifying an instruction concurrently any solution with one or
multiple timestamps is not sufficient:

	CPU0				CPU1
 0
 1	write insn A
 2					execute insn A
 3	sync-I$
 4

Due to I$, CPU1 might execute either the old or new A. No matter where
we record tracepoints on CPU0, one simply cannot tell what CPU1 will
have observed, except that at 0 it must be the old one and at 4 it
must be the new one.

To solve this, take inspiration from x86 text poking, which has to
solve this exact problem due to variable length instruction encoding
and I-fetch windows.

 1) overwrite the instruction with a breakpoint and sync I$

This guarantees that that code flow will never hit the target
instruction anymore, on any CPU (or rather, it will cause an
exception).

 2) issue the TEXT_POKE event

 3) overwrite the breakpoint with the new instruction and sync I$

Now we know that any execution after the TEXT_POKE event will either
observe the breakpoint (and hit the exception) or the new instruction.

So by guarding the TEXT_POKE event with an exception on either side;
we can now tell, without doubt, which instruction another CPU will
have observed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/linux/perf_event.h      |  8 +++
 include/uapi/linux/perf_event.h | 21 +++++++-
 kernel/events/core.c            | 90 ++++++++++++++++++++++++++++++++-
 3 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 9c3e7619c929..2d6e05d601f6 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1232,6 +1232,9 @@ extern void perf_event_exec(void);
 extern void perf_event_comm(struct task_struct *tsk, bool exec);
 extern void perf_event_namespaces(struct task_struct *tsk);
 extern void perf_event_fork(struct task_struct *tsk);
+extern void perf_event_text_poke(const void *addr,
+				 const void *old_bytes, size_t old_len,
+				 const void *new_bytes, size_t new_len);
 
 /* Callchains */
 DECLARE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry);
@@ -1482,6 +1485,11 @@ static inline void perf_event_exec(void)				{ }
 static inline void perf_event_comm(struct task_struct *tsk, bool exec)	{ }
 static inline void perf_event_namespaces(struct task_struct *tsk)	{ }
 static inline void perf_event_fork(struct task_struct *tsk)		{ }
+static inline void perf_event_text_poke(const void *addr,
+					const void *old_bytes,
+					size_t old_len,
+					const void *new_bytes,
+					size_t new_len)			{ }
 static inline void perf_event_init(void)				{ }
 static inline int  perf_swevent_get_recursion_context(void)		{ return -1; }
 static inline void perf_swevent_put_recursion_context(int rctx)		{ }
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 7b2d6fc9e6ed..e5bee6c17b86 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -383,7 +383,8 @@ struct perf_event_attr {
 				bpf_event      :  1, /* include bpf events */
 				aux_output     :  1, /* generate AUX records instead of events */
 				cgroup         :  1, /* include cgroup events */
-				__reserved_1   : 31;
+				text_poke      :  1, /* include text poke events */
+				__reserved_1   : 30;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -1024,6 +1025,24 @@ enum perf_event_type {
 	 */
 	PERF_RECORD_CGROUP			= 19,
 
+	/*
+	 * Records changes to kernel text i.e. self-modified code. 'old_len' is
+	 * the number of old bytes, 'new_len' is the number of new bytes. Either
+	 * 'old_len' or 'new_len' may be zero to indicate, for example, the
+	 * addition or removal of a trampoline. 'bytes' contains the old bytes
+	 * followed immediately by the new bytes.
+	 *
+	 * struct {
+	 *	struct perf_event_header	header;
+	 *	u64				addr;
+	 *	u16				old_len;
+	 *	u16				new_len;
+	 *	u8				bytes[];
+	 *	struct sample_id		sample_id;
+	 * };
+	 */
+	PERF_RECORD_TEXT_POKE			= 20,
+
 	PERF_RECORD_MAX,			/* non-ABI */
 };
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1569979c8912..7ecc932f4845 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -388,6 +388,7 @@ static atomic_t nr_switch_events __read_mostly;
 static atomic_t nr_ksymbol_events __read_mostly;
 static atomic_t nr_bpf_events __read_mostly;
 static atomic_t nr_cgroup_events __read_mostly;
+static atomic_t nr_text_poke_events __read_mostly;
 
 static LIST_HEAD(pmus);
 static DEFINE_MUTEX(pmus_lock);
@@ -4554,7 +4555,7 @@ static bool is_sb_event(struct perf_event *event)
 	if (attr->mmap || attr->mmap_data || attr->mmap2 ||
 	    attr->comm || attr->comm_exec ||
 	    attr->task || attr->ksymbol ||
-	    attr->context_switch ||
+	    attr->context_switch || attr->text_poke ||
 	    attr->bpf_event)
 		return true;
 	return false;
@@ -4630,6 +4631,8 @@ static void unaccount_event(struct perf_event *event)
 		atomic_dec(&nr_ksymbol_events);
 	if (event->attr.bpf_event)
 		atomic_dec(&nr_bpf_events);
+	if (event->attr.text_poke)
+		atomic_dec(&nr_text_poke_events);
 
 	if (dec) {
 		if (!atomic_add_unless(&perf_sched_count, -1, 1))
@@ -8598,6 +8601,89 @@ void perf_event_bpf_event(struct bpf_prog *prog,
 	perf_iterate_sb(perf_event_bpf_output, &bpf_event, NULL);
 }
 
+struct perf_text_poke_event {
+	const void		*old_bytes;
+	const void		*new_bytes;
+	size_t			pad;
+	u16			old_len;
+	u16			new_len;
+
+	struct {
+		struct perf_event_header	header;
+
+		u64				addr;
+	} event_id;
+};
+
+static int perf_event_text_poke_match(struct perf_event *event)
+{
+	return event->attr.text_poke;
+}
+
+static void perf_event_text_poke_output(struct perf_event *event, void *data)
+{
+	struct perf_text_poke_event *text_poke_event = data;
+	struct perf_output_handle handle;
+	struct perf_sample_data sample;
+	u64 padding = 0;
+	int ret;
+
+	if (!perf_event_text_poke_match(event))
+		return;
+
+	perf_event_header__init_id(&text_poke_event->event_id.header, &sample, event);
+
+	ret = perf_output_begin(&handle, event, text_poke_event->event_id.header.size);
+	if (ret)
+		return;
+
+	perf_output_put(&handle, text_poke_event->event_id);
+	perf_output_put(&handle, text_poke_event->old_len);
+	perf_output_put(&handle, text_poke_event->new_len);
+
+	__output_copy(&handle, text_poke_event->old_bytes, text_poke_event->old_len);
+	__output_copy(&handle, text_poke_event->new_bytes, text_poke_event->new_len);
+
+	if (text_poke_event->pad)
+		__output_copy(&handle, &padding, text_poke_event->pad);
+
+	perf_event__output_id_sample(event, &handle, &sample);
+
+	perf_output_end(&handle);
+}
+
+void perf_event_text_poke(const void *addr, const void *old_bytes,
+			  size_t old_len, const void *new_bytes, size_t new_len)
+{
+	struct perf_text_poke_event text_poke_event;
+	size_t tot, pad;
+
+	if (!atomic_read(&nr_text_poke_events))
+		return;
+
+	tot  = sizeof(text_poke_event.old_len) + old_len;
+	tot += sizeof(text_poke_event.new_len) + new_len;
+	pad  = ALIGN(tot, sizeof(u64)) - tot;
+
+	text_poke_event = (struct perf_text_poke_event){
+		.old_bytes    = old_bytes,
+		.new_bytes    = new_bytes,
+		.pad          = pad,
+		.old_len      = old_len,
+		.new_len      = new_len,
+		.event_id  = {
+			.header = {
+				.type = PERF_RECORD_TEXT_POKE,
+				.misc = PERF_RECORD_MISC_KERNEL,
+				.size = sizeof(text_poke_event.event_id) + tot + pad,
+			},
+			.addr = (unsigned long)addr,
+		},
+	};
+
+	perf_iterate_sb(perf_event_text_poke_output, &text_poke_event, NULL);
+}
+
 void perf_event_itrace_started(struct perf_event *event)
 {
 	event->attach_state |= PERF_ATTACH_ITRACE;
@@ -10917,6 +11003,8 @@ static void account_event(struct perf_event *event)
 		atomic_inc(&nr_ksymbol_events);
 	if (event->attr.bpf_event)
 		atomic_inc(&nr_bpf_events);
+	if (event->attr.text_poke)
+		atomic_inc(&nr_text_poke_events);
 
 	if (inc) {
 		/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 02/15] perf/x86: Add support for perf text poke event for text_poke_bp_batch() callers
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 01/15] perf: Add perf text poke event Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 03/15] kprobes: Add symbols for kprobe insn pages Adrian Hunter
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Add support for perf text poke event for text_poke_bp_batch() callers. That
includes jump labels. See comments for more details.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/kernel/alternative.c | 37 ++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 15ac0d5f4b40..69b49521bc4b 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -3,6 +3,7 @@
 
 #include <linux/module.h>
 #include <linux/sched.h>
+#include <linux/perf_event.h>
 #include <linux/mutex.h>
 #include <linux/list.h>
 #include <linux/stringify.h>
@@ -947,6 +948,7 @@ struct text_poke_loc {
 	s32 rel32;
 	u8 opcode;
 	const u8 text[POKE_MAX_OPCODE_SIZE];
+	u8 old;
 };
 
 struct bp_patching_desc {
@@ -1115,8 +1117,10 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
 	/*
 	 * First step: add a int3 trap to the address that will be patched.
 	 */
-	for (i = 0; i < nr_entries; i++)
+	for (i = 0; i < nr_entries; i++) {
+		tp[i].old = *(u8 *)text_poke_addr(&tp[i]);
 		text_poke(text_poke_addr(&tp[i]), &int3, INT3_INSN_SIZE);
+	}
 
 	text_poke_sync();
 
@@ -1124,14 +1128,45 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
 	 * Second step: update all but the first byte of the patched range.
 	 */
 	for (do_sync = 0, i = 0; i < nr_entries; i++) {
+		u8 old[POKE_MAX_OPCODE_SIZE] = { tp[i].old, };
 		int len = text_opcode_size(tp[i].opcode);
 
 		if (len - INT3_INSN_SIZE > 0) {
+			memcpy(old + INT3_INSN_SIZE,
+			       text_poke_addr(&tp[i]) + INT3_INSN_SIZE,
+			       len - INT3_INSN_SIZE);
 			text_poke(text_poke_addr(&tp[i]) + INT3_INSN_SIZE,
 				  (const char *)tp[i].text + INT3_INSN_SIZE,
 				  len - INT3_INSN_SIZE);
 			do_sync++;
 		}
+
+		/*
+		 * Emit a perf event to record the text poke, primarily to
+		 * support Intel PT decoding which must walk the executable code
+		 * to reconstruct the trace. The flow up to here is:
+		 *   - write INT3 byte
+		 *   - IPI-SYNC
+		 *   - write instruction tail
+		 * At this point the actual control flow will be through the
+		 * INT3 and handler and not hit the old or new instruction.
+		 * Intel PT outputs FUP/TIP packets for the INT3, so the flow
+		 * can still be decoded. Subsequently:
+		 *   - emit RECORD_TEXT_POKE with the new instruction
+		 *   - IPI-SYNC
+		 *   - write first byte
+		 *   - IPI-SYNC
+		 * So before the text poke event timestamp, the decoder will see
+		 * either the old instruction flow or FUP/TIP of INT3. After the
+		 * text poke event timestamp, the decoder will see either the
+		 * new instruction flow or FUP/TIP of INT3. Thus decoders can
+		 * use the timestamp as the point at which to modify the
+		 * executable code.
+		 * The old instruction is recorded so that the event can be
+		 * processed forwards or backwards.
+		 */
+		perf_event_text_poke(text_poke_addr(&tp[i]), old, len,
+				     tp[i].text, len);
 	}
 
 	if (do_sync) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 03/15] kprobes: Add symbols for kprobe insn pages
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 01/15] perf: Add perf text poke event Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 02/15] perf/x86: Add support for perf text poke event for text_poke_bp_batch() callers Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 04/15] kprobes: Add perf ksymbol events " Adrian Hunter
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Symbols are needed for tools to describe instruction addresses. Pages
allocated for kprobe's purposes need symbols to be created for them.
Add such symbols to be visible via /proc/kallsyms.

Note: kprobe insn pages are not used if ftrace is configured. To see the
effect of this patch, the kernel must be configured with:

	# CONFIG_FUNCTION_TRACER is not set
	CONFIG_KPROBES=y

and for optimised kprobes:

	CONFIG_OPTPROBES=y

Example on x86:

	# perf probe __schedule
	Added new event:
	  probe:__schedule     (on __schedule)
	# cat /proc/kallsyms | grep '\[__builtin__kprobes\]'
	ffffffffc00d4000 t kprobe_insn_page     [__builtin__kprobes]
	ffffffffc00d6000 t kprobe_optinsn_page  [__builtin__kprobes]

Note: This patch adds "__builtin__kprobes" as a module name in
/proc/kallsyms for symbols for pages allocated for kprobes' purposes, even
though "__builtin__kprobes" is not a module.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/linux/kprobes.h | 15 ++++++++++++++
 kernel/kallsyms.c       | 37 +++++++++++++++++++++++++++++----
 kernel/kprobes.c        | 45 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 93 insertions(+), 4 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 04bdaf01112c..62d682f47b5e 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -242,6 +242,7 @@ struct kprobe_insn_cache {
 	struct mutex mutex;
 	void *(*alloc)(void);	/* allocate insn page */
 	void (*free)(void *);	/* free insn page */
+	const char *sym;	/* symbol for insn pages */
 	struct list_head pages; /* list of kprobe_insn_page */
 	size_t insn_size;	/* size of instruction slot */
 	int nr_garbage;
@@ -272,6 +273,8 @@ static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
 {									\
 	return __is_insn_slot_addr(&kprobe_##__name##_slots, addr);	\
 }
+#define KPROBE_INSN_PAGE_SYM		"kprobe_insn_page"
+#define KPROBE_OPTINSN_PAGE_SYM		"kprobe_optinsn_page"
 #else /* __ARCH_WANT_KPROBES_INSN_SLOT */
 #define DEFINE_INSN_CACHE_OPS(__name)					\
 static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
@@ -373,6 +376,13 @@ void dump_kprobe(struct kprobe *kp);
 void *alloc_insn_page(void);
 void free_insn_page(void *page);
 
+int kprobe_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
+		       char *sym);
+int kprobe_cache_get_kallsym(struct kprobe_insn_cache *c, unsigned int *symnum,
+			     unsigned long *value, char *type, char *sym);
+
+int arch_kprobe_get_kallsym(unsigned int *symnum, unsigned long *value,
+			    char *type, char *sym);
 #else /* !CONFIG_KPROBES: */
 
 static inline int kprobes_built_in(void)
@@ -435,6 +445,11 @@ static inline bool within_kprobe_blacklist(unsigned long addr)
 {
 	return true;
 }
+static inline int kprobe_get_kallsym(unsigned int symnum, unsigned long *value,
+				     char *type, char *sym)
+{
+	return -ERANGE;
+}
 #endif /* CONFIG_KPROBES */
 static inline int disable_kretprobe(struct kretprobe *rp)
 {
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index a9b3f660dee7..92b510aa6d79 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -24,6 +24,7 @@
 #include <linux/slab.h>
 #include <linux/filter.h>
 #include <linux/ftrace.h>
+#include <linux/kprobes.h>
 #include <linux/compiler.h>
 
 /*
@@ -439,6 +440,7 @@ struct kallsym_iter {
 	loff_t pos_arch_end;
 	loff_t pos_mod_end;
 	loff_t pos_ftrace_mod_end;
+	loff_t pos_bpf_end;
 	unsigned long value;
 	unsigned int nameoff; /* If iterating in core kernel symbols. */
 	char type;
@@ -498,11 +500,33 @@ static int get_ksymbol_ftrace_mod(struct kallsym_iter *iter)
 
 static int get_ksymbol_bpf(struct kallsym_iter *iter)
 {
+	int ret;
+
 	strlcpy(iter->module_name, "bpf", MODULE_NAME_LEN);
 	iter->exported = 0;
-	return bpf_get_kallsym(iter->pos - iter->pos_ftrace_mod_end,
-			       &iter->value, &iter->type,
-			       iter->name) < 0 ? 0 : 1;
+	ret = bpf_get_kallsym(iter->pos - iter->pos_ftrace_mod_end,
+			      &iter->value, &iter->type,
+			      iter->name);
+	if (ret < 0) {
+		iter->pos_bpf_end = iter->pos;
+		return 0;
+	}
+
+	return 1;
+}
+
+/*
+ * This uses "__builtin__kprobes" as a module name for symbols for pages
+ * allocated for kprobes' purposes, even though "__builtin__kprobes" is not a
+ * module.
+ */
+static int get_ksymbol_kprobe(struct kallsym_iter *iter)
+{
+	strlcpy(iter->module_name, "__builtin__kprobes", MODULE_NAME_LEN);
+	iter->exported = 0;
+	return kprobe_get_kallsym(iter->pos - iter->pos_bpf_end,
+				  &iter->value, &iter->type,
+				  iter->name) < 0 ? 0 : 1;
 }
 
 /* Returns space to next name. */
@@ -529,6 +553,7 @@ static void reset_iter(struct kallsym_iter *iter, loff_t new_pos)
 		iter->pos_arch_end = 0;
 		iter->pos_mod_end = 0;
 		iter->pos_ftrace_mod_end = 0;
+		iter->pos_bpf_end = 0;
 	}
 }
 
@@ -553,7 +578,11 @@ static int update_iter_mod(struct kallsym_iter *iter, loff_t pos)
 	    get_ksymbol_ftrace_mod(iter))
 		return 1;
 
-	return get_ksymbol_bpf(iter);
+	if ((!iter->pos_bpf_end || iter->pos_bpf_end > pos) &&
+	    get_ksymbol_bpf(iter))
+		return 1;
+
+	return get_ksymbol_kprobe(iter);
 }
 
 /* Returns false if pos at or past end of file. */
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 2625c241ac00..229d1b596690 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -118,6 +118,7 @@ struct kprobe_insn_cache kprobe_insn_slots = {
 	.mutex = __MUTEX_INITIALIZER(kprobe_insn_slots.mutex),
 	.alloc = alloc_insn_page,
 	.free = free_insn_page,
+	.sym = KPROBE_INSN_PAGE_SYM,
 	.pages = LIST_HEAD_INIT(kprobe_insn_slots.pages),
 	.insn_size = MAX_INSN_SIZE,
 	.nr_garbage = 0,
@@ -296,6 +297,7 @@ struct kprobe_insn_cache kprobe_optinsn_slots = {
 	.mutex = __MUTEX_INITIALIZER(kprobe_optinsn_slots.mutex),
 	.alloc = alloc_insn_page,
 	.free = free_insn_page,
+	.sym = KPROBE_OPTINSN_PAGE_SYM,
 	.pages = LIST_HEAD_INIT(kprobe_optinsn_slots.pages),
 	/* .insn_size is initialized later */
 	.nr_garbage = 0,
@@ -2179,6 +2181,49 @@ int kprobe_add_area_blacklist(unsigned long start, unsigned long end)
 	return 0;
 }
 
+int kprobe_cache_get_kallsym(struct kprobe_insn_cache *c, unsigned int *symnum,
+			     unsigned long *value, char *type, char *sym)
+{
+	struct kprobe_insn_page *kip;
+	int ret = -ERANGE;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(kip, &c->pages, list) {
+		if ((*symnum)--)
+			continue;
+		strlcpy(sym, c->sym, KSYM_NAME_LEN);
+		*type = 't';
+		*value = (unsigned long)kip->insns;
+		ret = 0;
+		break;
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
+int __weak arch_kprobe_get_kallsym(unsigned int *symnum, unsigned long *value,
+				   char *type, char *sym)
+{
+	return -ERANGE;
+}
+
+int kprobe_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
+		       char *sym)
+{
+#ifdef __ARCH_WANT_KPROBES_INSN_SLOT
+	if (!kprobe_cache_get_kallsym(&kprobe_insn_slots, &symnum, value, type, sym))
+		return 0;
+#ifdef CONFIG_OPTPROBES
+	if (!kprobe_cache_get_kallsym(&kprobe_optinsn_slots, &symnum, value, type, sym))
+		return 0;
+#endif
+#endif
+	if (!arch_kprobe_get_kallsym(&symnum, value, type, sym))
+		return 0;
+	return -ERANGE;
+}
+
 int __init __weak arch_populate_kprobe_blacklist(void)
 {
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 04/15] kprobes: Add perf ksymbol events for kprobe insn pages
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (2 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 03/15] kprobes: Add symbols for kprobe insn pages Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 05/15] perf/x86: Add perf text poke events for kprobes Adrian Hunter
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Symbols are needed for tools to describe instruction addresses. Pages
allocated for kprobe's purposes need symbols to be created for them.
Add such symbols to be visible via perf ksymbol events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 include/uapi/linux/perf_event.h |  5 +++++
 kernel/kprobes.c                | 12 ++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index e5bee6c17b86..e1a4179144a1 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -1049,6 +1049,11 @@ enum perf_event_type {
 enum perf_record_ksymbol_type {
 	PERF_RECORD_KSYMBOL_TYPE_UNKNOWN	= 0,
 	PERF_RECORD_KSYMBOL_TYPE_BPF		= 1,
+	/*
+	 * Out of line code such as kprobe-replaced instructions or optimized
+	 * kprobes.
+	 */
+	PERF_RECORD_KSYMBOL_TYPE_OOL		= 2,
 	PERF_RECORD_KSYMBOL_TYPE_MAX		/* non-ABI */
 };
 
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 229d1b596690..f880eb2189c0 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -35,6 +35,7 @@
 #include <linux/ftrace.h>
 #include <linux/cpu.h>
 #include <linux/jump_label.h>
+#include <linux/perf_event.h>
 
 #include <asm/sections.h>
 #include <asm/cacheflush.h>
@@ -184,6 +185,10 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 	kip->cache = c;
 	list_add_rcu(&kip->list, &c->pages);
 	slot = kip->insns;
+
+	/* Record the perf ksymbol register event after adding the page */
+	perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL, (u64)kip->insns,
+			   PAGE_SIZE, false, c->sym);
 out:
 	mutex_unlock(&c->mutex);
 	return slot;
@@ -202,6 +207,13 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 		 * next time somebody inserts a probe.
 		 */
 		if (!list_is_singular(&kip->list)) {
+			/*
+			 * Record perf ksymbol unregister event before removing
+			 * the page.
+			 */
+			perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
+					   (u64)kip->insns, PAGE_SIZE, true,
+					   kip->cache->sym);
 			list_del_rcu(&kip->list);
 			synchronize_rcu();
 			kip->cache->free(kip->insns);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 05/15] perf/x86: Add perf text poke events for kprobes
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (3 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 04/15] kprobes: Add perf ksymbol events " Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 06/15] ftrace: Add symbols for ftrace trampolines Adrian Hunter
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Add perf text poke events for kprobes. That includes:

 - the replaced instruction(s) which are executed out-of-line
   i.e. arch_copy_kprobe() and arch_remove_kprobe()

 - the INT3 that activates the kprobe
   i.e. arch_arm_kprobe() and arch_disarm_kprobe()

 - optimised kprobe function
   i.e. arch_prepare_optimized_kprobe() and
      __arch_remove_optimized_kprobe()

 - optimised kprobe
   i.e. arch_optimize_kprobes() and arch_unoptimize_kprobe()

Resulting in 8 possible text_poke events:

 0:  NULL -> probe.ainsn.insn (if ainsn.boostable && !kp.post_handler)
					arch_copy_kprobe()

 1:  old0 -> INT3			arch_arm_kprobe()

 // boosted kprobe active

 2:  NULL -> optprobe_trampoline	arch_prepare_optimized_kprobe()

 3:  INT3,old1,old2,old3,old4 -> JMP32	arch_optimize_kprobes()

 // optprobe active

 4:  JMP32 -> INT3,old1,old2,old3,old4

 // optprobe disabled and kprobe active (this sometimes goes back to 3)
					arch_unoptimize_kprobe()

 5:  optprobe_trampoline -> NULL	arch_remove_optimized_kprobe()

 // boosted kprobe active

 6:  INT3 -> old0			arch_disarm_kprobe()

 7:  probe.ainsn.insn -> NULL (if ainsn.boostable && !kp.post_handler)
					arch_remove_kprobe()

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/x86/include/asm/kprobes.h |  2 ++
 arch/x86/kernel/kprobes/core.c | 15 +++++++++++++-
 arch/x86/kernel/kprobes/opt.c  | 38 +++++++++++++++++++++++++++++-----
 3 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 073eb7ad2f56..143bc9abe99c 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -66,6 +66,8 @@ struct arch_specific_insn {
 	 */
 	bool boostable;
 	bool if_modifier;
+	/* Number of bytes of text poked */
+	int tp_len;
 };
 
 struct arch_optimized_insn {
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 4d7022a740ab..6d8b58c84cbc 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -33,6 +33,7 @@
 #include <linux/hardirq.h>
 #include <linux/preempt.h>
 #include <linux/sched/debug.h>
+#include <linux/perf_event.h>
 #include <linux/extable.h>
 #include <linux/kdebug.h>
 #include <linux/kallsyms.h>
@@ -471,6 +472,9 @@ static int arch_copy_kprobe(struct kprobe *p)
 	/* Also, displacement change doesn't affect the first byte */
 	p->opcode = buf[0];
 
+	p->ainsn.tp_len = len;
+	perf_event_text_poke(p->ainsn.insn, NULL, 0, buf, len);
+
 	/* OK, write back the instruction(s) into ROX insn buffer */
 	text_poke(p->ainsn.insn, buf, len);
 
@@ -502,12 +506,18 @@ int arch_prepare_kprobe(struct kprobe *p)
 
 void arch_arm_kprobe(struct kprobe *p)
 {
-	text_poke(p->addr, ((unsigned char []){INT3_INSN_OPCODE}), 1);
+	u8 int3 = INT3_INSN_OPCODE;
+
+	text_poke(p->addr, &int3, 1);
 	text_poke_sync();
+	perf_event_text_poke(p->addr, &p->opcode, 1, &int3, 1);
 }
 
 void arch_disarm_kprobe(struct kprobe *p)
 {
+	u8 int3 = INT3_INSN_OPCODE;
+
+	perf_event_text_poke(p->addr, &int3, 1, &p->opcode, 1);
 	text_poke(p->addr, &p->opcode, 1);
 	text_poke_sync();
 }
@@ -515,6 +525,9 @@ void arch_disarm_kprobe(struct kprobe *p)
 void arch_remove_kprobe(struct kprobe *p)
 {
 	if (p->ainsn.insn) {
+		/* Record the perf event before freeing the slot */
+		perf_event_text_poke(p->ainsn.insn, p->ainsn.insn,
+				     p->ainsn.tp_len, NULL, 0);
 		free_insn_slot(p->ainsn.insn, p->ainsn.boostable);
 		p->ainsn.insn = NULL;
 	}
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index ea13f6888284..f86be31d37cc 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -6,6 +6,7 @@
  * Copyright (C) Hitachi Ltd., 2012
  */
 #include <linux/kprobes.h>
+#include <linux/perf_event.h>
 #include <linux/ptrace.h>
 #include <linux/string.h>
 #include <linux/slab.h>
@@ -354,8 +355,15 @@ int arch_within_optimized_kprobe(struct optimized_kprobe *op,
 static
 void __arch_remove_optimized_kprobe(struct optimized_kprobe *op, int dirty)
 {
-	if (op->optinsn.insn) {
-		free_optinsn_slot(op->optinsn.insn, dirty);
+	u8 *slot = op->optinsn.insn;
+	if (slot) {
+		int len = TMPL_END_IDX + op->optinsn.size + JMP32_INSN_SIZE;
+
+		/* Record the perf event before freeing the slot */
+		if (dirty)
+			perf_event_text_poke(slot, slot, len, NULL, 0);
+
+		free_optinsn_slot(slot, dirty);
 		op->optinsn.insn = NULL;
 		op->optinsn.size = 0;
 	}
@@ -426,8 +434,15 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
 			   (u8 *)op->kp.addr + op->optinsn.size);
 	len += JMP32_INSN_SIZE;
 
+	/*
+	 * Note	len = TMPL_END_IDX + op->optinsn.size + JMP32_INSN_SIZE is also
+	 * used in __arch_remove_optimized_kprobe().
+	 */
+
 	/* We have to use text_poke() for instruction buffer because it is RO */
+	perf_event_text_poke(slot, NULL, 0, buf, len);
 	text_poke(slot, buf, len);
+
 	ret = 0;
 out:
 	kfree(buf);
@@ -479,10 +494,23 @@ void arch_optimize_kprobes(struct list_head *oplist)
  */
 void arch_unoptimize_kprobe(struct optimized_kprobe *op)
 {
-	arch_arm_kprobe(&op->kp);
-	text_poke(op->kp.addr + INT3_INSN_SIZE,
-		  op->optinsn.copied_insn, DISP32_SIZE);
+	u8 new[JMP32_INSN_SIZE] = { INT3_INSN_OPCODE, };
+	u8 old[JMP32_INSN_SIZE];
+	u8 *addr = op->kp.addr;
+
+	memcpy(old, op->kp.addr, JMP32_INSN_SIZE);
+	memcpy(new + INT3_INSN_SIZE,
+	       op->optinsn.copied_insn,
+	       JMP32_INSN_SIZE - INT3_INSN_SIZE);
+
+	text_poke(addr, new, INT3_INSN_SIZE);
 	text_poke_sync();
+	text_poke(addr + INT3_INSN_SIZE,
+		  new + INT3_INSN_SIZE,
+		  JMP32_INSN_SIZE - INT3_INSN_SIZE);
+	text_poke_sync();
+
+	perf_event_text_poke(op->kp.addr, old, JMP32_INSN_SIZE, new, JMP32_INSN_SIZE);
 }
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 06/15] ftrace: Add symbols for ftrace trampolines
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (4 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 05/15] perf/x86: Add perf text poke events for kprobes Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 07/15] ftrace: Add perf ksymbol events " Adrian Hunter
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Symbols are needed for tools to describe instruction addresses. Pages
allocated for ftrace's purposes need symbols to be created for them.
Add such symbols to be visible via /proc/kallsyms.

Example on x86 with CONFIG_DYNAMIC_FTRACE=y

	# echo function > /sys/kernel/debug/tracing/current_tracer
	# cat /proc/kallsyms | grep '\[__builtin__ftrace\]'
	ffffffffc0238000 t ftrace_trampoline    [__builtin__ftrace]

Note: This patch adds "__builtin__ftrace" as a module name in /proc/kallsyms for
symbols for pages allocated for ftrace's purposes, even though "__builtin__ftrace"
is not a module.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/linux/ftrace.h | 12 ++++---
 kernel/kallsyms.c      |  5 +++
 kernel/trace/ftrace.c  | 77 ++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 88 insertions(+), 6 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index db95244a62d4..ea726ad1fa83 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -58,9 +58,6 @@ struct ftrace_direct_func;
 const char *
 ftrace_mod_address_lookup(unsigned long addr, unsigned long *size,
 		   unsigned long *off, char **modname, char *sym);
-int ftrace_mod_get_kallsym(unsigned int symnum, unsigned long *value,
-			   char *type, char *name,
-			   char *module_name, int *exported);
 #else
 static inline const char *
 ftrace_mod_address_lookup(unsigned long addr, unsigned long *size,
@@ -68,6 +65,13 @@ ftrace_mod_address_lookup(unsigned long addr, unsigned long *size,
 {
 	return NULL;
 }
+#endif
+
+#if defined(CONFIG_FUNCTION_TRACER) && defined(CONFIG_DYNAMIC_FTRACE)
+int ftrace_mod_get_kallsym(unsigned int symnum, unsigned long *value,
+			   char *type, char *name,
+			   char *module_name, int *exported);
+#else
 static inline int ftrace_mod_get_kallsym(unsigned int symnum, unsigned long *value,
 					 char *type, char *name,
 					 char *module_name, int *exported)
@@ -76,7 +80,6 @@ static inline int ftrace_mod_get_kallsym(unsigned int symnum, unsigned long *val
 }
 #endif
 
-
 #ifdef CONFIG_FUNCTION_TRACER
 
 extern int ftrace_enabled;
@@ -207,6 +210,7 @@ struct ftrace_ops {
 	struct ftrace_ops_hash		old_hash;
 	unsigned long			trampoline;
 	unsigned long			trampoline_size;
+	struct list_head		list;
 #endif
 };
 
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 92b510aa6d79..b6b8613906da 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -484,6 +484,11 @@ static int get_ksymbol_mod(struct kallsym_iter *iter)
 	return 1;
 }
 
+/*
+ * ftrace_mod_get_kallsym() may also get symbols for pages allocated for ftrace
+ * purposes. In that case "__builtin__ftrace" is used as a module name, even
+ * though "__builtin__ftrace" is not a module.
+ */
 static int get_ksymbol_ftrace_mod(struct kallsym_iter *iter)
 {
 	int ret = ftrace_mod_get_kallsym(iter->pos - iter->pos_mod_end,
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index fd81c7de77a7..4e1c2f0bea76 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2765,6 +2765,38 @@ void __weak arch_ftrace_trampoline_free(struct ftrace_ops *ops)
 {
 }
 
+/* List of trace_ops that have allocated trampolines */
+static LIST_HEAD(ftrace_ops_trampoline_list);
+
+static void ftrace_add_trampoline_to_kallsyms(struct ftrace_ops *ops)
+{
+	lockdep_assert_held(&ftrace_lock);
+	list_add_rcu(&ops->list, &ftrace_ops_trampoline_list);
+}
+
+static void ftrace_remove_trampoline_from_kallsyms(struct ftrace_ops *ops)
+{
+	lockdep_assert_held(&ftrace_lock);
+	list_del_rcu(&ops->list);
+}
+
+/*
+ * "__builtin__ftrace" is used as a module name in /proc/kallsyms for symbols
+ * for pages allocated for ftrace purposes, even though "__builtin__ftrace" is
+ * not a module.
+ */
+#define FTRACE_TRAMPOLINE_MOD "__builtin__ftrace"
+#define FTRACE_TRAMPOLINE_SYM "ftrace_trampoline"
+
+static void ftrace_trampoline_free(struct ftrace_ops *ops)
+{
+	if (ops && (ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP) &&
+	    ops->trampoline)
+		ftrace_remove_trampoline_from_kallsyms(ops);
+
+	arch_ftrace_trampoline_free(ops);
+}
+
 static void ftrace_startup_enable(int command)
 {
 	if (saved_ftrace_func != ftrace_trace_function) {
@@ -2935,7 +2967,7 @@ int ftrace_shutdown(struct ftrace_ops *ops, int command)
 			synchronize_rcu_tasks();
 
  free_ops:
-		arch_ftrace_trampoline_free(ops);
+		ftrace_trampoline_free(ops);
 	}
 
 	return 0;
@@ -6178,6 +6210,27 @@ struct ftrace_mod_map {
 	unsigned int		num_funcs;
 };
 
+static int ftrace_get_trampoline_kallsym(unsigned int symnum,
+					 unsigned long *value, char *type,
+					 char *name, char *module_name,
+					 int *exported)
+{
+	struct ftrace_ops *op;
+
+	list_for_each_entry_rcu(op, &ftrace_ops_trampoline_list, list) {
+		if (!op->trampoline || symnum--)
+			continue;
+		*value = op->trampoline;
+		*type = 't';
+		strlcpy(name, FTRACE_TRAMPOLINE_SYM, KSYM_NAME_LEN);
+		strlcpy(module_name, FTRACE_TRAMPOLINE_MOD, MODULE_NAME_LEN);
+		*exported = 0;
+		return 0;
+	}
+
+	return -ERANGE;
+}
+
 #ifdef CONFIG_MODULES
 
 #define next_to_ftrace_page(p) container_of(p, struct ftrace_page, next)
@@ -6514,6 +6567,7 @@ int ftrace_mod_get_kallsym(unsigned int symnum, unsigned long *value,
 {
 	struct ftrace_mod_map *mod_map;
 	struct ftrace_mod_func *mod_func;
+	int ret;
 
 	preempt_disable();
 	list_for_each_entry_rcu(mod_map, &ftrace_mod_maps, list) {
@@ -6540,8 +6594,10 @@ int ftrace_mod_get_kallsym(unsigned int symnum, unsigned long *value,
 		WARN_ON(1);
 		break;
 	}
+	ret = ftrace_get_trampoline_kallsym(symnum, value, type, name,
+					    module_name, exported);
 	preempt_enable();
-	return -ERANGE;
+	return ret;
 }
 
 #else
@@ -6553,6 +6609,18 @@ allocate_ftrace_mod_map(struct module *mod,
 {
 	return NULL;
 }
+int ftrace_mod_get_kallsym(unsigned int symnum, unsigned long *value,
+			   char *type, char *name, char *module_name,
+			   int *exported)
+{
+	int ret;
+
+	preempt_disable();
+	ret = ftrace_get_trampoline_kallsym(symnum, value, type, name,
+					    module_name, exported);
+	preempt_enable();
+	return ret;
+}
 #endif /* CONFIG_MODULES */
 
 struct ftrace_init_func {
@@ -6733,7 +6801,12 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 
 static void ftrace_update_trampoline(struct ftrace_ops *ops)
 {
+	unsigned long trampoline = ops->trampoline;
+
 	arch_ftrace_update_trampoline(ops);
+	if (ops->trampoline && ops->trampoline != trampoline &&
+	    (ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
+		ftrace_add_trampoline_to_kallsyms(ops);
 }
 
 void ftrace_init_trace_array(struct trace_array *tr)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 07/15] ftrace: Add perf ksymbol events for ftrace trampolines
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (5 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 06/15] ftrace: Add symbols for ftrace trampolines Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 08/15] ftrace: Add perf text poke " Adrian Hunter
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Symbols are needed for tools to describe instruction addresses. Pages
allocated for ftrace's purposes need symbols to be created for them.
Add such symbols to be visible via perf ksymbol events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/uapi/linux/perf_event.h |  2 +-
 kernel/trace/ftrace.c           | 14 ++++++++++++--
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index e1a4179144a1..52ca2093831c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -1051,7 +1051,7 @@ enum perf_record_ksymbol_type {
 	PERF_RECORD_KSYMBOL_TYPE_BPF		= 1,
 	/*
 	 * Out of line code such as kprobe-replaced instructions or optimized
-	 * kprobes.
+	 * kprobes or ftrace trampolines.
 	 */
 	PERF_RECORD_KSYMBOL_TYPE_OOL		= 2,
 	PERF_RECORD_KSYMBOL_TYPE_MAX		/* non-ABI */
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 4e1c2f0bea76..398d58b8ffa5 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2791,8 +2791,13 @@ static void ftrace_remove_trampoline_from_kallsyms(struct ftrace_ops *ops)
 static void ftrace_trampoline_free(struct ftrace_ops *ops)
 {
 	if (ops && (ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP) &&
-	    ops->trampoline)
+	    ops->trampoline) {
+		perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
+				   ops->trampoline, ops->trampoline_size,
+				   true, FTRACE_TRAMPOLINE_SYM);
+		/* Remove from kallsyms after the perf events */
 		ftrace_remove_trampoline_from_kallsyms(ops);
+	}
 
 	arch_ftrace_trampoline_free(ops);
 }
@@ -6805,8 +6810,13 @@ static void ftrace_update_trampoline(struct ftrace_ops *ops)
 
 	arch_ftrace_update_trampoline(ops);
 	if (ops->trampoline && ops->trampoline != trampoline &&
-	    (ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
+	    (ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP)) {
+		/* Add to kallsyms before the perf events */
 		ftrace_add_trampoline_to_kallsyms(ops);
+		perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
+				   ops->trampoline, ops->trampoline_size, false,
+				   FTRACE_TRAMPOLINE_SYM);
+	}
 }
 
 void ftrace_init_trace_array(struct trace_array *tr)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 08/15] ftrace: Add perf text poke events for ftrace trampolines
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (6 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 07/15] ftrace: Add perf ksymbol events " Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-21 13:45   ` Peter Zijlstra
  2020-04-05 20:13 ` [PATCH V6 09/15] perf kcore_copy: Fix module map when there are no modules loaded Adrian Hunter
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Add perf text poke events for ftrace trampolines when created and when
freed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 kernel/trace/ftrace.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 398d58b8ffa5..3318f7eca76c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2792,6 +2792,13 @@ static void ftrace_trampoline_free(struct ftrace_ops *ops)
 {
 	if (ops && (ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP) &&
 	    ops->trampoline) {
+		/*
+		 * Record the text poke event before the ksymbol unregister
+		 * event.
+		 */
+		perf_event_text_poke((void *)ops->trampoline,
+				     (void *)ops->trampoline,
+				     ops->trampoline_size, NULL, 0);
 		perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
 				   ops->trampoline, ops->trampoline_size,
 				   true, FTRACE_TRAMPOLINE_SYM);
@@ -6816,6 +6823,13 @@ static void ftrace_update_trampoline(struct ftrace_ops *ops)
 		perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
 				   ops->trampoline, ops->trampoline_size, false,
 				   FTRACE_TRAMPOLINE_SYM);
+		/*
+		 * Record the perf text poke event after the ksymbol register
+		 * event.
+		 */
+		perf_event_text_poke((void *)ops->trampoline, NULL, 0,
+				     (void *)ops->trampoline,
+				     ops->trampoline_size);
 	}
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 09/15] perf kcore_copy: Fix module map when there are no modules loaded
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (7 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 08/15] ftrace: Add perf text poke " Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 10/15] perf evlist: Disable 'immediate' events last Adrian Hunter
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

In the absence of any modules, no "modules" map is created, but there are
other executable pages to map, due to eBPF JIT, kprobe or ftrace. Map them
by recognizing that the first "module" symbol is not necessarily from a
module, and adjust the map accordingly.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/symbol-elf.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index be5b493f8284..5e43054bffea 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1458,6 +1458,7 @@ struct kcore_copy_info {
 	u64 first_symbol;
 	u64 last_symbol;
 	u64 first_module;
+	u64 first_module_symbol;
 	u64 last_module_symbol;
 	size_t phnum;
 	struct list_head phdrs;
@@ -1534,6 +1535,8 @@ static int kcore_copy__process_kallsyms(void *arg, const char *name, char type,
 		return 0;
 
 	if (strchr(name, '[')) {
+		if (!kci->first_module_symbol || start < kci->first_module_symbol)
+			kci->first_module_symbol = start;
 		if (start > kci->last_module_symbol)
 			kci->last_module_symbol = start;
 		return 0;
@@ -1731,6 +1734,10 @@ static int kcore_copy__calc_maps(struct kcore_copy_info *kci, const char *dir,
 		kci->etext += page_size;
 	}
 
+	if (kci->first_module_symbol &&
+	    (!kci->first_module || kci->first_module_symbol < kci->first_module))
+		kci->first_module = kci->first_module_symbol;
+
 	kci->first_module = round_down(kci->first_module, page_size);
 
 	if (kci->last_module_symbol) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 10/15] perf evlist: Disable 'immediate' events last
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (8 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 09/15] perf kcore_copy: Fix module map when there are no modules loaded Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 11/15] perf tools: Add support for PERF_RECORD_TEXT_POKE Adrian Hunter
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Events marked as 'immediate' are started before other events to ensure
that there is context at the start of the main tracing events. The same
is true at the end of tracing, so disable 'immediate' events after other
events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/evlist.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1548237b6558..06c4586065cc 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -379,22 +379,33 @@ void evlist__disable(struct evlist *evlist)
 {
 	struct evsel *pos;
 	struct affinity affinity;
-	int cpu, i;
+	int cpu, i, imm = 0;
+	bool has_imm = false;
 
 	if (affinity__setup(&affinity) < 0)
 		return;
 
-	evlist__for_each_cpu(evlist, i, cpu) {
-		affinity__set(&affinity, cpu);
-
-		evlist__for_each_entry(evlist, pos) {
-			if (evsel__cpu_iter_skip(pos, cpu))
-				continue;
-			if (pos->disabled || !perf_evsel__is_group_leader(pos) || !pos->core.fd)
-				continue;
-			evsel__disable_cpu(pos, pos->cpu_iter - 1);
+	/* Disable 'immediate' events last */
+	for (imm = 0; imm <= 1; imm++) {
+		evlist__for_each_cpu(evlist, i, cpu) {
+			affinity__set(&affinity, cpu);
+
+			evlist__for_each_entry(evlist, pos) {
+				if (evsel__cpu_iter_skip(pos, cpu))
+					continue;
+				if (pos->disabled || !perf_evsel__is_group_leader(pos) || !pos->core.fd)
+					continue;
+				if (pos->immediate)
+					has_imm = true;
+				if (pos->immediate != imm)
+					continue;
+				evsel__disable_cpu(pos, pos->cpu_iter - 1);
+			}
 		}
+		if (!has_imm)
+			break;
 	}
+
 	affinity__cleanup(&affinity);
 	evlist__for_each_entry(evlist, pos) {
 		if (!perf_evsel__is_group_leader(pos) || !pos->core.fd)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 11/15] perf tools: Add support for PERF_RECORD_TEXT_POKE
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (9 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 10/15] perf evlist: Disable 'immediate' events last Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 12/15] perf tools: Add support for PERF_RECORD_KSYMBOL_TYPE_OOL Adrian Hunter
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Add processing for PERF_RECORD_TEXT_POKE events. When a text poke event is
processed, then the kernel dso data cache is updated with the poked bytes.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/include/uapi/linux/perf_event.h     | 21 +++++++++-
 tools/lib/perf/include/perf/event.h       |  9 +++++
 tools/perf/builtin-record.c               | 45 ++++++++++++++++++++++
 tools/perf/util/event.c                   | 47 +++++++++++++++++++++++
 tools/perf/util/event.h                   |  5 +++
 tools/perf/util/evlist.h                  |  1 +
 tools/perf/util/evsel.c                   |  7 +++-
 tools/perf/util/machine.c                 | 43 +++++++++++++++++++++
 tools/perf/util/machine.h                 |  3 ++
 tools/perf/util/perf_event_attr_fprintf.c |  1 +
 tools/perf/util/record.c                  | 10 +++++
 tools/perf/util/record.h                  |  1 +
 tools/perf/util/session.c                 | 23 +++++++++++
 tools/perf/util/tool.h                    |  3 +-
 14 files changed, 216 insertions(+), 3 deletions(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 7b2d6fc9e6ed..e5bee6c17b86 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -383,7 +383,8 @@ struct perf_event_attr {
 				bpf_event      :  1, /* include bpf events */
 				aux_output     :  1, /* generate AUX records instead of events */
 				cgroup         :  1, /* include cgroup events */
-				__reserved_1   : 31;
+				text_poke      :  1, /* include text poke events */
+				__reserved_1   : 30;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -1024,6 +1025,24 @@ enum perf_event_type {
 	 */
 	PERF_RECORD_CGROUP			= 19,
 
+	/*
+	 * Records changes to kernel text i.e. self-modified code. 'old_len' is
+	 * the number of old bytes, 'new_len' is the number of new bytes. Either
+	 * 'old_len' or 'new_len' may be zero to indicate, for example, the
+	 * addition or removal of a trampoline. 'bytes' contains the old bytes
+	 * followed immediately by the new bytes.
+	 *
+	 * struct {
+	 *	struct perf_event_header	header;
+	 *	u64				addr;
+	 *	u16				old_len;
+	 *	u16				new_len;
+	 *	u8				bytes[];
+	 *	struct sample_id		sample_id;
+	 * };
+	 */
+	PERF_RECORD_TEXT_POKE			= 20,
+
 	PERF_RECORD_MAX,			/* non-ABI */
 };
 
diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
index 69b44d2cc0f5..842028858d66 100644
--- a/tools/lib/perf/include/perf/event.h
+++ b/tools/lib/perf/include/perf/event.h
@@ -111,6 +111,14 @@ struct perf_record_cgroup {
 	char			 path[PATH_MAX];
 };
 
+struct perf_record_text_poke_event {
+	struct perf_event_header header;
+	__u64			addr;
+	__u16			old_len;
+	__u16			new_len;
+	__u8			bytes[];
+};
+
 struct perf_record_sample {
 	struct perf_event_header header;
 	__u64			 array[];
@@ -367,6 +375,7 @@ union perf_event {
 	struct perf_record_sample		sample;
 	struct perf_record_bpf_event		bpf;
 	struct perf_record_ksymbol		ksymbol;
+	struct perf_record_text_poke_event	text_poke;
 	struct perf_record_header_attr		attr;
 	struct perf_record_event_update		event_update;
 	struct perf_record_header_event_type	event_type;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7d7912e121d6..839edc23242c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -723,6 +723,43 @@ static int record__auxtrace_init(struct record *rec __maybe_unused)
 
 #endif
 
+static int record__config_text_poke(struct evlist *evlist)
+{
+	struct evsel *evsel;
+	int err;
+
+	/* Nothing to do if text poke is already configured */
+	evlist__for_each_entry(evlist, evsel) {
+		if (evsel->core.attr.text_poke)
+			return 0;
+	}
+
+	err = parse_events(evlist, "dummy:u", NULL);
+	if (err)
+		return err;
+
+	evsel = evlist__last(evlist);
+
+	evsel->core.attr.freq = 0;
+	evsel->core.attr.sample_period = 1;
+	evsel->core.attr.text_poke = 1;
+	evsel->core.attr.ksymbol = 1;
+
+	evsel->core.system_wide = true;
+	evsel->no_aux_samples = true;
+	evsel->immediate = true;
+
+	/* Text poke must be collected on all CPUs */
+	perf_cpu_map__put(evsel->core.own_cpus);
+	evsel->core.own_cpus = perf_cpu_map__new(NULL);
+	perf_cpu_map__put(evsel->core.cpus);
+	evsel->core.cpus = perf_cpu_map__get(evsel->core.own_cpus);
+
+	perf_evsel__set_sample_bit(evsel, TIME);
+
+	return 0;
+}
+
 static bool record__kcore_readable(struct machine *machine)
 {
 	char kcore[PATH_MAX];
@@ -2620,6 +2657,14 @@ int cmd_record(int argc, const char **argv)
 	if (rec->opts.full_auxtrace)
 		rec->buildid_all = true;
 
+	if (rec->opts.text_poke) {
+		err = record__config_text_poke(rec->evlist);
+		if (err) {
+			pr_err("record__config_text_poke failed, error %d\n", err);
+			goto out;
+		}
+	}
+
 	if (record_opts__config(&rec->opts)) {
 		err = -EINVAL;
 		goto out;
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index dc0e11214ae1..cdaf58734254 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -31,6 +31,7 @@
 #include "stat.h"
 #include "session.h"
 #include "bpf-event.h"
+#include "print_binary.h"
 #include "tool.h"
 #include "../perf.h"
 
@@ -55,6 +56,7 @@ static const char *perf_event__names[] = {
 	[PERF_RECORD_KSYMBOL]			= "KSYMBOL",
 	[PERF_RECORD_BPF_EVENT]			= "BPF_EVENT",
 	[PERF_RECORD_CGROUP]			= "CGROUP",
+	[PERF_RECORD_TEXT_POKE]			= "TEXT_POKE",
 	[PERF_RECORD_HEADER_ATTR]		= "ATTR",
 	[PERF_RECORD_HEADER_EVENT_TYPE]		= "EVENT_TYPE",
 	[PERF_RECORD_HEADER_TRACING_DATA]	= "TRACING_DATA",
@@ -267,6 +269,14 @@ int perf_event__process_bpf(struct perf_tool *tool __maybe_unused,
 	return machine__process_bpf(machine, event, sample);
 }
 
+int perf_event__process_text_poke(struct perf_tool *tool __maybe_unused,
+				  union perf_event *event,
+				  struct perf_sample *sample,
+				  struct machine *machine)
+{
+	return machine__process_text_poke(machine, event, sample);
+}
+
 size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp)
 {
 	return fprintf(fp, " %d/%d: [%#" PRI_lx64 "(%#" PRI_lx64 ") @ %#" PRI_lx64 "]: %c %s\n",
@@ -413,6 +423,40 @@ size_t perf_event__fprintf_bpf(union perf_event *event, FILE *fp)
 		       event->bpf.type, event->bpf.flags, event->bpf.id);
 }
 
+static int text_poke_printer(enum binary_printer_ops op, unsigned int val,
+			     void *extra, FILE *fp)
+{
+	bool old = *(bool *)extra;
+
+	switch ((int)op) {
+	case BINARY_PRINT_LINE_BEGIN:
+		return fprintf(fp, "            %s bytes:", old ? "Old" : "New");
+	case BINARY_PRINT_NUM_DATA:
+		return fprintf(fp, " %02x", val);
+	case BINARY_PRINT_LINE_END:
+		return fprintf(fp, "\n");
+	default:
+		return 0;
+	}
+}
+
+size_t perf_event__fprintf_text_poke(union perf_event *event, FILE *fp)
+{
+	struct perf_record_text_poke_event *tp = &event->text_poke;
+	size_t ret;
+	bool old;
+
+	ret = fprintf(fp, " %" PRI_lx64 " old len %u new len %u\n",
+		      tp->addr, tp->old_len, tp->new_len);
+	old = true;
+	ret += binary__fprintf(tp->bytes, tp->old_len, 16, text_poke_printer,
+			       &old, fp);
+	old = false;
+	ret += binary__fprintf(tp->bytes + tp->old_len, tp->new_len, 16,
+			       text_poke_printer, &old, fp);
+	return ret;
+}
+
 size_t perf_event__fprintf(union perf_event *event, FILE *fp)
 {
 	size_t ret = fprintf(fp, "PERF_RECORD_%s",
@@ -457,6 +501,9 @@ size_t perf_event__fprintf(union perf_event *event, FILE *fp)
 	case PERF_RECORD_BPF_EVENT:
 		ret += perf_event__fprintf_bpf(event, fp);
 		break;
+	case PERF_RECORD_TEXT_POKE:
+		ret += perf_event__fprintf_text_poke(event, fp);
+		break;
 	default:
 		ret += fprintf(fp, "\n");
 	}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index b8289f160f07..dbbba6dea51d 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -351,6 +351,10 @@ int perf_event__process_bpf(struct perf_tool *tool,
 			    union perf_event *event,
 			    struct perf_sample *sample,
 			    struct machine *machine);
+int perf_event__process_text_poke(struct perf_tool *tool,
+				  union perf_event *event,
+				  struct perf_sample *sample,
+				  struct machine *machine);
 int perf_event__process(struct perf_tool *tool,
 			union perf_event *event,
 			struct perf_sample *sample,
@@ -385,6 +389,7 @@ size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_cgroup(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_bpf(union perf_event *event, FILE *fp);
+size_t perf_event__fprintf_text_poke(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf(union perf_event *event, FILE *fp);
 
 int kallsyms__get_function_start(const char *kallsyms_filename,
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index f5bd5c386df1..93e9b35e53b0 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -175,6 +175,7 @@ struct callchain_param;
 void perf_evlist__set_id_pos(struct evlist *evlist);
 bool perf_can_sample_identifier(void);
 bool perf_can_record_switch_events(void);
+bool perf_can_record_text_poke_events(void);
 bool perf_can_record_cpu_wide(void);
 bool perf_can_aux_sample(void);
 void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index eb880efbce16..b617d3b1fa95 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1098,7 +1098,12 @@ void perf_evsel__config(struct evsel *evsel, struct record_opts *opts,
 	attr->mmap  = track;
 	attr->mmap2 = track && !perf_missing_features.mmap2;
 	attr->comm  = track;
-	attr->ksymbol = track && !perf_missing_features.ksymbol;
+	/*
+	 * ksymbol is tracked separately with text poke because it needs to be
+	 * system wide and enabled immediately.
+	 */
+	if (!opts->text_poke)
+		attr->ksymbol = track && !perf_missing_features.ksymbol;
 	attr->bpf_event = track && !opts->no_bpf_event && !perf_missing_features.bpf;
 
 	if (opts->record_namespaces)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 97142e9671be..0fff43cf15e5 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -796,6 +796,47 @@ int machine__process_ksymbol(struct machine *machine __maybe_unused,
 	return machine__process_ksymbol_register(machine, event, sample);
 }
 
+int machine__process_text_poke(struct machine *machine, union perf_event *event,
+			       struct perf_sample *sample __maybe_unused)
+{
+	struct map *map = maps__find(&machine->kmaps, event->text_poke.addr);
+	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+
+	if (dump_trace)
+		perf_event__fprintf_text_poke(event, stdout);
+
+	if (!event->text_poke.new_len)
+		return 0;
+
+	if (cpumode != PERF_RECORD_MISC_KERNEL) {
+		pr_debug("%s: unsupported cpumode - ignoring\n", __func__);
+		return 0;
+	}
+
+	if (map && map->dso) {
+		u8 *new_bytes = event->text_poke.bytes + event->text_poke.old_len;
+		int ret;
+
+		/*
+		 * Kernel maps might be changed when loading symbols so loading
+		 * must be done prior to using kernel maps.
+		 */
+		map__load(map);
+		ret = dso__data_write_cache_addr(map->dso, map, machine,
+						 event->text_poke.addr,
+						 new_bytes,
+						 event->text_poke.new_len);
+		if (ret != event->text_poke.new_len)
+			pr_debug("Failed to write kernel text poke at %#" PRI_lx64 "\n",
+				 event->text_poke.addr);
+	} else {
+		pr_debug("Failed to find kernel text poke address map for %#" PRI_lx64 "\n",
+			 event->text_poke.addr);
+	}
+
+	return 0;
+}
+
 static struct map *machine__addnew_module_map(struct machine *machine, u64 start,
 					      const char *filename)
 {
@@ -1918,6 +1959,8 @@ int machine__process_event(struct machine *machine, union perf_event *event,
 		ret = machine__process_ksymbol(machine, event, sample); break;
 	case PERF_RECORD_BPF_EVENT:
 		ret = machine__process_bpf(machine, event, sample); break;
+	case PERF_RECORD_TEXT_POKE:
+		ret = machine__process_text_poke(machine, event, sample); break;
 	default:
 		ret = -1;
 		break;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index fa1be9ea00fa..062c36a8433c 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -138,6 +138,9 @@ int machine__process_mmap2_event(struct machine *machine, union perf_event *even
 int machine__process_ksymbol(struct machine *machine,
 			     union perf_event *event,
 			     struct perf_sample *sample);
+int machine__process_text_poke(struct machine *machine,
+			       union perf_event *event,
+			       struct perf_sample *sample);
 int machine__process_event(struct machine *machine, union perf_event *event,
 				struct perf_sample *sample);
 
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index b94fa07f5d32..e67a227c0ce7 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -147,6 +147,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(aux_watermark, p_unsigned);
 	PRINT_ATTRf(sample_max_stack, p_unsigned);
 	PRINT_ATTRf(aux_sample_size, p_unsigned);
+	PRINT_ATTRf(text_poke, p_unsigned);
 
 	return ret;
 }
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 7def66168503..207ba2a65008 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -97,6 +97,11 @@ static void perf_probe_context_switch(struct evsel *evsel)
 	evsel->core.attr.context_switch = 1;
 }
 
+static void perf_probe_text_poke(struct evsel *evsel)
+{
+	evsel->core.attr.text_poke = 1;
+}
+
 bool perf_can_sample_identifier(void)
 {
 	return perf_probe_api(perf_probe_sample_identifier);
@@ -112,6 +117,11 @@ bool perf_can_record_switch_events(void)
 	return perf_probe_api(perf_probe_context_switch);
 }
 
+bool perf_can_record_text_poke_events(void)
+{
+	return perf_probe_api(perf_probe_text_poke);
+}
+
 bool perf_can_record_cpu_wide(void)
 {
 	struct perf_event_attr attr = {
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 24316458be20..d0fd99864c49 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -47,6 +47,7 @@ struct record_opts {
 	bool	      sample_id;
 	bool	      no_bpf_event;
 	bool	      kcore;
+	bool	      text_poke;
 	unsigned int  freq;
 	unsigned int  mmap_pages;
 	unsigned int  auxtrace_mmap_pages;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 0b0bfe5bef17..9e011f3a83ef 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -491,6 +491,8 @@ void perf_tool__fill_defaults(struct perf_tool *tool)
 		tool->ksymbol = perf_event__process_ksymbol;
 	if (tool->bpf == NULL)
 		tool->bpf = perf_event__process_bpf;
+	if (tool->text_poke == NULL)
+		tool->text_poke = perf_event__process_text_poke;
 	if (tool->read == NULL)
 		tool->read = process_event_sample_stub;
 	if (tool->throttle == NULL)
@@ -660,6 +662,24 @@ static void perf_event__switch_swap(union perf_event *event, bool sample_id_all)
 		swap_sample_id_all(event, &event->context_switch + 1);
 }
 
+static void perf_event__text_poke_swap(union perf_event *event, bool sample_id_all)
+{
+	event->text_poke.addr    = bswap_64(event->text_poke.addr);
+	event->text_poke.old_len = bswap_16(event->text_poke.old_len);
+	event->text_poke.new_len = bswap_16(event->text_poke.new_len);
+
+	if (sample_id_all) {
+		size_t len = sizeof(event->text_poke.old_len) +
+			     sizeof(event->text_poke.new_len) +
+			     event->text_poke.old_len +
+			     event->text_poke.new_len;
+		void *data = &event->text_poke.old_len;
+
+		data += PERF_ALIGN(len, sizeof(u64));
+		swap_sample_id_all(event, data);
+	}
+}
+
 static void perf_event__throttle_swap(union perf_event *event,
 				      bool sample_id_all)
 {
@@ -933,6 +953,7 @@ static perf_event__swap_op perf_event__swap_ops[] = {
 	[PERF_RECORD_SWITCH]		  = perf_event__switch_swap,
 	[PERF_RECORD_SWITCH_CPU_WIDE]	  = perf_event__switch_swap,
 	[PERF_RECORD_NAMESPACES]	  = perf_event__namespaces_swap,
+	[PERF_RECORD_TEXT_POKE]		  = perf_event__text_poke_swap,
 	[PERF_RECORD_HEADER_ATTR]	  = perf_event__hdr_attr_swap,
 	[PERF_RECORD_HEADER_EVENT_TYPE]	  = perf_event__event_type_swap,
 	[PERF_RECORD_HEADER_TRACING_DATA] = perf_event__tracing_data_swap,
@@ -1476,6 +1497,8 @@ static int machines__deliver_event(struct machines *machines,
 		return tool->ksymbol(tool, event, sample, machine);
 	case PERF_RECORD_BPF_EVENT:
 		return tool->bpf(tool, event, sample, machine);
+	case PERF_RECORD_TEXT_POKE:
+		return tool->text_poke(tool, event, sample, machine);
 	default:
 		++evlist->stats.nr_unknown_events;
 		return -1;
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 3fb67bd31e4a..bbbc0dcd461f 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -57,7 +57,8 @@ struct perf_tool {
 			throttle,
 			unthrottle,
 			ksymbol,
-			bpf;
+			bpf,
+			text_poke;
 
 	event_attr_op	attr;
 	event_attr_op	event_update;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 12/15] perf tools: Add support for PERF_RECORD_KSYMBOL_TYPE_OOL
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (10 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 11/15] perf tools: Add support for PERF_RECORD_TEXT_POKE Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 13/15] perf intel-pt: Add support for text poke events Adrian Hunter
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

PERF_RECORD_KSYMBOL_TYPE_OOL marks an executable page. Create a map backed
only by memory, which will be populated as necessary by text poke events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/include/uapi/linux/perf_event.h | 5 +++++
 tools/perf/util/dso.c                 | 3 +++
 tools/perf/util/dso.h                 | 1 +
 tools/perf/util/machine.c             | 6 ++++++
 tools/perf/util/map.c                 | 5 +++++
 tools/perf/util/map.h                 | 3 ++-
 tools/perf/util/symbol.c              | 1 +
 7 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index e5bee6c17b86..52ca2093831c 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -1049,6 +1049,11 @@ enum perf_event_type {
 enum perf_record_ksymbol_type {
 	PERF_RECORD_KSYMBOL_TYPE_UNKNOWN	= 0,
 	PERF_RECORD_KSYMBOL_TYPE_BPF		= 1,
+	/*
+	 * Out of line code such as kprobe-replaced instructions or optimized
+	 * kprobes or ftrace trampolines.
+	 */
+	PERF_RECORD_KSYMBOL_TYPE_OOL		= 2,
 	PERF_RECORD_KSYMBOL_TYPE_MAX		/* non-ABI */
 };
 
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 91f21239608b..9eb50ff6da96 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -191,6 +191,7 @@ int dso__read_binary_type_filename(const struct dso *dso,
 	case DSO_BINARY_TYPE__GUEST_KALLSYMS:
 	case DSO_BINARY_TYPE__JAVA_JIT:
 	case DSO_BINARY_TYPE__BPF_PROG_INFO:
+	case DSO_BINARY_TYPE__OOL:
 	case DSO_BINARY_TYPE__NOT_FOUND:
 		ret = -1;
 		break;
@@ -881,6 +882,8 @@ static struct dso_cache *dso_cache__populate(struct dso *dso,
 
 	if (dso->binary_type == DSO_BINARY_TYPE__BPF_PROG_INFO)
 		*ret = bpf_read(dso, cache_offset, cache->data);
+	else if (dso->binary_type == DSO_BINARY_TYPE__OOL)
+		*ret = DSO__DATA_CACHE_SIZE;
 	else
 		*ret = file_read(dso, machine, cache_offset, cache->data);
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 2db64b79617a..b482453bc3d1 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -40,6 +40,7 @@ enum dso_binary_type {
 	DSO_BINARY_TYPE__GUEST_KCORE,
 	DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO,
 	DSO_BINARY_TYPE__BPF_PROG_INFO,
+	DSO_BINARY_TYPE__OOL,
 	DSO_BINARY_TYPE__NOT_FOUND,
 };
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0fff43cf15e5..8447334ac056 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -756,6 +756,12 @@ static int machine__process_ksymbol_register(struct machine *machine,
 			return -ENOMEM;
 		}
 
+		if (event->ksymbol.ksym_type == PERF_RECORD_KSYMBOL_TYPE_OOL) {
+			map->dso->binary_type = DSO_BINARY_TYPE__OOL;
+			map->dso->data.file_size = event->ksymbol.len;
+			dso__set_loaded(map->dso);
+		}
+
 		map->start = event->ksymbol.addr;
 		map->end = map->start + event->ksymbol.len;
 		maps__insert(&machine->kmaps, map);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 53d96611e6a6..f9dc8c5493ea 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -267,6 +267,11 @@ bool __map__is_bpf_prog(const struct map *map)
 	return name && (strstr(name, "bpf_prog_") == name);
 }
 
+bool __map__is_ool(const struct map *map)
+{
+	return map->dso && map->dso->binary_type == DSO_BINARY_TYPE__OOL;
+}
+
 bool map__has_symbols(const struct map *map)
 {
 	return dso__has_symbols(map->dso);
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 067036e8970c..9e312ae2d656 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -147,11 +147,12 @@ int map__set_kallsyms_ref_reloc_sym(struct map *map, const char *symbol_name,
 bool __map__is_kernel(const struct map *map);
 bool __map__is_extra_kernel_map(const struct map *map);
 bool __map__is_bpf_prog(const struct map *map);
+bool __map__is_ool(const struct map *map);
 
 static inline bool __map__is_kmodule(const struct map *map)
 {
 	return !__map__is_kernel(map) && !__map__is_extra_kernel_map(map) &&
-	       !__map__is_bpf_prog(map);
+	       !__map__is_bpf_prog(map) && !__map__is_ool(map);
 }
 
 bool map__has_symbols(const struct map *map);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 26bc6a0096ce..7d4b5ca0bd12 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1544,6 +1544,7 @@ static bool dso__is_compatible_symtab_type(struct dso *dso, bool kmod,
 		return true;
 
 	case DSO_BINARY_TYPE__BPF_PROG_INFO:
+	case DSO_BINARY_TYPE__OOL:
 	case DSO_BINARY_TYPE__NOT_FOUND:
 	default:
 		return false;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 13/15] perf intel-pt: Add support for text poke events
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (11 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 12/15] perf tools: Add support for PERF_RECORD_KSYMBOL_TYPE_OOL Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-21 12:50   ` Jiri Olsa
  2020-04-05 20:13 ` [PATCH V6 14/15] perf script: Add option --show-text-poke-events Adrian Hunter
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Select text poke events when available and the kernel is being traced.
Process text poke events to invalidate entries in Intel PT's instruction
cache.

Example:

  The example requires kernel config:
    CONFIG_PROC_SYSCTL=y
    CONFIG_SCHED_DEBUG=y
    CONFIG_SCHEDSTATS=y

  Before:

    # perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
    # cat /proc/sys/kernel/sched_schedstats
    0
    # echo 1 > /proc/sys/kernel/sched_schedstats
    # cat /proc/sys/kernel/sched_schedstats
    1
    # echo 0 > /proc/sys/kernel/sched_schedstats
    # cat /proc/sys/kernel/sched_schedstats
    0
    # kill %1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 3.341 MB perf.data.before ]
    [1]+  Terminated                 perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
    # perf script -i perf.data.before --itrace=e >/dev/null
    Warning:
    474 instruction trace errors

  After:

    # perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
    # cat /proc/sys/kernel/sched_schedstats
    0
    # echo 1 > /proc/sys/kernel/sched_schedstats
    # cat /proc/sys/kernel/sched_schedstats
    1
    # echo 0 > /proc/sys/kernel/sched_schedstats
    # cat /proc/sys/kernel/sched_schedstats
    0
    # kill %1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 2.646 MB perf.data.after ]
    [1]+  Terminated                 perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
    # perf script -i perf.data.after --itrace=e >/dev/null

Example:

  The example requires kernel config:
    # CONFIG_FUNCTION_TRACER is not set

  Before:
    # perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
    # perf probe __schedule
    Added new event:
      probe:__schedule     (on __schedule)

    You can now use it in all perf tools, such as:

            perf record -e probe:__schedule -aR sleep 1

    # perf record -e probe:__schedule -aR sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.026 MB perf.data (68 samples) ]
    # perf probe -d probe:__schedule
    Removed event: probe:__schedule
    # kill %1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 41.268 MB t1 ]
    [1]+  Terminated                 perf record --kcore -m,64M -o t1 -a -e intel_pt//k
    # perf script -i t1 --itrace=e >/dev/null
    Warning:
    207 instruction trace errors

  After:
    # perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
    # perf probe __schedule
    Added new event:
      probe:__schedule     (on __schedule)

    You can now use it in all perf tools, such as:

        perf record -e probe:__schedule -aR sleep 1

    # perf record -e probe:__schedule -aR sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.028 MB perf.data (107 samples) ]
    # perf probe -d probe:__schedule
    Removed event: probe:__schedule
    # kill %1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 39.978 MB t1 ]
    [1]+  Terminated                 perf record --kcore -m,64M -o t1 -a -e intel_pt//k
    # perf script -i t1 --itrace=e >/dev/null
    # perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
    6 565303693547 0x291f18 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027a000 len 4096 type 2 flags 0x0 name kprobe_insn_page
    6 565303697010 0x291f68 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 0 new len 6
    6 565303838278 0x291fa8 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027c000 len 4096 type 2 flags 0x0 name kprobe_optinsn_page
    6 565303848286 0x291ff8 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 0 new len 106
    6 565369336743 0x292af8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
    7 566434327704 0x217c208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
    6 566456313475 0x293198 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 106 new len 0
    6 566456314935 0x293238 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 6 new len 0

Example:

  The example requires kernel config:
    CONFIG_FUNCTION_TRACER=y

  Before:
    # perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
    # perf probe __kmalloc
    Added new event:
      probe:__kmalloc      (on __kmalloc)

    You can now use it in all perf tools, such as:

        perf record -e probe:__kmalloc -aR sleep 1

    # perf record -e probe:__kmalloc -aR sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.022 MB perf.data (6 samples) ]
    # perf probe -d probe:__kmalloc
    Removed event: probe:__kmalloc
    # kill %1
    [ perf record: Woken up 2 times to write data ]
    [ perf record: Captured and wrote 43.850 MB t1 ]
    [1]+  Terminated                 perf record --kcore -m,64M -o t1 -a -e intel_pt//k
    # perf script -i t1 --itrace=e >/dev/null
    Warning:
    8 instruction trace errors

  After:
    # perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
    # perf probe __kmalloc
    Added new event:
      probe:__kmalloc      (on __kmalloc)

    You can now use it in all perf tools, such as:

            perf record -e probe:__kmalloc -aR sleep 1

    # perf record -e probe:__kmalloc -aR sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.037 MB perf.data (206 samples) ]
    # perf probe -d probe:__kmalloc
    Removed event: probe:__kmalloc
    # kill %1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 41.442 MB t1 ]
    [1]+  Terminated                 perf record --kcore -m,64M -o t1 -a -e intel_pt//k
    # perf script -i t1 --itrace=e >/dev/null
    # perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
    5 312216133258 0x8bafe0 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc0360000 len 415 type 2 flags 0x0 name ftrace_trampoline
    5 312216133494 0x8bb030 [0x1d8]: PERF_RECORD_TEXT_POKE addr 0xffffffffc0360000 old len 0 new len 415
    5 312216229563 0x8bb208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
    5 312216239063 0x8bb248 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
    5 312216727230 0x8bb288 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
    5 312216739322 0x8bb2c8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
    5 312216748321 0x8bb308 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
    7 313287163462 0x2817430 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
    7 313287174890 0x2817470 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
    7 313287818979 0x28174b0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
    7 313287829357 0x28174f0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
    7 313287841246 0x2817530 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/arch/x86/util/intel-pt.c |  4 ++
 tools/perf/util/intel-pt.c          | 75 +++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+)

diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index 1643aed8c4c8..0140b4466209 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -827,6 +827,10 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
 		}
 	}
 
+	if (have_timing_info && !intel_pt_evsel->core.attr.exclude_kernel &&
+	    perf_can_record_text_poke_events() && perf_can_record_cpu_wide())
+		opts->text_poke = true;
+
 	if (intel_pt_evsel) {
 		/*
 		 * To obtain the auxtrace buffer file descriptor, the auxtrace
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 23c8289c2472..470a3b413869 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -514,6 +514,17 @@ intel_pt_cache_lookup(struct dso *dso, struct machine *machine, u64 offset)
 	return auxtrace_cache__lookup(dso->auxtrace_cache, offset);
 }
 
+static void intel_pt_cache_invalidate(struct dso *dso, struct machine *machine,
+				      u64 offset)
+{
+	struct auxtrace_cache *c = intel_pt_cache(dso, machine);
+
+	if (!c)
+		return;
+
+	auxtrace_cache__remove(dso->auxtrace_cache, offset);
+}
+
 static inline u8 intel_pt_cpumode(struct intel_pt *pt, uint64_t ip)
 {
 	return ip >= pt->kernel_start ?
@@ -2594,6 +2605,67 @@ static int intel_pt_process_itrace_start(struct intel_pt *pt,
 					event->itrace_start.tid);
 }
 
+static int intel_pt_find_map(struct thread *thread, u8 cpumode, u64 addr,
+			     struct addr_location *al)
+{
+	if (!al->map || addr < al->map->start || addr >= al->map->end) {
+		if (!thread__find_map(thread, cpumode, addr, al))
+			return -1;
+	}
+
+	return 0;
+}
+
+/* Invalidate all instruction cache entries that overlap the text poke */
+static int intel_pt_text_poke(struct intel_pt *pt, union perf_event *event)
+{
+	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+	u64 addr = event->text_poke.addr + event->text_poke.new_len - 1;
+	/* Assume text poke begins in a basic block no more than 4096 bytes */
+	int cnt = 4096 + event->text_poke.new_len;
+	struct thread *thread = pt->unknown_thread;
+	struct addr_location al = { .map = NULL };
+	struct machine *machine = pt->machine;
+	struct intel_pt_cache_entry *e;
+	u64 offset;
+
+	if (!event->text_poke.new_len)
+		return 0;
+
+	for (; cnt; cnt--, addr--) {
+		if (intel_pt_find_map(thread, cpumode, addr, &al)) {
+			if (addr < event->text_poke.addr)
+				return 0;
+			continue;
+		}
+
+		if (!al.map->dso || !al.map->dso->auxtrace_cache)
+			continue;
+
+		offset = al.map->map_ip(al.map, addr);
+
+		e = intel_pt_cache_lookup(al.map->dso, machine, offset);
+		if (!e)
+			continue;
+
+		if (addr + e->byte_cnt + e->length <= event->text_poke.addr) {
+			/*
+			 * No overlap. Working backwards there cannot be another
+			 * basic block that overlaps the text poke if there is a
+			 * branch instruction before the text poke address.
+			 */
+			if (e->branch != INTEL_PT_BR_NO_BRANCH)
+				return 0;
+		} else {
+			intel_pt_cache_invalidate(al.map->dso, machine, offset);
+			intel_pt_log("Invalidated instruction cache for %s at %#"PRIx64"\n",
+				     al.map->dso->long_name, addr);
+		}
+	}
+
+	return 0;
+}
+
 static int intel_pt_process_event(struct perf_session *session,
 				  union perf_event *event,
 				  struct perf_sample *sample,
@@ -2655,6 +2727,9 @@ static int intel_pt_process_event(struct perf_session *session,
 		 event->header.type == PERF_RECORD_SWITCH_CPU_WIDE)
 		err = intel_pt_context_switch(pt, event, sample);
 
+	if (!err && event->header.type == PERF_RECORD_TEXT_POKE)
+		err = intel_pt_text_poke(pt, event);
+
 	intel_pt_log("event %u: cpu %d time %"PRIu64" tsc %#"PRIx64" ",
 		     event->header.type, sample->cpu, sample->time, timestamp);
 	intel_pt_log_event(event);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 14/15] perf script: Add option --show-text-poke-events
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (12 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 13/15] perf intel-pt: Add support for text poke events Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-05 20:13 ` [PATCH V6 15/15] perf script: Show text poke address symbol Adrian Hunter
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

Consistent with other new events, add an option to perf script to display
text poke events and ksymbol events. Both text poke events and ksymbol
events are displayed because some text pokes (e.g. ftrace trampolines) have
corresponding ksymbol events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-script.txt |  4 ++++
 tools/perf/builtin-script.c              | 19 +++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 99a9853a11ba..8fe507d69f95 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -322,6 +322,10 @@ OPTIONS
 --show-cgroup-events
 	Display cgroup events i.e. events of type PERF_RECORD_CGROUP.
 
+--show-text-poke-events
+	Display text poke events i.e. events of type PERF_RECORD_TEXT_POKE and
+	PERF_RECORD_KSYMBOL.
+
 --demangle::
 	Demangle symbol names to human readable form. It's enabled by default,
 	disable with --no-demangle.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 348fec9743a4..e934bf46269d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1695,6 +1695,7 @@ struct perf_script {
 	bool			show_round_events;
 	bool			show_bpf_events;
 	bool			show_cgroup_events;
+	bool			show_text_poke_events;
 	bool			allocated;
 	bool			per_event_dump;
 	struct evswitch		evswitch;
@@ -2319,6 +2320,18 @@ process_bpf_events(struct perf_tool *tool __maybe_unused,
 			   sample->tid);
 }
 
+static int process_text_poke_events(struct perf_tool *tool,
+				    union perf_event *event,
+				    struct perf_sample *sample,
+				    struct machine *machine)
+{
+	if (perf_event__process_text_poke(tool, event, sample, machine) < 0)
+		return -1;
+
+	return print_event(tool, event, sample, machine, sample->pid,
+			   sample->tid);
+}
+
 static void sig_handler(int sig __maybe_unused)
 {
 	session_done = 1;
@@ -2427,6 +2440,10 @@ static int __cmd_script(struct perf_script *script)
 		script->tool.ksymbol = process_bpf_events;
 		script->tool.bpf     = process_bpf_events;
 	}
+	if (script->show_text_poke_events) {
+		script->tool.ksymbol   = process_bpf_events;
+		script->tool.text_poke = process_text_poke_events;
+	}
 
 	if (perf_script__setup_per_event_dump(script)) {
 		pr_err("Couldn't create the per event dump files\n");
@@ -3452,6 +3469,8 @@ int cmd_script(int argc, const char **argv)
 		    "Show round events (if recorded)"),
 	OPT_BOOLEAN('\0', "show-bpf-events", &script.show_bpf_events,
 		    "Show bpf related events (if recorded)"),
+	OPT_BOOLEAN('\0', "show-text-poke-events", &script.show_text_poke_events,
+		    "Show text poke related events (if recorded)"),
 	OPT_BOOLEAN('\0', "per-event-dump", &script.per_event_dump,
 		    "Dump trace output to files named by the monitored events"),
 	OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH V6 15/15] perf script: Show text poke address symbol
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (13 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 14/15] perf script: Add option --show-text-poke-events Adrian Hunter
@ 2020-04-05 20:13 ` Adrian Hunter
  2020-04-20  6:21 ` [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
  2020-04-21 13:46 ` Peter Zijlstra
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-05 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

It is generally more useful to show the symbol with an address. In this
case, the print function requires the 'machine' which means changing
callers to provide it as a parameter. It is optional because most events
do not need it and the callers that matter can provide it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/builtin-script.c    |  4 ++--
 tools/perf/tests/perf-record.c |  4 ++--
 tools/perf/util/event.c        | 23 ++++++++++++++++++-----
 tools/perf/util/event.h        |  6 ++++--
 tools/perf/util/intel-pt.c     |  2 +-
 tools/perf/util/machine.c      |  2 +-
 6 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index e934bf46269d..f06470e8f07b 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2168,7 +2168,7 @@ static int print_event_with_time(struct perf_tool *tool,
 					   event->header.type, stdout);
 	}
 
-	perf_event__fprintf(event, stdout);
+	perf_event__fprintf(machine, event, stdout);
 
 	thread__put(thread);
 
@@ -2303,7 +2303,7 @@ process_finished_round_event(struct perf_tool *tool __maybe_unused,
 			     struct ordered_events *oe __maybe_unused)
 
 {
-	perf_event__fprintf(event, stdout);
+	perf_event__fprintf(NULL, event, stdout);
 	return 0;
 }
 
diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index 2195fc205e72..a187a70b4a98 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -185,14 +185,14 @@ int test__PERF_RECORD(struct test *test __maybe_unused, int subtest __maybe_unus
 				err = perf_evlist__parse_sample(evlist, event, &sample);
 				if (err < 0) {
 					if (verbose > 0)
-						perf_event__fprintf(event, stderr);
+						perf_event__fprintf(NULL, event, stderr);
 					pr_debug("Couldn't parse sample\n");
 					goto out_delete_evlist;
 				}
 
 				if (verbose > 0) {
 					pr_info("%" PRIu64" %d ", sample.time, sample.cpu);
-					perf_event__fprintf(event, stderr);
+					perf_event__fprintf(NULL, event, stderr);
 				}
 
 				if (prev_time > sample.time) {
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index cdaf58734254..e151df9b28d7 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -440,14 +440,26 @@ static int text_poke_printer(enum binary_printer_ops op, unsigned int val,
 	}
 }
 
-size_t perf_event__fprintf_text_poke(union perf_event *event, FILE *fp)
+size_t perf_event__fprintf_text_poke(struct machine *machine,
+				     union perf_event *event, FILE *fp)
 {
 	struct perf_record_text_poke_event *tp = &event->text_poke;
 	size_t ret;
 	bool old;
 
-	ret = fprintf(fp, " %" PRI_lx64 " old len %u new len %u\n",
-		      tp->addr, tp->old_len, tp->new_len);
+	ret = fprintf(fp, " %" PRI_lx64 " ", tp->addr);
+	if (machine) {
+		struct addr_location al;
+
+		al.map = maps__find(&machine->kmaps, tp->addr);
+		if (al.map && map__load(al.map) >= 0) {
+			al.addr = al.map->map_ip(al.map, tp->addr);
+			al.sym = map__find_symbol(al.map, al.addr);
+			if (al.sym)
+				ret += symbol__fprintf_symname_offs(al.sym, &al, fp);
+		}
+	}
+	ret += fprintf(fp, " old len %u new len %u\n", tp->old_len, tp->new_len);
 	old = true;
 	ret += binary__fprintf(tp->bytes, tp->old_len, 16, text_poke_printer,
 			       &old, fp);
@@ -457,7 +469,8 @@ size_t perf_event__fprintf_text_poke(union perf_event *event, FILE *fp)
 	return ret;
 }
 
-size_t perf_event__fprintf(union perf_event *event, FILE *fp)
+size_t perf_event__fprintf(struct machine *machine, union perf_event *event,
+			   FILE *fp)
 {
 	size_t ret = fprintf(fp, "PERF_RECORD_%s",
 			     perf_event__name(event->header.type));
@@ -502,7 +515,7 @@ size_t perf_event__fprintf(union perf_event *event, FILE *fp)
 		ret += perf_event__fprintf_bpf(event, fp);
 		break;
 	case PERF_RECORD_TEXT_POKE:
-		ret += perf_event__fprintf_text_poke(event, fp);
+		ret += perf_event__fprintf_text_poke(machine, event, fp);
 		break;
 	default:
 		ret += fprintf(fp, "\n");
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index dbbba6dea51d..611c8e5ad9d9 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -389,8 +389,10 @@ size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_cgroup(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_bpf(union perf_event *event, FILE *fp);
-size_t perf_event__fprintf_text_poke(union perf_event *event, FILE *fp);
-size_t perf_event__fprintf(union perf_event *event, FILE *fp);
+size_t perf_event__fprintf_text_poke(struct machine *machine,
+				     union perf_event *event, FILE *fp);
+size_t perf_event__fprintf(struct machine *machine, union perf_event *event,
+			   FILE *fp);
 
 int kallsyms__get_function_start(const char *kallsyms_filename,
 				 const char *symbol_name, u64 *addr);
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 470a3b413869..902f85ded944 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -230,7 +230,7 @@ static void intel_pt_log_event(union perf_event *event)
 	if (!intel_pt_enable_logging || !f)
 		return;
 
-	perf_event__fprintf(event, f);
+	perf_event__fprintf(NULL, event, f);
 }
 
 static void intel_pt_dump_sample(struct perf_session *session,
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 8447334ac056..4c023667a073 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -809,7 +809,7 @@ int machine__process_text_poke(struct machine *machine, union perf_event *event,
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 
 	if (dump_trace)
-		perf_event__fprintf_text_poke(event, stdout);
+		perf_event__fprintf_text_poke(machine, event, stdout);
 
 	if (!event->text_poke.new_len)
 		return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 00/15] perf/x86: Add perf text poke events
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (14 preceding siblings ...)
  2020-04-05 20:13 ` [PATCH V6 15/15] perf script: Show text poke address symbol Adrian Hunter
@ 2020-04-20  6:21 ` Adrian Hunter
  2020-04-21 13:46 ` Peter Zijlstra
  16 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-20  6:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

On 5/04/20 11:13 pm, Adrian Hunter wrote:
> Hi
> 
> Here are patches to add a text poke event to record changes to kernel text
> (i.e. self-modifying code) in order to support tracers like Intel PT
> decoding through jump labels, kprobes and ftrace trampolines.
> 
> The first 8 patches make the kernel changes and the subsequent patches are
> tools changes.
> 
> The next 4 patches add support for updating perf tools' data cache
> with the changed bytes.
> 
> The next patch is an Intel PT specific tools change.
> 
> The final 2 patches add perf script --show-text-poke-events option
> 
> Patches also here:
> 
> 	git://git.infradead.org/users/ahunter/linux-perf.git text_poke
> 
> Changes in V6
> 
>   perf: Add perf text poke event
> 
> 	Adjustments due to re-base on top of cgroup event patches
> 
>   perf/x86: Add perf text poke events for kprobes
> 
> 	Expand commit message
> 	Add Masami's Ack
> 
>   perf tools: Add support for PERF_RECORD_TEXT_POKE
> 
> 	Adjustments due to re-base on top of cgroup event patches
> 	Improvements to perf_event__fprintf_text_poke()
> 
>   perf script: Add option --show-text-poke-events
>   perf script: Show text poke address symbol
> 
> 	New patches
> 

Any comments?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 13/15] perf intel-pt: Add support for text poke events
  2020-04-05 20:13 ` [PATCH V6 13/15] perf intel-pt: Add support for text poke events Adrian Hunter
@ 2020-04-21 12:50   ` Jiri Olsa
  2020-04-21 13:02     ` Adrian Hunter
  0 siblings, 1 reply; 32+ messages in thread
From: Jiri Olsa @ 2020-04-21 12:50 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Arnaldo Carvalho de Melo, linux-kernel

On Sun, Apr 05, 2020 at 11:13:25PM +0300, Adrian Hunter wrote:
> Select text poke events when available and the kernel is being traced.
> Process text poke events to invalidate entries in Intel PT's instruction
> cache.
> 
> Example:
> 
>   The example requires kernel config:
>     CONFIG_PROC_SYSCTL=y
>     CONFIG_SCHED_DEBUG=y
>     CONFIG_SCHEDSTATS=y
> 
>   Before:
> 
>     # perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
>     # cat /proc/sys/kernel/sched_schedstats
>     0
>     # echo 1 > /proc/sys/kernel/sched_schedstats
>     # cat /proc/sys/kernel/sched_schedstats
>     1
>     # echo 0 > /proc/sys/kernel/sched_schedstats
>     # cat /proc/sys/kernel/sched_schedstats
>     0
>     # kill %1
>     [ perf record: Woken up 1 times to write data ]
>     [ perf record: Captured and wrote 3.341 MB perf.data.before ]
>     [1]+  Terminated                 perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
>     # perf script -i perf.data.before --itrace=e >/dev/null
>     Warning:
>     474 instruction trace errors
> 
>   After:
> 
>     # perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
>     # cat /proc/sys/kernel/sched_schedstats
>     0
>     # echo 1 > /proc/sys/kernel/sched_schedstats
>     # cat /proc/sys/kernel/sched_schedstats
>     1
>     # echo 0 > /proc/sys/kernel/sched_schedstats
>     # cat /proc/sys/kernel/sched_schedstats
>     0
>     # kill %1
>     [ perf record: Woken up 1 times to write data ]
>     [ perf record: Captured and wrote 2.646 MB perf.data.after ]
>     [1]+  Terminated                 perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
>     # perf script -i perf.data.after --itrace=e >/dev/null


I'm still seeing some, probably I'm missing some CONFIG, will check

	# ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
	[1] 5880
	# cat /proc/sys/kernel/sched_schedstats
	0
	# echo 1 > /proc/sys/kernel/sched_schedstats
	# cat /proc/sys/kernel/sched_schedstats
	1
	# echo 0 > /proc/sys/kernel/sched_schedstats
	# kill %1
	# [ perf record: Woken up 1 times to write data ]
	[ perf record: Captured and wrote 6.181 MB perf.data.after ]
	[1]+  Terminated              ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
	# ./perf script --itrace=e -i perf.data.after > /dev/null
	Warning:
	18837 instruction trace errors

	# ./perf script --itrace=e -i perf.data.after | head
	instruction trace error type 1 time 9274.420582345 cpu 9 pid 845 tid 845 ip 0xffffffff814e6cf2 code 6: Trace doesn't match instruction
	instruction trace error type 1 time 9274.422229726 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a320 code 6: Trace doesn't match instruction
	instruction trace error type 1 time 9274.422231972 cpu 39 pid 5880 tid 5880 ip 0xffffffffa0315c1c code 6: Trace doesn't match instruction
	instruction trace error type 1 time 9274.422236141 cpu 39 pid 5880 tid 5880 ip 0xffffffff81143263 code 6: Trace doesn't match instruction
	instruction trace error type 1 time 9274.422237424 cpu 39 pid 5880 tid 5880 ip 0xffffffff8115c388 code 6: Trace doesn't match instruction
	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffff811428c9 code 6: Trace doesn't match instruction
	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffffa13279fb code 5: Failed to get instruction
	instruction trace error type 1 time 9274.422242556 cpu 39 pid 5880 tid 5880 ip 0xffffffff814e9c73 code 6: Trace doesn't match instruction
	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffff810d7da2 code 6: Trace doesn't match instruction
	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a0b2 code 6: Trace doesn't match instruction

jirka


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 13/15] perf intel-pt: Add support for text poke events
  2020-04-21 12:50   ` Jiri Olsa
@ 2020-04-21 13:02     ` Adrian Hunter
  2020-04-21 13:06       ` Jiri Olsa
  0 siblings, 1 reply; 32+ messages in thread
From: Adrian Hunter @ 2020-04-21 13:02 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Arnaldo Carvalho de Melo, linux-kernel

On 21/04/20 3:50 pm, Jiri Olsa wrote:
> On Sun, Apr 05, 2020 at 11:13:25PM +0300, Adrian Hunter wrote:
>> Select text poke events when available and the kernel is being traced.
>> Process text poke events to invalidate entries in Intel PT's instruction
>> cache.
>>
>> Example:
>>
>>   The example requires kernel config:
>>     CONFIG_PROC_SYSCTL=y
>>     CONFIG_SCHED_DEBUG=y
>>     CONFIG_SCHEDSTATS=y
>>
>>   Before:
>>
>>     # perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
>>     # cat /proc/sys/kernel/sched_schedstats
>>     0
>>     # echo 1 > /proc/sys/kernel/sched_schedstats
>>     # cat /proc/sys/kernel/sched_schedstats
>>     1
>>     # echo 0 > /proc/sys/kernel/sched_schedstats
>>     # cat /proc/sys/kernel/sched_schedstats
>>     0
>>     # kill %1
>>     [ perf record: Woken up 1 times to write data ]
>>     [ perf record: Captured and wrote 3.341 MB perf.data.before ]
>>     [1]+  Terminated                 perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
>>     # perf script -i perf.data.before --itrace=e >/dev/null
>>     Warning:
>>     474 instruction trace errors
>>
>>   After:
>>
>>     # perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
>>     # cat /proc/sys/kernel/sched_schedstats
>>     0
>>     # echo 1 > /proc/sys/kernel/sched_schedstats
>>     # cat /proc/sys/kernel/sched_schedstats
>>     1
>>     # echo 0 > /proc/sys/kernel/sched_schedstats
>>     # cat /proc/sys/kernel/sched_schedstats
>>     0
>>     # kill %1
>>     [ perf record: Woken up 1 times to write data ]
>>     [ perf record: Captured and wrote 2.646 MB perf.data.after ]
>>     [1]+  Terminated                 perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
>>     # perf script -i perf.data.after --itrace=e >/dev/null
> 
> 
> I'm still seeing some, probably I'm missing some CONFIG, will check
> 
> 	# ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
> 	[1] 5880
> 	# cat /proc/sys/kernel/sched_schedstats
> 	0
> 	# echo 1 > /proc/sys/kernel/sched_schedstats
> 	# cat /proc/sys/kernel/sched_schedstats
> 	1
> 	# echo 0 > /proc/sys/kernel/sched_schedstats
> 	# kill %1
> 	# [ perf record: Woken up 1 times to write data ]
> 	[ perf record: Captured and wrote 6.181 MB perf.data.after ]
> 	[1]+  Terminated              ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
> 	# ./perf script --itrace=e -i perf.data.after > /dev/null
> 	Warning:
> 	18837 instruction trace errors
> 
> 	# ./perf script --itrace=e -i perf.data.after | head
> 	instruction trace error type 1 time 9274.420582345 cpu 9 pid 845 tid 845 ip 0xffffffff814e6cf2 code 6: Trace doesn't match instruction
> 	instruction trace error type 1 time 9274.422229726 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a320 code 6: Trace doesn't match instruction
> 	instruction trace error type 1 time 9274.422231972 cpu 39 pid 5880 tid 5880 ip 0xffffffffa0315c1c code 6: Trace doesn't match instruction
> 	instruction trace error type 1 time 9274.422236141 cpu 39 pid 5880 tid 5880 ip 0xffffffff81143263 code 6: Trace doesn't match instruction
> 	instruction trace error type 1 time 9274.422237424 cpu 39 pid 5880 tid 5880 ip 0xffffffff8115c388 code 6: Trace doesn't match instruction
> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffff811428c9 code 6: Trace doesn't match instruction
> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffffa13279fb code 5: Failed to get instruction
> 	instruction trace error type 1 time 9274.422242556 cpu 39 pid 5880 tid 5880 ip 0xffffffff814e9c73 code 6: Trace doesn't match instruction
> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffff810d7da2 code 6: Trace doesn't match instruction
> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a0b2 code 6: Trace doesn't match instruction

I don't think it can be CONFIG.  Can you share the branch you are using?
Then I can test it.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 13/15] perf intel-pt: Add support for text poke events
  2020-04-21 13:02     ` Adrian Hunter
@ 2020-04-21 13:06       ` Jiri Olsa
  2020-04-21 13:10         ` Adrian Hunter
  0 siblings, 1 reply; 32+ messages in thread
From: Jiri Olsa @ 2020-04-21 13:06 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Arnaldo Carvalho de Melo, linux-kernel

On Tue, Apr 21, 2020 at 04:02:32PM +0300, Adrian Hunter wrote:
> On 21/04/20 3:50 pm, Jiri Olsa wrote:
> > On Sun, Apr 05, 2020 at 11:13:25PM +0300, Adrian Hunter wrote:
> >> Select text poke events when available and the kernel is being traced.
> >> Process text poke events to invalidate entries in Intel PT's instruction
> >> cache.
> >>
> >> Example:
> >>
> >>   The example requires kernel config:
> >>     CONFIG_PROC_SYSCTL=y
> >>     CONFIG_SCHED_DEBUG=y
> >>     CONFIG_SCHEDSTATS=y
> >>
> >>   Before:
> >>
> >>     # perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
> >>     # cat /proc/sys/kernel/sched_schedstats
> >>     0
> >>     # echo 1 > /proc/sys/kernel/sched_schedstats
> >>     # cat /proc/sys/kernel/sched_schedstats
> >>     1
> >>     # echo 0 > /proc/sys/kernel/sched_schedstats
> >>     # cat /proc/sys/kernel/sched_schedstats
> >>     0
> >>     # kill %1
> >>     [ perf record: Woken up 1 times to write data ]
> >>     [ perf record: Captured and wrote 3.341 MB perf.data.before ]
> >>     [1]+  Terminated                 perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
> >>     # perf script -i perf.data.before --itrace=e >/dev/null
> >>     Warning:
> >>     474 instruction trace errors
> >>
> >>   After:
> >>
> >>     # perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
> >>     # cat /proc/sys/kernel/sched_schedstats
> >>     0
> >>     # echo 1 > /proc/sys/kernel/sched_schedstats
> >>     # cat /proc/sys/kernel/sched_schedstats
> >>     1
> >>     # echo 0 > /proc/sys/kernel/sched_schedstats
> >>     # cat /proc/sys/kernel/sched_schedstats
> >>     0
> >>     # kill %1
> >>     [ perf record: Woken up 1 times to write data ]
> >>     [ perf record: Captured and wrote 2.646 MB perf.data.after ]
> >>     [1]+  Terminated                 perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
> >>     # perf script -i perf.data.after --itrace=e >/dev/null
> > 
> > 
> > I'm still seeing some, probably I'm missing some CONFIG, will check
> > 
> > 	# ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
> > 	[1] 5880
> > 	# cat /proc/sys/kernel/sched_schedstats
> > 	0
> > 	# echo 1 > /proc/sys/kernel/sched_schedstats
> > 	# cat /proc/sys/kernel/sched_schedstats
> > 	1
> > 	# echo 0 > /proc/sys/kernel/sched_schedstats
> > 	# kill %1
> > 	# [ perf record: Woken up 1 times to write data ]
> > 	[ perf record: Captured and wrote 6.181 MB perf.data.after ]
> > 	[1]+  Terminated              ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
> > 	# ./perf script --itrace=e -i perf.data.after > /dev/null
> > 	Warning:
> > 	18837 instruction trace errors
> > 
> > 	# ./perf script --itrace=e -i perf.data.after | head
> > 	instruction trace error type 1 time 9274.420582345 cpu 9 pid 845 tid 845 ip 0xffffffff814e6cf2 code 6: Trace doesn't match instruction
> > 	instruction trace error type 1 time 9274.422229726 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a320 code 6: Trace doesn't match instruction
> > 	instruction trace error type 1 time 9274.422231972 cpu 39 pid 5880 tid 5880 ip 0xffffffffa0315c1c code 6: Trace doesn't match instruction
> > 	instruction trace error type 1 time 9274.422236141 cpu 39 pid 5880 tid 5880 ip 0xffffffff81143263 code 6: Trace doesn't match instruction
> > 	instruction trace error type 1 time 9274.422237424 cpu 39 pid 5880 tid 5880 ip 0xffffffff8115c388 code 6: Trace doesn't match instruction
> > 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffff811428c9 code 6: Trace doesn't match instruction
> > 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffffa13279fb code 5: Failed to get instruction
> > 	instruction trace error type 1 time 9274.422242556 cpu 39 pid 5880 tid 5880 ip 0xffffffff814e9c73 code 6: Trace doesn't match instruction
> > 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffff810d7da2 code 6: Trace doesn't match instruction
> > 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a0b2 code 6: Trace doesn't match instruction
> 
> I don't think it can be CONFIG.  Can you share the branch you are using?
> Then I can test it.
> 

I built the one you mentioned in cover email:
  git://git.infradead.org/users/ahunter/linux-perf.git text_poke

jirka


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 13/15] perf intel-pt: Add support for text poke events
  2020-04-21 13:06       ` Jiri Olsa
@ 2020-04-21 13:10         ` Adrian Hunter
  2020-04-21 13:35           ` Jiri Olsa
  0 siblings, 1 reply; 32+ messages in thread
From: Adrian Hunter @ 2020-04-21 13:10 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Arnaldo Carvalho de Melo, linux-kernel

On 21/04/20 4:06 pm, Jiri Olsa wrote:
> On Tue, Apr 21, 2020 at 04:02:32PM +0300, Adrian Hunter wrote:
>> On 21/04/20 3:50 pm, Jiri Olsa wrote:
>>> On Sun, Apr 05, 2020 at 11:13:25PM +0300, Adrian Hunter wrote:
>>>> Select text poke events when available and the kernel is being traced.
>>>> Process text poke events to invalidate entries in Intel PT's instruction
>>>> cache.
>>>>
>>>> Example:
>>>>
>>>>   The example requires kernel config:
>>>>     CONFIG_PROC_SYSCTL=y
>>>>     CONFIG_SCHED_DEBUG=y
>>>>     CONFIG_SCHEDSTATS=y
>>>>
>>>>   Before:
>>>>
>>>>     # perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
>>>>     # cat /proc/sys/kernel/sched_schedstats
>>>>     0
>>>>     # echo 1 > /proc/sys/kernel/sched_schedstats
>>>>     # cat /proc/sys/kernel/sched_schedstats
>>>>     1
>>>>     # echo 0 > /proc/sys/kernel/sched_schedstats
>>>>     # cat /proc/sys/kernel/sched_schedstats
>>>>     0
>>>>     # kill %1
>>>>     [ perf record: Woken up 1 times to write data ]
>>>>     [ perf record: Captured and wrote 3.341 MB perf.data.before ]
>>>>     [1]+  Terminated                 perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
>>>>     # perf script -i perf.data.before --itrace=e >/dev/null
>>>>     Warning:
>>>>     474 instruction trace errors
>>>>
>>>>   After:
>>>>
>>>>     # perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
>>>>     # cat /proc/sys/kernel/sched_schedstats
>>>>     0
>>>>     # echo 1 > /proc/sys/kernel/sched_schedstats
>>>>     # cat /proc/sys/kernel/sched_schedstats
>>>>     1
>>>>     # echo 0 > /proc/sys/kernel/sched_schedstats
>>>>     # cat /proc/sys/kernel/sched_schedstats
>>>>     0
>>>>     # kill %1
>>>>     [ perf record: Woken up 1 times to write data ]
>>>>     [ perf record: Captured and wrote 2.646 MB perf.data.after ]
>>>>     [1]+  Terminated                 perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
>>>>     # perf script -i perf.data.after --itrace=e >/dev/null
>>>
>>>
>>> I'm still seeing some, probably I'm missing some CONFIG, will check
>>>
>>> 	# ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
>>> 	[1] 5880
>>> 	# cat /proc/sys/kernel/sched_schedstats
>>> 	0
>>> 	# echo 1 > /proc/sys/kernel/sched_schedstats
>>> 	# cat /proc/sys/kernel/sched_schedstats
>>> 	1
>>> 	# echo 0 > /proc/sys/kernel/sched_schedstats
>>> 	# kill %1
>>> 	# [ perf record: Woken up 1 times to write data ]
>>> 	[ perf record: Captured and wrote 6.181 MB perf.data.after ]
>>> 	[1]+  Terminated              ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
>>> 	# ./perf script --itrace=e -i perf.data.after > /dev/null
>>> 	Warning:
>>> 	18837 instruction trace errors
>>>
>>> 	# ./perf script --itrace=e -i perf.data.after | head
>>> 	instruction trace error type 1 time 9274.420582345 cpu 9 pid 845 tid 845 ip 0xffffffff814e6cf2 code 6: Trace doesn't match instruction
>>> 	instruction trace error type 1 time 9274.422229726 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a320 code 6: Trace doesn't match instruction
>>> 	instruction trace error type 1 time 9274.422231972 cpu 39 pid 5880 tid 5880 ip 0xffffffffa0315c1c code 6: Trace doesn't match instruction
>>> 	instruction trace error type 1 time 9274.422236141 cpu 39 pid 5880 tid 5880 ip 0xffffffff81143263 code 6: Trace doesn't match instruction
>>> 	instruction trace error type 1 time 9274.422237424 cpu 39 pid 5880 tid 5880 ip 0xffffffff8115c388 code 6: Trace doesn't match instruction
>>> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffff811428c9 code 6: Trace doesn't match instruction
>>> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffffa13279fb code 5: Failed to get instruction
>>> 	instruction trace error type 1 time 9274.422242556 cpu 39 pid 5880 tid 5880 ip 0xffffffff814e9c73 code 6: Trace doesn't match instruction
>>> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffff810d7da2 code 6: Trace doesn't match instruction
>>> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a0b2 code 6: Trace doesn't match instruction
>>
>> I don't think it can be CONFIG.  Can you share the branch you are using?
>> Then I can test it.
>>
> 
> I built the one you mentioned in cover email:
>   git://git.infradead.org/users/ahunter/linux-perf.git text_poke

The tools and the kernel?

Does it successfully decode anything? e.g.
./perf script --itrace=be -i perf.data.after | head


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 13/15] perf intel-pt: Add support for text poke events
  2020-04-21 13:10         ` Adrian Hunter
@ 2020-04-21 13:35           ` Jiri Olsa
  2020-04-21 13:45             ` Adrian Hunter
  0 siblings, 1 reply; 32+ messages in thread
From: Jiri Olsa @ 2020-04-21 13:35 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Arnaldo Carvalho de Melo, linux-kernel

On Tue, Apr 21, 2020 at 04:10:33PM +0300, Adrian Hunter wrote:

SNIP

> >>> I'm still seeing some, probably I'm missing some CONFIG, will check
> >>>
> >>> 	# ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
> >>> 	[1] 5880
> >>> 	# cat /proc/sys/kernel/sched_schedstats
> >>> 	0
> >>> 	# echo 1 > /proc/sys/kernel/sched_schedstats
> >>> 	# cat /proc/sys/kernel/sched_schedstats
> >>> 	1
> >>> 	# echo 0 > /proc/sys/kernel/sched_schedstats
> >>> 	# kill %1
> >>> 	# [ perf record: Woken up 1 times to write data ]
> >>> 	[ perf record: Captured and wrote 6.181 MB perf.data.after ]
> >>> 	[1]+  Terminated              ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
> >>> 	# ./perf script --itrace=e -i perf.data.after > /dev/null
> >>> 	Warning:
> >>> 	18837 instruction trace errors
> >>>
> >>> 	# ./perf script --itrace=e -i perf.data.after | head
> >>> 	instruction trace error type 1 time 9274.420582345 cpu 9 pid 845 tid 845 ip 0xffffffff814e6cf2 code 6: Trace doesn't match instruction
> >>> 	instruction trace error type 1 time 9274.422229726 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a320 code 6: Trace doesn't match instruction
> >>> 	instruction trace error type 1 time 9274.422231972 cpu 39 pid 5880 tid 5880 ip 0xffffffffa0315c1c code 6: Trace doesn't match instruction
> >>> 	instruction trace error type 1 time 9274.422236141 cpu 39 pid 5880 tid 5880 ip 0xffffffff81143263 code 6: Trace doesn't match instruction
> >>> 	instruction trace error type 1 time 9274.422237424 cpu 39 pid 5880 tid 5880 ip 0xffffffff8115c388 code 6: Trace doesn't match instruction
> >>> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffff811428c9 code 6: Trace doesn't match instruction
> >>> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffffa13279fb code 5: Failed to get instruction
> >>> 	instruction trace error type 1 time 9274.422242556 cpu 39 pid 5880 tid 5880 ip 0xffffffff814e9c73 code 6: Trace doesn't match instruction
> >>> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffff810d7da2 code 6: Trace doesn't match instruction
> >>> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a0b2 code 6: Trace doesn't match instruction
> >>
> >> I don't think it can be CONFIG.  Can you share the branch you are using?
> >> Then I can test it.
> >>
> > 
> > I built the one you mentioned in cover email:
> >   git://git.infradead.org/users/ahunter/linux-perf.git text_poke
> 
> The tools and the kernel?

both from that branch

> 
> Does it successfully decode anything? e.g.
> ./perf script --itrace=be -i perf.data.after | head
> 

# ./perf script --itrace=be -i perf.data.after | head
            perf  5880 [000]  9274.419884:          1  branches:k:                 0 [unknown] ([unknown]) => ffffffff8106a4a6 native_write_msr+0x6 (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8106a4ab native_write_msr+0xb (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff810152f9 pt_config_start+0x59 (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff81015304 pt_config_start+0x64 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff81015648 pt_event_add+0x38 (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8101565d pt_event_add+0x4d (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3ca event_sched_in.isra.0+0xea (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d3dc event_sched_in.isra.0+0xfc (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3e3 event_sched_in.isra.0+0x103 (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d3f4 event_sched_in.isra.0+0x114 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d497 event_sched_in.isra.0+0x1b7 (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d4c1 event_sched_in.isra.0+0x1e1 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff814f9560 __list_add_valid+0x0 (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff814f9593 __list_add_valid+0x33 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d4c6 event_sched_in.isra.0+0x1e6 (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d4e2 event_sched_in.isra.0+0x202 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3fa event_sched_in.isra.0+0x11a (/lib/modules/5.6.0-rc6+/build/vmlinux)
            perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d401 event_sched_in.isra.0+0x121 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d40d event_sched_in.isra.0+0x12d (/lib/modules/5.6.0-rc6+/build/vmlinux)

# ./perf script --call-trace -i perf.data.after | head
            perf  5880 [000]  9274.419884188:  cbr:  8 freq:  798 MHz ( 36%)    
            perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __list_add_valid            
            perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          perf_pmu_enable.part.0      
            perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax
            perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __x86_indirect_thunk_rax    
            perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax
            perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __list_add_valid            
            perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          rb_next                     
            perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          merge_sched_in              
            perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax

jirka


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 08/15] ftrace: Add perf text poke events for ftrace trampolines
  2020-04-05 20:13 ` [PATCH V6 08/15] ftrace: Add perf text poke " Adrian Hunter
@ 2020-04-21 13:45   ` Peter Zijlstra
  2020-04-22  7:39     ` Adrian Hunter
  0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2020-04-21 13:45 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

On Sun, Apr 05, 2020 at 11:13:20PM +0300, Adrian Hunter wrote:
> Add perf text poke events for ftrace trampolines when created and when
> freed.

Maybe also put in a little more detail on the various events. Because
arch_ftrace_update_trampoline() can also generate text_poke_bp() events,
to update an existing trampoline.

A diagram, like with the kprobes thing perhaps.

> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  kernel/trace/ftrace.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 398d58b8ffa5..3318f7eca76c 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2792,6 +2792,13 @@ static void ftrace_trampoline_free(struct ftrace_ops *ops)
>  {
>  	if (ops && (ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP) &&
>  	    ops->trampoline) {
> +		/*
> +		 * Record the text poke event before the ksymbol unregister
> +		 * event.
> +		 */
> +		perf_event_text_poke((void *)ops->trampoline,
> +				     (void *)ops->trampoline,
> +				     ops->trampoline_size, NULL, 0);
>  		perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
>  				   ops->trampoline, ops->trampoline_size,
>  				   true, FTRACE_TRAMPOLINE_SYM);
> @@ -6816,6 +6823,13 @@ static void ftrace_update_trampoline(struct ftrace_ops *ops)
>  		perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
>  				   ops->trampoline, ops->trampoline_size, false,
>  				   FTRACE_TRAMPOLINE_SYM);
> +		/*
> +		 * Record the perf text poke event after the ksymbol register
> +		 * event.
> +		 */
> +		perf_event_text_poke((void *)ops->trampoline, NULL, 0,
> +				     (void *)ops->trampoline,
> +				     ops->trampoline_size);
>  	}
>  }
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 13/15] perf intel-pt: Add support for text poke events
  2020-04-21 13:35           ` Jiri Olsa
@ 2020-04-21 13:45             ` Adrian Hunter
  2020-04-21 15:38               ` Adrian Hunter
  0 siblings, 1 reply; 32+ messages in thread
From: Adrian Hunter @ 2020-04-21 13:45 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Arnaldo Carvalho de Melo, linux-kernel

On 21/04/20 4:35 pm, Jiri Olsa wrote:
> On Tue, Apr 21, 2020 at 04:10:33PM +0300, Adrian Hunter wrote:
> 
> SNIP
> 
>>>>> I'm still seeing some, probably I'm missing some CONFIG, will check
>>>>>
>>>>> 	# ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
>>>>> 	[1] 5880
>>>>> 	# cat /proc/sys/kernel/sched_schedstats
>>>>> 	0
>>>>> 	# echo 1 > /proc/sys/kernel/sched_schedstats
>>>>> 	# cat /proc/sys/kernel/sched_schedstats
>>>>> 	1
>>>>> 	# echo 0 > /proc/sys/kernel/sched_schedstats
>>>>> 	# kill %1
>>>>> 	# [ perf record: Woken up 1 times to write data ]
>>>>> 	[ perf record: Captured and wrote 6.181 MB perf.data.after ]
>>>>> 	[1]+  Terminated              ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
>>>>> 	# ./perf script --itrace=e -i perf.data.after > /dev/null
>>>>> 	Warning:
>>>>> 	18837 instruction trace errors
>>>>>
>>>>> 	# ./perf script --itrace=e -i perf.data.after | head
>>>>> 	instruction trace error type 1 time 9274.420582345 cpu 9 pid 845 tid 845 ip 0xffffffff814e6cf2 code 6: Trace doesn't match instruction
>>>>> 	instruction trace error type 1 time 9274.422229726 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a320 code 6: Trace doesn't match instruction
>>>>> 	instruction trace error type 1 time 9274.422231972 cpu 39 pid 5880 tid 5880 ip 0xffffffffa0315c1c code 6: Trace doesn't match instruction
>>>>> 	instruction trace error type 1 time 9274.422236141 cpu 39 pid 5880 tid 5880 ip 0xffffffff81143263 code 6: Trace doesn't match instruction
>>>>> 	instruction trace error type 1 time 9274.422237424 cpu 39 pid 5880 tid 5880 ip 0xffffffff8115c388 code 6: Trace doesn't match instruction
>>>>> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffff811428c9 code 6: Trace doesn't match instruction
>>>>> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffffa13279fb code 5: Failed to get instruction
>>>>> 	instruction trace error type 1 time 9274.422242556 cpu 39 pid 5880 tid 5880 ip 0xffffffff814e9c73 code 6: Trace doesn't match instruction
>>>>> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffff810d7da2 code 6: Trace doesn't match instruction
>>>>> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a0b2 code 6: Trace doesn't match instruction
>>>>
>>>> I don't think it can be CONFIG.  Can you share the branch you are using?
>>>> Then I can test it.
>>>>
>>>
>>> I built the one you mentioned in cover email:
>>>   git://git.infradead.org/users/ahunter/linux-perf.git text_poke
>>
>> The tools and the kernel?
> 
> both from that branch
> 
>>
>> Does it successfully decode anything? e.g.
>> ./perf script --itrace=be -i perf.data.after | head
>>
> 
> # ./perf script --itrace=be -i perf.data.after | head
>             perf  5880 [000]  9274.419884:          1  branches:k:                 0 [unknown] ([unknown]) => ffffffff8106a4a6 native_write_msr+0x6 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8106a4ab native_write_msr+0xb (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff810152f9 pt_config_start+0x59 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff81015304 pt_config_start+0x64 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff81015648 pt_event_add+0x38 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8101565d pt_event_add+0x4d (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3ca event_sched_in.isra.0+0xea (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d3dc event_sched_in.isra.0+0xfc (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3e3 event_sched_in.isra.0+0x103 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d3f4 event_sched_in.isra.0+0x114 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d497 event_sched_in.isra.0+0x1b7 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d4c1 event_sched_in.isra.0+0x1e1 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff814f9560 __list_add_valid+0x0 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff814f9593 __list_add_valid+0x33 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d4c6 event_sched_in.isra.0+0x1e6 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d4e2 event_sched_in.isra.0+0x202 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3fa event_sched_in.isra.0+0x11a (/lib/modules/5.6.0-rc6+/build/vmlinux)
>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d401 event_sched_in.isra.0+0x121 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d40d event_sched_in.isra.0+0x12d (/lib/modules/5.6.0-rc6+/build/vmlinux)
> 
> # ./perf script --call-trace -i perf.data.after | head
>             perf  5880 [000]  9274.419884188:  cbr:  8 freq:  798 MHz ( 36%)    
>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __list_add_valid            
>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          perf_pmu_enable.part.0      
>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax
>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __x86_indirect_thunk_rax    
>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax
>             perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __list_add_valid            
>             perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          rb_next                     
>             perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          merge_sched_in              
>             perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax
> 

perf is not using kcore, which explains the errors.  Not sure why it prefers vmlinux.  It should effectively be doing:

perf script -i perf.data.after --itrace=e --kallsyms perf.data.after/kcore_dir/kallsyms

What is in the perf.data.after directory?






^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 00/15] perf/x86: Add perf text poke events
  2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
                   ` (15 preceding siblings ...)
  2020-04-20  6:21 ` [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
@ 2020-04-21 13:46 ` Peter Zijlstra
  16 siblings, 0 replies; 32+ messages in thread
From: Peter Zijlstra @ 2020-04-21 13:46 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

On Sun, Apr 05, 2020 at 11:13:12PM +0300, Adrian Hunter wrote:
> 
> Adrian Hunter (15):
>       perf: Add perf text poke event
>       perf/x86: Add support for perf text poke event for text_poke_bp_batch() callers
>       kprobes: Add symbols for kprobe insn pages
>       kprobes: Add perf ksymbol events for kprobe insn pages
>       perf/x86: Add perf text poke events for kprobes
>       ftrace: Add symbols for ftrace trampolines
>       ftrace: Add perf ksymbol events for ftrace trampolines
>       ftrace: Add perf text poke events for ftrace trampolines

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 13/15] perf intel-pt: Add support for text poke events
  2020-04-21 13:45             ` Adrian Hunter
@ 2020-04-21 15:38               ` Adrian Hunter
  0 siblings, 0 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-21 15:38 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Arnaldo Carvalho de Melo, linux-kernel

On 21/04/20 4:45 pm, Adrian Hunter wrote:
> On 21/04/20 4:35 pm, Jiri Olsa wrote:
>> On Tue, Apr 21, 2020 at 04:10:33PM +0300, Adrian Hunter wrote:
>>
>> SNIP
>>
>>>>>> I'm still seeing some, probably I'm missing some CONFIG, will check
>>>>>>
>>>>>> 	# ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
>>>>>> 	[1] 5880
>>>>>> 	# cat /proc/sys/kernel/sched_schedstats
>>>>>> 	0
>>>>>> 	# echo 1 > /proc/sys/kernel/sched_schedstats
>>>>>> 	# cat /proc/sys/kernel/sched_schedstats
>>>>>> 	1
>>>>>> 	# echo 0 > /proc/sys/kernel/sched_schedstats
>>>>>> 	# kill %1
>>>>>> 	# [ perf record: Woken up 1 times to write data ]
>>>>>> 	[ perf record: Captured and wrote 6.181 MB perf.data.after ]
>>>>>> 	[1]+  Terminated              ./perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
>>>>>> 	# ./perf script --itrace=e -i perf.data.after > /dev/null
>>>>>> 	Warning:
>>>>>> 	18837 instruction trace errors
>>>>>>
>>>>>> 	# ./perf script --itrace=e -i perf.data.after | head
>>>>>> 	instruction trace error type 1 time 9274.420582345 cpu 9 pid 845 tid 845 ip 0xffffffff814e6cf2 code 6: Trace doesn't match instruction
>>>>>> 	instruction trace error type 1 time 9274.422229726 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a320 code 6: Trace doesn't match instruction
>>>>>> 	instruction trace error type 1 time 9274.422231972 cpu 39 pid 5880 tid 5880 ip 0xffffffffa0315c1c code 6: Trace doesn't match instruction
>>>>>> 	instruction trace error type 1 time 9274.422236141 cpu 39 pid 5880 tid 5880 ip 0xffffffff81143263 code 6: Trace doesn't match instruction
>>>>>> 	instruction trace error type 1 time 9274.422237424 cpu 39 pid 5880 tid 5880 ip 0xffffffff8115c388 code 6: Trace doesn't match instruction
>>>>>> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffff811428c9 code 6: Trace doesn't match instruction
>>>>>> 	instruction trace error type 1 time 9274.422239028 cpu 39 pid 5880 tid 5880 ip 0xffffffffa13279fb code 5: Failed to get instruction
>>>>>> 	instruction trace error type 1 time 9274.422242556 cpu 39 pid 5880 tid 5880 ip 0xffffffff814e9c73 code 6: Trace doesn't match instruction
>>>>>> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffff810d7da2 code 6: Trace doesn't match instruction
>>>>>> 	instruction trace error type 1 time 9274.422258915 cpu 39 pid 5880 tid 5880 ip 0xffffffffa030a0b2 code 6: Trace doesn't match instruction
>>>>>
>>>>> I don't think it can be CONFIG.  Can you share the branch you are using?
>>>>> Then I can test it.
>>>>>
>>>>
>>>> I built the one you mentioned in cover email:
>>>>   git://git.infradead.org/users/ahunter/linux-perf.git text_poke
>>>
>>> The tools and the kernel?
>>
>> both from that branch
>>
>>>
>>> Does it successfully decode anything? e.g.
>>> ./perf script --itrace=be -i perf.data.after | head
>>>
>>
>> # ./perf script --itrace=be -i perf.data.after | head
>>             perf  5880 [000]  9274.419884:          1  branches:k:                 0 [unknown] ([unknown]) => ffffffff8106a4a6 native_write_msr+0x6 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8106a4ab native_write_msr+0xb (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff810152f9 pt_config_start+0x59 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff81015304 pt_config_start+0x64 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff81015648 pt_event_add+0x38 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8101565d pt_event_add+0x4d (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3ca event_sched_in.isra.0+0xea (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d3dc event_sched_in.isra.0+0xfc (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3e3 event_sched_in.isra.0+0x103 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d3f4 event_sched_in.isra.0+0x114 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d497 event_sched_in.isra.0+0x1b7 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d4c1 event_sched_in.isra.0+0x1e1 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff814f9560 __list_add_valid+0x0 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff814f9593 __list_add_valid+0x33 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d4c6 event_sched_in.isra.0+0x1e6 (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d4e2 event_sched_in.isra.0+0x202 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d3fa event_sched_in.isra.0+0x11a (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>             perf  5880 [000]  9274.419884:          1  branches:k:  ffffffff8121d401 event_sched_in.isra.0+0x121 (/lib/modules/5.6.0-rc6+/build/vmlinux) => ffffffff8121d40d event_sched_in.isra.0+0x12d (/lib/modules/5.6.0-rc6+/build/vmlinux)
>>
>> # ./perf script --call-trace -i perf.data.after | head
>>             perf  5880 [000]  9274.419884188:  cbr:  8 freq:  798 MHz ( 36%)    
>>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __list_add_valid            
>>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          perf_pmu_enable.part.0      
>>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax
>>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __x86_indirect_thunk_rax    
>>             perf  5880 [000]  9274.419884389: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax
>>             perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          __list_add_valid            
>>             perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          rb_next                     
>>             perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )          merge_sched_in              
>>             perf  5880 [000]  9274.419884709: (/lib/modules/5.6.0-rc6+/build/vmlinux             )              __x86_indirect_thunk_rax
>>
> 
> perf is not using kcore, which explains the errors.  Not sure why it prefers vmlinux.  It should effectively be doing:
> 
> perf script -i perf.data.after --itrace=e --kallsyms perf.data.after/kcore_dir/kallsyms
> 
> What is in the perf.data.after directory?

I played around but have not yet figured out why perf prefers vmlinux on
your system, but I will investigate more.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 08/15] ftrace: Add perf text poke events for ftrace trampolines
  2020-04-21 13:45   ` Peter Zijlstra
@ 2020-04-22  7:39     ` Adrian Hunter
  2020-04-22 11:46       ` Peter Zijlstra
  0 siblings, 1 reply; 32+ messages in thread
From: Adrian Hunter @ 2020-04-22  7:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

On 21/04/20 4:45 pm, Peter Zijlstra wrote:
> On Sun, Apr 05, 2020 at 11:13:20PM +0300, Adrian Hunter wrote:
>> Add perf text poke events for ftrace trampolines when created and when
>> freed.
> 
> Maybe also put in a little more detail on the various events. Because
> arch_ftrace_update_trampoline() can also generate text_poke_bp() events,
> to update an existing trampoline.
> 
> A diagram, like with the kprobes thing perhaps.

How about adding this:

There can be 3 text_poke events for ftrace trampolines:

1. NULL -> trampoline
   By ftrace_update_trampoline() when !ops->trampoline
   Trampoline created

2. [e.g. on x86] CALL rel32 -> CALL rel32
   By arch_ftrace_update_trampoline() when ops->trampoline and
                        ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
   [e.g. on x86] via text_poke_bp() which generates text poke events
   Trampoline-called function target updated

3. trampoline -> NULL
   By ftrace_trampoline_free() when ops->trampoline and
                 ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
   Trampoline freed

> 
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  kernel/trace/ftrace.c | 14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>>
>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>> index 398d58b8ffa5..3318f7eca76c 100644
>> --- a/kernel/trace/ftrace.c
>> +++ b/kernel/trace/ftrace.c
>> @@ -2792,6 +2792,13 @@ static void ftrace_trampoline_free(struct ftrace_ops *ops)
>>  {
>>  	if (ops && (ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP) &&
>>  	    ops->trampoline) {
>> +		/*
>> +		 * Record the text poke event before the ksymbol unregister
>> +		 * event.
>> +		 */
>> +		perf_event_text_poke((void *)ops->trampoline,
>> +				     (void *)ops->trampoline,
>> +				     ops->trampoline_size, NULL, 0);
>>  		perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
>>  				   ops->trampoline, ops->trampoline_size,
>>  				   true, FTRACE_TRAMPOLINE_SYM);
>> @@ -6816,6 +6823,13 @@ static void ftrace_update_trampoline(struct ftrace_ops *ops)
>>  		perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_OOL,
>>  				   ops->trampoline, ops->trampoline_size, false,
>>  				   FTRACE_TRAMPOLINE_SYM);
>> +		/*
>> +		 * Record the perf text poke event after the ksymbol register
>> +		 * event.
>> +		 */
>> +		perf_event_text_poke((void *)ops->trampoline, NULL, 0,
>> +				     (void *)ops->trampoline,
>> +				     ops->trampoline_size);
>>  	}
>>  }
>>  
>> -- 
>> 2.17.1
>>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 08/15] ftrace: Add perf text poke events for ftrace trampolines
  2020-04-22  7:39     ` Adrian Hunter
@ 2020-04-22 11:46       ` Peter Zijlstra
  2020-04-27  7:23         ` Adrian Hunter
  0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2020-04-22 11:46 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Masami Hiramatsu, Steven Rostedt, Borislav Petkov,
	H . Peter Anvin, x86, Mark Rutland, Alexander Shishkin,
	Mathieu Poirier, Leo Yan, Arnaldo Carvalho de Melo, Jiri Olsa,
	linux-kernel

On Wed, Apr 22, 2020 at 10:39:06AM +0300, Adrian Hunter wrote:
> On 21/04/20 4:45 pm, Peter Zijlstra wrote:
> > On Sun, Apr 05, 2020 at 11:13:20PM +0300, Adrian Hunter wrote:
> >> Add perf text poke events for ftrace trampolines when created and when
> >> freed.
> > 
> > Maybe also put in a little more detail on the various events. Because
> > arch_ftrace_update_trampoline() can also generate text_poke_bp() events,
> > to update an existing trampoline.
> > 
> > A diagram, like with the kprobes thing perhaps.
> 
> How about adding this:
> 
> There can be 3 text_poke events for ftrace trampolines:
> 
> 1. NULL -> trampoline
>    By ftrace_update_trampoline() when !ops->trampoline
>    Trampoline created
> 
> 2. [e.g. on x86] CALL rel32 -> CALL rel32
>    By arch_ftrace_update_trampoline() when ops->trampoline and
>                         ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
>    [e.g. on x86] via text_poke_bp() which generates text poke events
>    Trampoline-called function target updated
> 
> 3. trampoline -> NULL
>    By ftrace_trampoline_free() when ops->trampoline and
>                  ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
>    Trampoline freed

Yes, very nice. Thanks!

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 08/15] ftrace: Add perf text poke events for ftrace trampolines
  2020-04-22 11:46       ` Peter Zijlstra
@ 2020-04-27  7:23         ` Adrian Hunter
  2020-04-27 18:34           ` Steven Rostedt
  2020-04-27 20:10           ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 32+ messages in thread
From: Adrian Hunter @ 2020-04-27  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan, Jiri Olsa,
	linux-kernel

On 22/04/20 2:46 pm, Peter Zijlstra wrote:
> On Wed, Apr 22, 2020 at 10:39:06AM +0300, Adrian Hunter wrote:
>> On 21/04/20 4:45 pm, Peter Zijlstra wrote:
>>> On Sun, Apr 05, 2020 at 11:13:20PM +0300, Adrian Hunter wrote:
>>>> Add perf text poke events for ftrace trampolines when created and when
>>>> freed.
>>>
>>> Maybe also put in a little more detail on the various events. Because
>>> arch_ftrace_update_trampoline() can also generate text_poke_bp() events,
>>> to update an existing trampoline.
>>>
>>> A diagram, like with the kprobes thing perhaps.
>>
>> How about adding this:
>>
>> There can be 3 text_poke events for ftrace trampolines:
>>
>> 1. NULL -> trampoline
>>    By ftrace_update_trampoline() when !ops->trampoline
>>    Trampoline created
>>
>> 2. [e.g. on x86] CALL rel32 -> CALL rel32
>>    By arch_ftrace_update_trampoline() when ops->trampoline and
>>                         ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
>>    [e.g. on x86] via text_poke_bp() which generates text poke events
>>    Trampoline-called function target updated
>>
>> 3. trampoline -> NULL
>>    By ftrace_trampoline_free() when ops->trampoline and
>>                  ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
>>    Trampoline freed
> 
> Yes, very nice. Thanks!

Arnaldo, do you have this patchset on your radar?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 08/15] ftrace: Add perf text poke events for ftrace trampolines
  2020-04-27  7:23         ` Adrian Hunter
@ 2020-04-27 18:34           ` Steven Rostedt
  2020-04-28 16:47             ` Peter Zijlstra
  2020-04-27 20:10           ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2020-04-27 18:34 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Masami Hiramatsu, Borislav Petkov, H . Peter Anvin, x86,
	Mark Rutland, Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Jiri Olsa, linux-kernel

On Mon, 27 Apr 2020 10:23:05 +0300
Adrian Hunter <adrian.hunter@intel.com> wrote:

> On 22/04/20 2:46 pm, Peter Zijlstra wrote:
> > On Wed, Apr 22, 2020 at 10:39:06AM +0300, Adrian Hunter wrote:  
> >> On 21/04/20 4:45 pm, Peter Zijlstra wrote:  
> >>> On Sun, Apr 05, 2020 at 11:13:20PM +0300, Adrian Hunter wrote:  
> >>>> Add perf text poke events for ftrace trampolines when created and when
> >>>> freed.  
> >>>
> >>> Maybe also put in a little more detail on the various events. Because
> >>> arch_ftrace_update_trampoline() can also generate text_poke_bp() events,
> >>> to update an existing trampoline.
> >>>
> >>> A diagram, like with the kprobes thing perhaps.  
> >>
> >> How about adding this:
> >>
> >> There can be 3 text_poke events for ftrace trampolines:
> >>
> >> 1. NULL -> trampoline
> >>    By ftrace_update_trampoline() when !ops->trampoline
> >>    Trampoline created
> >>
> >> 2. [e.g. on x86] CALL rel32 -> CALL rel32
> >>    By arch_ftrace_update_trampoline() when ops->trampoline and
> >>                         ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
> >>    [e.g. on x86] via text_poke_bp() which generates text poke events
> >>    Trampoline-called function target updated
> >>
> >> 3. trampoline -> NULL
> >>    By ftrace_trampoline_free() when ops->trampoline and
> >>                  ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
> >>    Trampoline freed  
> > 
> > Yes, very nice. Thanks!  
> 
> Arnaldo, do you have this patchset on your radar?

Arnaldo deals with the userspace perf code. This looks like it needs to go
through the x86 tree.

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 08/15] ftrace: Add perf text poke events for ftrace trampolines
  2020-04-27  7:23         ` Adrian Hunter
  2020-04-27 18:34           ` Steven Rostedt
@ 2020-04-27 20:10           ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 32+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-04-27 20:10 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Peter Zijlstra, Ingo Molnar, Masami Hiramatsu, Steven Rostedt,
	Borislav Petkov, H . Peter Anvin, x86, Mark Rutland,
	Alexander Shishkin, Mathieu Poirier, Leo Yan, Jiri Olsa,
	linux-kernel

Em Mon, Apr 27, 2020 at 10:23:05AM +0300, Adrian Hunter escreveu:
> On 22/04/20 2:46 pm, Peter Zijlstra wrote:
> > On Wed, Apr 22, 2020 at 10:39:06AM +0300, Adrian Hunter wrote:
> >> On 21/04/20 4:45 pm, Peter Zijlstra wrote:
> >>> On Sun, Apr 05, 2020 at 11:13:20PM +0300, Adrian Hunter wrote:
> >>>> Add perf text poke events for ftrace trampolines when created and when
> >>>> freed.
> >>>
> >>> Maybe also put in a little more detail on the various events. Because
> >>> arch_ftrace_update_trampoline() can also generate text_poke_bp() events,
> >>> to update an existing trampoline.
> >>>
> >>> A diagram, like with the kprobes thing perhaps.
> >>
> >> How about adding this:
> >>
> >> There can be 3 text_poke events for ftrace trampolines:
> >>
> >> 1. NULL -> trampoline
> >>    By ftrace_update_trampoline() when !ops->trampoline
> >>    Trampoline created
> >>
> >> 2. [e.g. on x86] CALL rel32 -> CALL rel32
> >>    By arch_ftrace_update_trampoline() when ops->trampoline and
> >>                         ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
> >>    [e.g. on x86] via text_poke_bp() which generates text poke events
> >>    Trampoline-called function target updated
> >>
> >> 3. trampoline -> NULL
> >>    By ftrace_trampoline_free() when ops->trampoline and
> >>                  ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP
> >>    Trampoline freed
> > 
> > Yes, very nice. Thanks!
> 
> Arnaldo, do you have this patchset on your radar?

Now it is, will look it into it,

- Arnaldo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH V6 08/15] ftrace: Add perf text poke events for ftrace trampolines
  2020-04-27 18:34           ` Steven Rostedt
@ 2020-04-28 16:47             ` Peter Zijlstra
  0 siblings, 0 replies; 32+ messages in thread
From: Peter Zijlstra @ 2020-04-28 16:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Ingo Molnar,
	Masami Hiramatsu, Borislav Petkov, H . Peter Anvin, x86,
	Mark Rutland, Alexander Shishkin, Mathieu Poirier, Leo Yan,
	Jiri Olsa, linux-kernel

On Mon, Apr 27, 2020 at 02:34:42PM -0400, Steven Rostedt wrote:
> Arnaldo deals with the userspace perf code. This looks like it needs to go
> through the x86 tree.

Sometimes, when a series crosses both, it'll go through either tree.

I'm fine with picking up the kernel bits I suppose, but then we need to
be careful to have them land in order or something. Whatever people
want.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-04-28 16:47 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-05 20:13 [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 01/15] perf: Add perf text poke event Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 02/15] perf/x86: Add support for perf text poke event for text_poke_bp_batch() callers Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 03/15] kprobes: Add symbols for kprobe insn pages Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 04/15] kprobes: Add perf ksymbol events " Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 05/15] perf/x86: Add perf text poke events for kprobes Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 06/15] ftrace: Add symbols for ftrace trampolines Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 07/15] ftrace: Add perf ksymbol events " Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 08/15] ftrace: Add perf text poke " Adrian Hunter
2020-04-21 13:45   ` Peter Zijlstra
2020-04-22  7:39     ` Adrian Hunter
2020-04-22 11:46       ` Peter Zijlstra
2020-04-27  7:23         ` Adrian Hunter
2020-04-27 18:34           ` Steven Rostedt
2020-04-28 16:47             ` Peter Zijlstra
2020-04-27 20:10           ` Arnaldo Carvalho de Melo
2020-04-05 20:13 ` [PATCH V6 09/15] perf kcore_copy: Fix module map when there are no modules loaded Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 10/15] perf evlist: Disable 'immediate' events last Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 11/15] perf tools: Add support for PERF_RECORD_TEXT_POKE Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 12/15] perf tools: Add support for PERF_RECORD_KSYMBOL_TYPE_OOL Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 13/15] perf intel-pt: Add support for text poke events Adrian Hunter
2020-04-21 12:50   ` Jiri Olsa
2020-04-21 13:02     ` Adrian Hunter
2020-04-21 13:06       ` Jiri Olsa
2020-04-21 13:10         ` Adrian Hunter
2020-04-21 13:35           ` Jiri Olsa
2020-04-21 13:45             ` Adrian Hunter
2020-04-21 15:38               ` Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 14/15] perf script: Add option --show-text-poke-events Adrian Hunter
2020-04-05 20:13 ` [PATCH V6 15/15] perf script: Show text poke address symbol Adrian Hunter
2020-04-20  6:21 ` [PATCH V6 00/15] perf/x86: Add perf text poke events Adrian Hunter
2020-04-21 13:46 ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).