linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object
@ 2021-06-27 13:18 Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 01/10] " Adrian Hunter
                   ` (12 more replies)
  0 siblings, 13 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Hi
 
In some cases, users want to filter very large amounts of data (e.g. from
AUX area tracing like Intel PT) looking for something specific. While
scripting such as Python can be used, Python is 10 to 20 times slower than
C. So define a C API so that custom filters can be written and loaded.

This is V2.

The main patch is patch 1.

The other patches add more functionality, except for patch 5 which installs
the C API header file.


Changes in V2:
    perf script: Move filter_cpu() earlier
    perf script: Move filtering before scripting
    perf script: Share addr_al between functions
	Dropped because they have now been applied.

    perf script: Add API for filtering via dynamically loaded shared object
	Move 2 members of struct perf_dlfilter_sample
	Add 'ctx' as an argument to 'start' and 'stop'
	Find dlfilter .so files in current directory or exec-path/dlfilters

    perf script: Add option to list dlfilters
	New patch

    perf script: Add option to pass arguments to dlfilters
	New patch


Adrian Hunter (10):
      perf script: Add API for filtering via dynamically loaded shared object
      perf script: Add dlfilter__filter_event_early()
      perf script: Add option to list dlfilters
      perf script: Add option to pass arguments to dlfilters
      perf build: Install perf_dlfilter.h
      perf dlfilter: Add resolve_address() to perf_dlfilter_fns
      perf dlfilter: Add insn() to perf_dlfilter_fns
      perf dlfilter: Add srcline() to perf_dlfilter_fns
      perf dlfilter: Add attr() to perf_dlfilter_fns
      perf dlfilter: Add object_code() to perf_dlfilter_fns

 tools/perf/Documentation/perf-dlfilter.txt | 251 ++++++++++++
 tools/perf/Documentation/perf-script.txt   |  15 +-
 tools/perf/Makefile.config                 |   3 +
 tools/perf/Makefile.perf                   |   4 +-
 tools/perf/builtin-script.c                |  86 +++-
 tools/perf/util/Build                      |   1 +
 tools/perf/util/dlfilter.c                 | 615 +++++++++++++++++++++++++++++
 tools/perf/util/dlfilter.h                 |  97 +++++
 tools/perf/util/perf_dlfilter.h            | 150 +++++++
 9 files changed, 1211 insertions(+), 11 deletions(-)
 create mode 100644 tools/perf/Documentation/perf-dlfilter.txt
 create mode 100644 tools/perf/util/dlfilter.c
 create mode 100644 tools/perf/util/dlfilter.h
 create mode 100644 tools/perf/util/perf_dlfilter.h


Regards
Adrian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 01/10] perf script: Add API for filtering via dynamically loaded shared object
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 02/10] perf script: Add dlfilter__filter_event_early() Adrian Hunter
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

In some cases, users want to filter very large amounts of data (e.g. from
AUX area tracing like Intel PT) looking for something specific. While
scripting such as Python can be used, Python is 10 to 20 times slower than
C. So define a C API so that custom filters can be written and loaded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt | 222 ++++++++++++++
 tools/perf/Documentation/perf-script.txt   |   7 +-
 tools/perf/builtin-script.c                |  25 +-
 tools/perf/util/Build                      |   1 +
 tools/perf/util/dlfilter.c                 | 330 +++++++++++++++++++++
 tools/perf/util/dlfilter.h                 |  74 +++++
 tools/perf/util/perf_dlfilter.h            | 123 ++++++++
 7 files changed, 780 insertions(+), 2 deletions(-)
 create mode 100644 tools/perf/Documentation/perf-dlfilter.txt
 create mode 100644 tools/perf/util/dlfilter.c
 create mode 100644 tools/perf/util/dlfilter.h
 create mode 100644 tools/perf/util/perf_dlfilter.h

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
new file mode 100644
index 000000000000..15d5f4b01c97
--- /dev/null
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -0,0 +1,222 @@
+perf-dlfilter(1)
+================
+
+NAME
+----
+perf-dlfilter - Filter sample events using a dynamically loaded shared
+object file
+
+SYNOPSIS
+--------
+[verse]
+'perf script' [--dlfilter file.so ]
+
+DESCRIPTION
+-----------
+
+This option is used to process data through a custom filter provided by a
+dynamically loaded shared object file.
+
+If 'file.so' does not contain "/", then it will be found either in the current
+directory, or perf tools exec path which is ~/libexec/perf-core/dlfilters for
+a local build and install (refer perf --exec-path), or the dynamic linker
+paths.
+
+API
+---
+
+The API for filtering consists of the following:
+
+[source,c]
+----
+#include <perf/perf_dlfilter.h>
+
+const struct perf_dlfilter_fns perf_dlfilter_fns;
+
+int start(void **data, void *ctx);
+int stop(void *data, void *ctx);
+int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
+----
+
+If implemented, 'start' will be called at the beginning, before any
+calls to 'filter_event' . Return 0 to indicate success,
+or return a negative error code. '*data' can be assigned for use by other
+functions. 'ctx' is needed for calls to perf_dlfilter_fns, but most
+perf_dlfilter_fns are not valid when called from 'start'.
+
+If implemented, 'stop' will be called at the end, after any calls to
+'filter_event'. Return 0 to indicate success, or
+return a negative error code. 'data' is set by 'start'. 'ctx' is needed
+for calls to perf_dlfilter_fns, but most perf_dlfilter_fns are not valid
+when called from 'stop'.
+
+If implemented, 'filter_event' will be called for each sample event.
+Return 0 to keep the sample event, 1 to filter it out, or return a negative
+error code. 'data' is set by 'start'. 'ctx' is needed for calls to
+'perf_dlfilter_fns'.
+
+The perf_dlfilter_sample structure
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+'filter_event' is passed a perf_dlfilter_sample
+structure, which contains the following fields:
+[source,c]
+----
+/*
+ * perf sample event information (as per perf script and <linux/perf_event.h>)
+ */
+struct perf_dlfilter_sample {
+	__u32 size; /* Size of this structure (for compatibility checking) */
+	__u16 ins_lat;		/* Refer PERF_SAMPLE_WEIGHT_TYPE in <linux/perf_event.h> */
+	__u16 p_stage_cyc;	/* Refer PERF_SAMPLE_WEIGHT_TYPE in <linux/perf_event.h> */
+	__u64 ip;
+	__s32 pid;
+	__s32 tid;
+	__u64 time;
+	__u64 addr;
+	__u64 id;
+	__u64 stream_id;
+	__u64 period;
+	__u64 weight;		/* Refer PERF_SAMPLE_WEIGHT_TYPE in <linux/perf_event.h> */
+	__u64 transaction;	/* Refer PERF_SAMPLE_TRANSACTION in <linux/perf_event.h> */
+	__u64 insn_cnt;	/* For instructions-per-cycle (IPC) */
+	__u64 cyc_cnt;		/* For instructions-per-cycle (IPC) */
+	__s32 cpu;
+	__u32 flags;		/* Refer PERF_DLFILTER_FLAG_* above */
+	__u64 data_src;		/* Refer PERF_SAMPLE_DATA_SRC in <linux/perf_event.h> */
+	__u64 phys_addr;	/* Refer PERF_SAMPLE_PHYS_ADDR in <linux/perf_event.h> */
+	__u64 data_page_size;	/* Refer PERF_SAMPLE_DATA_PAGE_SIZE in <linux/perf_event.h> */
+	__u64 code_page_size;	/* Refer PERF_SAMPLE_CODE_PAGE_SIZE in <linux/perf_event.h> */
+	__u64 cgroup;		/* Refer PERF_SAMPLE_CGROUP in <linux/perf_event.h> */
+	__u8  cpumode;		/* Refer CPUMODE_MASK etc in <linux/perf_event.h> */
+	__u8  addr_correlates_sym; /* True => resolve_addr() can be called */
+	__u16 misc;		/* Refer perf_event_header in <linux/perf_event.h> */
+	__u32 raw_size;		/* Refer PERF_SAMPLE_RAW in <linux/perf_event.h> */
+	const void *raw_data;	/* Refer PERF_SAMPLE_RAW in <linux/perf_event.h> */
+	__u64 brstack_nr;	/* Number of brstack entries */
+	const struct perf_branch_entry *brstack; /* Refer <linux/perf_event.h> */
+	__u64 raw_callchain_nr;	/* Number of raw_callchain entries */
+	const __u64 *raw_callchain; /* Refer <linux/perf_event.h> */
+	const char *event;
+};
+----
+
+The perf_dlfilter_fns structure
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The 'perf_dlfilter_fns' structure is populated with function pointers when the
+file is loaded. The functions can be called by 'filter_event'.
+
+[source,c]
+----
+struct perf_dlfilter_fns {
+	const struct perf_dlfilter_al *(*resolve_ip)(void *ctx);
+	const struct perf_dlfilter_al *(*resolve_addr)(void *ctx);
+	void *(*reserved[126])(void *);
+};
+----
+
+'resolve_ip' returns information about ip.
+
+'resolve_addr' returns information about addr (if addr_correlates_sym).
+
+The perf_dlfilter_al structure
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The 'perf_dlfilter_al' structure contains information about an address.
+
+[source,c]
+----
+/*
+ * Address location (as per perf script)
+ */
+struct perf_dlfilter_al {
+	__u32 size; /* Size of this structure (for compatibility checking) */
+	__u32 symoff;
+	const char *sym;
+	__u64 addr; /* Mapped address (from dso) */
+	__u64 sym_start;
+	__u64 sym_end;
+	const char *dso;
+	__u8  sym_binding; /* STB_LOCAL, STB_GLOBAL or STB_WEAK, refer <elf.h> */
+	__u8  is_64_bit; /* Only valid if dso is not NULL */
+	__u8  is_kernel_ip; /* True if in kernel space */
+	__u32 buildid_size;
+	__u8 *buildid;
+	/* Below members are only populated by resolve_ip() */
+	__u8 filtered; /* true if this sample event will be filtered out */
+	const char *comm;
+};
+----
+
+perf_dlfilter_sample flags
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The 'flags' member of 'perf_dlfilter_sample' corresponds with the flags field
+of perf script. The bits of the flags are as follows:
+
+[source,c]
+----
+/* Definitions for perf_dlfilter_sample flags */
+enum {
+	PERF_DLFILTER_FLAG_BRANCH	= 1ULL << 0,
+	PERF_DLFILTER_FLAG_CALL		= 1ULL << 1,
+	PERF_DLFILTER_FLAG_RETURN	= 1ULL << 2,
+	PERF_DLFILTER_FLAG_CONDITIONAL	= 1ULL << 3,
+	PERF_DLFILTER_FLAG_SYSCALLRET	= 1ULL << 4,
+	PERF_DLFILTER_FLAG_ASYNC	= 1ULL << 5,
+	PERF_DLFILTER_FLAG_INTERRUPT	= 1ULL << 6,
+	PERF_DLFILTER_FLAG_TX_ABORT	= 1ULL << 7,
+	PERF_DLFILTER_FLAG_TRACE_BEGIN	= 1ULL << 8,
+	PERF_DLFILTER_FLAG_TRACE_END	= 1ULL << 9,
+	PERF_DLFILTER_FLAG_IN_TX	= 1ULL << 10,
+	PERF_DLFILTER_FLAG_VMENTRY	= 1ULL << 11,
+	PERF_DLFILTER_FLAG_VMEXIT	= 1ULL << 12,
+};
+----
+
+EXAMPLE
+-------
+
+Filter out everything except branches from "foo" to "bar":
+
+[source,c]
+----
+#include <perf/perf_dlfilter.h>
+#include <string.h>
+
+const struct perf_dlfilter_fns perf_dlfilter_fns;
+
+int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx)
+{
+	const struct perf_dlfilter_al *al;
+	const struct perf_dlfilter_al *addr_al;
+
+	if (!sample->ip || !sample->addr_correlates_sym)
+		return 1;
+
+	al = perf_dlfilter_fns.resolve_ip(ctx);
+	if (!al || !al->sym || strcmp(al->sym, "foo"))
+		return 1;
+
+	addr_al = perf_dlfilter_fns.resolve_addr(ctx);
+	if (!addr_al || !addr_al->sym || strcmp(addr_al->sym, "bar"))
+		return 1;
+
+	return 0;
+}
+----
+
+To build the shared object, assuming perf has been installed for the local user
+i.e. perf_dlfilter.h is in ~/include/perf :
+
+	gcc -c -I ~/include -fpic dlfilter-example.c
+	gcc -shared -o dlfilter-example.so dlfilter-example.o
+
+To use the filter with perf script:
+
+	perf script --dlfilter dlfilter-example.so
+
+SEE ALSO
+--------
+linkperf:perf-script[1]
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 48a5f5b26dd4..2306c81b606b 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -98,6 +98,10 @@ OPTIONS
         Generate perf-script.[ext] starter script for given language,
         using current perf.data.
 
+--dlfilter=<file>::
+	Filter sample events using the given shared object file.
+	Refer linkperf:perf-dlfilter[1]
+
 -a::
         Force system-wide collection.  Scripts run without a <command>
         normally use -a by default, while scripts run with a <command>
@@ -483,4 +487,5 @@ include::itrace.txt[]
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-script-perl[1],
-linkperf:perf-script-python[1], linkperf:perf-intel-pt[1]
+linkperf:perf-script-python[1], linkperf:perf-intel-pt[1],
+linkperf:perf-dlfilter[1]
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index d2771a997e26..aaf2922643a0 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -55,6 +55,7 @@
 #include <subcmd/pager.h>
 #include <perf/evlist.h>
 #include <linux/err.h>
+#include "util/dlfilter.h"
 #include "util/record.h"
 #include "util/util.h"
 #include "perf.h"
@@ -79,6 +80,7 @@ static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 static struct perf_stat_config	stat_config;
 static int			max_blocks;
 static bool			native_arch;
+static struct dlfilter		*dlfilter;
 
 unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;
 
@@ -2175,6 +2177,7 @@ static int process_sample_event(struct perf_tool *tool,
 	struct perf_script *scr = container_of(tool, struct perf_script, tool);
 	struct addr_location al;
 	struct addr_location addr_al;
+	int ret = 0;
 
 	if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
 					  sample->time)) {
@@ -2213,6 +2216,13 @@ static int process_sample_event(struct perf_tool *tool,
 	if (evswitch__discard(&scr->evswitch, evsel))
 		goto out_put;
 
+	ret = dlfilter__filter_event(dlfilter, event, sample, evsel, machine, &al, &addr_al);
+	if (ret) {
+		if (ret > 0)
+			ret = 0;
+		goto out_put;
+	}
+
 	if (scripting_ops) {
 		struct addr_location *addr_al_ptr = NULL;
 
@@ -2229,7 +2239,7 @@ static int process_sample_event(struct perf_tool *tool,
 
 out_put:
 	addr_location__put(&al);
-	return 0;
+	return ret;
 }
 
 static int process_attr(struct perf_tool *tool, union perf_event *event,
@@ -3568,6 +3578,7 @@ int cmd_script(int argc, const char **argv)
 	};
 	struct utsname uts;
 	char *script_path = NULL;
+	const char *dlfilter_file = NULL;
 	const char **__argv;
 	int i, j, err = 0;
 	struct perf_script script = {
@@ -3615,6 +3626,7 @@ int cmd_script(int argc, const char **argv)
 		     parse_scriptname),
 	OPT_STRING('g', "gen-script", &generate_script_lang, "lang",
 		   "generate perf-script.xx script in specified language"),
+	OPT_STRING(0, "dlfilter", &dlfilter_file, "file", "filter .so file name"),
 	OPT_STRING('i', "input", &input_name, "file", "input file name"),
 	OPT_BOOLEAN('d', "debug-mode", &debug_mode,
 		   "do various checks like samples ordering and lost events"),
@@ -3933,6 +3945,12 @@ int cmd_script(int argc, const char **argv)
 		exit(-1);
 	}
 
+	if (dlfilter_file) {
+		dlfilter = dlfilter__new(dlfilter_file);
+		if (!dlfilter)
+			return -1;
+	}
+
 	if (!script_name) {
 		setup_pager();
 		use_browser = 0;
@@ -4032,6 +4050,10 @@ int cmd_script(int argc, const char **argv)
 		goto out_delete;
 	}
 
+	err = dlfilter__start(dlfilter, session);
+	if (err)
+		goto out_delete;
+
 	if (script_name) {
 		err = scripting_ops->start_script(script_name, argc, argv, session);
 		if (err)
@@ -4081,6 +4103,7 @@ int cmd_script(int argc, const char **argv)
 
 	if (script_started)
 		cleanup_scripting();
+	dlfilter__cleanup(dlfilter);
 out:
 	return err;
 }
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 95e15d1035ab..1a909b53dc15 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -126,6 +126,7 @@ perf-y += parse-regs-options.o
 perf-y += parse-sublevel-options.o
 perf-y += term.o
 perf-y += help-unknown-cmd.o
+perf-y += dlfilter.o
 perf-y += mem-events.o
 perf-y += vsprintf.o
 perf-y += units.o
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
new file mode 100644
index 000000000000..03c4bf150656
--- /dev/null
+++ b/tools/perf/util/dlfilter.c
@@ -0,0 +1,330 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * dlfilter.c: Interface to perf script --dlfilter shared object
+ * Copyright (c) 2021, Intel Corporation.
+ */
+#include <dlfcn.h>
+#include <stdlib.h>
+#include <string.h>
+#include <linux/zalloc.h>
+#include <linux/build_bug.h>
+
+#include "debug.h"
+#include "event.h"
+#include "evsel.h"
+#include "dso.h"
+#include "map.h"
+#include "thread.h"
+#include "symbol.h"
+#include "dlfilter.h"
+#include "perf_dlfilter.h"
+
+static void al_to_d_al(struct addr_location *al, struct perf_dlfilter_al *d_al)
+{
+	struct symbol *sym = al->sym;
+
+	d_al->size = sizeof(*d_al);
+	if (al->map) {
+		struct dso *dso = al->map->dso;
+
+		if (symbol_conf.show_kernel_path && dso->long_name)
+			d_al->dso = dso->long_name;
+		else
+			d_al->dso = dso->name;
+		d_al->is_64_bit = dso->is_64_bit;
+		d_al->buildid_size = dso->bid.size;
+		d_al->buildid = dso->bid.data;
+	} else {
+		d_al->dso = NULL;
+		d_al->is_64_bit = 0;
+		d_al->buildid_size = 0;
+		d_al->buildid = NULL;
+	}
+	if (sym) {
+		d_al->sym = sym->name;
+		d_al->sym_start = sym->start;
+		d_al->sym_end = sym->end;
+		if (al->addr < sym->end)
+			d_al->symoff = al->addr - sym->start;
+		else
+			d_al->symoff = al->addr - al->map->start - sym->start;
+		d_al->sym_binding = sym->binding;
+	} else {
+		d_al->sym = NULL;
+		d_al->sym_start = 0;
+		d_al->sym_end = 0;
+		d_al->symoff = 0;
+		d_al->sym_binding = 0;
+	}
+	d_al->addr = al->addr;
+	d_al->comm = NULL;
+	d_al->filtered = 0;
+}
+
+static struct addr_location *get_al(struct dlfilter *d)
+{
+	struct addr_location *al = d->al;
+
+	if (!al->thread && machine__resolve(d->machine, al, d->sample) < 0)
+		return NULL;
+	return al;
+}
+
+static struct thread *get_thread(struct dlfilter *d)
+{
+	struct addr_location *al = get_al(d);
+
+	return al ? al->thread : NULL;
+}
+
+static const struct perf_dlfilter_al *dlfilter__resolve_ip(void *ctx)
+{
+	struct dlfilter *d = (struct dlfilter *)ctx;
+	struct perf_dlfilter_al *d_al = d->d_ip_al;
+	struct addr_location *al;
+
+	if (!d->ctx_valid)
+		return NULL;
+
+	/* 'size' is also used to indicate already initialized */
+	if (d_al->size)
+		return d_al;
+
+	al = get_al(d);
+	if (!al)
+		return NULL;
+
+	al_to_d_al(al, d_al);
+
+	d_al->is_kernel_ip = machine__kernel_ip(d->machine, d->sample->ip);
+	d_al->comm = al->thread ? thread__comm_str(al->thread) : ":-1";
+	d_al->filtered = al->filtered;
+
+	return d_al;
+}
+
+static const struct perf_dlfilter_al *dlfilter__resolve_addr(void *ctx)
+{
+	struct dlfilter *d = (struct dlfilter *)ctx;
+	struct perf_dlfilter_al *d_addr_al = d->d_addr_al;
+	struct addr_location *addr_al = d->addr_al;
+
+	if (!d->ctx_valid || !d->d_sample->addr_correlates_sym)
+		return NULL;
+
+	/* 'size' is also used to indicate already initialized */
+	if (d_addr_al->size)
+		return d_addr_al;
+
+	if (!addr_al->thread) {
+		struct thread *thread = get_thread(d);
+
+		if (!thread)
+			return NULL;
+		thread__resolve(thread, addr_al, d->sample);
+	}
+
+	al_to_d_al(addr_al, d_addr_al);
+
+	d_addr_al->is_kernel_ip = machine__kernel_ip(d->machine, d->sample->addr);
+
+	return d_addr_al;
+}
+
+static const struct perf_dlfilter_fns perf_dlfilter_fns = {
+	.resolve_ip      = dlfilter__resolve_ip,
+	.resolve_addr    = dlfilter__resolve_addr,
+};
+
+#define CHECK_FLAG(x) BUILD_BUG_ON((u64)PERF_DLFILTER_FLAG_ ## x != (u64)PERF_IP_FLAG_ ## x)
+
+static int dlfilter__init(struct dlfilter *d, const char *file)
+{
+	CHECK_FLAG(BRANCH);
+	CHECK_FLAG(CALL);
+	CHECK_FLAG(RETURN);
+	CHECK_FLAG(CONDITIONAL);
+	CHECK_FLAG(SYSCALLRET);
+	CHECK_FLAG(ASYNC);
+	CHECK_FLAG(INTERRUPT);
+	CHECK_FLAG(TX_ABORT);
+	CHECK_FLAG(TRACE_BEGIN);
+	CHECK_FLAG(TRACE_END);
+	CHECK_FLAG(IN_TX);
+	CHECK_FLAG(VMENTRY);
+	CHECK_FLAG(VMEXIT);
+
+	memset(d, 0, sizeof(*d));
+	d->file = strdup(file);
+	if (!d->file)
+		return -1;
+	return 0;
+}
+
+static void dlfilter__exit(struct dlfilter *d)
+{
+	zfree(&d->file);
+}
+
+static int dlfilter__open(struct dlfilter *d)
+{
+	d->handle = dlopen(d->file, RTLD_NOW);
+	if (!d->handle) {
+		pr_err("dlopen failed for: '%s'\n", d->file);
+		return -1;
+	}
+	d->start = dlsym(d->handle, "start");
+	d->filter_event = dlsym(d->handle, "filter_event");
+	d->stop = dlsym(d->handle, "stop");
+	d->fns = dlsym(d->handle, "perf_dlfilter_fns");
+	if (d->fns)
+		memcpy(d->fns, &perf_dlfilter_fns, sizeof(struct perf_dlfilter_fns));
+	return 0;
+}
+
+static int dlfilter__close(struct dlfilter *d)
+{
+	return dlclose(d->handle);
+}
+
+struct dlfilter *dlfilter__new(const char *file)
+{
+	struct dlfilter *d = malloc(sizeof(*d));
+
+	if (!d)
+		return NULL;
+
+	if (dlfilter__init(d, file))
+		goto err_free;
+
+	if (dlfilter__open(d))
+		goto err_exit;
+
+	return d;
+
+err_exit:
+	dlfilter__exit(d);
+err_free:
+	free(d);
+	return NULL;
+}
+
+static void dlfilter__free(struct dlfilter *d)
+{
+	if (d) {
+		dlfilter__exit(d);
+		free(d);
+	}
+}
+
+int dlfilter__start(struct dlfilter *d, struct perf_session *session)
+{
+	if (d) {
+		d->session = session;
+		if (d->start)
+			return d->start(&d->data, d);
+	}
+	return 0;
+}
+
+static int dlfilter__stop(struct dlfilter *d)
+{
+	if (d && d->stop)
+		return d->stop(d->data, d);
+	return 0;
+}
+
+void dlfilter__cleanup(struct dlfilter *d)
+{
+	if (d) {
+		dlfilter__stop(d);
+		dlfilter__close(d);
+		dlfilter__free(d);
+	}
+}
+
+#define ASSIGN(x) d_sample.x = sample->x
+
+int dlfilter__do_filter_event(struct dlfilter *d,
+			      union perf_event *event,
+			      struct perf_sample *sample,
+			      struct evsel *evsel,
+			      struct machine *machine,
+			      struct addr_location *al,
+			      struct addr_location *addr_al)
+{
+	struct perf_dlfilter_sample d_sample;
+	struct perf_dlfilter_al d_ip_al;
+	struct perf_dlfilter_al d_addr_al;
+	int ret;
+
+	d->event       = event;
+	d->sample      = sample;
+	d->evsel       = evsel;
+	d->machine     = machine;
+	d->al          = al;
+	d->addr_al     = addr_al;
+	d->d_sample    = &d_sample;
+	d->d_ip_al     = &d_ip_al;
+	d->d_addr_al   = &d_addr_al;
+
+	d_sample.size  = sizeof(d_sample);
+	d_ip_al.size   = 0; /* To indicate d_ip_al is not initialized */
+	d_addr_al.size = 0; /* To indicate d_addr_al is not initialized */
+
+	ASSIGN(ip);
+	ASSIGN(pid);
+	ASSIGN(tid);
+	ASSIGN(time);
+	ASSIGN(addr);
+	ASSIGN(id);
+	ASSIGN(stream_id);
+	ASSIGN(period);
+	ASSIGN(weight);
+	ASSIGN(ins_lat);
+	ASSIGN(p_stage_cyc);
+	ASSIGN(transaction);
+	ASSIGN(insn_cnt);
+	ASSIGN(cyc_cnt);
+	ASSIGN(cpu);
+	ASSIGN(flags);
+	ASSIGN(data_src);
+	ASSIGN(phys_addr);
+	ASSIGN(data_page_size);
+	ASSIGN(code_page_size);
+	ASSIGN(cgroup);
+	ASSIGN(cpumode);
+	ASSIGN(misc);
+	ASSIGN(raw_size);
+	ASSIGN(raw_data);
+
+	if (sample->branch_stack) {
+		d_sample.brstack_nr = sample->branch_stack->nr;
+		d_sample.brstack = (struct perf_branch_entry *)perf_sample__branch_entries(sample);
+	} else {
+		d_sample.brstack_nr = 0;
+		d_sample.brstack = NULL;
+	}
+
+	if (sample->callchain) {
+		d_sample.raw_callchain_nr = sample->callchain->nr;
+		d_sample.raw_callchain = (__u64 *)sample->callchain->ips;
+	} else {
+		d_sample.raw_callchain_nr = 0;
+		d_sample.raw_callchain = NULL;
+	}
+
+	d_sample.addr_correlates_sym =
+		(evsel->core.attr.sample_type & PERF_SAMPLE_ADDR) &&
+		sample_addr_correlates_sym(&evsel->core.attr);
+
+	d_sample.event = evsel__name(evsel);
+
+	d->ctx_valid = true;
+
+	ret = d->filter_event(d->data, &d_sample, d);
+
+	d->ctx_valid = false;
+
+	return ret;
+}
diff --git a/tools/perf/util/dlfilter.h b/tools/perf/util/dlfilter.h
new file mode 100644
index 000000000000..22b7636028dd
--- /dev/null
+++ b/tools/perf/util/dlfilter.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * dlfilter.h: Interface to perf script --dlfilter shared object
+ * Copyright (c) 2021, Intel Corporation.
+ */
+
+#ifndef PERF_UTIL_DLFILTER_H
+#define PERF_UTIL_DLFILTER_H
+
+struct perf_session;
+union  perf_event;
+struct perf_sample;
+struct evsel;
+struct machine;
+struct addr_location;
+struct perf_dlfilter_fns;
+struct perf_dlfilter_sample;
+struct perf_dlfilter_al;
+
+struct dlfilter {
+	char				*file;
+	void				*handle;
+	void				*data;
+	struct perf_session		*session;
+	bool				ctx_valid;
+
+	union perf_event		*event;
+	struct perf_sample		*sample;
+	struct evsel			*evsel;
+	struct machine			*machine;
+	struct addr_location		*al;
+	struct addr_location		*addr_al;
+	struct perf_dlfilter_sample	*d_sample;
+	struct perf_dlfilter_al		*d_ip_al;
+	struct perf_dlfilter_al		*d_addr_al;
+
+	int (*start)(void **data, void *ctx);
+	int (*stop)(void *data, void *ctx);
+
+	int (*filter_event)(void *data,
+			    const struct perf_dlfilter_sample *sample,
+			    void *ctx);
+
+	struct perf_dlfilter_fns *fns;
+};
+
+struct dlfilter *dlfilter__new(const char *file);
+
+int dlfilter__start(struct dlfilter *d, struct perf_session *session);
+
+int dlfilter__do_filter_event(struct dlfilter *d,
+			      union perf_event *event,
+			      struct perf_sample *sample,
+			      struct evsel *evsel,
+			      struct machine *machine,
+			      struct addr_location *al,
+			      struct addr_location *addr_al);
+
+void dlfilter__cleanup(struct dlfilter *d);
+
+static inline int dlfilter__filter_event(struct dlfilter *d,
+					 union perf_event *event,
+					 struct perf_sample *sample,
+					 struct evsel *evsel,
+					 struct machine *machine,
+					 struct addr_location *al,
+					 struct addr_location *addr_al)
+{
+	if (!d || !d->filter_event)
+		return 0;
+	return dlfilter__do_filter_event(d, event, sample, evsel, machine, al, addr_al);
+}
+
+#endif
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
new file mode 100644
index 000000000000..82833ee8680d
--- /dev/null
+++ b/tools/perf/util/perf_dlfilter.h
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * perf_dlfilter.h: API for perf --dlfilter shared object
+ * Copyright (c) 2021, Intel Corporation.
+ */
+#ifndef _LINUX_PERF_DLFILTER_H
+#define _LINUX_PERF_DLFILTER_H
+
+#include <linux/perf_event.h>
+#include <linux/types.h>
+
+/* Definitions for perf_dlfilter_sample flags */
+enum {
+	PERF_DLFILTER_FLAG_BRANCH	= 1ULL << 0,
+	PERF_DLFILTER_FLAG_CALL		= 1ULL << 1,
+	PERF_DLFILTER_FLAG_RETURN	= 1ULL << 2,
+	PERF_DLFILTER_FLAG_CONDITIONAL	= 1ULL << 3,
+	PERF_DLFILTER_FLAG_SYSCALLRET	= 1ULL << 4,
+	PERF_DLFILTER_FLAG_ASYNC	= 1ULL << 5,
+	PERF_DLFILTER_FLAG_INTERRUPT	= 1ULL << 6,
+	PERF_DLFILTER_FLAG_TX_ABORT	= 1ULL << 7,
+	PERF_DLFILTER_FLAG_TRACE_BEGIN	= 1ULL << 8,
+	PERF_DLFILTER_FLAG_TRACE_END	= 1ULL << 9,
+	PERF_DLFILTER_FLAG_IN_TX	= 1ULL << 10,
+	PERF_DLFILTER_FLAG_VMENTRY	= 1ULL << 11,
+	PERF_DLFILTER_FLAG_VMEXIT	= 1ULL << 12,
+};
+
+/*
+ * perf sample event information (as per perf script and <linux/perf_event.h>)
+ */
+struct perf_dlfilter_sample {
+	__u32 size; /* Size of this structure (for compatibility checking) */
+	__u16 ins_lat;		/* Refer PERF_SAMPLE_WEIGHT_TYPE in <linux/perf_event.h> */
+	__u16 p_stage_cyc;	/* Refer PERF_SAMPLE_WEIGHT_TYPE in <linux/perf_event.h> */
+	__u64 ip;
+	__s32 pid;
+	__s32 tid;
+	__u64 time;
+	__u64 addr;
+	__u64 id;
+	__u64 stream_id;
+	__u64 period;
+	__u64 weight;		/* Refer PERF_SAMPLE_WEIGHT_TYPE in <linux/perf_event.h> */
+	__u64 transaction;	/* Refer PERF_SAMPLE_TRANSACTION in <linux/perf_event.h> */
+	__u64 insn_cnt;	/* For instructions-per-cycle (IPC) */
+	__u64 cyc_cnt;		/* For instructions-per-cycle (IPC) */
+	__s32 cpu;
+	__u32 flags;		/* Refer PERF_DLFILTER_FLAG_* above */
+	__u64 data_src;		/* Refer PERF_SAMPLE_DATA_SRC in <linux/perf_event.h> */
+	__u64 phys_addr;	/* Refer PERF_SAMPLE_PHYS_ADDR in <linux/perf_event.h> */
+	__u64 data_page_size;	/* Refer PERF_SAMPLE_DATA_PAGE_SIZE in <linux/perf_event.h> */
+	__u64 code_page_size;	/* Refer PERF_SAMPLE_CODE_PAGE_SIZE in <linux/perf_event.h> */
+	__u64 cgroup;		/* Refer PERF_SAMPLE_CGROUP in <linux/perf_event.h> */
+	__u8  cpumode;		/* Refer CPUMODE_MASK etc in <linux/perf_event.h> */
+	__u8  addr_correlates_sym; /* True => resolve_addr() can be called */
+	__u16 misc;		/* Refer perf_event_header in <linux/perf_event.h> */
+	__u32 raw_size;		/* Refer PERF_SAMPLE_RAW in <linux/perf_event.h> */
+	const void *raw_data;	/* Refer PERF_SAMPLE_RAW in <linux/perf_event.h> */
+	__u64 brstack_nr;	/* Number of brstack entries */
+	const struct perf_branch_entry *brstack; /* Refer <linux/perf_event.h> */
+	__u64 raw_callchain_nr;	/* Number of raw_callchain entries */
+	const __u64 *raw_callchain; /* Refer <linux/perf_event.h> */
+	const char *event;
+};
+
+/*
+ * Address location (as per perf script)
+ */
+struct perf_dlfilter_al {
+	__u32 size; /* Size of this structure (for compatibility checking) */
+	__u32 symoff;
+	const char *sym;
+	__u64 addr; /* Mapped address (from dso) */
+	__u64 sym_start;
+	__u64 sym_end;
+	const char *dso;
+	__u8  sym_binding; /* STB_LOCAL, STB_GLOBAL or STB_WEAK, refer <elf.h> */
+	__u8  is_64_bit; /* Only valid if dso is not NULL */
+	__u8  is_kernel_ip; /* True if in kernel space */
+	__u32 buildid_size;
+	__u8 *buildid;
+	/* Below members are only populated by resolve_ip() */
+	__u8 filtered; /* True if this sample event will be filtered out */
+	const char *comm;
+};
+
+struct perf_dlfilter_fns {
+	/* Return information about ip */
+	const struct perf_dlfilter_al *(*resolve_ip)(void *ctx);
+	/* Return information about addr (if addr_correlates_sym) */
+	const struct perf_dlfilter_al *(*resolve_addr)(void *ctx);
+	/* Reserved */
+	void *(*reserved[126])(void *);
+};
+
+/*
+ * If implemented, 'start' will be called at the beginning,
+ * before any calls to 'filter_event'. Return 0 to indicate success,
+ * or return a negative error code. '*data' can be assigned for use
+ * by other functions. 'ctx' is needed for calls to perf_dlfilter_fns,
+ * but most perf_dlfilter_fns are not valid when called from 'start'.
+ */
+int start(void **data, void *ctx);
+
+/*
+ * If implemented, 'stop' will be called at the end,
+ * after any calls to 'filter_event'. Return 0 to indicate success, or
+ * return a negative error code. 'data' is set by start(). 'ctx' is
+ * needed for calls to perf_dlfilter_fns, but most perf_dlfilter_fns
+ * are not valid when called from 'stop'.
+ */
+int stop(void *data, void *ctx);
+
+/*
+ * If implemented, 'filter_event' will be called for each sample
+ * event. Return 0 to keep the sample event, 1 to filter it out, or
+ * return a negative error code. 'data' is set by start(). 'ctx' is
+ * needed for calls to perf_dlfilter_fns.
+ */
+int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 02/10] perf script: Add dlfilter__filter_event_early()
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 01/10] " Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 03/10] perf script: Add option to list dlfilters Adrian Hunter
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

filter_event_early() can be more than 30% faster than filter_event()
because it is called before internal filtering. In other respects it
is the same as filter_event(), except that it will be passed events
that have yet to be filtered out.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt | 13 +++++++----
 tools/perf/builtin-script.c                | 26 +++++++++++++++-------
 tools/perf/util/dlfilter.c                 |  9 ++++++--
 tools/perf/util/dlfilter.h                 | 21 +++++++++++++++--
 tools/perf/util/perf_dlfilter.h            |  6 +++++
 5 files changed, 59 insertions(+), 16 deletions(-)

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
index 15d5f4b01c97..aef1c32babd1 100644
--- a/tools/perf/Documentation/perf-dlfilter.txt
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -36,16 +36,17 @@ const struct perf_dlfilter_fns perf_dlfilter_fns;
 int start(void **data, void *ctx);
 int stop(void *data, void *ctx);
 int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
+int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
 ----
 
 If implemented, 'start' will be called at the beginning, before any
-calls to 'filter_event' . Return 0 to indicate success,
+calls to 'filter_event' or 'filter_event_early'. Return 0 to indicate success,
 or return a negative error code. '*data' can be assigned for use by other
 functions. 'ctx' is needed for calls to perf_dlfilter_fns, but most
 perf_dlfilter_fns are not valid when called from 'start'.
 
 If implemented, 'stop' will be called at the end, after any calls to
-'filter_event'. Return 0 to indicate success, or
+'filter_event' or 'filter_event_early'. Return 0 to indicate success, or
 return a negative error code. 'data' is set by 'start'. 'ctx' is needed
 for calls to perf_dlfilter_fns, but most perf_dlfilter_fns are not valid
 when called from 'stop'.
@@ -55,10 +56,13 @@ Return 0 to keep the sample event, 1 to filter it out, or return a negative
 error code. 'data' is set by 'start'. 'ctx' is needed for calls to
 'perf_dlfilter_fns'.
 
+'filter_event_early' is the same as 'filter_event' except it is called before
+internal filtering.
+
 The perf_dlfilter_sample structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-'filter_event' is passed a perf_dlfilter_sample
+'filter_event' and 'filter_event_early' are passed a perf_dlfilter_sample
 structure, which contains the following fields:
 [source,c]
 ----
@@ -105,7 +109,8 @@ The perf_dlfilter_fns structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 The 'perf_dlfilter_fns' structure is populated with function pointers when the
-file is loaded. The functions can be called by 'filter_event'.
+file is loaded. The functions can be called by 'filter_event' or
+'filter_event_early'.
 
 [source,c]
 ----
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index aaf2922643a0..e47affe674a5 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2179,9 +2179,20 @@ static int process_sample_event(struct perf_tool *tool,
 	struct addr_location addr_al;
 	int ret = 0;
 
+	/* Set thread to NULL to indicate addr_al and al are not initialized */
+	addr_al.thread = NULL;
+	al.thread = NULL;
+
+	ret = dlfilter__filter_event_early(dlfilter, event, sample, evsel, machine, &al, &addr_al);
+	if (ret) {
+		if (ret > 0)
+			ret = 0;
+		goto out_put;
+	}
+
 	if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
 					  sample->time)) {
-		return 0;
+		goto out_put;
 	}
 
 	if (debug_mode) {
@@ -2192,24 +2203,22 @@ static int process_sample_event(struct perf_tool *tool,
 			nr_unordered++;
 		}
 		last_timestamp = sample->time;
-		return 0;
+		goto out_put;
 	}
 
 	if (filter_cpu(sample))
-		return 0;
+		goto out_put;
 
 	if (machine__resolve(machine, &al, sample) < 0) {
 		pr_err("problem processing %d event, skipping it.\n",
 		       event->header.type);
-		return -1;
+		ret = -1;
+		goto out_put;
 	}
 
 	if (al.filtered)
 		goto out_put;
 
-	/* Set thread to NULL to indicate addr_al is not initialized */
-	addr_al.thread = NULL;
-
 	if (!show_event(sample, evsel, al.thread, &al, &addr_al))
 		goto out_put;
 
@@ -2238,7 +2247,8 @@ static int process_sample_event(struct perf_tool *tool,
 	}
 
 out_put:
-	addr_location__put(&al);
+	if (al.thread)
+		addr_location__put(&al);
 	return ret;
 }
 
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index 03c4bf150656..e93c49c07999 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -175,6 +175,7 @@ static int dlfilter__open(struct dlfilter *d)
 	}
 	d->start = dlsym(d->handle, "start");
 	d->filter_event = dlsym(d->handle, "filter_event");
+	d->filter_event_early = dlsym(d->handle, "filter_event_early");
 	d->stop = dlsym(d->handle, "stop");
 	d->fns = dlsym(d->handle, "perf_dlfilter_fns");
 	if (d->fns)
@@ -251,7 +252,8 @@ int dlfilter__do_filter_event(struct dlfilter *d,
 			      struct evsel *evsel,
 			      struct machine *machine,
 			      struct addr_location *al,
-			      struct addr_location *addr_al)
+			      struct addr_location *addr_al,
+			      bool early)
 {
 	struct perf_dlfilter_sample d_sample;
 	struct perf_dlfilter_al d_ip_al;
@@ -322,7 +324,10 @@ int dlfilter__do_filter_event(struct dlfilter *d,
 
 	d->ctx_valid = true;
 
-	ret = d->filter_event(d->data, &d_sample, d);
+	if (early)
+		ret = d->filter_event_early(d->data, &d_sample, d);
+	else
+		ret = d->filter_event(d->data, &d_sample, d);
 
 	d->ctx_valid = false;
 
diff --git a/tools/perf/util/dlfilter.h b/tools/perf/util/dlfilter.h
index 22b7636028dd..a994560e563d 100644
--- a/tools/perf/util/dlfilter.h
+++ b/tools/perf/util/dlfilter.h
@@ -40,6 +40,9 @@ struct dlfilter {
 	int (*filter_event)(void *data,
 			    const struct perf_dlfilter_sample *sample,
 			    void *ctx);
+	int (*filter_event_early)(void *data,
+				  const struct perf_dlfilter_sample *sample,
+				  void *ctx);
 
 	struct perf_dlfilter_fns *fns;
 };
@@ -54,7 +57,8 @@ int dlfilter__do_filter_event(struct dlfilter *d,
 			      struct evsel *evsel,
 			      struct machine *machine,
 			      struct addr_location *al,
-			      struct addr_location *addr_al);
+			      struct addr_location *addr_al,
+			      bool early);
 
 void dlfilter__cleanup(struct dlfilter *d);
 
@@ -68,7 +72,20 @@ static inline int dlfilter__filter_event(struct dlfilter *d,
 {
 	if (!d || !d->filter_event)
 		return 0;
-	return dlfilter__do_filter_event(d, event, sample, evsel, machine, al, addr_al);
+	return dlfilter__do_filter_event(d, event, sample, evsel, machine, al, addr_al, false);
+}
+
+static inline int dlfilter__filter_event_early(struct dlfilter *d,
+					       union perf_event *event,
+					       struct perf_sample *sample,
+					       struct evsel *evsel,
+					       struct machine *machine,
+					       struct addr_location *al,
+					       struct addr_location *addr_al)
+{
+	if (!d || !d->filter_event_early)
+		return 0;
+	return dlfilter__do_filter_event(d, event, sample, evsel, machine, al, addr_al, true);
 }
 
 #endif
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
index 82833ee8680d..f7a847fdee59 100644
--- a/tools/perf/util/perf_dlfilter.h
+++ b/tools/perf/util/perf_dlfilter.h
@@ -120,4 +120,10 @@ int stop(void *data, void *ctx);
  */
 int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
 
+/*
+ * The same as 'filter_event' except it is called before internal
+ * filtering.
+ */
+int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
+
 #endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 03/10] perf script: Add option to list dlfilters
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 01/10] " Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 02/10] perf script: Add dlfilter__filter_event_early() Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 04/10] perf script: Add option to pass arguments to dlfilters Adrian Hunter
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Add option --list-dlfilters to list dlfilters in the current directory or
the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must
come before option --list-dlfilters) to show long descriptions.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt |   4 +
 tools/perf/Documentation/perf-script.txt   |   4 +
 tools/perf/builtin-script.c                |   2 +
 tools/perf/util/dlfilter.c                 | 117 ++++++++++++++++++++-
 tools/perf/util/dlfilter.h                 |   2 +
 tools/perf/util/perf_dlfilter.h            |   6 ++
 6 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
index aef1c32babd1..8bc219f3eb83 100644
--- a/tools/perf/Documentation/perf-dlfilter.txt
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -37,6 +37,7 @@ int start(void **data, void *ctx);
 int stop(void *data, void *ctx);
 int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
 int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
+const char *filter_description(const char **long_description);
 ----
 
 If implemented, 'start' will be called at the beginning, before any
@@ -59,6 +60,9 @@ error code. 'data' is set by 'start'. 'ctx' is needed for calls to
 'filter_event_early' is the same as 'filter_event' except it is called before
 internal filtering.
 
+If implemented, 'filter_description' should return a one-line description
+of the filter, and optionally a longer description.
+
 The perf_dlfilter_sample structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 2306c81b606b..d2705d6b9874 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -102,6 +102,10 @@ OPTIONS
 	Filter sample events using the given shared object file.
 	Refer linkperf:perf-dlfilter[1]
 
+--list-dlfilters=::
+        Display a list of available dlfilters. Use with option -v (must come
+        before option --list-dlfilters) to show long descriptions.
+
 -a::
         Force system-wide collection.  Scripts run without a <command>
         normally use -a by default, while scripts run with a <command>
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index e47affe674a5..4ffba1dbc55d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3631,6 +3631,8 @@ int cmd_script(int argc, const char **argv)
 		    "show latency attributes (irqs/preemption disabled, etc)"),
 	OPT_CALLBACK_NOOPT('l', "list", NULL, NULL, "list available scripts",
 			   list_available_scripts),
+	OPT_CALLBACK_NOOPT(0, "list-dlfilters", NULL, NULL, "list available dlfilters",
+			   list_available_dlfilters),
 	OPT_CALLBACK('s', "script", NULL, "name",
 		     "script file name (lang:script name, script name, or *)",
 		     parse_scriptname),
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index e93c49c07999..288a2b57378c 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -6,6 +6,8 @@
 #include <dlfcn.h>
 #include <stdlib.h>
 #include <string.h>
+#include <dirent.h>
+#include <subcmd/exec-cmd.h>
 #include <linux/zalloc.h>
 #include <linux/build_bug.h>
 
@@ -136,6 +138,35 @@ static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.resolve_addr    = dlfilter__resolve_addr,
 };
 
+static char *find_dlfilter(const char *file)
+{
+	char path[PATH_MAX];
+	char *exec_path;
+
+	if (strchr(file, '/'))
+		goto out;
+
+	if (!access(file, R_OK)) {
+		/*
+		 * Prepend "./" so that dlopen will find the file in the
+		 * current directory.
+		 */
+		snprintf(path, sizeof(path), "./%s", file);
+		file = path;
+		goto out;
+	}
+
+	exec_path = get_argv_exec_path();
+	if (!exec_path)
+		goto out;
+	snprintf(path, sizeof(path), "%s/dlfilters/%s", exec_path, file);
+	free(exec_path);
+	if (!access(path, R_OK))
+		file = path;
+out:
+	return strdup(file);
+}
+
 #define CHECK_FLAG(x) BUILD_BUG_ON((u64)PERF_DLFILTER_FLAG_ ## x != (u64)PERF_IP_FLAG_ ## x)
 
 static int dlfilter__init(struct dlfilter *d, const char *file)
@@ -155,7 +186,7 @@ static int dlfilter__init(struct dlfilter *d, const char *file)
 	CHECK_FLAG(VMEXIT);
 
 	memset(d, 0, sizeof(*d));
-	d->file = strdup(file);
+	d->file = find_dlfilter(file);
 	if (!d->file)
 		return -1;
 	return 0;
@@ -333,3 +364,87 @@ int dlfilter__do_filter_event(struct dlfilter *d,
 
 	return ret;
 }
+
+static bool get_filter_desc(const char *dirname, const char *name,
+			    char **desc, char **long_desc)
+{
+	char path[PATH_MAX];
+	void *handle;
+	const char *(*desc_fn)(const char **long_description);
+
+	snprintf(path, sizeof(path), "%s/%s", dirname, name);
+	handle = dlopen(path, RTLD_NOW);
+	if (!handle || !(dlsym(handle, "filter_event") || dlsym(handle, "filter_event_early")))
+		return false;
+	desc_fn = dlsym(handle, "filter_description");
+	if (desc_fn) {
+		const char *dsc;
+		const char *long_dsc;
+
+		dsc = desc_fn(&long_dsc);
+		if (dsc)
+			*desc = strdup(dsc);
+		if (long_dsc)
+			*long_desc = strdup(long_dsc);
+	}
+	dlclose(handle);
+	return true;
+}
+
+static void list_filters(const char *dirname)
+{
+	struct dirent *entry;
+	DIR *dir;
+
+	dir = opendir(dirname);
+	if (!dir)
+		return;
+
+	while ((entry = readdir(dir)) != NULL)
+	{
+		size_t n = strlen(entry->d_name);
+		char *long_desc = NULL;
+		char *desc = NULL;
+
+		if (entry->d_type == DT_DIR || n < 4 ||
+		    strcmp(".so", entry->d_name + n - 3))
+			continue;
+		if (!get_filter_desc(dirname, entry->d_name, &desc, &long_desc))
+			continue;
+		printf("  %-36s %s\n", entry->d_name, desc ? desc : "");
+		if (verbose) {
+			char *p = long_desc;
+			char *line;
+
+			while ((line = strsep(&p, "\n")) != NULL)
+				printf("%39s%s\n", "", line);
+		}
+		free(long_desc);
+		free(desc);
+	}
+
+	closedir(dir);
+}
+
+int list_available_dlfilters(const struct option *opt __maybe_unused,
+			     const char *s __maybe_unused,
+			     int unset __maybe_unused)
+{
+	char path[PATH_MAX];
+	char *exec_path;
+
+	printf("List of available dlfilters:\n");
+
+	list_filters(".");
+
+	exec_path = get_argv_exec_path();
+	if (!exec_path)
+		goto out;
+	snprintf(path, sizeof(path), "%s/dlfilters", exec_path);
+
+	list_filters(path);
+
+	free(exec_path);
+out:
+	exit(0);
+}
diff --git a/tools/perf/util/dlfilter.h b/tools/perf/util/dlfilter.h
index a994560e563d..a1ed38da3ce6 100644
--- a/tools/perf/util/dlfilter.h
+++ b/tools/perf/util/dlfilter.h
@@ -88,4 +88,6 @@ static inline int dlfilter__filter_event_early(struct dlfilter *d,
 	return dlfilter__do_filter_event(d, event, sample, evsel, machine, al, addr_al, true);
 }
 
+int list_available_dlfilters(const struct option *opt, const char *s, int unset);
+
 #endif
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
index f7a847fdee59..31ad4c100181 100644
--- a/tools/perf/util/perf_dlfilter.h
+++ b/tools/perf/util/perf_dlfilter.h
@@ -126,4 +126,10 @@ int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ct
  */
 int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx);
 
+/*
+ * If implemented, return a one-line description of the filter, and optionally
+ * a longer description.
+ */
+const char *filter_description(const char **long_description);
+
 #endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 04/10] perf script: Add option to pass arguments to dlfilters
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (2 preceding siblings ...)
  2021-06-27 13:18 ` [PATCH V2 03/10] perf script: Add option to list dlfilters Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 05/10] perf build: Install perf_dlfilter.h Adrian Hunter
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Add option --dlarg to pass arguments to dlfilters. The --dlarg option can
be repeated to pass more than 1 argument.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt | 10 +++--
 tools/perf/Documentation/perf-script.txt   |  4 ++
 tools/perf/builtin-script.c                | 35 ++++++++++++++++-
 tools/perf/util/dlfilter.c                 | 45 ++++++++++++++++++----
 tools/perf/util/dlfilter.h                 |  6 ++-
 tools/perf/util/perf_dlfilter.h            |  4 +-
 6 files changed, 91 insertions(+), 13 deletions(-)

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
index 8bc219f3eb83..5795ab3ca23b 100644
--- a/tools/perf/Documentation/perf-dlfilter.txt
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -9,13 +9,14 @@ object file
 SYNOPSIS
 --------
 [verse]
-'perf script' [--dlfilter file.so ]
+'perf script' [--dlfilter file.so ] [ --dlarg arg ]...
 
 DESCRIPTION
 -----------
 
 This option is used to process data through a custom filter provided by a
-dynamically loaded shared object file.
+dynamically loaded shared object file. Arguments can be passed using --dlarg
+and retrieved using perf_dlfilter_fns.args().
 
 If 'file.so' does not contain "/", then it will be found either in the current
 directory, or perf tools exec path which is ~/libexec/perf-core/dlfilters for
@@ -121,7 +122,8 @@ file is loaded. The functions can be called by 'filter_event' or
 struct perf_dlfilter_fns {
 	const struct perf_dlfilter_al *(*resolve_ip)(void *ctx);
 	const struct perf_dlfilter_al *(*resolve_addr)(void *ctx);
-	void *(*reserved[126])(void *);
+	char **(*args)(void *ctx, int *dlargc);
+	void *(*reserved[125])(void *);
 };
 ----
 
@@ -129,6 +131,8 @@ struct perf_dlfilter_fns {
 
 'resolve_addr' returns information about addr (if addr_correlates_sym).
 
+'args' returns arguments from --dlarg options.
+
 The perf_dlfilter_al structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index d2705d6b9874..aa3a0b2c29a2 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -102,6 +102,10 @@ OPTIONS
 	Filter sample events using the given shared object file.
 	Refer linkperf:perf-dlfilter[1]
 
+--dlarg=<arg>::
+	Pass 'arg' as an argument to the dlfilter. --dlarg may be repeated
+	to add more arguments.
+
 --list-dlfilters=::
         Display a list of available dlfilters. Use with option -v (must come
         before option --list-dlfilters) to show long descriptions.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4ffba1dbc55d..2030936cc891 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -81,6 +81,8 @@ static struct perf_stat_config	stat_config;
 static int			max_blocks;
 static bool			native_arch;
 static struct dlfilter		*dlfilter;
+static int			dlargc;
+static char			**dlargv;
 
 unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;
 
@@ -3175,6 +3177,34 @@ static int list_available_scripts(const struct option *opt __maybe_unused,
 	exit(0);
 }
 
+static int add_dlarg(const struct option *opt __maybe_unused,
+		     const char *s, int unset __maybe_unused)
+{
+	char *arg = strdup(s);
+	void *a;
+
+	if (!arg)
+		return -1;
+
+	a = realloc(dlargv, sizeof(dlargv[0]) * (dlargc + 1));
+	if (!a) {
+		free(arg);
+		return -1;
+	}
+
+	dlargv = a;
+	dlargv[dlargc++] = arg;
+
+	return 0;
+}
+
+static void free_dlarg(void)
+{
+	while (dlargc--)
+		free(dlargv[dlargc]);
+	free(dlargv);
+}
+
 /*
  * Some scripts specify the required events in their "xxx-record" file,
  * this function will check if the events in perf.data match those
@@ -3639,6 +3669,8 @@ int cmd_script(int argc, const char **argv)
 	OPT_STRING('g', "gen-script", &generate_script_lang, "lang",
 		   "generate perf-script.xx script in specified language"),
 	OPT_STRING(0, "dlfilter", &dlfilter_file, "file", "filter .so file name"),
+	OPT_CALLBACK(0, "dlarg", NULL, "argument", "filter argument",
+		     add_dlarg),
 	OPT_STRING('i', "input", &input_name, "file", "input file name"),
 	OPT_BOOLEAN('d', "debug-mode", &debug_mode,
 		   "do various checks like samples ordering and lost events"),
@@ -3958,7 +3990,7 @@ int cmd_script(int argc, const char **argv)
 	}
 
 	if (dlfilter_file) {
-		dlfilter = dlfilter__new(dlfilter_file);
+		dlfilter = dlfilter__new(dlfilter_file, dlargc, dlargv);
 		if (!dlfilter)
 			return -1;
 	}
@@ -4116,6 +4148,7 @@ int cmd_script(int argc, const char **argv)
 	if (script_started)
 		cleanup_scripting();
 	dlfilter__cleanup(dlfilter);
+	free_dlarg();
 out:
 	return err;
 }
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index 288a2b57378c..eaa3cea49178 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -133,9 +133,26 @@ static const struct perf_dlfilter_al *dlfilter__resolve_addr(void *ctx)
 	return d_addr_al;
 }
 
+static char **dlfilter__args(void *ctx, int *dlargc)
+{
+	struct dlfilter *d = (struct dlfilter *)ctx;
+
+	if (dlargc)
+		*dlargc = 0;
+	else
+		return NULL;
+
+	if (!d->ctx_valid && !d->in_start && !d->in_stop)
+		return NULL;
+
+	*dlargc = d->dlargc;
+	return d->dlargv;
+}
+
 static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.resolve_ip      = dlfilter__resolve_ip,
 	.resolve_addr    = dlfilter__resolve_addr,
+	.args            = dlfilter__args,
 };
 
 static char *find_dlfilter(const char *file)
@@ -169,7 +186,7 @@ static char *find_dlfilter(const char *file)
 
 #define CHECK_FLAG(x) BUILD_BUG_ON((u64)PERF_DLFILTER_FLAG_ ## x != (u64)PERF_IP_FLAG_ ## x)
 
-static int dlfilter__init(struct dlfilter *d, const char *file)
+static int dlfilter__init(struct dlfilter *d, const char *file, int dlargc, char **dlargv)
 {
 	CHECK_FLAG(BRANCH);
 	CHECK_FLAG(CALL);
@@ -189,6 +206,8 @@ static int dlfilter__init(struct dlfilter *d, const char *file)
 	d->file = find_dlfilter(file);
 	if (!d->file)
 		return -1;
+	d->dlargc = dlargc;
+	d->dlargv = dlargv;
 	return 0;
 }
 
@@ -219,14 +238,14 @@ static int dlfilter__close(struct dlfilter *d)
 	return dlclose(d->handle);
 }
 
-struct dlfilter *dlfilter__new(const char *file)
+struct dlfilter *dlfilter__new(const char *file, int dlargc, char **dlargv)
 {
 	struct dlfilter *d = malloc(sizeof(*d));
 
 	if (!d)
 		return NULL;
 
-	if (dlfilter__init(d, file))
+	if (dlfilter__init(d, file, dlargc, dlargv))
 		goto err_free;
 
 	if (dlfilter__open(d))
@@ -253,16 +272,28 @@ int dlfilter__start(struct dlfilter *d, struct perf_session *session)
 {
 	if (d) {
 		d->session = session;
-		if (d->start)
-			return d->start(&d->data, d);
+		if (d->start) {
+			int ret;
+
+			d->in_start = true;
+			ret = d->start(&d->data, d);
+			d->in_start = false;
+			return ret;
+		}
 	}
 	return 0;
 }
 
 static int dlfilter__stop(struct dlfilter *d)
 {
-	if (d && d->stop)
-		return d->stop(d->data, d);
+	if (d && d->stop) {
+		int ret;
+
+		d->in_stop = true;
+		ret = d->stop(d->data, d);
+		d->in_stop = false;
+		return ret;
+	}
 	return 0;
 }
 
diff --git a/tools/perf/util/dlfilter.h b/tools/perf/util/dlfilter.h
index a1ed38da3ce6..505980442360 100644
--- a/tools/perf/util/dlfilter.h
+++ b/tools/perf/util/dlfilter.h
@@ -23,6 +23,10 @@ struct dlfilter {
 	void				*data;
 	struct perf_session		*session;
 	bool				ctx_valid;
+	bool				in_start;
+	bool				in_stop;
+	int				dlargc;
+	char				**dlargv;
 
 	union perf_event		*event;
 	struct perf_sample		*sample;
@@ -47,7 +51,7 @@ struct dlfilter {
 	struct perf_dlfilter_fns *fns;
 };
 
-struct dlfilter *dlfilter__new(const char *file);
+struct dlfilter *dlfilter__new(const char *file, int dlargc, char **dlargv);
 
 int dlfilter__start(struct dlfilter *d, struct perf_session *session);
 
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
index 31ad4c100181..35e03aa41a7f 100644
--- a/tools/perf/util/perf_dlfilter.h
+++ b/tools/perf/util/perf_dlfilter.h
@@ -90,8 +90,10 @@ struct perf_dlfilter_fns {
 	const struct perf_dlfilter_al *(*resolve_ip)(void *ctx);
 	/* Return information about addr (if addr_correlates_sym) */
 	const struct perf_dlfilter_al *(*resolve_addr)(void *ctx);
+	/* Return arguments from --dlarg option */
+	char **(*args)(void *ctx, int *dlargc);
 	/* Reserved */
-	void *(*reserved[126])(void *);
+	void *(*reserved[125])(void *);
 };
 
 /*
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 05/10] perf build: Install perf_dlfilter.h
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (3 preceding siblings ...)
  2021-06-27 13:18 ` [PATCH V2 04/10] perf script: Add option to pass arguments to dlfilters Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 06/10] perf dlfilter: Add resolve_address() to perf_dlfilter_fns Adrian Hunter
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Users of the --dlfilter option need to include perf_dlfilter.h
in their filters. Install it to the include path.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Makefile.config | 3 +++
 tools/perf/Makefile.perf   | 4 +++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 8dc7cef684dc..eb8e487ef90b 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -1117,6 +1117,8 @@ prefix ?= $(HOME)
 endif
 bindir_relative = bin
 bindir = $(abspath $(prefix)/$(bindir_relative))
+includedir_relative = include
+includedir = $(abspath $(prefix)/$(includedir_relative))
 mandir = share/man
 infodir = share/info
 perfexecdir = libexec/perf-core
@@ -1149,6 +1151,7 @@ ETC_PERFCONFIG_SQ = $(subst ','\'',$(ETC_PERFCONFIG))
 STRACE_GROUPS_DIR_SQ = $(subst ','\'',$(STRACE_GROUPS_DIR))
 DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
 bindir_SQ = $(subst ','\'',$(bindir))
+includedir_SQ = $(subst ','\'',$(includedir))
 mandir_SQ = $(subst ','\'',$(mandir))
 infodir_SQ = $(subst ','\'',$(infodir))
 perfexecdir_SQ = $(subst ','\'',$(perfexecdir))
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index e47f04e5b51e..c9e0de5b00c1 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -923,7 +923,9 @@ install-tools: all install-gtk
 	$(call QUIET_INSTALL, binaries) \
 		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(bindir_SQ)'; \
 		$(INSTALL) $(OUTPUT)perf '$(DESTDIR_SQ)$(bindir_SQ)'; \
-		$(LN) '$(DESTDIR_SQ)$(bindir_SQ)/perf' '$(DESTDIR_SQ)$(bindir_SQ)/trace'
+		$(LN) '$(DESTDIR_SQ)$(bindir_SQ)/perf' '$(DESTDIR_SQ)$(dir_SQ)/trace'; \
+		$(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(includedir_SQ)/perf'; \
+		$(INSTALL) util/perf_dlfilter.h -t '$(DESTDIR_SQ)$(includedir_SQ)/perf'
 ifndef NO_PERF_READ_VDSO32
 	$(call QUIET_INSTALL, perf-read-vdso32) \
 		$(INSTALL) $(OUTPUT)perf-read-vdso32 '$(DESTDIR_SQ)$(bindir_SQ)';
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 06/10] perf dlfilter: Add resolve_address() to perf_dlfilter_fns
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (4 preceding siblings ...)
  2021-06-27 13:18 ` [PATCH V2 05/10] perf build: Install perf_dlfilter.h Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 07/10] perf dlfilter: Add insn() " Adrian Hunter
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Add a function, for use by dlfilters, to resolve addresses from branch
stacks or callchains.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt |  6 ++++-
 tools/perf/util/dlfilter.c                 | 29 ++++++++++++++++++++++
 tools/perf/util/perf_dlfilter.h            |  7 +++++-
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
index 5795ab3ca23b..642323ac8934 100644
--- a/tools/perf/Documentation/perf-dlfilter.txt
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -123,7 +123,8 @@ struct perf_dlfilter_fns {
 	const struct perf_dlfilter_al *(*resolve_ip)(void *ctx);
 	const struct perf_dlfilter_al *(*resolve_addr)(void *ctx);
 	char **(*args)(void *ctx, int *dlargc);
-	void *(*reserved[125])(void *);
+	__s32 (*resolve_address)(void *ctx, __u64 address, struct perf_dlfilter_al *al);
+	void *(*reserved[124])(void *);
 };
 ----
 
@@ -133,6 +134,9 @@ struct perf_dlfilter_fns {
 
 'args' returns arguments from --dlarg options.
 
+'resolve_address' provides information about 'address'. al->size must be set
+before calling. Returns 0 on success, -1 otherwise.
+
 The perf_dlfilter_al structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index eaa3cea49178..4b03227541d3 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -149,10 +149,39 @@ static char **dlfilter__args(void *ctx, int *dlargc)
 	return d->dlargv;
 }
 
+static __s32 dlfilter__resolve_address(void *ctx, __u64 address, struct perf_dlfilter_al *d_al_p)
+{
+	struct dlfilter *d = (struct dlfilter *)ctx;
+	struct perf_dlfilter_al d_al;
+	struct addr_location al;
+	struct thread *thread;
+	__u32 sz;
+
+	if (!d->ctx_valid || !d_al_p)
+		return -1;
+
+	thread = get_thread(d);
+	if (!thread)
+		return -1;
+
+	thread__find_symbol_fb(thread, d->sample->cpumode, address, &al);
+
+	al_to_d_al(&al, &d_al);
+
+	d_al.is_kernel_ip = machine__kernel_ip(d->machine, address);
+
+	sz = d_al_p->size;
+	memcpy(d_al_p, &d_al, min((size_t)sz, sizeof(d_al)));
+	d_al_p->size = sz;
+
+	return 0;
+}
+
 static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.resolve_ip      = dlfilter__resolve_ip,
 	.resolve_addr    = dlfilter__resolve_addr,
 	.args            = dlfilter__args,
+	.resolve_address = dlfilter__resolve_address,
 };
 
 static char *find_dlfilter(const char *file)
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
index 35e03aa41a7f..dfd0f8482824 100644
--- a/tools/perf/util/perf_dlfilter.h
+++ b/tools/perf/util/perf_dlfilter.h
@@ -92,8 +92,13 @@ struct perf_dlfilter_fns {
 	const struct perf_dlfilter_al *(*resolve_addr)(void *ctx);
 	/* Return arguments from --dlarg option */
 	char **(*args)(void *ctx, int *dlargc);
+	/*
+	 * Return information about address (al->size must be set before
+	 * calling). Returns 0 on success, -1 otherwise.
+	 */
+	__s32 (*resolve_address)(void *ctx, __u64 address, struct perf_dlfilter_al *al);
 	/* Reserved */
-	void *(*reserved[125])(void *);
+	void *(*reserved[124])(void *);
 };
 
 /*
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 07/10] perf dlfilter: Add insn() to perf_dlfilter_fns
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (5 preceding siblings ...)
  2021-06-27 13:18 ` [PATCH V2 06/10] perf dlfilter: Add resolve_address() to perf_dlfilter_fns Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 08/10] perf dlfilter: Add srcline() " Adrian Hunter
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Add a function, for use by dlfilters, to return instruction bytes.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt |  5 +++-
 tools/perf/util/dlfilter.c                 | 32 ++++++++++++++++++++++
 tools/perf/util/perf_dlfilter.h            |  4 ++-
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
index 642323ac8934..bc4dba0995d8 100644
--- a/tools/perf/Documentation/perf-dlfilter.txt
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -124,7 +124,8 @@ struct perf_dlfilter_fns {
 	const struct perf_dlfilter_al *(*resolve_addr)(void *ctx);
 	char **(*args)(void *ctx, int *dlargc);
 	__s32 (*resolve_address)(void *ctx, __u64 address, struct perf_dlfilter_al *al);
-	void *(*reserved[124])(void *);
+	const __u8 *(*insn)(void *ctx, __u32 *length);
+	void *(*reserved[123])(void *);
 };
 ----
 
@@ -137,6 +138,8 @@ struct perf_dlfilter_fns {
 'resolve_address' provides information about 'address'. al->size must be set
 before calling. Returns 0 on success, -1 otherwise.
 
+'insn' returns instruction bytes and length.
+
 The perf_dlfilter_al structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index 4b03227541d3..79a899255c01 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -17,6 +17,7 @@
 #include "dso.h"
 #include "map.h"
 #include "thread.h"
+#include "trace-event.h"
 #include "symbol.h"
 #include "dlfilter.h"
 #include "perf_dlfilter.h"
@@ -177,11 +178,42 @@ static __s32 dlfilter__resolve_address(void *ctx, __u64 address, struct perf_dlf
 	return 0;
 }
 
+static const __u8 *dlfilter__insn(void *ctx, __u32 *len)
+{
+	struct dlfilter *d = (struct dlfilter *)ctx;
+
+	if (!len)
+		return NULL;
+
+	*len = 0;
+
+	if (!d->ctx_valid)
+		return NULL;
+
+	if (d->sample->ip && !d->sample->insn_len) {
+		struct addr_location *al = d->al;
+
+		if (!al->thread && machine__resolve(d->machine, al, d->sample) < 0)
+			return NULL;
+
+		if (al->thread->maps && al->thread->maps->machine)
+			script_fetch_insn(d->sample, al->thread, al->thread->maps->machine);
+	}
+
+	if (!d->sample->insn_len)
+		return NULL;
+
+	*len = d->sample->insn_len;
+
+	return (__u8 *)d->sample->insn;
+}
+
 static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.resolve_ip      = dlfilter__resolve_ip,
 	.resolve_addr    = dlfilter__resolve_addr,
 	.args            = dlfilter__args,
 	.resolve_address = dlfilter__resolve_address,
+	.insn            = dlfilter__insn,
 };
 
 static char *find_dlfilter(const char *file)
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
index dfd0f8482824..763c5af3c5f7 100644
--- a/tools/perf/util/perf_dlfilter.h
+++ b/tools/perf/util/perf_dlfilter.h
@@ -97,8 +97,10 @@ struct perf_dlfilter_fns {
 	 * calling). Returns 0 on success, -1 otherwise.
 	 */
 	__s32 (*resolve_address)(void *ctx, __u64 address, struct perf_dlfilter_al *al);
+	/* Return instruction bytes and length */
+	const __u8 *(*insn)(void *ctx, __u32 *length);
 	/* Reserved */
-	void *(*reserved[124])(void *);
+	void *(*reserved[123])(void *);
 };
 
 /*
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 08/10] perf dlfilter: Add srcline() to perf_dlfilter_fns
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (6 preceding siblings ...)
  2021-06-27 13:18 ` [PATCH V2 07/10] perf dlfilter: Add insn() " Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 09/10] perf dlfilter: Add attr() " Adrian Hunter
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Add a function, for use by dlfilters, to return source code file name and
line number.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt |  5 +++-
 tools/perf/util/dlfilter.c                 | 28 ++++++++++++++++++++++
 tools/perf/util/perf_dlfilter.h            |  4 +++-
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
index bc4dba0995d8..df118ddcd7f4 100644
--- a/tools/perf/Documentation/perf-dlfilter.txt
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -125,7 +125,8 @@ struct perf_dlfilter_fns {
 	char **(*args)(void *ctx, int *dlargc);
 	__s32 (*resolve_address)(void *ctx, __u64 address, struct perf_dlfilter_al *al);
 	const __u8 *(*insn)(void *ctx, __u32 *length);
-	void *(*reserved[123])(void *);
+	const char *(*srcline)(void *ctx, __u32 *line_number);
+	void *(*reserved[122])(void *);
 };
 ----
 
@@ -140,6 +141,8 @@ before calling. Returns 0 on success, -1 otherwise.
 
 'insn' returns instruction bytes and length.
 
+'srcline' return source file name and line number.
+
 The perf_dlfilter_al structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index 79a899255c01..f4ce1a80bddc 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -19,6 +19,7 @@
 #include "thread.h"
 #include "trace-event.h"
 #include "symbol.h"
+#include "srcline.h"
 #include "dlfilter.h"
 #include "perf_dlfilter.h"
 
@@ -208,12 +209,39 @@ static const __u8 *dlfilter__insn(void *ctx, __u32 *len)
 	return (__u8 *)d->sample->insn;
 }
 
+static const char *dlfilter__srcline(void *ctx, __u32 *line_no)
+{
+	struct dlfilter *d = (struct dlfilter *)ctx;
+	struct addr_location *al;
+	unsigned int line = 0;
+	char *srcfile = NULL;
+	struct map *map;
+	u64 addr;
+
+	if (!d->ctx_valid || !line_no)
+		return NULL;
+
+	al = get_al(d);
+	if (!al)
+		return NULL;
+
+	map = al->map;
+	addr = al->addr;
+
+	if (map && map->dso)
+		srcfile = get_srcline_split(map->dso, map__rip_2objdump(map, addr), &line);
+
+	*line_no = line;
+	return srcfile;
+}
+
 static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.resolve_ip      = dlfilter__resolve_ip,
 	.resolve_addr    = dlfilter__resolve_addr,
 	.args            = dlfilter__args,
 	.resolve_address = dlfilter__resolve_address,
 	.insn            = dlfilter__insn,
+	.srcline         = dlfilter__srcline,
 };
 
 static char *find_dlfilter(const char *file)
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
index 763c5af3c5f7..b989918056e2 100644
--- a/tools/perf/util/perf_dlfilter.h
+++ b/tools/perf/util/perf_dlfilter.h
@@ -99,8 +99,10 @@ struct perf_dlfilter_fns {
 	__s32 (*resolve_address)(void *ctx, __u64 address, struct perf_dlfilter_al *al);
 	/* Return instruction bytes and length */
 	const __u8 *(*insn)(void *ctx, __u32 *length);
+	/* Return source file name and line number */
+	const char *(*srcline)(void *ctx, __u32 *line_number);
 	/* Reserved */
-	void *(*reserved[123])(void *);
+	void *(*reserved[122])(void *);
 };
 
 /*
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 09/10] perf dlfilter: Add attr() to perf_dlfilter_fns
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (7 preceding siblings ...)
  2021-06-27 13:18 ` [PATCH V2 08/10] perf dlfilter: Add srcline() " Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 13:18 ` [PATCH V2 10/10] perf dlfilter: Add object_code() " Adrian Hunter
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Add a function, for use by dlfilters, to return the perf_event_attr
structure.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt |  5 ++++-
 tools/perf/util/dlfilter.c                 | 11 +++++++++++
 tools/perf/util/perf_dlfilter.h            |  4 +++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
index df118ddcd7f4..6b1d9da16feb 100644
--- a/tools/perf/Documentation/perf-dlfilter.txt
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -126,7 +126,8 @@ struct perf_dlfilter_fns {
 	__s32 (*resolve_address)(void *ctx, __u64 address, struct perf_dlfilter_al *al);
 	const __u8 *(*insn)(void *ctx, __u32 *length);
 	const char *(*srcline)(void *ctx, __u32 *line_number);
-	void *(*reserved[122])(void *);
+	struct perf_event_attr *(*attr)(void *ctx);
+	void *(*reserved[121])(void *);
 };
 ----
 
@@ -143,6 +144,8 @@ before calling. Returns 0 on success, -1 otherwise.
 
 'srcline' return source file name and line number.
 
+'attr' returns perf_event_attr, refer <linux/perf_event.h>.
+
 The perf_dlfilter_al structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index f4ce1a80bddc..e50d524906d6 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -235,6 +235,16 @@ static const char *dlfilter__srcline(void *ctx, __u32 *line_no)
 	return srcfile;
 }
 
+static struct perf_event_attr *dlfilter__attr(void *ctx)
+{
+	struct dlfilter *d = (struct dlfilter *)ctx;
+
+	if (!d->ctx_valid)
+		return NULL;
+
+	return &d->evsel->core.attr;
+}
+
 static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.resolve_ip      = dlfilter__resolve_ip,
 	.resolve_addr    = dlfilter__resolve_addr,
@@ -242,6 +252,7 @@ static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.resolve_address = dlfilter__resolve_address,
 	.insn            = dlfilter__insn,
 	.srcline         = dlfilter__srcline,
+	.attr            = dlfilter__attr,
 };
 
 static char *find_dlfilter(const char *file)
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
index b989918056e2..f3fc92fcb3c0 100644
--- a/tools/perf/util/perf_dlfilter.h
+++ b/tools/perf/util/perf_dlfilter.h
@@ -101,8 +101,10 @@ struct perf_dlfilter_fns {
 	const __u8 *(*insn)(void *ctx, __u32 *length);
 	/* Return source file name and line number */
 	const char *(*srcline)(void *ctx, __u32 *line_number);
+	/* Return perf_event_attr, refer <linux/perf_event.h> */
+	struct perf_event_attr *(*attr)(void *ctx);
 	/* Reserved */
-	void *(*reserved[122])(void *);
+	void *(*reserved[121])(void *);
 };
 
 /*
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 10/10] perf dlfilter: Add object_code() to perf_dlfilter_fns
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (8 preceding siblings ...)
  2021-06-27 13:18 ` [PATCH V2 09/10] perf dlfilter: Add attr() " Adrian Hunter
@ 2021-06-27 13:18 ` Adrian Hunter
  2021-06-27 16:13 ` [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Andi Kleen
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-27 13:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Add a function, for use by dlfilters, to read object code.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-dlfilter.txt |  5 +++-
 tools/perf/util/dlfilter.c                 | 34 ++++++++++++++++++++++
 tools/perf/util/perf_dlfilter.h            |  4 ++-
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt
index 6b1d9da16feb..02842cb4cf90 100644
--- a/tools/perf/Documentation/perf-dlfilter.txt
+++ b/tools/perf/Documentation/perf-dlfilter.txt
@@ -127,7 +127,8 @@ struct perf_dlfilter_fns {
 	const __u8 *(*insn)(void *ctx, __u32 *length);
 	const char *(*srcline)(void *ctx, __u32 *line_number);
 	struct perf_event_attr *(*attr)(void *ctx);
-	void *(*reserved[121])(void *);
+	__s32 (*object_code)(void *ctx, __u64 ip, void *buf, __u32 len);
+	void *(*reserved[120])(void *);
 };
 ----
 
@@ -146,6 +147,8 @@ before calling. Returns 0 on success, -1 otherwise.
 
 'attr' returns perf_event_attr, refer <linux/perf_event.h>.
 
+'object_code' reads object code and returns the number of bytes read.
+
 The perf_dlfilter_al structure
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index e50d524906d6..ca33fbc5efde 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -245,6 +245,39 @@ static struct perf_event_attr *dlfilter__attr(void *ctx)
 	return &d->evsel->core.attr;
 }
 
+static __s32 dlfilter__object_code(void *ctx, __u64 ip, void *buf, __u32 len)
+{
+	struct dlfilter *d = (struct dlfilter *)ctx;
+	struct addr_location *al;
+	struct addr_location a;
+	struct map *map;
+	u64 offset;
+
+	if (!d->ctx_valid)
+		return -1;
+
+	al = get_al(d);
+	if (!al)
+		return -1;
+
+	map = al->map;
+
+	if (map && ip >= map->start && ip < map->end &&
+	    machine__kernel_ip(d->machine, ip) == machine__kernel_ip(d->machine, d->sample->ip))
+		goto have_map;
+
+	thread__find_map_fb(al->thread, d->sample->cpumode, ip, &a);
+	if (!a.map)
+		return -1;
+
+	map = a.map;
+have_map:
+	offset = map->map_ip(map, ip);
+	if (ip + len >= map->end)
+		len = map->end - ip;
+	return dso__data_read_offset(map->dso, d->machine, offset, buf, len);
+}
+
 static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.resolve_ip      = dlfilter__resolve_ip,
 	.resolve_addr    = dlfilter__resolve_addr,
@@ -253,6 +286,7 @@ static const struct perf_dlfilter_fns perf_dlfilter_fns = {
 	.insn            = dlfilter__insn,
 	.srcline         = dlfilter__srcline,
 	.attr            = dlfilter__attr,
+	.object_code     = dlfilter__object_code,
 };
 
 static char *find_dlfilter(const char *file)
diff --git a/tools/perf/util/perf_dlfilter.h b/tools/perf/util/perf_dlfilter.h
index f3fc92fcb3c0..3eef03d661b4 100644
--- a/tools/perf/util/perf_dlfilter.h
+++ b/tools/perf/util/perf_dlfilter.h
@@ -103,8 +103,10 @@ struct perf_dlfilter_fns {
 	const char *(*srcline)(void *ctx, __u32 *line_number);
 	/* Return perf_event_attr, refer <linux/perf_event.h> */
 	struct perf_event_attr *(*attr)(void *ctx);
+	/* Read object code, return numbers of bytes read */
+	__s32 (*object_code)(void *ctx, __u64 ip, void *buf, __u32 len);
 	/* Reserved */
-	void *(*reserved[121])(void *);
+	void *(*reserved[120])(void *);
 };
 
 /*
-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (9 preceding siblings ...)
  2021-06-27 13:18 ` [PATCH V2 10/10] perf dlfilter: Add object_code() " Adrian Hunter
@ 2021-06-27 16:13 ` Andi Kleen
  2021-06-28  7:23   ` Adrian Hunter
  2021-06-29 19:28 ` Namhyung Kim
  2021-07-01 17:40 ` Arnaldo Carvalho de Melo
  12 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2021-06-27 16:13 UTC (permalink / raw)
  To: Adrian Hunter, Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel


On 6/27/2021 6:18 AM, Adrian Hunter wrote:
> Hi
>   
> In some cases, users want to filter very large amounts of data (e.g. from
> AUX area tracing like Intel PT) looking for something specific. While
> scripting such as Python can be used, Python is 10 to 20 times slower than
> C. So define a C API so that custom filters can be written and loaded.

While I appreciate this for complex cases, in my experience filtering is 
usually just a simple expression. It would be nice to also have a way to 
do this reasonably fast without having to write a custom C file.   Is 
the 10x-20x overhead just the python interpreter, or is it related to 
perf? Maybe we could have some kind of python fast path just for 
filters? Or maybe the alternative would be to have a frontend in perf 
that can automatically generate/compile such a C filter based on a 
simple expression, but I'm not sure if that would be much simpler.

-Andi


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object
  2021-06-27 16:13 ` [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Andi Kleen
@ 2021-06-28  7:23   ` Adrian Hunter
  2021-06-28 14:57     ` Andi Kleen
  0 siblings, 1 reply; 17+ messages in thread
From: Adrian Hunter @ 2021-06-28  7:23 UTC (permalink / raw)
  To: Andi Kleen, Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

On 27/06/21 7:13 pm, Andi Kleen wrote:
> 
> On 6/27/2021 6:18 AM, Adrian Hunter wrote:
>> Hi In some cases, users want to filter very large amounts of data
>> (e.g. from AUX area tracing like Intel PT) looking for something
>> specific. While scripting such as Python can be used, Python is 10
>> to 20 times slower than C. So define a C API so that custom filters
>> can be written and loaded.
> 
> While I appreciate this for complex cases, in my experience filtering
> is usually just a simple expression. It would be nice to also have a
> way to do this reasonably fast without having to write a custom C

I do not agree that writing C filters is a hassle e.g. a minimal do-nothing
filter is only a few lines:

#include <perf/perf_dlfilter.h>

int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx)
{
	return 0;
}

(Actually, the filter program does not have to have any LOC at all, but that
is not much of an example)

Additionally, a script to do the build is fairly trivial e.g. I use this:

$ cat `which make-dlfilter.sh `
#!/bin/bash

set -ex

if test -z "${1}" ; then
        echo "Name required"
        exit 1
fi

name="${1%.c}"

if test "${name}" = "${1}" ; then
        name="${1%.so}"
fi

gcc -c -I ~/include -fpic "${name}.c"

gcc -shared -o "${name}.so" "${name}.o"


> file.   Is the 10x-20x overhead just the python interpreter, or is it
> related to perf?


AFAICT the Python C API used to interface to Python performs fairly similarly
to the Python interpreter.

>                  Maybe we could have some kind of python fast path
> just for filters?

I expect there are ways to make it more efficient, but I doubt it would ever
come close to C.

> just for filters? Or maybe the alternative would be to have a
> frontend in perf that can automatically generate/compile such a C
> filter based on a simple expression, but I'm not sure if that would
> be much simpler.

If gcc is available, perf script could, in fact, build the .so on the fly
since the compile time is very quick.

Another point is that filters can be used for more than just filtering.
Here is an example which sums cycles per-cpu and prints them, and the difference
to the last print, at the beginning of each line.  I think this was something
you were interested in doing?


#include <perf/perf_dlfilter.h>
#include <stdio.h>

#define MAX_CPU 4096

__u64 cycles[MAX_CPU];
__u64 cycles_rpt[MAX_CPU];

int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx)
{
	__s32 cpu = sample->cpu;

	if (cpu >=0 && cpu < MAX_CPU)
		cycles[cpu] += sample->cyc_cnt;
	return 0;
}

int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx)
{
	__s32 cpu = sample->cpu;

	if (cpu >=0 && cpu < MAX_CPU) {
		printf("%10llu %10llu ", cycles[cpu], cycles[cpu] - cycles_rpt[cpu]);
		cycles_rpt[cpu] = cycles[cpu];
	} else {
		printf("%22s", "");
	}
	return 0;
}

const char *filter_description(const char **long_description)
{
	return "Print the number of cycles at the start of each line";
}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object
  2021-06-28  7:23   ` Adrian Hunter
@ 2021-06-28 14:57     ` Andi Kleen
  2021-06-28 19:30       ` Adrian Hunter
  0 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2021-06-28 14:57 UTC (permalink / raw)
  To: Adrian Hunter, Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel


On 6/28/2021 12:23 AM, Adrian Hunter wrote:
> On 27/06/21 7:13 pm, Andi Kleen wrote:
>> On 6/27/2021 6:18 AM, Adrian Hunter wrote:
>>> Hi In some cases, users want to filter very large amounts of data
>>> (e.g. from AUX area tracing like Intel PT) looking for something
>>> specific. While scripting such as Python can be used, Python is 10
>>> to 20 times slower than C. So define a C API so that custom filters
>>> can be written and loaded.
>> While I appreciate this for complex cases, in my experience filtering
>> is usually just a simple expression. It would be nice to also have a
>> way to do this reasonably fast without having to write a custom C
> I do not agree that writing C filters is a hassle e.g. a minimal do-nothing
> filter is only a few lines:

It still doesn't seem user friendly. Maybe it's obvious to you, but I 
suspect we left behind most of even the sophisticated perf users here.


>
>>                   Maybe we could have some kind of python fast path
>> just for filters?
> I expect there are ways to make it more efficient, but I doubt it would ever
> come close to C.

If it's within 2-3x I guess it would be ok. For any larger data files we 
should parallelize anyways, and that works fine with the --time x/y 
method (although it usually also needs some custom scripting, perhaps 
need to figure out how to make it more user friendly)


>
>> just for filters? Or maybe the alternative would be to have a
>> frontend in perf that can automatically generate/compile such a C
>> filter based on a simple expression, but I'm not sure if that would
>> be much simpler.
> If gcc is available, perf script could, in fact, build the .so on the fly
> since the compile time is very quick.
>
> Another point is that filters can be used for more than just filtering.
> Here is an example which sums cycles per-cpu and prints them, and the difference
> to the last print, at the beginning of each line.  I think this was something
> you were interested in doing?

Yes that's great and useful, but I would prefer to not maintain custom 
plugins for it. Often when I write a script it has to run in all kinds 
of weird environments that some random person installed, and it's not 
clear how portable building C will be there. And I doubt I can just copy 
the .so files around.

BTW I'm not arguing to not do the plugin (I can imagine extreme cases 
where such a plugin is the best option), but really for most of these 
things there should be easier and more portable alternatives, even if 
they are slightly slower.

-Andi


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object
  2021-06-28 14:57     ` Andi Kleen
@ 2021-06-28 19:30       ` Adrian Hunter
  0 siblings, 0 replies; 17+ messages in thread
From: Adrian Hunter @ 2021-06-28 19:30 UTC (permalink / raw)
  To: Andi Kleen, Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

On 28/06/21 5:57 pm, Andi Kleen wrote:
> 
> On 6/28/2021 12:23 AM, Adrian Hunter wrote:
>> On 27/06/21 7:13 pm, Andi Kleen wrote:
>>> On 6/27/2021 6:18 AM, Adrian Hunter wrote:
>>>> Hi In some cases, users want to filter very large amounts of data
>>>> (e.g. from AUX area tracing like Intel PT) looking for something
>>>> specific. While scripting such as Python can be used, Python is 10
>>>> to 20 times slower than C. So define a C API so that custom filters
>>>> can be written and loaded.
>>> While I appreciate this for complex cases, in my experience filtering
>>> is usually just a simple expression. It would be nice to also have a
>>> way to do this reasonably fast without having to write a custom C
>> I do not agree that writing C filters is a hassle e.g. a minimal do-nothing
>> filter is only a few lines:
> 
> It still doesn't seem user friendly. Maybe it's obvious to you, but I suspect we left behind most of even the sophisticated perf users here.
> 

Fair enough.

> 
>>
>>>                   Maybe we could have some kind of python fast path
>>> just for filters?
>> I expect there are ways to make it more efficient, but I doubt it would ever
>> come close to C.
> 
> If it's within 2-3x I guess it would be ok. For any larger data files we should parallelize anyways, and that works fine with the --time x/y method (although it usually also needs some custom scripting, perhaps need to figure out how to make it more user friendly)

I am not sure Python could do that, maybe something else.

Parallelization is on the list of things to do.  Splitting by time is OK but gets trickier if you want to put the results back together in time order.  Also call chains get broken at the splits in time.

> 
> 
>>
>>> just for filters? Or maybe the alternative would be to have a
>>> frontend in perf that can automatically generate/compile such a C
>>> filter based on a simple expression, but I'm not sure if that would
>>> be much simpler.
>> If gcc is available, perf script could, in fact, build the .so on the fly
>> since the compile time is very quick.
>>
>> Another point is that filters can be used for more than just filtering.
>> Here is an example which sums cycles per-cpu and prints them, and the difference
>> to the last print, at the beginning of each line.  I think this was something
>> you were interested in doing?
> 
> Yes that's great and useful, but I would prefer to not maintain custom plugins for it. Often when I write a script it has to run in all kinds of weird environments that some random person installed, and it's not clear how portable building C will be there. And I doubt I can just copy the .so files around.

That is true.  .so files have limitations.

> 
> BTW I'm not arguing to not do the plugin (I can imagine extreme cases where such a plugin is the best option), but really for most of these things there should be easier and more portable alternatives, even if they are slightly slower.

Right.  The documentation could definitely point out limitations and more user-friendly alternatives like using Python scripting.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (10 preceding siblings ...)
  2021-06-27 16:13 ` [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Andi Kleen
@ 2021-06-29 19:28 ` Namhyung Kim
  2021-07-01 17:40 ` Arnaldo Carvalho de Melo
  12 siblings, 0 replies; 17+ messages in thread
From: Namhyung Kim @ 2021-06-29 19:28 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Leo Yan, Kan Liang, linux-perf-users,
	linux-kernel

Hi Adrian,

On Sun, Jun 27, 2021 at 6:18 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> Hi
>
> In some cases, users want to filter very large amounts of data (e.g. from
> AUX area tracing like Intel PT) looking for something specific. While
> scripting such as Python can be used, Python is 10 to 20 times slower than
> C. So define a C API so that custom filters can be written and loaded.

Thanks for your work!  I guess we can use this for perf report (and others)
to have a custom filter too.

Thanks,
Namhyung

>
> This is V2.
>
> The main patch is patch 1.
>
> The other patches add more functionality, except for patch 5 which installs
> the C API header file.
>
>
> Changes in V2:
>     perf script: Move filter_cpu() earlier
>     perf script: Move filtering before scripting
>     perf script: Share addr_al between functions
>         Dropped because they have now been applied.
>
>     perf script: Add API for filtering via dynamically loaded shared object
>         Move 2 members of struct perf_dlfilter_sample
>         Add 'ctx' as an argument to 'start' and 'stop'
>         Find dlfilter .so files in current directory or exec-path/dlfilters
>
>     perf script: Add option to list dlfilters
>         New patch
>
>     perf script: Add option to pass arguments to dlfilters
>         New patch
>
>
> Adrian Hunter (10):
>       perf script: Add API for filtering via dynamically loaded shared object
>       perf script: Add dlfilter__filter_event_early()
>       perf script: Add option to list dlfilters
>       perf script: Add option to pass arguments to dlfilters
>       perf build: Install perf_dlfilter.h
>       perf dlfilter: Add resolve_address() to perf_dlfilter_fns
>       perf dlfilter: Add insn() to perf_dlfilter_fns
>       perf dlfilter: Add srcline() to perf_dlfilter_fns
>       perf dlfilter: Add attr() to perf_dlfilter_fns
>       perf dlfilter: Add object_code() to perf_dlfilter_fns
>
>  tools/perf/Documentation/perf-dlfilter.txt | 251 ++++++++++++
>  tools/perf/Documentation/perf-script.txt   |  15 +-
>  tools/perf/Makefile.config                 |   3 +
>  tools/perf/Makefile.perf                   |   4 +-
>  tools/perf/builtin-script.c                |  86 +++-
>  tools/perf/util/Build                      |   1 +
>  tools/perf/util/dlfilter.c                 | 615 +++++++++++++++++++++++++++++
>  tools/perf/util/dlfilter.h                 |  97 +++++
>  tools/perf/util/perf_dlfilter.h            | 150 +++++++
>  9 files changed, 1211 insertions(+), 11 deletions(-)
>  create mode 100644 tools/perf/Documentation/perf-dlfilter.txt
>  create mode 100644 tools/perf/util/dlfilter.c
>  create mode 100644 tools/perf/util/dlfilter.h
>  create mode 100644 tools/perf/util/perf_dlfilter.h
>
>
> Regards
> Adrian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object
  2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
                   ` (11 preceding siblings ...)
  2021-06-29 19:28 ` Namhyung Kim
@ 2021-07-01 17:40 ` Arnaldo Carvalho de Melo
  12 siblings, 0 replies; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-01 17:40 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Jiri Olsa, Andi Kleen, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Namhyung Kim, Leo Yan, Kan Liang, linux-perf-users, linux-kernel

Em Sun, Jun 27, 2021 at 04:18:08PM +0300, Adrian Hunter escreveu:
> Hi
>  
> In some cases, users want to filter very large amounts of data (e.g. from
> AUX area tracing like Intel PT) looking for something specific. While
> scripting such as Python can be used, Python is 10 to 20 times slower than
> C. So define a C API so that custom filters can be written and loaded.
> 
> This is V2.
> 
> The main patch is patch 1.
> 
> The other patches add more functionality, except for patch 5 which installs
> the C API header file.

Thanks! applied.

Please consider adding a 'perf test' entry to check if what is produced
is what is expected, also to exercise this code so that we get some
'perf test' segfault if we break something it uses somehow.

- Arnaldo
 
> 
> Changes in V2:
>     perf script: Move filter_cpu() earlier
>     perf script: Move filtering before scripting
>     perf script: Share addr_al between functions
> 	Dropped because they have now been applied.
> 
>     perf script: Add API for filtering via dynamically loaded shared object
> 	Move 2 members of struct perf_dlfilter_sample
> 	Add 'ctx' as an argument to 'start' and 'stop'
> 	Find dlfilter .so files in current directory or exec-path/dlfilters
> 
>     perf script: Add option to list dlfilters
> 	New patch
> 
>     perf script: Add option to pass arguments to dlfilters
> 	New patch
> 
> 
> Adrian Hunter (10):
>       perf script: Add API for filtering via dynamically loaded shared object
>       perf script: Add dlfilter__filter_event_early()
>       perf script: Add option to list dlfilters
>       perf script: Add option to pass arguments to dlfilters
>       perf build: Install perf_dlfilter.h
>       perf dlfilter: Add resolve_address() to perf_dlfilter_fns
>       perf dlfilter: Add insn() to perf_dlfilter_fns
>       perf dlfilter: Add srcline() to perf_dlfilter_fns
>       perf dlfilter: Add attr() to perf_dlfilter_fns
>       perf dlfilter: Add object_code() to perf_dlfilter_fns
> 
>  tools/perf/Documentation/perf-dlfilter.txt | 251 ++++++++++++
>  tools/perf/Documentation/perf-script.txt   |  15 +-
>  tools/perf/Makefile.config                 |   3 +
>  tools/perf/Makefile.perf                   |   4 +-
>  tools/perf/builtin-script.c                |  86 +++-
>  tools/perf/util/Build                      |   1 +
>  tools/perf/util/dlfilter.c                 | 615 +++++++++++++++++++++++++++++
>  tools/perf/util/dlfilter.h                 |  97 +++++
>  tools/perf/util/perf_dlfilter.h            | 150 +++++++
>  9 files changed, 1211 insertions(+), 11 deletions(-)
>  create mode 100644 tools/perf/Documentation/perf-dlfilter.txt
>  create mode 100644 tools/perf/util/dlfilter.c
>  create mode 100644 tools/perf/util/dlfilter.h
>  create mode 100644 tools/perf/util/perf_dlfilter.h
> 
> 
> Regards
> Adrian

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-07-01 17:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-27 13:18 [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 01/10] " Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 02/10] perf script: Add dlfilter__filter_event_early() Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 03/10] perf script: Add option to list dlfilters Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 04/10] perf script: Add option to pass arguments to dlfilters Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 05/10] perf build: Install perf_dlfilter.h Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 06/10] perf dlfilter: Add resolve_address() to perf_dlfilter_fns Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 07/10] perf dlfilter: Add insn() " Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 08/10] perf dlfilter: Add srcline() " Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 09/10] perf dlfilter: Add attr() " Adrian Hunter
2021-06-27 13:18 ` [PATCH V2 10/10] perf dlfilter: Add object_code() " Adrian Hunter
2021-06-27 16:13 ` [PATCH V2 00/10] perf script: Add API for filtering via dynamically loaded shared object Andi Kleen
2021-06-28  7:23   ` Adrian Hunter
2021-06-28 14:57     ` Andi Kleen
2021-06-28 19:30       ` Adrian Hunter
2021-06-29 19:28 ` Namhyung Kim
2021-07-01 17:40 ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).