[PATCH V2 0/2] bpf: enable/disable events stored in PERF_EVENT_ARRAY maps trace data output when perf sampling

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V2 0/2] bpf: enable/disable events stored in PERF_EVENT_ARRAY maps trace data output when perf sampling
@ 2015-10-14 12:37 Kaixu Xia
  2015-10-14 12:37 ` [PATCH V2 1/2] bpf: control the trace data output on current cpu " Kaixu Xia
  2015-10-14 12:37 ` [PATCH V2 2/2] bpf: control a set of perf events by creating a new ioctl PERF_EVENT_IOC_SET_ENABLER Kaixu Xia
  0 siblings, 2 replies; 8+ messages in thread
From: Kaixu Xia @ 2015-10-14 12:37 UTC (permalink / raw)
  To: ast, davem, acme, mingo, a.p.zijlstra, masami.hiramatsu.pt,
	jolsa, daniel
  Cc: xiakaixu, wangnan0, linux-kernel, pi3orama, hekuang, netdev

Previous RFC patch url:
https://lkml.org/lkml/2015/10/12/135

changes in V2:
 - rebase the whole patch set to net-next tree(4b418bf);
 - remove the added flag perf_sample_disable in bpf_map;
 - move the added fields in structure perf_event to proper place
   to avoid cacheline miss;
 - use counter based flag instead of 0/1 switcher in considering
   of reentering events;
 - use a single helper bpf_perf_event_sample_control() to enable/
   disable events;
 - implement a light-weight solution to control the trace data
   output on current cpu;
 - create a new ioctl PERF_EVENT_IOC_SET_ENABLER to enable/disable
   a set of events;

Before this patch,
   $ ./perf record -e cycles -a sleep 1
   $ ./perf report --stdio
	# To display the perf.data header info, please use --header/--header-only option
	#
	#
	# Total Lost Samples: 0
	#
	# Samples: 643  of event 'cycles'
	# Event count (approx.): 128313904
	...

After this patch,
   $ ./perf record -e pmux=cycles --event perf-bpf.o/my_cycles_map=pmux/ -a sleep 1
   $ ./perf report --stdio
	# To display the perf.data header info, please use --header/--header-only option
	#
	#
	# Total Lost Samples: 0
	#
	# Samples: 25  of event 'cycles'
	# Event count (approx.): 5788400
	...

The bpf program example:

  struct bpf_map_def SEC("maps") my_cycles_map = {
          .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
          .key_size = sizeof(int),
          .value_size = sizeof(u32),
          .max_entries = 32, 
  };

  SEC("enter=sys_write")
  int bpf_prog_1(struct pt_regs *ctx)
  {
          bpf_perf_event_sample_control(&my_cycles_map, 32, 0); 
          return 0;
  }

  SEC("exit=sys_write%return")
  int bpf_prog_2(struct pt_regs *ctx)
  {
          bpf_perf_event_sample_control(&my_cycles_map, 32, 1); 
          return 0;
  }

Consider control sampling in function level, if we don't use the
PERF_EVENT_IOC_SET_ENABLER ioctl in perf user side, we must set
a high sample frequency to dump trace data.

Kaixu Xia (2):
  bpf: control the trace data output on current cpu when perf sampling
  bpf: control a set of perf events by creating a new ioctl
    PERF_EVENT_IOC_SET_ENABLER

 include/linux/perf_event.h      |  2 ++
 include/uapi/linux/bpf.h        |  5 ++++
 include/uapi/linux/perf_event.h |  4 +++-
 kernel/bpf/verifier.c           |  3 ++-
 kernel/events/core.c            | 53 +++++++++++++++++++++++++++++++++++++++++
 kernel/trace/bpf_trace.c        | 35 +++++++++++++++++++++++++++
 6 files changed, 100 insertions(+), 2 deletions(-)

-- 
1.8.3.4


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V2 1/2] bpf: control the trace data output on current cpu when perf sampling
  2015-10-14 12:37 [PATCH V2 0/2] bpf: enable/disable events stored in PERF_EVENT_ARRAY maps trace data output when perf sampling Kaixu Xia
@ 2015-10-14 12:37 ` Kaixu Xia
  2015-10-14 13:49   ` [RFC PATCH] bpf: bpf_perf_event_sample_control_proto can be static kbuild test robot
                     ` (2 more replies)
  2015-10-14 12:37 ` [PATCH V2 2/2] bpf: control a set of perf events by creating a new ioctl PERF_EVENT_IOC_SET_ENABLER Kaixu Xia
  1 sibling, 3 replies; 8+ messages in thread
From: Kaixu Xia @ 2015-10-14 12:37 UTC (permalink / raw)
  To: ast, davem, acme, mingo, a.p.zijlstra, masami.hiramatsu.pt,
	jolsa, daniel
  Cc: xiakaixu, wangnan0, linux-kernel, pi3orama, hekuang, netdev

This patch adds the flag sample_disable to control the trace data
output process when perf sampling. By setting this flag and
integrating with ebpf, we can control the data output process and
get the samples we are most interested in.

The bpf helper bpf_perf_event_sample_control() can control the
perf_event on current cpu.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
---
 include/linux/perf_event.h      |  1 +
 include/uapi/linux/bpf.h        |  5 +++++
 include/uapi/linux/perf_event.h |  3 ++-
 kernel/bpf/verifier.c           |  3 ++-
 kernel/events/core.c            | 13 +++++++++++++
 kernel/trace/bpf_trace.c        | 32 ++++++++++++++++++++++++++++++++
 6 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 092a0e8..dcbf7d5 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -472,6 +472,7 @@ struct perf_event {
 	struct irq_work			pending;
 
 	atomic_t			event_limit;
+	atomic_t			sample_disable;
 
 	void (*destroy)(struct perf_event *);
 	struct rcu_head			rcu_head;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 564f1f0..e2c99c6 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -287,6 +287,11 @@ enum bpf_func_id {
 	 * Return: realm if != 0
 	 */
 	BPF_FUNC_get_route_realm,
+
+	/**
+	 * u64 bpf_perf_event_sample_control(&map, index, flag)
+	 */
+	BPF_FUNC_perf_event_sample_control,
 	__BPF_FUNC_MAX_ID,
 };
 
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 2881145..a2b9dd7 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -331,7 +331,8 @@ struct perf_event_attr {
 				comm_exec      :  1, /* flag comm events that are due to an exec */
 				use_clockid    :  1, /* use @clockid for time fields */
 				context_switch :  1, /* context switch data */
-				__reserved_1   : 37;
+				sample_disable :  1, /* don't output data on samples */
+				__reserved_1   : 36;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1d6b97b..3ffe630 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -245,6 +245,7 @@ static const struct {
 } func_limit[] = {
 	{BPF_MAP_TYPE_PROG_ARRAY, BPF_FUNC_tail_call},
 	{BPF_MAP_TYPE_PERF_EVENT_ARRAY, BPF_FUNC_perf_event_read},
+	{BPF_MAP_TYPE_PERF_EVENT_ARRAY, BPF_FUNC_perf_event_sample_control},
 };
 
 static void print_verifier_state(struct verifier_env *env)
@@ -910,7 +911,7 @@ static int check_map_func_compatibility(struct bpf_map *map, int func_id)
 		 * don't allow any other map type to be passed into
 		 * the special func;
 		 */
-		if (bool_map != bool_func)
+		if (bool_func && bool_map != bool_func)
 			return -EINVAL;
 	}
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b11756f..942351c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6337,6 +6337,9 @@ static int __perf_event_overflow(struct perf_event *event,
 		irq_work_queue(&event->pending);
 	}
 
+	if (!atomic_read(&event->sample_disable))
+		return ret;
+
 	if (event->overflow_handler)
 		event->overflow_handler(event, data, regs);
 	else
@@ -7709,6 +7712,14 @@ static void account_event(struct perf_event *event)
 	account_event_cpu(event, event->cpu);
 }
 
+static void perf_event_check_sample_flag(struct perf_event *event)
+{
+	if (event->attr.sample_disable == 1)
+		atomic_set(&event->sample_disable, 0);
+	else
+		atomic_set(&event->sample_disable, 1);
+}
+
 /*
  * Allocate and initialize a event structure
  */
@@ -7840,6 +7851,8 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 		}
 	}
 
+	perf_event_check_sample_flag(event);
+
 	return event;
 
 err_per_task:
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 0fe96c7..f261333 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -215,6 +215,36 @@ const struct bpf_func_proto bpf_perf_event_read_proto = {
 	.arg2_type	= ARG_ANYTHING,
 };
 
+static u64 bpf_perf_event_sample_control(u64 r1, u64 index, u64 flag, u64 r4, u64 r5)
+{
+	struct bpf_map *map = (struct bpf_map *) (unsigned long) r1;
+	struct bpf_array *array = container_of(map, struct bpf_array, map);
+	struct perf_event *event;
+
+	if (unlikely(index >= array->map.max_entries))
+		return -E2BIG;
+
+	event = (struct perf_event *)array->ptrs[index];
+	if (!event)
+		return -ENOENT;
+
+	if (flag)
+		atomic_dec(&event->sample_disable);
+	else
+		atomic_inc(&event->sample_disable);
+
+	return 0;
+}
+
+const struct bpf_func_proto bpf_perf_event_sample_control_proto = {
+	.func		= bpf_perf_event_sample_control,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_CONST_MAP_PTR,
+	.arg2_type	= ARG_ANYTHING,
+	.arg3_type	= ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func_id)
 {
 	switch (func_id) {
@@ -242,6 +272,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
 		return &bpf_get_smp_processor_id_proto;
 	case BPF_FUNC_perf_event_read:
 		return &bpf_perf_event_read_proto;
+	case BPF_FUNC_perf_event_sample_control:
+		return &bpf_perf_event_sample_control_proto;
 	default:
 		return NULL;
 	}
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 2/2] bpf: control a set of perf events by creating a new ioctl PERF_EVENT_IOC_SET_ENABLER
  2015-10-14 12:37 [PATCH V2 0/2] bpf: enable/disable events stored in PERF_EVENT_ARRAY maps trace data output when perf sampling Kaixu Xia
  2015-10-14 12:37 ` [PATCH V2 1/2] bpf: control the trace data output on current cpu " Kaixu Xia
@ 2015-10-14 12:37 ` Kaixu Xia
  2015-10-14 21:28   ` Alexei Starovoitov
  1 sibling, 1 reply; 8+ messages in thread
From: Kaixu Xia @ 2015-10-14 12:37 UTC (permalink / raw)
  To: ast, davem, acme, mingo, a.p.zijlstra, masami.hiramatsu.pt,
	jolsa, daniel
  Cc: xiakaixu, wangnan0, linux-kernel, pi3orama, hekuang, netdev

This patch creates a new ioctl PERF_EVENT_IOC_SET_ENABLER to let
perf to select an event as 'enabler'. So we can set this 'enabler'
event to enable/disable a set of events. The event on CPU 0 is
treated as the 'enabler' event by default.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
---
 include/linux/perf_event.h      |  1 +
 include/uapi/linux/perf_event.h |  1 +
 kernel/events/core.c            | 42 ++++++++++++++++++++++++++++++++++++++++-
 kernel/trace/bpf_trace.c        |  5 ++++-
 4 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index dcbf7d5..bc9fe77 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -473,6 +473,7 @@ struct perf_event {
 
 	atomic_t			event_limit;
 	atomic_t			sample_disable;
+	atomic_t			*p_sample_disable;
 
 	void (*destroy)(struct perf_event *);
 	struct rcu_head			rcu_head;
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index a2b9dd7..3b4fb90 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -393,6 +393,7 @@ struct perf_event_attr {
 #define PERF_EVENT_IOC_SET_FILTER	_IOW('$', 6, char *)
 #define PERF_EVENT_IOC_ID		_IOR('$', 7, __u64 *)
 #define PERF_EVENT_IOC_SET_BPF		_IOW('$', 8, __u32)
+#define PERF_EVENT_IOC_SET_ENABLER	_IO ('$', 9)
 
 enum perf_event_ioc_flags {
 	PERF_IOC_FLAG_GROUP		= 1U << 0,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 942351c..03d2594 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4152,6 +4152,7 @@ static int perf_event_set_output(struct perf_event *event,
 				 struct perf_event *output_event);
 static int perf_event_set_filter(struct perf_event *event, void __user *arg);
 static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd);
+static int perf_event_set_sample_enabler(struct perf_event *event, u32 enabler_fd);
 
 static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned long arg)
 {
@@ -4208,6 +4209,9 @@ static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned lon
 	case PERF_EVENT_IOC_SET_BPF:
 		return perf_event_set_bpf_prog(event, arg);
 
+	case PERF_EVENT_IOC_SET_ENABLER:
+		return perf_event_set_sample_enabler(event, arg);
+
 	default:
 		return -ENOTTY;
 	}
@@ -6337,7 +6341,7 @@ static int __perf_event_overflow(struct perf_event *event,
 		irq_work_queue(&event->pending);
 	}
 
-	if (!atomic_read(&event->sample_disable))
+	if (!atomic_read(event->p_sample_disable))
 		return ret;
 
 	if (event->overflow_handler)
@@ -6989,6 +6993,35 @@ static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd)
 	return 0;
 }
 
+static int perf_event_set_sample_enabler(struct perf_event *event, u32 enabler_fd)
+{
+	int ret;
+	struct fd enabler;
+	struct perf_event *enabler_event;
+
+	if (enabler_fd == -1)
+		return 0;
+
+	ret = perf_fget_light(enabler_fd, &enabler);
+	if (ret)
+		return ret;
+	enabler_event = enabler.file->private_data;
+	if (event == enabler_event) {
+		fdput(enabler);
+		return 0;
+	}
+
+	/* they must be on the same PMU*/
+	if (event->pmu != enabler_event->pmu) {
+		fdput(enabler);
+		return -EINVAL;
+	}
+
+	event->p_sample_disable = &enabler_event->sample_disable;
+	fdput(enabler);
+	return 0;
+}
+
 static void perf_event_free_bpf_prog(struct perf_event *event)
 {
 	struct bpf_prog *prog;
@@ -7023,6 +7056,11 @@ static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd)
 	return -ENOENT;
 }
 
+static int perf_event_set_sample_enabler(struct perf_event *event, u32 group_fd)
+{
+	return -ENOENT;
+}
+
 static void perf_event_free_bpf_prog(struct perf_event *event)
 {
 }
@@ -7718,6 +7756,8 @@ static void perf_event_check_sample_flag(struct perf_event *event)
 		atomic_set(&event->sample_disable, 0);
 	else
 		atomic_set(&event->sample_disable, 1);
+
+	event->p_sample_disable = &event->sample_disable;
 }
 
 /*
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f261333..d012be3 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -221,9 +221,12 @@ static u64 bpf_perf_event_sample_control(u64 r1, u64 index, u64 flag, u64 r4, u6
 	struct bpf_array *array = container_of(map, struct bpf_array, map);
 	struct perf_event *event;
 
-	if (unlikely(index >= array->map.max_entries))
+	if (unlikely(index > array->map.max_entries))
 		return -E2BIG;
 
+	if (index == array->map.max_entries)
+		index = 0;
+
 	event = (struct perf_event *)array->ptrs[index];
 	if (!event)
 		return -ENOENT;
-- 
1.8.3.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 1/2] bpf: control the trace data output on current cpu when perf sampling
  2015-10-14 12:37 ` [PATCH V2 1/2] bpf: control the trace data output on current cpu " Kaixu Xia
  2015-10-14 13:49   ` [RFC PATCH] bpf: bpf_perf_event_sample_control_proto can be static kbuild test robot
@ 2015-10-14 13:49   ` kbuild test robot
  2015-10-14 21:21   ` Alexei Starovoitov
  2 siblings, 0 replies; 8+ messages in thread
From: kbuild test robot @ 2015-10-14 13:49 UTC (permalink / raw)
  To: Kaixu Xia
  Cc: kbuild-all, ast, davem, acme, mingo, a.p.zijlstra,
	masami.hiramatsu.pt, jolsa, daniel, xiakaixu, wangnan0,
	linux-kernel, pi3orama, hekuang, netdev

Hi Kaixu,

[auto build test WARNING on net-next/master -- if it's inappropriate base, please suggest rules for selecting the more suitable base]

url:    https://github.com/0day-ci/linux/commits/Kaixu-Xia/bpf-enable-disable-events-stored-in-PERF_EVENT_ARRAY-maps-trace-data-output-when-perf-sampling/20151014-204158
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> kernel/trace/bpf_trace.c:239:29: sparse: symbol 'bpf_perf_event_sample_control_proto' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH] bpf: bpf_perf_event_sample_control_proto can be static
  2015-10-14 12:37 ` [PATCH V2 1/2] bpf: control the trace data output on current cpu " Kaixu Xia
@ 2015-10-14 13:49   ` kbuild test robot
  2015-10-14 13:49   ` [PATCH V2 1/2] bpf: control the trace data output on current cpu when perf sampling kbuild test robot
  2015-10-14 21:21   ` Alexei Starovoitov
  2 siblings, 0 replies; 8+ messages in thread
From: kbuild test robot @ 2015-10-14 13:49 UTC (permalink / raw)
  To: Kaixu Xia
  Cc: kbuild-all, ast, davem, acme, mingo, a.p.zijlstra,
	masami.hiramatsu.pt, jolsa, daniel, xiakaixu, wangnan0,
	linux-kernel, pi3orama, hekuang, netdev


Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 bpf_trace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f261333..ade24a1 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -236,7 +236,7 @@ static u64 bpf_perf_event_sample_control(u64 r1, u64 index, u64 flag, u64 r4, u6
 	return 0;
 }
 
-const struct bpf_func_proto bpf_perf_event_sample_control_proto = {
+static const struct bpf_func_proto bpf_perf_event_sample_control_proto = {
 	.func		= bpf_perf_event_sample_control,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 1/2] bpf: control the trace data output on current cpu when perf sampling
  2015-10-14 12:37 ` [PATCH V2 1/2] bpf: control the trace data output on current cpu " Kaixu Xia
  2015-10-14 13:49   ` [RFC PATCH] bpf: bpf_perf_event_sample_control_proto can be static kbuild test robot
  2015-10-14 13:49   ` [PATCH V2 1/2] bpf: control the trace data output on current cpu when perf sampling kbuild test robot
@ 2015-10-14 21:21   ` Alexei Starovoitov
  2 siblings, 0 replies; 8+ messages in thread
From: Alexei Starovoitov @ 2015-10-14 21:21 UTC (permalink / raw)
  To: Kaixu Xia, davem, acme, mingo, a.p.zijlstra, masami.hiramatsu.pt,
	jolsa, daniel
  Cc: wangnan0, linux-kernel, pi3orama, hekuang, netdev

On 10/14/15 5:37 AM, Kaixu Xia wrote:
> This patch adds the flag sample_disable to control the trace data
> output process when perf sampling. By setting this flag and
> integrating with ebpf, we can control the data output process and
> get the samples we are most interested in.
>
> The bpf helper bpf_perf_event_sample_control() can control the
> perf_event on current cpu.
>
> Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
...
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6337,6 +6337,9 @@ static int __perf_event_overflow(struct perf_event *event,
>   		irq_work_queue(&event->pending);
>   	}
>
> +	if (!atomic_read(&event->sample_disable))
> +		return ret;
> +

the condition check and the name are inconsistent.
It's either
if (!enabled) return
or
if (disabled) return

>   	if (event->overflow_handler)
>   		event->overflow_handler(event, data, regs);
>   	else
> @@ -7709,6 +7712,14 @@ static void account_event(struct perf_event *event)
>   	account_event_cpu(event, event->cpu);
>   }
>
> +static void perf_event_check_sample_flag(struct perf_event *event)
> +{
> +	if (event->attr.sample_disable == 1)
> +		atomic_set(&event->sample_disable, 0);
> +	else
> +		atomic_set(&event->sample_disable, 1);
> +}

why introduce new attribute for this?
we already have 'disabled' flag.

> +static u64 bpf_perf_event_sample_control(u64 r1, u64 index, u64 flag, u64 r4, u64 r5)
> +{
> +	struct bpf_map *map = (struct bpf_map *) (unsigned long) r1;
> +	struct bpf_array *array = container_of(map, struct bpf_array, map);
> +	struct perf_event *event;
> +
> +	if (unlikely(index >= array->map.max_entries))
> +		return -E2BIG;
> +
> +	event = (struct perf_event *)array->ptrs[index];
> +	if (!event)
> +		return -ENOENT;
> +
> +	if (flag)

please check only bit 0 and check that all other bits are zero as well
for future extensibility.

> +		atomic_dec(&event->sample_disable);

it should be atomic_dec_if_positive();

> +	else
> +		atomic_inc(&event->sample_disable);

and atomic_add_unless()
to make sure we don't wrap on either side.

> +const struct bpf_func_proto bpf_perf_event_sample_control_proto = {

static.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 2/2] bpf: control a set of perf events by creating a new ioctl PERF_EVENT_IOC_SET_ENABLER
  2015-10-14 12:37 ` [PATCH V2 2/2] bpf: control a set of perf events by creating a new ioctl PERF_EVENT_IOC_SET_ENABLER Kaixu Xia
@ 2015-10-14 21:28   ` Alexei Starovoitov
  2015-10-15  2:21     ` xiakaixu
  0 siblings, 1 reply; 8+ messages in thread
From: Alexei Starovoitov @ 2015-10-14 21:28 UTC (permalink / raw)
  To: Kaixu Xia, davem, acme, mingo, a.p.zijlstra, masami.hiramatsu.pt,
	jolsa, daniel
  Cc: wangnan0, linux-kernel, pi3orama, hekuang, netdev

On 10/14/15 5:37 AM, Kaixu Xia wrote:
> +	event->p_sample_disable = &enabler_event->sample_disable;

I don't like it as a concept and it's buggy implementation.
What happens here when enabler is alive, but other event is destroyed?

> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -221,9 +221,12 @@ static u64 bpf_perf_event_sample_control(u64 r1, u64 index, u64 flag, u64 r4, u6
>   	struct bpf_array *array = container_of(map, struct bpf_array, map);
>   	struct perf_event *event;
>
> -	if (unlikely(index >= array->map.max_entries))
> +	if (unlikely(index > array->map.max_entries))
>   		return -E2BIG;
>
> +	if (index == array->map.max_entries)
> +		index = 0;

what is this hack for ?

Either use notification and user space disable or
call bpf_perf_event_sample_control() manually for each cpu.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 2/2] bpf: control a set of perf events by creating a new ioctl PERF_EVENT_IOC_SET_ENABLER
  2015-10-14 21:28   ` Alexei Starovoitov
@ 2015-10-15  2:21     ` xiakaixu
  0 siblings, 0 replies; 8+ messages in thread
From: xiakaixu @ 2015-10-15  2:21 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: davem, acme, mingo, a.p.zijlstra, masami.hiramatsu.pt, jolsa,
	daniel, wangnan0, linux-kernel, pi3orama, hekuang, netdev

于 2015/10/15 5:28, Alexei Starovoitov 写道:
> On 10/14/15 5:37 AM, Kaixu Xia wrote:
>> +    event->p_sample_disable = &enabler_event->sample_disable;
> 
> I don't like it as a concept and it's buggy implementation.
> What happens here when enabler is alive, but other event is destroyed?
> 
>> --- a/kernel/trace/bpf_trace.c
>> +++ b/kernel/trace/bpf_trace.c
>> @@ -221,9 +221,12 @@ static u64 bpf_perf_event_sample_control(u64 r1, u64 index, u64 flag, u64 r4, u6
>>       struct bpf_array *array = container_of(map, struct bpf_array, map);
>>       struct perf_event *event;
>>
>> -    if (unlikely(index >= array->map.max_entries))
>> +    if (unlikely(index > array->map.max_entries))
>>           return -E2BIG;
>>
>> +    if (index == array->map.max_entries)
>> +        index = 0;
> 
> what is this hack for ?
> 
> Either use notification and user space disable or
> call bpf_perf_event_sample_control() manually for each cpu.

I will discard current implemention that controlling a set of
perf events by the 'enabler' event. Call bpf_perf_event_sample_control()
manually for each cpu is fine. Maybe we can add a loop to control all the
events stored in maps by judging the index, OK?
> 
> 
> 
> .
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-10-15  2:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-14 12:37 [PATCH V2 0/2] bpf: enable/disable events stored in PERF_EVENT_ARRAY maps trace data output when perf sampling Kaixu Xia
2015-10-14 12:37 ` [PATCH V2 1/2] bpf: control the trace data output on current cpu " Kaixu Xia
2015-10-14 13:49   ` [RFC PATCH] bpf: bpf_perf_event_sample_control_proto can be static kbuild test robot
2015-10-14 13:49   ` [PATCH V2 1/2] bpf: control the trace data output on current cpu when perf sampling kbuild test robot
2015-10-14 21:21   ` Alexei Starovoitov
2015-10-14 12:37 ` [PATCH V2 2/2] bpf: control a set of perf events by creating a new ioctl PERF_EVENT_IOC_SET_ENABLER Kaixu Xia
2015-10-14 21:28   ` Alexei Starovoitov
2015-10-15  2:21     ` xiakaixu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.