linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/1] perf/core: Wake up parent event if inherited event has no ring buffer
@ 2021-12-06 11:38 James Clark
  2021-12-06 11:38 ` [RFC PATCH 1/1] " James Clark
  0 siblings, 1 reply; 8+ messages in thread
From: James Clark @ 2021-12-06 11:38 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
  Cc: linux-perf-users, leo.yan, Suzuki.Poulose, James Clark, linux-kernel

This is one possible fix to the issue described in the commit message.

What I'm not sure about is whether these other two possibilities should
be done instead, which is what I'm looking for feedback on:

 * Assign the ring buffer of the parent event to the child in inherit_event()

 * Move the poll list from the RB to the event, although it was moved into
   the RB for a specific reason in 10c6db1

James Clark (1):
  perf/core: Wake up parent event if inherited event has no ring buffer

 kernel/events/core.c | 3 +++
 1 file changed, 3 insertions(+)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 1/1] perf/core: Wake up parent event if inherited event has no ring buffer
  2021-12-06 11:38 [RFC PATCH 0/1] perf/core: Wake up parent event if inherited event has no ring buffer James Clark
@ 2021-12-06 11:38 ` James Clark
  2022-01-24 11:37   ` Ruben Ayrapetyan
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: James Clark @ 2021-12-06 11:38 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
  Cc: linux-perf-users, leo.yan, Suzuki.Poulose, James Clark,
	Ruben Ayrapetyan, linux-kernel

When using per-process mode and event inheritance is set to true, forked
processes will create a new perf events via inherit_event() ->
perf_event_alloc(). But these events will not have ring buffers assigned
to them. Any call to wakeup will be dropped if it's called on an event
with no ring buffer assigned because that's the object that holds the
wakeup list.

If the child event is disabled due to a call to perf_aux_output_begin()
or perf_aux_output_end(), the wakeup is dropped leaving userspace
hanging forever on the poll.

Normally the event is explicitly re-enabled by userspace after it wakes
up to read the aux data, but in this case it does not get woken up so
the event remains disabled.

This can be reproduced when using Arm SPE and 'stress' which forks once
before running the workload. By looking at the list of aux buffers read,
it's apparent that they stop after the fork:

  perf record -e arm_spe// -vvv -- stress -c 1

With this patch applied they continue to be printed. This behaviour
doesn't happen when using systemwide or per-cpu mode.

Reported-by: Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
---
 kernel/events/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 30d94f68c5bd..76217a3f9cdd 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5993,6 +5993,9 @@ static void ring_buffer_wakeup(struct perf_event *event)
 
 	rcu_read_lock();
 	rb = rcu_dereference(event->rb);
+	if (!rb && event->parent)
+		rb = rcu_dereference(event->parent->rb);
+
 	if (rb) {
 		list_for_each_entry_rcu(event, &rb->event_list, rb_entry)
 			wake_up_all(&event->waitq);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/1] perf/core: Wake up parent event if inherited event has no ring buffer
  2021-12-06 11:38 ` [RFC PATCH 1/1] " James Clark
@ 2022-01-24 11:37   ` Ruben Ayrapetyan
  2022-01-24 11:49   ` Peter Zijlstra
  2022-01-27 11:50   ` [tip: perf/urgent] perf: Always wake the parent event tip-bot2 for James Clark
  2 siblings, 0 replies; 8+ messages in thread
From: Ruben Ayrapetyan @ 2022-01-24 11:37 UTC (permalink / raw)
  To: James Clark, peterz, mingo, acme, Mark Rutland,
	alexander.shishkin, jolsa, namhyung
  Cc: linux-perf-users, leo.yan, Suzuki Poulose, linux-kernel

Hi everyone,

I've tried running a benchmark with and without this patch applied, with the following options:
 perf record -e arm_spe_0/event_filter=0,jitter=0,ts_enable=1,pa_enable=0,min_latency=0/ ...
repeating it 60 times with and without the patch.

Without the patch, the output perf.data sizes were: 1.0G 1.0G 1.0G 1.0G 1.0G 1.0G 1.0G 1.0G 47M 48M 48M 48M 48M 48M 48M 48M 48M 48M 49M 49M 49M 49M 49M 49M 49M 49M 49M 49M 49M 49M 49M 49M 49M 49M 49M 60M 60M 60M 60M 61M 61M 61M 61M 61M 61M 61M 62M 62M 62M 62M 62M 62M 62M 62M 62M 63M 63M 63M 65M 990M i.e. 47MB to 1.0GB

With the patch, the output sizes were: 1.4G 1.4G 1.4G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G 1.5G i.e. 1.4GB to 1.5GB.

Tested-by: Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com>
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/1] perf/core: Wake up parent event if inherited event has no ring buffer
  2021-12-06 11:38 ` [RFC PATCH 1/1] " James Clark
  2022-01-24 11:37   ` Ruben Ayrapetyan
@ 2022-01-24 11:49   ` Peter Zijlstra
  2022-01-24 14:58     ` James Clark
  2022-01-27 11:50   ` [tip: perf/urgent] perf: Always wake the parent event tip-bot2 for James Clark
  2 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2022-01-24 11:49 UTC (permalink / raw)
  To: James Clark
  Cc: mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung,
	linux-perf-users, leo.yan, Suzuki.Poulose, Ruben Ayrapetyan,
	linux-kernel

On Mon, Dec 06, 2021 at 11:38:40AM +0000, James Clark wrote:
> When using per-process mode and event inheritance is set to true, forked
> processes will create a new perf events via inherit_event() ->
> perf_event_alloc(). But these events will not have ring buffers assigned
> to them. Any call to wakeup will be dropped if it's called on an event
> with no ring buffer assigned because that's the object that holds the
> wakeup list.
> 
> If the child event is disabled due to a call to perf_aux_output_begin()
> or perf_aux_output_end(), the wakeup is dropped leaving userspace
> hanging forever on the poll.
> 
> Normally the event is explicitly re-enabled by userspace after it wakes
> up to read the aux data, but in this case it does not get woken up so
> the event remains disabled.
> 
> This can be reproduced when using Arm SPE and 'stress' which forks once
> before running the workload. By looking at the list of aux buffers read,
> it's apparent that they stop after the fork:
> 
>   perf record -e arm_spe// -vvv -- stress -c 1
> 
> With this patch applied they continue to be printed. This behaviour
> doesn't happen when using systemwide or per-cpu mode.
> 
> Reported-by: Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com>
> Signed-off-by: James Clark <james.clark@arm.com>
> ---

Would this be the better patch?


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 479c9e672ec4..b1c1928c0e7c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5985,6 +5985,8 @@ static void ring_buffer_attach(struct perf_event *event,
 	struct perf_buffer *old_rb = NULL;
 	unsigned long flags;
 
+	WARN_ON_ONCE(event->parent);
+
 	if (event->rb) {
 		/*
 		 * Should be impossible, we set this when removing
@@ -6042,6 +6044,9 @@ static void ring_buffer_wakeup(struct perf_event *event)
 {
 	struct perf_buffer *rb;
 
+	if (event->parent)
+		event = event->parent;
+
 	rcu_read_lock();
 	rb = rcu_dereference(event->rb);
 	if (rb) {
@@ -6055,6 +6060,9 @@ struct perf_buffer *ring_buffer_get(struct perf_event *event)
 {
 	struct perf_buffer *rb;
 
+	if (event->parent)
+		event = event->parent;
+
 	rcu_read_lock();
 	rb = rcu_dereference(event->rb);
 	if (rb) {
@@ -6763,7 +6771,7 @@ static unsigned long perf_prepare_sample_aux(struct perf_event *event,
 	if (WARN_ON_ONCE(READ_ONCE(sampler->oncpu) != smp_processor_id()))
 		goto out;
 
-	rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);
+	rb = ring_buffer_get(sampler);
 	if (!rb)
 		goto out;
 
@@ -6829,7 +6837,7 @@ static void perf_aux_sample_output(struct perf_event *event,
 	if (WARN_ON_ONCE(!sampler || !data->aux_size))
 		return;
 
-	rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);
+	rb = ring_buffer_get(sampler);
 	if (!rb)
 		return;
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/1] perf/core: Wake up parent event if inherited event has no ring buffer
  2022-01-24 11:49   ` Peter Zijlstra
@ 2022-01-24 14:58     ` James Clark
  2022-01-25 19:12       ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: James Clark @ 2022-01-24 14:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung,
	linux-perf-users, leo.yan, Suzuki.Poulose, Ruben Ayrapetyan,
	linux-kernel



On 24/01/2022 11:49, Peter Zijlstra wrote:
> On Mon, Dec 06, 2021 at 11:38:40AM +0000, James Clark wrote:
>> When using per-process mode and event inheritance is set to true, forked
>> processes will create a new perf events via inherit_event() ->
>> perf_event_alloc(). But these events will not have ring buffers assigned
>> to them. Any call to wakeup will be dropped if it's called on an event
>> with no ring buffer assigned because that's the object that holds the
>> wakeup list.
>>
>> If the child event is disabled due to a call to perf_aux_output_begin()
>> or perf_aux_output_end(), the wakeup is dropped leaving userspace
>> hanging forever on the poll.
>>
>> Normally the event is explicitly re-enabled by userspace after it wakes
>> up to read the aux data, but in this case it does not get woken up so
>> the event remains disabled.
>>
>> This can be reproduced when using Arm SPE and 'stress' which forks once
>> before running the workload. By looking at the list of aux buffers read,
>> it's apparent that they stop after the fork:
>>
>>   perf record -e arm_spe// -vvv -- stress -c 1
>>
>> With this patch applied they continue to be printed. This behaviour
>> doesn't happen when using systemwide or per-cpu mode.
>>
>> Reported-by: Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com>
>> Signed-off-by: James Clark <james.clark@arm.com>
>> ---
> 
> Would this be the better patch?

Yes I tested this and it also works. There is one other suspicious access
of ->rb followed by if(rb) here in perf_poll(), but maybe it works out ok?

	mutex_lock(&event->mmap_mutex);
	rb = event->rb;
	if (rb)
		events = atomic_xchg(&rb->poll, 0);

We also have a Perf self test that covers this failure for Arm SPE now, I'm not
sure if I should post that separately or with your new version of this fix?

Thanks
James
 
> 
> 
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 479c9e672ec4..b1c1928c0e7c 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5985,6 +5985,8 @@ static void ring_buffer_attach(struct perf_event *event,
>  	struct perf_buffer *old_rb = NULL;
>  	unsigned long flags;
>  
> +	WARN_ON_ONCE(event->parent);
> +
>  	if (event->rb) {
>  		/*
>  		 * Should be impossible, we set this when removing
> @@ -6042,6 +6044,9 @@ static void ring_buffer_wakeup(struct perf_event *event)
>  {
>  	struct perf_buffer *rb;
>  
> +	if (event->parent)
> +		event = event->parent;
> +
>  	rcu_read_lock();
>  	rb = rcu_dereference(event->rb);
>  	if (rb) {
> @@ -6055,6 +6060,9 @@ struct perf_buffer *ring_buffer_get(struct perf_event *event)
>  {
>  	struct perf_buffer *rb;
>  
> +	if (event->parent)
> +		event = event->parent;
> +
>  	rcu_read_lock();
>  	rb = rcu_dereference(event->rb);
>  	if (rb) {
> @@ -6763,7 +6771,7 @@ static unsigned long perf_prepare_sample_aux(struct perf_event *event,
>  	if (WARN_ON_ONCE(READ_ONCE(sampler->oncpu) != smp_processor_id()))
>  		goto out;
>  
> -	rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);
> +	rb = ring_buffer_get(sampler);
>  	if (!rb)
>  		goto out;
>  
> @@ -6829,7 +6837,7 @@ static void perf_aux_sample_output(struct perf_event *event,
>  	if (WARN_ON_ONCE(!sampler || !data->aux_size))
>  		return;
>  
> -	rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);
> +	rb = ring_buffer_get(sampler);
>  	if (!rb)
>  		return;
>  
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/1] perf/core: Wake up parent event if inherited event has no ring buffer
  2022-01-24 14:58     ` James Clark
@ 2022-01-25 19:12       ` Peter Zijlstra
  2022-01-26 16:15         ` James Clark
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2022-01-25 19:12 UTC (permalink / raw)
  To: James Clark
  Cc: mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung,
	linux-perf-users, leo.yan, Suzuki.Poulose, Ruben Ayrapetyan,
	linux-kernel

On Mon, Jan 24, 2022 at 02:58:18PM +0000, James Clark wrote:

> > Would this be the better patch?
> 
> Yes I tested this and it also works.

Excellent!

> There is one other suspicious access
> of ->rb followed by if(rb) here in perf_poll(), but maybe it works out ok?
> 
> 	mutex_lock(&event->mmap_mutex);
> 	rb = event->rb;
> 	if (rb)
> 		events = atomic_xchg(&rb->poll, 0);
> 

That case must be good since it is the event from a file. Those cannot
be clones.

> We also have a Perf self test that covers this failure for Arm SPE now, I'm not
> sure if I should post that separately or with your new version of this fix?

They'd be routed through separate trees anyway, Arnaldo takes the
userspace stuff.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/1] perf/core: Wake up parent event if inherited event has no ring buffer
  2022-01-25 19:12       ` Peter Zijlstra
@ 2022-01-26 16:15         ` James Clark
  0 siblings, 0 replies; 8+ messages in thread
From: James Clark @ 2022-01-26 16:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung,
	linux-perf-users, leo.yan, Suzuki.Poulose, Ruben Ayrapetyan,
	linux-kernel



On 25/01/2022 19:12, Peter Zijlstra wrote:
> On Mon, Jan 24, 2022 at 02:58:18PM +0000, James Clark wrote:
> 
>>> Would this be the better patch?
>>
>> Yes I tested this and it also works.
> 
> Excellent!
> 
>> There is one other suspicious access
>> of ->rb followed by if(rb) here in perf_poll(), but maybe it works out ok?
>>
>> 	mutex_lock(&event->mmap_mutex);
>> 	rb = event->rb;
>> 	if (rb)
>> 		events = atomic_xchg(&rb->poll, 0);
>>
> 
> That case must be good since it is the event from a file. Those cannot
> be clones.
> 
>> We also have a Perf self test that covers this failure for Arm SPE now, I'm not
>> sure if I should post that separately or with your new version of this fix?
> 
> They'd be routed through separate trees anyway, Arnaldo takes the
> userspace stuff.
> 

Ok we will post that one separately then. What are the next steps for this one?
Will you just put your version directly in your tree, or should I resubmit it with a
new commit message?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip: perf/urgent] perf: Always wake the parent event
  2021-12-06 11:38 ` [RFC PATCH 1/1] " James Clark
  2022-01-24 11:37   ` Ruben Ayrapetyan
  2022-01-24 11:49   ` Peter Zijlstra
@ 2022-01-27 11:50   ` tip-bot2 for James Clark
  2 siblings, 0 replies; 8+ messages in thread
From: tip-bot2 for James Clark @ 2022-01-27 11:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Ruben Ayrapetyan, James Clark, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     961c39121759ad09a89598ec4ccdd34ae0468a19
Gitweb:        https://git.kernel.org/tip/961c39121759ad09a89598ec4ccdd34ae0468a19
Author:        James Clark <james.clark@arm.com>
AuthorDate:    Mon, 06 Dec 2021 11:38:40 
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 26 Jan 2022 15:06:06 +01:00

perf: Always wake the parent event

When using per-process mode and event inheritance is set to true,
forked processes will create a new perf events via inherit_event() ->
perf_event_alloc(). But these events will not have ring buffers
assigned to them. Any call to wakeup will be dropped if it's called on
an event with no ring buffer assigned because that's the object that
holds the wakeup list.

If the child event is disabled due to a call to
perf_aux_output_begin() or perf_aux_output_end(), the wakeup is
dropped leaving userspace hanging forever on the poll.

Normally the event is explicitly re-enabled by userspace after it
wakes up to read the aux data, but in this case it does not get woken
up so the event remains disabled.

This can be reproduced when using Arm SPE and 'stress' which forks once
before running the workload. By looking at the list of aux buffers read,
it's apparent that they stop after the fork:

  perf record -e arm_spe// -vvv -- stress -c 1

With this patch applied they continue to be printed. This behaviour
doesn't happen when using systemwide or per-cpu mode.

Reported-by: Ruben Ayrapetyan <Ruben.Ayrapetyan@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211206113840.130802-2-james.clark@arm.com
---
 kernel/events/core.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 479c9e6..b1c1928 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5985,6 +5985,8 @@ static void ring_buffer_attach(struct perf_event *event,
 	struct perf_buffer *old_rb = NULL;
 	unsigned long flags;
 
+	WARN_ON_ONCE(event->parent);
+
 	if (event->rb) {
 		/*
 		 * Should be impossible, we set this when removing
@@ -6042,6 +6044,9 @@ static void ring_buffer_wakeup(struct perf_event *event)
 {
 	struct perf_buffer *rb;
 
+	if (event->parent)
+		event = event->parent;
+
 	rcu_read_lock();
 	rb = rcu_dereference(event->rb);
 	if (rb) {
@@ -6055,6 +6060,9 @@ struct perf_buffer *ring_buffer_get(struct perf_event *event)
 {
 	struct perf_buffer *rb;
 
+	if (event->parent)
+		event = event->parent;
+
 	rcu_read_lock();
 	rb = rcu_dereference(event->rb);
 	if (rb) {
@@ -6763,7 +6771,7 @@ static unsigned long perf_prepare_sample_aux(struct perf_event *event,
 	if (WARN_ON_ONCE(READ_ONCE(sampler->oncpu) != smp_processor_id()))
 		goto out;
 
-	rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);
+	rb = ring_buffer_get(sampler);
 	if (!rb)
 		goto out;
 
@@ -6829,7 +6837,7 @@ static void perf_aux_sample_output(struct perf_event *event,
 	if (WARN_ON_ONCE(!sampler || !data->aux_size))
 		return;
 
-	rb = ring_buffer_get(sampler->parent ? sampler->parent : sampler);
+	rb = ring_buffer_get(sampler);
 	if (!rb)
 		return;
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-01-27 11:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 11:38 [RFC PATCH 0/1] perf/core: Wake up parent event if inherited event has no ring buffer James Clark
2021-12-06 11:38 ` [RFC PATCH 1/1] " James Clark
2022-01-24 11:37   ` Ruben Ayrapetyan
2022-01-24 11:49   ` Peter Zijlstra
2022-01-24 14:58     ` James Clark
2022-01-25 19:12       ` Peter Zijlstra
2022-01-26 16:15         ` James Clark
2022-01-27 11:50   ` [tip: perf/urgent] perf: Always wake the parent event tip-bot2 for James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).