All of lore.kernel.org
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: linux-kernel <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Kernel Team <Kernel-team@fb.com>,
	Stephane Eranian <eranian@google.com>,
	Lucian Grijincu <lucian@fb.com>
Subject: Re: [PATCH v2] perf/core: fix userpage->time_enabled of inactive events
Date: Mon, 27 Sep 2021 16:33:43 +0000	[thread overview]
Message-ID: <A38DD9FF-E142-4BD0-89C4-2B1B699812D7@fb.com> (raw)
In-Reply-To: <20210924012800.2461781-1-songliubraving@fb.com>

Hi Peter, 

> On Sep 23, 2021, at 6:28 PM, Song Liu <songliubraving@fb.com> wrote:
> 
> Users of rdpmc rely on the mmapped user page to calculate accurate
> time_enabled. Currently, userpage->time_enabled is only updated when the
> event is added to the pmu. As a result, inactive event (due to counter
> multiplexing) does not have accurate userpage->time_enabled. This can
> be reproduced with something like:
> 
>   /* open 20 task perf_event "cycles", to create multiplexing */
> 
>   fd = perf_event_open();  /* open task perf_event "cycles" */
>   userpage = mmap(fd);     /* use mmap and rdmpc */
> 
>   while (true) {
>     time_enabled_mmap = xxx; /* use logic in perf_event_mmap_page */
>     time_enabled_read = read(fd).time_enabled;
>     if (time_enabled_mmap > time_enabled_read)
>         BUG();
>   }
> 
> Fix this by updating userpage for inactive events in merge_sched_in.
> 
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reported-and-tested-by: Lucian Grijincu <lucian@fb.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>

Could you please share your comments on this version?

Thanks,
Song

> ---
> include/linux/perf_event.h |  4 +++-
> kernel/events/core.c       | 49 ++++++++++++++++++++++++++++++++++----
> 2 files changed, 48 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 2d510ad750edc..4aa52f7a48c16 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -683,7 +683,9 @@ struct perf_event {
> 	/*
> 	 * timestamp shadows the actual context timing but it can
> 	 * be safely used in NMI interrupt context. It reflects the
> -	 * context time as it was when the event was last scheduled in.
> +	 * context time as it was when the event was last scheduled in,
> +	 * or when ctx_sched_in failed to schedule the event because we
> +	 * run out of PMC.
> 	 *
> 	 * ctx_time already accounts for ctx->timestamp. Therefore to
> 	 * compute ctx_time for a sample, simply add perf_clock().
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 1cb1f9b8392e2..d73f986eef7b3 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3707,6 +3707,46 @@ static noinline int visit_groups_merge(struct perf_cpu_context *cpuctx,
> 	return 0;
> }
> 
> +static inline bool event_update_userpage(struct perf_event *event)
> +{
> +	/*
> +	 * Checking mmap_count to avoid unnecessary work. This does leave a
> +	 * corner case: if the event is enabled before mmap(), the first
> +	 * time the event gets scheduled is via:
> +	 *
> +	 *  __perf_event_enable (or __perf_install_in_context)
> +	 *      -> ctx_resched
> +	 *         -> perf_event_sched_in
> +	 *            -> ctx_sched_in
> +	 *
> +	 * with mmap_count of 0, so we will skip here. As a result,
> +	 * userpage->offset is not accurate after mmap and before the
> +	 * first rotation.
> +	 *
> +	 * To avoid the discrepancy of this window, the user space should
> +	 * mmap the event before enabling it.
> +	 */
> +	if (likely(!atomic_read(&event->mmap_count)))
> +		return false;
> +
> +	perf_event_update_time(event);
> +	perf_set_shadow_time(event, event->ctx);
> +	perf_event_update_userpage(event);
> +
> +	return true;
> +}
> +
> +static inline void group_update_userpage(struct perf_event *group_event)
> +{
> +	struct perf_event *event;
> +
> +	if (!event_update_userpage(group_event))
> +		return;
> +
> +	for_each_sibling_event(event, group_event)
> +		event_update_userpage(event);
> +}
> +
> static int merge_sched_in(struct perf_event *event, void *data)
> {
> 	struct perf_event_context *ctx = event->ctx;
> @@ -3725,14 +3765,15 @@ static int merge_sched_in(struct perf_event *event, void *data)
> 	}
> 
> 	if (event->state == PERF_EVENT_STATE_INACTIVE) {
> +		*can_add_hw = 0;
> 		if (event->attr.pinned) {
> 			perf_cgroup_event_disable(event, ctx);
> 			perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
> +		} else {
> +			ctx->rotate_necessary = 1;
> +			perf_mux_hrtimer_restart(cpuctx);
> +			group_update_userpage(event);
> 		}
> -
> -		*can_add_hw = 0;
> -		ctx->rotate_necessary = 1;
> -		perf_mux_hrtimer_restart(cpuctx);
> 	}
> 
> 	return 0;
> -- 
> 2.30.2
> 


  reply	other threads:[~2021-09-27 16:33 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24  1:28 [PATCH v2] perf/core: fix userpage->time_enabled of inactive events Song Liu
2021-09-27 16:33 ` Song Liu [this message]
2021-09-29  9:18 ` Peter Zijlstra
2021-09-29 19:38   ` Song Liu
2021-09-29 17:04 ` Athira Rajeev
2021-09-29 19:41   ` Song Liu
2021-09-30  4:32     ` Athira Rajeev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A38DD9FF-E142-4BD0-89C4-2B1B699812D7@fb.com \
    --to=songliubraving@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=acme@kernel.org \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucian@fb.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.