All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: "Gupta, Sourab" <sourab.gupta@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	"Wu, Jabin" <jabin.wu@intel.com>,
	"Woo, Insoo" <insoo.woo@intel.com>,
	"Bragg, Robert" <robert.bragg@intel.com>
Subject: Re: [RFC 5/7] drm/i915: Wait for GPU to finish before event stop in Gen Perf PMU
Date: Thu, 25 Jun 2015 09:02:47 +0100	[thread overview]
Message-ID: <20150625080247.GD30757@nuc-i3427.alporthouse.com> (raw)
In-Reply-To: <1435212288.1938.16.camel@sourabgu-desktop>

On Thu, Jun 25, 2015 at 06:02:35AM +0000, Gupta, Sourab wrote:
> On Mon, 2015-06-22 at 16:09 +0000, Daniel Vetter wrote:
> > On Mon, Jun 22, 2015 at 02:22:54PM +0100, Chris Wilson wrote:
> > > On Mon, Jun 22, 2015 at 03:25:07PM +0530, sourab.gupta@intel.com wrote:
> > > > From: Sourab Gupta <sourab.gupta@intel.com>
> > > > 
> > > > To collect timestamps around any GPU workload, we need to insert
> > > > commands to capture them into the ringbuffer. Therefore, during the stop event
> > > > call, we need to wait for GPU to complete processing the last request for
> > > > which these commands were inserted.
> > > > We need to ensure this processing is done before event_destroy callback which
> > > > deallocates the buffer for holding the data.
> > > 
> > > There's no reason for this to be synchronous. Just that you need an
> > > active reference on the output buffer to be only released after the
> > > final request.
> > 
> > Yeah I think the interaction between OA sampling and GEM is the critical
> > piece here for both patch series. Step one is to have a per-pmu lock to
> > keep track of data private to OA and mmio based sampling as a starting
> > point. Then we need to figure out how to structure the connection without
> > OA/PMU and gem trampling over each another's feet.
> > 
> > Wrt requests those are refcounted already and just waiting on them doesn't
> > need dev->struct_mutex. That should be all you need, assuming you do
> > correctly refcount them ...
> > -Daniel
> 
> Hi Daniel/Chris,
> 
> Thanks for your feedback. I acknowledge the issue with increasing
> coverage scope of i915 dev mutex. I'll have a per-pmu lock to keep track
> of data private to OA in my next version of patches.
> 
> W.r.t. having the synchronous wait during event stop, I thought through
> the method of having active reference on output buffer. This will let us
> have a delayed buffer destroy (controlled by decrement of output buffer
> refcount as and when gpu requests complete). This will be decoupled from
> the event stop here. But that is not the only thing at play here.
> 
> The event stop is expected to disable the OA unit (which it is doing
> right now). In my usecase, I can't disable OA unit, till the time GPU
> processes the MI_RPC requests scheduled already, which otherwise may
> lead to hangs.
> So, I'm waiting synchronously for requests to complete before disabling
> OA unit.

There's no issue here. Just hook into the retire-notification callback
for the last active request of the oa buffer. If you are using LRI to
disable the OA Context writing to the buffer, you can simply create a
request for the LRI and use the active reference as the retire
notification. (Otoh, if the oa buffer is inactive you can simply do the
decoupling immediately.) You shouldn't need a object-free-notification
callback aiui.
 
> Also, the event_destroy callback should be destroying the buffers
> associated with event. (Right now, in current design, event_destroy
> callback waits for the above operations to be completed before
> proceeding to destroy the buffer).
> 
> If I try to create a function which destroys the output buffer according
> to all references being released, all these operations will have to be
> performed together in that function, in this serial order (so that when
> refcount drops to 0, i.e. requests have completed, OA unit will be
> disabled, and then the buffer destroyed). This would lead to these
> operations being decoupled from the event_stop and event_destroy
> functions. This may be non-intentional behaviour w.r.t. these callbacks
> from userspace perspective.

When the perf event is stopped, you stop generating perf events. But the
buffer may still be alive, and may be resurrected if you a new event is
started, but you will want to be sure to ensure that pending oa writes
are ignored.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2015-06-25  8:03 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-22  9:55 [RFC 0/7] Introduce framework for forwarding generic non-OA performance sourab.gupta
2015-06-22  9:55 ` [RFC 1/7] drm/i915: Add a new PMU for handling non-OA counter data profiling requests sourab.gupta
2015-06-22  9:55 ` [RFC 2/7] drm/i915: Register routines for Gen perf PMU driver sourab.gupta
2015-06-22  9:55 ` [RFC 3/7] drm/i915: Introduce timestamp node for timestamp data collection sourab.gupta
2015-06-22  9:55 ` [RFC 4/7] drm/i915: Add mechanism for forwarding the data samples to userspace through Gen PMU perf interface sourab.gupta
2015-06-22 13:21   ` Chris Wilson
2015-06-22  9:55 ` [RFC 5/7] drm/i915: Wait for GPU to finish before event stop in Gen Perf PMU sourab.gupta
2015-06-22 13:22   ` Chris Wilson
2015-06-22 16:09     ` Daniel Vetter
2015-06-25  6:02       ` Gupta, Sourab
2015-06-25  7:42         ` Daniel Vetter
2015-06-25  8:27           ` Gupta, Sourab
2015-06-25 11:47             ` Robert Bragg
2015-06-25  8:02         ` Chris Wilson [this message]
2015-06-25 17:31           ` Robert Bragg
2015-06-25 17:37             ` Chris Wilson
2015-06-25 18:20               ` Chris Wilson
2015-06-25 13:02         ` Robert Bragg
2015-06-25 13:07           ` Robert Bragg
2015-06-22  9:55 ` [RFC 6/7] drm/i915: Add routines for inserting commands in the ringbuf for capturing timestamps sourab.gupta
2015-06-22  9:55 ` [RFC 7/7] drm/i915: Add support for retrieving MMIO register values in Gen Perf PMU sourab.gupta
2015-06-22 13:29   ` Chris Wilson
2015-06-22 16:06   ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150625080247.GD30757@nuc-i3427.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=a.p.zijlstra@chello.nl \
    --cc=insoo.woo@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jabin.wu@intel.com \
    --cc=robert.bragg@intel.com \
    --cc=sourab.gupta@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.