From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robert Bragg Subject: Re: [PATCH v7 06/11] drm/i915: Enable i915 perf stream for Haswell OA unit Date: Wed, 26 Oct 2016 16:03:57 +0100 Message-ID: References: <20161024231934.2243-1-robert@sixbynine.org> <20161024231934.2243-7-robert@sixbynine.org> Reply-To: robert@sixbynine.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1438516752==" Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Matthew Auld Cc: ML dri-devel , David Airlie , Intel Graphics Development , Sourab Gupta , Daniel Vetter List-Id: dri-devel@lists.freedesktop.org --===============1438516752== Content-Type: multipart/alternative; boundary=001a1142c2286677af053fc5f0d2 --001a1142c2286677af053fc5f0d2 Content-Type: text/plain; charset=UTF-8 On 26 Oct 2016 11:08 a.m., "Matthew Auld" wrote: > > On 26 October 2016 at 00:51, Robert Bragg wrote: > > > > > > On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld > > wrote: > >> > >> On 25 October 2016 at 00:19, Robert Bragg wrote: > > > > > >> > >> > >> > diff --git a/drivers/gpu/drm/i915/i915_drv.h > >> > b/drivers/gpu/drm/i915/i915_drv.h > >> > index 3448d05..ea24814 100644 > >> > --- a/drivers/gpu/drm/i915/i915_drv.h > >> > +++ b/drivers/gpu/drm/i915/i915_drv.h > >> > @@ -1764,6 +1764,11 @@ struct intel_wm_config { > >> > >> > > >> > struct drm_i915_private { > >> > @@ -2149,16 +2164,46 @@ struct drm_i915_private { > >> > > >> > struct { > >> > bool initialized; > >> > + > >> > struct mutex lock; > >> > struct list_head streams; > >> > > >> > + spinlock_t hook_lock; > >> > + > >> > struct { > >> > - u32 metrics_set; > >> > + struct i915_perf_stream *exclusive_stream; > >> > + > >> > + u32 specific_ctx_id; > >> Can we just get rid of this, now that the vma remains pinned we can > >> simply get the ggtt address at the time of configuring the OA_CONTROL > >> register ? > > > > > > I considered that, but would ideally prefer to keep it considering the gen8+ > > patches to come. For gen8+ (with execlists) the context ID isn't a gtt > > offset. > > > >> > >> > >> > + > >> > + struct hrtimer poll_check_timer; > >> > + wait_queue_head_t poll_wq; > >> > + atomic_t pollin; > >> > + > >> > > > >> > >> > +/* The maximum exponent the hardware accepts is 63 (essentially it > >> > selects one > >> > + * of the 64bit timestamp bits to trigger reports from) but there's > >> > currently > >> > + * no known use case for sampling as infrequently as once per 47 > >> > thousand years. > >> > + * > >> > + * Since the timestamps included in OA reports are only 32bits it seems > >> > + * reasonable to limit the OA exponent where it's still possible to > >> > account for > >> > + * overflow in OA report timestamps. > >> > + */ > >> > +#define OA_EXPONENT_MAX 31 > >> > + > >> > +#define INVALID_CTX_ID 0xffffffff > >> We shouldn't need this anymore. > > > > > > yeah I removed it and then added it back, just for the sake of explicitly > > setting the specific_ctx_id to an invalid ID when closing the exclusive > > stream - though resetting the value isn't strictly necessary. > Can we not make the specific_ctx_id per-stream, the gem context > already is, then we don't need to be concerned with resetting it ? Hmm, I'm not sure about that, conceptually to me it's global OA unit state. Currently the driver only supports a single exclusive stream, while Sourab later relaxes that to a per-engine stream and that could be relaxed further with non-oa metric stream types. With multiple streams we'll still only be able to programmer a single ctx id in oacontol. Conceptually to me, other stream types could be associated with different contexts (if they don't depend on the OA unit) so to me stream->ctx isn't necessarily OA unit state. It probably could be played around with, but right now we don't track OA specific state in the stream. For the ID it's just semantics to say it's OA state, and we could consider that it's maybe generally useful to track the ID, even for future non-oa streams. That might mean potentially redundantly pinning state for the sake of tracking the ID for streams that don't end up needing it. --001a1142c2286677af053fc5f0d2 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On 26 Oct 2016 11:08 a.m., "Matthew Au= ld" <matthew.william.auld@gmail.com> wrote:
>
> On 26 October 2016 at 00:51, Robert Bragg <robert@sixbynine.org> wrote:
> >
> >
> > On Tue, Oct 25, 2016 at 10:35 PM, Matthew Auld
> > <matthew.william.auld@gmail.com> wrote:
> >>
> >> On 25 October 2016 at 00:19, Robert Bragg <robert@sixbynine.org> wro= te:
> >
> >
> >>
> >>
> >> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> >> > b/drivers/gpu/drm/i915/i915_drv.h
> >> > index 3448d05..ea24814 100644
> >> > --- a/drivers/gpu/drm/i915/i915_drv.h
> >> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> > @@ -1764,6 +1764,11 @@ struct intel_wm_config {
> >>
> >> >
> >> >=C2=A0 struct drm_i915_private {
> >> > @@ -2149,16 +2164,46 @@ struct drm_i915_private {
> >> >
> >> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct {
> >> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0bool initialized;
> >> > +
> >> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0struct mutex lock;
> >> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0struct list_head streams;
> >> >
> >> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= spinlock_t hook_lock;
> >> > +
> >> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0struct {
> >> > -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0u32 metrics_set;
> >> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0struct i915_perf_stream *exclusive_stream;
> >> > +
> >> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0u32 specific_ctx_id;
> >> Can we just get rid of this, now that the vma remains pinned = we can
> >> simply get the ggtt address at the time of configuring the OA= _CONTROL
> >> register ?
> >
> >
> > I considered that, but would ideally prefer to keep it considerin= g the gen8+
> > patches to come. For gen8+ (with execlists) the context ID isn= 9;t a gtt
> > offset.
> >
> >>
> >>
> >> > +
> >> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0struct hrtimer poll_check_timer;
> >> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0wait_queue_head_t poll_wq;
> >> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0atomic_t pollin;
> >> > +
> >>
> >
> >>
> >> > +/* The maximum exponent the hardware accepts is 63 (ess= entially it
> >> > selects one
> >> > + * of the 64bit timestamp bits to trigger reports from)= but there's
> >> > currently
> >> > + * no known use case for sampling as infrequently as on= ce per 47
> >> > thousand years.
> >> > + *
> >> > + * Since the timestamps included in OA reports are only= 32bits it seems
> >> > + * reasonable to limit the OA exponent where it's s= till possible to
> >> > account for
> >> > + * overflow in OA report timestamps.
> >> > + */
> >> > +#define OA_EXPONENT_MAX 31
> >> > +
> >> > +#define INVALID_CTX_ID 0xffffffff
> >> We shouldn't need this anymore.
> >
> >
> > yeah I removed it and then added it back, just for the sake of ex= plicitly
> > setting the specific_ctx_id to an invalid ID when closing the exc= lusive
> > stream - though resetting the value isn't strictly necessary.=
> Can we not make the specific_ctx_id per-stream, the gem context
> already is, then we don't need to be concerned with resetting it ?=

Hmm, I'm not sure about that, conceptually to me it'= s global OA unit state.

Currently the driver only supports a single exclusive stream= , while Sourab later relaxes that to a per-engine stream and that could be = relaxed further with non-oa metric stream types.

With multiple streams we'll still only be able to progra= mmer a single ctx id in oacontol.

Conceptually to me, other stream types could be associated w= ith different contexts (if they don't depend on the OA unit) so to me s= tream->ctx isn't necessarily OA unit state.

It probably could be played around with, but right now we do= n't track OA specific state in the stream. For the ID it's just sem= antics to say it's OA state, and we could consider that it's maybe = generally useful to track the ID, even for future non-oa streams. That migh= t mean potentially redundantly pinning state for the sake of tracking the I= D for streams that don't end up needing it.

--001a1142c2286677af053fc5f0d2-- --===============1438516752== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4 IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg== --===============1438516752==--