From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ankit Navik Subject: Re: [PATCH v2 1/4] drm/i915: Get active pending request for given context Date: Thu, 14 Mar 2019 14:21:38 +0530 Message-ID: References: <1541477601-10883-1-git-send-email-ankit.p.navik@intel.com> <1541477601-10883-2-git-send-email-ankit.p.navik@intel.com> <21c50b40-18f4-ed42-6f88-d231d32efc4b@linux.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1749424119==" Return-path: Received: from mail-pf1-x441.google.com (mail-pf1-x441.google.com [IPv6:2607:f8b0:4864:20::441]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0BD1B6E24E for ; Thu, 14 Mar 2019 08:51:51 +0000 (UTC) Received: by mail-pf1-x441.google.com with SMTP id n125so3367931pfn.5 for ; Thu, 14 Mar 2019 01:51:51 -0700 (PDT) In-Reply-To: <21c50b40-18f4-ed42-6f88-d231d32efc4b@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Tvrtko Ursulin Cc: Ankit Navik , intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org --===============1749424119== Content-Type: multipart/alternative; boundary="00000000000079ff0c05840a08fe" --00000000000079ff0c05840a08fe Content-Type: text/plain; charset="UTF-8" Hi Tvrtko, On Tue, Nov 6, 2018 at 3:14 PM Tvrtko Ursulin < tvrtko.ursulin@linux.intel.com> wrote: > > On 06/11/2018 04:13, Ankit Navik wrote: > > From: Praveen Diwakar > > > > This patch gives us the active pending request count which is yet > > to be submitted to the GPU > > > > Signed-off-by: Praveen Diwakar > > Signed-off-by: Yogesh Marathe > > Signed-off-by: Aravindan Muthukumar > > Signed-off-by: Kedar J Karanje > > Signed-off-by: Ankit Navik > > Suggested-by: Tvrtko Ursulin > > --- > > drivers/gpu/drm/i915/i915_drv.c | 1 + > > drivers/gpu/drm/i915/i915_drv.h | 5 +++++ > > drivers/gpu/drm/i915/i915_gem_context.c | 1 + > > drivers/gpu/drm/i915/i915_gem_context.h | 6 ++++++ > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++ > > drivers/gpu/drm/i915/intel_lrc.c | 6 ++++++ > > 6 files changed, 24 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c > b/drivers/gpu/drm/i915/i915_drv.c > > index f8cfd16..d37c46e 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct > drm_i915_private *dev_priv, > > mutex_init(&dev_priv->av_mutex); > > mutex_init(&dev_priv->wm.wm_mutex); > > mutex_init(&dev_priv->pps_mutex); > > + mutex_init(&dev_priv->pred_mutex); > > > > i915_memcpy_init_early(dev_priv); > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h > b/drivers/gpu/drm/i915/i915_drv.h > > index 4aca534..137ec33 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -1609,6 +1609,11 @@ struct drm_i915_private { > > * controller on different i2c buses. */ > > struct mutex gmbus_mutex; > > > > + /** pred_mutex protects against councurrent usage of pending > > + * request counter for multiple contexts > > + */ > > + struct mutex pred_mutex; > > + > > /** > > * Base address of the gmbus and gpio block. > > */ > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > b/drivers/gpu/drm/i915/i915_gem_context.c > > index b10770c..0bcbe32 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > > @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private > *dev_priv, > > } > > > > trace_i915_context_create(ctx); > > + atomic_set(&ctx->req_cnt, 0); > > > > return ctx; > > } > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h > b/drivers/gpu/drm/i915/i915_gem_context.h > > index b116e49..04e3ff7 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_context.h > > +++ b/drivers/gpu/drm/i915/i915_gem_context.h > > @@ -194,6 +194,12 @@ struct i915_gem_context { > > * context close. > > */ > > struct list_head handles_list; > > + > > + /** req_cnt: tracks the pending commands, based on which we decide > to > > + * go for low/medium/high load configuration of the GPU, this is > > + * controlled via a mutex > > + */ > > + atomic_t req_cnt; > > }; > > > > static inline bool i915_gem_context_is_closed(const struct > i915_gem_context *ctx) > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > > index 3f0c612..8afa2a5 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > > @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, > > struct drm_syncobj **fences) > > { > > struct i915_execbuffer eb; > > + struct drm_i915_private *dev_priv = to_i915(dev); > > struct dma_fence *in_fence = NULL; > > struct sync_file *out_fence = NULL; > > int out_fence_fd = -1; > > @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, > > */ > > eb.request->batch = eb.batch; > > > > + mutex_lock(&dev_priv->pred_mutex); > > + atomic_inc(&eb.ctx->req_cnt); > > Point of going to atomic_t was to remove need for the mutex. > > > + mutex_unlock(&dev_priv->pred_mutex); > > + > > trace_i915_request_queue(eb.request, eb.batch_flags); > > err = eb_submit(&eb); > > err_request: > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > > index 1744792..bcbb66b 100644 > > --- a/drivers/gpu/drm/i915/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/intel_lrc.c > > @@ -728,6 +728,12 @@ static void execlists_dequeue(struct > intel_engine_cs *engine) > > trace_i915_request_in(rq, port_index(port, > execlists)); > > last = rq; > > submit = true; > > + > > + mutex_lock(&rq->i915->pred_mutex); > > + if (atomic_read(&rq->gem_context->req_cnt) > 0) > > + atomic_dec(&rq->gem_context->req_cnt); > > Hitting underflow is a hint accounting does not work as expected. I > really think you need to fix it by gathering some ideas from the patches > I've pointed at in the previous round. > I have submitted the patch v4. I have tried with point which you have suggested, but didnt see much power benefit with that. Regards, Ankit > > And there is also GuC to think about. > > Regards, > > Tvrtko > > > + > > + mutex_unlock(&rq->i915->pred_mutex); > > } > > > > rb_erase_cached(&p->node, &execlists->queue); > > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > --00000000000079ff0c05840a08fe Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Tvrtko,=C2=A0

On Tue, Nov 6, 2018 at 3:= 14 PM Tvrtko Ursulin <= tvrtko.ursulin@linux.intel.com> wrote:

On 06/11/2018 04:13, Ankit Navik wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
>
> This patch gives us the active pending request count which is yet
> to be submitted to the GPU
>
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>= ;
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
>=C2=A0 =C2=A0drivers/gpu/drm/i915/i915_drv.c=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 | 1 +
>=C2=A0 =C2=A0drivers/gpu/drm/i915/i915_drv.h=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 | 5 +++++
>=C2=A0 =C2=A0drivers/gpu/drm/i915/i915_gem_context.c=C2=A0 =C2=A0 | 1 +=
>=C2=A0 =C2=A0drivers/gpu/drm/i915/i915_gem_context.h=C2=A0 =C2=A0 | 6 += +++++
>=C2=A0 =C2=A0drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
>=C2=A0 =C2=A0drivers/gpu/drm/i915/intel_lrc.c=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0| 6 ++++++
>=C2=A0 =C2=A06 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i9= 15_drv.c
> index f8cfd16..d37c46e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_= private *dev_priv,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_init(&dev_priv->av_mutex);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_init(&dev_priv->wm.wm_mutex); >=C2=A0 =C2=A0 =C2=A0 =C2=A0mutex_init(&dev_priv->pps_mutex);
> +=C2=A0 =C2=A0 =C2=A0mutex_init(&dev_priv->pred_mutex);
>=C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0i915_memcpy_init_early(dev_priv);
>=C2=A0 =C2=A0
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i9= 15_drv.h
> index 4aca534..137ec33 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1609,6 +1609,11 @@ struct drm_i915_private {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 * controller on different i2c buses. */
>=C2=A0 =C2=A0 =C2=A0 =C2=A0struct mutex gmbus_mutex;
>=C2=A0 =C2=A0
> +=C2=A0 =C2=A0 =C2=A0/** pred_mutex protects against councurrent usage= of pending
> +=C2=A0 =C2=A0 =C2=A0 * request counter for multiple contexts
> +=C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0struct mutex pred_mutex;
> +
>=C2=A0 =C2=A0 =C2=A0 =C2=A0/**
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 * Base address of the gmbus and gpio block.=
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm= /i915/i915_gem_context.c
> index b10770c..0bcbe32 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *d= ev_priv,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0}
>=C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0trace_i915_context_create(ctx);
> +=C2=A0 =C2=A0 =C2=A0atomic_set(&ctx->req_cnt, 0);
>=C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0return ctx;
>=C2=A0 =C2=A0}
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm= /i915/i915_gem_context.h
> index b116e49..04e3ff7 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -194,6 +194,12 @@ struct i915_gem_context {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 * context close.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
>=C2=A0 =C2=A0 =C2=A0 =C2=A0struct list_head handles_list;
> +
> +=C2=A0 =C2=A0 =C2=A0/** req_cnt: tracks the pending commands, based o= n which we decide to
> +=C2=A0 =C2=A0 =C2=A0 * go for low/medium/high load configuration of t= he GPU, this is
> +=C2=A0 =C2=A0 =C2=A0 * controlled via a mutex
> +=C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0atomic_t req_cnt;
>=C2=A0 =C2=A0};
>=C2=A0 =C2=A0
>=C2=A0 =C2=A0static inline bool i915_gem_context_is_closed(const struct= i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/= drm/i915/i915_gem_execbuffer.c
> index 3f0c612..8afa2a5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 struct drm_syncobj **fences)
>=C2=A0 =C2=A0{
>=C2=A0 =C2=A0 =C2=A0 =C2=A0struct i915_execbuffer eb;
> +=C2=A0 =C2=A0 =C2=A0struct drm_i915_private *dev_priv =3D to_i915(dev= );
>=C2=A0 =C2=A0 =C2=A0 =C2=A0struct dma_fence *in_fence =3D NULL;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0struct sync_file *out_fence =3D NULL;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0int out_fence_fd =3D -1;
> @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,<= br> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
>=C2=A0 =C2=A0 =C2=A0 =C2=A0eb.request->batch =3D eb.batch;
>=C2=A0 =C2=A0
> +=C2=A0 =C2=A0 =C2=A0mutex_lock(&dev_priv->pred_mutex);
> +=C2=A0 =C2=A0 =C2=A0atomic_inc(&eb.ctx->req_cnt);

Point of going to atomic_t was to remove need for the mutex.

> +=C2=A0 =C2=A0 =C2=A0mutex_unlock(&dev_priv->pred_mutex);
> +
>=C2=A0 =C2=A0 =C2=A0 =C2=A0trace_i915_request_queue(eb.request, eb.batc= h_flags);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0err =3D eb_submit(&eb);
>=C2=A0 =C2=A0err_request:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/i= ntel_lrc.c
> index 1744792..bcbb66b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine= _cs *engine)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0trace_i915_request_in(rq, port_index(port, execlists));
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0last =3D rq;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0submit =3D true;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0mutex_lock(&rq->i915->pred_mutex);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0if (atomic_read(&rq->gem_context->req_cnt) > 0)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0atomic_dec(&rq->gem_context->r= eq_cnt);

Hitting underflow is a hint accounting does not work as expected. I
really think you need to fix it by gathering some ideas from the patches I've pointed at in the previous round.

<= div>I have submitted the patch v4.=C2=A0
I have tried with point which y= ou have suggested, but didnt see much
power benefit with that.=C2= =A0

Regards, Ankit

And there is also GuC to think about.

Regards,

Tvrtko

> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0mutex_unlock(&rq->i915->pred_mutex);
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>=C2=A0 =C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0rb_erase_cached(= &p->node, &execlists->queue);
>
_______________________________________________
Intel-gfx mailing list
Intel-= gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listin= fo/intel-gfx
--00000000000079ff0c05840a08fe-- --===============1749424119== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4 IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4 --===============1749424119==--