linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/3] dma-fence: Add a "boost" mechanism
@ 2021-05-19 18:38 Rob Clark
  2021-05-19 18:38 ` [RFC 1/3] dma-fence: Add boost fence op Rob Clark
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Rob Clark @ 2021-05-19 18:38 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Rob Clark,
	moderated list:DMA BUFFER SHARING FRAMEWORK, open list,
	open list:DMA BUFFER SHARING FRAMEWORK

From: Rob Clark <robdclark@chromium.org>

In some cases, like double-buffered rendering, missing vblanks can
trick the GPU into running at a lower frequence, when really we
want to be running at a higher frequency to not miss the vblanks
in the first place.

This is partially inspired by a trick i915 does, but implemented
via dma-fence for a couple of reasons:

1) To continue to be able to use the atomic helpers
2) To support cases where display and gpu are different drivers

The last patch is just proof of concept, in reality I think it
may want to be a bit more clever.  But sending this out as it
is as an RFC to get feedback.

Rob Clark (3):
  dma-fence: Add boost fence op
  drm/atomic: Call dma_fence_boost() when we've missed a vblank
  drm/msm: Wire up gpu boost

 drivers/gpu/drm/drm_atomic_helper.c | 11 +++++++++++
 drivers/gpu/drm/msm/msm_fence.c     | 10 ++++++++++
 drivers/gpu/drm/msm/msm_gpu.c       | 13 +++++++++++++
 drivers/gpu/drm/msm/msm_gpu.h       |  2 ++
 include/linux/dma-fence.h           | 26 ++++++++++++++++++++++++++
 5 files changed, 62 insertions(+)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC 1/3] dma-fence: Add boost fence op
  2021-05-19 18:38 [RFC 0/3] dma-fence: Add a "boost" mechanism Rob Clark
@ 2021-05-19 18:38 ` Rob Clark
  2021-05-20  6:46   ` Christian König
  2021-05-19 18:38 ` [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank Rob Clark
  2021-05-19 18:38 ` [RFC 3/3] drm/msm: Wire up gpu boost Rob Clark
  2 siblings, 1 reply; 20+ messages in thread
From: Rob Clark @ 2021-05-19 18:38 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Rob Clark, Sumit Semwal,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK, open list

From: Rob Clark <robdclark@chromium.org>

Add a way to hint to the fence signaler that a fence waiter has missed a
deadline waiting on the fence.

In some cases, missing a vblank can result in lower gpu utilization,
when really we want to go in the opposite direction and boost gpu freq.
The boost callback gives some feedback to the fence signaler that we
are missing deadlines, so it can take this into account in it's freq/
utilization calculations.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 9f12efaaa93a..172702521acc 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -231,6 +231,17 @@ struct dma_fence_ops {
 	signed long (*wait)(struct dma_fence *fence,
 			    bool intr, signed long timeout);
 
+	/**
+	 * @boost:
+	 *
+	 * Optional callback, to indicate that a fence waiter missed a deadline.
+	 * This can serve as a signal that (if possible) whatever signals the
+	 * fence should boost it's clocks.
+	 *
+	 * This can be called in any context that can call dma_fence_wait().
+	 */
+	void (*boost)(struct dma_fence *fence);
+
 	/**
 	 * @release:
 	 *
@@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
 	return ret < 0 ? ret : 0;
 }
 
+/**
+ * dma_fence_boost - hint from waiter that it missed a deadline
+ *
+ * @fence: the fence that caused the missed deadline
+ *
+ * This function gives a hint from a fence waiter that a deadline was
+ * missed, so that the fence signaler can factor this in to device
+ * power state decisions
+ */
+static inline void dma_fence_boost(struct dma_fence *fence)
+{
+	if (fence->ops->boost)
+		fence->ops->boost(fence);
+}
+
 struct dma_fence *dma_fence_get_stub(void);
 u64 dma_fence_context_alloc(unsigned num);
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank
  2021-05-19 18:38 [RFC 0/3] dma-fence: Add a "boost" mechanism Rob Clark
  2021-05-19 18:38 ` [RFC 1/3] dma-fence: Add boost fence op Rob Clark
@ 2021-05-19 18:38 ` Rob Clark
  2021-05-20 16:29   ` Daniel Vetter
  2021-05-19 18:38 ` [RFC 3/3] drm/msm: Wire up gpu boost Rob Clark
  2 siblings, 1 reply; 20+ messages in thread
From: Rob Clark @ 2021-05-19 18:38 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Rob Clark, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Daniel Vetter,
	open list

From: Rob Clark <robdclark@chromium.org>

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/drm_atomic_helper.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 560aaecba31b..fe10fc2e7f86 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1435,11 +1435,15 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
 	int i, ret;
 
 	for_each_new_plane_in_state(state, plane, new_plane_state, i) {
+		u64 vblank_count;
+
 		if (!new_plane_state->fence)
 			continue;
 
 		WARN_ON(!new_plane_state->fb);
 
+		vblank_count = drm_crtc_vblank_count(new_plane_state->crtc);
+
 		/*
 		 * If waiting for fences pre-swap (ie: nonblock), userspace can
 		 * still interrupt the operation. Instead of blocking until the
@@ -1449,6 +1453,13 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
 		if (ret)
 			return ret;
 
+		/*
+		 * Check if we've missed a vblank while waiting, and if we have
+		 * signal the fence that it's signaler should be boosted
+		 */
+		if (vblank_count != drm_crtc_vblank_count(new_plane_state->crtc))
+			dma_fence_boost(new_plane_state->fence);
+
 		dma_fence_put(new_plane_state->fence);
 		new_plane_state->fence = NULL;
 	}
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC 3/3] drm/msm: Wire up gpu boost
  2021-05-19 18:38 [RFC 0/3] dma-fence: Add a "boost" mechanism Rob Clark
  2021-05-19 18:38 ` [RFC 1/3] dma-fence: Add boost fence op Rob Clark
  2021-05-19 18:38 ` [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank Rob Clark
@ 2021-05-19 18:38 ` Rob Clark
  2 siblings, 0 replies; 20+ messages in thread
From: Rob Clark @ 2021-05-19 18:38 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Rob Clark, Rob Clark, Sean Paul,
	David Airlie, Daniel Vetter, Sumit Semwal, Christian König,
	open list, open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK

From: Rob Clark <robdclark@chromium.org>

Note, at this point I haven't given a lot of consideration into how much
we should boost, and for how long.  And perhaps we should only boost at
less than 50% utilization?  At this point, this is only an example of
dma_fence_boost() implementation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/msm_fence.c | 10 ++++++++++
 drivers/gpu/drm/msm/msm_gpu.c   | 13 +++++++++++++
 drivers/gpu/drm/msm/msm_gpu.h   |  2 ++
 3 files changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index cd59a5918038..e58895603726 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -8,6 +8,7 @@
 
 #include "msm_drv.h"
 #include "msm_fence.h"
+#include "msm_gpu.h"
 
 
 struct msm_fence_context *
@@ -114,10 +115,19 @@ static bool msm_fence_signaled(struct dma_fence *fence)
 	return fence_completed(f->fctx, f->base.seqno);
 }
 
+static void msm_fence_boost(struct dma_fence *fence)
+{
+	struct msm_fence *f = to_msm_fence(fence);
+	struct msm_drm_private *priv = f->fctx->dev->dev_private;
+
+	msm_gpu_boost(priv->gpu);
+}
+
 static const struct dma_fence_ops msm_fence_ops = {
 	.get_driver_name = msm_fence_get_driver_name,
 	.get_timeline_name = msm_fence_get_timeline_name,
 	.signaled = msm_fence_signaled,
+	.boost = msm_fence_boost,
 };
 
 struct dma_fence *
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 9dd1c58430ab..c90b79116500 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -62,6 +62,10 @@ static int msm_devfreq_get_dev_status(struct device *dev,
 	status->total_time = ktime_us_delta(time, gpu->devfreq.time);
 	gpu->devfreq.time = time;
 
+	if (atomic_dec_if_positive(&gpu->devfreq.boost) >= 0) {
+		status->busy_time = status->total_time;
+	}
+
 	return 0;
 }
 
@@ -84,6 +88,15 @@ static struct devfreq_dev_profile msm_devfreq_profile = {
 	.get_cur_freq = msm_devfreq_get_cur_freq,
 };
 
+void msm_gpu_boost(struct msm_gpu *gpu)
+{
+	if (!gpu->funcs->gpu_busy)
+		return;
+
+	/* Add three devfreq polling intervals worth of boost: */
+	atomic_add(3, &gpu->devfreq.boost);
+}
+
 static void msm_devfreq_init(struct msm_gpu *gpu)
 {
 	/* We need target support to do devfreq */
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 18baf935e143..7a082a12d98f 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -150,6 +150,7 @@ struct msm_gpu {
 		struct devfreq *devfreq;
 		u64 busy_cycles;
 		ktime_t time;
+		atomic_t boost;
 	} devfreq;
 
 	uint32_t suspend_count;
@@ -295,6 +296,7 @@ static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)
 int msm_gpu_pm_suspend(struct msm_gpu *gpu);
 int msm_gpu_pm_resume(struct msm_gpu *gpu);
 void msm_gpu_resume_devfreq(struct msm_gpu *gpu);
+void msm_gpu_boost(struct msm_gpu *gpu);
 
 int msm_gpu_hw_init(struct msm_gpu *gpu);
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/3] dma-fence: Add boost fence op
  2021-05-19 18:38 ` [RFC 1/3] dma-fence: Add boost fence op Rob Clark
@ 2021-05-20  6:46   ` Christian König
  2021-05-20 14:07     ` Rob Clark
  0 siblings, 1 reply; 20+ messages in thread
From: Christian König @ 2021-05-20  6:46 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: freedreno, linux-arm-msm, Rob Clark, Sumit Semwal,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK, open list

Uff, that looks very hardware specific to me.

As far as I can see you can also implement completely inside the backend 
by starting a timer on enable_signaling, don't you?

Christian.

Am 19.05.21 um 20:38 schrieb Rob Clark:
> From: Rob Clark <robdclark@chromium.org>
>
> Add a way to hint to the fence signaler that a fence waiter has missed a
> deadline waiting on the fence.
>
> In some cases, missing a vblank can result in lower gpu utilization,
> when really we want to go in the opposite direction and boost gpu freq.
> The boost callback gives some feedback to the fence signaler that we
> are missing deadlines, so it can take this into account in it's freq/
> utilization calculations.
>
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>   include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
>   1 file changed, 26 insertions(+)
>
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 9f12efaaa93a..172702521acc 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -231,6 +231,17 @@ struct dma_fence_ops {
>   	signed long (*wait)(struct dma_fence *fence,
>   			    bool intr, signed long timeout);
>   
> +	/**
> +	 * @boost:
> +	 *
> +	 * Optional callback, to indicate that a fence waiter missed a deadline.
> +	 * This can serve as a signal that (if possible) whatever signals the
> +	 * fence should boost it's clocks.
> +	 *
> +	 * This can be called in any context that can call dma_fence_wait().
> +	 */
> +	void (*boost)(struct dma_fence *fence);
> +
>   	/**
>   	 * @release:
>   	 *
> @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
>   	return ret < 0 ? ret : 0;
>   }
>   
> +/**
> + * dma_fence_boost - hint from waiter that it missed a deadline
> + *
> + * @fence: the fence that caused the missed deadline
> + *
> + * This function gives a hint from a fence waiter that a deadline was
> + * missed, so that the fence signaler can factor this in to device
> + * power state decisions
> + */
> +static inline void dma_fence_boost(struct dma_fence *fence)
> +{
> +	if (fence->ops->boost)
> +		fence->ops->boost(fence);
> +}
> +
>   struct dma_fence *dma_fence_get_stub(void);
>   u64 dma_fence_context_alloc(unsigned num);
>   


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20  6:46   ` Christian König
@ 2021-05-20 14:07     ` Rob Clark
  2021-05-20 14:11       ` Christian König
  2021-05-20 16:25       ` Daniel Vetter
  0 siblings, 2 replies; 20+ messages in thread
From: Rob Clark @ 2021-05-20 14:07 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel, freedreno, linux-arm-msm, Rob Clark, Sumit Semwal,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK, open list

On Wed, May 19, 2021 at 11:47 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Uff, that looks very hardware specific to me.

Howso?  I'm not sure I agree.. and even if it was not useful for some
hw, it should be useful for enough drivers (and harm no drivers), so I
still think it is a good idea

The fallback plan is to go the i915 route and stop using atomic
helpers and do the same thing inside the driver, but that doesn't help
any of the cases where you have a separate kms and gpu driver.

> As far as I can see you can also implement completely inside the backend
> by starting a timer on enable_signaling, don't you?

Not really.. I mean, the fact that something waited on a fence could
be a useful input signal to gpu freq governor, but it is entirely
insufficient..

If the cpu is spending a lot of time waiting on a fence, cpufreq will
clock down so you spend less time waiting.  And no problem has been
solved.  You absolutely need the concept of a missed deadline, and a
timer doesn't give you that.

BR,
-R

> Christian.
>
> Am 19.05.21 um 20:38 schrieb Rob Clark:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Add a way to hint to the fence signaler that a fence waiter has missed a
> > deadline waiting on the fence.
> >
> > In some cases, missing a vblank can result in lower gpu utilization,
> > when really we want to go in the opposite direction and boost gpu freq.
> > The boost callback gives some feedback to the fence signaler that we
> > are missing deadlines, so it can take this into account in it's freq/
> > utilization calculations.
> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >   include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
> >   1 file changed, 26 insertions(+)
> >
> > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> > index 9f12efaaa93a..172702521acc 100644
> > --- a/include/linux/dma-fence.h
> > +++ b/include/linux/dma-fence.h
> > @@ -231,6 +231,17 @@ struct dma_fence_ops {
> >       signed long (*wait)(struct dma_fence *fence,
> >                           bool intr, signed long timeout);
> >
> > +     /**
> > +      * @boost:
> > +      *
> > +      * Optional callback, to indicate that a fence waiter missed a deadline.
> > +      * This can serve as a signal that (if possible) whatever signals the
> > +      * fence should boost it's clocks.
> > +      *
> > +      * This can be called in any context that can call dma_fence_wait().
> > +      */
> > +     void (*boost)(struct dma_fence *fence);
> > +
> >       /**
> >        * @release:
> >        *
> > @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
> >       return ret < 0 ? ret : 0;
> >   }
> >
> > +/**
> > + * dma_fence_boost - hint from waiter that it missed a deadline
> > + *
> > + * @fence: the fence that caused the missed deadline
> > + *
> > + * This function gives a hint from a fence waiter that a deadline was
> > + * missed, so that the fence signaler can factor this in to device
> > + * power state decisions
> > + */
> > +static inline void dma_fence_boost(struct dma_fence *fence)
> > +{
> > +     if (fence->ops->boost)
> > +             fence->ops->boost(fence);
> > +}
> > +
> >   struct dma_fence *dma_fence_get_stub(void);
> >   u64 dma_fence_context_alloc(unsigned num);
> >
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20 14:07     ` Rob Clark
@ 2021-05-20 14:11       ` Christian König
  2021-05-20 14:54         ` Rob Clark
  2021-05-20 16:25       ` Daniel Vetter
  1 sibling, 1 reply; 20+ messages in thread
From: Christian König @ 2021-05-20 14:11 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, freedreno, linux-arm-msm, Rob Clark, Sumit Semwal,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK, open list



Am 20.05.21 um 16:07 schrieb Rob Clark:
> On Wed, May 19, 2021 at 11:47 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Uff, that looks very hardware specific to me.
> Howso?  I'm not sure I agree.. and even if it was not useful for some
> hw, it should be useful for enough drivers (and harm no drivers), so I
> still think it is a good idea
>
> The fallback plan is to go the i915 route and stop using atomic
> helpers and do the same thing inside the driver, but that doesn't help
> any of the cases where you have a separate kms and gpu driver.

Yeah, that's certainly not something we want.

>> As far as I can see you can also implement completely inside the backend
>> by starting a timer on enable_signaling, don't you?
> Not really.. I mean, the fact that something waited on a fence could
> be a useful input signal to gpu freq governor, but it is entirely
> insufficient..
>
> If the cpu is spending a lot of time waiting on a fence, cpufreq will
> clock down so you spend less time waiting.  And no problem has been
> solved.  You absolutely need the concept of a missed deadline, and a
> timer doesn't give you that.

Ok then I probably don't understand the use case here.

What exactly do you try to solve?

Thanks,
Christian.

>
> BR,
> -R
>
>> Christian.
>>
>> Am 19.05.21 um 20:38 schrieb Rob Clark:
>>> From: Rob Clark <robdclark@chromium.org>
>>>
>>> Add a way to hint to the fence signaler that a fence waiter has missed a
>>> deadline waiting on the fence.
>>>
>>> In some cases, missing a vblank can result in lower gpu utilization,
>>> when really we want to go in the opposite direction and boost gpu freq.
>>> The boost callback gives some feedback to the fence signaler that we
>>> are missing deadlines, so it can take this into account in it's freq/
>>> utilization calculations.
>>>
>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>> ---
>>>    include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
>>>    1 file changed, 26 insertions(+)
>>>
>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>> index 9f12efaaa93a..172702521acc 100644
>>> --- a/include/linux/dma-fence.h
>>> +++ b/include/linux/dma-fence.h
>>> @@ -231,6 +231,17 @@ struct dma_fence_ops {
>>>        signed long (*wait)(struct dma_fence *fence,
>>>                            bool intr, signed long timeout);
>>>
>>> +     /**
>>> +      * @boost:
>>> +      *
>>> +      * Optional callback, to indicate that a fence waiter missed a deadline.
>>> +      * This can serve as a signal that (if possible) whatever signals the
>>> +      * fence should boost it's clocks.
>>> +      *
>>> +      * This can be called in any context that can call dma_fence_wait().
>>> +      */
>>> +     void (*boost)(struct dma_fence *fence);
>>> +
>>>        /**
>>>         * @release:
>>>         *
>>> @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
>>>        return ret < 0 ? ret : 0;
>>>    }
>>>
>>> +/**
>>> + * dma_fence_boost - hint from waiter that it missed a deadline
>>> + *
>>> + * @fence: the fence that caused the missed deadline
>>> + *
>>> + * This function gives a hint from a fence waiter that a deadline was
>>> + * missed, so that the fence signaler can factor this in to device
>>> + * power state decisions
>>> + */
>>> +static inline void dma_fence_boost(struct dma_fence *fence)
>>> +{
>>> +     if (fence->ops->boost)
>>> +             fence->ops->boost(fence);
>>> +}
>>> +
>>>    struct dma_fence *dma_fence_get_stub(void);
>>>    u64 dma_fence_context_alloc(unsigned num);
>>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20 14:11       ` Christian König
@ 2021-05-20 14:54         ` Rob Clark
  2021-05-20 16:01           ` [Linaro-mm-sig] " Christian König
  0 siblings, 1 reply; 20+ messages in thread
From: Rob Clark @ 2021-05-20 14:54 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel, freedreno, linux-arm-msm, Rob Clark, Sumit Semwal,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK, open list

On Thu, May 20, 2021 at 7:11 AM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 20.05.21 um 16:07 schrieb Rob Clark:
> > On Wed, May 19, 2021 at 11:47 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Uff, that looks very hardware specific to me.
> > Howso?  I'm not sure I agree.. and even if it was not useful for some
> > hw, it should be useful for enough drivers (and harm no drivers), so I
> > still think it is a good idea
> >
> > The fallback plan is to go the i915 route and stop using atomic
> > helpers and do the same thing inside the driver, but that doesn't help
> > any of the cases where you have a separate kms and gpu driver.
>
> Yeah, that's certainly not something we want.
>
> >> As far as I can see you can also implement completely inside the backend
> >> by starting a timer on enable_signaling, don't you?
> > Not really.. I mean, the fact that something waited on a fence could
> > be a useful input signal to gpu freq governor, but it is entirely
> > insufficient..
> >
> > If the cpu is spending a lot of time waiting on a fence, cpufreq will
> > clock down so you spend less time waiting.  And no problem has been
> > solved.  You absolutely need the concept of a missed deadline, and a
> > timer doesn't give you that.
>
> Ok then I probably don't understand the use case here.
>
> What exactly do you try to solve?

Basically situations where you are ping-ponging between GPU and CPU..
for example if you are double buffering instead of triple buffering,
and doing vblank sync'd pageflips.  The GPU, without any extra signal,
could get stuck at 30fps and a low gpu freq, because it ends up idle
while waiting for an extra vblank cycle for the next back-buffer to
become available.  Whereas if it boosted up to a higher freq and
stopped missing a vblank deadline, it would be less idle due to
getting the next back-buffer sooner (due to not missing a vblank
deadline).

BR,
-R

> Thanks,
> Christian.
>
> >
> > BR,
> > -R
> >
> >> Christian.
> >>
> >> Am 19.05.21 um 20:38 schrieb Rob Clark:
> >>> From: Rob Clark <robdclark@chromium.org>
> >>>
> >>> Add a way to hint to the fence signaler that a fence waiter has missed a
> >>> deadline waiting on the fence.
> >>>
> >>> In some cases, missing a vblank can result in lower gpu utilization,
> >>> when really we want to go in the opposite direction and boost gpu freq.
> >>> The boost callback gives some feedback to the fence signaler that we
> >>> are missing deadlines, so it can take this into account in it's freq/
> >>> utilization calculations.
> >>>
> >>> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >>> ---
> >>>    include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
> >>>    1 file changed, 26 insertions(+)
> >>>
> >>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> >>> index 9f12efaaa93a..172702521acc 100644
> >>> --- a/include/linux/dma-fence.h
> >>> +++ b/include/linux/dma-fence.h
> >>> @@ -231,6 +231,17 @@ struct dma_fence_ops {
> >>>        signed long (*wait)(struct dma_fence *fence,
> >>>                            bool intr, signed long timeout);
> >>>
> >>> +     /**
> >>> +      * @boost:
> >>> +      *
> >>> +      * Optional callback, to indicate that a fence waiter missed a deadline.
> >>> +      * This can serve as a signal that (if possible) whatever signals the
> >>> +      * fence should boost it's clocks.
> >>> +      *
> >>> +      * This can be called in any context that can call dma_fence_wait().
> >>> +      */
> >>> +     void (*boost)(struct dma_fence *fence);
> >>> +
> >>>        /**
> >>>         * @release:
> >>>         *
> >>> @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
> >>>        return ret < 0 ? ret : 0;
> >>>    }
> >>>
> >>> +/**
> >>> + * dma_fence_boost - hint from waiter that it missed a deadline
> >>> + *
> >>> + * @fence: the fence that caused the missed deadline
> >>> + *
> >>> + * This function gives a hint from a fence waiter that a deadline was
> >>> + * missed, so that the fence signaler can factor this in to device
> >>> + * power state decisions
> >>> + */
> >>> +static inline void dma_fence_boost(struct dma_fence *fence)
> >>> +{
> >>> +     if (fence->ops->boost)
> >>> +             fence->ops->boost(fence);
> >>> +}
> >>> +
> >>>    struct dma_fence *dma_fence_get_stub(void);
> >>>    u64 dma_fence_context_alloc(unsigned num);
> >>>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20 14:54         ` Rob Clark
@ 2021-05-20 16:01           ` Christian König
  2021-05-20 16:34             ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Christian König @ 2021-05-20 16:01 UTC (permalink / raw)
  To: Rob Clark, Christian König
  Cc: Rob Clark, linux-arm-msm, open list, dri-devel,
	moderated list:DMA BUFFER SHARING FRAMEWORK, freedreno,
	open list:DMA BUFFER SHARING FRAMEWORK

Am 20.05.21 um 16:54 schrieb Rob Clark:
> On Thu, May 20, 2021 at 7:11 AM Christian König
> <christian.koenig@amd.com> wrote:
>>
>>
>> Am 20.05.21 um 16:07 schrieb Rob Clark:
>>> On Wed, May 19, 2021 at 11:47 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Uff, that looks very hardware specific to me.
>>> Howso?  I'm not sure I agree.. and even if it was not useful for some
>>> hw, it should be useful for enough drivers (and harm no drivers), so I
>>> still think it is a good idea
>>>
>>> The fallback plan is to go the i915 route and stop using atomic
>>> helpers and do the same thing inside the driver, but that doesn't help
>>> any of the cases where you have a separate kms and gpu driver.
>> Yeah, that's certainly not something we want.
>>
>>>> As far as I can see you can also implement completely inside the backend
>>>> by starting a timer on enable_signaling, don't you?
>>> Not really.. I mean, the fact that something waited on a fence could
>>> be a useful input signal to gpu freq governor, but it is entirely
>>> insufficient..
>>>
>>> If the cpu is spending a lot of time waiting on a fence, cpufreq will
>>> clock down so you spend less time waiting.  And no problem has been
>>> solved.  You absolutely need the concept of a missed deadline, and a
>>> timer doesn't give you that.
>> Ok then I probably don't understand the use case here.
>>
>> What exactly do you try to solve?
> Basically situations where you are ping-ponging between GPU and CPU..
> for example if you are double buffering instead of triple buffering,
> and doing vblank sync'd pageflips.  The GPU, without any extra signal,
> could get stuck at 30fps and a low gpu freq, because it ends up idle
> while waiting for an extra vblank cycle for the next back-buffer to
> become available.  Whereas if it boosted up to a higher freq and
> stopped missing a vblank deadline, it would be less idle due to
> getting the next back-buffer sooner (due to not missing a vblank
> deadline).

Ok the is the why, but what about the how?

How does it help to have this boost callback and not just start a time 
on enable signaling and stop it when the signal arrives?

Regards,
Christian.

>
> BR,
> -R
>
>> Thanks,
>> Christian.
>>
>>> BR,
>>> -R
>>>
>>>> Christian.
>>>>
>>>> Am 19.05.21 um 20:38 schrieb Rob Clark:
>>>>> From: Rob Clark <robdclark@chromium.org>
>>>>>
>>>>> Add a way to hint to the fence signaler that a fence waiter has missed a
>>>>> deadline waiting on the fence.
>>>>>
>>>>> In some cases, missing a vblank can result in lower gpu utilization,
>>>>> when really we want to go in the opposite direction and boost gpu freq.
>>>>> The boost callback gives some feedback to the fence signaler that we
>>>>> are missing deadlines, so it can take this into account in it's freq/
>>>>> utilization calculations.
>>>>>
>>>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>>>> ---
>>>>>     include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
>>>>>     1 file changed, 26 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>>>> index 9f12efaaa93a..172702521acc 100644
>>>>> --- a/include/linux/dma-fence.h
>>>>> +++ b/include/linux/dma-fence.h
>>>>> @@ -231,6 +231,17 @@ struct dma_fence_ops {
>>>>>         signed long (*wait)(struct dma_fence *fence,
>>>>>                             bool intr, signed long timeout);
>>>>>
>>>>> +     /**
>>>>> +      * @boost:
>>>>> +      *
>>>>> +      * Optional callback, to indicate that a fence waiter missed a deadline.
>>>>> +      * This can serve as a signal that (if possible) whatever signals the
>>>>> +      * fence should boost it's clocks.
>>>>> +      *
>>>>> +      * This can be called in any context that can call dma_fence_wait().
>>>>> +      */
>>>>> +     void (*boost)(struct dma_fence *fence);
>>>>> +
>>>>>         /**
>>>>>          * @release:
>>>>>          *
>>>>> @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
>>>>>         return ret < 0 ? ret : 0;
>>>>>     }
>>>>>
>>>>> +/**
>>>>> + * dma_fence_boost - hint from waiter that it missed a deadline
>>>>> + *
>>>>> + * @fence: the fence that caused the missed deadline
>>>>> + *
>>>>> + * This function gives a hint from a fence waiter that a deadline was
>>>>> + * missed, so that the fence signaler can factor this in to device
>>>>> + * power state decisions
>>>>> + */
>>>>> +static inline void dma_fence_boost(struct dma_fence *fence)
>>>>> +{
>>>>> +     if (fence->ops->boost)
>>>>> +             fence->ops->boost(fence);
>>>>> +}
>>>>> +
>>>>>     struct dma_fence *dma_fence_get_stub(void);
>>>>>     u64 dma_fence_context_alloc(unsigned num);
>>>>>
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-mm-sig


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20 14:07     ` Rob Clark
  2021-05-20 14:11       ` Christian König
@ 2021-05-20 16:25       ` Daniel Vetter
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel Vetter @ 2021-05-20 16:25 UTC (permalink / raw)
  To: Rob Clark, Matthew Brost
  Cc: Christian König, Rob Clark, linux-arm-msm, open list,
	dri-devel, moderated list:DMA BUFFER SHARING FRAMEWORK,
	freedreno, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, May 20, 2021 at 4:03 PM Rob Clark <robdclark@gmail.com> wrote:
>
> On Wed, May 19, 2021 at 11:47 PM Christian König
> <christian.koenig@amd.com> wrote:
> >
> > Uff, that looks very hardware specific to me.
>
> Howso?  I'm not sure I agree.. and even if it was not useful for some
> hw, it should be useful for enough drivers (and harm no drivers), so I
> still think it is a good idea
>
> The fallback plan is to go the i915 route and stop using atomic
> helpers and do the same thing inside the driver, but that doesn't help
> any of the cases where you have a separate kms and gpu driver.

Don't, because the i915 plan is to actually move towards drm/scheduler
and atomic helpers.

> > As far as I can see you can also implement completely inside the backend
> > by starting a timer on enable_signaling, don't you?
>
> Not really.. I mean, the fact that something waited on a fence could
> be a useful input signal to gpu freq governor, but it is entirely
> insufficient..
>
> If the cpu is spending a lot of time waiting on a fence, cpufreq will
> clock down so you spend less time waiting.  And no problem has been
> solved.  You absolutely need the concept of a missed deadline, and a
> timer doesn't give you that.

Yup agreed.

Adding Matt Brost, since he's planning all this boostback work.
-Daniel

>
> BR,
> -R
>
> > Christian.
> >
> > Am 19.05.21 um 20:38 schrieb Rob Clark:
> > > From: Rob Clark <robdclark@chromium.org>
> > >
> > > Add a way to hint to the fence signaler that a fence waiter has missed a
> > > deadline waiting on the fence.
> > >
> > > In some cases, missing a vblank can result in lower gpu utilization,
> > > when really we want to go in the opposite direction and boost gpu freq.
> > > The boost callback gives some feedback to the fence signaler that we
> > > are missing deadlines, so it can take this into account in it's freq/
> > > utilization calculations.
> > >
> > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > ---
> > >   include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
> > >   1 file changed, 26 insertions(+)
> > >
> > > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> > > index 9f12efaaa93a..172702521acc 100644
> > > --- a/include/linux/dma-fence.h
> > > +++ b/include/linux/dma-fence.h
> > > @@ -231,6 +231,17 @@ struct dma_fence_ops {
> > >       signed long (*wait)(struct dma_fence *fence,
> > >                           bool intr, signed long timeout);
> > >
> > > +     /**
> > > +      * @boost:
> > > +      *
> > > +      * Optional callback, to indicate that a fence waiter missed a deadline.
> > > +      * This can serve as a signal that (if possible) whatever signals the
> > > +      * fence should boost it's clocks.
> > > +      *
> > > +      * This can be called in any context that can call dma_fence_wait().
> > > +      */
> > > +     void (*boost)(struct dma_fence *fence);
> > > +
> > >       /**
> > >        * @release:
> > >        *
> > > @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
> > >       return ret < 0 ? ret : 0;
> > >   }
> > >
> > > +/**
> > > + * dma_fence_boost - hint from waiter that it missed a deadline
> > > + *
> > > + * @fence: the fence that caused the missed deadline
> > > + *
> > > + * This function gives a hint from a fence waiter that a deadline was
> > > + * missed, so that the fence signaler can factor this in to device
> > > + * power state decisions
> > > + */
> > > +static inline void dma_fence_boost(struct dma_fence *fence)
> > > +{
> > > +     if (fence->ops->boost)
> > > +             fence->ops->boost(fence);
> > > +}
> > > +
> > >   struct dma_fence *dma_fence_get_stub(void);
> > >   u64 dma_fence_context_alloc(unsigned num);
> > >
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank
  2021-05-19 18:38 ` [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank Rob Clark
@ 2021-05-20 16:29   ` Daniel Vetter
  2021-05-30 14:33     ` Rob Clark
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-05-20 16:29 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, freedreno, linux-arm-msm, Rob Clark,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, Daniel Vetter, open list, Matthew Brost

On Wed, May 19, 2021 at 11:38:53AM -0700, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>  drivers/gpu/drm/drm_atomic_helper.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 560aaecba31b..fe10fc2e7f86 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -1435,11 +1435,15 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
>  	int i, ret;
>  
>  	for_each_new_plane_in_state(state, plane, new_plane_state, i) {
> +		u64 vblank_count;
> +
>  		if (!new_plane_state->fence)
>  			continue;
>  
>  		WARN_ON(!new_plane_state->fb);
>  
> +		vblank_count = drm_crtc_vblank_count(new_plane_state->crtc);
> +
>  		/*
>  		 * If waiting for fences pre-swap (ie: nonblock), userspace can
>  		 * still interrupt the operation. Instead of blocking until the
> @@ -1449,6 +1453,13 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
>  		if (ret)
>  			return ret;
>  
> +		/*
> +		 * Check if we've missed a vblank while waiting, and if we have
> +		 * signal the fence that it's signaler should be boosted
> +		 */
> +		if (vblank_count != drm_crtc_vblank_count(new_plane_state->crtc))
> +			dma_fence_boost(new_plane_state->fence);

I think we should do a lot better here:
- maybe only bother doing this for single-crtc updates, and only if
  modeset isn't set. No one else cares about latency.

- We should boost _right_ when we've missed the frame, so I think we
  should have a _timeout wait here that guesstimates when the vblank is
  over (might need to throw in a vblank wait if we missed) and then boost
  immediately. Not wait a bunch of frames (worst case) until we finally
  decide to boost.

Otherwise I really like this, I think it's about the only real reason i915
isn't using atomic helpers.

Also adding Matt B for this topic.
-Daniel

> +
>  		dma_fence_put(new_plane_state->fence);
>  		new_plane_state->fence = NULL;
>  	}
> -- 
> 2.30.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20 16:01           ` [Linaro-mm-sig] " Christian König
@ 2021-05-20 16:34             ` Daniel Vetter
  2021-05-20 16:40               ` Christian König
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-05-20 16:34 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Christian König, Rob Clark, linux-arm-msm,
	open list, dri-devel,
	moderated list:DMA BUFFER SHARING FRAMEWORK, freedreno,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, May 20, 2021 at 06:01:39PM +0200, Christian König wrote:
> Am 20.05.21 um 16:54 schrieb Rob Clark:
> > On Thu, May 20, 2021 at 7:11 AM Christian König
> > <christian.koenig@amd.com> wrote:
> > > 
> > > 
> > > Am 20.05.21 um 16:07 schrieb Rob Clark:
> > > > On Wed, May 19, 2021 at 11:47 PM Christian König
> > > > <christian.koenig@amd.com> wrote:
> > > > > Uff, that looks very hardware specific to me.
> > > > Howso?  I'm not sure I agree.. and even if it was not useful for some
> > > > hw, it should be useful for enough drivers (and harm no drivers), so I
> > > > still think it is a good idea
> > > > 
> > > > The fallback plan is to go the i915 route and stop using atomic
> > > > helpers and do the same thing inside the driver, but that doesn't help
> > > > any of the cases where you have a separate kms and gpu driver.
> > > Yeah, that's certainly not something we want.
> > > 
> > > > > As far as I can see you can also implement completely inside the backend
> > > > > by starting a timer on enable_signaling, don't you?
> > > > Not really.. I mean, the fact that something waited on a fence could
> > > > be a useful input signal to gpu freq governor, but it is entirely
> > > > insufficient..
> > > > 
> > > > If the cpu is spending a lot of time waiting on a fence, cpufreq will
> > > > clock down so you spend less time waiting.  And no problem has been
> > > > solved.  You absolutely need the concept of a missed deadline, and a
> > > > timer doesn't give you that.
> > > Ok then I probably don't understand the use case here.
> > > 
> > > What exactly do you try to solve?
> > Basically situations where you are ping-ponging between GPU and CPU..
> > for example if you are double buffering instead of triple buffering,
> > and doing vblank sync'd pageflips.  The GPU, without any extra signal,
> > could get stuck at 30fps and a low gpu freq, because it ends up idle
> > while waiting for an extra vblank cycle for the next back-buffer to
> > become available.  Whereas if it boosted up to a higher freq and
> > stopped missing a vblank deadline, it would be less idle due to
> > getting the next back-buffer sooner (due to not missing a vblank
> > deadline).
> 
> Ok the is the why, but what about the how?
> 
> How does it help to have this boost callback and not just start a time on
> enable signaling and stop it when the signal arrives?

Because the render side (or drm/scheduler, if msm would use that) has no
idea for which vblank a rendering actually is for.

So boosting right when you've missed your frame (not what Rob implements
currently, but fixable) is the right semantics.

The other issue is that for cpu waits, we want to differentiate from fence
waits that userspace does intentially (e.g. wait ioctl) and waits that
random other things are doing within the kernel to keep track of progress.

For the former we know that userspace is stuck waiting for the gpu, and we
probably want to boost. For the latter we most definitely do _not_ want to
boost.

Otoh I do agree with you that the current api is a bit awkward, so perhaps
we do need a dma_fence_userspace_wait wrapper which boosts automatically
after a bit. And similarly perhaps a drm_vblank_dma_fence_wait, where you
give it a vblank target, and if the fence isn't signalled by then, we kick
it real hard.

But otherwise yes this is absolutely a thing that matters a ton. If you
look at Matt Brost's scheduler rfc, there's also a line item in there
about adding this kind of boosting to drm/scheduler.
-Daniel


> 
> Regards,
> Christian.
> 
> > 
> > BR,
> > -R
> > 
> > > Thanks,
> > > Christian.
> > > 
> > > > BR,
> > > > -R
> > > > 
> > > > > Christian.
> > > > > 
> > > > > Am 19.05.21 um 20:38 schrieb Rob Clark:
> > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > 
> > > > > > Add a way to hint to the fence signaler that a fence waiter has missed a
> > > > > > deadline waiting on the fence.
> > > > > > 
> > > > > > In some cases, missing a vblank can result in lower gpu utilization,
> > > > > > when really we want to go in the opposite direction and boost gpu freq.
> > > > > > The boost callback gives some feedback to the fence signaler that we
> > > > > > are missing deadlines, so it can take this into account in it's freq/
> > > > > > utilization calculations.
> > > > > > 
> > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > > > ---
> > > > > >     include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
> > > > > >     1 file changed, 26 insertions(+)
> > > > > > 
> > > > > > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> > > > > > index 9f12efaaa93a..172702521acc 100644
> > > > > > --- a/include/linux/dma-fence.h
> > > > > > +++ b/include/linux/dma-fence.h
> > > > > > @@ -231,6 +231,17 @@ struct dma_fence_ops {
> > > > > >         signed long (*wait)(struct dma_fence *fence,
> > > > > >                             bool intr, signed long timeout);
> > > > > > 
> > > > > > +     /**
> > > > > > +      * @boost:
> > > > > > +      *
> > > > > > +      * Optional callback, to indicate that a fence waiter missed a deadline.
> > > > > > +      * This can serve as a signal that (if possible) whatever signals the
> > > > > > +      * fence should boost it's clocks.
> > > > > > +      *
> > > > > > +      * This can be called in any context that can call dma_fence_wait().
> > > > > > +      */
> > > > > > +     void (*boost)(struct dma_fence *fence);
> > > > > > +
> > > > > >         /**
> > > > > >          * @release:
> > > > > >          *
> > > > > > @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
> > > > > >         return ret < 0 ? ret : 0;
> > > > > >     }
> > > > > > 
> > > > > > +/**
> > > > > > + * dma_fence_boost - hint from waiter that it missed a deadline
> > > > > > + *
> > > > > > + * @fence: the fence that caused the missed deadline
> > > > > > + *
> > > > > > + * This function gives a hint from a fence waiter that a deadline was
> > > > > > + * missed, so that the fence signaler can factor this in to device
> > > > > > + * power state decisions
> > > > > > + */
> > > > > > +static inline void dma_fence_boost(struct dma_fence *fence)
> > > > > > +{
> > > > > > +     if (fence->ops->boost)
> > > > > > +             fence->ops->boost(fence);
> > > > > > +}
> > > > > > +
> > > > > >     struct dma_fence *dma_fence_get_stub(void);
> > > > > >     u64 dma_fence_context_alloc(unsigned num);
> > > > > > 
> > _______________________________________________
> > Linaro-mm-sig mailing list
> > Linaro-mm-sig@lists.linaro.org
> > https://lists.linaro.org/mailman/listinfo/linaro-mm-sig
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20 16:34             ` Daniel Vetter
@ 2021-05-20 16:40               ` Christian König
  2021-05-20 17:08                 ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Christian König @ 2021-05-20 16:40 UTC (permalink / raw)
  To: Christian König, Rob Clark, Rob Clark, linux-arm-msm,
	open list, dri-devel,
	moderated list:DMA BUFFER SHARING FRAMEWORK, freedreno,
	open list:DMA BUFFER SHARING FRAMEWORK

Am 20.05.21 um 18:34 schrieb Daniel Vetter:
> On Thu, May 20, 2021 at 06:01:39PM +0200, Christian König wrote:
>> Am 20.05.21 um 16:54 schrieb Rob Clark:
>>> On Thu, May 20, 2021 at 7:11 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>>
>>>> Am 20.05.21 um 16:07 schrieb Rob Clark:
>>>>> On Wed, May 19, 2021 at 11:47 PM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Uff, that looks very hardware specific to me.
>>>>> Howso?  I'm not sure I agree.. and even if it was not useful for some
>>>>> hw, it should be useful for enough drivers (and harm no drivers), so I
>>>>> still think it is a good idea
>>>>>
>>>>> The fallback plan is to go the i915 route and stop using atomic
>>>>> helpers and do the same thing inside the driver, but that doesn't help
>>>>> any of the cases where you have a separate kms and gpu driver.
>>>> Yeah, that's certainly not something we want.
>>>>
>>>>>> As far as I can see you can also implement completely inside the backend
>>>>>> by starting a timer on enable_signaling, don't you?
>>>>> Not really.. I mean, the fact that something waited on a fence could
>>>>> be a useful input signal to gpu freq governor, but it is entirely
>>>>> insufficient..
>>>>>
>>>>> If the cpu is spending a lot of time waiting on a fence, cpufreq will
>>>>> clock down so you spend less time waiting.  And no problem has been
>>>>> solved.  You absolutely need the concept of a missed deadline, and a
>>>>> timer doesn't give you that.
>>>> Ok then I probably don't understand the use case here.
>>>>
>>>> What exactly do you try to solve?
>>> Basically situations where you are ping-ponging between GPU and CPU..
>>> for example if you are double buffering instead of triple buffering,
>>> and doing vblank sync'd pageflips.  The GPU, without any extra signal,
>>> could get stuck at 30fps and a low gpu freq, because it ends up idle
>>> while waiting for an extra vblank cycle for the next back-buffer to
>>> become available.  Whereas if it boosted up to a higher freq and
>>> stopped missing a vblank deadline, it would be less idle due to
>>> getting the next back-buffer sooner (due to not missing a vblank
>>> deadline).
>> Ok the is the why, but what about the how?
>>
>> How does it help to have this boost callback and not just start a time on
>> enable signaling and stop it when the signal arrives?
> Because the render side (or drm/scheduler, if msm would use that) has no
> idea for which vblank a rendering actually is for.

AH! So we are basically telling the fence backend that we have just 
missed an event we waited for.

So what we want to know is how long the frontend wanted to wait instead 
of how long the backend took for rendering.

> So boosting right when you've missed your frame (not what Rob implements
> currently, but fixable) is the right semantics.
>
> The other issue is that for cpu waits, we want to differentiate from fence
> waits that userspace does intentially (e.g. wait ioctl) and waits that
> random other things are doing within the kernel to keep track of progress.
>
> For the former we know that userspace is stuck waiting for the gpu, and we
> probably want to boost. For the latter we most definitely do _not_ want to
> boost.
>
> Otoh I do agree with you that the current api is a bit awkward, so perhaps
> we do need a dma_fence_userspace_wait wrapper which boosts automatically
> after a bit. And similarly perhaps a drm_vblank_dma_fence_wait, where you
> give it a vblank target, and if the fence isn't signalled by then, we kick
> it real hard.

Yeah, something like an use case driven API would be nice to have.

For this particular case I suggest that we somehow extend the enable 
signaling callback.

> But otherwise yes this is absolutely a thing that matters a ton. If you
> look at Matt Brost's scheduler rfc, there's also a line item in there
> about adding this kind of boosting to drm/scheduler.

BTW: I still can't see this in my inbox.

Do you have a link?

Christian.

> -Daniel
>
>
>> Regards,
>> Christian.
>>
>>> BR,
>>> -R
>>>
>>>> Thanks,
>>>> Christian.
>>>>
>>>>> BR,
>>>>> -R
>>>>>
>>>>>> Christian.
>>>>>>
>>>>>> Am 19.05.21 um 20:38 schrieb Rob Clark:
>>>>>>> From: Rob Clark <robdclark@chromium.org>
>>>>>>>
>>>>>>> Add a way to hint to the fence signaler that a fence waiter has missed a
>>>>>>> deadline waiting on the fence.
>>>>>>>
>>>>>>> In some cases, missing a vblank can result in lower gpu utilization,
>>>>>>> when really we want to go in the opposite direction and boost gpu freq.
>>>>>>> The boost callback gives some feedback to the fence signaler that we
>>>>>>> are missing deadlines, so it can take this into account in it's freq/
>>>>>>> utilization calculations.
>>>>>>>
>>>>>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>>>>>> ---
>>>>>>>      include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
>>>>>>>      1 file changed, 26 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>>>>>> index 9f12efaaa93a..172702521acc 100644
>>>>>>> --- a/include/linux/dma-fence.h
>>>>>>> +++ b/include/linux/dma-fence.h
>>>>>>> @@ -231,6 +231,17 @@ struct dma_fence_ops {
>>>>>>>          signed long (*wait)(struct dma_fence *fence,
>>>>>>>                              bool intr, signed long timeout);
>>>>>>>
>>>>>>> +     /**
>>>>>>> +      * @boost:
>>>>>>> +      *
>>>>>>> +      * Optional callback, to indicate that a fence waiter missed a deadline.
>>>>>>> +      * This can serve as a signal that (if possible) whatever signals the
>>>>>>> +      * fence should boost it's clocks.
>>>>>>> +      *
>>>>>>> +      * This can be called in any context that can call dma_fence_wait().
>>>>>>> +      */
>>>>>>> +     void (*boost)(struct dma_fence *fence);
>>>>>>> +
>>>>>>>          /**
>>>>>>>           * @release:
>>>>>>>           *
>>>>>>> @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
>>>>>>>          return ret < 0 ? ret : 0;
>>>>>>>      }
>>>>>>>
>>>>>>> +/**
>>>>>>> + * dma_fence_boost - hint from waiter that it missed a deadline
>>>>>>> + *
>>>>>>> + * @fence: the fence that caused the missed deadline
>>>>>>> + *
>>>>>>> + * This function gives a hint from a fence waiter that a deadline was
>>>>>>> + * missed, so that the fence signaler can factor this in to device
>>>>>>> + * power state decisions
>>>>>>> + */
>>>>>>> +static inline void dma_fence_boost(struct dma_fence *fence)
>>>>>>> +{
>>>>>>> +     if (fence->ops->boost)
>>>>>>> +             fence->ops->boost(fence);
>>>>>>> +}
>>>>>>> +
>>>>>>>      struct dma_fence *dma_fence_get_stub(void);
>>>>>>>      u64 dma_fence_context_alloc(unsigned num);
>>>>>>>
>>> _______________________________________________
>>> Linaro-mm-sig mailing list
>>> Linaro-mm-sig@lists.linaro.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linaro.org%2Fmailman%2Flistinfo%2Flinaro-mm-sig&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C69c1843a93ec4888abd308d91bad18bd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571252548030247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EJBA9rVl5xTRmdEPzyCyGX7xyZMKAGVhTmoEnsPfOxw%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20 16:40               ` Christian König
@ 2021-05-20 17:08                 ` Daniel Vetter
  2021-05-21  7:43                   ` Christian König
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-05-20 17:08 UTC (permalink / raw)
  To: Christian König
  Cc: Christian König, Rob Clark, Rob Clark, linux-arm-msm,
	open list, dri-devel,
	moderated list:DMA BUFFER SHARING FRAMEWORK, freedreno,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, May 20, 2021 at 6:41 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 20.05.21 um 18:34 schrieb Daniel Vetter:
> > On Thu, May 20, 2021 at 06:01:39PM +0200, Christian König wrote:
> >> Am 20.05.21 um 16:54 schrieb Rob Clark:
> >>> On Thu, May 20, 2021 at 7:11 AM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>>
> >>>> Am 20.05.21 um 16:07 schrieb Rob Clark:
> >>>>> On Wed, May 19, 2021 at 11:47 PM Christian König
> >>>>> <christian.koenig@amd.com> wrote:
> >>>>>> Uff, that looks very hardware specific to me.
> >>>>> Howso?  I'm not sure I agree.. and even if it was not useful for some
> >>>>> hw, it should be useful for enough drivers (and harm no drivers), so I
> >>>>> still think it is a good idea
> >>>>>
> >>>>> The fallback plan is to go the i915 route and stop using atomic
> >>>>> helpers and do the same thing inside the driver, but that doesn't help
> >>>>> any of the cases where you have a separate kms and gpu driver.
> >>>> Yeah, that's certainly not something we want.
> >>>>
> >>>>>> As far as I can see you can also implement completely inside the backend
> >>>>>> by starting a timer on enable_signaling, don't you?
> >>>>> Not really.. I mean, the fact that something waited on a fence could
> >>>>> be a useful input signal to gpu freq governor, but it is entirely
> >>>>> insufficient..
> >>>>>
> >>>>> If the cpu is spending a lot of time waiting on a fence, cpufreq will
> >>>>> clock down so you spend less time waiting.  And no problem has been
> >>>>> solved.  You absolutely need the concept of a missed deadline, and a
> >>>>> timer doesn't give you that.
> >>>> Ok then I probably don't understand the use case here.
> >>>>
> >>>> What exactly do you try to solve?
> >>> Basically situations where you are ping-ponging between GPU and CPU..
> >>> for example if you are double buffering instead of triple buffering,
> >>> and doing vblank sync'd pageflips.  The GPU, without any extra signal,
> >>> could get stuck at 30fps and a low gpu freq, because it ends up idle
> >>> while waiting for an extra vblank cycle for the next back-buffer to
> >>> become available.  Whereas if it boosted up to a higher freq and
> >>> stopped missing a vblank deadline, it would be less idle due to
> >>> getting the next back-buffer sooner (due to not missing a vblank
> >>> deadline).
> >> Ok the is the why, but what about the how?
> >>
> >> How does it help to have this boost callback and not just start a time on
> >> enable signaling and stop it when the signal arrives?
> > Because the render side (or drm/scheduler, if msm would use that) has no
> > idea for which vblank a rendering actually is for.
>
> AH! So we are basically telling the fence backend that we have just
> missed an event we waited for.
>
> So what we want to know is how long the frontend wanted to wait instead
> of how long the backend took for rendering.

tbh I'm not sure the timestamp matters at all. What we do in i915 is
boost quite aggressively, and then let the usual clock tuning wittle
it down if we overshot. Plus soom cool-down to prevent
abuse/continuous boosting. I think we also differentiate between
display boost and userspace waits.

On the display side we also wait until the vblank has passed we aimed
for (atm always the next, we don't have target_frame support like
amdgpu), to avoid boosting when there's no point.

> > So boosting right when you've missed your frame (not what Rob implements
> > currently, but fixable) is the right semantics.
> >
> > The other issue is that for cpu waits, we want to differentiate from fence
> > waits that userspace does intentially (e.g. wait ioctl) and waits that
> > random other things are doing within the kernel to keep track of progress.
> >
> > For the former we know that userspace is stuck waiting for the gpu, and we
> > probably want to boost. For the latter we most definitely do _not_ want to
> > boost.
> >
> > Otoh I do agree with you that the current api is a bit awkward, so perhaps
> > we do need a dma_fence_userspace_wait wrapper which boosts automatically
> > after a bit. And similarly perhaps a drm_vblank_dma_fence_wait, where you
> > give it a vblank target, and if the fence isn't signalled by then, we kick
> > it real hard.
>
> Yeah, something like an use case driven API would be nice to have.
>
> For this particular case I suggest that we somehow extend the enable
> signaling callback.
>
> > But otherwise yes this is absolutely a thing that matters a ton. If you
> > look at Matt Brost's scheduler rfc, there's also a line item in there
> > about adding this kind of boosting to drm/scheduler.
>
> BTW: I still can't see this in my inbox.

You've replied already:

https://lore.kernel.org/dri-devel/20210518235830.133834-1-matthew.brost@intel.com/

It's just the big picture plan of what areas we're all trying to
tackle with some why, so that everyone knows what's coming in the next
half year at least. Probably longer until this is all sorted. I think
Matt has some poc hacked-up pile, but nothing really to show.
-Daniel

> Do you have a link?
>
> Christian.
>
> > -Daniel
> >
> >
> >> Regards,
> >> Christian.
> >>
> >>> BR,
> >>> -R
> >>>
> >>>> Thanks,
> >>>> Christian.
> >>>>
> >>>>> BR,
> >>>>> -R
> >>>>>
> >>>>>> Christian.
> >>>>>>
> >>>>>> Am 19.05.21 um 20:38 schrieb Rob Clark:
> >>>>>>> From: Rob Clark <robdclark@chromium.org>
> >>>>>>>
> >>>>>>> Add a way to hint to the fence signaler that a fence waiter has missed a
> >>>>>>> deadline waiting on the fence.
> >>>>>>>
> >>>>>>> In some cases, missing a vblank can result in lower gpu utilization,
> >>>>>>> when really we want to go in the opposite direction and boost gpu freq.
> >>>>>>> The boost callback gives some feedback to the fence signaler that we
> >>>>>>> are missing deadlines, so it can take this into account in it's freq/
> >>>>>>> utilization calculations.
> >>>>>>>
> >>>>>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >>>>>>> ---
> >>>>>>>      include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
> >>>>>>>      1 file changed, 26 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> >>>>>>> index 9f12efaaa93a..172702521acc 100644
> >>>>>>> --- a/include/linux/dma-fence.h
> >>>>>>> +++ b/include/linux/dma-fence.h
> >>>>>>> @@ -231,6 +231,17 @@ struct dma_fence_ops {
> >>>>>>>          signed long (*wait)(struct dma_fence *fence,
> >>>>>>>                              bool intr, signed long timeout);
> >>>>>>>
> >>>>>>> +     /**
> >>>>>>> +      * @boost:
> >>>>>>> +      *
> >>>>>>> +      * Optional callback, to indicate that a fence waiter missed a deadline.
> >>>>>>> +      * This can serve as a signal that (if possible) whatever signals the
> >>>>>>> +      * fence should boost it's clocks.
> >>>>>>> +      *
> >>>>>>> +      * This can be called in any context that can call dma_fence_wait().
> >>>>>>> +      */
> >>>>>>> +     void (*boost)(struct dma_fence *fence);
> >>>>>>> +
> >>>>>>>          /**
> >>>>>>>           * @release:
> >>>>>>>           *
> >>>>>>> @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
> >>>>>>>          return ret < 0 ? ret : 0;
> >>>>>>>      }
> >>>>>>>
> >>>>>>> +/**
> >>>>>>> + * dma_fence_boost - hint from waiter that it missed a deadline
> >>>>>>> + *
> >>>>>>> + * @fence: the fence that caused the missed deadline
> >>>>>>> + *
> >>>>>>> + * This function gives a hint from a fence waiter that a deadline was
> >>>>>>> + * missed, so that the fence signaler can factor this in to device
> >>>>>>> + * power state decisions
> >>>>>>> + */
> >>>>>>> +static inline void dma_fence_boost(struct dma_fence *fence)
> >>>>>>> +{
> >>>>>>> +     if (fence->ops->boost)
> >>>>>>> +             fence->ops->boost(fence);
> >>>>>>> +}
> >>>>>>> +
> >>>>>>>      struct dma_fence *dma_fence_get_stub(void);
> >>>>>>>      u64 dma_fence_context_alloc(unsigned num);
> >>>>>>>
> >>> _______________________________________________
> >>> Linaro-mm-sig mailing list
> >>> Linaro-mm-sig@lists.linaro.org
> >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linaro.org%2Fmailman%2Flistinfo%2Flinaro-mm-sig&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C69c1843a93ec4888abd308d91bad18bd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571252548030247%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EJBA9rVl5xTRmdEPzyCyGX7xyZMKAGVhTmoEnsPfOxw%3D&amp;reserved=0
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op
  2021-05-20 17:08                 ` Daniel Vetter
@ 2021-05-21  7:43                   ` Christian König
  2021-05-21 14:21                     ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Christian König @ 2021-05-21  7:43 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Rob Clark, Rob Clark, linux-arm-msm,
	open list, dri-devel,
	moderated list:DMA BUFFER SHARING FRAMEWORK, freedreno,
	open list:DMA BUFFER SHARING FRAMEWORK

Am 20.05.21 um 19:08 schrieb Daniel Vetter:
> [SNIP]
>> AH! So we are basically telling the fence backend that we have just
>> missed an event we waited for.
>>
>> So what we want to know is how long the frontend wanted to wait instead
>> of how long the backend took for rendering.
> tbh I'm not sure the timestamp matters at all. What we do in i915 is
> boost quite aggressively, and then let the usual clock tuning wittle
> it down if we overshot. Plus soom cool-down to prevent
> abuse/continuous boosting. I think we also differentiate between
> display boost and userspace waits.

I was not thinking about time stamps here, but more like which 
information we need at which place.

> On the display side we also wait until the vblank has passed we aimed
> for (atm always the next, we don't have target_frame support like
> amdgpu), to avoid boosting when there's no point.
>
>>> So boosting right when you've missed your frame (not what Rob implements
>>> currently, but fixable) is the right semantics.
>>>
>>> The other issue is that for cpu waits, we want to differentiate from fence
>>> waits that userspace does intentially (e.g. wait ioctl) and waits that
>>> random other things are doing within the kernel to keep track of progress.
>>>
>>> For the former we know that userspace is stuck waiting for the gpu, and we
>>> probably want to boost. For the latter we most definitely do _not_ want to
>>> boost.
>>>
>>> Otoh I do agree with you that the current api is a bit awkward, so perhaps
>>> we do need a dma_fence_userspace_wait wrapper which boosts automatically
>>> after a bit. And similarly perhaps a drm_vblank_dma_fence_wait, where you
>>> give it a vblank target, and if the fence isn't signalled by then, we kick
>>> it real hard.
>> Yeah, something like an use case driven API would be nice to have.
>>
>> For this particular case I suggest that we somehow extend the enable
>> signaling callback.
>>
>>> But otherwise yes this is absolutely a thing that matters a ton. If you
>>> look at Matt Brost's scheduler rfc, there's also a line item in there
>>> about adding this kind of boosting to drm/scheduler.
>> BTW: I still can't see this in my inbox.
> You've replied already:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210518235830.133834-1-matthew.brost%40intel.com%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ce4f3688b832842c4236e08d91bb1e148%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571273080820910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=uk3Gs%2FW42BDqMuMJtujcAH5GvN8mOlDnmywK8x1I%2F0k%3D&amp;reserved=0

Yeah, but doesn't that also require some changes to the DRM scheduler?

I was expecting that this is a bit more than just two patches.

Christian.

>
> It's just the big picture plan of what areas we're all trying to
> tackle with some why, so that everyone knows what's coming in the next
> half year at least. Probably longer until this is all sorted. I think
> Matt has some poc hacked-up pile, but nothing really to show.
> -Daniel
>
>> Do you have a link?
>>
>> Christian.
>>
>>> -Daniel
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> BR,
>>>>> -R
>>>>>
>>>>>> Thanks,
>>>>>> Christian.
>>>>>>
>>>>>>> BR,
>>>>>>> -R
>>>>>>>
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>> Am 19.05.21 um 20:38 schrieb Rob Clark:
>>>>>>>>> From: Rob Clark <robdclark@chromium.org>
>>>>>>>>>
>>>>>>>>> Add a way to hint to the fence signaler that a fence waiter has missed a
>>>>>>>>> deadline waiting on the fence.
>>>>>>>>>
>>>>>>>>> In some cases, missing a vblank can result in lower gpu utilization,
>>>>>>>>> when really we want to go in the opposite direction and boost gpu freq.
>>>>>>>>> The boost callback gives some feedback to the fence signaler that we
>>>>>>>>> are missing deadlines, so it can take this into account in it's freq/
>>>>>>>>> utilization calculations.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>>>>>>>> ---
>>>>>>>>>       include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
>>>>>>>>>       1 file changed, 26 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>>>>>>>> index 9f12efaaa93a..172702521acc 100644
>>>>>>>>> --- a/include/linux/dma-fence.h
>>>>>>>>> +++ b/include/linux/dma-fence.h
>>>>>>>>> @@ -231,6 +231,17 @@ struct dma_fence_ops {
>>>>>>>>>           signed long (*wait)(struct dma_fence *fence,
>>>>>>>>>                               bool intr, signed long timeout);
>>>>>>>>>
>>>>>>>>> +     /**
>>>>>>>>> +      * @boost:
>>>>>>>>> +      *
>>>>>>>>> +      * Optional callback, to indicate that a fence waiter missed a deadline.
>>>>>>>>> +      * This can serve as a signal that (if possible) whatever signals the
>>>>>>>>> +      * fence should boost it's clocks.
>>>>>>>>> +      *
>>>>>>>>> +      * This can be called in any context that can call dma_fence_wait().
>>>>>>>>> +      */
>>>>>>>>> +     void (*boost)(struct dma_fence *fence);
>>>>>>>>> +
>>>>>>>>>           /**
>>>>>>>>>            * @release:
>>>>>>>>>            *
>>>>>>>>> @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
>>>>>>>>>           return ret < 0 ? ret : 0;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> +/**
>>>>>>>>> + * dma_fence_boost - hint from waiter that it missed a deadline
>>>>>>>>> + *
>>>>>>>>> + * @fence: the fence that caused the missed deadline
>>>>>>>>> + *
>>>>>>>>> + * This function gives a hint from a fence waiter that a deadline was
>>>>>>>>> + * missed, so that the fence signaler can factor this in to device
>>>>>>>>> + * power state decisions
>>>>>>>>> + */
>>>>>>>>> +static inline void dma_fence_boost(struct dma_fence *fence)
>>>>>>>>> +{
>>>>>>>>> +     if (fence->ops->boost)
>>>>>>>>> +             fence->ops->boost(fence);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>       struct dma_fence *dma_fence_get_stub(void);
>>>>>>>>>       u64 dma_fence_context_alloc(unsigned num);
>>>>>>>>>
>>>>> _______________________________________________
>>>>> Linaro-mm-sig mailing list
>>>>> Linaro-mm-sig@lists.linaro.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linaro.org%2Fmailman%2Flistinfo%2Flinaro-mm-sig&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ce4f3688b832842c4236e08d91bb1e148%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571273080820910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=lOOKD4J4h7byys2ifx0Ibn5vVr9gwZGGGsgrNmaymc4%3D&amp;reserved=0
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Linaro-mm-sig] [RFC 1/3] dma-fence: Add boost fence op
  2021-05-21  7:43                   ` Christian König
@ 2021-05-21 14:21                     ` Daniel Vetter
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Vetter @ 2021-05-21 14:21 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, Christian König, Rob Clark, Rob Clark,
	linux-arm-msm, open list, dri-devel,
	moderated list:DMA BUFFER SHARING FRAMEWORK, freedreno,
	open list:DMA BUFFER SHARING FRAMEWORK

On Fri, May 21, 2021 at 09:43:59AM +0200, Christian König wrote:
> Am 20.05.21 um 19:08 schrieb Daniel Vetter:
> > [SNIP]
> > > AH! So we are basically telling the fence backend that we have just
> > > missed an event we waited for.
> > > 
> > > So what we want to know is how long the frontend wanted to wait instead
> > > of how long the backend took for rendering.
> > tbh I'm not sure the timestamp matters at all. What we do in i915 is
> > boost quite aggressively, and then let the usual clock tuning wittle
> > it down if we overshot. Plus soom cool-down to prevent
> > abuse/continuous boosting. I think we also differentiate between
> > display boost and userspace waits.
> 
> I was not thinking about time stamps here, but more like which information
> we need at which place.
> 
> > On the display side we also wait until the vblank has passed we aimed
> > for (atm always the next, we don't have target_frame support like
> > amdgpu), to avoid boosting when there's no point.
> > 
> > > > So boosting right when you've missed your frame (not what Rob implements
> > > > currently, but fixable) is the right semantics.
> > > > 
> > > > The other issue is that for cpu waits, we want to differentiate from fence
> > > > waits that userspace does intentially (e.g. wait ioctl) and waits that
> > > > random other things are doing within the kernel to keep track of progress.
> > > > 
> > > > For the former we know that userspace is stuck waiting for the gpu, and we
> > > > probably want to boost. For the latter we most definitely do _not_ want to
> > > > boost.
> > > > 
> > > > Otoh I do agree with you that the current api is a bit awkward, so perhaps
> > > > we do need a dma_fence_userspace_wait wrapper which boosts automatically
> > > > after a bit. And similarly perhaps a drm_vblank_dma_fence_wait, where you
> > > > give it a vblank target, and if the fence isn't signalled by then, we kick
> > > > it real hard.
> > > Yeah, something like an use case driven API would be nice to have.
> > > 
> > > For this particular case I suggest that we somehow extend the enable
> > > signaling callback.
> > > 
> > > > But otherwise yes this is absolutely a thing that matters a ton. If you
> > > > look at Matt Brost's scheduler rfc, there's also a line item in there
> > > > about adding this kind of boosting to drm/scheduler.
> > > BTW: I still can't see this in my inbox.
> > You've replied already:
> > 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210518235830.133834-1-matthew.brost%40intel.com%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ce4f3688b832842c4236e08d91bb1e148%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571273080820910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=uk3Gs%2FW42BDqMuMJtujcAH5GvN8mOlDnmywK8x1I%2F0k%3D&amp;reserved=0
> 
> Yeah, but doesn't that also require some changes to the DRM scheduler?
> 
> I was expecting that this is a bit more than just two patches.

It's just the rfc document, per the new rfc process:

https://dri.freedesktop.org/docs/drm/gpu/rfc/

It's rather obviously not any piece of code in there, but just meant to
check rough direction before we go rewrite the entire i915 execbuf
frontend.
-Daniel

> 
> Christian.
> 
> > 
> > It's just the big picture plan of what areas we're all trying to
> > tackle with some why, so that everyone knows what's coming in the next
> > half year at least. Probably longer until this is all sorted. I think
> > Matt has some poc hacked-up pile, but nothing really to show.
> > -Daniel
> > 
> > > Do you have a link?
> > > 
> > > Christian.
> > > 
> > > > -Daniel
> > > > 
> > > > 
> > > > > Regards,
> > > > > Christian.
> > > > > 
> > > > > > BR,
> > > > > > -R
> > > > > > 
> > > > > > > Thanks,
> > > > > > > Christian.
> > > > > > > 
> > > > > > > > BR,
> > > > > > > > -R
> > > > > > > > 
> > > > > > > > > Christian.
> > > > > > > > > 
> > > > > > > > > Am 19.05.21 um 20:38 schrieb Rob Clark:
> > > > > > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > > > > > 
> > > > > > > > > > Add a way to hint to the fence signaler that a fence waiter has missed a
> > > > > > > > > > deadline waiting on the fence.
> > > > > > > > > > 
> > > > > > > > > > In some cases, missing a vblank can result in lower gpu utilization,
> > > > > > > > > > when really we want to go in the opposite direction and boost gpu freq.
> > > > > > > > > > The boost callback gives some feedback to the fence signaler that we
> > > > > > > > > > are missing deadlines, so it can take this into account in it's freq/
> > > > > > > > > > utilization calculations.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > > > > > > > ---
> > > > > > > > > >       include/linux/dma-fence.h | 26 ++++++++++++++++++++++++++
> > > > > > > > > >       1 file changed, 26 insertions(+)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> > > > > > > > > > index 9f12efaaa93a..172702521acc 100644
> > > > > > > > > > --- a/include/linux/dma-fence.h
> > > > > > > > > > +++ b/include/linux/dma-fence.h
> > > > > > > > > > @@ -231,6 +231,17 @@ struct dma_fence_ops {
> > > > > > > > > >           signed long (*wait)(struct dma_fence *fence,
> > > > > > > > > >                               bool intr, signed long timeout);
> > > > > > > > > > 
> > > > > > > > > > +     /**
> > > > > > > > > > +      * @boost:
> > > > > > > > > > +      *
> > > > > > > > > > +      * Optional callback, to indicate that a fence waiter missed a deadline.
> > > > > > > > > > +      * This can serve as a signal that (if possible) whatever signals the
> > > > > > > > > > +      * fence should boost it's clocks.
> > > > > > > > > > +      *
> > > > > > > > > > +      * This can be called in any context that can call dma_fence_wait().
> > > > > > > > > > +      */
> > > > > > > > > > +     void (*boost)(struct dma_fence *fence);
> > > > > > > > > > +
> > > > > > > > > >           /**
> > > > > > > > > >            * @release:
> > > > > > > > > >            *
> > > > > > > > > > @@ -586,6 +597,21 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
> > > > > > > > > >           return ret < 0 ? ret : 0;
> > > > > > > > > >       }
> > > > > > > > > > 
> > > > > > > > > > +/**
> > > > > > > > > > + * dma_fence_boost - hint from waiter that it missed a deadline
> > > > > > > > > > + *
> > > > > > > > > > + * @fence: the fence that caused the missed deadline
> > > > > > > > > > + *
> > > > > > > > > > + * This function gives a hint from a fence waiter that a deadline was
> > > > > > > > > > + * missed, so that the fence signaler can factor this in to device
> > > > > > > > > > + * power state decisions
> > > > > > > > > > + */
> > > > > > > > > > +static inline void dma_fence_boost(struct dma_fence *fence)
> > > > > > > > > > +{
> > > > > > > > > > +     if (fence->ops->boost)
> > > > > > > > > > +             fence->ops->boost(fence);
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > >       struct dma_fence *dma_fence_get_stub(void);
> > > > > > > > > >       u64 dma_fence_context_alloc(unsigned num);
> > > > > > > > > > 
> > > > > > _______________________________________________
> > > > > > Linaro-mm-sig mailing list
> > > > > > Linaro-mm-sig@lists.linaro.org
> > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.linaro.org%2Fmailman%2Flistinfo%2Flinaro-mm-sig&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ce4f3688b832842c4236e08d91bb1e148%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637571273080820910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=lOOKD4J4h7byys2ifx0Ibn5vVr9gwZGGGsgrNmaymc4%3D&amp;reserved=0
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank
  2021-05-20 16:29   ` Daniel Vetter
@ 2021-05-30 14:33     ` Rob Clark
  2021-06-01 14:18       ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Rob Clark @ 2021-05-30 14:33 UTC (permalink / raw)
  To: Rob Clark, dri-devel, freedreno, linux-arm-msm, Rob Clark,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, open list, Matthew Brost
  Cc: Daniel Vetter

On Thu, May 20, 2021 at 9:29 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, May 19, 2021 at 11:38:53AM -0700, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >  drivers/gpu/drm/drm_atomic_helper.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 560aaecba31b..fe10fc2e7f86 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -1435,11 +1435,15 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
> >       int i, ret;
> >
> >       for_each_new_plane_in_state(state, plane, new_plane_state, i) {
> > +             u64 vblank_count;
> > +
> >               if (!new_plane_state->fence)
> >                       continue;
> >
> >               WARN_ON(!new_plane_state->fb);
> >
> > +             vblank_count = drm_crtc_vblank_count(new_plane_state->crtc);
> > +
> >               /*
> >                * If waiting for fences pre-swap (ie: nonblock), userspace can
> >                * still interrupt the operation. Instead of blocking until the
> > @@ -1449,6 +1453,13 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
> >               if (ret)
> >                       return ret;
> >
> > +             /*
> > +              * Check if we've missed a vblank while waiting, and if we have
> > +              * signal the fence that it's signaler should be boosted
> > +              */
> > +             if (vblank_count != drm_crtc_vblank_count(new_plane_state->crtc))
> > +                     dma_fence_boost(new_plane_state->fence);
>
> I think we should do a lot better here:
> - maybe only bother doing this for single-crtc updates, and only if
>   modeset isn't set. No one else cares about latency.
>
> - We should boost _right_ when we've missed the frame, so I think we
>   should have a _timeout wait here that guesstimates when the vblank is
>   over (might need to throw in a vblank wait if we missed) and then boost
>   immediately. Not wait a bunch of frames (worst case) until we finally
>   decide to boost.

I was thinking about this a bit more.. How about rather than calling
some fence->op->boost() type thing when we are about to miss a vblank
(IMO that is also already too late), we do something more like
fence->ops->set_deadline() before we even wait?

It's probably a bit impossible for a gpu driver to really predict how
long some rendering will take, but other cases like video decoder are
somewhat more predictable.. the fence provider could predict given the
remaining time until the deadline what clk rates are required to get
you there.

BR,
-R


>
> Otherwise I really like this, I think it's about the only real reason i915
> isn't using atomic helpers.
>
> Also adding Matt B for this topic.
> -Daniel
>
> > +
> >               dma_fence_put(new_plane_state->fence);
> >               new_plane_state->fence = NULL;
> >       }
> > --
> > 2.30.2
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank
  2021-05-30 14:33     ` Rob Clark
@ 2021-06-01 14:18       ` Daniel Vetter
  2021-06-01 15:46         ` Rob Clark
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Vetter @ 2021-06-01 14:18 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, freedreno, linux-arm-msm, Rob Clark,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, open list, Matthew Brost, Daniel Vetter

On Sun, May 30, 2021 at 07:33:57AM -0700, Rob Clark wrote:
> On Thu, May 20, 2021 at 9:29 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Wed, May 19, 2021 at 11:38:53AM -0700, Rob Clark wrote:
> > > From: Rob Clark <robdclark@chromium.org>
> > >
> > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > ---
> > >  drivers/gpu/drm/drm_atomic_helper.c | 11 +++++++++++
> > >  1 file changed, 11 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > index 560aaecba31b..fe10fc2e7f86 100644
> > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > @@ -1435,11 +1435,15 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
> > >       int i, ret;
> > >
> > >       for_each_new_plane_in_state(state, plane, new_plane_state, i) {
> > > +             u64 vblank_count;
> > > +
> > >               if (!new_plane_state->fence)
> > >                       continue;
> > >
> > >               WARN_ON(!new_plane_state->fb);
> > >
> > > +             vblank_count = drm_crtc_vblank_count(new_plane_state->crtc);
> > > +
> > >               /*
> > >                * If waiting for fences pre-swap (ie: nonblock), userspace can
> > >                * still interrupt the operation. Instead of blocking until the
> > > @@ -1449,6 +1453,13 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
> > >               if (ret)
> > >                       return ret;
> > >
> > > +             /*
> > > +              * Check if we've missed a vblank while waiting, and if we have
> > > +              * signal the fence that it's signaler should be boosted
> > > +              */
> > > +             if (vblank_count != drm_crtc_vblank_count(new_plane_state->crtc))
> > > +                     dma_fence_boost(new_plane_state->fence);
> >
> > I think we should do a lot better here:
> > - maybe only bother doing this for single-crtc updates, and only if
> >   modeset isn't set. No one else cares about latency.
> >
> > - We should boost _right_ when we've missed the frame, so I think we
> >   should have a _timeout wait here that guesstimates when the vblank is
> >   over (might need to throw in a vblank wait if we missed) and then boost
> >   immediately. Not wait a bunch of frames (worst case) until we finally
> >   decide to boost.
> 
> I was thinking about this a bit more.. How about rather than calling
> some fence->op->boost() type thing when we are about to miss a vblank
> (IMO that is also already too late), we do something more like
> fence->ops->set_deadline() before we even wait?

Hm yeah that sounds like a clean idea.

Even more, why not add the deadline/waiter information to the callback
we're adding? That way drivers can inspect it whenever they feel like and
don't have to duplicate the tracking. And it's probably easier to
tune/adjust to the myriads of use-cases (flip target miss, userspace wait,
wakeup boost maybe too ...).

I like this direction a lot more than what we discussed with post-miss
hints thus far.

> It's probably a bit impossible for a gpu driver to really predict how
> long some rendering will take, but other cases like video decoder are
> somewhat more predictable.. the fence provider could predict given the
> remaining time until the deadline what clk rates are required to get
> you there.

Well if we do have a deadline the driver can note that in its scheduler
and arm a driver to kick the clocks. Or maybe use past history to do this
upfront.
-Daniel

> 
> BR,
> -R
> 
> 
> >
> > Otherwise I really like this, I think it's about the only real reason i915
> > isn't using atomic helpers.
> >
> > Also adding Matt B for this topic.
> > -Daniel
> >
> > > +
> > >               dma_fence_put(new_plane_state->fence);
> > >               new_plane_state->fence = NULL;
> > >       }
> > > --
> > > 2.30.2
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank
  2021-06-01 14:18       ` Daniel Vetter
@ 2021-06-01 15:46         ` Rob Clark
  2021-06-01 16:11           ` Daniel Vetter
  0 siblings, 1 reply; 20+ messages in thread
From: Rob Clark @ 2021-06-01 15:46 UTC (permalink / raw)
  To: Rob Clark, dri-devel, freedreno, linux-arm-msm, Rob Clark,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, open list, Matthew Brost
  Cc: Daniel Vetter

On Tue, Jun 1, 2021 at 7:18 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Sun, May 30, 2021 at 07:33:57AM -0700, Rob Clark wrote:
> > On Thu, May 20, 2021 at 9:29 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Wed, May 19, 2021 at 11:38:53AM -0700, Rob Clark wrote:
> > > > From: Rob Clark <robdclark@chromium.org>
> > > >
> > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > ---
> > > >  drivers/gpu/drm/drm_atomic_helper.c | 11 +++++++++++
> > > >  1 file changed, 11 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > index 560aaecba31b..fe10fc2e7f86 100644
> > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > @@ -1435,11 +1435,15 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
> > > >       int i, ret;
> > > >
> > > >       for_each_new_plane_in_state(state, plane, new_plane_state, i) {
> > > > +             u64 vblank_count;
> > > > +
> > > >               if (!new_plane_state->fence)
> > > >                       continue;
> > > >
> > > >               WARN_ON(!new_plane_state->fb);
> > > >
> > > > +             vblank_count = drm_crtc_vblank_count(new_plane_state->crtc);
> > > > +
> > > >               /*
> > > >                * If waiting for fences pre-swap (ie: nonblock), userspace can
> > > >                * still interrupt the operation. Instead of blocking until the
> > > > @@ -1449,6 +1453,13 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
> > > >               if (ret)
> > > >                       return ret;
> > > >
> > > > +             /*
> > > > +              * Check if we've missed a vblank while waiting, and if we have
> > > > +              * signal the fence that it's signaler should be boosted
> > > > +              */
> > > > +             if (vblank_count != drm_crtc_vblank_count(new_plane_state->crtc))
> > > > +                     dma_fence_boost(new_plane_state->fence);
> > >
> > > I think we should do a lot better here:
> > > - maybe only bother doing this for single-crtc updates, and only if
> > >   modeset isn't set. No one else cares about latency.
> > >
> > > - We should boost _right_ when we've missed the frame, so I think we
> > >   should have a _timeout wait here that guesstimates when the vblank is
> > >   over (might need to throw in a vblank wait if we missed) and then boost
> > >   immediately. Not wait a bunch of frames (worst case) until we finally
> > >   decide to boost.
> >
> > I was thinking about this a bit more.. How about rather than calling
> > some fence->op->boost() type thing when we are about to miss a vblank
> > (IMO that is also already too late), we do something more like
> > fence->ops->set_deadline() before we even wait?
>
> Hm yeah that sounds like a clean idea.
>
> Even more, why not add the deadline/waiter information to the callback
> we're adding? That way drivers can inspect it whenever they feel like and
> don't have to duplicate the tracking. And it's probably easier to
> tune/adjust to the myriads of use-cases (flip target miss, userspace wait,
> wakeup boost maybe too ...).

You mean, enumerate the types of deadline?

For userspace waits, we might have a timeout, but not really
(currently) any more information than that?  The vblank deadline is
the only type of deadline that seems pretty clear to me.

I suppose we could do something like:

   dma_fence_set_deadline(fence, &(struct dma_fence_deadline){
           .type = DMA_FENCE_DEADLINE_VBLANK,
           .time = next_vblank_ktime,
       });

to make it a bit more extensible to add more deadline types or
additional optional information

BR,
-R

>
> I like this direction a lot more than what we discussed with post-miss
> hints thus far.
>
> > It's probably a bit impossible for a gpu driver to really predict how
> > long some rendering will take, but other cases like video decoder are
> > somewhat more predictable.. the fence provider could predict given the
> > remaining time until the deadline what clk rates are required to get
> > you there.
>
> Well if we do have a deadline the driver can note that in its scheduler
> and arm a driver to kick the clocks. Or maybe use past history to do this
> upfront.
> -Daniel
>
> >
> > BR,
> > -R
> >
> >
> > >
> > > Otherwise I really like this, I think it's about the only real reason i915
> > > isn't using atomic helpers.
> > >
> > > Also adding Matt B for this topic.
> > > -Daniel
> > >
> > > > +
> > > >               dma_fence_put(new_plane_state->fence);
> > > >               new_plane_state->fence = NULL;
> > > >       }
> > > > --
> > > > 2.30.2
> > > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank
  2021-06-01 15:46         ` Rob Clark
@ 2021-06-01 16:11           ` Daniel Vetter
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Vetter @ 2021-06-01 16:11 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, freedreno, linux-arm-msm, Rob Clark,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, open list, Matthew Brost, Daniel Vetter

On Tue, Jun 01, 2021 at 08:46:14AM -0700, Rob Clark wrote:
> On Tue, Jun 1, 2021 at 7:18 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Sun, May 30, 2021 at 07:33:57AM -0700, Rob Clark wrote:
> > > On Thu, May 20, 2021 at 9:29 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Wed, May 19, 2021 at 11:38:53AM -0700, Rob Clark wrote:
> > > > > From: Rob Clark <robdclark@chromium.org>
> > > > >
> > > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > > ---
> > > > >  drivers/gpu/drm/drm_atomic_helper.c | 11 +++++++++++
> > > > >  1 file changed, 11 insertions(+)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > index 560aaecba31b..fe10fc2e7f86 100644
> > > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > @@ -1435,11 +1435,15 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
> > > > >       int i, ret;
> > > > >
> > > > >       for_each_new_plane_in_state(state, plane, new_plane_state, i) {
> > > > > +             u64 vblank_count;
> > > > > +
> > > > >               if (!new_plane_state->fence)
> > > > >                       continue;
> > > > >
> > > > >               WARN_ON(!new_plane_state->fb);
> > > > >
> > > > > +             vblank_count = drm_crtc_vblank_count(new_plane_state->crtc);
> > > > > +
> > > > >               /*
> > > > >                * If waiting for fences pre-swap (ie: nonblock), userspace can
> > > > >                * still interrupt the operation. Instead of blocking until the
> > > > > @@ -1449,6 +1453,13 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
> > > > >               if (ret)
> > > > >                       return ret;
> > > > >
> > > > > +             /*
> > > > > +              * Check if we've missed a vblank while waiting, and if we have
> > > > > +              * signal the fence that it's signaler should be boosted
> > > > > +              */
> > > > > +             if (vblank_count != drm_crtc_vblank_count(new_plane_state->crtc))
> > > > > +                     dma_fence_boost(new_plane_state->fence);
> > > >
> > > > I think we should do a lot better here:
> > > > - maybe only bother doing this for single-crtc updates, and only if
> > > >   modeset isn't set. No one else cares about latency.
> > > >
> > > > - We should boost _right_ when we've missed the frame, so I think we
> > > >   should have a _timeout wait here that guesstimates when the vblank is
> > > >   over (might need to throw in a vblank wait if we missed) and then boost
> > > >   immediately. Not wait a bunch of frames (worst case) until we finally
> > > >   decide to boost.
> > >
> > > I was thinking about this a bit more.. How about rather than calling
> > > some fence->op->boost() type thing when we are about to miss a vblank
> > > (IMO that is also already too late), we do something more like
> > > fence->ops->set_deadline() before we even wait?
> >
> > Hm yeah that sounds like a clean idea.
> >
> > Even more, why not add the deadline/waiter information to the callback
> > we're adding? That way drivers can inspect it whenever they feel like and
> > don't have to duplicate the tracking. And it's probably easier to
> > tune/adjust to the myriads of use-cases (flip target miss, userspace wait,
> > wakeup boost maybe too ...).
> 
> You mean, enumerate the types of deadline?
> 
> For userspace waits, we might have a timeout, but not really
> (currently) any more information than that?  The vblank deadline is
> the only type of deadline that seems pretty clear to me.
> 
> I suppose we could do something like:
> 
>    dma_fence_set_deadline(fence, &(struct dma_fence_deadline){
>            .type = DMA_FENCE_DEADLINE_VBLANK,
>            .time = next_vblank_ktime,
>        });
> 
> to make it a bit more extensible to add more deadline types or
> additional optional information

Nah not enumerate the types of deadlines, but the types of waits. Some of
which might have a deadline (like page flip), some wont (like userspace
waiting or poll() on a dma-fd or whatever).

What I had in mind is roughly


diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 6ffb4b2c6371..e7c239145273 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -116,6 +116,8 @@ typedef void (*dma_fence_func_t)(struct dma_fence *fence,
 struct dma_fence_cb {
 	struct list_head node;
 	dma_fence_func_t func;
+	enume dma_fence_wait_type wait_type;
+	struct ktime deadline; /* fixme how do we indicate no deadline? */
 };
 
 /**

With that waiters, and irrespective of whether they use dma_fence_wait or
have something else like the dma-buf fd poll stuff, can indicate to the
driver what kind of wait with what kind of deadline this is.

Maybe we should make this a sub-struct, so that it can also be passed to
dma_fence_wait().
-Daniel

> 
> BR,
> -R
> 
> >
> > I like this direction a lot more than what we discussed with post-miss
> > hints thus far.
> >
> > > It's probably a bit impossible for a gpu driver to really predict how
> > > long some rendering will take, but other cases like video decoder are
> > > somewhat more predictable.. the fence provider could predict given the
> > > remaining time until the deadline what clk rates are required to get
> > > you there.
> >
> > Well if we do have a deadline the driver can note that in its scheduler
> > and arm a driver to kick the clocks. Or maybe use past history to do this
> > upfront.
> > -Daniel
> >
> > >
> > > BR,
> > > -R
> > >
> > >
> > > >
> > > > Otherwise I really like this, I think it's about the only real reason i915
> > > > isn't using atomic helpers.
> > > >
> > > > Also adding Matt B for this topic.
> > > > -Daniel
> > > >
> > > > > +
> > > > >               dma_fence_put(new_plane_state->fence);
> > > > >               new_plane_state->fence = NULL;
> > > > >       }
> > > > > --
> > > > > 2.30.2
> > > > >
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-06-01 16:11 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-19 18:38 [RFC 0/3] dma-fence: Add a "boost" mechanism Rob Clark
2021-05-19 18:38 ` [RFC 1/3] dma-fence: Add boost fence op Rob Clark
2021-05-20  6:46   ` Christian König
2021-05-20 14:07     ` Rob Clark
2021-05-20 14:11       ` Christian König
2021-05-20 14:54         ` Rob Clark
2021-05-20 16:01           ` [Linaro-mm-sig] " Christian König
2021-05-20 16:34             ` Daniel Vetter
2021-05-20 16:40               ` Christian König
2021-05-20 17:08                 ` Daniel Vetter
2021-05-21  7:43                   ` Christian König
2021-05-21 14:21                     ` Daniel Vetter
2021-05-20 16:25       ` Daniel Vetter
2021-05-19 18:38 ` [RFC 2/3] drm/atomic: Call dma_fence_boost() when we've missed a vblank Rob Clark
2021-05-20 16:29   ` Daniel Vetter
2021-05-30 14:33     ` Rob Clark
2021-06-01 14:18       ` Daniel Vetter
2021-06-01 15:46         ` Rob Clark
2021-06-01 16:11           ` Daniel Vetter
2021-05-19 18:38 ` [RFC 3/3] drm/msm: Wire up gpu boost Rob Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).