dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/etnaviv: slow down FE idle polling
@ 2023-06-07 12:59 Lucas Stach
  2023-06-14 18:37 ` Christian Gmeiner
  2023-06-15  4:09 ` Sui Jingfeng
  0 siblings, 2 replies; 10+ messages in thread
From: Lucas Stach @ 2023-06-07 12:59 UTC (permalink / raw)
  To: etnaviv; +Cc: patchwork-lst, kernel, dri-devel, Russell King

Currently the FE is spinning way too fast when polling for new work in
the FE idleloop. As each poll fetches 16 bytes from memory, a GPU running
at 1GHz with the current setting of 200 wait cycle between fetches causes
80 MB/s of memory traffic just to check for new work when the GPU is
otherwise idle, which is more FE traffic than in some GPU loaded cases.

Significantly increase the number of wait cycles to slow down the poll
interval to ~30µs, limiting the FE idle memory traffic to 512 KB/s, while
providing a max latency which should not hurt most use-cases. The FE WAIT
command seems to have some unknown discrete steps in the wait cycles so
we may over/undershoot the target a bit, but that should be harmless.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
---
 drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 11 ++++++-----
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c    |  7 +++++++
 drivers/gpu/drm/etnaviv/etnaviv_gpu.h    |  1 +
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
index cf741c5c82d2..384df1659be6 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
@@ -53,11 +53,12 @@ static inline void CMD_END(struct etnaviv_cmdbuf *buffer)
 	OUT(buffer, VIV_FE_END_HEADER_OP_END);
 }
 
-static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer)
+static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer,
+			    unsigned int waitcycles)
 {
 	buffer->user_size = ALIGN(buffer->user_size, 8);
 
-	OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | 200);
+	OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | waitcycles);
 }
 
 static inline void CMD_LINK(struct etnaviv_cmdbuf *buffer,
@@ -168,7 +169,7 @@ u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu)
 	/* initialize buffer */
 	buffer->user_size = 0;
 
-	CMD_WAIT(buffer);
+	CMD_WAIT(buffer, gpu->fe_waitcycles);
 	CMD_LINK(buffer, 2,
 		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
 		 + buffer->user_size - 4);
@@ -320,7 +321,7 @@ void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event)
 	CMD_END(buffer);
 
 	/* Append waitlink */
-	CMD_WAIT(buffer);
+	CMD_WAIT(buffer, gpu->fe_waitcycles);
 	CMD_LINK(buffer, 2,
 		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
 		 + buffer->user_size - 4);
@@ -503,7 +504,7 @@ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, u32 exec_state,
 
 	CMD_LOAD_STATE(buffer, VIVS_GL_EVENT, VIVS_GL_EVENT_EVENT_ID(event) |
 		       VIVS_GL_EVENT_FROM_PE);
-	CMD_WAIT(buffer);
+	CMD_WAIT(buffer, gpu->fe_waitcycles);
 	CMD_LINK(buffer, 2,
 		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
 		 + buffer->user_size - 4);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index 41aab1aa330b..8c20dff32240 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -493,6 +493,13 @@ static void etnaviv_gpu_update_clock(struct etnaviv_gpu *gpu)
 		clock |= VIVS_HI_CLOCK_CONTROL_FSCALE_VAL(fscale);
 		etnaviv_gpu_load_clock(gpu, clock);
 	}
+
+	/*
+	 * Choose number of wait cycles to target a ~30us (1/32768) max latency
+	 * until new work is picked up by the FE when it polls in the idle loop.
+	 */
+	gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
+				 0xffffUL);
 }
 
 static int etnaviv_hw_reset(struct etnaviv_gpu *gpu)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
index 98c6f9c320fc..e1e1de59c38d 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
@@ -150,6 +150,7 @@ struct etnaviv_gpu {
 	struct clk *clk_shader;
 
 	unsigned int freq_scale;
+	unsigned int fe_waitcycles;
 	unsigned long base_rate_core;
 	unsigned long base_rate_shader;
 };
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/etnaviv: slow down FE idle polling
  2023-06-07 12:59 [PATCH] drm/etnaviv: slow down FE idle polling Lucas Stach
@ 2023-06-14 18:37 ` Christian Gmeiner
  2023-06-15  4:09 ` Sui Jingfeng
  1 sibling, 0 replies; 10+ messages in thread
From: Christian Gmeiner @ 2023-06-14 18:37 UTC (permalink / raw)
  To: Lucas Stach; +Cc: kernel, patchwork-lst, etnaviv, dri-devel, Russell King

Hi Lucas

>
> Currently the FE is spinning way too fast when polling for new work in
> the FE idleloop. As each poll fetches 16 bytes from memory, a GPU running
> at 1GHz with the current setting of 200 wait cycle between fetches causes
> 80 MB/s of memory traffic just to check for new work when the GPU is
> otherwise idle, which is more FE traffic than in some GPU loaded cases.
>
> Significantly increase the number of wait cycles to slow down the poll
> interval to ~30µs, limiting the FE idle memory traffic to 512 KB/s, while
> providing a max latency which should not hurt most use-cases. The FE WAIT
> command seems to have some unknown discrete steps in the wait cycles so
> we may over/undershoot the target a bit, but that should be harmless.
>
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>

> ---
>  drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 11 ++++++-----
>  drivers/gpu/drm/etnaviv/etnaviv_gpu.c    |  7 +++++++
>  drivers/gpu/drm/etnaviv/etnaviv_gpu.h    |  1 +
>  3 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> index cf741c5c82d2..384df1659be6 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> @@ -53,11 +53,12 @@ static inline void CMD_END(struct etnaviv_cmdbuf *buffer)
>         OUT(buffer, VIV_FE_END_HEADER_OP_END);
>  }
>
> -static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer)
> +static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer,
> +                           unsigned int waitcycles)
>  {
>         buffer->user_size = ALIGN(buffer->user_size, 8);
>
> -       OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | 200);
> +       OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | waitcycles);
>  }
>
>  static inline void CMD_LINK(struct etnaviv_cmdbuf *buffer,
> @@ -168,7 +169,7 @@ u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu)
>         /* initialize buffer */
>         buffer->user_size = 0;
>
> -       CMD_WAIT(buffer);
> +       CMD_WAIT(buffer, gpu->fe_waitcycles);
>         CMD_LINK(buffer, 2,
>                  etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>                  + buffer->user_size - 4);
> @@ -320,7 +321,7 @@ void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event)
>         CMD_END(buffer);
>
>         /* Append waitlink */
> -       CMD_WAIT(buffer);
> +       CMD_WAIT(buffer, gpu->fe_waitcycles);
>         CMD_LINK(buffer, 2,
>                  etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>                  + buffer->user_size - 4);
> @@ -503,7 +504,7 @@ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, u32 exec_state,
>
>         CMD_LOAD_STATE(buffer, VIVS_GL_EVENT, VIVS_GL_EVENT_EVENT_ID(event) |
>                        VIVS_GL_EVENT_FROM_PE);
> -       CMD_WAIT(buffer);
> +       CMD_WAIT(buffer, gpu->fe_waitcycles);
>         CMD_LINK(buffer, 2,
>                  etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>                  + buffer->user_size - 4);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> index 41aab1aa330b..8c20dff32240 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> @@ -493,6 +493,13 @@ static void etnaviv_gpu_update_clock(struct etnaviv_gpu *gpu)
>                 clock |= VIVS_HI_CLOCK_CONTROL_FSCALE_VAL(fscale);
>                 etnaviv_gpu_load_clock(gpu, clock);
>         }
> +
> +       /*
> +        * Choose number of wait cycles to target a ~30us (1/32768) max latency
> +        * until new work is picked up by the FE when it polls in the idle loop.
> +        */
> +       gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
> +                                0xffffUL);
>  }
>
>  static int etnaviv_hw_reset(struct etnaviv_gpu *gpu)
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> index 98c6f9c320fc..e1e1de59c38d 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> @@ -150,6 +150,7 @@ struct etnaviv_gpu {
>         struct clk *clk_shader;
>
>         unsigned int freq_scale;
> +       unsigned int fe_waitcycles;
>         unsigned long base_rate_core;
>         unsigned long base_rate_shader;
>  };
> --
> 2.39.2
>


-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm/etnaviv: slow down FE idle polling
  2023-06-07 12:59 [PATCH] drm/etnaviv: slow down FE idle polling Lucas Stach
  2023-06-14 18:37 ` Christian Gmeiner
@ 2023-06-15  4:09 ` Sui Jingfeng
  2023-06-15  9:04   ` Lucas Stach
  1 sibling, 1 reply; 10+ messages in thread
From: Sui Jingfeng @ 2023-06-15  4:09 UTC (permalink / raw)
  To: Lucas Stach, etnaviv; +Cc: Russell King, dri-devel, kernel, patchwork-lst

Hi,

On 2023/6/7 20:59, Lucas Stach wrote:
> Currently the FE is spinning way too fast when polling for new work in
'way' -> 'away'
> the FE idleloop.
'idleloop' -> 'idle loop'
>   As each poll fetches 16 bytes from memory, a GPU running
> at 1GHz with the current setting of 200 wait cycle between fetches causes
> 80 MB/s of memory traffic just to check for new work when the GPU is
> otherwise idle, which is more FE traffic than in some GPU loaded cases.
>
> Significantly increase the number of wait cycles to slow down the poll
> interval to ~30µs, limiting the FE idle memory traffic to 512 KB/s, while
> providing a max latency which should not hurt most use-cases. The FE WAIT
> command seems to have some unknown discrete steps in the wait cycles
add a comma here.
>   so
> we may over/undershoot the target a bit, but that should be harmless.
overshoot or undershoot
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
> ---
>   drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 11 ++++++-----
>   drivers/gpu/drm/etnaviv/etnaviv_gpu.c    |  7 +++++++
>   drivers/gpu/drm/etnaviv/etnaviv_gpu.h    |  1 +
>   3 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> index cf741c5c82d2..384df1659be6 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> @@ -53,11 +53,12 @@ static inline void CMD_END(struct etnaviv_cmdbuf *buffer)
>   	OUT(buffer, VIV_FE_END_HEADER_OP_END);
>   }
>   
> -static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer)
> +static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer,
> +			    unsigned int waitcycles)
>   {
>   	buffer->user_size = ALIGN(buffer->user_size, 8);
>   
> -	OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | 200);
> +	OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | waitcycles);
>   }
>   
>   static inline void CMD_LINK(struct etnaviv_cmdbuf *buffer,
> @@ -168,7 +169,7 @@ u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu)
>   	/* initialize buffer */
>   	buffer->user_size = 0;
>   
> -	CMD_WAIT(buffer);
> +	CMD_WAIT(buffer, gpu->fe_waitcycles);
>   	CMD_LINK(buffer, 2,
>   		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>   		 + buffer->user_size - 4);
> @@ -320,7 +321,7 @@ void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event)
>   	CMD_END(buffer);
>   
>   	/* Append waitlink */
> -	CMD_WAIT(buffer);
> +	CMD_WAIT(buffer, gpu->fe_waitcycles);
>   	CMD_LINK(buffer, 2,
>   		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>   		 + buffer->user_size - 4);
> @@ -503,7 +504,7 @@ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, u32 exec_state,
>   
>   	CMD_LOAD_STATE(buffer, VIVS_GL_EVENT, VIVS_GL_EVENT_EVENT_ID(event) |
>   		       VIVS_GL_EVENT_FROM_PE);
> -	CMD_WAIT(buffer);
> +	CMD_WAIT(buffer, gpu->fe_waitcycles);
>   	CMD_LINK(buffer, 2,
>   		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>   		 + buffer->user_size - 4);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> index 41aab1aa330b..8c20dff32240 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> @@ -493,6 +493,13 @@ static void etnaviv_gpu_update_clock(struct etnaviv_gpu *gpu)
>   		clock |= VIVS_HI_CLOCK_CONTROL_FSCALE_VAL(fscale);
>   		etnaviv_gpu_load_clock(gpu, clock);
>   	}
> +
> +	/*
> +	 * Choose number of wait cycles to target a ~30us (1/32768) max latency
> +	 * until new work is picked up by the FE when it polls in the idle loop.
> +	 */
> +	gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
> +				 0xffffUL);

This patch is NOT effective on our hardware GC1000 v5037 (ls7a1000 + 
ls3a5000).

As the gpu->base_rate_core is 0,  so, in the end gpu->fe_waitcycles is 
also zero.

But after apply this path, the glmark2 still run happily, no influence. So


Tested-by: Sui Jingfeng <suijingfeng@loongson.cn>

>   }
>   
>   static int etnaviv_hw_reset(struct etnaviv_gpu *gpu)
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> index 98c6f9c320fc..e1e1de59c38d 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> @@ -150,6 +150,7 @@ struct etnaviv_gpu {
>   	struct clk *clk_shader;
>   
>   	unsigned int freq_scale;
> +	unsigned int fe_waitcycles;
>   	unsigned long base_rate_core;
>   	unsigned long base_rate_shader;
>   };

-- 
Jingfeng


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm/etnaviv: slow down FE idle polling
  2023-06-15  4:09 ` Sui Jingfeng
@ 2023-06-15  9:04   ` Lucas Stach
  2023-06-15  9:16     ` Sui Jingfeng
  0 siblings, 1 reply; 10+ messages in thread
From: Lucas Stach @ 2023-06-15  9:04 UTC (permalink / raw)
  To: Sui Jingfeng, etnaviv; +Cc: Russell King, dri-devel, kernel, patchwork-lst

Am Donnerstag, dem 15.06.2023 um 12:09 +0800 schrieb Sui Jingfeng:
> Hi,
> 
> On 2023/6/7 20:59, Lucas Stach wrote:
> > Currently the FE is spinning way too fast when polling for new work in
> 'way' -> 'away'
> > the FE idleloop.
> 'idleloop' -> 'idle loop'
> >   As each poll fetches 16 bytes from memory, a GPU running
> > at 1GHz with the current setting of 200 wait cycle between fetches causes
> > 80 MB/s of memory traffic just to check for new work when the GPU is
> > otherwise idle, which is more FE traffic than in some GPU loaded cases.
> > 
> > Significantly increase the number of wait cycles to slow down the poll
> > interval to ~30µs, limiting the FE idle memory traffic to 512 KB/s, while
> > providing a max latency which should not hurt most use-cases. The FE WAIT
> > command seems to have some unknown discrete steps in the wait cycles
> add a comma here.
> >   so
> > we may over/undershoot the target a bit, but that should be harmless.
> overshoot or undershoot
> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> > Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
> > ---
> >   drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 11 ++++++-----
> >   drivers/gpu/drm/etnaviv/etnaviv_gpu.c    |  7 +++++++
> >   drivers/gpu/drm/etnaviv/etnaviv_gpu.h    |  1 +
> >   3 files changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> > index cf741c5c82d2..384df1659be6 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> > @@ -53,11 +53,12 @@ static inline void CMD_END(struct etnaviv_cmdbuf *buffer)
> >   	OUT(buffer, VIV_FE_END_HEADER_OP_END);
> >   }
> >   
> > -static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer)
> > +static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer,
> > +			    unsigned int waitcycles)
> >   {
> >   	buffer->user_size = ALIGN(buffer->user_size, 8);
> >   
> > -	OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | 200);
> > +	OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | waitcycles);
> >   }
> >   
> >   static inline void CMD_LINK(struct etnaviv_cmdbuf *buffer,
> > @@ -168,7 +169,7 @@ u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu)
> >   	/* initialize buffer */
> >   	buffer->user_size = 0;
> >   
> > -	CMD_WAIT(buffer);
> > +	CMD_WAIT(buffer, gpu->fe_waitcycles);
> >   	CMD_LINK(buffer, 2,
> >   		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
> >   		 + buffer->user_size - 4);
> > @@ -320,7 +321,7 @@ void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event)
> >   	CMD_END(buffer);
> >   
> >   	/* Append waitlink */
> > -	CMD_WAIT(buffer);
> > +	CMD_WAIT(buffer, gpu->fe_waitcycles);
> >   	CMD_LINK(buffer, 2,
> >   		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
> >   		 + buffer->user_size - 4);
> > @@ -503,7 +504,7 @@ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, u32 exec_state,
> >   
> >   	CMD_LOAD_STATE(buffer, VIVS_GL_EVENT, VIVS_GL_EVENT_EVENT_ID(event) |
> >   		       VIVS_GL_EVENT_FROM_PE);
> > -	CMD_WAIT(buffer);
> > +	CMD_WAIT(buffer, gpu->fe_waitcycles);
> >   	CMD_LINK(buffer, 2,
> >   		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
> >   		 + buffer->user_size - 4);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > index 41aab1aa330b..8c20dff32240 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> > @@ -493,6 +493,13 @@ static void etnaviv_gpu_update_clock(struct etnaviv_gpu *gpu)
> >   		clock |= VIVS_HI_CLOCK_CONTROL_FSCALE_VAL(fscale);
> >   		etnaviv_gpu_load_clock(gpu, clock);
> >   	}
> > +
> > +	/*
> > +	 * Choose number of wait cycles to target a ~30us (1/32768) max latency
> > +	 * until new work is picked up by the FE when it polls in the idle loop.
> > +	 */
> > +	gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
> > +				 0xffffUL);
> 
> This patch is NOT effective on our hardware GC1000 v5037 (ls7a1000 + 
> ls3a5000).
> 
> As the gpu->base_rate_core is 0,  so, in the end gpu->fe_waitcycles is 
> also zero.
> 
Uh, that's a problem, as the patch will then have the opposite effect
on your platform by speeding up the idle loop. Thanks for catching
this! I'll improve the patch to keep a reasonable amount of wait cycles
in this case.

Regards,
Lucas

> But after apply this path, the glmark2 still run happily, no influence. So
> 
> 
> Tested-by: Sui Jingfeng <suijingfeng@loongson.cn>
> 
> >   }
> >   
> >   static int etnaviv_hw_reset(struct etnaviv_gpu *gpu)
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > index 98c6f9c320fc..e1e1de59c38d 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> > @@ -150,6 +150,7 @@ struct etnaviv_gpu {
> >   	struct clk *clk_shader;
> >   
> >   	unsigned int freq_scale;
> > +	unsigned int fe_waitcycles;
> >   	unsigned long base_rate_core;
> >   	unsigned long base_rate_shader;
> >   };
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm/etnaviv: slow down FE idle polling
  2023-06-15  9:04   ` Lucas Stach
@ 2023-06-15  9:16     ` Sui Jingfeng
  2023-06-15  9:20       ` Christian Gmeiner
  0 siblings, 1 reply; 10+ messages in thread
From: Sui Jingfeng @ 2023-06-15  9:16 UTC (permalink / raw)
  To: Lucas Stach, etnaviv; +Cc: Russell King, dri-devel, kernel, patchwork-lst

Hi,

On 2023/6/15 17:04, Lucas Stach wrote:
> Am Donnerstag, dem 15.06.2023 um 12:09 +0800 schrieb Sui Jingfeng:
>> Hi,
>>
>> On 2023/6/7 20:59, Lucas Stach wrote:
>>> Currently the FE is spinning way too fast when polling for new work in
>> 'way' -> 'away'
>>> the FE idleloop.
>> 'idleloop' -> 'idle loop'
>>>    As each poll fetches 16 bytes from memory, a GPU running
>>> at 1GHz with the current setting of 200 wait cycle between fetches causes
>>> 80 MB/s of memory traffic just to check for new work when the GPU is
>>> otherwise idle, which is more FE traffic than in some GPU loaded cases.
>>>
>>> Significantly increase the number of wait cycles to slow down the poll
>>> interval to ~30µs, limiting the FE idle memory traffic to 512 KB/s, while
>>> providing a max latency which should not hurt most use-cases. The FE WAIT
>>> command seems to have some unknown discrete steps in the wait cycles
>> add a comma here.
>>>    so
>>> we may over/undershoot the target a bit, but that should be harmless.
>> overshoot or undershoot
>>> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
>>> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
>>> ---
>>>    drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 11 ++++++-----
>>>    drivers/gpu/drm/etnaviv/etnaviv_gpu.c    |  7 +++++++
>>>    drivers/gpu/drm/etnaviv/etnaviv_gpu.h    |  1 +
>>>    3 files changed, 14 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
>>> index cf741c5c82d2..384df1659be6 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
>>> @@ -53,11 +53,12 @@ static inline void CMD_END(struct etnaviv_cmdbuf *buffer)
>>>    	OUT(buffer, VIV_FE_END_HEADER_OP_END);
>>>    }
>>>    
>>> -static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer)
>>> +static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer,
>>> +			    unsigned int waitcycles)
>>>    {
>>>    	buffer->user_size = ALIGN(buffer->user_size, 8);
>>>    
>>> -	OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | 200);
>>> +	OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | waitcycles);
>>>    }
>>>    
>>>    static inline void CMD_LINK(struct etnaviv_cmdbuf *buffer,
>>> @@ -168,7 +169,7 @@ u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu)
>>>    	/* initialize buffer */
>>>    	buffer->user_size = 0;
>>>    
>>> -	CMD_WAIT(buffer);
>>> +	CMD_WAIT(buffer, gpu->fe_waitcycles);
>>>    	CMD_LINK(buffer, 2,
>>>    		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>>>    		 + buffer->user_size - 4);
>>> @@ -320,7 +321,7 @@ void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event)
>>>    	CMD_END(buffer);
>>>    
>>>    	/* Append waitlink */
>>> -	CMD_WAIT(buffer);
>>> +	CMD_WAIT(buffer, gpu->fe_waitcycles);
>>>    	CMD_LINK(buffer, 2,
>>>    		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>>>    		 + buffer->user_size - 4);
>>> @@ -503,7 +504,7 @@ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, u32 exec_state,
>>>    
>>>    	CMD_LOAD_STATE(buffer, VIVS_GL_EVENT, VIVS_GL_EVENT_EVENT_ID(event) |
>>>    		       VIVS_GL_EVENT_FROM_PE);
>>> -	CMD_WAIT(buffer);
>>> +	CMD_WAIT(buffer, gpu->fe_waitcycles);
>>>    	CMD_LINK(buffer, 2,
>>>    		 etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>>>    		 + buffer->user_size - 4);
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
>>> index 41aab1aa330b..8c20dff32240 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
>>> @@ -493,6 +493,13 @@ static void etnaviv_gpu_update_clock(struct etnaviv_gpu *gpu)
>>>    		clock |= VIVS_HI_CLOCK_CONTROL_FSCALE_VAL(fscale);
>>>    		etnaviv_gpu_load_clock(gpu, clock);
>>>    	}
>>> +
>>> +	/*
>>> +	 * Choose number of wait cycles to target a ~30us (1/32768) max latency
>>> +	 * until new work is picked up by the FE when it polls in the idle loop.
>>> +	 */
>>> +	gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
>>> +				 0xffffUL);
>> This patch is NOT effective on our hardware GC1000 v5037 (ls7a1000 +
>> ls3a5000).
>>
>> As the gpu->base_rate_core is 0,  so, in the end gpu->fe_waitcycles is
>> also zero.
>>
> Uh, that's a problem, as the patch will then have the opposite effect
> on your platform by speeding up the idle loop. Thanks for catching
> this! I'll improve the patch to keep a reasonable amount of wait cycles
> in this case.

It's OK, no big problem as far as I can see. (it my platform's problem, 
not your problem)

Merge it is also OK, if we found something wrong we could fix it with a 
another patch.

> Regards,
> Lucas
>
>> But after apply this path, the glmark2 still run happily, no influence. So
>>
>>
>> Tested-by: Sui Jingfeng <suijingfeng@loongson.cn>
>>
>>>    }
>>>    
>>>    static int etnaviv_hw_reset(struct etnaviv_gpu *gpu)
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
>>> index 98c6f9c320fc..e1e1de59c38d 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
>>> @@ -150,6 +150,7 @@ struct etnaviv_gpu {
>>>    	struct clk *clk_shader;
>>>    
>>>    	unsigned int freq_scale;
>>> +	unsigned int fe_waitcycles;
>>>    	unsigned long base_rate_core;
>>>    	unsigned long base_rate_shader;
>>>    };

-- 
Jingfeng


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm/etnaviv: slow down FE idle polling
  2023-06-15  9:16     ` Sui Jingfeng
@ 2023-06-15  9:20       ` Christian Gmeiner
  2023-06-15  9:37         ` Sui Jingfeng
  0 siblings, 1 reply; 10+ messages in thread
From: Christian Gmeiner @ 2023-06-15  9:20 UTC (permalink / raw)
  To: Sui Jingfeng; +Cc: etnaviv, dri-devel, patchwork-lst, kernel, Russell King

Hi

Am Do., 15. Juni 2023 um 11:16 Uhr schrieb Sui Jingfeng
<suijingfeng@loongson.cn>:
>
> Hi,
>
> On 2023/6/15 17:04, Lucas Stach wrote:
> > Am Donnerstag, dem 15.06.2023 um 12:09 +0800 schrieb Sui Jingfeng:
> >> Hi,
> >>
> >> On 2023/6/7 20:59, Lucas Stach wrote:
> >>> Currently the FE is spinning way too fast when polling for new work in
> >> 'way' -> 'away'
> >>> the FE idleloop.
> >> 'idleloop' -> 'idle loop'
> >>>    As each poll fetches 16 bytes from memory, a GPU running
> >>> at 1GHz with the current setting of 200 wait cycle between fetches causes
> >>> 80 MB/s of memory traffic just to check for new work when the GPU is
> >>> otherwise idle, which is more FE traffic than in some GPU loaded cases.
> >>>
> >>> Significantly increase the number of wait cycles to slow down the poll
> >>> interval to ~30µs, limiting the FE idle memory traffic to 512 KB/s, while
> >>> providing a max latency which should not hurt most use-cases. The FE WAIT
> >>> command seems to have some unknown discrete steps in the wait cycles
> >> add a comma here.
> >>>    so
> >>> we may over/undershoot the target a bit, but that should be harmless.
> >> overshoot or undershoot
> >>> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> >>> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
> >>> ---
> >>>    drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 11 ++++++-----
> >>>    drivers/gpu/drm/etnaviv/etnaviv_gpu.c    |  7 +++++++
> >>>    drivers/gpu/drm/etnaviv/etnaviv_gpu.h    |  1 +
> >>>    3 files changed, 14 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> >>> index cf741c5c82d2..384df1659be6 100644
> >>> --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> >>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
> >>> @@ -53,11 +53,12 @@ static inline void CMD_END(struct etnaviv_cmdbuf *buffer)
> >>>     OUT(buffer, VIV_FE_END_HEADER_OP_END);
> >>>    }
> >>>
> >>> -static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer)
> >>> +static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer,
> >>> +                       unsigned int waitcycles)
> >>>    {
> >>>     buffer->user_size = ALIGN(buffer->user_size, 8);
> >>>
> >>> -   OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | 200);
> >>> +   OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | waitcycles);
> >>>    }
> >>>
> >>>    static inline void CMD_LINK(struct etnaviv_cmdbuf *buffer,
> >>> @@ -168,7 +169,7 @@ u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu)
> >>>     /* initialize buffer */
> >>>     buffer->user_size = 0;
> >>>
> >>> -   CMD_WAIT(buffer);
> >>> +   CMD_WAIT(buffer, gpu->fe_waitcycles);
> >>>     CMD_LINK(buffer, 2,
> >>>              etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
> >>>              + buffer->user_size - 4);
> >>> @@ -320,7 +321,7 @@ void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event)
> >>>     CMD_END(buffer);
> >>>
> >>>     /* Append waitlink */
> >>> -   CMD_WAIT(buffer);
> >>> +   CMD_WAIT(buffer, gpu->fe_waitcycles);
> >>>     CMD_LINK(buffer, 2,
> >>>              etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
> >>>              + buffer->user_size - 4);
> >>> @@ -503,7 +504,7 @@ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, u32 exec_state,
> >>>
> >>>     CMD_LOAD_STATE(buffer, VIVS_GL_EVENT, VIVS_GL_EVENT_EVENT_ID(event) |
> >>>                    VIVS_GL_EVENT_FROM_PE);
> >>> -   CMD_WAIT(buffer);
> >>> +   CMD_WAIT(buffer, gpu->fe_waitcycles);
> >>>     CMD_LINK(buffer, 2,
> >>>              etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
> >>>              + buffer->user_size - 4);
> >>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> >>> index 41aab1aa330b..8c20dff32240 100644
> >>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> >>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
> >>> @@ -493,6 +493,13 @@ static void etnaviv_gpu_update_clock(struct etnaviv_gpu *gpu)
> >>>             clock |= VIVS_HI_CLOCK_CONTROL_FSCALE_VAL(fscale);
> >>>             etnaviv_gpu_load_clock(gpu, clock);
> >>>     }
> >>> +
> >>> +   /*
> >>> +    * Choose number of wait cycles to target a ~30us (1/32768) max latency
> >>> +    * until new work is picked up by the FE when it polls in the idle loop.
> >>> +    */
> >>> +   gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
> >>> +                            0xffffUL);
> >> This patch is NOT effective on our hardware GC1000 v5037 (ls7a1000 +
> >> ls3a5000).
> >>
> >> As the gpu->base_rate_core is 0,  so, in the end gpu->fe_waitcycles is
> >> also zero.
> >>
> > Uh, that's a problem, as the patch will then have the opposite effect
> > on your platform by speeding up the idle loop. Thanks for catching
> > this! I'll improve the patch to keep a reasonable amount of wait cycles
> > in this case.
>
> It's OK, no big problem as far as I can see. (it my platform's problem,
> not your problem)
>

It will become a problem as it eats up the bandwidth that you want to
spend for real graphic work.

>
> Merge it is also OK, if we found something wrong we could fix it with a
> another patch.
>

Hmm.. I think that the fix for this problem is more or less an extra
if so I would love to see a proper fix
before this patch gets merged.

> > Regards,
> > Lucas
> >
> >> But after apply this path, the glmark2 still run happily, no influence. So
> >>
> >>
> >> Tested-by: Sui Jingfeng <suijingfeng@loongson.cn>
> >>
> >>>    }
> >>>
> >>>    static int etnaviv_hw_reset(struct etnaviv_gpu *gpu)
> >>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> >>> index 98c6f9c320fc..e1e1de59c38d 100644
> >>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> >>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
> >>> @@ -150,6 +150,7 @@ struct etnaviv_gpu {
> >>>     struct clk *clk_shader;
> >>>
> >>>     unsigned int freq_scale;
> >>> +   unsigned int fe_waitcycles;
> >>>     unsigned long base_rate_core;
> >>>     unsigned long base_rate_shader;
> >>>    };
>
> --
> Jingfeng
>


-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info/privacypolicy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm/etnaviv: slow down FE idle polling
  2023-06-15  9:20       ` Christian Gmeiner
@ 2023-06-15  9:37         ` Sui Jingfeng
  2023-06-15  9:53           ` Lucas Stach
  0 siblings, 1 reply; 10+ messages in thread
From: Sui Jingfeng @ 2023-06-15  9:37 UTC (permalink / raw)
  To: Christian Gmeiner; +Cc: etnaviv, dri-devel, patchwork-lst, kernel, Russell King

Hi,

On 2023/6/15 17:20, Christian Gmeiner wrote:
> Hi
>
> Am Do., 15. Juni 2023 um 11:16 Uhr schrieb Sui Jingfeng
> <suijingfeng@loongson.cn>:
>> Hi,
>>
>> On 2023/6/15 17:04, Lucas Stach wrote:
>>> Am Donnerstag, dem 15.06.2023 um 12:09 +0800 schrieb Sui Jingfeng:
>>>> Hi,
>>>>
>>>> On 2023/6/7 20:59, Lucas Stach wrote:
>>>>> Currently the FE is spinning way too fast when polling for new work in
>>>> 'way' -> 'away'
>>>>> the FE idleloop.
>>>> 'idleloop' -> 'idle loop'
>>>>>     As each poll fetches 16 bytes from memory, a GPU running
>>>>> at 1GHz with the current setting of 200 wait cycle between fetches causes
>>>>> 80 MB/s of memory traffic just to check for new work when the GPU is
>>>>> otherwise idle, which is more FE traffic than in some GPU loaded cases.
>>>>>
>>>>> Significantly increase the number of wait cycles to slow down the poll
>>>>> interval to ~30µs, limiting the FE idle memory traffic to 512 KB/s, while
>>>>> providing a max latency which should not hurt most use-cases. The FE WAIT
>>>>> command seems to have some unknown discrete steps in the wait cycles
>>>> add a comma here.
>>>>>     so
>>>>> we may over/undershoot the target a bit, but that should be harmless.
>>>> overshoot or undershoot
>>>>> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
>>>>> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
>>>>> ---
>>>>>     drivers/gpu/drm/etnaviv/etnaviv_buffer.c | 11 ++++++-----
>>>>>     drivers/gpu/drm/etnaviv/etnaviv_gpu.c    |  7 +++++++
>>>>>     drivers/gpu/drm/etnaviv/etnaviv_gpu.h    |  1 +
>>>>>     3 files changed, 14 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
>>>>> index cf741c5c82d2..384df1659be6 100644
>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_buffer.c
>>>>> @@ -53,11 +53,12 @@ static inline void CMD_END(struct etnaviv_cmdbuf *buffer)
>>>>>      OUT(buffer, VIV_FE_END_HEADER_OP_END);
>>>>>     }
>>>>>
>>>>> -static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer)
>>>>> +static inline void CMD_WAIT(struct etnaviv_cmdbuf *buffer,
>>>>> +                       unsigned int waitcycles)
>>>>>     {
>>>>>      buffer->user_size = ALIGN(buffer->user_size, 8);
>>>>>
>>>>> -   OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | 200);
>>>>> +   OUT(buffer, VIV_FE_WAIT_HEADER_OP_WAIT | waitcycles);
>>>>>     }
>>>>>
>>>>>     static inline void CMD_LINK(struct etnaviv_cmdbuf *buffer,
>>>>> @@ -168,7 +169,7 @@ u16 etnaviv_buffer_init(struct etnaviv_gpu *gpu)
>>>>>      /* initialize buffer */
>>>>>      buffer->user_size = 0;
>>>>>
>>>>> -   CMD_WAIT(buffer);
>>>>> +   CMD_WAIT(buffer, gpu->fe_waitcycles);
>>>>>      CMD_LINK(buffer, 2,
>>>>>               etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>>>>>               + buffer->user_size - 4);
>>>>> @@ -320,7 +321,7 @@ void etnaviv_sync_point_queue(struct etnaviv_gpu *gpu, unsigned int event)
>>>>>      CMD_END(buffer);
>>>>>
>>>>>      /* Append waitlink */
>>>>> -   CMD_WAIT(buffer);
>>>>> +   CMD_WAIT(buffer, gpu->fe_waitcycles);
>>>>>      CMD_LINK(buffer, 2,
>>>>>               etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>>>>>               + buffer->user_size - 4);
>>>>> @@ -503,7 +504,7 @@ void etnaviv_buffer_queue(struct etnaviv_gpu *gpu, u32 exec_state,
>>>>>
>>>>>      CMD_LOAD_STATE(buffer, VIVS_GL_EVENT, VIVS_GL_EVENT_EVENT_ID(event) |
>>>>>                     VIVS_GL_EVENT_FROM_PE);
>>>>> -   CMD_WAIT(buffer);
>>>>> +   CMD_WAIT(buffer, gpu->fe_waitcycles);
>>>>>      CMD_LINK(buffer, 2,
>>>>>               etnaviv_cmdbuf_get_va(buffer, &gpu->mmu_context->cmdbuf_mapping)
>>>>>               + buffer->user_size - 4);
>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
>>>>> index 41aab1aa330b..8c20dff32240 100644
>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
>>>>> @@ -493,6 +493,13 @@ static void etnaviv_gpu_update_clock(struct etnaviv_gpu *gpu)
>>>>>              clock |= VIVS_HI_CLOCK_CONTROL_FSCALE_VAL(fscale);
>>>>>              etnaviv_gpu_load_clock(gpu, clock);
>>>>>      }
>>>>> +
>>>>> +   /*
>>>>> +    * Choose number of wait cycles to target a ~30us (1/32768) max latency
>>>>> +    * until new work is picked up by the FE when it polls in the idle loop.
>>>>> +    */
>>>>> +   gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
>>>>> +                            0xffffUL);
>>>> This patch is NOT effective on our hardware GC1000 v5037 (ls7a1000 +
>>>> ls3a5000).
>>>>
>>>> As the gpu->base_rate_core is 0,  so, in the end gpu->fe_waitcycles is
>>>> also zero.
>>>>
>>> Uh, that's a problem, as the patch will then have the opposite effect
>>> on your platform by speeding up the idle loop. Thanks for catching
>>> this! I'll improve the patch to keep a reasonable amount of wait cycles
>>> in this case.
>> It's OK, no big problem as far as I can see. (it my platform's problem,
>> not your problem)
>>
> It will become a problem as it eats up the bandwidth that you want to
> spend for real graphic work.
>
>> Merge it is also OK, if we found something wrong we could fix it with a
>> another patch.
>>
> Hmm.. I think that the fix for this problem is more or less an extra
> if so I would love to see a proper fix
> before this patch gets merged.

It just no effect(at least I can't find).

I have tried, The score of glmark2 does not change, not become better, 
not become worse.

>>> Regards,
>>> Lucas
>>>
>>>> But after apply this path, the glmark2 still run happily, no influence. So
>>>>
>>>>
>>>> Tested-by: Sui Jingfeng <suijingfeng@loongson.cn>
>>>>
>>>>>     }
>>>>>
>>>>>     static int etnaviv_hw_reset(struct etnaviv_gpu *gpu)
>>>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
>>>>> index 98c6f9c320fc..e1e1de59c38d 100644
>>>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
>>>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
>>>>> @@ -150,6 +150,7 @@ struct etnaviv_gpu {
>>>>>      struct clk *clk_shader;
>>>>>
>>>>>      unsigned int freq_scale;
>>>>> +   unsigned int fe_waitcycles;
>>>>>      unsigned long base_rate_core;
>>>>>      unsigned long base_rate_shader;
>>>>>     };
>> --
>> Jingfeng
>>
>
-- 
Jingfeng


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm/etnaviv: slow down FE idle polling
  2023-06-15  9:37         ` Sui Jingfeng
@ 2023-06-15  9:53           ` Lucas Stach
  2023-06-15 13:41             ` Chris Healy
  0 siblings, 1 reply; 10+ messages in thread
From: Lucas Stach @ 2023-06-15  9:53 UTC (permalink / raw)
  To: Sui Jingfeng, Christian Gmeiner
  Cc: kernel, patchwork-lst, etnaviv, dri-devel, Russell King

Am Donnerstag, dem 15.06.2023 um 17:37 +0800 schrieb Sui Jingfeng:
> Hi,
> 
[...]
> > > > > > +
> > > > > > +   /*
> > > > > > +    * Choose number of wait cycles to target a ~30us (1/32768) max latency
> > > > > > +    * until new work is picked up by the FE when it polls in the idle loop.
> > > > > > +    */
> > > > > > +   gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
> > > > > > +                            0xffffUL);
> > > > > This patch is NOT effective on our hardware GC1000 v5037 (ls7a1000 +
> > > > > ls3a5000).
> > > > > 
> > > > > As the gpu->base_rate_core is 0,  so, in the end gpu->fe_waitcycles is
> > > > > also zero.
> > > > > 
> > > > Uh, that's a problem, as the patch will then have the opposite effect
> > > > on your platform by speeding up the idle loop. Thanks for catching
> > > > this! I'll improve the patch to keep a reasonable amount of wait cycles
> > > > in this case.
> > > It's OK, no big problem as far as I can see. (it my platform's problem,
> > > not your problem)
> > > 
> > It will become a problem as it eats up the bandwidth that you want to
> > spend for real graphic work.
> > 
> > > Merge it is also OK, if we found something wrong we could fix it with a
> > > another patch.
> > > 
> > Hmm.. I think that the fix for this problem is more or less an extra
> > if so I would love to see a proper fix
> > before this patch gets merged.

Right, we don't merge known broken stuff. We are all humans and bugs
and oversights happen, but we don't knowingly regress things.

> 
> It just no effect(at least I can't find).
> 
> I have tried, The score of glmark2 does not change, not become better, 
> not become worse.

That's because it only affects your system when the GPU is idle but
isn't in runtime PM yet. If you measure the DRAM bandwidth in that time
window you'll see that the GPU now uses much more bandwidth, slowing
down other workloads.

Regards,
Lucas


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm/etnaviv: slow down FE idle polling
  2023-06-15  9:53           ` Lucas Stach
@ 2023-06-15 13:41             ` Chris Healy
  2023-06-15 13:51               ` Sui Jingfeng
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Healy @ 2023-06-15 13:41 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Sui Jingfeng, etnaviv, dri-devel, patchwork-lst, kernel, Russell King

Jingfeng,

Does your design have any bus PMU counters that can be used to measure
DRAM bandwidth of the 3D GPU directly or even indirectly?

Regards,

Chris

On Thu, Jun 15, 2023 at 2:53 AM Lucas Stach <l.stach@pengutronix.de> wrote:
>
> Am Donnerstag, dem 15.06.2023 um 17:37 +0800 schrieb Sui Jingfeng:
> > Hi,
> >
> [...]
> > > > > > > +
> > > > > > > +   /*
> > > > > > > +    * Choose number of wait cycles to target a ~30us (1/32768) max latency
> > > > > > > +    * until new work is picked up by the FE when it polls in the idle loop.
> > > > > > > +    */
> > > > > > > +   gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
> > > > > > > +                            0xffffUL);
> > > > > > This patch is NOT effective on our hardware GC1000 v5037 (ls7a1000 +
> > > > > > ls3a5000).
> > > > > >
> > > > > > As the gpu->base_rate_core is 0,  so, in the end gpu->fe_waitcycles is
> > > > > > also zero.
> > > > > >
> > > > > Uh, that's a problem, as the patch will then have the opposite effect
> > > > > on your platform by speeding up the idle loop. Thanks for catching
> > > > > this! I'll improve the patch to keep a reasonable amount of wait cycles
> > > > > in this case.
> > > > It's OK, no big problem as far as I can see. (it my platform's problem,
> > > > not your problem)
> > > >
> > > It will become a problem as it eats up the bandwidth that you want to
> > > spend for real graphic work.
> > >
> > > > Merge it is also OK, if we found something wrong we could fix it with a
> > > > another patch.
> > > >
> > > Hmm.. I think that the fix for this problem is more or less an extra
> > > if so I would love to see a proper fix
> > > before this patch gets merged.
>
> Right, we don't merge known broken stuff. We are all humans and bugs
> and oversights happen, but we don't knowingly regress things.
>
> >
> > It just no effect(at least I can't find).
> >
> > I have tried, The score of glmark2 does not change, not become better,
> > not become worse.
>
> That's because it only affects your system when the GPU is idle but
> isn't in runtime PM yet. If you measure the DRAM bandwidth in that time
> window you'll see that the GPU now uses much more bandwidth, slowing
> down other workloads.
>
> Regards,
> Lucas
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: drm/etnaviv: slow down FE idle polling
  2023-06-15 13:41             ` Chris Healy
@ 2023-06-15 13:51               ` Sui Jingfeng
  0 siblings, 0 replies; 10+ messages in thread
From: Sui Jingfeng @ 2023-06-15 13:51 UTC (permalink / raw)
  To: Chris Healy, Lucas Stach
  Cc: etnaviv, dri-devel, patchwork-lst, kernel, Russell King

Hi,

On 2023/6/15 21:41, Chris Healy wrote:
> Jingfeng,
>
> Does your design have any bus PMU counters that can be used to measure
> DRAM bandwidth of the 3D GPU directly or even indirectly?

No,  It seems that we don't have such hardware.


What we can do is measure by the CPU,  say write a memcpy program.

Testing the system ram to vram and vram to system bandwidth.

system ram to system ram bandwidth.

Out 3a5000 system RAM bandwidth can be 10 GB/s (tested by memcpy testing 
program 1920x1080)

But the GPU is inside the north bridge,  Access memory from there is a 
bit slower.

because it need cross the HT bus.  But I don't mind.

> Regards,
>
> Chris
>
> On Thu, Jun 15, 2023 at 2:53 AM Lucas Stach <l.stach@pengutronix.de> wrote:
>> Am Donnerstag, dem 15.06.2023 um 17:37 +0800 schrieb Sui Jingfeng:
>>> Hi,
>>>
>> [...]
>>>>>>>> +
>>>>>>>> +   /*
>>>>>>>> +    * Choose number of wait cycles to target a ~30us (1/32768) max latency
>>>>>>>> +    * until new work is picked up by the FE when it polls in the idle loop.
>>>>>>>> +    */
>>>>>>>> +   gpu->fe_waitcycles = min(gpu->base_rate_core >> (15 - gpu->freq_scale),
>>>>>>>> +                            0xffffUL);
>>>>>>> This patch is NOT effective on our hardware GC1000 v5037 (ls7a1000 +
>>>>>>> ls3a5000).
>>>>>>>
>>>>>>> As the gpu->base_rate_core is 0,  so, in the end gpu->fe_waitcycles is
>>>>>>> also zero.
>>>>>>>
>>>>>> Uh, that's a problem, as the patch will then have the opposite effect
>>>>>> on your platform by speeding up the idle loop. Thanks for catching
>>>>>> this! I'll improve the patch to keep a reasonable amount of wait cycles
>>>>>> in this case.
>>>>> It's OK, no big problem as far as I can see. (it my platform's problem,
>>>>> not your problem)
>>>>>
>>>> It will become a problem as it eats up the bandwidth that you want to
>>>> spend for real graphic work.
>>>>
>>>>> Merge it is also OK, if we found something wrong we could fix it with a
>>>>> another patch.
>>>>>
>>>> Hmm.. I think that the fix for this problem is more or less an extra
>>>> if so I would love to see a proper fix
>>>> before this patch gets merged.
>> Right, we don't merge known broken stuff. We are all humans and bugs
>> and oversights happen, but we don't knowingly regress things.
>>
>>> It just no effect(at least I can't find).
>>>
>>> I have tried, The score of glmark2 does not change, not become better,
>>> not become worse.
>> That's because it only affects your system when the GPU is idle but
>> isn't in runtime PM yet. If you measure the DRAM bandwidth in that time
>> window you'll see that the GPU now uses much more bandwidth, slowing
>> down other workloads.
>>
>> Regards,
>> Lucas
>>
-- 
Jingfeng


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-06-15 13:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07 12:59 [PATCH] drm/etnaviv: slow down FE idle polling Lucas Stach
2023-06-14 18:37 ` Christian Gmeiner
2023-06-15  4:09 ` Sui Jingfeng
2023-06-15  9:04   ` Lucas Stach
2023-06-15  9:16     ` Sui Jingfeng
2023-06-15  9:20       ` Christian Gmeiner
2023-06-15  9:37         ` Sui Jingfeng
2023-06-15  9:53           ` Lucas Stach
2023-06-15 13:41             ` Chris Healy
2023-06-15 13:51               ` Sui Jingfeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).