All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate
@ 2018-11-20 11:37 Sharat Masetty
       [not found] ` <1542713851-14375-1-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2018-11-20 11:37 ` [PATCH 4/4] drm/msm: Optimize adreno_show_object() Sharat Masetty
  0 siblings, 2 replies; 8+ messages in thread
From: Sharat Masetty @ 2018-11-20 11:37 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, Sharat Masetty,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

The ringbuffer data to capture at crashtime can end up being large
sometimes, and the size can vary from being less than a page to the
full size of 32KB. So use the kvmalloc variant that perfectly fits the bill.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index d88d00d..6ebe842 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -397,7 +397,7 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state)
 				size = j + 1;
 
 		if (size) {
-			state->ring[i].data = kmalloc(size << 2, GFP_KERNEL);
+			state->ring[i].data = kvmalloc(size << 2, GFP_KERNEL);
 			if (state->ring[i].data) {
 				memcpy(state->ring[i].data, gpu->rb[i]->start, size << 2);
 				state->ring[i].data_size = size << 2;
@@ -440,7 +440,7 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(state->ring); i++)
-		kfree(state->ring[i].data);
+		kvfree(state->ring[i].data);
 
 	for (i = 0; state->bos && i < state->nr_bos; i++)
 		kvfree(state->bos[i].data);
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] include/linux/ascii85: Update ascii85_encode()
       [not found] ` <1542713851-14375-1-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2018-11-20 11:37   ` Sharat Masetty
       [not found]     ` <1542713851-14375-2-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2018-11-20 11:37   ` [PATCH 3/4] drm/msm: Use msm_gpu_state_bo for ringbuffer data Sharat Masetty
  2018-11-20 15:42   ` [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate Jordan Crouse
  2 siblings, 1 reply; 8+ messages in thread
From: Sharat Masetty @ 2018-11-20 11:37 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, Sharat Masetty,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

The current implementation of ascii85_encode() does not copy the encoded
buffer 'z' to the output buffer in case the input is zero. This patch
simply adds this missing piece. This makes it easier to use this
function to encode large buffers.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
---
 include/linux/ascii85.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/ascii85.h b/include/linux/ascii85.h
index 4cc4020..00646fc 100644
--- a/include/linux/ascii85.h
+++ b/include/linux/ascii85.h
@@ -23,8 +23,12 @@
 {
 	int i;
 
-	if (in == 0)
-		return "z";
+	if (in == 0) {
+		out[0] = 'z';
+		out[1] = '\0';
+
+		return out;
+	}
 
 	out[5] = '\0';
 	for (i = 5; i--; ) {
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] drm/msm: Use msm_gpu_state_bo for ringbuffer data
       [not found] ` <1542713851-14375-1-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2018-11-20 11:37   ` [PATCH 2/4] include/linux/ascii85: Update ascii85_encode() Sharat Masetty
@ 2018-11-20 11:37   ` Sharat Masetty
  2018-11-20 15:52     ` Jordan Crouse
  2018-11-20 15:42   ` [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate Jordan Crouse
  2 siblings, 1 reply; 8+ messages in thread
From: Sharat Masetty @ 2018-11-20 11:37 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, Sharat Masetty,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

The ring substructure in msm_gpu_state is an extension of
msm_gpu_state_bo, so this patch changes the ring structure
to reuse the msm_gpu_state_bo as a base class, instead of
redefining the required variables.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 20 +++++++++++---------
 drivers/gpu/drm/msm/msm_gpu.h           |  4 +---
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 6ebe842..bbf8d3e 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -383,7 +383,7 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state)
 		int size = 0, j;
 
 		state->ring[i].fence = gpu->rb[i]->memptrs->fence;
-		state->ring[i].iova = gpu->rb[i]->iova;
+		state->ring[i].bo.iova = gpu->rb[i]->iova;
 		state->ring[i].seqno = gpu->rb[i]->seqno;
 		state->ring[i].rptr = get_rptr(adreno_gpu, gpu->rb[i]);
 		state->ring[i].wptr = get_wptr(gpu->rb[i]);
@@ -397,10 +397,12 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state)
 				size = j + 1;
 
 		if (size) {
-			state->ring[i].data = kvmalloc(size << 2, GFP_KERNEL);
-			if (state->ring[i].data) {
-				memcpy(state->ring[i].data, gpu->rb[i]->start, size << 2);
-				state->ring[i].data_size = size << 2;
+			state->ring[i].bo.data =
+				kvmalloc(size << 2, GFP_KERNEL);
+			if (state->ring[i].bo.data) {
+				memcpy(state->ring[i].bo.data,
+						gpu->rb[i]->start, size << 2);
+				state->ring[i].bo.size = size << 2;
 			}
 		}
 	}
@@ -440,7 +442,7 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(state->ring); i++)
-		kvfree(state->ring[i].data);
+		kvfree(state->ring[i].bo.data);
 
 	for (i = 0; state->bos && i < state->nr_bos; i++)
 		kvfree(state->bos[i].data);
@@ -522,15 +524,15 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
 
 	for (i = 0; i < gpu->nr_rings; i++) {
 		drm_printf(p, "  - id: %d\n", i);
-		drm_printf(p, "    iova: 0x%016llx\n", state->ring[i].iova);
+		drm_printf(p, "    iova: 0x%016llx\n", state->ring[i].bo.iova);
 		drm_printf(p, "    last-fence: %d\n", state->ring[i].seqno);
 		drm_printf(p, "    retired-fence: %d\n", state->ring[i].fence);
 		drm_printf(p, "    rptr: %d\n", state->ring[i].rptr);
 		drm_printf(p, "    wptr: %d\n", state->ring[i].wptr);
 		drm_printf(p, "    size: %d\n", MSM_GPU_RINGBUFFER_SZ);
 
-		adreno_show_object(p, state->ring[i].data,
-			state->ring[i].data_size);
+		adreno_show_object(p, state->ring[i].bo.data,
+			state->ring[i].bo.size);
 	}
 
 	if (state->bos) {
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 7dc775f..a3a6371 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -198,13 +198,11 @@ struct msm_gpu_state {
 	struct timeval time;
 
 	struct {
-		u64 iova;
 		u32 fence;
 		u32 seqno;
 		u32 rptr;
 		u32 wptr;
-		void *data;
-		int data_size;
+		struct msm_gpu_state_bo bo;
 	} ring[MSM_GPU_MAX_RINGS];
 
 	int nr_registers;
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] drm/msm: Optimize adreno_show_object()
  2018-11-20 11:37 [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate Sharat Masetty
       [not found] ` <1542713851-14375-1-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2018-11-20 11:37 ` Sharat Masetty
       [not found]   ` <1542713851-14375-4-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  1 sibling, 1 reply; 8+ messages in thread
From: Sharat Masetty @ 2018-11-20 11:37 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, Sharat Masetty, dri-devel

When the userspace tries to read the crashstate dump, the read side
implementation in the driver currently ascii85 encodes all the binary
buffers and it does this each time the read system call is called.
A userspace tool like cat typically does a page by page read and the
number of read calls depends on the size of the data captured by the
driver. This is certainly not desirable and does not scale well with
large captures.

This patch encodes the buffer only once in the read path. With this there
is an immediate >10X speed improvement in crashstate save time.

Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>
---
Changes from v1:
	Addressed comments from Jordan Crouse

 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 80 ++++++++++++++++++++++++---------
 drivers/gpu/drm/msm/msm_gpu.h           |  1 +
 2 files changed, 60 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index bbf8d3e..7749967 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -441,11 +441,15 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
 {
 	int i;

-	for (i = 0; i < ARRAY_SIZE(state->ring); i++)
+	for (i = 0; i < ARRAY_SIZE(state->ring); i++) {
 		kvfree(state->ring[i].bo.data);
+		kvfree(state->ring[i].bo.encoded);
+	}

-	for (i = 0; state->bos && i < state->nr_bos; i++)
+	for (i = 0; state->bos && i < state->nr_bos; i++) {
 		kvfree(state->bos[i].data);
+		kvfree(state->bos[i].encoded);
+	}

 	kfree(state->bos);
 	kfree(state->comm);
@@ -472,34 +476,70 @@ int adreno_gpu_state_put(struct msm_gpu_state *state)

 #if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP)

-static void adreno_show_object(struct drm_printer *p, u32 *ptr, int len)
+static char *adreno_gpu_ascii85_encode(u32 *src, size_t len)
 {
-	char out[ASCII85_BUFSZ];
-	long l, datalen, i;
+	void *buf;
+	size_t buf_itr = 0;
+	long i, l;

-	if (!ptr || !len)
-		return;
+	if (!len)
+		return NULL;
+
+	l = ascii85_encode_len(len);

 	/*
-	 * Only dump the non-zero part of the buffer - rarely will any data
-	 * completely fill the entire allocated size of the buffer
+	 * ascii85 outputs either a 5 byte string or a 1 byte string. So we
+	 * account for the worst case of 5 bytes per dword plus the 1 for '\0'
 	 */
-	for (datalen = 0, i = 0; i < len >> 2; i++) {
-		if (ptr[i])
-			datalen = (i << 2) + 1;
+	buf = kvmalloc((l * 5) + 1, GFP_KERNEL);
+	if (!buf)
+		return NULL;
+
+	for (i = 0; i < l; i++) {
+		ascii85_encode(src[i], buf + buf_itr);
+
+		if (src[i] == 0)
+			buf_itr += 1;
+		else
+			buf_itr += 5;
 	}

-	/* Skip printing the object if it is empty */
-	if (datalen == 0)
+	return buf;
+}
+
+static void adreno_show_object(struct drm_printer *p,
+		struct msm_gpu_state_bo *bo)
+{
+	if ((!bo->data && !bo->encoded) || !bo->size)
 		return;

-	l = ascii85_encode_len(datalen);
+	if (!bo->encoded) {
+		long datalen, i;
+		u32 *buf = bo->data;
+
+		/*
+		 * Only dump the non-zero part of the buffer - rarely will
+		 * any data completely fill the entire allocated size of
+		 * the buffer.
+		 */
+		for (datalen = 0, i = 0; i < (bo->size) >> 2; i++) {
+			if (buf[i])
+				datalen = ((i + 1) << 2);
+		}
+
+		bo->encoded = adreno_gpu_ascii85_encode(buf, datalen);
+
+		kvfree(bo->data);
+		bo->data = NULL;
+
+		if (!bo->encoded)
+			return;
+	}

 	drm_puts(p, "    data: !!ascii85 |\n");
 	drm_puts(p, "     ");

-	for (i = 0; i < l; i++)
-		drm_puts(p, ascii85_encode(ptr[i], out));
+	drm_puts(p, bo->encoded);

 	drm_puts(p, "\n");
 }
@@ -531,8 +571,7 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
 		drm_printf(p, "    wptr: %d\n", state->ring[i].wptr);
 		drm_printf(p, "    size: %d\n", MSM_GPU_RINGBUFFER_SZ);

-		adreno_show_object(p, state->ring[i].bo.data,
-			state->ring[i].bo.size);
+		adreno_show_object(p, &(state->ring[i].bo));
 	}

 	if (state->bos) {
@@ -543,8 +582,7 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
 				state->bos[i].iova);
 			drm_printf(p, "    size: %zd\n", state->bos[i].size);

-			adreno_show_object(p, state->bos[i].data,
-				state->bos[i].size);
+			adreno_show_object(p, &(state->bos[i]));
 		}
 	}

diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index a3a6371..737bf45 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -191,6 +191,7 @@ struct msm_gpu_state_bo {
 	u64 iova;
 	size_t size;
 	void *data;
+	char *encoded;
 };

 struct msm_gpu_state {
--
1.9.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate
       [not found] ` <1542713851-14375-1-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2018-11-20 11:37   ` [PATCH 2/4] include/linux/ascii85: Update ascii85_encode() Sharat Masetty
  2018-11-20 11:37   ` [PATCH 3/4] drm/msm: Use msm_gpu_state_bo for ringbuffer data Sharat Masetty
@ 2018-11-20 15:42   ` Jordan Crouse
  2 siblings, 0 replies; 8+ messages in thread
From: Jordan Crouse @ 2018-11-20 15:42 UTC (permalink / raw)
  To: Sharat Masetty
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Tue, Nov 20, 2018 at 05:07:28PM +0530, Sharat Masetty wrote:
> The ringbuffer data to capture at crashtime can end up being large
> sometimes, and the size can vary from being less than a page to the
> full size of 32KB. So use the kvmalloc variant that perfectly fits the bill.
> 
> Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>

Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>

> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index d88d00d..6ebe842 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -397,7 +397,7 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state)
>  				size = j + 1;
>  
>  		if (size) {
> -			state->ring[i].data = kmalloc(size << 2, GFP_KERNEL);
> +			state->ring[i].data = kvmalloc(size << 2, GFP_KERNEL);
>  			if (state->ring[i].data) {
>  				memcpy(state->ring[i].data, gpu->rb[i]->start, size << 2);
>  				state->ring[i].data_size = size << 2;
> @@ -440,7 +440,7 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
>  	int i;
>  
>  	for (i = 0; i < ARRAY_SIZE(state->ring); i++)
> -		kfree(state->ring[i].data);
> +		kvfree(state->ring[i].data);
>  
>  	for (i = 0; state->bos && i < state->nr_bos; i++)
>  		kvfree(state->bos[i].data);
> -- 
> 1.9.1
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/4] include/linux/ascii85: Update ascii85_encode()
       [not found]     ` <1542713851-14375-2-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2018-11-20 15:47       ` Jordan Crouse
  0 siblings, 0 replies; 8+ messages in thread
From: Jordan Crouse @ 2018-11-20 15:47 UTC (permalink / raw)
  To: Sharat Masetty
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Tue, Nov 20, 2018 at 05:07:29PM +0530, Sharat Masetty wrote:
> The current implementation of ascii85_encode() does not copy the encoded
> buffer 'z' to the output buffer in case the input is zero. This patch
> simply adds this missing piece. This makes it easier to use this
> function to encode large buffers.
> 
> Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>

Adding intel-gfx since they are the other consumers of this code but this seems
legit to me.

Jordan

> ---
>  include/linux/ascii85.h | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/ascii85.h b/include/linux/ascii85.h
> index 4cc4020..00646fc 100644
> --- a/include/linux/ascii85.h
> +++ b/include/linux/ascii85.h
> @@ -23,8 +23,12 @@
>  {
>  	int i;
>  
> -	if (in == 0)
> -		return "z";
> +	if (in == 0) {
> +		out[0] = 'z';
> +		out[1] = '\0';
> +
> +		return out;
> +	}
>  
>  	out[5] = '\0';
>  	for (i = 5; i--; ) {
> -- 
> 1.9.1
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/4] drm/msm: Use msm_gpu_state_bo for ringbuffer data
  2018-11-20 11:37   ` [PATCH 3/4] drm/msm: Use msm_gpu_state_bo for ringbuffer data Sharat Masetty
@ 2018-11-20 15:52     ` Jordan Crouse
  0 siblings, 0 replies; 8+ messages in thread
From: Jordan Crouse @ 2018-11-20 15:52 UTC (permalink / raw)
  To: Sharat Masetty; +Cc: linux-arm-msm, freedreno, dri-devel

On Tue, Nov 20, 2018 at 05:07:30PM +0530, Sharat Masetty wrote:
> The ring substructure in msm_gpu_state is an extension of
> msm_gpu_state_bo, so this patch changes the ring structure
> to reuse the msm_gpu_state_bo as a base class, instead of
> redefining the required variables.
> 
> Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>

Looks okay.  There might be some code consolidation to be had here but perhaps
you've already addressed this in a future patch or we can do it as a rainy day
cleanup.

Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 20 +++++++++++---------
>  drivers/gpu/drm/msm/msm_gpu.h           |  4 +---
>  2 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 6ebe842..bbf8d3e 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -383,7 +383,7 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state)
>  		int size = 0, j;
>  
>  		state->ring[i].fence = gpu->rb[i]->memptrs->fence;
> -		state->ring[i].iova = gpu->rb[i]->iova;
> +		state->ring[i].bo.iova = gpu->rb[i]->iova;
>  		state->ring[i].seqno = gpu->rb[i]->seqno;
>  		state->ring[i].rptr = get_rptr(adreno_gpu, gpu->rb[i]);
>  		state->ring[i].wptr = get_wptr(gpu->rb[i]);
> @@ -397,10 +397,12 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct msm_gpu_state *state)
>  				size = j + 1;
>  
>  		if (size) {
> -			state->ring[i].data = kvmalloc(size << 2, GFP_KERNEL);
> -			if (state->ring[i].data) {
> -				memcpy(state->ring[i].data, gpu->rb[i]->start, size << 2);
> -				state->ring[i].data_size = size << 2;
> +			state->ring[i].bo.data =
> +				kvmalloc(size << 2, GFP_KERNEL);
> +			if (state->ring[i].bo.data) {
> +				memcpy(state->ring[i].bo.data,
> +						gpu->rb[i]->start, size << 2);
> +				state->ring[i].bo.size = size << 2;

Today I learned there that kvmemdup() does not exist and I was sad about it.

>  			}
>  		}
>  	}
> @@ -440,7 +442,7 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
>  	int i;
>  
>  	for (i = 0; i < ARRAY_SIZE(state->ring); i++)
> -		kvfree(state->ring[i].data);
> +		kvfree(state->ring[i].bo.data);
>  
>  	for (i = 0; state->bos && i < state->nr_bos; i++)
>  		kvfree(state->bos[i].data);
> @@ -522,15 +524,15 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
>  
>  	for (i = 0; i < gpu->nr_rings; i++) {
>  		drm_printf(p, "  - id: %d\n", i);
> -		drm_printf(p, "    iova: 0x%016llx\n", state->ring[i].iova);
> +		drm_printf(p, "    iova: 0x%016llx\n", state->ring[i].bo.iova);
>  		drm_printf(p, "    last-fence: %d\n", state->ring[i].seqno);
>  		drm_printf(p, "    retired-fence: %d\n", state->ring[i].fence);
>  		drm_printf(p, "    rptr: %d\n", state->ring[i].rptr);
>  		drm_printf(p, "    wptr: %d\n", state->ring[i].wptr);
>  		drm_printf(p, "    size: %d\n", MSM_GPU_RINGBUFFER_SZ);
>  
> -		adreno_show_object(p, state->ring[i].data,
> -			state->ring[i].data_size);
> +		adreno_show_object(p, state->ring[i].bo.data,
> +			state->ring[i].bo.size);
>  	}
>  
>  	if (state->bos) {
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 7dc775f..a3a6371 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -198,13 +198,11 @@ struct msm_gpu_state {
>  	struct timeval time;
>  
>  	struct {
> -		u64 iova;
>  		u32 fence;
>  		u32 seqno;
>  		u32 rptr;
>  		u32 wptr;
> -		void *data;
> -		int data_size;
> +		struct msm_gpu_state_bo bo;
>  	} ring[MSM_GPU_MAX_RINGS];
>  
>  	int nr_registers;
> -- 
> 1.9.1
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/4] drm/msm: Optimize adreno_show_object()
       [not found]   ` <1542713851-14375-4-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2018-11-20 15:56     ` Jordan Crouse
  0 siblings, 0 replies; 8+ messages in thread
From: Jordan Crouse @ 2018-11-20 15:56 UTC (permalink / raw)
  To: Sharat Masetty
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Tue, Nov 20, 2018 at 05:07:31PM +0530, Sharat Masetty wrote:
> When the userspace tries to read the crashstate dump, the read side
> implementation in the driver currently ascii85 encodes all the binary
> buffers and it does this each time the read system call is called.
> A userspace tool like cat typically does a page by page read and the
> number of read calls depends on the size of the data captured by the
> driver. This is certainly not desirable and does not scale well with
> large captures.
> 
> This patch encodes the buffer only once in the read path. With this there
> is an immediate >10X speed improvement in crashstate save time.
> 
> Signed-off-by: Sharat Masetty <smasetty@codeaurora.org>

Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>

I like this a lot.

> ---
> Changes from v1:
> 	Addressed comments from Jordan Crouse
> 
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 80 ++++++++++++++++++++++++---------
>  drivers/gpu/drm/msm/msm_gpu.h           |  1 +
>  2 files changed, 60 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index bbf8d3e..7749967 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -441,11 +441,15 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
>  {
>  	int i;
> 
> -	for (i = 0; i < ARRAY_SIZE(state->ring); i++)
> +	for (i = 0; i < ARRAY_SIZE(state->ring); i++) {
>  		kvfree(state->ring[i].bo.data);
> +		kvfree(state->ring[i].bo.encoded);
> +	}
> 
> -	for (i = 0; state->bos && i < state->nr_bos; i++)
> +	for (i = 0; state->bos && i < state->nr_bos; i++) {
>  		kvfree(state->bos[i].data);
> +		kvfree(state->bos[i].encoded);
> +	}
> 
>  	kfree(state->bos);
>  	kfree(state->comm);
> @@ -472,34 +476,70 @@ int adreno_gpu_state_put(struct msm_gpu_state *state)
> 
>  #if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP)
> 
> -static void adreno_show_object(struct drm_printer *p, u32 *ptr, int len)
> +static char *adreno_gpu_ascii85_encode(u32 *src, size_t len)
>  {
> -	char out[ASCII85_BUFSZ];
> -	long l, datalen, i;
> +	void *buf;
> +	size_t buf_itr = 0;
> +	long i, l;
> 
> -	if (!ptr || !len)
> -		return;
> +	if (!len)
> +		return NULL;
> +
> +	l = ascii85_encode_len(len);
> 
>  	/*
> -	 * Only dump the non-zero part of the buffer - rarely will any data
> -	 * completely fill the entire allocated size of the buffer
> +	 * ascii85 outputs either a 5 byte string or a 1 byte string. So we
> +	 * account for the worst case of 5 bytes per dword plus the 1 for '\0'
>  	 */
> -	for (datalen = 0, i = 0; i < len >> 2; i++) {
> -		if (ptr[i])
> -			datalen = (i << 2) + 1;
> +	buf = kvmalloc((l * 5) + 1, GFP_KERNEL);
> +	if (!buf)
> +		return NULL;
> +
> +	for (i = 0; i < l; i++) {
> +		ascii85_encode(src[i], buf + buf_itr);
> +
> +		if (src[i] == 0)
> +			buf_itr += 1;
> +		else
> +			buf_itr += 5;
>  	}
> 
> -	/* Skip printing the object if it is empty */
> -	if (datalen == 0)
> +	return buf;
> +}
> +
> +static void adreno_show_object(struct drm_printer *p,
> +		struct msm_gpu_state_bo *bo)
> +{
> +	if ((!bo->data && !bo->encoded) || !bo->size)
>  		return;
> 
> -	l = ascii85_encode_len(datalen);
> +	if (!bo->encoded) {
> +		long datalen, i;
> +		u32 *buf = bo->data;
> +
> +		/*
> +		 * Only dump the non-zero part of the buffer - rarely will
> +		 * any data completely fill the entire allocated size of
> +		 * the buffer.
> +		 */
> +		for (datalen = 0, i = 0; i < (bo->size) >> 2; i++) {
> +			if (buf[i])
> +				datalen = ((i + 1) << 2);
> +		}
> +
> +		bo->encoded = adreno_gpu_ascii85_encode(buf, datalen);
> +
> +		kvfree(bo->data);
> +		bo->data = NULL;
> +
> +		if (!bo->encoded)
> +			return;
> +	}
> 
>  	drm_puts(p, "    data: !!ascii85 |\n");
>  	drm_puts(p, "     ");
> 
> -	for (i = 0; i < l; i++)
> -		drm_puts(p, ascii85_encode(ptr[i], out));
> +	drm_puts(p, bo->encoded);
> 
>  	drm_puts(p, "\n");
>  }
> @@ -531,8 +571,7 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
>  		drm_printf(p, "    wptr: %d\n", state->ring[i].wptr);
>  		drm_printf(p, "    size: %d\n", MSM_GPU_RINGBUFFER_SZ);
> 
> -		adreno_show_object(p, state->ring[i].bo.data,
> -			state->ring[i].bo.size);
> +		adreno_show_object(p, &(state->ring[i].bo));
>  	}
> 
>  	if (state->bos) {
> @@ -543,8 +582,7 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
>  				state->bos[i].iova);
>  			drm_printf(p, "    size: %zd\n", state->bos[i].size);
> 
> -			adreno_show_object(p, state->bos[i].data,
> -				state->bos[i].size);
> +			adreno_show_object(p, &(state->bos[i]));
>  		}
>  	}
> 
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index a3a6371..737bf45 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -191,6 +191,7 @@ struct msm_gpu_state_bo {
>  	u64 iova;
>  	size_t size;
>  	void *data;
> +	char *encoded;
>  };
> 
>  struct msm_gpu_state {
> --
> 1.9.1
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-11-20 15:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-20 11:37 [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate Sharat Masetty
     [not found] ` <1542713851-14375-1-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-11-20 11:37   ` [PATCH 2/4] include/linux/ascii85: Update ascii85_encode() Sharat Masetty
     [not found]     ` <1542713851-14375-2-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-11-20 15:47       ` Jordan Crouse
2018-11-20 11:37   ` [PATCH 3/4] drm/msm: Use msm_gpu_state_bo for ringbuffer data Sharat Masetty
2018-11-20 15:52     ` Jordan Crouse
2018-11-20 15:42   ` [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate Jordan Crouse
2018-11-20 11:37 ` [PATCH 4/4] drm/msm: Optimize adreno_show_object() Sharat Masetty
     [not found]   ` <1542713851-14375-4-git-send-email-smasetty-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-11-20 15:56     ` Jordan Crouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.