All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately
@ 2018-04-17 15:58 Robin Murphy
  2018-04-17 15:58 ` [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing Robin Murphy
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Robin Murphy @ 2018-04-17 15:58 UTC (permalink / raw)
  To: amd-gfx, dri-devel; +Cc: okaya, alexander.deucher, christian.koenig

For dma_map_sg(), DMA API implementations are free to merge consecutive
segments into a single DMA mapping if conditions are suitable, thus the
resulting DMA addresses which drm_prime_sg_to_page_addr_arrays()
iterates may be packed into fewer entries than ttm->sg->nents implies.

The current implementation does not account for this, meaning that its
callers either have to reject the 0 < count < nents case or risk getting
bogus DMA addresses beyond the first segment. Fortunately this is quite
easy to handle without having to rejig structures to also store the
mapped count, since the total DMA length should still be equal to the
total buffer length. All we need is a second scatterlist cursor to
iterate through the DMA addresses independently of the page addresses.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

v2: Remember to iterate dma_len correctly as well.

 drivers/gpu/drm/drm_prime.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 7856a9b3f8a8..b8ca06ea7d80 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -933,16 +933,18 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
 				     dma_addr_t *addrs, int max_entries)
 {
 	unsigned count;
-	struct scatterlist *sg;
+	struct scatterlist *sg, *dma_sg;
 	struct page *page;
-	u32 len, index;
+	u32 len, dma_len, index;
 	dma_addr_t addr;
 
 	index = 0;
+	dma_sg = sgt->sgl;
+	dma_len = sg_dma_len(dma_sg);
+	addr = sg_dma_address(dma_sg);
 	for_each_sg(sgt->sgl, sg, sgt->nents, count) {
 		len = sg->length;
 		page = sg_page(sg);
-		addr = sg_dma_address(sg);
 
 		while (len > 0) {
 			if (WARN_ON(index >= max_entries))
@@ -955,8 +957,15 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
 			page++;
 			addr += PAGE_SIZE;
 			len -= PAGE_SIZE;
+			dma_len -= PAGE_SIZE;
 			index++;
 		}
+
+		if (dma_len == 0) {
+			dma_sg = sg_next(dma_sg);
+			dma_len = sg_dma_len(dma_sg);
+			addr = sg_dma_address(dma_sg);
+		}
 	}
 	return 0;
 }
-- 
2.16.1.dirty

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
  2018-04-17 15:58 [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately Robin Murphy
@ 2018-04-17 15:58 ` Robin Murphy
       [not found]   ` <622c9ac3784170880dbde6900146e68631df958a.1523977133.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2018-04-17 15:58 ` [PATCH v2 3/3] drm/radeon: " Robin Murphy
  2018-04-17 16:29 ` [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately Christian König
  2 siblings, 1 reply; 12+ messages in thread
From: Robin Murphy @ 2018-04-17 15:58 UTC (permalink / raw)
  To: amd-gfx, dri-devel; +Cc: okaya, alexander.deucher, christian.koenig

The amdgpu driver doesn't appear to directly use the scatterlist mapped
by amdgpu_ttm_tt_pin_userptr(), it merely hands it off to
drm_prime_sg_to_page_addr_arrays() to generate the dma_address array
which it actually cares about. Now that the latter can cope with
dma_map_sg() coalescing dma-contiguous segments such that it returns
0 < count < nents, we can relax the current count == nents check to
only consider genuine failure as other drivers do.

This avoids a false negative on arm64 systems with an IOMMU.

Reported-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

v2: Expand commit message to clarify

 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 205da3ff9cd0..f81e96a4242f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -813,7 +813,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 
 	r = -ENOMEM;
 	nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-	if (nents != ttm->sg->nents)
+	if (nents == 0)
 		goto release_sg;
 
 	drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
-- 
2.16.1.dirty

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] drm/radeon: Allow dma_map_sg() coalescing
  2018-04-17 15:58 [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately Robin Murphy
  2018-04-17 15:58 ` [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing Robin Murphy
@ 2018-04-17 15:58 ` Robin Murphy
  2018-04-17 16:29 ` [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately Christian König
  2 siblings, 0 replies; 12+ messages in thread
From: Robin Murphy @ 2018-04-17 15:58 UTC (permalink / raw)
  To: amd-gfx, dri-devel; +Cc: okaya, alexander.deucher, christian.koenig

Much like amdgpu, the radeon driver doesn't appear to directly use the
scatterlist mapped by radeon_ttm_tt_pin_userptr(), it merely hands it
off to drm_prime_sg_to_page_addr_arrays() to generate the dma_address
array which it actually cares about. Now that the latter can cope with
dma_map_sg() coalescing dma-contiguous segments such that it returns
0 < count < nents, we can relax the current count == nents check to
only consider genuine failure as other drivers do.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

v2: New

 drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 8689fcca051c..7c099192c7fa 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -585,7 +585,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 
 	r = -ENOMEM;
 	nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-	if (nents != ttm->sg->nents)
+	if (nents == 0)
 		goto release_sg;
 
 	drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
-- 
2.16.1.dirty

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately
  2018-04-17 15:58 [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately Robin Murphy
  2018-04-17 15:58 ` [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing Robin Murphy
  2018-04-17 15:58 ` [PATCH v2 3/3] drm/radeon: " Robin Murphy
@ 2018-04-17 16:29 ` Christian König
  2018-04-17 18:22   ` Robin Murphy
  2 siblings, 1 reply; 12+ messages in thread
From: Christian König @ 2018-04-17 16:29 UTC (permalink / raw)
  To: Robin Murphy, amd-gfx, dri-devel
  Cc: okaya, alexander.deucher, christian.koenig

Am 17.04.2018 um 17:58 schrieb Robin Murphy:
> For dma_map_sg(), DMA API implementations are free to merge consecutive
> segments into a single DMA mapping if conditions are suitable, thus the
> resulting DMA addresses which drm_prime_sg_to_page_addr_arrays()
> iterates may be packed into fewer entries than ttm->sg->nents implies.
>
> The current implementation does not account for this, meaning that its
> callers either have to reject the 0 < count < nents case or risk getting
> bogus DMA addresses beyond the first segment. Fortunately this is quite
> easy to handle without having to rejig structures to also store the
> mapped count, since the total DMA length should still be equal to the
> total buffer length. All we need is a second scatterlist cursor to
> iterate through the DMA addresses independently of the page addresses.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Reviewed-by: Christian König <christian.koenig@amd.com> for the whole 
series.

> ---
>
> v2: Remember to iterate dma_len correctly as well.
>
>   drivers/gpu/drm/drm_prime.c | 15 ++++++++++++---
>   1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 7856a9b3f8a8..b8ca06ea7d80 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -933,16 +933,18 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
>   				     dma_addr_t *addrs, int max_entries)
>   {
>   	unsigned count;
> -	struct scatterlist *sg;
> +	struct scatterlist *sg, *dma_sg;
>   	struct page *page;
> -	u32 len, index;
> +	u32 len, dma_len, index;
>   	dma_addr_t addr;
>   
>   	index = 0;
> +	dma_sg = sgt->sgl;
> +	dma_len = sg_dma_len(dma_sg);
> +	addr = sg_dma_address(dma_sg);
>   	for_each_sg(sgt->sgl, sg, sgt->nents, count) {
>   		len = sg->length;
>   		page = sg_page(sg);
> -		addr = sg_dma_address(sg);
>   
>   		while (len > 0) {
>   			if (WARN_ON(index >= max_entries))
> @@ -955,8 +957,15 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
>   			page++;
>   			addr += PAGE_SIZE;
>   			len -= PAGE_SIZE;
> +			dma_len -= PAGE_SIZE;
>   			index++;
>   		}
> +
> +		if (dma_len == 0) {
> +			dma_sg = sg_next(dma_sg);
> +			dma_len = sg_dma_len(dma_sg);
> +			addr = sg_dma_address(dma_sg);
> +		}
>   	}
>   	return 0;
>   }

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately
  2018-04-17 16:29 ` [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately Christian König
@ 2018-04-17 18:22   ` Robin Murphy
       [not found]     ` <73ac5c65-1dbd-8711-5770-75d79389bf44-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Robin Murphy @ 2018-04-17 18:22 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, dri-devel; +Cc: okaya, alexander.deucher

On 17/04/18 17:29, Christian König wrote:
> Am 17.04.2018 um 17:58 schrieb Robin Murphy:
>> For dma_map_sg(), DMA API implementations are free to merge consecutive
>> segments into a single DMA mapping if conditions are suitable, thus the
>> resulting DMA addresses which drm_prime_sg_to_page_addr_arrays()
>> iterates may be packed into fewer entries than ttm->sg->nents implies.
>>
>> The current implementation does not account for this, meaning that its
>> callers either have to reject the 0 < count < nents case or risk getting
>> bogus DMA addresses beyond the first segment. Fortunately this is quite
>> easy to handle without having to rejig structures to also store the
>> mapped count, since the total DMA length should still be equal to the
>> total buffer length. All we need is a second scatterlist cursor to
>> iterate through the DMA addresses independently of the page addresses.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> 
> Reviewed-by: Christian König <christian.koenig@amd.com> for the whole 
> series.

Thanks Christian.

FWIW, the following *completely untested* hack should in theory give the 
AMD IOMMU similar segment-merging behaviour to the arm64 IOMMU DMA ops, 
if that helps widen the scope for testing/investigation (I have neither 
an AMD/ATI graphics card nor a PCIe-capable arm64 box to hand just at 
the moment).

Robin.

----->8-----
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2a99f0f14795..60b0e495b567 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2489,11 +2489,11 @@ static int map_sg(struct device *dev, struct 
scatterlist *sglist,
  		  int nelems, enum dma_data_direction direction,
  		  unsigned long attrs)
  {
-	int mapped_pages = 0, npages = 0, prot = 0, i;
+	int mapped_pages = 0, npages = 0, prot = 0, i, count;
  	struct protection_domain *domain;
  	struct dma_ops_domain *dma_dom;
-	struct scatterlist *s;
-	unsigned long address;
+	struct scatterlist *s, *d;
+	unsigned long address, max_seg;
  	u64 dma_mask;

  	domain = get_domain(dev);
@@ -2535,7 +2535,28 @@ static int map_sg(struct device *dev, struct 
scatterlist *sglist,
  		s->dma_length   = s->length;
  	}

-	return nelems;
+	d = sglist;
+	max_seg = dma_get_max_seg_size(dev);
+	count = 1;
+	nelems -= 1;
+	for_each_sg(sg_next(sglist), s, nelems, i) {
+		dma_addr_t s_dma_addr = s->dma_address;
+		unsigned int s_dma_len = s->dma_length;
+
+		s->dma_address = 0;
+		s->dma_length = 0;
+		if (s_dma_addr == d->dma_address + d->dma_length &&
+		    d->dma_length + s_dma_len <= max_seg) {
+			d->dma_length += s_dma_len;
+		} else {
+			d = sg_next(d);
+			d->dma_address = s_dma_addr;
+			d->dma_length = s_dma_len;
+			count++;
+		}
+	}
+
+	return count;

  out_unmap:
  	pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately
       [not found]     ` <73ac5c65-1dbd-8711-5770-75d79389bf44-5wv7dgnIgG8@public.gmane.org>
@ 2018-04-17 18:25       ` Deucher, Alexander
  0 siblings, 0 replies; 12+ messages in thread
From: Deucher, Alexander @ 2018-04-17 18:25 UTC (permalink / raw)
  To: Robin Murphy, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Suthikulpanit,
	Suravee
  Cc: okaya-sgV2jX0FEOL9JmXXK+q4OQ


[-- Attachment #1.1: Type: text/plain, Size: 3748 bytes --]

+ Suravee

________________________________
From: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
Sent: Tuesday, April 17, 2018 2:22:46 PM
To: Koenig, Christian; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; dri-devel-PD4FTy7X32mMSPqsTGOZug@public.gmane.orgesktop.org
Cc: okaya-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org; Deucher, Alexander
Subject: Re: [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately

On 17/04/18 17:29, Christian König wrote:
> Am 17.04.2018 um 17:58 schrieb Robin Murphy:
>> For dma_map_sg(), DMA API implementations are free to merge consecutive
>> segments into a single DMA mapping if conditions are suitable, thus the
>> resulting DMA addresses which drm_prime_sg_to_page_addr_arrays()
>> iterates may be packed into fewer entries than ttm->sg->nents implies.
>>
>> The current implementation does not account for this, meaning that its
>> callers either have to reject the 0 < count < nents case or risk getting
>> bogus DMA addresses beyond the first segment. Fortunately this is quite
>> easy to handle without having to rejig structures to also store the
>> mapped count, since the total DMA length should still be equal to the
>> total buffer length. All we need is a second scatterlist cursor to
>> iterate through the DMA addresses independently of the page addresses.
>>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>
> Reviewed-by: Christian König <christian.koenig-5C7GfCeVMHo@public.gmane.org> for the whole
> series.

Thanks Christian.

FWIW, the following *completely untested* hack should in theory give the
AMD IOMMU similar segment-merging behaviour to the arm64 IOMMU DMA ops,
if that helps widen the scope for testing/investigation (I have neither
an AMD/ATI graphics card nor a PCIe-capable arm64 box to hand just at
the moment).

Robin.

----->8-----
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2a99f0f14795..60b0e495b567 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2489,11 +2489,11 @@ static int map_sg(struct device *dev, struct
scatterlist *sglist,
                   int nelems, enum dma_data_direction direction,
                   unsigned long attrs)
  {
-       int mapped_pages = 0, npages = 0, prot = 0, i;
+       int mapped_pages = 0, npages = 0, prot = 0, i, count;
         struct protection_domain *domain;
         struct dma_ops_domain *dma_dom;
-       struct scatterlist *s;
-       unsigned long address;
+       struct scatterlist *s, *d;
+       unsigned long address, max_seg;
         u64 dma_mask;

         domain = get_domain(dev);
@@ -2535,7 +2535,28 @@ static int map_sg(struct device *dev, struct
scatterlist *sglist,
                 s->dma_length   = s->length;
         }

-       return nelems;
+       d = sglist;
+       max_seg = dma_get_max_seg_size(dev);
+       count = 1;
+       nelems -= 1;
+       for_each_sg(sg_next(sglist), s, nelems, i) {
+               dma_addr_t s_dma_addr = s->dma_address;
+               unsigned int s_dma_len = s->dma_length;
+
+               s->dma_address = 0;
+               s->dma_length = 0;
+               if (s_dma_addr == d->dma_address + d->dma_length &&
+                   d->dma_length + s_dma_len <= max_seg) {
+                       d->dma_length += s_dma_len;
+               } else {
+                       d = sg_next(d);
+                       d->dma_address = s_dma_addr;
+                       d->dma_length = s_dma_len;
+                       count++;
+               }
+       }
+
+       return count;

  out_unmap:
         pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",

[-- Attachment #1.2: Type: text/html, Size: 7200 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
       [not found]   ` <622c9ac3784170880dbde6900146e68631df958a.1523977133.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2018-04-17 21:13     ` Sinan Kaya
       [not found]       ` <59bb29d9-da0c-1542-88bf-71bed96f3ed2-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Sinan Kaya @ 2018-04-17 21:13 UTC (permalink / raw)
  To: Robin Murphy, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: alexander.deucher-5C7GfCeVMHo, David1.Zhou-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo

On 4/17/2018 11:58 AM, Robin Murphy wrote:
> The amdgpu driver doesn't appear to directly use the scatterlist mapped
> by amdgpu_ttm_tt_pin_userptr(), it merely hands it off to
> drm_prime_sg_to_page_addr_arrays() to generate the dma_address array
> which it actually cares about. Now that the latter can cope with
> dma_map_sg() coalescing dma-contiguous segments such that it returns
> 0 < count < nents, we can relax the current count == nents check to
> only consider genuine failure as other drivers do.
> 
> This avoids a false negative on arm64 systems with an IOMMU.
> 
> Reported-by: Sinan Kaya <okaya@codeaurora.org>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
> 
> v2: Expand commit message to clarify
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 205da3ff9cd0..f81e96a4242f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -813,7 +813,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
>  
>  	r = -ENOMEM;
>  	nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
> -	if (nents != ttm->sg->nents)
> +	if (nents == 0)
>  		goto release_sg;
>  
>  	drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
> 

Tested-by: Sinan Kaya <okaya@codeaurora.org>

using QDF2400 and XFX Vega64 GPU for the first two patches.

./builddir/tests/amdgpu/amdgpu_test -s 1

Suite: Basic Tests
  Test: Userptr Test ...passed

Userptr Test fails without this patch.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
       [not found]       ` <59bb29d9-da0c-1542-88bf-71bed96f3ed2-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2018-04-25 20:44         ` Sinan Kaya
  2018-04-27 15:54           ` Robin Murphy
  0 siblings, 1 reply; 12+ messages in thread
From: Sinan Kaya @ 2018-04-25 20:44 UTC (permalink / raw)
  To: Robin Murphy, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: alexander.deucher-5C7GfCeVMHo, David1.Zhou-5C7GfCeVMHo,
	christian.koenig-5C7GfCeVMHo

On 4/17/2018 5:13 PM, Sinan Kaya wrote:
> Tested-by: Sinan Kaya <okaya@codeaurora.org>
> 
> using QDF2400 and XFX Vega64 GPU for the first two patches.
> 
> ./builddir/tests/amdgpu/amdgpu_test -s 1
> 
> Suite: Basic Tests
>   Test: Userptr Test ...passed
> 
> Userptr Test fails without this patch.


I'm taking this back. I observed a crash with the HSA applications:

ubuntu@ubuntu:~/amdgpu$_./vectoradd_hip.exe
[  834.002206] create_process:620
[  837.413021] Unable to handle kernel NULL pointer dereference at virtual address 00000018
[  837.414097] user pgtable: 4k pages, 48-bit VAs, pgd = ffff80000d448000
[  837.427034] [0000000000000018] *pgd=000000000a424003, *pud=000000000e0b3003, *pmd=0000000000000000
[  837.427414] Internal error: Oops: 96000006 [#1] SMP
[  837.427744] Modules linked in:
[  837.457060] CPU: 3 PID: 2321 Comm: vectoradd_hip.e Not tainted 4.13.0 #5
[  837.463076] task: ffff80000dfb0d80 task.stack: ffff80000e17c000
[  837.473795] PC is at drm_prime_sg_to_page_addr_arrays+0xac/0xec
[  837.482877] LR is at drm_prime_sg_to_page_addr_arrays+0xac/0xec
[  837.491910] pc : [<ffff0000084877e8>] lr : [<ffff0000084877e8>] pstate: 80400149
[  837.492022] sp : ffff80000e17f850
[  837.492115] x29: ffff80000e17f850 x28: ffff80000d586700
[  837.516635] x27: 0000000000000000 x26: 0000e10410004000
[  837.526444] x25: ffff80000cb91880 x24: 0000000000000002
[  837.534974] x23: ffff80000cb91910 x22: ffff80000cb91900
[  837.535178] x21: 0000000000000002 x20: 0000000000000000
[  837.535340] x19: ffff80000cb91880 x18: 0000ffffffffd278
[  837.560498] x17: 0000ffffbef39240 x16: ffff0000081bb868
[  837.560684] x15: 0000ffffbf6fe000 x14: 0000000000000000
[  837.574764] x13: 0000000000000000 x12: 0000000000000000
[  837.588881] x11: 0000000000000001 x10: ffff80000a449038
[  837.593181] x9 : 0000000000000000 x8 : ffff80000cb91980
[  837.604606] x7 : 0000000000000000 x6 : 000000000000003f
[  837.612801] x5 : 0000000000000040 x4 : 0000000000000000
[  837.617425] x3 : 0000000000000002 x2 : 0000000000000000
[  837.625768] x1 : 0000000000000002 x0 : 0000000000000000
[  838.516100] [<ffff0000084877e8>] drm_prime_sg_to_page_addr_arrays+0xac/0xec
[  838.516385] [<ffff0000084c4df4>] amdgpu_ttm_tt_populate+0x80/0xe8
[  838.545137] [<ffff000008498fa4>] ttm_tt_bind+0x3c/0x7c
[  838.558468] [<ffff00000849abb4>] ttm_bo_handle_move_mem+0x12c/0x340
[  838.562518] [<ffff00000849b988>] ttm_bo_validate+0x90/0x100
[  838.572370] [<ffff00000849bc54>] ttm_bo_init_reserved+0x25c/0x324
[  838.582103] [<ffff0000084c82b4>] amdgpu_bo_do_create+0x140/0x3e4
[  838.591609] [<ffff0000084c8598>] amdgpu_bo_create+0x40/0x15c
[  838.601034] [<ffff00000856abc4>] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x36c/0x80c
[  838.609631] [<ffff0000084a3224>] kfd_ioctl_alloc_memory_of_gpu+0xfc/0x180
[  838.621500] [<ffff0000084a49c0>] kfd_ioctl+0x144/0x1e8
[  838.632253] [<ffff0000081bb0e8>] vfs_ioctl+0x18/0x40
[  838.641592] [<ffff0000081bb758>] do_vfs_ioctl+0x5ac/0x6bc
[  838.649349] [<ffff0000081bb8c4>] SyS_ioctl+0x5c/0x8c
[  838.649609] [<ffff000008082bf0>] el0_svc_naked+0x24/0x28
[  838.649776] Code: 17fffff1 350000d4 aa1903e0 97fab3f6 (b9401814)
[  838.672742] ---[ end trace fb2627bd4d4c9818 ]---

Robin, if you want to debug this; feel free to send me a debug patch.


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
  2018-04-25 20:44         ` Sinan Kaya
@ 2018-04-27 15:54           ` Robin Murphy
  2018-04-27 16:20             ` Sinan Kaya
  2018-04-27 19:42             ` Sinan Kaya
  0 siblings, 2 replies; 12+ messages in thread
From: Robin Murphy @ 2018-04-27 15:54 UTC (permalink / raw)
  To: Sinan Kaya, amd-gfx, dri-devel; +Cc: alexander.deucher, christian.koenig

On 25/04/18 21:44, Sinan Kaya wrote:
> On 4/17/2018 5:13 PM, Sinan Kaya wrote:
>> Tested-by: Sinan Kaya <okaya@codeaurora.org>
>>
>> using QDF2400 and XFX Vega64 GPU for the first two patches.
>>
>> ./builddir/tests/amdgpu/amdgpu_test -s 1
>>
>> Suite: Basic Tests
>>    Test: Userptr Test ...passed
>>
>> Userptr Test fails without this patch.
> 
> 
> I'm taking this back. I observed a crash with the HSA applications:

FWIW, was this working before, or is this somewhere new that we're only 
reaching now that pin_userpages can succeed?

> ubuntu@ubuntu:~/amdgpu$_./vectoradd_hip.exe
> [  834.002206] create_process:620
> [  837.413021] Unable to handle kernel NULL pointer dereference at virtual address 00000018

£5 says that's sg_dma_len(NULL), which implies either that something's 
gone horribly wrong with the scatterlist DMA mapping such that the 
lengths don't match, or much more likely that ttm.dma_address is NULL 
and I've missed the tiny subtlety below. Does that fix matters?

Robin.


----->8-----
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index b8ca06ea7d80..6a31229e4611 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -961,7 +961,7 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table 
*sgt, struct page **pages,
  			index++;
  		}

-		if (dma_len == 0) {
+		if (addrs && dma_len == 0) {
  			dma_sg = sg_next(dma_sg);
  			dma_len = sg_dma_len(dma_sg);
  			addr = sg_dma_address(dma_sg);
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
  2018-04-27 15:54           ` Robin Murphy
@ 2018-04-27 16:20             ` Sinan Kaya
  2018-04-27 19:42             ` Sinan Kaya
  1 sibling, 0 replies; 12+ messages in thread
From: Sinan Kaya @ 2018-04-27 16:20 UTC (permalink / raw)
  To: Robin Murphy, amd-gfx, dri-devel; +Cc: alexander.deucher, christian.koenig

On 4/27/2018 11:54 AM, Robin Murphy wrote:
>> I'm taking this back. I observed a crash with the HSA applications:
> 
> FWIW, was this working before, or is this somewhere new that we're only reaching now that pin_userpages can succeed?

Yes, HSA application is a different test. Previous DRM library unit test is
still passing. It must have some unique characteristic. 

I'll test your patch and report.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
  2018-04-27 15:54           ` Robin Murphy
  2018-04-27 16:20             ` Sinan Kaya
@ 2018-04-27 19:42             ` Sinan Kaya
  2018-04-30 13:00               ` Robin Murphy
  1 sibling, 1 reply; 12+ messages in thread
From: Sinan Kaya @ 2018-04-27 19:42 UTC (permalink / raw)
  To: Robin Murphy, amd-gfx, dri-devel; +Cc: alexander.deucher, christian.koenig

On 4/27/2018 11:54 AM, Robin Murphy wrote:
> 
>> ubuntu@ubuntu:~/amdgpu$_./vectoradd_hip.exe
>> [  834.002206] create_process:620
>> [  837.413021] Unable to handle kernel NULL pointer dereference at virtual address 00000018
> 
> £5 says that's sg_dma_len(NULL), which implies either that something's gone horribly wrong with the scatterlist DMA mapping such that the lengths don't match, or much more likely that ttm.dma_address is NULL and I've missed the tiny subtlety below. Does that fix matters?

Turned out to be a null pointer problem after sg_next(). The following helped.

+               if (addrs && (dma_len == 0)) {
                        dma_sg = sg_next(dma_sg);
-                       dma_len = sg_dma_len(dma_sg);
-                       addr = sg_dma_address(dma_sg);
+                       if (dma_sg) {
+                               dma_len = sg_dma_len(dma_sg);
+                               addr = sg_dma_address(dma_sg);
+                       }
                }
 

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing
  2018-04-27 19:42             ` Sinan Kaya
@ 2018-04-30 13:00               ` Robin Murphy
  0 siblings, 0 replies; 12+ messages in thread
From: Robin Murphy @ 2018-04-30 13:00 UTC (permalink / raw)
  To: Sinan Kaya, amd-gfx, dri-devel; +Cc: alexander.deucher, christian.koenig

On 27/04/18 20:42, Sinan Kaya wrote:
> On 4/27/2018 11:54 AM, Robin Murphy wrote:
>>
>>> ubuntu@ubuntu:~/amdgpu$_./vectoradd_hip.exe
>>> [  834.002206] create_process:620
>>> [  837.413021] Unable to handle kernel NULL pointer dereference at virtual address 00000018
>>
>> £5 says that's sg_dma_len(NULL), which implies either that something's gone horribly wrong with the scatterlist DMA mapping such that the lengths don't match, or much more likely that ttm.dma_address is NULL and I've missed the tiny subtlety below. Does that fix matters?
> 
> Turned out to be a null pointer problem after sg_next(). The following helped.

Ugh, right, the whole thing's in the wrong place such that when addrs is 
valid we can dereference junk on the way out of the loop (entirely 
needlessly)... v3 coming up.

Robin.

> 
> +               if (addrs && (dma_len == 0)) {
>                          dma_sg = sg_next(dma_sg);
> -                       dma_len = sg_dma_len(dma_sg);
> -                       addr = sg_dma_address(dma_sg);
> +                       if (dma_sg) {
> +                               dma_len = sg_dma_len(dma_sg);
> +                               addr = sg_dma_address(dma_sg);
> +                       }
>                  }
>   
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-04-30 13:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-17 15:58 [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately Robin Murphy
2018-04-17 15:58 ` [PATCH v2 2/3] drm/amdgpu: Allow dma_map_sg() coalescing Robin Murphy
     [not found]   ` <622c9ac3784170880dbde6900146e68631df958a.1523977133.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2018-04-17 21:13     ` Sinan Kaya
     [not found]       ` <59bb29d9-da0c-1542-88bf-71bed96f3ed2-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-04-25 20:44         ` Sinan Kaya
2018-04-27 15:54           ` Robin Murphy
2018-04-27 16:20             ` Sinan Kaya
2018-04-27 19:42             ` Sinan Kaya
2018-04-30 13:00               ` Robin Murphy
2018-04-17 15:58 ` [PATCH v2 3/3] drm/radeon: " Robin Murphy
2018-04-17 16:29 ` [PATCH v2 1/3] drm/prime: Iterate SG DMA addresses separately Christian König
2018-04-17 18:22   ` Robin Murphy
     [not found]     ` <73ac5c65-1dbd-8711-5770-75d79389bf44-5wv7dgnIgG8@public.gmane.org>
2018-04-17 18:25       ` Deucher, Alexander

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.