All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
@ 2022-07-18  6:28 Liu Song
       [not found] ` <CGME20220816123958eucas1p1b03a5efa1f5804245a5c1a9b27529015@eucas1p1.samsung.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Liu Song @ 2022-07-18  6:28 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, linux-kernel

From: Liu Song <liusong@linux.alibaba.com>

In "dma_alloc_from_dev_coherent" and "dma_direct_alloc",
the allocated memory is explicitly set to 0.

A helper function "use_dev_coherent_memory" is introduced here to
determine whether the memory is allocated by "dma_alloc_from_dev_coherent".

And use "get_dma_ops" to determine whether the memory is allocated by
"dma_direct_alloc".

After this modification, memory allocated using "dma_pool_zalloc" can avoid
duplicate memset.

Signed-off-by: Liu Song <liusong@linux.alibaba.com>
---
 include/linux/dma-map-ops.h | 5 +++++
 mm/dmapool.c                | 5 ++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
index 0d5b06b..c29948d 100644
--- a/include/linux/dma-map-ops.h
+++ b/include/linux/dma-map-ops.h
@@ -171,6 +171,10 @@ int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
 int dma_release_from_dev_coherent(struct device *dev, int order, void *vaddr);
 int dma_mmap_from_dev_coherent(struct device *dev, struct vm_area_struct *vma,
 		void *cpu_addr, size_t size, int *ret);
+static inline bool use_dev_coherent_memory(struct device *dev)
+{
+	return dev->dma_mem ? true : false;
+}
 #else
 static inline int dma_declare_coherent_memory(struct device *dev,
 		phys_addr_t phys_addr, dma_addr_t device_addr, size_t size)
@@ -180,6 +184,7 @@ static inline int dma_declare_coherent_memory(struct device *dev,
 #define dma_alloc_from_dev_coherent(dev, size, handle, ret) (0)
 #define dma_release_from_dev_coherent(dev, order, vaddr) (0)
 #define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
+#define use_dev_coherent_memory(dev) (0)
 #endif /* CONFIG_DMA_DECLARE_COHERENT */
 
 #ifdef CONFIG_DMA_GLOBAL_POOL
diff --git a/mm/dmapool.c b/mm/dmapool.c
index a7eb5d0..6e03530 100644
--- a/mm/dmapool.c
+++ b/mm/dmapool.c
@@ -21,6 +21,7 @@
 
 #include <linux/device.h>
 #include <linux/dma-mapping.h>
+#include <linux/dma-map-ops.h>
 #include <linux/dmapool.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
@@ -372,7 +373,9 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
 #endif
 	spin_unlock_irqrestore(&pool->lock, flags);
 
-	if (want_init_on_alloc(mem_flags))
+	if (want_init_on_alloc(mem_flags) &&
+		!use_dev_coherent_memory(pool->dev) &&
+		get_dma_ops(pool->dev))
 		memset(retval, 0, pool->size);
 
 	return retval;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
       [not found] ` <CGME20220816123958eucas1p1b03a5efa1f5804245a5c1a9b27529015@eucas1p1.samsung.com>
@ 2022-08-16 12:39     ` Marek Szyprowski
  0 siblings, 0 replies; 11+ messages in thread
From: Marek Szyprowski @ 2022-08-16 12:39 UTC (permalink / raw)
  To: Liu Song, akpm, linux-arm-kernel; +Cc: linux-mm, linux-kernel

Hi,

On 18.07.2022 08:28, Liu Song wrote:
> From: Liu Song <liusong@linux.alibaba.com>
>
> In "dma_alloc_from_dev_coherent" and "dma_direct_alloc",
> the allocated memory is explicitly set to 0.
>
> A helper function "use_dev_coherent_memory" is introduced here to
> determine whether the memory is allocated by "dma_alloc_from_dev_coherent".
>
> And use "get_dma_ops" to determine whether the memory is allocated by
> "dma_direct_alloc".
>
> After this modification, memory allocated using "dma_pool_zalloc" can avoid
> duplicate memset.
>
> Signed-off-by: Liu Song <liusong@linux.alibaba.com>

This patch landed linux next-20220816. Unfortunately it causes serious 
issues on ARM 32bit systems. I've observed it on ARM 32bit Samsung 
Exynos 5422 based Odroid XU4 board with USB r8152 driver. After applying 
this patch and loading r8152 driver I only the following endless 
messages in the log:

xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?

It looks that there are drivers which rely on the fact that the dma 
coherent buffers are always zeroed.

> ---
>   include/linux/dma-map-ops.h | 5 +++++
>   mm/dmapool.c                | 5 ++++-
>   2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
> index 0d5b06b..c29948d 100644
> --- a/include/linux/dma-map-ops.h
> +++ b/include/linux/dma-map-ops.h
> @@ -171,6 +171,10 @@ int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
>   int dma_release_from_dev_coherent(struct device *dev, int order, void *vaddr);
>   int dma_mmap_from_dev_coherent(struct device *dev, struct vm_area_struct *vma,
>   		void *cpu_addr, size_t size, int *ret);
> +static inline bool use_dev_coherent_memory(struct device *dev)
> +{
> +	return dev->dma_mem ? true : false;
> +}
>   #else
>   static inline int dma_declare_coherent_memory(struct device *dev,
>   		phys_addr_t phys_addr, dma_addr_t device_addr, size_t size)
> @@ -180,6 +184,7 @@ static inline int dma_declare_coherent_memory(struct device *dev,
>   #define dma_alloc_from_dev_coherent(dev, size, handle, ret) (0)
>   #define dma_release_from_dev_coherent(dev, order, vaddr) (0)
>   #define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
> +#define use_dev_coherent_memory(dev) (0)
>   #endif /* CONFIG_DMA_DECLARE_COHERENT */
>   
>   #ifdef CONFIG_DMA_GLOBAL_POOL
> diff --git a/mm/dmapool.c b/mm/dmapool.c
> index a7eb5d0..6e03530 100644
> --- a/mm/dmapool.c
> +++ b/mm/dmapool.c
> @@ -21,6 +21,7 @@
>   
>   #include <linux/device.h>
>   #include <linux/dma-mapping.h>
> +#include <linux/dma-map-ops.h>
>   #include <linux/dmapool.h>
>   #include <linux/kernel.h>
>   #include <linux/list.h>
> @@ -372,7 +373,9 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
>   #endif
>   	spin_unlock_irqrestore(&pool->lock, flags);
>   
> -	if (want_init_on_alloc(mem_flags))
> +	if (want_init_on_alloc(mem_flags) &&
> +		!use_dev_coherent_memory(pool->dev) &&
> +		get_dma_ops(pool->dev))
>   		memset(retval, 0, pool->size);
>   
>   	return retval;

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
@ 2022-08-16 12:39     ` Marek Szyprowski
  0 siblings, 0 replies; 11+ messages in thread
From: Marek Szyprowski @ 2022-08-16 12:39 UTC (permalink / raw)
  To: Liu Song, akpm, linux-arm-kernel; +Cc: linux-mm, linux-kernel

Hi,

On 18.07.2022 08:28, Liu Song wrote:
> From: Liu Song <liusong@linux.alibaba.com>
>
> In "dma_alloc_from_dev_coherent" and "dma_direct_alloc",
> the allocated memory is explicitly set to 0.
>
> A helper function "use_dev_coherent_memory" is introduced here to
> determine whether the memory is allocated by "dma_alloc_from_dev_coherent".
>
> And use "get_dma_ops" to determine whether the memory is allocated by
> "dma_direct_alloc".
>
> After this modification, memory allocated using "dma_pool_zalloc" can avoid
> duplicate memset.
>
> Signed-off-by: Liu Song <liusong@linux.alibaba.com>

This patch landed linux next-20220816. Unfortunately it causes serious 
issues on ARM 32bit systems. I've observed it on ARM 32bit Samsung 
Exynos 5422 based Odroid XU4 board with USB r8152 driver. After applying 
this patch and loading r8152 driver I only the following endless 
messages in the log:

xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?

It looks that there are drivers which rely on the fact that the dma 
coherent buffers are always zeroed.

> ---
>   include/linux/dma-map-ops.h | 5 +++++
>   mm/dmapool.c                | 5 ++++-
>   2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
> index 0d5b06b..c29948d 100644
> --- a/include/linux/dma-map-ops.h
> +++ b/include/linux/dma-map-ops.h
> @@ -171,6 +171,10 @@ int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
>   int dma_release_from_dev_coherent(struct device *dev, int order, void *vaddr);
>   int dma_mmap_from_dev_coherent(struct device *dev, struct vm_area_struct *vma,
>   		void *cpu_addr, size_t size, int *ret);
> +static inline bool use_dev_coherent_memory(struct device *dev)
> +{
> +	return dev->dma_mem ? true : false;
> +}
>   #else
>   static inline int dma_declare_coherent_memory(struct device *dev,
>   		phys_addr_t phys_addr, dma_addr_t device_addr, size_t size)
> @@ -180,6 +184,7 @@ static inline int dma_declare_coherent_memory(struct device *dev,
>   #define dma_alloc_from_dev_coherent(dev, size, handle, ret) (0)
>   #define dma_release_from_dev_coherent(dev, order, vaddr) (0)
>   #define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
> +#define use_dev_coherent_memory(dev) (0)
>   #endif /* CONFIG_DMA_DECLARE_COHERENT */
>   
>   #ifdef CONFIG_DMA_GLOBAL_POOL
> diff --git a/mm/dmapool.c b/mm/dmapool.c
> index a7eb5d0..6e03530 100644
> --- a/mm/dmapool.c
> +++ b/mm/dmapool.c
> @@ -21,6 +21,7 @@
>   
>   #include <linux/device.h>
>   #include <linux/dma-mapping.h>
> +#include <linux/dma-map-ops.h>
>   #include <linux/dmapool.h>
>   #include <linux/kernel.h>
>   #include <linux/list.h>
> @@ -372,7 +373,9 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
>   #endif
>   	spin_unlock_irqrestore(&pool->lock, flags);
>   
> -	if (want_init_on_alloc(mem_flags))
> +	if (want_init_on_alloc(mem_flags) &&
> +		!use_dev_coherent_memory(pool->dev) &&
> +		get_dma_ops(pool->dev))
>   		memset(retval, 0, pool->size);
>   
>   	return retval;

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
  2022-08-16 12:39     ` Marek Szyprowski
@ 2022-08-16 15:00       ` Robin Murphy
  -1 siblings, 0 replies; 11+ messages in thread
From: Robin Murphy @ 2022-08-16 15:00 UTC (permalink / raw)
  To: Marek Szyprowski, Liu Song, akpm, linux-arm-kernel, Christoph Hellwig
  Cc: linux-mm, linux-kernel, iommu

On 2022-08-16 13:39, Marek Szyprowski wrote:
> Hi,
> 
> On 18.07.2022 08:28, Liu Song wrote:
>> From: Liu Song <liusong@linux.alibaba.com>
>>
>> In "dma_alloc_from_dev_coherent" and "dma_direct_alloc",
>> the allocated memory is explicitly set to 0.
>>
>> A helper function "use_dev_coherent_memory" is introduced here to
>> determine whether the memory is allocated by "dma_alloc_from_dev_coherent".
>>
>> And use "get_dma_ops" to determine whether the memory is allocated by
>> "dma_direct_alloc".
>>
>> After this modification, memory allocated using "dma_pool_zalloc" can avoid
>> duplicate memset.
>>
>> Signed-off-by: Liu Song <liusong@linux.alibaba.com>
> 
> This patch landed linux next-20220816. Unfortunately it causes serious
> issues on ARM 32bit systems. I've observed it on ARM 32bit Samsung
> Exynos 5422 based Odroid XU4 board with USB r8152 driver. After applying
> this patch and loading r8152 driver I only the following endless
> messages in the log:
> 
> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
> 
> It looks that there are drivers which rely on the fact that the dma
> coherent buffers are always zeroed.

It's not even that, the change here is just obviously broken, since it 
ends up entirely ignoring want_init_on_alloc() for devices using 
dma-direct. Sure, the memory backing a dma_page is zeroed *once* by its 
initial dma-coherent allocation, but who says we're not not reallocating 
pool entries from an existing dma_page?

I'm not convinced it's worth trying to special-case this at all, since 
we can only do it reliably for the first pool entry allocated from a new 
dma_page, and that will only happen as the pool initially grows to a 
suitable size for its working set, after which no further new pages are 
likely to be allocated for the lifetime of the pool. Even if there is a 
case to be made for doing so, it would need to be based on the flow 
through dma_pool_alloc() itself, not some nonsense heuristic on the device.

Andrew, please drop this patch.

Thanks,
Robin.

>> ---
>>    include/linux/dma-map-ops.h | 5 +++++
>>    mm/dmapool.c                | 5 ++++-
>>    2 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
>> index 0d5b06b..c29948d 100644
>> --- a/include/linux/dma-map-ops.h
>> +++ b/include/linux/dma-map-ops.h
>> @@ -171,6 +171,10 @@ int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
>>    int dma_release_from_dev_coherent(struct device *dev, int order, void *vaddr);
>>    int dma_mmap_from_dev_coherent(struct device *dev, struct vm_area_struct *vma,
>>    		void *cpu_addr, size_t size, int *ret);
>> +static inline bool use_dev_coherent_memory(struct device *dev)
>> +{
>> +	return dev->dma_mem ? true : false;
>> +}
>>    #else
>>    static inline int dma_declare_coherent_memory(struct device *dev,
>>    		phys_addr_t phys_addr, dma_addr_t device_addr, size_t size)
>> @@ -180,6 +184,7 @@ static inline int dma_declare_coherent_memory(struct device *dev,
>>    #define dma_alloc_from_dev_coherent(dev, size, handle, ret) (0)
>>    #define dma_release_from_dev_coherent(dev, order, vaddr) (0)
>>    #define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
>> +#define use_dev_coherent_memory(dev) (0)
>>    #endif /* CONFIG_DMA_DECLARE_COHERENT */
>>    
>>    #ifdef CONFIG_DMA_GLOBAL_POOL
>> diff --git a/mm/dmapool.c b/mm/dmapool.c
>> index a7eb5d0..6e03530 100644
>> --- a/mm/dmapool.c
>> +++ b/mm/dmapool.c
>> @@ -21,6 +21,7 @@
>>    
>>    #include <linux/device.h>
>>    #include <linux/dma-mapping.h>
>> +#include <linux/dma-map-ops.h>
>>    #include <linux/dmapool.h>
>>    #include <linux/kernel.h>
>>    #include <linux/list.h>
>> @@ -372,7 +373,9 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
>>    #endif
>>    	spin_unlock_irqrestore(&pool->lock, flags);
>>    
>> -	if (want_init_on_alloc(mem_flags))
>> +	if (want_init_on_alloc(mem_flags) &&
>> +		!use_dev_coherent_memory(pool->dev) &&
>> +		get_dma_ops(pool->dev))
>>    		memset(retval, 0, pool->size);
>>    
>>    	return retval;
> 
> Best regards

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
@ 2022-08-16 15:00       ` Robin Murphy
  0 siblings, 0 replies; 11+ messages in thread
From: Robin Murphy @ 2022-08-16 15:00 UTC (permalink / raw)
  To: Marek Szyprowski, Liu Song, akpm, linux-arm-kernel, Christoph Hellwig
  Cc: linux-mm, linux-kernel, iommu

On 2022-08-16 13:39, Marek Szyprowski wrote:
> Hi,
> 
> On 18.07.2022 08:28, Liu Song wrote:
>> From: Liu Song <liusong@linux.alibaba.com>
>>
>> In "dma_alloc_from_dev_coherent" and "dma_direct_alloc",
>> the allocated memory is explicitly set to 0.
>>
>> A helper function "use_dev_coherent_memory" is introduced here to
>> determine whether the memory is allocated by "dma_alloc_from_dev_coherent".
>>
>> And use "get_dma_ops" to determine whether the memory is allocated by
>> "dma_direct_alloc".
>>
>> After this modification, memory allocated using "dma_pool_zalloc" can avoid
>> duplicate memset.
>>
>> Signed-off-by: Liu Song <liusong@linux.alibaba.com>
> 
> This patch landed linux next-20220816. Unfortunately it causes serious
> issues on ARM 32bit systems. I've observed it on ARM 32bit Samsung
> Exynos 5422 based Odroid XU4 board with USB r8152 driver. After applying
> this patch and loading r8152 driver I only the following endless
> messages in the log:
> 
> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs queued?
> 
> It looks that there are drivers which rely on the fact that the dma
> coherent buffers are always zeroed.

It's not even that, the change here is just obviously broken, since it 
ends up entirely ignoring want_init_on_alloc() for devices using 
dma-direct. Sure, the memory backing a dma_page is zeroed *once* by its 
initial dma-coherent allocation, but who says we're not not reallocating 
pool entries from an existing dma_page?

I'm not convinced it's worth trying to special-case this at all, since 
we can only do it reliably for the first pool entry allocated from a new 
dma_page, and that will only happen as the pool initially grows to a 
suitable size for its working set, after which no further new pages are 
likely to be allocated for the lifetime of the pool. Even if there is a 
case to be made for doing so, it would need to be based on the flow 
through dma_pool_alloc() itself, not some nonsense heuristic on the device.

Andrew, please drop this patch.

Thanks,
Robin.

>> ---
>>    include/linux/dma-map-ops.h | 5 +++++
>>    mm/dmapool.c                | 5 ++++-
>>    2 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
>> index 0d5b06b..c29948d 100644
>> --- a/include/linux/dma-map-ops.h
>> +++ b/include/linux/dma-map-ops.h
>> @@ -171,6 +171,10 @@ int dma_alloc_from_dev_coherent(struct device *dev, ssize_t size,
>>    int dma_release_from_dev_coherent(struct device *dev, int order, void *vaddr);
>>    int dma_mmap_from_dev_coherent(struct device *dev, struct vm_area_struct *vma,
>>    		void *cpu_addr, size_t size, int *ret);
>> +static inline bool use_dev_coherent_memory(struct device *dev)
>> +{
>> +	return dev->dma_mem ? true : false;
>> +}
>>    #else
>>    static inline int dma_declare_coherent_memory(struct device *dev,
>>    		phys_addr_t phys_addr, dma_addr_t device_addr, size_t size)
>> @@ -180,6 +184,7 @@ static inline int dma_declare_coherent_memory(struct device *dev,
>>    #define dma_alloc_from_dev_coherent(dev, size, handle, ret) (0)
>>    #define dma_release_from_dev_coherent(dev, order, vaddr) (0)
>>    #define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
>> +#define use_dev_coherent_memory(dev) (0)
>>    #endif /* CONFIG_DMA_DECLARE_COHERENT */
>>    
>>    #ifdef CONFIG_DMA_GLOBAL_POOL
>> diff --git a/mm/dmapool.c b/mm/dmapool.c
>> index a7eb5d0..6e03530 100644
>> --- a/mm/dmapool.c
>> +++ b/mm/dmapool.c
>> @@ -21,6 +21,7 @@
>>    
>>    #include <linux/device.h>
>>    #include <linux/dma-mapping.h>
>> +#include <linux/dma-map-ops.h>
>>    #include <linux/dmapool.h>
>>    #include <linux/kernel.h>
>>    #include <linux/list.h>
>> @@ -372,7 +373,9 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
>>    #endif
>>    	spin_unlock_irqrestore(&pool->lock, flags);
>>    
>> -	if (want_init_on_alloc(mem_flags))
>> +	if (want_init_on_alloc(mem_flags) &&
>> +		!use_dev_coherent_memory(pool->dev) &&
>> +		get_dma_ops(pool->dev))
>>    		memset(retval, 0, pool->size);
>>    
>>    	return retval;
> 
> Best regards

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
  2022-08-16 15:00       ` Robin Murphy
@ 2022-08-17  2:03         ` Liu Song
  -1 siblings, 0 replies; 11+ messages in thread
From: Liu Song @ 2022-08-17  2:03 UTC (permalink / raw)
  To: Robin Murphy, Marek Szyprowski, akpm, linux-arm-kernel,
	Christoph Hellwig
  Cc: linux-mm, linux-kernel, iommu

> On 2022-08-16 13:39, Marek Szyprowski wrote:
>> Hi,
>>
>> On 18.07.2022 08:28, Liu Song wrote:
>>> From: Liu Song <liusong@linux.alibaba.com>
>>>
>>> In "dma_alloc_from_dev_coherent" and "dma_direct_alloc",
>>> the allocated memory is explicitly set to 0.
>>>
>>> A helper function "use_dev_coherent_memory" is introduced here to
>>> determine whether the memory is allocated by 
>>> "dma_alloc_from_dev_coherent".
>>>
>>> And use "get_dma_ops" to determine whether the memory is allocated by
>>> "dma_direct_alloc".
>>>
>>> After this modification, memory allocated using "dma_pool_zalloc" 
>>> can avoid
>>> duplicate memset.
>>>
>>> Signed-off-by: Liu Song <liusong@linux.alibaba.com>
>>
>> This patch landed linux next-20220816. Unfortunately it causes serious
>> issues on ARM 32bit systems. I've observed it on ARM 32bit Samsung
>> Exynos 5422 based Odroid XU4 board with USB r8152 driver. After applying
>> this patch and loading r8152 driver I only the following endless
>> messages in the log:
>>
>> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs 
>> queued?
>> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs 
>> queued?
>> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs 
>> queued?
>>
>> It looks that there are drivers which rely on the fact that the dma
>> coherent buffers are always zeroed.
>
> It's not even that, the change here is just obviously broken, since it 
> ends up entirely ignoring want_init_on_alloc() for devices using 
> dma-direct. Sure, the memory backing a dma_page is zeroed *once* by 
> its initial dma-coherent allocation, but who says we're not not 
> reallocating pool entries from an existing dma_page?
>
> I'm not convinced it's worth trying to special-case this at all, since 
> we can only do it reliably for the first pool entry allocated from a 
> new dma_page, and that will only happen as the pool initially grows to 
> a suitable size for its working set, after which no further new pages 
> are likely to be allocated for the lifetime of the pool. Even if there 
> is a case to be made for doing so, it would need to be based on the 
> flow through dma_pool_alloc() itself, not some nonsense heuristic on 
> the device.

Hi,

First of all, I am very sorry that there are missing branches that have 
not been considered fully,

but there is a possibility that the memset to 0 will be repeated whether 
re-allocation from the

dma pool, and this patch needs to be fixed.


Thanks


>
> Andrew, please drop this patch.
>
> Thanks,
> Robin.
>
>>> ---
>>>    include/linux/dma-map-ops.h | 5 +++++
>>>    mm/dmapool.c                | 5 ++++-
>>>    2 files changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
>>> index 0d5b06b..c29948d 100644
>>> --- a/include/linux/dma-map-ops.h
>>> +++ b/include/linux/dma-map-ops.h
>>> @@ -171,6 +171,10 @@ int dma_alloc_from_dev_coherent(struct device 
>>> *dev, ssize_t size,
>>>    int dma_release_from_dev_coherent(struct device *dev, int order, 
>>> void *vaddr);
>>>    int dma_mmap_from_dev_coherent(struct device *dev, struct 
>>> vm_area_struct *vma,
>>>            void *cpu_addr, size_t size, int *ret);
>>> +static inline bool use_dev_coherent_memory(struct device *dev)
>>> +{
>>> +    return dev->dma_mem ? true : false;
>>> +}
>>>    #else
>>>    static inline int dma_declare_coherent_memory(struct device *dev,
>>>            phys_addr_t phys_addr, dma_addr_t device_addr, size_t size)
>>> @@ -180,6 +184,7 @@ static inline int 
>>> dma_declare_coherent_memory(struct device *dev,
>>>    #define dma_alloc_from_dev_coherent(dev, size, handle, ret) (0)
>>>    #define dma_release_from_dev_coherent(dev, order, vaddr) (0)
>>>    #define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
>>> +#define use_dev_coherent_memory(dev) (0)
>>>    #endif /* CONFIG_DMA_DECLARE_COHERENT */
>>>       #ifdef CONFIG_DMA_GLOBAL_POOL
>>> diff --git a/mm/dmapool.c b/mm/dmapool.c
>>> index a7eb5d0..6e03530 100644
>>> --- a/mm/dmapool.c
>>> +++ b/mm/dmapool.c
>>> @@ -21,6 +21,7 @@
>>>       #include <linux/device.h>
>>>    #include <linux/dma-mapping.h>
>>> +#include <linux/dma-map-ops.h>
>>>    #include <linux/dmapool.h>
>>>    #include <linux/kernel.h>
>>>    #include <linux/list.h>
>>> @@ -372,7 +373,9 @@ void *dma_pool_alloc(struct dma_pool *pool, 
>>> gfp_t mem_flags,
>>>    #endif
>>>        spin_unlock_irqrestore(&pool->lock, flags);
>>>    -    if (want_init_on_alloc(mem_flags))
>>> +    if (want_init_on_alloc(mem_flags) &&
>>> +        !use_dev_coherent_memory(pool->dev) &&
>>> +        get_dma_ops(pool->dev))
>>>            memset(retval, 0, pool->size);
>>>           return retval;
>>
>> Best regards

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
@ 2022-08-17  2:03         ` Liu Song
  0 siblings, 0 replies; 11+ messages in thread
From: Liu Song @ 2022-08-17  2:03 UTC (permalink / raw)
  To: Robin Murphy, Marek Szyprowski, akpm, linux-arm-kernel,
	Christoph Hellwig
  Cc: linux-mm, linux-kernel, iommu

> On 2022-08-16 13:39, Marek Szyprowski wrote:
>> Hi,
>>
>> On 18.07.2022 08:28, Liu Song wrote:
>>> From: Liu Song <liusong@linux.alibaba.com>
>>>
>>> In "dma_alloc_from_dev_coherent" and "dma_direct_alloc",
>>> the allocated memory is explicitly set to 0.
>>>
>>> A helper function "use_dev_coherent_memory" is introduced here to
>>> determine whether the memory is allocated by 
>>> "dma_alloc_from_dev_coherent".
>>>
>>> And use "get_dma_ops" to determine whether the memory is allocated by
>>> "dma_direct_alloc".
>>>
>>> After this modification, memory allocated using "dma_pool_zalloc" 
>>> can avoid
>>> duplicate memset.
>>>
>>> Signed-off-by: Liu Song <liusong@linux.alibaba.com>
>>
>> This patch landed linux next-20220816. Unfortunately it causes serious
>> issues on ARM 32bit systems. I've observed it on ARM 32bit Samsung
>> Exynos 5422 based Odroid XU4 board with USB r8152 driver. After applying
>> this patch and loading r8152 driver I only the following endless
>> messages in the log:
>>
>> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs 
>> queued?
>> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs 
>> queued?
>> xhci-hcd xhci-hcd.9.auto: WARN Event TRB for slot 1 ep 0 with no TDs 
>> queued?
>>
>> It looks that there are drivers which rely on the fact that the dma
>> coherent buffers are always zeroed.
>
> It's not even that, the change here is just obviously broken, since it 
> ends up entirely ignoring want_init_on_alloc() for devices using 
> dma-direct. Sure, the memory backing a dma_page is zeroed *once* by 
> its initial dma-coherent allocation, but who says we're not not 
> reallocating pool entries from an existing dma_page?
>
> I'm not convinced it's worth trying to special-case this at all, since 
> we can only do it reliably for the first pool entry allocated from a 
> new dma_page, and that will only happen as the pool initially grows to 
> a suitable size for its working set, after which no further new pages 
> are likely to be allocated for the lifetime of the pool. Even if there 
> is a case to be made for doing so, it would need to be based on the 
> flow through dma_pool_alloc() itself, not some nonsense heuristic on 
> the device.

Hi,

First of all, I am very sorry that there are missing branches that have 
not been considered fully,

but there is a possibility that the memset to 0 will be repeated whether 
re-allocation from the

dma pool, and this patch needs to be fixed.


Thanks


>
> Andrew, please drop this patch.
>
> Thanks,
> Robin.
>
>>> ---
>>>    include/linux/dma-map-ops.h | 5 +++++
>>>    mm/dmapool.c                | 5 ++++-
>>>    2 files changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h
>>> index 0d5b06b..c29948d 100644
>>> --- a/include/linux/dma-map-ops.h
>>> +++ b/include/linux/dma-map-ops.h
>>> @@ -171,6 +171,10 @@ int dma_alloc_from_dev_coherent(struct device 
>>> *dev, ssize_t size,
>>>    int dma_release_from_dev_coherent(struct device *dev, int order, 
>>> void *vaddr);
>>>    int dma_mmap_from_dev_coherent(struct device *dev, struct 
>>> vm_area_struct *vma,
>>>            void *cpu_addr, size_t size, int *ret);
>>> +static inline bool use_dev_coherent_memory(struct device *dev)
>>> +{
>>> +    return dev->dma_mem ? true : false;
>>> +}
>>>    #else
>>>    static inline int dma_declare_coherent_memory(struct device *dev,
>>>            phys_addr_t phys_addr, dma_addr_t device_addr, size_t size)
>>> @@ -180,6 +184,7 @@ static inline int 
>>> dma_declare_coherent_memory(struct device *dev,
>>>    #define dma_alloc_from_dev_coherent(dev, size, handle, ret) (0)
>>>    #define dma_release_from_dev_coherent(dev, order, vaddr) (0)
>>>    #define dma_mmap_from_dev_coherent(dev, vma, vaddr, order, ret) (0)
>>> +#define use_dev_coherent_memory(dev) (0)
>>>    #endif /* CONFIG_DMA_DECLARE_COHERENT */
>>>       #ifdef CONFIG_DMA_GLOBAL_POOL
>>> diff --git a/mm/dmapool.c b/mm/dmapool.c
>>> index a7eb5d0..6e03530 100644
>>> --- a/mm/dmapool.c
>>> +++ b/mm/dmapool.c
>>> @@ -21,6 +21,7 @@
>>>       #include <linux/device.h>
>>>    #include <linux/dma-mapping.h>
>>> +#include <linux/dma-map-ops.h>
>>>    #include <linux/dmapool.h>
>>>    #include <linux/kernel.h>
>>>    #include <linux/list.h>
>>> @@ -372,7 +373,9 @@ void *dma_pool_alloc(struct dma_pool *pool, 
>>> gfp_t mem_flags,
>>>    #endif
>>>        spin_unlock_irqrestore(&pool->lock, flags);
>>>    -    if (want_init_on_alloc(mem_flags))
>>> +    if (want_init_on_alloc(mem_flags) &&
>>> +        !use_dev_coherent_memory(pool->dev) &&
>>> +        get_dma_ops(pool->dev))
>>>            memset(retval, 0, pool->size);
>>>           return retval;
>>
>> Best regards

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
  2022-08-16 15:00       ` Robin Murphy
@ 2022-08-17  5:36         ` Christoph Hellwig
  -1 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2022-08-17  5:36 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Marek Szyprowski, Liu Song, akpm, linux-arm-kernel,
	Christoph Hellwig, linux-mm, linux-kernel, iommu

>>> A helper function "use_dev_coherent_memory" is introduced here to
>>> determine whether the memory is allocated by "dma_alloc_from_dev_coherent".
>>>
>>> And use "get_dma_ops" to determine whether the memory is allocated by
>>> "dma_direct_alloc".

WTF?  get_dma_ops is privat to the DMA API layer, and dmapool has no
business even using that.  Even independent of this particular case,
consumers of an API never have any business looking at the implementation
of the API, that is the whole point of the abstraction.

> It's not even that, the change here is just obviously broken, since it ends 
> up entirely ignoring want_init_on_alloc() for devices using dma-direct. 
> Sure, the memory backing a dma_page is zeroed *once* by its initial 
> dma-coherent allocation, but who says we're not not reallocating pool 
> entries from an existing dma_page?

And yes, in addition to that it also is completely broken.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
@ 2022-08-17  5:36         ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2022-08-17  5:36 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Marek Szyprowski, Liu Song, akpm, linux-arm-kernel,
	Christoph Hellwig, linux-mm, linux-kernel, iommu

>>> A helper function "use_dev_coherent_memory" is introduced here to
>>> determine whether the memory is allocated by "dma_alloc_from_dev_coherent".
>>>
>>> And use "get_dma_ops" to determine whether the memory is allocated by
>>> "dma_direct_alloc".

WTF?  get_dma_ops is privat to the DMA API layer, and dmapool has no
business even using that.  Even independent of this particular case,
consumers of an API never have any business looking at the implementation
of the API, that is the whole point of the abstraction.

> It's not even that, the change here is just obviously broken, since it ends 
> up entirely ignoring want_init_on_alloc() for devices using dma-direct. 
> Sure, the memory backing a dma_page is zeroed *once* by its initial 
> dma-coherent allocation, but who says we're not not reallocating pool 
> entries from an existing dma_page?

And yes, in addition to that it also is completely broken.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
  2022-08-17  5:36         ` Christoph Hellwig
@ 2022-08-18  8:37           ` Liu Song
  -1 siblings, 0 replies; 11+ messages in thread
From: Liu Song @ 2022-08-18  8:37 UTC (permalink / raw)
  To: Christoph Hellwig, Robin Murphy, akpm
  Cc: Marek Szyprowski, linux-arm-kernel, linux-mm, linux-kernel, iommu

A helper function "use_dev_coherent_memory" is introduced here to

>>>> determine whether the memory is allocated by "dma_alloc_from_dev_coherent".
>>>>
>>>> And use "get_dma_ops" to determine whether the memory is allocated by
>>>> "dma_direct_alloc".
> WTF?  get_dma_ops is privat to the DMA API layer, and dmapool has no
> business even using that.  Even independent of this particular case,
> consumers of an API never have any business looking at the implementation
> of the API, that is the whole point of the abstraction.
>
>> It's not even that, the change here is just obviously broken, since it ends
>> up entirely ignoring want_init_on_alloc() for devices using dma-direct.
>> Sure, the memory backing a dma_page is zeroed *once* by its initial
>> dma-coherent allocation, but who says we're not not reallocating pool
>> entries from an existing dma_page?
> And yes, in addition to that it also is completely broken.

After reading everyone's comments, I found that fixing this patch will

make the code look strange, so the benefits of the changes will be

dispensable, so I also agree to discard this patch.

Sorry for this trouble again.


Thanks



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc
@ 2022-08-18  8:37           ` Liu Song
  0 siblings, 0 replies; 11+ messages in thread
From: Liu Song @ 2022-08-18  8:37 UTC (permalink / raw)
  To: Christoph Hellwig, Robin Murphy, akpm
  Cc: Marek Szyprowski, linux-arm-kernel, linux-mm, linux-kernel, iommu

A helper function "use_dev_coherent_memory" is introduced here to

>>>> determine whether the memory is allocated by "dma_alloc_from_dev_coherent".
>>>>
>>>> And use "get_dma_ops" to determine whether the memory is allocated by
>>>> "dma_direct_alloc".
> WTF?  get_dma_ops is privat to the DMA API layer, and dmapool has no
> business even using that.  Even independent of this particular case,
> consumers of an API never have any business looking at the implementation
> of the API, that is the whole point of the abstraction.
>
>> It's not even that, the change here is just obviously broken, since it ends
>> up entirely ignoring want_init_on_alloc() for devices using dma-direct.
>> Sure, the memory backing a dma_page is zeroed *once* by its initial
>> dma-coherent allocation, but who says we're not not reallocating pool
>> entries from an existing dma_page?
> And yes, in addition to that it also is completely broken.

After reading everyone's comments, I found that fixing this patch will

make the code look strange, so the benefits of the changes will be

dispensable, so I also agree to discard this patch.

Sorry for this trouble again.


Thanks



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-08-18  8:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-18  6:28 [PATCH v2] mm/dmapool.c: avoid duplicate memset within dma_pool_alloc Liu Song
     [not found] ` <CGME20220816123958eucas1p1b03a5efa1f5804245a5c1a9b27529015@eucas1p1.samsung.com>
2022-08-16 12:39   ` Marek Szyprowski
2022-08-16 12:39     ` Marek Szyprowski
2022-08-16 15:00     ` Robin Murphy
2022-08-16 15:00       ` Robin Murphy
2022-08-17  2:03       ` Liu Song
2022-08-17  2:03         ` Liu Song
2022-08-17  5:36       ` Christoph Hellwig
2022-08-17  5:36         ` Christoph Hellwig
2022-08-18  8:37         ` Liu Song
2022-08-18  8:37           ` Liu Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.