linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
@ 2017-01-10 14:18 Vladimir Murzin
  2017-01-10 14:18 ` [RFC PATCH v4 1/5] dma: Add simple dma_noop_mmap Vladimir Murzin
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-10 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

It seem that addition of cache support for M-class cpus uncovered
latent bug in DMA usage. NOMMU memory model has been treated as being
always consistent; however, for R/M classes of cpu memory can be
covered by MPU which in turn might configure RAM as Normal
i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
friends, since data can stuck in caches now or be buffered.

This patch set is trying to address the issue by providing region of
memory suitable for consistent DMA operations. It is supposed that
such region is marked by MPU as non-cacheable. Robin suggested to
advertise such memory as reserved shared-dma-pool, rather then using
homebrew command line option, and extend dma-coherent to provide
default DMA area in the similar way as it is done for CMA (PATCH
2/5). It allows us to offload all bookkeeping on generic coherent DMA
framework, and it is seems that it might be reused by other
architectures like c6x and blackfin.

Dedicated DMA region is required for cases other than:
 - MMU/MPU is off
 - cpu is v7m w/o cache support
 - device is coherent

In case one of the above conditions is true dma operations are forced
to be coherent and wired with dma_noop_ops.

To make life easier NOMMU dma operations are kept in separate
compilation unit.

Since the issue was reported in the same time as Benjamin sent his
patch [1] to allow mmap for NOMMU, his case is also addressed in this
series (PATCH 1/5 and PATCH 3/5).

Thanks!

[1] http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8633/1

Vladimir Murzin (5):
  dma: Add simple dma_noop_mmap
  drivers: dma-coherent: Introduce default DMA pool
  ARM: NOMMU: Introduce dma operations for noMMU
  ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
  ARM: dma-mapping: Remove traces of NOMMU code

 .../bindings/reserved-memory/reserved-memory.txt   |   3 +
 arch/arm/include/asm/dma-mapping.h                 |   3 +-
 arch/arm/mm/Kconfig                                |   2 +-
 arch/arm/mm/Makefile                               |   5 +-
 arch/arm/mm/dma-mapping-nommu.c                    | 252 +++++++++++++++++++++
 arch/arm/mm/dma-mapping.c                          |  26 +--
 drivers/base/dma-coherent.c                        |  59 ++++-
 lib/dma-noop.c                                     |  21 ++
 8 files changed, 335 insertions(+), 36 deletions(-)
 create mode 100644 arch/arm/mm/dma-mapping-nommu.c

-- 
2.0.0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 1/5] dma: Add simple dma_noop_mmap
  2017-01-10 14:18 [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Vladimir Murzin
@ 2017-01-10 14:18 ` Vladimir Murzin
  2017-01-10 14:18 ` [RFC PATCH v4 2/5] drivers: dma-coherent: Introduce default DMA pool Vladimir Murzin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-10 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds a simple implementation of mmap to dma_noop_ops.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 lib/dma-noop.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/lib/dma-noop.c b/lib/dma-noop.c
index 3d766e7..96638a1 100644
--- a/lib/dma-noop.c
+++ b/lib/dma-noop.c
@@ -64,6 +64,26 @@ static int dma_noop_supported(struct device *dev, u64 mask)
 	return 1;
 }
 
+static int dma_noop_mmap(struct device *dev, struct vm_area_struct *vma,
+			 void *cpu_addr, dma_addr_t dma_addr, size_t size,
+			 unsigned long attrs)
+{
+	unsigned long user_count = vma_pages(vma);
+	unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	unsigned long pfn = page_to_pfn(virt_to_page(cpu_addr));
+	unsigned long off = vma->vm_pgoff;
+	int ret = -ENXIO;
+
+	if (off < count && user_count <= (count - off)) {
+		ret = remap_pfn_range(vma, vma->vm_start,
+				      pfn + off,
+				      user_count << PAGE_SHIFT,
+				      vma->vm_page_prot);
+	}
+
+	return ret;
+}
+
 struct dma_map_ops dma_noop_ops = {
 	.alloc			= dma_noop_alloc,
 	.free			= dma_noop_free,
@@ -71,6 +91,7 @@ struct dma_map_ops dma_noop_ops = {
 	.map_sg			= dma_noop_map_sg,
 	.mapping_error		= dma_noop_mapping_error,
 	.dma_supported		= dma_noop_supported,
+	.mmap			= dma_noop_mmap,
 };
 
 EXPORT_SYMBOL(dma_noop_ops);
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 2/5] drivers: dma-coherent: Introduce default DMA pool
  2017-01-10 14:18 [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Vladimir Murzin
  2017-01-10 14:18 ` [RFC PATCH v4 1/5] dma: Add simple dma_noop_mmap Vladimir Murzin
@ 2017-01-10 14:18 ` Vladimir Murzin
  2017-01-10 14:18 ` [RFC PATCH v4 3/5] ARM: NOMMU: Introduce dma operations for noMMU Vladimir Murzin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-10 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

This patch introduces default coherent DMA pool similar to default CMA
area concept. To keep other users safe code kept under CONFIG_ARM.

Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 .../bindings/reserved-memory/reserved-memory.txt   |  3 ++
 drivers/base/dma-coherent.c                        | 59 +++++++++++++++++++---
 2 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
index 3da0ebd..16291f2 100644
--- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
@@ -68,6 +68,9 @@ Linux implementation note:
 - If a "linux,cma-default" property is present, then Linux will use the
   region for the default pool of the contiguous memory allocator.
 
+- If a "linux,dma-default" property is present, then Linux will use the
+  region for the default pool of the consistent DMA allocator.
+
 Device node references to reserved memory
 -----------------------------------------
 Regions in the /reserved-memory node may be referenced by other device
diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index 640a7e6..b52ba27 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -18,6 +18,15 @@ struct dma_coherent_mem {
 	spinlock_t	spinlock;
 };
 
+static struct dma_coherent_mem *dma_coherent_default_memory __ro_after_init;
+
+static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *dev)
+{
+	if (dev && dev->dma_mem)
+		return dev->dma_mem;
+	return dma_coherent_default_memory;
+}
+
 static bool dma_init_coherent_memory(
 	phys_addr_t phys_addr, dma_addr_t device_addr, size_t size, int flags,
 	struct dma_coherent_mem **mem)
@@ -83,6 +92,9 @@ static void dma_release_coherent_memory(struct dma_coherent_mem *mem)
 static int dma_assign_coherent_memory(struct device *dev,
 				      struct dma_coherent_mem *mem)
 {
+	if (!dev)
+		return -ENODEV;
+
 	if (dev->dma_mem)
 		return -EBUSY;
 
@@ -161,15 +173,12 @@ EXPORT_SYMBOL(dma_mark_declared_memory_occupied);
 int dma_alloc_from_coherent(struct device *dev, ssize_t size,
 				       dma_addr_t *dma_handle, void **ret)
 {
-	struct dma_coherent_mem *mem;
+	struct dma_coherent_mem *mem = dev_get_coherent_memory(dev);
 	int order = get_order(size);
 	unsigned long flags;
 	int pageno;
 	int dma_memory_map;
 
-	if (!dev)
-		return 0;
-	mem = dev->dma_mem;
 	if (!mem)
 		return 0;
 
@@ -223,7 +232,7 @@ EXPORT_SYMBOL(dma_alloc_from_coherent);
  */
 int dma_release_from_coherent(struct device *dev, int order, void *vaddr)
 {
-	struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+	struct dma_coherent_mem *mem = dev_get_coherent_memory(dev);
 
 	if (mem && vaddr >= mem->virt_base && vaddr <
 		   (mem->virt_base + (mem->size << PAGE_SHIFT))) {
@@ -257,7 +266,7 @@ EXPORT_SYMBOL(dma_release_from_coherent);
 int dma_mmap_from_coherent(struct device *dev, struct vm_area_struct *vma,
 			   void *vaddr, size_t size, int *ret)
 {
-	struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+	struct dma_coherent_mem *mem = dev_get_coherent_memory(dev);
 
 	if (mem && vaddr >= mem->virt_base && vaddr + size <=
 		   (mem->virt_base + (mem->size << PAGE_SHIFT))) {
@@ -287,6 +296,8 @@ EXPORT_SYMBOL(dma_mmap_from_coherent);
 #include <linux/of_fdt.h>
 #include <linux/of_reserved_mem.h>
 
+static struct reserved_mem *dma_reserved_default_memory __initdata;
+
 static int rmem_dma_device_init(struct reserved_mem *rmem, struct device *dev)
 {
 	struct dma_coherent_mem *mem = rmem->priv;
@@ -307,7 +318,8 @@ static int rmem_dma_device_init(struct reserved_mem *rmem, struct device *dev)
 static void rmem_dma_device_release(struct reserved_mem *rmem,
 				    struct device *dev)
 {
-	dev->dma_mem = NULL;
+	if (dev)
+		dev->dma_mem = NULL;
 }
 
 static const struct reserved_mem_ops rmem_dma_ops = {
@@ -327,6 +339,12 @@ static int __init rmem_dma_setup(struct reserved_mem *rmem)
 		pr_err("Reserved memory: regions without no-map are not yet supported\n");
 		return -EINVAL;
 	}
+
+	if (of_get_flat_dt_prop(node, "linux,dma-default", NULL)) {
+		WARN(dma_reserved_default_memory,
+		     "Reserved memory: region for default DMA coherent area is redefined\n");
+		dma_reserved_default_memory = rmem;
+	}
 #endif
 
 	rmem->ops = &rmem_dma_ops;
@@ -334,5 +352,32 @@ static int __init rmem_dma_setup(struct reserved_mem *rmem)
 		&rmem->base, (unsigned long)rmem->size / SZ_1M);
 	return 0;
 }
+
+static int __init dma_init_reserved_memory(void)
+{
+	const struct reserved_mem_ops *ops;
+	int ret;
+
+	if (!dma_reserved_default_memory)
+		return -ENOMEM;
+
+	ops = dma_reserved_default_memory->ops;
+
+	/*
+	 * We rely on rmem_dma_device_init() does not propagate error of
+	 * dma_assign_coherent_memory() for "NULL" device.
+	 */
+	ret = ops->device_init(dma_reserved_default_memory, NULL);
+
+	if (!ret) {
+		dma_coherent_default_memory = dma_reserved_default_memory->priv;
+		pr_info("DMA: default coherent area is set\n");
+	}
+
+	return ret;
+}
+
+core_initcall(dma_init_reserved_memory);
+
 RESERVEDMEM_OF_DECLARE(dma, "shared-dma-pool", rmem_dma_setup);
 #endif
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 3/5] ARM: NOMMU: Introduce dma operations for noMMU
  2017-01-10 14:18 [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Vladimir Murzin
  2017-01-10 14:18 ` [RFC PATCH v4 1/5] dma: Add simple dma_noop_mmap Vladimir Murzin
  2017-01-10 14:18 ` [RFC PATCH v4 2/5] drivers: dma-coherent: Introduce default DMA pool Vladimir Murzin
@ 2017-01-10 14:18 ` Vladimir Murzin
  2017-01-10 14:18 ` [RFC PATCH v4 4/5] ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus Vladimir Murzin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-10 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

R/M classes of cpus can have memory covered by MPU which in turn might
configure RAM as Normal i.e. bufferable and cacheable. It breaks
dma_alloc_coherent() and friends, since data can stuck in caches now
or be buffered.

This patch factors out DMA support for NOMMU configuration into
separate entity which provides dedicated dma_ops. We have to handle
there several cases:
- configurations with MMU/MPU setup
- configurations without MMU/MPU setup
- special case for M-class, since caches and MPU there are optional

In general we rely on default DMA area for coherent allocations or/and
per-device memory reserves suitable for coherent DMA, so if such
regions are set coherent allocations go from there.

In case MPU/MPU was not setup we fallback to normal page allocator for
DMA memory allocation.

In case we run M-class cpus, for configuration without cache support
(like Cortex-M3/M4) dma operations are forced to be coherent and wired
with dma-noop (such decision is made based on cacheid global
variable); however, if caches are detected there and no DMA coherent
region is given (either default or per-device), dma is disallowed even
MPU is not set - it is because M-class implement system memory map
which defines part of address space as Normal memory.

Reported-by: Alexandre Torgue <alexandre.torgue@st.com>
Reported-by: Andras Szemzo <sza@esh.hu>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/include/asm/dma-mapping.h |   3 +-
 arch/arm/mm/Makefile               |   5 +-
 arch/arm/mm/dma-mapping-nommu.c    | 252 +++++++++++++++++++++++++++++++++++++
 3 files changed, 256 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm/mm/dma-mapping-nommu.c

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index bf02dbd..559faad 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -20,7 +20,8 @@ static inline struct dma_map_ops *__generic_dma_ops(struct device *dev)
 {
 	if (dev && dev->archdata.dma_ops)
 		return dev->archdata.dma_ops;
-	return &arm_dma_ops;
+
+	return IS_ENABLED(CONFIG_MMU) ? &arm_dma_ops : &dma_noop_ops;
 }
 
 static inline struct dma_map_ops *get_dma_ops(struct device *dev)
diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
index 2ac7988..5796357 100644
--- a/arch/arm/mm/Makefile
+++ b/arch/arm/mm/Makefile
@@ -2,9 +2,8 @@
 # Makefile for the linux arm-specific parts of the memory manager.
 #
 
-obj-y				:= dma-mapping.o extable.o fault.o init.o \
-				   iomap.o
-
+obj-y				:= extable.o fault.o init.o iomap.o
+obj-y				+= dma-mapping$(MMUEXT).o
 obj-$(CONFIG_MMU)		+= fault-armv.o flush.o idmap.o ioremap.o \
 				   mmap.o pgd.o mmu.o pageattr.o
 
diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c
new file mode 100644
index 0000000..76f00c9
--- /dev/null
+++ b/arch/arm/mm/dma-mapping-nommu.c
@@ -0,0 +1,252 @@
+/*
+ *  Based on linux/arch/arm/mm/dma-mapping.c
+ *
+ *  Copyright (C) 2000-2004 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/export.h>
+#include <linux/mm.h>
+#include <linux/dma-mapping.h>
+#include <linux/scatterlist.h>
+
+#include <asm/cachetype.h>
+#include <asm/cacheflush.h>
+#include <asm/outercache.h>
+#include <asm/cp15.h>
+
+#include "dma.h"
+
+/*
+ *  dma_noop_ops is used if
+ *   - MMU/MPU is off
+ *   - cpu is v7m w/o cache support
+ *   - device is coherent
+ *  otherwise arm_nommu_dma_ops is used.
+ *
+ *  arm_nommu_dma_ops rely on consistent DMA memory (please, refer to
+ *  [1] on how to declare such memory).
+ *
+ *  [1] Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+ */
+
+static void *arm_nommu_dma_alloc(struct device *dev, size_t size,
+				 dma_addr_t *dma_handle, gfp_t gfp,
+				 unsigned long attrs)
+
+{
+	struct dma_map_ops *ops = &dma_noop_ops;
+
+	/*
+	 * We are here because:
+	 * - no consistent DMA region has been defined, so we can't
+	 *   continue.
+	 * - there is no space left in consistent DMA region, so we
+	 *   only can fallback to generic allocator if we are
+	 *   advertised that consistency is not required.
+	 */
+
+	if (attrs & DMA_ATTR_NON_CONSISTENT)
+		return ops->alloc(dev, size, dma_handle, gfp, attrs);
+
+	WARN_ON_ONCE(1);
+	return NULL;
+}
+
+static void arm_nommu_dma_free(struct device *dev, size_t size,
+			       void *cpu_addr, dma_addr_t dma_addr,
+			       unsigned long attrs)
+{
+	struct dma_map_ops *ops = &dma_noop_ops;
+
+	if (attrs & DMA_ATTR_NON_CONSISTENT)
+		ops->free(dev, size, cpu_addr, dma_addr, attrs);
+
+	WARN_ON_ONCE(1);
+	return;
+}
+
+static int arm_nommu_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+			      void *cpu_addr, dma_addr_t dma_addr, size_t size,
+			      unsigned long attrs)
+{
+	struct dma_map_ops *ops = &dma_noop_ops;
+	int ret;
+
+	if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret))
+		return ret;
+
+	if (attrs & DMA_ATTR_NON_CONSISTENT)
+		return ops->mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+
+	WARN_ON_ONCE(1);
+	return -ENXIO;
+}
+
+static void __dma_page_cpu_to_dev(phys_addr_t paddr, size_t size,
+				  enum dma_data_direction dir)
+{
+	dmac_map_area(__va(paddr), size, dir);
+
+	if (dir == DMA_FROM_DEVICE)
+		outer_inv_range(paddr, paddr + size);
+	else
+		outer_clean_range(paddr, paddr + size);
+}
+
+static void __dma_page_dev_to_cpu(phys_addr_t paddr, size_t size,
+				  enum dma_data_direction dir)
+{
+	if (dir != DMA_TO_DEVICE) {
+		outer_inv_range(paddr, paddr + size);
+		dmac_unmap_area(__va(paddr), size, dir);
+	}
+}
+
+static dma_addr_t arm_nommu_dma_map_page(struct device *dev, struct page *page,
+					 unsigned long offset, size_t size,
+					 enum dma_data_direction dir,
+					 unsigned long attrs)
+{
+	dma_addr_t handle = page_to_phys(page) + offset;
+
+	__dma_page_cpu_to_dev(handle, size, dir);
+
+	return handle;
+}
+
+static void arm_nommu_dma_unmap_page(struct device *dev, dma_addr_t handle,
+				     size_t size, enum dma_data_direction dir,
+				     unsigned long attrs)
+{
+	__dma_page_dev_to_cpu(handle, size, dir);
+}
+
+
+static int arm_nommu_dma_map_sg(struct device *dev, struct scatterlist *sgl,
+				int nents, enum dma_data_direction dir,
+				unsigned long attrs)
+{
+	int i;
+	struct scatterlist *sg;
+
+	for_each_sg(sgl, sg, nents, i) {
+		sg_dma_address(sg) = sg_phys(sg);
+		sg_dma_len(sg) = sg->length;
+		__dma_page_cpu_to_dev(sg_dma_address(sg), sg_dma_len(sg), dir);
+	}
+
+	return nents;
+}
+
+static void arm_nommu_dma_unmap_sg(struct device *dev, struct scatterlist *sgl,
+				   int nents, enum dma_data_direction dir,
+				   unsigned long attrs)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i)
+		__dma_page_dev_to_cpu(sg_dma_address(sg), sg_dma_len(sg), dir);
+}
+
+static void arm_nommu_dma_sync_single_for_device(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	__dma_page_cpu_to_dev(handle, size, dir);
+}
+
+static void arm_nommu_dma_sync_single_for_cpu(struct device *dev,
+		dma_addr_t handle, size_t size, enum dma_data_direction dir)
+{
+	__dma_page_cpu_to_dev(handle, size, dir);
+}
+
+static void arm_nommu_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sgl,
+					     int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i)
+		__dma_page_cpu_to_dev(sg_dma_address(sg), sg_dma_len(sg), dir);
+}
+
+static void arm_nommu_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sgl,
+					  int nents, enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i)
+		__dma_page_dev_to_cpu(sg_dma_address(sg), sg_dma_len(sg), dir);
+}
+
+struct dma_map_ops arm_nommu_dma_ops = {
+	.alloc			= arm_nommu_dma_alloc,
+	.free			= arm_nommu_dma_free,
+	.mmap			= arm_nommu_dma_mmap,
+	.map_page		= arm_nommu_dma_map_page,
+	.unmap_page		= arm_nommu_dma_unmap_page,
+	.map_sg			= arm_nommu_dma_map_sg,
+	.unmap_sg		= arm_nommu_dma_unmap_sg,
+	.sync_single_for_device	= arm_nommu_dma_sync_single_for_device,
+	.sync_single_for_cpu	= arm_nommu_dma_sync_single_for_cpu,
+	.sync_sg_for_device	= arm_nommu_dma_sync_sg_for_device,
+	.sync_sg_for_cpu	= arm_nommu_dma_sync_sg_for_cpu,
+};
+EXPORT_SYMBOL(arm_nommu_dma_ops);
+
+static struct dma_map_ops *arm_nommu_get_dma_map_ops(bool coherent)
+{
+	return coherent ? &dma_noop_ops : &arm_nommu_dma_ops;
+}
+
+void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+			const struct iommu_ops *iommu, bool coherent)
+{
+	struct dma_map_ops *dma_ops;
+
+	if (IS_ENABLED(CONFIG_CPU_V7M)) {
+		/*
+		 * Cache support for v7m is optional, so can be treated as
+		 * coherent if no cache has been detected. Note that it is not
+		 * enough to check if MPU is in use or not since in absense of
+		 * MPU system memory map is used.
+		 */
+		dev->archdata.dma_coherent = (cacheid) ? coherent : true;
+	} else {
+		/*
+		 * Assume coherent DMA in case MMU/MPU has not been set up.
+		 */
+		dev->archdata.dma_coherent = (get_cr() & CR_M) ? coherent : true;
+	}
+
+	dma_ops = arm_nommu_get_dma_map_ops(dev->archdata.dma_coherent);
+
+	set_dma_ops(dev, dma_ops);
+}
+
+void arch_teardown_dma_ops(struct device *dev)
+{
+}
+
+int dma_supported(struct device *dev, u64 mask)
+{
+	return 1;
+}
+
+EXPORT_SYMBOL(dma_supported);
+
+#define PREALLOC_DMA_DEBUG_ENTRIES	4096
+
+static int __init dma_debug_do_init(void)
+{
+	dma_debug_init(PREALLOC_DMA_DEBUG_ENTRIES);
+	return 0;
+}
+core_initcall(dma_debug_do_init);
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 4/5] ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
  2017-01-10 14:18 [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Vladimir Murzin
                   ` (2 preceding siblings ...)
  2017-01-10 14:18 ` [RFC PATCH v4 3/5] ARM: NOMMU: Introduce dma operations for noMMU Vladimir Murzin
@ 2017-01-10 14:18 ` Vladimir Murzin
  2017-01-10 14:18 ` [RFC PATCH v4 5/5] ARM: dma-mapping: Remove traces of NOMMU code Vladimir Murzin
  2017-01-11 13:17 ` [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Benjamin Gaignard
  5 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-10 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

Now, we have dedicated non-cacheable region for consistent DMA
operations. However, that region can still be marked as bufferable by
MPU, so it'd be safer to have barriers by default.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/mm/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 0b79f12..64a1465c 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -1029,7 +1029,7 @@ config ARM_L1_CACHE_SHIFT
 
 config ARM_DMA_MEM_BUFFERABLE
 	bool "Use non-cacheable memory for DMA" if (CPU_V6 || CPU_V6K) && !CPU_V7
-	default y if CPU_V6 || CPU_V6K || CPU_V7
+	default y if CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7M
 	help
 	  Historically, the kernel has used strongly ordered mappings to
 	  provide DMA coherent memory.  With the advent of ARMv7, mapping
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 5/5] ARM: dma-mapping: Remove traces of NOMMU code
  2017-01-10 14:18 [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Vladimir Murzin
                   ` (3 preceding siblings ...)
  2017-01-10 14:18 ` [RFC PATCH v4 4/5] ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus Vladimir Murzin
@ 2017-01-10 14:18 ` Vladimir Murzin
  2017-01-11 13:17 ` [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Benjamin Gaignard
  5 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-10 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

DMA operations for NOMMU case have been just factored out into
separate compilation unit, so don't keep dead code.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
---
 arch/arm/mm/dma-mapping.c | 26 ++------------------------
 1 file changed, 2 insertions(+), 24 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index ab77100..d8a755b 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -344,8 +344,6 @@ static void __dma_free_buffer(struct page *page, size_t size)
 	}
 }
 
-#ifdef CONFIG_MMU
-
 static void *__alloc_from_contiguous(struct device *dev, size_t size,
 				     pgprot_t prot, struct page **ret_page,
 				     const void *caller, bool want_vaddr,
@@ -646,22 +644,6 @@ static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot)
 	return prot;
 }
 
-#define nommu() 0
-
-#else	/* !CONFIG_MMU */
-
-#define nommu() 1
-
-#define __get_dma_pgprot(attrs, prot)				__pgprot(0)
-#define __alloc_remap_buffer(dev, size, gfp, prot, ret, c, wv)	NULL
-#define __alloc_from_pool(size, ret_page)			NULL
-#define __alloc_from_contiguous(dev, size, prot, ret, c, wv, coherent_flag)	NULL
-#define __free_from_pool(cpu_addr, size)			do { } while (0)
-#define __free_from_contiguous(dev, page, cpu_addr, size, wv)	do { } while (0)
-#define __dma_free_remap(cpu_addr, size)			do { } while (0)
-
-#endif	/* CONFIG_MMU */
-
 static void *__alloc_simple_buffer(struct device *dev, size_t size, gfp_t gfp,
 				   struct page **ret_page)
 {
@@ -803,7 +785,7 @@ static void *__dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
 
 	if (cma)
 		buf->allocator = &cma_allocator;
-	else if (nommu() || is_coherent)
+	else if (is_coherent)
 		buf->allocator = &simple_allocator;
 	else if (allowblock)
 		buf->allocator = &remap_allocator;
@@ -852,8 +834,7 @@ static int __arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		 void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		 unsigned long attrs)
 {
-	int ret = -ENXIO;
-#ifdef CONFIG_MMU
+	int ret;
 	unsigned long nr_vma_pages = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
 	unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	unsigned long pfn = dma_to_pfn(dev, dma_addr);
@@ -868,7 +849,6 @@ static int __arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 				      vma->vm_end - vma->vm_start,
 				      vma->vm_page_prot);
 	}
-#endif	/* CONFIG_MMU */
 
 	return ret;
 }
@@ -887,9 +867,7 @@ int arm_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 		 void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		 unsigned long attrs)
 {
-#ifdef CONFIG_MMU
 	vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
-#endif	/* CONFIG_MMU */
 	return __arm_dma_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
 }
 
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-10 14:18 [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Vladimir Murzin
                   ` (4 preceding siblings ...)
  2017-01-10 14:18 ` [RFC PATCH v4 5/5] ARM: dma-mapping: Remove traces of NOMMU code Vladimir Murzin
@ 2017-01-11 13:17 ` Benjamin Gaignard
  2017-01-11 14:34   ` Vladimir Murzin
  5 siblings, 1 reply; 17+ messages in thread
From: Benjamin Gaignard @ 2017-01-11 13:17 UTC (permalink / raw)
  To: linux-arm-kernel

2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
> Hi,
>
> It seem that addition of cache support for M-class cpus uncovered
> latent bug in DMA usage. NOMMU memory model has been treated as being
> always consistent; however, for R/M classes of cpu memory can be
> covered by MPU which in turn might configure RAM as Normal
> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
> friends, since data can stuck in caches now or be buffered.
>
> This patch set is trying to address the issue by providing region of
> memory suitable for consistent DMA operations. It is supposed that
> such region is marked by MPU as non-cacheable. Robin suggested to
> advertise such memory as reserved shared-dma-pool, rather then using
> homebrew command line option, and extend dma-coherent to provide
> default DMA area in the similar way as it is done for CMA (PATCH
> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
> framework, and it is seems that it might be reused by other
> architectures like c6x and blackfin.
>
> Dedicated DMA region is required for cases other than:
>  - MMU/MPU is off
>  - cpu is v7m w/o cache support
>  - device is coherent
>
> In case one of the above conditions is true dma operations are forced
> to be coherent and wired with dma_noop_ops.
>
> To make life easier NOMMU dma operations are kept in separate
> compilation unit.
>
> Since the issue was reported in the same time as Benjamin sent his
> patch [1] to allow mmap for NOMMU, his case is also addressed in this
> series (PATCH 1/5 and PATCH 3/5).
>
> Thanks!

I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
it doesn't work with my drm/kms driver.
I haven't any errors but nothing is displayed unlike what I have when
using current dma-mapping
code.
I guess the issue is coming from dma-noop where __get_free_pages() is
used instead of alloc_pages()
in dma-mapping.

Since my hardware doesn't have cache or MPU (and so use dma-noop) I
haven't reserved specific memory region.
Buffer addresses and vma parameters look correct... What could I have
miss here ?

Benjamin

>
> [1] http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8633/1
>
> Vladimir Murzin (5):
>   dma: Add simple dma_noop_mmap
>   drivers: dma-coherent: Introduce default DMA pool
>   ARM: NOMMU: Introduce dma operations for noMMU
>   ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
>   ARM: dma-mapping: Remove traces of NOMMU code
>
>  .../bindings/reserved-memory/reserved-memory.txt   |   3 +
>  arch/arm/include/asm/dma-mapping.h                 |   3 +-
>  arch/arm/mm/Kconfig                                |   2 +-
>  arch/arm/mm/Makefile                               |   5 +-
>  arch/arm/mm/dma-mapping-nommu.c                    | 252 +++++++++++++++++++++
>  arch/arm/mm/dma-mapping.c                          |  26 +--
>  drivers/base/dma-coherent.c                        |  59 ++++-
>  lib/dma-noop.c                                     |  21 ++
>  8 files changed, 335 insertions(+), 36 deletions(-)
>  create mode 100644 arch/arm/mm/dma-mapping-nommu.c
>
> --
> 2.0.0
>



-- 
Benjamin Gaignard

Graphic Study Group

Linaro.org ? Open source software for ARM SoCs

Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-11 13:17 ` [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Benjamin Gaignard
@ 2017-01-11 14:34   ` Vladimir Murzin
  2017-01-12 10:35     ` Benjamin Gaignard
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/01/17 13:17, Benjamin Gaignard wrote:
> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>> Hi,
>>
>> It seem that addition of cache support for M-class cpus uncovered
>> latent bug in DMA usage. NOMMU memory model has been treated as being
>> always consistent; however, for R/M classes of cpu memory can be
>> covered by MPU which in turn might configure RAM as Normal
>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>> friends, since data can stuck in caches now or be buffered.
>>
>> This patch set is trying to address the issue by providing region of
>> memory suitable for consistent DMA operations. It is supposed that
>> such region is marked by MPU as non-cacheable. Robin suggested to
>> advertise such memory as reserved shared-dma-pool, rather then using
>> homebrew command line option, and extend dma-coherent to provide
>> default DMA area in the similar way as it is done for CMA (PATCH
>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>> framework, and it is seems that it might be reused by other
>> architectures like c6x and blackfin.
>>
>> Dedicated DMA region is required for cases other than:
>>  - MMU/MPU is off
>>  - cpu is v7m w/o cache support
>>  - device is coherent
>>
>> In case one of the above conditions is true dma operations are forced
>> to be coherent and wired with dma_noop_ops.
>>
>> To make life easier NOMMU dma operations are kept in separate
>> compilation unit.
>>
>> Since the issue was reported in the same time as Benjamin sent his
>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>> series (PATCH 1/5 and PATCH 3/5).
>>
>> Thanks!
> 
> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
> it doesn't work with my drm/kms driver.

I guess the same is for fbmem, but would be better to have confirmation since
amba-clcd I use has not been ported to drm/kms (yet), so I can't test.

> I haven't any errors but nothing is displayed unlike what I have when
> using current dma-mapping
> code.
> I guess the issue is coming from dma-noop where __get_free_pages() is
> used instead of alloc_pages()
> in dma-mapping.

Unless I've missed something bellow is a call stack for both

#1
__alloc_simple_buffer
	__dma_alloc_buffer
		alloc_pages
		split_page
		__dma_clear_buffer
			memset
	page_address

#2
__get_free_pages
	alloc_pages
	page_address

So the difference is that nommu case in dma-mapping.c memzeros memory, handles
DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.

Is something from above critical for your driver?

> 
> Since my hardware doesn't have cache or MPU (and so use dma-noop) I
> haven't reserved specific memory region.
> Buffer addresses and vma parameters look correct... What could I have
> miss here ?

No ideas, sorry...

Cheers
Vladimir

> 
> Benjamin
> 
>>
>> [1] http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8633/1
>>
>> Vladimir Murzin (5):
>>   dma: Add simple dma_noop_mmap
>>   drivers: dma-coherent: Introduce default DMA pool
>>   ARM: NOMMU: Introduce dma operations for noMMU
>>   ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
>>   ARM: dma-mapping: Remove traces of NOMMU code
>>
>>  .../bindings/reserved-memory/reserved-memory.txt   |   3 +
>>  arch/arm/include/asm/dma-mapping.h                 |   3 +-
>>  arch/arm/mm/Kconfig                                |   2 +-
>>  arch/arm/mm/Makefile                               |   5 +-
>>  arch/arm/mm/dma-mapping-nommu.c                    | 252 +++++++++++++++++++++
>>  arch/arm/mm/dma-mapping.c                          |  26 +--
>>  drivers/base/dma-coherent.c                        |  59 ++++-
>>  lib/dma-noop.c                                     |  21 ++
>>  8 files changed, 335 insertions(+), 36 deletions(-)
>>  create mode 100644 arch/arm/mm/dma-mapping-nommu.c
>>
>> --
>> 2.0.0
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-11 14:34   ` Vladimir Murzin
@ 2017-01-12 10:35     ` Benjamin Gaignard
  2017-01-12 10:55       ` Benjamin Gaignard
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Gaignard @ 2017-01-12 10:35 UTC (permalink / raw)
  To: linux-arm-kernel

2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
> On 11/01/17 13:17, Benjamin Gaignard wrote:
>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>> Hi,
>>>
>>> It seem that addition of cache support for M-class cpus uncovered
>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>> always consistent; however, for R/M classes of cpu memory can be
>>> covered by MPU which in turn might configure RAM as Normal
>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>> friends, since data can stuck in caches now or be buffered.
>>>
>>> This patch set is trying to address the issue by providing region of
>>> memory suitable for consistent DMA operations. It is supposed that
>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>> advertise such memory as reserved shared-dma-pool, rather then using
>>> homebrew command line option, and extend dma-coherent to provide
>>> default DMA area in the similar way as it is done for CMA (PATCH
>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>> framework, and it is seems that it might be reused by other
>>> architectures like c6x and blackfin.
>>>
>>> Dedicated DMA region is required for cases other than:
>>>  - MMU/MPU is off
>>>  - cpu is v7m w/o cache support
>>>  - device is coherent
>>>
>>> In case one of the above conditions is true dma operations are forced
>>> to be coherent and wired with dma_noop_ops.
>>>
>>> To make life easier NOMMU dma operations are kept in separate
>>> compilation unit.
>>>
>>> Since the issue was reported in the same time as Benjamin sent his
>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>> series (PATCH 1/5 and PATCH 3/5).
>>>
>>> Thanks!
>>
>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>> it doesn't work with my drm/kms driver.
>
> I guess the same is for fbmem, but would be better to have confirmation since
> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>
>> I haven't any errors but nothing is displayed unlike what I have when
>> using current dma-mapping
>> code.
>> I guess the issue is coming from dma-noop where __get_free_pages() is
>> used instead of alloc_pages()
>> in dma-mapping.
>
> Unless I've missed something bellow is a call stack for both
>
> #1
> __alloc_simple_buffer
>         __dma_alloc_buffer
>                 alloc_pages
>                 split_page
>                 __dma_clear_buffer
>                         memset
>         page_address
>
> #2
> __get_free_pages
>         alloc_pages
>         page_address
>
> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>
> Is something from above critical for your driver?

I have removed all the diff (split_page,  __dma_clear_buffer, memset)
from #1 and it is still working.
DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.

I have investigated more and found that dma-noop doesn't take care of
"dma-ranges" property which is set in DT.
I believed that is the root cause of my problem with your patches.

Benjamin

>
>>
>> Since my hardware doesn't have cache or MPU (and so use dma-noop) I
>> haven't reserved specific memory region.
>> Buffer addresses and vma parameters look correct... What could I have
>> miss here ?
>
> No ideas, sorry...
>
> Cheers
> Vladimir
>
>>
>> Benjamin
>>
>>>
>>> [1] http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8633/1
>>>
>>> Vladimir Murzin (5):
>>>   dma: Add simple dma_noop_mmap
>>>   drivers: dma-coherent: Introduce default DMA pool
>>>   ARM: NOMMU: Introduce dma operations for noMMU
>>>   ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
>>>   ARM: dma-mapping: Remove traces of NOMMU code
>>>
>>>  .../bindings/reserved-memory/reserved-memory.txt   |   3 +
>>>  arch/arm/include/asm/dma-mapping.h                 |   3 +-
>>>  arch/arm/mm/Kconfig                                |   2 +-
>>>  arch/arm/mm/Makefile                               |   5 +-
>>>  arch/arm/mm/dma-mapping-nommu.c                    | 252 +++++++++++++++++++++
>>>  arch/arm/mm/dma-mapping.c                          |  26 +--
>>>  drivers/base/dma-coherent.c                        |  59 ++++-
>>>  lib/dma-noop.c                                     |  21 ++
>>>  8 files changed, 335 insertions(+), 36 deletions(-)
>>>  create mode 100644 arch/arm/mm/dma-mapping-nommu.c
>>>
>>> --
>>> 2.0.0
>>>
>>
>>
>>
>



-- 
Benjamin Gaignard

Graphic Study Group

Linaro.org ? Open source software for ARM SoCs

Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-12 10:35     ` Benjamin Gaignard
@ 2017-01-12 10:55       ` Benjamin Gaignard
  2017-01-12 16:52         ` Vladimir Murzin
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Gaignard @ 2017-01-12 10:55 UTC (permalink / raw)
  To: linux-arm-kernel

2017-01-12 11:35 GMT+01:00 Benjamin Gaignard <benjamin.gaignard@linaro.org>:
> 2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>> On 11/01/17 13:17, Benjamin Gaignard wrote:
>>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>> Hi,
>>>>
>>>> It seem that addition of cache support for M-class cpus uncovered
>>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>>> always consistent; however, for R/M classes of cpu memory can be
>>>> covered by MPU which in turn might configure RAM as Normal
>>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>>> friends, since data can stuck in caches now or be buffered.
>>>>
>>>> This patch set is trying to address the issue by providing region of
>>>> memory suitable for consistent DMA operations. It is supposed that
>>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>>> advertise such memory as reserved shared-dma-pool, rather then using
>>>> homebrew command line option, and extend dma-coherent to provide
>>>> default DMA area in the similar way as it is done for CMA (PATCH
>>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>>> framework, and it is seems that it might be reused by other
>>>> architectures like c6x and blackfin.
>>>>
>>>> Dedicated DMA region is required for cases other than:
>>>>  - MMU/MPU is off
>>>>  - cpu is v7m w/o cache support
>>>>  - device is coherent
>>>>
>>>> In case one of the above conditions is true dma operations are forced
>>>> to be coherent and wired with dma_noop_ops.
>>>>
>>>> To make life easier NOMMU dma operations are kept in separate
>>>> compilation unit.
>>>>
>>>> Since the issue was reported in the same time as Benjamin sent his
>>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>>> series (PATCH 1/5 and PATCH 3/5).
>>>>
>>>> Thanks!
>>>
>>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>>> it doesn't work with my drm/kms driver.
>>
>> I guess the same is for fbmem, but would be better to have confirmation since
>> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>>
>>> I haven't any errors but nothing is displayed unlike what I have when
>>> using current dma-mapping
>>> code.
>>> I guess the issue is coming from dma-noop where __get_free_pages() is
>>> used instead of alloc_pages()
>>> in dma-mapping.
>>
>> Unless I've missed something bellow is a call stack for both
>>
>> #1
>> __alloc_simple_buffer
>>         __dma_alloc_buffer
>>                 alloc_pages
>>                 split_page
>>                 __dma_clear_buffer
>>                         memset
>>         page_address
>>
>> #2
>> __get_free_pages
>>         alloc_pages
>>         page_address
>>
>> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
>> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>>
>> Is something from above critical for your driver?
>
> I have removed all the diff (split_page,  __dma_clear_buffer, memset)
> from #1 and it is still working.
> DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.
>
> I have investigated more and found that dma-noop doesn't take care of
> "dma-ranges" property which is set in DT.
> I believed that is the root cause of my problem with your patches.

After testing changing virt_to_phys to virt_to_dma in dma-noop.c fix the issue
modetest and fbdemo are now still functional.

>
> Benjamin
>
>>
>>>
>>> Since my hardware doesn't have cache or MPU (and so use dma-noop) I
>>> haven't reserved specific memory region.
>>> Buffer addresses and vma parameters look correct... What could I have
>>> miss here ?
>>
>> No ideas, sorry...
>>
>> Cheers
>> Vladimir
>>
>>>
>>> Benjamin
>>>
>>>>
>>>> [1] http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8633/1
>>>>
>>>> Vladimir Murzin (5):
>>>>   dma: Add simple dma_noop_mmap
>>>>   drivers: dma-coherent: Introduce default DMA pool
>>>>   ARM: NOMMU: Introduce dma operations for noMMU
>>>>   ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
>>>>   ARM: dma-mapping: Remove traces of NOMMU code
>>>>
>>>>  .../bindings/reserved-memory/reserved-memory.txt   |   3 +
>>>>  arch/arm/include/asm/dma-mapping.h                 |   3 +-
>>>>  arch/arm/mm/Kconfig                                |   2 +-
>>>>  arch/arm/mm/Makefile                               |   5 +-
>>>>  arch/arm/mm/dma-mapping-nommu.c                    | 252 +++++++++++++++++++++
>>>>  arch/arm/mm/dma-mapping.c                          |  26 +--
>>>>  drivers/base/dma-coherent.c                        |  59 ++++-
>>>>  lib/dma-noop.c                                     |  21 ++
>>>>  8 files changed, 335 insertions(+), 36 deletions(-)
>>>>  create mode 100644 arch/arm/mm/dma-mapping-nommu.c
>>>>
>>>> --
>>>> 2.0.0
>>>>
>>>
>>>
>>>
>>
>
>
>
> --
> Benjamin Gaignard
>
> Graphic Study Group
>
> Linaro.org ? Open source software for ARM SoCs
>
> Follow Linaro: Facebook | Twitter | Blog



-- 
Benjamin Gaignard

Graphic Study Group

Linaro.org ? Open source software for ARM SoCs

Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-12 10:55       ` Benjamin Gaignard
@ 2017-01-12 16:52         ` Vladimir Murzin
  2017-01-12 17:04           ` Robin Murphy
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-12 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/01/17 10:55, Benjamin Gaignard wrote:
> 2017-01-12 11:35 GMT+01:00 Benjamin Gaignard <benjamin.gaignard@linaro.org>:
>> 2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>> On 11/01/17 13:17, Benjamin Gaignard wrote:
>>>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>> Hi,
>>>>>
>>>>> It seem that addition of cache support for M-class cpus uncovered
>>>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>>>> always consistent; however, for R/M classes of cpu memory can be
>>>>> covered by MPU which in turn might configure RAM as Normal
>>>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>>>> friends, since data can stuck in caches now or be buffered.
>>>>>
>>>>> This patch set is trying to address the issue by providing region of
>>>>> memory suitable for consistent DMA operations. It is supposed that
>>>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>>>> advertise such memory as reserved shared-dma-pool, rather then using
>>>>> homebrew command line option, and extend dma-coherent to provide
>>>>> default DMA area in the similar way as it is done for CMA (PATCH
>>>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>>>> framework, and it is seems that it might be reused by other
>>>>> architectures like c6x and blackfin.
>>>>>
>>>>> Dedicated DMA region is required for cases other than:
>>>>>  - MMU/MPU is off
>>>>>  - cpu is v7m w/o cache support
>>>>>  - device is coherent
>>>>>
>>>>> In case one of the above conditions is true dma operations are forced
>>>>> to be coherent and wired with dma_noop_ops.
>>>>>
>>>>> To make life easier NOMMU dma operations are kept in separate
>>>>> compilation unit.
>>>>>
>>>>> Since the issue was reported in the same time as Benjamin sent his
>>>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>>>> series (PATCH 1/5 and PATCH 3/5).
>>>>>
>>>>> Thanks!
>>>>
>>>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>>>> it doesn't work with my drm/kms driver.
>>>
>>> I guess the same is for fbmem, but would be better to have confirmation since
>>> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>>>
>>>> I haven't any errors but nothing is displayed unlike what I have when
>>>> using current dma-mapping
>>>> code.
>>>> I guess the issue is coming from dma-noop where __get_free_pages() is
>>>> used instead of alloc_pages()
>>>> in dma-mapping.
>>>
>>> Unless I've missed something bellow is a call stack for both
>>>
>>> #1
>>> __alloc_simple_buffer
>>>         __dma_alloc_buffer
>>>                 alloc_pages
>>>                 split_page
>>>                 __dma_clear_buffer
>>>                         memset
>>>         page_address
>>>
>>> #2
>>> __get_free_pages
>>>         alloc_pages
>>>         page_address
>>>
>>> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
>>> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>>>
>>> Is something from above critical for your driver?
>>
>> I have removed all the diff (split_page,  __dma_clear_buffer, memset)
>> from #1 and it is still working.
>> DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.
>>
>> I have investigated more and found that dma-noop doesn't take care of
>> "dma-ranges" property which is set in DT.
>> I believed that is the root cause of my problem with your patches.
> 
> After testing changing virt_to_phys to virt_to_dma in dma-noop.c fix the issue
> modetest and fbdemo are now still functional.
> 

Thanks for narrowing it down! I did not noticed that stm32f4 remap its memory,
so dma-ranges property is in use.

It looks like virt_to_dma is ARM specific, so I probably have to discard idea
of reusing dma-noop-ops and switch logic into dma-mapping-nommu.c based on
is_device_dma_coherent(dev) check.

Meanwhile, I'm quite puzzled on how such memory remaping should work together
with reserved memory. It seem it doesn't account dma-ranges while reserving
memory (it is too early) nor while allocating/mapping/etc.

Cheers
Vladimir

>>
>> Benjamin
>>
>>>
>>>>
>>>> Since my hardware doesn't have cache or MPU (and so use dma-noop) I
>>>> haven't reserved specific memory region.
>>>> Buffer addresses and vma parameters look correct... What could I have
>>>> miss here ?
>>>
>>> No ideas, sorry...
>>>
>>> Cheers
>>> Vladimir
>>>
>>>>
>>>> Benjamin
>>>>
>>>>>
>>>>> [1] http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8633/1
>>>>>
>>>>> Vladimir Murzin (5):
>>>>>   dma: Add simple dma_noop_mmap
>>>>>   drivers: dma-coherent: Introduce default DMA pool
>>>>>   ARM: NOMMU: Introduce dma operations for noMMU
>>>>>   ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
>>>>>   ARM: dma-mapping: Remove traces of NOMMU code
>>>>>
>>>>>  .../bindings/reserved-memory/reserved-memory.txt   |   3 +
>>>>>  arch/arm/include/asm/dma-mapping.h                 |   3 +-
>>>>>  arch/arm/mm/Kconfig                                |   2 +-
>>>>>  arch/arm/mm/Makefile                               |   5 +-
>>>>>  arch/arm/mm/dma-mapping-nommu.c                    | 252 +++++++++++++++++++++
>>>>>  arch/arm/mm/dma-mapping.c                          |  26 +--
>>>>>  drivers/base/dma-coherent.c                        |  59 ++++-
>>>>>  lib/dma-noop.c                                     |  21 ++
>>>>>  8 files changed, 335 insertions(+), 36 deletions(-)
>>>>>  create mode 100644 arch/arm/mm/dma-mapping-nommu.c
>>>>>
>>>>> --
>>>>> 2.0.0
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Benjamin Gaignard
>>
>> Graphic Study Group
>>
>> Linaro.org ? Open source software for ARM SoCs
>>
>> Follow Linaro: Facebook | Twitter | Blog
> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-12 16:52         ` Vladimir Murzin
@ 2017-01-12 17:04           ` Robin Murphy
  2017-01-12 17:15             ` Vladimir Murzin
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2017-01-12 17:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/01/17 16:52, Vladimir Murzin wrote:
> On 12/01/17 10:55, Benjamin Gaignard wrote:
>> 2017-01-12 11:35 GMT+01:00 Benjamin Gaignard <benjamin.gaignard@linaro.org>:
>>> 2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>> On 11/01/17 13:17, Benjamin Gaignard wrote:
>>>>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>> Hi,
>>>>>>
>>>>>> It seem that addition of cache support for M-class cpus uncovered
>>>>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>>>>> always consistent; however, for R/M classes of cpu memory can be
>>>>>> covered by MPU which in turn might configure RAM as Normal
>>>>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>>>>> friends, since data can stuck in caches now or be buffered.
>>>>>>
>>>>>> This patch set is trying to address the issue by providing region of
>>>>>> memory suitable for consistent DMA operations. It is supposed that
>>>>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>>>>> advertise such memory as reserved shared-dma-pool, rather then using
>>>>>> homebrew command line option, and extend dma-coherent to provide
>>>>>> default DMA area in the similar way as it is done for CMA (PATCH
>>>>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>>>>> framework, and it is seems that it might be reused by other
>>>>>> architectures like c6x and blackfin.
>>>>>>
>>>>>> Dedicated DMA region is required for cases other than:
>>>>>>  - MMU/MPU is off
>>>>>>  - cpu is v7m w/o cache support
>>>>>>  - device is coherent
>>>>>>
>>>>>> In case one of the above conditions is true dma operations are forced
>>>>>> to be coherent and wired with dma_noop_ops.
>>>>>>
>>>>>> To make life easier NOMMU dma operations are kept in separate
>>>>>> compilation unit.
>>>>>>
>>>>>> Since the issue was reported in the same time as Benjamin sent his
>>>>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>>>>> series (PATCH 1/5 and PATCH 3/5).
>>>>>>
>>>>>> Thanks!
>>>>>
>>>>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>>>>> it doesn't work with my drm/kms driver.
>>>>
>>>> I guess the same is for fbmem, but would be better to have confirmation since
>>>> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>>>>
>>>>> I haven't any errors but nothing is displayed unlike what I have when
>>>>> using current dma-mapping
>>>>> code.
>>>>> I guess the issue is coming from dma-noop where __get_free_pages() is
>>>>> used instead of alloc_pages()
>>>>> in dma-mapping.
>>>>
>>>> Unless I've missed something bellow is a call stack for both
>>>>
>>>> #1
>>>> __alloc_simple_buffer
>>>>         __dma_alloc_buffer
>>>>                 alloc_pages
>>>>                 split_page
>>>>                 __dma_clear_buffer
>>>>                         memset
>>>>         page_address
>>>>
>>>> #2
>>>> __get_free_pages
>>>>         alloc_pages
>>>>         page_address
>>>>
>>>> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
>>>> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>>>>
>>>> Is something from above critical for your driver?
>>>
>>> I have removed all the diff (split_page,  __dma_clear_buffer, memset)
>>> from #1 and it is still working.
>>> DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.
>>>
>>> I have investigated more and found that dma-noop doesn't take care of
>>> "dma-ranges" property which is set in DT.
>>> I believed that is the root cause of my problem with your patches.
>>
>> After testing changing virt_to_phys to virt_to_dma in dma-noop.c fix the issue
>> modetest and fbdemo are now still functional.
>>
> 
> Thanks for narrowing it down! I did not noticed that stm32f4 remap its memory,
> so dma-ranges property is in use.
> 
> It looks like virt_to_dma is ARM specific, so I probably have to discard idea
> of reusing dma-noop-ops and switch logic into dma-mapping-nommu.c based on
> is_device_dma_coherent(dev) check.

dma_pfn_offset is a member of struct device, so it should be OK for
dma_noop_ops to also make reference to it (and assume it's zero if not
explicitly set).

> Meanwhile, I'm quite puzzled on how such memory remaping should work together
> with reserved memory. It seem it doesn't account dma-ranges while reserving
> memory (it is too early) nor while allocating/mapping/etc.

The reserved memory is described in terms of CPU physical addresses, so
a device offset shouldn't matter from that perspective. It only comes
into play at the point you generate the dma_addr_t to hand off to the
device - only then do you need to transform the CPU physical address of
the allocated/mapped page into the device's view of that page (i.e.
subtract the offset).

Robin.

> 
> Cheers
> Vladimir
> 
>>>
>>> Benjamin
>>>
>>>>
>>>>>
>>>>> Since my hardware doesn't have cache or MPU (and so use dma-noop) I
>>>>> haven't reserved specific memory region.
>>>>> Buffer addresses and vma parameters look correct... What could I have
>>>>> miss here ?
>>>>
>>>> No ideas, sorry...
>>>>
>>>> Cheers
>>>> Vladimir
>>>>
>>>>>
>>>>> Benjamin
>>>>>
>>>>>>
>>>>>> [1] http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8633/1
>>>>>>
>>>>>> Vladimir Murzin (5):
>>>>>>   dma: Add simple dma_noop_mmap
>>>>>>   drivers: dma-coherent: Introduce default DMA pool
>>>>>>   ARM: NOMMU: Introduce dma operations for noMMU
>>>>>>   ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
>>>>>>   ARM: dma-mapping: Remove traces of NOMMU code
>>>>>>
>>>>>>  .../bindings/reserved-memory/reserved-memory.txt   |   3 +
>>>>>>  arch/arm/include/asm/dma-mapping.h                 |   3 +-
>>>>>>  arch/arm/mm/Kconfig                                |   2 +-
>>>>>>  arch/arm/mm/Makefile                               |   5 +-
>>>>>>  arch/arm/mm/dma-mapping-nommu.c                    | 252 +++++++++++++++++++++
>>>>>>  arch/arm/mm/dma-mapping.c                          |  26 +--
>>>>>>  drivers/base/dma-coherent.c                        |  59 ++++-
>>>>>>  lib/dma-noop.c                                     |  21 ++
>>>>>>  8 files changed, 335 insertions(+), 36 deletions(-)
>>>>>>  create mode 100644 arch/arm/mm/dma-mapping-nommu.c
>>>>>>
>>>>>> --
>>>>>> 2.0.0
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Benjamin Gaignard
>>>
>>> Graphic Study Group
>>>
>>> Linaro.org ? Open source software for ARM SoCs
>>>
>>> Follow Linaro: Facebook | Twitter | Blog
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-12 17:04           ` Robin Murphy
@ 2017-01-12 17:15             ` Vladimir Murzin
  2017-01-12 18:07               ` Robin Murphy
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-12 17:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/01/17 17:04, Robin Murphy wrote:
> On 12/01/17 16:52, Vladimir Murzin wrote:
>> On 12/01/17 10:55, Benjamin Gaignard wrote:
>>> 2017-01-12 11:35 GMT+01:00 Benjamin Gaignard <benjamin.gaignard@linaro.org>:
>>>> 2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>> On 11/01/17 13:17, Benjamin Gaignard wrote:
>>>>>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>>> Hi,
>>>>>>>
>>>>>>> It seem that addition of cache support for M-class cpus uncovered
>>>>>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>>>>>> always consistent; however, for R/M classes of cpu memory can be
>>>>>>> covered by MPU which in turn might configure RAM as Normal
>>>>>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>>>>>> friends, since data can stuck in caches now or be buffered.
>>>>>>>
>>>>>>> This patch set is trying to address the issue by providing region of
>>>>>>> memory suitable for consistent DMA operations. It is supposed that
>>>>>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>>>>>> advertise such memory as reserved shared-dma-pool, rather then using
>>>>>>> homebrew command line option, and extend dma-coherent to provide
>>>>>>> default DMA area in the similar way as it is done for CMA (PATCH
>>>>>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>>>>>> framework, and it is seems that it might be reused by other
>>>>>>> architectures like c6x and blackfin.
>>>>>>>
>>>>>>> Dedicated DMA region is required for cases other than:
>>>>>>>  - MMU/MPU is off
>>>>>>>  - cpu is v7m w/o cache support
>>>>>>>  - device is coherent
>>>>>>>
>>>>>>> In case one of the above conditions is true dma operations are forced
>>>>>>> to be coherent and wired with dma_noop_ops.
>>>>>>>
>>>>>>> To make life easier NOMMU dma operations are kept in separate
>>>>>>> compilation unit.
>>>>>>>
>>>>>>> Since the issue was reported in the same time as Benjamin sent his
>>>>>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>>>>>> series (PATCH 1/5 and PATCH 3/5).
>>>>>>>
>>>>>>> Thanks!
>>>>>>
>>>>>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>>>>>> it doesn't work with my drm/kms driver.
>>>>>
>>>>> I guess the same is for fbmem, but would be better to have confirmation since
>>>>> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>>>>>
>>>>>> I haven't any errors but nothing is displayed unlike what I have when
>>>>>> using current dma-mapping
>>>>>> code.
>>>>>> I guess the issue is coming from dma-noop where __get_free_pages() is
>>>>>> used instead of alloc_pages()
>>>>>> in dma-mapping.
>>>>>
>>>>> Unless I've missed something bellow is a call stack for both
>>>>>
>>>>> #1
>>>>> __alloc_simple_buffer
>>>>>         __dma_alloc_buffer
>>>>>                 alloc_pages
>>>>>                 split_page
>>>>>                 __dma_clear_buffer
>>>>>                         memset
>>>>>         page_address
>>>>>
>>>>> #2
>>>>> __get_free_pages
>>>>>         alloc_pages
>>>>>         page_address
>>>>>
>>>>> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
>>>>> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>>>>>
>>>>> Is something from above critical for your driver?
>>>>
>>>> I have removed all the diff (split_page,  __dma_clear_buffer, memset)
>>>> from #1 and it is still working.
>>>> DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.
>>>>
>>>> I have investigated more and found that dma-noop doesn't take care of
>>>> "dma-ranges" property which is set in DT.
>>>> I believed that is the root cause of my problem with your patches.
>>>
>>> After testing changing virt_to_phys to virt_to_dma in dma-noop.c fix the issue
>>> modetest and fbdemo are now still functional.
>>>
>>
>> Thanks for narrowing it down! I did not noticed that stm32f4 remap its memory,
>> so dma-ranges property is in use.
>>
>> It looks like virt_to_dma is ARM specific, so I probably have to discard idea
>> of reusing dma-noop-ops and switch logic into dma-mapping-nommu.c based on
>> is_device_dma_coherent(dev) check.
> 
> dma_pfn_offset is a member of struct device, so it should be OK for
> dma_noop_ops to also make reference to it (and assume it's zero if not
> explicitly set).
> 
>> Meanwhile, I'm quite puzzled on how such memory remaping should work together
>> with reserved memory. It seem it doesn't account dma-ranges while reserving
>> memory (it is too early) nor while allocating/mapping/etc.
> 
> The reserved memory is described in terms of CPU physical addresses, so
> a device offset shouldn't matter from that perspective. It only comes
> into play at the point you generate the dma_addr_t to hand off to the
> device - only then do you need to transform the CPU physical address of
> the allocated/mapped page into the device's view of that page (i.e.
> subtract the offset).

Thanks for explanation! So dma-coherent.c should be modified, right? I see
that some architectures provide phys_to_dma/dma_to_phys helpers primary for
swiotlb, is it safe to reuse them given that default implementation is
provided? Nothing under Documentation explains how they supposed to be used,
sorry if asking stupid question.

Cheers
Vladimir


> 
> Robin.
> 
>>
>> Cheers
>> Vladimir
>>
>>>>
>>>> Benjamin
>>>>
>>>>>
>>>>>>
>>>>>> Since my hardware doesn't have cache or MPU (and so use dma-noop) I
>>>>>> haven't reserved specific memory region.
>>>>>> Buffer addresses and vma parameters look correct... What could I have
>>>>>> miss here ?
>>>>>
>>>>> No ideas, sorry...
>>>>>
>>>>> Cheers
>>>>> Vladimir
>>>>>
>>>>>>
>>>>>> Benjamin
>>>>>>
>>>>>>>
>>>>>>> [1] http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8633/1
>>>>>>>
>>>>>>> Vladimir Murzin (5):
>>>>>>>   dma: Add simple dma_noop_mmap
>>>>>>>   drivers: dma-coherent: Introduce default DMA pool
>>>>>>>   ARM: NOMMU: Introduce dma operations for noMMU
>>>>>>>   ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
>>>>>>>   ARM: dma-mapping: Remove traces of NOMMU code
>>>>>>>
>>>>>>>  .../bindings/reserved-memory/reserved-memory.txt   |   3 +
>>>>>>>  arch/arm/include/asm/dma-mapping.h                 |   3 +-
>>>>>>>  arch/arm/mm/Kconfig                                |   2 +-
>>>>>>>  arch/arm/mm/Makefile                               |   5 +-
>>>>>>>  arch/arm/mm/dma-mapping-nommu.c                    | 252 +++++++++++++++++++++
>>>>>>>  arch/arm/mm/dma-mapping.c                          |  26 +--
>>>>>>>  drivers/base/dma-coherent.c                        |  59 ++++-
>>>>>>>  lib/dma-noop.c                                     |  21 ++
>>>>>>>  8 files changed, 335 insertions(+), 36 deletions(-)
>>>>>>>  create mode 100644 arch/arm/mm/dma-mapping-nommu.c
>>>>>>>
>>>>>>> --
>>>>>>> 2.0.0
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Benjamin Gaignard
>>>>
>>>> Graphic Study Group
>>>>
>>>> Linaro.org ? Open source software for ARM SoCs
>>>>
>>>> Follow Linaro: Facebook | Twitter | Blog
>>>
>>>
>>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-12 17:15             ` Vladimir Murzin
@ 2017-01-12 18:07               ` Robin Murphy
  2017-01-13  9:12                 ` Vladimir Murzin
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2017-01-12 18:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/01/17 17:15, Vladimir Murzin wrote:
> On 12/01/17 17:04, Robin Murphy wrote:
>> On 12/01/17 16:52, Vladimir Murzin wrote:
>>> On 12/01/17 10:55, Benjamin Gaignard wrote:
>>>> 2017-01-12 11:35 GMT+01:00 Benjamin Gaignard <benjamin.gaignard@linaro.org>:
>>>>> 2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>> On 11/01/17 13:17, Benjamin Gaignard wrote:
>>>>>>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> It seem that addition of cache support for M-class cpus uncovered
>>>>>>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>>>>>>> always consistent; however, for R/M classes of cpu memory can be
>>>>>>>> covered by MPU which in turn might configure RAM as Normal
>>>>>>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>>>>>>> friends, since data can stuck in caches now or be buffered.
>>>>>>>>
>>>>>>>> This patch set is trying to address the issue by providing region of
>>>>>>>> memory suitable for consistent DMA operations. It is supposed that
>>>>>>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>>>>>>> advertise such memory as reserved shared-dma-pool, rather then using
>>>>>>>> homebrew command line option, and extend dma-coherent to provide
>>>>>>>> default DMA area in the similar way as it is done for CMA (PATCH
>>>>>>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>>>>>>> framework, and it is seems that it might be reused by other
>>>>>>>> architectures like c6x and blackfin.
>>>>>>>>
>>>>>>>> Dedicated DMA region is required for cases other than:
>>>>>>>>  - MMU/MPU is off
>>>>>>>>  - cpu is v7m w/o cache support
>>>>>>>>  - device is coherent
>>>>>>>>
>>>>>>>> In case one of the above conditions is true dma operations are forced
>>>>>>>> to be coherent and wired with dma_noop_ops.
>>>>>>>>
>>>>>>>> To make life easier NOMMU dma operations are kept in separate
>>>>>>>> compilation unit.
>>>>>>>>
>>>>>>>> Since the issue was reported in the same time as Benjamin sent his
>>>>>>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>>>>>>> series (PATCH 1/5 and PATCH 3/5).
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>
>>>>>>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>>>>>>> it doesn't work with my drm/kms driver.
>>>>>>
>>>>>> I guess the same is for fbmem, but would be better to have confirmation since
>>>>>> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>>>>>>
>>>>>>> I haven't any errors but nothing is displayed unlike what I have when
>>>>>>> using current dma-mapping
>>>>>>> code.
>>>>>>> I guess the issue is coming from dma-noop where __get_free_pages() is
>>>>>>> used instead of alloc_pages()
>>>>>>> in dma-mapping.
>>>>>>
>>>>>> Unless I've missed something bellow is a call stack for both
>>>>>>
>>>>>> #1
>>>>>> __alloc_simple_buffer
>>>>>>         __dma_alloc_buffer
>>>>>>                 alloc_pages
>>>>>>                 split_page
>>>>>>                 __dma_clear_buffer
>>>>>>                         memset
>>>>>>         page_address
>>>>>>
>>>>>> #2
>>>>>> __get_free_pages
>>>>>>         alloc_pages
>>>>>>         page_address
>>>>>>
>>>>>> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
>>>>>> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>>>>>>
>>>>>> Is something from above critical for your driver?
>>>>>
>>>>> I have removed all the diff (split_page,  __dma_clear_buffer, memset)
>>>>> from #1 and it is still working.
>>>>> DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.
>>>>>
>>>>> I have investigated more and found that dma-noop doesn't take care of
>>>>> "dma-ranges" property which is set in DT.
>>>>> I believed that is the root cause of my problem with your patches.
>>>>
>>>> After testing changing virt_to_phys to virt_to_dma in dma-noop.c fix the issue
>>>> modetest and fbdemo are now still functional.
>>>>
>>>
>>> Thanks for narrowing it down! I did not noticed that stm32f4 remap its memory,
>>> so dma-ranges property is in use.
>>>
>>> It looks like virt_to_dma is ARM specific, so I probably have to discard idea
>>> of reusing dma-noop-ops and switch logic into dma-mapping-nommu.c based on
>>> is_device_dma_coherent(dev) check.
>>
>> dma_pfn_offset is a member of struct device, so it should be OK for
>> dma_noop_ops to also make reference to it (and assume it's zero if not
>> explicitly set).
>>
>>> Meanwhile, I'm quite puzzled on how such memory remaping should work together
>>> with reserved memory. It seem it doesn't account dma-ranges while reserving
>>> memory (it is too early) nor while allocating/mapping/etc.
>>
>> The reserved memory is described in terms of CPU physical addresses, so
>> a device offset shouldn't matter from that perspective. It only comes
>> into play at the point you generate the dma_addr_t to hand off to the
>> device - only then do you need to transform the CPU physical address of
>> the allocated/mapped page into the device's view of that page (i.e.
>> subtract the offset).
> 
> Thanks for explanation! So dma-coherent.c should be modified, right? I see
> that some architectures provide phys_to_dma/dma_to_phys helpers primary for
> swiotlb, is it safe to reuse them given that default implementation is
> provided? Nothing under Documentation explains how they supposed to be used,
> sorry if asking stupid question.

Those are essentially SWIOTLB-specific, so can't be universally relied
upon. I think something like this ought to suffice:

---8<---
diff --git a/lib/dma-noop.c b/lib/dma-noop.c
index 3d766e78fbe2..fbb1b37750d5 100644
--- a/lib/dma-noop.c
+++ b/lib/dma-noop.c
@@ -8,6 +8,11 @@
 #include <linux/dma-mapping.h>
 #include <linux/scatterlist.h>

+static dma_addr_t dma_noop_dev_offset(struct device *dev)
+{
+       return (dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT;
+}
+
 static void *dma_noop_alloc(struct device *dev, size_t size,
                            dma_addr_t *dma_handle, gfp_t gfp,
                            unsigned long attrs)
@@ -16,7 +21,7 @@ static void *dma_noop_alloc(struct device *dev, size_t
size,

        ret = (void *)__get_free_pages(gfp, get_order(size));
        if (ret)
-               *dma_handle = virt_to_phys(ret);
+               *dma_handle = virt_to_phys(ret) - dma_noop_dev_offset(dev);
        return ret;
 }

@@ -32,7 +37,7 @@ static dma_addr_t dma_noop_map_page(struct device
*dev, struct page *page,
                                      enum dma_data_direction dir,
                                      unsigned long attrs)
 {
-       return page_to_phys(page) + offset;
+       return page_to_phys(page) + offset - dma_noop_dev_offset(dev);
 }

 static int dma_noop_map_sg(struct device *dev, struct scatterlist *sgl,
int nents,
@@ -47,7 +52,8 @@ static int dma_noop_map_sg(struct device *dev, struct
scatterlist *sgl, int nent

                BUG_ON(!sg_page(sg));
                va = sg_virt(sg);
-               sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va);
+               sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va) -
+                                       dma_noop_dev_offset(dev);
                sg_dma_len(sg) = sg->length;
        }
--->8---

intentionally whitespace-damaged by copy-pasting off my terminal to
emphasise how utterly untested it is ;)

Robin.

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-12 18:07               ` Robin Murphy
@ 2017-01-13  9:12                 ` Vladimir Murzin
  2017-01-13 12:40                   ` Robin Murphy
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-13  9:12 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/01/17 18:07, Robin Murphy wrote:
> On 12/01/17 17:15, Vladimir Murzin wrote:
>> On 12/01/17 17:04, Robin Murphy wrote:
>>> On 12/01/17 16:52, Vladimir Murzin wrote:
>>>> On 12/01/17 10:55, Benjamin Gaignard wrote:
>>>>> 2017-01-12 11:35 GMT+01:00 Benjamin Gaignard <benjamin.gaignard@linaro.org>:
>>>>>> 2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>>> On 11/01/17 13:17, Benjamin Gaignard wrote:
>>>>>>>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> It seem that addition of cache support for M-class cpus uncovered
>>>>>>>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>>>>>>>> always consistent; however, for R/M classes of cpu memory can be
>>>>>>>>> covered by MPU which in turn might configure RAM as Normal
>>>>>>>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>>>>>>>> friends, since data can stuck in caches now or be buffered.
>>>>>>>>>
>>>>>>>>> This patch set is trying to address the issue by providing region of
>>>>>>>>> memory suitable for consistent DMA operations. It is supposed that
>>>>>>>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>>>>>>>> advertise such memory as reserved shared-dma-pool, rather then using
>>>>>>>>> homebrew command line option, and extend dma-coherent to provide
>>>>>>>>> default DMA area in the similar way as it is done for CMA (PATCH
>>>>>>>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>>>>>>>> framework, and it is seems that it might be reused by other
>>>>>>>>> architectures like c6x and blackfin.
>>>>>>>>>
>>>>>>>>> Dedicated DMA region is required for cases other than:
>>>>>>>>>  - MMU/MPU is off
>>>>>>>>>  - cpu is v7m w/o cache support
>>>>>>>>>  - device is coherent
>>>>>>>>>
>>>>>>>>> In case one of the above conditions is true dma operations are forced
>>>>>>>>> to be coherent and wired with dma_noop_ops.
>>>>>>>>>
>>>>>>>>> To make life easier NOMMU dma operations are kept in separate
>>>>>>>>> compilation unit.
>>>>>>>>>
>>>>>>>>> Since the issue was reported in the same time as Benjamin sent his
>>>>>>>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>>>>>>>> series (PATCH 1/5 and PATCH 3/5).
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>>>>>>>> it doesn't work with my drm/kms driver.
>>>>>>>
>>>>>>> I guess the same is for fbmem, but would be better to have confirmation since
>>>>>>> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>>>>>>>
>>>>>>>> I haven't any errors but nothing is displayed unlike what I have when
>>>>>>>> using current dma-mapping
>>>>>>>> code.
>>>>>>>> I guess the issue is coming from dma-noop where __get_free_pages() is
>>>>>>>> used instead of alloc_pages()
>>>>>>>> in dma-mapping.
>>>>>>>
>>>>>>> Unless I've missed something bellow is a call stack for both
>>>>>>>
>>>>>>> #1
>>>>>>> __alloc_simple_buffer
>>>>>>>         __dma_alloc_buffer
>>>>>>>                 alloc_pages
>>>>>>>                 split_page
>>>>>>>                 __dma_clear_buffer
>>>>>>>                         memset
>>>>>>>         page_address
>>>>>>>
>>>>>>> #2
>>>>>>> __get_free_pages
>>>>>>>         alloc_pages
>>>>>>>         page_address
>>>>>>>
>>>>>>> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
>>>>>>> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>>>>>>>
>>>>>>> Is something from above critical for your driver?
>>>>>>
>>>>>> I have removed all the diff (split_page,  __dma_clear_buffer, memset)
>>>>>> from #1 and it is still working.
>>>>>> DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.
>>>>>>
>>>>>> I have investigated more and found that dma-noop doesn't take care of
>>>>>> "dma-ranges" property which is set in DT.
>>>>>> I believed that is the root cause of my problem with your patches.
>>>>>
>>>>> After testing changing virt_to_phys to virt_to_dma in dma-noop.c fix the issue
>>>>> modetest and fbdemo are now still functional.
>>>>>
>>>>
>>>> Thanks for narrowing it down! I did not noticed that stm32f4 remap its memory,
>>>> so dma-ranges property is in use.
>>>>
>>>> It looks like virt_to_dma is ARM specific, so I probably have to discard idea
>>>> of reusing dma-noop-ops and switch logic into dma-mapping-nommu.c based on
>>>> is_device_dma_coherent(dev) check.
>>>
>>> dma_pfn_offset is a member of struct device, so it should be OK for
>>> dma_noop_ops to also make reference to it (and assume it's zero if not
>>> explicitly set).
>>>
>>>> Meanwhile, I'm quite puzzled on how such memory remaping should work together
>>>> with reserved memory. It seem it doesn't account dma-ranges while reserving
>>>> memory (it is too early) nor while allocating/mapping/etc.
>>>
>>> The reserved memory is described in terms of CPU physical addresses, so
>>> a device offset shouldn't matter from that perspective. It only comes
>>> into play at the point you generate the dma_addr_t to hand off to the
>>> device - only then do you need to transform the CPU physical address of
>>> the allocated/mapped page into the device's view of that page (i.e.
>>> subtract the offset).
>>
>> Thanks for explanation! So dma-coherent.c should be modified, right? I see
>> that some architectures provide phys_to_dma/dma_to_phys helpers primary for
>> swiotlb, is it safe to reuse them given that default implementation is
>> provided? Nothing under Documentation explains how they supposed to be used,
>> sorry if asking stupid question.
> 
> Those are essentially SWIOTLB-specific, so can't be universally relied
> upon. I think something like this ought to suffice:

Yup, but what about dma-coherent.c? Currently it has 

int dma_alloc_from_coherent(struct device *dev, ssize_t size,
				       dma_addr_t *dma_handle, void **ret)
{
...
	*dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
	*ret = mem->virt_base + (pageno << PAGE_SHIFT);
...
}

In case reserved memory is described in terms of CPU phys addresses, would not
we need to take into account dma_pfn_offset? What I'm missing?

Thanks
Vladimir

> 
> ---8<---
> diff --git a/lib/dma-noop.c b/lib/dma-noop.c
> index 3d766e78fbe2..fbb1b37750d5 100644
> --- a/lib/dma-noop.c
> +++ b/lib/dma-noop.c
> @@ -8,6 +8,11 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/scatterlist.h>
> 
> +static dma_addr_t dma_noop_dev_offset(struct device *dev)
> +{
> +       return (dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT;
> +}
> +
>  static void *dma_noop_alloc(struct device *dev, size_t size,
>                             dma_addr_t *dma_handle, gfp_t gfp,
>                             unsigned long attrs)
> @@ -16,7 +21,7 @@ static void *dma_noop_alloc(struct device *dev, size_t
> size,
> 
>         ret = (void *)__get_free_pages(gfp, get_order(size));
>         if (ret)
> -               *dma_handle = virt_to_phys(ret);
> +               *dma_handle = virt_to_phys(ret) - dma_noop_dev_offset(dev);
>         return ret;
>  }
> 
> @@ -32,7 +37,7 @@ static dma_addr_t dma_noop_map_page(struct device
> *dev, struct page *page,
>                                       enum dma_data_direction dir,
>                                       unsigned long attrs)
>  {
> -       return page_to_phys(page) + offset;
> +       return page_to_phys(page) + offset - dma_noop_dev_offset(dev);
>  }
> 
>  static int dma_noop_map_sg(struct device *dev, struct scatterlist *sgl,
> int nents,
> @@ -47,7 +52,8 @@ static int dma_noop_map_sg(struct device *dev, struct
> scatterlist *sgl, int nent
> 
>                 BUG_ON(!sg_page(sg));
>                 va = sg_virt(sg);
> -               sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va);
> +               sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va) -
> +                                       dma_noop_dev_offset(dev);
>                 sg_dma_len(sg) = sg->length;
>         }
> --->8---
> 
> intentionally whitespace-damaged by copy-pasting off my terminal to
> emphasise how utterly untested it is ;)
> 
> Robin.
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-13  9:12                 ` Vladimir Murzin
@ 2017-01-13 12:40                   ` Robin Murphy
  2017-01-16 11:58                     ` Vladimir Murzin
  0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2017-01-13 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 13/01/17 09:12, Vladimir Murzin wrote:
> On 12/01/17 18:07, Robin Murphy wrote:
>> On 12/01/17 17:15, Vladimir Murzin wrote:
>>> On 12/01/17 17:04, Robin Murphy wrote:
>>>> On 12/01/17 16:52, Vladimir Murzin wrote:
>>>>> On 12/01/17 10:55, Benjamin Gaignard wrote:
>>>>>> 2017-01-12 11:35 GMT+01:00 Benjamin Gaignard <benjamin.gaignard@linaro.org>:
>>>>>>> 2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>>>> On 11/01/17 13:17, Benjamin Gaignard wrote:
>>>>>>>>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> It seem that addition of cache support for M-class cpus uncovered
>>>>>>>>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>>>>>>>>> always consistent; however, for R/M classes of cpu memory can be
>>>>>>>>>> covered by MPU which in turn might configure RAM as Normal
>>>>>>>>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>>>>>>>>> friends, since data can stuck in caches now or be buffered.
>>>>>>>>>>
>>>>>>>>>> This patch set is trying to address the issue by providing region of
>>>>>>>>>> memory suitable for consistent DMA operations. It is supposed that
>>>>>>>>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>>>>>>>>> advertise such memory as reserved shared-dma-pool, rather then using
>>>>>>>>>> homebrew command line option, and extend dma-coherent to provide
>>>>>>>>>> default DMA area in the similar way as it is done for CMA (PATCH
>>>>>>>>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>>>>>>>>> framework, and it is seems that it might be reused by other
>>>>>>>>>> architectures like c6x and blackfin.
>>>>>>>>>>
>>>>>>>>>> Dedicated DMA region is required for cases other than:
>>>>>>>>>>  - MMU/MPU is off
>>>>>>>>>>  - cpu is v7m w/o cache support
>>>>>>>>>>  - device is coherent
>>>>>>>>>>
>>>>>>>>>> In case one of the above conditions is true dma operations are forced
>>>>>>>>>> to be coherent and wired with dma_noop_ops.
>>>>>>>>>>
>>>>>>>>>> To make life easier NOMMU dma operations are kept in separate
>>>>>>>>>> compilation unit.
>>>>>>>>>>
>>>>>>>>>> Since the issue was reported in the same time as Benjamin sent his
>>>>>>>>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>>>>>>>>> series (PATCH 1/5 and PATCH 3/5).
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>>>>>>>>> it doesn't work with my drm/kms driver.
>>>>>>>>
>>>>>>>> I guess the same is for fbmem, but would be better to have confirmation since
>>>>>>>> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>>>>>>>>
>>>>>>>>> I haven't any errors but nothing is displayed unlike what I have when
>>>>>>>>> using current dma-mapping
>>>>>>>>> code.
>>>>>>>>> I guess the issue is coming from dma-noop where __get_free_pages() is
>>>>>>>>> used instead of alloc_pages()
>>>>>>>>> in dma-mapping.
>>>>>>>>
>>>>>>>> Unless I've missed something bellow is a call stack for both
>>>>>>>>
>>>>>>>> #1
>>>>>>>> __alloc_simple_buffer
>>>>>>>>         __dma_alloc_buffer
>>>>>>>>                 alloc_pages
>>>>>>>>                 split_page
>>>>>>>>                 __dma_clear_buffer
>>>>>>>>                         memset
>>>>>>>>         page_address
>>>>>>>>
>>>>>>>> #2
>>>>>>>> __get_free_pages
>>>>>>>>         alloc_pages
>>>>>>>>         page_address
>>>>>>>>
>>>>>>>> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
>>>>>>>> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>>>>>>>>
>>>>>>>> Is something from above critical for your driver?
>>>>>>>
>>>>>>> I have removed all the diff (split_page,  __dma_clear_buffer, memset)
>>>>>>> from #1 and it is still working.
>>>>>>> DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.
>>>>>>>
>>>>>>> I have investigated more and found that dma-noop doesn't take care of
>>>>>>> "dma-ranges" property which is set in DT.
>>>>>>> I believed that is the root cause of my problem with your patches.
>>>>>>
>>>>>> After testing changing virt_to_phys to virt_to_dma in dma-noop.c fix the issue
>>>>>> modetest and fbdemo are now still functional.
>>>>>>
>>>>>
>>>>> Thanks for narrowing it down! I did not noticed that stm32f4 remap its memory,
>>>>> so dma-ranges property is in use.
>>>>>
>>>>> It looks like virt_to_dma is ARM specific, so I probably have to discard idea
>>>>> of reusing dma-noop-ops and switch logic into dma-mapping-nommu.c based on
>>>>> is_device_dma_coherent(dev) check.
>>>>
>>>> dma_pfn_offset is a member of struct device, so it should be OK for
>>>> dma_noop_ops to also make reference to it (and assume it's zero if not
>>>> explicitly set).
>>>>
>>>>> Meanwhile, I'm quite puzzled on how such memory remaping should work together
>>>>> with reserved memory. It seem it doesn't account dma-ranges while reserving
>>>>> memory (it is too early) nor while allocating/mapping/etc.
>>>>
>>>> The reserved memory is described in terms of CPU physical addresses, so
>>>> a device offset shouldn't matter from that perspective. It only comes
>>>> into play at the point you generate the dma_addr_t to hand off to the
>>>> device - only then do you need to transform the CPU physical address of
>>>> the allocated/mapped page into the device's view of that page (i.e.
>>>> subtract the offset).
>>>
>>> Thanks for explanation! So dma-coherent.c should be modified, right? I see
>>> that some architectures provide phys_to_dma/dma_to_phys helpers primary for
>>> swiotlb, is it safe to reuse them given that default implementation is
>>> provided? Nothing under Documentation explains how they supposed to be used,
>>> sorry if asking stupid question.
>>
>> Those are essentially SWIOTLB-specific, so can't be universally relied
>> upon. I think something like this ought to suffice:
> 
> Yup, but what about dma-coherent.c? Currently it has 
> 
> int dma_alloc_from_coherent(struct device *dev, ssize_t size,
> 				       dma_addr_t *dma_handle, void **ret)
> {
> ...
> 	*dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
> 	*ret = mem->virt_base + (pageno << PAGE_SHIFT);
> ...
> }
> 
> In case reserved memory is described in terms of CPU phys addresses, would not
> we need to take into account dma_pfn_offset? What I'm missing?

Ah yes, I overlooked that one. AFAICS, that's intended to be accounted
for when calling dma_init_coherent_memory (i.e. phys_addr vs.
device_addr), but that's a bit awkward for a global pool.

How utterly disgusting do you think this (or some variant thereof) looks?

	/* Apply device-specific offset for the global pool */
	if (mem == dma_coherent_default_memory)
		*handle += dev->dma_pfn_offset << PAGE_SHIFT;

Robin.

> Thanks
> Vladimir
> 
>>
>> ---8<---
>> diff --git a/lib/dma-noop.c b/lib/dma-noop.c
>> index 3d766e78fbe2..fbb1b37750d5 100644
>> --- a/lib/dma-noop.c
>> +++ b/lib/dma-noop.c
>> @@ -8,6 +8,11 @@
>>  #include <linux/dma-mapping.h>
>>  #include <linux/scatterlist.h>
>>
>> +static dma_addr_t dma_noop_dev_offset(struct device *dev)
>> +{
>> +       return (dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT;
>> +}
>> +
>>  static void *dma_noop_alloc(struct device *dev, size_t size,
>>                             dma_addr_t *dma_handle, gfp_t gfp,
>>                             unsigned long attrs)
>> @@ -16,7 +21,7 @@ static void *dma_noop_alloc(struct device *dev, size_t
>> size,
>>
>>         ret = (void *)__get_free_pages(gfp, get_order(size));
>>         if (ret)
>> -               *dma_handle = virt_to_phys(ret);
>> +               *dma_handle = virt_to_phys(ret) - dma_noop_dev_offset(dev);
>>         return ret;
>>  }
>>
>> @@ -32,7 +37,7 @@ static dma_addr_t dma_noop_map_page(struct device
>> *dev, struct page *page,
>>                                       enum dma_data_direction dir,
>>                                       unsigned long attrs)
>>  {
>> -       return page_to_phys(page) + offset;
>> +       return page_to_phys(page) + offset - dma_noop_dev_offset(dev);
>>  }
>>
>>  static int dma_noop_map_sg(struct device *dev, struct scatterlist *sgl,
>> int nents,
>> @@ -47,7 +52,8 @@ static int dma_noop_map_sg(struct device *dev, struct
>> scatterlist *sgl, int nent
>>
>>                 BUG_ON(!sg_page(sg));
>>                 va = sg_virt(sg);
>> -               sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va);
>> +               sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va) -
>> +                                       dma_noop_dev_offset(dev);
>>                 sg_dma_len(sg) = sg->length;
>>         }
>> --->8---
>>
>> intentionally whitespace-damaged by copy-pasting off my terminal to
>> emphasise how utterly untested it is ;)
>>
>> Robin.
>>
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU
  2017-01-13 12:40                   ` Robin Murphy
@ 2017-01-16 11:58                     ` Vladimir Murzin
  0 siblings, 0 replies; 17+ messages in thread
From: Vladimir Murzin @ 2017-01-16 11:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 13/01/17 12:40, Robin Murphy wrote:
> On 13/01/17 09:12, Vladimir Murzin wrote:
>> On 12/01/17 18:07, Robin Murphy wrote:
>>> On 12/01/17 17:15, Vladimir Murzin wrote:
>>>> On 12/01/17 17:04, Robin Murphy wrote:
>>>>> On 12/01/17 16:52, Vladimir Murzin wrote:
>>>>>> On 12/01/17 10:55, Benjamin Gaignard wrote:
>>>>>>> 2017-01-12 11:35 GMT+01:00 Benjamin Gaignard <benjamin.gaignard@linaro.org>:
>>>>>>>> 2017-01-11 15:34 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>>>>> On 11/01/17 13:17, Benjamin Gaignard wrote:
>>>>>>>>>> 2017-01-10 15:18 GMT+01:00 Vladimir Murzin <vladimir.murzin@arm.com>:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> It seem that addition of cache support for M-class cpus uncovered
>>>>>>>>>>> latent bug in DMA usage. NOMMU memory model has been treated as being
>>>>>>>>>>> always consistent; however, for R/M classes of cpu memory can be
>>>>>>>>>>> covered by MPU which in turn might configure RAM as Normal
>>>>>>>>>>> i.e. bufferable and cacheable. It breaks dma_alloc_coherent() and
>>>>>>>>>>> friends, since data can stuck in caches now or be buffered.
>>>>>>>>>>>
>>>>>>>>>>> This patch set is trying to address the issue by providing region of
>>>>>>>>>>> memory suitable for consistent DMA operations. It is supposed that
>>>>>>>>>>> such region is marked by MPU as non-cacheable. Robin suggested to
>>>>>>>>>>> advertise such memory as reserved shared-dma-pool, rather then using
>>>>>>>>>>> homebrew command line option, and extend dma-coherent to provide
>>>>>>>>>>> default DMA area in the similar way as it is done for CMA (PATCH
>>>>>>>>>>> 2/5). It allows us to offload all bookkeeping on generic coherent DMA
>>>>>>>>>>> framework, and it is seems that it might be reused by other
>>>>>>>>>>> architectures like c6x and blackfin.
>>>>>>>>>>>
>>>>>>>>>>> Dedicated DMA region is required for cases other than:
>>>>>>>>>>>  - MMU/MPU is off
>>>>>>>>>>>  - cpu is v7m w/o cache support
>>>>>>>>>>>  - device is coherent
>>>>>>>>>>>
>>>>>>>>>>> In case one of the above conditions is true dma operations are forced
>>>>>>>>>>> to be coherent and wired with dma_noop_ops.
>>>>>>>>>>>
>>>>>>>>>>> To make life easier NOMMU dma operations are kept in separate
>>>>>>>>>>> compilation unit.
>>>>>>>>>>>
>>>>>>>>>>> Since the issue was reported in the same time as Benjamin sent his
>>>>>>>>>>> patch [1] to allow mmap for NOMMU, his case is also addressed in this
>>>>>>>>>>> series (PATCH 1/5 and PATCH 3/5).
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> I have tested this v4 on my setup (stm32f4, no cache, no MPU) and unfortunately
>>>>>>>>>> it doesn't work with my drm/kms driver.
>>>>>>>>>
>>>>>>>>> I guess the same is for fbmem, but would be better to have confirmation since
>>>>>>>>> amba-clcd I use has not been ported to drm/kms (yet), so I can't test.
>>>>>>>>>
>>>>>>>>>> I haven't any errors but nothing is displayed unlike what I have when
>>>>>>>>>> using current dma-mapping
>>>>>>>>>> code.
>>>>>>>>>> I guess the issue is coming from dma-noop where __get_free_pages() is
>>>>>>>>>> used instead of alloc_pages()
>>>>>>>>>> in dma-mapping.
>>>>>>>>>
>>>>>>>>> Unless I've missed something bellow is a call stack for both
>>>>>>>>>
>>>>>>>>> #1
>>>>>>>>> __alloc_simple_buffer
>>>>>>>>>         __dma_alloc_buffer
>>>>>>>>>                 alloc_pages
>>>>>>>>>                 split_page
>>>>>>>>>                 __dma_clear_buffer
>>>>>>>>>                         memset
>>>>>>>>>         page_address
>>>>>>>>>
>>>>>>>>> #2
>>>>>>>>> __get_free_pages
>>>>>>>>>         alloc_pages
>>>>>>>>>         page_address
>>>>>>>>>
>>>>>>>>> So the difference is that nommu case in dma-mapping.c memzeros memory, handles
>>>>>>>>> DMA_ATTR_NO_KERNEL_MAPPING and does optimisation of memory usage.
>>>>>>>>>
>>>>>>>>> Is something from above critical for your driver?
>>>>>>>>
>>>>>>>> I have removed all the diff (split_page,  __dma_clear_buffer, memset)
>>>>>>>> from #1 and it is still working.
>>>>>>>> DMA_ATTR_NO_KERNEL_MAPPING flag is not set when allocating the buffer.
>>>>>>>>
>>>>>>>> I have investigated more and found that dma-noop doesn't take care of
>>>>>>>> "dma-ranges" property which is set in DT.
>>>>>>>> I believed that is the root cause of my problem with your patches.
>>>>>>>
>>>>>>> After testing changing virt_to_phys to virt_to_dma in dma-noop.c fix the issue
>>>>>>> modetest and fbdemo are now still functional.
>>>>>>>
>>>>>>
>>>>>> Thanks for narrowing it down! I did not noticed that stm32f4 remap its memory,
>>>>>> so dma-ranges property is in use.
>>>>>>
>>>>>> It looks like virt_to_dma is ARM specific, so I probably have to discard idea
>>>>>> of reusing dma-noop-ops and switch logic into dma-mapping-nommu.c based on
>>>>>> is_device_dma_coherent(dev) check.
>>>>>
>>>>> dma_pfn_offset is a member of struct device, so it should be OK for
>>>>> dma_noop_ops to also make reference to it (and assume it's zero if not
>>>>> explicitly set).
>>>>>
>>>>>> Meanwhile, I'm quite puzzled on how such memory remaping should work together
>>>>>> with reserved memory. It seem it doesn't account dma-ranges while reserving
>>>>>> memory (it is too early) nor while allocating/mapping/etc.
>>>>>
>>>>> The reserved memory is described in terms of CPU physical addresses, so
>>>>> a device offset shouldn't matter from that perspective. It only comes
>>>>> into play at the point you generate the dma_addr_t to hand off to the
>>>>> device - only then do you need to transform the CPU physical address of
>>>>> the allocated/mapped page into the device's view of that page (i.e.
>>>>> subtract the offset).
>>>>
>>>> Thanks for explanation! So dma-coherent.c should be modified, right? I see
>>>> that some architectures provide phys_to_dma/dma_to_phys helpers primary for
>>>> swiotlb, is it safe to reuse them given that default implementation is
>>>> provided? Nothing under Documentation explains how they supposed to be used,
>>>> sorry if asking stupid question.
>>>
>>> Those are essentially SWIOTLB-specific, so can't be universally relied
>>> upon. I think something like this ought to suffice:
>>
>> Yup, but what about dma-coherent.c? Currently it has 
>>
>> int dma_alloc_from_coherent(struct device *dev, ssize_t size,
>> 				       dma_addr_t *dma_handle, void **ret)
>> {
>> ...
>> 	*dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
>> 	*ret = mem->virt_base + (pageno << PAGE_SHIFT);
>> ...
>> }
>>
>> In case reserved memory is described in terms of CPU phys addresses, would not
>> we need to take into account dma_pfn_offset? What I'm missing?
> 
> Ah yes, I overlooked that one. AFAICS, that's intended to be accounted
> for when calling dma_init_coherent_memory (i.e. phys_addr vs.
> device_addr), but that's a bit awkward for a global pool.
> 
> How utterly disgusting do you think this (or some variant thereof) looks?
> 
> 	/* Apply device-specific offset for the global pool */
> 	if (mem == dma_coherent_default_memory)
> 		*handle += dev->dma_pfn_offset << PAGE_SHIFT;

It'd work for default dma region, but IMO the issue is wider here... does
following look better?

diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index b52ba27..22daa4c 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -27,6 +27,15 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
 	return dma_coherent_default_memory;
 }
 
+static inline dma_addr_t dma_get_device_base(struct device *dev,
+					     struct dma_coherent_mem * mem)
+{
+	if (!dev)
+		return mem->pfn_base << PAGE_SHIFT;
+
+	return (mem->pfn_base + dev->dma_pfn_offset) << PAGE_SHIFT;
+}
+
 static bool dma_init_coherent_memory(
 	phys_addr_t phys_addr, dma_addr_t device_addr, size_t size, int flags,
 	struct dma_coherent_mem **mem)
@@ -92,13 +101,19 @@ static void dma_release_coherent_memory(struct dma_coherent_mem *mem)
 static int dma_assign_coherent_memory(struct device *dev,
 				      struct dma_coherent_mem *mem)
 {
+	unsigned long dma_pfn_offset = mem->pfn_base - PFN_DOWN(mem->device_base);
+
 	if (!dev)
 		return -ENODEV;
 
 	if (dev->dma_mem)
 		return -EBUSY;
 
+	if (dev->dma_pfn_offset)
+		WARN_ON(dev->dma_pfn_offset != dma_pfn_offset);
+
 	dev->dma_mem = mem;
+	dev->dma_pfn_offset = dma_pfn_offset;
 	/* FIXME: this routine just ignores DMA_MEMORY_INCLUDES_CHILDREN */
 
 	return 0;
@@ -145,7 +160,7 @@ void *dma_mark_declared_memory_occupied(struct device *dev,
 		return ERR_PTR(-EINVAL);
 
 	spin_lock_irqsave(&mem->spinlock, flags);
-	pos = (device_addr - mem->device_base) >> PAGE_SHIFT;
+	pos = PFN_DOWN(device_addr - dma_get_device_base(dev, mem));
 	err = bitmap_allocate_region(mem->bitmap, pos, get_order(size));
 	spin_unlock_irqrestore(&mem->spinlock, flags);
 
@@ -195,8 +210,9 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
 	/*
 	 * Memory was found in the per-device area.
 	 */
-	*dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
+	*dma_handle = dma_get_device_base(dev, mem) + (pageno << PAGE_SHIFT);
 	*ret = mem->virt_base + (pageno << PAGE_SHIFT);
+
 	dma_memory_map = (mem->flags & DMA_MEMORY_MAP);
 	spin_unlock_irqrestore(&mem->spinlock, flags);
 	if (dma_memory_map)

Cheers
Vladimir

> 
> Robin.
> 
>> Thanks
>> Vladimir
>>
>>>
>>> ---8<---
>>> diff --git a/lib/dma-noop.c b/lib/dma-noop.c
>>> index 3d766e78fbe2..fbb1b37750d5 100644
>>> --- a/lib/dma-noop.c
>>> +++ b/lib/dma-noop.c
>>> @@ -8,6 +8,11 @@
>>>  #include <linux/dma-mapping.h>
>>>  #include <linux/scatterlist.h>
>>>
>>> +static dma_addr_t dma_noop_dev_offset(struct device *dev)
>>> +{
>>> +       return (dma_addr_t)dev->dma_pfn_offset << PAGE_SHIFT;
>>> +}
>>> +
>>>  static void *dma_noop_alloc(struct device *dev, size_t size,
>>>                             dma_addr_t *dma_handle, gfp_t gfp,
>>>                             unsigned long attrs)
>>> @@ -16,7 +21,7 @@ static void *dma_noop_alloc(struct device *dev, size_t
>>> size,
>>>
>>>         ret = (void *)__get_free_pages(gfp, get_order(size));
>>>         if (ret)
>>> -               *dma_handle = virt_to_phys(ret);
>>> +               *dma_handle = virt_to_phys(ret) - dma_noop_dev_offset(dev);
>>>         return ret;
>>>  }
>>>
>>> @@ -32,7 +37,7 @@ static dma_addr_t dma_noop_map_page(struct device
>>> *dev, struct page *page,
>>>                                       enum dma_data_direction dir,
>>>                                       unsigned long attrs)
>>>  {
>>> -       return page_to_phys(page) + offset;
>>> +       return page_to_phys(page) + offset - dma_noop_dev_offset(dev);
>>>  }
>>>
>>>  static int dma_noop_map_sg(struct device *dev, struct scatterlist *sgl,
>>> int nents,
>>> @@ -47,7 +52,8 @@ static int dma_noop_map_sg(struct device *dev, struct
>>> scatterlist *sgl, int nent
>>>
>>>                 BUG_ON(!sg_page(sg));
>>>                 va = sg_virt(sg);
>>> -               sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va);
>>> +               sg_dma_address(sg) = (dma_addr_t)virt_to_phys(va) -
>>> +                                       dma_noop_dev_offset(dev);
>>>                 sg_dma_len(sg) = sg->length;
>>>         }
>>> --->8---
>>>
>>> intentionally whitespace-damaged by copy-pasting off my terminal to
>>> emphasise how utterly untested it is ;)
>>>
>>> Robin.
>>>
>>
> 
> 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-01-16 11:58 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-10 14:18 [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Vladimir Murzin
2017-01-10 14:18 ` [RFC PATCH v4 1/5] dma: Add simple dma_noop_mmap Vladimir Murzin
2017-01-10 14:18 ` [RFC PATCH v4 2/5] drivers: dma-coherent: Introduce default DMA pool Vladimir Murzin
2017-01-10 14:18 ` [RFC PATCH v4 3/5] ARM: NOMMU: Introduce dma operations for noMMU Vladimir Murzin
2017-01-10 14:18 ` [RFC PATCH v4 4/5] ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus Vladimir Murzin
2017-01-10 14:18 ` [RFC PATCH v4 5/5] ARM: dma-mapping: Remove traces of NOMMU code Vladimir Murzin
2017-01-11 13:17 ` [RFC PATCH v4 0/5] ARM: Fix dma_alloc_coherent() and friends for NOMMU Benjamin Gaignard
2017-01-11 14:34   ` Vladimir Murzin
2017-01-12 10:35     ` Benjamin Gaignard
2017-01-12 10:55       ` Benjamin Gaignard
2017-01-12 16:52         ` Vladimir Murzin
2017-01-12 17:04           ` Robin Murphy
2017-01-12 17:15             ` Vladimir Murzin
2017-01-12 18:07               ` Robin Murphy
2017-01-13  9:12                 ` Vladimir Murzin
2017-01-13 12:40                   ` Robin Murphy
2017-01-16 11:58                     ` Vladimir Murzin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).