linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] drm+dma: cache support for arm, etc
@ 2019-08-14 21:59 Rob Clark
  2019-08-14 21:59 ` [PATCH 1/6] arm64: export arch_sync_dma_for_*() Rob Clark
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Rob Clark @ 2019-08-14 21:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, Chris Wilson, Masahiro Yamada,
	Benjamin Gaignard, Mauro Carvalho Chehab, Will Deacon,
	Christoph Hellwig, Emil Velikov, Rob Clark, Michael Ellerman,
	Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT AARCH64 ARCHITECTURE, Daniel Vetter,
	open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Deepak Sharma,
	Joerg Roedel, Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC 32-BIT AND 64-BIT, Alexios Zavras,
	Russell King, Doug Anderson, Thomas Gleixner, Sean Paul,
	Allison Randal, Christophe Leroy, Enrico Weigelt, Ard Biesheuvel,
	Greg Kroah-Hartman, open list, Paul Burton, Souptick Joarder,
	Andrew Morton, open list:DRM DRIVER FOR MSM ADRENO GPU

From: Rob Clark <robdclark@chromium.org>

This is a replacement for a previous patches[1] that was adding arm64
support for drm_clflush.  I've also added a patch to solve a similar
cache issue in vgem.

The first few patches just export arch_sync_dma_for_*().  Possibly
instead the EXPORT_SYMBOL_GPL() should be somewere central, rather
than per-arch (but where would make sense?)

The fourth adds (and exports) these ops for arch/arm.  (Arnd Bergmann
mentioned on IRC that Christoph Hellwig was working on this already
for arch/arm which could replace the fourth patch.)

The last two patches actually fix things.

[1] https://patchwork.freedesktop.org/series/64732/

Rob Clark (6):
  arm64: export arch_sync_dma_for_*()
  mips: export arch_sync_dma_for_*()
  powerpc: export arch_sync_dma_for_*()
  arm: add arch_sync_dma_for_*()
  drm/msm: stop abusing DMA API
  drm/vgem: fix cache synchronization on arm/arm64 (take two)

 arch/arm/Kconfig                  |   2 +
 arch/arm/mm/dma-mapping-nommu.c   |  14 +++
 arch/arm/mm/dma-mapping.c         |  28 ++++++
 arch/arm64/mm/dma-mapping.c       |   2 +
 arch/arm64/mm/flush.c             |   2 +
 arch/mips/mm/dma-noncoherent.c    |   2 +
 arch/powerpc/mm/dma-noncoherent.c |   2 +
 drivers/gpu/drm/drm_cache.c       |  20 ++++-
 drivers/gpu/drm/msm/msm_gem.c     |  37 +++-----
 drivers/gpu/drm/vgem/vgem_drv.c   | 145 ++++++++++++++++++++----------
 include/drm/drm_cache.h           |   4 +
 11 files changed, 182 insertions(+), 76 deletions(-)

-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] arm64: export arch_sync_dma_for_*()
  2019-08-14 21:59 [PATCH 0/6] drm+dma: cache support for arm, etc Rob Clark
@ 2019-08-14 21:59 ` Rob Clark
  2019-08-14 21:59 ` [PATCH 2/6] mips: " Rob Clark
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Rob Clark @ 2019-08-14 21:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Clark, Masayoshi Mizuma, Joerg Roedel, Catalin Marinas,
	Robin Murphy, linux-kernel, Jesper Dangaard Brouer, Will Deacon,
	Christoph Hellwig, linux-arm-kernel

From: Rob Clark <robdclark@chromium.org>

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 arch/arm64/mm/dma-mapping.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 1d3f0b5a9940..ea5ae11d07f7 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -24,12 +24,14 @@ void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
 {
 	__dma_map_area(phys_to_virt(paddr), size, dir);
 }
+EXPORT_SYMBOL_GPL(arch_sync_dma_for_device);
 
 void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
 		size_t size, enum dma_data_direction dir)
 {
 	__dma_unmap_area(phys_to_virt(paddr), size, dir);
 }
+EXPORT_SYMBOL_GPL(arch_sync_dma_for_cpu);
 
 void arch_dma_prep_coherent(struct page *page, size_t size)
 {
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] mips: export arch_sync_dma_for_*()
  2019-08-14 21:59 [PATCH 0/6] drm+dma: cache support for arm, etc Rob Clark
  2019-08-14 21:59 ` [PATCH 1/6] arm64: export arch_sync_dma_for_*() Rob Clark
@ 2019-08-14 21:59 ` Rob Clark
  2019-08-14 21:59 ` [PATCH 4/6] arm: add arch_sync_dma_for_*() Rob Clark
  2019-08-15  6:51 ` [PATCH 0/6] drm+dma: cache support for arm, etc Christoph Hellwig
  3 siblings, 0 replies; 13+ messages in thread
From: Rob Clark @ 2019-08-14 21:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Clark, Maciej W. Rozycki, Hauke Mehrtens, Greg Kroah-Hartman,
	Maxime Ripard, Catalin Marinas, David Airlie, Maarten Lankhorst,
	linux-kernel, Ralf Baechle, linux-mips, Sean Paul, Paul Burton,
	linux-arm-kernel, Daniel Vetter, James Hogan, Thomas Gleixner,
	Will Deacon, Christoph Hellwig, Allison Randal

From: Rob Clark <robdclark@chromium.org>

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 arch/arm64/mm/flush.c          |  2 ++
 arch/mips/mm/dma-noncoherent.c |  2 ++
 drivers/gpu/drm/drm_cache.c    | 20 +++++++++++++++++---
 include/drm/drm_cache.h        |  4 ++++
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
index dc19300309d2..f0eb6320c979 100644
--- a/arch/arm64/mm/flush.c
+++ b/arch/arm64/mm/flush.c
@@ -93,3 +93,5 @@ void arch_invalidate_pmem(void *addr, size_t size)
 }
 EXPORT_SYMBOL_GPL(arch_invalidate_pmem);
 #endif
+
+EXPORT_SYMBOL_GPL(__flush_dcache_area);
diff --git a/arch/mips/mm/dma-noncoherent.c b/arch/mips/mm/dma-noncoherent.c
index ed56c6fa7be2..bd5debe1b423 100644
--- a/arch/mips/mm/dma-noncoherent.c
+++ b/arch/mips/mm/dma-noncoherent.c
@@ -131,6 +131,7 @@ void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
 {
 	dma_sync_phys(paddr, size, dir);
 }
+EXPORT_SYMBOL_GPL(arch_sync_dma_for_device);
 
 #ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU
 void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
@@ -139,6 +140,7 @@ void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
 	if (cpu_needs_post_dma_flush(dev))
 		dma_sync_phys(paddr, size, dir);
 }
+EXPORT_SYMBOL_GPL(arch_sync_dma_for_cpu);
 #endif
 
 void arch_dma_cache_sync(struct device *dev, void *vaddr, size_t size,
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 3bd76e918b5d..90105c637797 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -69,6 +69,14 @@ static void drm_cache_flush_clflush(struct page *pages[],
 }
 #endif
 
+#if defined(__powerpc__)
+static void __flush_dcache_area(void *addr, size_t len)
+{
+	flush_dcache_range((unsigned long)addr,
+			   (unsigned long)addr + PAGE_SIZE);
+}
+#endif
+
 /**
  * drm_clflush_pages - Flush dcache lines of a set of pages.
  * @pages: List of pages to be flushed.
@@ -90,7 +98,7 @@ drm_clflush_pages(struct page *pages[], unsigned long num_pages)
 	if (wbinvd_on_all_cpus())
 		pr_err("Timed out waiting for cache flush\n");
 
-#elif defined(__powerpc__)
+#elif defined(__powerpc__) || defined(CONFIG_ARM64)
 	unsigned long i;
 	for (i = 0; i < num_pages; i++) {
 		struct page *page = pages[i];
@@ -100,8 +108,7 @@ drm_clflush_pages(struct page *pages[], unsigned long num_pages)
 			continue;
 
 		page_virtual = kmap_atomic(page);
-		flush_dcache_range((unsigned long)page_virtual,
-				   (unsigned long)page_virtual + PAGE_SIZE);
+		__flush_dcache_area(page_virtual, PAGE_SIZE);
 		kunmap_atomic(page_virtual);
 	}
 #else
@@ -135,6 +142,13 @@ drm_clflush_sg(struct sg_table *st)
 
 	if (wbinvd_on_all_cpus())
 		pr_err("Timed out waiting for cache flush\n");
+#elif defined(CONFIG_ARM64)
+	struct sg_page_iter sg_iter;
+
+	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
+		struct page *p = sg_page_iter_page(&sg_iter);
+		drm_clflush_pages(&p, 1);
+	}
 #else
 	pr_err("Architecture has no drm_cache.c support\n");
 	WARN_ON_ONCE(1);
diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h
index 987ff16b9420..f94e7bd3eca4 100644
--- a/include/drm/drm_cache.h
+++ b/include/drm/drm_cache.h
@@ -40,6 +40,10 @@ void drm_clflush_sg(struct sg_table *st);
 void drm_clflush_virt_range(void *addr, unsigned long length);
 bool drm_need_swiotlb(int dma_bits);
 
+#if defined(CONFIG_X86) || defined(__powerpc__) || defined(CONFIG_ARM64)
+#define HAS_DRM_CACHE 1
+#endif
+
 
 static inline bool drm_arch_can_wc_memory(void)
 {
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] arm: add arch_sync_dma_for_*()
  2019-08-14 21:59 [PATCH 0/6] drm+dma: cache support for arm, etc Rob Clark
  2019-08-14 21:59 ` [PATCH 1/6] arm64: export arch_sync_dma_for_*() Rob Clark
  2019-08-14 21:59 ` [PATCH 2/6] mips: " Rob Clark
@ 2019-08-14 21:59 ` Rob Clark
  2019-08-15  6:51 ` [PATCH 0/6] drm+dma: cache support for arm, etc Christoph Hellwig
  3 siblings, 0 replies; 13+ messages in thread
From: Rob Clark @ 2019-08-14 21:59 UTC (permalink / raw)
  To: dri-devel
  Cc: Kate Stewart, Wolfram Sang, Linus Walleij, Wolfram Sang (Renesas),
	Benjamin Gaignard, Mauro Carvalho Chehab, Christoph Hellwig,
	Rob Clark, Russell King, Mike Rapoport, linux-arm-kernel,
	Vladimir Murzin, Arnd Bergmann, Anshuman Khandual,
	Masahiro Yamada, Doug Anderson, Thomas Gleixner, Vlastimil Babka,
	Ard Biesheuvel, linux-kernel, Paul Burton, Souptick Joarder,
	Andrew Morton, Robin Murphy

From: Rob Clark <robdclark@chromium.org>

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 arch/arm/Kconfig                |  2 ++
 arch/arm/mm/dma-mapping-nommu.c | 14 ++++++++++++++
 arch/arm/mm/dma-mapping.c       | 28 ++++++++++++++++++++++++++++
 3 files changed, 44 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 33b00579beff..a48a7263a2c1 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -18,6 +18,8 @@ config ARM
 	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
 	select ARCH_HAS_STRICT_MODULE_RWX if MMU
+	select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select ARCH_HAS_TEARDOWN_DMA_OPS if MMU
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAVE_CUSTOM_GPIO_H
diff --git a/arch/arm/mm/dma-mapping-nommu.c b/arch/arm/mm/dma-mapping-nommu.c
index 52b82559d99b..4a3df952151f 100644
--- a/arch/arm/mm/dma-mapping-nommu.c
+++ b/arch/arm/mm/dma-mapping-nommu.c
@@ -84,6 +84,13 @@ static void __dma_page_cpu_to_dev(phys_addr_t paddr, size_t size,
 		outer_clean_range(paddr, paddr + size);
 }
 
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
+{
+	__dma_page_cpu_to_dev(paddr, size, dir);
+}
+EXPORT_SYMBOL_GPL(arch_sync_dma_for_device);
+
 static void __dma_page_dev_to_cpu(phys_addr_t paddr, size_t size,
 				  enum dma_data_direction dir)
 {
@@ -93,6 +100,13 @@ static void __dma_page_dev_to_cpu(phys_addr_t paddr, size_t size,
 	}
 }
 
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
+{
+	__dma_page_dev_to_cpu(paddr, size, dir);
+}
+EXPORT_SYMBOL_GPL(arch_sync_dma_for_cpu);
+
 static dma_addr_t arm_nommu_dma_map_page(struct device *dev, struct page *page,
 					 unsigned long offset, size_t size,
 					 enum dma_data_direction dir,
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 6774b03aa405..8ead93196194 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -979,6 +979,13 @@ static void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
 	/* FIXME: non-speculating: flush on bidirectional mappings? */
 }
 
+void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
+{
+	__dma_page_cpu_to_dev(phys_to_page(paddr), paddr % PAGE_SIZE, size, dir);
+}
+EXPORT_SYMBOL_GPL(arch_sync_dma_for_device);
+
 static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
 	size_t size, enum dma_data_direction dir)
 {
@@ -1013,6 +1020,27 @@ static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
 	}
 }
 
+void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
+		size_t size, enum dma_data_direction dir)
+{
+	__dma_page_dev_to_cpu(phys_to_page(paddr), paddr % PAGE_SIZE, size, dir);
+}
+EXPORT_SYMBOL_GPL(arch_sync_dma_for_cpu);
+
+/*
+ * arch_dma_{alloc,free} fail-stubs needed to avoid link-errors in dma/direct.c
+ * (which is not actually used on arch/arm)
+ */
+void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
+		gfp_t flags, unsigned long attrs)
+{
+	return NULL;
+}
+void arch_dma_free(struct device *dev, size_t size, void *vaddr,
+		dma_addr_t dma_handle, unsigned long attrs)
+{
+}
+
 /**
  * arm_dma_map_sg - map a set of SG buffers for streaming mode DMA
  * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
  2019-08-14 21:59 [PATCH 0/6] drm+dma: cache support for arm, etc Rob Clark
                   ` (2 preceding siblings ...)
  2019-08-14 21:59 ` [PATCH 4/6] arm: add arch_sync_dma_for_*() Rob Clark
@ 2019-08-15  6:51 ` Christoph Hellwig
  2019-08-15 13:54   ` Rob Clark
  3 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2019-08-15  6:51 UTC (permalink / raw)
  To: Rob Clark
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Christoph Hellwig, Emil Velikov, Rob Clark,
	Michael Ellerman, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Deepak Sharma,
	Joerg Roedel, Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Enrico Weigelt,
	Ard Biesheuvel, Greg Kroah-Hartman, open list, Paul Burton,
	Souptick Joarder, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU

As said before I don't think these low-level helpers are the
right API to export, but even if they did you'd just cover a tiny
subset of the architectures.

Also to distil the previous thread - if you remap memory to uncached
the helper to use is arch_dma_prep_coherent, which does a writeback+
invalidate everywhere, and there is no need to clean up after a
long-term uncached mapping.  We might still get speculations into
that area, if we don't remap the direct mapping, but it isn't like
invalidting that just before freeing the memory is going to help
anyone.

Also it seems like patches 5 and 6 are missing in my inbox.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
  2019-08-15  6:51 ` [PATCH 0/6] drm+dma: cache support for arm, etc Christoph Hellwig
@ 2019-08-15 13:54   ` Rob Clark
  2019-08-15 17:53     ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Rob Clark @ 2019-08-15 13:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Emil Velikov, Deepak Sharma, Michael Ellerman,
	Paul Burton, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Joerg Roedel,
	Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Enrico Weigelt,
	Ard Biesheuvel, Greg Kroah-Hartman, open list, Rob Clark,
	Souptick Joarder, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU

On Wed, Aug 14, 2019 at 11:51 PM Christoph Hellwig <hch@lst.de> wrote:
>
> As said before I don't think these low-level helpers are the
> right API to export, but even if they did you'd just cover a tiny
> subset of the architectures.

Are you thinking instead something like:

void dma_sync_sg_for_{cpu,device}(struct device *dev, struct scatterlist *sgl,
                                  int nents, enum dma_data_direction dir)
{
    for_each_sg(sgl, sg, nents, i) {
        arch_sync_dma_for_..(dev, sg_phys(sg), sg->length, dir);
    }
}
EXPORT_SYMBOL_GPL(dma_sync_sg_for_..)

or did you have something else in mind?

I guess something like this would avoid figuring out *which* archs
actually build drm..


> Also to distil the previous thread - if you remap memory to uncached
> the helper to use is arch_dma_prep_coherent, which does a writeback+
> invalidate everywhere, and there is no need to clean up after a
> long-term uncached mapping.  We might still get speculations into
> that area, if we don't remap the direct mapping, but it isn't like
> invalidting that just before freeing the memory is going to help
> anyone.

hmm, IIUC the aarch64 cache instructions, what I'm doing now is equiv
to what I would get with dma_map_sg(DMA_BIDIRECTIONAL) and
arch_dma_prep_coherent() is equiv to what I'd get w/ DMA_FROM_DEVICE
(but a single pass instead of separate inv+clean passes)..

but I can respin this with a single dma_prep_coherent_sg() which uses
arch_dma_prep_coherent()..

> Also it seems like patches 5 and 6 are missing in my inbox.

Hmm, not entirely sure why.. you should be on the cc list for each
individual patch.

But here is the patchwork link if you want to see them:

https://patchwork.freedesktop.org/series/65211/

BR,
-R

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
  2019-08-15 13:54   ` Rob Clark
@ 2019-08-15 17:53     ` Christoph Hellwig
  2019-08-15 18:21       ` Koenig, Christian
  2019-08-16 21:04       ` Rob Clark
  0 siblings, 2 replies; 13+ messages in thread
From: Christoph Hellwig @ 2019-08-15 17:53 UTC (permalink / raw)
  To: Rob Clark
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Christoph Hellwig, Emil Velikov, Deepak Sharma,
	Michael Ellerman, Paul Burton, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Joerg Roedel,
	Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Enrico Weigelt,
	Ard Biesheuvel, Greg Kroah-Hartman, open list, Rob Clark,
	Souptick Joarder, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU, christian.koenig

On Thu, Aug 15, 2019 at 06:54:39AM -0700, Rob Clark wrote:
> On Wed, Aug 14, 2019 at 11:51 PM Christoph Hellwig <hch@lst.de> wrote:
> >
> > As said before I don't think these low-level helpers are the
> > right API to export, but even if they did you'd just cover a tiny
> > subset of the architectures.
> 
> Are you thinking instead something like:
> 
> void dma_sync_sg_for_{cpu,device}(struct device *dev, struct scatterlist *sgl,
>                                   int nents, enum dma_data_direction dir)
> {
>     for_each_sg(sgl, sg, nents, i) {
>         arch_sync_dma_for_..(dev, sg_phys(sg), sg->length, dir);
>     }
> }
> EXPORT_SYMBOL_GPL(dma_sync_sg_for_..)
> 
> or did you have something else in mind?

No.  We really need an interface thay says please give me uncached
memory (for some definition of uncached that includes that grapics
drivers call write combine), and then let the architecture do the right
thing.  Basically dma_alloc_coherent with DMA_ATTR_NO_KERNEL_MAPPING
is superficially close to what you want, except that the way the drm
drivers work you can't actually use it.

The reason for that is if we can we really need to not create another
uncachable alias, but instead change the page attributes in place.
On x86 we can and must do that for example, and based on the
conversation with Will arm64 could do that fairly easily.  arm32 can
right now only do that for CMA, though.

The big question is what API do we want.  I had a pretty similar
discussion with Christian on doing such an allocation for amdgpu,
where the device normally is cache coherent, but they actually want
to turn it into non-coherent by using PCIe unsnooped transactions.

Here is my high level plan, which still has a few lose end:

 (1) provide a new API:

	struct page *dma_alloc_pages(struct device *dev, unsigned nr_pages,
			gfp_t gfp, unsigned long flags);
	void dma_free_pages(struct device *dev, unsigned nr_pages,
			unsigned long flags);

     These give you back page backed memory that is guaranteed to be
     addressable by the device (no swiotlb or similar).  The memory can
     then be mapped using dma_map*, including unmap and dma_sync to
     bounce ownership around.  This is the replacement for the current
     dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT API, that is rather
     badly defined.

 (2) Add support for DMA_ATTR_NO_KERNEL_MAPPING to this new API instead
     of dma_alloc_attrs.  The initial difference with that flag is just
     that we allow highmem, but in the future we could also unmap this
     memory from the kernel linear mapping entirely on architectures
     where we can easily do that.

 (3) Add a dma_pages_map/dma_pages_unmap or similar API that allows you
     to get a kernel mapping for parts or all of a
     DMA_ATTR_NO_KERNEL_MAPPING allocation.  This is to replace things
     like your open-coded vmap in msm (or similarly elsewhere in dma-buf
     providers).

 (4) Add support for a DMA_ATTR_UNCACHABLE flags (or similar) to the new
     API, that maps the pages as uncachable iff they have a kernel
     mapping, including invalidating the caches at time of this page
     attribute change (or creation of a new mapping).  This API will fail
     if the architecture does not allow in-place remapping.  Note that for
     arm32 we could always dip into the CMA pool if one is present to not
     fail.  We'll also need some helper to map from the DMA_ATTR_* flags
     to a pgprot for mapping the page to userspace.  There is also a few
     other weird bits here, e.g. on architectures like mips that use an
     uncached segment we'll have to fail use with the plain
     DMA_ATTR_UNCACHABLE flag, but it could be supported with
     DMA_ATTR_UNCACHABLE | DMA_ATTR_NO_KERNEL_MAPPING.

I was hoping to get most of this done for this merge window, but I'm
probably lucky if I get at least parts done.  Too much distraction.

> Hmm, not entirely sure why.. you should be on the cc list for each
> individual patch.

They finally made it, although even with the delay they only ended up
in the spam mailbox.  I still can't see them on the various mailing
lists.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
  2019-08-15 17:53     ` Christoph Hellwig
@ 2019-08-15 18:21       ` Koenig, Christian
  2019-08-15 18:27         ` Christoph Hellwig
  2019-08-16 21:04       ` Rob Clark
  1 sibling, 1 reply; 13+ messages in thread
From: Koenig, Christian @ 2019-08-15 18:21 UTC (permalink / raw)
  To: Christoph Hellwig, Rob Clark
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Emil Velikov, Sharma, Deepak, Michael Ellerman,
	Paul Burton, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Joerg Roedel,
	Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Enrico Weigelt,
	Ard Biesheuvel, Greg Kroah-Hartman, open list, Rob Clark,
	Souptick Joarder, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU

Am 15.08.19 um 19:53 schrieb Christoph Hellwig:
> On Thu, Aug 15, 2019 at 06:54:39AM -0700, Rob Clark wrote:
>> On Wed, Aug 14, 2019 at 11:51 PM Christoph Hellwig <hch@lst.de> wrote:
>>> As said before I don't think these low-level helpers are the
>>> right API to export, but even if they did you'd just cover a tiny
>>> subset of the architectures.
>> Are you thinking instead something like:
>>
>> void dma_sync_sg_for_{cpu,device}(struct device *dev, struct scatterlist *sgl,
>>                                    int nents, enum dma_data_direction dir)
>> {
>>      for_each_sg(sgl, sg, nents, i) {
>>          arch_sync_dma_for_..(dev, sg_phys(sg), sg->length, dir);
>>      }
>> }
>> EXPORT_SYMBOL_GPL(dma_sync_sg_for_..)
>>
>> or did you have something else in mind?
> No.  We really need an interface thay says please give me uncached
> memory (for some definition of uncached that includes that grapics
> drivers call write combine), and then let the architecture do the right
> thing.  Basically dma_alloc_coherent with DMA_ATTR_NO_KERNEL_MAPPING
> is superficially close to what you want, except that the way the drm
> drivers work you can't actually use it.

The terminology graphics driver use is USWC (Uncached Speculative Write 
Combine).

Essentially it is a leftover from the AGP days where the graphics 
devices couldn't snoop the CPU caches.

Nowadays they either don't want to snoop the CPU caches for performance 
reasons.

Or because of special requirements that certain areas of system memory 
are not cached for certain functionality.

For example the "VRAM" on AMD GPUs which are integrated into the CPU is 
just stolen system memory. Then you can scanout your picture to the 
display from this memory or "normal" system memory, but only if the 
system memory is mapped as USWC.

> The reason for that is if we can we really need to not create another
> uncachable alias, but instead change the page attributes in place.
> On x86 we can and must do that for example, and based on the
> conversation with Will arm64 could do that fairly easily.  arm32 can
> right now only do that for CMA, though.
>
> The big question is what API do we want.  I had a pretty similar
> discussion with Christian on doing such an allocation for amdgpu,
> where the device normally is cache coherent, but they actually want
> to turn it into non-coherent by using PCIe unsnooped transactions.
>
> Here is my high level plan, which still has a few lose end:
>
>   (1) provide a new API:
>
> 	struct page *dma_alloc_pages(struct device *dev, unsigned nr_pages,
> 			gfp_t gfp, unsigned long flags);
> 	void dma_free_pages(struct device *dev, unsigned nr_pages,
> 			unsigned long flags);
>
>       These give you back page backed memory that is guaranteed to be
>       addressable by the device (no swiotlb or similar).  The memory can
>       then be mapped using dma_map*, including unmap and dma_sync to
>       bounce ownership around.  This is the replacement for the current
>       dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT API, that is rather
>       badly defined.
>
>   (2) Add support for DMA_ATTR_NO_KERNEL_MAPPING to this new API instead
>       of dma_alloc_attrs.  The initial difference with that flag is just
>       that we allow highmem, but in the future we could also unmap this
>       memory from the kernel linear mapping entirely on architectures
>       where we can easily do that.

Mhm, why would we want to do this?

>
>   (3) Add a dma_pages_map/dma_pages_unmap or similar API that allows you
>       to get a kernel mapping for parts or all of a
>       DMA_ATTR_NO_KERNEL_MAPPING allocation.  This is to replace things
>       like your open-coded vmap in msm (or similarly elsewhere in dma-buf
>       providers).
>
>   (4) Add support for a DMA_ATTR_UNCACHABLE flags (or similar) to the new
>       API, that maps the pages as uncachable iff they have a kernel
>       mapping, including invalidating the caches at time of this page
>       attribute change (or creation of a new mapping).  This API will fail
>       if the architecture does not allow in-place remapping.  Note that for
>       arm32 we could always dip into the CMA pool if one is present to not
>       fail.  We'll also need some helper to map from the DMA_ATTR_* flags
>       to a pgprot for mapping the page to userspace.  There is also a few
>       other weird bits here, e.g. on architectures like mips that use an
>       uncached segment we'll have to fail use with the plain
>       DMA_ATTR_UNCACHABLE flag, but it could be supported with
>       DMA_ATTR_UNCACHABLE | DMA_ATTR_NO_KERNEL_MAPPING.
>
> I was hoping to get most of this done for this merge window, but I'm
> probably lucky if I get at least parts done.  Too much distraction.

Thanks again for taking care of this,
Christian.

>
>> Hmm, not entirely sure why.. you should be on the cc list for each
>> individual patch.
> They finally made it, although even with the delay they only ended up
> in the spam mailbox.  I still can't see them on the various mailing
> lists.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
  2019-08-15 18:21       ` Koenig, Christian
@ 2019-08-15 18:27         ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2019-08-15 18:27 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Christoph Hellwig, Emil Velikov, Rob Clark,
	Michael Ellerman, Paul Burton, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Sharma, Deepak,
	Joerg Roedel, Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Ard Biesheuvel,
	Enrico Weigelt, open list, Rob Clark, Souptick Joarder,
	Greg Kroah-Hartman, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU

On Thu, Aug 15, 2019 at 06:21:00PM +0000, Koenig, Christian wrote:
> >   (2) Add support for DMA_ATTR_NO_KERNEL_MAPPING to this new API instead
> >       of dma_alloc_attrs.  The initial difference with that flag is just
> >       that we allow highmem, but in the future we could also unmap this
> >       memory from the kernel linear mapping entirely on architectures
> >       where we can easily do that.
> 
> Mhm, why would we want to do this?

To avoid the CPU misspeculating into this memory.  For example NVMe SSDs
have a feature called host memory buffer that is a lot like your stolen
main ram for the GPU case.  We currently hand the SSD a
DMA_ATTR_NO_KERNEL_MAPPING allocation if it requests such a buffer.  If
possible we'd really like to make sure no speculative execution bug
(or intentional attacker with a kernel exploit for that matter) can easily
access that memory.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
  2019-08-15 17:53     ` Christoph Hellwig
  2019-08-15 18:21       ` Koenig, Christian
@ 2019-08-16 21:04       ` Rob Clark
  2019-08-19  5:23         ` Christoph Hellwig
  1 sibling, 1 reply; 13+ messages in thread
From: Rob Clark @ 2019-08-16 21:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Emil Velikov, Deepak Sharma, Michael Ellerman,
	Paul Burton, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Joerg Roedel,
	Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Enrico Weigelt,
	Ard Biesheuvel, Greg Kroah-Hartman, open list, Rob Clark,
	Souptick Joarder, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU, christian.koenig

On Thu, Aug 15, 2019 at 10:53 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Aug 15, 2019 at 06:54:39AM -0700, Rob Clark wrote:
> > On Wed, Aug 14, 2019 at 11:51 PM Christoph Hellwig <hch@lst.de> wrote:
> > >
> > > As said before I don't think these low-level helpers are the
> > > right API to export, but even if they did you'd just cover a tiny
> > > subset of the architectures.
> >
> > Are you thinking instead something like:
> >
> > void dma_sync_sg_for_{cpu,device}(struct device *dev, struct scatterlist *sgl,
> >                                   int nents, enum dma_data_direction dir)
> > {
> >     for_each_sg(sgl, sg, nents, i) {
> >         arch_sync_dma_for_..(dev, sg_phys(sg), sg->length, dir);
> >     }
> > }
> > EXPORT_SYMBOL_GPL(dma_sync_sg_for_..)
> >
> > or did you have something else in mind?
>
> No.  We really need an interface thay says please give me uncached
> memory (for some definition of uncached that includes that grapics
> drivers call write combine), and then let the architecture do the right
> thing.  Basically dma_alloc_coherent with DMA_ATTR_NO_KERNEL_MAPPING
> is superficially close to what you want, except that the way the drm
> drivers work you can't actually use it.

I don't disagree about needing an API to get uncached memory (or
ideally just something outside of the linear map).  But I think this
is a separate problem.

What I was hoping for, for v5.4, is a way to stop abusing dma_map/sync
for cache ops to get rid of the hack I had to make for v5.3.  And also
to fix vgem on non-x86.  (Unfortunately changing vgem to used cached
mappings breaks x86 CI, but fixes CI on arm/arm64..)  We can do that
without any changes in allocation.  There is still the possibility for
problems due to cached alias, but that has been a problem this whole
time, it isn't something new.

BR,
-R

> The reason for that is if we can we really need to not create another
> uncachable alias, but instead change the page attributes in place.
> On x86 we can and must do that for example, and based on the
> conversation with Will arm64 could do that fairly easily.  arm32 can
> right now only do that for CMA, though.
>
> The big question is what API do we want.  I had a pretty similar
> discussion with Christian on doing such an allocation for amdgpu,
> where the device normally is cache coherent, but they actually want
> to turn it into non-coherent by using PCIe unsnooped transactions.
>
> Here is my high level plan, which still has a few lose end:
>
>  (1) provide a new API:
>
>         struct page *dma_alloc_pages(struct device *dev, unsigned nr_pages,
>                         gfp_t gfp, unsigned long flags);
>         void dma_free_pages(struct device *dev, unsigned nr_pages,
>                         unsigned long flags);
>
>      These give you back page backed memory that is guaranteed to be
>      addressable by the device (no swiotlb or similar).  The memory can
>      then be mapped using dma_map*, including unmap and dma_sync to
>      bounce ownership around.  This is the replacement for the current
>      dma_alloc_attrs with DMA_ATTR_NON_CONSISTENT API, that is rather
>      badly defined.
>
>  (2) Add support for DMA_ATTR_NO_KERNEL_MAPPING to this new API instead
>      of dma_alloc_attrs.  The initial difference with that flag is just
>      that we allow highmem, but in the future we could also unmap this
>      memory from the kernel linear mapping entirely on architectures
>      where we can easily do that.
>
>  (3) Add a dma_pages_map/dma_pages_unmap or similar API that allows you
>      to get a kernel mapping for parts or all of a
>      DMA_ATTR_NO_KERNEL_MAPPING allocation.  This is to replace things
>      like your open-coded vmap in msm (or similarly elsewhere in dma-buf
>      providers).
>
>  (4) Add support for a DMA_ATTR_UNCACHABLE flags (or similar) to the new
>      API, that maps the pages as uncachable iff they have a kernel
>      mapping, including invalidating the caches at time of this page
>      attribute change (or creation of a new mapping).  This API will fail
>      if the architecture does not allow in-place remapping.  Note that for
>      arm32 we could always dip into the CMA pool if one is present to not
>      fail.  We'll also need some helper to map from the DMA_ATTR_* flags
>      to a pgprot for mapping the page to userspace.  There is also a few
>      other weird bits here, e.g. on architectures like mips that use an
>      uncached segment we'll have to fail use with the plain
>      DMA_ATTR_UNCACHABLE flag, but it could be supported with
>      DMA_ATTR_UNCACHABLE | DMA_ATTR_NO_KERNEL_MAPPING.
>
> I was hoping to get most of this done for this merge window, but I'm
> probably lucky if I get at least parts done.  Too much distraction.
>
> > Hmm, not entirely sure why.. you should be on the cc list for each
> > individual patch.
>
> They finally made it, although even with the delay they only ended up
> in the spam mailbox.  I still can't see them on the various mailing
> lists.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
  2019-08-16 21:04       ` Rob Clark
@ 2019-08-19  5:23         ` Christoph Hellwig
  2019-08-19 14:39           ` Rob Clark
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2019-08-19  5:23 UTC (permalink / raw)
  To: Rob Clark
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Christoph Hellwig, Emil Velikov, Deepak Sharma,
	Michael Ellerman, Paul Burton, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Joerg Roedel,
	Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Enrico Weigelt,
	Ard Biesheuvel, Greg Kroah-Hartman, open list, Rob Clark,
	Souptick Joarder, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU, christian.koenig

On Fri, Aug 16, 2019 at 02:04:35PM -0700, Rob Clark wrote:
> I don't disagree about needing an API to get uncached memory (or
> ideally just something outside of the linear map).  But I think this
> is a separate problem.
> 
> What I was hoping for, for v5.4, is a way to stop abusing dma_map/sync
> for cache ops to get rid of the hack I had to make for v5.3.  And also
> to fix vgem on non-x86.  (Unfortunately changing vgem to used cached
> mappings breaks x86 CI, but fixes CI on arm/arm64..)  We can do that
> without any changes in allocation.  There is still the possibility for
> problems due to cached alias, but that has been a problem this whole
> time, it isn't something new.

But that just means we start exposing random low-level APIs that
people will quickly abuse..  In fact even your simple plan to some
extent already is an abuse of the intent of these functions, and
it also requires a lot of knowledge in the driver that in the normal
cases drivers can't know (e.g. is the device dma coherent or not).

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
  2019-08-19  5:23         ` Christoph Hellwig
@ 2019-08-19 14:39           ` Rob Clark
  0 siblings, 0 replies; 13+ messages in thread
From: Rob Clark @ 2019-08-19 14:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Emil Velikov, Deepak Sharma, Michael Ellerman,
	Paul Burton, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Joerg Roedel,
	Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Enrico Weigelt,
	Ard Biesheuvel, Greg Kroah-Hartman, open list, Rob Clark,
	Souptick Joarder, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU, christian.koenig

On Sun, Aug 18, 2019 at 10:23 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Fri, Aug 16, 2019 at 02:04:35PM -0700, Rob Clark wrote:
> > I don't disagree about needing an API to get uncached memory (or
> > ideally just something outside of the linear map).  But I think this
> > is a separate problem.
> >
> > What I was hoping for, for v5.4, is a way to stop abusing dma_map/sync
> > for cache ops to get rid of the hack I had to make for v5.3.  And also
> > to fix vgem on non-x86.  (Unfortunately changing vgem to used cached
> > mappings breaks x86 CI, but fixes CI on arm/arm64..)  We can do that
> > without any changes in allocation.  There is still the possibility for
> > problems due to cached alias, but that has been a problem this whole
> > time, it isn't something new.
>
> But that just means we start exposing random low-level APIs that
> people will quickly abuse..  In fact even your simple plan to some
> extent already is an abuse of the intent of these functions, and
> it also requires a lot of knowledge in the driver that in the normal
> cases drivers can't know (e.g. is the device dma coherent or not).

I can agree that most drivers should use the higher level APIs.. but
not that we must prevent *all* drivers from using them.  Most of what
DMA API is trying to solve doesn't apply to a driver like drm/msm..
which is how we ended up with hacks to try and misuse the high level
API to accomplish what we need.

Perhaps we can protect the prototypes with #ifdef LOWLEVEL_DMA_API /
#endif type thing to make it more obvious to other drivers that it
probably isn't the API they should use?

BR,
-R

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/6] drm+dma: cache support for arm, etc
       [not found] <215e5cb7-0fcf-48db-a656-817054dde420@email.android.com>
@ 2019-08-15 18:52 ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2019-08-15 18:52 UTC (permalink / raw)
  To: Koenig, Christian
  Cc: Kate Stewart, Masayoshi Mizuma, Maciej W. Rozycki, Eric Biggers,
	Catalin Marinas, Imre Deak, dri-devel, Chris Wilson,
	Masahiro Yamada, Benjamin Gaignard, Mauro Carvalho Chehab,
	Will Deacon, Christoph Hellwig, Emil Velikov, Rob Clark,
	Michael Ellerman, Paul Burton, Mike Rapoport, Geert Uytterhoeven,
	moderated list:ARM64 PORT (AARCH64 ARCHITECTURE),
	Daniel Vetter, open list:MIPS, Linus Walleij, Robin Murphy,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Sharma, Deepak,
	Joerg Roedel, Arnd Bergmann, Anshuman Khandual, Hauke Mehrtens,
	Jesper Dangaard Brouer, Wolfram Sang (Renesas),
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Alexios Zavras, Russell King, Doug Anderson, Thomas Gleixner,
	Sean Paul, Allison Randal, Christophe Leroy, Ard Biesheuvel,
	Enrico Weigelt, open list, Rob Clark, Souptick Joarder,
	Greg Kroah-Hartman, Andrew Morton,
	open list:DRM DRIVER FOR MSM ADRENO GPU

On Thu, Aug 15, 2019 at 06:48:39PM +0000, Koenig, Christian wrote:
> Well, for the graphics case I absolutely need to keep the linear kernel mapping. Because for certain use cases the memory is accessed by the kernel all the time as well.

Then don't set DMA_ATTR_NO_KERNEL_MAPPING.  At least for x86 and arm64
we should be able to support uncached allocations easily even without
that, and I plan to support that for your use case.  But if the driver
is explicitly saying it doesn't want a permanent kernel mapping it makes
sense to (optionally if the architecture can easily do it) unmap the
memory from the kernel linear mapping.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-08-19 14:39 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-14 21:59 [PATCH 0/6] drm+dma: cache support for arm, etc Rob Clark
2019-08-14 21:59 ` [PATCH 1/6] arm64: export arch_sync_dma_for_*() Rob Clark
2019-08-14 21:59 ` [PATCH 2/6] mips: " Rob Clark
2019-08-14 21:59 ` [PATCH 4/6] arm: add arch_sync_dma_for_*() Rob Clark
2019-08-15  6:51 ` [PATCH 0/6] drm+dma: cache support for arm, etc Christoph Hellwig
2019-08-15 13:54   ` Rob Clark
2019-08-15 17:53     ` Christoph Hellwig
2019-08-15 18:21       ` Koenig, Christian
2019-08-15 18:27         ` Christoph Hellwig
2019-08-16 21:04       ` Rob Clark
2019-08-19  5:23         ` Christoph Hellwig
2019-08-19 14:39           ` Rob Clark
     [not found] <215e5cb7-0fcf-48db-a656-817054dde420@email.android.com>
2019-08-15 18:52 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).