IOMMU Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH 0/3] fix dma_mask for CCW devices 
@ 2019-09-23 12:34 Halil Pasic
  2019-09-23 12:34 ` [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable Halil Pasic
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Halil Pasic @ 2019-09-23 12:34 UTC (permalink / raw)
  To: Christoph Hellwig, Gerald Schaefer
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel, Halil Pasic,
	Christian Borntraeger, iommu

Commit 37db8985b211 ("s390/cio: add basic protected virtualization
support") breaks virtio-ccw devices with VIRTIO_F_IOMMU_PLATFORM for non
Protected Virtualization (PV) guests. The problem is that the dma_mask
of the CCW device, which is used by virtio core, gets changed from 64 to
31 bit. This is done because some of the DMA allocations do require 31
bit addressable memory, but it has unfavorable side effects. 

For PV the only drawback is that some of the virtio structures must end
up in ZONE_DMA (with PV we have the bounce the buffers mapped via DMA
API anyway).

But for non PV guests we have a problem: because of the 31 bit mask
guests bigger than 2G are likely to try bouncing buffers. The swiotlb
however is only initialized for PV guests (because we don't want to
bounce anything for non PV guests). The first map of a buffer with
an address beyond 0x7fffffff kills the guest.

This series sets out to fix this problem by first making the GPF_DMA
flag count for DMA API allocations -- on s390 at least.  Then we set
dma_mask to 64 bit and do the allocations for the memory that needs to
be 31 bit addressable with the GPF_DMA flag.

For CCW devices we could probably just not clear any GFP flags at
all but, I decided to be conservative and change only what really needs
to be changed.

I'm not perfectly satisfied with this solution, but I believe it is good
enough, and I can't think of anything better at the moment. Ideas
welcome.

Halil Pasic (3):
  dma-mapping: make overriding GFP_* flags arch customizable
  s390/virtio: fix virtio-ccw DMA without PV
  dma-mapping: warn on harmful GFP_* flags

 arch/s390/Kconfig             |  1 +
 arch/s390/include/asm/cio.h   |  5 +++--
 arch/s390/mm/init.c           | 20 ++++++++++++++++++++
 drivers/s390/cio/css.c        | 16 +++++++++-------
 drivers/s390/cio/device.c     |  5 +++--
 drivers/s390/cio/device_ops.c |  3 ++-
 include/linux/dma-mapping.h   | 13 +++++++++++++
 kernel/dma/Kconfig            |  6 ++++++
 kernel/dma/mapping.c          |  4 +---
 9 files changed, 58 insertions(+), 15 deletions(-)

-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable
  2019-09-23 12:34 [RFC PATCH 0/3] fix dma_mask for CCW devices Halil Pasic
@ 2019-09-23 12:34 ` Halil Pasic
  2019-09-23 15:21   ` Christoph Hellwig
  2019-09-23 12:34 ` [RFC PATCH 2/3] s390/virtio: fix virtio-ccw DMA without PV Halil Pasic
  2019-09-23 12:34 ` [RFC PATCH 3/3] dma-mapping: warn on harmful GFP_* flags Halil Pasic
  2 siblings, 1 reply; 9+ messages in thread
From: Halil Pasic @ 2019-09-23 12:34 UTC (permalink / raw)
  To: Christoph Hellwig, Gerald Schaefer
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel, Halil Pasic,
	Christian Borntraeger, iommu

Before commit 57bf5a8963f8 ("dma-mapping: clear harmful GFP_* flags in
common code") tweaking the client code supplied GFP_* flags used to be
an issue handled in the architecture specific code. The commit message
suggests, that fixing the client code would actually be a better way
of dealing with this.

On s390 common I/O devices are generally capable of using the full 64
bit address space for DMA I/O, but some chunks of the DMA memory need to
be 31 bit addressable (in physical address space) because the
instructions involved mandate it. Before switching to DMA API this used
to be a non-issue, we used to allocate those chunks from ZONE_DMA.
Currently our only option with the DMA API is to restrict the devices to
(via dma_mask and dma_mask_coherent) to 31 bit, which is sub-optimal.

Thus s390 we would benefit form having control over what flags are
dropped.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 include/linux/dma-mapping.h | 10 ++++++++++
 kernel/dma/Kconfig          |  6 ++++++
 kernel/dma/mapping.c        |  4 +---
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 4a1c4fca475a..5024bc863fa7 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -817,4 +817,14 @@ static inline int dma_mmap_wc(struct device *dev,
 #define dma_unmap_len_set(PTR, LEN_NAME, VAL)    do { } while (0)
 #endif
 
+#ifdef CONFIG_ARCH_HAS_DMA_OVERRIDE_GFP_FLAGS
+extern gfp_t dma_override_gfp_flags(struct device *dev, gfp_t flags);
+#else
+static inline gfp_t dma_override_gfp_flags(struct device *dev, gfp_t flags)
+{
+	/* let the implementation decide on the zone to allocate from: */
+	return flags & ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
+}
+#endif
+
 #endif
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 73c5c2b8e824..4756c75047e3 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -54,6 +54,12 @@ config ARCH_HAS_DMA_PREP_COHERENT
 config ARCH_HAS_DMA_COHERENT_TO_PFN
 	bool
 
+config ARCH_HAS_DMA_MMAP_PGPROT
+	bool
+
+config ARCH_HAS_DMA_OVERRIDE_GFP_FLAGS
+	bool
+
 config ARCH_HAS_FORCE_DMA_UNENCRYPTED
 	bool
 
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index d9334f31a5af..535b809548e2 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -303,9 +303,7 @@ void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
 	if (dma_alloc_from_dev_coherent(dev, size, dma_handle, &cpu_addr))
 		return cpu_addr;
 
-	/* let the implementation decide on the zone to allocate from: */
-	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
-
+	flag = dma_override_gfp_flags(dev, flag);
 	if (dma_is_direct(ops))
 		cpu_addr = dma_direct_alloc(dev, size, dma_handle, flag, attrs);
 	else if (ops->alloc)
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH 2/3] s390/virtio: fix virtio-ccw DMA without PV
  2019-09-23 12:34 [RFC PATCH 0/3] fix dma_mask for CCW devices Halil Pasic
  2019-09-23 12:34 ` [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable Halil Pasic
@ 2019-09-23 12:34 ` Halil Pasic
  2019-09-23 12:34 ` [RFC PATCH 3/3] dma-mapping: warn on harmful GFP_* flags Halil Pasic
  2 siblings, 0 replies; 9+ messages in thread
From: Halil Pasic @ 2019-09-23 12:34 UTC (permalink / raw)
  To: Christoph Hellwig, Gerald Schaefer
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel, Halil Pasic,
	Christian Borntraeger, iommu

Commit 37db8985b211 ("s390/cio: add basic protected virtualization
support") breaks virtio-ccw devices with VIRTIO_F_IOMMU_PLATFORM for non
Protected Virtualization (PV) guests. The problem is that the dma_mask
of the ccw device, which is used by virtio core, gets changed from 64 to
31 bit, because some of the DMA allocations do require 31 bit
addressable memory. For PV the only drawback is that some of the virtio
structures must end up in ZONE_DMA because we have the bounce the
buffers mapped via DMA API anyway.

But for non PV guests we have a problem: because of the 31 bit mask
guests bigger than 2G are likely to try bouncing buffers. The swiotlb
however is only initialized for PV guests, because we don't want to
bounce anything for non PV guests. The first such map kills the guest.

Let us go back to differentiating between allocations that need to be
from ZONE_DMA and the ones that don't using the GFP_DMA flag. For that
we need to make sure dma_override_gfp_flags() won't clear away GFP_DMA
like it does by default. Then we can fix the dma_mask. CCW devices are
very well capable of  DMA-ing data to/from any address, it is just that
certain things need do be 31 addressable because of the 31 bit heritage.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Fixes: 37db8985b211 ("s390/cio: add basic protected virtualization support")
---

I was conservative about preserving old behavior for PCI. Could we just
not clear any flags in dma_override_gfp_flags()?
---
 arch/s390/Kconfig             |  1 +
 arch/s390/include/asm/cio.h   |  5 +++--
 arch/s390/mm/init.c           | 20 ++++++++++++++++++++
 drivers/s390/cio/css.c        | 16 +++++++++-------
 drivers/s390/cio/device.c     |  5 +++--
 drivers/s390/cio/device_ops.c |  3 ++-
 6 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index f933a473b128..e61351b61ce7 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -60,6 +60,7 @@ config S390
 	def_bool y
 	select ARCH_BINFMT_ELF_STATE
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
+	select ARCH_HAS_DMA_OVERRIDE_GFP_FLAGS
 	select ARCH_HAS_ELF_RANDOMIZE
 	select ARCH_HAS_FORTIFY_SOURCE
 	select ARCH_HAS_GCOV_PROFILE_ALL
diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
index b5bfb3123cb1..32041ec48170 100644
--- a/arch/s390/include/asm/cio.h
+++ b/arch/s390/include/asm/cio.h
@@ -364,10 +364,11 @@ extern void cio_dma_free(void *cpu_addr, size_t size);
 extern struct device *cio_get_dma_css_dev(void);
 
 void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
-			size_t size);
+			size_t size, gfp_t flags);
 void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size);
 void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev);
-struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages);
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages,
+				   gfp_t flags);
 
 /* Function from drivers/s390/cio/chsc.c */
 int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index a124f19f7b3c..757e2cc60a1a 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -161,6 +161,26 @@ bool force_dma_unencrypted(struct device *dev)
 	return is_prot_virt_guest();
 }
 
+
+gfp_t dma_override_gfp_flags(struct device *dev, gfp_t flags)
+{
+	gfp_t  taboo_mask;
+	const char *taboo_msg;
+
+	if (dma_is_direct(dev->dma_ops)) {
+		/* cio: we have to mix in some allocations from ZONE_DMA */
+		taboo_mask = __GFP_DMA32 | __GFP_HIGHMEM;
+		taboo_msg = "__GFP_DMA32, __GFP_HIGHMEM";
+	} else {
+		/* pci */
+		taboo_mask = __GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM;
+		taboo_msg = " __GFP_DMA, __GFP_DMA32, __GFP_HIGHMEM,";
+	}
+	dev_WARN_ONCE(dev, flags & taboo_mask,
+		      "fixme: don't dma_alloc with any of: %s\n", taboo_msg);
+	return flags & ~taboo_mask;
+}
+
 /* protected virtualization */
 static void pv_init(void)
 {
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index 22c55816100b..3115602384d7 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -231,7 +231,7 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
 	 * The physical addresses of some the dma structures that can
 	 * belong to a subchannel need to fit 31 bit width (e.g. ccw).
 	 */
-	sch->dev.coherent_dma_mask = DMA_BIT_MASK(31);
+	sch->dev.coherent_dma_mask = DMA_BIT_MASK(64);
 	sch->dev.dma_mask = &sch->dev.coherent_dma_mask;
 	return sch;
 
@@ -1091,7 +1091,8 @@ struct device *cio_get_dma_css_dev(void)
 	return &channel_subsystems[0]->device;
 }
 
-struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev,
+				   int nr_pages, gfp_t flags)
 {
 	struct gen_pool *gp_dma;
 	void *cpu_addr;
@@ -1103,7 +1104,7 @@ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
 		return NULL;
 	for (i = 0; i < nr_pages; ++i) {
 		cpu_addr = dma_alloc_coherent(dma_dev, PAGE_SIZE, &dma_addr,
-					      CIO_DMA_GFP);
+					      flags);
 		if (!cpu_addr)
 			return gp_dma;
 		gen_pool_add_virt(gp_dma, (unsigned long) cpu_addr,
@@ -1134,14 +1135,14 @@ void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev)
 static int cio_dma_pool_init(void)
 {
 	/* No need to free up the resources: compiled in */
-	cio_dma_pool = cio_gp_dma_create(cio_get_dma_css_dev(), 1);
+	cio_dma_pool = cio_gp_dma_create(cio_get_dma_css_dev(), 1, CIO_DMA_GFP);
 	if (!cio_dma_pool)
 		return -ENOMEM;
 	return 0;
 }
 
 void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
-			size_t size)
+			size_t size, gfp_t flags)
 {
 	dma_addr_t dma_addr;
 	unsigned long addr;
@@ -1153,7 +1154,7 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
 	while (!addr) {
 		chunk_size = round_up(size, PAGE_SIZE);
 		addr = (unsigned long) dma_alloc_coherent(dma_dev,
-					 chunk_size, &dma_addr, CIO_DMA_GFP);
+					 chunk_size, &dma_addr, flags);
 		if (!addr)
 			return NULL;
 		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
@@ -1179,7 +1180,8 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
  */
 void *cio_dma_zalloc(size_t size)
 {
-	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
+	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(),
+				 size, CIO_DMA_GFP);
 }
 
 void cio_dma_free(void *cpu_addr, size_t size)
diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
index 131430bd48d9..7726e9c4d719 100644
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -711,12 +711,13 @@ static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
 		goto err_priv;
 	cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask;
 	cdev->dev.dma_mask = &cdev->dev.coherent_dma_mask;
-	dma_pool = cio_gp_dma_create(&cdev->dev, 1);
+	dma_pool = cio_gp_dma_create(&cdev->dev, 1, GFP_KERNEL | GFP_DMA);
 	if (!dma_pool)
 		goto err_dma_pool;
 	cdev->private->dma_pool = dma_pool;
 	cdev->private->dma_area = cio_gp_dma_zalloc(dma_pool, &cdev->dev,
-					sizeof(*cdev->private->dma_area));
+					sizeof(*cdev->private->dma_area),
+					GFP_KERNEL | GFP_DMA);
 	if (!cdev->private->dma_area)
 		goto err_dma_area;
 	return cdev;
diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c
index d722458c5928..57dcb6175b71 100644
--- a/drivers/s390/cio/device_ops.c
+++ b/drivers/s390/cio/device_ops.c
@@ -706,7 +706,8 @@ EXPORT_SYMBOL_GPL(ccw_device_get_schid);
  */
 void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size)
 {
-	return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size);
+	return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size,
+				 GFP_KERNEL | GFP_DMA);
 }
 EXPORT_SYMBOL(ccw_device_dma_zalloc);
 
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH 3/3] dma-mapping: warn on harmful GFP_* flags
  2019-09-23 12:34 [RFC PATCH 0/3] fix dma_mask for CCW devices Halil Pasic
  2019-09-23 12:34 ` [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable Halil Pasic
  2019-09-23 12:34 ` [RFC PATCH 2/3] s390/virtio: fix virtio-ccw DMA without PV Halil Pasic
@ 2019-09-23 12:34 ` Halil Pasic
  2 siblings, 0 replies; 9+ messages in thread
From: Halil Pasic @ 2019-09-23 12:34 UTC (permalink / raw)
  To: Christoph Hellwig, Gerald Schaefer
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel, Halil Pasic,
	Christian Borntraeger, iommu

The commit message of commit 57bf5a8963f8 ("dma-mapping: clear harmful
GFP_* flags in common code") says that probably warn when we encounter
harmful GFP_* flags which we clean -- because the client code is best
case silly if not buggy. I concur with that.

Let's warn once when we encounter silly GFP_* flags. The guys caring
about the respective client code will hopefully fix these soon.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---

I'm not too happy with my warning message. Suggestions welcome!
---
 include/linux/dma-mapping.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 5024bc863fa7..299f36ac8668 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -823,6 +823,9 @@ extern gfp_t dma_override_gfp_flags(struct device *dev, gfp_t flags);
 static inline gfp_t dma_override_gfp_flags(struct device *dev, gfp_t flags)
 {
 	/* let the implementation decide on the zone to allocate from: */
+	dev_WARN_ONCE(dev,
+		      flags & (__GFP_DMA32 | __GFP_DMA | __GFP_HIGHMEM),
+		      "fixme: don't dma_alloc with any of: __GFP_DMA32, __GFP_DMA, __GFP_HIGHMEM\n");
 	return flags & ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM);
 }
 #endif
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable
  2019-09-23 12:34 ` [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable Halil Pasic
@ 2019-09-23 15:21   ` Christoph Hellwig
  2019-09-26 12:37     ` Halil Pasic
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2019-09-23 15:21 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel,
	Christian Borntraeger, iommu, Christoph Hellwig, Gerald Schaefer

On Mon, Sep 23, 2019 at 02:34:16PM +0200, Halil Pasic wrote:
> Before commit 57bf5a8963f8 ("dma-mapping: clear harmful GFP_* flags in
> common code") tweaking the client code supplied GFP_* flags used to be
> an issue handled in the architecture specific code. The commit message
> suggests, that fixing the client code would actually be a better way
> of dealing with this.
> 
> On s390 common I/O devices are generally capable of using the full 64
> bit address space for DMA I/O, but some chunks of the DMA memory need to
> be 31 bit addressable (in physical address space) because the
> instructions involved mandate it. Before switching to DMA API this used
> to be a non-issue, we used to allocate those chunks from ZONE_DMA.
> Currently our only option with the DMA API is to restrict the devices to
> (via dma_mask and dma_mask_coherent) to 31 bit, which is sub-optimal.
> 
> Thus s390 we would benefit form having control over what flags are
> dropped.

No way, sorry.  You need to express that using a dma mask instead of
overloading the GFP flags.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable
  2019-09-23 15:21   ` Christoph Hellwig
@ 2019-09-26 12:37     ` Halil Pasic
  2019-09-26 13:04       ` Robin Murphy
  0 siblings, 1 reply; 9+ messages in thread
From: Halil Pasic @ 2019-09-26 12:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel,
	Christian Borntraeger, iommu, Gerald Schaefer

On Mon, 23 Sep 2019 17:21:17 +0200
Christoph Hellwig <hch@lst.de> wrote:

> On Mon, Sep 23, 2019 at 02:34:16PM +0200, Halil Pasic wrote:
> > Before commit 57bf5a8963f8 ("dma-mapping: clear harmful GFP_* flags in
> > common code") tweaking the client code supplied GFP_* flags used to be
> > an issue handled in the architecture specific code. The commit message
> > suggests, that fixing the client code would actually be a better way
> > of dealing with this.
> > 
> > On s390 common I/O devices are generally capable of using the full 64
> > bit address space for DMA I/O, but some chunks of the DMA memory need to
> > be 31 bit addressable (in physical address space) because the
> > instructions involved mandate it. Before switching to DMA API this used
> > to be a non-issue, we used to allocate those chunks from ZONE_DMA.
> > Currently our only option with the DMA API is to restrict the devices to
> > (via dma_mask and dma_mask_coherent) to 31 bit, which is sub-optimal.
> > 
> > Thus s390 we would benefit form having control over what flags are
> > dropped.
> 
> No way, sorry.  You need to express that using a dma mask instead of
> overloading the GFP flags.

Thanks for your feedback and sorry for the delay. Can you help me figure
out how can I express that using a dma mask? 

IMHO what you ask from me is frankly impossible.

What I need is the ability to ask for  (considering the physical
address) 31 bit addressable DMA memory if the chunk is supposed to host
control-type data that needs to be 31 bit addressable because that is
how the architecture is, without affecting the normal data-path. So
normally 64 bit mask is fine but occasionally (control) we would need
a 31 bit mask.

Regards,
Halil

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable
  2019-09-26 12:37     ` Halil Pasic
@ 2019-09-26 13:04       ` Robin Murphy
  2019-09-27  0:33         ` Halil Pasic
  0 siblings, 1 reply; 9+ messages in thread
From: Robin Murphy @ 2019-09-26 13:04 UTC (permalink / raw)
  To: Halil Pasic, Christoph Hellwig
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel,
	Christian Borntraeger, iommu, Gerald Schaefer

On 26/09/2019 13:37, Halil Pasic wrote:
> On Mon, 23 Sep 2019 17:21:17 +0200
> Christoph Hellwig <hch@lst.de> wrote:
> 
>> On Mon, Sep 23, 2019 at 02:34:16PM +0200, Halil Pasic wrote:
>>> Before commit 57bf5a8963f8 ("dma-mapping: clear harmful GFP_* flags in
>>> common code") tweaking the client code supplied GFP_* flags used to be
>>> an issue handled in the architecture specific code. The commit message
>>> suggests, that fixing the client code would actually be a better way
>>> of dealing with this.
>>>
>>> On s390 common I/O devices are generally capable of using the full 64
>>> bit address space for DMA I/O, but some chunks of the DMA memory need to
>>> be 31 bit addressable (in physical address space) because the
>>> instructions involved mandate it. Before switching to DMA API this used
>>> to be a non-issue, we used to allocate those chunks from ZONE_DMA.
>>> Currently our only option with the DMA API is to restrict the devices to
>>> (via dma_mask and dma_mask_coherent) to 31 bit, which is sub-optimal.
>>>
>>> Thus s390 we would benefit form having control over what flags are
>>> dropped.
>>
>> No way, sorry.  You need to express that using a dma mask instead of
>> overloading the GFP flags.
> 
> Thanks for your feedback and sorry for the delay. Can you help me figure
> out how can I express that using a dma mask?
> 
> IMHO what you ask from me is frankly impossible.
> 
> What I need is the ability to ask for  (considering the physical
> address) 31 bit addressable DMA memory if the chunk is supposed to host
> control-type data that needs to be 31 bit addressable because that is
> how the architecture is, without affecting the normal data-path. So
> normally 64 bit mask is fine but occasionally (control) we would need
> a 31 bit mask.

If it's possible to rework the "data" path to use streaming mappings 
instead of coherent allocations, you could potentially mimic what virtio 
does for a similar situation - see commit a0be1db4304f.

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable
  2019-09-26 13:04       ` Robin Murphy
@ 2019-09-27  0:33         ` Halil Pasic
  2019-09-27 21:21           ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Halil Pasic @ 2019-09-27  0:33 UTC (permalink / raw)
  To: Robin Murphy
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel,
	Christian Borntraeger, iommu, Christoph Hellwig, Gerald Schaefer

On Thu, 26 Sep 2019 14:04:13 +0100
Robin Murphy <robin.murphy@arm.com> wrote:

> On 26/09/2019 13:37, Halil Pasic wrote:
> > On Mon, 23 Sep 2019 17:21:17 +0200
> > Christoph Hellwig <hch@lst.de> wrote:
> > 
> >> On Mon, Sep 23, 2019 at 02:34:16PM +0200, Halil Pasic wrote:
> >>> Before commit 57bf5a8963f8 ("dma-mapping: clear harmful GFP_* flags in
> >>> common code") tweaking the client code supplied GFP_* flags used to be
> >>> an issue handled in the architecture specific code. The commit message
> >>> suggests, that fixing the client code would actually be a better way
> >>> of dealing with this.
> >>>
> >>> On s390 common I/O devices are generally capable of using the full 64
> >>> bit address space for DMA I/O, but some chunks of the DMA memory need to
> >>> be 31 bit addressable (in physical address space) because the
> >>> instructions involved mandate it. Before switching to DMA API this used
> >>> to be a non-issue, we used to allocate those chunks from ZONE_DMA.
> >>> Currently our only option with the DMA API is to restrict the devices to
> >>> (via dma_mask and coherent_dma_mask) to 31 bit, which is sub-optimal.
> >>>
> >>> Thus s390 we would benefit form having control over what flags are
> >>> dropped.
> >>
> >> No way, sorry.  You need to express that using a dma mask instead of
> >> overloading the GFP flags.
> > 
> > Thanks for your feedback and sorry for the delay. Can you help me figure
> > out how can I express that using a dma mask?
> > 
> > IMHO what you ask from me is frankly impossible.
> > 
> > What I need is the ability to ask for  (considering the physical
> > address) 31 bit addressable DMA memory if the chunk is supposed to host
> > control-type data that needs to be 31 bit addressable because that is
> > how the architecture is, without affecting the normal data-path. So
> > normally 64 bit mask is fine but occasionally (control) we would need
> > a 31 bit mask.
> 
> If it's possible to rework the "data" path to use streaming mappings 
> instead of coherent allocations, you could potentially mimic what virtio 
> does for a similar situation - see commit a0be1db4304f.
> 

Thank you for your feedback. Just to be sure we are on the same pager, I
read commit a0be1db4304f like this:
1) virtio_pci_legacy needs to allocate the virtqueues so that the base
address fits 44 bits
2) if 64 bit dma is possible they set coherent_dma_mask to
  DMA_BIT_MASK(44) and dma_mask to DMA_BIT_MASK(64)
3) since the queues get allocated with coherent allocations 1) is
satisfied
4) when the streaming mappings see a buffer that is beyond
  DMA_BIT_MASK(44) then it has to treat it as not coherent memory
  and do the syncing magic (which isn't actually required, just
  a side effect of the workaround.

I'm actually trying to get virtio_ccw working nice with Protected
Virtualization (best think of encrypted memory). So the "data" path
is mostly the same as for virtio_pci.

But unlike virtio_pci_legacy we are perfectly fine with virtqueues at
an arbitrary address.

We can make
coherent_dma_mask == DMA_BIT_MASK(31) != dma_mask == DMA_BIT_MASK(64)
but that affects all dma coherent allocations and needlessly force
the virtio control structures into the  [0..2G] range. Furthermore this
whole issue has nothing to do with memory coherence. For ccw devices
memory at addresses above 2G is no less coherent for ccw devices than
memory at addresses below 2G.

I've already implemented a patch (see after the scissors line) that
takes a similar route as commit a0be1db4304f, but I consider that a
workaround at best. But if that is what the community wants... I have to
get the job done one way or the other.

Many thanks for your help and your time.

-------------------------------8<------------------------------------

From: Halil Pasic <pasic@linux.ibm.com>
Date: Thu, 25 Jul 2019 18:44:21 +0200
Subject: [PATCH 1/1] s390/cio: fix virtio-ccw DMA without PV

Commit 37db8985b211 ("s390/cio: add basic protected virtualization
support") breaks virtio-ccw devices with VIRTIO_F_IOMMU_PLATFORM for non
Protected Virtualization (PV) guests. The problem is that the dma_mask
of the ccw device, which is used by virtio core, gets changed from 64 to
31 bit, because some of the DMA allocations do require 31 bit
addressable memory. For PV the only drawback is that some of the virtio
structures must end up in ZONE_DMA because we have the bounce the
buffers mapped via DMA API anyway.

But for non PV guests we have a problem: because of the 31 bit mask
guests bigger than 2G are likely to try bouncing buffers. The swiotlb
however is only initialized for PV guests, because we don't want to
bounce anything for non PV guests. The first such map kills the guest.

Since the DMA API won't allow us to specify for each allocating whether
we need memory from ZONE_DMA (31 bit addressable) or any DMA capable
memory will do, let us abuse coherent_dma_mask (which is used for
allocations) to force allocating form ZONE_DMA while changing dma_mask
to DMA_BIT_MASK(64).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Fixes: 37db8985b211 ("s390/cio: add basic protected virtualization support")
---
 drivers/s390/cio/cio.h    | 1 +
 drivers/s390/cio/css.c    | 8 +++++++-
 drivers/s390/cio/device.c | 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/cio/cio.h b/drivers/s390/cio/cio.h
index ba7d248..dcdaba6 100644
--- a/drivers/s390/cio/cio.h
+++ b/drivers/s390/cio/cio.h
@@ -113,6 +113,7 @@ struct subchannel {
 	enum sch_todo todo;
 	struct work_struct todo_work;
 	struct schib_config config;
+	u64 dma_mask;
 	char *driver_override; /* Driver name to force a match */
 } __attribute__ ((aligned(8)));
 
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index 22c5581..d63e80a 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -232,7 +232,13 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
 	 * belong to a subchannel need to fit 31 bit width (e.g. ccw).
 	 */
 	sch->dev.coherent_dma_mask = DMA_BIT_MASK(31);
-	sch->dev.dma_mask = &sch->dev.coherent_dma_mask;
+	/*
+	 * But we don't have such restrictions imposed on the stuff that
+	 * is handled by the streaming API. Using coherent_dma_mask != dma_mask
+	 * is just a workaround.
+	 */
+	sch->dma_mask = DMA_BIT_MASK(64);
+	sch->dev.dma_mask = &sch->dma_mask;
 	return sch;
 
 err:
diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
index 131430b..0c6245f 100644
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -710,7 +710,7 @@ static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
 	if (!cdev->private)
 		goto err_priv;
 	cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask;
-	cdev->dev.dma_mask = &cdev->dev.coherent_dma_mask;
+	cdev->dev.dma_mask = sch->dev.dma_mask;
 	dma_pool = cio_gp_dma_create(&cdev->dev, 1);
 	if (!dma_pool)
 		goto err_dma_pool;
-- 
2.5.5




_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable
  2019-09-27  0:33         ` Halil Pasic
@ 2019-09-27 21:21           ` Christoph Hellwig
  0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2019-09-27 21:21 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, Janosch Frank, Vasily Gorbik, Cornelia Huck,
	Heiko Carstens, Peter Oberparleiter, linux-kernel,
	Christian Borntraeger, iommu, Robin Murphy, Christoph Hellwig,
	Gerald Schaefer

On Fri, Sep 27, 2019 at 02:33:14AM +0200, Halil Pasic wrote:
> Thank you for your feedback. Just to be sure we are on the same pager, I
> read commit a0be1db4304f like this:
> 1) virtio_pci_legacy needs to allocate the virtqueues so that the base
> address fits 44 bits
> 2) if 64 bit dma is possible they set coherent_dma_mask to
>   DMA_BIT_MASK(44) and dma_mask to DMA_BIT_MASK(64)
> 3) since the queues get allocated with coherent allocations 1) is
> satisfied
> 4) when the streaming mappings see a buffer that is beyond
>   DMA_BIT_MASK(44) then it has to treat it as not coherent memory
>   and do the syncing magic (which isn't actually required, just
>   a side effect of the workaround.

1-3 is correct, 4 is not.  The coherent mask is a little misnamed and
doesn't have to anything with coherency.  It is the mask for DMA
allocations, while the dma mask is for streaming mappings.

> I've already implemented a patch (see after the scissors line) that
> takes a similar route as commit a0be1db4304f, but I consider that a
> workaround at best. But if that is what the community wants... I have to
> get the job done one way or the other.

That patch (minus the comments about being a workaround) is what you
should have done from the beginning.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-23 12:34 [RFC PATCH 0/3] fix dma_mask for CCW devices Halil Pasic
2019-09-23 12:34 ` [RFC PATCH 1/3] dma-mapping: make overriding GFP_* flags arch customizable Halil Pasic
2019-09-23 15:21   ` Christoph Hellwig
2019-09-26 12:37     ` Halil Pasic
2019-09-26 13:04       ` Robin Murphy
2019-09-27  0:33         ` Halil Pasic
2019-09-27 21:21           ` Christoph Hellwig
2019-09-23 12:34 ` [RFC PATCH 2/3] s390/virtio: fix virtio-ccw DMA without PV Halil Pasic
2019-09-23 12:34 ` [RFC PATCH 3/3] dma-mapping: warn on harmful GFP_* flags Halil Pasic

IOMMU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-iommu/0 linux-iommu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-iommu linux-iommu/ https://lore.kernel.org/linux-iommu \
		iommu@lists.linux-foundation.org
	public-inbox-index linux-iommu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.linux-foundation.lists.iommu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git