All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/12] s390: virtio: support protected virtualization
@ 2019-04-04 23:16 Halil Pasic
  2019-04-04 23:16 ` [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue Halil Pasic
                   ` (13 more replies)
  0 siblings, 14 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Enhanced virtualization protection technology may require the use of
bounce buffers for I/O. While support for this was built into the virtio
core,  virtio-ccw wasn't changed accordingly.

Thus what needs to be done to bring virtio-ccw up to speed with respect
to this is:
* use some 'new' common virtio stuff
* make sure that virtio-ccw specific stuff uses shared memory when
  talking to the hypervisor (except communication blocks like ORB, these
  are handled by the hypervisor)
* make sure the DMA API does what is necessary to talk through shared
  memory if we are a protected virtualization guest.
* make sure the common IO layer plays along as well (airqs, sense).

The series is structured in incremental fashion: some of the changes are
overridden by following patches. The main reason why is that this is how I
developed. But I think it ain't bad for the didactic and we are a bit more
flexible with regards to throwing out some of the stuff in the end.

Important notes:

* This is an early (WIP) RFC that does not add any function to the
  kernel at his stage, as the ultravisor interactions are left out.
  The purpose is getting some early feedback ASAP.

* In future these patches will depend on some code interacting with the
  ultravisor (WIP by Vasily and Janosch).

* The s390 names are by no means final, and are not properly explained. Should
  not hamper understanding too much. If it does please ask.

* The existing naming in the common infrastructure (kernel internal
  interfaces) is pretty much based on the AMD SEV terminology. Thus the
  names aren't always perfect. There might be merit to changing these
  names to more abstract ones. I did not put much thought into that at
  the current stage.


Testing:

Please use iommu_platform=on for any virtio devices you are going
to test this code with (so virtio actually uses the DMA API).

Looking forward to your review or any other type of input.

Halil Pasic (12):
  virtio/s390: use vring_create_virtqueue
  virtio/s390: DMA support for virtio-ccw
  s390/mm: force swiotlb for protected virtualization
  s390/cio: introduce cio DMA pool
  s390/cio: add protected virtualization support to cio
  s390/airq: use DMA memory for adapter interrupts
  virtio/s390: use DMA memory for ccw I/O
  virtio/s390: add indirection to indicators access
  virtio/s390: use DMA memory for notifiers
  virtio/s390: consolidate DMA allocations
  virtio/s390: use the cio DMA pool
  virtio/s390:  make airq summary indicators DMA

 arch/s390/Kconfig                   |   5 +
 arch/s390/include/asm/Kbuild        |   1 -
 arch/s390/include/asm/airq.h        |   2 +
 arch/s390/include/asm/cio.h         |   4 +
 arch/s390/include/asm/dma-mapping.h |  13 ++
 arch/s390/include/asm/mem_encrypt.h |  18 +++
 arch/s390/mm/init.c                 |  44 +++++
 drivers/s390/cio/airq.c             |  18 ++-
 drivers/s390/cio/ccwreq.c           |   8 +-
 drivers/s390/cio/css.c              |  63 ++++++++
 drivers/s390/cio/device.c           |  46 ++++--
 drivers/s390/cio/device_fsm.c       |  40 ++---
 drivers/s390/cio/device_id.c        |  18 +--
 drivers/s390/cio/device_ops.c       |   4 +-
 drivers/s390/cio/device_pgid.c      |  20 +--
 drivers/s390/cio/device_status.c    |  24 +--
 drivers/s390/cio/io_sch.h           |  19 ++-
 drivers/s390/virtio/virtio_ccw.c    | 310 ++++++++++++++++++++----------------
 18 files changed, 444 insertions(+), 213 deletions(-)
 create mode 100644 arch/s390/include/asm/dma-mapping.h
 create mode 100644 arch/s390/include/asm/mem_encrypt.h

-- 
2.16.4

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-08 11:01     ` Cornelia Huck
  2019-04-04 23:16 ` [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw Halil Pasic
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
establishes a new way of allocating virtqueues (as a part of the effort
that taught DMA to virtio rings).

In the future we will want virtio-ccw to use the DMA API as well.

Let us switch from the legacy method of allocating virtqueues to
vring_create_virtqueue() as the first step into that direction.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
 1 file changed, 8 insertions(+), 19 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 74c328321889..edf4afe2d688 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -108,7 +108,6 @@ struct virtio_rev_info {
 struct virtio_ccw_vq_info {
 	struct virtqueue *vq;
 	int num;
-	void *queue;
 	union {
 		struct vq_info_block s;
 		struct vq_info_block_legacy l;
@@ -423,7 +422,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 	struct virtio_ccw_device *vcdev = to_vc_device(vq->vdev);
 	struct virtio_ccw_vq_info *info = vq->priv;
 	unsigned long flags;
-	unsigned long size;
 	int ret;
 	unsigned int index = vq->index;
 
@@ -461,8 +459,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 			 ret, index);
 
 	vring_del_virtqueue(vq);
-	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-	free_pages_exact(info->queue, size);
 	kfree(info->info_block);
 	kfree(info);
 }
@@ -494,7 +490,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	int err;
 	struct virtqueue *vq = NULL;
 	struct virtio_ccw_vq_info *info;
-	unsigned long size = 0; /* silence the compiler */
+	u64 queue;
 	unsigned long flags;
 
 	/* Allocate queue. */
@@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		err = info->num;
 		goto out_err;
 	}
-	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
-	if (info->queue == NULL) {
-		dev_warn(&vcdev->cdev->dev, "no queue\n");
-		err = -ENOMEM;
-		goto out_err;
-	}
+	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
+				    vdev, true, true, ctx,
+				    virtio_ccw_kvm_notify, callback, name);
 
-	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
-				 true, ctx, info->queue, virtio_ccw_kvm_notify,
-				 callback, name);
 	if (!vq) {
 		/* For now, we fail if we can't get the requested size. */
 		dev_warn(&vcdev->cdev->dev, "no vq\n");
@@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		goto out_err;
 	}
 
+
 	/* Register it with the host. */
+	queue = virtqueue_get_desc_addr(vq);
 	if (vcdev->revision == 0) {
-		info->info_block->l.queue = (__u64)info->queue;
+		info->info_block->l.queue = queue;
 		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
 		info->info_block->l.index = i;
 		info->info_block->l.num = info->num;
 		ccw->count = sizeof(info->info_block->l);
 	} else {
-		info->info_block->s.desc = (__u64)info->queue;
+		info->info_block->s.desc = queue;
 		info->info_block->s.index = i;
 		info->info_block->s.num = info->num;
 		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
@@ -572,8 +563,6 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	if (vq)
 		vring_del_virtqueue(vq);
 	if (info) {
-		if (info->queue)
-			free_pages_exact(info->queue, size);
 		kfree(info->info_block);
 	}
 	kfree(info);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
  2019-04-04 23:16 ` [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-09  9:57     ` Cornelia Huck
  2019-04-04 23:16 ` [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization Halil Pasic
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Currently we have a problem if a virtio-ccw device has
VIRTIO_F_IOMMU_PLATFORM. In future we do want to support DMA API with
virtio-ccw.

Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
with virtio-ccw.

Let us also switch from legacy avail/used accessors to the DMA aware
ones (even if it isn't strictly necessary).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index edf4afe2d688..5956c9e820bb 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -66,6 +66,7 @@ struct virtio_ccw_device {
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
+	__u64 dma_mask;
 };
 
 struct vq_info_block_legacy {
@@ -536,8 +537,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		info->info_block->s.desc = queue;
 		info->info_block->s.index = i;
 		info->info_block->s.num = info->num;
-		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
-		info->info_block->s.used = (__u64)virtqueue_get_used(vq);
+		info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
+		info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
 		ccw->count = sizeof(info->info_block->s);
 	}
 	ccw->cmd_code = CCW_CMD_SET_VQ;
@@ -769,10 +770,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
 	/*
-	 * Packed ring isn't enabled on virtio_ccw for now,
-	 * because virtio_ccw uses some legacy accessors,
-	 * e.g. virtqueue_get_avail() and virtqueue_get_used()
-	 * which aren't available in packed ring currently.
+	 * There shouldn't be anything that precludes supporting paced.
+	 * TODO: Remove the limitation after having another look into this.
 	 */
 	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
@@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		ret = -ENOMEM;
 		goto out_free;
 	}
+	vcdev->vdev.dev.parent = &cdev->dev;
+	cdev->dev.dma_mask = &vcdev->dma_mask;
+
+	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
+	if (ret)
+		ret = dma_set_mask_and_coherent(&cdev->dev,
+						DMA_BIT_MASK(32));
+	if (ret) {
+		dev_warn(&cdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");
+		goto out_free;
+	}
+
 	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
 				   GFP_DMA | GFP_KERNEL);
 	if (!vcdev->config_block) {
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
  2019-04-04 23:16 ` [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue Halil Pasic
  2019-04-04 23:16 ` [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-09 10:16     ` Cornelia Huck
  2019-04-09 12:22     ` Christoph Hellwig
  2019-04-04 23:16 ` [RFC PATCH 04/12] s390/cio: introduce cio DMA pool Halil Pasic
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On s390 protected virtualization guests also have to use bounce I/O
buffers.  That requires some plumbing.

Let us make sure any device using DMA API accordingly is spared from the
problems that hypervisor attempting I/O to a non-shared secure page would
bring.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/Kconfig                   |  4 ++++
 arch/s390/include/asm/Kbuild        |  1 -
 arch/s390/include/asm/dma-mapping.h | 13 +++++++++++
 arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++++
 arch/s390/mm/init.c                 | 44 +++++++++++++++++++++++++++++++++++++
 5 files changed, 79 insertions(+), 1 deletion(-)
 create mode 100644 arch/s390/include/asm/dma-mapping.h
 create mode 100644 arch/s390/include/asm/mem_encrypt.h

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index b6e3d0653002..46c69283a67b 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
+config ARCH_HAS_MEM_ENCRYPT
+        def_bool y
+
 config MMU
 	def_bool y
 
@@ -190,6 +193,7 @@ config S390
 	select ARCH_HAS_SCALED_CPUTIME
 	select VIRT_TO_BUS
 	select HAVE_NMI
+	select SWIOTLB
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/Kbuild b/arch/s390/include/asm/Kbuild
index 12d77cb11fe5..ba55cd472950 100644
--- a/arch/s390/include/asm/Kbuild
+++ b/arch/s390/include/asm/Kbuild
@@ -8,7 +8,6 @@ generic-y += asm-offsets.h
 generic-y += cacheflush.h
 generic-y += device.h
 generic-y += dma-contiguous.h
-generic-y += dma-mapping.h
 generic-y += div64.h
 generic-y += emergency-restart.h
 generic-y += export.h
diff --git a/arch/s390/include/asm/dma-mapping.h b/arch/s390/include/asm/dma-mapping.h
new file mode 100644
index 000000000000..8985da6ecdfd
--- /dev/null
+++ b/arch/s390/include/asm/dma-mapping.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_S390_DMA_MAPPING_H
+#define _ASM_S390_DMA_MAPPING_H
+
+#include <linux/dma-contiguous.h>
+
+static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
+{
+	return NULL;
+}
+
+#endif /* _ASM_S390_DMA_MAPPING_H */
+
diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h
new file mode 100644
index 000000000000..0898c09a888c
--- /dev/null
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef S390_MEM_ENCRYPT_H__
+#define S390_MEM_ENCRYPT_H__
+
+#ifndef __ASSEMBLY__
+
+#define sme_me_mask	0ULL
+
+static inline bool sme_active(void) { return false; }
+extern bool sev_active(void);
+
+int set_memory_encrypted(unsigned long addr, int numpages);
+int set_memory_decrypted(unsigned long addr, int numpages);
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* S390_MEM_ENCRYPT_H__ */
+
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 3e82f66d5c61..a47bd4998d24 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -18,6 +18,7 @@
 #include <linux/mman.h>
 #include <linux/mm.h>
 #include <linux/swap.h>
+#include <linux/swiotlb.h>
 #include <linux/smp.h>
 #include <linux/init.h>
 #include <linux/pagemap.h>
@@ -29,6 +30,7 @@
 #include <linux/export.h>
 #include <linux/cma.h>
 #include <linux/gfp.h>
+#include <linux/dma-mapping.h>
 #include <asm/processor.h>
 #include <linux/uaccess.h>
 #include <asm/pgtable.h>
@@ -42,6 +44,7 @@
 #include <asm/sclp.h>
 #include <asm/set_memory.h>
 #include <asm/kasan.h>
+#include <asm/dma-mapping.h>
 
 pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
 
@@ -126,6 +129,45 @@ void mark_rodata_ro(void)
 	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
 }
 
+int set_memory_encrypted(unsigned long addr, int numpages)
+{
+	/* also called for the swiotlb bounce buffers, make all pages shared */
+	/* TODO: do ultravisor calls */
+	return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_encrypted);
+
+int set_memory_decrypted(unsigned long addr, int numpages)
+{
+	/* also called for the swiotlb bounce buffers, make all pages shared */
+	/* TODO: do ultravisor calls */
+	return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_decrypted);
+
+/* are we a protected virtualization guest? */
+bool sev_active(void)
+{
+	/*
+	 * TODO: Do proper detection using ultravisor, for now let us fake we
+	 *  have it so the code gets exercised.
+	 */
+	return true;
+}
+EXPORT_SYMBOL_GPL(sev_active);
+
+/* protected virtualization */
+static void pv_init(void)
+{
+	if (!sev_active())
+		return;
+
+	/* make sure bounce buffers are shared */
+	swiotlb_init(1);
+	swiotlb_update_mem_attributes();
+	swiotlb_force = SWIOTLB_FORCE;
+}
+
 void __init mem_init(void)
 {
 	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
@@ -134,6 +176,8 @@ void __init mem_init(void)
 	set_max_mapnr(max_low_pfn);
         high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
 
+	pv_init();
+
 	/* Setup guest page hinting */
 	cmma_init();
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (2 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-09 10:44     ` Cornelia Huck
  2019-04-11 18:25     ` Sebastian Ott
  2019-04-04 23:16 ` [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio Halil Pasic
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

To support protected virtualization cio will need to make sure the
memory used for communication with the hypervisor is DMA memory.

Let us introduce a DMA pool to cio that will help us in allocating
deallocating those chunks of memory.

We use a gen_pool backed with DMA pages to avoid each allocation
effectively wasting a page, as we typically allocate much less
than PAGE_SIZE.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/Kconfig           |  1 +
 arch/s390/include/asm/cio.h |  3 +++
 drivers/s390/cio/css.c      | 57 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 61 insertions(+)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 46c69283a67b..e8099ab47368 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -194,6 +194,7 @@ config S390
 	select VIRT_TO_BUS
 	select HAVE_NMI
 	select SWIOTLB
+	select GENERIC_ALLOCATOR
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
index 1727180e8ca1..4510e418614a 100644
--- a/arch/s390/include/asm/cio.h
+++ b/arch/s390/include/asm/cio.h
@@ -328,6 +328,9 @@ static inline u8 pathmask_to_pos(u8 mask)
 void channel_subsystem_reinit(void);
 extern void css_schedule_reprobe(void);
 
+extern void *cio_dma_zalloc(size_t size);
+extern void cio_dma_free(void *cpu_addr, size_t size);
+
 /* Function from drivers/s390/cio/chsc.c */
 int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
 int chsc_sstpi(void *page, void *result, size_t size);
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index aea502922646..72629d99d8e4 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -20,6 +20,8 @@
 #include <linux/reboot.h>
 #include <linux/suspend.h>
 #include <linux/proc_fs.h>
+#include <linux/genalloc.h>
+#include <linux/dma-mapping.h>
 #include <asm/isc.h>
 #include <asm/crw.h>
 
@@ -886,6 +888,8 @@ static const struct attribute_group *cssdev_attr_groups[] = {
 	NULL,
 };
 
+static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
+
 static int __init setup_css(int nr)
 {
 	struct channel_subsystem *css;
@@ -899,6 +903,9 @@ static int __init setup_css(int nr)
 	dev_set_name(&css->device, "css%x", nr);
 	css->device.groups = cssdev_attr_groups;
 	css->device.release = channel_subsystem_release;
+	/* some cio DMA memory needs to be 31 bit addressable */
+	css->device.coherent_dma_mask = DMA_BIT_MASK(31),
+	css->device.dma_mask = &css_dev_dma_mask;
 
 	mutex_init(&css->mutex);
 	css->cssid = chsc_get_cssid(nr);
@@ -1018,6 +1025,55 @@ static struct notifier_block css_power_notifier = {
 	.notifier_call = css_power_event,
 };
 
+#define POOL_INIT_PAGES 1
+static struct gen_pool *cio_dma_pool;
+/* Currently cio supports only a single css */
+static struct device *cio_dma_css;
+static gfp_t  cio_dma_flags;
+
+static void __init cio_dma_pool_init(void)
+{
+	void *cpu_addr;
+	dma_addr_t dma_addr;
+	int i;
+
+	cio_dma_css = &channel_subsystems[0]->device;
+	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
+	cio_dma_pool = gen_pool_create(3, -1);
+	/* No need to free up the resources: compiled in */
+	for (i = 0; i < POOL_INIT_PAGES; ++i) {
+		cpu_addr = dma_alloc_coherent(cio_dma_css, PAGE_SIZE, &dma_addr,
+					      cio_dma_flags);
+		if (!cpu_addr)
+			return;
+		gen_pool_add_virt(cio_dma_pool, (unsigned long) cpu_addr,
+				  dma_addr, PAGE_SIZE, -1);
+	}
+
+}
+
+void *cio_dma_zalloc(size_t size)
+{
+	dma_addr_t dma_addr;
+	unsigned long addr = gen_pool_alloc(cio_dma_pool, size);
+
+	if (!addr) {
+		addr = (unsigned long) dma_alloc_coherent(cio_dma_css,
+					PAGE_SIZE, &dma_addr, cio_dma_flags);
+		if (!addr)
+			return NULL;
+		gen_pool_add_virt(cio_dma_pool, addr, dma_addr, PAGE_SIZE, -1);
+		addr = gen_pool_alloc(cio_dma_pool, size);
+	}
+	return (void *) addr;
+}
+
+void cio_dma_free(void *cpu_addr, size_t size)
+{
+	memset(cpu_addr, 0, size);
+	gen_pool_free(cio_dma_pool, (unsigned long) cpu_addr, size);
+}
+
 /*
  * Now that the driver core is running, we can setup our channel subsystem.
  * The struct subchannel's are created during probing.
@@ -1063,6 +1119,7 @@ static int __init css_bus_init(void)
 		unregister_reboot_notifier(&css_reboot_notifier);
 		goto out_unregister;
 	}
+	cio_dma_pool_init();
 	css_init_done = 1;
 
 	/* Enable default isc for I/O subchannels. */
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (3 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 04/12] s390/cio: introduce cio DMA pool Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-09 17:55     ` Cornelia Huck
  2019-04-11 14:15     ` Sebastian Ott
  2019-04-04 23:16 ` [RFC PATCH 06/12] s390/airq: use DMA memory for adapter interrupts Halil Pasic
                   ` (8 subsequent siblings)
  13 siblings, 2 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Virtio-ccw relies on cio mechanisms for bootstrapping the ccw device.
Thus we need to make sure any memory that is used for communication with
the hypervisor is shared.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/cio/ccwreq.c        |  8 +++----
 drivers/s390/cio/device.c        | 46 +++++++++++++++++++++++++++++++---------
 drivers/s390/cio/device_fsm.c    | 40 +++++++++++++++++-----------------
 drivers/s390/cio/device_id.c     | 18 ++++++++--------
 drivers/s390/cio/device_ops.c    |  4 ++--
 drivers/s390/cio/device_pgid.c   | 20 ++++++++---------
 drivers/s390/cio/device_status.c | 24 ++++++++++-----------
 drivers/s390/cio/io_sch.h        | 19 ++++++++++++-----
 8 files changed, 107 insertions(+), 72 deletions(-)

diff --git a/drivers/s390/cio/ccwreq.c b/drivers/s390/cio/ccwreq.c
index 603268a33ea1..dafbceb311b3 100644
--- a/drivers/s390/cio/ccwreq.c
+++ b/drivers/s390/cio/ccwreq.c
@@ -63,7 +63,7 @@ static void ccwreq_stop(struct ccw_device *cdev, int rc)
 		return;
 	req->done = 1;
 	ccw_device_set_timeout(cdev, 0);
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 	if (rc && rc != -ENODEV && req->drc)
 		rc = req->drc;
 	req->callback(cdev, req->data, rc);
@@ -86,7 +86,7 @@ static void ccwreq_do(struct ccw_device *cdev)
 			continue;
 		}
 		/* Perform start function. */
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		rc = cio_start(sch, cp, (u8) req->mask);
 		if (rc == 0) {
 			/* I/O started successfully. */
@@ -169,7 +169,7 @@ int ccw_request_cancel(struct ccw_device *cdev)
  */
 static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
 {
-	struct irb *irb = &cdev->private->irb;
+	struct irb *irb = &cdev->private->dma_area->irb;
 	struct cmd_scsw *scsw = &irb->scsw.cmd;
 	enum uc_todo todo;
 
@@ -187,7 +187,7 @@ static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
 		CIO_TRACE_EVENT(2, "sensedata");
 		CIO_HEX_EVENT(2, &cdev->private->dev_id,
 			      sizeof(struct ccw_dev_id));
-		CIO_HEX_EVENT(2, &cdev->private->irb.ecw, SENSE_MAX_COUNT);
+		CIO_HEX_EVENT(2, &cdev->private->dma_area->irb.ecw, SENSE_MAX_COUNT);
 		/* Check for command reject. */
 		if (irb->ecw[0] & SNS0_CMD_REJECT)
 			return IO_REJECTED;
diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
index 1540229a37bb..62f50e288740 100644
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -689,6 +689,7 @@ ccw_device_release(struct device *dev)
 	cdev = to_ccwdev(dev);
 	/* Release reference of parent subchannel. */
 	put_device(cdev->dev.parent);
+	cio_dma_free(cdev->private->dma_area, sizeof(*cdev->private->dma_area));
 	kfree(cdev->private);
 	kfree(cdev);
 }
@@ -698,13 +699,22 @@ static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
 	struct ccw_device *cdev;
 
 	cdev  = kzalloc(sizeof(*cdev), GFP_KERNEL);
-	if (cdev) {
-		cdev->private = kzalloc(sizeof(struct ccw_device_private),
-					GFP_KERNEL | GFP_DMA);
-		if (cdev->private)
-			return cdev;
-	}
+	if (!cdev)
+		goto err_cdev;
+	cdev->private = kzalloc(sizeof(struct ccw_device_private),
+				GFP_KERNEL | GFP_DMA);
+	if (!cdev->private)
+		goto err_priv;
+	cdev->private->dma_area = cio_dma_zalloc(
+					sizeof(*cdev->private->dma_area));
+	if (!cdev->private->dma_area)
+		goto err_dma_area;
+	return cdev;
+err_dma_area:
+	kfree(cdev->private);
+err_priv:
 	kfree(cdev);
+err_cdev:
 	return ERR_PTR(-ENOMEM);
 }
 
@@ -1061,6 +1071,11 @@ static int io_subchannel_probe(struct subchannel *sch)
 	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
 	if (!io_priv)
 		goto out_schedule;
+	io_priv->dma_area = cio_dma_zalloc(sizeof(*io_priv->dma_area));
+	if (!io_priv->dma_area) {
+		kfree(io_priv);
+		goto out_schedule;
+	}
 
 	set_io_private(sch, io_priv);
 	css_schedule_eval(sch->schid);
@@ -1088,6 +1103,7 @@ static int io_subchannel_remove(struct subchannel *sch)
 	set_io_private(sch, NULL);
 	spin_unlock_irq(sch->lock);
 out_free:
+	cio_dma_free(io_priv->dma_area, sizeof(*io_priv->dma_area));
 	kfree(io_priv);
 	sysfs_remove_group(&sch->dev.kobj, &io_subchannel_attr_group);
 	return 0;
@@ -1593,20 +1609,29 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
 		return ERR_CAST(sch);
 
 	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
-	if (!io_priv) {
-		put_device(&sch->dev);
-		return ERR_PTR(-ENOMEM);
-	}
+	if (!io_priv)
+		goto err_priv;
+	io_priv->dma_area = cio_dma_zalloc(sizeof(*io_priv->dma_area));
+	if (!io_priv->dma_area)
+		goto err_dma_area;
 	set_io_private(sch, io_priv);
 	cdev = io_subchannel_create_ccwdev(sch);
 	if (IS_ERR(cdev)) {
 		put_device(&sch->dev);
+		cio_dma_free(io_priv->dma_area, sizeof(*io_priv->dma_area));
 		kfree(io_priv);
 		return cdev;
 	}
 	cdev->drv = drv;
 	ccw_device_set_int_class(cdev);
 	return cdev;
+
+err_dma_area:
+		kfree(io_priv);
+err_priv:
+	put_device(&sch->dev);
+	return ERR_PTR(-ENOMEM);
+
 }
 
 void __init ccw_device_destroy_console(struct ccw_device *cdev)
@@ -1617,6 +1642,7 @@ void __init ccw_device_destroy_console(struct ccw_device *cdev)
 	set_io_private(sch, NULL);
 	put_device(&sch->dev);
 	put_device(&cdev->dev);
+	cio_dma_free(io_priv->dma_area, sizeof(*io_priv->dma_area));
 	kfree(io_priv);
 }
 
diff --git a/drivers/s390/cio/device_fsm.c b/drivers/s390/cio/device_fsm.c
index 9169af7dbb43..fea23b44795b 100644
--- a/drivers/s390/cio/device_fsm.c
+++ b/drivers/s390/cio/device_fsm.c
@@ -67,8 +67,8 @@ static void ccw_timeout_log(struct ccw_device *cdev)
 			       sizeof(struct tcw), 0);
 	} else {
 		printk(KERN_WARNING "cio: orb indicates command mode\n");
-		if ((void *)(addr_t)orb->cmd.cpa == &private->sense_ccw ||
-		    (void *)(addr_t)orb->cmd.cpa == cdev->private->iccws)
+		if ((void *)(addr_t)orb->cmd.cpa == &private->dma_area->sense_ccw ||
+		    (void *)(addr_t)orb->cmd.cpa == cdev->private->dma_area->iccws)
 			printk(KERN_WARNING "cio: last channel program "
 			       "(intern):\n");
 		else
@@ -143,18 +143,18 @@ ccw_device_cancel_halt_clear(struct ccw_device *cdev)
 void ccw_device_update_sense_data(struct ccw_device *cdev)
 {
 	memset(&cdev->id, 0, sizeof(cdev->id));
-	cdev->id.cu_type   = cdev->private->senseid.cu_type;
-	cdev->id.cu_model  = cdev->private->senseid.cu_model;
-	cdev->id.dev_type  = cdev->private->senseid.dev_type;
-	cdev->id.dev_model = cdev->private->senseid.dev_model;
+	cdev->id.cu_type   = cdev->private->dma_area->senseid.cu_type;
+	cdev->id.cu_model  = cdev->private->dma_area->senseid.cu_model;
+	cdev->id.dev_type  = cdev->private->dma_area->senseid.dev_type;
+	cdev->id.dev_model = cdev->private->dma_area->senseid.dev_model;
 }
 
 int ccw_device_test_sense_data(struct ccw_device *cdev)
 {
-	return cdev->id.cu_type == cdev->private->senseid.cu_type &&
-		cdev->id.cu_model == cdev->private->senseid.cu_model &&
-		cdev->id.dev_type == cdev->private->senseid.dev_type &&
-		cdev->id.dev_model == cdev->private->senseid.dev_model;
+	return cdev->id.cu_type == cdev->private->dma_area->senseid.cu_type &&
+		cdev->id.cu_model == cdev->private->dma_area->senseid.cu_model &&
+		cdev->id.dev_type == cdev->private->dma_area->senseid.dev_type &&
+		cdev->id.dev_model == cdev->private->dma_area->senseid.dev_model;
 }
 
 /*
@@ -342,7 +342,7 @@ ccw_device_done(struct ccw_device *cdev, int state)
 		cio_disable_subchannel(sch);
 
 	/* Reset device status. */
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 
 	cdev->private->state = state;
 
@@ -509,13 +509,13 @@ void ccw_device_verify_done(struct ccw_device *cdev, int err)
 		ccw_device_done(cdev, DEV_STATE_ONLINE);
 		/* Deliver fake irb to device driver, if needed. */
 		if (cdev->private->flags.fake_irb) {
-			create_fake_irb(&cdev->private->irb,
+			create_fake_irb(&cdev->private->dma_area->irb,
 					cdev->private->flags.fake_irb);
 			cdev->private->flags.fake_irb = 0;
 			if (cdev->handler)
 				cdev->handler(cdev, cdev->private->intparm,
-					      &cdev->private->irb);
-			memset(&cdev->private->irb, 0, sizeof(struct irb));
+					      &cdev->private->dma_area->irb);
+			memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		}
 		ccw_device_report_path_events(cdev);
 		ccw_device_handle_broken_paths(cdev);
@@ -672,7 +672,7 @@ ccw_device_online_verify(struct ccw_device *cdev, enum dev_event dev_event)
 
 	if (scsw_actl(&sch->schib.scsw) != 0 ||
 	    (scsw_stctl(&sch->schib.scsw) & SCSW_STCTL_STATUS_PEND) ||
-	    (scsw_stctl(&cdev->private->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
+	    (scsw_stctl(&cdev->private->dma_area->irb.scsw) & SCSW_STCTL_STATUS_PEND)) {
 		/*
 		 * No final status yet or final status not yet delivered
 		 * to the device driver. Can't do path verification now,
@@ -719,7 +719,7 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
 	 *  - fast notification was requested (primary status)
 	 *  - unsolicited interrupts
 	 */
-	stctl = scsw_stctl(&cdev->private->irb.scsw);
+	stctl = scsw_stctl(&cdev->private->dma_area->irb.scsw);
 	ending_status = (stctl & SCSW_STCTL_SEC_STATUS) ||
 		(stctl == (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND)) ||
 		(stctl == SCSW_STCTL_STATUS_PEND);
@@ -735,9 +735,9 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
 
 	if (cdev->handler)
 		cdev->handler(cdev, cdev->private->intparm,
-			      &cdev->private->irb);
+			      &cdev->private->dma_area->irb);
 
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 	return 1;
 }
 
@@ -759,7 +759,7 @@ ccw_device_irq(struct ccw_device *cdev, enum dev_event dev_event)
 			/* Unit check but no sense data. Need basic sense. */
 			if (ccw_device_do_sense(cdev, irb) != 0)
 				goto call_handler_unsol;
-			memcpy(&cdev->private->irb, irb, sizeof(struct irb));
+			memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
 			cdev->private->state = DEV_STATE_W4SENSE;
 			cdev->private->intparm = 0;
 			return;
@@ -842,7 +842,7 @@ ccw_device_w4sense(struct ccw_device *cdev, enum dev_event dev_event)
 	if (scsw_fctl(&irb->scsw) &
 	    (SCSW_FCTL_CLEAR_FUNC | SCSW_FCTL_HALT_FUNC)) {
 		cdev->private->flags.dosense = 0;
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		ccw_device_accumulate_irb(cdev, irb);
 		goto call_handler;
 	}
diff --git a/drivers/s390/cio/device_id.c b/drivers/s390/cio/device_id.c
index f6df83a9dfbb..ea8a0fc6c0b6 100644
--- a/drivers/s390/cio/device_id.c
+++ b/drivers/s390/cio/device_id.c
@@ -99,7 +99,7 @@ static int diag210_to_senseid(struct senseid *senseid, struct diag210 *diag)
 static int diag210_get_dev_info(struct ccw_device *cdev)
 {
 	struct ccw_dev_id *dev_id = &cdev->private->dev_id;
-	struct senseid *senseid = &cdev->private->senseid;
+	struct senseid *senseid = &cdev->private->dma_area->senseid;
 	struct diag210 diag_data;
 	int rc;
 
@@ -134,8 +134,8 @@ static int diag210_get_dev_info(struct ccw_device *cdev)
 static void snsid_init(struct ccw_device *cdev)
 {
 	cdev->private->flags.esid = 0;
-	memset(&cdev->private->senseid, 0, sizeof(cdev->private->senseid));
-	cdev->private->senseid.cu_type = 0xffff;
+	memset(&cdev->private->dma_area->senseid, 0, sizeof(cdev->private->dma_area->senseid));
+	cdev->private->dma_area->senseid.cu_type = 0xffff;
 }
 
 /*
@@ -143,16 +143,16 @@ static void snsid_init(struct ccw_device *cdev)
  */
 static int snsid_check(struct ccw_device *cdev, void *data)
 {
-	struct cmd_scsw *scsw = &cdev->private->irb.scsw.cmd;
+	struct cmd_scsw *scsw = &cdev->private->dma_area->irb.scsw.cmd;
 	int len = sizeof(struct senseid) - scsw->count;
 
 	/* Check for incomplete SENSE ID data. */
 	if (len < SENSE_ID_MIN_LEN)
 		goto out_restart;
-	if (cdev->private->senseid.cu_type == 0xffff)
+	if (cdev->private->dma_area->senseid.cu_type == 0xffff)
 		goto out_restart;
 	/* Check for incompatible SENSE ID data. */
-	if (cdev->private->senseid.reserved != 0xff)
+	if (cdev->private->dma_area->senseid.reserved != 0xff)
 		return -EOPNOTSUPP;
 	/* Check for extended-identification information. */
 	if (len > SENSE_ID_BASIC_LEN)
@@ -170,7 +170,7 @@ static int snsid_check(struct ccw_device *cdev, void *data)
 static void snsid_callback(struct ccw_device *cdev, void *data, int rc)
 {
 	struct ccw_dev_id *id = &cdev->private->dev_id;
-	struct senseid *senseid = &cdev->private->senseid;
+	struct senseid *senseid = &cdev->private->dma_area->senseid;
 	int vm = 0;
 
 	if (rc && MACHINE_IS_VM) {
@@ -200,7 +200,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
 {
 	struct subchannel *sch = to_subchannel(cdev->dev.parent);
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	CIO_TRACE_EVENT(4, "snsid");
 	CIO_HEX_EVENT(4, &cdev->private->dev_id, sizeof(cdev->private->dev_id));
@@ -208,7 +208,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
 	snsid_init(cdev);
 	/* Channel program setup. */
 	cp->cmd_code	= CCW_CMD_SENSE_ID;
-	cp->cda		= (u32) (addr_t) &cdev->private->senseid;
+	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->senseid;
 	cp->count	= sizeof(struct senseid);
 	cp->flags	= CCW_FLAG_SLI;
 	/* Request setup. */
diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c
index 4435ae0b3027..8c3ccb8768a0 100644
--- a/drivers/s390/cio/device_ops.c
+++ b/drivers/s390/cio/device_ops.c
@@ -429,8 +429,8 @@ struct ciw *ccw_device_get_ciw(struct ccw_device *cdev, __u32 ct)
 	if (cdev->private->flags.esid == 0)
 		return NULL;
 	for (ciw_cnt = 0; ciw_cnt < MAX_CIWS; ciw_cnt++)
-		if (cdev->private->senseid.ciw[ciw_cnt].ct == ct)
-			return cdev->private->senseid.ciw + ciw_cnt;
+		if (cdev->private->dma_area->senseid.ciw[ciw_cnt].ct == ct)
+			return cdev->private->dma_area->senseid.ciw + ciw_cnt;
 	return NULL;
 }
 
diff --git a/drivers/s390/cio/device_pgid.c b/drivers/s390/cio/device_pgid.c
index d30a3babf176..e97baa89cbf8 100644
--- a/drivers/s390/cio/device_pgid.c
+++ b/drivers/s390/cio/device_pgid.c
@@ -57,7 +57,7 @@ static void verify_done(struct ccw_device *cdev, int rc)
 static void nop_build_cp(struct ccw_device *cdev)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	cp->cmd_code	= CCW_CMD_NOOP;
 	cp->cda		= 0;
@@ -134,9 +134,9 @@ static void nop_callback(struct ccw_device *cdev, void *data, int rc)
 static void spid_build_cp(struct ccw_device *cdev, u8 fn)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 	int i = pathmask_to_pos(req->lpm);
-	struct pgid *pgid = &cdev->private->pgid[i];
+	struct pgid *pgid = &cdev->private->dma_area->pgid[i];
 
 	pgid->inf.fc	= fn;
 	cp->cmd_code	= CCW_CMD_SET_PGID;
@@ -300,7 +300,7 @@ static int pgid_cmp(struct pgid *p1, struct pgid *p2)
 static void pgid_analyze(struct ccw_device *cdev, struct pgid **p,
 			 int *mismatch, u8 *reserved, u8 *reset)
 {
-	struct pgid *pgid = &cdev->private->pgid[0];
+	struct pgid *pgid = &cdev->private->dma_area->pgid[0];
 	struct pgid *first = NULL;
 	int lpm;
 	int i;
@@ -342,7 +342,7 @@ static u8 pgid_to_donepm(struct ccw_device *cdev)
 		lpm = 0x80 >> i;
 		if ((cdev->private->pgid_valid_mask & lpm) == 0)
 			continue;
-		pgid = &cdev->private->pgid[i];
+		pgid = &cdev->private->dma_area->pgid[i];
 		if (sch->opm & lpm) {
 			if (pgid->inf.ps.state1 != SNID_STATE1_GROUPED)
 				continue;
@@ -368,7 +368,7 @@ static void pgid_fill(struct ccw_device *cdev, struct pgid *pgid)
 	int i;
 
 	for (i = 0; i < 8; i++)
-		memcpy(&cdev->private->pgid[i], pgid, sizeof(struct pgid));
+		memcpy(&cdev->private->dma_area->pgid[i], pgid, sizeof(struct pgid));
 }
 
 /*
@@ -435,12 +435,12 @@ static void snid_done(struct ccw_device *cdev, int rc)
 static void snid_build_cp(struct ccw_device *cdev)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 	int i = pathmask_to_pos(req->lpm);
 
 	/* Channel program setup. */
 	cp->cmd_code	= CCW_CMD_SENSE_PGID;
-	cp->cda		= (u32) (addr_t) &cdev->private->pgid[i];
+	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->pgid[i];
 	cp->count	= sizeof(struct pgid);
 	cp->flags	= CCW_FLAG_SLI;
 	req->cp		= cp;
@@ -516,7 +516,7 @@ static void verify_start(struct ccw_device *cdev)
 	sch->lpm = sch->schib.pmcw.pam;
 
 	/* Initialize PGID data. */
-	memset(cdev->private->pgid, 0, sizeof(cdev->private->pgid));
+	memset(cdev->private->dma_area->pgid, 0, sizeof(cdev->private->dma_area->pgid));
 	cdev->private->pgid_valid_mask = 0;
 	cdev->private->pgid_todo_mask = sch->schib.pmcw.pam;
 	cdev->private->path_notoper_mask = 0;
@@ -626,7 +626,7 @@ struct stlck_data {
 static void stlck_build_cp(struct ccw_device *cdev, void *buf1, void *buf2)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	cp[0].cmd_code = CCW_CMD_STLCK;
 	cp[0].cda = (u32) (addr_t) buf1;
diff --git a/drivers/s390/cio/device_status.c b/drivers/s390/cio/device_status.c
index 7d5c7892b2c4..a9aabde604f4 100644
--- a/drivers/s390/cio/device_status.c
+++ b/drivers/s390/cio/device_status.c
@@ -79,15 +79,15 @@ ccw_device_accumulate_ecw(struct ccw_device *cdev, struct irb *irb)
 	 * are condition that have to be met for the extended control
 	 * bit to have meaning. Sick.
 	 */
-	cdev->private->irb.scsw.cmd.ectl = 0;
+	cdev->private->dma_area->irb.scsw.cmd.ectl = 0;
 	if ((irb->scsw.cmd.stctl & SCSW_STCTL_ALERT_STATUS) &&
 	    !(irb->scsw.cmd.stctl & SCSW_STCTL_INTER_STATUS))
-		cdev->private->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
+		cdev->private->dma_area->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
 	/* Check if extended control word is valid. */
-	if (!cdev->private->irb.scsw.cmd.ectl)
+	if (!cdev->private->dma_area->irb.scsw.cmd.ectl)
 		return;
 	/* Copy concurrent sense / model dependent information. */
-	memcpy (&cdev->private->irb.ecw, irb->ecw, sizeof (irb->ecw));
+	memcpy (&cdev->private->dma_area->irb.ecw, irb->ecw, sizeof (irb->ecw));
 }
 
 /*
@@ -118,7 +118,7 @@ ccw_device_accumulate_esw(struct ccw_device *cdev, struct irb *irb)
 	if (!ccw_device_accumulate_esw_valid(irb))
 		return;
 
-	cdev_irb = &cdev->private->irb;
+	cdev_irb = &cdev->private->dma_area->irb;
 
 	/* Copy last path used mask. */
 	cdev_irb->esw.esw1.lpum = irb->esw.esw1.lpum;
@@ -210,7 +210,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 		ccw_device_path_notoper(cdev);
 	/* No irb accumulation for transport mode irbs. */
 	if (scsw_is_tm(&irb->scsw)) {
-		memcpy(&cdev->private->irb, irb, sizeof(struct irb));
+		memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
 		return;
 	}
 	/*
@@ -219,7 +219,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 	if (!scsw_is_solicited(&irb->scsw))
 		return;
 
-	cdev_irb = &cdev->private->irb;
+	cdev_irb = &cdev->private->dma_area->irb;
 
 	/*
 	 * If the clear function had been performed, all formerly pending
@@ -227,7 +227,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct irb *irb)
 	 * intermediate accumulated status to the device driver.
 	 */
 	if (irb->scsw.cmd.fctl & SCSW_FCTL_CLEAR_FUNC)
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 
 	/* Copy bits which are valid only for the start function. */
 	if (irb->scsw.cmd.fctl & SCSW_FCTL_START_FUNC) {
@@ -329,9 +329,9 @@ ccw_device_do_sense(struct ccw_device *cdev, struct irb *irb)
 	/*
 	 * We have ending status but no sense information. Do a basic sense.
 	 */
-	sense_ccw = &to_io_private(sch)->sense_ccw;
+	sense_ccw = &to_io_private(sch)->dma_area->sense_ccw;
 	sense_ccw->cmd_code = CCW_CMD_BASIC_SENSE;
-	sense_ccw->cda = (__u32) __pa(cdev->private->irb.ecw);
+	sense_ccw->cda = (__u32) __pa(cdev->private->dma_area->irb.ecw);
 	sense_ccw->count = SENSE_MAX_COUNT;
 	sense_ccw->flags = CCW_FLAG_SLI;
 
@@ -364,7 +364,7 @@ ccw_device_accumulate_basic_sense(struct ccw_device *cdev, struct irb *irb)
 
 	if (!(irb->scsw.cmd.dstat & DEV_STAT_UNIT_CHECK) &&
 	    (irb->scsw.cmd.dstat & DEV_STAT_CHN_END)) {
-		cdev->private->irb.esw.esw0.erw.cons = 1;
+		cdev->private->dma_area->irb.esw.esw0.erw.cons = 1;
 		cdev->private->flags.dosense = 0;
 	}
 	/* Check if path verification is required. */
@@ -386,7 +386,7 @@ ccw_device_accumulate_and_sense(struct ccw_device *cdev, struct irb *irb)
 	/* Check for basic sense. */
 	if (cdev->private->flags.dosense &&
 	    !(irb->scsw.cmd.dstat & DEV_STAT_UNIT_CHECK)) {
-		cdev->private->irb.esw.esw0.erw.cons = 1;
+		cdev->private->dma_area->irb.esw.esw0.erw.cons = 1;
 		cdev->private->flags.dosense = 0;
 		return 0;
 	}
diff --git a/drivers/s390/cio/io_sch.h b/drivers/s390/cio/io_sch.h
index 90e4e3a7841b..fc3220fede0f 100644
--- a/drivers/s390/cio/io_sch.h
+++ b/drivers/s390/cio/io_sch.h
@@ -9,15 +9,20 @@
 #include "css.h"
 #include "orb.h"
 
+
+struct io_subchannel_dma_area {
+	struct ccw1 sense_ccw;	/* static ccw for sense command */
+};
+
 struct io_subchannel_private {
 	union orb orb;		/* operation request block */
-	struct ccw1 sense_ccw;	/* static ccw for sense command */
 	struct ccw_device *cdev;/* pointer to the child ccw device */
 	struct {
 		unsigned int suspend:1;	/* allow suspend */
 		unsigned int prefetch:1;/* deny prefetch */
 		unsigned int inter:1;	/* suppress intermediate interrupts */
 	} __packed options;
+	struct io_subchannel_dma_area *dma_area;
 } __aligned(8);
 
 #define to_io_private(n) ((struct io_subchannel_private *) \
@@ -115,6 +120,13 @@ enum cdev_todo {
 #define FAKE_CMD_IRB	1
 #define FAKE_TM_IRB	2
 
+struct ccw_device_dma_area {
+	struct senseid senseid;	/* SenseID info */
+	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
+	struct irb irb;		/* device status */
+	struct pgid pgid[8];	/* path group IDs per chpid*/
+};
+
 struct ccw_device_private {
 	struct ccw_device *cdev;
 	struct subchannel *sch;
@@ -156,11 +168,7 @@ struct ccw_device_private {
 	} __attribute__((packed)) flags;
 	unsigned long intparm;	/* user interruption parameter */
 	struct qdio_irq *qdio_data;
-	struct irb irb;		/* device status */
 	int async_kill_io_rc;
-	struct senseid senseid;	/* SenseID info */
-	struct pgid pgid[8];	/* path group IDs per chpid*/
-	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
 	struct work_struct todo_work;
 	enum cdev_todo todo;
 	wait_queue_head_t wait_q;
@@ -169,6 +177,7 @@ struct ccw_device_private {
 	struct list_head cmb_list;	/* list of measured devices */
 	u64 cmb_start_time;		/* clock value of cmb reset */
 	void *cmb_wait;			/* deferred cmb enable/disable */
+	struct ccw_device_dma_area *dma_area;
 	enum interruption_class int_class;
 };
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 06/12] s390/airq: use DMA memory for adapter interrupts
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (4 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-04 23:16 ` [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O Halil Pasic
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Protected virtualization guests have to use shared pages for airq
notifier bit vectors, because hypervisor needs to write these bits.

Let us make sure we allocate DMA memory for the notifier bit vectors.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 arch/s390/include/asm/airq.h |  2 ++
 arch/s390/include/asm/cio.h  |  1 +
 drivers/s390/cio/airq.c      | 18 ++++++++++++++----
 drivers/s390/cio/css.c       |  8 +++++++-
 4 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index fcf539efb32f..1492d4856049 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -11,6 +11,7 @@
 #define _ASM_S390_AIRQ_H
 
 #include <linux/bit_spinlock.h>
+#include <linux/dma-mapping.h>
 
 struct airq_struct {
 	struct hlist_node list;		/* Handler queueing. */
@@ -29,6 +30,7 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
 	unsigned long *vector;	/* Adapter interrupt bit vector */
+	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
 	unsigned long *avail;	/* Allocation bit mask for the bit vector */
 	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
 	unsigned long *ptr;	/* Pointer associated with each bit */
diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
index 4510e418614a..036d3a40d10d 100644
--- a/arch/s390/include/asm/cio.h
+++ b/arch/s390/include/asm/cio.h
@@ -330,6 +330,7 @@ extern void css_schedule_reprobe(void);
 
 extern void *cio_dma_zalloc(size_t size);
 extern void cio_dma_free(void *cpu_addr, size_t size);
+extern struct device *cio_get_dma_css_dev(void);
 
 /* Function from drivers/s390/cio/chsc.c */
 int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index a45011e4529e..7a5c0a08ee09 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -19,6 +19,7 @@
 
 #include <asm/airq.h>
 #include <asm/isc.h>
+#include <asm/cio.h>
 
 #include "cio.h"
 #include "cio_debug.h"
@@ -113,6 +114,11 @@ void __init init_airq_interrupts(void)
 	setup_irq(THIN_INTERRUPT, &airq_interrupt);
 }
 
+static inline unsigned long iv_size(unsigned long bits)
+{
+	return BITS_TO_LONGS(bits) * sizeof(unsigned long);
+}
+
 /**
  * airq_iv_create - create an interrupt vector
  * @bits: number of bits in the interrupt vector
@@ -123,14 +129,15 @@ void __init init_airq_interrupts(void)
 struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 {
 	struct airq_iv *iv;
-	unsigned long size;
+	unsigned long size = 0;
 
 	iv = kzalloc(sizeof(*iv), GFP_KERNEL);
 	if (!iv)
 		goto out;
 	iv->bits = bits;
-	size = BITS_TO_LONGS(bits) * sizeof(unsigned long);
-	iv->vector = kzalloc(size, GFP_KERNEL);
+	size = iv_size(bits);
+	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
+						 &iv->vector_dma, GFP_KERNEL);
 	if (!iv->vector)
 		goto out_free;
 	if (flags & AIRQ_IV_ALLOC) {
@@ -165,7 +172,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->avail);
-	kfree(iv->vector);
+	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
+			  iv->vector_dma);
 	kfree(iv);
 out:
 	return NULL;
@@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->vector);
+	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
+			  iv->vector, iv->vector_dma);
 	kfree(iv->avail);
 	kfree(iv);
 }
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index 72629d99d8e4..9e21c2a50682 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -1031,13 +1031,19 @@ static struct gen_pool *cio_dma_pool;
 static struct device *cio_dma_css;
 static gfp_t  cio_dma_flags;
 
+
+struct device *cio_get_dma_css_dev(void)
+{
+	return &channel_subsystems[0]->device;
+}
+
 static void __init cio_dma_pool_init(void)
 {
 	void *cpu_addr;
 	dma_addr_t dma_addr;
 	int i;
 
-	cio_dma_css = &channel_subsystems[0]->device;
+	cio_dma_css =  cio_get_dma_css_dev();
 	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
 	cio_dma_pool = gen_pool_create(3, -1);
 	/* No need to free up the resources: compiled in */
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (5 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 06/12] s390/airq: use DMA memory for adapter interrupts Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-10  8:42     ` Cornelia Huck
  2019-04-04 23:16 ` [RFC PATCH 08/12] virtio/s390: add indirection to indicators access Halil Pasic
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Before virtio-ccw could get away with not using DMA API for the pieces of
memory it does ccw I/O with. With protected virtualization this has to
change, since the hypervisor needs to read and sometimes also write these
pieces of memory.

Let us make sure all ccw I/O is done through shared memory.

Note: The control blocks of I/O instructions do not need to be shared.
These are marshalled by the ultravisor.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++++----------------
 1 file changed, 107 insertions(+), 70 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 5956c9e820bb..9c412a581a50 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -49,6 +49,7 @@ struct vq_config_block {
 struct virtio_ccw_device {
 	struct virtio_device vdev;
 	__u8 *status;
+	dma_addr_t status_dma_addr;
 	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
 	struct ccw_device *cdev;
 	__u32 curr_io;
@@ -61,6 +62,7 @@ struct virtio_ccw_device {
 	unsigned long indicators;
 	unsigned long indicators2;
 	struct vq_config_block *config_block;
+	dma_addr_t config_block_dma_addr;
 	bool is_thinint;
 	bool going_away;
 	bool device_lost;
@@ -113,6 +115,7 @@ struct virtio_ccw_vq_info {
 		struct vq_info_block s;
 		struct vq_info_block_legacy l;
 	} *info_block;
+	dma_addr_t info_block_dma_addr;
 	int bit_nr;
 	struct list_head node;
 	long cookie;
@@ -167,6 +170,28 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
 	return container_of(vdev, struct virtio_ccw_device, vdev);
 }
 
+#define vc_dma_decl_struct(type, field) \
+	dma_addr_t field ## _dma_addr;  \
+	struct type *field
+
+static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
+				   dma_addr_t *dma_handle)
+{
+	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
+				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
+}
+
+static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
+				 void *cpu_addr, dma_addr_t dma_handle)
+{
+	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
+}
+
+#define vc_dma_alloc_struct(vdev, ptr) \
+	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
+#define vc_dma_free_struct(vdev, ptr) \
+	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))
+
 static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
 {
 	unsigned long i, flags;
@@ -322,12 +347,12 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 {
 	int ret;
 	unsigned long *indicatorp = NULL;
-	struct virtio_thinint_area *thinint_area = NULL;
+	vc_dma_decl_struct(virtio_thinint_area, thinint_area) = NULL;
+	dma_addr_t indicatorp_dma_addr;
 	struct airq_info *airq_info = vcdev->airq_info;
 
 	if (vcdev->is_thinint) {
-		thinint_area = kzalloc(sizeof(*thinint_area),
-				       GFP_DMA | GFP_KERNEL);
+		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
 		if (!thinint_area)
 			return;
 		thinint_area->summary_indicator =
@@ -338,8 +363,9 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		ccw->cda = (__u32)(unsigned long) thinint_area;
 	} else {
 		/* payload is the address of the indicators */
-		indicatorp = kmalloc(sizeof(&vcdev->indicators),
-				     GFP_DMA | GFP_KERNEL);
+		indicatorp = __vc_dma_alloc(&vcdev->vdev,
+					    sizeof(&vcdev->indicators),
+					    &indicatorp_dma_addr);
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
@@ -359,8 +385,10 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 			 "Failed to deregister indicators (%d)\n", ret);
 	else if (vcdev->is_thinint)
 		virtio_ccw_drop_indicators(vcdev);
-	kfree(indicatorp);
-	kfree(thinint_area);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
+			       indicatorp, indicatorp_dma_addr);
+	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 }
 
 static inline long __do_kvm_notify(struct subchannel_id schid,
@@ -460,17 +488,17 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
 			 ret, index);
 
 	vring_del_virtqueue(vq);
-	kfree(info->info_block);
+	vc_dma_free_struct(vq->vdev, info->info_block);
 	kfree(info);
 }
 
 static void virtio_ccw_del_vqs(struct virtio_device *vdev)
 {
 	struct virtqueue *vq, *n;
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(ccw1, ccw);
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
@@ -479,7 +507,7 @@ static void virtio_ccw_del_vqs(struct virtio_device *vdev)
 	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
 		virtio_ccw_del_vq(vq, ccw);
 
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
@@ -501,8 +529,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 		err = -ENOMEM;
 		goto out_err;
 	}
-	info->info_block = kzalloc(sizeof(*info->info_block),
-				   GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, info->info_block);
 	if (!info->info_block) {
 		dev_warn(&vcdev->cdev->dev, "no info block\n");
 		err = -ENOMEM;
@@ -564,7 +591,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
 	if (vq)
 		vring_del_virtqueue(vq);
 	if (info) {
-		kfree(info->info_block);
+		vc_dma_free_struct(vdev, info->info_block);
 	}
 	kfree(info);
 	return ERR_PTR(err);
@@ -575,10 +602,10 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 					   struct ccw1 *ccw)
 {
 	int ret;
-	struct virtio_thinint_area *thinint_area = NULL;
+	vc_dma_decl_struct(virtio_thinint_area, thinint_area) = NULL;
 	struct airq_info *info;
 
-	thinint_area = kzalloc(sizeof(*thinint_area), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
 	if (!thinint_area) {
 		ret = -ENOMEM;
 		goto out;
@@ -614,7 +641,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 		virtio_ccw_drop_indicators(vcdev);
 	}
 out:
-	kfree(thinint_area);
+	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 	return ret;
 }
 
@@ -627,10 +654,11 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	unsigned long *indicatorp = NULL;
+	dma_addr_t indicatorp_dma_addr;
 	int ret, i, queue_idx = 0;
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(ccw1, ccw);
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
 
@@ -654,7 +682,8 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = kmalloc(sizeof(&vcdev->indicators), GFP_DMA | GFP_KERNEL);
+	indicatorp = __vc_dma_alloc(&vcdev->vdev, sizeof(&vcdev->indicators),
+				    &indicatorp_dma_addr);
 	if (!indicatorp)
 		goto out;
 	*indicatorp = (unsigned long) &vcdev->indicators;
@@ -686,12 +715,16 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	if (ret)
 		goto out;
 
-	kfree(indicatorp);
-	kfree(ccw);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
+			       indicatorp, indicatorp_dma_addr);
+	vc_dma_free_struct(vdev, ccw);
 	return 0;
 out:
-	kfree(indicatorp);
-	kfree(ccw);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
+			       indicatorp, indicatorp_dma_addr);
+	vc_dma_free_struct(vdev, ccw);
 	virtio_ccw_del_vqs(vdev);
 	return ret;
 }
@@ -699,9 +732,9 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 static void virtio_ccw_reset(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(ccw1, ccw);
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
@@ -714,22 +747,22 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
 	ccw->count = 0;
 	ccw->cda = 0;
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_RESET);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	struct virtio_feature_desc *features;
+	vc_dma_decl_struct(virtio_feature_desc, features);
+	vc_dma_decl_struct(ccw1, ccw);
 	int ret;
 	u64 rc;
-	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return 0;
 
-	features = kzalloc(sizeof(*features), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, features);
 	if (!features) {
 		rc = 0;
 		goto out_free;
@@ -762,8 +795,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 		rc |= (u64)le32_to_cpu(features->features) << 32;
 
 out_free:
-	kfree(features);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, features);
+	vc_dma_free_struct(vdev, ccw);
 	return rc;
 }
 
@@ -779,8 +812,8 @@ static void ccw_transport_features(struct virtio_device *vdev)
 static int virtio_ccw_finalize_features(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	struct virtio_feature_desc *features;
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(virtio_feature_desc, features);
+	vc_dma_decl_struct(ccw1, ccw);
 	int ret;
 
 	if (vcdev->revision >= 1 &&
@@ -790,11 +823,11 @@ static int virtio_ccw_finalize_features(struct virtio_device *vdev)
 		return -EINVAL;
 	}
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
 
-	features = kzalloc(sizeof(*features), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, features);
 	if (!features) {
 		ret = -ENOMEM;
 		goto out_free;
@@ -829,8 +862,8 @@ static int virtio_ccw_finalize_features(struct virtio_device *vdev)
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_FEAT);
 
 out_free:
-	kfree(features);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, features);
+	vc_dma_free_struct(vdev, ccw);
 
 	return ret;
 }
@@ -840,15 +873,17 @@ static void virtio_ccw_get_config(struct virtio_device *vdev,
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	int ret;
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(ccw1, ccw);
 	void *config_area;
 	unsigned long flags;
+	dma_addr_t config_area_dma_addr;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = kzalloc(VIRTIO_CCW_CONFIG_SIZE, GFP_DMA | GFP_KERNEL);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE,
+				     &config_area_dma_addr);
 	if (!config_area)
 		goto out_free;
 
@@ -870,8 +905,9 @@ static void virtio_ccw_get_config(struct virtio_device *vdev,
 		memcpy(buf, config_area + offset, len);
 
 out_free:
-	kfree(config_area);
-	kfree(ccw);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area,
+				     config_area_dma_addr);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static void virtio_ccw_set_config(struct virtio_device *vdev,
@@ -879,15 +915,17 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 				  unsigned len)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(ccw1, ccw);
 	void *config_area;
 	unsigned long flags;
+	dma_addr_t config_area_dma_addr;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = kzalloc(VIRTIO_CCW_CONFIG_SIZE, GFP_DMA | GFP_KERNEL);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE,
+				     &config_area_dma_addr);
 	if (!config_area)
 		goto out_free;
 
@@ -906,20 +944,21 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_CONFIG);
 
 out_free:
-	kfree(config_area);
-	kfree(ccw);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area,
+				     config_area_dma_addr);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static u8 virtio_ccw_get_status(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	u8 old_status = *vcdev->status;
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(ccw1, ccw);
 
 	if (vcdev->revision < 1)
 		return *vcdev->status;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return old_status;
 
@@ -934,7 +973,7 @@ static u8 virtio_ccw_get_status(struct virtio_device *vdev)
  * handler anyway), vcdev->status was not overwritten and we just
  * return the old status, which is fine.
 */
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 
 	return *vcdev->status;
 }
@@ -943,10 +982,10 @@ static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	u8 old_status = *vcdev->status;
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(ccw1, ccw);
 	int ret;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
@@ -960,7 +999,7 @@ static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
 	/* Write failed? We assume status is unchanged. */
 	if (ret)
 		*vcdev->status = old_status;
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
@@ -993,8 +1032,8 @@ static void virtio_ccw_release_dev(struct device *_d)
 	struct virtio_device *dev = dev_to_virtio(_d);
 	struct virtio_ccw_device *vcdev = to_vc_device(dev);
 
-	kfree(vcdev->status);
-	kfree(vcdev->config_block);
+	vc_dma_free_struct(dev, vcdev->status);
+	vc_dma_free_struct(dev, vcdev->config_block);
 	kfree(vcdev);
 }
 
@@ -1198,16 +1237,16 @@ static int virtio_ccw_offline(struct ccw_device *cdev)
 
 static int virtio_ccw_set_transport_rev(struct virtio_ccw_device *vcdev)
 {
-	struct virtio_rev_info *rev;
-	struct ccw1 *ccw;
+	vc_dma_decl_struct(virtio_rev_info, rev);
+	vc_dma_decl_struct(ccw1, ccw);
 	int ret;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
-	rev = kzalloc(sizeof(*rev), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, rev);
 	if (!rev) {
-		kfree(ccw);
+		vc_dma_free_struct(&vcdev->vdev, ccw);
 		return -ENOMEM;
 	}
 
@@ -1237,8 +1276,8 @@ static int virtio_ccw_set_transport_rev(struct virtio_ccw_device *vcdev)
 		}
 	} while (ret == -EOPNOTSUPP);
 
-	kfree(ccw);
-	kfree(rev);
+	vc_dma_free_struct(&vcdev->vdev, ccw);
+	vc_dma_free_struct(&vcdev->vdev, rev);
 	return ret;
 }
 
@@ -1266,13 +1305,12 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		goto out_free;
 	}
 
-	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
-				   GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, vcdev->config_block);
 	if (!vcdev->config_block) {
 		ret = -ENOMEM;
 		goto out_free;
 	}
-	vcdev->status = kzalloc(sizeof(*vcdev->status), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, vcdev->status);
 	if (!vcdev->status) {
 		ret = -ENOMEM;
 		goto out_free;
@@ -1280,7 +1318,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 
 	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
 
-	vcdev->vdev.dev.parent = &cdev->dev;
 	vcdev->vdev.dev.release = virtio_ccw_release_dev;
 	vcdev->vdev.config = &virtio_ccw_config_ops;
 	vcdev->cdev = cdev;
@@ -1314,8 +1351,8 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	return ret;
 out_free:
 	if (vcdev) {
-		kfree(vcdev->status);
-		kfree(vcdev->config_block);
+		vc_dma_free_struct(&vcdev->vdev, vcdev->status);
+		vc_dma_free_struct(&vcdev->vdev, vcdev->config_block);
 	}
 	kfree(vcdev);
 	return ret;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 08/12] virtio/s390: add indirection to indicators access
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (6 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-04 23:16 ` [RFC PATCH 09/12] virtio/s390: use DMA memory for notifiers Halil Pasic
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

This will come in handy soon when we pull out the indicators from
virtio_ccw_device to a memory area that is shared with the hypervisor
(in particular for protected virtualization guests).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 46 ++++++++++++++++++++++++----------------
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 9c412a581a50..156166ae4df4 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -71,6 +71,16 @@ struct virtio_ccw_device {
 	__u64 dma_mask;
 };
 
+static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
+{
+	return &vcdev->indicators;
+}
+
+static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
+{
+	return &vcdev->indicators2;
+}
+
 struct vq_info_block_legacy {
 	__u64 queue;
 	__u32 align;
@@ -364,17 +374,17 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 	} else {
 		/* payload is the address of the indicators */
 		indicatorp = __vc_dma_alloc(&vcdev->vdev,
-					    sizeof(&vcdev->indicators),
+					    sizeof(indicators(vcdev)),
 					    &indicatorp_dma_addr);
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
 		ccw->cmd_code = CCW_CMD_SET_IND;
-		ccw->count = sizeof(&vcdev->indicators);
+		ccw->count = sizeof(indicators(vcdev));
 		ccw->cda = (__u32)(unsigned long) indicatorp;
 	}
 	/* Deregister indicators from host. */
-	vcdev->indicators = 0;
+	*indicators(vcdev) = 0;
 	ccw->flags = 0;
 	ret = ccw_io_helper(vcdev, ccw,
 			    vcdev->is_thinint ?
@@ -386,7 +396,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 	else if (vcdev->is_thinint)
 		virtio_ccw_drop_indicators(vcdev);
 	if (indicatorp)
-		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
 			       indicatorp, indicatorp_dma_addr);
 	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 }
@@ -682,11 +692,11 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = __vc_dma_alloc(&vcdev->vdev, sizeof(&vcdev->indicators),
+	indicatorp = __vc_dma_alloc(&vcdev->vdev, sizeof(indicators(vcdev)),
 				    &indicatorp_dma_addr);
 	if (!indicatorp)
 		goto out;
-	*indicatorp = (unsigned long) &vcdev->indicators;
+	*indicatorp = (unsigned long) indicators(vcdev);
 	if (vcdev->is_thinint) {
 		ret = virtio_ccw_register_adapter_ind(vcdev, vqs, nvqs, ccw);
 		if (ret)
@@ -695,34 +705,34 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	}
 	if (!vcdev->is_thinint) {
 		/* Register queue indicators with host. */
-		vcdev->indicators = 0;
+		*indicators(vcdev) = 0;
 		ccw->cmd_code = CCW_CMD_SET_IND;
 		ccw->flags = 0;
-		ccw->count = sizeof(&vcdev->indicators);
+		ccw->count = sizeof(indicators(vcdev));
 		ccw->cda = (__u32)(unsigned long) indicatorp;
 		ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_IND);
 		if (ret)
 			goto out;
 	}
 	/* Register indicators2 with host for config changes */
-	*indicatorp = (unsigned long) &vcdev->indicators2;
-	vcdev->indicators2 = 0;
+	*indicatorp = (unsigned long) indicators2(vcdev);
+	*indicators2(vcdev) = 0;
 	ccw->cmd_code = CCW_CMD_SET_CONF_IND;
 	ccw->flags = 0;
-	ccw->count = sizeof(&vcdev->indicators2);
+	ccw->count = sizeof(indicators2(vcdev));
 	ccw->cda = (__u32)(unsigned long) indicatorp;
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_CONF_IND);
 	if (ret)
 		goto out;
 
 	if (indicatorp)
-		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
 			       indicatorp, indicatorp_dma_addr);
 	vc_dma_free_struct(vdev, ccw);
 	return 0;
 out:
 	if (indicatorp)
-		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
 			       indicatorp, indicatorp_dma_addr);
 	vc_dma_free_struct(vdev, ccw);
 	virtio_ccw_del_vqs(vdev);
@@ -1131,17 +1141,17 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 			vcdev->err = -EIO;
 	}
 	virtio_ccw_check_activity(vcdev, activity);
-	for_each_set_bit(i, &vcdev->indicators,
-			 sizeof(vcdev->indicators) * BITS_PER_BYTE) {
+	for_each_set_bit(i, indicators(vcdev),
+			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
 		/* The bit clear must happen before the vring kick. */
-		clear_bit(i, &vcdev->indicators);
+		clear_bit(i, indicators(vcdev));
 		barrier();
 		vq = virtio_ccw_vq_by_ind(vcdev, i);
 		vring_interrupt(0, vq);
 	}
-	if (test_bit(0, &vcdev->indicators2)) {
+	if (test_bit(0, indicators2(vcdev))) {
 		virtio_config_changed(&vcdev->vdev);
-		clear_bit(0, &vcdev->indicators2);
+		clear_bit(0, indicators2(vcdev));
 	}
 }
 
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 09/12] virtio/s390: use DMA memory for notifiers
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (7 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 08/12] virtio/s390: add indirection to indicators access Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-04 23:16 ` [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations Halil Pasic
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

From: Halil Pasic <pasic@linux.vnet.ibm.com>

The hypervisor is supposed to poke the classic notifiers, if these are
used, out of band with regards to ccw I/O. So these need to be allocated
as DMA memory (which is shared memory for protected virtualization
guests).

Let us factor out notifiers in a satellite that is allocated as DMA
memory to accommodate that.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 156166ae4df4..aa45a6a027ae 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -46,6 +46,11 @@ struct vq_config_block {
 #define VIRTIO_CCW_CONFIG_SIZE 0x100
 /* same as PCI config space size, should be enough for all drivers */
 
+struct vcdev_dma_area {
+	unsigned long indicators;
+	unsigned long indicators2;
+};
+
 struct virtio_ccw_device {
 	struct virtio_device vdev;
 	__u8 *status;
@@ -59,8 +64,6 @@ struct virtio_ccw_device {
 	spinlock_t lock;
 	struct mutex io_lock; /* Serializes I/O requests */
 	struct list_head virtqueues;
-	unsigned long indicators;
-	unsigned long indicators2;
 	struct vq_config_block *config_block;
 	dma_addr_t config_block_dma_addr;
 	bool is_thinint;
@@ -69,16 +72,18 @@ struct virtio_ccw_device {
 	unsigned int config_ready;
 	void *airq_info;
 	__u64 dma_mask;
+	struct vcdev_dma_area *dma_area;
+	dma_addr_t dma_area_dma_addr;
 };
 
 static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
 {
-	return &vcdev->indicators;
+	return &vcdev->dma_area->indicators;
 }
 
 static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
 {
-	return &vcdev->indicators2;
+	return &vcdev->dma_area->indicators2;
 }
 
 struct vq_info_block_legacy {
@@ -1044,6 +1049,8 @@ static void virtio_ccw_release_dev(struct device *_d)
 
 	vc_dma_free_struct(dev, vcdev->status);
 	vc_dma_free_struct(dev, vcdev->config_block);
+	__vc_dma_free(&vcdev->vdev, PAGE_SIZE, vcdev->dma_area,
+		      vcdev->dma_area_dma_addr);
 	kfree(vcdev);
 }
 
@@ -1325,6 +1332,12 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		ret = -ENOMEM;
 		goto out_free;
 	}
+	vcdev->dma_area = __vc_dma_alloc(&vcdev->vdev, PAGE_SIZE,
+					 &vcdev->dma_area_dma_addr);
+	if (!vcdev->dma_area) {
+		ret = -ENOMEM;
+		goto out_free;
+	}
 
 	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
 
@@ -1363,6 +1376,8 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	if (vcdev) {
 		vc_dma_free_struct(&vcdev->vdev, vcdev->status);
 		vc_dma_free_struct(&vcdev->vdev, vcdev->config_block);
+		__vc_dma_free(&vcdev->vdev, PAGE_SIZE, vcdev->dma_area,
+			      vcdev->dma_area_dma_addr);
 	}
 	kfree(vcdev);
 	return ret;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (8 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 09/12] virtio/s390: use DMA memory for notifiers Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-10  8:46     ` Cornelia Huck
  2019-04-04 23:16 ` [RFC PATCH 11/12] virtio/s390: use the cio DMA pool Halil Pasic
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

We can reduce the number of DMA allocations by pulling the bits
of memory that belongs to virtio_ccw_device and needs to be DMA
memory into a single area.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 48 ++++++++++++++--------------------------
 1 file changed, 16 insertions(+), 32 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index aa45a6a027ae..7268149f2ee8 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -49,12 +49,12 @@ struct vq_config_block {
 struct vcdev_dma_area {
 	unsigned long indicators;
 	unsigned long indicators2;
+	struct vq_config_block config_block;
+	__u8 status; /* TODO check __aligned(8); */
 };
 
 struct virtio_ccw_device {
 	struct virtio_device vdev;
-	__u8 *status;
-	dma_addr_t status_dma_addr;
 	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
 	struct ccw_device *cdev;
 	__u32 curr_io;
@@ -64,8 +64,6 @@ struct virtio_ccw_device {
 	spinlock_t lock;
 	struct mutex io_lock; /* Serializes I/O requests */
 	struct list_head virtqueues;
-	struct vq_config_block *config_block;
-	dma_addr_t config_block_dma_addr;
 	bool is_thinint;
 	bool going_away;
 	bool device_lost;
@@ -450,15 +448,15 @@ static int virtio_ccw_read_vq_conf(struct virtio_ccw_device *vcdev,
 {
 	int ret;
 
-	vcdev->config_block->index = index;
+	vcdev->dma_area->config_block.index = index;
 	ccw->cmd_code = CCW_CMD_READ_VQ_CONF;
 	ccw->flags = 0;
 	ccw->count = sizeof(struct vq_config_block);
-	ccw->cda = (__u32)(unsigned long)(vcdev->config_block);
+	ccw->cda = (__u32)(unsigned long)(&vcdev->dma_area->config_block);
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_VQ_CONF);
 	if (ret)
 		return ret;
-	return vcdev->config_block->num ?: -ENOENT;
+	return vcdev->dma_area->config_block.num ?: -ENOENT;
 }
 
 static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
@@ -754,7 +752,7 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
 		return;
 
 	/* Zero status bits. */
-	*vcdev->status = 0;
+	vcdev->dma_area->status = 0;
 
 	/* Send a reset ccw on device. */
 	ccw->cmd_code = CCW_CMD_VDEV_RESET;
@@ -967,11 +965,11 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 static u8 virtio_ccw_get_status(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	u8 old_status = *vcdev->status;
+	u8 old_status = vcdev->dma_area->status;
 	vc_dma_decl_struct(ccw1, ccw);
 
 	if (vcdev->revision < 1)
-		return *vcdev->status;
+		return vcdev->dma_area->status;
 
 	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
@@ -979,24 +977,24 @@ static u8 virtio_ccw_get_status(struct virtio_device *vdev)
 
 	ccw->cmd_code = CCW_CMD_READ_STATUS;
 	ccw->flags = 0;
-	ccw->count = sizeof(*vcdev->status);
-	ccw->cda = (__u32)(unsigned long)vcdev->status;
+	ccw->count = sizeof(vcdev->dma_area->status);
+	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_STATUS);
 /*
  * If the channel program failed (should only happen if the device
  * was hotunplugged, and then we clean up via the machine check
- * handler anyway), vcdev->status was not overwritten and we just
+ * handler anyway), vcdev->dma_area->status was not overwritten and we just
  * return the old status, which is fine.
 */
 	vc_dma_free_struct(vdev, ccw);
 
-	return *vcdev->status;
+	return vcdev->dma_area->status;
 }
 
 static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	u8 old_status = *vcdev->status;
+	u8 old_status = vcdev->dma_area->status;
 	vc_dma_decl_struct(ccw1, ccw);
 	int ret;
 
@@ -1005,15 +1003,15 @@ static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
 		return;
 
 	/* Write the status to the host. */
-	*vcdev->status = status;
+	vcdev->dma_area->status = status;
 	ccw->cmd_code = CCW_CMD_WRITE_STATUS;
 	ccw->flags = 0;
 	ccw->count = sizeof(status);
-	ccw->cda = (__u32)(unsigned long)vcdev->status;
+	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_STATUS);
 	/* Write failed? We assume status is unchanged. */
 	if (ret)
-		*vcdev->status = old_status;
+		vcdev->dma_area->status = old_status;
 	vc_dma_free_struct(vdev, ccw);
 }
 
@@ -1047,8 +1045,6 @@ static void virtio_ccw_release_dev(struct device *_d)
 	struct virtio_device *dev = dev_to_virtio(_d);
 	struct virtio_ccw_device *vcdev = to_vc_device(dev);
 
-	vc_dma_free_struct(dev, vcdev->status);
-	vc_dma_free_struct(dev, vcdev->config_block);
 	__vc_dma_free(&vcdev->vdev, PAGE_SIZE, vcdev->dma_area,
 		      vcdev->dma_area_dma_addr);
 	kfree(vcdev);
@@ -1322,16 +1318,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		goto out_free;
 	}
 
-	vc_dma_alloc_struct(&vcdev->vdev, vcdev->config_block);
-	if (!vcdev->config_block) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-	vc_dma_alloc_struct(&vcdev->vdev, vcdev->status);
-	if (!vcdev->status) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
 	vcdev->dma_area = __vc_dma_alloc(&vcdev->vdev, PAGE_SIZE,
 					 &vcdev->dma_area_dma_addr);
 	if (!vcdev->dma_area) {
@@ -1374,8 +1360,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	return ret;
 out_free:
 	if (vcdev) {
-		vc_dma_free_struct(&vcdev->vdev, vcdev->status);
-		vc_dma_free_struct(&vcdev->vdev, vcdev->config_block);
 		__vc_dma_free(&vcdev->vdev, PAGE_SIZE, vcdev->dma_area,
 			      vcdev->dma_area_dma_addr);
 	}
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 11/12] virtio/s390: use the cio DMA pool
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (9 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-04 23:16 ` [RFC PATCH 12/12] virtio/s390: make airq summary indicators DMA Halil Pasic
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Lets use the cio DMA pool for DMA allocations to avoid problems with
granularity, and make things simpler and cheaper.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 54 ++++++++++++++--------------------------
 1 file changed, 18 insertions(+), 36 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 7268149f2ee8..ba1dafe04968 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -71,7 +71,6 @@ struct virtio_ccw_device {
 	void *airq_info;
 	__u64 dma_mask;
 	struct vcdev_dma_area *dma_area;
-	dma_addr_t dma_area_dma_addr;
 };
 
 static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
@@ -128,7 +127,6 @@ struct virtio_ccw_vq_info {
 		struct vq_info_block s;
 		struct vq_info_block_legacy l;
 	} *info_block;
-	dma_addr_t info_block_dma_addr;
 	int bit_nr;
 	struct list_head node;
 	long cookie;
@@ -184,26 +182,23 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
 }
 
 #define vc_dma_decl_struct(type, field) \
-	dma_addr_t field ## _dma_addr;  \
 	struct type *field
 
-static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
-				   dma_addr_t *dma_handle)
+static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
 {
-	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
-				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
+	return (void *) cio_dma_zalloc(size);
 }
 
 static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
-				 void *cpu_addr, dma_addr_t dma_handle)
+				 void *cpu_addr)
 {
-	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
+	cio_dma_free(cpu_addr, size);
 }
 
 #define vc_dma_alloc_struct(vdev, ptr) \
-	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
+	({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
 #define vc_dma_free_struct(vdev, ptr) \
-	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))
+	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr))
 
 static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
 {
@@ -361,7 +356,6 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 	int ret;
 	unsigned long *indicatorp = NULL;
 	vc_dma_decl_struct(virtio_thinint_area, thinint_area) = NULL;
-	dma_addr_t indicatorp_dma_addr;
 	struct airq_info *airq_info = vcdev->airq_info;
 
 	if (vcdev->is_thinint) {
@@ -377,8 +371,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 	} else {
 		/* payload is the address of the indicators */
 		indicatorp = __vc_dma_alloc(&vcdev->vdev,
-					    sizeof(indicators(vcdev)),
-					    &indicatorp_dma_addr);
+					    sizeof(indicators(vcdev)));
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
@@ -400,7 +393,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		virtio_ccw_drop_indicators(vcdev);
 	if (indicatorp)
 		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
-			       indicatorp, indicatorp_dma_addr);
+			       indicatorp);
 	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 }
 
@@ -667,7 +660,6 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	unsigned long *indicatorp = NULL;
-	dma_addr_t indicatorp_dma_addr;
 	int ret, i, queue_idx = 0;
 	vc_dma_decl_struct(ccw1, ccw);
 
@@ -695,8 +687,7 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = __vc_dma_alloc(&vcdev->vdev, sizeof(indicators(vcdev)),
-				    &indicatorp_dma_addr);
+	indicatorp = __vc_dma_alloc(&vcdev->vdev, sizeof(indicators(vcdev)));
 	if (!indicatorp)
 		goto out;
 	*indicatorp = (unsigned long) indicators(vcdev);
@@ -730,13 +721,13 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 
 	if (indicatorp)
 		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
-			       indicatorp, indicatorp_dma_addr);
+			       indicatorp);
 	vc_dma_free_struct(vdev, ccw);
 	return 0;
 out:
 	if (indicatorp)
 		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
-			       indicatorp, indicatorp_dma_addr);
+			       indicatorp);
 	vc_dma_free_struct(vdev, ccw);
 	virtio_ccw_del_vqs(vdev);
 	return ret;
@@ -889,14 +880,12 @@ static void virtio_ccw_get_config(struct virtio_device *vdev,
 	vc_dma_decl_struct(ccw1, ccw);
 	void *config_area;
 	unsigned long flags;
-	dma_addr_t config_area_dma_addr;
 
 	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE,
-				     &config_area_dma_addr);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE);
 	if (!config_area)
 		goto out_free;
 
@@ -918,8 +907,7 @@ static void virtio_ccw_get_config(struct virtio_device *vdev,
 		memcpy(buf, config_area + offset, len);
 
 out_free:
-	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area,
-				     config_area_dma_addr);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area);
 	vc_dma_free_struct(vdev, ccw);
 }
 
@@ -931,14 +919,12 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 	vc_dma_decl_struct(ccw1, ccw);
 	void *config_area;
 	unsigned long flags;
-	dma_addr_t config_area_dma_addr;
 
 	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE,
-				     &config_area_dma_addr);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE);
 	if (!config_area)
 		goto out_free;
 
@@ -957,8 +943,7 @@ static void virtio_ccw_set_config(struct virtio_device *vdev,
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_CONFIG);
 
 out_free:
-	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area,
-				     config_area_dma_addr);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area);
 	vc_dma_free_struct(vdev, ccw);
 }
 
@@ -1045,8 +1030,7 @@ static void virtio_ccw_release_dev(struct device *_d)
 	struct virtio_device *dev = dev_to_virtio(_d);
 	struct virtio_ccw_device *vcdev = to_vc_device(dev);
 
-	__vc_dma_free(&vcdev->vdev, PAGE_SIZE, vcdev->dma_area,
-		      vcdev->dma_area_dma_addr);
+	vc_dma_free_struct(&vcdev->vdev, vcdev->dma_area);
 	kfree(vcdev);
 }
 
@@ -1318,8 +1302,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		goto out_free;
 	}
 
-	vcdev->dma_area = __vc_dma_alloc(&vcdev->vdev, PAGE_SIZE,
-					 &vcdev->dma_area_dma_addr);
+	vc_dma_alloc_struct(&vcdev->vdev, vcdev->dma_area);
 	if (!vcdev->dma_area) {
 		ret = -ENOMEM;
 		goto out_free;
@@ -1360,8 +1343,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	return ret;
 out_free:
 	if (vcdev) {
-		__vc_dma_free(&vcdev->vdev, PAGE_SIZE, vcdev->dma_area,
-			      vcdev->dma_area_dma_addr);
+		vc_dma_free_struct(&vcdev->vdev, vcdev->dma_area);
 	}
 	kfree(vcdev);
 	return ret;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* [RFC PATCH 12/12] virtio/s390:  make airq summary indicators DMA
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
                   ` (10 preceding siblings ...)
  2019-04-04 23:16 ` [RFC PATCH 11/12] virtio/s390: use the cio DMA pool Halil Pasic
@ 2019-04-04 23:16 ` Halil Pasic
  2019-04-10  9:20   ` Cornelia Huck
  2019-04-12 13:47   ` David Hildenbrand
  13 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-04 23:16 UTC (permalink / raw)
  To: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky, Sebastian Ott
  Cc: Halil Pasic, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

Hypervisor needs to interact with the summary indicators, so these
need to be DMA memory as well (at least for protected virtualization
guests).

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index ba1dafe04968..5ef82461d0f2 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -141,11 +141,17 @@ static int virtio_ccw_use_airq = 1;
 
 struct airq_info {
 	rwlock_t lock;
-	u8 summary_indicator;
+	u8 summary_indicator_idx;
 	struct airq_struct airq;
 	struct airq_iv *aiv;
 };
 static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
+static u8 *summary_indicators;
+
+static inline u8 *get_summary_indicator(struct airq_info *info)
+{
+	return summary_indicators + info->summary_indicator_idx;
+}
 
 #define CCW_CMD_SET_VQ 0x13
 #define CCW_CMD_VDEV_RESET 0x33
@@ -229,7 +235,7 @@ static void virtio_airq_handler(struct airq_struct *airq)
 			break;
 		vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
 	}
-	info->summary_indicator = 0;
+	*(get_summary_indicator(info)) = 0;
 	smp_wmb();
 	/* Walk through indicators field, summary indicator not active. */
 	for (ai = 0;;) {
@@ -241,7 +247,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
 	read_unlock(&info->lock);
 }
 
-static struct airq_info *new_airq_info(void)
+/* call with airq_areas_lock held */
+static struct airq_info *new_airq_info(int index)
 {
 	struct airq_info *info;
 	int rc;
@@ -256,7 +263,8 @@ static struct airq_info *new_airq_info(void)
 		return NULL;
 	}
 	info->airq.handler = virtio_airq_handler;
-	info->airq.lsi_ptr = &info->summary_indicator;
+	info->summary_indicator_idx = index;
+	info->airq.lsi_ptr = get_summary_indicator(info);
 	info->airq.lsi_mask = 0xff;
 	info->airq.isc = VIRTIO_AIRQ_ISC;
 	rc = register_adapter_interrupt(&info->airq);
@@ -277,8 +285,9 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs,
 	unsigned long bit, flags;
 
 	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
+		/* TODO: this seems to be racy */
 		if (!airq_areas[i])
-			airq_areas[i] = new_airq_info();
+			airq_areas[i] = new_airq_info(i);
 		info = airq_areas[i];
 		if (!info)
 			return 0;
@@ -363,7 +372,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		if (!thinint_area)
 			return;
 		thinint_area->summary_indicator =
-			(unsigned long) &airq_info->summary_indicator;
+			(unsigned long) get_summary_indicator(airq_info);
 		thinint_area->isc = VIRTIO_AIRQ_ISC;
 		ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
 		ccw->count = sizeof(*thinint_area);
@@ -626,7 +635,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
 	}
 	info = vcdev->airq_info;
 	thinint_area->summary_indicator =
-		(unsigned long) &info->summary_indicator;
+		(unsigned long) get_summary_indicator(info);
 	thinint_area->isc = VIRTIO_AIRQ_ISC;
 	ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
 	ccw->flags = CCW_FLAG_SLI;
@@ -1515,6 +1524,7 @@ static int __init virtio_ccw_init(void)
 {
 	/* parse no_auto string before we do anything further */
 	no_auto_parse();
+	summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
 	return ccw_driver_register(&virtio_ccw_driver);
 }
 device_initcall(virtio_ccw_init);
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
  2019-04-04 23:16 ` [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue Halil Pasic
@ 2019-04-08 11:01     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-08 11:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael S . Tsirkin

On Fri,  5 Apr 2019 01:16:11 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> establishes a new way of allocating virtqueues (as a part of the effort
> that taught DMA to virtio rings).
> 
> In the future we will want virtio-ccw to use the DMA API as well.
> 
> Let us switch from the legacy method of allocating virtqueues to
> vring_create_virtqueue() as the first step into that direction.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
>  1 file changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 74c328321889..edf4afe2d688 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c

> @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		err = info->num;
>  		goto out_err;
>  	}
> -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> -	if (info->queue == NULL) {
> -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> -		err = -ENOMEM;
> -		goto out_err;
> -	}
> +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> +				    vdev, true, true, ctx,

This second true means 'may_reduce_num'. Looking at the vring code, it
seems that this parameter is never checked; the code will try to
allocate a smaller queue if it can't get the requested size in any
case... this will probably be a problem for legacy virtio-pci, which
explicitly sets may_reduce_num to false. (I can try to come up with a
patch to fix that.)

> +				    virtio_ccw_kvm_notify, callback, name);
>  
> -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> -				 callback, name);
>  	if (!vq) {
>  		/* For now, we fail if we can't get the requested size. */
>  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		goto out_err;
>  	}
>  
> +

Extra blank line :)

>  	/* Register it with the host. */
> +	queue = virtqueue_get_desc_addr(vq);
>  	if (vcdev->revision == 0) {
> -		info->info_block->l.queue = (__u64)info->queue;
> +		info->info_block->l.queue = queue;
>  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
>  		info->info_block->l.index = i;
>  		info->info_block->l.num = info->num;

You always fill in the size requested by the host, but the actual size
may be smaller (see above). I don't think that is allowed for revision
0 (which implies !virtio-1). You probably need to call
vring_create_virtqueue with may_reduce_num=false for revision 0 (and
wait for the generic vring code to be fixed...)

>  		ccw->count = sizeof(info->info_block->l);
>  	} else {
> -		info->info_block->s.desc = (__u64)info->queue;
> +		info->info_block->s.desc = queue;
>  		info->info_block->s.index = i;
>  		info->info_block->s.num = info->num;

Here, you need to obtain the actual number via
virtqueue_get_vring_size().

>  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
@ 2019-04-08 11:01     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-08 11:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Michael S . Tsirkin, Farhan Ali, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:11 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> establishes a new way of allocating virtqueues (as a part of the effort
> that taught DMA to virtio rings).
> 
> In the future we will want virtio-ccw to use the DMA API as well.
> 
> Let us switch from the legacy method of allocating virtqueues to
> vring_create_virtqueue() as the first step into that direction.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
>  1 file changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 74c328321889..edf4afe2d688 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c

> @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		err = info->num;
>  		goto out_err;
>  	}
> -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> -	if (info->queue == NULL) {
> -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> -		err = -ENOMEM;
> -		goto out_err;
> -	}
> +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> +				    vdev, true, true, ctx,

This second true means 'may_reduce_num'. Looking at the vring code, it
seems that this parameter is never checked; the code will try to
allocate a smaller queue if it can't get the requested size in any
case... this will probably be a problem for legacy virtio-pci, which
explicitly sets may_reduce_num to false. (I can try to come up with a
patch to fix that.)

> +				    virtio_ccw_kvm_notify, callback, name);
>  
> -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> -				 callback, name);
>  	if (!vq) {
>  		/* For now, we fail if we can't get the requested size. */
>  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		goto out_err;
>  	}
>  
> +

Extra blank line :)

>  	/* Register it with the host. */
> +	queue = virtqueue_get_desc_addr(vq);
>  	if (vcdev->revision == 0) {
> -		info->info_block->l.queue = (__u64)info->queue;
> +		info->info_block->l.queue = queue;
>  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
>  		info->info_block->l.index = i;
>  		info->info_block->l.num = info->num;

You always fill in the size requested by the host, but the actual size
may be smaller (see above). I don't think that is allowed for revision
0 (which implies !virtio-1). You probably need to call
vring_create_virtqueue with may_reduce_num=false for revision 0 (and
wait for the generic vring code to be fixed...)

>  		ccw->count = sizeof(info->info_block->l);
>  	} else {
> -		info->info_block->s.desc = (__u64)info->queue;
> +		info->info_block->s.desc = queue;
>  		info->info_block->s.index = i;
>  		info->info_block->s.num = info->num;

Here, you need to obtain the actual number via
virtqueue_get_vring_size().

>  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
  2019-04-08 11:01     ` Cornelia Huck
@ 2019-04-08 12:37       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 87+ messages in thread
From: Michael S. Tsirkin @ 2019-04-08 12:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Mon, Apr 08, 2019 at 01:01:28PM +0200, Cornelia Huck wrote:
> On Fri,  5 Apr 2019 01:16:11 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > establishes a new way of allocating virtqueues (as a part of the effort
> > that taught DMA to virtio rings).
> > 
> > In the future we will want virtio-ccw to use the DMA API as well.
> > 
> > Let us switch from the legacy method of allocating virtqueues to
> > vring_create_virtqueue() as the first step into that direction.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
> >  1 file changed, 8 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 74c328321889..edf4afe2d688 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> 
> > @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		err = info->num;
> >  		goto out_err;
> >  	}
> > -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> > -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> > -	if (info->queue == NULL) {
> > -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> > -		err = -ENOMEM;
> > -		goto out_err;
> > -	}
> > +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> > +				    vdev, true, true, ctx,
> 
> This second true means 'may_reduce_num'. Looking at the vring code, it
> seems that this parameter is never checked; the code will try to
> allocate a smaller queue if it can't get the requested size in any
> case... this will probably be a problem for legacy virtio-pci, which
> explicitly sets may_reduce_num to false. (I can try to come up with a
> patch to fix that.)

Yes, pls do. Not too late for a bugfix to go into the current linux.

> > +				    virtio_ccw_kvm_notify, callback, name);
> >  
> > -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> > -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> > -				 callback, name);
> >  	if (!vq) {
> >  		/* For now, we fail if we can't get the requested size. */
> >  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> > @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		goto out_err;
> >  	}
> >  
> > +
> 
> Extra blank line :)
> 
> >  	/* Register it with the host. */
> > +	queue = virtqueue_get_desc_addr(vq);
> >  	if (vcdev->revision == 0) {
> > -		info->info_block->l.queue = (__u64)info->queue;
> > +		info->info_block->l.queue = queue;
> >  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
> >  		info->info_block->l.index = i;
> >  		info->info_block->l.num = info->num;
> 
> You always fill in the size requested by the host, but the actual size
> may be smaller (see above). I don't think that is allowed for revision
> 0 (which implies !virtio-1). You probably need to call
> vring_create_virtqueue with may_reduce_num=false for revision 0 (and
> wait for the generic vring code to be fixed...)
> 
> >  		ccw->count = sizeof(info->info_block->l);
> >  	} else {
> > -		info->info_block->s.desc = (__u64)info->queue;
> > +		info->info_block->s.desc = queue;
> >  		info->info_block->s.index = i;
> >  		info->info_block->s.num = info->num;
> 
> Here, you need to obtain the actual number via
> virtqueue_get_vring_size().
> 
> >  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
@ 2019-04-08 12:37       ` Michael S. Tsirkin
  0 siblings, 0 replies; 87+ messages in thread
From: Michael S. Tsirkin @ 2019-04-08 12:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Halil Pasic,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Mon, Apr 08, 2019 at 01:01:28PM +0200, Cornelia Huck wrote:
> On Fri,  5 Apr 2019 01:16:11 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > establishes a new way of allocating virtqueues (as a part of the effort
> > that taught DMA to virtio rings).
> > 
> > In the future we will want virtio-ccw to use the DMA API as well.
> > 
> > Let us switch from the legacy method of allocating virtqueues to
> > vring_create_virtqueue() as the first step into that direction.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
> >  1 file changed, 8 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 74c328321889..edf4afe2d688 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> 
> > @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		err = info->num;
> >  		goto out_err;
> >  	}
> > -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> > -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> > -	if (info->queue == NULL) {
> > -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> > -		err = -ENOMEM;
> > -		goto out_err;
> > -	}
> > +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> > +				    vdev, true, true, ctx,
> 
> This second true means 'may_reduce_num'. Looking at the vring code, it
> seems that this parameter is never checked; the code will try to
> allocate a smaller queue if it can't get the requested size in any
> case... this will probably be a problem for legacy virtio-pci, which
> explicitly sets may_reduce_num to false. (I can try to come up with a
> patch to fix that.)

Yes, pls do. Not too late for a bugfix to go into the current linux.

> > +				    virtio_ccw_kvm_notify, callback, name);
> >  
> > -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> > -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> > -				 callback, name);
> >  	if (!vq) {
> >  		/* For now, we fail if we can't get the requested size. */
> >  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> > @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		goto out_err;
> >  	}
> >  
> > +
> 
> Extra blank line :)
> 
> >  	/* Register it with the host. */
> > +	queue = virtqueue_get_desc_addr(vq);
> >  	if (vcdev->revision == 0) {
> > -		info->info_block->l.queue = (__u64)info->queue;
> > +		info->info_block->l.queue = queue;
> >  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
> >  		info->info_block->l.index = i;
> >  		info->info_block->l.num = info->num;
> 
> You always fill in the size requested by the host, but the actual size
> may be smaller (see above). I don't think that is allowed for revision
> 0 (which implies !virtio-1). You probably need to call
> vring_create_virtqueue with may_reduce_num=false for revision 0 (and
> wait for the generic vring code to be fixed...)
> 
> >  		ccw->count = sizeof(info->info_block->l);
> >  	} else {
> > -		info->info_block->s.desc = (__u64)info->queue;
> > +		info->info_block->s.desc = queue;
> >  		info->info_block->s.index = i;
> >  		info->info_block->s.num = info->num;
> 
> Here, you need to obtain the actual number via
> virtqueue_get_vring_size().
> 
> >  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
  2019-04-08 11:01     ` Cornelia Huck
  (?)
  (?)
@ 2019-04-08 13:20     ` Halil Pasic
  -1 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-08 13:20 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman, Michael S . Tsirkin

On Mon, 8 Apr 2019 13:01:28 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri,  5 Apr 2019 01:16:11 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
> > establishes a new way of allocating virtqueues (as a part of the effort
> > that taught DMA to virtio rings).
> > 
> > In the future we will want virtio-ccw to use the DMA API as well.
> > 
> > Let us switch from the legacy method of allocating virtqueues to
> > vring_create_virtqueue() as the first step into that direction.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 27 ++++++++-------------------
> >  1 file changed, 8 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index 74c328321889..edf4afe2d688 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> 
> > @@ -516,17 +512,10 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		err = info->num;
> >  		goto out_err;
> >  	}
> > -	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
> > -	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
> > -	if (info->queue == NULL) {
> > -		dev_warn(&vcdev->cdev->dev, "no queue\n");
> > -		err = -ENOMEM;
> > -		goto out_err;
> > -	}
> > +	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
> > +				    vdev, true, true, ctx,
> 
> This second true means 'may_reduce_num'. Looking at the vring code, it
> seems that this parameter is never checked; the code will try to
> allocate a smaller queue if it can't get the requested size in any
> case... this will probably be a problem for legacy virtio-pci, which
> explicitly sets may_reduce_num to false. (I can try to come up with a
> patch to fix that.)
> 

Right.

> > +				    virtio_ccw_kvm_notify, callback, name);
> >  
> > -	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
> > -				 true, ctx, info->queue, virtio_ccw_kvm_notify,
> > -				 callback, name);
> >  	if (!vq) {
> >  		/* For now, we fail if we can't get the requested size. */
> >  		dev_warn(&vcdev->cdev->dev, "no vq\n");
> > @@ -534,15 +523,17 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		goto out_err;
> >  	}
> >  
> > +
> 
> Extra blank line :)
> 
> >  	/* Register it with the host. */
> > +	queue = virtqueue_get_desc_addr(vq);
> >  	if (vcdev->revision == 0) {
> > -		info->info_block->l.queue = (__u64)info->queue;
> > +		info->info_block->l.queue = queue;
> >  		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
> >  		info->info_block->l.index = i;
> >  		info->info_block->l.num = info->num;
> 
> You always fill in the size requested by the host, but the actual size
> may be smaller (see above). I don't think that is allowed for revision
> 0 (which implies !virtio-1). You probably need to call
> vring_create_virtqueue with may_reduce_num=false for revision 0 (and
> wait for the generic vring code to be fixed...)

I will have a look into this.

> 
> >  		ccw->count = sizeof(info->info_block->l);
> >  	} else {
> > -		info->info_block->s.desc = (__u64)info->queue;
> > +		info->info_block->s.desc = queue;
> >  		info->info_block->s.index = i;
> >  		info->info_block->s.num = info->num;
> 
> Here, you need to obtain the actual number via
> virtqueue_get_vring_size().
> 

Will change as requested.

Thanks for having a look!

Regards,
Halil

> >  		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
  2019-04-04 23:16 ` [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw Halil Pasic
@ 2019-04-09  9:57     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09  9:57 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri,  5 Apr 2019 01:16:12 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Currently we have a problem if a virtio-ccw device has
> VIRTIO_F_IOMMU_PLATFORM. 

Can you please describe what the actual problem is?

> In future we do want to support DMA API with
> virtio-ccw.
> 
> Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
> with virtio-ccw.
> 
> Let us also switch from legacy avail/used accessors to the DMA aware
> ones (even if it isn't strictly necessary).

I think with this change we can remove the legacy accessors, if I
didn't mis-grep.

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index edf4afe2d688..5956c9e820bb 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -66,6 +66,7 @@ struct virtio_ccw_device {
>  	bool device_lost;
>  	unsigned int config_ready;
>  	void *airq_info;
> +	__u64 dma_mask;

u64?

>  };
>  
>  struct vq_info_block_legacy {
> @@ -536,8 +537,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		info->info_block->s.desc = queue;
>  		info->info_block->s.index = i;
>  		info->info_block->s.num = info->num;
> -		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
> -		info->info_block->s.used = (__u64)virtqueue_get_used(vq);
> +		info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
> +		info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
>  		ccw->count = sizeof(info->info_block->s);
>  	}
>  	ccw->cmd_code = CCW_CMD_SET_VQ;
> @@ -769,10 +770,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
>  static void ccw_transport_features(struct virtio_device *vdev)
>  {
>  	/*
> -	 * Packed ring isn't enabled on virtio_ccw for now,
> -	 * because virtio_ccw uses some legacy accessors,
> -	 * e.g. virtqueue_get_avail() and virtqueue_get_used()
> -	 * which aren't available in packed ring currently.
> +	 * There shouldn't be anything that precludes supporting paced.

s/paced/packed/

> +	 * TODO: Remove the limitation after having another look into this.
>  	 */
>  	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
>  }
> @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  		ret = -ENOMEM;
>  		goto out_free;
>  	}
> +	vcdev->vdev.dev.parent = &cdev->dev;

That one makes sense, pci and mmio are doing that as well.

> +	cdev->dev.dma_mask = &vcdev->dma_mask;

That one feels a bit weird. Will this change in one of the follow-on
patches? (Have not yet looked at the whole series.)

> +
> +	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> +	if (ret)
> +		ret = dma_set_mask_and_coherent(&cdev->dev,
> +						DMA_BIT_MASK(32));
> +	if (ret) {
> +		dev_warn(&cdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");

This does not look like you'd try to continue?

> +		goto out_free;
> +	}
> +
>  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
>  				   GFP_DMA | GFP_KERNEL);
>  	if (!vcdev->config_block) {

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
@ 2019-04-09  9:57     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09  9:57 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:12 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Currently we have a problem if a virtio-ccw device has
> VIRTIO_F_IOMMU_PLATFORM. 

Can you please describe what the actual problem is?

> In future we do want to support DMA API with
> virtio-ccw.
> 
> Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
> with virtio-ccw.
> 
> Let us also switch from legacy avail/used accessors to the DMA aware
> ones (even if it isn't strictly necessary).

I think with this change we can remove the legacy accessors, if I
didn't mis-grep.

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index edf4afe2d688..5956c9e820bb 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -66,6 +66,7 @@ struct virtio_ccw_device {
>  	bool device_lost;
>  	unsigned int config_ready;
>  	void *airq_info;
> +	__u64 dma_mask;

u64?

>  };
>  
>  struct vq_info_block_legacy {
> @@ -536,8 +537,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
>  		info->info_block->s.desc = queue;
>  		info->info_block->s.index = i;
>  		info->info_block->s.num = info->num;
> -		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
> -		info->info_block->s.used = (__u64)virtqueue_get_used(vq);
> +		info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
> +		info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
>  		ccw->count = sizeof(info->info_block->s);
>  	}
>  	ccw->cmd_code = CCW_CMD_SET_VQ;
> @@ -769,10 +770,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
>  static void ccw_transport_features(struct virtio_device *vdev)
>  {
>  	/*
> -	 * Packed ring isn't enabled on virtio_ccw for now,
> -	 * because virtio_ccw uses some legacy accessors,
> -	 * e.g. virtqueue_get_avail() and virtqueue_get_used()
> -	 * which aren't available in packed ring currently.
> +	 * There shouldn't be anything that precludes supporting paced.

s/paced/packed/

> +	 * TODO: Remove the limitation after having another look into this.
>  	 */
>  	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
>  }
> @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  		ret = -ENOMEM;
>  		goto out_free;
>  	}
> +	vcdev->vdev.dev.parent = &cdev->dev;

That one makes sense, pci and mmio are doing that as well.

> +	cdev->dev.dma_mask = &vcdev->dma_mask;

That one feels a bit weird. Will this change in one of the follow-on
patches? (Have not yet looked at the whole series.)

> +
> +	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> +	if (ret)
> +		ret = dma_set_mask_and_coherent(&cdev->dev,
> +						DMA_BIT_MASK(32));
> +	if (ret) {
> +		dev_warn(&cdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");

This does not look like you'd try to continue?

> +		goto out_free;
> +	}
> +
>  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
>  				   GFP_DMA | GFP_KERNEL);
>  	if (!vcdev->config_block) {

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
  2019-04-04 23:16 ` [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization Halil Pasic
@ 2019-04-09 10:16     ` Cornelia Huck
  2019-04-09 12:22     ` Christoph Hellwig
  1 sibling, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 10:16 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri,  5 Apr 2019 01:16:13 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On s390 protected virtualization guests also have to use bounce I/O
> buffers.  That requires some plumbing.
> 
> Let us make sure any device using DMA API accordingly is spared from the
> problems that hypervisor attempting I/O to a non-shared secure page would
> bring.

I have problems parsing this sentence :(

Do you mean that we want to exclude pages for I/O from encryption?

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/Kconfig                   |  4 ++++
>  arch/s390/include/asm/Kbuild        |  1 -
>  arch/s390/include/asm/dma-mapping.h | 13 +++++++++++
>  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++++
>  arch/s390/mm/init.c                 | 44 +++++++++++++++++++++++++++++++++++++
>  5 files changed, 79 insertions(+), 1 deletion(-)
>  create mode 100644 arch/s390/include/asm/dma-mapping.h
>  create mode 100644 arch/s390/include/asm/mem_encrypt.h

(...)

> @@ -126,6 +129,45 @@ void mark_rodata_ro(void)
>  	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
>  }
>  
> +int set_memory_encrypted(unsigned long addr, int numpages)
> +{
> +	/* also called for the swiotlb bounce buffers, make all pages shared */
> +	/* TODO: do ultravisor calls */
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> +
> +int set_memory_decrypted(unsigned long addr, int numpages)
> +{
> +	/* also called for the swiotlb bounce buffers, make all pages shared */
> +	/* TODO: do ultravisor calls */
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> +
> +/* are we a protected virtualization guest? */
> +bool sev_active(void)
> +{
> +	/*
> +	 * TODO: Do proper detection using ultravisor, for now let us fake we
> +	 *  have it so the code gets exercised.

That's the swiotlb stuff, right?

(The patches will obviously need some reordering before it is actually
getting merged.)

> +	 */
> +	return true;
> +}
> +EXPORT_SYMBOL_GPL(sev_active);
> +
> +/* protected virtualization */
> +static void pv_init(void)
> +{
> +	if (!sev_active())
> +		return;
> +
> +	/* make sure bounce buffers are shared */
> +	swiotlb_init(1);
> +	swiotlb_update_mem_attributes();
> +	swiotlb_force = SWIOTLB_FORCE;
> +}
> +
>  void __init mem_init(void)
>  {
>  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> @@ -134,6 +176,8 @@ void __init mem_init(void)
>  	set_max_mapnr(max_low_pfn);
>          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>  
> +	pv_init();
> +
>  	/* Setup guest page hinting */
>  	cmma_init();
>  

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
@ 2019-04-09 10:16     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 10:16 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:13 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On s390 protected virtualization guests also have to use bounce I/O
> buffers.  That requires some plumbing.
> 
> Let us make sure any device using DMA API accordingly is spared from the
> problems that hypervisor attempting I/O to a non-shared secure page would
> bring.

I have problems parsing this sentence :(

Do you mean that we want to exclude pages for I/O from encryption?

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/Kconfig                   |  4 ++++
>  arch/s390/include/asm/Kbuild        |  1 -
>  arch/s390/include/asm/dma-mapping.h | 13 +++++++++++
>  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++++
>  arch/s390/mm/init.c                 | 44 +++++++++++++++++++++++++++++++++++++
>  5 files changed, 79 insertions(+), 1 deletion(-)
>  create mode 100644 arch/s390/include/asm/dma-mapping.h
>  create mode 100644 arch/s390/include/asm/mem_encrypt.h

(...)

> @@ -126,6 +129,45 @@ void mark_rodata_ro(void)
>  	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
>  }
>  
> +int set_memory_encrypted(unsigned long addr, int numpages)
> +{
> +	/* also called for the swiotlb bounce buffers, make all pages shared */
> +	/* TODO: do ultravisor calls */
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> +
> +int set_memory_decrypted(unsigned long addr, int numpages)
> +{
> +	/* also called for the swiotlb bounce buffers, make all pages shared */
> +	/* TODO: do ultravisor calls */
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> +
> +/* are we a protected virtualization guest? */
> +bool sev_active(void)
> +{
> +	/*
> +	 * TODO: Do proper detection using ultravisor, for now let us fake we
> +	 *  have it so the code gets exercised.

That's the swiotlb stuff, right?

(The patches will obviously need some reordering before it is actually
getting merged.)

> +	 */
> +	return true;
> +}
> +EXPORT_SYMBOL_GPL(sev_active);
> +
> +/* protected virtualization */
> +static void pv_init(void)
> +{
> +	if (!sev_active())
> +		return;
> +
> +	/* make sure bounce buffers are shared */
> +	swiotlb_init(1);
> +	swiotlb_update_mem_attributes();
> +	swiotlb_force = SWIOTLB_FORCE;
> +}
> +
>  void __init mem_init(void)
>  {
>  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> @@ -134,6 +176,8 @@ void __init mem_init(void)
>  	set_max_mapnr(max_low_pfn);
>          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
>  
> +	pv_init();
> +
>  	/* Setup guest page hinting */
>  	cmma_init();
>  

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-04 23:16 ` [RFC PATCH 04/12] s390/cio: introduce cio DMA pool Halil Pasic
@ 2019-04-09 10:44     ` Cornelia Huck
  2019-04-11 18:25     ` Sebastian Ott
  1 sibling, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 10:44 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri,  5 Apr 2019 01:16:14 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> To support protected virtualization cio will need to make sure the
> memory used for communication with the hypervisor is DMA memory.
> 
> Let us introduce a DMA pool to cio that will help us in allocating

missing 'and'

> deallocating those chunks of memory.
> 
> We use a gen_pool backed with DMA pages to avoid each allocation
> effectively wasting a page, as we typically allocate much less
> than PAGE_SIZE.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/Kconfig           |  1 +
>  arch/s390/include/asm/cio.h |  3 +++
>  drivers/s390/cio/css.c      | 57 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 61 insertions(+)
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 46c69283a67b..e8099ab47368 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -194,6 +194,7 @@ config S390
>  	select VIRT_TO_BUS
>  	select HAVE_NMI
>  	select SWIOTLB
> +	select GENERIC_ALLOCATOR
>  
>  
>  config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
> index 1727180e8ca1..4510e418614a 100644
> --- a/arch/s390/include/asm/cio.h
> +++ b/arch/s390/include/asm/cio.h
> @@ -328,6 +328,9 @@ static inline u8 pathmask_to_pos(u8 mask)
>  void channel_subsystem_reinit(void);
>  extern void css_schedule_reprobe(void);
>  
> +extern void *cio_dma_zalloc(size_t size);
> +extern void cio_dma_free(void *cpu_addr, size_t size);
> +
>  /* Function from drivers/s390/cio/chsc.c */
>  int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
>  int chsc_sstpi(void *page, void *result, size_t size);
> diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> index aea502922646..72629d99d8e4 100644
> --- a/drivers/s390/cio/css.c
> +++ b/drivers/s390/cio/css.c
> @@ -20,6 +20,8 @@
>  #include <linux/reboot.h>
>  #include <linux/suspend.h>
>  #include <linux/proc_fs.h>
> +#include <linux/genalloc.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/isc.h>
>  #include <asm/crw.h>
>  
> @@ -886,6 +888,8 @@ static const struct attribute_group *cssdev_attr_groups[] = {
>  	NULL,
>  };
>  
> +static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
> +
>  static int __init setup_css(int nr)
>  {
>  	struct channel_subsystem *css;
> @@ -899,6 +903,9 @@ static int __init setup_css(int nr)
>  	dev_set_name(&css->device, "css%x", nr);
>  	css->device.groups = cssdev_attr_groups;
>  	css->device.release = channel_subsystem_release;
> +	/* some cio DMA memory needs to be 31 bit addressable */
> +	css->device.coherent_dma_mask = DMA_BIT_MASK(31),
> +	css->device.dma_mask = &css_dev_dma_mask;

Question: Does this percolate down to the child devices eventually?
E.g., you have a ccw_device getting the mask from its parent
subchannel, which gets it from its parent css? If so, does that clash
with the the mask you used for the virtio_ccw_device in a previous
patch? Or are they two really different things?

>  
>  	mutex_init(&css->mutex);
>  	css->cssid = chsc_get_cssid(nr);
> @@ -1018,6 +1025,55 @@ static struct notifier_block css_power_notifier = {
>  	.notifier_call = css_power_event,
>  };
>  
> +#define POOL_INIT_PAGES 1
> +static struct gen_pool *cio_dma_pool;
> +/* Currently cio supports only a single css */
> +static struct device *cio_dma_css;

That global variable feels wrong, especially if you plan to support
MCSS-E in the future. (Do you? :) If yes, should the dma pool be global
or per-css? As css0 currently is the root device for the channel
subsystem stuff, you'd either need a new parent to hang this off from
or size this with the number of css images.)

For now, just grab channel_subsystems[0]->device directly?

> +static gfp_t  cio_dma_flags;
> +
> +static void __init cio_dma_pool_init(void)
> +{
> +	void *cpu_addr;
> +	dma_addr_t dma_addr;
> +	int i;
> +
> +	cio_dma_css = &channel_subsystems[0]->device;
> +	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
> +	cio_dma_pool = gen_pool_create(3, -1);
> +	/* No need to free up the resources: compiled in */
> +	for (i = 0; i < POOL_INIT_PAGES; ++i) {
> +		cpu_addr = dma_alloc_coherent(cio_dma_css, PAGE_SIZE, &dma_addr,
> +					      cio_dma_flags);
> +		if (!cpu_addr)
> +			return;
> +		gen_pool_add_virt(cio_dma_pool, (unsigned long) cpu_addr,
> +				  dma_addr, PAGE_SIZE, -1);
> +	}
> +
> +}
> +
> +void *cio_dma_zalloc(size_t size)
> +{
> +	dma_addr_t dma_addr;
> +	unsigned long addr = gen_pool_alloc(cio_dma_pool, size);
> +
> +	if (!addr) {
> +		addr = (unsigned long) dma_alloc_coherent(cio_dma_css,
> +					PAGE_SIZE, &dma_addr, cio_dma_flags);
> +		if (!addr)
> +			return NULL;
> +		gen_pool_add_virt(cio_dma_pool, addr, dma_addr, PAGE_SIZE, -1);
> +		addr = gen_pool_alloc(cio_dma_pool, size);
> +	}
> +	return (void *) addr;

At this point, you're always going via the css0 device. I'm wondering
whether you should pass in the cssid here and use css_by_id(cssid) to
make this future proof. But then, I'm not quite clear from which
context this will be called.

> +}
> +
> +void cio_dma_free(void *cpu_addr, size_t size)
> +{
> +	memset(cpu_addr, 0, size);
> +	gen_pool_free(cio_dma_pool, (unsigned long) cpu_addr, size);
> +}
> +
>  /*
>   * Now that the driver core is running, we can setup our channel subsystem.
>   * The struct subchannel's are created during probing.
> @@ -1063,6 +1119,7 @@ static int __init css_bus_init(void)
>  		unregister_reboot_notifier(&css_reboot_notifier);
>  		goto out_unregister;
>  	}
> +	cio_dma_pool_init();
>  	css_init_done = 1;
>  
>  	/* Enable default isc for I/O subchannels. */

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
@ 2019-04-09 10:44     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 10:44 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:14 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> To support protected virtualization cio will need to make sure the
> memory used for communication with the hypervisor is DMA memory.
> 
> Let us introduce a DMA pool to cio that will help us in allocating

missing 'and'

> deallocating those chunks of memory.
> 
> We use a gen_pool backed with DMA pages to avoid each allocation
> effectively wasting a page, as we typically allocate much less
> than PAGE_SIZE.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  arch/s390/Kconfig           |  1 +
>  arch/s390/include/asm/cio.h |  3 +++
>  drivers/s390/cio/css.c      | 57 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 61 insertions(+)
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 46c69283a67b..e8099ab47368 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -194,6 +194,7 @@ config S390
>  	select VIRT_TO_BUS
>  	select HAVE_NMI
>  	select SWIOTLB
> +	select GENERIC_ALLOCATOR
>  
>  
>  config SCHED_OMIT_FRAME_POINTER
> diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
> index 1727180e8ca1..4510e418614a 100644
> --- a/arch/s390/include/asm/cio.h
> +++ b/arch/s390/include/asm/cio.h
> @@ -328,6 +328,9 @@ static inline u8 pathmask_to_pos(u8 mask)
>  void channel_subsystem_reinit(void);
>  extern void css_schedule_reprobe(void);
>  
> +extern void *cio_dma_zalloc(size_t size);
> +extern void cio_dma_free(void *cpu_addr, size_t size);
> +
>  /* Function from drivers/s390/cio/chsc.c */
>  int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
>  int chsc_sstpi(void *page, void *result, size_t size);
> diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> index aea502922646..72629d99d8e4 100644
> --- a/drivers/s390/cio/css.c
> +++ b/drivers/s390/cio/css.c
> @@ -20,6 +20,8 @@
>  #include <linux/reboot.h>
>  #include <linux/suspend.h>
>  #include <linux/proc_fs.h>
> +#include <linux/genalloc.h>
> +#include <linux/dma-mapping.h>
>  #include <asm/isc.h>
>  #include <asm/crw.h>
>  
> @@ -886,6 +888,8 @@ static const struct attribute_group *cssdev_attr_groups[] = {
>  	NULL,
>  };
>  
> +static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
> +
>  static int __init setup_css(int nr)
>  {
>  	struct channel_subsystem *css;
> @@ -899,6 +903,9 @@ static int __init setup_css(int nr)
>  	dev_set_name(&css->device, "css%x", nr);
>  	css->device.groups = cssdev_attr_groups;
>  	css->device.release = channel_subsystem_release;
> +	/* some cio DMA memory needs to be 31 bit addressable */
> +	css->device.coherent_dma_mask = DMA_BIT_MASK(31),
> +	css->device.dma_mask = &css_dev_dma_mask;

Question: Does this percolate down to the child devices eventually?
E.g., you have a ccw_device getting the mask from its parent
subchannel, which gets it from its parent css? If so, does that clash
with the the mask you used for the virtio_ccw_device in a previous
patch? Or are they two really different things?

>  
>  	mutex_init(&css->mutex);
>  	css->cssid = chsc_get_cssid(nr);
> @@ -1018,6 +1025,55 @@ static struct notifier_block css_power_notifier = {
>  	.notifier_call = css_power_event,
>  };
>  
> +#define POOL_INIT_PAGES 1
> +static struct gen_pool *cio_dma_pool;
> +/* Currently cio supports only a single css */
> +static struct device *cio_dma_css;

That global variable feels wrong, especially if you plan to support
MCSS-E in the future. (Do you? :) If yes, should the dma pool be global
or per-css? As css0 currently is the root device for the channel
subsystem stuff, you'd either need a new parent to hang this off from
or size this with the number of css images.)

For now, just grab channel_subsystems[0]->device directly?

> +static gfp_t  cio_dma_flags;
> +
> +static void __init cio_dma_pool_init(void)
> +{
> +	void *cpu_addr;
> +	dma_addr_t dma_addr;
> +	int i;
> +
> +	cio_dma_css = &channel_subsystems[0]->device;
> +	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
> +	cio_dma_pool = gen_pool_create(3, -1);
> +	/* No need to free up the resources: compiled in */
> +	for (i = 0; i < POOL_INIT_PAGES; ++i) {
> +		cpu_addr = dma_alloc_coherent(cio_dma_css, PAGE_SIZE, &dma_addr,
> +					      cio_dma_flags);
> +		if (!cpu_addr)
> +			return;
> +		gen_pool_add_virt(cio_dma_pool, (unsigned long) cpu_addr,
> +				  dma_addr, PAGE_SIZE, -1);
> +	}
> +
> +}
> +
> +void *cio_dma_zalloc(size_t size)
> +{
> +	dma_addr_t dma_addr;
> +	unsigned long addr = gen_pool_alloc(cio_dma_pool, size);
> +
> +	if (!addr) {
> +		addr = (unsigned long) dma_alloc_coherent(cio_dma_css,
> +					PAGE_SIZE, &dma_addr, cio_dma_flags);
> +		if (!addr)
> +			return NULL;
> +		gen_pool_add_virt(cio_dma_pool, addr, dma_addr, PAGE_SIZE, -1);
> +		addr = gen_pool_alloc(cio_dma_pool, size);
> +	}
> +	return (void *) addr;

At this point, you're always going via the css0 device. I'm wondering
whether you should pass in the cssid here and use css_by_id(cssid) to
make this future proof. But then, I'm not quite clear from which
context this will be called.

> +}
> +
> +void cio_dma_free(void *cpu_addr, size_t size)
> +{
> +	memset(cpu_addr, 0, size);
> +	gen_pool_free(cio_dma_pool, (unsigned long) cpu_addr, size);
> +}
> +
>  /*
>   * Now that the driver core is running, we can setup our channel subsystem.
>   * The struct subchannel's are created during probing.
> @@ -1063,6 +1119,7 @@ static int __init css_bus_init(void)
>  		unregister_reboot_notifier(&css_reboot_notifier);
>  		goto out_unregister;
>  	}
> +	cio_dma_pool_init();
>  	css_init_done = 1;
>  
>  	/* Enable default isc for I/O subchannels. */

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
  2019-04-09 10:16     ` Cornelia Huck
  (?)
@ 2019-04-09 10:54     ` Halil Pasic
  2019-04-09 17:18         ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-09 10:54 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 12:16:47 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri,  5 Apr 2019 01:16:13 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On s390 protected virtualization guests also have to use bounce I/O
> > buffers.  That requires some plumbing.
> > 
> > Let us make sure any device using DMA API accordingly is spared from the
                    ^,                                   ^,
Maybe this helps...

> > problems that hypervisor attempting I/O to a non-shared secure page would
> > bring.
> 
> I have problems parsing this sentence :(
> 
> Do you mean that we want to exclude pages for I/O from encryption?

The intended meaning is:
* Devices that do use DMA API (properly) to get get/map the memory
  that is used to talk to hypervisor should be OK with PV (protected
  virtualizaton). I.e. for such devices PV or not PV is basically
  transparent.
* But if a device does not use DMA API for the memory that is used to
  talk to the hypervisor this patch won't help.

And yes the gist of it is: memory accessed by the hypervisor needs to
be on pages excluded from protection (which in case of PV is technically
not encryption).

Does that help?

> 
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  arch/s390/Kconfig                   |  4 ++++
> >  arch/s390/include/asm/Kbuild        |  1 -
> >  arch/s390/include/asm/dma-mapping.h | 13 +++++++++++
> >  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++++
> >  arch/s390/mm/init.c                 | 44 +++++++++++++++++++++++++++++++++++++
> >  5 files changed, 79 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/s390/include/asm/dma-mapping.h
> >  create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 
> (...)
> 
> > @@ -126,6 +129,45 @@ void mark_rodata_ro(void)
> >  	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
> >  }
> >  
> > +int set_memory_encrypted(unsigned long addr, int numpages)
> > +{
> > +	/* also called for the swiotlb bounce buffers, make all pages shared */
> > +	/* TODO: do ultravisor calls */
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> > +
> > +int set_memory_decrypted(unsigned long addr, int numpages)
> > +{
> > +	/* also called for the swiotlb bounce buffers, make all pages shared */
> > +	/* TODO: do ultravisor calls */
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> > +
> > +/* are we a protected virtualization guest? */
> > +bool sev_active(void)
> > +{
> > +	/*
> > +	 * TODO: Do proper detection using ultravisor, for now let us fake we
> > +	 *  have it so the code gets exercised.
> 
> That's the swiotlb stuff, right?
> 

You mean 'That' == code to get exercised == 'swiotlb stuff'?

If yes then the answer is kind of. The swiotlb (i.e. bounce buffers) is
when we map (like we map the buffers pointed to by the descriptors in
case of the virtio ring). The other part of it is the memory allocated
as DMA coherent (i.e. the virtio ring (desc, avail used) itself).

> (The patches will obviously need some reordering before it is actually
> getting merged.)
> 

What do you mean by reordering?

One reason why this is an early RFC is the missing dependency (i.e. the
stuff described by most of the TODO comments). As pointed out in the
cover letter. Another reason is that I wanted to avoid putting a lots of
effort into fine-polishing before clarifying the getting some feedback
on the basics from the community. ;)


> > +	 */
> > +	return true;
> > +}
> > +EXPORT_SYMBOL_GPL(sev_active);
> > +
> > +/* protected virtualization */
> > +static void pv_init(void)
> > +{
> > +	if (!sev_active())
> > +		return;
> > +
> > +	/* make sure bounce buffers are shared */
> > +	swiotlb_init(1);
> > +	swiotlb_update_mem_attributes();
> > +	swiotlb_force = SWIOTLB_FORCE;
> > +}
> > +
> >  void __init mem_init(void)
> >  {
> >  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> > @@ -134,6 +176,8 @@ void __init mem_init(void)
> >  	set_max_mapnr(max_low_pfn);
> >          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> >  
> > +	pv_init();
> > +
> >  	/* Setup guest page hinting */
> >  	cmma_init();
> >  
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
  2019-04-09  9:57     ` Cornelia Huck
  (?)
@ 2019-04-09 11:29     ` Halil Pasic
  2019-04-09 13:01         ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-09 11:29 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 11:57:43 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri,  5 Apr 2019 01:16:12 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > Currently we have a problem if a virtio-ccw device has
> > VIRTIO_F_IOMMU_PLATFORM. 
> 
> Can you please describe what the actual problem is?
> 

Without this patch:

WARNING: CPU: 2 PID: 26
at [..]/kernel/dma/mapping.c:251
dma_alloc_attrs+0x8e/0xd0 Modules linked in: CPU: 2 PID: 26 Comm:
kworker/u6:1 Not tainted 5.1.0-rc3-00023-g1ec89ec #596 Hardware name:
IBM 2964 NC9 712 (KVM/Linux) Workqueue: events_unbound async_run_entry_fn
Krnl PSW : 0704c00180000000 000000000021b18e (dma_alloc_attrs+0x8e/0xd0)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000
0000000000001406 000003e00040f838 0000000000002dc0 0000000000000100
0000000000000001 0000000000001000 000000000236f028 000003e00040f838
0000000000001406 000000004b289828 0000000000000080 000003e00040f6f8
000003e00040f6a0 Krnl Code: 000000000021b17e: f0e00004ebaf
srp	4(15,%r0),2991(%r14),0 000000000021b184: f0c0000407f4
srp	4(13,%r0),2036,0 #000000000021b18a: a7f40001
brc	15,21b18c >000000000021b18e: ec5520bc0055	risbg
%r5,%r5,32,188,0 000000000021b194: b9020011		ltgr
%r1,%r1 000000000021b198: a784ffd9		brc	8,21b14a
           000000000021b19c: e31010000002	ltg	%r1,0(%r1)
           000000000021b1a2: a7840012		brc	8,21b1c6
Call Trace:
([<0000000000000004>] 0x4)
 [<00000000007a7d54>] vring_alloc_queue+0x74/0x90 
 [<00000000007a8390>] vring_create_virtqueue+0xf8/0x288 
 [<0000000000919ec0>] virtio_ccw_find_vqs+0xf8/0x950 
 [<000000000080772e>] init_vq+0x16e/0x318 
 [<00000000008087c4>] virtblk_probe+0xf4/0xb58 
 [<00000000007a62a6>] virtio_dev_probe+0x1a6/0x250 
 [<00000000007ea498>] really_probe+0x1c8/0x290 
 [<00000000007ea746>] driver_probe_device+0x86/0x160 
 [<00000000007e7cba>] bus_for_each_drv+0x7a/0xc0 
 [<00000000007ea23c>] __device_attach+0xfc/0x180 
 [<00000000007e9116>] bus_probe_device+0xae/0xc8 
 [<00000000007e5066>] device_add+0x3fe/0x698 
 [<00000000007a5d92>] register_virtio_device+0xca/0x120 
 [<00000000009195a2>] virtio_ccw_online+0x1b2/0x220 
 [<000000000089853e>] ccw_device_set_online+0x1d6/0x4d8 
 [<0000000000918cf6>] virtio_ccw_auto_online+0x26/0x58 
 [<00000000001a61b6>] async_run_entry_fn+0x5e/0x158 
 [<0000000000199322>] process_one_work+0x25a/0x668 
 [<000000000019977a>] worker_thread+0x4a/0x428 
 [<00000000001a1ae8>] kthread+0x150/0x170 
 [<0000000000aeab3a>] kernel_thread_starter+0x6/0xc 
 [<0000000000aeab34>] kernel_thread_starter+0x0/0xc 

[..]

virtio_ccw 0.0.0301: no vq
---[ end trace d35815958c12cad3 ]---
virtio_ccw 0.0.0300: no vq
virtio_blk: probe of virtio1 failed with error -12
virtio_blk: probe of virtio3 failed with error -12

Means virtio devices broken.

Should I
s/we have a problem if a virtio-ccw device/virtio-ccw devices do not work if the device/
?

> > In future we do want to support DMA API with
> > virtio-ccw.
> > 
> > Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
> > with virtio-ccw.
> > 
> > Let us also switch from legacy avail/used accessors to the DMA aware
> > ones (even if it isn't strictly necessary).
> 
> I think with this change we can remove the legacy accessors, if I
> didn't mis-grep.
> 

That is possible, I can do that in v1.

> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++------
> >  1 file changed, 17 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index edf4afe2d688..5956c9e820bb 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> > @@ -66,6 +66,7 @@ struct virtio_ccw_device {
> >  	bool device_lost;
> >  	unsigned int config_ready;
> >  	void *airq_info;
> > +	__u64 dma_mask;
> 
> u64?
> 

Right, has nothing to do with userspace.

> >  };
> >  
> >  struct vq_info_block_legacy {
> > @@ -536,8 +537,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
> >  		info->info_block->s.desc = queue;
> >  		info->info_block->s.index = i;
> >  		info->info_block->s.num = info->num;
> > -		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
> > -		info->info_block->s.used = (__u64)virtqueue_get_used(vq);
> > +		info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
> > +		info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
> >  		ccw->count = sizeof(info->info_block->s);
> >  	}
> >  	ccw->cmd_code = CCW_CMD_SET_VQ;
> > @@ -769,10 +770,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev)
> >  static void ccw_transport_features(struct virtio_device *vdev)
> >  {
> >  	/*
> > -	 * Packed ring isn't enabled on virtio_ccw for now,
> > -	 * because virtio_ccw uses some legacy accessors,
> > -	 * e.g. virtqueue_get_avail() and virtqueue_get_used()
> > -	 * which aren't available in packed ring currently.
> > +	 * There shouldn't be anything that precludes supporting paced.
> 
> s/paced/packed/

Thx!

> 
> > +	 * TODO: Remove the limitation after having another look into this.
> >  	 */
> >  	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
> >  }
> > @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> >  		ret = -ENOMEM;
> >  		goto out_free;
> >  	}
> > +	vcdev->vdev.dev.parent = &cdev->dev;
> 
> That one makes sense, pci and mmio are doing that as well.
> 
> > +	cdev->dev.dma_mask = &vcdev->dma_mask;
> 
> That one feels a bit weird. Will this change in one of the follow-on
> patches? (Have not yet looked at the whole series.)

I don't thinks so. Do you mean this should happen within the cio code?
I think I started out with the idea to keep the scope as narrow as
possible. Do you have any suggestions?

> 
> > +
> > +	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> > +	if (ret)
> > +		ret = dma_set_mask_and_coherent(&cdev->dev,
> > +						DMA_BIT_MASK(32));
> > +	if (ret) {
> > +		dev_warn(&cdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");
> 
> This does not look like you'd try to continue?
> 

I remember now. First I did continue, then I changed this to fail hard
so I can not ignore any such problems while smoke testing ('I don't always
check the kernel messages'), but kept the old message. This basically
should not fail anyway, otherwise we have a problem AFAIU.

By the way virtio-pci tries to continue indeed, and this is also where
the wording comes from ;).

What would you prefer? Try to continue or fail right away?

Regards,
Halil

> > +		goto out_free;
> > +	}
> > +
> >  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
> >  				   GFP_DMA | GFP_KERNEL);
> >  	if (!vcdev->config_block) {
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-09 10:44     ` Cornelia Huck
  (?)
@ 2019-04-09 12:11     ` Halil Pasic
  2019-04-09 17:14         ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-09 12:11 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 12:44:58 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri,  5 Apr 2019 01:16:14 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > To support protected virtualization cio will need to make sure the
> > memory used for communication with the hypervisor is DMA memory.
> > 
> > Let us introduce a DMA pool to cio that will help us in allocating
> 
> missing 'and'
> 

Nod.

> > deallocating those chunks of memory.
> > 
> > We use a gen_pool backed with DMA pages to avoid each allocation
> > effectively wasting a page, as we typically allocate much less
> > than PAGE_SIZE.
> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  arch/s390/Kconfig           |  1 +
> >  arch/s390/include/asm/cio.h |  3 +++
> >  drivers/s390/cio/css.c      | 57 +++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 61 insertions(+)
> > 
> > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> > index 46c69283a67b..e8099ab47368 100644
> > --- a/arch/s390/Kconfig
> > +++ b/arch/s390/Kconfig
> > @@ -194,6 +194,7 @@ config S390
> >  	select VIRT_TO_BUS
> >  	select HAVE_NMI
> >  	select SWIOTLB
> > +	select GENERIC_ALLOCATOR
> >  
> >  
> >  config SCHED_OMIT_FRAME_POINTER
> > diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
> > index 1727180e8ca1..4510e418614a 100644
> > --- a/arch/s390/include/asm/cio.h
> > +++ b/arch/s390/include/asm/cio.h
> > @@ -328,6 +328,9 @@ static inline u8 pathmask_to_pos(u8 mask)
> >  void channel_subsystem_reinit(void);
> >  extern void css_schedule_reprobe(void);
> >  
> > +extern void *cio_dma_zalloc(size_t size);
> > +extern void cio_dma_free(void *cpu_addr, size_t size);
> > +
> >  /* Function from drivers/s390/cio/chsc.c */
> >  int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
> >  int chsc_sstpi(void *page, void *result, size_t size);
> > diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> > index aea502922646..72629d99d8e4 100644
> > --- a/drivers/s390/cio/css.c
> > +++ b/drivers/s390/cio/css.c
> > @@ -20,6 +20,8 @@
> >  #include <linux/reboot.h>
> >  #include <linux/suspend.h>
> >  #include <linux/proc_fs.h>
> > +#include <linux/genalloc.h>
> > +#include <linux/dma-mapping.h>
> >  #include <asm/isc.h>
> >  #include <asm/crw.h>
> >  
> > @@ -886,6 +888,8 @@ static const struct attribute_group *cssdev_attr_groups[] = {
> >  	NULL,
> >  };
> >  
> > +static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
> > +
> >  static int __init setup_css(int nr)
> >  {
> >  	struct channel_subsystem *css;
> > @@ -899,6 +903,9 @@ static int __init setup_css(int nr)
> >  	dev_set_name(&css->device, "css%x", nr);
> >  	css->device.groups = cssdev_attr_groups;
> >  	css->device.release = channel_subsystem_release;
> > +	/* some cio DMA memory needs to be 31 bit addressable */
> > +	css->device.coherent_dma_mask = DMA_BIT_MASK(31),
> > +	css->device.dma_mask = &css_dev_dma_mask;
> 
> Question: Does this percolate down to the child devices eventually?
> E.g., you have a ccw_device getting the mask from its parent
> subchannel, which gets it from its parent css? If so, does that clash
> with the the mask you used for the virtio_ccw_device in a previous
> patch? Or are they two really different things?
> 

AFAIK id does not percolate. I could not find anything showing in this
direction in drivers core at least. AFAIU dma_mask is also about the
addressing limitations of the device itself (in the good old pci world,
not really applicable for ccw devices).

Regarding virtio_ccw, I need to set the DMA stuff on the parent device
because that is how the stuff in virtio_ring.c works. There I we can
get away with DMA_BIT_MASK(64) as that stuff is not used in places where
31 bit addresses are required. For the actual ccw stuff I used the
cio DMA pool introduced here.

FWIW the allocation mechanisms provided by  DMA API (as documented) is
not very well suited with what we need for ccw. This is why I choose to
do this gen_pool thing. The gist of it is as-is at the moment we get
page granularity allocations from DMA API. Of course we could use
dma_pool which is kind of perfect for uniformly sized objects. As you
will see in the rest of the series, we have a comparatively small number
of smallish objects of varying sizes. And some of these are short lived.

So the child devices like subchannels and ccw devices do not use DMA API
directly, except for virtio_ccw.

Does all that make any sense to you?


> >  
> >  	mutex_init(&css->mutex);
> >  	css->cssid = chsc_get_cssid(nr);
> > @@ -1018,6 +1025,55 @@ static struct notifier_block css_power_notifier = {
> >  	.notifier_call = css_power_event,
> >  };
> >  
> > +#define POOL_INIT_PAGES 1
> > +static struct gen_pool *cio_dma_pool;
> > +/* Currently cio supports only a single css */
> > +static struct device *cio_dma_css;
> 
> That global variable feels wrong, especially if you plan to support
> MCSS-E in the future. (Do you? :) 

Not that I'm aware of any plans to add support MCSS-E.

> If yes, should the dma pool be global
> or per-css? As css0 currently is the root device for the channel
> subsystem stuff, you'd either need a new parent to hang this off from
> or size this with the number of css images.)
> 
> For now, just grab channel_subsystems[0]->device directly?
> 

Patch 6 gets rid of this variable and adds an accessor instead:

+struct device *cio_get_dma_css_dev(void)
+{
+	return &channel_subsystems[0]->device;
+}
+

I can move that here if you like (for v1).

Yes it is kind of unfortunate that some pieces of this code look
like there could be more than one css, but actually MCSS-E is not
supported. I would prefer sorting these problems out when they arise, if
they ever arise.

> > +static gfp_t  cio_dma_flags;
> > +
> > +static void __init cio_dma_pool_init(void)
> > +{
> > +	void *cpu_addr;
> > +	dma_addr_t dma_addr;
> > +	int i;
> > +
> > +	cio_dma_css = &channel_subsystems[0]->device;
> > +	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
> > +	cio_dma_pool = gen_pool_create(3, -1);
> > +	/* No need to free up the resources: compiled in */
> > +	for (i = 0; i < POOL_INIT_PAGES; ++i) {
> > +		cpu_addr = dma_alloc_coherent(cio_dma_css, PAGE_SIZE, &dma_addr,
> > +					      cio_dma_flags);
> > +		if (!cpu_addr)
> > +			return;
> > +		gen_pool_add_virt(cio_dma_pool, (unsigned long) cpu_addr,
> > +				  dma_addr, PAGE_SIZE, -1);
> > +	}
> > +
> > +}
> > +
> > +void *cio_dma_zalloc(size_t size)
> > +{
> > +	dma_addr_t dma_addr;
> > +	unsigned long addr = gen_pool_alloc(cio_dma_pool, size);
> > +
> > +	if (!addr) {
> > +		addr = (unsigned long) dma_alloc_coherent(cio_dma_css,
> > +					PAGE_SIZE, &dma_addr, cio_dma_flags);
> > +		if (!addr)
> > +			return NULL;
> > +		gen_pool_add_virt(cio_dma_pool, addr, dma_addr, PAGE_SIZE, -1);
> > +		addr = gen_pool_alloc(cio_dma_pool, size);
> > +	}
> > +	return (void *) addr;
> 
> At this point, you're always going via the css0 device. I'm wondering
> whether you should pass in the cssid here and use css_by_id(cssid) to
> make this future proof. But then, I'm not quite clear from which
> context this will be called.

As said before I don't see MCSS-E coming. Regarding the client code,
please check out the rest of the series. Especially patch 6. 

From my perspective it would be at this stage saner to make it more
obvious that only one css is supported (at the moment), than to implement
MCSS-E, or to make this (kind of) MCSS-E ready.

Regards,
Halil

> 
> > +}
> > +
> > +void cio_dma_free(void *cpu_addr, size_t size)
> > +{
> > +	memset(cpu_addr, 0, size);
> > +	gen_pool_free(cio_dma_pool, (unsigned long) cpu_addr, size);
> > +}
> > +
> >  /*
> >   * Now that the driver core is running, we can setup our channel subsystem.
> >   * The struct subchannel's are created during probing.
> > @@ -1063,6 +1119,7 @@ static int __init css_bus_init(void)
> >  		unregister_reboot_notifier(&css_reboot_notifier);
> >  		goto out_unregister;
> >  	}
> > +	cio_dma_pool_init();
> >  	css_init_done = 1;
> >  
> >  	/* Enable default isc for I/O subchannels. */
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
  2019-04-04 23:16 ` [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization Halil Pasic
@ 2019-04-09 12:22     ` Christoph Hellwig
  2019-04-09 12:22     ` Christoph Hellwig
  1 sibling, 0 replies; 87+ messages in thread
From: Christoph Hellwig @ 2019-04-09 12:22 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

> +++ b/arch/s390/include/asm/dma-mapping.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_S390_DMA_MAPPING_H
> +#define _ASM_S390_DMA_MAPPING_H
> +
> +#include <linux/dma-contiguous.h>
> +
> +static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
> +{
> +	return NULL;
> +}
> +
> +#endif /* _ASM_S390_DMA_MAPPING_H */
> +

Congratulations!  You ust create an entirely pointless duplicate of
include/asm-generic/dma-mapping.h.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
@ 2019-04-09 12:22     ` Christoph Hellwig
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Hellwig @ 2019-04-09 12:22 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Cornelia Huck, virtualization, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

> +++ b/arch/s390/include/asm/dma-mapping.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_S390_DMA_MAPPING_H
> +#define _ASM_S390_DMA_MAPPING_H
> +
> +#include <linux/dma-contiguous.h>
> +
> +static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
> +{
> +	return NULL;
> +}
> +
> +#endif /* _ASM_S390_DMA_MAPPING_H */
> +

Congratulations!  You ust create an entirely pointless duplicate of
include/asm-generic/dma-mapping.h.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
  2019-04-09 12:22     ` Christoph Hellwig
  (?)
@ 2019-04-09 12:39     ` Halil Pasic
  -1 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-09 12:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Tue, 9 Apr 2019 05:22:41 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> > +++ b/arch/s390/include/asm/dma-mapping.h
> > @@ -0,0 +1,13 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_S390_DMA_MAPPING_H
> > +#define _ASM_S390_DMA_MAPPING_H
> > +
> > +#include <linux/dma-contiguous.h>
> > +
> > +static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
> > +{
> > +	return NULL;
> > +}
> > +
> > +#endif /* _ASM_S390_DMA_MAPPING_H */
> > +
> 
> Congratulations!  You ust create an entirely pointless duplicate of
> include/asm-generic/dma-mapping.h.
> 

Right! Thanks for spotting. There was a stage in development when my
arch/s390/include/asm/dma-mapping.h looked quite different, but I did
end up with an entirely pointless duplicate indeed.

I will fix it for v1.

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
  2019-04-09 11:29     ` Halil Pasic
@ 2019-04-09 13:01         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 13:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 13:29:27 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 11:57:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:12 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > Currently we have a problem if a virtio-ccw device has
> > > VIRTIO_F_IOMMU_PLATFORM.   
> > 
> > Can you please describe what the actual problem is?
> >   
> 
> Without this patch:
> 
> WARNING: CPU: 2 PID: 26
> at [..]/kernel/dma/mapping.c:251
> dma_alloc_attrs+0x8e/0xd0 Modules linked in: CPU: 2 PID: 26 Comm:
> kworker/u6:1 Not tainted 5.1.0-rc3-00023-g1ec89ec #596 Hardware name:
> IBM 2964 NC9 712 (KVM/Linux) Workqueue: events_unbound async_run_entry_fn
> Krnl PSW : 0704c00180000000 000000000021b18e (dma_alloc_attrs+0x8e/0xd0)
>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000
> 0000000000001406 000003e00040f838 0000000000002dc0 0000000000000100
> 0000000000000001 0000000000001000 000000000236f028 000003e00040f838
> 0000000000001406 000000004b289828 0000000000000080 000003e00040f6f8
> 000003e00040f6a0 Krnl Code: 000000000021b17e: f0e00004ebaf
> srp	4(15,%r0),2991(%r14),0 000000000021b184: f0c0000407f4
> srp	4(13,%r0),2036,0 #000000000021b18a: a7f40001
> brc	15,21b18c >000000000021b18e: ec5520bc0055	risbg
> %r5,%r5,32,188,0 000000000021b194: b9020011		ltgr
> %r1,%r1 000000000021b198: a784ffd9		brc	8,21b14a
>            000000000021b19c: e31010000002	ltg	%r1,0(%r1)
>            000000000021b1a2: a7840012		brc	8,21b1c6
> Call Trace:
> ([<0000000000000004>] 0x4)
>  [<00000000007a7d54>] vring_alloc_queue+0x74/0x90 
>  [<00000000007a8390>] vring_create_virtqueue+0xf8/0x288 
>  [<0000000000919ec0>] virtio_ccw_find_vqs+0xf8/0x950 
>  [<000000000080772e>] init_vq+0x16e/0x318 
>  [<00000000008087c4>] virtblk_probe+0xf4/0xb58 
>  [<00000000007a62a6>] virtio_dev_probe+0x1a6/0x250 
>  [<00000000007ea498>] really_probe+0x1c8/0x290 
>  [<00000000007ea746>] driver_probe_device+0x86/0x160 
>  [<00000000007e7cba>] bus_for_each_drv+0x7a/0xc0 
>  [<00000000007ea23c>] __device_attach+0xfc/0x180 
>  [<00000000007e9116>] bus_probe_device+0xae/0xc8 
>  [<00000000007e5066>] device_add+0x3fe/0x698 
>  [<00000000007a5d92>] register_virtio_device+0xca/0x120 
>  [<00000000009195a2>] virtio_ccw_online+0x1b2/0x220 
>  [<000000000089853e>] ccw_device_set_online+0x1d6/0x4d8 
>  [<0000000000918cf6>] virtio_ccw_auto_online+0x26/0x58 
>  [<00000000001a61b6>] async_run_entry_fn+0x5e/0x158 
>  [<0000000000199322>] process_one_work+0x25a/0x668 
>  [<000000000019977a>] worker_thread+0x4a/0x428 
>  [<00000000001a1ae8>] kthread+0x150/0x170 
>  [<0000000000aeab3a>] kernel_thread_starter+0x6/0xc 
>  [<0000000000aeab34>] kernel_thread_starter+0x0/0xc 
> 
> [..]
> 
> virtio_ccw 0.0.0301: no vq
> ---[ end trace d35815958c12cad3 ]---
> virtio_ccw 0.0.0300: no vq
> virtio_blk: probe of virtio1 failed with error -12
> virtio_blk: probe of virtio3 failed with error -12
> 
> Means virtio devices broken.
> 
> Should I
> s/we have a problem if a virtio-ccw device/virtio-ccw devices do not work if the device/
> ?

Much better :)

(That is, this happens if we switch on the feature bit in the
hypervisor, right?)

> 
> > > In future we do want to support DMA API with
> > > virtio-ccw.
> > > 
> > > Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
> > > with virtio-ccw.
> > > 
> > > Let us also switch from legacy avail/used accessors to the DMA aware
> > > ones (even if it isn't strictly necessary).  
> > 
> > I think with this change we can remove the legacy accessors, if I
> > didn't mis-grep.
> >   
> 
> That is possible, I can do that in v1.

Sounds good.

> 
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++------
> > >  1 file changed, 17 insertions(+), 6 deletions(-)

(...)

> > > @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > >  		ret = -ENOMEM;
> > >  		goto out_free;
> > >  	}
> > > +	vcdev->vdev.dev.parent = &cdev->dev;  
> > 
> > That one makes sense, pci and mmio are doing that as well.
> >   
> > > +	cdev->dev.dma_mask = &vcdev->dma_mask;  
> > 
> > That one feels a bit weird. Will this change in one of the follow-on
> > patches? (Have not yet looked at the whole series.)  
> 
> I don't thinks so. Do you mean this should happen within the cio code?
> I think I started out with the idea to keep the scope as narrow as
> possible. Do you have any suggestions?

From what I see, you set the mask from the virtio-ccw side, then
propagate it up to the general ccw_device, and then the generic virtio
code will fetch it from the ccw_device. Don't you potentially need
something for other ccw_devices in that protected hipervisor case as
well (e.g for 3270)?

> 
> >   
> > > +
> > > +	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> > > +	if (ret)
> > > +		ret = dma_set_mask_and_coherent(&cdev->dev,
> > > +						DMA_BIT_MASK(32));
> > > +	if (ret) {
> > > +		dev_warn(&cdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");  
> > 
> > This does not look like you'd try to continue?
> >   
> 
> I remember now. First I did continue, then I changed this to fail hard
> so I can not ignore any such problems while smoke testing ('I don't always
> check the kernel messages'), but kept the old message. This basically
> should not fail anyway, otherwise we have a problem AFAIU.
> 
> By the way virtio-pci tries to continue indeed, and this is also where
> the wording comes from ;).
> 
> What would you prefer? Try to continue or fail right away?

If it does not have a chance of working properly in the general case,
I'd fail.

> 
> Regards,
> Halil
> 
> > > +		goto out_free;
> > > +	}
> > > +
> > >  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
> > >  				   GFP_DMA | GFP_KERNEL);
> > >  	if (!vcdev->config_block) {  
> >   
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
@ 2019-04-09 13:01         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 13:01 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 9 Apr 2019 13:29:27 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 11:57:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:12 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > Currently we have a problem if a virtio-ccw device has
> > > VIRTIO_F_IOMMU_PLATFORM.   
> > 
> > Can you please describe what the actual problem is?
> >   
> 
> Without this patch:
> 
> WARNING: CPU: 2 PID: 26
> at [..]/kernel/dma/mapping.c:251
> dma_alloc_attrs+0x8e/0xd0 Modules linked in: CPU: 2 PID: 26 Comm:
> kworker/u6:1 Not tainted 5.1.0-rc3-00023-g1ec89ec #596 Hardware name:
> IBM 2964 NC9 712 (KVM/Linux) Workqueue: events_unbound async_run_entry_fn
> Krnl PSW : 0704c00180000000 000000000021b18e (dma_alloc_attrs+0x8e/0xd0)
>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000
> 0000000000001406 000003e00040f838 0000000000002dc0 0000000000000100
> 0000000000000001 0000000000001000 000000000236f028 000003e00040f838
> 0000000000001406 000000004b289828 0000000000000080 000003e00040f6f8
> 000003e00040f6a0 Krnl Code: 000000000021b17e: f0e00004ebaf
> srp	4(15,%r0),2991(%r14),0 000000000021b184: f0c0000407f4
> srp	4(13,%r0),2036,0 #000000000021b18a: a7f40001
> brc	15,21b18c >000000000021b18e: ec5520bc0055	risbg
> %r5,%r5,32,188,0 000000000021b194: b9020011		ltgr
> %r1,%r1 000000000021b198: a784ffd9		brc	8,21b14a
>            000000000021b19c: e31010000002	ltg	%r1,0(%r1)
>            000000000021b1a2: a7840012		brc	8,21b1c6
> Call Trace:
> ([<0000000000000004>] 0x4)
>  [<00000000007a7d54>] vring_alloc_queue+0x74/0x90 
>  [<00000000007a8390>] vring_create_virtqueue+0xf8/0x288 
>  [<0000000000919ec0>] virtio_ccw_find_vqs+0xf8/0x950 
>  [<000000000080772e>] init_vq+0x16e/0x318 
>  [<00000000008087c4>] virtblk_probe+0xf4/0xb58 
>  [<00000000007a62a6>] virtio_dev_probe+0x1a6/0x250 
>  [<00000000007ea498>] really_probe+0x1c8/0x290 
>  [<00000000007ea746>] driver_probe_device+0x86/0x160 
>  [<00000000007e7cba>] bus_for_each_drv+0x7a/0xc0 
>  [<00000000007ea23c>] __device_attach+0xfc/0x180 
>  [<00000000007e9116>] bus_probe_device+0xae/0xc8 
>  [<00000000007e5066>] device_add+0x3fe/0x698 
>  [<00000000007a5d92>] register_virtio_device+0xca/0x120 
>  [<00000000009195a2>] virtio_ccw_online+0x1b2/0x220 
>  [<000000000089853e>] ccw_device_set_online+0x1d6/0x4d8 
>  [<0000000000918cf6>] virtio_ccw_auto_online+0x26/0x58 
>  [<00000000001a61b6>] async_run_entry_fn+0x5e/0x158 
>  [<0000000000199322>] process_one_work+0x25a/0x668 
>  [<000000000019977a>] worker_thread+0x4a/0x428 
>  [<00000000001a1ae8>] kthread+0x150/0x170 
>  [<0000000000aeab3a>] kernel_thread_starter+0x6/0xc 
>  [<0000000000aeab34>] kernel_thread_starter+0x0/0xc 
> 
> [..]
> 
> virtio_ccw 0.0.0301: no vq
> ---[ end trace d35815958c12cad3 ]---
> virtio_ccw 0.0.0300: no vq
> virtio_blk: probe of virtio1 failed with error -12
> virtio_blk: probe of virtio3 failed with error -12
> 
> Means virtio devices broken.
> 
> Should I
> s/we have a problem if a virtio-ccw device/virtio-ccw devices do not work if the device/
> ?

Much better :)

(That is, this happens if we switch on the feature bit in the
hypervisor, right?)

> 
> > > In future we do want to support DMA API with
> > > virtio-ccw.
> > > 
> > > Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
> > > with virtio-ccw.
> > > 
> > > Let us also switch from legacy avail/used accessors to the DMA aware
> > > ones (even if it isn't strictly necessary).  
> > 
> > I think with this change we can remove the legacy accessors, if I
> > didn't mis-grep.
> >   
> 
> That is possible, I can do that in v1.

Sounds good.

> 
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  drivers/s390/virtio/virtio_ccw.c | 23 +++++++++++++++++------
> > >  1 file changed, 17 insertions(+), 6 deletions(-)

(...)

> > > @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > >  		ret = -ENOMEM;
> > >  		goto out_free;
> > >  	}
> > > +	vcdev->vdev.dev.parent = &cdev->dev;  
> > 
> > That one makes sense, pci and mmio are doing that as well.
> >   
> > > +	cdev->dev.dma_mask = &vcdev->dma_mask;  
> > 
> > That one feels a bit weird. Will this change in one of the follow-on
> > patches? (Have not yet looked at the whole series.)  
> 
> I don't thinks so. Do you mean this should happen within the cio code?
> I think I started out with the idea to keep the scope as narrow as
> possible. Do you have any suggestions?

From what I see, you set the mask from the virtio-ccw side, then
propagate it up to the general ccw_device, and then the generic virtio
code will fetch it from the ccw_device. Don't you potentially need
something for other ccw_devices in that protected hipervisor case as
well (e.g for 3270)?

> 
> >   
> > > +
> > > +	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
> > > +	if (ret)
> > > +		ret = dma_set_mask_and_coherent(&cdev->dev,
> > > +						DMA_BIT_MASK(32));
> > > +	if (ret) {
> > > +		dev_warn(&cdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");  
> > 
> > This does not look like you'd try to continue?
> >   
> 
> I remember now. First I did continue, then I changed this to fail hard
> so I can not ignore any such problems while smoke testing ('I don't always
> check the kernel messages'), but kept the old message. This basically
> should not fail anyway, otherwise we have a problem AFAIU.
> 
> By the way virtio-pci tries to continue indeed, and this is also where
> the wording comes from ;).
> 
> What would you prefer? Try to continue or fail right away?

If it does not have a chance of working properly in the general case,
I'd fail.

> 
> Regards,
> Halil
> 
> > > +		goto out_free;
> > > +	}
> > > +
> > >  	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
> > >  				   GFP_DMA | GFP_KERNEL);
> > >  	if (!vcdev->config_block) {  
> >   
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
  2019-04-09 13:01         ` Cornelia Huck
  (?)
@ 2019-04-09 13:23         ` Halil Pasic
  2019-04-09 15:47             ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-09 13:23 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 15:01:20 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Tue, 9 Apr 2019 13:29:27 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Tue, 9 Apr 2019 11:57:43 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Fri,  5 Apr 2019 01:16:12 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > Currently we have a problem if a virtio-ccw device has
> > > > VIRTIO_F_IOMMU_PLATFORM.   
> > > 
> > > Can you please describe what the actual problem is?
> > >   
> > 
> > Without this patch:

[..]

> > virtio_ccw 0.0.0300: no vq
> > virtio_blk: probe of virtio1 failed with error -12
> > virtio_blk: probe of virtio3 failed with error -12
> > 
> > Means virtio devices broken.
> > 
> > Should I
> > s/we have a problem if a virtio-ccw device/virtio-ccw devices do not work if the device/
> > ?
> 
> Much better :)
> 
> (That is, this happens if we switch on the feature bit in the
> hypervisor, right?)
> 

Yes, that is with qemu -device virtio-blk-ccw,iommu_platform=on (and no
PV whatsoever).

I will change the commit message accordingly. 

[..]

> > > > @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > > >  		ret = -ENOMEM;
> > > >  		goto out_free;
> > > >  	}
> > > > +	vcdev->vdev.dev.parent = &cdev->dev;  
> > > 
> > > That one makes sense, pci and mmio are doing that as well.
> > >   
> > > > +	cdev->dev.dma_mask = &vcdev->dma_mask;  
> > > 
> > > That one feels a bit weird. Will this change in one of the follow-on
> > > patches? (Have not yet looked at the whole series.)  
> > 
> > I don't thinks so. Do you mean this should happen within the cio code?
> > I think I started out with the idea to keep the scope as narrow as
> > possible. Do you have any suggestions?
> 
> From what I see, you set the mask from the virtio-ccw side, then
> propagate it up to the general ccw_device, and then the generic virtio
> code will fetch it from the ccw_device.

Right! For some reason dma_mask is a pointer. And I need virtio core to
use a sane value for virtio_ccw devices.

> Don't you potentially need
> something for other ccw_devices in that protected hipervisor case as
> well (e.g for 3270)?


Maybe, maybe not. The first stage is likely to be virito only. I would
prefer sorting out stuff like 3270 as the need arises. Also see my
response to patch 4 (Message-Id: <20190409141114.7dcce94a@oc2783563651>).

> 
> > 
> > >   
> > > > +
> > > > +	ret = dma_set_mask_and_coherent(&cdev->dev,
> > > > DMA_BIT_MASK(64));
> > > > +	if (ret)
> > > > +		ret = dma_set_mask_and_coherent(&cdev->dev,
> > > > +
> > > > DMA_BIT_MASK(32));
> > > > +	if (ret) {
> > > > +		dev_warn(&cdev->dev, "Failed to enable 64-bit
> > > > or 32-bit DMA.  Trying to continue, but this might not
> > > > work.\n");  
> > > 
> > > This does not look like you'd try to continue?
> > >   
> > 
> > I remember now. First I did continue, then I changed this to fail
> > hard so I can not ignore any such problems while smoke testing ('I
> > don't always check the kernel messages'), but kept the old message.
> > This basically should not fail anyway, otherwise we have a problem
> > AFAIU.
> > 
> > By the way virtio-pci tries to continue indeed, and this is also
> > where the wording comes from ;).
> > 
> > What would you prefer? Try to continue or fail right away?
> 
> If it does not have a chance of working properly in the general case,
> I'd fail.
> 

Agreed! I will make it so. Would dropping '  Trying to continue, but
this might not work.' from the warning message work for you?

I could also drop the attempt to set a 32 bit mask if you agree. Do you?

Many thanks for your review!

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
  2019-04-09 13:23         ` Halil Pasic
@ 2019-04-09 15:47             ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 15:47 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 15:23:13 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 15:01:20 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Tue, 9 Apr 2019 13:29:27 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Tue, 9 Apr 2019 11:57:43 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Fri,  5 Apr 2019 01:16:12 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > > > @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > > > >  		ret = -ENOMEM;
> > > > >  		goto out_free;
> > > > >  	}
> > > > > +	vcdev->vdev.dev.parent = &cdev->dev;    
> > > > 
> > > > That one makes sense, pci and mmio are doing that as well.
> > > >     
> > > > > +	cdev->dev.dma_mask = &vcdev->dma_mask;    
> > > > 
> > > > That one feels a bit weird. Will this change in one of the follow-on
> > > > patches? (Have not yet looked at the whole series.)    
> > > 
> > > I don't thinks so. Do you mean this should happen within the cio code?
> > > I think I started out with the idea to keep the scope as narrow as
> > > possible. Do you have any suggestions?  
> > 
> > From what I see, you set the mask from the virtio-ccw side, then
> > propagate it up to the general ccw_device, and then the generic virtio
> > code will fetch it from the ccw_device.  
> 
> Right! For some reason dma_mask is a pointer. And I need virtio core to
> use a sane value for virtio_ccw devices.
> 
> > Don't you potentially need
> > something for other ccw_devices in that protected hipervisor case as
> > well (e.g for 3270)?  
> 
> 
> Maybe, maybe not. The first stage is likely to be virito only. I would
> prefer sorting out stuff like 3270 as the need arises. Also see my
> response to patch 4 (Message-Id: <20190409141114.7dcce94a@oc2783563651>).

As long as the infrastructure is flexible enough to be extended later,
ok. I still need to read that mail.

> 
> >   
> > >   
> > > >     
> > > > > +
> > > > > +	ret = dma_set_mask_and_coherent(&cdev->dev,
> > > > > DMA_BIT_MASK(64));
> > > > > +	if (ret)
> > > > > +		ret = dma_set_mask_and_coherent(&cdev->dev,
> > > > > +
> > > > > DMA_BIT_MASK(32));
> > > > > +	if (ret) {
> > > > > +		dev_warn(&cdev->dev, "Failed to enable 64-bit
> > > > > or 32-bit DMA.  Trying to continue, but this might not
> > > > > work.\n");    
> > > > 
> > > > This does not look like you'd try to continue?
> > > >     
> > > 
> > > I remember now. First I did continue, then I changed this to fail
> > > hard so I can not ignore any such problems while smoke testing ('I
> > > don't always check the kernel messages'), but kept the old message.
> > > This basically should not fail anyway, otherwise we have a problem
> > > AFAIU.
> > > 
> > > By the way virtio-pci tries to continue indeed, and this is also
> > > where the wording comes from ;).
> > > 
> > > What would you prefer? Try to continue or fail right away?  
> > 
> > If it does not have a chance of working properly in the general case,
> > I'd fail.
> >   
> 
> Agreed! I will make it so. Would dropping '  Trying to continue, but
> this might not work.' from the warning message work for you?

Sounds fine.

> 
> I could also drop the attempt to set a 32 bit mask if you agree. Do you?

Only if you also drop it from the message as well ;)

Not sure in what cases you'll fail to set a 64 bit mask, but succeed
with a 32 bit mask. If there's no sensible situation where that might
happen, I'd just go ahead and drop it.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw
@ 2019-04-09 15:47             ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 15:47 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 9 Apr 2019 15:23:13 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 15:01:20 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Tue, 9 Apr 2019 13:29:27 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Tue, 9 Apr 2019 11:57:43 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Fri,  5 Apr 2019 01:16:12 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > > > @@ -1255,6 +1254,18 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> > > > >  		ret = -ENOMEM;
> > > > >  		goto out_free;
> > > > >  	}
> > > > > +	vcdev->vdev.dev.parent = &cdev->dev;    
> > > > 
> > > > That one makes sense, pci and mmio are doing that as well.
> > > >     
> > > > > +	cdev->dev.dma_mask = &vcdev->dma_mask;    
> > > > 
> > > > That one feels a bit weird. Will this change in one of the follow-on
> > > > patches? (Have not yet looked at the whole series.)    
> > > 
> > > I don't thinks so. Do you mean this should happen within the cio code?
> > > I think I started out with the idea to keep the scope as narrow as
> > > possible. Do you have any suggestions?  
> > 
> > From what I see, you set the mask from the virtio-ccw side, then
> > propagate it up to the general ccw_device, and then the generic virtio
> > code will fetch it from the ccw_device.  
> 
> Right! For some reason dma_mask is a pointer. And I need virtio core to
> use a sane value for virtio_ccw devices.
> 
> > Don't you potentially need
> > something for other ccw_devices in that protected hipervisor case as
> > well (e.g for 3270)?  
> 
> 
> Maybe, maybe not. The first stage is likely to be virito only. I would
> prefer sorting out stuff like 3270 as the need arises. Also see my
> response to patch 4 (Message-Id: <20190409141114.7dcce94a@oc2783563651>).

As long as the infrastructure is flexible enough to be extended later,
ok. I still need to read that mail.

> 
> >   
> > >   
> > > >     
> > > > > +
> > > > > +	ret = dma_set_mask_and_coherent(&cdev->dev,
> > > > > DMA_BIT_MASK(64));
> > > > > +	if (ret)
> > > > > +		ret = dma_set_mask_and_coherent(&cdev->dev,
> > > > > +
> > > > > DMA_BIT_MASK(32));
> > > > > +	if (ret) {
> > > > > +		dev_warn(&cdev->dev, "Failed to enable 64-bit
> > > > > or 32-bit DMA.  Trying to continue, but this might not
> > > > > work.\n");    
> > > > 
> > > > This does not look like you'd try to continue?
> > > >     
> > > 
> > > I remember now. First I did continue, then I changed this to fail
> > > hard so I can not ignore any such problems while smoke testing ('I
> > > don't always check the kernel messages'), but kept the old message.
> > > This basically should not fail anyway, otherwise we have a problem
> > > AFAIU.
> > > 
> > > By the way virtio-pci tries to continue indeed, and this is also
> > > where the wording comes from ;).
> > > 
> > > What would you prefer? Try to continue or fail right away?  
> > 
> > If it does not have a chance of working properly in the general case,
> > I'd fail.
> >   
> 
> Agreed! I will make it so. Would dropping '  Trying to continue, but
> this might not work.' from the warning message work for you?

Sounds fine.

> 
> I could also drop the attempt to set a 32 bit mask if you agree. Do you?

Only if you also drop it from the message as well ;)

Not sure in what cases you'll fail to set a 64 bit mask, but succeed
with a 32 bit mask. If there's no sensible situation where that might
happen, I'd just go ahead and drop it.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-09 12:11     ` Halil Pasic
@ 2019-04-09 17:14         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:14 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 14:11:14 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 12:44:58 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:14 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > @@ -886,6 +888,8 @@ static const struct attribute_group *cssdev_attr_groups[] = {
> > >  	NULL,
> > >  };
> > >  
> > > +static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
> > > +
> > >  static int __init setup_css(int nr)
> > >  {
> > >  	struct channel_subsystem *css;
> > > @@ -899,6 +903,9 @@ static int __init setup_css(int nr)
> > >  	dev_set_name(&css->device, "css%x", nr);
> > >  	css->device.groups = cssdev_attr_groups;
> > >  	css->device.release = channel_subsystem_release;
> > > +	/* some cio DMA memory needs to be 31 bit addressable */
> > > +	css->device.coherent_dma_mask = DMA_BIT_MASK(31),
> > > +	css->device.dma_mask = &css_dev_dma_mask;  
> > 
> > Question: Does this percolate down to the child devices eventually?
> > E.g., you have a ccw_device getting the mask from its parent
> > subchannel, which gets it from its parent css? If so, does that clash
> > with the the mask you used for the virtio_ccw_device in a previous
> > patch? Or are they two really different things?
> >   
> 
> AFAIK id does not percolate. I could not find anything showing in this
> direction in drivers core at least. AFAIU dma_mask is also about the
> addressing limitations of the device itself (in the good old pci world,
> not really applicable for ccw devices).
> 
> Regarding virtio_ccw, I need to set the DMA stuff on the parent device
> because that is how the stuff in virtio_ring.c works. There I we can
> get away with DMA_BIT_MASK(64) as that stuff is not used in places where
> 31 bit addresses are required. For the actual ccw stuff I used the
> cio DMA pool introduced here.

Ok, so those are two different users then.

> 
> FWIW the allocation mechanisms provided by  DMA API (as documented) is
> not very well suited with what we need for ccw. This is why I choose to
> do this gen_pool thing. The gist of it is as-is at the moment we get
> page granularity allocations from DMA API. Of course we could use
> dma_pool which is kind of perfect for uniformly sized objects. As you
> will see in the rest of the series, we have a comparatively small number
> of smallish objects of varying sizes. And some of these are short lived.
> 
> So the child devices like subchannels and ccw devices do not use DMA API
> directly, except for virtio_ccw.
> 
> Does all that make any sense to you?

I think I need to read more of the series, but that should be enough
info to get me started :)

> 
> 
> > >  
> > >  	mutex_init(&css->mutex);
> > >  	css->cssid = chsc_get_cssid(nr);
> > > @@ -1018,6 +1025,55 @@ static struct notifier_block css_power_notifier = {
> > >  	.notifier_call = css_power_event,
> > >  };
> > >  
> > > +#define POOL_INIT_PAGES 1
> > > +static struct gen_pool *cio_dma_pool;
> > > +/* Currently cio supports only a single css */
> > > +static struct device *cio_dma_css;  
> > 
> > That global variable feels wrong, especially if you plan to support
> > MCSS-E in the future. (Do you? :)   
> 
> Not that I'm aware of any plans to add support MCSS-E.
> 
> > If yes, should the dma pool be global
> > or per-css? As css0 currently is the root device for the channel
> > subsystem stuff, you'd either need a new parent to hang this off from
> > or size this with the number of css images.)
> > 
> > For now, just grab channel_subsystems[0]->device directly?
> >   
> 
> Patch 6 gets rid of this variable and adds an accessor instead:
> 
> +struct device *cio_get_dma_css_dev(void)
> +{
> +	return &channel_subsystems[0]->device;
> +}
> +
> 
> I can move that here if you like (for v1).

An accessor looks more sane than a global variable, yes.

> 
> Yes it is kind of unfortunate that some pieces of this code look
> like there could be more than one css, but actually MCSS-E is not
> supported. I would prefer sorting these problems out when they arise, if
> they ever arise.

As long as it's not too unreasonable, I think we should keep the
infrastructure for multiple css instances in place.

> 
> > > +static gfp_t  cio_dma_flags;
> > > +
> > > +static void __init cio_dma_pool_init(void)
> > > +{
> > > +	void *cpu_addr;
> > > +	dma_addr_t dma_addr;
> > > +	int i;
> > > +
> > > +	cio_dma_css = &channel_subsystems[0]->device;
> > > +	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
> > > +	cio_dma_pool = gen_pool_create(3, -1);
> > > +	/* No need to free up the resources: compiled in */
> > > +	for (i = 0; i < POOL_INIT_PAGES; ++i) {
> > > +		cpu_addr = dma_alloc_coherent(cio_dma_css, PAGE_SIZE, &dma_addr,
> > > +					      cio_dma_flags);
> > > +		if (!cpu_addr)
> > > +			return;
> > > +		gen_pool_add_virt(cio_dma_pool, (unsigned long) cpu_addr,
> > > +				  dma_addr, PAGE_SIZE, -1);
> > > +	}
> > > +
> > > +}
> > > +
> > > +void *cio_dma_zalloc(size_t size)
> > > +{
> > > +	dma_addr_t dma_addr;
> > > +	unsigned long addr = gen_pool_alloc(cio_dma_pool, size);
> > > +
> > > +	if (!addr) {
> > > +		addr = (unsigned long) dma_alloc_coherent(cio_dma_css,
> > > +					PAGE_SIZE, &dma_addr, cio_dma_flags);
> > > +		if (!addr)
> > > +			return NULL;
> > > +		gen_pool_add_virt(cio_dma_pool, addr, dma_addr, PAGE_SIZE, -1);
> > > +		addr = gen_pool_alloc(cio_dma_pool, size);
> > > +	}
> > > +	return (void *) addr;  
> > 
> > At this point, you're always going via the css0 device. I'm wondering
> > whether you should pass in the cssid here and use css_by_id(cssid) to
> > make this future proof. But then, I'm not quite clear from which
> > context this will be called.  
> 
> As said before I don't see MCSS-E coming. Regarding the client code,
> please check out the rest of the series. Especially patch 6. 
> 
> From my perspective it would be at this stage saner to make it more
> obvious that only one css is supported (at the moment), than to implement
> MCSS-E, or to make this (kind of) MCSS-E ready.

I disagree. I think there's value in keeping the interfaces clean
(within reasonable bounds, of course.) Even if there is no
implementation of MCSS-E other than QEMU... it is probably a good idea
to spend some brain cycles to make this conceptually clean.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
@ 2019-04-09 17:14         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:14 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 9 Apr 2019 14:11:14 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 12:44:58 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:14 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > @@ -886,6 +888,8 @@ static const struct attribute_group *cssdev_attr_groups[] = {
> > >  	NULL,
> > >  };
> > >  
> > > +static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
> > > +
> > >  static int __init setup_css(int nr)
> > >  {
> > >  	struct channel_subsystem *css;
> > > @@ -899,6 +903,9 @@ static int __init setup_css(int nr)
> > >  	dev_set_name(&css->device, "css%x", nr);
> > >  	css->device.groups = cssdev_attr_groups;
> > >  	css->device.release = channel_subsystem_release;
> > > +	/* some cio DMA memory needs to be 31 bit addressable */
> > > +	css->device.coherent_dma_mask = DMA_BIT_MASK(31),
> > > +	css->device.dma_mask = &css_dev_dma_mask;  
> > 
> > Question: Does this percolate down to the child devices eventually?
> > E.g., you have a ccw_device getting the mask from its parent
> > subchannel, which gets it from its parent css? If so, does that clash
> > with the the mask you used for the virtio_ccw_device in a previous
> > patch? Or are they two really different things?
> >   
> 
> AFAIK id does not percolate. I could not find anything showing in this
> direction in drivers core at least. AFAIU dma_mask is also about the
> addressing limitations of the device itself (in the good old pci world,
> not really applicable for ccw devices).
> 
> Regarding virtio_ccw, I need to set the DMA stuff on the parent device
> because that is how the stuff in virtio_ring.c works. There I we can
> get away with DMA_BIT_MASK(64) as that stuff is not used in places where
> 31 bit addresses are required. For the actual ccw stuff I used the
> cio DMA pool introduced here.

Ok, so those are two different users then.

> 
> FWIW the allocation mechanisms provided by  DMA API (as documented) is
> not very well suited with what we need for ccw. This is why I choose to
> do this gen_pool thing. The gist of it is as-is at the moment we get
> page granularity allocations from DMA API. Of course we could use
> dma_pool which is kind of perfect for uniformly sized objects. As you
> will see in the rest of the series, we have a comparatively small number
> of smallish objects of varying sizes. And some of these are short lived.
> 
> So the child devices like subchannels and ccw devices do not use DMA API
> directly, except for virtio_ccw.
> 
> Does all that make any sense to you?

I think I need to read more of the series, but that should be enough
info to get me started :)

> 
> 
> > >  
> > >  	mutex_init(&css->mutex);
> > >  	css->cssid = chsc_get_cssid(nr);
> > > @@ -1018,6 +1025,55 @@ static struct notifier_block css_power_notifier = {
> > >  	.notifier_call = css_power_event,
> > >  };
> > >  
> > > +#define POOL_INIT_PAGES 1
> > > +static struct gen_pool *cio_dma_pool;
> > > +/* Currently cio supports only a single css */
> > > +static struct device *cio_dma_css;  
> > 
> > That global variable feels wrong, especially if you plan to support
> > MCSS-E in the future. (Do you? :)   
> 
> Not that I'm aware of any plans to add support MCSS-E.
> 
> > If yes, should the dma pool be global
> > or per-css? As css0 currently is the root device for the channel
> > subsystem stuff, you'd either need a new parent to hang this off from
> > or size this with the number of css images.)
> > 
> > For now, just grab channel_subsystems[0]->device directly?
> >   
> 
> Patch 6 gets rid of this variable and adds an accessor instead:
> 
> +struct device *cio_get_dma_css_dev(void)
> +{
> +	return &channel_subsystems[0]->device;
> +}
> +
> 
> I can move that here if you like (for v1).

An accessor looks more sane than a global variable, yes.

> 
> Yes it is kind of unfortunate that some pieces of this code look
> like there could be more than one css, but actually MCSS-E is not
> supported. I would prefer sorting these problems out when they arise, if
> they ever arise.

As long as it's not too unreasonable, I think we should keep the
infrastructure for multiple css instances in place.

> 
> > > +static gfp_t  cio_dma_flags;
> > > +
> > > +static void __init cio_dma_pool_init(void)
> > > +{
> > > +	void *cpu_addr;
> > > +	dma_addr_t dma_addr;
> > > +	int i;
> > > +
> > > +	cio_dma_css = &channel_subsystems[0]->device;
> > > +	cio_dma_flags = GFP_DMA | GFP_KERNEL | __GFP_ZERO;
> > > +	cio_dma_pool = gen_pool_create(3, -1);
> > > +	/* No need to free up the resources: compiled in */
> > > +	for (i = 0; i < POOL_INIT_PAGES; ++i) {
> > > +		cpu_addr = dma_alloc_coherent(cio_dma_css, PAGE_SIZE, &dma_addr,
> > > +					      cio_dma_flags);
> > > +		if (!cpu_addr)
> > > +			return;
> > > +		gen_pool_add_virt(cio_dma_pool, (unsigned long) cpu_addr,
> > > +				  dma_addr, PAGE_SIZE, -1);
> > > +	}
> > > +
> > > +}
> > > +
> > > +void *cio_dma_zalloc(size_t size)
> > > +{
> > > +	dma_addr_t dma_addr;
> > > +	unsigned long addr = gen_pool_alloc(cio_dma_pool, size);
> > > +
> > > +	if (!addr) {
> > > +		addr = (unsigned long) dma_alloc_coherent(cio_dma_css,
> > > +					PAGE_SIZE, &dma_addr, cio_dma_flags);
> > > +		if (!addr)
> > > +			return NULL;
> > > +		gen_pool_add_virt(cio_dma_pool, addr, dma_addr, PAGE_SIZE, -1);
> > > +		addr = gen_pool_alloc(cio_dma_pool, size);
> > > +	}
> > > +	return (void *) addr;  
> > 
> > At this point, you're always going via the css0 device. I'm wondering
> > whether you should pass in the cssid here and use css_by_id(cssid) to
> > make this future proof. But then, I'm not quite clear from which
> > context this will be called.  
> 
> As said before I don't see MCSS-E coming. Regarding the client code,
> please check out the rest of the series. Especially patch 6. 
> 
> From my perspective it would be at this stage saner to make it more
> obvious that only one css is supported (at the moment), than to implement
> MCSS-E, or to make this (kind of) MCSS-E ready.

I disagree. I think there's value in keeping the interfaces clean
(within reasonable bounds, of course.) Even if there is no
implementation of MCSS-E other than QEMU... it is probably a good idea
to spend some brain cycles to make this conceptually clean.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
  2019-04-09 10:54     ` Halil Pasic
@ 2019-04-09 17:18         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:18 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 12:54:16 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 12:16:47 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:13 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On s390 protected virtualization guests also have to use bounce I/O
> > > buffers.  That requires some plumbing.
> > > 
> > > Let us make sure any device using DMA API accordingly is spared from the  
>                     ^,                                   ^,
> Maybe this helps...
> 
> > > problems that hypervisor attempting I/O to a non-shared secure page would
> > > bring.  
> > 
> > I have problems parsing this sentence :(
> > 
> > Do you mean that we want to exclude pages for I/O from encryption?  
> 
> The intended meaning is:
> * Devices that do use DMA API (properly) to get get/map the memory
>   that is used to talk to hypervisor should be OK with PV (protected
>   virtualizaton). I.e. for such devices PV or not PV is basically
>   transparent.
> * But if a device does not use DMA API for the memory that is used to
>   talk to the hypervisor this patch won't help.
> 
> And yes the gist of it is: memory accessed by the hypervisor needs to
> be on pages excluded from protection (which in case of PV is technically
> not encryption).
> 
> Does that help?

Hm, let me sleep on this. The original sentence was a bit too
convoluted for me...

> 
> >   
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  arch/s390/Kconfig                   |  4 ++++
> > >  arch/s390/include/asm/Kbuild        |  1 -
> > >  arch/s390/include/asm/dma-mapping.h | 13 +++++++++++
> > >  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++++
> > >  arch/s390/mm/init.c                 | 44 +++++++++++++++++++++++++++++++++++++
> > >  5 files changed, 79 insertions(+), 1 deletion(-)
> > >  create mode 100644 arch/s390/include/asm/dma-mapping.h
> > >  create mode 100644 arch/s390/include/asm/mem_encrypt.h  
> > 
> > (...)
> >   
> > > @@ -126,6 +129,45 @@ void mark_rodata_ro(void)
> > >  	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
> > >  }
> > >  
> > > +int set_memory_encrypted(unsigned long addr, int numpages)
> > > +{
> > > +	/* also called for the swiotlb bounce buffers, make all pages shared */
> > > +	/* TODO: do ultravisor calls */
> > > +	return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> > > +
> > > +int set_memory_decrypted(unsigned long addr, int numpages)
> > > +{
> > > +	/* also called for the swiotlb bounce buffers, make all pages shared */
> > > +	/* TODO: do ultravisor calls */
> > > +	return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> > > +
> > > +/* are we a protected virtualization guest? */
> > > +bool sev_active(void)
> > > +{
> > > +	/*
> > > +	 * TODO: Do proper detection using ultravisor, for now let us fake we
> > > +	 *  have it so the code gets exercised.  
> > 
> > That's the swiotlb stuff, right?
> >   
> 
> You mean 'That' == code to get exercised == 'swiotlb stuff'?
> 
> If yes then the answer is kind of. The swiotlb (i.e. bounce buffers) is
> when we map (like we map the buffers pointed to by the descriptors in
> case of the virtio ring). The other part of it is the memory allocated
> as DMA coherent (i.e. the virtio ring (desc, avail used) itself).

Ok.

> 
> > (The patches will obviously need some reordering before it is actually
> > getting merged.)
> >   
> 
> What do you mean by reordering?
> 
> One reason why this is an early RFC is the missing dependency (i.e. the
> stuff described by most of the TODO comments). As pointed out in the
> cover letter. Another reason is that I wanted to avoid putting a lots of
> effort into fine-polishing before clarifying the getting some feedback
> on the basics from the community. ;)

Sure. I'm just reading top-down and unconditionally enabling this is
something that obviously needs to be changed in later iterations ;)

> 
> 
> > > +	 */
> > > +	return true;
> > > +}
> > > +EXPORT_SYMBOL_GPL(sev_active);
> > > +
> > > +/* protected virtualization */
> > > +static void pv_init(void)
> > > +{
> > > +	if (!sev_active())
> > > +		return;
> > > +
> > > +	/* make sure bounce buffers are shared */
> > > +	swiotlb_init(1);
> > > +	swiotlb_update_mem_attributes();
> > > +	swiotlb_force = SWIOTLB_FORCE;
> > > +}
> > > +
> > >  void __init mem_init(void)
> > >  {
> > >  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> > > @@ -134,6 +176,8 @@ void __init mem_init(void)
> > >  	set_max_mapnr(max_low_pfn);
> > >          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> > >  
> > > +	pv_init();
> > > +
> > >  	/* Setup guest page hinting */
> > >  	cmma_init();
> > >    
> >   
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization
@ 2019-04-09 17:18         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:18 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Tue, 9 Apr 2019 12:54:16 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 12:16:47 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:13 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On s390 protected virtualization guests also have to use bounce I/O
> > > buffers.  That requires some plumbing.
> > > 
> > > Let us make sure any device using DMA API accordingly is spared from the  
>                     ^,                                   ^,
> Maybe this helps...
> 
> > > problems that hypervisor attempting I/O to a non-shared secure page would
> > > bring.  
> > 
> > I have problems parsing this sentence :(
> > 
> > Do you mean that we want to exclude pages for I/O from encryption?  
> 
> The intended meaning is:
> * Devices that do use DMA API (properly) to get get/map the memory
>   that is used to talk to hypervisor should be OK with PV (protected
>   virtualizaton). I.e. for such devices PV or not PV is basically
>   transparent.
> * But if a device does not use DMA API for the memory that is used to
>   talk to the hypervisor this patch won't help.
> 
> And yes the gist of it is: memory accessed by the hypervisor needs to
> be on pages excluded from protection (which in case of PV is technically
> not encryption).
> 
> Does that help?

Hm, let me sleep on this. The original sentence was a bit too
convoluted for me...

> 
> >   
> > > 
> > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > > ---
> > >  arch/s390/Kconfig                   |  4 ++++
> > >  arch/s390/include/asm/Kbuild        |  1 -
> > >  arch/s390/include/asm/dma-mapping.h | 13 +++++++++++
> > >  arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++++
> > >  arch/s390/mm/init.c                 | 44 +++++++++++++++++++++++++++++++++++++
> > >  5 files changed, 79 insertions(+), 1 deletion(-)
> > >  create mode 100644 arch/s390/include/asm/dma-mapping.h
> > >  create mode 100644 arch/s390/include/asm/mem_encrypt.h  
> > 
> > (...)
> >   
> > > @@ -126,6 +129,45 @@ void mark_rodata_ro(void)
> > >  	pr_info("Write protected read-only-after-init data: %luk\n", size >> 10);
> > >  }
> > >  
> > > +int set_memory_encrypted(unsigned long addr, int numpages)
> > > +{
> > > +	/* also called for the swiotlb bounce buffers, make all pages shared */
> > > +	/* TODO: do ultravisor calls */
> > > +	return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(set_memory_encrypted);
> > > +
> > > +int set_memory_decrypted(unsigned long addr, int numpages)
> > > +{
> > > +	/* also called for the swiotlb bounce buffers, make all pages shared */
> > > +	/* TODO: do ultravisor calls */
> > > +	return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(set_memory_decrypted);
> > > +
> > > +/* are we a protected virtualization guest? */
> > > +bool sev_active(void)
> > > +{
> > > +	/*
> > > +	 * TODO: Do proper detection using ultravisor, for now let us fake we
> > > +	 *  have it so the code gets exercised.  
> > 
> > That's the swiotlb stuff, right?
> >   
> 
> You mean 'That' == code to get exercised == 'swiotlb stuff'?
> 
> If yes then the answer is kind of. The swiotlb (i.e. bounce buffers) is
> when we map (like we map the buffers pointed to by the descriptors in
> case of the virtio ring). The other part of it is the memory allocated
> as DMA coherent (i.e. the virtio ring (desc, avail used) itself).

Ok.

> 
> > (The patches will obviously need some reordering before it is actually
> > getting merged.)
> >   
> 
> What do you mean by reordering?
> 
> One reason why this is an early RFC is the missing dependency (i.e. the
> stuff described by most of the TODO comments). As pointed out in the
> cover letter. Another reason is that I wanted to avoid putting a lots of
> effort into fine-polishing before clarifying the getting some feedback
> on the basics from the community. ;)

Sure. I'm just reading top-down and unconditionally enabling this is
something that obviously needs to be changed in later iterations ;)

> 
> 
> > > +	 */
> > > +	return true;
> > > +}
> > > +EXPORT_SYMBOL_GPL(sev_active);
> > > +
> > > +/* protected virtualization */
> > > +static void pv_init(void)
> > > +{
> > > +	if (!sev_active())
> > > +		return;
> > > +
> > > +	/* make sure bounce buffers are shared */
> > > +	swiotlb_init(1);
> > > +	swiotlb_update_mem_attributes();
> > > +	swiotlb_force = SWIOTLB_FORCE;
> > > +}
> > > +
> > >  void __init mem_init(void)
> > >  {
> > >  	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
> > > @@ -134,6 +176,8 @@ void __init mem_init(void)
> > >  	set_max_mapnr(max_low_pfn);
> > >          high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> > >  
> > > +	pv_init();
> > > +
> > >  	/* Setup guest page hinting */
> > >  	cmma_init();
> > >    
> >   
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
  2019-04-04 23:16 ` [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio Halil Pasic
@ 2019-04-09 17:55     ` Cornelia Huck
  2019-04-11 14:15     ` Sebastian Ott
  1 sibling, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:55 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri,  5 Apr 2019 01:16:15 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Virtio-ccw relies on cio mechanisms for bootstrapping the ccw device.

Well, a ccw device is, by definition, using cio mechanisms ;)

Better say: "As virtio-ccw devices are channel devices, we need to use
the dma area for any communication with the hypervisor."
Or something like that.

> Thus we need to make sure any memory that is used for communication with
> the hypervisor is shared.

In this context, does 'hypervisor' always mean 'QEMU/KVM'? If Other
Hypervisors implement protected virtualization, we probably need to
make sure that all common I/O layer control blocks are in the dma area
(including e.g. QDIO), not just what virtio-ccw devices use.

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/cio/ccwreq.c        |  8 +++----
>  drivers/s390/cio/device.c        | 46 +++++++++++++++++++++++++++++++---------
>  drivers/s390/cio/device_fsm.c    | 40 +++++++++++++++++-----------------
>  drivers/s390/cio/device_id.c     | 18 ++++++++--------
>  drivers/s390/cio/device_ops.c    |  4 ++--
>  drivers/s390/cio/device_pgid.c   | 20 ++++++++---------
>  drivers/s390/cio/device_status.c | 24 ++++++++++-----------
>  drivers/s390/cio/io_sch.h        | 19 ++++++++++++-----
>  8 files changed, 107 insertions(+), 72 deletions(-)
> 

(...)

(just looking at which fields are moved for now)

> diff --git a/drivers/s390/cio/io_sch.h b/drivers/s390/cio/io_sch.h
> index 90e4e3a7841b..fc3220fede0f 100644
> --- a/drivers/s390/cio/io_sch.h
> +++ b/drivers/s390/cio/io_sch.h
> @@ -9,15 +9,20 @@
>  #include "css.h"
>  #include "orb.h"
>  
> +
> +struct io_subchannel_dma_area {
> +	struct ccw1 sense_ccw;	/* static ccw for sense command */

The ccw makes sense.

> +};
> +
>  struct io_subchannel_private {
>  	union orb orb;		/* operation request block */
> -	struct ccw1 sense_ccw;	/* static ccw for sense command */
>  	struct ccw_device *cdev;/* pointer to the child ccw device */
>  	struct {
>  		unsigned int suspend:1;	/* allow suspend */
>  		unsigned int prefetch:1;/* deny prefetch */
>  		unsigned int inter:1;	/* suppress intermediate interrupts */
>  	} __packed options;
> +	struct io_subchannel_dma_area *dma_area;
>  } __aligned(8);
>  
>  #define to_io_private(n) ((struct io_subchannel_private *) \
> @@ -115,6 +120,13 @@ enum cdev_todo {
>  #define FAKE_CMD_IRB	1
>  #define FAKE_TM_IRB	2
>  
> +struct ccw_device_dma_area {
> +	struct senseid senseid;	/* SenseID info */
> +	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
> +	struct irb irb;		/* device status */
> +	struct pgid pgid[8];	/* path group IDs per chpid*/

Again, ccws, and blocks that will be written by the hypervisor, which
makes sense as well.

> +};
> +
>  struct ccw_device_private {
>  	struct ccw_device *cdev;
>  	struct subchannel *sch;
> @@ -156,11 +168,7 @@ struct ccw_device_private {
>  	} __attribute__((packed)) flags;
>  	unsigned long intparm;	/* user interruption parameter */
>  	struct qdio_irq *qdio_data;
> -	struct irb irb;		/* device status */
>  	int async_kill_io_rc;
> -	struct senseid senseid;	/* SenseID info */
> -	struct pgid pgid[8];	/* path group IDs per chpid*/
> -	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
>  	struct work_struct todo_work;
>  	enum cdev_todo todo;
>  	wait_queue_head_t wait_q;
> @@ -169,6 +177,7 @@ struct ccw_device_private {
>  	struct list_head cmb_list;	/* list of measured devices */
>  	u64 cmb_start_time;		/* clock value of cmb reset */
>  	void *cmb_wait;			/* deferred cmb enable/disable */
> +	struct ccw_device_dma_area *dma_area;
>  	enum interruption_class int_class;
>  };
>  

So, this leaves some things I'm not sure about, especially as I do not
know the architecture of this new feature.

- This applies only to asynchronously handled things, it seems? So
  things like control blocks modified by stsch/msch/etc does not need
  special treatment?
- What about channel measurements? Or are they not supported?
- What about CHSCs? Or would only asynchronous commands (which we
  currently don't implement in QEMU) need special treatment?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
@ 2019-04-09 17:55     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-09 17:55 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:15 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Virtio-ccw relies on cio mechanisms for bootstrapping the ccw device.

Well, a ccw device is, by definition, using cio mechanisms ;)

Better say: "As virtio-ccw devices are channel devices, we need to use
the dma area for any communication with the hypervisor."
Or something like that.

> Thus we need to make sure any memory that is used for communication with
> the hypervisor is shared.

In this context, does 'hypervisor' always mean 'QEMU/KVM'? If Other
Hypervisors implement protected virtualization, we probably need to
make sure that all common I/O layer control blocks are in the dma area
(including e.g. QDIO), not just what virtio-ccw devices use.

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/cio/ccwreq.c        |  8 +++----
>  drivers/s390/cio/device.c        | 46 +++++++++++++++++++++++++++++++---------
>  drivers/s390/cio/device_fsm.c    | 40 +++++++++++++++++-----------------
>  drivers/s390/cio/device_id.c     | 18 ++++++++--------
>  drivers/s390/cio/device_ops.c    |  4 ++--
>  drivers/s390/cio/device_pgid.c   | 20 ++++++++---------
>  drivers/s390/cio/device_status.c | 24 ++++++++++-----------
>  drivers/s390/cio/io_sch.h        | 19 ++++++++++++-----
>  8 files changed, 107 insertions(+), 72 deletions(-)
> 

(...)

(just looking at which fields are moved for now)

> diff --git a/drivers/s390/cio/io_sch.h b/drivers/s390/cio/io_sch.h
> index 90e4e3a7841b..fc3220fede0f 100644
> --- a/drivers/s390/cio/io_sch.h
> +++ b/drivers/s390/cio/io_sch.h
> @@ -9,15 +9,20 @@
>  #include "css.h"
>  #include "orb.h"
>  
> +
> +struct io_subchannel_dma_area {
> +	struct ccw1 sense_ccw;	/* static ccw for sense command */

The ccw makes sense.

> +};
> +
>  struct io_subchannel_private {
>  	union orb orb;		/* operation request block */
> -	struct ccw1 sense_ccw;	/* static ccw for sense command */
>  	struct ccw_device *cdev;/* pointer to the child ccw device */
>  	struct {
>  		unsigned int suspend:1;	/* allow suspend */
>  		unsigned int prefetch:1;/* deny prefetch */
>  		unsigned int inter:1;	/* suppress intermediate interrupts */
>  	} __packed options;
> +	struct io_subchannel_dma_area *dma_area;
>  } __aligned(8);
>  
>  #define to_io_private(n) ((struct io_subchannel_private *) \
> @@ -115,6 +120,13 @@ enum cdev_todo {
>  #define FAKE_CMD_IRB	1
>  #define FAKE_TM_IRB	2
>  
> +struct ccw_device_dma_area {
> +	struct senseid senseid;	/* SenseID info */
> +	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
> +	struct irb irb;		/* device status */
> +	struct pgid pgid[8];	/* path group IDs per chpid*/

Again, ccws, and blocks that will be written by the hypervisor, which
makes sense as well.

> +};
> +
>  struct ccw_device_private {
>  	struct ccw_device *cdev;
>  	struct subchannel *sch;
> @@ -156,11 +168,7 @@ struct ccw_device_private {
>  	} __attribute__((packed)) flags;
>  	unsigned long intparm;	/* user interruption parameter */
>  	struct qdio_irq *qdio_data;
> -	struct irb irb;		/* device status */
>  	int async_kill_io_rc;
> -	struct senseid senseid;	/* SenseID info */
> -	struct pgid pgid[8];	/* path group IDs per chpid*/
> -	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
>  	struct work_struct todo_work;
>  	enum cdev_todo todo;
>  	wait_queue_head_t wait_q;
> @@ -169,6 +177,7 @@ struct ccw_device_private {
>  	struct list_head cmb_list;	/* list of measured devices */
>  	u64 cmb_start_time;		/* clock value of cmb reset */
>  	void *cmb_wait;			/* deferred cmb enable/disable */
> +	struct ccw_device_dma_area *dma_area;
>  	enum interruption_class int_class;
>  };
>  

So, this leaves some things I'm not sure about, especially as I do not
know the architecture of this new feature.

- This applies only to asynchronously handled things, it seems? So
  things like control blocks modified by stsch/msch/etc does not need
  special treatment?
- What about channel measurements? Or are they not supported?
- What about CHSCs? Or would only asynchronous commands (which we
  currently don't implement in QEMU) need special treatment?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
  2019-04-09 17:55     ` Cornelia Huck
  (?)
@ 2019-04-10  0:10     ` Halil Pasic
  2019-04-10  8:25         ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-10  0:10 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 19:55:48 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri,  5 Apr 2019 01:16:15 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > Virtio-ccw relies on cio mechanisms for bootstrapping the ccw device.
> 
> Well, a ccw device is, by definition, using cio mechanisms ;)

Right, this was supposed to explain to the non-virtio guys why do I
have to deal with this stuff in cio in a series that is about virtio. I
was chiefly alluding to the SENSE stuff.

> 
> Better say: "As virtio-ccw devices are channel devices, we need to use
> the dma area for any communication with the hypervisor."
> Or something like that.

Sure we can come up with a better commit message. Your sentence is
already better than mine ;).

> 
> > Thus we need to make sure any memory that is used for communication with
> > the hypervisor is shared.
> 
> In this context, does 'hypervisor' always mean 'QEMU/KVM'? If Other
> Hypervisors implement protected virtualization, we probably need to
> make sure that all common I/O layer control blocks are in the dma area
> (including e.g. QDIO), not just what virtio-ccw devices use.
> 

Hypervisor could theoretically be something different than QEMU/KVM. Yet,
as stated before, this series is about getting virtio-ccw working
(modulo the TODOs). 

[..]

> >  
> 
> So, this leaves some things I'm not sure about, especially as I do not
> know the architecture of this new feature.
> 
> - This applies only to asynchronously handled things, it seems? So
>   things like control blocks modified by stsch/msch/etc does not need
>   special treatment?

I had a feeble attempt at explaining this in the cover letter:

* make sure that virtio-ccw specific stuff uses shared memory when
  talking to the hypervisor (except communication blocks like ORB, these
  are handled by the hypervisor)

Unfortunately the last 'hypervisor' was supposed to be 'ultravisor'.

I.e. the ultravisor will take care of exposing the control blocks
to the hypervisor (and of changes as well).

> - What about channel measurements? Or are they not supported?

For virtio-ccw channel measurements are not very relevant. I think
it should work as it used to, but I will give it another look later.

> - What about CHSCs? Or would only asynchronous commands (which we
>   currently don't implement in QEMU) need special treatment?
> 

CHSC is special. AFAIR the code of Vasily and Janosch is handling
what needs to be handled. But I will double-check this as well.

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
  2019-04-10  0:10     ` Halil Pasic
@ 2019-04-10  8:25         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 02:10:44 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 19:55:48 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:15 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > Thus we need to make sure any memory that is used for communication with
> > > the hypervisor is shared.  
> > 
> > In this context, does 'hypervisor' always mean 'QEMU/KVM'? If Other
> > Hypervisors implement protected virtualization, we probably need to
> > make sure that all common I/O layer control blocks are in the dma area
> > (including e.g. QDIO), not just what virtio-ccw devices use.
> >   
> 
> Hypervisor could theoretically be something different than QEMU/KVM. Yet,
> as stated before, this series is about getting virtio-ccw working
> (modulo the TODOs). 

Sure, just wanted to point it out. If this is "enable the common I/O
layer, except for QDIO" or so, that would sound fine to me :)

> 
> [..]
> 
> > >    
> > 
> > So, this leaves some things I'm not sure about, especially as I do not
> > know the architecture of this new feature.
> > 
> > - This applies only to asynchronously handled things, it seems? So
> >   things like control blocks modified by stsch/msch/etc does not need
> >   special treatment?  
> 
> I had a feeble attempt at explaining this in the cover letter:
> 
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> 
> Unfortunately the last 'hypervisor' was supposed to be 'ultravisor'.
> 
> I.e. the ultravisor will take care of exposing the control blocks
> to the hypervisor (and of changes as well).

Yeah, that "control blocks" or "communication blocks" leaves me a bit
fuzzy :)

So, what is a high-level summary of areas that need the treatment?
What I get from looking at the patches so far, it's:
- stuff that is written by the hypervisor's interrupt injection code:
  IRB, indicators, etc.
- buffers that are filled by a channel program: sense, sense id, etc.
- ccws themselves (because of translation?)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
@ 2019-04-10  8:25         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 02:10:44 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 19:55:48 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:15 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > Thus we need to make sure any memory that is used for communication with
> > > the hypervisor is shared.  
> > 
> > In this context, does 'hypervisor' always mean 'QEMU/KVM'? If Other
> > Hypervisors implement protected virtualization, we probably need to
> > make sure that all common I/O layer control blocks are in the dma area
> > (including e.g. QDIO), not just what virtio-ccw devices use.
> >   
> 
> Hypervisor could theoretically be something different than QEMU/KVM. Yet,
> as stated before, this series is about getting virtio-ccw working
> (modulo the TODOs). 

Sure, just wanted to point it out. If this is "enable the common I/O
layer, except for QDIO" or so, that would sound fine to me :)

> 
> [..]
> 
> > >    
> > 
> > So, this leaves some things I'm not sure about, especially as I do not
> > know the architecture of this new feature.
> > 
> > - This applies only to asynchronously handled things, it seems? So
> >   things like control blocks modified by stsch/msch/etc does not need
> >   special treatment?  
> 
> I had a feeble attempt at explaining this in the cover letter:
> 
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> 
> Unfortunately the last 'hypervisor' was supposed to be 'ultravisor'.
> 
> I.e. the ultravisor will take care of exposing the control blocks
> to the hypervisor (and of changes as well).

Yeah, that "control blocks" or "communication blocks" leaves me a bit
fuzzy :)

So, what is a high-level summary of areas that need the treatment?
What I get from looking at the patches so far, it's:
- stuff that is written by the hypervisor's interrupt injection code:
  IRB, indicators, etc.
- buffers that are filled by a channel program: sense, sense id, etc.
- ccws themselves (because of translation?)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O
  2019-04-04 23:16 ` [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O Halil Pasic
@ 2019-04-10  8:42     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:42 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri,  5 Apr 2019 01:16:17 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Before virtio-ccw could get away with not using DMA API for the pieces of
> memory it does ccw I/O with. With protected virtualization this has to
> change, since the hypervisor needs to read and sometimes also write these
> pieces of memory.
> 
> Let us make sure all ccw I/O is done through shared memory.
> 
> Note: The control blocks of I/O instructions do not need to be shared.
> These are marshalled by the ultravisor.

Ok, so direct parameters of I/O instructions are handled by the
ultravisor?

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++++----------------
>  1 file changed, 107 insertions(+), 70 deletions(-)
> 
(...)
> @@ -167,6 +170,28 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
>  	return container_of(vdev, struct virtio_ccw_device, vdev);
>  }
>  
> +#define vc_dma_decl_struct(type, field) \
> +	dma_addr_t field ## _dma_addr;  \
> +	struct type *field
> +
> +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
> +				   dma_addr_t *dma_handle)
> +{
> +	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
> +				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
> +}
> +
> +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> +				 void *cpu_addr, dma_addr_t dma_handle)
> +{
> +	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
> +}
> +
> +#define vc_dma_alloc_struct(vdev, ptr) \
> +	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
> +#define vc_dma_free_struct(vdev, ptr) \
> +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))

Not sure I'm a fan of those wrappers... I think they actually hurt
readability of the code.

> +
>  static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
>  {
>  	unsigned long i, flags;
> @@ -322,12 +347,12 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  {
>  	int ret;
>  	unsigned long *indicatorp = NULL;
> -	struct virtio_thinint_area *thinint_area = NULL;
> +	vc_dma_decl_struct(virtio_thinint_area, thinint_area) = NULL;
> +	dma_addr_t indicatorp_dma_addr;
>  	struct airq_info *airq_info = vcdev->airq_info;
>  
>  	if (vcdev->is_thinint) {
> -		thinint_area = kzalloc(sizeof(*thinint_area),
> -				       GFP_DMA | GFP_KERNEL);
> +		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
>  		if (!thinint_area)
>  			return;
>  		thinint_area->summary_indicator =
> @@ -338,8 +363,9 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  		ccw->cda = (__u32)(unsigned long) thinint_area;
>  	} else {
>  		/* payload is the address of the indicators */
> -		indicatorp = kmalloc(sizeof(&vcdev->indicators),
> -				     GFP_DMA | GFP_KERNEL);
> +		indicatorp = __vc_dma_alloc(&vcdev->vdev,
> +					    sizeof(&vcdev->indicators),
> +					    &indicatorp_dma_addr);
>  		if (!indicatorp)
>  			return;
>  		*indicatorp = 0;
> @@ -359,8 +385,10 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  			 "Failed to deregister indicators (%d)\n", ret);
>  	else if (vcdev->is_thinint)
>  		virtio_ccw_drop_indicators(vcdev);
> -	kfree(indicatorp);
> -	kfree(thinint_area);
> +	if (indicatorp)
> +		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
> +			       indicatorp, indicatorp_dma_addr);
> +	vc_dma_free_struct(&vcdev->vdev, thinint_area);

Don't you need to check for !NULL here as well?

>  }
>  
>  static inline long __do_kvm_notify(struct subchannel_id schid,
(...)
> @@ -1280,7 +1318,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  
>  	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
>  
> -	vcdev->vdev.dev.parent = &cdev->dev;

Hm?

(You added a line like that in a previous patch; should it simply have
been a movement instead? Or am I misremembering?)

>  	vcdev->vdev.dev.release = virtio_ccw_release_dev;
>  	vcdev->vdev.config = &virtio_ccw_config_ops;
>  	vcdev->cdev = cdev;

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O
@ 2019-04-10  8:42     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:42 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:17 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Before virtio-ccw could get away with not using DMA API for the pieces of
> memory it does ccw I/O with. With protected virtualization this has to
> change, since the hypervisor needs to read and sometimes also write these
> pieces of memory.
> 
> Let us make sure all ccw I/O is done through shared memory.
> 
> Note: The control blocks of I/O instructions do not need to be shared.
> These are marshalled by the ultravisor.

Ok, so direct parameters of I/O instructions are handled by the
ultravisor?

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++++----------------
>  1 file changed, 107 insertions(+), 70 deletions(-)
> 
(...)
> @@ -167,6 +170,28 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
>  	return container_of(vdev, struct virtio_ccw_device, vdev);
>  }
>  
> +#define vc_dma_decl_struct(type, field) \
> +	dma_addr_t field ## _dma_addr;  \
> +	struct type *field
> +
> +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
> +				   dma_addr_t *dma_handle)
> +{
> +	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
> +				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
> +}
> +
> +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> +				 void *cpu_addr, dma_addr_t dma_handle)
> +{
> +	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
> +}
> +
> +#define vc_dma_alloc_struct(vdev, ptr) \
> +	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
> +#define vc_dma_free_struct(vdev, ptr) \
> +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))

Not sure I'm a fan of those wrappers... I think they actually hurt
readability of the code.

> +
>  static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
>  {
>  	unsigned long i, flags;
> @@ -322,12 +347,12 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  {
>  	int ret;
>  	unsigned long *indicatorp = NULL;
> -	struct virtio_thinint_area *thinint_area = NULL;
> +	vc_dma_decl_struct(virtio_thinint_area, thinint_area) = NULL;
> +	dma_addr_t indicatorp_dma_addr;
>  	struct airq_info *airq_info = vcdev->airq_info;
>  
>  	if (vcdev->is_thinint) {
> -		thinint_area = kzalloc(sizeof(*thinint_area),
> -				       GFP_DMA | GFP_KERNEL);
> +		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
>  		if (!thinint_area)
>  			return;
>  		thinint_area->summary_indicator =
> @@ -338,8 +363,9 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  		ccw->cda = (__u32)(unsigned long) thinint_area;
>  	} else {
>  		/* payload is the address of the indicators */
> -		indicatorp = kmalloc(sizeof(&vcdev->indicators),
> -				     GFP_DMA | GFP_KERNEL);
> +		indicatorp = __vc_dma_alloc(&vcdev->vdev,
> +					    sizeof(&vcdev->indicators),
> +					    &indicatorp_dma_addr);
>  		if (!indicatorp)
>  			return;
>  		*indicatorp = 0;
> @@ -359,8 +385,10 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
>  			 "Failed to deregister indicators (%d)\n", ret);
>  	else if (vcdev->is_thinint)
>  		virtio_ccw_drop_indicators(vcdev);
> -	kfree(indicatorp);
> -	kfree(thinint_area);
> +	if (indicatorp)
> +		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
> +			       indicatorp, indicatorp_dma_addr);
> +	vc_dma_free_struct(&vcdev->vdev, thinint_area);

Don't you need to check for !NULL here as well?

>  }
>  
>  static inline long __do_kvm_notify(struct subchannel_id schid,
(...)
> @@ -1280,7 +1318,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  
>  	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
>  
> -	vcdev->vdev.dev.parent = &cdev->dev;

Hm?

(You added a line like that in a previous patch; should it simply have
been a movement instead? Or am I misremembering?)

>  	vcdev->vdev.dev.release = virtio_ccw_release_dev;
>  	vcdev->vdev.config = &virtio_ccw_config_ops;
>  	vcdev->cdev = cdev;

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
  2019-04-04 23:16 ` [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations Halil Pasic
@ 2019-04-10  8:46     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:46 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri,  5 Apr 2019 01:16:20 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> We can reduce the number of DMA allocations by pulling the bits
> of memory that belongs to virtio_ccw_device and needs to be DMA
> memory into a single area.

That makes a lot of sense (maybe start with introducing the dma area
from the beginning?), but...

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 48 ++++++++++++++--------------------------
>  1 file changed, 16 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index aa45a6a027ae..7268149f2ee8 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -49,12 +49,12 @@ struct vq_config_block {
>  struct vcdev_dma_area {
>  	unsigned long indicators;
>  	unsigned long indicators2;
> +	struct vq_config_block config_block;
> +	__u8 status; /* TODO check __aligned(8); */

...I think that needs attention.

>  };
>  
>  struct virtio_ccw_device {
>  	struct virtio_device vdev;
> -	__u8 *status;
> -	dma_addr_t status_dma_addr;
>  	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
>  	struct ccw_device *cdev;
>  	__u32 curr_io;

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
@ 2019-04-10  8:46     ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10  8:46 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:20 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> We can reduce the number of DMA allocations by pulling the bits
> of memory that belongs to virtio_ccw_device and needs to be DMA
> memory into a single area.

That makes a lot of sense (maybe start with introducing the dma area
from the beginning?), but...

> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 48 ++++++++++++++--------------------------
>  1 file changed, 16 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index aa45a6a027ae..7268149f2ee8 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -49,12 +49,12 @@ struct vq_config_block {
>  struct vcdev_dma_area {
>  	unsigned long indicators;
>  	unsigned long indicators2;
> +	struct vq_config_block config_block;
> +	__u8 status; /* TODO check __aligned(8); */

...I think that needs attention.

>  };
>  
>  struct virtio_ccw_device {
>  	struct virtio_device vdev;
> -	__u8 *status;
> -	dma_addr_t status_dma_addr;
>  	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
>  	struct ccw_device *cdev;
>  	__u32 curr_io;

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
@ 2019-04-10  9:20   ` Cornelia Huck
  2019-04-04 23:16 ` [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw Halil Pasic
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10  9:20 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri,  5 Apr 2019 01:16:10 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Enhanced virtualization protection technology may require the use of
> bounce buffers for I/O. While support for this was built into the virtio
> core,  virtio-ccw wasn't changed accordingly.
> 
> Thus what needs to be done to bring virtio-ccw up to speed with respect
> to this is:
> * use some 'new' common virtio stuff
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> * make sure the DMA API does what is necessary to talk through shared
>   memory if we are a protected virtualization guest.
> * make sure the common IO layer plays along as well (airqs, sense).

It would be good to have a summary somewhere in the code (or
Documentation/) as to what needs the dma treatment and what doesn't,
for later reference. We don't want people to accidentally break things
(especially if they cannot refer to architecture documentation - or
will at least some of that be published?)

> 
> The series is structured in incremental fashion: some of the changes are
> overridden by following patches. The main reason why is that this is how I
> developed. But I think it ain't bad for the didactic and we are a bit more
> flexible with regards to throwing out some of the stuff in the end.

FWIW, I think reshuffling the patches in the next iteration would ease
review.

> 
> Important notes:
> 
> * This is an early (WIP) RFC that does not add any function to the
>   kernel at his stage, as the ultravisor interactions are left out.
>   The purpose is getting some early feedback ASAP.

I would like some comments from people who have experience with the dma
api.

> 
> * In future these patches will depend on some code interacting with the
>   ultravisor (WIP by Vasily and Janosch).
> 
> * The s390 names are by no means final, and are not properly explained. Should
>   not hamper understanding too much. If it does please ask.
> 
> * The existing naming in the common infrastructure (kernel internal
>   interfaces) is pretty much based on the AMD SEV terminology. Thus the
>   names aren't always perfect. There might be merit to changing these
>   names to more abstract ones. I did not put much thought into that at
>   the current stage.

If we can find some generic names that work well for everyone,
converting seems like a good idea. But following SEV is not that bad,
either (you'll probably find more people who have heard about SEV than
folks familiar with s390 ;)

> 
> 
> Testing:
> 
> Please use iommu_platform=on for any virtio devices you are going
> to test this code with (so virtio actually uses the DMA API).
> 
> Looking forward to your review or any other type of input.

I have now read through the whole series and commented in some places.
But I'd really like to see comments from others as well.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
@ 2019-04-10  9:20   ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10  9:20 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Fri,  5 Apr 2019 01:16:10 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> Enhanced virtualization protection technology may require the use of
> bounce buffers for I/O. While support for this was built into the virtio
> core,  virtio-ccw wasn't changed accordingly.
> 
> Thus what needs to be done to bring virtio-ccw up to speed with respect
> to this is:
> * use some 'new' common virtio stuff
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> * make sure the DMA API does what is necessary to talk through shared
>   memory if we are a protected virtualization guest.
> * make sure the common IO layer plays along as well (airqs, sense).

It would be good to have a summary somewhere in the code (or
Documentation/) as to what needs the dma treatment and what doesn't,
for later reference. We don't want people to accidentally break things
(especially if they cannot refer to architecture documentation - or
will at least some of that be published?)

> 
> The series is structured in incremental fashion: some of the changes are
> overridden by following patches. The main reason why is that this is how I
> developed. But I think it ain't bad for the didactic and we are a bit more
> flexible with regards to throwing out some of the stuff in the end.

FWIW, I think reshuffling the patches in the next iteration would ease
review.

> 
> Important notes:
> 
> * This is an early (WIP) RFC that does not add any function to the
>   kernel at his stage, as the ultravisor interactions are left out.
>   The purpose is getting some early feedback ASAP.

I would like some comments from people who have experience with the dma
api.

> 
> * In future these patches will depend on some code interacting with the
>   ultravisor (WIP by Vasily and Janosch).
> 
> * The s390 names are by no means final, and are not properly explained. Should
>   not hamper understanding too much. If it does please ask.
> 
> * The existing naming in the common infrastructure (kernel internal
>   interfaces) is pretty much based on the AMD SEV terminology. Thus the
>   names aren't always perfect. There might be merit to changing these
>   names to more abstract ones. I did not put much thought into that at
>   the current stage.

If we can find some generic names that work well for everyone,
converting seems like a good idea. But following SEV is not that bad,
either (you'll probably find more people who have heard about SEV than
folks familiar with s390 ;)

> 
> 
> Testing:
> 
> Please use iommu_platform=on for any virtio devices you are going
> to test this code with (so virtio actually uses the DMA API).
> 
> Looking forward to your review or any other type of input.

I have now read through the whole series and commented in some places.
But I'd really like to see comments from others as well.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
  2019-04-10  8:25         ` Cornelia Huck
  (?)
@ 2019-04-10 13:02         ` Halil Pasic
  2019-04-10 16:16             ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-10 13:02 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 10:25:57 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 10 Apr 2019 02:10:44 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Tue, 9 Apr 2019 19:55:48 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Fri,  5 Apr 2019 01:16:15 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > > > Thus we need to make sure any memory that is used for communication with
> > > > the hypervisor is shared.  
> > > 
> > > In this context, does 'hypervisor' always mean 'QEMU/KVM'? If Other
> > > Hypervisors implement protected virtualization, we probably need to
> > > make sure that all common I/O layer control blocks are in the dma area
> > > (including e.g. QDIO), not just what virtio-ccw devices use.
> > >   
> > 
> > Hypervisor could theoretically be something different than QEMU/KVM. Yet,
> > as stated before, this series is about getting virtio-ccw working
> > (modulo the TODOs). 
> 
> Sure, just wanted to point it out. If this is "enable the common I/O
> layer, except for QDIO" or so, that would sound fine to me :)
> 

Right except QDIO. Series cares only about the bits relevant for
virtio-ccw. I guess I should make that clearer in the commit message.


> > 
> > [..]
> > 
> > > >    
> > > 
> > > So, this leaves some things I'm not sure about, especially as I do not
> > > know the architecture of this new feature.
> > > 
> > > - This applies only to asynchronously handled things, it seems? So
> > >   things like control blocks modified by stsch/msch/etc does not need
> > >   special treatment?  
> > 
> > I had a feeble attempt at explaining this in the cover letter:
> > 
> > * make sure that virtio-ccw specific stuff uses shared memory when
> >   talking to the hypervisor (except communication blocks like ORB, these
> >   are handled by the hypervisor)
> > 
> > Unfortunately the last 'hypervisor' was supposed to be 'ultravisor'.
> > 
> > I.e. the ultravisor will take care of exposing the control blocks
> > to the hypervisor (and of changes as well).
> 
> Yeah, that "control blocks" or "communication blocks" leaves me a bit
> fuzzy :)

Yeah. I don't know if there is an official name for the stuff. For
the ORB of SSCH control block seems quite fitting. For the SCHIB of
STSCH communication seems a better fit.

> 
> So, what is a high-level summary of areas that need the treatment?
> What I get from looking at the patches so far, it's:
> - stuff that is written by the hypervisor's interrupt injection code:
>   IRB, indicators, etc.

In interrupt context IRB behaves like a control block. We don't have
to make it shared.

I moved IRB because of snse:


@@ -329,9 +329,9 @@ ccw_device_do_sense(struct ccw_device *cdev, struct
irb *irb) /*
         * We have ending status but no sense information. Do a basic
sense. */
-       sense_ccw = &to_io_private(sch)->sense_ccw;
+       sense_ccw = &to_io_private(sch)->dma_area->sense_ccw;
        sense_ccw->cmd_code = CCW_CMD_BASIC_SENSE;
-       sense_ccw->cda = (__u32) __pa(cdev->private->irb.ecw);
+       sense_ccw->cda = (__u32) __pa(cdev->private->dma_area->irb.ecw);

as the irb.ecw is used as channel program data. And that needs to be
shared.

> - buffers that are filled by a channel program: sense, sense id, etc.
> - ccws themselves (because of translation?)
> 

Right. The idea is: smallish, basically fixed size a readily available
control blocks (specified as a ) are copied back and forth by the
ultravisor (via SIE SD).

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O
  2019-04-10  8:42     ` Cornelia Huck
  (?)
@ 2019-04-10 14:42     ` Halil Pasic
  2019-04-10 16:21         ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-10 14:42 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 10:42:51 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri,  5 Apr 2019 01:16:17 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > Before virtio-ccw could get away with not using DMA API for the pieces of
> > memory it does ccw I/O with. With protected virtualization this has to
> > change, since the hypervisor needs to read and sometimes also write these
> > pieces of memory.
> > 
> > Let us make sure all ccw I/O is done through shared memory.
> > 
> > Note: The control blocks of I/O instructions do not need to be shared.
> > These are marshalled by the ultravisor.
> 
> Ok, so direct parameters of I/O instructions are handled by the
> ultravisor?
> 

Yes.

> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++++----------------
> >  1 file changed, 107 insertions(+), 70 deletions(-)
> > 
> (...)
> > @@ -167,6 +170,28 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
> >  	return container_of(vdev, struct virtio_ccw_device, vdev);
> >  }
> >  
> > +#define vc_dma_decl_struct(type, field) \
> > +	dma_addr_t field ## _dma_addr;  \
> > +	struct type *field
> > +
> > +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
> > +				   dma_addr_t *dma_handle)
> > +{
> > +	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
> > +				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
> > +}
> > +
> > +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> > +				 void *cpu_addr, dma_addr_t dma_handle)
> > +{
> > +	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
> > +}
> > +
> > +#define vc_dma_alloc_struct(vdev, ptr) \
> > +	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
> > +#define vc_dma_free_struct(vdev, ptr) \
> > +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))
> 
> Not sure I'm a fan of those wrappers... I think they actually hurt
> readability of the code.
> 

By wrappers you mean just the macros or also the inline functions?

If we agree to go with the cio DMA pool instead of using DMA API
facilities for allocation (dma_alloc_coherent or maybe a per ccw-device
dma_pool) I think I could just use cio_dma_zalloc() directly if you like.

I was quite insecure about how this gen_pool idea is going to be received
here. That's why I decided to keep the dma_alloc_coherent() version in
for the RFC.

If you prefer I can squash patches #7 #9 #10 and #11 together and
pull #8 forward. Would you prefer that?


> > +
> >  static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
> >  {
> >  	unsigned long i, flags;
> > @@ -322,12 +347,12 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
> >  {
> >  	int ret;
> >  	unsigned long *indicatorp = NULL;
> > -	struct virtio_thinint_area *thinint_area = NULL;
> > +	vc_dma_decl_struct(virtio_thinint_area, thinint_area) = NULL;
> > +	dma_addr_t indicatorp_dma_addr;
> >  	struct airq_info *airq_info = vcdev->airq_info;
> >  
> >  	if (vcdev->is_thinint) {
> > -		thinint_area = kzalloc(sizeof(*thinint_area),
> > -				       GFP_DMA | GFP_KERNEL);
> > +		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
> >  		if (!thinint_area)
> >  			return;
> >  		thinint_area->summary_indicator =
> > @@ -338,8 +363,9 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
> >  		ccw->cda = (__u32)(unsigned long) thinint_area;
> >  	} else {
> >  		/* payload is the address of the indicators */
> > -		indicatorp = kmalloc(sizeof(&vcdev->indicators),
> > -				     GFP_DMA | GFP_KERNEL);
> > +		indicatorp = __vc_dma_alloc(&vcdev->vdev,
> > +					    sizeof(&vcdev->indicators),
> > +					    &indicatorp_dma_addr);
> >  		if (!indicatorp)
> >  			return;
> >  		*indicatorp = 0;
> > @@ -359,8 +385,10 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
> >  			 "Failed to deregister indicators (%d)\n", ret);
> >  	else if (vcdev->is_thinint)
> >  		virtio_ccw_drop_indicators(vcdev);
> > -	kfree(indicatorp);
> > -	kfree(thinint_area);
> > +	if (indicatorp)
> > +		__vc_dma_free(&vcdev->vdev, sizeof(&vcdev->indicators),
> > +			       indicatorp, indicatorp_dma_addr);
> > +	vc_dma_free_struct(&vcdev->vdev, thinint_area);
> 
> Don't you need to check for !NULL here as well?

Good catch! 

I could take care of it in __vc_dma_free().

void cio_dma_free(void *cpu_addr, size_t size) {
 +	if (!cpu_addr)
 +		return;

also seems to me like a good idea right now.

> 
> >  }
> >  
> >  static inline long __do_kvm_notify(struct subchannel_id schid,
> (...)
> > @@ -1280,7 +1318,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> >  
> >  	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
> >  
> > -	vcdev->vdev.dev.parent = &cdev->dev;
> 
> Hm?
> 
> (You added a line like that in a previous patch; should it simply have
> been a movement instead? Or am I misremembering?)

Right the move was supposed to take place in patch #2. Not sure how
I ended up with this. Maybe a messed up rebase.

> 
> >  	vcdev->vdev.dev.release = virtio_ccw_release_dev;
> >  	vcdev->vdev.config = &virtio_ccw_config_ops;
> >  	vcdev->cdev = cdev;
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
  2019-04-10  8:46     ` Cornelia Huck
  (?)
@ 2019-04-10 15:12     ` Halil Pasic
  2019-04-10 16:36         ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-10 15:12 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 10:46:49 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri,  5 Apr 2019 01:16:20 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > We can reduce the number of DMA allocations by pulling the bits
> > of memory that belongs to virtio_ccw_device and needs to be DMA
> > memory into a single area.
> 
> That makes a lot of sense (maybe start with introducing the dma area
> from the beginning?), but...
> 

Yeah, as I wrote at #7 once we are in clear how the code should look
like in the end, it might make sense to squash a couple of patches
together.

> > 
> > Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 48 ++++++++++++++--------------------------
> >  1 file changed, 16 insertions(+), 32 deletions(-)
> > 
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index aa45a6a027ae..7268149f2ee8 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> > @@ -49,12 +49,12 @@ struct vq_config_block {
> >  struct vcdev_dma_area {
> >  	unsigned long indicators;
> >  	unsigned long indicators2;
> > +	struct vq_config_block config_block;
> > +	__u8 status; /* TODO check __aligned(8); */
> 
> ...I think that needs attention.

Yes I wanted to discuss this with you. I could not find anything
in the virtio spec that would put requirements on how this
status field needs to be aligned. But I did not look to hard.

The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
of course we do have alignment requirements).

Apparently status used to be a normal field, and became a pointer with
73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.

So maybe dropping the TODO comment will do just fine. What do you think?

Regards,
Halil

> 
> >  };
> >  
> >  struct virtio_ccw_device {
> >  	struct virtio_device vdev;
> > -	__u8 *status;
> > -	dma_addr_t status_dma_addr;
> >  	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
> >  	struct ccw_device *cdev;
> >  	__u32 curr_io;
> 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-09 17:14         ` Cornelia Huck
  (?)
@ 2019-04-10 15:31         ` Halil Pasic
  2019-04-10 16:07             ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-10 15:31 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 9 Apr 2019 19:14:53 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

[..]

> > > At this point, you're always going via the css0 device. I'm
> > > wondering whether you should pass in the cssid here and use
> > > css_by_id(cssid) to make this future proof. But then, I'm not
> > > quite clear from which context this will be called.    
> > 
> > As said before I don't see MCSS-E coming. Regarding the client code,
> > please check out the rest of the series. Especially patch 6. 
> > 
> > From my perspective it would be at this stage saner to make it more
> > obvious that only one css is supported (at the moment), than to
> > implement MCSS-E, or to make this (kind of) MCSS-E ready.  
> 
> I disagree. I think there's value in keeping the interfaces clean
> (within reasonable bounds, of course.) Even if there is no
> implementation of MCSS-E other than QEMU... it is probably a good idea
> to spend some brain cycles to make this conceptually clean.

AFAIU Linux currently does not support MCSS-E. I don't have the
bandwidth to implement MCSS-E support in the kernel.

I fully agree for external interfaces should be MCSS-E conform, so
should we ever decide to implement we don't have to overcome self-made
obstacles.

Kernel internal stuff however, IMHO, should avoid carrying a ballast of
an 20%-30% implemented MCSS-E support. I see no benefit.

But I don't insist. If you have a good idea how to make this more MCSS-E
conform, you are welcome. In think something like this is best done on
top of a series that provides a basic working solution. Especially if the
conceptually clean thing is conceptually or code-wise more complex than
the basic solution. If you have something in mind that is simpler and
more lightweight than what I did here, I would be more than happy to make
that happen.

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
  2019-04-10  9:20   ` Cornelia Huck
  (?)
@ 2019-04-10 15:57   ` Halil Pasic
  2019-04-10 16:24       ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-10 15:57 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 11:20:48 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri,  5 Apr 2019 01:16:10 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > Enhanced virtualization protection technology may require the use of
> > bounce buffers for I/O. While support for this was built into the
> > virtio core,  virtio-ccw wasn't changed accordingly.
> > 
> > Thus what needs to be done to bring virtio-ccw up to speed with
> > respect to this is:
> > * use some 'new' common virtio stuff
> > * make sure that virtio-ccw specific stuff uses shared memory when
> >   talking to the hypervisor (except communication blocks like ORB,
> > these are handled by the hypervisor)
> > * make sure the DMA API does what is necessary to talk through shared
> >   memory if we are a protected virtualization guest.
> > * make sure the common IO layer plays along as well (airqs, sense).
> 
> It would be good to have a summary somewhere in the code (or
> Documentation/) as to what needs the dma treatment and what doesn't,
> for later reference. We don't want people to accidentally break things
> (especially if they cannot refer to architecture documentation - or
> will at least some of that be published?)
> 

I can put documentation on my TODO list. This cover letter was also
supposed to provide a bird's-eye view on what needs to be done.

> > 
> > The series is structured in incremental fashion: some of the changes
> > are overridden by following patches. The main reason why is that
> > this is how I developed. But I think it ain't bad for the didactic
> > and we are a bit more flexible with regards to throwing out some of
> > the stuff in the end.
> 
> FWIW, I think reshuffling the patches in the next iteration would ease
> review.
> 

Can you please tell me more about what is desired here? I mean, I made
some tentative proposals on squashing some patches together. I don't
remember any requests to reorder patches or split.

> > 
> > Important notes:
> > 
> > * This is an early (WIP) RFC that does not add any function to the
> >   kernel at his stage, as the ultravisor interactions are left out.
> >   The purpose is getting some early feedback ASAP.
> 
> I would like some comments from people who have experience with the dma
> api.
> 

I can only second that! 

From the Linux on Z folks Sebastian seems to be the most likely suspect.
I've briefly shown this series to him before posting here, and I asked
him to have another look today. Unfortunately he seems quite busy at the
moment, but he promised me that he will make some time for this
(soonish).

Getting some more feedback from Christoph Hellwig would be nice as well.

> > 
> > * In future these patches will depend on some code interacting with
> > the ultravisor (WIP by Vasily and Janosch).
> > 
> > * The s390 names are by no means final, and are not properly
> > explained. Should not hamper understanding too much. If it does
> > please ask.
> > 
> > * The existing naming in the common infrastructure (kernel internal
> >   interfaces) is pretty much based on the AMD SEV terminology. Thus
> > the names aren't always perfect. There might be merit to changing
> > these names to more abstract ones. I did not put much thought into
> > that at the current stage.
> 
> If we can find some generic names that work well for everyone,
> converting seems like a good idea. But following SEV is not that bad,
> either (you'll probably find more people who have heard about SEV than
> folks familiar with s390 ;)
> 

I agree. We should not swap out the names for worse ones for sure. :)

I think the names discussion can be postponed. I would like to get the
code right first.

> > 
> > 
> > Testing:
> > 
> > Please use iommu_platform=on for any virtio devices you are going
> > to test this code with (so virtio actually uses the DMA API).
> > 
> > Looking forward to your review or any other type of input.
> 
> I have now read through the whole series and commented in some places.
> But I'd really like to see comments from others as well.
>

Thank you so much for the quick review. Thanks to your quality review I will
have a bunch of things to change (for the better) for v1.

I intend to wait a bit for more feedback and also for the my dependencies
to get published. I hope to send out a functional v1 mid next week.

Thank you again!

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-10 15:31         ` Halil Pasic
@ 2019-04-10 16:07             ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:07 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 17:31:48 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 19:14:53 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> [..]
> 
> > > > At this point, you're always going via the css0 device. I'm
> > > > wondering whether you should pass in the cssid here and use
> > > > css_by_id(cssid) to make this future proof. But then, I'm not
> > > > quite clear from which context this will be called.      
> > > 
> > > As said before I don't see MCSS-E coming. Regarding the client code,
> > > please check out the rest of the series. Especially patch 6. 
> > > 
> > > From my perspective it would be at this stage saner to make it more
> > > obvious that only one css is supported (at the moment), than to
> > > implement MCSS-E, or to make this (kind of) MCSS-E ready.    
> > 
> > I disagree. I think there's value in keeping the interfaces clean
> > (within reasonable bounds, of course.) Even if there is no
> > implementation of MCSS-E other than QEMU... it is probably a good idea
> > to spend some brain cycles to make this conceptually clean.  
> 
> AFAIU Linux currently does not support MCSS-E. I don't have the
> bandwidth to implement MCSS-E support in the kernel.
> 
> I fully agree for external interfaces should be MCSS-E conform, so
> should we ever decide to implement we don't have to overcome self-made
> obstacles.
> 
> Kernel internal stuff however, IMHO, should avoid carrying a ballast of
> an 20%-30% implemented MCSS-E support. I see no benefit.
> 
> But I don't insist. If you have a good idea how to make this more MCSS-E
> conform, you are welcome. In think something like this is best done on
> top of a series that provides a basic working solution. Especially if the
> conceptually clean thing is conceptually or code-wise more complex than
> the basic solution. If you have something in mind that is simpler and
> more lightweight than what I did here, I would be more than happy to make
> that happen.

I haven't asked for adding complete support for MCSS-E, just to keep it
in the back of our minds...

Back to the issue at hand. You're currently hard-wiring this to the
css0 device. My question is: Does it make sense to have a per-css area?
From my limited perspective, I think it does (more css images would
mean more devices, and just adding smaller areas per-css is probably
easier than adding one big area scaling with the number of css images).
In that case, what about tacking the area to the actual css object,
instead of having the global variable? Should be an easy change, and
feels cleaner.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
@ 2019-04-10 16:07             ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:07 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 17:31:48 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 9 Apr 2019 19:14:53 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> [..]
> 
> > > > At this point, you're always going via the css0 device. I'm
> > > > wondering whether you should pass in the cssid here and use
> > > > css_by_id(cssid) to make this future proof. But then, I'm not
> > > > quite clear from which context this will be called.      
> > > 
> > > As said before I don't see MCSS-E coming. Regarding the client code,
> > > please check out the rest of the series. Especially patch 6. 
> > > 
> > > From my perspective it would be at this stage saner to make it more
> > > obvious that only one css is supported (at the moment), than to
> > > implement MCSS-E, or to make this (kind of) MCSS-E ready.    
> > 
> > I disagree. I think there's value in keeping the interfaces clean
> > (within reasonable bounds, of course.) Even if there is no
> > implementation of MCSS-E other than QEMU... it is probably a good idea
> > to spend some brain cycles to make this conceptually clean.  
> 
> AFAIU Linux currently does not support MCSS-E. I don't have the
> bandwidth to implement MCSS-E support in the kernel.
> 
> I fully agree for external interfaces should be MCSS-E conform, so
> should we ever decide to implement we don't have to overcome self-made
> obstacles.
> 
> Kernel internal stuff however, IMHO, should avoid carrying a ballast of
> an 20%-30% implemented MCSS-E support. I see no benefit.
> 
> But I don't insist. If you have a good idea how to make this more MCSS-E
> conform, you are welcome. In think something like this is best done on
> top of a series that provides a basic working solution. Especially if the
> conceptually clean thing is conceptually or code-wise more complex than
> the basic solution. If you have something in mind that is simpler and
> more lightweight than what I did here, I would be more than happy to make
> that happen.

I haven't asked for adding complete support for MCSS-E, just to keep it
in the back of our minds...

Back to the issue at hand. You're currently hard-wiring this to the
css0 device. My question is: Does it make sense to have a per-css area?
From my limited perspective, I think it does (more css images would
mean more devices, and just adding smaller areas per-css is probably
easier than adding one big area scaling with the number of css images).
In that case, what about tacking the area to the actual css object,
instead of having the global variable? Should be an easy change, and
feels cleaner.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
  2019-04-10 13:02         ` Halil Pasic
@ 2019-04-10 16:16             ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:16 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 15:02:25 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:25:57 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:

> > So, what is a high-level summary of areas that need the treatment?
> > What I get from looking at the patches so far, it's:
> > - stuff that is written by the hypervisor's interrupt injection code:
> >   IRB, indicators, etc.  
> 
> In interrupt context IRB behaves like a control block. We don't have
> to make it shared.
> 
> I moved IRB because of snse:
> 
> 
> @@ -329,9 +329,9 @@ ccw_device_do_sense(struct ccw_device *cdev, struct
> irb *irb) /*
>          * We have ending status but no sense information. Do a basic
> sense. */
> -       sense_ccw = &to_io_private(sch)->sense_ccw;
> +       sense_ccw = &to_io_private(sch)->dma_area->sense_ccw;
>         sense_ccw->cmd_code = CCW_CMD_BASIC_SENSE;
> -       sense_ccw->cda = (__u32) __pa(cdev->private->irb.ecw);
> +       sense_ccw->cda = (__u32) __pa(cdev->private->dma_area->irb.ecw);
> 
> as the irb.ecw is used as channel program data. And that needs to be
> shared.

Ah, I see. So it's more "guest points to some memory where the
hypervisor needs to write data"?

> 
> > - buffers that are filled by a channel program: sense, sense id, etc.
> > - ccws themselves (because of translation?)
> >   
> 
> Right. The idea is: smallish, basically fixed size a readily available
> control blocks (specified as a ) are copied back and forth by the

As a what? :)

> ultravisor (via SIE SD).

Ok. It's really hard to be sure from the outside what falls into which
category :(

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
@ 2019-04-10 16:16             ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:16 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 15:02:25 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:25:57 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:

> > So, what is a high-level summary of areas that need the treatment?
> > What I get from looking at the patches so far, it's:
> > - stuff that is written by the hypervisor's interrupt injection code:
> >   IRB, indicators, etc.  
> 
> In interrupt context IRB behaves like a control block. We don't have
> to make it shared.
> 
> I moved IRB because of snse:
> 
> 
> @@ -329,9 +329,9 @@ ccw_device_do_sense(struct ccw_device *cdev, struct
> irb *irb) /*
>          * We have ending status but no sense information. Do a basic
> sense. */
> -       sense_ccw = &to_io_private(sch)->sense_ccw;
> +       sense_ccw = &to_io_private(sch)->dma_area->sense_ccw;
>         sense_ccw->cmd_code = CCW_CMD_BASIC_SENSE;
> -       sense_ccw->cda = (__u32) __pa(cdev->private->irb.ecw);
> +       sense_ccw->cda = (__u32) __pa(cdev->private->dma_area->irb.ecw);
> 
> as the irb.ecw is used as channel program data. And that needs to be
> shared.

Ah, I see. So it's more "guest points to some memory where the
hypervisor needs to write data"?

> 
> > - buffers that are filled by a channel program: sense, sense id, etc.
> > - ccws themselves (because of translation?)
> >   
> 
> Right. The idea is: smallish, basically fixed size a readily available
> control blocks (specified as a ) are copied back and forth by the

As a what? :)

> ultravisor (via SIE SD).

Ok. It's really hard to be sure from the outside what falls into which
category :(

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O
  2019-04-10 14:42     ` Halil Pasic
@ 2019-04-10 16:21         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:21 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 16:42:45 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:42:51 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:17 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > @@ -167,6 +170,28 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
> > >  	return container_of(vdev, struct virtio_ccw_device, vdev);
> > >  }
> > >  
> > > +#define vc_dma_decl_struct(type, field) \
> > > +	dma_addr_t field ## _dma_addr;  \
> > > +	struct type *field
> > > +
> > > +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
> > > +				   dma_addr_t *dma_handle)
> > > +{
> > > +	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
> > > +				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
> > > +}
> > > +
> > > +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> > > +				 void *cpu_addr, dma_addr_t dma_handle)
> > > +{
> > > +	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
> > > +}
> > > +
> > > +#define vc_dma_alloc_struct(vdev, ptr) \
> > > +	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
> > > +#define vc_dma_free_struct(vdev, ptr) \
> > > +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))  
> > 
> > Not sure I'm a fan of those wrappers... I think they actually hurt
> > readability of the code.
> >   
> 
> By wrappers you mean just the macros or also the inline functions?

In particular, I dislike the macros.

> 
> If we agree to go with the cio DMA pool instead of using DMA API
> facilities for allocation (dma_alloc_coherent or maybe a per ccw-device
> dma_pool) I think I could just use cio_dma_zalloc() directly if you like.

If we go with the pool (I'm not familiar enough with the dma stuff to
be able to make a good judgment there), nice and obvious calls sound
good to me :)

> 
> I was quite insecure about how this gen_pool idea is going to be received
> here. That's why I decided to keep the dma_alloc_coherent() version in
> for the RFC.
> 
> If you prefer I can squash patches #7 #9 #10 and #11 together and
> pull #8 forward. Would you prefer that?

If that avoids multiple switches of the approach used, that sounds like
a good idea.

(Still would like to see some feedback from others.)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O
@ 2019-04-10 16:21         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:21 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 16:42:45 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:42:51 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:17 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > @@ -167,6 +170,28 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
> > >  	return container_of(vdev, struct virtio_ccw_device, vdev);
> > >  }
> > >  
> > > +#define vc_dma_decl_struct(type, field) \
> > > +	dma_addr_t field ## _dma_addr;  \
> > > +	struct type *field
> > > +
> > > +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size,
> > > +				   dma_addr_t *dma_handle)
> > > +{
> > > +	return dma_alloc_coherent(vdev->dev.parent, size, dma_handle,
> > > +				  GFP_DMA | GFP_KERNEL | __GFP_ZERO);
> > > +}
> > > +
> > > +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
> > > +				 void *cpu_addr, dma_addr_t dma_handle)
> > > +{
> > > +	dma_free_coherent(vdev->dev.parent, size, cpu_addr, dma_handle);
> > > +}
> > > +
> > > +#define vc_dma_alloc_struct(vdev, ptr) \
> > > +	({ ptr = __vc_dma_alloc(vdev, (sizeof(*(ptr))), &(ptr ## _dma_addr)); })
> > > +#define vc_dma_free_struct(vdev, ptr) \
> > > +	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr), (ptr ## _dma_addr))  
> > 
> > Not sure I'm a fan of those wrappers... I think they actually hurt
> > readability of the code.
> >   
> 
> By wrappers you mean just the macros or also the inline functions?

In particular, I dislike the macros.

> 
> If we agree to go with the cio DMA pool instead of using DMA API
> facilities for allocation (dma_alloc_coherent or maybe a per ccw-device
> dma_pool) I think I could just use cio_dma_zalloc() directly if you like.

If we go with the pool (I'm not familiar enough with the dma stuff to
be able to make a good judgment there), nice and obvious calls sound
good to me :)

> 
> I was quite insecure about how this gen_pool idea is going to be received
> here. That's why I decided to keep the dma_alloc_coherent() version in
> for the RFC.
> 
> If you prefer I can squash patches #7 #9 #10 and #11 together and
> pull #8 forward. Would you prefer that?

If that avoids multiple switches of the approach used, that sounds like
a good idea.

(Still would like to see some feedback from others.)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
  2019-04-10 15:57   ` Halil Pasic
@ 2019-04-10 16:24       ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:24 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 17:57:50 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 11:20:48 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:10 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > Enhanced virtualization protection technology may require the use of
> > > bounce buffers for I/O. While support for this was built into the
> > > virtio core,  virtio-ccw wasn't changed accordingly.
> > > 
> > > Thus what needs to be done to bring virtio-ccw up to speed with
> > > respect to this is:
> > > * use some 'new' common virtio stuff
> > > * make sure that virtio-ccw specific stuff uses shared memory when
> > >   talking to the hypervisor (except communication blocks like ORB,
> > > these are handled by the hypervisor)
> > > * make sure the DMA API does what is necessary to talk through shared
> > >   memory if we are a protected virtualization guest.
> > > * make sure the common IO layer plays along as well (airqs, sense).  
> > 
> > It would be good to have a summary somewhere in the code (or
> > Documentation/) as to what needs the dma treatment and what doesn't,
> > for later reference. We don't want people to accidentally break things
> > (especially if they cannot refer to architecture documentation - or
> > will at least some of that be published?)
> >   
> 
> I can put documentation on my TODO list. This cover letter was also
> supposed to provide a bird's-eye view on what needs to be done.

Some comments in the code to prevent further headscratching are also a
good idea.

> 
> > > 
> > > The series is structured in incremental fashion: some of the changes
> > > are overridden by following patches. The main reason why is that
> > > this is how I developed. But I think it ain't bad for the didactic
> > > and we are a bit more flexible with regards to throwing out some of
> > > the stuff in the end.  
> > 
> > FWIW, I think reshuffling the patches in the next iteration would ease
> > review.
> >   
> 
> Can you please tell me more about what is desired here? I mean, I made
> some tentative proposals on squashing some patches together. I don't
> remember any requests to reorder patches or split.

Just try to avoid to rewrite things multiple times -- if we agree on
the end result, we should be able to go there directly :)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
@ 2019-04-10 16:24       ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:24 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 17:57:50 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 11:20:48 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:10 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > Enhanced virtualization protection technology may require the use of
> > > bounce buffers for I/O. While support for this was built into the
> > > virtio core,  virtio-ccw wasn't changed accordingly.
> > > 
> > > Thus what needs to be done to bring virtio-ccw up to speed with
> > > respect to this is:
> > > * use some 'new' common virtio stuff
> > > * make sure that virtio-ccw specific stuff uses shared memory when
> > >   talking to the hypervisor (except communication blocks like ORB,
> > > these are handled by the hypervisor)
> > > * make sure the DMA API does what is necessary to talk through shared
> > >   memory if we are a protected virtualization guest.
> > > * make sure the common IO layer plays along as well (airqs, sense).  
> > 
> > It would be good to have a summary somewhere in the code (or
> > Documentation/) as to what needs the dma treatment and what doesn't,
> > for later reference. We don't want people to accidentally break things
> > (especially if they cannot refer to architecture documentation - or
> > will at least some of that be published?)
> >   
> 
> I can put documentation on my TODO list. This cover letter was also
> supposed to provide a bird's-eye view on what needs to be done.

Some comments in the code to prevent further headscratching are also a
good idea.

> 
> > > 
> > > The series is structured in incremental fashion: some of the changes
> > > are overridden by following patches. The main reason why is that
> > > this is how I developed. But I think it ain't bad for the didactic
> > > and we are a bit more flexible with regards to throwing out some of
> > > the stuff in the end.  
> > 
> > FWIW, I think reshuffling the patches in the next iteration would ease
> > review.
> >   
> 
> Can you please tell me more about what is desired here? I mean, I made
> some tentative proposals on squashing some patches together. I don't
> remember any requests to reorder patches or split.

Just try to avoid to rewrite things multiple times -- if we agree on
the end result, we should be able to go there directly :)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
  2019-04-10 15:12     ` Halil Pasic
@ 2019-04-10 16:36         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:36 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 17:12:54 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:46:49 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:20 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > index aa45a6a027ae..7268149f2ee8 100644
> > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > @@ -49,12 +49,12 @@ struct vq_config_block {
> > >  struct vcdev_dma_area {
> > >  	unsigned long indicators;
> > >  	unsigned long indicators2;
> > > +	struct vq_config_block config_block;
> > > +	__u8 status; /* TODO check __aligned(8); */  
> > 
> > ...I think that needs attention.  
> 
> Yes I wanted to discuss this with you. I could not find anything
> in the virtio spec that would put requirements on how this
> status field needs to be aligned. But I did not look to hard.
> 
> The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
> of course we do have alignment requirements).

I think it needs to be doubleword aligned.

> 
> Apparently status used to be a normal field, and became a pointer with
> 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
> data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.

In the beginning, the code used a below-2G-area for all commands.
Rather than adding locking to avoid races there, that commit switches
to allocating the needed structures individually. The status field
needed to be below 2G, so it needed to be allocated separately.

> 
> So maybe dropping the TODO comment will do just fine. What do you think?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
@ 2019-04-10 16:36         ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-10 16:36 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 17:12:54 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 10:46:49 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri,  5 Apr 2019 01:16:20 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > index aa45a6a027ae..7268149f2ee8 100644
> > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > @@ -49,12 +49,12 @@ struct vq_config_block {
> > >  struct vcdev_dma_area {
> > >  	unsigned long indicators;
> > >  	unsigned long indicators2;
> > > +	struct vq_config_block config_block;
> > > +	__u8 status; /* TODO check __aligned(8); */  
> > 
> > ...I think that needs attention.  
> 
> Yes I wanted to discuss this with you. I could not find anything
> in the virtio spec that would put requirements on how this
> status field needs to be aligned. But I did not look to hard.
> 
> The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
> of course we do have alignment requirements).

I think it needs to be doubleword aligned.

> 
> Apparently status used to be a normal field, and became a pointer with
> 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
> data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.

In the beginning, the code used a below-2G-area for all commands.
Rather than adding locking to avoid races there, that commit switches
to allocating the needed structures individually. The status field
needed to be below 2G, so it needed to be allocated separately.

> 
> So maybe dropping the TODO comment will do just fine. What do you think?

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-10 16:07             ` Cornelia Huck
  (?)
@ 2019-04-10 16:52             ` Halil Pasic
  -1 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-10 16:52 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 18:07:19 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 10 Apr 2019 17:31:48 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Tue, 9 Apr 2019 19:14:53 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > [..]
> > 
> > > > > At this point, you're always going via the css0 device. I'm
> > > > > wondering whether you should pass in the cssid here and use
> > > > > css_by_id(cssid) to make this future proof. But then, I'm not
> > > > > quite clear from which context this will be called.      
> > > > 
> > > > As said before I don't see MCSS-E coming. Regarding the client code,
> > > > please check out the rest of the series. Especially patch 6. 
> > > > 
> > > > From my perspective it would be at this stage saner to make it more
> > > > obvious that only one css is supported (at the moment), than to
> > > > implement MCSS-E, or to make this (kind of) MCSS-E ready.    
> > > 
> > > I disagree. I think there's value in keeping the interfaces clean
> > > (within reasonable bounds, of course.) Even if there is no
> > > implementation of MCSS-E other than QEMU... it is probably a good idea
> > > to spend some brain cycles to make this conceptually clean.  
> > 
> > AFAIU Linux currently does not support MCSS-E. I don't have the
> > bandwidth to implement MCSS-E support in the kernel.
> > 
> > I fully agree for external interfaces should be MCSS-E conform, so
> > should we ever decide to implement we don't have to overcome self-made
> > obstacles.
> > 
> > Kernel internal stuff however, IMHO, should avoid carrying a ballast of
> > an 20%-30% implemented MCSS-E support. I see no benefit.
> > 
> > But I don't insist. If you have a good idea how to make this more MCSS-E
> > conform, you are welcome. In think something like this is best done on
> > top of a series that provides a basic working solution. Especially if the
> > conceptually clean thing is conceptually or code-wise more complex than
> > the basic solution. If you have something in mind that is simpler and
> > more lightweight than what I did here, I would be more than happy to make
> > that happen.
> 
> I haven't asked for adding complete support for MCSS-E, just to keep it
> in the back of our minds...

I will, I promise ;)!

> 
> Back to the issue at hand. You're currently hard-wiring this to the
> css0 device. My question is: Does it make sense to have a per-css area?

What do you mean by area?

> From my limited perspective, I think it does (more css images would
> mean more devices, and just adding smaller areas per-css is probably
> easier than adding one big area scaling with the number of css images).

Currently all devices are hooked up to css0. This is also hard-wired:

int css_register_subchannel(struct subchannel *sch)                             
{                                                                               
        int ret;                                                                
                                                                                
        /* Initialize the subchannel structure */                               
        sch->dev.parent = &channel_subsystems[0]->device;

I think the hard-wiring the DMA properties of the CSS to
channel_subsystems[0] is consistent with prior art.

> In that case, what about tacking the area to the actual css object,
> instead of having the global variable? Should be an easy change, and
> feels cleaner.
> 

I already got rid of the global variable. Instead we have an accessor
now.

My problem with a gen_pool per ccs approach is the airq aiv vectors.
Those vectors are currently shared between devices (at least for
virtio-ccw), and those vectors must be allocated as shared (DMA) memory.
Please take another look at patch #6. Would you like 
airq_iv_create(unsigned long bits, unsigned long flags) to take
a struct channel_subsystem argument?

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
  2019-04-10 16:36         ` Cornelia Huck
  (?)
@ 2019-04-10 17:48         ` Halil Pasic
  2019-04-11  9:24             ` Cornelia Huck
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-10 17:48 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 18:36:43 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 10 Apr 2019 17:12:54 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Wed, 10 Apr 2019 10:46:49 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Fri,  5 Apr 2019 01:16:20 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > > index aa45a6a027ae..7268149f2ee8 100644
> > > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > > @@ -49,12 +49,12 @@ struct vq_config_block {
> > > >  struct vcdev_dma_area {
> > > >  	unsigned long indicators;
> > > >  	unsigned long indicators2;
> > > > +	struct vq_config_block config_block;
> > > > +	__u8 status; /* TODO check __aligned(8); */  
> > > 
> > > ...I think that needs attention.  
> > 
> > Yes I wanted to discuss this with you. I could not find anything
> > in the virtio spec that would put requirements on how this
> > status field needs to be aligned. But I did not look to hard.
> > 
> > The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
> > of course we do have alignment requirements).
> 
> I think it needs to be doubleword aligned.
> 

I've re-read the part of the PoP that describes the ccw formats. And
it reinforced my position: for IDA and MIDA we need proper alignment,
but if the CCW ain't an indirect one there is no alignment requirement.

QEMU also does not seem to check either.

Can you double-check and provide me with a reference that proves me
wrong if I'm wrong.

> > 
> > Apparently status used to be a normal field, and became a pointer with
> > 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
> > data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.
> 
> In the beginning, the code used a below-2G-area for all commands.
> Rather than adding locking to avoid races there, that commit switches
> to allocating the needed structures individually. The status field
> needed to be below 2G, so it needed to be allocated separately.
> 

I get it now. The confusing part was that the field 'area' was about
holding the address of the also previously dynamically allocated
below 2G area that was used for talking to the hypervisor via CCW I/O.

> > 
> > So maybe dropping the TODO comment will do just fine. What do you think?
> 

I still think we just need to drop the comment, as we don't have to
align it.

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
  2019-04-10 17:48         ` Halil Pasic
@ 2019-04-11  9:24             ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-11  9:24 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Wed, 10 Apr 2019 19:48:49 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 18:36:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 10 Apr 2019 17:12:54 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Wed, 10 Apr 2019 10:46:49 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Fri,  5 Apr 2019 01:16:20 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:  
> >   
> > > > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > > > index aa45a6a027ae..7268149f2ee8 100644
> > > > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > > > @@ -49,12 +49,12 @@ struct vq_config_block {
> > > > >  struct vcdev_dma_area {
> > > > >  	unsigned long indicators;
> > > > >  	unsigned long indicators2;
> > > > > +	struct vq_config_block config_block;
> > > > > +	__u8 status; /* TODO check __aligned(8); */    
> > > > 
> > > > ...I think that needs attention.    
> > > 
> > > Yes I wanted to discuss this with you. I could not find anything
> > > in the virtio spec that would put requirements on how this
> > > status field needs to be aligned. But I did not look to hard.
> > > 
> > > The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
> > > of course we do have alignment requirements).  
> > 
> > I think it needs to be doubleword aligned.
> >   
> 
> I've re-read the part of the PoP that describes the ccw formats. And
> it reinforced my position: for IDA and MIDA we need proper alignment,
> but if the CCW ain't an indirect one there is no alignment requirement.
> 
> QEMU also does not seem to check either.
> 
> Can you double-check and provide me with a reference that proves me
> wrong if I'm wrong.

Ah, it was the ccw itself, not the cda. Indeed, there do not seem to be
any requirements for direct addressing.

> 
> > > 
> > > Apparently status used to be a normal field, and became a pointer with
> > > 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
> > > data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.  
> > 
> > In the beginning, the code used a below-2G-area for all commands.
> > Rather than adding locking to avoid races there, that commit switches
> > to allocating the needed structures individually. The status field
> > needed to be below 2G, so it needed to be allocated separately.
> >   
> 
> I get it now. The confusing part was that the field 'area' was about
> holding the address of the also previously dynamically allocated
> below 2G area that was used for talking to the hypervisor via CCW I/O.
> 
> > > 
> > > So maybe dropping the TODO comment will do just fine. What do you think?  
> >   
> 
> I still think we just need to drop the comment, as we don't have to
> align it.

Agreed.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
@ 2019-04-11  9:24             ` Cornelia Huck
  0 siblings, 0 replies; 87+ messages in thread
From: Cornelia Huck @ 2019-04-11  9:24 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Farhan Ali, virtualization, Martin Schwidefsky,
	Viktor Mihajlovski, Janosch Frank

On Wed, 10 Apr 2019 19:48:49 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 10 Apr 2019 18:36:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 10 Apr 2019 17:12:54 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> > > On Wed, 10 Apr 2019 10:46:49 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:
> > >   
> > > > On Fri,  5 Apr 2019 01:16:20 +0200
> > > > Halil Pasic <pasic@linux.ibm.com> wrote:  
> >   
> > > > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > > > index aa45a6a027ae..7268149f2ee8 100644
> > > > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > > > @@ -49,12 +49,12 @@ struct vq_config_block {
> > > > >  struct vcdev_dma_area {
> > > > >  	unsigned long indicators;
> > > > >  	unsigned long indicators2;
> > > > > +	struct vq_config_block config_block;
> > > > > +	__u8 status; /* TODO check __aligned(8); */    
> > > > 
> > > > ...I think that needs attention.    
> > > 
> > > Yes I wanted to discuss this with you. I could not find anything
> > > in the virtio spec that would put requirements on how this
> > > status field needs to be aligned. But I did not look to hard.
> > > 
> > > The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
> > > of course we do have alignment requirements).  
> > 
> > I think it needs to be doubleword aligned.
> >   
> 
> I've re-read the part of the PoP that describes the ccw formats. And
> it reinforced my position: for IDA and MIDA we need proper alignment,
> but if the CCW ain't an indirect one there is no alignment requirement.
> 
> QEMU also does not seem to check either.
> 
> Can you double-check and provide me with a reference that proves me
> wrong if I'm wrong.

Ah, it was the ccw itself, not the cda. Indeed, there do not seem to be
any requirements for direct addressing.

> 
> > > 
> > > Apparently status used to be a normal field, and became a pointer with
> > > 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
> > > data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.  
> > 
> > In the beginning, the code used a below-2G-area for all commands.
> > Rather than adding locking to avoid races there, that commit switches
> > to allocating the needed structures individually. The status field
> > needed to be below 2G, so it needed to be allocated separately.
> >   
> 
> I get it now. The confusing part was that the field 'area' was about
> holding the address of the also previously dynamically allocated
> below 2G area that was used for talking to the hypervisor via CCW I/O.
> 
> > > 
> > > So maybe dropping the TODO comment will do just fine. What do you think?  
> >   
> 
> I still think we just need to drop the comment, as we don't have to
> align it.

Agreed.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations
  2019-04-11  9:24             ` Cornelia Huck
  (?)
@ 2019-04-11 10:10             ` Halil Pasic
  -1 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-11 10:10 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-s390, Martin Schwidefsky, Sebastian Ott,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Thu, 11 Apr 2019 11:24:46 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 10 Apr 2019 19:48:49 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Wed, 10 Apr 2019 18:36:43 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Wed, 10 Apr 2019 17:12:54 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> > >   
> > > > On Wed, 10 Apr 2019 10:46:49 +0200
> > > > Cornelia Huck <cohuck@redhat.com> wrote:
> > > >   
> > > > > On Fri,  5 Apr 2019 01:16:20 +0200
> > > > > Halil Pasic <pasic@linux.ibm.com> wrote:  
> > >   
> > > > > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > > > > > index aa45a6a027ae..7268149f2ee8 100644
> > > > > > --- a/drivers/s390/virtio/virtio_ccw.c
> > > > > > +++ b/drivers/s390/virtio/virtio_ccw.c
> > > > > > @@ -49,12 +49,12 @@ struct vq_config_block {
> > > > > >  struct vcdev_dma_area {
> > > > > >  	unsigned long indicators;
> > > > > >  	unsigned long indicators2;
> > > > > > +	struct vq_config_block config_block;
> > > > > > +	__u8 status; /* TODO check __aligned(8); */    
> > > > > 
> > > > > ...I think that needs attention.    
> > > > 
> > > > Yes I wanted to discuss this with you. I could not find anything
> > > > in the virtio spec that would put requirements on how this
> > > > status field needs to be aligned. But I did not look to hard.
> > > > 
> > > > The ccw.cda can hold an arbitrary data address AFAIR (for indirect,
> > > > of course we do have alignment requirements).  
> > > 
> > > I think it needs to be doubleword aligned.
> > >   
> > 
> > I've re-read the part of the PoP that describes the ccw formats. And
> > it reinforced my position: for IDA and MIDA we need proper alignment,
> > but if the CCW ain't an indirect one there is no alignment requirement.
> > 
> > QEMU also does not seem to check either.
> > 
> > Can you double-check and provide me with a reference that proves me
> > wrong if I'm wrong.
> 
> Ah, it was the ccw itself, not the cda. Indeed, there do not seem to be
> any requirements for direct addressing.
> 
> > 
> > > > 
> > > > Apparently status used to be a normal field, and became a pointer with
> > > > 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O
> > > > data." (Cornelia Huck, 2013-01-07). I could not quite figure out why.  
> > > 
> > > In the beginning, the code used a below-2G-area for all commands.
> > > Rather than adding locking to avoid races there, that commit switches
> > > to allocating the needed structures individually. The status field
> > > needed to be below 2G, so it needed to be allocated separately.
> > >   
> > 
> > I get it now. The confusing part was that the field 'area' was about
> > holding the address of the also previously dynamically allocated
> > below 2G area that was used for talking to the hypervisor via CCW I/O.
> > 
> > > > 
> > > > So maybe dropping the TODO comment will do just fine. What do you think?  
> > >   
> > 
> > I still think we just need to drop the comment, as we don't have to
> > align it.
> 
> Agreed.
> 

Will do. Thanks for double-checking!

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
  2019-04-04 23:16 ` [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio Halil Pasic
@ 2019-04-11 14:15     ` Sebastian Ott
  2019-04-11 14:15     ` Sebastian Ott
  1 sibling, 0 replies; 87+ messages in thread
From: Sebastian Ott @ 2019-04-11 14:15 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 5 Apr 2019, Halil Pasic wrote:
> @@ -1593,20 +1609,29 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
>  		return ERR_CAST(sch);
>  
>  	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
> -	if (!io_priv) {
> -		put_device(&sch->dev);
> -		return ERR_PTR(-ENOMEM);
> -	}
> +	if (!io_priv)
> +		goto err_priv;
> +	io_priv->dma_area = cio_dma_zalloc(sizeof(*io_priv->dma_area));

This is called very early - the dma pool is not yet initialized.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
@ 2019-04-11 14:15     ` Sebastian Ott
  0 siblings, 0 replies; 87+ messages in thread
From: Sebastian Ott @ 2019-04-11 14:15 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Eric Farman, Claudio Imbrenda,
	Vasily Gorbik, kvm, Cornelia Huck, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 5 Apr 2019, Halil Pasic wrote:
> @@ -1593,20 +1609,29 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
>  		return ERR_CAST(sch);
>  
>  	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
> -	if (!io_priv) {
> -		put_device(&sch->dev);
> -		return ERR_PTR(-ENOMEM);
> -	}
> +	if (!io_priv)
> +		goto err_priv;
> +	io_priv->dma_area = cio_dma_zalloc(sizeof(*io_priv->dma_area));

This is called very early - the dma pool is not yet initialized.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-04 23:16 ` [RFC PATCH 04/12] s390/cio: introduce cio DMA pool Halil Pasic
@ 2019-04-11 18:25     ` Sebastian Ott
  2019-04-11 18:25     ` Sebastian Ott
  1 sibling, 0 replies; 87+ messages in thread
From: Sebastian Ott @ 2019-04-11 18:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 5 Apr 2019, Halil Pasic wrote:
> To support protected virtualization cio will need to make sure the
> memory used for communication with the hypervisor is DMA memory.
> 
> Let us introduce a DMA pool to cio that will help us in allocating
> deallocating those chunks of memory.
> 
> We use a gen_pool backed with DMA pages to avoid each allocation
> effectively wasting a page, as we typically allocate much less
> than PAGE_SIZE.

I don't think we should use this global DMA pool. I guess it's OK for
stuff like airq (where we don't have a struct device at hand) but for
CCW we should use the device we have. Yes, this way we waste some memory
but all dma memory a device uses should fit in a page - so the wastage
is not too much.

Sebastian

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
@ 2019-04-11 18:25     ` Sebastian Ott
  0 siblings, 0 replies; 87+ messages in thread
From: Sebastian Ott @ 2019-04-11 18:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Eric Farman, Claudio Imbrenda,
	Vasily Gorbik, kvm, Cornelia Huck, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 5 Apr 2019, Halil Pasic wrote:
> To support protected virtualization cio will need to make sure the
> memory used for communication with the hypervisor is DMA memory.
> 
> Let us introduce a DMA pool to cio that will help us in allocating
> deallocating those chunks of memory.
> 
> We use a gen_pool backed with DMA pages to avoid each allocation
> effectively wasting a page, as we typically allocate much less
> than PAGE_SIZE.

I don't think we should use this global DMA pool. I guess it's OK for
stuff like airq (where we don't have a struct device at hand) but for
CCW we should use the device we have. Yes, this way we waste some memory
but all dma memory a device uses should fit in a page - so the wastage
is not too much.

Sebastian

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-11 18:25     ` Sebastian Ott
  (?)
@ 2019-04-12 11:20     ` Halil Pasic
  2019-04-12 12:12         ` Sebastian Ott
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-12 11:20 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> On Fri, 5 Apr 2019, Halil Pasic wrote:
> > To support protected virtualization cio will need to make sure the
> > memory used for communication with the hypervisor is DMA memory.
> > 
> > Let us introduce a DMA pool to cio that will help us in allocating
> > deallocating those chunks of memory.
> > 
> > We use a gen_pool backed with DMA pages to avoid each allocation
> > effectively wasting a page, as we typically allocate much less
> > than PAGE_SIZE.
> 
> I don't think we should use this global DMA pool. I guess it's OK for
> stuff like airq (where we don't have a struct device at hand) but for
> CCW we should use the device we have. Yes, this way we waste some memory
> but all dma memory a device uses should fit in a page - so the wastage
> is not too much.
> 

Is what you envision an own gen_pool on for each subchannel (e.g. a
struct io_subchannel member)?

I'm struggling with understanding the expected benefits of a
per-subchannel pool/allocator. Can you please tell me what benefits do
you expect (over the current approach)?

I understand you idea is to keep the CIO global pool for stuff that can
not be tied to a single device (i.e. ariq). So the per device stuff would
also mean more code. Would you be OK with postponing this alleged
enhancement (i.e. implement it as a patch on top of this series)?

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio
  2019-04-11 14:15     ` Sebastian Ott
  (?)
@ 2019-04-12 11:29     ` Halil Pasic
  -1 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-12 11:29 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Thu, 11 Apr 2019 16:15:02 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> On Fri, 5 Apr 2019, Halil Pasic wrote:
> > @@ -1593,20 +1609,29 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv)
> >  		return ERR_CAST(sch);
> >  
> >  	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
> > -	if (!io_priv) {
> > -		put_device(&sch->dev);
> > -		return ERR_PTR(-ENOMEM);
> > -	}
> > +	if (!io_priv)
> > +		goto err_priv;
> > +	io_priv->dma_area = cio_dma_zalloc(sizeof(*io_priv->dma_area));
> 
> This is called very early - the dma pool is not yet initialized.
> 

Hm, I will have to have better look into this. I kind of assumed we
can't/don't create any ccw_device instances before we have a
channel_subsystem, as naive as I am.

Thank you very much for spotting this! Looking forward to more
review ;)

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-12 11:20     ` Halil Pasic
@ 2019-04-12 12:12         ` Sebastian Ott
  0 siblings, 0 replies; 87+ messages in thread
From: Sebastian Ott @ 2019-04-12 12:12 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 12 Apr 2019, Halil Pasic wrote:
> On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > I don't think we should use this global DMA pool. I guess it's OK for
> > stuff like airq (where we don't have a struct device at hand) but for
> > CCW we should use the device we have. Yes, this way we waste some memory
> > but all dma memory a device uses should fit in a page - so the wastage
> > is not too much.
> > 
> 
> Is what you envision an own gen_pool on for each subchannel (e.g. a
> struct io_subchannel member)?

Either that or if that's too much overhead simply map a page and create
a struct containing the few dma areas for that device.

> I'm struggling with understanding the expected benefits of a
> per-subchannel pool/allocator. Can you please tell me what benefits do
> you expect (over the current approach)?

Logically DMA is a capability of a device and the whole DMA API is build
around devices. Working around that just feels wrong. For practical
matters: DMA debugging will complain about misuse of a specific device or
driver.

> I understand you idea is to keep the CIO global pool for stuff that can
> not be tied to a single device (i.e. ariq). So the per device stuff would
> also mean more code. Would you be OK with postponing this alleged
> enhancement (i.e. implement it as a patch on top of this series)?

I don't like it but it's just in-kernel usage which we can change at any
time. So if it helps you to do it that way, why not..

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
@ 2019-04-12 12:12         ` Sebastian Ott
  0 siblings, 0 replies; 87+ messages in thread
From: Sebastian Ott @ 2019-04-12 12:12 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Eric Farman, Claudio Imbrenda,
	Vasily Gorbik, kvm, Cornelia Huck, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 12 Apr 2019, Halil Pasic wrote:
> On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > I don't think we should use this global DMA pool. I guess it's OK for
> > stuff like airq (where we don't have a struct device at hand) but for
> > CCW we should use the device we have. Yes, this way we waste some memory
> > but all dma memory a device uses should fit in a page - so the wastage
> > is not too much.
> > 
> 
> Is what you envision an own gen_pool on for each subchannel (e.g. a
> struct io_subchannel member)?

Either that or if that's too much overhead simply map a page and create
a struct containing the few dma areas for that device.

> I'm struggling with understanding the expected benefits of a
> per-subchannel pool/allocator. Can you please tell me what benefits do
> you expect (over the current approach)?

Logically DMA is a capability of a device and the whole DMA API is build
around devices. Working around that just feels wrong. For practical
matters: DMA debugging will complain about misuse of a specific device or
driver.

> I understand you idea is to keep the CIO global pool for stuff that can
> not be tied to a single device (i.e. ariq). So the per device stuff would
> also mean more code. Would you be OK with postponing this alleged
> enhancement (i.e. implement it as a patch on top of this series)?

I don't like it but it's just in-kernel usage which we can change at any
time. So if it helps you to do it that way, why not..

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
  2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
@ 2019-04-12 13:47   ` David Hildenbrand
  2019-04-04 23:16 ` [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw Halil Pasic
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 87+ messages in thread
From: David Hildenbrand @ 2019-04-12 13:47 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On 05.04.19 01:16, Halil Pasic wrote:
> Enhanced virtualization protection technology may require the use of
> bounce buffers for I/O. While support for this was built into the virtio
> core,  virtio-ccw wasn't changed accordingly.

Can you elaborate some more about the general approach (Enhanced
virtualization protection technology, ultravisor, concept, issues, how
to squeeze it into QEMU/KVM/kernel) etc =

For my taste, this cover letter misses some important context :)

> 
> Thus what needs to be done to bring virtio-ccw up to speed with respect
> to this is:
> * use some 'new' common virtio stuff
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> * make sure the DMA API does what is necessary to talk through shared
>   memory if we are a protected virtualization guest.
> * make sure the common IO layer plays along as well (airqs, sense).
> 
> The series is structured in incremental fashion: some of the changes are
> overridden by following patches. The main reason why is that this is how I
> developed. But I think it ain't bad for the didactic and we are a bit more
> flexible with regards to throwing out some of the stuff in the end.
> 
> Important notes:
> 
> * This is an early (WIP) RFC that does not add any function to the
>   kernel at his stage, as the ultravisor interactions are left out.
>   The purpose is getting some early feedback ASAP.
> 
> * In future these patches will depend on some code interacting with the
>   ultravisor (WIP by Vasily and Janosch).
> 
> * The s390 names are by no means final, and are not properly explained. Should
>   not hamper understanding too much. If it does please ask.
> 
> * The existing naming in the common infrastructure (kernel internal
>   interfaces) is pretty much based on the AMD SEV terminology. Thus the
>   names aren't always perfect. There might be merit to changing these
>   names to more abstract ones. I did not put much thought into that at
>   the current stage.
> 
> 
> Testing:
> 
> Please use iommu_platform=on for any virtio devices you are going
> to test this code with (so virtio actually uses the DMA API).
> 
> Looking forward to your review or any other type of input.
> 
> Halil Pasic (12):
>   virtio/s390: use vring_create_virtqueue
>   virtio/s390: DMA support for virtio-ccw
>   s390/mm: force swiotlb for protected virtualization
>   s390/cio: introduce cio DMA pool
>   s390/cio: add protected virtualization support to cio
>   s390/airq: use DMA memory for adapter interrupts
>   virtio/s390: use DMA memory for ccw I/O
>   virtio/s390: add indirection to indicators access
>   virtio/s390: use DMA memory for notifiers
>   virtio/s390: consolidate DMA allocations
>   virtio/s390: use the cio DMA pool
>   virtio/s390:  make airq summary indicators DMA
> 
>  arch/s390/Kconfig                   |   5 +
>  arch/s390/include/asm/Kbuild        |   1 -
>  arch/s390/include/asm/airq.h        |   2 +
>  arch/s390/include/asm/cio.h         |   4 +
>  arch/s390/include/asm/dma-mapping.h |  13 ++
>  arch/s390/include/asm/mem_encrypt.h |  18 +++
>  arch/s390/mm/init.c                 |  44 +++++
>  drivers/s390/cio/airq.c             |  18 ++-
>  drivers/s390/cio/ccwreq.c           |   8 +-
>  drivers/s390/cio/css.c              |  63 ++++++++
>  drivers/s390/cio/device.c           |  46 ++++--
>  drivers/s390/cio/device_fsm.c       |  40 ++---
>  drivers/s390/cio/device_id.c        |  18 +--
>  drivers/s390/cio/device_ops.c       |   4 +-
>  drivers/s390/cio/device_pgid.c      |  20 +--
>  drivers/s390/cio/device_status.c    |  24 +--
>  drivers/s390/cio/io_sch.h           |  19 ++-
>  drivers/s390/virtio/virtio_ccw.c    | 310 ++++++++++++++++++++----------------
>  18 files changed, 444 insertions(+), 213 deletions(-)
>  create mode 100644 arch/s390/include/asm/dma-mapping.h
>  create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
@ 2019-04-12 13:47   ` David Hildenbrand
  0 siblings, 0 replies; 87+ messages in thread
From: David Hildenbrand @ 2019-04-12 13:47 UTC (permalink / raw)
  To: Halil Pasic, kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott
  Cc: Eric Farman, Viktor Mihajlovski, Janosch Frank, Vasily Gorbik,
	Farhan Ali, virtualization, Claudio Imbrenda

On 05.04.19 01:16, Halil Pasic wrote:
> Enhanced virtualization protection technology may require the use of
> bounce buffers for I/O. While support for this was built into the virtio
> core,  virtio-ccw wasn't changed accordingly.

Can you elaborate some more about the general approach (Enhanced
virtualization protection technology, ultravisor, concept, issues, how
to squeeze it into QEMU/KVM/kernel) etc =

For my taste, this cover letter misses some important context :)

> 
> Thus what needs to be done to bring virtio-ccw up to speed with respect
> to this is:
> * use some 'new' common virtio stuff
> * make sure that virtio-ccw specific stuff uses shared memory when
>   talking to the hypervisor (except communication blocks like ORB, these
>   are handled by the hypervisor)
> * make sure the DMA API does what is necessary to talk through shared
>   memory if we are a protected virtualization guest.
> * make sure the common IO layer plays along as well (airqs, sense).
> 
> The series is structured in incremental fashion: some of the changes are
> overridden by following patches. The main reason why is that this is how I
> developed. But I think it ain't bad for the didactic and we are a bit more
> flexible with regards to throwing out some of the stuff in the end.
> 
> Important notes:
> 
> * This is an early (WIP) RFC that does not add any function to the
>   kernel at his stage, as the ultravisor interactions are left out.
>   The purpose is getting some early feedback ASAP.
> 
> * In future these patches will depend on some code interacting with the
>   ultravisor (WIP by Vasily and Janosch).
> 
> * The s390 names are by no means final, and are not properly explained. Should
>   not hamper understanding too much. If it does please ask.
> 
> * The existing naming in the common infrastructure (kernel internal
>   interfaces) is pretty much based on the AMD SEV terminology. Thus the
>   names aren't always perfect. There might be merit to changing these
>   names to more abstract ones. I did not put much thought into that at
>   the current stage.
> 
> 
> Testing:
> 
> Please use iommu_platform=on for any virtio devices you are going
> to test this code with (so virtio actually uses the DMA API).
> 
> Looking forward to your review or any other type of input.
> 
> Halil Pasic (12):
>   virtio/s390: use vring_create_virtqueue
>   virtio/s390: DMA support for virtio-ccw
>   s390/mm: force swiotlb for protected virtualization
>   s390/cio: introduce cio DMA pool
>   s390/cio: add protected virtualization support to cio
>   s390/airq: use DMA memory for adapter interrupts
>   virtio/s390: use DMA memory for ccw I/O
>   virtio/s390: add indirection to indicators access
>   virtio/s390: use DMA memory for notifiers
>   virtio/s390: consolidate DMA allocations
>   virtio/s390: use the cio DMA pool
>   virtio/s390:  make airq summary indicators DMA
> 
>  arch/s390/Kconfig                   |   5 +
>  arch/s390/include/asm/Kbuild        |   1 -
>  arch/s390/include/asm/airq.h        |   2 +
>  arch/s390/include/asm/cio.h         |   4 +
>  arch/s390/include/asm/dma-mapping.h |  13 ++
>  arch/s390/include/asm/mem_encrypt.h |  18 +++
>  arch/s390/mm/init.c                 |  44 +++++
>  drivers/s390/cio/airq.c             |  18 ++-
>  drivers/s390/cio/ccwreq.c           |   8 +-
>  drivers/s390/cio/css.c              |  63 ++++++++
>  drivers/s390/cio/device.c           |  46 ++++--
>  drivers/s390/cio/device_fsm.c       |  40 ++---
>  drivers/s390/cio/device_id.c        |  18 +--
>  drivers/s390/cio/device_ops.c       |   4 +-
>  drivers/s390/cio/device_pgid.c      |  20 +--
>  drivers/s390/cio/device_status.c    |  24 +--
>  drivers/s390/cio/io_sch.h           |  19 ++-
>  drivers/s390/virtio/virtio_ccw.c    | 310 ++++++++++++++++++++----------------
>  18 files changed, 444 insertions(+), 213 deletions(-)
>  create mode 100644 arch/s390/include/asm/dma-mapping.h
>  create mode 100644 arch/s390/include/asm/mem_encrypt.h
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-12 12:12         ` Sebastian Ott
  (?)
@ 2019-04-12 15:30         ` Halil Pasic
  2019-04-16 12:50             ` Sebastian Ott
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-12 15:30 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 12 Apr 2019 14:12:31 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> On Fri, 12 Apr 2019, Halil Pasic wrote:
> > On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
> > Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > I don't think we should use this global DMA pool. I guess it's OK for
> > > stuff like airq (where we don't have a struct device at hand) but for
> > > CCW we should use the device we have. Yes, this way we waste some memory
> > > but all dma memory a device uses should fit in a page - so the wastage
> > > is not too much.

Regarding the wastage. Let us do the math together in search for an
upper (wastage) limit.

#define __MAX_CSSID 0
#define __MAX_SUBCHANNEL 65535
#define __MAX_SSID 3 

that is a maximum of
2^16 devices per subchannel set x 2^2 subchannel sets * 2^0 css
== 2^18 devices at most.

With your a page per device we have max 2^30 = 2^18 + 2^12 bytes.

That is exactly 1GB. And because some of the stuff needs to be
31 bit addressable all of that 1GB would have to be under 2GB. 

Currently we need at least 224 bytes per device that is ~ 6%
of a PAGE_SIZE.

> > > 
> > 
> > Is what you envision an own gen_pool on for each subchannel (e.g. a
> > struct io_subchannel member)?
> 
> Either that or if that's too much overhead simply map a page and create
> a struct containing the few dma areas for that device.

In virtio-ccw we do dynamic allocations of DMA memory. And we could
theoretically do the same elsewhere. So I don't think a struct with
a few dma areas would do (assumed I understood you correctly, which
isn't very likely).

> 
> > I'm struggling with understanding the expected benefits of a
> > per-subchannel pool/allocator. Can you please tell me what benefits do
> > you expect (over the current approach)?
> 
> Logically DMA is a capability of a device and the whole DMA API is build
> around devices. 

I agree the whole DMA API is built around devices. IMHO DMA is indeed on
one hand a capability of a device, but it is also a capability of a
machine (system).

Normally (PCI) limitations can come either from the device side (e.g.
device supports only 32 bit addressing) or from the machine/bus side.

In our case however all ccw devices have the same
limitations/requirements. E.g. if you want to do a SSCH your ORB and
your CCWs will have to have to be 31 bit addressable (physical). Your
data can be above 2G if MIDA is available and used. None of this depends
on the device we are talking to, but dictated by the properties of
the architecture and the machine. Another important thing to notice is
that for CCW IO isolation is done very differently than for PCI. 

> Working around that just feels wrong.

Then the airq iv->vector should be a per-device thing as well, or?

> For practical
> matters: DMA debugging will complain about misuse of a specific device or
> driver.
> 

Do you mean CONFIG_DMA_API_DEBUG and CONFIG_DMA_API_DEBUG_SG? I've been
running with those and did not see any complaints. Maybe we should
clarify this one offline...

> > I understand you idea is to keep the CIO global pool for stuff that can
> > not be tied to a single device (i.e. ariq). So the per device stuff would
> > also mean more code. Would you be OK with postponing this alleged
> > enhancement (i.e. implement it as a patch on top of this series)?
> 
> I don't like it but it's just in-kernel usage which we can change at any
> time. So if it helps you to do it that way, why not..
> 

Right. There should be no external interface implications to changing
this AFAICT. I prefer seeing this matter as not substantial for what I'm
trying to accomplish with this series.

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
  2019-04-12 13:47   ` David Hildenbrand
  (?)
@ 2019-04-16 11:10   ` Halil Pasic
  2019-04-16 11:50       ` David Hildenbrand
  -1 siblings, 1 reply; 87+ messages in thread
From: Halil Pasic @ 2019-04-16 11:10 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On Fri, 12 Apr 2019 15:47:50 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 05.04.19 01:16, Halil Pasic wrote:
> > Enhanced virtualization protection technology may require the use of
> > bounce buffers for I/O. While support for this was built into the virtio
> > core,  virtio-ccw wasn't changed accordingly.
> 
> Can you elaborate some more about the general approach (Enhanced
> virtualization protection technology, ultravisor, concept, issues, how
> to squeeze it into QEMU/KVM/kernel) etc =
> 
> For my taste, this cover letter misses some important context :)
> 

I'm aware. Unfortunately we don't have a decision yet about which parts
of the protected virtualization architecture are going to be PoP
material.

You can get some more context immediately by having a look at Martin's
features branch and looking at the s390/protvirt  and s390/uv patches.

I will try to provide more background information for v1. But having
a remotely complete and reliable documentation will take some time. What
I can offer at the moment is answers to specific questions.

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
  2019-04-16 11:10   ` Halil Pasic
@ 2019-04-16 11:50       ` David Hildenbrand
  0 siblings, 0 replies; 87+ messages in thread
From: David Hildenbrand @ 2019-04-16 11:50 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	Sebastian Ott, virtualization, Christian Borntraeger,
	Viktor Mihajlovski, Vasily Gorbik, Janosch Frank,
	Claudio Imbrenda, Farhan Ali, Eric Farman

On 16.04.19 13:10, Halil Pasic wrote:
> On Fri, 12 Apr 2019 15:47:50 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 05.04.19 01:16, Halil Pasic wrote:
>>> Enhanced virtualization protection technology may require the use of
>>> bounce buffers for I/O. While support for this was built into the virtio
>>> core,  virtio-ccw wasn't changed accordingly.
>>
>> Can you elaborate some more about the general approach (Enhanced
>> virtualization protection technology, ultravisor, concept, issues, how
>> to squeeze it into QEMU/KVM/kernel) etc =
>>
>> For my taste, this cover letter misses some important context :)
>>
> 
> I'm aware. Unfortunately we don't have a decision yet about which parts
> of the protected virtualization architecture are going to be PoP
> material.

Oh, okay.

> 
> You can get some more context immediately by having a look at Martin's
> features branch and looking at the s390/protvirt  and s390/uv patches.

As I don't have time to dig through random branches to
discover/reverse-engineer the obvious, I won't be reviewing this patch
series. But as I am not an I/O expert, this might not be bad at all :)

Maybe other people can help.

> 
> I will try to provide more background information for v1. But having
> a remotely complete and reliable documentation will take some time. What
> I can offer at the moment is answers to specific questions.

Waiting for v1 then.

Cheers!

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 00/12] s390: virtio: support protected virtualization
@ 2019-04-16 11:50       ` David Hildenbrand
  0 siblings, 0 replies; 87+ messages in thread
From: David Hildenbrand @ 2019-04-16 11:50 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Vasily Gorbik, linux-s390, Eric Farman, Claudio Imbrenda, kvm,
	Sebastian Ott, Cornelia Huck, virtualization, Martin Schwidefsky,
	Farhan Ali, Viktor Mihajlovski, Janosch Frank

On 16.04.19 13:10, Halil Pasic wrote:
> On Fri, 12 Apr 2019 15:47:50 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 05.04.19 01:16, Halil Pasic wrote:
>>> Enhanced virtualization protection technology may require the use of
>>> bounce buffers for I/O. While support for this was built into the virtio
>>> core,  virtio-ccw wasn't changed accordingly.
>>
>> Can you elaborate some more about the general approach (Enhanced
>> virtualization protection technology, ultravisor, concept, issues, how
>> to squeeze it into QEMU/KVM/kernel) etc =
>>
>> For my taste, this cover letter misses some important context :)
>>
> 
> I'm aware. Unfortunately we don't have a decision yet about which parts
> of the protected virtualization architecture are going to be PoP
> material.

Oh, okay.

> 
> You can get some more context immediately by having a look at Martin's
> features branch and looking at the s390/protvirt  and s390/uv patches.

As I don't have time to dig through random branches to
discover/reverse-engineer the obvious, I won't be reviewing this patch
series. But as I am not an I/O expert, this might not be bad at all :)

Maybe other people can help.

> 
> I will try to provide more background information for v1. But having
> a remotely complete and reliable documentation will take some time. What
> I can offer at the moment is answers to specific questions.

Waiting for v1 then.

Cheers!

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-12 15:30         ` Halil Pasic
@ 2019-04-16 12:50             ` Sebastian Ott
  0 siblings, 0 replies; 87+ messages in thread
From: Sebastian Ott @ 2019-04-16 12:50 UTC (permalink / raw)
  To: Halil Pasic
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Fri, 12 Apr 2019, Halil Pasic wrote:
> On Fri, 12 Apr 2019 14:12:31 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > On Fri, 12 Apr 2019, Halil Pasic wrote:
> > > On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
> > > Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > > I don't think we should use this global DMA pool. I guess it's OK for
> > > > stuff like airq (where we don't have a struct device at hand) but for
> > > > CCW we should use the device we have. Yes, this way we waste some memory
> > > > but all dma memory a device uses should fit in a page - so the wastage
> > > > is not too much.
> 
> Regarding the wastage. Let us do the math together in search for an
> upper (wastage) limit.
[...]
> Currently we need at least 224 bytes per device that is ~ 6%
> of a PAGE_SIZE.

Yes, we basically waste the whole page. I'm ok with that if the benefit is
to play nice with the kernel APIs.

> > For practical
> > matters: DMA debugging will complain about misuse of a specific device or
> > driver.
> > 
> 
> Do you mean CONFIG_DMA_API_DEBUG and CONFIG_DMA_API_DEBUG_SG? I've been
> running with those and did not see any complaints. Maybe we should
> clarify this one offline...

I didn't mean to imply that there are bugs already - just that when used
as intended the DMA_DEBUG_API can complain about stuff like "your device
is gone but you have still DMA memory set up for it" which will not work
if you don't use the correct device...

Sebastian

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
@ 2019-04-16 12:50             ` Sebastian Ott
  0 siblings, 0 replies; 87+ messages in thread
From: Sebastian Ott @ 2019-04-16 12:50 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Farhan Ali, linux-s390, Eric Farman, Claudio Imbrenda,
	Vasily Gorbik, kvm, Cornelia Huck, virtualization,
	Martin Schwidefsky, Viktor Mihajlovski, Janosch Frank

On Fri, 12 Apr 2019, Halil Pasic wrote:
> On Fri, 12 Apr 2019 14:12:31 +0200 (CEST)
> Sebastian Ott <sebott@linux.ibm.com> wrote:
> > On Fri, 12 Apr 2019, Halil Pasic wrote:
> > > On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
> > > Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > > I don't think we should use this global DMA pool. I guess it's OK for
> > > > stuff like airq (where we don't have a struct device at hand) but for
> > > > CCW we should use the device we have. Yes, this way we waste some memory
> > > > but all dma memory a device uses should fit in a page - so the wastage
> > > > is not too much.
> 
> Regarding the wastage. Let us do the math together in search for an
> upper (wastage) limit.
[...]
> Currently we need at least 224 bytes per device that is ~ 6%
> of a PAGE_SIZE.

Yes, we basically waste the whole page. I'm ok with that if the benefit is
to play nice with the kernel APIs.

> > For practical
> > matters: DMA debugging will complain about misuse of a specific device or
> > driver.
> > 
> 
> Do you mean CONFIG_DMA_API_DEBUG and CONFIG_DMA_API_DEBUG_SG? I've been
> running with those and did not see any complaints. Maybe we should
> clarify this one offline...

I didn't mean to imply that there are bugs already - just that when used
as intended the DMA_DEBUG_API can complain about stuff like "your device
is gone but you have still DMA memory set up for it" which will not work
if you don't use the correct device...

Sebastian

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [RFC PATCH 04/12] s390/cio: introduce cio DMA pool
  2019-04-16 12:50             ` Sebastian Ott
  (?)
@ 2019-04-16 13:31             ` Halil Pasic
  -1 siblings, 0 replies; 87+ messages in thread
From: Halil Pasic @ 2019-04-16 13:31 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: kvm, linux-s390, Cornelia Huck, Martin Schwidefsky,
	virtualization, Christian Borntraeger, Viktor Mihajlovski,
	Vasily Gorbik, Janosch Frank, Claudio Imbrenda, Farhan Ali,
	Eric Farman

On Tue, 16 Apr 2019 14:50:14 +0200 (CEST)
Sebastian Ott <sebott@linux.ibm.com> wrote:

> On Fri, 12 Apr 2019, Halil Pasic wrote:
> > On Fri, 12 Apr 2019 14:12:31 +0200 (CEST)
> > Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > On Fri, 12 Apr 2019, Halil Pasic wrote:
> > > > On Thu, 11 Apr 2019 20:25:01 +0200 (CEST)
> > > > Sebastian Ott <sebott@linux.ibm.com> wrote:
> > > > > I don't think we should use this global DMA pool. I guess it's OK for
> > > > > stuff like airq (where we don't have a struct device at hand) but for
> > > > > CCW we should use the device we have. Yes, this way we waste some memory
> > > > > but all dma memory a device uses should fit in a page - so the wastage
> > > > > is not too much.
> > 
> > Regarding the wastage. Let us do the math together in search for an
> > upper (wastage) limit.
> [...]
> > Currently we need at least 224 bytes per device that is ~ 6%
> > of a PAGE_SIZE.
> 
> Yes, we basically waste the whole page. I'm ok with that if the benefit is
> to play nice with the kernel APIs.
> 
> > > For practical
> > > matters: DMA debugging will complain about misuse of a specific device or
> > > driver.
> > > 
> > 
> > Do you mean CONFIG_DMA_API_DEBUG and CONFIG_DMA_API_DEBUG_SG? I've been
> > running with those and did not see any complaints. Maybe we should
> > clarify this one offline...
> 
> I didn't mean to imply that there are bugs already - just that when used
> as intended the DMA_DEBUG_API can complain about stuff like "your device
> is gone but you have still DMA memory set up for it" which will not work
> if you don't use the correct device...
> 

Right. In fact the 'real' allocations happen using gen_pool, and the pool
never shrinks.

IMHO as so often in software engineering we have a trade-off here. I'm
still not convinced I traded badly here, but I will take the request of
yours to tie the dma allocations to a more the device requiring the dma
as a maintainers request, and accommodate it in v1.

Are you fine with having a similar gen_pool backed with dma_pages on a
per struct io_subchannel_private basis?

Based on our offline chat I think you are, but but better safe than sorry.

Regards,
Halil

^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2019-04-16 13:31 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-04 23:16 [RFC PATCH 00/12] s390: virtio: support protected virtualization Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue Halil Pasic
2019-04-08 11:01   ` Cornelia Huck
2019-04-08 11:01     ` Cornelia Huck
2019-04-08 12:37     ` Michael S. Tsirkin
2019-04-08 12:37       ` Michael S. Tsirkin
2019-04-08 13:20     ` Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 02/12] virtio/s390: DMA support for virtio-ccw Halil Pasic
2019-04-09  9:57   ` Cornelia Huck
2019-04-09  9:57     ` Cornelia Huck
2019-04-09 11:29     ` Halil Pasic
2019-04-09 13:01       ` Cornelia Huck
2019-04-09 13:01         ` Cornelia Huck
2019-04-09 13:23         ` Halil Pasic
2019-04-09 15:47           ` Cornelia Huck
2019-04-09 15:47             ` Cornelia Huck
2019-04-04 23:16 ` [RFC PATCH 03/12] s390/mm: force swiotlb for protected virtualization Halil Pasic
2019-04-09 10:16   ` Cornelia Huck
2019-04-09 10:16     ` Cornelia Huck
2019-04-09 10:54     ` Halil Pasic
2019-04-09 17:18       ` Cornelia Huck
2019-04-09 17:18         ` Cornelia Huck
2019-04-09 12:22   ` Christoph Hellwig
2019-04-09 12:22     ` Christoph Hellwig
2019-04-09 12:39     ` Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 04/12] s390/cio: introduce cio DMA pool Halil Pasic
2019-04-09 10:44   ` Cornelia Huck
2019-04-09 10:44     ` Cornelia Huck
2019-04-09 12:11     ` Halil Pasic
2019-04-09 17:14       ` Cornelia Huck
2019-04-09 17:14         ` Cornelia Huck
2019-04-10 15:31         ` Halil Pasic
2019-04-10 16:07           ` Cornelia Huck
2019-04-10 16:07             ` Cornelia Huck
2019-04-10 16:52             ` Halil Pasic
2019-04-11 18:25   ` Sebastian Ott
2019-04-11 18:25     ` Sebastian Ott
2019-04-12 11:20     ` Halil Pasic
2019-04-12 12:12       ` Sebastian Ott
2019-04-12 12:12         ` Sebastian Ott
2019-04-12 15:30         ` Halil Pasic
2019-04-16 12:50           ` Sebastian Ott
2019-04-16 12:50             ` Sebastian Ott
2019-04-16 13:31             ` Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 05/12] s390/cio: add protected virtualization support to cio Halil Pasic
2019-04-09 17:55   ` Cornelia Huck
2019-04-09 17:55     ` Cornelia Huck
2019-04-10  0:10     ` Halil Pasic
2019-04-10  8:25       ` Cornelia Huck
2019-04-10  8:25         ` Cornelia Huck
2019-04-10 13:02         ` Halil Pasic
2019-04-10 16:16           ` Cornelia Huck
2019-04-10 16:16             ` Cornelia Huck
2019-04-11 14:15   ` Sebastian Ott
2019-04-11 14:15     ` Sebastian Ott
2019-04-12 11:29     ` Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 06/12] s390/airq: use DMA memory for adapter interrupts Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 07/12] virtio/s390: use DMA memory for ccw I/O Halil Pasic
2019-04-10  8:42   ` Cornelia Huck
2019-04-10  8:42     ` Cornelia Huck
2019-04-10 14:42     ` Halil Pasic
2019-04-10 16:21       ` Cornelia Huck
2019-04-10 16:21         ` Cornelia Huck
2019-04-04 23:16 ` [RFC PATCH 08/12] virtio/s390: add indirection to indicators access Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 09/12] virtio/s390: use DMA memory for notifiers Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 10/12] virtio/s390: consolidate DMA allocations Halil Pasic
2019-04-10  8:46   ` Cornelia Huck
2019-04-10  8:46     ` Cornelia Huck
2019-04-10 15:12     ` Halil Pasic
2019-04-10 16:36       ` Cornelia Huck
2019-04-10 16:36         ` Cornelia Huck
2019-04-10 17:48         ` Halil Pasic
2019-04-11  9:24           ` Cornelia Huck
2019-04-11  9:24             ` Cornelia Huck
2019-04-11 10:10             ` Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 11/12] virtio/s390: use the cio DMA pool Halil Pasic
2019-04-04 23:16 ` [RFC PATCH 12/12] virtio/s390: make airq summary indicators DMA Halil Pasic
2019-04-10  9:20 ` [RFC PATCH 00/12] s390: virtio: support protected virtualization Cornelia Huck
2019-04-10  9:20   ` Cornelia Huck
2019-04-10 15:57   ` Halil Pasic
2019-04-10 16:24     ` Cornelia Huck
2019-04-10 16:24       ` Cornelia Huck
2019-04-12 13:47 ` David Hildenbrand
2019-04-12 13:47   ` David Hildenbrand
2019-04-16 11:10   ` Halil Pasic
2019-04-16 11:50     ` David Hildenbrand
2019-04-16 11:50       ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.